MATRIX OPERATIONS - TINY_MATRIX¶
TINY_MATRIX Library
- This library is a lightweight matrix computation library implemented in C++, providing basic matrix operations and linear algebra functions.
- The design goal of this library is to provide a simple and easy-to-use matrix operation interface, suitable for embedded systems and resource-constrained environments.
Usage Scenario
Compared to the TINYMAT library, the TINYMATRIX library offers richer functionality and higher flexibility, suitable for applications that require complex matrix computations. However, please note that this library is written in C++.
This page documents the full tiny::Mat interface, including core matrix storage, element access, ROI helpers, arithmetic operators, linear-system solvers, matrix decompositions, and eigenvalue routines.
Design Insights¶
Why Two Matrix Libraries? (C + C++)¶
TinySHM provides two matrix libraries with different trade-offs:
tiny_mat (C) | tiny::Mat (C++) | |
|---|---|---|
| Language | C89 compatible | C++17 |
| Memory | Manual (float buffer[100]) | RAII (auto alloc/free) |
| API style | func(buffer, rows, cols, ...) | A.multiply(B) or A * B |
| Decompositions | QR only (MGS, HQR) | LU, Cholesky, QR, SVD, Jacobi EVP |
| Eigenvalue | Francis QR (C array output) | Francis QR + Jacobi (Mat object output) |
| Solvers | — | Linear systems, pseudo-inverse |
| Binary size | \(\sim\)2 KB | \(\sim\)25 KB |
| Best for | Low-level kernels, C-only projects | SHM algorithms, convenience |
Insight: Choose tiny_mat (C) when…
- You need minimal binary size (ESP32-S3 has 512 KB flash — every KB matters)
- Your codebase is pure C
- You only need basic element-wise operations + QR decomposition
- You want to avoid C++ runtime overhead
Insight: Choose tiny::Mat (C++) when…
- You need matrix multiplication, LU decomposition, SVD, or eigenvalue solvers
- RAII memory management simplifies your code
- Operator overloading (\(A \times B\) vs
tiny_mat_mult_f32(...)) improves readability - You need ROI/view/sub-matrix operations for complex matrix workflows
Memory Model: Row-Major with Padding¶
The Mat class uses row-major storage with optional row padding:
Logical 3×3 matrix: Storage (step=4):
┌───┬───┬───┐ ┌───┬───┬───┬───┐
│ a │ b │ c │ │ a │ b │ c │ - │
├───┼───┼───┤ ├───┼───┼───┼───┤
│ d │ e │ f │ │ d │ e │ f │ - │
├───┼───┼───┤ ├───┼───┼───┼───┤
│ g │ h │ i │ │ g │ h │ i │ - │
└───┴───┴───┘ └───┴───┴───┴───┘
element(i,j) = data[i * step + j]
| Field | Meaning | Example |
|---|---|---|
rows | Logical rows | 3 |
cols | Logical columns | 3 |
pad | Placeholder elements per row | 1 |
step | Physical row width \(= cols + pad\) | 4 |
element | \(rows \times cols\) | 9 |
memory | \(rows \times step\) | 12 |
Why Padding?
Padding aligns matrix rows to power-of-two boundaries, enabling SIMD/NEON/ESP-DSP vectorised loads. For an ESP32, padding reduces matrix multiplication time by 2-5× compared to tight (unpadded) storage, even though the memory overhead is only \(\text{pad} \times \text{rows}\) floats.
Sub-Matrix Model: View vs Copy¶
The Mat class offers three ways to work with sub-matrices:
| Method | Memory | Write-back? | Use Case |
|---|---|---|---|
view_roi() | Shallow — shares data with parent | Yes | Temporary sub-block access |
copy_roi() | Deep — allocates new buffer | Independent | Extracting a sub-matrix for separate computation |
block() | Deep — allocates new buffer | Independent | Same as copy_roi, different interface |
Mat big(10, 10);
Mat sub_view = big.view_roi(2, 2, 5, 5); // no copy, 5×5 view starting at (2,2)
Mat sub_copy = big.copy_roi(Roi{2, 2, 5, 5}); // new 5×5 matrix, independent
sub_view(0, 0) = 99; // modifies big(2, 2) too!
sub_copy(0, 0) = 99; // does NOT modify big
View Lifetime
A view (view_roi) becomes a dangling pointer if the parent matrix is destroyed or reallocated. Always ensure the parent outlives all views.
C++ vs C Algorithm Selection¶
The Mat class provides multiple algorithms for the same operation:
| Operation | Available Methods | Insight |
|---|---|---|
| Determinant | laplace (\(O(n!)\)), lu (\(O(n^3)\)), gaussian (\(O(n^3)\)) | Laplace only for \(n \leq 4\); LU/Gaussian for larger |
| Inverse | adjoint (\(O(n^4)\)), gje (\(O(n^3)\)) | GJE (Gauss-Jordan) is always faster for \(n > 3\) |
| Eigenvalue | Single (power iteration), Full (Jacobi), Full (Francis QR) | Single: fastest for top-1. Jacobi: accurate for small \(n\). Francis QR: general \(n\) |
Matrix Size Matters on MCU
For \(n \leq 8\), the algorithmic complexity differences are negligible — all methods run in \(< 1\) ms on ESP32-S3 at 240 MHz. For \(n = 50\) (typical SSI), choose carefully:
- LU is 25× faster than Laplace for determinant
- GJE is 20× faster than adjoint for inverse
- Francis QR is 3× faster than Jacobi for eigenvalues
Decomposition Pipeline¶
The Mat class chains multiple decompositions for different purposes:
┌─────────────┐
│ Mat object │
└──────┬──────┘
│
┌─────────────┼─────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ LU │ │ Cholesky│ │ QR │
│ O(n³) │ │ O(n³/3) │ │ O(n³) │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
▼ ▼ ▼
determinant symmetric eigenvalue
inverse only (Francis QR)
└──────────────┴──────────────────┘
▼
┌──────────────┐
│ SVD (Jacobi) │
│ O(n³) │
└──────────────┘
Which Decomposition to Use?
- LU: Default choice for determinant and inverse. Works on any square matrix.
- Cholesky: 2× faster than LU but only for symmetric positive-definite matrices (common in SHM covariance matrices).
- QR: More stable than LU for ill-conditioned matrices, but 2× slower.
- SVD: Most stable, gives singular values and condition number, but slowest. Use for rank determination.
LIST OF FUNCTIONS¶
/**
* @file tiny_matrix.hpp
* @author SHUAIWEN CUI (SHUAIWEN001@e.ntu.edu.sg)
* @brief This file is the header file for the submodule matrix (advanced matrix operations) of the tiny_math middleware.
*
* Terminology conventions:
* - pad / padding: number of padded elements at row end (not part of logical matrix values).
* - step: total elements per physical row (logical + padding).
* - stride: spacing for strided sampling in one row (general concept in low-level kernels).
* In this matrix class, row indexing uses `step`; column sampling spacing is fixed to 1.
* The field name `stride` exists only as a deprecated backward-compatible alias of `step`.
* @version 1.0
* @date 2025-04-17
* @note This file is built on top of the mat.h file from the ESP-DSP library.
*
*/
#pragma once
/* DEPENDENCIES */
// TinyMath
#include "tiny_math_config.h"
#include "tiny_vec.h"
#include "tiny_mat.h"
// Standard Libraries
#include <iostream>
#include <stdint.h>
#if MCU_PLATFORM_SELECTED == MCU_PLATFORM_ESP32
// ESP32 DSP C++ Matrix library
#include "mat.h"
#endif
/* STATEMENTS */
namespace tiny
{
class Mat
{
public:
// ============================================================================
// Matrix Metadata
// ============================================================================
int row; //< number of rows
int col; //< number of columns
int pad; //< placeholder elements in storage; not used in math ops
union
{
int step; //< total elements per row (logical + padding)
int stride; //< deprecated backward-compatible alias for `step` (not sampling stride)
};
int element; //< number of elements = rows * cols
int memory; //< size of the data buffer = rows * step
float *data; //< pointer to the data buffer
float *temp; //< pointer to the temporary data buffer
bool ext_buff; //< flag indicates that matrix use external buffer
bool sub_matrix; //< flag indicates that matrix is a subset of another matrix
// ============================================================================
// Rectangular ROI Structure
// ============================================================================
/**
* @name Region of Interest (ROI) Structure
* @brief This is the structure for ROI
*/
struct ROI
{
int pos_x; ///< starting column index
int pos_y; ///< starting row index
int width; ///< width of ROI (columns)
int height; ///< height of ROI (rows)
ROI(int pos_x = 0, int pos_y = 0, int width = 0, int height = 0);
void resize_roi(int pos_x, int pos_y, int width, int height);
int area_roi(void) const;
};
// ============================================================================
// Printing Functions
// ============================================================================
void print_info() const;
void print_matrix(bool show_padding) const;
// ============================================================================
// Constructors & Destructor
// ============================================================================
/**
* @brief Allocate memory for the matrix according to the memory required.
* @note For ESP32, it will automatically determine if using RAM or PSRAM based on the size of the matrix.
* @note This function sets ext_buff to false and allocates memory based on row * step.
* If allocation fails or parameters are invalid, data will be set to nullptr.
*/
void alloc_mem();
/**
* @brief Default constructor: create a 1x1 matrix with only a zero element.
* @note If memory allocation fails, the object will be in an invalid state (data = nullptr).
* Caller should check the data pointer before using the matrix.
*/
Mat();
/**
* @brief Constructor - create a matrix with the specified number of rows and columns.
* @param rows Number of rows
* @param cols Number of columns
*/
Mat(int rows, int cols);
/**
* @brief Constructor - create a matrix with the specified number of rows, columns and step.
* @param rows Number of rows
* @param cols Number of columns
* @param step Total number of elements in a row (including padding)
*/
Mat(int rows, int cols, int step);
/**
* @brief Constructor - create a matrix with the specified number of rows, columns and external data.
* @param data Pointer to external data buffer
* @param rows Number of rows
* @param cols Number of columns
*/
Mat(float *data, int rows, int cols);
/**
* @brief Constructor - create a matrix with the specified number of rows, columns and external data.
* @param data Pointer to external data buffer
* @param rows Number of rows
* @param cols Number of columns
* @param step Total number of elements in a row (including padding)
*/
Mat(float *data, int rows, int cols, int step);
/**
* @brief Copy constructor - create a matrix with the same properties as the source matrix.
* @param src Source matrix
*/
Mat(const Mat &src);
/**
* @brief Destructor - free the memory allocated for the matrix.
*/
~Mat();
// ============================================================================
// Element Access
// ============================================================================
inline float &operator()(int row, int col) { return data[row * step + col]; }
inline const float &operator()(int row, int col) const { return data[row * step + col]; }
// ============================================================================
// Data Manipulation
// ============================================================================
tiny_error_t copy_paste(const Mat &src, int row_pos, int col_pos);
tiny_error_t copy_head(const Mat &src);
Mat view_roi(int start_row, int start_col, int roi_rows, int roi_cols) const;
Mat view_roi(const Mat::ROI &roi) const;
Mat copy_roi(int start_row, int start_col, int height, int width);
Mat copy_roi(const Mat::ROI &roi);
Mat block(int start_row, int start_col, int block_rows, int block_cols);
void swap_rows(int row1, int row2);
void swap_cols(int col1, int col2);
void clear(void);
// ============================================================================
// Arithmetic Operators
// ============================================================================
Mat &operator=(const Mat &src); // Copy assignment
Mat &operator+=(const Mat &A); // Add matrix
Mat &operator+=(float C); // Add constant
Mat &operator-=(const Mat &A); // Subtract matrix
Mat &operator-=(float C); // Subtract constant
Mat &operator*=(const Mat &A); // Multiply matrix
Mat &operator*=(float C); // Multiply constant
Mat &operator/=(const Mat &B); // Divide matrix
Mat &operator/=(float C); // Divide constant
Mat operator^(int C); // Exponentiation
// ============================================================================
// Linear Algebra - Basic Operations
// ============================================================================
Mat transpose(); // Transpose matrix
float determinant(); // Compute determinant (auto-selects method based on size)
float determinant_laplace(); // Compute determinant using Laplace expansion (O(n!), for small matrices)
float determinant_lu(); // Compute determinant using LU decomposition (O(n³), efficient for large matrices)
float determinant_gaussian(); // Compute determinant using Gaussian elimination (O(n³), efficient for large matrices)
Mat adjoint(); // Compute adjoint matrix
Mat inverse_adjoint(); // Compute inverse using adjoint method
void normalize(); // Normalize matrix
float norm() const; // Compute matrix norm
float dotprod(const Mat &A, const Mat &B); // Dot product
// ============================================================================
// Linear Algebra - Matrix Utilities
// ============================================================================
static Mat eye(int size); // Create identity matrix
static Mat ones(int rows, int cols); // Create matrix filled with ones
static Mat ones(int size); // Create square matrix filled with ones
static Mat augment(const Mat &A, const Mat &B); // Horizontal concatenation [A | B]
static Mat vstack(const Mat &A, const Mat &B); // Vertical concatenation [A; B]
/**
* @brief Gram-Schmidt orthogonalization process
* @note Orthogonalizes a set of vectors using the Gram-Schmidt process
* @param vectors Input matrix where each column is a vector to be orthogonalized
* @param orthogonal_vectors Output matrix for orthogonalized vectors (each column is orthogonal)
* @param coefficients Output matrix for projection coefficients (R matrix in QR decomposition)
* @param tolerance Minimum norm threshold for linear independence check
* @return true if successful, false if input is invalid
*/
static bool gram_schmidt_orthogonalize(const Mat &vectors, Mat &orthogonal_vectors,
Mat &coefficients, float tolerance = 1e-6f);
// ============================================================================
// Linear Algebra - Matrix Operations
// ============================================================================
Mat minor(int target_row, int target_col) const; // Submatrix after removing one row + one column (works on any m x n matrix)
Mat cofactor(int target_row, int target_col) const; // Cofactor submatrix; requires the matrix to be square
Mat gaussian_eliminate() const; // Gaussian elimination
Mat row_reduce_from_gaussian() const; // REF -> RREF via back-substitution
Mat inverse_gje() const; // Inverse using Gauss-Jordan elimination on [A | I]
// ============================================================================
// Linear Algebra - Linear System Solving
// ============================================================================
Mat solve(const Mat &A, const Mat &b) const; // Solve Ax = b using Gaussian elimination
Mat band_solve(Mat A, Mat b, int k) const; // Solve banded system
Mat roots(Mat A, Mat y) const; // Alternative solve method
// ============================================================================
// Matrix Decomposition
// ============================================================================
// Forward declarations (structures defined after class)
struct LUDecomposition;
struct CholeskyDecomposition;
struct QRDecomposition;
struct SVDDecomposition;
// Matrix property checks
/**
* @brief Check if the matrix is symmetric within a given tolerance.
* @param tolerance Maximum allowed difference between A(i,j) and A(j,i) (must be >= 0)
* @return true if matrix is symmetric, false otherwise
*/
bool is_symmetric(float tolerance = 1e-6f) const;
/**
* @brief Check if matrix is positive definite using Sylvester's criterion.
* @param tolerance Tolerance for numerical checks (must be >= 0)
* @param max_minors_to_check Maximum number of leading principal minors to check.
* - If -1: check all minors (complete Sylvester's criterion)
* - If > 0: check first max_minors_to_check minors
* @return true if matrix is positive definite, false otherwise
*/
bool is_positive_definite(float tolerance = 1e-6f, int max_minors_to_check = -1) const;
// Decomposition methods
LUDecomposition lu_decompose(bool use_pivoting = true) const;
CholeskyDecomposition cholesky_decompose() const;
QRDecomposition qr_decompose() const;
SVDDecomposition svd_decompose(int max_iter = 100, float tolerance = 1e-6f) const;
// Solve using decomposition (more efficient for multiple RHS)
static Mat solve_lu(const LUDecomposition &lu, const Mat &b);
static Mat solve_cholesky(const CholeskyDecomposition &chol, const Mat &b);
static Mat solve_qr(const QRDecomposition &qr, const Mat &b); // Least squares solution
// Pseudo-inverse using SVD (for rank-deficient or non-square matrices)
static Mat pseudo_inverse(const SVDDecomposition &svd, float tolerance = 1e-6f);
// ============================================================================
// Eigenvalue & Eigenvector Decomposition
// ============================================================================
// Forward declarations (structures defined after class)
struct EigenPair;
struct EigenDecomposition;
// Single eigenvalue methods (fast, for real-time applications)
/**
* @brief Compute the dominant (largest magnitude) eigenvalue and eigenvector using power iteration.
* @param max_iter Maximum number of iterations (must be > 0)
* @param tolerance Convergence tolerance (must be >= 0). Convergence when |λ_k - λ_{k-1}| < tolerance * |λ_k|
* @return EigenPair containing the dominant eigenvalue, eigenvector, and status
*/
EigenPair power_iteration(int max_iter = 1000, float tolerance = 1e-6f) const;
/**
* @brief Compute the smallest (minimum magnitude) eigenvalue and eigenvector using inverse power iteration.
* @param max_iter Maximum number of iterations (must be > 0)
* @param tolerance Convergence tolerance (must be >= 0). Convergence when |λ_k - λ_{k-1}| < tolerance * max(|λ_k|, 1)
* @return EigenPair containing the smallest eigenvalue, eigenvector, and status
* @note The matrix must be invertible (non-singular) for this method to work.
*/
EigenPair inverse_power_iteration(int max_iter = 1000, float tolerance = 1e-6f) const;
// Complete eigendecomposition methods
/**
* @brief Compute complete eigenvalue decomposition using Jacobi method for symmetric matrices.
* @param tolerance Convergence tolerance (must be >= 0). Convergence when max off-diagonal < tolerance
* @param max_iter Maximum number of iterations (must be > 0)
* @return EigenDecomposition containing all eigenvalues, eigenvectors, and status
* @note Best for symmetric matrices. Matrix should be symmetric for best results.
*/
EigenDecomposition eigendecompose_jacobi(float tolerance = 1e-6f, int max_iter = 100) const;
/**
* @brief Compute complete eigenvalue decomposition using QR algorithm for general matrices.
* @param max_iter Maximum number of QR iterations (must be > 0)
* @param tolerance Convergence tolerance (must be >= 0). Convergence when subdiagonal < tolerance
* @return EigenDecomposition containing eigenvalues, eigenvectors, and status
* @note Supports non-symmetric matrices, but may have complex eigenvalues (only real part returned).
*/
EigenDecomposition eigendecompose_qr(int max_iter = 100, float tolerance = 1e-6f) const;
/**
* @brief Automatic eigenvalue decomposition with method selection.
* @param tolerance Convergence tolerance (must be >= 0)
* @param max_iter Maximum number of iterations (must be > 0, default = 100)
* @return EigenDecomposition containing eigenvalues, eigenvectors, and status
* @note Automatically selects Jacobi method for symmetric matrices, QR algorithm for general matrices.
*/
EigenDecomposition eigendecompose(float tolerance = 1e-6f, int max_iter = 100) const;
protected:
private:
};
// ============================================================================
// Matrix Decomposition Structures
// ============================================================================
/**
* @brief Structure to hold LU decomposition results
* @note A = L * U, where L is lower triangular and U is upper triangular
*/
struct Mat::LUDecomposition
{
Mat L; ///< Lower triangular matrix (with unit diagonal)
Mat U; ///< Upper triangular matrix
Mat P; ///< Permutation matrix (if pivoting used)
bool pivoted; ///< Whether pivoting was used
tiny_error_t status; ///< Computation status
LUDecomposition();
};
/**
* @brief Structure to hold Cholesky decomposition results
* @note A = L * L^T, where L is lower triangular (for symmetric positive definite matrices)
*/
struct Mat::CholeskyDecomposition
{
Mat L; ///< Lower triangular matrix
tiny_error_t status; ///< Computation status
CholeskyDecomposition();
};
/**
* @brief Structure to hold QR decomposition results
* @note A = Q * R, where Q is orthogonal and R is upper triangular
*/
struct Mat::QRDecomposition
{
Mat Q; ///< Orthogonal matrix (Q^T * Q = I)
Mat R; ///< Upper triangular matrix
tiny_error_t status; ///< Computation status
QRDecomposition();
};
/**
* @brief Structure to hold SVD decomposition results
* @note A = U * S * V^T, where U and V are orthogonal, S is diagonal (singular values)
*/
struct Mat::SVDDecomposition
{
Mat U; ///< Left singular vectors (orthogonal matrix)
Mat S; ///< Singular values (diagonal matrix or vector)
Mat V; ///< Right singular vectors (orthogonal matrix, V^T)
int rank; ///< Numerical rank of the matrix
int iterations; ///< Number of iterations performed
tiny_error_t status; ///< Computation status
SVDDecomposition();
};
// ============================================================================
// Eigenvalue & Eigenvector Decomposition Structures
// ============================================================================
/**
* @brief Structure to hold a single eigenvalue-eigenvector pair
* @note Used primarily for power iteration method
*/
struct Mat::EigenPair
{
float eigenvalue; ///< Eigenvalue (real part)
Mat eigenvector; ///< Corresponding eigenvector (column vector)
int iterations; ///< Number of iterations performed
tiny_error_t status; ///< Computation status
EigenPair();
};
/**
* @brief Structure to hold complete eigenvalue decomposition results
* @note Contains all eigenvalues and eigenvectors
*/
struct Mat::EigenDecomposition
{
Mat eigenvalues; ///< Eigenvalues (diagonal matrix or vector)
Mat eigenvectors; ///< Eigenvector matrix (each column is an eigenvector)
int iterations; ///< Number of iterations performed
tiny_error_t status; ///< Computation status
EigenDecomposition();
};
// ============================================================================
// Stream Operators
// ============================================================================
std::ostream &operator<<(std::ostream &os, const Mat &m);
std::ostream &operator<<(std::ostream &os, const Mat::ROI &roi);
std::istream &operator>>(std::istream &is, Mat &m);
// ============================================================================
// Global Arithmetic Operators
// ============================================================================
Mat operator+(const Mat &A, const Mat &B);
Mat operator+(const Mat &A, float C);
Mat operator-(const Mat &A, const Mat &B);
Mat operator-(const Mat &A, float C);
Mat operator*(const Mat &A, const Mat &B);
Mat operator*(const Mat &A, float C);
Mat operator*(float C, const Mat &A);
Mat operator/(const Mat &A, float C);
Mat operator/(const Mat &A, const Mat &B);
bool operator==(const Mat &A, const Mat &B);
}