Skip to content

Decomp — Implementation

File Structure

decomp/
├── tiny_decomp.h             (26 lines — public API)
├── tiny_decomp_mgs.c         (118 lines — Modified Gram-Schmidt QR)
└── tiny_decomp_hqr.c         (101 lines — Householder QR)

Dependencies: tiny_cfloat.h (for complex eigenvalues in HQR), <math.h>, <string.h>.


MGS QR (tiny_decomp_mgs.c)

Algorithm

The Modified Gram-Schmidt (MGS) algorithm computes \(A = QR\) where \(A\) is \(m \times k\) (\(m \geq k\)):

For each column j = 0, 1, ..., k-1:
    1. Copy column j of A → qⱼ (A is overwritten in-place)
    2. For each previous column i = 0, ..., j-1:
         rᵢⱼ = qᵢ · qⱼ                           (projection coefficient)
         qⱼ   = qⱼ − rᵢⱼ · qᵢ                    (orthogonalize)
    3. rⱼⱼ = ||qⱼ||                               (norm)
       qⱼ  = qⱼ / rⱼⱼ                              (normalize)

MGS vs Classical GS

MGS orthogonalizes each new column against the already-orthogonalized columns. Classical GS orthogonalises against the original columns — MGS is numerically more stable at no extra cost.

Re-orthogonalization

When reorthogonalize = 1, a second full pass is applied after the first. This improves orthogonality from \(\sim 10^{-4}\) to \(\sim 10^{-7}\) in float32, at the cost of doubling the operation count.

// First pass
mgs_pass_full(A, m, k, R);

// Second pass (if requested)
if (reorthogonalize) {
    for (int j = 0; j < k; j++) {
        for (int i = 0; i < j; i++) {
            float dot = column_dot(Q_i, Q_j, m);
            R[i*k + j] += dot;              // accumulate into R
            for (int r = 0; r < m; r++)
                Q[r*k + j] -= dot * Q[r*k + i];
        }
    }
}

Orthogonality Check

float tiny_mgs_check_orthogonality_f32(const float *Q, int m, int k)
Computes \(\|Q^TQ - I\|_F\) — a value close to 0 indicates good orthogonality.


Householder QR (tiny_decomp_hqr.c)

Algorithm

Householder QR annihilates sub-diagonal elements using reflection matrices:

For each column \(j = 0, 1, ..., n-1\): 1. Form Householder vector \(v_j\) from column \(j\) of \(A^{(j)}\), rows \(j..m\) 2. Apply reflection: \(A^{(j+1)} = (I - 2 v_j v_j^T) A^{(j)}\) 3. Accumulate \(Q = (I - 2 v_1 v_1^T)(I - 2 v_2 v_2^T)...\)

Key Implementation Detail

// Compute Householder vector v from x (length n)
// Returns β = 2/||v||²
float house(float *x, int n, float *v_out)
{
    float sigma = 0;
    for (int i = 1; i < n; i++) sigma += x[i] * x[i];
    v_out[0] = 1;
    float mu = sqrtf(x[0]*x[0] + sigma);
    if (x[0] <= 0) v_out[0] = x[0] - mu;
    else           v_out[0] = -sigma / (x[0] + mu);
    float beta = 2 * v_out[0]*v_out[0] / (sigma + v_out[0]*v_out[0]);
    for (int i = 1; i < n; i++) v_out[i] = x[i] / v_out[0];
    return beta;
}

MGS vs HQR: Implementation Differences

Aspect MGS HQR
Workspace None (in-place) m * n floats for Q (separate output)
Float32 orthogonality \(10^{-4}\) without re-ortho, \(10^{-7}\) with \(\sim 10^{-7}\) (backward stable)
Typical use in TinySHM ERA reference projection SSI covariance projection
Column count limit \(k \leq m\) (tall matrix) No constraint

Memory Layout

Both algorithms operate on row-major, contiguous matrices. MGS overwrites the input \(A\) with \(Q\) in-place:

Before MGS:                    After MGS:
  A (m×k, in-place)            Q (m×k, overwrites A) + R (k×k, separate)
┌─────┬─────┬─────┐          ┌─────┬─────┬─────┐
│ a₁₁ │ a₁₂ │ a₁₃ │          │ q₁₁ │ q₁₂ │ q₁₃ │
│ a₂₁ │ a₂₂ │ a₂₃ │          │ q₂₁ │ q₂₂ │ q₂₃ │
│ a₃₁ │ a₃₂ │ a₃₃ │          │ q₃₁ │ q₃₂ │ q₃₃ │
└─────┴─────┴─────┘          └─────┴─────┴─────┘
                                   R (k×k):
                              ┌─────┬─────┬─────┐
                              │ r₁₁ │ r₁₂ │ r₁₃ │
                              │  0  │ r₂₂ │ r₂₃ │
                              │  0  │  0  │ r₃₃ │
                              └─────┴─────┴─────┘