Decomp — Implementation¶
File Structure¶
decomp/
├── tiny_decomp.h (26 lines — public API)
├── tiny_decomp_mgs.c (118 lines — Modified Gram-Schmidt QR)
└── tiny_decomp_hqr.c (101 lines — Householder QR)
Dependencies: tiny_cfloat.h (for complex eigenvalues in HQR), <math.h>, <string.h>.
MGS QR (tiny_decomp_mgs.c)¶
Algorithm¶
The Modified Gram-Schmidt (MGS) algorithm computes \(A = QR\) where \(A\) is \(m \times k\) (\(m \geq k\)):
For each column j = 0, 1, ..., k-1:
1. Copy column j of A → qⱼ (A is overwritten in-place)
2. For each previous column i = 0, ..., j-1:
rᵢⱼ = qᵢ · qⱼ (projection coefficient)
qⱼ = qⱼ − rᵢⱼ · qᵢ (orthogonalize)
3. rⱼⱼ = ||qⱼ|| (norm)
qⱼ = qⱼ / rⱼⱼ (normalize)
MGS vs Classical GS
MGS orthogonalizes each new column against the already-orthogonalized columns. Classical GS orthogonalises against the original columns — MGS is numerically more stable at no extra cost.
Re-orthogonalization¶
When reorthogonalize = 1, a second full pass is applied after the first. This improves orthogonality from \(\sim 10^{-4}\) to \(\sim 10^{-7}\) in float32, at the cost of doubling the operation count.
// First pass
mgs_pass_full(A, m, k, R);
// Second pass (if requested)
if (reorthogonalize) {
for (int j = 0; j < k; j++) {
for (int i = 0; i < j; i++) {
float dot = column_dot(Q_i, Q_j, m);
R[i*k + j] += dot; // accumulate into R
for (int r = 0; r < m; r++)
Q[r*k + j] -= dot * Q[r*k + i];
}
}
}
Orthogonality Check¶
Computes \(\|Q^TQ - I\|_F\) — a value close to 0 indicates good orthogonality.Householder QR (tiny_decomp_hqr.c)¶
Algorithm¶
Householder QR annihilates sub-diagonal elements using reflection matrices:
For each column \(j = 0, 1, ..., n-1\): 1. Form Householder vector \(v_j\) from column \(j\) of \(A^{(j)}\), rows \(j..m\) 2. Apply reflection: \(A^{(j+1)} = (I - 2 v_j v_j^T) A^{(j)}\) 3. Accumulate \(Q = (I - 2 v_1 v_1^T)(I - 2 v_2 v_2^T)...\)
Key Implementation Detail¶
// Compute Householder vector v from x (length n)
// Returns β = 2/||v||²
float house(float *x, int n, float *v_out)
{
float sigma = 0;
for (int i = 1; i < n; i++) sigma += x[i] * x[i];
v_out[0] = 1;
float mu = sqrtf(x[0]*x[0] + sigma);
if (x[0] <= 0) v_out[0] = x[0] - mu;
else v_out[0] = -sigma / (x[0] + mu);
float beta = 2 * v_out[0]*v_out[0] / (sigma + v_out[0]*v_out[0]);
for (int i = 1; i < n; i++) v_out[i] = x[i] / v_out[0];
return beta;
}
MGS vs HQR: Implementation Differences¶
| Aspect | MGS | HQR |
|---|---|---|
| Workspace | None (in-place) | m * n floats for Q (separate output) |
| Float32 orthogonality | \(10^{-4}\) without re-ortho, \(10^{-7}\) with | \(\sim 10^{-7}\) (backward stable) |
| Typical use in TinySHM | ERA reference projection | SSI covariance projection |
| Column count limit | \(k \leq m\) (tall matrix) | No constraint |
Memory Layout¶
Both algorithms operate on row-major, contiguous matrices. MGS overwrites the input \(A\) with \(Q\) in-place:
Before MGS: After MGS:
A (m×k, in-place) Q (m×k, overwrites A) + R (k×k, separate)
┌─────┬─────┬─────┐ ┌─────┬─────┬─────┐
│ a₁₁ │ a₁₂ │ a₁₃ │ │ q₁₁ │ q₁₂ │ q₁₃ │
│ a₂₁ │ a₂₂ │ a₂₃ │ │ q₂₁ │ q₂₂ │ q₂₃ │
│ a₃₁ │ a₃₂ │ a₃₃ │ │ q₃₁ │ q₃₂ │ q₃₃ │
└─────┴─────┴─────┘ └─────┴─────┴─────┘
R (k×k):
┌─────┬─────┬─────┐
│ r₁₁ │ r₁₂ │ r₁₃ │
│ 0 │ r₂₂ │ r₂₃ │
│ 0 │ 0 │ r₃₃ │
└─────┴─────┴─────┘