Vector — Implementation¶
File Structure¶
Dependencies: tiny_math_config.h, ESP-DSP headers (dsps_math.h, dsps_dotprod.h) on ESP32.
Design Pattern¶
Every function follows the same three-step pattern:
- Validate inputs (null pointers, lengths)
- Dispatch to ESP-DSP when contiguous (stride = 1) on ESP32 platform
- Fallback to generic C loop with stride-based indexing
Example: tiny_vec_add_f32¶
tiny_error_t tiny_vec_add_f32(const float *input1, const float *input2,
float *output, int len,
int stride1, int stride2, int stride_out)
{
// 1. Validate
if (!input1 || !input2 || !output || len <= 0)
return TINY_ERR_INVALID_ARG;
// 2. ESP-DSP fast path (contiguous only)
#if MCU_PLATFORM_SELECTED == MCU_PLATFORM_ESP32
if (stride1 == 1 && stride2 == 1 && stride_out == 1) {
dsps_add_f32(input1, input2, output, len);
return TINY_OK;
}
#endif
// 3. Generic fallback with stride
for (int i = 0; i < len; i++)
output[i * stride_out] = input1[i * stride1] + input2[i * stride2];
return TINY_OK;
}
Strided Access Model¶
TinyVec supports strided access for all element-wise operations:
Stride = 1 (contiguous): Stride = 3 (take every 3rd element):
┌───┬───┬───┬───┬───┬───┐ ┌───┬───┬───┬───┬───┬───┐
│ a │ b │ c │ d │ e │ f │ │ a │ │ │ b │ │ │
│ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │
└───┴───┴───┴───┴───┴───┘ └───┴───┴───┴───┴───┴───┘
Element \(i\) is accessed at base[i * stride]. Useful for extracting sub-vectors from matrix rows or interleaved data.
Performance
ESP-DSP fast path only activates for stride = 1. Any non-unit stride forces the generic C loop (\(\sim\) 10× slower).
Platform Dispatch Summary¶
| Function Group | ESP-DSP Call | Generic Fallback |
|---|---|---|
add, sub, mul | dsps_add/sub/mul_f32 | Strided C loop |
addc, subc, mulc | dsps_add/sub/mulc_f32 | C loop with scalar |
div, divc | — | C loop with div-by-zero guard |
sqrt, inv_sqrt | dsps_sqrt_f32, dsps_sqrtf32_f32 | sqrtf() C loop |
sqrtf, inv_sqrtf | dsps_sqrtf32_f32 | sqrtf() fast C loop |
dotprod | dsps_dotprod_f32 | Sum-of-products C loop |
dotprode | dsps_dotprod_f32 | Sum-of-products with stride |
Division Safety¶
tiny_vec_div_f32 and tiny_vec_divc_f32 have a zero-division guard:
if (fabsf(denominator) < TINY_MATH_MIN_DENOMINATOR) {
if (!allow_divide_by_zero) {
return TINY_ERR_MATH_ZERO_DIVISION;
}
output = (numerator >= 0) ? INFINITY : -INFINITY;
}
allow_divide_by_zero = false→ returns error code (safe default)allow_divide_by_zero = true→ returns±inf(useful for data processing pipelines where occasional zeros are expected)
Fast vs Standard Sqrt¶
TinyVec provides two variants of sqrt and inv_sqrt:
| Variant | ESP-DSP | Precision | Speed |
|---|---|---|---|
sqrt_f32 | dsps_sqrt_f32 | Full (~1e-7) | 1× |
sqrtf_f32 | dsps_sqrtf32_f32 | Approx (~1e-4) | 2–3× faster |
inv_sqrt_f32 | dsps_sqrt_f32 + division | Full | 1× |
inv_sqrtf_f32 | dsps_sqrtf32_f32 + division | Approx | 2–3× faster |
Use sqrtf / inv_sqrtf in inner loops of iterative algorithms where absolute precision is not required.