Notes¶
Notes
Dense is the fully-connected (linear) layer: \( y = x W^\top + b \). It powers MLPs and classification heads. Weights are initialised with Xavier-uniform to keep activations well-scaled in deep stacks.
MATH¶
Input x has shape [batch, in_features], output y has shape [batch, out_features]:
Weights [out_features, in_features] (rows = output dim), bias [out_features].
Xavier-uniform init¶
Bias is zero-initialised.
CLASS DEFINITION¶
class Dense : public Layer
{
public:
Tensor weight; // [out_features, in_features]
Tensor bias; // [out_features] (empty when use_bias=false)
#if TINY_AI_TRAINING_ENABLED
Tensor dweight;
Tensor dbias;
#endif
Dense(int in_features, int out_features, bool use_bias = true);
Tensor forward(const Tensor &x) override; // [B, in_feat] → [B, out_feat]
Tensor backward(const Tensor &grad_out) override;
void collect_params(std::vector<ParamGroup> &groups) override;
int in_features() const;
int out_features() const;
};
BACKWARD¶
Input is cached in x_cache_ (a clone() of the forward input). The backward equations:
dweight and dbias are accumulated; the optimiser's zero_grad() clears them at the start of every mini-batch.
PARAMETER COLLECTION¶
void Dense::collect_params(std::vector<ParamGroup> &groups)
{
groups.push_back({&weight, &dweight});
if (use_bias_) groups.push_back({&bias, &dbias});
}
When use_bias = false, the bias tensor is empty and is not registered.
USAGE¶
Dense fc1(F, 128); // [B, F] → [B, 128]
Dense fc2(128, num_classes); // [B, 128] → [B, num_classes]
Sequential m;
m.add(new Dense(F, 128));
m.add(new ActivationLayer(ActType::RELU));
m.add(new Dense(128, num_classes));
m.add(new ActivationLayer(ActType::SOFTMAX));
Or via the MLP convenience wrapper:
which auto-inserts ReLU between hidden Dense layers and a final Softmax.
PERFORMANCE & MEMORY¶
- Param count:
F_in * F_out + F_out(with bias). - Complexity: forward
O(B * F_in * F_out); backward of the same order. - Memory: training adds another ~2× weight (
dweight) and ~1× bias (dbias). - PSRAM: when
F_in * F_out ≥ 64 KiB, storeweightin PSRAM viaTensor::from_data.
QUANTISATION HOOKS¶
- INT8 PTQ:
quantize_weights(weight, qp)produces anint8_t*, then calltiny_quant_dense_forward_int8for fully-integer inference. - FP8:
calibrate(weight, TINY_DTYPE_FP8_E4M3)+quantize(weight, buf, qp)saves 4× storage; dequantise back to float at runtime.