12.3.2 Optimization Process and Performance Trade-offs
Progressive Size Reduction: We
systematically tested different model sizes and found that while models with up to 128
neurons per layer can fit on the device, the 64-neuron configuration emerged as the best
balance point for accuracy and latency.
Memory versus. Accuracy:
Memory constraints became our primary design consideration, forcing us to work backward
from hardware limitations rather than forward from accuracy goals.
Implementation Considerations:
The final model not only fit on the device but also:
Compiled reliably with the NPU
toolchain
Maintained adequate precision for
sine wave approximation