SDAA429 June 2026 MSPM0G5187
The Edge AI model can be deployed to hardware via TI Neural Network Compiler, targeting either the dedicated hardware NPU or the host CPU. The TinyEngine NPU is a dedicated hardware accelerator specifically designed to execute neural network computations with high efficiency, delivering significantly reduced inference latency and power consumption compared to a general-purpose CPU.
To facilitate performance evaluation, the MSPM0 SDK provides both NPU and CPU implementation examples for user benchmarking, refers to:
Table 5-9 summarizes the performance comparison between the NPU-based and CPU-based Edge AI designs for the digit recognition application.
| Performance Metric | NPU-Based Design | CPU-Based Design |
|---|---|---|
| Accuracy | ~99% | ~99% |
| Flash Usage | 73 KB | 68 KB |
| RAM Usage | 10.9 KB | 8.1 KB |
| Inference Latency | 6.05 ms | 89.81 ms |
| Inference Power Consumption (AVG) | 424.65 uJ | 6,265.32 uJ |
The NPU-based design achieves approximately 14x lower inference latency compared to the CPU-based implementation and effectively reduces the energy consumption per inference by approximately 93%, making it a highly efficient choice for power-sensitive edge applications.