SDAA429 June   2026 MSPM0G5187

 

  1.   1
  2.   Abstract
  3.   Trademarks
  4. 1Introduction
  5. 2MSPM0G5187 with TinyEngine NPU
  6. 3Edge AI Toolchains
    1. 3.1 TI Edge AI Studio
    2. 3.2 TI Tiny ML Tensorlab
    3. 3.3 TI Neural Network Compiler
  7. 4Edge AI Application: Digit Recognition
    1. 4.1 LeNet-5 Variant CNN Model
    2. 4.2 NPU/CPU Performance Comparison
  8. 5Edge AI Application: Waveform Classifier
    1. 5.1 Feature Extraction
    2. 5.2 Time-Series Classification Model
    3. 5.3 Model Memory Considerations
    4. 5.4 NPU/CPU Performance Comparison
  9. 6Summary
  10. 7References

NPU/CPU Performance Comparison

The Edge AI model can be deployed to hardware via TI Neural Network Compiler, targeting either the dedicated hardware NPU or the host CPU. The TinyEngine NPU is a dedicated hardware accelerator specifically designed to execute neural network computations with high efficiency, delivering significantly reduced inference latency and power consumption compared to a general-purpose CPU.

To facilitate performance evaluation, the MSPM0 SDK provides both NPU and CPU implementation examples for user benchmarking, refers to:

Table 5-9 summarizes the performance comparison between the NPU-based and CPU-based Edge AI designs for the digit recognition application.

Table 4-3 NPU/CPU Performance Comparison
Performance Metric NPU-Based Design CPU-Based Design
Accuracy ~99% ~99%
Flash Usage 73 KB 68 KB
RAM Usage 10.9 KB 8.1 KB
Inference Latency 6.05 ms 89.81 ms
Inference Power Consumption (AVG) 424.65 uJ 6,265.32 uJ

The NPU-based design achieves approximately 14x lower inference latency compared to the CPU-based implementation and effectively reduces the energy consumption per inference by approximately 93%, making it a highly efficient choice for power-sensitive edge applications.