SPRADP7A February   2025  – March 2025 AM62A3 , AM62A3-Q1 , AM62A7 , AM62A7-Q1 , AM67A , TDA4AEN-Q1

 

  1.   1
  2.   Abstract
  3.   Trademarks
  4. 1Introduction
  5. 2Building Blocks of an RGB-IR Vision Pipeline
    1. 2.1 CSI Receiver
    2. 2.2 Image Signal Processor
    3. 2.3 Video Processing Unit
    4. 2.4 TI Deep Learning Acceleration
    5. 2.5 GStreamer and TIOVX Frameworks
  6. 3Performance Considerations and Benchmarking Tools
  7. 4Reference Design
    1. 4.1 Camera Module
    2. 4.2 Sensor Driver
    3. 4.3 CSI-2 Rx Driver
    4. 4.4 Image Processing
    5. 4.5 Deep Learning for Driver and Occupancy Monitoring
    6. 4.6 Reference Code and Applications
  8. 5Application Examples and Benchmarking
    1. 5.1 Application 1: Single-stream Capture and Visualization with GST
    2. 5.2 Application 2: Dual-stream Capture and Visualization with GST and TIOVX Frameworks
    3. 5.3 Application 3: Representative OMS-DMS + Video Telephony Pipeline in GStreamer
  9. 6Summary
  10. 7References
  11. 8Revision History

TI Deep Learning Acceleration

Deep learning and neural networks are an increasingly popular strategy to extract meaning and information from imagery and other data. TI’s AM6xA and TDA4x SoC’s use an in-house developed hardware IP, the C7xMMA, with TI Deep Learning (TIDL) software to accelerate neural network inference.

The C7xMMA is a tightly coupled C7x SIMD DSP and matrix multiplier accelerator (MMA). The architecture is highly effective for Convolution Neural Networks (CNNs), which are a common type of neural network used for vision processing. In most CNNs, matrix multiplication and similar operations compose at least 98% of the total operations. In this way, MMA’s have a large impact on the computational efficiency of neural network acceleration for vision tasks, such as object detection, pixel-level segmentation, and key-point detection.

Figure 2-3 depicts a general development flow for TIDL on AM6xA and TDA4x processors. This development flow can be entered from multiple points. TI provides GUI-based and command line-based tools that enable users to:

  • Bring data (BYOD) and train a TI model
  • Bring pretrained model (BYOM) of a custom architecture
  • Evaluate a pre-trained and pre-optimized model from TI’s Model Zoo.

Where each of these development actions feeds into the next. Developers compile a model for the target SoC and can test accuracy on PC before deploying to the target. The compilation tools and accelerator are invoked through open source runtimes like Tensorflow Lite, ONNX Runtime or TVM. These runtimes provide a familiar API and allow unaccelerated layers to run on the Arm® A cores, easing usability for a broad host of models. Each of these open source runtimes (OSRT) leverage the TIDL runtime (TIDL_RT) under-the-hood.

 TI Deep Learning Development FlowFigure 2-3 TI Deep Learning Development Flow