SPRACZ2 August   2022 TDA4VM , TDA4VM-Q1

ADVANCE INFORMATION  

  1.   Abstract
  2. 1Introduction
    1. 1.1 Vision Analytics
    2. 1.2 End Equipments
    3. 1.3 Deep learning: State-of-the-art
  3. 2Embedded edge AI system: Design considerations
    1. 2.1 Processors for edge AI: Technology landscape
    2. 2.2 Edge AI with TI: Energy-efficient and Practical AI
      1. 2.2.1 TDA4VM processor architecture
        1. 2.2.1.1 Development platform
    3. 2.3 Software programming
  4. 3Industry standard performance and power benchmarking
    1. 3.1 MLPerf models
    2. 3.2 Performance and efficiency benchmarking
    3. 3.3 Comparison against other SoC Architectures
      1. 3.3.1 Benchmarking against GPU-based architectures
      2. 3.3.2 Benchmarking against FPGA based SoCs
      3. 3.3.3 Summary of competitive benchmarking
  5. 4Conclusion
  6.   Revision History
  7. 5References

Deep learning: State-of-the-art

Deep learning model development is a hot research area in the deep learning and AI community. Driven by smartphones and the amount of image data that is being generated, a special focus has been given for image or vision deep learning functions to be able to identify faces, scenes, moods and other information in pictures. A specific type of neural network, Convolution Neural network (CNN), is an enabler for the latest advancements in computer vision. Convolution is a cool technique to detect different features in the input image. Convolution process uses a kernel, also called a filer, to sweep across an image to detect patterns in the image. A kernel is a very small matrix (usually 3x3 or 5x5) with a set of weights corresponding to its size and typically detects one feature in the image such as eyes, nose or a specific expression.

The seminal AlexNet [3] paper in 2012, showed researchers and industry that deep learning was an extremely effective algorithmic processing technique in solving computer vision tasks like classification, object detection and semantic segmentation. This triggered a series of new innovations continuing to improve the inference performance and efficiency targeting myriad of applications from robots and smart retail carts to last mile delivery autonomous delivery systems.

Figure 1-3 below shows popular models used in the industry today starting with AlexNet [4]. In general, there is a clear trade-off between the accuracy of the model and the number of operations used by the model shown in Giga operations (G-ops) [4].

GUID-6C649F9F-EA78-492F-98D1-A9F69BB0CD0F-low.gif Figure 1-3 Popular deep learning models

This trade-off highlights the need for efficient SoC architectures to be able to run large computations efficiently. Deep learning community is pretty vibrant and constant innovation is happening to improve the model performance and efficiency. TI is constantly looking into new SoC technology innovations to offer this community best-in-class AI capabilities.

It is common practice to use architectures of deep learning networks published in literature. There are open-source implementations of several models. TI is making the process even easier with its own Model Zoo. TI's ModelZoo is a large collection of models that are optimized for inference speed and low power consumption. The models used in this benchmarking app note are examples of such open source models.