SPRADB8 may 2023 AM62A3 , AM62A7

2 AM62A Processor

The AM62A Edge AI Microprocessor, shown in Figure 2-1, is designed for single or dual camera applications that are cost and power sensitive, yet require intensive image analysis. Vision applications benefit from hardware acceleration to enhance image quality, speed up preprocessing, and accelerate analytics algorithms like deep neural networks.

Figure 2-1 AM62A Simplified Block Diagram

Figure 2-2 demonstrates a general dataflow for vision analytics applications. Images are produced by a low-cost raw image sensor, that is, the camera. Raw image data enters the processor through the 4-lane MIPI-CSI2 port, which can be split into multiple virtual channels for more cameras. The image is enhanced by the ISP to reduce noise, tune white balancing and gain, filter and interpolate color information, and process High Dynamic Range (HDR) information. For applications with wide angle lens, the Lens Distortion Correction (LDC) accelerator reduces warping effects from the lens. After preprocessing the image to meet the AI model’s input specification, the hardware accelerator runs the model at 50-100x the CPU’s capability; see Table 3-1 for several model benchmarks. An AI model can accomplish tasks like recognize food items, locate a barcode on a package, identify where customers spend the most time, or detect patterns of theft.

Figure 2-2 AM6xA Vision Application Data Flow

Once the AI model has run, the specific application can decide how to act upon the information such as communicating over the network, showing information on a display, or playing an alarm sound. When inactive, low power modes drastically reduce power consumption; when running at 100% load, the SoC consumes less than 3 Watts at up to 85°C, reducing the need for active cooling. On-device security prevents tampering to protect data and firmware IP.

The AM62A features a 2 TOPS deep learning accelerator that is designed in-house. The accelerator is composed of a 256-bit C7x DSP tightly coupled to a matrix multiply accelerator (MMA). This tight coupling enables fast and efficient data movement to the accelerator, which ensures high utilization of the accelerator. The 2 TOPS metric refers the max number of operations per second on 8-bit quantized matrices. However, TOPS is not an ideal indicator of performance for deep learning acceleration, because 1 TOPS can have very different inference time and power usage based on the accelerator architecture and even model / neural network architecture. For this reason, it is more informative to view benchmarks that show inference rate (frames per second).