

Jianzhong Xu, Tarkesh Pande

#### ABSTRACT

This white paper describes how the AM62A System On Chip (SoC) can be used in automotive-grade camera mirror systems.

# **Table of Contents**

| 1 Introduction                               | 2              |
|----------------------------------------------|----------------|
| 2 AM62A Processor                            | <mark>2</mark> |
| 3 Vision Pre-processing Accelerator (VPAC)   | 3              |
| 3.1 Vision Imaging Sub-System (VISS)         | 4              |
| 3.2 Lens Distortion Correction (LDC) Block   | 4              |
| 3.3 Multi-Scalar (MSC) Block                 | 5              |
| 4 Deep Learning Acceleration                 | 5              |
| 5 Camera Mirror System Data Flow and Latency | 5              |
| 6 End-to-End Functional Safety               | 6              |
| 7 Example Demonstration                      | 6              |
| 7.1 Hardware Equipment                       | 6              |
| 7.2 Software Components                      | 7              |
| 7.3 Latency Measurement                      | 8              |
| 7.4 Future Improvement on Latency            | 8              |
| 8 Summary                                    | 8              |
| 9 References                                 | 8              |

# **List of Figures**

| Figure 1-1. Example Rearview Camera Design With the AM62A | 2   |
|-----------------------------------------------------------|-----|
| Figure 2-1. AM62A Simplified Block Diagram                | 3   |
| Figure 3-1. VPAC Block Diagram                            | 4   |
| Figure 3-2. Example of Aspherical Mirror Emulation        | . 4 |
| Figure 5-1. Typical Data Flow of a Camera Mirror System   | 5   |
| Figure 7-1. CMS Demonstration Hardware Setup              | 7   |
| Figure 7-2. CMS Demonstration Software Stack              | 7   |
| Figure 7-3. Glass-to-Glass Latency Measurement            | 8   |
| <b>o</b>                                                  |     |

# List of Tables

| Table 4-1. Example of Deep Learning Performance on the AM62A With C7, MMA at 850 MHz | 5              |
|--------------------------------------------------------------------------------------|----------------|
| Table 5-1. Camera to Display Latency Estimate                                        | <mark>6</mark> |

#### Trademarks

Arm<sup>®</sup> and Cortex<sup>®</sup> are registered trademarks of Arm Limited.

MIPI® is a registered trademark of Mobil Industry Processor Interface Alliance.

Sony<sup>®</sup> is a registered trademark of SONY Corporation.

All trademarks are the property of their respective owners.



# **1** Introduction

Camera mirror systems replace the traditional physical mirrors in vehicles with cameras and displays. A camera mirror system (CMS) uses cameras mounted on the exterior of the vehicle to capture real-time video feeds of the surroundings and display the captured information on displays where traditional mirrors are placed. CMS offers several advantages over traditional mirrors such as providing a wider field of view, reducing blind spots, and enhancing visibility especially in challenging driving conditions. CMS also offers advanced feature capabilities like image enhancement, low light visibility along with the capability of providing pedestrian and vehicle proximity warnings. Recently, the United Nations Economic Commission for Europe (UNECE) established UN regulation no. 151<sup>(1)</sup>, which sets specific requirements and standards concerning the approval of motor vehicles with regards to Blind Spot Information Systems (BSIS) for the detection of bicycles. Next generation CMS systems need to be capable of not only satisfying current and new regulatory standards but also evolve with the quickly changing landscape.

This paper documents how the features in the AM62A processor enable next-generation automotive camera system design. The document also highlights some of the critical considerations when designing a CMS system like latency and functional safety and how the AM62A was designed considering these parameters. AM62A is targeted for 1-1-1 camera systems like a rearview mirror or intelligent side-view mirrors in commercial trucks. The convention 1-1-1 stands for 1 camera, 1 processor, and 1 display. Figure 1-1 shows a typical example configuration. The camera is typically located away from the display unit and for automotive applications FD-Link is typically used for enabling camera data transmission.



Figure 1-1. Example Rearview Camera Design With the AM62A

# 2 AM62A Processor

Figure 2-1 shows the AM62A microprocessor. The processor is a heterogeneous processor designed for camera analytics applications. There are different hardware accelerators optimized for different tasks thus enabling an optimized power and cost footprint.







Figure 2-1. AM62A Simplified Block Diagram

The main processing, compute, and interface subsystems from a CMS context in the AM62A are as follows:

- Quad Arm<sup>®</sup> Cortex<sup>®</sup>-A53 cores: These cores can run up to 1.4 GHz and provide up to 16.8k Dhrystone Million Instructions Per Second (DMIPS) of performance.
- C7x Digital Signal Processor (DSP) and Matrix Multiplication Accelerator (MMA): TI's deep-learning
  accelerator on the AM62A is capable of 2 Tera Operations Per Second (TOPS) when clocked at 1 GHz.
- Vision Pre-processing Accelerator (VPAC3L): The latest generation in TI Image Signal Processor (ISP) technology for performing image operations, some examples of which are color conversions, chromatic aberration correction, pyramid scaling, and lens distortion correction. The total throughput of VPAC3L is up to 300MP/s.
- Camera Serial Interface (CSI-2 Rx): Mobile Industry Processor Interface (MIPI<sup>®</sup>) CSI-2 v1.3 compliant CSI-2 RX interface supporting 1, 2, 3, or 4 data lane mode of up to 1.5Gbps per lane.
- Display Subsystem and DPI interface: The display subsystem is capable of driving a single display with a typical configuration of 2MP at 60 fps. The pixel clock support is set at 165 MHz. DPI supports a 24-bit RGB parallel interface.

## 3 Vision Pre-processing Accelerator (VPAC)

The VPAC subsystem on the AM62A device provides common vision primitive functions for image data processing which is performed at the pixel level. There are three sub-modules in VPAC: Vision Imaging Sub-System (VISS), Lens Distortion Correction (LDC) block, and the Multi-Scalar (MSC) block, as illustrated in Figure 3-1. The VPAC also includes an imaging pipe, which can either be integrated with external camera sensors to operate in on-the-fly mode, or operate in memory-to-memory mode. MSC and LDC can start processing immediately after their upstream block finishes processing.





Figure 3-1. VPAC Block Diagram

## 3.1 Vision Imaging Sub-System (VISS)

The VISS performs image processing on raw data, which includes wide dynamic range (WDR) merge, defect pixel correction (DPC), lens shading correction (LSC), global and local brightness and contrast enhancement (GLBCE), demosaicing, color conversion, and edge enhancement (EE). VISS can operate on sensor data either in on-the-fly mode or in memory-to-memory mode:

- On-the-fly mode: camera data goes directly to VPAC for processing without being first stored to memory. This mode is enabled by integrating the imaging pipe inside VISS with the external camera sensor.
- Memory-to-memory mode: camera data is stored to the memory and VPAC reads the data from memory and then processes the data.

As of Processor SDK release 9.0, only memory-to-memory mode is supported. On-the-fly mode will be supported in future SDK releases.

In addition to performing raw image processing, the VISS also computes image statistics and supports the control loop for auto focus (AF), auto exposure (AE), and auto white balance (AWB).

#### **3.2 Lens Distortion Correction (LDC) Block**

The LDC is a dedicated hardware accelerator capable of flexible perspective transformations. These transformations are necessary because the final mirror view displayed on the monitor is constructed taking the following into account:

- Input fish-eye lens distortion correction
- Driver viewing perspective adjustment
- Emulation of the aspherical side-view mirror effect digitally

In Figure 3-2, aspherical mirror emulation compresses the right side of the view horizontally so that a wider view can fit into a smaller CMS monitor.



Figure 3-2. Example of Aspherical Mirror Emulation



## 3.3 Multi-Scalar (MSC) Block

The MSC can generate up to 10 scaled outputs from a given input with various scaling ratios (between  $1 \times and 0.25 x$ ). Each of the 10 scaling operations can be configured to perform pyramid scale or inter-octave scale generation.

## 4 Deep Learning Acceleration

Next-generation CMS systems need to be able to identify vehicles, bicycles, and pedestrians reliably and have the capability of providing proximity warnings. Deep learning is highly effective for these tasks in the automotive context due to the capacity to handle variability such as scale, viewpoint, and lighting conditions thus allowing for robust detection performance. TI's deep learning accelerator is the C7x, MMA DSP engine that is capable of 2 TOPs of performance. TI provides a model analyzer and model selection tool<sup>(2)</sup> that enables third party perception stack providers to choose the deep learning model that provides the maximum entitlement in terms of frames per second and accuracy. As an example, Table 4-1 illustrates the performance entitlement with the SSDLite-MobDet-EdgeTPU model when running at 60 fps. This model is found in TI's edgeai-modelzoo<sup>(3)</sup>.

| Table 4-1. Example of Deep | Learning Performance on th | ne AM62A With C7, MMA at 850 MHz |
|----------------------------|----------------------------|----------------------------------|
|----------------------------|----------------------------|----------------------------------|

| Model                  | Model Resolution |    | MAP<br>Accuracy<br>On CoCo<br>Dataset | Latency<br>(ms) | Deep<br>Learning<br>Utilization | DDR<br>Bandwidth<br>Utilization |
|------------------------|------------------|----|---------------------------------------|-----------------|---------------------------------|---------------------------------|
| SSDLite-MobDet-EdgeTPU | 320 × 320        | 60 | 29.7                                  | 8.35            | 50%                             | 1.09GB/s                        |

## **5** Camera Mirror System Data Flow and Latency

In a camera mirror system, the image data from the camera typically goes through the CSI-2 RX interface, ISP, deep learning engine, and finally to the display. Figure 5-1shows the data flow on the AM62A device.



Figure 5-1. Typical Data Flow of a Camera Mirror System

For camera mirror applications, the latency from camera to display (that is, the so-called glass-to-glass latency) must be as small as possible. Every block in the data flow contributes to the latency. The AM62A SoC has the following differentiating features that can achieve optimal latency:

- High throughput ISP: each of the three blocks of VPAC3L can process 1 pixel per clock cycle, up to 300MP/s
  after accounting for overhead.
- ISP on-the-fly mode: when operating in this mode, VISS processes the camera data on the fly, without waiting for a full frame of data to be available.
- High performance deep learning accelerator: the C7x, MMA deep learning accelerator has 2 TOPS processing capability.
- DDR subsystem supports up to 3733 MT/s.

Table 5-1 shows an exemplary analysis of the latency accumulated block after block in the data path shown in Figure 5-1. A 2.1MP (1936 × 1100) sensor running at 60 fps is used for the analysis. The latency introduced by each block is estimated as the following:

- At 60 fps, the frame duration is 16.67 ms.
- This analysis assumes VISS running in memory-to-memory mode instead of on-the-fly mode. Therefore, VISS has to wait until a whole frame is available before it starts processing.
- The processing time for each of VISS, LDC, and MSC is estimated as 1936 × 1100 Pixels / 300MP/s = 7 ms. Configuration time for each block is considered as 1 ms to be conservative. Therefore, total latency is about 8 ms for each block.
- Deep learning latency is estimated to be 8 ms according to Section 4.

| Sensor<br>1936 × 1100 at 60 fps | Frame 0 |         |         |         |         |         |         |
|---------------------------------|---------|---------|---------|---------|---------|---------|---------|
| CSI2-RX                         |         | Frame 0 |         |         |         |         |         |
| VISS                            |         |         | Frame 0 |         |         |         |         |
| LDC                             |         |         |         | Frame 0 |         |         |         |
| MSC                             |         |         |         |         | Frame 0 |         |         |
| Deep Learning                   |         |         |         |         |         | Frame 0 |         |
| Display                         |         |         |         |         |         |         | Frame 0 |
| Time (ms)                       | 0       | 16.67   | 24.67   | 32.67   | 40.67   | 48.67   | 65.33   |

Table 5-1. Camera to Display Latency Estimate

As shown in this analysis, the total latency of the data flow shown in Figure 5-1 is about 65ms, which is adequate for a typical camera mirror system.

## 6 End-to-End Functional Safety

In CMS systems, it is crucial to provide the proper functioning and expected performance of the entire software chain and frame acquisition process, starting from the image sensor to the display. It is necessary to detect any issues that can result in a static frame display (commonly referred to as frame freeze) and promptly alert the driver. There are two methods employed to detect frame freeze:

- 1. CRC Signature Comparison: The currently displayed frame is stored back in memory, and a unique signature is generated using the CRC module. The consecutive CRC signatures are compared to determine the status of frame freeze.
- 2. Frame Statistics and Metadata Comparison: The frame statistics, including metadata such as frame number and timestamp, are transmitted alongside the frame over the display interface. On the receiving end, the metadata is extracted from multiple consecutive frames and compared to ascertain the presence of frame freeze.

These monitoring techniques make sure the continuous and accurate display of frames, providing a reliable and visually consistent experience to the driver.

#### 7 Example Demonstration

An example demonstration was built to showcase the capabilities of AM62A SoC for camera mirror applications. In this demonstration, raw video data is captured from a camera, processed on the AM62A SoC, and then sent to a display. The camera is connected to the AM62A SoC through a FAKRA cable to simulate the real use case where the camera is placed away from the processor.

#### 7.1 Hardware Equipment

The hardware equipment for this demonstration includes:

- AM62A starter kit evaluation module (EVM)<sup>(4)</sup>
- FPD-Link III camera module with a Sony<sup>®</sup> IMX390 image sensor and a lens with 190° field of view (FOV), running at 1936 × 1100 resolution and 60 fps
- FPD-Link III compatible FAKRA cable to connect a camera up to 15 m away from the AM62A SoC
- DS90UB954-Q1 deserializer EVM to connect the IMX390 camera module to the AM62A starter kit EVM
- A 12.6-inch wide screen rear view display

Figure 7-1 shows the complete setup of this demonstration.



Figure 7-1. CMS Demonstration Hardware Setup

#### 7.2 Software Components

The software components for this demonstration are listed below and are shown in Figure 7-2:

- QNX SDK
- Sensor driver, serializer and deserializer driver, CSI2-RX driver
- VPAC driver running all the processing blocks: VISS, LDC, and MSC
- · Auto exposure (AE) and auto white balance (AWB) algorithm
- Display driver
- TI OpenVX based vision application performing the video streaming
- (Deep learning is not included in this demonstration)



Figure 7-2. CMS Demonstration Software Stack



# 7.3 Latency Measurement

The glass-to-glass latency of this demonstration was measured by streaming a digital stopwatch and calculating the time difference between the actual stopwatch and the displayed stopwatch. Figure 7-3 shows an example of the latency measurement:

- The actual real-time stopwatch was at 09:52:32 when the picture was taken.
- The displayed stopwatch was at 09:52:26 when the picture was taken.
- The displayed stopwatch lagged the actual stopwatch by 60 ms, which was considered the glass-to-glass latency.



Figure 7-3. Glass-to-Glass Latency Measurement

This demonstration did not perform deep learning analytics and thus the latency did not include what is introduced by the deep learning processing. According to Table 5-1, the total latency is 57 ms without the deep learning block. Considering measurement errors, the measured latency of 60 ms is very close to the estimated value.

#### 7.4 Future Improvement on Latency

The demonstration described in Section 7.3 can achieve 60 ms glass-to-glass latency (not including deep learning related processing). A few improvements can be made to further reduce the latency:

- Implementing the on-the-fly mode for VPAC, which can reduce the latency by either the camera data accumulation time or VISS processing time, whichever is shorter. For example, in the 2.1MP at 60 fps case, utilizing on-the-fly mode can reduce latency by the processing time of VISS which is 8 ms.
- Removing MSC. The LDC block of VPAC can also resize the image, though LDC is not as flexible as MSC and LDC does not do anti-aliasing filtering. For certain applications, it is possible to just use LDC to resize the image without using MSC. In that case, the latency can be reduced by the MSC processing time, which is 8 ms for 2.1MP at 60 fps.

## 8 Summary

This white paper presented a camera mirror system design built with the AM62A processor. The unique features of the AM62A, including the image signal processor and deep learning accelerator, provide what is needed to build a low cost, low power, and high performance solution for camera mirror systems. Future development of the demonstration presented in this paper will include deep-learning analytics and further latency improvement.

## 9 References

- 1. UN Regulation No. 151 Blind Spot Information System for the Detection of Bicycles
- 2. Texas Instruments, Model Analyzer (Edge AI) (login required)
- 3. TI's edgeai-modelzoo
- 4. Texas Instruments, SK-AM62A-LP tool page

## IMPORTANT NOTICE AND DISCLAIMER

TI PROVIDES TECHNICAL AND RELIABILITY DATA (INCLUDING DATA SHEETS), DESIGN RESOURCES (INCLUDING REFERENCE DESIGNS), APPLICATION OR OTHER DESIGN ADVICE, WEB TOOLS, SAFETY INFORMATION, AND OTHER RESOURCES "AS IS" AND WITH ALL FAULTS, AND DISCLAIMS ALL WARRANTIES, EXPRESS AND IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT OF THIRD PARTY INTELLECTUAL PROPERTY RIGHTS.

These resources are intended for skilled developers designing with TI products. You are solely responsible for (1) selecting the appropriate TI products for your application, (2) designing, validating and testing your application, and (3) ensuring your application meets applicable standards, and any other safety, security, regulatory or other requirements.

These resources are subject to change without notice. TI grants you permission to use these resources only for development of an application that uses the TI products described in the resource. Other reproduction and display of these resources is prohibited. No license is granted to any other TI intellectual property right or to any third party intellectual property right. TI disclaims responsibility for, and you will fully indemnify TI and its representatives against, any claims, damages, costs, losses, and liabilities arising out of your use of these resources.

TI's products are provided subject to TI's Terms of Sale or other applicable terms available either on ti.com or provided in conjunction with such TI products. TI's provision of these resources does not expand or otherwise alter TI's applicable warranties or warranty disclaimers for TI products.

TI objects to and rejects any additional or different terms you may have proposed.

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265 Copyright © 2023, Texas Instruments Incorporated