# Application Note **Dual-TDA4x System Solution**

# **TEXAS INSTRUMENTS**

#### Neo Wang, Mahmut Ciftci, and Linjun Meng

#### ABSTRACT

Jacinto TDA4x SoCs family provides highly integrated, high-performance processors with power efficient architecture to enable optimized Advanced Driver Assistance Systems (ADAS) solutions. Based on the performance, safety, power and application requirements, there are some scenarios where dual-TDA4x solution might be required to meet system goals. This application re;port introduces the motivation and high-level systems considerations for dual-SoC solutions based on the TDA4x System-on-Chip (SoC).

### **Table of Contents**

| 1 Introduction                                     | 2 |
|----------------------------------------------------|---|
| 2 Dual TDA4 System                                 | 2 |
| 2.1 Dual TDA4x SoC System Diagram                  | 2 |
| 2.2 System Consideration and BOM Optimization      |   |
| 3 Camera Connection                                |   |
| 3.1 Duplicate Front Camera Input to Two TDA4x SoCs | 4 |
| 3.2 Connect Front Camera to Only one TDA4x         | 6 |
| 4 Boot Sequence Solution                           | 6 |
| 4.1 Boot Solution Based on Dual Flash              | 7 |
| 4.2 Boot Solution Based on Single Flash            | 8 |
| 5 Multi-SoC Demo Based on PCle                     |   |
| 6 References                                       |   |
|                                                    |   |

# List of Figures

| Figure 2-1. Typical Dual TDA4 Cascading System Diagram | . 3 |
|--------------------------------------------------------|-----|
| Figure 3-1. Typical Camera Duplicate Solution          |     |
| Figure 3-2. Typical Camera Series Solution             |     |
| Figure 4-1. Boot Flow With Second Flash                |     |
| Figure 4-2. Boot Flow With First flash Only            |     |
| 5                                                      |     |

# List of Tables

| Table 1-1. TDA4VM | Гурісаl Configuration | .2 |
|-------------------|-----------------------|----|
|-------------------|-----------------------|----|

#### Trademarks

All trademarks are the property of their respective owners.

# 1 Introduction

Jacinto TDA4x processors family is a scalable platform with software-compatible products. Based on systems requirements, TDA4x SoC family offers multiple products with different performance, power and feature-set that customers can choose from. Currently, the TDA4VM is production and details can be found at TDA4VM product page [1]. Table 1-1 shows a typical IP configuration of TDA4VM. For more information, see the *DRA829/TDA4VM Technical Reference Manual* [2].

|        |         | Processor/Accelerator    |                                        |                  |                           |                  |                   | Inter             | face                    |                                |
|--------|---------|--------------------------|----------------------------------------|------------------|---------------------------|------------------|-------------------|-------------------|-------------------------|--------------------------------|
| TDA4VM | IP      | Arm                      | Arm-R5F                                | DSP              | GPU                       | MMA              | DDR               | Capture           | PCle                    | Ethernet                       |
|        | Feature | 2xA72<br>(~25K<br>DMIPS) | 3xDual<br>R5F<br>(3x<br>~12KDMIP<br>S) | 1xC7x+<br>2xC66x | GE8430<br>(100<br>GFLOPs) | 1xMMAv1<br>8TOPs | 1x32b@42<br>66Mhz | 2x CSI-Rx<br>(4L) | PCle<br>Gen3: 4x<br>2DL | 8pswitch+<br>1x RGMII<br>(MCU) |

| Table 1-1 | . TDA4VM | Typical | Configuration |
|-----------|----------|---------|---------------|
|-----------|----------|---------|---------------|

Dual-TDA4x cascading solution may be required in below scenarios:

- Performance Requirement
  - Performance requirements for general processing, deep learning and customers applications running on MPU, might be too high for a single SoC.
- Function Safety
  - To meet functional safety requirements, a second TDA4x SoC might be required as a redundant SoCs as a backup system when main TDA4x device is abnormal.
- Power and Thermal Consideration
  - Dual SoC solution provides better power and thermal distribution where system load can be distributed across SoCs to enable better power and thermal management.
- IO Interfaces
  - In highly integrated system solutions, single SoC might not provide enough peripheral interfaces including camera, PCIe, Ethernet, CAN, LIN interfaces. Dual-SoC solution might be required to meet system interface requirements.

All the Jacinto TDA4VM SoCs are configured with Ethernet and PCIe interfaces. This means that all the TDA4x family processors can be interconnected via Ethernet and PCIe to enable dual-SoC system requirements. Next sections will discuss system considerations for such solution.

# 2 Dual TDA4 System

#### 2.1 Dual TDA4x SoC System Diagram

An example highly integrated ADAS system based on dual TDA4VM SoCs is shown in Figure 1. This ADAS system integrates front-camera, 4 surround view cameras and 4 side view cameras on a single PCB and supports below main features:

- Cameras are connected via MIPI CSI2-RX interface
  - Each CSI2-RX port supports 4 lane and data rates up to 2.5Gbps per lane, totaling up to 10 Gbps per CSI2-RX port.
- Each TDA4x SOC needs a dedicated power (PMIC) solution, however secondary TDA4x PMIC can be connected to primary TDA4x to achieve wake up function.
- The front camera systems typically require high-resolution sensors to achieve complex functions such as object detection, parking assist and so on. In such scenario, multiple deep learning models need to work together. The original camera data can be duplicated to two TDA4x separately through de-serializers such as FPDLink.
- The four side view cameras and four surround view cameras can be deployed on two TDA4x separately to balance functionality, processing load, power and thermal distribution.



#### Figure 2-1. Typical Dual TDA4 Cascading System Diagram

- Two TDA4x SoCs can be connected via various high-speed interfaces including PCIe and Ethernet. The
  PCIe, Ethernet and SPI interconnection between the two TDA4x SoCs does not require the PHY chips, the
  pins can be directly connected through PCB. For example, the Ethernet switch (CPSW9G) interconnection
  between two TDA4x SoCs, or CPSW9G and CPSW2G interconnection in the same chip, are using MAC to
  MAC support without Ethernet PHY.
  - The PCIe controller supports Gen3, 8 Gbps for each lane and up to 4 lanes per PCIe controller, providing total 32Gbps throughput.
  - The 8-port Ethernet switch supports:
    - All ports support 2.5Gb HSGMII, 1GB SGMII/RGMII
    - Two ports supporting 5Gb/10Gb XFI/USXGMII
- The external DDR memory and flash memories such as eMMC, OSPI/QSPI are required for each TDA4 to achieve the best performance. However, in some scenarios, further optimizations are possible to balance cost and system design.

#### 2.2 System Consideration and BOM Optimization

Dual-TDA4x based system can enable system cost savings by taking advantage of TDA4x SoC features. TDA4X SoC integrates many critical processors and IPs required for an ADAS system including image signal processing (ISP), MCU/safety island for ASIL-D safety support, Ethernet switch and so on. Such integration minimizes required external components. For TDA4, PMIC, DDR, flash memory for boot and storage are the main required external peripheral devices. In addition, there can be further cost savings for dual-SoC solution including:

The external storage and flash memory including DDR, EMMC and flash.

- The external DDR memory is needed for each TDA4.
- MMC is usually used to store high level OS system image and filesystem. If there is no strict restriction on the boot time of high-level operating system (HLOS) on secondary TDA4x, then these boot images can be transferred via PCIe or Ethernet from primary TDA4x SoC. As a result, the secondary eMMC can be optimized in this case.
- In order to achieve faster startup time, the boot image usually saved in NOR Flash (OSPI/QSPI), however the OSPI is faster and more expensive than QSPI.
  - If secondary TDA4x SoC is required to start at the same time as the primary TDA4, then both boot flashes are needed, therefore the OSPI for primary TDA4x SoC and QSPI for secondary TDA4x SoC can be used to provide a cost-effective solution.
  - If secondary TDA4x SoC can boot after primary TDA4x SoC starts up, the secondary boot flash can be optimized because secondary TDA4x boot image can be transferred via PCIe or Ethernet from primary TDA4x SoC.



However, the final decision on system BOM is based on system requirements to optimize the best performance and the system cost.

#### **3 Camera Connection**

Generally, side view cameras and surround view cameras are connected and processed on two TDA4x SoCs . However, the front camera typically needs to implement more L2, L2+ functions such as Object Detection, Semantic Segmentation, Lane Keeping/Changing Assist and so on. When the computing power of one TDA4x SoC cannot support such multi-functional implementation, then front camera input needs to be routed to the second TDA4x SoCs for additional processing. This section introduces two methods to connect front camera to TDA4x SoCs.

#### 3.1 Duplicate Front Camera Input to Two TDA4x SoCs

This solution duplicates the camera source data to two TDA4x SoCs, (TDA4-A, TDA4-B) each TDA4x processor implements different ADAS functions. A reference implementation to distribute front camera functionality, as shown in Figure 3-1.



Figure 3-1. Typical Camera Duplicate Solution



#### Camera in Primary TDA4-A:

- Front camera is used to monitor the front camera object:
  - The camera data input through CSI2-1 port
  - Preprocessing by C66x DSP (such as Format/Grayscale conversion, ROI Setting)
  - Run inference on deep learning models to achieve Lane Departure Warning (LDW) and Lane Keeping Assist (LKA) function in C7x+MMA deep learning accelerator
  - Postprocessing by C66x DSP (such as drawing line and interpolation) and output the result to fusion node
  - Side view cameras are used to monitor car and objects coming from the side:
  - The camera data input through CSI2-0 port
  - Preprocessing/postprocessing by C66x DSP
  - Run inference on deep learning models for Lane Changing Assist (LCA) and Adaptive Cruise Control (ACC) in C7x+MMA
  - Output the result to fusion node

#### Camera in Secondary TDA4-B:

- Front camera is used to monitor the front object:
  - The camera data input through CSI2-0 port
  - Preprocessing/postprocessing is done by C66x DSP
  - Run inference deep learning models to achieve Object Detect (OD) function in C7x+MMA, such as Traffic Light Recognition (TLR), Traffic Sign Recognition (TSR)
  - Output the result to PCIe TX node
- Surround view cameras are used to detect the environment around the car at low speed:
  - The camera data input through CSI2-1 port
  - The codec node is optional for recording (DVR)
  - GPU node for 3D rendering
  - Preprocessing/postprocessing by C66x DSP
  - Run inference deep learning models to achieve Automated Valet Parking (AVP) and Automated Parking Assist (APA) function in C7x+MMA
  - Output the result to PCIe TX node

#### Fusion:

- All of the results from the TDA4-B are transferred to the TDA4-A via PCIe TX/RX node, then input to fusion node.
- All of the results whether from TDA4-A or TDA4-B are assembled in fusion node, then output the finally result.

5



## 3.2 Connect Front Camera to Only one TDA4x

This solution connects the front camera and side view cameras to TDA4-A, and transfers intermediate results to TDA4-B for further processing. The multiple result fusion is then implemented in TDA4-B and the final result is outputted, as shown in Figure 3-2.



Figure 3-2. Typical Camera Series Solution

The difference between this scenario and the previous scenario is the result of TDA4-A being transferred through the PCIe TX node. Deep learning inference intermediate results are received by the PCIe RX node in the TDA4-B node, and those results are used as the source data for further processing in TDA4-B. The surround view camera results are input to the fusion node along with the deep learning result. Final results are outputted from TDA4-B.

#### 4 Boot Sequence Solution

As Section 2.2 shows, depending on system BOM, different peripheral devices and boot modes may be required, which leads to different boot flow and boot time. Software is needed to take care boot flow depending on the specific hardware design. This section introduces two typical boot sequence solutions for reference.



#### 4.1 Boot Solution Based on Dual Flash

In this case, both the dual TDA4x SoCs have their own flash for system boot. Figure 4-1 shows the boot sequence of dual TDA4x. The advantage of this boot sequence solution is that the two TDA4x SoCs are booted in parallel, which can shorten the start-up time of the whole dual TDA4x system.



Figure 4-1. Boot Flow With Second Flash

Key features and process as below:

- The primary and secondary TDA4 should be using the OSPI/QSPI boot mode.
- The boot images are stored in OSPI (for primary TDA4x) or QSPI (for secondary TDA4x) to achieve faster boot times. In addition, system images of MCU2\_x/MCU3\_x cores can also be stored in flash storage, which can further shorten the start-up time.
- Primary TDA4x SoC needs to initialize and configure some hardware interfaces, such as Ethernet, and PCIe. These hardware configurations needs to take care of subsequent transmission of images to secondary TDA4x SoC.
- Primary TDA4x SoC continues its boot flow to wakeup other cores after transferring secondary TDA4x's images. Secondary TDA4x SoC boots from its QSPI first, then boots other cores after receiving images from primary TDA4x SoC.
- Primary TDA4x SoC continues its boot flow to wakeup other cores after transferring secondary TDA4x's images. Secondary TDA4x SoC will boot from its QSPI first, then boot other cores after receiving images from primary TDA4x SoC.

7



#### 4.2 Boot Solution Based on Single Flash

In this case, only one boot flash for primary TDA4 is used, as shown in Figure 4-2. The advantage of this boot sequence is lower system cost, and the disadvantage is longer system boot time because the secondary TDA4x SoC is dependent on the primary TDA4x SoC to start booting.



Figure 4-2. Boot Flow With First flash Only

Key features and process as below:

- The primary TDA4x SoC uses OSPI/QSPI boot mode, meanwhile the secondary TDA4x SoC boots using SPI/USB/Eth/PCIe mod depending on hardware support.
- Primary TDA4x SoC will boot from its own OSPI, then wakeup the secondary TDA4x PMIC via I2C connection, then initialize and configure the hardware interface which used to boot secondary TDA4x SoC. In parallel, it will load the boot images for other cores (except A72) from EMMC to DDR.
- After PMIC is enabled, secondary TDA4x SoC will first receive boot image from primary TDA4x SoC and starts to boot. After that, it loads all of the core images from EMMC to DDR to boot the whole system.
- After both TDA4x SoC boots are completed, subsequent data transfer can happen via Ethernet/PCIe.

#### 5 Multi-SoC Demo Based on PCIe

Jacinto 7 SDK [3] provides a reference application to showcase multi SoC concept. In this example, three TDA4x EVMs [4] are connected using PCIe interface. First, the EVM captures camera frames and those frames are transferred to the third EVM via second EVM over PCIe. The third EVM drives the display to show camera frames on a screen.

For more details, see https://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/06\_02\_00\_21/exports/ docs/vision\_apps/docs/user\_guide/APP\_PCIE\_VIDEO.html

For additional details on PCIe and Ethernet connections, see [5] and [6].



# **6** References

- 1. TDA4VM Product Page
- 2. Texas Instruments: DRA829/TDA4VM Technical Reference Manual
- 3. Software Development Kit for TDA4VM/DRA829 Jacinto Processors
- 4. TDA4VM/DRA829 Evaluation Module
- 5. PCIe Interconnect Solution
- 6. MAC2MAC Solution Based on DRA829 EVM

#### IMPORTANT NOTICE AND DISCLAIMER

TI PROVIDES TECHNICAL AND RELIABILITY DATA (INCLUDING DATA SHEETS), DESIGN RESOURCES (INCLUDING REFERENCE DESIGNS), APPLICATION OR OTHER DESIGN ADVICE, WEB TOOLS, SAFETY INFORMATION, AND OTHER RESOURCES "AS IS" AND WITH ALL FAULTS, AND DISCLAIMS ALL WARRANTIES, EXPRESS AND IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT OF THIRD PARTY INTELLECTUAL PROPERTY RIGHTS.

These resources are intended for skilled developers designing with TI products. You are solely responsible for (1) selecting the appropriate TI products for your application, (2) designing, validating and testing your application, and (3) ensuring your application meets applicable standards, and any other safety, security, regulatory or other requirements.

These resources are subject to change without notice. TI grants you permission to use these resources only for development of an application that uses the TI products described in the resource. Other reproduction and display of these resources is prohibited. No license is granted to any other TI intellectual property right or to any third party intellectual property right. TI disclaims responsibility for, and you will fully indemnify TI and its representatives against, any claims, damages, costs, losses, and liabilities arising out of your use of these resources.

TI's products are provided subject to TI's Terms of Sale or other applicable terms available either on ti.com or provided in conjunction with such TI products. TI's provision of these resources does not expand or otherwise alter TI's applicable warranties or warranty disclaimers for TI products.

TI objects to and rejects any additional or different terms you may have proposed.

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265 Copyright © 2022, Texas Instruments Incorporated