SPRADP7A February   2025  – March 2025 AM62A3 , AM62A3-Q1 , AM62A7 , AM62A7-Q1 , AM67A , TDA4AEN-Q1

 

  1.   1
  2.   Abstract
  3.   Trademarks
  4. 1Introduction
  5. 2Building Blocks of an RGB-IR Vision Pipeline
    1. 2.1 CSI Receiver
    2. 2.2 Image Signal Processor
    3. 2.3 Video Processing Unit
    4. 2.4 TI Deep Learning Acceleration
    5. 2.5 GStreamer and TIOVX Frameworks
  6. 3Performance Considerations and Benchmarking Tools
  7. 4Reference Design
    1. 4.1 Camera Module
    2. 4.2 Sensor Driver
    3. 4.3 CSI-2 Rx Driver
    4. 4.4 Image Processing
    5. 4.5 Deep Learning for Driver and Occupancy Monitoring
    6. 4.6 Reference Code and Applications
  8. 5Application Examples and Benchmarking
    1. 5.1 Application 1: Single-stream Capture and Visualization with GST
    2. 5.2 Application 2: Dual-stream Capture and Visualization with GST and TIOVX Frameworks
    3. 5.3 Application 3: Representative OMS-DMS + Video Telephony Pipeline in GStreamer
  9. 6Summary
  10. 7References
  11. 8Revision History

Application 2: Dual-stream Capture and Visualization with GST and TIOVX Frameworks

This section analyzes an application that visualizes both RGB and Infrared streams simultaneously.

This is composed in both GStreamer and TIOVX to allow comparison between the two. Shown in Figure 5-5 is the application in GStreamer and Figure 5-6 shows the equivalent application constructed with TIOVX. The GST commands and TIOVX code can be found in the GitHub [6].

 Dual Stream Capture and Display with GStreamerFigure 5-5 Dual Stream Capture and Display with GStreamer
 Dual Stream Capture and Display with TIOVXFigure 5-6 Dual Stream Capture and Display with TIOVX

Frames arrive through v4l2 from the corresponding /dev/videoX entries for RGB and IR streams and are processed by the on-chip ISP. Both streams are downscaled to a resolution that fit into the monitor, and the IR stream (in grayscale) is converted to the same color format as the RGB stream. Then, the streams are combined into a single frame with a mosaic feature before displaying to a monitor through the Linux KMS or DRM interface.

The TIOVX and GStreamer applications are equivalent in terms of the processing functions involved, however, there are a few key differences. The TIOVX application builds a TIOVX graph to handle the inner body of the application, which, in this case, is the ISP, down-scaling, color-conversion, and image-merging (mosaic) features. Input from V4L2 and output through KMS or DRM is handled using Linux-level APIs outside the TIOVX graph. However, GStreamer has numerous plugins available to implement these API calls by plugins. The TIOVX application is compiled into a binary application and run whereas the GStreamer pipeline can be represented using a single string that can be run from the command line.

Performance Statistics and SoC Resource Utilization

This section analyzes resource utilization on AM62A74 while running these applications. The measurement application receives remote core utilizations through TIOVX and reads other information like DDR utilization and temperature through memory-mapped registers. This perf_stats application is part of the SDK under the /opt/edgeai-gst-apps/scripts/perf_stats directory. The sampling interval for SoC utilization is 500ms; these are collected across a 20 second window (600 frames per RGB, IR stream) and averaged together into the double bar-chart shown in Figure 5-7.

 Utilization Comparison of GStreamer and TIOVX For Dual-stream Visualization PipelineFigure 5-7 Utilization Comparison of GStreamer and TIOVX For Dual-stream Visualization Pipeline

Error bars in this chart represent the 25th and 75th percentile. Between the two frameworks, accelerator and DDR utilization is at parity for the application running at 30FPS (per input stream) without frame drops. Notably, the MPU (quad Arm® Cortex®-A53 in SMP mode) denoted as mpu1_0 in Figure 5-7, has higher utilization for GStreamer than TIOVX. This is due to increased signalling between the CPU complex and any remote core or accelerator. C7xMMA is unused in this application. Otherwise, Core or HWA utilizations are very similar between the two application frameworks, with TIOVX being slightly more efficient.

Table 5-1 shows latency through individual components of the pipeline. Frame capture and display latency is not included. For each processing task in the application, compare the latency for GStreamer and TIOVX and the total latencies. From here, the TIOVX is generally faster, especially for color conversion; the infrared path is most affected by the difference in application frameworks.

Table 5-1 GStreamer vs. TIOVX Latencies

Function

GStreamer (ms)

TIOVX (ms)

VISS ISP (Infrared)

18.5

13.9

VISS ISP (RGB)

17.6

14.1

MSC Downscaling (Infrared)

14.3

13.7

MSC Downscaling (RGB)

21.2

20.5

Color conversion

(Infrared->NV12)

19.2

0.64

Mosaic combining images (RGB + IR)

5.5

4.7

Sum latencies (IR path)

57.5

32.9

Sum latencies (RGB path)

44.3

39.3

GStreamer internally implements TIOVX nodes for each plugin, so TIOVX nodes always measure faster than the equivalent plugin in GStreamer. Measurements in GStreamer are captured before and after the plugin runs from Linux, whereas measurements in TIOVX can be captured and reported by the remote core running the operation. There is a noticeable improvement(1) in TIOVX compared to GStreamer, especially for the color conversion from grayscale to NV12 format(1).

Another way to compare these applications is in regards interrupts and how frequent inter-processor communication is (that is, A53 messaging C7x, R5F messaging A53, and so forth). Fewer interrupts is better, as this allows the processor to more quickly address pending signals from different peripherals and accelerators. Measure interrupt count from Linux to see the number of interrupts the A53’s (running Linux) received for each application before and after running for 600 frames.

Number of interrupts to Linux across 20 second duration (600 frames per stream). GStreamer shows more interrupts than TIOVX because Cortex A53 cores running Linux must be notified between each plugin/pipeline element. Interrupt counts for individual core's mailbox are captured from /proc/interrupts.

Table 5-2 GStreamer vs. TIOVX Interrupt Counts

GStreamer Application

TIOVX Application

DM R5F (manage VPAC)

13,469

10,589

C7x

0 (unused)

0 (unused)

MCU R5F

0 (unused)

0 (unused)

The data in Table 5-2 reflects the general trend that GStreamer is less efficient in terms of CPU interrupts than TIOVX. This is because TIOVX allows all cores to communicate directly, whereas GStreamer requires the cores to flow through the Linux host (A53). Adding AI processing with TIDL shows a similar pattern for C7x interrupts.

In some GStreamer pipelines, individual plugin latency can be impacted by adjacent plugins and presence of queues, since these queues affect how GStreamer chooses to multithread plugins and portions of the pipeline. This also explains why the GStreamer application's latencies in Table 5-1 exceed those shown in Figure 5-4.
Note that color conversion is not necessary in a practical application; this is included to allow RGB and Ir streams to be simultaneously visualized by stitching frames. Both frames must be in the same format. A production application does not require this color conversion step.