SPRADP7A February 2025 – March 2025 AM62A3 , AM62A3-Q1 , AM62A7 , AM62A7-Q1 , AM67A , TDA4AEN-Q1
This section analyzes an application that visualizes both RGB and Infrared streams simultaneously.
This is composed in both GStreamer and TIOVX to allow comparison between the two. Shown in Figure 5-5 is the application in GStreamer and Figure 5-6 shows the equivalent application constructed with TIOVX. The GST commands and TIOVX code can be found in the GitHub [6].
Frames arrive through v4l2 from the corresponding /dev/videoX entries for RGB and IR streams and are processed by the on-chip ISP. Both streams are downscaled to a resolution that fit into the monitor, and the IR stream (in grayscale) is converted to the same color format as the RGB stream. Then, the streams are combined into a single frame with a mosaic feature before displaying to a monitor through the Linux KMS or DRM interface.
The TIOVX and GStreamer applications are equivalent in terms of the processing functions involved, however, there are a few key differences. The TIOVX application builds a TIOVX graph to handle the inner body of the application, which, in this case, is the ISP, down-scaling, color-conversion, and image-merging (mosaic) features. Input from V4L2 and output through KMS or DRM is handled using Linux-level APIs outside the TIOVX graph. However, GStreamer has numerous plugins available to implement these API calls by plugins. The TIOVX application is compiled into a binary application and run whereas the GStreamer pipeline can be represented using a single string that can be run from the command line.
This section analyzes resource utilization on AM62A74 while running these applications. The measurement application receives remote core utilizations through TIOVX and reads other information like DDR utilization and temperature through memory-mapped registers. This perf_stats application is part of the SDK under the /opt/edgeai-gst-apps/scripts/perf_stats directory. The sampling interval for SoC utilization is 500ms; these are collected across a 20 second window (600 frames per RGB, IR stream) and averaged together into the double bar-chart shown in Figure 5-7.
Figure 5-7 Utilization Comparison of GStreamer and TIOVX For Dual-stream Visualization PipelineError bars in this chart represent the 25th and 75th percentile. Between the two frameworks, accelerator and DDR utilization is at parity for the application running at 30FPS (per input stream) without frame drops. Notably, the MPU (quad Arm® Cortex®-A53 in SMP mode) denoted as mpu1_0 in Figure 5-7, has higher utilization for GStreamer than TIOVX. This is due to increased signalling between the CPU complex and any remote core or accelerator. C7xMMA is unused in this application. Otherwise, Core or HWA utilizations are very similar between the two application frameworks, with TIOVX being slightly more efficient.
Table 5-1 shows latency through individual components of the pipeline. Frame capture and display latency is not included. For each processing task in the application, compare the latency for GStreamer and TIOVX and the total latencies. From here, the TIOVX is generally faster, especially for color conversion; the infrared path is most affected by the difference in application frameworks.
Function | GStreamer (ms) | TIOVX (ms) |
|---|---|---|
VISS ISP (Infrared) | 18.5 | 13.9 |
VISS ISP (RGB) | 17.6 | 14.1 |
MSC Downscaling (Infrared) | 14.3 | 13.7 |
MSC Downscaling (RGB) | 21.2 | 20.5 |
Color conversion (Infrared->NV12) | 19.2 | 0.64 |
Mosaic combining images (RGB + IR) | 5.5 | 4.7 |
Sum latencies (IR path) | 57.5 | 32.9 |
Sum latencies (RGB path) | 44.3 | 39.3 |
GStreamer internally implements TIOVX nodes for each plugin, so TIOVX nodes always measure faster than the equivalent plugin in GStreamer. Measurements in GStreamer are captured before and after the plugin runs from Linux, whereas measurements in TIOVX can be captured and reported by the remote core running the operation. There is a noticeable improvement(1) in TIOVX compared to GStreamer, especially for the color conversion from grayscale to NV12 format(1).
Another way to compare these applications is in regards interrupts and how frequent inter-processor communication is (that is, A53 messaging C7x, R5F messaging A53, and so forth). Fewer interrupts is better, as this allows the processor to more quickly address pending signals from different peripherals and accelerators. Measure interrupt count from Linux to see the number of interrupts the A53’s (running Linux) received for each application before and after running for 600 frames.
Number of interrupts to Linux across 20 second duration (600 frames per stream). GStreamer shows more interrupts than TIOVX because Cortex A53 cores running Linux must be notified between each plugin/pipeline element. Interrupt counts for individual core's mailbox are captured from /proc/interrupts.
GStreamer Application | TIOVX Application | |
|---|---|---|
DM R5F (manage VPAC) | 13,469 | 10,589 |
C7x | 0 (unused) | 0 (unused) |
MCU R5F | 0 (unused) | 0 (unused) |
The data in Table 5-2 reflects the general trend that GStreamer is less efficient in terms of CPU interrupts than TIOVX. This is because TIOVX allows all cores to communicate directly, whereas GStreamer requires the cores to flow through the Linux host (A53). Adding AI processing with TIDL shows a similar pattern for C7x interrupts.