When building a vision application, the following performance metrics need to be considered and benchmarked:
- End-to-end latency: The latency between capturing an image and generating the analytics result must be as low as possible to allow for timely decision-making and responsive actions.
- Video throughput (frames per second): Images must be captured and processed at the desired frame rate without frame drops.
- CPU load: The load on general-purpose CPU cores (A53 in the case of AM62A) due to the vision pipeline must be minimal, as all image processing is done on hardware accelerators.
- DDR utilization: The DDR read and write operations by the vision pipeline must leave enough bandwidth for other system tasks.
- Hardware accelerators (HWA) load (ISP, VPU, C7x/MMA): HWAs are dedicated to specific functionalities and cannot be used for other purposes. The HWAs can be utilized up to 100% by the vision pipeline with some margin.
The EdgeAI SDK for AM62A provides several tools to benchmark these performance metrics:
- Perf_stats tool [5]: Measures the load on CPU cores and HWAs, as well as DDR utilization.
- GStreamer debug trace: By setting the environment variable GST_DEBUG_FILE, GStreamer debug messages can be redirected to a file. An EdgeAI SDK script (/opt/edgeai-gst-apps/scripts/gst_tracers/parse_gst_tracers.py) can process these messages and estimate the processing time for each element in the GStreamer pipeline.
- GStreamer plugin fpsdisplaysink:
Displays the throughput of the pipeline in frames per second (fps).
- Custom GStreamer plugin
tiperfoverlay: Projects the CPU loads, DDR utilization, HWA loads,
and fps on the display or prints them on the terminal console.