SPRAD90 February   2023 AM62A3 , AM62A3-Q1 , AM62A7 , AM62A7-Q1

 

  1.   Abstract
  2.   Trademarks
  3. 1Introduction
    1. 1.1 Change Cortex-A53 Clock Frequency
  4. 2Processor Core Benchmarks
    1. 2.1 Dhrystone
  5. 3Compute and Memory System Benchmarks
    1. 3.1 Memory Bandwidth and Latency
      1. 3.1.1 LMBench
      2. 3.1.2 STREAM
      3. 3.1.3 Critical Memory Access Latency
    2. 3.2 CoreMark-Pro
    3. 3.3 Fast Fourier Transform
    4. 3.4 Cryptographic Benchmarks
  6. 4Application Benchmarks
    1. 4.1 Machine Learning Inference
  7. 5References

STREAM

STREAM is a microbenchmark for measuring data memory system performance without any data reuse. It is designed to miss on caches and exercise the data prefetcher and speculative accesses. It uses double precision floating point (64 bit), but in most modern processors the memory access is the bottleneck. The four individual scores are copy, scale as in multiply by constant, add two numbers, and triad for multiply accumulate.

  • Copy: Measures memory transfer rate without arithmetic operation, a[i] = b[i]
  • Scale: Includes a simple arithmetic operation, a[i] = k × b[i]
  • Add: Includes three memory access in addition to arithmetic operation, a[i] = b[i] + c[i]
  • Triad: Combines scale and add in one operation, a[i] = b[i] + k × c[i]

For bandwidth, a byte read counts as one and a byte written counts as one resulting in a score that is double the bandwidth LMBench. #GUID-04883389-9158-47AF-AC5E-8ED52C252B8D/GUID-6BEC4A78-072D-4DDE-B00A-075A9F6B0BCB shows the measured bandwidth and the efficiency compared to theoretical wire rate. The wire rate used is the LPDDR4 MT/s rate times the width. To get overall maximum achieved throughput the command used is stream -M 16M -P 4-N 10, which means four parallel threads and 10 iterations. The Arm-Cortex-A53 clock frequency is setup to 1.4 GHz in this test.

Table 3-3 Stream Benchmarks
LPDDR4-3200MT/s-32-Bit Bandwidth

LP
DDR4-3200MT/s-32-Bit Efficiency

copy 7,780 MB/s 61%
scale 7,815 MB/s 61%
add 6,868 MB/s 54%
triad 6,871 MB/s 54%