SPRADO9 March   2025 AM62L

 

  1.   1
  2.   Abstract
  3.   Trademarks
  4. 1Introduction
  5. 2Processor Core and Compute Benchmarks
    1. 2.1 Dhrystone
    2. 2.2 Whetstone
    3. 2.3 Linpack
    4. 2.4 NBench
    5. 2.5 CoreMark-Pro
    6. 2.6 Fast Fourier Transform
    7. 2.7 Cryptographic Benchmarks
  6. 3Memory System Benchmarks
    1. 3.1 Memory Bandwidth and Latency
      1. 3.1.1 LMBench
      2. 3.1.2 STREAM
    2. 3.2 Critical Memory Access Latency
    3. 3.3 UDMA: DDR to DDR Data Copy
  7. 4Summary
  8. 5References

CoreMark-Pro

CoreMark®-Pro tests the entire processor, adding comprehensive support for multi-core technology, a combination of integer and floating-point workloads, and data sets for utilizing larger memory subsystems. The components of CoreMark-Pro utilizes all levels of cache with an up to 3MB data memory footprint. Many, but not all of the tests, are also using P threads to allow utilization of multiple cores. The score scales with the number of cores but is always less than linear (dual core score is less than 2x single core).

CoreMark-Pro must not be confused with the smaller CoreMark which, like Dhrystone, is a microbenchmark contained in L1 caches of a modern processor.

CoreMark-Pro is not included in the SDK and can be downloaded from CoreMark-Pro. In this test, the code is directly cloned and built in the AM62Lx EVM. Next are the steps to clone, build, and run CoreMark-Pro directly on the target:

  1. Clone the repository.
    root@am62lxx-evm:~# git clone https://github.com/eembc/coremark-pro.git
  2. Build CoreMark-Pro.
    root@am62lxx-evm:~# cd coremark-pro/
    root@am62lxx-evm:~/coremark-pro# make TARGET=linux64 build-all
  3. Run CoreMark-Pro: use certify-all to run all 9 benchmarks of CoreMark-Pro and XCMD to set the number of cores.
    root@am62lxx-evm:~/coremark-pro# make TARGET=linux64 certify-all XCMD='-c2'

Benchmark output:

root@am62lxx-evm:~/coremark-pro# make TARGET=linux64 certify-all XCMD='-c2'
.
.
WORKLOAD RESULTS TABLE

                                                 MultiCore SingleCore           
Workload Name                                     (iter/s)   (iter/s)    Scaling
----------------------------------------------- ---------- ---------- ----------
cjpeg-rose7-preset                                   71.43      37.04       1.93
core                                                  0.52       0.27       1.93
linear_alg-mid-100x100-sp                            24.58      12.92       1.90
loops-all-mid-10k-sp                                  0.72       0.43       1.67
nnet_test                                             1.96       1.01       1.94
parser-125k                                           6.85       7.09       0.97
radix2-big-64k                                       32.47      22.28       1.46
sha-test                                            138.89      73.53       1.89
zip-test                                             33.90      20.00       1.69

MARK RESULTS TABLE

Mark Name                                        MultiCore SingleCore    Scaling
----------------------------------------------- ---------- ---------- ----------
CoreMark-PRO                                       1189.42     710.32       1.67

All official CoreMark-Pro rules have been satisfied such as making sure that the execution time of each workload is at least 1000 times the minimum timer resolution. Table 2-5 shows the CoreMark-Pro results for single, and dual A53 cores at 1.25GHz.

Table 2-5 CoreMark-Pro Results
Arm-Cortex-A53
At 1.25GHz [iter/s]
Parallel Scaling
Single core7101
Dual core1,1891.67