Device Benchmarks

The benchmarks in this section help illustrate device level performance. See core benchmarks for performance of a single core.

TMS3206678 FFT performance vs. number of cores used

The following graph shows the relative increase in performance as the number of cores used in FFT processing is increased using the C6678 multicore DSP. Note that as the number of cores increases to 8, the system performance asymptotes to ~6x the single core performance. In this case we have hit the DDR bandwidth limit and the processing in the DSP is limited to the data throughput in and out of external memory. For different types of data processing this limit may or may not be hit depending on the specific algorithms data and processing requirements.

FFT computation time in ms as a function of the number of cores used

FFT size 1x C66x 2x C66x 4x C66x 8x C66x
16k 0.473 0.261 0.159 0.131
32k 0.915 0.478 0.278 0.198
64k 1.857 0.922 0.508 0.315
128k 4.1 2.004 1.06 0.641
256k 8.795 4.323 2.228 1.186
512k 18.669 9.291 4.704 3.103
1024k 38.557 19.328 9.605 6.403


Multicore Fixed and Floating-Point Digital Signal Processor