The TMS320C64x DSPs (including the SM320C6414-EP, SM320C6415-EP, and SM320C6416-EP devices) are the highest-performance fixed-point DSP generation in the TMS320C6000 DSP platform. The SM320C64x (C64x) device is based on the second-generation high-performance, advanced VelociTI very-long-instruction-word (VLIW) architecture (VelociTI.2) developed by Texas Instruments (TI), making these DSPs an excellent choice for multichannel and multifunction applications. The C64x is a code-compatible member of the C6000 DSP platform.
With performance of up to 4000 million instructions per second (MIPS) at a clock rate of 500 MHz, the C64x devices offer cost-effective solutions to high-performance DSP programming challenges. The C64x DSPs possess the operational flexibility of high-speed controllers and the numerical capability of array processors. The C64x DSP core processor has 64 general-purpose registers of 32-bit word length and eight highly independent functional units 2 multipliers for a 32-bit result and 6 arithmetic logic units (ALUs) with VelociTI.2 extensions. The VelociTI.2 extensions in the eight functional units include new instructions to accelerate the performance in key applications and extend the parallelism of the VelociTI architecture. The C64x can produce four 32-bit multiply-accumulates (MACs) per cycle for a total of 2400
million MACs per second (MMACS), or eight 8-bit MACs per cycle for a total of 4800 MMACS. The C64x DSP also has application-specific hardware logic, on-chip memory, and additional on-chip peripherals similar to the other C6000 DSP platform devices.
The C6416 device has two high-performance embedded coprocessors [Viterbi decoder coprocessor (VCP) and turbo decoder coprocessor (TCP)] that significantly speed up channel-decoding operations on chip. The VCP operating at CPU clock divided-by-4 can decode over 500 7.95-Kbps adaptive multi-rate (AMR) (K = 9, R = 1/3) voice channels. The VCP supports constraint lengths K = 5, 6, 7, 8, and 9, rates R = 1/2, 1/3, and 1/4, and flexible polynomials, while generating hard decisions or soft decisions. The TCP
operating at CPU clock divided-by-2 can decode up to 36 384-Kbps or 6 2-Mbps turbo encoded channels (assuming iterations). The TCP implements the max*log-map algorithm and is designed to support all polynomials and rates required by Third-Generation Partnership Projects (3GPP and 3GPP2), with fully programmable frame length and turbo interleaver. Decoding parameters, such as the number of iterations and stopping criteria, are also programmable. Communications between the VCP/TCP and the CPU are
carried out through the EDMA controller.
The C64x uses a two-level cache-based architecture and has a powerful and diverse set of peripherals. The level 1 program (L1P) cache is a 128K-bit direct-mapped cache and the level 1 data (L1D) cache is a 128K-bit 2-way set-associative cache. The level 2 memory/cache (L2) consists of an 8M-bit memory space that is shared between program and data space. L2 memory can be configured as mapped memory or combinations of cache (up to 256K bytes) and mapped memory. The peripheral set includes 3 multichannel buffered serial ports (McBSPs), an 8-bit universal test and operations PHY interface for
asynchronous transfer mode (ATM) slave (UTOPIA slave) port (C6415/C6416 only), 3 32-bit
general-purpose timers, a user-configurable 16-bit or 32-bit host-port interface (HPI16/HPI32), a peripheral component interconnect (PCI) (C6415/C6416 only), a general-purpose input/output port (GPIO) with 16 GPIO pins, and two glueless external memory interfaces (64-bit EMIFA and 16-bit EMIFB), both of which are capable of interfacing to synchronous and asynchronous memories and peripherals.
The C64x has a complete set of development tools that includes an advanced C compiler with C64x-specific enhancements, an assembly optimizer to simplify programming and scheduling, and a Windows debugger interface for visibility into source code execution.(3)(4)