DaVinci Digital Media System-on-Chip - TMS320DM6446

TMS320DM6446 (NRND)

DaVinci Digital Media System-on-Chip

 

Description

The TMS320DM6446 (also referenced as DM6446) leverages TI's DaVinci™ technology to meet the networked media encode and decode application processing needs of next-generation embedded devices.

The DM6446 enables OEMs and ODMs to quickly bring to market devices featuring robust operating systems support, rich user interfaces, high processing performance, and long battery life through the maximum flexibility of a fully integrated mixed processor solution.

The dual-core architecture of the DM6446 provides benefits of both DSP and Reduced Instruction Set Computer (RISC) technologies, incorporating a high-performance TMS320C64x+™ DSP core and an ARM926EJ-S core.

The ARM926EJ-S is a 32-bit RISC processor core that performs 32-bit or 16-bit instructions and processes 32-bit, 16-bit, or 8-bit data. The core uses pipelining so that all parts of the processor and memory system can operate continuously.

The ARM core incorporates: A coprocessor 15 (CP15) and protection module Data and program Memory Management Units (MMUs) with table look-aside buffers. Separate 16K-byte instruction and 8K-byte data caches. Both are four-way associative with virtual index virtual tag (VIVT).

The TMS320C64x+™ DSPs are the highest-performance fixed-point DSP generation in the TMS320C6000™ DSP platform. It is based on an enhanced version of the second-generation high-performance, advanced very-long-instruction-word (VLIW) architecture developed by Texas Instruments (TI), making these DSP cores an excellent choice for digital media applications. The C64x is a code-compatible member of the C6000™ DSP platform. The TMS320C64x+ DSP is an enhancement of the C64x+™ DSP with added functionality and an expanded instruction set.

Any reference to the C64x™ DSP or C64x™ CPU also applies, unless otherwise noted, to the C64x+™ DSP and C64x+™ CPU, respectively.

With performance of up to 6480 million instructions per second (MIPS) at a clock rate of 810 MHz, the C64x+ core offers solutions to high-performance DSP programming challenges. The DSP core possesses the operational flexibility of high-speed controllers and the numerical capability of array processors. The C64x+ DSP core processor has 64 general-purpose registers of 32-bit word length and eight highly independent functional units--two multipliers for a 32-bit result and six arithmetic logic units (ALUs). The eight functional units include instructions to accelerate the performance in video and imaging applications. The DSP core can produce four 16-bit multiply-accumulates (MACs) per cycle for a total of 3240 million MACs per second (MMACS), or eight 8-bit MACs per cycle for a total of 6480 MMACS. For more details on the C64x+ DSP, see the TMS320C64x/C64x+ DSP CPU and Instruction Set Reference Guide (literature number SPRU732).

The DM6446 also has application-specific hardware logic, on-chip memory, and additional on-chip peripherals similar to the other C6000 DSP platform devices. The DM6446 core uses a two-level cache-based architecture. The Level 1 program cache (L1P) is a 256K-bit direct mapped cache and the Level 1 data cache (L1D) is a 640K-bit 2-way set-associative cache. The Level 2 memory/cache (L2) consists of an 512K-bit memory space that is shared between program and data space. L2 memory can be configured as mapped memory, cache, or combinations of the two.

The peripheral set includes: 2 configurable video ports; a 10/100 Mb/s Ethernet MAC (EMAC) with a Management Data Input/Output (MDIO) module; an inter-integrated circuit (I2C) Bus interface; one audio serial port (ASP); 2 64-bit general-purpose timers each configurable as 2 independent 32-bit timers; 1 64-bit watchdog timer; up to 71-pins of general-purpose input/output (GPIO) with programmable interrupt/event generation modes, multiplexed with other peripherals; 3 UARTs with hardware handshaking support on 1 UART; 3 pulse width modulator (PWM) peripherals; and 2 external memory interfaces: an asynchronous external memory interface (EMIFA) for slower memories/peripherals, and a higher speed synchronous memory interface for DDR2.

The DM6446 device includes a Video Processing Subsystem (VPSS) with two configurable video/imaging peripherals: 1 Video Processing Front-End (VPFE) input used for video capture, 1 Video Processing Back-End (VPBE) output with imaging co-processor (VICP) used for display.

The Video Processing Front-End (VPFE) is comprised of a CCD Controller (CCDC), a Preview Engine (Previewer), Histogram Module, Auto-Exposure/White Balance/Focus Module (H3A), and Resizer. The CCDC is capable of interfacing to common video decoders, CMOS sensors, and Charge Coupled Devices (CCDs). The Previewer is a real-time image processing engine that takes raw imager data from a CMOS sensor or CCD and converts from an RGB Bayer Pattern to YUV4:2:2. The Histogram and H3A modules provide statistical information on the raw color data for use by the DM6446. The Resizer accepts image data for separate horizontal and vertical resizing from 1/4x to 4x in increments of 256/N, where N is between 64 and 1024.

The Video Processing Back-End (VPBE) is comprised of an On-Screen Display Engine (OSD) and a Video Encoder (VENC). The OSD engine is capable of handling 2 separate video windows and 2 separate OSD windows. Other configurations include 2 video windows, 1 OSD window, and 1 attribute window allowing up to 8 levels of alpha blending. The VENC provides four analog DACs that run at 54 MHz, providing a means for composite NTSC/PAL video, S-Video, and/or Component video output. The VENC also provides up to 24 bits of digital output to interface to RGB888 devices. The digital output is capable of 8/16-bit BT.656 output and/or CCIR.601 with separate horizontal and vertical syncs.

The Ethernet Media Access Controller (EMAC) provides an efficient interface between the DM644x and the network. The DM6446 EMAC support both 10Base-T and 100Base-TX, or 10 Mbits/second (Mbps) and 100 Mbps in either half- or full-duplex mode, with hardware flow control and quality of service (QOS) support.

The Management Data Input/Output (MDIO) module continuously polls all 32 MDIO addresses in order to enumerate all PHY devices in the system. Once a PHY candidate has been selected by the ARM, the MDIO module transparently monitors its link state by reading the PHY status register. Link change events are stored in the MDIO module and can optionally interrupt the ARM, allowing the ARM to poll the link status of the device without continuously performing costly MDIO accesses.

The HPI, I2C, SPI, USB2.0, and VLYNQ ports allow DM6446 to easily control peripheral devices and/or communicate with host processors. The DM6446 also provides multimedia card support, MMC/SD, with SDIO support.

The DM6446 also includes a Video/Imaging Co-processor (VICP) to offload many video and imaging processing tasks from the DSP core, making more DSP MIPS available for common video and imaging algorithms. For more information on the VICP enhanced codecs, such as H.264 and MPEG4, please contact your nearest TI sales representative.

The rich peripheral set provides the ability to control external peripheral devices and communicate with external processors. For details on each of the peripherals, see the related sections later in this document and the associated peripheral reference guides listed in Section 2.8.3.1, Related Documentation From Texas Instruments.

The DM6446 has a complete set of development tools for both the ARM and DSP. These include C compilers, a DSP assembly optimizer to simplify programming and scheduling, and a Windows™ debugger interface for visibility into source code execution.

Features

  • High-Performance Digital Media SoC
    • 513-, 594-, 810-MHz C64x+™ Clock Rates
    • 256.5-, 297-, 405-MHz ARM926EJ-S™ Clock Rates
    • Eight 32-Bit C64x+ Instructions/Cycle
    • 4104, 4752, 6480 C64x+ MIPS
    • Fully Software-Compatible With C64x / ARM9™
    • Extended Temperature Devices Available
  • Advanced Very-Long-Instruction-Word (VLIW) TMS320C64x+™ DSP Core
    • Eight Highly Independent Functional Units
      • Six ALUs (32-/40-Bit), Each Supports Single 32-Bit, Dual 16-Bit, or Quad 8-Bit Arithmetic per Clock Cycle
      • Two Multipliers Support Four 16 x 16-Bit Multiplies (32-Bit Results) per Clock Cycle or Eight 8 x 8-Bit Multiplies (16-Bit Results) per Clock Cycle
    • Load-Store Architecture With Non-Aligned Support
    • 64 32-Bit General-Purpose Registers
    • Instruction Packing Reduces Code Size
    • All Instructions Conditional
    • Additional C64x+™ Enhancements
      • Protected Mode Operation
      • Exceptions Support for Error Detection and Program Redirection
      • Hardware Support for Modulo Loop Operation
  • C64x+ Instruction Set Features
    • Byte-Addressable (8-/16-/32-/64-Bit Data)
    • 8-Bit Overflow Protection
    • Bit-Field Extract, Set, Clear
    • Normalization, Saturation, Bit-Counting
    • Compact 16-Bit Instructions
    • Additional Instructions to Support Complex Multiplies
  • C64x+ L1/L2 Memory Architecture
    • 32K-Byte L1P Program RAM/Cache (Direct Mapped)
    • 80K-Byte L1D Data RAM/Cache (2-Way Set-Associative)
    • 64K-Byte L2 Unified Mapped RAM/Cache (Flexible RAM/Cache Allocation)
  • ARM926EJ-S Core
    • Support for 32-Bit and 16-Bit (Thumb® Mode) Instruction Sets
    • DSP Instruction Extensions and Single Cycle MAC
    • ARM® Jazelle®: Technology
    • EmbeddedICE-RT™ Logic for Real-Time Debug
  • ARM9 Memory Architecture
    • 16K-Byte Instruction Cache
    • 8K-Byte Data Cache
    • 16K-Byte RAM
    • 8K-Byte ROM
  • Embedded Trace Buffer™ (ETB11™) With 4KB Memory for ARM9 Debug
  • Endianness: Little Endian for ARM and DSP
  • Video Imaging Co-Processor (VICP)
  • Video Processing Subsystem
    • Front End Provides:
      • CCD and CMOS Imager Interface
      • BT.601/BT.656 Digital YCbCr 4:2:2 (8-/16-Bit) Interface
      • Preview Engine for Real-Time Image Processing
      • Glueless Interface to Common Video Decoders
      • Histogram Module
      • Auto-Exposure, Auto-White Balance and Auto-Focus Module
      • Resize Engine Resize
        • Images From 1/4x to 4x
        • Separate Horizontal/Vertical Control
    • Back End Provides:
      • Hardware On-Screen Display (OSD)
      • Four 54-MHz DACs for a Combination of
        • Composite NTSC/PAL Video
        • Luma/Chroma Separate Video (S-video)
        • Component (YPbPr or RGB) Video (Progressive)
      • Digital Output
        • 8-/16-bit YUV or up to 24-Bit RGB
        • HD Resolution
        • Up to 2 Video Windows
  • External Memory Interfaces (EMIFs)
    • 32-Bit DDR2 SDRAM Memory Controller With 256M-Byte Address Space (1.8-V I/O)
      • Up to 167-MHz Controller (A-513, -594)
      • Up to 189-MHz Controller (-810)
    • Asynchronous 16-Bit-Wide EMIF (EMIFA) With 128M-Byte Address Reach
      • Flash Memory Interfaces
        • NOR (8-/16-Bit-Wide Data)
        • NAND (8-/16-Bit-Wide Data)
  • Flash Card Interfaces
    • Multimedia Card (MMC)/Secure Digital (SD) with Secure Data I/O (SDIO)
    • Compact Flash Controller With True IDE Mode
    • SmartMedia
  • Enhanced Direct-Memory-Access (EDMA3) Controller (64 Independent Channels)
  • Two 64-Bit General-Purpose Timers (Each Configurable as Two 32-Bit Timers)
  • One 64-Bit Watch Dog Timer
  • Three UARTs (One with RTS and CTS Flow Control)
  • One Serial Peripheral Interface (SPI) With Two Chip-Selects
  • Master/Slave Inter-Integrated Circuit (I2C Bus™)
  • Audio Serial Port (ASP)
    • I2S
    • AC97 Audio Codec Interface
    • Standard Voice Codec Interface (AIC12)
  • 10/100 Mb/s Ethernet MAC (EMAC)
    • IEEE 802.3 Compliant
    • Media Independent Interface (MII)
  • VLYNQ™ Interface (FPGA Interface)
  • Host Port Interface (HPI) with 16-Bit Multiplexed Address/Data
  • USB Port With Integrated 2.0 PHY
    • USB 2.0 High-/Full-Speed (480-Mbps) Client
    • USB 2.0 High-/Full-/Low-Speed Host (Mini-Host, Supporting One External Device)
  • Three Pulse Width Modulator (PWM) Outputs
  • On-Chip ARM ROM Bootloader (RBL) to Boot From NAND Flash or UART
  • ATA/ATAPI I/F (ATA/ATAPI-6 Specification)
  • Individual Power-Saving Modes for ARM/DSP
  • Flexible PLL Clock Generators
  • IEEE-1149.1 (JTAG) Boundary-Scan-Compatible
  • Up to 71 General-Purpose I/O (GPIO) Pins (Multiplexed With Other Device Functions)
  • 361-Pin Pb-Free BGA Package(ZWT Suffix), 0.8-mm Ball Pitch
  • 0.09-µm/6-Level Cu Metal Process (CMOS)
  • 3.3-V and 1.8-V I/O, 1.2-V Internal (513, 594)
  • 3.3-V and 1.8-V I/O, 1.2-V DAC and USB, 1.3-V Internal (810 only)
  • Applications:
    • Digital Media
    • Networked Media Encode/Decode
    • Video Imaging

All other trademarks are the property of their respective owners

View more

Featured tools and software