# C55x v3.x CPU Mnemonic Instruction Set Reference Guide 

## Preface

## Read This First

## About This Manual

The C55x ${ }^{\text {TM }}$ CPU is a fixed-point digital signal processor (DSP) in the TMS320™ family, and it can use either of two forms of the instruction set: a mnemonic form or an algebraic form. This book is a reference for the mnemonic form of the instruction set. It contains information about the instructions used for all types of operations. For information on the algebraic instruction set, see C55x v3.1 CPU Algebraic Instruction Set Reference Guide, SWPU068.

## Notational Conventions

This book uses the following conventions.
$\square$ In syntax descriptions, the instruction is in a bold typeface. Portions of a syntax in bold must be entered as shown. Here is an example of an instruction syntax:

LMS Xmem, Ymem, ACx, ACy
LMS is the instruction, and it has four operands: Xmem, Ymem, $A C x$, and $A C y$. When you use LMS, the operands should be actual dual datamemory operand values and accumulator values. A comma and a space (optional) must separate the four values.
$\square$ Square brackets, [ and ], identify an optional parameter. If you use an optional parameter, specify the information within the brackets; do not type the brackets themselves.

## Related Documentation From Texas Instruments

The following books describe the C55x ${ }^{\text {TM }}$ devices and related support tools. To obtain a copy of any of these TI documents, call the Texas Instruments Literature Response Center at (800) 477-8924. When ordering, please identify the book by its title and literature number.

TMS320C55x Technical Overview (SPRU393). This overview is an introduction to the TMS320C55xTM digital signal processor (DSP). The TMS320C55x is the latest generation of fixed-point DSPs in the TMS320C5000™ DSP platform. Like the previous generations, this processor is optimized for high performance and low-power operation. This book describes the CPU architecture, low-power enhancements, and embedded emulation features of the TMS320C55x.

C55x v3.x CPU Reference Guide (literature number SWPU073) describes the architecture, registers, and operation of the $v 3 . x$ CPU for the C55x ${ }^{\top M}$.

C55x v3.x CPU Algebraic Instruction Set Reference Guide (literature number SWPU068) describes the algebraic instructions individually. It also includes a summary of the instruction set, a list of the instruction opcodes, and a cross-reference to the mnemonic instruction set.

TMS320C55x Programmer's Guide (literature number SPRU376) describes ways to optimize C and assembly code for the TMS320C55xTM DSPs and explains how to write code that uses special features and instructions of the DSP.

TMS320C55x Optimizing C Compiler User's Guide (literature number SPRU281) describes the TMS320C55xTM C Compiler. This C compiler accepts ANSI standard C source code and produces assembly language source code for TMS320C55x devices.

TMS320C55x Assembly Language Tools User's Guide (literature number SPRU280) describes the assembly language tools (assembler, linker, and other tools used to develop assembly language code), assembler directives, macros, common object file format, and symbolic debugging directives for TMS320C55 $x^{\text {TM }}$ devices.

## Trademarks

TMS320, TMS320C54x, TMS320C55x, C54x, and C55x are trademarks of Texas Instruments.

## Contents

1 Terms, Symbols, and Abbreviations ..... 1-1
Lists and defines the terms, symbols, and abbreviations used in the TMS320C55x DSP mnemonic instruction set summary and in the individual instruction descriptions.
1.1 Instruction Set Terms, Symbols, and Abbreviations ..... 1-2
1.2 Instruction Set Conditional (cond) Fields ..... 1-7
1.3 Affect of Status Bits ..... 1-9
1.3.1 Accumulator Overflow Status Bit (ACOVx) ..... 1-9
1.3.2 C54CM Status Bit ..... 1-9
1.3.3 CARRY Status Bit ..... 1-9
1.3.4 FRCT Status Bit ..... 1-9
1.3.5 INTM Status Bit ..... 1-9
1.3.6 M40 Status Bit ..... 1-10
1.3.7 RDM Status Bit ..... 1-12
1.3.8 SATA Status Bit ..... 1-12
1.3.9 SATD Status Bit ..... 1-13
1.3.10 SMUL Status Bit ..... 1-13
1.3.11 SXMD Status Bit ..... 1-13
1.3.12 Test Control Status Bit (TCx) ..... 1-13
1.4 Instruction Set Notes and Rules ..... 1-14
1.4.1 Notes ..... 1-14
1.4.2 Rules ..... 1-14
1.5 Nonrepeatable Instructions ..... 1-21
2 Parallelism Features and Rules ..... 2-1
Describes the parallelism features and rules of the TMS320C55x DSP mnemonic instructionset.
2.1 Parallelism Features ..... 2-2
2.2 Parallelism Basics ..... 2-3
2.3 Resource Conflicts ..... 2-4
2.3.1 Operators ..... 2-4
2.3.2 Address Generation Units ..... 2-4
2.3.3 Buses ..... 2-5
2.4 Soft-Dual Parallelism ..... 2-5
2.4.1 Soft-Dual Parallelism of MAR Instructions ..... 2-6
2.5 Execute Conditionally Instructions ..... 2-6
2.6 Other Exceptions ..... 2-7
3 Introduction to Addressing Modes ..... 3-1
Provides an introduction to the addressing modes of the TMS320C55x DSP.
3.1 Introduction to the Addressing Modes ..... 3-2
3.2 Absolute Addressing Modes ..... 3-3
3.2.1 k16 Absolute Addressing Mode ..... 3-3
3.2.2 k23 Absolute Addressing Mode ..... 3-3
3.2.3 I/O Absolute Addressing Mode ..... 3-3
3.3 Direct Addressing Modes ..... 3-4
3.3.1 DP Direct Addressing Mode ..... 3-4
3.3.2 SP Direct Addressing Mode ..... 3-5
3.3.3 Register-Bit Direct Addressing Mode ..... 3-5
3.3.4 PDP Direct Addressing Mode ..... 3-5
3.4 Indirect Addressing Modes ..... 3-6
3.4.1 AR Indirect Addressing Mode ..... 3-6
3.4.2 Dual AR Indirect Addressing Mode ..... 3-14
3.4.3 CDP Indirect Addressing Mode ..... 3-16
3.4.4 Coefficient Indirect Addressing Mode ..... 3-19
3.5 Circular Addressing ..... 3-21
4 Instruction Set Summary ..... 4-1
Summary of the TMS320C55x mnemonic instruction set.
5 Instruction Set Descriptions ..... 5-1
Detailed information on the TMS320C55x DSP mnemonic instruction set.
AADD (Modify Auxiliary or Temporary Register Content by Addition) ..... 5-2
AADD (Modify Data Stack Pointer) ..... 5-6
AADD (Modify Extended Auxiliary Register Content by Addition) ..... 5-7
ABDST (Absolute Distance) ..... 5-9
ABS (Absolute Value) ..... 5-11
ADD (Addition) ..... 5-14
ADD (Dual 16-Bit Additions) ..... 5-35
ADD::MOV (Addition with Parallel Store Accumulator Content to Memory) ..... 5-40
ADDSUB (Dual 16-Bit Addition and Subtraction) ..... 5-42
ADDSUBCC (Addition or Subtraction Conditionally) ..... 5-47
ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally) ..... 5-49
ADDSUB2CC (Addition or Subtraction Conditionally with Shift) ..... 5-51
ADDV (Addition with Absolute Value) ..... 5-54
AMAR (Modify Auxiliary Register Content) ..... 5-56
AMAR (Modify Extended Auxiliary Register Content) ..... 5-58
AMAR (Parallel Modify Auxiliary Register Contents) ..... 5-59
AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and Accumulate) ..... 5-60
AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract) ..... 5-65
AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply) ..... 5-67
AMOV (Load Extended Auxiliary Register with Immediate Value) ..... 5-69
AMOV (Modify Auxiliary or Temporary Register Content) ..... 5-70
AMOV (Modify Extended Auxiliary Register Content) ..... 5-74
AND (Bitwise AND) ..... 5-76
ASUB (Modify Auxiliary or Temporary Register Content by Subtraction) ..... 5-85
ASUB (Modify Extended Auxiliary Register Content by Subtraction) ..... 5-89
B (Branch Unconditionally) ..... 5-91
BAND (Bitwise AND Memory with Immediate Value and Compare to Zero) ..... 5-95
BCC (Branch Conditionally) ..... 5-96
BCC (Branch on Auxiliary Register Not Zero) ..... 5-100
BCC (Compare and Branch) ..... 5-103
BCLR (Clear Accumulator, Auxiliary, or Temporary Register Bit) ..... 5-106
BCLR (Clear Memory Bit) ..... 5-107
BCLR (Clear Status Register Bit) ..... 5-108
BCNT (Count Accumulator Bits) ..... 5-111
BFXPA (Expand Accumulator Bit Field) ..... 5-112
BFXTR (Extract Accumulator Bit Field) ..... 5-113
BNOT (Complement Accumulator, Auxiliary, or Temporary Register Bit) ..... 5-114
BNOT (Complement Memory Bit) ..... 5-115
BSET (Set Accumulator, Auxiliary, or Temporary Register Bit) ..... 5-116
BSET (Set Memory Bit) ..... 5-117
BSET (Set Status Register Bit) ..... 5-118
BTST (Test Accumulator, Auxiliary, or Temporary Register Bit) ..... 5-121
BTST (Test Memory Bit) ..... 5-123
BTSTCLR (Test and Clear Memory Bit) ..... 5-126
BTSTNOT (Test and Complement Memory Bit) ..... 5-127
BTSTP (Test Accumulator, Auxiliary, or Temporary Register Bit Pair) ..... 5-128
BTSTSET (Test and Set Memory Bit) ..... 5-130
CALL (Call Unconditionally) ..... 5-131
CALLCC (Call Conditionally) ..... 5-135
CMP (Compare Memory with Immediate Value) ..... 5-141
CMP (Compare Accumulator, Auxiliary, or Temporary Register Content) ..... 5-143
CMPAND (Compare Accumulator, Auxiliary, or Temporary Register Content with AND) ..... 5-145
CMPOR (Compare Accumulator, Auxiliary, or Temporary Register Content with OR) ..... 5-150
.CR (Circular Addressing Qualifier) ..... 5-155
DELAY (Memory Delay) ..... 5-156
EXP (Compute Exponent of Accumulator Content) ..... 5-157
FIRSADD (Symmetrical Finite Impulse Response Filter) ..... 5-158
FIRSSUB (Antisymmetrical Finite Impulse Response Filter) ..... 5-160
IDLE ..... 5-162
INTR (Software Interrupt) ..... 5-163
.LK (Lock Access Qualifier) ..... 5-165
LMS (Least Mean Square) ..... 5-167
LMSF (Least Mean Square) ..... 5-169
.LR (Linear Addressing Qualifier) ..... 5-173
MAC (Multiply and Accumulate) ..... 5-174
MACMZ (Multiply and Accumulate with Parallel Delay) ..... 5-191
MAC::MAC (Parallel Multiply and Accumulates) ..... 5-193
MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract) ..... 5-228
MAC::MPY (Multiply and Accumulate with Parallel Multiply) ..... 5-248
MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory) ..... 5-267
MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory) ..... 5-269
MANT::NEXP (Compute Mantissa and Exponent of Accumulator Content) ..... 5-272
MAS (Multiply and Subtract) ..... 5-274
MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate) ..... 5-286
MAS::MAS (Parallel Multiply and Subtracts) ..... 5-297
MAS::MPY (Multiply and Subtract with Parallel Multiply) ..... 5-309
MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory) ..... 5-318
MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory) ..... 5-320
MAX (Compare Accumulator, Auxiliary, or Temporary Register Content Maximum) ..... 5-322
MAXDIFF (Compare and Select Accumulator Content Maximum) ..... 5-325
MIN (Compare Accumulator, Auxiliary, or Temporary Register Content Minimum) ..... 5-331
MINDIFF (Compare and Select Accumulator Content Minimum) ..... 5-334
mmap (Memory-Mapped Register Access Qualifier) ..... 5-340
MOV (Load Accumulator from Memory) ..... 5-342
MOV (Load Accumulator Pair from Memory) ..... 5-351
MOV (Load Accumulator with Immediate Value) ..... 5-354
MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory) ..... 5-357
MOV (Load Accumulator, Auxiliary, or Temporary Register Content with Immediate Value) ..... 5-363
MOV (Load Auxiliary or Temporary Register Pair from Memory) ..... 5-367
MOV (Load CPU Register from Memory) ..... 5-368
MOV (Load CPU Register with Immediate Value) ..... 5-371
MOV (Load Extended Auxiliary Register from Memory) ..... 5-373
MOV (Load Memory with Immediate Value) ..... 5-374
MOV (Move Accumulator Content to Auxiliary or Temporary Register) ..... 5-375
MOV (Move Accumulator, Auxiliary, or Temporary Register Content) ..... 5-376
MOV (Move Auxiliary or Temporary Register Content to Accumulator) ..... 5-378
MOV (Move Auxiliary or Temporary Register Content to CPU Register) ..... 5-379
MOV (Move CPU Register Content to Auxiliary or Temporary Register) ..... 5-381
MOV (Move Extended Auxiliary Register Content) ..... 5-383
MOV (Move Memory to Memory) ..... 5-384
MOV (Store Accumulator Content to Memory) ..... 5-391
MOV (Store Accumulator Pair Content to Memory) ..... 5-415
MOV (Store Accumulator, Auxiliary, or Temporary Register Content to Memory) ..... 5-418
MOV (Store Auxiliary or Temporary Register Pair Content to Memory) ..... 5-422
MOV (Store CPU Register Content to Memory) ..... 5-423
MOV (Store Extended Auxiliary Register Content to Memory) ..... 5-427
MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory) ..... 5-428
MPY (Multiply) ..... 5-430
MPY::MAC (Multiply with Parallel Multiply and Accumulate) ..... 5-446
MPY::MAS (Multiply with Parallel Multiply and Subtract) ..... 5-458
MPY::MPY (Parallel Multiplies) ..... 5-468
MPYM::MOV (Multiply with Parallel Store Accumulator Content to Memory) ..... 5-480
NEG (Negate Accumulator, Auxiliary, or Temporary Register Content) ..... 5-483
NOP (No Operation) ..... 5-485
NOT (Complement Accumulator, Auxiliary, or Temporary Register Content) ..... 5-486
OR (Bitwise OR) ..... 5-487
POP (Pop Top of Stack) ..... 5-496
POPBOTH (Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers) ..... 5-503
port (Peripheral Port Register Access Qualifiers) ..... 5-504
PSH (Push to Top of Stack) ..... 5-506
PSHBOTH (Push Accumulator or Extended Auxiliary Register Content to Stack Pointers) ..... 5-513
RESET (Software Reset) ..... 5-514
RET (Return Unconditionally) ..... 5-518
RETCC (Return Conditionally) ..... 5-520
RETI (Return from Interrupt) ..... 5-522
ROL (Rotate Left Accumulator, Auxiliary, or Temporary Register Content) ..... 5-524
ROR (Rotate Right Accumulator, Auxiliary, or Temporary Register Content) ..... 5-526
ROUND (Round Accumulator Content) ..... 5-528
RPT (Repeat Single Instruction Unconditionally) ..... 5-530
RPTADD (Repeat Single Instruction Unconditionally and Increment CSR) ..... 5-535
RPTB (Repeat Block of Instructions Unconditionally) ..... 5-538
RPTCC (Repeat Single Instruction Conditionally) ..... 5-550
RPTSUB (Repeat Single Instruction Unconditionally and Decrement CSR) ..... 5-553
SAT (Saturate Accumulator Content) ..... 5-555
SFTCC (Shift Accumulator Content Conditionally) ..... 5-557
SFTL (Shift Accumulator Content Logically) ..... 5-559
SFTL (Shift Accumulator, Auxiliary, or Temporary Register Content Logically) ..... 5-562
SFTS (Signed Shift of Accumulator Content) ..... 5-565
SFTS (Signed Shift of Accumulator, Auxiliary, or Temporary Register Content) ..... 5-574
SQA (Square and Accumulate) ..... 5-579
SQDST (Square Distance) ..... 5-582
SQR (Square) ..... 5-584
SQS (Square and Subtract) ..... 5-587
SUB (Dual 16-Bit Subtractions) ..... 5-590
SUB (Subtraction) ..... 5-599
SUB::MOV (Subtraction with Parallel Store Accumulator Content to Memory) ..... 5-624
SUBADD (Dual 16-Bit Subtraction and Addition) ..... 5-626
SUBC (Subtract Conditionally) ..... 5-631
SWAP (Swap Accumulator Content) ..... 5-634
SWAP (Swap Auxiliary Register Content) ..... 5-635
SWAP (Swap Auxiliary and Temporary Register Content) ..... 5-636
SWAP (Swap Temporary Register Content) ..... 5-638
SWAPP (Swap Accumulator Pair Content) ..... 5-639
SWAPP (Swap Auxiliary Register Pair Content) ..... 5-640
SWAPP (Swap Auxiliary and Temporary Register Pair Content) ..... 5-641
SWAPP (Swap Temporary Register Pair Content) ..... 5-643
SWAP4 (Swap Auxiliary and Temporary Register Pairs Content) ..... 5-644
TRAP (Software Trap) ..... 5-646
XCC (Execute Conditionally) ..... 5-648
XOR (Bitwise Exclusive OR) ..... 5-655
6 Instruction Opcodes in Sequential Order ..... 6-1
Provides the opcode in sequential order for each TMS320C55x DSP instruction syntax.
6.1 Instruction Set Opcodes ..... 6-2
6.2 Instruction Set Opcode Symbols and Abbreviations ..... 6-18
7 Cross-Reference of Mnemonic and Algebraic Instruction Sets ..... 7-1
Cross-Reference of TMS320C55x DSP Algebraic and Mnemonic Instruction Sets.
8 Index ..... Index-1

## Figures

5-1 Status Registers Bit Mapping ..... 5-110
5-2 Status Registers Bit Mapping ..... 5-120
5-3 Effects of a Software Reset on Status Registers ..... 5-517
5-4 Legal Uses of Repeat Block of Instructions Unconditionally (RPTBLOCAL) Instruction ..... 5-543
1-1 Instruction Set Terms, Symbols, and Abbreviations ..... 1-2
1-2 Operators Used in Instruction Set ..... 1-6
1-3 Instruction Set Conditional (cond) Field ..... 1-7
1-4 Nonrepeatable Instructions ..... 1-21
3-1 Addressing-Mode Operands ..... 3-2
3-2 Absolute Addressing Modes ..... 3-3
3-3 Direct Addressing Modes ..... 3-4
3-4 Indirect Addressing Modes ..... 3-6
3-5 DSP Mode Operands for the AR Indirect Addressing Mode ..... 3-8
3-6 Control Mode Operands for the AR Indirect Addressing Mode ..... 3-12
3-7 Dual AR Indirect Operands ..... 3-15
3-8 CDP Indirect Operands ..... 3-17
3-9 Coefficient Indirect Operands ..... 3-20
3-10 Circular Addressing Pointers ..... 3-21
4-1 Mnemonic Instruction Set Summary ..... 4-3
5-1 Opcodes for Load CPU Register from Memory Instruction ..... 5-370
5-2 Opcodes for Load CPU Register with Immediate Value Instruction ..... 5-372
5-3 Opcodes for Move Auxiliary or Temporary Register Content to CPU Register Instruction ..... 5-380
5-4 Opcodes for Move CPU Register Content to Auxiliary or Temporary Register Instruction ..... 5-382
5-5 Opcodes for Store CPU Register Content to Memory Instruction ..... 5-426
5-6 Effects of a Software Reset on DSP Registers ..... 5-515
6-1 Instruction Set Opcodes ..... 6-2
6-2 Instruction Set Opcode Symbols and Abbreviations ..... 6-18
7-1 Cross-Reference of Mnemonic and Algebraic Instruction Sets ..... 7-2

# Terms, Symbols, and Abbreviations 

This chapter lists and defines the terms, symbols, and abbreviations used in the TMS320C55x ${ }^{\text {TM }}$ DSP mnemonic instruction set summary and in the individual instruction descriptions. Also provided are instruction set notes and rules and a list of nonrepeatable instructions.
Topic Page
1.1 Instruction Set Terms, Symbols, and Abbreviations ..... 1-2
1.2 Instruction Set Conditional (cond) Fields ..... 1-7
1.3 Affect of Status Bits ..... 1-9
1.4 Instruction Set Notes and Rules ..... 1-14
1.5 Nonrepeatable Instructions ..... 1-21

### 1.1 Instruction Set Terms, Symbols, and Abbreviations

Table 1-1 lists the terms, symbols, and abbreviations used and Table 1-2 lists the operators used in the instruction set summary and in the individual instruction descriptions.

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations

| Symbol | Meaning |
| :---: | :---: |
| [ ] | Optional operands |
| 40 | If the optional 40 keyword is applied to the instruction, the instruction provides the option to locally set M40 to 1 for the execution of the instruction |
| ACB | Bus that brings D-unit registers to A -unit and P -unit operators |
| ACOVx | Accumulator overflow status bit: ACOV0, ACOV1, ACOV2, ACOV3 |
| ACw, ACx, ACy, ACz | Accumulator: AC0, AC1, AC2, AC3 |
| ARn_mod | Content of selected auxiliary register (ARn) is premodified or postmodified in the address generation unit. |
| ARx, ARy | Auxiliary register: <br> AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 |
| AU | A unit |
| Baddr | Register bit address |
| Bitln | Shifted bit in: Test control flag 2 (TC2) or CARRY status bit |
| BitOut | Shifted bit out: Test control flag 2 (TC2) or CARRY status bit |
| BORROW | Logical complement of CARRY status bit |
| C, Cycles | Execution in cycles. For conditional instructions, $x / y$ field means: <br> $x$ cycle, if the condition is true. <br> y cycle, if the condition is false. |
| CA | Coefficient address generation unit |
| CARRY | Value of CARRY status bit |
| Cmem | Coefficient indirect operand referencing a 16-bit or 32-bit value in data space |
| cond | Condition based on accumulator (ACx) value, auxiliary register (ARx) value, temporary register (Tx) value, test control (TCx) flag, or CARRY status bit. See section 1.2. |
| CR | Coefficient Read bus |

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations (Continued)

| Symbol | Meaning |
| :---: | :---: |
| CSR | Computed single-repeat register |
| DA | Data address generation unit |
| DR | Data Read bus |
| dst | Destination accumulator (ACx), lower 16 bits of auxiliary register (ARx), or temporary register (Tx): <br> AC0, AC1, AC2, AC3 <br> AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 <br> T0, T1, T2, T3 |
| DU | D unit |
| DW | Data Write bus |
| Dx | Data address label coded on x bits (absolute address) |
| E | Indicates if the instruction contains a parallel enable bit. |
| kx | Unsigned constant coded on x bits |
| Kx | Signed constant coded on x bits |
| Lmem | Long-word single data memory access (32-bit data access). Same legal inputs as Smem. |
| Ix | Program address label coded on x bits (unsigned offset relative to program counter register) |
| Lx | Program address label coded on x bits (signed offset relative to program counter register) |
| Operator | Operator(s) used by an instruction. |
| Pipe, Pipeline | Pipeline phase in which the instruction executes: <br> AD Address <br> D Decode <br> R Read <br> X Execute |
| pmad | Program memory address |
| Px | Program or data address label coded on x bits (absolute address) |
| RELOP | ```Relational operators: == equal to < less than >= greater than or equal to != not equal to``` |

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations (Continued)

| Symbol | Meaning |
| :---: | :---: |
| $R$ or rnd | If the optional R or rnd keyword is applied to the instruction, rounding is performed in the instruction |
| RPTC | Single-repeat counter register |
| S, Size | Instruction size in bytes. |
| SA | Stack address generation unit |
| saturate | If the optional saturate keyword is applied to the input operand, the 40-bit output of the operation is saturated |
| SHFT | 4-bit immediate shift value, 0 to 15 |
| SHIFTW | 6 -bit immediate shift value, -32 to +31 |
| Smem | Word single data memory access (16-bit data access) |
| SP | Data stack pointer |
| src | Source accumulator (ACx), lower 16 bits of auxiliary register (ARx), or temporary register (Tx): <br> AC0, AC1, AC2, AC3 <br> AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 <br> T0, T1, T2, T3 |
| SSP | System stack pointer |
| STx | Status register: ST0, ST1, ST2, ST3 |
| TAx, TAy | Auxiliary register (ARx) or temporary register (Tx): AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 T0, T1, T2, T3 |
| TCx, TCy | Test control flag: TC1, TC2 |
| TRNx | Transition register: TRN0, TRN1 |
| Tx, Ty | Temporary register: T0, T1, T2, T3 |
| U or uns | If the optional $U$ or uns keyword is applied to the input operand, the operand is zero extended |

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations (Continued)

| Symbol | Meaning |
| :---: | :---: |
| XACdst | Destination extended register: All 23 bits of coefficient data pointer (XCDP), and extended auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| XACsrc | Source extended register: All 23 bits of coefficient data pointer (XCDP), and extended auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| XAdst | Destination extended register: All 23 bits of data stack pointer (XSP), system stack pointer (XSSP), data page pointer (XDP), coefficient data pointer (XCDP), and extended auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| XARx | All 23 bits of extended auxiliary register: XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| XAsrc | Source extended register: All 23 bits of data stack pointer (XSP), system stack pointer (XSSP), data page pointer (XDP), coefficient data pointer (XCDP), and extended auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| xdst | Accumulator: $\mathrm{AC0}, \mathrm{AC} 1, \mathrm{AC} 2, \mathrm{AC} 3$ |
|  | Destination extended register: All 23 bits of data stack pointer (XSP), system stack pointer (XSSP), data page pointer (XDP), coefficient data pointer (XCDP), and extended auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| xsrc | Accumulator: AC0, AC1, AC2, AC3 |
|  | Source extended register: All 23 bits of data stack pointer (XSP), system stack pointer (XSSP), data page pointer (XDP), coefficient data pointer (XCDP), and extended auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| Xmem, Ymem | Indirect dual data memory access (two data accesses) |

## Table 1-2. Operators Used in Instruction Set

|  |  | Symbols | Operators | Evaluation |
| :---: | :---: | :---: | :---: | :---: |
| + | - | ~ | Unary plus, minus, 1s complement | Right to left |
| * | 1 | \% | Multiplication, division, modulo | Left to right |
| + |  | - | Addition, subtraction | Left to right |
| << |  | >> | Signed left shift, right shift | Left to right |
| <<< |  | >>> | Logical left shift, logical right shift | Left to right |
| < |  | <= | Less than, less than or equal to | Left to right |
| > |  | >= | Greater than, greater than or equal to | Left to right |
| $=$ |  | != | Equal to, not equal to | Left to right |
| \& |  |  | Bitwise AND | Left to right |
| \| |  |  | Bitwise OR | Left to right |
| $\wedge$ |  |  | Bitwise exclusive OR (XOR) | Left to right |

Note: Unary,+- , and * have higher precedence than the binary forms.

### 1.2 Instruction Set Conditional (cond) Fields

Table 1-3 lists the testing conditions available in the cond field of the conditional instructions.

Table 1-3. Instruction Set Conditional (cond) Field

| Bit or Register | Condition (cond) Field | For Condition to be True ... |
| :---: | :---: | :---: |
| Accumulator | Tests the accumulator (ACx) content against 0 . The comparison against 0 depends on M40 status bit: |  |
|  | $\square$ If $\mathrm{M} 40=0, \mathrm{ACx}(31-0)$ is compared to 0 . |  |
|  | - If M40 = 1, $\mathrm{ACx}(39-0)$ is compared to 0 . |  |
|  | ACx $=$ = \#0 | ACx content is equal to 0 |
|  | ACx < \#0 | ACx content is less than 0 |
|  | ACx > \#0 | ACx content is greater than 0 |
|  | ACx ! = \#0 | ACx content is not equal to 0 |
|  | ACx $<=$ \#0 | ACx content is less than or equal to 0 |
|  | ACx $>=$ \#0 | ACx content is greater than or equal to 0 |
| Accumulator Overflow Status Bit | Tests the accumulator overflow status bit (ACOVx) against 1 ; when the optional ! symbol is used before the bit designation, the bit can be tested against 0 . When this condition is used, the corresponding ACOVx is cleared to 0 . |  |
|  | overflow(ACx) | ACOVx bit is set to 1 |
|  | !overflow(ACx) | ACOVx bit is cleared to 0 |
| Auxiliary Register | Tests the auxiliary register (ARx) content against 0 . |  |
|  | ARx $=$ = \#0 | ARx content is equal to 0 |
|  | ARx $<\# 0$ | ARx content is less than 0 |
|  | ARx > \#0 | ARx content is greater than 0 |
|  | ARx ! = \#0 | ARx content is not equal to 0 |
|  | ARx $<=\# 0$ | ARx content is less than or equal to 0 |
|  | ARx $>=\# 0$ | ARx content is greater than or equal to 0 |
| CARRY Status Bit | Tests the CARRY status bit against 1 ; when the optional ! symbol is used before the bit designation, the bit can be tested against 0 . |  |
|  | CARRY | CARRY bit is set to 1 |
|  | !CARRY | CARRY bit is cleared to 0 |

Table 1-3. Instruction Set Conditional (cond) Field (Continued)

| Bit or Register | Condition (cond) Field | For Condition to be True ... |
| :---: | :---: | :---: |
| Temporary Register | Tests the temporary register ( Tx ) content against 0 . |  |
|  | Tx = = \#0 | Tx content is equal to 0 |
|  | Tx < \# | Tx content is less than 0 |
|  | Tx > \#0 | Tx content is greater than 0 |
|  | Tx ! = \#0 | Tx content is not equal to 0 |
|  | Tx <= \#0 | Tx content is less than or equal to 0 |
|  | Tx >= \#0 | Tx content is greater than or equal to 0 |
| Test Control Flags | Tests the test control flags (TC1 and TC2) independently against 1 ; when the optional ! symbol is used before the flag designation, the flag can be tested independently against 0 . |  |
|  | TCx | TCx flag is set to 1 |
|  | !TCx | TCx flag is cleared to 0 |
|  | TC1 and TC2 can be combined with an AND (\&), OR (\|), and XOR (^) logical bit combinations: |  |
|  | TC1 \& TC2 | TC1 AND TC2 is equal to 1 |
|  | !TC1 \& TC2 | TC1 AND TC2 is equal to 1 |
|  | TC1 \& !TC2 | TC1 AND TC2 is equal to 1 |
|  | !TC1 \& !TC2 | TC1 AND TC2 is equal to 1 |
|  | TC1 \| TC2 | TC1 OR TC2 is equal to 1 |
|  | !TC1 \| TC2 | TC1 OR TC2 is equal to 1 |
|  | TC1 \| !TC2 | TC1 OR TC2 is equal to 1 |
|  | !TC1 \| !TC2 | TC1 OR TC2 is equal to 1 |
|  | TC1 ^ TC2 | TC1 XOR TC2 is equal to 1 |
|  | $!T C 1 \wedge ~ T C 2 ~$ | TC1 XOR TC2 is equal to 1 |
|  | TC1 ^ ! CC2 $^{\text {a }}$ | TC1 XOR TC2 is equal to 1 |
|  | $!T C 1 \wedge!T C 2$ | TC1 XOR TC2 is equal to 1 |

### 1.3 Affect of Status Bits

### 1.3.1 Accumulator Overflow Status Bit (ACOVx)

The ACOV[0-3] depends on M40:

- When M40 $=0$, overflow is detected at bit position 31
- When M40 $=1$, overflow is detected at bit position 39

If an overflow is detected, the destination accumulator overflow status bit is set to 1 .

### 1.3.2 C54CM Status Bit

- When C54CM = 0, the enhanced mode, the CPU supports code originally developed for a TMS320C55x ${ }^{\text {TM }}$ DSP.
- When C54CM = 1, the compatible mode, all the C55x CPU resources remain available; therefore, as you translate code, you can take advantage of the additional features on the C55x DSP to optimize your code. This mode must be set when you are porting code that was originally developed for a TMS320C54x ${ }^{\text {TM }}$ DSP.


### 1.3.3 CARRY Status Bit

- When M40 $=0$, the carry/borrow is detected at bit position 31
- When M40 = 1, the carry/borrow is detected at bit position 39

When performing a logical shift or signed shift that affects the CARRY status bit and the shift count is zero, the CARRY status bit is cleared to 0 .

### 1.3.4 FRCT Status Bit

- When FRCT $=0$, the fractional mode is OFF and results of multiply operations are not shifted.
- When FRCT = 1, the fractional mode is ON and results of multiply operations are shifted left by 1 bit to eliminate an extra sign bit.


### 1.3.5 INTM Status Bit

The INTM bit globally enables or disables the maskable interrupts. This bit has no effect on nonmaskable interrupts (those that cannot be blocked by software).
$\square$ When INTM $=0$, all unmasked interrupts are enabled.

- When INTM = 1, all maskable interrupts are disabled.


### 1.3.6 M40 Status Bit

$\square \quad$ When M40 $=0$ :
overflow is detected at bit position 31

- the carry/borrow is detected at bit position 31

■ saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)

- TMS320C54x ${ }^{\text {TM }}$ DSP compatibility mode
- for conditional instructions, the comparison against 0 (zero) is performed on 32 bits, $\mathrm{ACx}(31-0)$
$\square$ When M40=1:
- overflow is detected at bit position 39
- the carry/borrow is detected at bit position 39
- saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)
- for conditional instructions, the comparison against 0 (zero) is performed on 40 bits, $\mathrm{ACx}(39-0)$


### 1.3.6.1 M40 Status Bit When Sign Shifting

In D-unit shifter:
$\square$ When shifting to the LSBs:

- when M40 = 0, the input to the shifter is modified according to SXMD and then the modified input is shifted according to the shift quantity:
- if $\mathrm{SXMD}=0,0$ is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
- if $\operatorname{SXMD}=1$, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
bit 39 is extended according to SXMD
the shifted-out bit is extracted at bit position 0
$\square$ When shifting to the MSBs:
0 is inserted at bit position 0
- if M40 $=0$, the shifted-out bit is extracted at bit position 31
- if M40 = 1, the shifted-out bit is extracted at bit position 39

After shifting, unless otherwise noted, when $\mathrm{M} 40=0$ :

- overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVx bit is set)
- the carry/borrow is detected at bit position 31
- if SATD = 1, when an overflow is detected, $A C x$ saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
- TMS320C54x ${ }^{\text {TM }}$ DSP compatibility mode
$\square$ After shifting, unless otherwise noted, when $\mathrm{M} 40=1$ :
- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVx bit is set)
- the carry/borrow is detected at bit position 39
- if SATD = 1, when an overflow is detected, ACx saturation values are 7F FFFF FFFFh (positive overflow) or 8000000000 h (negative overflow)

In A-unit ALU:
$\square$ When shifting to the LSBs, bit 15 is sign extended
$\square$ When shifting to the MSBs, 0 is inserted at bit position 0
$\square$ After shifting, unless otherwise noted:

- overflow is detected at bit position 15 (if an overflow is detected, the destination ACOVx bit is set)
- if SATA = 1, when an overflow is detected, register saturation values are 7FFFh (positive overflow) or 8000h (negative overflow)


### 1.3.6.2 M40 Status Bit When Logically Shifting

In D-unit shifter:
When shifting to the LSBs:

- if $\mathrm{M} 40=0,0$ is inserted at bit position 31 and the guard bits (39-32) of the destination accumulator are cleared
- if M40 $=1,0$ is inserted at bit position 39
- the shifted-out bit is extracted at bit position 0 and stored in the CARRY status bit
- When shifting to the MSBs:

■ 0 is inserted at bit position 0

- if $\mathrm{M} 40=0$, the shifted-out bit is extracted at bit position 31 and stored in the CARRY status bit, and the guard bits (39-32) of the destination accumulator are cleared
- if $\mathrm{M} 40=1$, the shifted-out bit is extracted at bit position 39 and stored in the CARRY status bit

In A-unit ALU:

- When shifting to the LSBs:

0 is inserted at bit position 15
■ the shifted-out bit is extracted at bit position 0 and stored in the CARRY status bit
$\square$ When shifting to the MSBs:
■ 0 is inserted at bit position 0

- the shifted-out bit is extracted at bit position 15 and stored in the CARRY status bit


### 1.3.7 RDM Status Bit

When the optional rnd or R keyword is applied to the instruction, then rounding is performed in the D-unit shifter. This is done according to RDM:

- When RDM $=0$, the biased rounding to the infinite is performed. 8000 h $\left(2^{15}\right)$ is added to the 40 -bit result of the shift result.
$\square$ When RDM $=1$, the unbiased rounding to the nearest is performed. According to the value of the 17 LSBs of the 40 -bit result of the shift result, $8000 \mathrm{~h}\left(2^{15}\right)$ is added:

```
if( 8000h < bit(15-0) < 10000h)
    add 8000h to the 40-bit result of the shift result.
else if( bit(15-0) == 8000h)
        if( bit(16) == 1)
        add 8000h to the 40-bit result of the shift result.
```

If a rounding has been performed, the 16 lowest bits of the result are cleared to 0 .

### 1.3.8 SATA Status Bit

This status bit controls operations performed in the A unit.

- When SATA $=0$, no saturation is performed.
$\square$ When SATA $=1$ and an overflow is detected, the destination register is saturated to 7FFFh (positive overflow) or 8000h (negative overflow).


### 1.3.9 SATD Status Bit

This status bit controls operations performed in the $D$ unit.

- When SATD $=0$, no saturation is performed.
- When SATD $=1$ and an overflow is detected, the destination register is saturated.


### 1.3.10 SMUL Status Bit

- When $\mathrm{SMUL}=0$, the saturation mode is OFF.
$\square$ When $\operatorname{SMUL}=1$, the saturation mode is ON . When $\mathrm{SMUL}=1, \mathrm{FRCT}=1$, and SATD $=1$, the result of $18000 \mathrm{~h} \times 18000 \mathrm{~h}$ is saturated to 00 7FFF FFFFh (regardless of the value of the M40 bit). This forces the product of the two negative numbers to be a positive number. For multiply-and-accumulate/subtract instructions, the saturation is performed after the multiplication and before the addition/subtraction.


### 1.3.11 SXMD Status Bit

This status bit controls operations performed in the $D$ unit.

- When $\operatorname{SXMD}=0$, input operands are zero extended.
$\square$ When $\operatorname{SXMD}=1$, input operands are sign extended.


### 1.3.12 Test Control Status Bit (TCx)

The test control status bits (TC1 or TC2) hold the result of a test performed by the instruction.

### 1.4 Instruction Set Notes and Rules

### 1.4.1 Notes

- Mnemonic syntax keywords and operand modifiers are case insensitive. You can write:
ABDST *ARO, *ar1, ACO, ac1
or
aBdST *ar0, *aR1, aC0, Ac1
- Operands for commutative operations (+, *, \& , |, ^) can be arranged in any order.


### 1.4.2 Rules

$\square$ Simple instructions are not allowed to span multiple lines. One exception, single instructions that use the double colons, :.: notation to imply parallelism. These instructions may be split up following the :: notation.

The following example shows a single instruction (dual multiply) occupying two lines:

```
MPYR40 uns(Xmem), uns(Cmem), ACx
:: MPYR4O uns(Ymem), uns(Cmem), ACy
```

- User-defined parallelism instructions (using || notation) are allowed to span multiple lines. For example, all of the following instructions are legal:

```
MOV ACO, AC1 || MOV AC2, AC3
mov ACO, AC1 ||
mOV AC2, AC3
mov ACO, AC1
    || MOV AC2, AC3
mOV AC0, AC1
|
MOV AC2, AC3
```


### 1.4.2.1 Reserved Words

Register names are reserved and they may not be used as names of identifiers, labels, etc. Mnemonic syntax names are not reserved.

### 1.4.2.2 Mnemonic Syntax Roots

The following root words are used in the mnemonic syntax.

| Root | Meaning |
| :--- | :--- |
| ABS | Absolute value |
| ADD | Addition |
| AND | Bitwise AND |
| B | Branch |
| CALL | Function call |
| CLR | Assign the value to 0 |
| CMP | Compare |
| CNT | Count |
| EXP | Exponent |
| MAC | Multiply and accumulate |
| MAR | Modify auxiliary register content |
| MAS | Multiply and subtract |
| MAX | Maximum |
| MIN | Minimum |
| MOV | Move data |
| MPY | Multiply |
| NEG | Negate (2s complement) |
| NOT | Bitwise complement (1s complement) |
| OR | Bitwise OR |
| POP | Pop from top of the stack |
| PSH | Push to top of the stack |
| RET | Return |
| ROL | Rotate left |
| ROR | Rotate right |
| RPT | Repeat |
| SAT | Saturate |
| SET | Assign the value to 1 |
| SFT | Shift (left or right depending on sign of shift count) |
| SQA | Square and add |
| SQR | Square |
| SQS | Square and subtract |
| SUB | Subtraction |


| SWAP | Swap register contents |
| :--- | :--- |
| TST | Test bit |
| XOR | Bitwise exclusive-OR (XOR) |
| XPA | Expand |
| XTR | Extract |

### 1.4.2.3 Mnemonic Syntax Prefixes

The following prefixes are used in the mnemonic syntax.

## Prefix Meaning

A Instruction happens in address phase and is subject to circular addressing effects. Also, it occurs in the DAGEN functional unit and cannot be placed in parallel with any instruction that uses dual addressing mode.

B Bit instruction. Note that B is also a root (branch), suffix (borrow), and prefix (bit). The differences in context should prevent any confusion.

### 1.4.2.4 Mnemonic Syntax Suffixes

Suffixes can be combined. For the multiply variant instructions, the combination order is: M K R \{40, A, Z, or U\}. This list does not imply that all of the suffixes will ever be combined at once; but, when they are combined, they will be in this order.

## Suffix Meaning

40 Enables the M40 mode (all 40 bits of the accumulator count)
B Borrow
C Carry
CC Conditional
I Enable interrupts
K Multiply has a constant operand
L Logical shift (left or right depending on sign of shift count)
M This instruction has the option of assigning a memory operand to T3; regardless of whether that assignment actually occurs.
R Round
S Signed shift (left or right depending on sign of shift count)
U Unsigned
V Absolute value
Z Delay on the memory operand

### 1.4.2.5 Literal and Address Operands

Literals in the mnemonic strings are denoted as K or k fields. In the Smem address modes that require an offset, the offset is also a literal (K16 or k3). 8-bit and 16 -bit literals are allowed to be linktime-relocatable; for other literals, the value must be known at assembly time.

Addresses are the elements of the mnemonic strings denoted by P, L, and I. Further, 16 -bit and 24 -bit absolute address Smem modes are addresses, as is the dma Smem mode, denoted by the @ syntax. Addresses may be assem-bly-time constants or symbolic linktime-known constants or expressions.

Both literals and addresses follow syntax rule 1. For addresses only, rules 2 and 3 also apply.

## Rule 1

A valid address or literal is a \# followed by one of the following:
$\square$ a number (\#123)
$\square$ an identifier (\#FOO)
$\square$ a parenthesized expression (\# (FOO + 2) )
Note that \# is not used inside the expression.

## Rule 2

When an address is used in a dma, the address does not need to have a leading \#, be it a number, a symbol or an expression. These are all legal:

```
@#123
@123
@#foo
@foo
@#(foo+2)
@(foo+2)
```


## Rule 3

When used in contexts other than dma (such as branch targets or Smemabsolute address), addresses generally need a leading \#. As a convenience, the \# may be omitted in front of an identifier. These are all legal:

| Branch | Absolute Address |
| :--- | :--- |
| B \#123 | *(\#123) |
| B \#foo | *(\#foo) |
| B foo | *(foo) |
| B \#(foo +2) | *(\#(foo +2$))$ |

These are illegal:
B 123

* (123)
B (foo +2 )
* ((foo+2))


### 1.4.2.6 Memory Operands

- Syntax of Smem is the same as that of Lmem or Baddr.
$\square$ In the following instruction syntaxes, Smem cannot reference to a memory-mapped register (MMR). No instruction can access a byte within a memory-mapped register. If Smem is an MMR in one of the following syntaxes, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

```
MOV [uns(]high_byte(Smem)[)], dst
MOV [uns(]low_byte(Smem) [)], dst
MOV high_byte(Smem) << #SHIFTW, ACx
MOV low_byte(Smem) << #SHIFTW, ACx
MOV src, high_byte(Smem)
MOV src, low_byte(Smem)
```

- Syntax of Xmem is the same as that of Ymem.
- Syntax of coefficient operands, Cmem:
*CDP
*CDP+
*CDP-
* (CDP + T0), when C54CM = 0
* (CDP + AR0), when C54CM = 1

When an instruction uses a Cmem operand with paralleled instructions, the pointer modification of the Cmem operand must be the same for both instructions of the paralleled pair or the assembler generates an error. For example:

```
MAC *AR2+, *CDP+, ACO
:: MAC *AR3+, *CDP+, AC1
```

$\square$ An optional mmr prefix is allowed to be specified for indirect memory operands, for example, mmr (*ARO). This is an assertion by you that this is an access to a memory-mapped register. The assembler checks whether such access is legal in given circumstances.

The mmr prefix is supported for Xmem, Ymem, indirect Smem, indirect Lmem, and Cmem operands. It is not supported for direct memory operands; it is expected that an explicit mmap() instruction is used in conjunction with direct memory operands to indicate MMR access.

Note that the mmr prefix is part of the syntax. It is an implementation restriction that mmr cannot exchange positions with other prefixes around the memory operand, such as dbl or uns. If several prefixes are specified, mmr must be the innermost prefix. Thus, uns (mmr (*ARO)) is legal, but mmr (uns (*ARO)) is not legal.
$\square$ The following indirect operands cannot be used for accesses to I/O space. An instruction using one of these operands requires a 2-byte extension for the constant. This extension would prevent the use of the port() qualifier needed to indicate an I/O-space access.

```
*ARn(#K16)
*+ARn(#K16)
*CDP(#K16)
*+CDP(#K16)
```

Also, the following instructions that include the delay operation cannot be used for accesses to I/O space:

```
DELAY Smem
MACM[R]Z [T3 = ] Smem, Cmem, ACx
```

Any illegal access to I/O space will generate a hardware bus-error interrupt (BERRINT) to be handled by the CPU.

### 1.4.2.7 Operand Modifiers

Operand modifiers look like function calls on operands. Note that uns is an operand modifier meaning unsigned and that the instruction suffix $U$ also means unsigned. The operand modifier uns is used when the operand is modified on the way to the rest of the operation (MAC). The instruction suffix $U$ is used when the whole operation is affected (MPYMU, CMPU, BCCU).
\(\left.$$
\begin{array}{ll}\begin{array}{l}\text { Modifier } \\
\text { dbl }\end{array} & \begin{array}{l}\text { Meaning } \\
\text { Access a true 32-bit memory operand } \\
\text { dual }\end{array} \\
\begin{array}{l}\text { Access a 32-bit memory operand for use as two } \\
\text { independent 16-bit halves of the given operation }\end{array}
$$ <br>

HI \& Access upper 16 bits of the accumulator\end{array}\right\}\)| high_byte | Access the high byte of the memory location |
| :--- | :--- |
| LO | Access lower 16 bits of the accumulator |
| low_byte | Access the low byte of the memory location |
| pair | Dual register access |
| rnd | Round |
| saturate | Saturate |
| uns | Unsigned operand (not used in MOV instructions) |

When an instruction uses a Cmem operand with paralleled instructions and the Cmem operand is defined as unsigned (uns), both Cmem operands of the paralleled pair must be defined as unsigned (and reciprocally).

When an instruction uses both Xmem and Ymem operands with paralleled instructions and the Xmem operand is defined as unsigned (uns), Ymem operand must also be defined as unsigned (and reciprocally).

### 1.5 Nonrepeatable Instructions

Table 1-4 lists the instructions that cannot be used in a repeatable instruction.
Table 1-4. Nonrepeatable Instructions

| Instruction Description | Mnemonic Syntax That Cannot Be Repeated |
| :---: | :---: |
| B: Branch Unconditionally | B ACx |
|  | B L7 |
|  | B L16 |
|  | B P24 |
| BCC: Branch Conditionally | BCC 14, cond |
|  | BCC L8, cond |
|  | BCC L16, cond |
|  | BCC P24, cond |
| BCC: Branch on Auxiliary Register Not Zero | BCC L16, ARn_mod != \#0 |
| BCC: Compare and Branch | BCC[U] L8, src RELOP K8 |
| BCLR: Clear Status Register Bit | BCLR k4, STx_55 |
|  | BCLR f-name |
| BSET: Set Status Register Bit | BSET k4, STx_55 |
|  | BSET f-name |
| CALL: Call Unconditionally | CALL ACx |
|  | CALL L16 |
|  | CALL P24 |
| CALLCC: Call Conditionally | CALLCC L16, cond |
|  | CALLCC P24, cond |
| IDLE | IDLE |
| INTR: Software Interrupt | INTR k5 |
| MOV: Load CPU Register from Memory | MOV Smem, DP |
|  | MOV dbl(Lmem), RETA |
| MOV: Load CPU Register with Immediate Value | MOV k16, DP |
| MOV: Move CPU Register Content to Auxiliary or Temporary Register | MOV RPTC, TAx |

Table 1-4. Nonrepeatable Instructions (Continued)

| Instruction Description | Mnemonic Syntax That Cannot Be Repeated |
| :--- | :--- |
| MOV: Store CPU Register Content to Memory | MOV RETA, dbl(Lmem) |
| RESET: Software Reset | RESET |
| RET: Return Unconditionally | RET |
| RETCC: Return Conditionally | RETCC cond |
| RETI: Return from Interrupt | RETI |
| ROUND: Round Accumulator Content | ROUND [ACx,] ACy |
| RPT: Repeat Single Instruction Unconditionally | RPT k8 |
|  | RPT k16 |
| RPTADD: Repeat Single Instruction | RPT CSR |
| Unconditionally and Increment CSR | RPTADD CSR, TAx |
| RPTB: Repeat Block of Instructions | RPTADD CSR, k4 |
| Unconditionally | RPTBLOCAL pmad |
| RPTCC: Repeat Single Instruction Conditionally | RPTCC k8, cond |
| RPTSUB: Repeat Single Instruction | RPTSUB CSR, k4 |
| Unconditionally and Decrement CSR |  |
| TRAP: Software Trap | TRAP k5 |
| XCC: Execute Conditionally | XCC [label, ]cond |

## Parallelism Features and Rules

This chapter describes the parallelism features and rules of the TMS320C55x ${ }^{\text {TM }}$ DSP mnemonic instruction set.
Topic Page
2.1 Parallelism Features ..... 2-2
2.2 Parallelism Basics ..... 2-3
2.3 Resource Conflicts ..... 2-4
2.4 Soft-Dual Parallelism ..... 2-5
2.5 Execute Conditionally Instructions ..... 2-6
2.6 Other Exceptions ..... 2-7

### 2.1 Parallelism Features

The C55x ${ }^{\text {TM }}$ DSP architecture enables you to execute two instructions in parallel within the same cycle of execution. The types of parallelism are:

- Built-in parallelism within a single instruction.

Some instructions perform two different operations in parallel. Double colons, $:$. , are used to separate the two operations. This type of parallelism is also called implied parallelism. For example:

| MPY *AR0, *CDP, AC0 | This is a single instruction. The data |
| :--- | :--- |
| $::$ MPY *AR1, *CDP, AC1 | referenced by ARO is multiplied by the <br> coefficient referenced by CDP. At the |
|  | same time, the data referenced by AR1 <br> is multiplied by the same coefficient |
|  | (CDP). |

- User-defined parallelism between two instructions.

Two instructions may be paralleled by you or the C compiler. The parallel bars, $\|$, are used to separate the two instructions to be executed in parallel. For example:

| MPYM *AR1-, *CDP, AC1 | The first instruction performs a |
| :--- | :--- |
| \|| XOR AR2, T1 | multiplication in the D-unit. The second |
| instruction performs a logical operation in |  |
|  | the A-unit ALU. |

- Built-in parallelism can be combined with user-defined parallelism. For example:

| MPYM T3 $=*$ AR3 + , AC1, AC2 | The first instruction includes implied |
| :--- | :--- |
| \|| MOV \#5, AR1 | parallelism. The second instruction is |
|  | paralleled by you. |

### 2.2 Parallelism Basics

In the parallel pair, all of these constraints must be met:

- Total size of both instructions may not exceed 6 bytes.
$\square$ No resource conflicts as detailed in section 2.3.
- One instruction must have a parallel enable bit or the pair must qualify for soft-dual parallelism as detailed in section 2.4.
- No memory operand may use an addressing mode that requires a constant that is 16 bits or larger:

```
■ *abs16(#k16)
■ *(#k23)
■ port(#k16)
■ *ARn(K16)
| **ARn(K16)
■ *CDP(K16)
■ *+CDP(K16)
```

- The following instructions cannot be in parallel:
- BCC P24, cond
- CALLCC P24, cond
- IDLE
- INTR k5
- RESET
- TRAP k5
- Neither instruction in the parallel pair can use any of these instruction or operand modifiers:

```
- mmap()
- port()
- <instruction>. CR
- <instruction>.LR
```

$\square$ A particular register or memory location can only be written once per pipeline phase. Violations of this rule take many forms. Loading the same register twice is a simple case. Other cases include:

- Conflicting address mode modifications (for example, *AR2+ versus *AR2-)
- Combining a SWAP instruction (modifies all of its registers) with any other instruction that writes one of the same registers
$\square$ Data stack pointer (XSP) or system stack pointer (XSSP) modifications cannot be combined with any of the following instructions:
- Call Conditionally, (if (cond) call instructions)
- Call Unconditionally, (call instructions)
- Push to top of Stack (push instructions)
- Pop from top of Stack (pop instructions)
- Return Conditionally, (if (cond) return instructions)
- Return Unconditionally, (return instructions)
- Return from Interrupt, (return_int, instructions)
- trap or intr instructions
$\square$ When both instructions in a parallel pair modify a status bit, the value of that status bit becomes undefined.


### 2.3 Resource Conflicts

Every instruction uses some set of operators, address generation units, and buses, collectively called resources, while executing. To determine which resources are used by a specific instruction, see Table 4-1. Two instructions in parallel use all the resources of the individual instructions. A resource conflict occurs when two instructions use a combination of resources that is not supported on the C55x device. This section details the resource conflicts.

### 2.3.1 Operators

You may use each of these operators only once:

- D Unit ALU

D Unit Shift
D Unit Swap
A Unit Swap

- A Unit ALU
- P Unit

For an instruction that uses multiple operators, any other instruction that uses one or more of those same operators may not be placed in parallel.

### 2.3.2 Address Generation Units

You may use no more than the indicated number of data address generation units:

- 2 Data Address (DA) Generation Units
- 1 Coefficient Address (CA) Generation Unit
- 1 Stack Address (SA) Generation Unit


### 2.3.3 Buses

You may use no more than the indicated number of buses:
$\square 2$ Data Read (DR) Buses
$\square 1$ Coefficient Read (CR) Bus
$\square 2$ Data Write (DW) Buses

- 1 ACB Bus - brings D-unit registers to A-unit and P-unit operators
$\square 1 \mathrm{KAB}$ Bus - Constant Bus
$\square 1$ KDB Bus - Constant Bus


### 2.4 Soft-Dual Parallelism

Instructions that reference memory operands do not have parallel enable bits. Two such instructions may still be combined with a type of parallelism called soft-dual parallelism. The constraints of soft-dual parallelism are:
$\square$ Both memory operands must meet the constraints of the dual AR indirect addressing mode (Xmem and Ymem), as described in section 3.4.2. The operands available for the dual AR indirect addressing mode are:

```
- *ARn
- *ARn+
- *ARn-
- *(ARn + ARO)
- *(ARn + TO)
- *(ARn - ARO)
- *(ARn - T0)
- *ARn(ARO)
- *ARn(TO)
- *(ARn + T1)
- *(ARn-T1)
```

$\square$ Neither instruction can contain any of the following:

- Instructions embedding high_byte(Smem) and low_byte(Smem):

■ MOV [uns(]high_byte(Smem) [)], dst
■ MOV [uns(]low_byte(Smem) [)], dst
■ MOV low_byte (Smem) << \#SHIFTW, ACx

- MOV high_byte (Smem) << \#SHIFTW, ACx

■ MOV src, high_byte(Smem)

- MOV src, low_byte(Smem)

■ These instructions that read and write the same memory location:

- BCLR src, Smem
- BNOT src, Smem
- BSET src, Smem
- BTSTCLR k4, Smem, TCx
- BTSTNOT k4, Smem, TCx
- BTSTSET k4, Smem, TCx
$\square$ With regard to soft-dual parallelism, the AMAR smem instruction has the same properties as any memory reference instruction.


### 2.4.1 Soft-Dual Parallelism of MAR Instructions

Although the following modify auxiliary register (MAR) instructions do not reference memory and do not have parallel enable bits, they may be combined together or with any other memory reference instructions (not limited to Xmem/ Ymem) to form soft-dual parallelism.
AADD TAx, TAy
A AADD k8, TAx
AMOV TAx, TAy
AMOV k8, TAx
ASUB TAx, TAy
a ASUB k8, TAx

Note that this is not the full list of MAR instructions; instructions AMOV D16, TAx and AMAR smem are not included.

### 2.5 Execute Conditionally Instructions

The parallelization of the execute conditionally (XCC) instructions does not adhere to the descriptions in this chapter. All of the specific instances of legal XCC parallelism are covered in the XCC descriptions in Chapter 5.

### 2.6 Other Exceptions

The following are other exceptions not covered elsewhere in this chapter.
$\square$ An instruction that reads the repeat counter register (RPTC) may not be combined with any single-repeat instruction:

- RPT
- RPTADD
- RPTSUB
- RPTCC


## Introduction to Addressing Modes

This chapter provides an introduction to the addressing modes of the TMS320C55x™ DSP.
Topic Page
3.1 Introduction to the Addressing Modes ..... 3-2
3.2 Absolute Addressing Modes ..... 3-3
3.3 Direct Addressing Modes ..... 3-4
3.4 Indirect Addressing Modes ..... 3-6
3.5 Circular Addressing ..... 3-21

### 3.1 Introduction to the Addressing Modes

The TMS320C55x DSP supports three types of addressing modes that enable flexible access to data memory, to memory-mapped registers, to register bits, and to I/O space:
$\square$ The absolute addressing mode allows you to reference a location by supplying all or part of an address as a constant in an instruction.

- The direct addressing mode allows you to reference a location using an address offset.
$\square$ The indirect addressing mode allows you to reference a location using a pointer.

Each addressing mode provides one or more types of operands. An instruction that supports an addressing-mode operand has one of the following syntax elements listed in Table 3-1.

Table 3-1. Addressing-Mode Operands

| Syntax <br> Element(s) | Description |
| :--- | :--- | | Baddr | When an instruction contains Baddr, that instruction can access one or two bits in an <br> accumulator (AC0-AC3), an auxiliary register (AR0-AR7), or a temporary register (T0-T3). <br> Only the register bit test/set/clear/complement instructions support Baddr. As you write one of <br> these instructions, replace Baddr with a compatible operand. |
| :--- | :--- |
| Cmem | When an instruction contains Cmem, that instruction can access a single word (16 bits) of data <br> from data memory. As you write the instruction, replace Cmem with a compatible operand. |
| Lmem | When an instruction contains Lmem, that instruction can access a long word (32 bits) of data <br> from data memory or from a memory-mapped registers. As you write the instruction, replace |
|  | Lmem with a compatible operand. |
| Smem | When an instruction contains Smem, that instruction can access a single word (16 bits) of data <br> from data memory, from I/O space, or from a memory-mapped register. As you write the <br> instruction, replace Smem with a compatible operand. |
| Xmem and | When an instruction contains Xmem and Ymem, that instruction can perform two simultaneous <br> Ymem |
|  | 16-bit accesses to data memory. As you write the instruction, replace Xmem and Ymem with <br> compatible operands. |

### 3.2 Absolute Addressing Modes

Table 3-2 lists the absolute addressing modes available.
Table 3-2. Absolute Addressing Modes

| Addressing Mode | Description |
| :--- | :--- |
| k16 absolute | This mode uses the 7-bit register called DPH (high part of the extended data page <br> register) and a 16-bit unsigned constant to form a 23-bit data-space address. This mode <br> is used to access a memory location or a memory-mapped register. |
| k23 absolute | This mode enables you to specify a full address as a 23-bit unsigned constant. This <br> mode is used to access a memory location or a memory-mapped register. |
| I/O absolute | This mode enables you to specify an I/O address as a 16-bit unsigned constant. This <br> mode is used to access a location in I/O space. |

### 3.2.1 k16 Absolute Addressing Mode

The k16 absolute addressing mode uses the operand *abs16(\#k16), where k 16 is a 16 -bit unsigned constant. DPH (the high part of the extended data page register) and k 16 are concatenated to form a 23 -bit data-space address.
An instruction using this addressing mode encodes the constant as a 2-byte extension to the instruction. Because of the extension, an instruction using this mode cannot be executed in parallel with another instruction.

### 3.2.2 k23 Absolute Addressing Mode

The k23 absolute addressing mode uses the operand *(\#k23), where k23 is a 23 -bit unsigned constant. An instruction using this addressing mode encodes the constant as a 3-byte extension to the instruction (the most-significant bit of this 3-byte extension is discarded). Because of the extension, an instruction using this mode cannot be executed in parallel with another instruction.

Instructions using the operand *(\#k23) to access the memory operand Smem cannot be used in a repeatable instruction. See Table 1-4 for a list of these instructions.

### 3.2.3 I/O Absolute Addressing Mode

The I/O absolute addressing mode uses the port() operand qualifier. Enclose a 16-bit unsigned constant in the parentheses of the port() qualifier, port(\#k16); there is no preceding asterisk, ${ }^{*}$, in this operand.

An instruction using this addressing mode encodes the constant as a 2-byte extension to the instruction. Because of the extension, an instruction using this mode cannot be executed in parallel with another instruction. The DELAY and MACMZ instructions cannot use this mode.

### 3.3 Direct Addressing Modes

Table 3-3 lists the direct addressing modes available.
Table 3-3. Direct Addressing Modes

| Addressing Mode | Description |
| :--- | :--- |
| DP direct | This mode uses the main data page specified by DPH (high part of the extended data <br> page register) in conjunction with the data page register (DP). This mode is used to <br> access a memory location or a memory-mapped register. |
| SP direct | This mode uses the main data page specified by SPH (high part of the extended stack <br> pointers) in conjunction with the data stack pointer (SP). This mode is used to access <br> stack values in data memory. |
| Register-bit direct | This mode uses an offset to specify a bit address. This mode is used to access one <br> register bit or two adjacent register bits. |
| PDP direct | This mode uses the peripheral data page register (PDP) and an offset to specify an I/O <br> address. This mode is used to access a location in I/O space. |

The DP direct and SP direct addressing modes are mutually exclusive. The mode selected depends on the CPL bit in status register ST1_55:

| CPL | Addressing Mode Selected |
| :--- | :--- |
| 0 | DP direct addressing mode |
| 1 | SP direct addressing mode |

The register-bit and PDP direct addressing modes are independent of the CPL bit.

### 3.3.1 DP Direct Addressing Mode

When an instruction uses the DP direct addressing mode, a 23-bit address is formed. The 7 MSBs are taken from DPH that selects one of the 128 main data pages ( 0 through 127). The 16 LSBs are the sum of two values:
$\square$ The value in the data page register (DP). DP identifies the start address of a 128 -word local data page within the main data page. This start address can be any address within the selected main data page.

- A 7-bit offset (Doffset) calculated by the assembler. The calculation depends on whether you are accessing data memory or a memorymapped register (using the mmap() qualifier).

The concatenation of DPH and DP is called the extended data page register (XDP). You can load DPH and DP individually, or you can use an instruction that loads XDP.

### 3.3.2 SP Direct Addressing Mode

When an instruction uses the SP direct addressing mode, a 23 -bit address is formed. The 7 MSBs are taken from SPH. The 16 LSBs are the sum of the SP value and a 7 -bit offset that you specify in the instruction. The offset can be a value from 0 to 127. The concatenation of SPH and SP is called the extended data stack pointer (XSP). You can load SPH and SP individually, or you can use an instruction that loads XSP.

On the first main data page, addresses $000000 \mathrm{~h}-00005 \mathrm{Fh}$ are reserved for the memory-mapped registers. If any of your data stack is in main data page 0 , make sure it uses only addresses 00 0060h-00 FFFFh on that page.

### 3.3.3 Register-Bit Direct Addressing Mode

In the register-bit direct addressing mode, the offset you supply in the operand, @bitoffset, is an offset from the LSB of the register. For example, if bitoffset is 0 , you are addressing the LSB of a register. If bitoffset is 3 , you are addressing bit 3 of the register.

Only the register bit test/set/clear/complement instructions support this mode. These instructions enable you to access bits in the following registers only: the accumulators (AC0-AC3), the auxiliary registers (AR0-AR7), and the temporary registers (T0-T3).

### 3.3.4 PDP Direct Addressing Mode

When an instruction uses the PDP direct addressing mode, a 16 -bit I/O address is formed. The 9 MSBs are taken from the 9 -bit peripheral data page register (PDP) that selects one of the 512 peripheral data pages ( 0 through 511). Each page has 128 words ( 0 to 127). You select a particular word by specifying a 7 -bit offset (Poffset) in the instruction. For example, to access the first word on a page, use an offset of 0 .

You must use a port() qualifier to indicate that you are accessing an I/O-space location rather than a data-memory location. The port() qualifier must enclose the qualified read or write operand.

### 3.4 Indirect Addressing Modes

Table 3-4 list the indirect addressing modes available. You may use these modes for linear addressing or circular addressing.

Table 3-4. Indirect Addressing Modes

| Addressing Mode | Description |
| :--- | :--- |
| AR indirect | This mode uses one of eight auxiliary registers (AR0-AR7) to point to data. The way the <br> CPU uses the auxiliary register to generate an address depends on whether you are <br> accessing data space (memory or memory-mapped registers), individual register bits, <br> or I/O space. |
| Dual AR indirect | This mode uses the same address-generation process as the AR indirect addressing <br> mode. This mode is used with instructions that access two or more data-memory <br> locations. |
| CDP indirect | This mode uses the coefficient data pointer (CDP) to point to data. The way the CPU <br> uses CDP to generate an address depends on whether you are accessing data space <br> (memory or memory-mapped registers), individual register bits, or I/O space. |
| Coefficient indirect | This mode uses the same address-generation process as the CDP indirect addressing <br> mode. This mode is available to support instructions that can access a coefficient in data <br> memory at the same time they access two other data-memory values using the dual AR <br> indirect addressing mode. |

### 3.4.1 AR Indirect Addressing Mode

The AR indirect addressing mode uses an auxiliary register ARn ( $\mathrm{n}=0,1,2$, $3,4,5,6$, or 7 ) to point to data. The way the CPU uses ARn to generate an address depends on the access type:

| For An Access To ... | ARn Contains ... |
| :--- | :--- |
| Data space | The 16 least significant bits (LSBs) of a 23-bit address. <br> (memory or registers) |
| The 7 most significant bits (MSBs) are supplied by <br> ARnH, which is the high part of extended auxiliary <br> register XARn. For accesses to data space, use an <br> instruction that loads XARn; ARn can be individually <br> loaded, but ARnH cannot be loaded. |  |
| A register bit (or bit pair) | A bit number. Only the register bit test/set/clear/com- <br> plement instructions support AR indirect accesses to <br> register bits. These instructions enable you to access <br> bits in the following registers only: the accumulators <br> (AC0-AC3), the auxiliary registers (AR0-AR7), and <br> the temporary registers (T0-T3). |
| I/O space | A 16-bit I/O address. |

The AR indirect addressing-mode operand available depends on the ARMS bit of status register ST2_55:

## ARMS DSP Mode or Control Mode

$0 \quad$ DSP mode. The CPU can use the list of DSP mode operands (Table 3-5), which provide efficient execution of DSP-intensive applications.

1 Control mode. The CPU can use the list of control mode operands (Table 3-6), which enable optimized code size for control system applications.

Table 3-5 (page 3-8) introduces the DSP operands available for the AR indirect addressing mode. Table 3-6 (page 3-12) introduces the control mode operands. When using the tables, keep in mind that:
$\square$ Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the appropriate 16-bit buffer start address register (BSA01, BSA23, BSA45, or BSA67) is added only if circular addressing is activated for the chosen pointer.
$\square$ All additions to and subtractions from the pointers are done modulo 64K. You cannot address data across main data pages without changing the value in the extended auxiliary register (XARn).

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *ARn | ARn is not modified. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn+ | ARn is incremented after the address is generated: If 16 -bit/1-bit operation: $A R n=A R n+1$ <br> If 32 -bit/2-bit operation: $\mathrm{ARn}=\mathrm{ARn}+2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn- | ARn is decremented after the address is generated: <br> If 16-bit/1-bit operation: $\mathrm{ARn}=\mathrm{ARn}-1$ <br> If 32-bit/2-bit operation: $A R n=A R n-2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| * + ARn | ARn is incremented before the address is generated: <br> If 16 -bit/1-bit operation: $A R n=A R n+1$ <br> If 32-bit/2-bit operation: $A R n=A R n+2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *-ARn | ARn is decremented before the address is generated: <br> If 16-bit/1-bit operation: $A R n=A R n-1$ <br> If 32-bit/2-bit operation: $A R n=A R n-2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *(ARn + AR0) | The 16 -bit signed constant in ARO is added to ARn after the address is generated: $A R n=A R n+A R 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) I/O-space (Smem) |

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn + T0) | The 16-bit signed constant in T0 is added to ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}+\mathrm{T} 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - ARO) | The 16-bit signed constant in ARO is subtracted from ARn after the address is generated: $A R n=A R n-A R 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - T0) | The 16-bit signed constant in T0 is subtracted from ARn after the address is generated: $A R n=A R n-T 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(ARO) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in ARO is used as an offset from that base pointer. <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(TO) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in TO is used as an offset from that base pointer. <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(T1) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in T1 is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn + T1) | The 16-bit signed constant in T1 is added to ARn after the address is generated: $A R n=A R n+T 1$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - T1) | The 16 -bit signed constant in T1 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{T} 1$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn + AR0B) | The 16 -bit signed constant in ARO is added to ARn after the address is generated: $A R n=A R n+A R 0$ <br> (The addition is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If ARn is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn + TOB) | The 16-bit signed constant in T0 is added to ARn after the address is generated: $A R n=A R n+T 0$ <br> (The addition is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If ARn is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn - AR0B) | The 16-bit signed constant in AR0 is subtracted from ARn after the address is generated: $A R n=A R n-A R 0$ <br> (The subtraction is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If ARn is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - TOB) | The 16 -bit signed constant in T0 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{TO}$ <br> (The subtraction is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If $A R n$ is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(\#K16) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant (K16) is used as an offset from that base pointer. <br> Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) <br> Register bit (Baddr) |
| * + ARn(\#K16) | The 16-bit signed constant (K16) is added to ARn before the address is generated: $A R n=A R n+K 16$ <br> Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) <br> Register bit (Baddr) |

Table 3-6. Control Mode Operands for the AR Indirect Addressing Mode

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *ARn | ARn is not modified. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn+ | ARn is incremented after the address is generated: If 16 -bit/1-bit operation: $A R n=A R n+1$ If 32-bit/2-bit operation: $\mathrm{ARn}=\mathrm{ARn}+2$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn- | ARn is decremented after the address is generated: <br> If 16-bit/1-bit operation: $\mathrm{ARn}=\mathrm{ARn}-1$ <br> If 32-bit/2-bit operation: $\mathrm{ARn}=\mathrm{ARn}-2$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *(ARn + AR0) | The 16 -bit signed constant in ARO is added to ARn after the address is generated:$A R n=A R n+A R 0$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at | Register bit (Baddr) |
|  | assembly time. - | I/O-space (Smem) |
| *(ARn + T0) | The 16-bit signed constant in T0 is added to ARn after the address is generated:$A R n=A R n+T 0$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at | Register bit (Baddr) |
|  | assembly time. - | I/O-space (Smem) |
| *(ARn - AR0) | The 16-bit signed constant in AR0 is subtracted from ARn after the address is generated:$A R n=A R n-A R 0$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Register bit (Baddr) |
|  |  | I/O-space (Smem) |

Table 3-6. Control Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn - T0) | The 16 -bit signed constant in T0 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{T} 0$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Register bit (Baddr) I/O-space (Smem) |
| *ARn(AR0) | ARn is not modified. ARn is used as a base pointer. The 16-bit signed constant in ARO is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Register bit (Baddr) I/O-space (Smem) |
| *ARn(TO) | $A R n$ is not modified. ARn is used as a base pointer. The 16 -bit signed constant in T0 is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Register bit (Baddr) I/O-space (Smem) |
| *ARn(\#K16) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant (K16) is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Register bit (Baddr) |

Table 3-6. Control Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :--- | :--- | :--- |
| ${ }^{*}+$ ARn(\#K16) | The 16-bit signed constant (K16) is added to ARn <br> before the address is generated: <br> ARn = ARn + K16 | Data-memory (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the <br> constant is encoded in a 2-byte extension to the <br> instruction. Because of the extension, an instruction <br> using this operand cannot be executed in parallel with <br> another instruction. | Memory-mapped register <br> (Smem, Lmem) |
| *ARn(short(\#k3))) | ARn is not modified. ARn is used as a base pointer. The bit (Baddr) <br> 3-bit unsigned constant (k3) is used as an offset from <br> that base pointer. k3 is in the range 1 to 7. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register |
| (Smem, Lmem) |  |  |

### 3.4.2 Dual AR Indirect Addressing Mode

The dual AR indirect addressing mode enables you to make two data-memory accesses through the eight auxiliary registers, AR0-AR7. As with single AR indirect accesses to data space, the CPU uses an extended auxiliary register to create each 23 -bit address. You can use linear addressing or circular addressing for each of the two accesses.

You may use the dual AR indirect addressing mode for:

- Executing an instruction that makes two 16-bit data-memory accesses. In this case, the two data-memory operands are designated in the instruction syntax as Xmem and Ymem. For example:
ADD Xmem, Ymem, ACx
- Executing two instructions in parallel. In this case, both instructions must each access a single memory value, designated in the instruction syntaxes as Smem or Lmem. For example:

```
MOV Smem, dst
|| AND Smem, src, dst
```

The operand of the first instruction is treated as an Xmem operand, and the operand of the second instruction is treated as a Ymem operand.

The available dual AR indirect operands are a subset of the AR indirect operands. The ARMS status bit does not affect the set of dual AR indirect operands available.

## Note:

The assembler rejects code in which dual operands use the same auxiliary register with two different auxiliary register modifications. You can use the same ARn for both operands, if one of the operands is *ARn or *ARn(TO); neither modifies ARn.

Table 3-7 (page 3-15) introduces the operands available for the dual AR indirect addressing mode. Note that:

- Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the appropriate 16-bit buffer start address register (BSA01, BSA23, BSA45, or BSA67) is added only if circular addressing is activated for the chosen pointer.
- All additions to and subtractions from the pointers are done modulo 64K. You cannot address data across main data pages without changing the value in the extended auxiliary register (XARn).

Table 3-7. Dual AR Indirect Operands

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *ARn | ARn is not modified. | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *ARn+ | ARn is incremented after the address is generated: <br> If 16-bit operation: $A R n=A R n+1$ <br> If 32-bit operation: $A R n=A R n+2$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *ARn- | ARn is decremented after the address is generated: <br> If 16 -bit operation: $\mathrm{ARn}=\mathrm{ARn}-1$ <br> If 32-bit operation: $\mathrm{ARn}=\mathrm{ARn}-2$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *(ARn + AR0) | The 16 -bit signed constant in ARO is added to ARn after the address is generated: $A R n=A R n+A R 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. |  |
| *(ARn + T0) | The 16-bit signed constant in T0 is added to ARn after the address is generated: $A R n=A R n+T 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |

Table 3-7. Dual AR Indirect Operands (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn - ARO) | The 16-bit signed constant in AR0 is subtracted from ARn after the address is generated: $A R n=A R n-A R 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. |  |
| *(ARn - T0) | The 16-bit signed constant in T0 is subtracted from ARn after the address is generated: $A R n=A R n-T 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |
| *ARn(AR0) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in ARO is used as an offset from that base pointer. | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. |  |
| *ARn(T0) | ARn is not modified. ARn is used as a base pointer. The 16-bit signed constant in T0 is used as an offset from that base pointer. | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |
| *(ARn + T1) | The 16-bit signed constant in T1 is added to ARn after the address is generated: $A R n=A R n+T 1$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *(ARn - T1) | The 16 -bit signed constant in T 1 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{T} 1$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |

### 3.4.3 CDP Indirect Addressing Mode

The CDP indirect addressing mode uses the coefficient data pointer (CDP) to point to data. The way the CPU uses CDP to generate an address depends on the access type:

| For An Access To ... | CDP Contains ... |
| :--- | :--- |
| Data space <br> (memory or registers) | The 16 least significant bits (LSBs) of a 23-bit address. <br> The 7 most significant bits (MSBs) are supplied by <br> CDPH, the high part of the extended coefficient data <br> pointer (XCDP). |
| A register bit (or bit pair) | A bit number. Only the register bit test/set/clear/com- <br> plement instructions support CDP indirect accesses to <br> register bits. These instructions enable you to access <br> bits in the following registers only: the accumulators <br> (AC0-AC3), the auxiliary registers (AR0-AR7), and <br> the temporary registers (T0-T3). |
|  | A 16-bit I/O address. |

Table 3-8 (page 3-17) introduces the operands available for the CDP indirect addressing mode. Note that:
$\square$ Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the 16 -bit buffer start address register BSAC is added only if circular addressing is activated for CDP.
$\square$ All additions to and subtractions from CDP are done modulo 64K. You cannot address data across main data pages without changing the value of CDPH (the high part of the extended coefficient data pointer).

Table 3-8. CDP Indirect Operands

| Operand | Pointer Modification | Supported Access Types |
| :--- | :--- | :--- |
| ${ }^{*}$ CDP | CDP is not modified. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register <br> (Smem, Lmem) |
|  |  | Register-bit (Baddr) |
|  |  | I/O-space (Smem) |
|  |  |  |
|  | CDP + | If is incremented after the address is generated: |
|  | If 32-bit/2-bit operation: CDP = CDP +1 | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register |
|  | (Smem, Lmem) |  |
|  |  | Register-bit (Baddr) |
|  |  | I/O-space (Smem) |

Table 3-8. CDP Indirect Operands (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *CDP- | CDP is decremented after the address is generated: <br> If 16-bit/1-bit operation: CDP = CDP - 1 <br> If 32-bit/2-bit operation: $C D P=C D P-2$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register-bit (Baddr) |
|  |  | I/O-space (Smem) |
| *CDP(\#K16) | CDP is not modified. CDP is used as a base pointer. The 16 -bit signed constant (K16) is used as an offset from that base pointer. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Register-bit (Baddr) |
| * + CDP(\#K16) | The 16 -bit signed constant (K16) is added to CDP before the address is generated:$C D P=C D P+K 16$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Register-bit (Baddr) |

### 3.4.4 Coefficient Indirect Addressing Mode

The coefficient indirect addressing mode uses the same address-generation process as the CDP indirect addressing mode for data-space accesses. The coefficient indirect addressing mode is supported by select memory-tomemory move and memory initialization instructions and by the following arithmetical instructions:

| $\square$ | Dual multiply (accumulate/subtract) |
| :--- | :--- |
| Finite impulse response filter |  |
| Multiply |  |
| Multiply and accumulate |  |
| $\square$ Multiply and subtract |  |

Instructions using the coefficient indirect addressing mode to access data are mainly instructions performing operations with three memory operands per cycle. Two of these operands (Xmem and Ymem) are accessed with the dual AR indirect addressing mode. The third operand (Cmem) is accessed with the coefficient indirect addressing mode. The Cmem operand is carried on the BB bus.

Keep the following facts about the BB bus in mind as you use the coefficient indirect addressing mode:

- The BB bus is not connected to external memory. If a Cmem operand is accessed through the BB bus, the operand must be in internal memory.
- Although the following instructions access Cmem operands, they do not use the BB bus to fetch the 16 -bit or 32-bit Cmem operand.

| Instruction <br> Syntax | Description of <br> Cmem Access | Bus Used to <br> Access Cmem |
| :--- | :--- | :--- |
| MOV Cmem, Smem | 16-bit read from Cmem | DB |
| MOV Smem, Cmem | 16-bit write to Cmem | EB |
| MOV Cmem, dbl(Lmem) | 32-bit read from Cmem | CB for most significant <br> word (MSW) <br> DB for least significant <br> word (LSW) |
|  |  | FB for MSW <br> MOV dbI(Lmem), Cmem |

Consider the following instruction syntax. In one cycle, two multiplications can be performed in parallel. One memory operand (Cmem) is common to both multiplications, while dual AR indirect operands (Xmem and Ymem) are used for the other values in the multiplication.

```
MPY Xmem, Cmem, ACx
:: MPY Ymem, Cmem, ACY
```

To access three memory values (as in the above example) in a single cycle, the value referenced by Cmem must be located in a memory bank different from the one containing the Xmem and Ymem values.

Table 3-9 introduces the operands available for the coefficient indirect addressing mode. Note that:

- Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the 16 -bit buffer start address register BSAC is added only if circular addressing is activated for CDP.
- All additions to and subtractions from CDP are done modulo 64K. You cannot address data across main data pages without changing the value of CDPH (the high part of the extended coefficient data pointer).

Table 3-9. Coefficient Indirect Operands

| Operand | Pointer Modification | Supported Access Type |
| :---: | :---: | :---: |
| *CDP | CDP is not modified. 1 | Data-memory |
| *CDP+ | CDP is incremented after the address is generated: <br> If 16-bit operation: CDP = CDP + 1 <br> If 32-bit operation: $C D P=C D P+2$ | Data-memory |
| *CDP- | CDP is decremented after the address is generated: If 16-bit operation: CDP = CDP - 1 <br> If 32-bit operation: $C D P=C D P-2$ | Data-memory |
| * $(\mathrm{CDP}+\mathrm{ARO})$ | The 16-bit signed constant in ARO is added to CDP after the address is generated: $C D P=C D P+A R 0$ | Data-memory |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. |  |
| * $\mathrm{CDP}+\mathrm{TO}$ ) | The 16-bit signed constant in T0 is added to CDP after the address is generated: $\mathrm{CDP}=\mathrm{CDP}+\mathrm{TO}$ | Data-memory |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |

### 3.5 Circular Addressing

Circular addressing can be used with any of the indirect addressing modes. Each of the eight auxiliary registers (AR0-AR7) and the coefficient data pointer (CDP) can be independently configured to be linearly or circularly modified as they act as pointers to data or to register bits, see Table 3-10. This configuration is done with a bit (ARnLC) in status register ST2_55. To choose circular modification, set the bit.

Table 3-10. Circular Addressing Pointers

| Pointer | Linear/Circular Con- <br> figuration Bit | Supplier of <br> Main Data Page | Buffer Start Address <br> Register | Buffer Size <br> Register |
| :---: | :---: | :---: | :---: | :---: |
| AR0 | ST2_55(0) = AR0LC | AR0H | BSA01 | BK03 |
| AR1 | ST2_55(1) = AR1LC | AR1H | BSA01 | BK03 |
| AR2 | ST2_55(2) = AR2LC | AR2H | BSA23 | BK03 |
| AR3 | ST2_55(3) =AR3LC | AR3H | BSA23 | BK03 |
| AR4 | ST2_55(4) =AR4LC | AR4H | BSA45 | BK47 |
| AR5 | ST2_55(5) =AR5LC | AR5H | BSA45 | BK47 |
| AR6 | ST2_55(6) =AR6LC | AR6H | BSA67 | BK47 |
| AR7 | ST2_55(7) $=$ AR7LC | AR7H | BSA67 | BK47 |
| CDP | ST2_55(8) $=$ CDPLC | CDPH | BSAC | BKC |

Each auxiliary register ARn has its own linear/circular configuration bit in ST2_55:

| ARnLC | ARn Is Used For ... |
| :--- | :--- |
| 0 | Linear addressing |
| 1 | Circular addressing |

The CDPLC bit in status register ST2_55 configures the DSP to use CDP for linear addressing or circular addressing:

| CDPLC | CDP Is Used For ... |
| :--- | :--- |
| 0 | Linear addressing |
| 1 | Circular addressing |

You can use the circular addressing instruction qualifier, .CR, if you want every pointer used by the instruction to be modified circularly, just add .CR to the end of the instruction mnemonic (for example, ADD.CR). The circular addressing instruction qualifier overrides the linear/circular configuration in ST2_55.

## Instruction Set Summary

This chapter provides a summary of the TMS320C55x™ DSP mnemonic instruction set (Table 4-1). With each instruction, you will find the availability of a parallel enable bit, word count (size), cycle time, what pipeline phase the instruction executes, in what operator unit the instruction executes, how many of each address generation unit is used, and how many of each bus is used.

Table 4-1 does not list all of the resources that may be used by an instruction, it only lists those that may result in a resource conflict, and thus prevent two instructions from being in parallel. If an instruction lists nothing in a particular column, it means that particular resource will never be in conflict for that instruction.

The column heads of Table 4-1 are:
$\square$ Instruction: In cases where the resource usage of an instruction varies with the kinds of registers, you see the notation <name>-AU for A-unit registers and <name>-DU for D-unit registers. So, dst-AU is a destination that is an A-unit register and src-DU is a source that is a D-unit register. In the few cases where that notation is insufficient, you see the cases listed in the Notes column.
$\square$ E: Whether that instruction has a parallel enable bit
$\square$ S: The size of the instruction in bytes
$\square \mathrm{C}$ : Number of cycles required for the instruction
$\square$ Pipe: The pipeline phase in which the instruction executes:

| Name | Phase |
| :--- | :--- |
| AD | Address |
| D | Decode |
| R | Read |
| X | Execute |

$\square$ Operator: Which operator(s) are used by this instruction. When an instruction uses multiple operators, any other instruction that uses one or more of those same operators may not be placed in parallel.

- Address Generation Unit: How many of each address generation unit is used. The address generation units are:


## Name <br> Unit

DA
Data Address Generation Unit
CA Coefficient Address Generation Unit
SA Stack Address Generation Unit
$\square$ Buses: How many of each bus is used. The buses are:

| Name | Bus |
| :--- | :--- |
| DR | Data Read |
| CR | Coefficient Read |
| DW | Data Write |
| ACB | Brings D-unit registers to A-unit and P-unit operators |


i Table 4-1. Mnemonic Instruction Set Summary (Continued)

|  | No. Instruction |  | E | S | C | Pipe | Operator | Address Generation Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | DA |  |  |  |  | CA | SA | DR | CR | DW | ACB |  |
|  |  | ADD Smem, [src,] dst-DU |  | N | 3 | 1 | X | DU_ALU | 1 | . | . | 1 | . | . | . | See Note 1. |
|  | [5] | ADD ACx << Tx, ACy | Y | 2 | 1 | X | DU_SHIFT | . | . | . | . | . | . | . |  |
|  | [6] | ADD ACx << \#SHIFTW, ACy | Y | 3 | 1 | X | DU_SHIFT | . | . | . | . | . | . | . |  |
| $\begin{aligned} & \infty \\ & \infty \end{aligned}$ | [7] | ADD K16 <<\#16, [ACx,] ACy | N | 4 | 1 | X | DU_ALU | . | . | . | . | . | . |  |  |
| 0 | [8] | ADD K16 <<\#SHFT, [ACx,] ACy | N | 4 | 1 | $x$ | DU_SHIFT | . | . | . | . | . | . | . |  |
| 3 | [9] | ADD Smem << Tx, [ACx,] ACy | N | 3 | 1 | x | DU_SHIFT | 1 | . | . | 1 | . | . | . |  |
| $\underset{\sim}{2}$ | [10] | ADD Smem <<\#16, [ACx,] ACy | N | 3 | 1 | x | DU_ALU | 1 | . | . | 1 | . | . | . |  |
|  | [11] | ADD [uns(]Smem[]], CARRY, [ACx,] ACy | N | 3 | 1 | x | DU_ALU | 1 | . | . | 1 | . | . | . |  |
|  | [12] | ADD [uns(]Smem[)], [ACx,] ACy | N | 3 | 1 | x | DU_ALU | 1 | . | . | 1 | . | . | . |  |
|  | [13] | ADD [uns(]Smem[]] <<\#SHIFTW, [ACx,] ACy | N | 4 | 1 | x | DU_SHIFT | 1 | . |  | 1 | . | . |  |  |
|  | [14] | ADD dbl(Lmem), [ACx, $]$ ACy | N | 3 | 1 | x | DU_ALU | 1 | . | . | 2 | . | . | . |  |
|  | [15] | ADD Xmem, Ymem, ACx | N | 3 | 1 | x | DU_ALU | 2 | . | . | 2 | . | . | . |  |
|  | [16] | ADD K16, Smem | N | 4 | 1 | X | DU_ALU | 1 | . | . | 1 | . | 1 | - |  |
|  | ADDV: Addition with Absolute Value (page 5-54) |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  | ADD[R]V [ACx,] ACy | Y | 2 | 1 | X | DU_MAC1 | . |  |  |  |  |  |  |  |
|  | ADD: Dual 16-Bit Additions (page 5-35) |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | [1] | ADD dual(Lmem), [ACx, ] $A C y$ | N | 3 | 1 | $x$ | DU_ALU | 1 | . | . | 2 | . | . | . |  |
|  | [2] | ADD dual(Lmem), $\mathrm{T} \times$, ACx | N | 3 | 1 | X | DU_ALU | 1 |  | . | 2 | . |  | . |  |

## ADD::MOV: Addition with Parallel Store Accumulator Content to Memory (page 5-40)



Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU


+ $\mathrm{o} \quad$ Table 4-1. Mnemonic Instruction Set Summary (Continued)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)

| No. Instruction | E | S | C | Pipe | Operator | AddressGeneration Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |

## ASUB: Modify Auxiliary or Temporary Register Content by Subtraction (page 5-85)

| $[1]$ | ASUB TAx, TAy |
| :--- | :--- | :--- | :--- | :--- | :--- |
| [2] | ASUB P8, TAx |$|$| $N$ | 3 | 1 | $A D$ |
| :--- | :--- | :--- | :--- |
| $N$ | 3 | 1 | $A D$ |

ASUB: Modify Extended Auxiliary Register Content by Subtraction (page 5-89) ASUB XACsrc, XACdst

B: Branch Unconditionally (page 5-91)
[1] B ACx
[2] BL7
[3] BL16
[4] B P24

$|$| $N$ | 2 | 10 | $X$ | PU_UNIT |
| :---: | :---: | :---: | :---: | :---: |
| $Y$ | 2 | $6^{\dagger}$ | AD | PU_UNIT |
| $Y$ | 3 | $6^{\dagger}$ | AD | PU_UNIT |
| $N$ | 4 | 5 | $D$ | PU_UNIT |

1
$\dagger$ These instructions execute in 3 cycles if the addressed instruction is in the instruction buffer unit.

## BAND: Bitwise AND Memory with Immediate Value and Compare to Zero (page 5-95)

[1] BAND Smem, k16, TC1
[2] BAND Smem, k16, TC2
N $\left.4 \begin{array}{llll|lll|l} \\ N & 4 & 1 & X & \text { AU_ALU } & 1 & . & . \\ 1\end{array}\right)$

## BCC: Branch Conditionally (page 5-96)

[1] BCC I4, cond
[2] BCC L8, cond
$\underset{\omega}{5} \quad$ [3] BCC L16, cond
[4] BCC P24, cond

$|$| $N$ | 2 | $6 / 5^{\dagger}$ | $R$ | PU_UNIT |
| :--- | :--- | :--- | :--- | :--- |
| $Y$ | 3 | $6 / 5^{\dagger}$ | $R$ | PU_UNIT |
| $N$ | 4 | $6 / 5^{\dagger}$ | $R$ | PU_UNIT |
| $N$ | 5 | $5 / 5^{\dagger}$ | $R$ | PU_UNIT |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
BCC: Branch on Auxiliary Register Not Zero (page 5-100)
BCC L16, ARn_mod ! = \#0

$$
\left|\begin{array}{lllll}
\mathrm{N} & 4 & 6 / 5^{\dagger} \quad \text { AD } & \text { PU_UNIT }
\end{array}\right|
$$

$\stackrel{\perp}{\infty} \quad$ Table 4-1. Mnemonic Instruction Set Summary (Continued)

| $\begin{aligned} & \underset{\omega}{\omega} \\ & \underset{\Xi}{3} \end{aligned}$ | No. Instruction | E | S | C | Pipe | Operator | AddressGeneration Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |
|  | BCC: Compare and Branch (page 5-103) |  |  |  |  |  |  |  |  |  |  |  |  |  |
| $\begin{aligned} & \frac{1}{7} \\ & \hline 1 \end{aligned}$ | BCC[U] L8, src-AU RELOP K8 | N | 4 | 7/6 ${ }^{+}$ | x | AU_ALU + PU_UNIT | . |  |  | . |  | . |  |  |
| $\begin{aligned} & \mathscr{D} \\ & \underset{\sim}{2} \end{aligned}$ | BCC[U] L8, src-DU RELOP K8 | N | 4 | 7/6 ${ }^{+}$ | X | DU_ALU + PU_UNIT | . | . | . |  |  | . |  |  |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
BCLR: Clear Accumulator, Auxiliary, or Temporary Register Bit (page 5-106)

$\dagger$ When this instruction is decoded to modify status bit CAFRZ (15), CAEN (14), or CACLR (13), the CPU pipeline is flushed and the instruction is executed in 5 cycles regardless of the instruction context.

## BCNT: Count Accumulator Bits (page 5-111)

[1] BCNT ACx, ACy, TC1, Tx
[2] BCNT ACx, ACy, TC2, Tx

$$
\left\lvert\, \begin{array}{lllll}
\mathrm{Y} & 3 & 1 & \mathrm{X} & \begin{array}{l}
\mathrm{DU} \mathrm{DUIT}+ \\
\mathrm{AU} \text { - } \mathrm{ALU}^{+}
\end{array} \\
\mathrm{Y} & 3 & 1 & \mathrm{X} & \begin{array}{l}
\text { DU_BIT }+ \\
\text { AU_ALU }
\end{array}
\end{array}\right.
$$



BFXPA: Expand Accumulator Bit Field (page 5-112) BFXPA k16, ACx, dst-AU


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)


BFXTR: Extract Accumulator Bit Field (page 5-113)
BFXTR k16, ACx, dst-AU


BNOT: Complement Accumulator, Auxiliary, or Temporary Register Bit (page 5-114)


BSET: Set Accumulator, Auxiliary, or Temporary Register Bit (page 5-116)
BSET Baddr, src-AU
$\left|\begin{array}{lllll}N & 3 & 1 & X & \text { AU_ALU } \\ N & 3 & 1 & X & \text { DU_BIT }\end{array}\right|$
$\left|\begin{array}{cccc}\cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot\end{array}\right|$
BSET: Set Memory Bit (page 5-117)
BSET src, Smem


BSET: Set Status Register Bit (page 5-118)
[1] BSET k4, STO_55
[2] BSET k4, ST1_55
[3] BSET k4, ST2_55
[4] BSET k4, ST3_55
[5] BSET f-name

$$
\left\lvert\, \begin{array}{ccccc}
Y & 2 & 1 & X & A U \_A L U \\
Y & 2 & 1 & X & A U \_A L U \\
Y & 2 & 1 & X & A U \_A L U \\
Y & 2 & 1^{\dagger} & X & A U \_A L U \\
Y & 2 & 1^{\dagger} & X & A U \_A L U
\end{array}\right.
$$

$\dagger$ When this instruction is decoded to modify status bit CAFRZ (15), CAEN (14), or CACLR (13), the CPU pipeline is flushed and the instruction is executed in 5 cycles regardless of the instruction context.
Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

Table 4-1. Mnemonic Instruction Set Summary (Continued)


BTSTCLR: Test and Clear Memory Bit (page 5-126)
[1] BTSTCLR k4, Smem, TC1
$\left|\begin{array}{lllll|lll|llll}N & 3 & 1 & X & A U \_A L U & 1 & \cdot & \cdot & 1 & \cdot & 1 & \cdot \\ N & 3 & 1 & X & A U \_A L U & 1 & \cdot & \cdot & 1 & \cdot & 1 & \cdot\end{array}\right|$

## BTSTNOT: Test and Complement Memory Bit (page 5-127)

[1] BTSTNOT k4, Smem, TC1

| $N$ | 3 | 1 | $X$ | AU_ALU | 1 | $\cdot$ | $\cdot$ | 1 | $\cdot$ | 1 | $\cdot$ |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $N$ | 3 | 1 | $X$ | AU_ALU | 1 | $\cdot$ | $\cdot$ | 1 | $\cdot$ | 1 | $\cdot$ |$|$

BTSTP: Test Accumulator, Auxiliary, or Temporary Register Bit Pair (page 5-128)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)


## CALLCC: Call Conditionally (page 5-135)

[1] CALLCC L16, cond
[2] CALLCC P24, cond

| N | 4 | 6/5 ${ }^{\dagger}$ | R | PU_UNIT | 1 | 1 |  | 2 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| N | 5 | 5/5 ${ }^{\dagger}$ | R | PU_UNIT | 1 | 1 |  | 2 |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false

## CMP: Compare Memory with Immediate Value (page 5-141)

[1] CMP Smem == K16, TC1
[2] CMP Smem $==$ K16, TC2

| N | 4 | 1 | X | AU ALU | 1 | $\cdot$ | $\cdot$ | 1 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| N | 4 | 1 | X | AU _ALU | 1 | $\cdot$ | $\cdot$ | 1 |

CMP: Compare Accumulator, Auxiliary, or Temporary Register Content (page 5-143)
[1] CMP[U] src-AU RELOP dst-AU, TC1
CMP[U] src RELOP dst, TC1
CMP[U] src-DU RELOP dst-DU, TC1
[2] CMP[U] src-AU RELOP dst-AU, TC2
CMP[U] src RELOP dst, TC2
CMP[U] src-DU RELOP dst-DU, TC2

$|$| $Y$ | 3 | 1 | $X$ | AU_ALU |
| :--- | :--- | :--- | :--- | :--- |
| $Y$ | 3 | 1 | $X$ | AU_ALU |
| $Y$ | 3 | 1 | $X$ | DU_ALU |
| $Y$ | 3 | 1 | $X$ | AU_ALU |
| $Y$ | 3 | 1 | $X$ | AU_ALU |
| $Y$ | 3 | 1 | $X$ | DU_ALU |


| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| :---: | :---: | :---: | :---: |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ | . |


| $\cdot$ | . |
| :--- | :--- |
| 1 | See Note 2. |
| $\cdot$ |  |
| $\cdot$ | . |
| 1 | See Note 2. |
| $\cdot$ |  |

CMPAND: Compare Accumulator, Auxiliary, or Temporary Register Content with AND (page 5-145)
[1] CMPAND[U] src-AU RELOP dst-AU, TCy, TCx
CMPAND[U] src RELOP dst, TCy, TCx

CMPAND[U] src-DU RELOP dst-DU, TCy, TCx
[2] CMPAND[U] src-AU RELOP dst-AU, !TCy, TCx
CMPAND[U] src RELOP dst, !TCy, TCx

| $Y$ | 3 | 1 | $X$ |
| :--- | :--- | :--- | :--- |
| $Y$ | 3 | 1 | $X$ |
| $Y$ | 3 | 1 | $X$ |
| $Y$ | 3 | 1 | $X$ |
| $Y$ | 3 | 1 | $X$ |
| $Y$ | 3 | 1 | $X$ |


| AU_ALU | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |  |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AU_ALU | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 | See Note 2. |
| DU_ALU | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |  |
| AU_ALU | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |  |
| AU_ALU | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 | See Note 2. |
| DU_ALU | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |  |

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\stackrel{\stackrel{\rightharpoonup}{\sim}}{\stackrel{1}{n}}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)

|  |  |  |  |  |  |  |  |  |  |  |  | ses |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | No. Instruction | E | S | c | Pipe | Operator | DA | CA | SA | DR | CR | DW | ACB | Notes |
|  | CMPOR: Compare Accumulator, Auxiliary, or Temporary Register Content with OR (page |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | [1] CMPOR[U] src-AU RELOP dst-AU, TCy, TCx | Y | 3 | 1 | $x$ | AU_ALU |  |  | . |  |  | . |  |  |
|  | CMPOR[U] src RELOP dst, TCy, TCx | Y | 3 | 1 | x | AU_ALU | . | . | . |  |  | . | 1 | See Note 2. |
|  | CMPOR[U] src-DU RELOP dst-DU, TCy, TCx | Y | 3 | 1 | X | DU_ALU |  |  |  |  | . | . | . |  |
|  | [2] CMPOR[U] src-AU RELOP dst-AU, ITCy, TCx | Y | 3 | 1 | $x$ | AU_ALU | . | - | . | . | . | . | . |  |
|  | CMPOR[U] src RELOP dst, !TCy, TCx | Y | 3 | 1 | $x$ | AU_ALU | . | . | . |  | . | . | 1 | See Note 2. |
|  | CMPOR[U] src-DU RELOP dst-DU, !TCy, TCx | Y |  | 1 | x | DU_ALU | . |  |  | . |  | $\cdot$ |  |  |
|  | .CR: Circular Addressing Qualifier (page 5-155) |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | <instruction>.CR | N | 1 | 1 | AD |  |  |  |  |  |  |  |  |  |
|  | DELAY: Memory Delay (page 5-156) |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | DELAY Smem | N | 2 | 1 | X |  | 2 | 1 |  | 1 | 1 | 1 |  |  |

EXP: Compute Exponent of Accumulator Content (page 5-157)


FIRSADD: Finite Impulse Response Filter, Symmetrical (page 5-158)

| FIRSADD Xmem, Ymem, Cmem, ACx, ACy | N 4 |
| :--- | :--- | :--- |

FIRSSUB: Finite Impulse Response Filter, Antisymmetrical (page 5-160)

> FIRSSUB Xmem, Ymem, Cmem, ACx, ACy


IDLE (page 5-162)
IDLE
INTR: Software Interrupt (page 5-163) INTR k5

N 230
PU_UNIT
$-\mathrm{ALU}+$

: Lock Access Qualifier (page 5-165)
Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)

|  |  |  |  |  |  | Address Generation Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| No. Instruction | E | S | c | Pipe | Operator | DA | CA | SA | DR | CR | DW | ACB |  |

LMS: Least Mean Square (page 5-167)
LMS Xmem, Ymem, ACx, ACy


## <instruction>.LR

MAC: Multiply and Accumulate (page 5-174)
[1] MAC[R] ACx, Tx, ACy[, ACy]
[2] $\operatorname{MAC[R]~} A C y, T x, A C x, A C y$
$\left|\begin{array}{lllll|lll|llll}Y & 2 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ Y & 2 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ Y & 3 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & 1 & \cdot & 1 & 1 & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & 2 & \cdot & \cdot & 2 & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & 2 & \cdot & \cdot & 2 & \cdot & \cdot & \cdot \\ \text { N } & 3 & 1 & X & \text { DU_MAC1 } & 1 & 1 & \cdot & 1 & 1 & \cdot & \cdot\end{array}\right|$

[^0][11] MAC[R] Smem, uns(Cmem), ACx
MACMZ: Multiply and Accumulate with Parallel Delay (page 5-191)

## MACM[R]Z [T3 = ]Smem, Cmem, ACx

$\begin{array}{lll}\mathrm{N} & 3 & 1\end{array}$
DU_MAC1 $\qquad$
Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{\sum}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)

| No. Instruction | E | S | C | Pipe | Operator | Address Generation Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |

## MAC::MAS: Multiply and Accumulate with Parallel Multiply and Subtract (page 5-228)

| [1] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[]), ACy :: MAS[R][40] [uns(]Smem[]], [uns([LO(Cmem)[]], ACx | N | 4 | 1 | $x$ | DU_MAC1 + DU_MAC2 | 1 | 1 | 1 | 2 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MAC[R][40] [uns(]Smem[]), [uns(]HI(Cmem)[]], ACy>>\#16 :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem) [)], ACx | N | 4 | 1 | x | DU MAC1 + <br> DU_MAC2 | 1 | 1 | 1 | 2 |
| [3] | MAC[R][40] [uns(]HI(Lmem) $[$ ]], [uns(]HI(Cmem) $)$ ], ACy :: MAS[R][40] [uns(]LO(Lmem)[]], [uns(]LO(Cmem)[]], ACx | N | 4 | 1 | x | DU MAC1 + DU_MAC2 | 1 | 1 | 2 | 2 |
| [4] | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAS[R][40] [uns(]LO(Lmem)[]], [uns(]LO(Cmem)[)], ACx | N | 4 | 1 | x | DU MAC1 + DU_MAC2 | 1 | 1 | 2 | 2 |
| [5] | MAC[R][40] [uns(]Ymem[]), [uns(]HI(coef(Cmem))[]], ACy, :: MAS[R][40] [uns(]Xmem[]], [uns([LO(coef(Cmem)) D$]$, ACx | N | 5 | 1 | X | DU_MAC1 + <br> DU_MAC2 | 2 | 1 | 2 | 2 |
| [6] | MAC[R][40] [uns(]HI(Ymem)[]], [uns(]HI(coef(Cmem))[]], ACy >> \#16, :: MAS[R][40] [uns(]LO(Xmem)[]], [uns(]LO(coef(Cmem)) []], ACx | N | 5 | 1 | x | DU MAC1 + DU_MAC2 | 2 | 1 | 2 | 2 |

MAC::MPY: Multiply and Accumulate with Parallel Multiply (page 5-248)
[1] MAC[R][40] [uns(]Xmem[]), [uns(]Cmem[]), ACx - MPY[R][40] [uns(]Ymem[]), [uns([]Cmem[]), ACy
[2] MAC[R][40] [uns([Smem[)], [uns(]HI(Cmem) $)$ ], ACy

| N | 4 | 1 | x | DU MAC1 + <br> DU_MAC2 | 2 | 1 | 2 | 1 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| N | 4 | 1 | x | DU MAC1 + <br> DU_MAC2 | 1 | 1 | 1 | 2 |
| N | 4 | 1 | x | DU_MAC1 + DU_MAC2 | 1 | 1 | 1 | 2 |
| N | 4 | 1 | x | DU MAC1 + DU_MAC2 | 1 | 1 | 2 | 2 |
| N | 4 | 1 | x | DU_MAC1 + DU_MAC2 | 1 | 1 | 2 | 2 |
| N | 5 | 1 | x | DU MAC1 + <br> DU_MAC2 | 2 | 1 | 2 | 2 |

MACM::MOV: Multiply and Accumulate with Parallel Load Accumulator from Memory (page 5-267)
MACM $[\mathrm{R}][T 3=]$ Xmem, $\mathrm{T} x$, ACx
:MOV Ymem <<\#16, AC

MACM::MOV: Multiply and Accumulate with Parallel Store Accumulator Content to Memory (page 5-269)

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

Table 4-1. Mnemonic Instruction Set Summary (Continued)


## MAS::MAC: Multiply and Subtract with Parallel Multiply and Accumulate (page 5-286)

[1] MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[]], ACx : MAC[R][40] [uns(]Ymem[]), [uns(]Cmem[)], AC
[2] MAS[R][40] [uns(]Xmem[]), [uns(]Cmem[]], ACx $:$ MAC[R][40] [uns(]Ymem[]), [uns(]Cmem[]], ACy >> \#16
[3] MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy : MAC[R][40] [uns(]Smem[]], [uns(]LO(Cmem)[], ACx

4] MAS[R][40] [uns([HI(Lmem)[]), [uns([HI(Cmem)[D], ACy : MAC[R][40] [uns(]LO(Lmem)[)], [uns((LOO(Cmem)[]], ACx

MAS::MAS: Parallel Multiply and Subtracts (page 5-297)
[1] MAS[R][40] [uns(]Xmem[])] [uns(]Cmem[)], ACx : MAS[R][40] [uns(]Ymem[]], [uns(]Cmem[]], ACy
[2] MAS[R][40] [uns(]Smem[]], [uns(]HI(Cmem)[]], ACy $::$ MAS[R][40] [uns(]Smem[]), [uns(](LO(Cmem)[]], ACx

3] MAS[R][40] [uns(1HI(Lmem)[0] [uns(1H)(Cmem)[]] ACy

[4] MAS[R][40][uns(]Hi(Ymem)D], [uns(]HI(coef(Cmem))[]), ACy, MAS[R][40] [uns(]LO(Xmem) []], [uns((LOO(coef(Cmem)) $)]$, AC

Notes:

1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)


## MAS::MPY: Multiply and Subtract with Parallel Multiply (page 5-309

| [1] | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[]), ACx :: MPY[R][40] [uns(]Ymem[]], [uns(]Cmem[]], ACy | N | 4 | 1 | x | DU_MAC1 + DU_MAC2 | 2 | 1 | 2 | 1 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MAS[R][40] [uns(]Smem[]), [uns(]HI(Cmem)[]], ACy <br> :: MPY[R][40] [uns(]Smem[]], [uns(]LO(Cmem) []], ACx | N | 4 | 1 | X | DU MAC1 + DU_MAC2 | 1 | 1 | 1 | 2 |
| [3] | MAS[R][40] [uns(]HI(Lmem) [)], [uns(]HI(Cmem) $[$ ], ACy <br> :: MPY[R][40] [uns(]LO(Lmem) []], [uns(JLO(Cmem) []], ACx | N | 4 | 1 | X | DU_MAC1 + <br> DU_MAC2 | 1 | 1 | 2 | 2 |

MASM::MOV: Multiply and Subtract with Parallel Load Accumulator from Memory (page 5-318)
MASM[R] [T3 = ]Xmem, Tx, ACx
$\begin{array}{llll}\mathrm{N} & 4 & 1 & X\end{array}$
DU_MAC1
2


MASM::MOV: Multiply and Subtract with Parallel Store Accumulator Content to Memory (page 5-320) $\begin{aligned} & \text { MASM[R] [T3 }=] \text { Xmem, Tx, ACy } \\ & :: ~ M O V ~ H I(A C x ~ \ll ~ T 2), ~ Y m e m ~\end{aligned}$
X: Compare Accumulator, Auxiliary, or Temporary Register Content Maximum (page 5-322)


MAXDIFF: Compare and Select Accumulator Content Maximum (page 5-325)

- $\left\lvert\, \begin{array}{lll}\text { y } & 3 & 1\end{array}\right.$
[2] DMAXDIFF $A C x, A C y, A C z, ~ A C w, ~ T R N ~$
MIN: Compare Accumulator, Auxiliary, or Temporary Register Content Minimum (page 5-331)


MINDIFF: Compare and Select Accumulator Content Minimum (page 5-334)
[1] MINDIFF ACx, ACy, ACz, ACw


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\stackrel{\perp}{\stackrel{\rightharpoonup}{\infty}} \quad$ Table 4-1. Mnemonic Instruction Set Summary (Continued)

|  |  |  |  |  |  |  |  | ddres ration |  |  |  | ses |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | No. Instruction | E | s | C | Pipe | Operator | DA | CA | SA | DR | CR | DW | ACB | Notes |
|  | mmap: Memory-Mapped Register Access Qualifier (page 5-340) |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | mmap | N | 1 | 1 | D |  | . | . |  | . | . |  |  |  |
|  | MOV: Load Accumulator from Memory (page 5-342) |  |  |  |  |  |  |  |  |  |  |  |  |  |
| $\omega$ | [1] MOV [rnd(]Smem << Tx[]], ACx | N | 3 | 1 | x | DU_SHIFT | 1 | . |  | 1 | . | . |  |  |
| $\bigcirc$ | [2] MOV low_byte(Smem) <<\#SHIFTW, ACx | N | 3 | 1 | x | DU_SHIFT | 1 | . |  | 1 | . | . | . |  |
| $\begin{aligned} & 3 \\ & 2 \\ & 2 \end{aligned}$ | [3] MOV high_byte(Smem) <<\#SHIFTW, ACx | N | 3 | 1 | x | DU_SHIFT | 1 | . |  | 1 | . | . |  |  |
|  | [4] MOV Smem <<\#16, ACx | N | 2 | 1 | x | DU_LOAD | 1 | . |  | 1 | . | . |  |  |
|  | [5] MOV [uns(]Smem[]], ACx | N | 3 | 1 | $x$ | DU_LOAD | 1 | . |  | 1 | . | . | . |  |
|  | [6] MOV [uns(]Smem[)] <<\#SHIFTW, ACx | N | 4 | 1 | X | DU_SHIFT | 1 | . |  | 1 | . | . |  |  |
|  | [7] MOV[40] dbl(Lmem), ACx | N | 3 | 1 | x | DU_LOAD | 1 | . | . | 2 | . | . |  |  |
|  | [8] MOV Xmem, Ymem, ACx | N | 3 | 1 | X | DU_LOAD | 2 | . |  | 2 | . | . |  |  |
|  | MOV: Load Accumulator Pair from Memory (page 5-351) |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | [1] MOV dbl(Lmem), pair(HI(ACx)) | N | 3 | 1 | X | DU_LOAD | 1 | . |  | 2 | . | . |  |  |
|  | [2] MOV dbl(Lmem), pair(LO(ACx)) | N | 3 | 1 | X | DU_LOAD | 1 | . |  | 2 |  | . |  |  |

MOV: Load Accumulator with Immediate Value (page 5-354)

| [1] | MOVK16<<\#16, ACx |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| [2] | MOVK16 <<\#SHFT, ACx |$|$| $N$ | 4 | 1 | $X$ | DU_LOAD |
| :--- | :--- | :--- | :--- | :--- |
| $N$ | 4 | 1 | $X$ | DU_SHIFT |

<< \#SHFT, ACx



Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

## MOV: Load Accumulator, Auxiliary, or Temporary Register from Memory (page 5-357)

|  |  |  |  |  | Gene | Adress |  |  |  | ses |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| No. Instruction | E S | c | Pipe | Operator | DA | CA | SA | DR | CR | Dw | ACB | Notes |

## MOV: Load Accumulator, Auxiliary, or Temporary Register with Immediate Value (page 5-363)

| [1] | MOV k4, dst-AU | Y | 2 | 1 | $x$ | AU_LOAD |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | MOV k4, dst-DU | Y | 2 | 1 | X | DU_LOAD |
| [2] | MOV -k4, dst-AU | Y | 2 | 1 | $x$ | AU_LOAD |
|  | MOV -k4, dst-DU | Y | 2 | 1 | x | DU_LOAD |
| [3] | MOV K16, dst-AU | N | 4 | 1 | $x$ | AU_LOAD |
|  | MOV K16, dst-DU | N | 4 | 1 | x | DU_LOAD |


| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| :---: | :---: | :---: | :---: | :---: |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |

MOV: Load Auxiliary or Temporary Register Pair from Memory (page 5-367) MOV dbl(Lmem), pair(TAx)
MOV: Load CPU Register from Memory (page 5-368)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

송 Table 4-1. Mnemonic Instruction Set Summary (Continued)

|  |  | Instruction | E | S | C | Pipe | Operator | Address Generation Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | No. |  |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |
| $\stackrel{\sim}{0}$ | [16] | MOV Smem, SP | N | 3 | 1 | X | AU_LOAD | 1 | . |  | 1 | . |  |  |  |
| $\stackrel{\square}{\square}$ | [17] | MOV Smem, SSP | N | 3 | 1 | x | AU_LOAD | 1 |  |  | 1 |  |  |  |  |
| 5 | [18] | MOV Smem, TRNO | N | 3 | 1 | X | DU_LOAD | 1 | . | . | 1 | . | . |  |  |
| $\stackrel{1}{\square}$ | [19] | MOV Smem, TRN1 | N | 3 | 1 | X | DU_LOAD | 1 | . |  | 1 | . |  |  |  |
| $\begin{aligned} & \infty \\ & 5 \end{aligned}$ | [20] | MOV dbl(Lmem), RETA | N | 3 | 5 | X |  | 1 | . |  | 2 | . |  |  |  |

MOV: Load CPU Register with Immediate Value (page 5-371)

| [1] | MOV k12, BK03 | Y | 3 | 1 | AD | AU_LOAD |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MOV k12, BK47 | Y | 3 | 1 | AD | AU_LOAD |
| [3] | MOV k12, BKC | Y | 3 | 1 | AD | AU_LOAD |
| [4] | MOV k12, BRC0 | Y | 3 | 1 | AD |  |
| [5] | MOV k12, BRC1 | Y | 3 | 1 | AD |  |
| [6] | MOV k12, CSR | Y | 3 | 1 | AD |  |
| [7] | MOV k7, DPH | Y | 3 | 1 | AD | AU_LOAD |
| [8] | MOV k9, PDP | Y | 3 | 1 | AD | AU_LOAD |
| [9] | MOV k16, BSA01 | N | 4 | 1 | AD | AU_LOAD |
| [10] | MOV k16, BSA23 | N | 4 | 1 | AD | AU_LOAD |
| [11] | MOV k16, BSA45 | N | 4 | 1 | AD | AU_LOAD |
| [12] | MOV k16, BSA67 | N | 4 | 1 | AD | AU_LOAD |
| [13] | MOV k16, BSAC | N | 4 | 1 | AD | AU_LOAD |
| [14] | MOV k16, CDP | N | 4 | 1 | AD | AU_LOAD |
| [15] | MOV k16, DP | N | 4 | 1 | AD | AU_LOAD |
| [16] | MOV k16, SP | N | 4 | 1 | AD | AU_LOAD |
| [17] | MOV k16, SSP | N | 4 | 1 | AD | AU_LOAD |

## MOV: Load Extended Auxiliary Register from Memory (page 5-373)

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)


MOV: Load Memory with Immediate Value (page 5-374)

| $[1]$ | MOV K8, Smem |
| :--- | :--- | :--- | :--- | :--- | :--- |
| [2] | MOV K16, Smem |$|$| $N$ | 3 | 1 | $X$ |
| :---: | :---: | :---: | :---: |
| $N$ | 4 | 1 | $X$ |

MOV: Move Accumulator Content to Auxiliary or Temporary Register (page 5-375) MOV HI(ACx), TAx

| $Y$ | 2 | 1 | $X$ | $A U \_A L U$ |
| :--- | :--- | :--- | :--- | :--- |



MOV: Move Accumulator, Auxiliary, or Temporary Register Content (page 5-376)


MOV: Move Auxiliary or Temporary Register Content to Accumulator (page 5-378)

MOV TAx, HI(ACx) |  | Y | 1 | $X$ |
| :--- | :--- | :--- | :--- |
| DU_ALU |  |  |  |

MOV: Move Auxiliary or Temporary Register Content to CPU Register (page 5-379)

|  | [1] | MOV TAx, BRCO | Y | 2 | 1 | X | AU_ALU |  | . | . | . |  | . | . | . | . |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | [2] | MOV TAx, BRC1 | Y | 2 | 1 | X | AU_ALU |  |  |  |  |  | . | . | . |  |  |  |
|  | [3] | MOV TAx, CDP | Y | 2 | 1 | x | AU_ALU |  | . | . | . |  | . | . | . |  |  |  |
|  | [4] | MOV TAx, CSR | Y | 2 | 1 | x | AU_ALU |  |  | . | . |  | . | . | . |  |  |  |
|  | [5] | MOV TAx, SP | Y | 2 | 1 | x | AU_ALU |  |  | . | . |  | . | . | . |  |  |  |
| $\stackrel{\sim}{\square}$ | [6] | MOV TAx, SSP | Y | 2 | 1 | X | AU_ALU |  | . | . | . |  | . | . | . |  |  |  |
| $\stackrel{\text { ¢ }}{\square}$ | MOV: Move CPU Register Content to Auxiliary or Temporary Register (page 5-381) |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| $\bigcirc$ | [1] | MOV BRCO, TAx | $Y$ | 2 |  | x | AU_ALU |  | . | . | . |  |  | . | . |  |  |  |
| D | [2] | MOV BRC1, TAx | Y | 2 | 1 | X | AU_ALU |  |  | . | . |  |  | . | . |  |  |  |
| 0 | [3] | MOV CDP, TAx | Y | 2 | 1 | $x$ | AU_ALU |  |  | . | . |  | . | . | . |  |  |  |
| $\bigcirc$ | [4] | MOV RPTC, TAx | Y | 2 | 1 | X | AU_ALU |  |  |  | . |  |  | . | . |  |  |  |
| $\stackrel{3}{2}$ | [5] | MOV SP, TAx | Y | 2 | 1 | x | AU_ALU |  | . | . | . |  | . | . | . |  |  |  |
|  | [6] | MOV SSP, TAx | Y | 2 | 1 | x | AU_ALU |  | . |  | . |  | . | . | . |  |  |  |

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

ㅅ Table 4-1. Mnemonic Instruction Set Summary (Continued)


## MOV: Move Memory to Memory (page 5-384)

| [1] | MOV Cmem, Smem | N | 3 | 1 | X |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MOV Smem, Cmem | N | 3 | 1 | X |
| [3] | MOV Cmem,dbl(Lmem) | N | 3 | 1 | X |
| [4] | MOV dbl(Lmem), Cmem | N | 3 | 1 | x |
| [5] | MOV dbl(Xmem), dbl(Ymem) | N | 3 | 1 | x |
| [6] | MOV Xmem, Ymem | N | 3 | 1 | $x$ |

MOV: Store Accumulator Content to Memory (page 5-391)
[1] MOV HI(ACx), Smem
[2] $\operatorname{MOV}[r n d(] H I(A C x)[]]$, Smem
$\begin{array}{lll}\mathrm{N} & 2 & 1\end{array}$

|  |  |
| :--- | :--- |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |
| DU_SHIFT |  |


| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 2 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 2 |
| 1 | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 2 |

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU


吴 Table 4-1. Mnemonic Instruction Set Summary (Continued)

|  |  |  |  |  |  |  |  |  | ddre |  |  |  | ses |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | No. | Instruction | E | S | C | Pipe | Operator | DA | CA | SA | DR | CR | DW | ACB | Notes |
| $\stackrel{\sim}{\square}$ | [15] | MOV PDP, Smem | N | 3 | 1 | X |  | 1 | . |  |  | . | 1 |  |  |
| $\stackrel{\text { ¢ }}{ }$ | [16] | MOV SP, Smem | N | 3 | 1 | $x$ |  | 1 | . | . | . | . | 1 | . |  |
| $\bigcirc$ | [17] | MOV SSP, Smem | N | 3 | 1 | x |  | 1 | . | . | . | . | 1 | . |  |
| $\underset{\sim}{\infty}$ | [18] | MOV TRN0, Smem | N | 3 | 1 | X |  | 1 |  |  |  |  | 1 |  |  |
| 0 | [19] | MOV TRN1, Smem |  | 3 | 1 | X |  | 1 |  |  |  |  | 1 |  |  |
| S | [20] | MOV RETA, dbl(Lmem) | N | 3 | 5 | x |  | 1 | . | . |  | . | 2 |  |  |
| 2 | MOV: Store Extended Auxiliary Register Content to Memory (page 5-427) |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  | MOV XAsrc, dbl(Lmem) | N | 3 | 1 | X |  | 1 |  |  |  | . | 2 |  |  |

MOV::MOV: Load Accumulator from Memory with Parallel Store Accumulator Content to Memory (page 5-428)

> MOV Xmem <<\#16, ACy
> $::$ MOV HI(ACx << T2), Ymem

MPY: Multiply (page 5-430)
[1] MPY[R] $[A C x] A C$,


MPY::MAC: Multiply with Parallel Multiply and Accumulate (page 5-446)
[1] MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[]], ACx


| $N$ | 4 | 1 | $X$ | DU_MAC1 + |
| :--- | :--- | :--- | :--- | :--- |


| $N$ | 4 | 1 | $X$ | DU_MAC2 <br> DU_MAC1 <br> DU_MAC2 |
| :--- | :--- | :--- | :--- | :--- | :--- |


| 2 | 1 | $\cdot$ | 2 | 1 |
| :--- | :--- | :--- | :--- | :--- |
| 1 | 1 | $\cdot$ | 1 | 2 |

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4－1．Mnemonic Instruction Set Summary（Continued）

|  | S | C | Pipe | Operator | Address Generation Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |
| N | 4 | 1 | X | DU MAC1＋ DU＿MAC2 | 1 | 1 |  | 2 | 2 | ． | ． |  |
| N | 5 | 1 | x | DU＿MAC1＋ <br> DU＿MAC2 | 2 | 1 |  | 2 | 2 | ． |  |  |



MPY：：MAS：Multiply with Parallel Multiply and Subtract（page 5－458）
［1］MPY［R］］［40］［uns（］Smem［）］，［uns（］HI（Cmem）［）］，ACy $:$ MAS［R］［40］［uns（］Smem［］），［uns（］LO（Cmem）［］］，ACx
［2］MPY［R］［40］［uns（］HI（Lmem）［］］，［uns（］HI（Cmem）［］］，ACy MAS［R］［40］［uns（］LO（Lmem）［］］，［uns（］LO（Cmem）［］］，ACX
［3］MPY［R］［40］［uns（］Ymem［］）］，［uns（］HI（coef（Cmem））［］］，ACy， ：：MAS［R］［40］［uns（］Xmem［］］，［uns（［LO（coef（Cmem））［］］，ACx

| N | 4 | 1 | X | DU＿MAC1＋ DU＿MAC2 | 1 | 1 | 1 | 2 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| N | 4 | 1 | X | DU＿MAC1＋ DU＿MAC2 | 1 | 1 | 2 | 2 |
| N | 5 | 1 | X | DU＿MAC1＋ DU＿MAC2 | 2 | 1 | 2 | 2 |

MPY：：MPY：Parallel Multiplies（page 5－468）
［1］MPY［R］［40］［uns（］Xmem［）］，［uns（］Cmem［］），ACx
$:: \mathrm{MPY}[\mathrm{R}][40]$［uns（］Ymem［］］，［uns（］Cmem［］］，ACy

$|$| $N$ | 4 | 1 | $X$ | DU＿MAC1 + <br> DU＿MAC2 | 2 | 1 | $\cdot$ | 2 | 1 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $N$ | 4 | 1 | $X$ | DU＿MAC1 + <br> DU＿MAC2 | 1 | 1 | $\cdot$ | 1 | 2 |
| $N$ | 4 | 1 | $X$ | DU＿MAC1 + <br> DU＿MAC2 | 1 | 1 | $\cdot$ | 2 | 2 |
| $N$ | 5 | 1 | $X$ | DU＿MAC1 + <br> DU＿MAC2 | 2 | 1 | $\cdot$ | 2 | 2 |


［3］MPY［R］［40］［uns（］HI（Lmem）［］］，［uns（］HI（Cmem）［］］，ACy $\because \mathrm{MPY}[\mathrm{R}][40]$［uns（］LO（Lmem）［］］，［uns（］LO（Cmem）［ $]$ ］，ACx
［4］MPY［R］［40］［uns（］Ymem［）］，［uns（］HI（coef（Cmem））［］］，ACy，
MPYM：：MOV：Multiply with Parallel Store Accumulator Content to Memory（page 5－480）
MPYM［R］［T3＝］Xmem，Tx，ACy
$::$ MOV HI（ACx＜＜T2），Ymem
$\begin{array}{llllll}\mathrm{N} & 4 & 1 & X & \text { DU＿MAC1＋}\end{array}$
DU＿MAC1＋
DU＿SHIFT
2


NEG：Negate Accumulator，Auxiliary，or Temporary Register Content（page 5－483）


NOP：No Operation（page 5－485）
［1］NOP

| $Y$ | 1 | 1 | $D$ |
| :--- | :--- | :--- | :--- |
| $Y$ | 2 | 1 | $D$ |

Notes：1）dst－DU，src－AU or dst－DU，src－DU
2）dst－DU，src－AU or dst－AU，src－DU

Table 4-1. Mnemonic Instruction Set Summary (Continued)

| No. Instruction |  | E | S | C | Pipe | Operator | Address Generation Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | DA |  |  |  |  | CA | SA | DR | CR | DW | ACB |  |
| NOT: Complement Accumulator, Auxiliary, or Temporary Register Content (page 5-486) |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | NOT [src-AU,] dst-AU |  | Y | 2 | 1 | $x$ | AU_ALU | . | . | . |  | . | . | . |  |
|  | NOT [src-DU,] dst-AU | Y | 2 | 1 | X | AU_ALU | . | . | . | . | . | . | 1 |  |
|  | NOT [src,] dst-DU | Y | 2 | 1 | x | DU_ALU | . | . | . |  | . | . |  | See Note 1. |
| OR: Bitwise OR (page 5-487) |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| [1] | OR src-AU, dst-AU | Y | 2 | 1 | $x$ | AU_ALU | . | . |  |  | . |  |  |  |
|  | OR src-DU, dst-AU | Y | 2 | 1 | x | AU_ALU | . | . | . | . | . | . | 1 |  |
|  | OR src, dst-DU | Y | 2 | 1 | x | DU_ALU | . | . | . | . | . | . | . | See Note 1. |
| [2] | OR k8, src-AU, dst-AU | Y | 3 | 1 | x | AU_ALU | . | . | . | . | . | . | . |  |
|  | OR k8, src-DU, dst-AU | Y | 3 | 1 | x | AU_ALU | . |  |  |  | . | . | 1 |  |
|  | OR k8, src, dst-Du | Y | 3 | 1 | x | DU_ALU | . |  | . |  | . | . | . | See Note 1. |
| [3] | OR k16, src-AU, dst-AU | N | 4 | 1 | x | AU_ALU | . | . | . |  | . | . | . |  |
|  | OR k16, src-DU, dst-AU | N | 4 | 1 | x | AU_ALU | . |  | . |  | . | . | 1 |  |
|  | OR k16, src, dst-DU | N | 4 | 1 | x | DU_ALU | . | . | . | . | . | . | . | See Note 1. |
| [4] | OR Smem, src-AU, dst-AU | N | 3 | 1 | $x$ | AU_ALU | 1 | - | . | 1 | - | $\cdot$ | - |  |
|  | OR Smem, src-DU, dst-AU | N | 3 | 1 | x | AU_ALU | 1 | . |  | 1 | . | . | 1 |  |
|  | OR Smem, src, dst-DU | N | 3 | 1 | x | DU_ALU | 1 | . | . | 1 | . | . | . | See Note 1. |
| [5] | OR ACx <<\#SHIFTW[, ACy] | Y | 3 | 1 | X | DU_SHIFT | . | - | - | . | . | - | . |  |
| [6] | OR k16 << \#16, [ACx,] ACy | N | 4 | 1 | X | DU_ALU | . | . | . | . | . | . | . |  |
| [7] | OR k16 <<\#SHFT, [ACx, ] ACy | N | 4 | 1 | X | DU_SHIFT | . | - | . |  | - | . | . |  |
| [8] | OR k16, Smem | N | 4 | 1 | X | AU_ALU | 1 | . | . | 1 | . | 1 | . |  |

Notes: 1) dst-DU, src-AU or dst-DU, src-DU

1) dst-DU, src-AU or dst-DU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)


## POP: Pop Top of Stack (page 5-496)

| [1] | POP dst1-AU, dst2-AU |
| ---: | :--- |
|  | POP dst1-DU, dst2-DU |
| POP dst1-AU, dst2-DU |  |


| Y | 2 | 1 | x | AU_LOAD | 1 | 1 | 2 | . | . |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Y | 2 | 1 | x | DU_LOAD | 1 | 1 | 2 |  | . |
| Y | 2 | 1 | X | AU_LOAD + DU_LOAD | 1 | 1 | 2 |  | . |
| Y | 2 | 1 | X | AU_LOAD + DU_LOAD | 1 | 1 | 2 |  | . |
| Y | 2 | 1 | x | AU_LOAD | 1 | 1 | 1 |  | . |
| Y | 2 | 1 | x | DU_LOAD | 1 | 1 | 1 |  | . |
| N | 3 | 1 | x | AU_LOAD | 1 | 1 | 2 |  | 1 |
| N | 3 | 1 | x | DU_LOAD | 1 | 1 | 2 |  | 1 |
| Y | 2 | 1 | x | DU_LOAD | 1 | 1 | 2 |  | . |
| N | 2 | 1 | x |  | 1 | 1 | 1 |  | 1 |
| N | 2 | 1 | $x$ |  | 1 | 1 | 2 |  | 2 |

[6] POP dbl(Lmem)
POPBOTH: Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers (page 5-503)

| POPBOTH xdst-AU | Y | 2 | 1 | X | AU_LOAD | 1 | 1 | 2 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| POPBOTH xdst-DU | Y | 2 | 1 | x | DU_LOAD | 1 | 1 | 2 |

port: Peripheral Port Register Access Qualifiers (page 5-504)


[^1]́ㅗ Table 4-1. Mnemonic Instruction Set Summary (Continued)


PSHBOTH: Push Accumulator or Extended Auxiliary Register Content to Stack Pointers (page 5-513)

## PSHBOTH xsrc

RESET: Software Reset (page 5-514)
RESET

$$
\begin{aligned}
& \left.\left|\begin{array}{lllll}
y & 2 & 1 & X
\end{array}\right| \begin{array}{lll}
1 & . & 1
\end{array} \right\rvert\, \\
& \left\lvert\, \begin{array}{llllll}
\mathrm{N} & 2 \quad ? & \mathrm{D} & \text { PU_UNIT } \quad \mid \cdot \quad \cdot \quad .
\end{array}\right.
\end{aligned}
$$

RET: Return Unconditionally (page 5-518) RET

RETCC: Return Conditionally (page 5-520) RETCC cond
$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
RETI: Return from Interrupt (page 5-522)
RETI

| N | 2 | 5 | D | PU_UNIT | 1 | . | 1 | 2 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |

ROL: Rotate Left Accumulator, Auxiliary, or Temporary Register Content (page 5-524)


ROR: Rotate Right Accumulator, Auxiliary, or Temporary Register Content (page 5-526)


ROUND: Round Accumulator Content (page 5-528)
ROUND [ACx,] ACy

$$
\begin{array}{|lllll}
Y & 2 & 1 & X & \text { DU_ALU }
\end{array}
$$

RPT: Repeat Single Instruction Unconditionally (page 5-530)

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU


Table 4-1. Mnemonic Instruction Set Summary (Continued)


SFTS: Signed Shift of Accumulator, Auxiliary, or Temporary Register Content (page 5-574)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

$\stackrel{\text { 山 }}{\underset{\sim}{\omega}}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)

| $\begin{aligned} & \bar{D} \\ & 0 \\ & \vdots \\ & \vdots \\ & \vdots \\ & \hline 1 \end{aligned}$ |  | Instruction | E | S | C | Pipe | Operator | Address Generation Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | No. |  |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |
|  | [5] | SUB src-AU, Smem, dst-AU | N | 3 | 1 | X | AU_ALU | 1 |  |  | 1 |  |  |  |  |
|  |  | SUB src-DU, Smem, dst-AU | N | 3 | 1 | $x$ | AU_ALU | 1 | . |  | 1 | . |  | 1 |  |
|  |  | SUB src, Smem, dst-DU | N | 3 | 1 | x | DU_ALU | 1 | . |  | 1 | . | . |  | See Note 1. |
| $\underset{\sim}{\infty}$ | [6] | SUB ACx << Tx, ACy | Y | 2 | 1 | X | DU_SHIFT |  | . |  | . | . |  |  |  |
| $$ | [7] | SUB ACx <<\#SHIFTW, ACy | Y | 3 | 1 | X | DU_SHIFT | . | . | . | . | . | . |  |  |
|  | [8] | SUB K16 <<\#16, [ACx,] ACy | N | 4 | 1 | $x$ | DU_ALU | . | . | . | . | . | . |  |  |
|  | [9] | SUB K16 <<\#SHFT, [ACx,] ACy | N | 4 | 1 | X | DU_SHIFT |  |  |  |  |  |  |  |  |
|  | [10] | SUB Smem << Tx, [ACx,] ACy | N | 3 | 1 | x | DU_SHIFT | 1 | . |  | 1 | . |  |  |  |
|  | [11] | SUB Smem <<\#16, [ACx, ] ACy | N | 3 | 1 | $x$ | DU_ALU | 1 | . |  | 1 | . | . |  |  |
|  | [12] | SUB ACx, Smem <<\#16, ACy | N | 3 | 1 | x | DU_ALU | 1 | . |  | 1 | . |  |  |  |
|  | [13] | SUB [uns(]Smem[]], BORROW, [ACx, ] ACy | N | 3 | 1 | x | DU_ALU | 1 | . |  | 1 | . | . |  |  |
|  | [14] | SUB [uns(]Smem[]), [ACx,] ACy | N | 3 | 1 | $x$ | DU_ALU | 1 | . |  | 1 | . | . |  |  |
|  | [15] | SUB [uns(]Smem[)] << \#SHIFTW, [ACx,] ACy | N | 4 | 1 | x | DU_SHIFT | 1 | . |  | 1 | . | . |  |  |
|  | [16] | SUB dbl(Lmem), [ACx, ] ACy | N | 3 | 1 | x | DU_ALU | 1 | . |  | 2 | . | . |  |  |
|  | [17] | SUB ACx, dbl(Lmem), ACy | N | 3 | 1 | $x$ | DU_ALU | 1 | . |  | 2 | . |  |  |  |
|  | [18] | SUB Xmem, Ymem, ACx | N | 3 | 1 | $x$ | DU_ALU | 2 | . |  | 2 | . |  |  |  |

## SUB::MOV: Subtraction with Parallel Store Accumulator Content to Memory (page 5-624) <br> sub

SUB Xmem <<\#16, ACx, ACy
$::$ MOV HI(ACy << T2), Ymem
BADD: Dual 16-Bit Subtraction and Addition (page 5-626)
[1] SUBADD Tx, Smem, ACx
[2] SUBADD Tx, dual(Lmem), $A C x$

$$
\begin{array}{|lllll|lll|l}
N & 3 & 1 & X & \text { DU_ALU } & 1 & \cdot & \cdot & 1 \\
N & 3 & 1 & X & \text { DU_ALU } & 1 & \cdot & \cdot & 2
\end{array}
$$

$$
2
$$

SUBC: Subtract Conditionally (page 5-631)

$$
\begin{aligned}
& \text { SUBC Smem, }[A C x,] \text { ACy } \\
& \hline \text { Notes: } \\
& \\
& \\
& \text { 1) dst-DU, src-AU or dst-DU, src-DU } \\
& \text { 2) src-AU or dst-AU, src-DU }
\end{aligned}
$$

$$
\mathrm{N}
$$


$\stackrel{\text { + }}{\dot{ \pm}}$ Table 4-1. Mnemonic Instruction Set Summary (Continued)


## XOR: Bitwise Exclusive OR (XOR) (page 5-655)

| [1] | XOR src-AU, dst-AU | Y | 2 | 1 | x | AU_ALU |  | . |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | XOR src-DU, dst-AU | Y | 2 | 1 | x | AU_ALU | . | . | . |  | . |  | 1 |  |
|  | XOR src, dst-DU | Y | 2 | 1 | x | DU_ALU | . | . | . |  |  |  |  | See Note 1. |
| [2] | XOR k8, src-AU, dst-AU | Y | 3 | 1 | x | AU_ALU | . | . | . | . | . |  | . |  |
|  | XOR k8, src-DU, dst-AU | Y | 3 | 1 | X | AU_ALU | - | - | . | . | . |  | 1 |  |
|  | XOR k8, src, dst-DU | Y | 3 | 1 | x | DU_ALU | . | . | . |  |  |  | . | See Note 1. |
| [3] | XOR k16, src-AU, dst-AU | N | 4 | 1 | X | AU_ALU | . | . | . |  | . |  |  |  |
|  | XOR k16, src-DU, dst-AU | N | 4 | 1 | x | AU_ALU | . | - | . | . | . | . | 1 |  |
|  | XOR k16, src, dst-DU | N | 4 | 1 | X | DU_ALU | . | . | . |  | . |  |  | See Note 1. |
| [4] | XOR Smem, src-AU, dst-AU | N | 3 | 1 | x | AU_ALU | 1 | - | . | 1 | . |  |  |  |
|  | XOR Smem, src-DU, dst-AU | N | 3 | 1 | X | AU_ALU | 1 | . | . | 1 | . |  | 1 |  |
|  | XOR Smem, src, dst-DU | N | 3 | 1 | x | DU_ALU | 1 | . | . | 1 | . | . | . | See Note 1. |
| [5] | XOR ACx << \#SHIFTW[, ACy] | Y | 3 | 1 | X | DU_SHIFT | . | - | . |  | . | . |  |  |
| [6] | XOR k16 <<\#16, [ACx,] ACy | N | 4 | 1 | X | DU_ALU | . | - | . | . | . | . | . |  |
| [7] | XOR k16 <<\#SHFT, [ACx, ] ACy | N | 4 | 1 | X | DU_SHIFT | . | - | . | . | . | . | . |  |
| [8] | XOR k16, Smem | N | 4 | 1 | X | AU_ALU | 1 | . | . | 1 | . | 1 | . |  |

[^2]SWPU067E

## Instruction Set Descriptions

This chapter provides detailed information on the TMS320C55xTM DSP mnemonic instruction set.

See section 1.1, Instruction Set Terms, Symbols, and Abbreviations, for definitions of symbols and abbreviations used in the description of each instruction. See Chapter 4 for a summary of the instruction set.

## AADD

Modify Auxiliary or Temporary Register Content by Addition

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | AADD TAx, TAy | No | 3 | 1 | $A D$ |
| $[2]$ | AADD P8, TAx | No | 3 | 1 | $A D$ |

Description These instructions perform, in the A-unit address generation units:
$\square$ an addition between two auxiliary or temporary registers, TAx and TAy, and stores the result in TAy
$\square$ an addition between the auxiliary or temporary registers TAx and a program address defined by a program address label assembled into unsigned P 8 , and stores the result in TAx

The operation is performed in the address phase of the pipeline, however data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1, the circular buffer management controls the result stored in the destination register.

## Status Bits

Affected by ST2_55
Affects none
See Also See the following other related instructions:

- AADD (Modify Extended Auxiliary Register Content by Addition)
- AMAR (Modify Auxiliary Register Content)
- AMAR (Modify Extended Auxiliary Register Content)
- AMOV (Modify Auxiliary or Temporary Register Content)
- ASUB (Modify Auxiliary or Temporary Register Content by Subtraction)


## Syntax Characteristics

| No. | Syntax |  | Parallel Enable B | Bit | Size |  | Cyc | les |  | ipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | AADD TAx, TAy |  | No |  | 3 |  | 1 |  |  | AD |
| Opcode |  | 0001 | 010E ${ }^{\text {F }}$ | FSSS | x | xxx | $x$ | FDD |  | 0000 |
|  |  | 0001 | 010E F | FSSS | X | xxx |  | FDD |  | 1000 |

## Operands

Description

Status Bits

Repeat

The assembler selects the opcode depending on the instruction position in a paralleled pair.

TAx, TAy
This instruction performs, in the A-unit address generation units, an addition between two auxiliary or temporary registers, TAy and TAx, and stores the result in TAy. The content of TAx is considered signed:

TAY $=$ TAY + TAX
The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 , the circular buffer management controls the result stored in the destination register.

## Compatibility with C54x devices (C54CM =1)

In the translated code section, the AADD instruction must be executed with C54CM set to 1 .

When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.

Affected by ST2_55
Affects none
This instruction can be repeated.

AADD Modify Auxiliary or Temporary Register Content by Addition

Example 1

| Syntax | Description |  |  |  |
| :---: | :---: | :---: | :---: | :---: |
| AADD T0, AR0 | The content of AR0 is added to the signed content of T0 and the result is stored in ARO. |  |  |  |
| Before |  | Afte |  |  |
| XARO | 010000 | XARO | 00 | 8000 |
| то | 8000 | то |  | 8000 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| AADD T1, T0 | The content of T0 is added to the content of T1 and the result is stored in T0. |

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | AADD P8, TAx | No | 3 | 1 | AD |  |
| Opcode | $\mid 0001$ | 010 E | PPPP | PPPP | FDDD | 0100 |
|  |  | $\mid 0001$ | 010 E | PPPP | PPPP | FDDD |
|  | 1100 |  |  |  |  |  |

## Operands

Description

Status Bits

Repeat
Example

| Syntax | Description |
| :--- | :--- |
| AADD \#255, T0 | The unsigned 8-bit value (255) is added to the content of T0 and the result is stored in T0. |

The assembler selects the opcode depending on the instruction position in a paralleled pair.

TAx, P8
This instruction performs, in the A-unit address generation units, an addition between the auxiliary or temporary register TAx and a program address defined by a program address label assembled into unsigned P8, and stores the result in TAx:

TAX $=$ TAx + P8
The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1, the circular buffer management controls the result stored in the destination register.

## Compatibility with C54x devices (C54CM = 1)

In the translated code section, the AADD instruction must be executed with C54CM set to 1 .

When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.

Affected by ST2_55
Affects none
This instruction can be repeated.

## AADD

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | AADD K8, SP | Yes | 2 | 1 | $A D$ |

## Opcode

0100 111E $\mid$ KKKK KKKK

## Operands

 K8Description This instruction performs an addition in the A-unit data-address generation unit (DAGEN) in the address phase of the pipeline. The 8-bit signed constant, K8, is sign extended to 16 bits and added to the data stack pointer (SP):
$S P=S P+K 8$
When in 32-bit stack configuration, the system stack pointer (SSP) is also modified. Updates of the SP and SSP (depending on the stack configuration) should not be executed in parallel with this instruction.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |
| Repeat | Repeat |  |

This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| AADD \#127, SP | The 8-bit value (127) is sign extended to 16 bits and added to the stack pointer (SP). |

## AADD

## Modify Extended Auxiliary Register Content by Addition

## Syntax Characteristics



## Compatibility with C54x devices (C54CM = 1)

None.
Status Bits Affected by
Affects
Repeat
This instruction can be repeated.

```
See Also See the following other related instructions:
    \square AADD (Modify Auxiliary or Temporary Register Content by Addition)
    \square AMAR (Modify Extended Auxiliary Register Content)
    ] AMAR (Parallel Modify Auxiliary Register Contents)
    | AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and
    Accumulate)
| AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and
    Subtract)
\square AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply)
\square AMOV (Modify Auxiliary or Temporary Register Content)
\square ASUB (Modify Auxiliary or Temporary Register Content by Subtraction)
\square ASUB (Modify Extended Auxiliary Register Content by Subtraction)
```


## Example 1

| Syntax | Description |
| :--- | :--- |
| AADD XAR0, XAR1 | The content of XAR0 is added to XAR1 and stored in XAR1. |


| Before | After |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
| XAR0 | 12 | 3456 | XAR0 | 12 | 3456 |
| XAR1 | 43 | 5634 | XAR1 | 55 | $8 A 8 A$ |

## Example 2

| Syntax | Description |
| :--- | :--- |
| AADD XAR7, XCDP | The content of XAR7 is added to XCDP and stored in XCDP. |


| Before | After |  |  |  |
| :--- | :--- | :--- | :--- | :--- |
| XCDP | 008000 | XCDP | 010080 |  |
| XAR7 | 008080 | XAR7 | 008080 |  |

Execution
(XACdst) + (XACsrc) -> XACdst

## ABDST

## Absolute Distance

## Syntax Characteristics



The absolute value of accumulator ACx content is computed and added to accumulator ACy content through the D-unit MAC. When an overflow is detected according to M40:
$\square$ the destination accumulator overflow status bit (ACOVy) is set
$\square$ the destination register (ACy) is saturated according to SATD
The Ymem content shifted left 16 bits is subtracted from the Xmem content shifted left 16 bits in the D-unit ALU.

- Input operands (Xmem and Ymem) are sign extended to 40 bits according to SXMD
- CARRY status bit depends on M40. Subtraction borrow bit is reported in CARRY status bit. It is the logical complement of CARRY status bit.
- When an overflow is detected according to M40:

■ the destination accumulator overflow status bit (ACOVX) is set

- the destination register (ACx) is saturated according to SATD


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, the subtract operation does not have any overflow detection, report, and saturation after the shifting operation.

Status Bits Affected by C54CM, FRCT, M40, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
Repeat
This instruction can be repeated.

## See Also See the following other related instructions:

- SQDST (Square Distance)


## Example



## ABS

Absolute Value

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | ABS [scc,] dst | Yes | 2 | 1 | X |  |
| Opcode |  |  | 0011 | 001 E | FSSS | FDDD |
| Operands | dst, src |  |  |  |  |  |
| Description | This instruction computes the absolute value of the source register (src): |  |  |  |  |  |

$\square$ When the destination register (dst) is an accumulator:

- The operation is performed on 40 bits in the D-unit ALU.

■ If an auxiliary or temporary register is the source operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD.
■ If $\mathrm{M} 40=0$, the sign of the source register is extracted at bit position 31 . If $\operatorname{src}(31)=1$, the source register content is negated. If $\operatorname{src}(31)=0$, the source register content is moved to the destination accumulator.
■ If $\mathrm{M} 40=1$, the sign of the source register is extracted at bit position 39 . If $\operatorname{src}(39)=1$, the source register content is negated. If $\operatorname{src}(39)=0$, the source register content is moved to the destination accumulator.
■ During the 40-bit move operation, an overflow and CARRY bit status are detected according to M40:

- The destination accumulator overflow status bit (ACOVx) is set.
- The destination register is saturated according to SATD.
- The CARRY status bit is updated as follows: If the result of the operation stored in the destination register is 0, CARRY is set; otherwise, CARRY is cleared.
$\square$ When the destination register (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- The sign of the source register is extracted at bit position 15. If $\operatorname{src}(15)=1$, the source register content is negated. If $\operatorname{src}(15)=0$, the source register content is moved to the destination register. Overflow is detected at bit position 15.
- The destination register is saturated according to SATA.


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, this instruction is executed as if M40 status bit was locally set to 1. To ensure compatibility versus overflow detection and saturation of destination accumulator, this instruction must be executed with $\mathrm{M} 40=0$.

Status Bits Affected by C54CM, M40, SATA, SATD, SXMD
Affects ACOVx, CARRY
Repeat This instruction can be repeated.
See Also See the following other related instructions:
$\square$ ADDV (Addition with Absolute Value)
Example 1

| Syntax | Description |
| :--- | :--- |
| ABS AC0, AC1 | The absolute value of the content of AC0 is stored in AC1. |

Before After

| AC1 | 00 | 0000 | 2000 | AC1 | 7D | FFFF | EDCC |
| ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| AC0 | 82 | 0000 | 1234 | AC0 | 82 | 0000 | 1234 |
| M40 |  |  | 1 | M40 |  |  | 1 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| ABS AR1, AC1 | The absolute value of the content of AR1 is stored in AC1. |


| Before |  | After |  |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| AC1 | 000000 | 2000 | AC1 | 000000 | 0000 |  |
| AR1 |  | 0000 | AR1 |  | 0000 |  |
| CARRY |  |  | 0 | CARRY |  | 1 |

## Example 3

| Syntax | Description |  |  |  |
| :---: | :---: | :---: | :---: | :---: |
| ABS AR1, AC1 | The absolute value of the content of AR1 is stored in AC1. Since SXMD = 1, AR1 content is sign extended. The resulting 40-bit data is negated since M40 = 0 and AR1 (31) = 1 . |  |  |  |
| Before |  | After |  |  |
| AC1 00 | 00002000 | AC1 | 000000 | 7900 |
| AR1 | 8700 | AR1 |  | 8700 |
| M40 | 0 | M40 |  | 0 |
| SXMD | 1 | SXMD |  | 1 |

## Example 4

| Syntax | Description |
| :--- | :--- |
| ABS AC0, T1 | The absolute value of the content of $\mathrm{ACO}(15-0)$ is stored in T 1. The sign bit is extracted at <br> $\mathrm{ACO}(15)$. Since $\mathrm{ACO}(15)=0, \mathrm{~T} 1=\mathrm{ACO}(15-0)$. |


| Before |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| T1 |  | 2000 | T1 |  |  |  |
| AC0 | 800002 | 1234 | ACO | 8 | 0002 |  |

## Example 5



## ADD

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :--- | :--- | :--- |
| $[1]$ | ADD [scc,] dst | Yes | 2 | 1 | X |
| $[2]$ | ADD k4, dst | Yes | 2 | 1 | X |
| $[3]$ | ADD K16, [src,] dst | No | 4 | 1 | X |
| $[4]$ | ADD Smem, [src,] dst | No | 3 | 1 | X |
| $[5]$ | ADD ACx << Tx, ACy | Yes | 2 | 1 | X |
| $[6]$ | ADD ACx <<\#SHIFTW, ACy | Yes | 3 | 1 | X |
| $[7]$ | ADD K16 << \#16, [ACx,] ACy | No | 4 | 1 | X |
| $[8]$ | ADD K16 << \#SHFT, [ACx,] ACy | No | 4 | 1 | X |
| $[9]$ | ADD Smem << Tx, [ACx,] ACy | No | 3 | 1 | X |
| $[10]$ | ADD Smem <<\#16, [ACx,] ACy | No | 3 | 1 | X |
| $[11]$ | ADD [uns(]Smem[)], CARRY, [ACx,] ACy | No | 3 | 1 | X |
| $[12]$ | ADD [uns(]Smem[)], [ACx,] ACy | No | 3 | 1 | X |
| $[13]$ | ADD [uns(]Smem[)] << \#SHIFTW, [ACx,] ACy | No | 4 | 1 | X |
| $[14]$ | ADD dbl(Lmem), [ACx,] ACy | No | 3 | 1 | X |
| $[15]$ | ADD Xmem, Ymem, ACx | No | 3 | 1 | X |
| $[16]$ | ADD K16, Smem | No | 4 | 1 | X |

Description These instructions perform an addition operation.
Status Bits Affected by CARRY, C54CM, M40, SATA, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
See Also See the following other related instructions:

- ADD (Dual 16-Bit Additions)
- ADD::MOV (Addition with Parallel Store Accumulator Content to Memory)
- ADDSUB (Dual 16-Bit Addition and Subtraction)
$\square$ ADDSUBCC (Addition or Subtraction Conditionally)
- ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally)
$\square$ ADDSUB2CC (Addition or Subtraction Conditionally with Shift)
$\square$ ADDV (Addition with Absolute Value)
- SUB (Subtraction)
$\square$ SUBADD (Dual 16-Bit Subtraction and Addition)

Addition
Syntax Characteristics

| No. Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: |
| [1] ADD [src,] dst | Yes | 2 | 1 | X |
| Opcode | 0010 | 010 | E FSSS | FDDD |
| Operands | dst, src |  |  |  |
| Description | This instruction performs an addition operation between two registers:$\text { dst }=\text { dst + src }$ |  |  |  |

$\square$ When the destination (dst) operand is an accumulator:

- The operation is performed on 40 bits in the D-unit ALU.

■ Input operands are sign extended to 40 bits according to SXMD.
■ If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.
■ Overflow detection and CARRY status bit depends on M40.

- When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination (dst) operand is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.

■ If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

- Addition overflow detection is done at bit position 15.

■ When an overflow is detected, the destination register is saturated according to SATA.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits |  | Affected by | M 40, SATA, SATD, SXMD |
| :---: | :---: | :---: | :---: |
|  |  | Affects | ACOVx, CARRY |
|  | Repeat | This instruction can be repeated. |  |
|  | Example |  |  |
|  | Syntax | Description |  |
|  | ADD AC1, AC0 | The content of AC1 is added to the content of AC0 and the result is stored in AC0 |  |

## Addition

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size Cycles Pipeline |
| :---: | :---: |
| [2] ADD k4, dst | $\begin{array}{llll}\text { Yes } & 2 & 1\end{array}$ |
| Opcode | 0100 000E ${ }^{\text {k }}$ kkk FDDD |
| Operands | dst, k4 |
| Description | This instruction performs an addition operation between a register content and a 4-bit unsigned constant, k4: $\text { dst }=\text { dst }+k 4$ |

- When the destination (dst) operand is an accumulator:
- The operation is performed on 40 bits in the D-unit ALU.

■ Overflow detection and CARRY status bit depends on M40.

- When an overflow is detected, the accumulator is saturated according to SATD.
- When the destination (dst) operand is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- Addition overflow detection is done at bit position 15.
- When an overflow is detected, the destination register is saturated according to SATA.


## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by M40, SATA, SATD
Affects ACOVx, CARRY
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| ADD \#15, AC0 | The content of AC0 is added to an unsigned 4-bit value (15) and the result is stored in AC0. |

Addition

Syntax Characteristics


```
dst = src + K16
```

$\square$ When the destination (dst) operand is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.

- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ The 16 -bit constant, K16, is sign extended to 40 bits according to SXMD.

■ Overflow detection and CARRY status bit depends on M40.
■ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination (dst) operand is an auxiliary or temporary register:
■ The operation is performed on 16 bits in the A-unit ALU.
■ If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
■ Addition overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by | M40, SATA, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, CARRY |

## Repeat <br> This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |

ADD \#2E00h, AC0, AC1 The content of AC0 is added to the signed 16-bit value (2E00h) and the result is stored in AC1.

Addition

Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | ADD Smem, [scc,] dst | No | 3 | 1 | X |  |
| Opcode |  | 1101 | $0110 \mid A A A A$ | AAAI | FDDD | FSSS |
| Operands | dst, Smem, src |  |  |  |  |  |
| Description | This instruction performs an addition operation between a register content and <br> the content of a memory (Smem) location. |  |  |  |  |  |
|  | dst $=$ src + Smem |  |  |  |  |  |

$\square$ When the destination (dst) operand is an accumulator:

- The operation is performed on 40 bits in the D-unit ALU.
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ The content of the memory location is sign extended to 40 bits according to SXMD.

■ Overflow detection and CARRY status bit depends on M40.

- When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination (dst) operand is an auxiliary or temporary register:
■ The operation is performed on 16 bits in the A-unit ALU.
■ If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- Addition overflow detection is done at bit position 15.

■ When an overflow is detected, the destination register is saturated according to SATA.

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

## Status Bits

Affected by M40, SATA, SATD, SXMD
Affects ACOVx, CARRY

| Repeat | This instruction can be repeated. |
| :---: | :---: |
| Example |  |
| Syntax | Description |
| ADD *AR3+, T0, T1 | The content of T0 is added to the content addressed by AR3 and the result is stored in T1. AR3 is incremented by 1. |
| Before | After |
| AR3 0302 | AR3 0303 |
| 302 EF00 | 302 EF00 |
| T0 3300 | то 3300 |
| T1 0 | T1 2200 |
| CARRY 0 | CARRY 1 |

Addition
Syntax Characteristics


When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$ :

- An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
$\square$ The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVy, CARRY |
| Repeat | This instruction can be repeated. |  |

Example

| Syntax | Description |
| :--- | :--- |
| ADD AC1 << T0, AC0 | The content of AC1 shifted by the content of T0 is added to the content of AC0 <br> and the result is stored in AC0. |

## Addition

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [6] | ADD ACx << \#SHIFTW, ACy | Yes | 3 | 1 | X |
| Opcod |  | O00E ${ }^{\text {DDSS }}$ | 0011 | $1 \mid \mathrm{xxSH}$ | IFTW |
| Operan |  | ACx, ACy, SHIFTW |  |  |  |
| Description |  | This instruction performs an addition operation between an accumulator content ACy and an accumulator content ACx shifted by the 6 -bit value, SHIFTW: |  |  |  |
|  |  |  |  |  |  |

- The operation is performed on 40 bits in the D-unit shifter.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVy, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| ADD AC1 <<\#31, AC0 | The content of AC1 shifted left by 31 bits is added to the content of AC0 and the <br> result is stored in AC0. |

## Addition

Syntax Characteristics

$A C y=A C x+($ K16 << \#16)

- The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40.
When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.


## Addition

## Syntax Characteristics



- The operation is performed on 40 bits in the D-unit shifter.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVy, CARRY |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :--- | :--- |
| ADD \#FFFFh <<\#15, AC1, AC0 | A signed 16-bit value (FFFFh) shifted left by 15 bits is added to the <br> content of AC1 and the result is stored in AC0. |

Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: |
| $[9]$ | ADD Smem $\ll$ Tx, [ACx,] ACy | No | 3 | 1 | X |  |
| Opcode | $\mid 1101$ | $1101 \mid A A A A$ | AAAI | SSDD | SS00 |  |

## Operands

ACx, ACy, Tx, Smem
Description This instruction performs an addition operation between an accumulator content ACx and the content of a memory (Smem) location shifted by the content of Tx:
$A C y=A C x+($ Smem $\ll T x)$

- The operation is performed on 40 bits in the D-unit shifter.
- Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM = 1:

- An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
$\square$ The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVy, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| ADD *AR1 << T0, AC1, AC0 | The content addressed by AR1 shifted left by the content of T0 is added <br> to the content of AC1 and the result is stored in AC0. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 00 | 0000 | 0000 | ACO | 00 | 2330 | 0000 |
| AC1 | 00 | 2300 | 0000 | AC1 | 00 | 2300 | 0000 |
| T0 |  |  | 000C | т0 |  |  | 000C |
| AR1 |  |  | 0200 | AR1 |  |  | 0200 |
| 200 |  |  | 0300 | 200 |  |  | 0300 |
| SXMD |  |  | 0 | SXMD |  |  | 0 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| Acovo |  |  | 0 | Acovo |  |  | 0 |
| CARRY |  |  | 0 | CARRY |  |  | 1 |

## Addition

## Syntax Characteristics



- The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. If the result of the addition generates a carry, the CARRY status bit is set; otherwise, the CARRY status bit is not affected.

When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| ADD *AR3 << \#16, AC1, AC0 | The content addressed by AR3 shifted left by 16 bits is added to the <br> content of AC1 and the result is stored in AC0. |

## Addition

Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [11] | ADD [uns(]Smem[]], CARRY, [ACx,] ACy |  | No | 3 | 1 | X |
| Opcode |  |  | 1111 \| AAAA | AAA | I ${ }^{\text {SSDD }}$ | 100u |
| Operands |  | ACx, ACy, Smem |  |  |  |  |
| Description |  | This instruction performs an addition operation of the accumulator content ACx, the content of a memory (Smem) location, and the value of the CARRY status bit: |  |  |  |  |
|  |  | ACy $=$ ACx + Smem + |  |  |  |  |

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by CARRY, M40, SATD, SXMD |
| :---: | :---: |
|  | ACOVy, CARRY |
| Repeat This instruction | ction can be repeated. |
| Example |  |
| Syntax | Description |
| ADD uns(*AR3), CARRY, AC1, AC0 | The CARRY status bit and the unsigned content addressed by AR3 are added to the content of $A C 1$ and the result is stored in ACO. |

## Syntax Characteristics


$\mathrm{ACy}=\mathrm{ACx}+\mathrm{uns}($ Smem $)$
The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are extended to 40 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ Overflow detection and CARRY status bit depends on M40.

- When an overflow is detected, the accumulator is saturated according to SATD.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

Status Bits | Affected by M40, SATD, SXMD |
| :--- |
| Affects ACOVy, CARRY |

| Repeat | This instruction can be repeated. |
| :--- | :--- |
| Example | Description |
| Syntax | The unsigned content addressed by AR3 is added to the content of AC1 and <br> the result is stored in AC0. |
| ADD uns(*AR3), AC1, AC0 |  |

## Addition

## Syntax Characteristics



- The operation is performed on 40 bits in the D-unit shifter.
- Input operands are extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
Status Bits Affected by C54CM, M40, SATD, SXMD

Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| ADD uns $\left({ }^{*} A R 3\right) \ll \# 31$, AC1, AC0 | The unsigned content addressed by AR3 shifted left by 31 bits is <br> added to the content of AC1 and the result is stored in AC0. |

## Addition

Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | ADD dbl(Lmem), [ACx,] ACy | No | 3 | 1 | X |
| Opcod |  | 1101 \| AAA | AAA | I ${ }^{\text {SSDD }}$ | 000n |
| Operands |  | ACx, ACy, Lmem |  |  |  |
| Description |  | This instruction performs an addition operation between an accumulator content ACx and the content of data memory operand dbl(Lmem): |  |  |  |

$A C y=A C x+d b l($ Lmem $)$
$\square$ The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1
■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem - 1
$\square$ The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD.

- Overflow detection and CARRY status bit depends on M40.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by M40, SATD, SXMD |  |
| :---: | :---: | :---: |
|  | ects | ACOVy, CARRY |
| Repeat Th | This instruction can be repeated. |  |
| Example |  |  |
| Syntax | Descript |  |
| ADD dbl(*AR3+), AC1, AC0 | The conte content of long-oper | (long word) addressed 1 and the result is sto instruction, AR3 is inc |

## Addition

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | ADD Xmem, Ymem, ACx | No | 3 | 1 | X |
| Opcod |  | 0001 \| XXXM | MMY | Y ${ }^{\text {YMMM }}$ | OODD |
| Operan |  | ACx, Xmem, Ymem |  |  |  |
| Description |  | This instruction performs an addition operation between the content of data memory operand Xmem shifted left 16 bits, and the content of data memory operand Ymem shifted left 16 bits: |  |  |  |
|  |  |  |  |  |  |

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, CARRY |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :--- | :--- |
| ADD *AR3, *AR4, AC0 | The content addressed by AR3 shifted left by 16 bits is added to the content <br> addressed by AR4 shifted left by 16 bits and the result is stored in AC0. |

## Addition

Syntax Characteristics


Smem $=$ Smem + K16
$\square$ The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD and shifted by 16 bits to the MSBs before being added.
$\square$ Addition overflow is detected at bit position 31. If an overflow is detected, accumulator 0 overflow status bit (ACOVO) is set.

Addition carry report in CARRY status bit is extracted at bit position 31.
$\square$ If SATD is 1 when an overflow is detected, the result is saturated before being stored in memory. Saturation values are 7FFFh or 8000 h .

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by SATD, SXMD |
| :--- | :--- |
|  | Affects ACOV0, CARRY |

## ADD

Dual 16-Bit Additions

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | ADD dual(Lmem), $[A C x] A C y$, | No | 3 | 1 | $X$ |
| $[2]$ | ADD dual(Lmem), Tx, ACx | No | 3 | 1 | $X$ |


| Description | These instructions perform two paralleled addition operations in one cycle. <br> The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath). |
| :---: | :---: |
| Status Bits | Affected by C54CM, SATD, SXMD |
|  | Affects ACOVx, ACOVy, CARRY |
| See Also | See the following other related instructions: |
|  | - ADD (Addition) |
|  | - ADD::MOV (Addition with Parallel Store Accumulator Content to Memory) |
|  | - ADDSUB (Dual 16-Bit Addition and Subtraction) |
|  | - ADDSUBCC (Addition or Subtraction Conditionally) |
|  | - ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally) |
|  | - ADDSUB2CC (Addition or Subtraction Conditionally with Shift) |
|  | - SUBADD (Dual 16-Bit Subtraction and Addition) |

Dual 16-Bit Additions

## Syntax Characteristics



The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The data memory operand dbl(Lmem) is divided into two 16-bit parts:

■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part

- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem - 1
$\square$ For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVy) is set.

- For the operations performed in the ALU low part, overflow is detected at bit position 15 .
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.

Status Bits | Affected by C16, C54CM, SATD, SXMD |
| :--- | :--- |
| Affects $\quad$ ACOVy, CARRY |

This instruction can be repeated.
Repeat
Example

| Syntax | Description |
| :--- | :--- |
| ADD dual(*AR3), AC1, AC0 | Both instructions are performed in parallel. When the Lmem address is even <br> (AR3 = even): The content of AC1 (39-16) is added to the content addressed <br> by AR3 and the result is stored in AC0 $39-16)$. The content of AC1 $15-0)$ is <br> added to the content addressed by AR3 + 1 and the result is stored in <br> AC0(15-0). |

Dual 16-Bit Additions

## Syntax Characteristics



The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The temporary register Tx:

■ is used as one of the 16-bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
$\square$ The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part

- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word = Lmem - 1

- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.
- For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
$\square$ For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
$\square$ Independently on each data path, if SATD $=1$ when an overflow is detected on the data path, a saturation is performed:

- For the operations performed in the ALU low part, saturation values are 7FFFh and 8000h.
- For the operations performed in the ALU high part, saturation values are 00 7FFFh and FF 8000h.


## Compatibility with C54x devices (C54CM = 1)

When C54CM = 1, this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24-bit datapath (overflow is detected at bit position 31).

| Status Bits | Affected by | C54CM, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, CARRY |

Example

| Syntax | Description |
| :--- | :--- |
| ADD dual(*AR3), T0, AC0 | Both instructions are performed in parallel. When the Lmem address is even <br> (AR3 = even): The content of T0 is added to the content addressed by AR3 <br> and the result is stored in AC0(39-16). The duplicated content of T0 is added <br> to the content addressed by AR3 + 1 and the result is stored in AC0(15-0). |


| ADD::MOV |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Syntax Characteristics |  |  |  |  |  |  |  |
|  | Syntax |  |  | Parallel Enable Bi | Size | Cycles | Pipeline |
| [1] | ADD Xmem << \#16, ACx, ACy :: MOV HI(ACy << T2), Ymem |  |  | No | 4 | 1 | X |
| Opcode |  | 10000111 | XXXM | MMYY $\mid$ YM | YMMM SS | D 100 | $x$ xxxx |
| Operands ACx, ACy, T |  | Xmem, Ymem |  |  |  |  |  |
| Description |  | performs two <br> Xmem << \#16) <br> ACy << T2) | peratio | ns in parall | allel, addi | on and | tore: |

The first operation performs an addition between an accumulator content $A C x$ and the content of data memory operand Xmem shifted left by 16 bits.

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
- When an overflow is detected, the accumulator is saturated according to SATD.

The second operation shifts the accumulator ACy by the content of T2 and stores $\operatorname{ACy}(31-16)$ to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
$\square$ The input operand is shifted in the D-unit shifter according to SXMD.
$\square$ After the shift, the high part of the accumulator, $\mathrm{ACy}(31-16)$, is stored to the memory location.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, the 6 LSBs of T2 are used to determine the shift quantity. The 6 LSBs of T2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .


## ADDSUB <br> Dual 16-Bit Addition and Subtraction

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | ADDSUB Tx, Smem, ACx | No | 3 | 1 | X |
| $[2]$ | ADDSUB Tx, dual(Lmem), ACx | No | 3 | 1 | $X$ |

Description These instructions performs two paralleled arithmetical operations in one cycle, an addition and subtraction.

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

| Status Bits | Affected by | C54CM, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy, CARRY |

See Also See the following other related instructions:

- ADD (Addition)
- ADD (Dual 16-Bit Additions)
$\square$ SUB (Dual 16-Bit Subtractions)
- SUB (Subtraction)
- SUBADD (Dual 16-Bit Subtraction and Addition)


## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | ADDSUB Tx, Smem, ACx | No | 3 | 1 | X |  |  |
| Opcode |  | 1101 | 1110 | AAAA | AAAI | SSDD | 1000 |
| Operands | ACx, Smem, Tx |  |  |  |  |  |  |

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The data memory operand Smem:

■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part

- The temporary register Tx:

■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part

- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15.
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
- Independently on each data path, if SATD = 1 when an overflow is detected on the data path, a saturation is performed:
- For the operations performed in the ALU low part, saturation values are 7FFFh and 8000h.
- For the operations performed in the ALU high part, saturation values are 007 FFFh and FF 8000h.


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24 -bit datapath (overflow is detected at bit position 31).
Status Bits

| Affected by C54CM, SATD, SXMD |
| :--- | :--- |
| Affects $\quad$ ACOVx, CARRY |

Repeat

Example \begin{tabular}{|l|l|}
\hline Syntax \& Description <br>

\hline ADDSUB T1, *AR1, AC1 \& | Both instructions are performed in parallel. The content addressed by AR1 is add- |
| :--- |
| ed to the content of T1 and the result is stored in AC1 (39-16). The duplicated |
| content of T1 is subtracted from the duplicated content addressed by AR1 and the |
| result is stored in AC1 (15-0). | <br>

\hline
\end{tabular}

| Before |  |  | After |  |  |
| :--- | ---: | ---: | :--- | ---: | :--- |
| AC1 | 00 | 2300 | 0000 | AC1 | 00 |
| T1 |  | 4000 | T1 |  |  |
| AR1 | 0201 | AR1 | 4000 |  |  |
| 201 |  | E300 |  |  |  |
| SXMD |  | 1 | SXMD | 0201 |  |
| M40 | 1 | M40 | E300 |  |  |
| ACOV0 |  | 0 | ACOVO | 1 |  |
| CARRY | 0 | CARRY | 1 |  |  |

## Dual 16-Bit Addition and Subtraction

## Syntax Characteristics

| No. | Syntax | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: |
|  | ADDSUB Tx, dual(Lmem), ACx | 3 | 1 | X |
| Opcode $\quad\|11101110\|$ AAAA AAAI ${ }^{\text {S }}$ SDD 110x |  |  |  |  |
| Operands ACx, Lmem, Tx |  |  |  |  |
| Description | This instruction performs two paralleled arithmetical operations in one cycle, an addition and subtraction: |  |  |  |
|  | $\begin{aligned} & \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })+\mathrm{Tx} \\ & :: \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })-\mathrm{Tx} \end{aligned}$ |  |  |  |

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The temporary register Tx:

■ is used as one of the 16-bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
$\square$ The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part

- the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
$\square$ The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1
■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem - 1
- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit ( ACOV x ) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15.

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.

| Status Bits | Affected by | C16, C54CM, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| ADDSUB T0, dual(*AR3), AC0 | Both instructions are performed in parallel. When the Lmem address is <br> even (AR3 = even): The content of T0 is added to the content addressed <br> by AR3 and the result is stored in AC0 $39-16)$. The duplicated content of <br> T0 is subtracted from the content addressed by AR3 + 1 and the result is <br> stored in AC0(15-0). |

## ADDSUBCC <br> Addition or Subtraction Conditionally

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | ADDSUBCC Smem, ACx, TC1, ACy | No | 3 | 1 | X |
| [2] | ADDSUBCC Smem, ACx, TC2, ACy | No | 3 | 1 | X |
| Opcod | TC1 | 1110 \| $A$ | A | AI ${ }^{\text {SSDD }}$ | 0000 |
|  | TC2 | 1110 \| ${ }^{\text {A }}$ | A A | AI ${ }^{\text {SSDD }}$ | 0001 |
| Operands | ACx, ACy, Smem, TCx |  |  |  |  |
| Description | This instruction evaluates the selected TCx status bit and based on the result of the test, either an addition or a subtraction is performed. Evaluation of the condition on the TCx status bit is performed during the Execute phase of the instruction. |  |  |  |  |


| TC1 or TC2 | Operation |
| :---: | :---: |
| 0 | $\mathrm{ACy}=\mathrm{ACx}-($ Smem $\ll \# 16)$ |
| 1 | $\mathrm{ACy}=\mathrm{ACx}+($ Smem $\ll \# 16)$ |
| $\square$ | TCx $=\mathbf{0}$, then $\mathrm{ACy}=\mathrm{ACx}-($ Smem $\ll \# 16):$ |

This instruction subtracts the content of a memory (Smem) location shifted left by 16 bits from accumulator ACx and stores the result in accumulator ACy.

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.
$\square T C x=1$, then $A C y=A C x+($ Smem $\ll \# 16)$ :
This instruction performs an addition operation between accumulator ACx and the content of a memory (Smem) location shifted left by 16 bits and stores the result in accumulator ACy.
- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.

■ Overflow detection and CARRY status bit depends on M40.

- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\(\left.\begin{array}{lll}Status Bits \& Affected by \quad C54CM, M40, SATD, SXMD, TCx <br>

\& Affects \quad ACOVy, CARRY\end{array}\right]\)| Repeat | This instruction can be repeated. |
| :--- | :--- |
| See Also | See the following other related instructions: |

$\square$ ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally)
$\square$ ADDSUB2CC (Addition or Subtraction Conditionally with Shift)
Example 1

| Syntax | Description |
| :--- | :--- |
| ADDSUBCC *AR3, AC1, TC1, AC0 | If TC1 = 1, the content addressed by AR3 shifted left by 16 bits is <br> added to the content of AC1 and the result is stored in AC0. If <br> TC1 = 0, the content addressed by AR3 shifted left by 16 bits is <br> subtracted from the content of AC1 and the result is stored in AC0. |

## Example 2



## ADDSUBCC

Addition, Subtraction, or Move Accumulator Content Conditionally
Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | ADDSUBCC Smem, ACx, TC1, TC2, ACy |  | No | 3 | 1 | X |
| Opcode |  | $\mid 1101$ | 1110 | AAAAA | AAAI | SSDD |
| Operands | 0010 |  |  |  |  |  |
| Description | ACx, ACy, Smem, TC1, TC2 | This instruction evaluates the TCx status bits and based on the result of the <br> test, an addition, a subtraction, or a move is performed. Evaluation of the <br> condition on the TCx status bits is performed during the Execute phase of the <br> instruction. |  |  |  |  |


| TC1 | TC2 | Operation |
| :---: | :---: | :---: |
| 0 | 0 | ACy $=\mathrm{ACx}-($ Smem $\ll \# 16)$ |
| 0 | 1 | $\mathrm{ACy}=\mathrm{ACx}$ |
| 1 | 0 | $\mathrm{ACy}=\mathrm{ACx}+($ Smem $\ll \# 16)$ |
| 1 | 1 | $\mathrm{ACy}=\mathrm{ACx}$ |

- TC2 = 1, then ACy = ACx:

This instruction moves the content of ACx to ACy.
■ The 40 -bit move operation is performed in the D-unit ALU.
■ During the 40-bit move operation, an overflow is detected according to M40:

- the destination accumulator overflow status bit (ACOVy) is set.
- the destination register (ACy) is saturated according to SATD.
- TC1 = 0 and TC2 = 0, then ACy = ACx - (Smem <<\#16):

This instruction subtracts the content of a memory (Smem) location shifted left by 16 bits from accumulator ACx and stores the result in accumulator ACy.
■ The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are sign extended to 40 bits according to SXMD.
■ The shift operation is equivalent to the signed shift instruction.
■ Overflow detection and CARRY status bit depends on M40.

- When an overflow is detected, the accumulator is saturated according to SATD.
- TC1 = $\mathbf{1}$ and TC2 = 0, then ACy $=A C x+($ Smem $\ll \# 16)$ :

This instruction performs an addition operation between accumulator ACx and the content of a memory (Smem) location shifted left by 16 bits and stores the result in accumulator ACy.

The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are sign extended to 40 bits according to SXMD.
■ The shift operation is equivalent to the signed shift instruction.
■ Overflow detection and CARRY status bit depends on M40.
■ When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\(\left.\begin{array}{lll}Status Bits \& Affected by \quad C54CM, M40, SATD, SXMD, TC1, TC2 <br>

\& Affects \quad ACOVy, CARRY\end{array}\right]\)|  | This instruction can be repeated. |
| :--- | :--- |
| Repeat | See the following other related instructions: |

- ADDSUBCC (Addition or Subtraction Conditionally)
$\square$ ADDSUB2CC (Addition or Subtraction Conditionally with Shift)
Example

| Syntax | Description |
| :--- | :--- |
| ADDSUBCC *AR3, AC1, TC1, TC2, AC0 | If TC2 $=1$, the content of AC1 is stored in AC0. If TC2 $=0$ and <br>  <br>  <br>  <br> TC1 $=1$, the content addressed by AR3 shifted left by 16 bits is <br> added to the content of AC1 and the result is stored in AC0. If <br> TC2 $=0$ and TC1 = 0, the content addressed by AR3 shifted left <br> by 16 bits is subtracted from the content of AC1 and the result is <br> stored in AC0. |

## ADDSUB2CC <br> Addition or Subtraction Conditionally with Shift

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $[1]$ | ADDSUB2CC Smem, ACx, Tx, TC1, TC2, ACy | No | 3 | 1 | X |  |
| Opcode |  |  |  |  |  |  |
| Operands | ACx, ACy, Tx, Smem, TC1, TC2 |  |  |  |  |  |


| TC1 | TC2 | Operation |
| :---: | :---: | :---: |
| 0 | 0 | $\mathrm{ACy}=\mathrm{ACx}-($ Smem $\ll \mathrm{Tx})$ |
| 0 | 1 | $\mathrm{ACy}=\mathrm{ACx}-($ Smem $\ll \# 16)$ |
| 1 | 0 | $\mathrm{ACy}=\mathrm{ACx}+($ Smem $\ll \mathrm{Tx})$ |
| 1 | 1 | $\mathrm{ACy}=\mathrm{ACx}+($ Smem $\ll \# 16)$ |

- TC1 = 0 and TC2 = 0, then ACy $=A C x-($ Smem $\ll T x)$ :

This instruction subtracts the content of a memory (Smem) location shifted left by the content of Tx from an accumulator ACx and stores the result in accumulator ACy.

- TC1 = 0 and TC2 = 1, then ACy = ACx - (Smem <<\#16):

This instruction subtracts the content of a memory (Smem) location shifted left by 16 bits from an accumulator $A C x$ and stores the result in accumulator ACy.

■ The operation is performed on 40 bits in the D-unit shifter.
■ Input operands are sign extended to 40 bits according to SXMD.

- The shift operation is equivalent to the signed shift instruction.

■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

- When an overflow is detected, the accumulator is saturated according to SATD.
- TC1 = 1 and TC2 $=0$, then $A C y=A C x+($ Smem $\ll T x)$ :

This instruction performs an addition operation between an accumulator ACx and the content of a memory (Smem) location shifted left by the content of Tx and stores the result in accumulator ACy.

- TC1 = 1 and TC2 = 1, then ACy = ACx + (Smem $\ll \# 16)$ :

This instruction performs an addition operation between an accumulator ACx and the content of a memory (Smem) location shifted left by 16 bits and stores the result in accumulator ACy.

■ The operation is performed on 40 bits in the D-unit shifter.
■ Input operands are sign extended to 40 bits according to SXMD.

- The shift operation is equivalent to the signed shift instruction.

■ Overflow detection and CARRY status bit depends on M40.

- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$ :

- An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
- The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

Status Bits

Repeat
See Also

Affected by C54CM, M40, SATD, SXMD, TC1, TC2
Affects ACOVy, CARRY
This instruction can be repeated.
See the following other related instructions:
$\square$ ADDSUBCC (Addition or Subtraction Conditionally)
$\square$ ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally)

## Example

| Syntax | Description |
| :--- | :--- |
| ADDSUB2CC *AR2, AC0, T1, TC1, TC2, AC2 | TC1 $=1$ and TC2 $=0$, the content addressed by AR2 <br> shifted left by the content of T1 is added to the content of <br> AC0 and the result is stored in AC2. The result generated <br> an overflow. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 | 00 | ECOO | 0000 | ACO | 00 | ECOO | 0000 |
| AC2 | 00 | 0000 | 0000 | AC2 | 00 | ECOO | CCOO |
| AR2 |  |  | 0201 | AR2 |  |  | 0201 |
| 201 |  |  | 3300 | 201 |  |  | 3300 |
| T1 |  |  | 0002 | T1 |  |  | 0002 |
| TC1 |  |  | 1 | TC1 |  |  | 1 |
| TC2 |  |  | 0 | TC2 |  |  | 0 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| ACOV2 |  |  | 0 | ACOV2 |  |  | 1 |
| CARRY |  |  | 0 | CARRY |  |  | 0 |

## ADDV

Syntax Characteristics

See Also See the following other related instructions:

- ABS (Absolute Value)
$\square$ ADD (Addition)- ADDSUBCC (Addition or Subtraction Conditionally)$\square$ ADDSUBCC (Addition, Subtraction, or Move Accumulator ContentConditionally)
$\square$ ADDSUB2CC (Addition or Subtraction Conditionally with Shift)
Example

| Syntax | Description |
| :--- | :--- |
| ADDV AC1, AC0 | The absolute value of AC1 is added to the content of AC0 and the result is stored <br> in AC0. |

## AMAR

Modify Auxiliary Register Content

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size Cycles Pipeline |
| :---: | :---: |
| [1] AMAR Smem | No 201 |
| Opcode | $10110100 \mid$ AAAA AAAI |
| Operands | Smem |
| Description | This instruction performs, in the A-unit address generation units, the auxiliary register modification specified by Smem as if a word single data memory operand access was made. The operation is performed in the address phase of the pipeline; however, data memory is not accessed. <br> If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1, the circular buffer management controls the result stored in the destination register. <br> Compatibility with C54x devices (C54CM =1) <br> In the translated code section, the $\operatorname{AMAR}()$ instruction must be executed with C54CM set to 1 . <br> When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BKO3 as the circular buffer size register; BK47 is not used. |
| Status Bits | Affected by ST2_55 <br> Affects none |
| Repeat | This instruction can be repeated. |


| See Also | See the following other related instructions: |
| :---: | :---: |
|  | - AADD (Modify Auxiliary or Temporary Register Content by Addition) |
|  | - AADD (Modify Extended Auxiliary Register Content by Addition) |
|  | - AMAR (Modify Extended Auxiliary Register Content) |
|  | - AMAR (Parallel Modify Auxiliary Register Contents) |
|  | $\square$ AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and Accumulate) |
|  | - AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract) |
|  | - AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply) |
|  | - AMOV (Modify Auxiliary or Temporary Register Content) |
|  | - ASUB (Modify Auxiliary or Temporary Register Content by Subtraction) |
|  | - ASUB (Modify Extended Auxiliary Register Content by Subtraction) |
| Example |  |
| Syntax | Description |
| AMAR *AR3+ | The content of AR3 is incremented by 1. |

## AMAR

## Modify Extended Auxiliary Register Content

## Syntax Characteristics



## AMAR

Parallel Modify Auxiliary Register Contents

## Syntax Characteristics



## AMAR:\#MAC <br> Modify Auxiliary Register Content with Parallel Multiply and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | AMAR Xmem <br> $:: ~ M A C[R][40][u n s(] Y m e m[)], ~[u n s(] C m e m[)], ~ A C x ~$ | No | 4 | 1 | X |
| $[2]$ | AMAR Xmem <br> $:: ~ M A C[R][40][u n s(] Y m e m[)], ~[u n s(] C m e m[)], ~ A C x ~ \gg \# 16 ~$ | No | 4 | 1 | X |

Description These instructions perform two parallel operations in one cycle: modify auxiliary register (MAR), and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:

- AMAR (Modify Auxiliary Register Content)
- AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply)
- AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract)
- MAC (Multiply and Accumulate)

Modify Auxiliary Register Content with Parallel Multiply and Accumulate
Syntax Characteristics

| No. | Syntax |  |  | Parallel Enable Bi | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | AMAR Xmem <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx |  |  | No | 4 | 1 | X |
| Opcod |  | 10000011 | XXXM | MMYY ${ }^{\text {YM }}$ | M 11 | mm ${ }^{\text {u }}$ | x DDg\% |
| Opera |  | ACx, Cmem, Xmem, Ymem |  |  |  |  |  |
| Description |  | This instruction performs two parallel operations in one cycle: modify auxiliary register (MAR), and multiply and accumulate (MAC):```mar (Xmem) :: ACx = ACx + (Ymem * Cmem)``` |  |  |  |  |  |

The operations are executed in the two D-unit MACs. The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.If $\mathrm{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.

- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.
$\square$ For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem
- AMAR Ymem
- AMAR Cmem

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| AMAR *AR3+ <br> $:: ~ M A C ~ u n s(* A R 4), ~ u n s(* C D P), ~ A C 0 ~$ | Both instructions are performed in parallel. AR3 is incremented by 1. <br> The unsigned content addressed by AR4 multiplied by the unsigned <br> content addressed by the coefficient data pointer register (CDP) is <br> added to the content of AC0 and the result is stored in AC0. |

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | AMAR Xmem <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx >> \#16 | No | 4 | 1 | X |

## Opcode

Operands

## Description

ACx, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: modify auxiliary register (MAR), and multiply and accumulate (MAC):
mar (Xmem)
$:: A C x=(A C x \gg \# 16)+($ Ymem * Cmem $)$
The operations are executed in the two D-unit MACs. The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.
$\square$ Input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator $A C x$ shifted right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACx(39).
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.
- For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem

■ AMAR Ymem

- AMAR Cmem


## Status Bits

Affected by
FRCT, M40, RDM, SATD, SMUL, SXMD
Affects
ACOVx
Repeat This instruction can be repeated.
Example


## AMAR::MAS

## Syntax Characteristics



The operations are executed in the two D-unit MACs. The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.

- Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

- If FRCT = 1 , the output of the multiplier is shifted left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.
$\square$ For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem
- AMAR Ymem
- AMAR Cmem

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx
Repeat This instruction can be repeated.
See Also See the following other related instructions:

- AMAR (Modify Auxiliary Register Content)
$\square$ AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and Accumulate)
$\square$ AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply)
$\square$ MAS (Multiply and Subtract)


## Example

| Syntax | Description |
| :--- | :--- |
| AMAR *AR3+ | Both instructions are performed in parallel. AR3 is incremented by 1. <br> The uns uns(*AR4), uns(*CDP), AC0 <br> The unsigned content addressed by AR4 multiplied by the unsigned <br> content addressed by the coefficient data pointer register (CDP) is <br> subtracted from the content of AC0 and the result is stored in AC0. |

## AMAR::MPY <br> Modify Auxiliary Register Content with Parallel Multiply

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | AMAR Xmem <br> $:: ~ M P Y[R][40][u n s(] Y m e m[)], ~[u n s(] C m e m[)], ~ A C x ~$ | No | 4 | 1 | X |

Opcode $\quad|10000010|$ XXXM MMYY $\mid$ YMMM $11 \mathrm{~mm} \mid$ uuxx DDg\%

## ACx, Cmem, Xmem, Ymem

This instruction performs two parallel operations in one cycle: modify auxiliary register (MAR) and multiply:

```
mar(Xmem)
:: ACx = Ymem * Cmem
```

The operations are executed in the two D-unit MACs. The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.
$\square$ Input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
$\square$ If $\mathrm{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
T. The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

- This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.
- For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

■ AMAR Xmem
■ AMAR Ymem

- AMAR Cmem

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :---: | :---: | :---: |
|  | Affects | ACOVx |
| Repeat | This instruction can be repeated. |  |
| See Also | See the following other related instructions: |  |
|  | - AMAR | dify Auxiliary Register Content) |
|  | - AMAR: Accumu | C (Modify Auxiliary Register Content with |
|  | $\square$ AMAR Subtrac | (Modify Auxiliary Register Content with P |
|  | $\square \mathrm{MPY}$ (M |  |

## Example

| Syntax | Description |
| :--- | :--- |
| AMAR *AR3+ | Both instructions are performed in parallel. AR3 is incremented by |
| $\because:$ MPY uns(*AR4), uns(*CDP), AC0 | 1. The unsigned content addressed by AR4 is multiplied by the |
|  | unsigned content addressed by the coefficient data pointer register |
|  | (CDP) and the result is stored in AC0. |

## AMOV

## Load Extended Auxiliary Register with Immediate Value

## Syntax Characteristics



## AMOV

Modify Auxiliary or Temporary Register Content
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | AMOV TAx, TAy | No | 3 | 1 | $A D$ |
| $[2]$ | AMOV P8, TAx | No | 3 | 1 | $A D$ |
| $[3]$ | AMOV D16, TAx | No | 4 | 1 | $A D$ |

Description These instructions perform, in the A-unit address generation units:
a move from auxiliary or temporary register TAx to auxiliary or temporary register TAy
a a load in the auxiliary or temporary registers TAx of a program address defined by a program address label assembled into P8
$\square$ a load in the auxiliary or temporary registers TAx of the absolute data address signed constant D16

The operation is performed in the address phase of the pipeline, however data memory is not accessed.

Status Bits
Affected by none
Affects none
See Also See the following other related instructions:

- AADD (Modify Auxiliary or Temporary Register Content by Addition)
- AMAR (Modify Auxiliary Register Content)
- AMAR (Modify Extended Auxiliary Register Content)
- ASUB (Modify Auxiliary or Temporary Register Content by Subtraction)
$\square$ MOV (Load Auxiliary or Temporary Register from Memory)

Modify Auxiliary or Temporary Register Content

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | AMOV TAx, TAy |  | No | 3 | 1 | AD |  |
|  |  |  |  |  |  |  |  |
| Opcode |  | 0001 | $010 E$ | FSSS | xxxx | FDDD | 0001 |
|  |  | $\mid 0001$ | 010 E | FSSS | xxxx | FDDD | 1001 |

The assembler selects the opcode depending on the instruction position in a paralleled pair.

## Operands

TAx, TAy
Description This instruction performs, in the A-unit address generation units, a move from the auxiliary or temporary register TAx to auxiliary or temporary register TAy. The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.
Example 1

| Syntax | Description |
| :--- | :--- |
| AMOV AR1, AR0 | The content of AR1 is copied to AR0. |

## Example 2

| Syntax | Description |
| :--- | :--- |
| AMOV T1, T0 | The content of T1 is copied to T0. |

Modify Auxiliary or Temporary Register Content

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[2]$ | AMOV P8, TAx | No | 3 | 1 | AD |  |  |
|  |  |  |  |  |  |  |  |
| Opcode |  | 0001 | 010 E | PPPP | PPPP | FDDD | 0101 |
|  |  | $\mid 0001$ | 010 E | PPPP | PPPP | FDDD | 1101 |

The assembler selects the opcode depending on the instruction position in a paralleled pair.

## Operands

Description

Status Bits

Repeat This instruction can be repeated.

## Example 1

| Syntax | Description |
| :--- | :--- |
| AMOV \#255, AR0 | The unsigned 8-bit value (255) is copied to AR0. |

## Example 2

| Syntax | Description |
| :--- | :--- |
| AMOV \#255, T0 | The unsigned 8-bit value (255) is copied to T0. |

Modify Auxiliary or Temporary Register Content

## Syntax Characteristics



## AMOV

## Modify Extended Auxiliary Register Content

## Syntax Characteristics



## Example 1

| Syntax | Description |
| :--- | :--- |
| $\operatorname{mar}(X A R 1=$ XAR0 $)$ | The content of XAR0 is copied to XAR1. |


| Before | After |  |  |  |
| :--- | :--- | :--- | :--- | :--- |
| XAR0 | 12 | 3456 | XARO | 12 |
| XAR1 | 43 | 5634 | XAR1 | 12 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| $\operatorname{mar}(X C D P=$ XAR7 $)$ | The content of XAR7 is copied to XCDP. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- | :--- |
| XCDP | 008000 | XCDP | 014000 |
| XAR7 | 014000 | XAR7 | 014000 |

## Execution

(XACsrc) -> XACdst

## AND Bitwise AND

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | AND src, dst | Yes | 2 | 1 | X |
| $[2]$ | AND k8, src, dst | Yes | 3 | 1 | X |
| $[3]$ | AND k16, src, dst | No | 4 | 1 | X |
| $[4]$ | AND Smem, src, dst | No | 3 | 1 | X |
| $[5]$ | AND ACx << \#SHIFTW[, ACy] | Yes | 3 | 1 | X |
| $[6]$ | AND k16 <<\#16, [ACx,] ACy | No | 4 | 1 | X |
| $[7]$ | AND k16 <<\#SHFT, [ACx,] ACy | No | 4 | 1 | X |
| $[8]$ | AND k16, Smem | No | 4 | 1 | $X$ |

Description These instructions perform a bitwise AND operation:
$\square$ In the $D$-unit, if the destination operand is an accumulator.

- In the A-unit ALU, if the destination operand is an auxiliary or temporary register.
$\square$ In the A-unit ALU, if the destination operand is the memory.


## Status Bits Affected by C54CM

Affects none
See Also
See the following other related instructions:

- BAND (Bitwise AND Memory with Immediate Value and Compare to Zero)
- OR (Bitwise OR)
$\square$ XOR (Bitwise Exclusive OR)


## Bitwise AND

## Syntax Characteristics

| No. $\quad$ Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | AND src, dst |  | Yes | 2 | 1 | X |
|  |  |  |  |  |  |  |
| Opcode |  | 0010 | 100 E | FSSS | FDDD |  |
| Operands | dst, src |  |  |  |  |  |
| Description | This instruction performs a bitwise AND operation between two registers: |  |  |  |  |  |
|  | dst $=$ dst \& src |  |  |  |  |  |

$\square$ When the destination (dst) operand is an accumulator:

- The operation is performed on 40 bits in the D-unit ALU.

■ Input operands are zero extended to 40 bits.

- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended.
$\square$ When the destination (dst) operand is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| AND AC0, AC1 | The content of AC0 is ANDed with the content of AC1 and the result is stored in AC1. |


| Before | After |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| AC0 | 7 E | 2355 | 4 FCO | ACO | 7 E | 2355 |
| AC1 | 0 F | E 340 | 5678 | AC1 | 0 FCO | 2340 |

Bitwise AND

## Syntax Characteristics



## Bitwise AND

## Syntax Characteristics



Bitwise AND

## Syntax Characteristics



## Bitwise AND

## Syntax Characteristics



- The shift and AND operations are performed in one cycle in the D-unit shifter.
- When $\mathrm{M} 40=0$ and $\mathrm{C} 54 \mathrm{CM}=0$, input operands $\mathrm{ACx}(31-0)$ are zero extended to 40 bits. Otherwise, $\operatorname{ACx}(39-0)$ is used as is.
- The input operand (ACx) is shifted by a 6-bit immediate value in the D-unit shifter.
$\square$ The CARRY status bit is not affected by the logical shift operation.


## Compatibility with C54x devices (C54CM = 1)

When C54CM $=1$, the intermediary logical shift is performed as if M40 is locally set to 1 . The 8 upper bits of the 40 -bit intermediary result are not cleared.

| Status Bits | Affected by C54CM, M40 |
| :--- | :--- | :--- |
|  | Affects $\quad$ This instruction can be repeated. |
| Repeat |  |
| Example | Description |
| Syntax | The content of AC0 is ANDed with the content of AC1 logically shifted left by <br> 30 |
| AND AC1 << \#30, ACs and the result is stored in AC0. |  |

Bitwise AND

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[6]$ | AND k16 << \#16, $[A C x] A C y$, | No | 4 | 1 | $X$ |

Opcode $\quad|01111010| k k k k \quad k k k k|k k k k \quad k k k k| S S D D \quad 010 \mathrm{x}$

## Operands

Description This instruction performs a bitwise AND operation between an accumulator (ACx) content and a 16 -bit unsigned constant, k 16 , shifted left by 16 bits:

```
ACy = ACx & (k16 <<< #16)
```

$\square$ The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are zero extended to 40 bits.
$\square$ The input operand (k16) is shifted 16 bits to the MSBs.
Status Bits
Affected by none
Affects none
This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| AND \#FFFFh <<\#16, AC1, AC0 | The content of AC1 is ANDed with the unsigned 16-bit value (FFFFh) <br> logically shifted left by 16 bits and the result is stored in AC0. |

## Bitwise AND

## Syntax Characteristics



Bitwise AND

## Syntax Characteristics



- The operation is performed on 16 bits in the A-unit ALU.
- The result is stored in memory.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :---: | :---: |
| AND \#0FC0, *AR1 | The content addressed by AR1 is ANDed with the unsigned 16-bit value (FCOh) and the result is stored in the location addressed by AR1. |
| Before | After |
| *AR1 5678 | *AR1 0640 |

## ASUB

Modify Auxiliary or Temporary Register Content by Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | ASUB TAx, TAy | No | 3 | 1 | AD |
| $[2]$ | ASUB P8, TAx | No | 3 | 1 | AD |

Description These instructions perform, in the A-unit address generation units:
$\square$ a subtraction between two auxiliary or temporary registers, TAy and TAx, and stores the result in TAy

- a subtraction between the auxiliary or temporary registers TAx and a program address defined by a program address label assembled into unsigned P8, and stores the result in TAx

The operation is performed in the address phase of the pipeline, however data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 , the circular buffer management controls the result stored in the destination register.

Status Bits Affected by ST2_55
Affects none
See Also See the following other related instructions:

- AADD (Modify Auxiliary or Temporary Register Content by Addition)
- AMAR (Modify Auxiliary Register Content)
- AMAR (Modify Extended Auxiliary Register Content)
- AMOV (Modify Auxiliary or Temporary Register Content)

Modify Auxiliary or Temporary Register Content by Subtraction

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | ASUB TAx, TAy |  | No | 3 | 1 | AD |  |
|  |  |  |  |  |  |  |  |

## Operands

Description This instruction performs, in the A-unit address generation units, a subtraction between two auxiliary or temporary registers, TAy and TAx, and stores the result in TAy. The content of TAx is considered signed:

TAy = TAy - TAx
The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1, the circular buffer management controls the result stored in the destination register.

## Compatibility with C54x devices (C54CM = 1)

In the translated code section, the ASUB instruction must be executed with C54CM set to 1 .

When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.

| Status Bits | Affected by | ST2_55 |
| :--- | :--- | :--- |
|  | Affects | none | Repeat $\quad$ This instruction can be repeated.

## Example 1

| Syntax | Description |  |  |
| :---: | :---: | :---: | :---: |
| ASUB T0, AR0 | The signed content of T0 is subtracted from the content of ARO and the result is stored in ARO. |  |  |
| Before |  | After |  |
| XARO | 018000 | XARO | 010000 |
| то | 8000 | то | 8000 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| ASUB T1, T0 | The content of T1 is subtracted from the content of T0 and the result is stored in T0. |

## Modify Auxiliary or Temporary Register Content by Subtraction

## Syntax Characteristics

| No. Syntax |  | Parallel Enable Bi | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [2] ASUB P8, TAx |  | No | 3 | 1 | AD |
| Opcode | 0001 | 010E PP | P PP | P ${ }^{\text {FDD }}$ | 0110 |
|  | 0001 | 010E PP | P PP | PP FDDD | 1110 |

The assembler selects the opcode depending on the instruction position in a paralleled pair.

## Operands

Description

Status Bits

Repeat

TAx, P8
This instruction performs, in the A-unit address generation units, a subtraction between the auxiliary or temporary register TAx and a program address defined by a program address label assembled into unsigned P8, and stores the result in TAx:

TAx = TAx - P8
The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 , the circular buffer management controls the result stored in the destination register.

Compatibility with C54x devices (C54CM = 1)
In the translated code section, the ASUB instruction must be executed with C54CM set to 1 .

When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.

## Example

| Syntax | Description |
| :--- | :--- |
| ASUB \#255, AR0 | The unsigned 8-bit value (255) is subtracted from the signed content of AR0 and <br> the result is stored in AR0. |

## ASUB

Modify Extended Auxiliary Register Content by Subtraction

## Syntax Characteristics



## Compatibility with C54x devices (C54CM = 1)

None.
Status Bits Affected by
Affects
Repeat This instruction can be repeated.

```
See Also See the following other related instructions:
    \square AADD (Modify Auxiliary or Temporary Register Content by Addition)
    \square AADD (Modify Extended Auxiliary Register Content by Addition)
    \square AMAR (Modify Extended Auxiliary Register Content)
    ] AMAR (Parallel Modify Auxiliary Register Contents)
    \square AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and
        Accumulate)
        | AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and
        Subtract)
        \square AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply)
        \square AMOV (Modify Auxiliary or Temporary Register Content)
        \square ASUB (Modify Auxiliary or Temporary Register Content by Subtraction)
```

Example 1

| Syntax | Description |
| :--- | :--- |
| ASUB XAR0, XAR1 | The content of XAR0 is subtracted from XAR1 and stored in XAR1. |


| Before | After |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- |
| XAR0 | 12 | 3456 | XAR0 | 12 | 3456 |
| XAR1 | 43 | 5634 | XAR1 | 31 | $21 D E$ |

## Example 2

| Syntax | Description |
| :--- | :--- |
| ASUB XAR7, XCDP | The content of XAR7 is subtracted from XCDP and stored in XCDP. |


| Before |  | After |  |
| :--- | :--- | :--- | :--- | :--- |
| XCDP | 008000 | XCDP | 007000 |
| XAR7 | 001000 | XAR7 | 001000 |

Execution
(XACdst) - (XACsrc) -> XACdst

B
Branch Unconditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | B ACx | No | 2 | 10 | X |
| $[2]$ | B L7 | Yes | 2 | $6^{\dagger}$ | AD |
| $[3]$ | B L16 | Yes | 3 | $6^{\dagger}$ | AD |
| $[4]$ | B P24 | No | 4 | 5 | D |

† This instruction executes in 3 cycles if the addressed instruction is in the instruction buffer unit.

Description This instruction branches to a 24-bit program address defined by the content of the 24 lowest bits of an accumulator (ACx), or to a program address defined by the program address label assembled into Lx or P24.

These instructions cannot be repeated.
Status Bits
Affected by none
Affects none
See Also
See the following other related instructions:

- BCC (Branch Conditionally)
- BCC (Branch on Auxiliary Register Not Zero)
- BCC (Compare and Branch)
- CALL (Call Unconditionally)

Branch Unconditionally

## Syntax Characteristics



## Branch Unconditionally

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | B L7 | Yes | 2 | 6 | AD |
| $[3]$ | B L16 | Yes | 3 | 6 | AD |

$\dagger$ Executes in 3 cycles if the addressed instruction is in the instruction buffer unit.

| Opcode | L7 |  | 0100 | 101E | OLLL | LLLL |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | L16 |  | LLLL | LLLL | LLLL | LLLL |
| Operands | Lx |  |  |  |  |  |
| Description | This instruction branches to a program address defined by a program address label assembled into Lx. |  |  |  |  |  |
| Status Bits | Affected by | none |  |  |  |  |
|  | Affects | none |  |  |  |  |
| Repeat | This instruction cannot be repeated. |  |  |  |  |  |
| Example |  |  |  |  |  |  |
| Syntax | Description |  |  |  |  |  |
| B branch | Program control is passed to the absolute address defined by branch. |  |  |  |  |  |



Branch Unconditionally

## Syntax Characteristics



## BAND <br> Bitwise AND Memory with Immediate Value and Compare to Zero

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | BAND Smem, k16, TC1 | No | 4 | 1 | X |
| $[2]$ | BAND Smem, k16, TC2 | No | 4 | 1 | X |


| Opcode | TC1 | 11110010 | AAAA | AAAI | kkkk | kkkk | kkkk | kkkk |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | TC2 | 11110011 | AAAA | AAAI | kkkk | kkkk | kkkk | kkkk |
| Operands | k16, Smem, TCx |  |  |  |  |  |  |  |
| Description | This instruction performs a bit field manipulation in the A-unit ALU. The 16-bit field mask, k 16 , is ANDed with the memory (Smem) operand and the result is compared to 0 : |  |  |  |  |  |  |  |

```
if( ((Smem) AND k16 ) == 0)
    TCx = 0
else
    TCx = 1
```

Status Bits Affected by none
Affects TCx
Repeat This instruction can be repeated.
See Also See the following other related instructions:
$\square$ AND (Bitwise AND)

## Example

| Syntax | Description |
| :--- | :--- |
| BAND *AR0, \#0060h, TC1 | The unsigned 16-bit value (0060h) is ANDed with the content addressed by <br> AR0. The result is 1, TC1 is set to 1. |


| Before | After |  |  |
| :--- | ---: | :--- | ---: |
| *AR0 | 0040 | *AR0 | 0040 |
| TC1 | 0 | TC1 | 1 |

## BCC

## Branch Conditionally

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | BCC 14, cond | No | 2 | $6 / 5$ | R |
| $[2]$ | BCC L8, cond | Yes | 3 | $6 / 5$ | R |
| $[3]$ | BCC L16, cond | No | 4 | $6 / 5$ | R |
| $[4]$ | BCC P24, cond | No | 5 | $5 / 5$ | R |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false

Description These instructions evaluate a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a branch occurs to the program address label assembled into $14, \mathrm{Lx}$, or P24. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

The instruction selection depends on the branch offset between the current PC value and the program branch address specified by the label.

These instructions cannot be repeated.
Status Bits
Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
See Also See the following other related instructions:

- B (Branch Unconditionally)
- BCC (Branch on Auxiliary Register Not Zero)
- BCC (Compare and Branch)
- CALLCC (Call Conditionally)


## Branch Conditionally

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | BCC 14, cond | No | 2 | $6 / 5$ | $R$ |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false

## Opcode

## Operands

Description

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| BCC branch, ACO != \#0 | The content of AC0 is not equal to 0, control is passed to the program address <br> label defined by branch. |



## Branch Conditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | BCC L8, cond | Yes | 3 | $6 / 5$ | $R$ |
| $[3]$ | BCC L16, cond | No | 4 | $6 / 5$ | $R$ |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| BCC branch, ACO != \#0 | The content of AC0 is not equal to 0, control is passed to the program address <br> label defined by branch. |



## Branch Conditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[4]$ | BCC P24, cond | No | 5 | $5 / 5$ | $R$ |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false


## Operands

Description

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| BCC branch, ACO != \#0 | The content of AC0 is not equal to 0, control is passed to the program address <br> label defined by branch. |



## BCC <br> Branch on Auxiliary Register Not Zero

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | BCC L16, ARn_mod != \#0 | No | 4 | $6 / 5$ | AD |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false

## Opcode

11111100 |AAAA AAAI |LLLL LLLL $\mid$ LLLL LLLL

## Operands

ARn_mod, L16
Description This instruction performs a conditional branch (selected auxiliary register content not equal to 0 ) of the program counter ( PC ). The program branch address is specified as a 16 -bit signed offset, L16, relative to PC. Use this instruction to branch within a 64K-byte window centered on the current PC value.

The possible addressing operands can be grouped into three categories:
$\square$ ARx not modified (ARx as base pointer), some examples:
*AR1; No modification or offset
*AR1(\#15); Use 16-bit immediate value (15) as offset
*AR1(TO); Use content of T0 as offset
*AR1(short(\#4)); Use 3-bit immediate value (4) as offset
$\square$ ARx modified before being compared to 0 , some examples:
*-AR1; Decrement by 1 before comparison
*+AR1(\#20); Add 16-bit immediate value (20) before comparison
$\square$ ARx modified after being compared to 0 , some examples:
*AR1+; Increment by 1 after comparison
*(AR1 - T1); Subtract content of T1 after comparison

1) The content of the selected auxiliary register (ARn) is premodified in the address generation unit.
2) The (premodified) content of ARn is compared to 0 and sets the condition in the address phase of the pipeline.
3) If the condition is not true, a branch occurs. If the condition is true, the instructions are executed in sequence.
4) The content of ARn is postmodified in the address generation unit.

## Compatibility with C54x devices (C54CM = 1)

When C54CM = 1 :
The premodifier *ARn(TO) is not available; ${ }^{*} \operatorname{ARn}(\operatorname{ARO})$ is available.
The postmodifiers *(ARn + T0) and *(ARn - T0) are not available; *(ARn + ARO) and *(ARn - ARO) are available.

The legality of the modifier usage is checked by the assembler when using the .c54cm_on and .c54cm_off assembler directives.

| Status Bits | Affected by | C54CM |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction cannot be repeated.
See Also See the following other related instructions:

- B (Branch Unconditionally)
- BCC (Branch Conditionally)
- BCC (Compare and Branch)


## Example 1

| Syntax | Description |
| :--- | :--- |
| BCC branch, ${ }^{*} A R 1(\# 6)!=\# 0$ | The content of AR1 is compared to 0. The content is not 0, program control <br> is passed to the program address label defined by branch. |


|  | BCC branch, *AR1(\#6) != \#0 | address: | 004004 |
| :---: | :---: | :---: | :---: |
|  | ... | ; | 00400A |
| branch |  | ; | 00400C |


| Before |  | After |  |
| :--- | ---: | :--- | ---: |
| AR1 | 0005 | AR1 | 0005 |
| PC | 004004 | PC | 00400 C |

## Example 2

| Syntax | Description |
| :--- | :--- |
| BCC branch, *AR3- != \#0 | The content of AR3 is compared to 0. The content is 0, program control is <br> passed to the next instruction (the branch is not taken). AR3 is decremented by <br> 1 after the comparison. |


|  | BCC branch, *AR3- != \#0 | address: | 00400F |
| :---: | :---: | :---: | :---: |
|  | $\ldots$ | ; | 004013 |
| branch | ... ... | ; | 004015 |


| Before |  | After |  |
| :--- | ---: | :--- | ---: |
| AR3 | 0000 | AR3 | FFFF |
| PC | 00400 F | PC | 004013 |

## BCC

Compare and Branch

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | BCC[U] L8, src RELOP K8 | No | 4 | $7 / 6$ | $X$ |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false

## Opcode

## Operands

Description
$01101111 \mid$ FSSS ccxu $\mid$ KKKK KKKK $\mid$ LLLL LLLL
K8, L8, RELOP, src
This instruction performs a comparison operation between a source (src) register content and an 8 -bit signed value, K8. The instruction performs a comparison in the D-unit ALU or in the A-unit ALU. The comparison is performed in the execute phase of the pipeline. If the result of the comparison is true, a branch occurs.

The program branch address is specified as an 8-bit signed offset, L8, relative to the program counter (PC). Use this instruction to branch within a 256 -byte window centered on the current PC value.

The comparison depends on the optional U keyword and, for accumulator comparisons, on M40.

- In the case of an unsigned comparison, the 8-bit constant, K8, is zero extended to:
- 16 bits, if the source (src) operand is an auxiliary or temporary register.
- 40 bits, if the source (src) operand is an accumulator.
$\square$ In the case of a signed comparison, the 8 -bit constant, K8, is sign extended to:
- 16 bits, if the source (src) operand is an auxiliary or temporary register.
- 40 bits, if the source (src) operand is an accumulator.

As the following table shows, the $U$ keyword specifies an unsigned comparison; M40 defines the comparison bit width of the accumulator.
$\left.\begin{array}{ccl}\text { U } & \text { src } & \text { Comparison Type } \\ \text { no } & \text { TAx } & \begin{array}{l}\text { 16-bit signed comparison in A-unit ALU } \\ \text { no }\end{array} \\ \text { ACx M40 = 0,32-bit signed comparison in D-unit ALU } \\ \text { if M40 = 1, 40-bit signed comparison in D-unit ALU }\end{array}\right\}$


## Example 2

| Syntax | Description |
| :--- | :--- |
| BCC branch, T1 != \#1 | The content of T1 is not equal to 1, program control is passed to the next <br> instruction (the branch is not taken). |


|  | BCC branch, T1 != \#1 |  |  |
| :---: | :---: | :---: | :---: |
|  | $\ldots$ | address: | 00407D |
| branch |  |  | 004080 |


| Before |  | After |  |
| :--- | :--- | :--- | :--- |
| T1 | 0000 | T1 | 0000 |
| PC | 4079 | PC | $407 D$ |

## BCLR

Clear Accumulator, Auxiliary, or Temporary Register Bit

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | BCLR Baddr, src | No | 3 | 1 | $X$ |

## Opcode

## Operands

Description

Baddr, src
This instruction performs a bit manipulation:

- In the D-unit ALU, if the source (src) register operand is an accumulator.
$\square$ In the A-unit ALU, if the source (src) register operand is an auxiliary or temporary register.

The instruction clears to 0 a single bit, as defined by the bit addressing mode, Baddr, of the source register.

The generated bit address must be within:

- 0-39 when accessing accumulator bits (only the 6 LSBs of the generated bit address are used to determine the bit position). If the generated bit address is not within $0-39$, the selected register bit value does not change.
- 0-15 when accessing auxiliary or temporary register bits (only the 4 LSBs of the generated address are used to determine the bit position).

| Status Bits | Affected by none |
| :--- | :--- |
|  | Affects none |
| Repeat | This instruction can be repeated. |
| See Also | See the following other related instructions: |
|  | $\square$ |
|  | $\square$ |
|  | $\square$ |
|  | BCLR (Clear Memory Bit) (Clear Status Register Bit) |

## BCLR

Clear Memory Bit

## Syntax Characteristics



## BCLR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | BCLR k4, ST0_55 | Yes | 2 | 1 | X |
| [2] | BCLR k4, ST1_55 | Yes | 2 | 1 | X |
| [3] | BCLR k4, ST2_55 | Yes | 2 | 1 | X |
| [4] | BCLR k4, ST3_55 | Yes | 2 | $1{ }^{\dagger}$ | X |
| [5] | BCLR f-name | Yes | 2 | $1{ }^{\dagger}$ | X |

$\dagger$ When this instruction is decoded to modify status bit CAFRZ (15), CAEN (14), or CACLR (13), the CPU pipeline is flushed and the instruction is executed in 5 cycles regardless of the instruction context.

| Opcode | ST0 | 0100 | 011E | kkkk | 0000 |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | ST1 | 0100 | 011E | kkkk | 0010 |
|  | ST2 | 0100 | 011E | kkkk | 0100 |
|  | ST3 | 0100 | 011E | kkkk | 0110 |

## Operands f-name, k4, STx_55

Description These instructions perform a bit manipulation in the A-unit ALU.
These instructions clear to 0 a single bit, as defined by a 4 -bit immediate value, k4, or the one-bit-wide status bit field name, f-name, in the selected status register (ST0_55, ST1_55, ST2_55, or ST3_55).

It is not allowed to access DP register mapped in ST0 register with BCLR k4, ST0_55 instruction. Therefore k4 cannot have a value of 0-8.

It is not allowed to access ASM bit field in ST1 with BCLR k4, ST1_55 instruction. Therefore k4 cannot have a value of 0-4.

Compatibility with C54x devices (C54CM =1)
C55x DSP status registers bit mapping (Figure 5-1, page 5-110) does not correspond to C54x DSP status registers bits.

## Status Bits Affected by none

Affects Selected status bits

```
Repeat This instruction cannot be repeated.
See Also See the following other related instructions:
BCLR (Clear Accumulator, Auxiliary, or Temporary Register Bit)
\square BCLR (Clear Memory Bit)
B BSET (Set Status Register Bit)
```


## Example 1

| Syntax | Description |
| :--- | :--- |
| BCLR AR2LC, ST2_55 | The ST2_55 bit position defined by the label (AR2LC, bit 2) is cleared to 0. |


| Before | After |  |
| :--- | :--- | :--- |
| ST2_55 | 0006 | ST2_55 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| BCLR AR2LC | The ST2_55 AR2LC (bit 2) is cleared to 0. |


| Before | After |  |
| :--- | :--- | :--- |
| ST2_55 | 0006 | ST2_55 |

Figure 5-1. Status Registers Bit Mapping
STO_55

| 15 | 14 | 13 |  | 12 |  | 11 |  | 10 | 9 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACOV2 $^{\dagger}$ | ACOV3 $^{\dagger}$ | TC1 $^{\dagger}$ | TC2 | CARRY | ACOV0 | ACOV1 |  |  |  |
| R/W-0 | R/W-0 | R/W-1 | R/W-1 | R/W-1 | R/W-0 | R/W-0 |  |  |  |



ST1_55

| 15 | 14 | 13 | 12 | 11 | 10 |  | 9 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| BRAF | CPL | XF | HM | INTM | M40 $^{\dagger}$ | SATD | SXMD |
| R/W-0 | R/W-0 | R/W-1 | R/W-0 | R/W-1 | R/W-0 | R/W-0 | R/W-1 |


| 5 |  |  |  |  |  |  |  | 4 |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| C16 | FRCT | C54CM $^{\dagger}$ | ASM |  |  |  |  |  |  |
| R/W-0 | R/W-0 | R/W-1 | R/W-0 |  |  |  |  |  |  |

ST2_55

| 14 | 13 |  | 12 | 11 | 10 |  | 9 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ARMS | Reserved | DBGM | EALLOW | RDM | Reserved | CDPLC |  |
| R/W-0 |  | R/W-1 | R/W-0 | R/W-0 |  | R/W-0 |  |


| 7 | 6 | 5 | 4 |  | 3 |  | 2 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | 1 | 0 |  |  |  |  |  |
| AR7LC | AR6LC | AR5LC | AR4LC | AR3LC | AR2LC | AR1LC | AR0LC |
| R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 |

## ST3_55

| 15 | 13 |  | 12 | 11 | Reserved (always write 1100b) |
| :---: | :---: | :---: | :---: | :---: | :---: |
| CAFRZ $^{\dagger}$ | CAEN $^{\dagger}$ | CACLR $^{\dagger}$ | HINT $^{\dagger}$ |  |  |
| R/W-0 | R/W-0 | R/W-0 | R/W-1 |  |  |


| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| CBERR ${ }^{\dagger}$ | MPNMC§ | SATA ${ }^{\dagger}$ | Reserved |  | CLKOFF | SMUL | SST |
| R/W-0 | R/W-pins | R/W-0 |  |  | R/W-0 | R/W-0 | R/W-0 |

Legend: $\mathrm{R}=$ Read; $\mathrm{W}=$ Write; $-n=$ Value after reset
$\dagger$ Highlighted bit: If you write to the protected address of the status register, a write to this bit has no effect, and the bit always appears as a 0 during read operations.
$\ddagger$ The HINT bit is not used for all C55x host port interfaces (HPIs). Consult the documentation for the specific C55x DSP.
§ The reset value of MPNMC may be dependent on the state of predefined pins at reset. To check this for a particular C55x DSP, see the boot loader section of its data sheet.

## BCNT

## Count Accumulator Bits

## Syntax Characteristics



## Operands

Description

ACx, ACy, Tx, TCx
This instruction performs bit field manipulation in the D -unit shifter. The result is stored in the selected temporary register ( Tx ). The A-unit ALU is used to make the move operation.

Accumulator ACx is ANDed with accumulator ACy. The number of bits set to 1 in the intermediary result is evaluated and stored in the selected temporary register (Tx). If the number of bits is even, the selected TCx status bit is cleared to 0 . If the number of bits is odd, the selected TCx status bit is set to 1 .

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | TCx |

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| BCNT AC1, AC2, TC1, T1 | The content of AC1 is ANDed with the content of AC2, the number of bits <br> set to 1 in the result is evaluated and stored in T1. The number of bits set <br> to 1 is odd, TC1 is set to 1. |


| Before |  |  | After |  |  |  |  |
| :--- | ---: | ---: | ---: | :--- | :--- | :--- | :--- |
| AC1 | $7 E$ | 2355 | 4 FCO | AC1 | 7E | 2355 | $4 \mathrm{FC0}$ |
| AC2 | 0 F | E 340 | 5678 | AC2 | OF | E340 | 5678 |
| T1 |  | 0000 | T1 |  |  | 000 B |  |
| TC1 |  | 0 | TC1 |  |  | 1 |  |

## BFXPA

Expand Accumulator Bit Field

## Syntax Characteristics



## BFXTR

## Extract Accumulator Bit Field

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | BFXTR k16, ACx, dst | No | 4 | 1 | $X$ |

Opcode

## Operands

## Description


ACx, dst, k16
This instruction performs a bit field manipulation in the D-unit shifter. When the destination register (dst) is an A-unit register (ARx or Tx), a dedicated bus carries the output of the D-unit shifter directly into dst.

The 16 -bit field mask, k16, is scanned from the least significant bits (LSBs) to the most significant bits (MSBs). According to the bit set to 1 in the bit field mask, the corresponding 16 LSBs of the source accumulator (ACx) bits are extracted and packed toward the LSBs. The result is stored in dst.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction can be repeated.
See Also See the following other related instructions:

- BFXPA (Expand Accumulator Bit Field)

Example

| Syntax | Description |
| :--- | :--- |
| BFXTR \#8024h, AC0, T2 | Each bit of the unsigned 16-bit value (8024h) is scanned from the LSB to the |
|  | MSB to test for a 1. If the bit is set to 1, the corresponding bit in AC0 is |
| extracted and packed toward the LSB in T2; otherwise, the corresponding bit in |  |
| AC0 is not extracted. The result is stored in T2. |  |


| Execution |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| \#k16 (8024h) | 1000 | 0000 | 0010 | 0100 |  |  |  |
| AC0 (15-0) | 0101 | 0101 | 1010 | 1010 |  |  |  |
| T2 | 0000 | 0000 | 0000 | 0010 |  |  |  |
| Before | After |  |  |  |  |  |  |
| ACO 002300 | 55AA |  | ACO |  | 00 | 2300 | 55AA |
| T2 | 0000 |  | T2 |  |  |  | 0002 |

## BNOT <br> Complement Accumulator, Auxiliary, or Temporary Register Bit

## Syntax Characteristics



## Operands

## Description

Status Bits

Repeat
See Also

Baddr, src
This instruction performs a bit manipulation:
$\square$ In the D-unit ALU, if the source (src) register operand is an accumulator.

- In the A-unit ALU, if the source (src) register operand is an auxiliary or temporary register.
The instruction complements a single bit, as defined by the bit addressing mode, Baddr, of the source register.
The generated bit address must be within:
- 0-39 when accessing accumulator bits (only the 6 LSBs of the generated bit address are used to determine the bit position). If the generated bit address is not within $0-39$, the selected register bit value does not change.
- 0-15 when accessing auxiliary or temporary register bits (only the 4 LSBs of the generated address are used to determine the bit position).

Affected by none
Affects none
This instruction can be repeated.
See the following other related instructions:

- BNOT (Complement Memory Bit)
$\square$ NOT (Complement Accumulator, Auxiliary, or Temporary Register Content)


## Example

| Syntax | Description |
| :--- | :--- |
| BNOT AR1, T0 | The bit at the position defined by the content of AR1(3-0) in T0 is complemented. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| T0 | E000 | T0 | F000 |
| AR1 | 000 C | AR1 | 000 C |

## BNOT <br> Complement Memory Bit

## Syntax Characteristics

| No. Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| BNOT src, Smem |  | No | 3 | 1 | X |
| Opcode |  | 0011 \| | A | I FSSS | 111x |
| Operands | Smem, src |  |  |  |  |
| Description | This instruction performs a bit manipulation in the A-unit ALU. The instruction complements a single bit, as defined by the content of the source (src) operand, of a memory (Smem) location. |  |  |  |  |
| Status Bits | Affected by <br> Affects |  |  |  |  |
| Repeat | This instructior |  |  |  |  |
| See Also | See the follow BCLR BNOT ( BSET NOT (C | ions: <br> Auxiliary, <br> uxiliary, or | Tem <br> mpora | rary Regi <br> Register | ister Bit) <br> Content) |
| Example |  |  |  |  |  |
| Syntax | Description |  |  |  |  |
| BNOT AC0, *AR3 | The bit at the position defined by AC0(3-0) in the content addressed by AR3 is complemented. |  |  |  |  |

## BSET

Set Accumulator, Auxiliary, or Temporary Register Bit

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | BSET Baddr, src | No | 3 | 1 | $X$ |

## Opcode

## Operands

Description

Baddr, src
This instruction performs a bit manipulation:

- In the D-unit ALU, if the source (src) register operand is an accumulator.
$\square$ In the A-unit ALU, if the source (src) register operand is an auxiliary or temporary register.

The instruction sets to 1 a single bit, as defined by the bit addressing mode, Baddr, of the source register.

The generated bit address must be within:

- 0-39 when accessing accumulator bits (only the 6 LSBs of the generated bit address are used to determine the bit position). If the generated bit address is not within $0-39$, the selected register bit value does not change.
- 0-15 when accessing auxiliary or temporary register bits (only the 4 LSBs of the generated address are used to determine the bit position).

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction can be repeated.
See Also See the following other related instructions:
$\square$ BCLR (Clear Accumulator, Auxiliary, or Temporary Register Bit)
$\square$ BNOT (Complement Accumulator, Auxiliary, or Temporary Register Bit)

- BSET (Set Memory Bit)
- BSET (Set Status Register Bit)


## Example

| Syntax | Description |
| :--- | :--- |
| BSET AR3, AC0 | The bit at the position defined by the content of AR3(4-0) in AC0 is set to 1. |

## BSET

Set Memory Bit

## Syntax Characteristics

| No. Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| BSET src, Smem |  | No | 3 | 1 | X |
| Opcode |  | 0011 \| | A A | AI $\mid$ FSSS | S 1100 |
| Operands | Smem, src |  |  |  |  |
| Description | This instruction performs a bit manipulation in the A-unit ALU. The instruction sets to 1 a single bit, as defined by the content of the source (src) operand, of a memory (Smem) location. |  |  |  |  |
| Status Bits | Affected by <br> Affects |  |  |  |  |
| Repeat | This instruc |  |  |  |  |
| See Also | See the follow BCLR BNOT BSET BSET | tions: <br> or Tempor | ry Re | ister Bit) |  |
| Example |  |  |  |  |  |
| Syntax | Description |  |  |  |  |
| BSET AC0, *AR3 | The bit at the position defined by ACO(3-0) in the content addressed by AR3 is set to 1 . |  |  |  |  |

## BSET <br> Set Status Register Bit

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | BSET k4, STO_55 | Yes | 2 | 1 | X |
| [2] | BSET k4, ST1_55 | Yes | 2 | 1 | X |
| [3] | BSET k4, ST2_55 | Yes | 2 | 1 | X |
| [4] | BSET k4, ST3_55 | Yes | 2 | $1{ }^{\dagger}$ | X |
| [5] | BSET f-name | Yes | 2 | $1{ }^{\dagger}$ | X |

$\dagger$ When this instruction is decoded to modify status bit CAFRZ (15), CAEN (14), or CACLR (13), the CPU pipeline is flushed and the instruction is executed in 5 cycles regardless of the instruction context.

| Opcode | ST0 | 0100 | 011E | kkkk | 0001 |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | ST1 | 0100 | 011E | kkkk | 0011 |
|  | ST2 | 0100 | 011E | kkkk | 0101 |
|  | ST3 | 0100 | 011E | kkkk | 0111 |

## Operands

Description These instructions perform a bit manipulation in the A-unit ALU.
These instructions set to 1 a single bit, as defined by a 4-bit immediate value, k 4 , or the one-bit-wide status bit field name, f-name, in the selected status register (ST0_55, ST1_55, ST2_55, or ST3_55).

It is not allowed to access DP register mapped in ST0 register with BSET k4, ST0_55 instruction. Therefore k4 cannot have a value of 0-8.

It is not allowed to access ASM bit field in ST1 with BSET k4, ST1_55 instruction. Therefore k4 cannot have a value of 0-4.

Compatibility with C54x devices (C54CM = 1)
C55x DSP status registers bit mapping (Figure 5-2, page 5-120) does not correspond to C54x DSP status register bits.

## Status Bits Affected by none

Affects Selected status bits

| Repeat | This instruction cannot be repeated. |
| :--- | :--- |
| See Also | See the following other related instructions: |
|  | $\square$ BCLR (Clear Status Register Bit) |
| $\square$ | BSET (Set Accumulator, Auxiliary, or Temporary Register Bit) |
| $\square$ | BSET (Set Memory Bit) |

## Example 1

| Syntax | Description |
| :--- | :--- |
| BSET CARRY, ST0_55 | The ST0_55 bit position defined by the label (CARRY, bit 11) is set to 1. |


| Before | After |  |
| :--- | :--- | :--- |
| STO_55 | 0000 | STO_55 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| BSET CARRY | The ST0_55 CARRY (bit 11) is set to 1. |


| Before | After |  |
| :--- | :--- | :--- |
| STO_55 | 0000 | STO_55 |

Figure 5-2. Status Registers Bit Mapping
STO_55

| 15 | 14 | 13 |  | 12 |  | 11 |  | 10 | 9 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACOV2 $^{\dagger}$ | ACOV3 $^{\dagger}$ | TC1 $^{\dagger}$ | TC2 | CARRY | ACOV0 | ACOV1 |  |  |  |
| R/W-0 | R/W-0 | R/W-1 | R/W-1 | R/W-1 | R/W-0 | R/W-0 |  |  |  |



ST1_55

| 15 | 14 | 13 | 12 | 11 | 10 |  | 9 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| BRAF | CPL | XF | HM | INTM | M40 $^{\dagger}$ | SATD | SXMD |
| R/W-0 | R/W-0 | R/W-1 | R/W-0 | R/W-1 | R/W-0 | R/W-0 | R/W-1 |


| 5 |  |  |  |  |  |  |  | 4 |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| C16 | FRCT | C54CM $^{\dagger}$ | ASM |  |  |  |  |  |  |
| R/W-0 | R/W-0 | R/W-1 | R/W-0 |  |  |  |  |  |  |

ST2_55

| 14 | 13 |  | 12 | 11 | 10 |  | 9 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ARMS | Reserved | DBGM | EALLOW | RDM | Reserved | CDPLC |  |
| R/W-0 |  | R/W-1 | R/W-0 | R/W-0 |  | R/W-0 |  |


| 7 | 6 | 5 | 4 |  | 3 |  | 2 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | 1 | 0 |  |  |  |  |  |
| AR7LC | AR6LC | AR5LC | AR4LC | AR3LC | AR2LC | AR1LC | AR0LC |
| R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 | R/W-0 |

## ST3_55

| 15 | 13 | 12 | 11 | 8 |
| :---: | :---: | :---: | :---: | :---: |
| CAFRZ $^{\dagger}$ | CAEN $^{\dagger}$ | CACLR $^{\dagger}$ | HINT $^{\ddagger}$ | Reserved (always write 1100b) |
| R/W-0 | R/W-0 | R/W-0 | R/W-1 |  |


| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| CBERR ${ }^{\dagger}$ | MPNMC§ | SATA ${ }^{\dagger}$ | Reserved |  | CLKOFF | SMUL | SST |
| R/W-0 | R/W-pins | R/W-0 |  |  | R/W-0 | R/W-0 | R/W-0 |

Legend: $\mathrm{R}=$ Read; $\mathrm{W}=$ Write; $-n=$ Value after reset
$\dagger$ Highlighted bit: If you write to the protected address of the status register, a write to this bit has no effect, and the bit always appears as a 0 during read operations.
$\ddagger$ The HINT bit is not used for all C55x host port interfaces (HPIs). Consult the documentation for the specific C55x DSP.
§ The reset value of MPNMC may be dependent on the state of predefined pins at reset. To check this for a particular C55x DSP, see the boot loader section of its data sheet.

## Syntax Characteristics



## Example

| Syntax | Description |
| :--- | :--- |
| BTST @\#12, T0, TC1 | The bit at the position defined by the register bit address (12) in T0 is tested and <br> the tested bit is copied into TC1. |


| Before |  | After |  |
| :--- | ---: | :--- | ---: |
| T0 | FE00 | T0 | FE00 |
| TC1 | 0 | TC1 | 1 |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | BTST src, Smem, TCx | No | 3 | 1 | X |
| $[2]$ | BTST k4, Smem, TCx | No | 3 | 1 | $X$ |


| Description | These instructions perform a bit manipulation in the A-unit ALU. These instructions test a single bit of a memory (Smem) location. The bit tested is defined by either the content of the source (src) operand or a 4-bit immediate value, k 4 . The tested bit is copied into the selected TCx status bit. <br> For instruction [1], the generated bit address must be within $0-15$ (only the 4 LSBs of the register are used to determine the bit position). |
| :---: | :---: |
| Status Bits | Affected by none |
|  | Affects TCx |
| See Also | See the following other related instructions: |
|  | - BCLR (Clear Memory Bit) |
|  | - BNOT (Complement Memory Bit) |
|  | - BSET (Set Memory Bit) |
|  | $\square$ BTST (Test Accumulator, Auxiliary, or Temporary Register Bit) |
|  | - BTSTCLR (Test and Clear Memory Bit) |
|  | - BTSTNOT (Test and Complement Memory Bit) |
|  | $\square$ BTSTP (Test Accumulator, Auxiliary, or Temporary Register Bit Pair) |
|  | $\square$ BTSTSET (Test and Set Memory Bit) |

## Test Memory Bit

## Syntax Characteristics



Test Memory Bit

## Syntax Characteristics



## BTSTCLR Test and Clear Memory Bit

## Syntax Characteristics



## BTSTNOT <br> Test and Complement Memory Bit

## Syntax Characteristics



Example

| Syntax | Description |
| :--- | :--- | :--- |
| BTSTNOT \#12, *AR0, TC1 | The bit at the position defined by the unsigned 4-bit value (12) in the <br> content addressed by AR0 is tested and the tested bit is copied into TC1. <br> The selected bit (12) in the content addressed by AR0 is complemented. | | Before | After |  |
| :--- | :--- | :--- |
| *AR0 | *ARO | 1040 |
| TC1 | 0040 | TC1 |

## BTSTP

Test Accumulator, Auxiliary, or Temporary Register Bit Pair

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | BTSTP Baddr, src | No | 3 | 1 | X |  |
| Opcode | 1110 | $1100 \mid A A A A$ | AAAI | FSSS | 010x |  |

## Operands Baddr, src

Description This instruction performs a bit manipulation:
$\square$ In the D-unit ALU, if the source (src) register operand is an accumulator.
$\square$ In the A-unit ALU, if the source (src) register operand is an auxiliary or temporary register.

The instruction tests two consecutive bits of the source register location as defined by the bit addressing mode, Baddr and Baddr +1 . The tested bits are copied into status bits TC1 and TC2:

- TC1 tests the bit that is defined by Baddr
- TC2 tests the bit defined by Baddr +1

The generated bit address must be within:
$\square$ 0-38 when accessing accumulator bits (only the 6 LSBs of the generated bit address are used to determine the bit position). If the generated bit address is not within $0-38$ :

- If the generated bit address is 39, bit 39 of the register is stored into TC1 and 0 is stored into TC2.
- In all other cases, 0 is stored into TC1 and TC2.
$\square$ 0-14 when accessing auxiliary or temporary register bits (only the 4 LSBs of the generated address are used to determine the bit position). If the generated bit address is not within $0-14$ :
- If the generated bit address is 15 , bit 15 of the register is stored into TC1 and 0 is stored into TC2.

In all other cases, 0 is stored into TC1 and TC2.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | TC1, TC2 |

```
Repeat This instruction can be repeated.
See Also See the following other related instructions:
BCLR (Clear Accumulator, Auxiliary, or Temporary Register Bit)
\square BNOT (Complement Accumulator, Auxiliary, or Temporary Register Bit)
BSET (Set Accumulator, Auxiliary, or Temporary Register Bit)
\square BTST (Test Accumulator, Auxiliary, or Temporary Register Bit)
BTST (Test Memory Bit)
```


## Example

| Syntax | Description |
| :--- | :--- |
| BTSTP AR1(T0), AC0 | The bit at the position defined by the content of AR1(T0) in AC0 is tested and the <br> tested bit is copied into TC1. The bit at the position defined by the content of <br> AR1(T0) + 1 in AC0 is tested and the tested bit is copied into TC2. |


| Before |  |  | After |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| AC0 | E0 | 1234 | 0000 | AC0 | E0 | 1234 |
| AR1 |  | 0026 | AR1 |  | 0000 |  |
| T0 |  | 0001 | T0 |  | 0026 |  |
| TC1 |  | 0 | TC1 |  | 0001 |  |
| TC2 |  | 0 | TC2 |  | 1 |  |

## BTSTSET

Test and Set Memory Bit

## Syntax Characteristics



## CALL

## Call Unconditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | CALL ACx | No | 2 | 10 | X |
| $[2]$ | CALL L16 | Yes | 3 | 6 | AD |
| $[3]$ | CALL P24 | No | 4 | 5 | D |

## Description

Status Bits

See Also

This instruction passes control to a specified subroutine program address defined by the content of the 24 lowest bits of the accumulator, ACx , or a program address label assembled into L16 or P24.

Before beginning a called subroutine, the CPU automatically saves the value of two internal registers: the program counter (PC) and a loop context register. The CPU can use these values to re-establish the context of the interrupted program sequence when the subroutine is done.

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions.

These instructions cannot be repeated.
Affected by none
Affects none
See the following other related instructions:

- B (Branch Unconditionally)
- CALLCC (Call Conditionally)
- RET (Return Unconditionally)
- RETCC (Return Conditionally)

CALL Call Unconditionally

## Call Unconditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | CALL ACx | No | 2 | 10 | $X$ |

## Opcode

10010010 xxxx xxSS

## Operands <br> ACx

Description
This instruction passes control to a specified subroutine program address defined by the content of the 24 lowest bits of the accumulator, $A C x$.

In the slow-return process (default), the return address (from the PC ) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
- The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
$\square$ The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.

|  | System Stack (SSP) | AfterSave $\rightarrow$ SP = y-1 | Data Stack (SP) |
| :---: | :---: | :---: | :---: |
| After Save $\rightarrow$ SSP $=x-1$ | (Loop bits):PC(23-16) |  | $\mathrm{PC}(15-0)$ |
| Before Save $\rightarrow$ SSP $=x$ | Previously saved data | $\begin{array}{r} \text { Before } \\ \text { Save } \end{array} \rightarrow S P=y$ | Previously saved data |


| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction cannot be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| CALL AC0 | Program control is passed to the program address defined by the content of AC0(23-0). |

Call Unconditionally
Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[2]$ | CALL L16 |  | Yes | 3 | 6 | AD |
| Opcode | 0000 | 100E | LLLL | LLLL | LLLL | LLLL |

## Operands

Description

L16
This instruction passes control to a specified subroutine program address defined by a program address label assembled into L16.

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
$\square$ The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
$\square$ The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.

System Stack (SSP)


CALL Call Unconditionally

## Call Unconditionally

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[3]$ | CALL P24 | No | 4 | 5 | D |  |  |
| Opcode | $\mid 0110$ | 1100 | $\operatorname{PPPP}$ | PPPP | PPPP | PPPP | PPPP |

## Operands

## Description

P24
This instruction passes control to a specified subroutine program address defined by a program address label assembled into P24.

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
$\square$ The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
$\square$ The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.

- The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.

System Stack (SSP)

| After <br> Save | $\rightarrow$ SSP $=x-1$ | (Loop bits):PC(23-16) |
| ---: | :--- | :--- |
| Before <br> Save$\rightarrow$ SSP $=x$ | Previously saved data |  |

Data Stack (SP)


| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction cannot be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| CALL FOO | Program control is passed to the program address label (FOO) assembled into an absolute <br> address defined by the 24-bit value. |

## CALLCC

## Call Conditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | CALLCC L16, cond | No | 4 | $6 / 5$ | R |
| $[2]$ | CALLCC P24, cond | No | 5 | $5 / 5$ | R |

$\dagger x / y$ cycles: $x$ cycles $=$ condition true, $y$ cycles $=$ condition false

Description These instructions evaluate a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a subroutine call occurs to the program address defined by the program address label assembled into L 16 or P24. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

Before beginning a called subroutine, the CPU automatically saves the value of two internal registers: the program counter (PC) and a loop context register. The CPU can use these values to re-establish the context of the interrupted program sequence when the subroutine is done.

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions.

The instruction selection depends on the branch offset between the current PC value and program subroutine address specified by the label.

These instructions cannot be repeated.

| Status Bits | Affected by | ACOVx, CARRY, C54CM, M40, TCx |
| :--- | :--- | :--- |
| Affects | ACOVx |  |

## CALLCC

See Also See the following other related instructions:

- BCC (Branch Conditionally)
- CALL (Call Unconditionally)
- RETCC (Return Conditionally)
$\square$ RET (Return Unconditionally)

Call Conditionally
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | CALLCC L16, cond | No | 4 | $6 / 5$ | $R$ |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
Opcode $\quad|01101110| x C C C \quad$ CCCC $\mid$ LLLL LLLL $\mid$ LLLL LLLL

## Operands

Description
cond, L16
This instruction evaluates a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a subroutine call occurs to the program address defined by the program address label assembled into L16. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

When a subroutine call occurs in the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
$\square$ The data stack pointer (SP) is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter ( PC ), of the called subroutine are pushed to the top of $S P$.

- The system stack pointer (SSP) is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
$\square$ The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.

| After Save | $\rightarrow$ | SSP $=x-1$ | System Stack (SSP) | After Save | $\rightarrow \mathrm{SP}=\mathrm{y}-1$ | Data Stack (SP) |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | (Loop bits):PC(23-16) |  |  | $\mathrm{PC}(15-0)$ |
| Before Save | $\rightarrow$ | SSP = $x$ | Previously saved data | Before Save | $\rightarrow \quad S P=y$ | Previously saved data |

## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits \begin{tabular}{l}
Affected by ACOVx, CARRY, C54CM, M40, TCx <br>
Affects <br>
Repeat <br>
Example <br>

| Syntax | Descrip instruction cannot be repeated. |
| :--- | :--- | <br>

\hline CALLCC (subroutine), AC1 >= \#2000h <br>

| The content of AC1 is equal to or greater than 2000h, control is |
| :--- | :--- |
| passed to the program address label, subroutine. The program |
| counter (PC) is loaded with the subroutine program address. | <br>

\hline
\end{tabular}

Call Conditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[2]$ | CALLCC P24, cond | No | 5 | $5 / 5$ | $R$ |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
Opcode $\quad|01101001|$ xCCC CCCC $\mid$ PPPP PPPP $\mid$ PPPP PPPP $\mid$ PPPP PPPP

## Operands

cond, P24

## Description

This instruction evaluates a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a subroutine call occurs to the program address defined by the program address label assembled into P24. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

When a subroutine call occurs in the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
$\square$ The data stack pointer (SP) is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter ( PC ), of the called subroutine are pushed to the top of SP.

- The system stack pointer (SSP) is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
$\square$ The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.



## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, the comparison of accumulators to 0 is performed as if M40 was set to 1 .

| Status Bits |
| :--- |
| Affected by ACOVx, CARRY, C54CM, M40, TCx |
| Repeat | | Affects $\quad$ ACOVx |
| :--- |

## CMP

Compare Memory with Immediate Value

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | CMP Smem $==$ K16, TC1 | No | 4 | 1 | X |
| $[2]$ | CMP Smem $==$ K16, TC2 | No | 4 | 1 | X |


| Opcode | TC1 | 1111 | 0000 | AAAA | AAAI | KKKK | KKKK | KKKK | KKKK |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | TC2 | 1111 | 0001 | AAAA | AAAI | KKKK | KKKK | KKKK | KKKK |
| Operands | K16, Smem, TCx |  |  |  |  |  |  |  |  |
| Description | This instruction performs a comparison in the A-unit ALU. The data memory operand Smem is compared to the 16 -bit signed constant, K16. If they are equal, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . |  |  |  |  |  |  |  |  |

```
if((Smem) == K16)
```

    \(T C x=1\)
    else
$T C x=0$

Status Bits Affected by none
Affects TCx
Repeat This instruction can be repeated.
See Also See the following other related instructions:

- CMP (Compare Accumulator, Auxiliary, or Temporary Register Content)

Example 1

| Syntax | Description |
| :--- | :--- |
| CMP *AR1 $+==\# 400 h$, TC1 | The content addressed by AR1 is compared to the signed 16-bit value <br> (400h). Because they are equal, TC1 is set to 1. AR1 is incremented by 1. |


| Before |  |  |  |
| :--- | :--- | :--- | :--- |
| AR1 | After |  |  |
| 0285 | 0285 | AR1 | 0286 |
| TC1 | 0400 | 0285 | 0400 |
|  | 0 | TC1 | 1 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| CMP *AR1 == \#400h, TC2 | The content addressed by AR1 is compared to the signed 16-bit value <br> (400h). Because they are not equal, TC2 is cleared to 0. | | Before |
| :--- |
| AR1 |
| After |
| TC2 |$\quad 0285$

TC2

## CMP

Compare Accumulator, Auxiliary, or Temporary Register Content

## Syntax Characteristics



| U | src | dst | Comparison Type |
| :---: | :---: | :---: | :--- |
| no | TAx | TAy | 16-bit signed comparison in A-unit ALU |
| no | TAx | ACy | 16-bit signed comparison in A-unit ALU |
| no | ACx | TAy | 16-bit signed comparison in A-unit ALU |
| no | ACx | ACy | if M40 $=0,32$-bit signed comparison in D-unit ALU <br> if M40 $=1,40$-bit signed comparison in D-unit ALU |
| yes | TAx | TAy | 16-bit unsigned comparison in A-unit ALU |
| yes | TAx | ACy | 16-bit unsigned comparison in A-unit ALU |
| yes | ACx | TAy | 16-bit unsigned comparison in A-unit ALU <br> yes |
| ACx | ACy | if $40=0,32$-bit unsigned comparison in D-unit ALU <br> if M40 $=1,40$-bit unsigned comparison in D-unit ALU |  |

## Compatibility with C54x devices (C54CM = 1)

Contrary to the corresponding C54x instruction, the C55x register comparison instruction is performed in execute phase of the pipeline.
When $\mathrm{C} 54 \mathrm{CM}=1$, the conditions testing the accumulators content are all performed as if M40 was set to 1 .


## Example 1

| Syntax | Description |
| :--- | :--- |
| CMP AC1 $==\mathrm{T} 1$, TC1 | The signed content of AC1 (15-0) is compared to the content of T1 and because <br> they are equal, TC1 is set to 1. |


| Before |  | After |  |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| AC1 | 00 | 0028 | 0400 | AC1 | 000028 | 0400 |
| T1 |  | 0400 | T1 |  | 0400 |  |
| TC1 |  | 0 | TC1 |  | 1 |  |

## Example 2

| Syntax | Description |
| :--- | :--- |
| CMP T1 >= AC1, TC1 | The content of T1 is compared to the signed content of AC1(15-0). The content of <br> T1 is greater than the content of AC1, TC1 is set to 1. |


| Before |  | After |  |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| T1 | 0500 | T1 |  |  |  |  |
| AC1 | 800000 | 0400 | AC1 | 80 | 0000 | 0400 |
| TC1 | 0 | 0 | TC1 |  | 1 |  |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | CMPAND[U] src RELOP dst, TCy, TCx | Yes | 3 | 1 | X |
| $[2]$ | CMPAND[U] src RELOP dst, !TCy, TCx | Yes | 3 | 1 | $X$ |

Description These instructions perform a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU.

Status Bits Affected by C54CM, M40, TCy
Affects TCx
See Also See the following other related instructions:

- CMP (Compare Memory with Immediate Value)
$\square$ CMP (Compare Accumulator, Auxiliary, or Temporary Register Content)
- CMPOR (Compare Accumulator, Auxiliary, or Temporary Register Content with OR)
- MAX (Compare Accumulator, Auxiliary, or Temporary Register Content Maximum)
$\square$ MIN (Compare Accumulator, Auxiliary, or Temporary Register Content Minimum)


## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :---: | :---: | :---: | :---: |
|  | CMPAND[U] src RELOP dst, TCy, TCx |  |  |  |  |
| $[1 a]$ | CMPAND[U] src RELOP dst, TC2, TC1 | Yes | 3 | 1 | $X$ |
| $[1 b]$ | CMPAND[U] src RELOP dst, TC1, TC2 | Yes | 3 | 1 | X |

## Opcode

0001 001E $\operatorname{FSSS}$ Cc01 $\mid$ FDDD Outt

## Operands dst, RELOP, src, TC1, TC2

Description This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ANDed with TCy; TCx is updated with this operation.

The comparison depends on the optional U keyword and on M40 for accumulator comparisons. As the following table shows, the U keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons

| U | src | dst | Comparison Type |
| :---: | :---: | :---: | :--- |
| no | TAx | TAy | 16-bit signed comparison in A-unit ALU |
| no | TAx | ACy | 16-bit signed comparison in A-unit ALU |
| no | ACx | TAy | 16-bit signed comparison in A-unit ALU <br> no |
| ACx | ACy | if $M 4=0,32$-bit signed comparison in D-unit ALU <br> if M40 $=1,40$-bit signed comparison in D-unit ALU |  |
| yes | TAx | TAy | 16-bit unsigned comparison in A-unit ALU |
| yes | TAx | ACy | 16-bit unsigned comparison in A-unit ALU |
| yes | ACx | TAy | 16-bit unsigned comparison in A-unit ALU <br> yes |
| ACx | ACy | if $40=0,32$-bit unsigned comparison in D-unit ALU <br> if M40 $=1,40$-bit unsigned comparison in D-unit ALU |  |

## Compatibility with C54x devices (C54CM = 1)

Contrary to the corresponding C54x instruction, the C55x register comparison instruction is performed in execute phase of the pipeline.

When $\mathrm{C} 54 \mathrm{CM}=1$, the conditions testing the accumulators content are all performed as if M40 was set to 1 .

| Status Bits | Affected by | C54CM, M40, TCy |
| :--- | :--- | :--- |
|  | Affects | TCx |

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| CMPAND AC1 == AC2, TC1, TC2 | The content of AC1(31-0) is compared to the content of AC2(31-0). <br> The contents are equal (true), TC2 $=$ TC1 \& 1. |


| Before |  |  | After |  |  |  |  |
| :--- | ---: | ---: | ---: | :--- | ---: | ---: | ---: |
| AC1 | 80 | 0028 | 0400 | AC1 | 80 | 0028 | 0400 |
| AC2 | 00 | 0028 | 0400 | AC2 | 00 | 0028 | 0400 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| TC1 |  |  | 1 | TC1 |  |  | 1 |
| TC2 |  |  | 0 | TC2 |  |  | 1 |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :---: | :---: | :---: | :---: |
|  | CMPAND[U] src RELOP dst, !TCy, TCx |  |  |  |  |
| $[2 a]$ | CMPAND[U] src RELOP dst, !TC2, TC1 | Yes | 3 | 1 | X |
| $[2 b]$ | CMPAND[U] src RELOP dst, !TC1, TC2 | Yes | 3 | 1 | X |

## Opcode

0001 001E $\operatorname{FSSS}$ CC01 $\mid$ FDDD lutt

## Operands dst, RELOP, src, TC1, TC2

Description This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ANDed with the complement of TCy; TCx is updated with this operation.

The comparison depends on the optional U keyword and on M40 for accumulator comparisons. As the following table shows, the U keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons

| U | src | dst | Comparison Type |
| :---: | :---: | :---: | :--- |
| no | TAx | TAy | 16-bit signed comparison in A-unit ALU |
| no | TAx | ACy | 16-bit signed comparison in A-unit ALU |
| no | ACx | TAy | 16-bit signed comparison in A-unit ALU <br> no |
| ACx | ACy | if $M 4=0,32$-bit signed comparison in D-unit ALU <br> if M40 $=1,40$-bit signed comparison in D-unit ALU |  |
| yes | TAx | TAy | 16-bit unsigned comparison in A-unit ALU |
| yes | TAx | ACy | 16-bit unsigned comparison in A-unit ALU |
| yes | ACx | TAy | 16-bit unsigned comparison in A-unit ALU <br> yes |
| ACx | ACy | if $40=0,32$-bit unsigned comparison in D-unit ALU <br> if M40 $=1,40$-bit unsigned comparison in D-unit ALU |  |

## Compatibility with C54x devices (C54CM = 1)

Contrary to the corresponding C54x instruction, the C55x register comparison instruction is performed in execute phase of the pipeline.

When $\mathrm{C} 54 \mathrm{CM}=1$, the conditions testing the accumulators content are all performed as if M40 was set to 1 .

| Status Bits | Affected by | C54CM, M40, TCy |
| :--- | :--- | :--- |
|  | Affects | TCx |

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| CMPAND AC1 == AC2, !TC1, TC2 | The content of AC1(31-0) is compared to the content of AC2(31-0). <br> The contents are equal (true), TC2 $=!$ TC1 \& 1. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC1 | 80 | 0028 | 0400 | AC1 | 80 | 0028 | 0400 |
| AC2 | 00 | 0028 | 0400 | AC2 | 00 | 0028 | 0400 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| TC1 |  |  | 1 | TC1 |  |  | 1 |
| TC2 |  |  | 0 | TC2 |  |  | 0 |

## CMPOR

Compare Accumulator, Auxiliary, or Temporary Register Content with OR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | CMPOR[U] src RELOP dst, TCy, TCx | Yes | 3 | 1 | X |
| $[2]$ | CMPOR[U] src RELOP dst, !TCy, TCx | Yes | 3 | 1 | $X$ |

Description These instructions perform a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU.

Status Bits Affected by C54CM, M40, TCy
Affects TCx
See Also See the following other related instructions:

- CMP (Compare Memory with Immediate Value)
- CMP (Compare Accumulator, Auxiliary, or Temporary Register Content)
$\square$ CMPAND (Compare Accumulator, Auxiliary, or Temporary Register Content with AND)
- MAX (Compare Accumulator, Auxiliary, or Temporary Register Content Maximum)
- MIN (Compare Accumulator, Auxiliary, or Temporary Register Content Minimum)


## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :---: | :---: | :---: | :---: |
|  | CMPOR[U] src RELOP dst, TCy, TCx |  |  |  |  |
| $[1 \mathrm{a}]$ | CMPOR[U] src RELOP dst, TC2, TC1 | Yes | 3 | 1 | X |
| $[1 b]$ | CMPOR[U] src RELOP dst, TC1, TC2 | Yes | 3 | 1 | X |

## Opcode

Operands
Description
dst, RELOP, src, TC1, TC2
This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of $A C x$ are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ORed with TCy; TCx is updated with this operation.

The comparison depends on the optional U keyword and on M40 for accumulator comparisons. As the following table shows, the U keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons

| U | src | dst | Comparison Type |
| :---: | :---: | :---: | :--- |
| no | TAx | TAy | 16-bit signed comparison in A-unit ALU |
| no | TAx | ACy | 16-bit signed comparison in A-unit ALU |
| no | ACx | TAy | 16-bit signed comparison in A-unit ALU <br> no |
| ACx | ACy | if $40=0,32$-bit signed comparison in D-unit ALU <br> if M40 $=1,40$-bit signed comparison in D-unit ALU |  |
| yes | TAx | TAy | 16-bit unsigned comparison in A-unit ALU |
| yes | TAx | ACy | 16-bit unsigned comparison in A-unit ALU |
| yes | ACx | TAy | 16-bit unsigned comparison in A-unit ALU <br> yes |
|  | ACx | ACy | if $40=0,32$-bit unsigned comparison in D-unit ALU <br> if M40 $=1,40$-bit unsigned comparison in D-unit ALU |



## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :---: | :---: | :---: | :---: |
|  | CMPOR[U] src RELOP dst, !TCy, TCx |  |  |  |  |
| $[2 a]$ | CMPOR[U] src RELOP dst, !TC2, TC1 | Yes | 3 | 1 | X |
| $[2 b]$ | CMPOR[U] src RELOP dst, !TC1, TC2 | Yes | 3 | 1 | X |

## Opcode

0001 001E $\mid$ FSSS cc10 $\mid$ FDDD 1utt

Operands
Description
dst, RELOP, src, TC1, TC2
This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ORed with the complement of TCy; TCx is updated with this operation.

The comparison depends on the optional U keyword and on M40 for accumulator comparisons. As the following table shows, the U keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons

| U | src | dst | Comparison Type |
| :---: | :---: | :---: | :--- |
| no | TAx | TAy | 16-bit signed comparison in A-unit ALU |
| no | TAx | ACy | 16-bit signed comparison in A-unit ALU |
| no | ACx | TAy | 16-bit signed comparison in A-unit ALU |
| no | ACx | ACyif M40 $=0,32$-bit signed comparison in D-unit ALU <br> if M40 =1, 40-bit signed comparison in D-unit ALU |  |
| yes | TAx | TAy | 16-bit unsigned comparison in A-unit ALU |
| yes | TAx | ACy | 16-bit unsigned comparison in A-unit ALU |
| yes | ACx | TAy | 16-bit unsigned comparison in A-unit ALU <br> yes |
|  | ACx | ACyif $M 4=0,32$-bit unsigned comparison in D-unit ALU <br> if M40 $=1,40$-bit unsigned comparison in D-unit ALU |  |



## .CR <br> Circular Addressing Qualifier

## Syntax Characteristics



## DELAY

## Syntax Characteristics



## EXP

## Syntax Characteristics



## FIRSADD

Symmetrical Finite Impulse Response Filter

## Syntax Characteristics

| No. | Synt |  |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | FIRSADD Xmem, Ymem, Cmem, ACx, ACy |  |  | No | 4 | 1 | X |
| Opcode |  | 10000101 | XXXM | MMYY $\mid$ Y | 11 | mm ${ }^{\text {DD }}$ | D DDU\% |
| Operands |  | ACx, ACy, Cmem, Xmem, Ymem |  |  |  |  |  |
| Description |  | This instruction performs two parallel operations: multiply and accumulate (MAC), and addition. The operation is executed: |  |  |  |  |  |
|  |  | ```ACy = ACy + (ACx * Cmem) :: ACx = (Xmem << #16) + (Ymem << #16)``` |  |  |  |  |  |

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of $\mathrm{ACx}(32-16)$ and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, sign extended to 17 bits.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

The second operation performs an addition operation between the content of data memory operand Xmem, shifted left 16 bits, and the content of data memory operand Ymem, shifted left 16 bits.

- The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40.
When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$, no overflow detection, report, and saturation is done after the shifting operation.
\(\left.\begin{array}{lll}Status Bits \& Affected by C54CM, FRCT, M40, SATD, SMUL, SXMD <br>

\& Affects \quad ACOVx, ACOVy, CARRY\end{array}\right]\)| Repeat | This instruction can be repeated. |
| :--- | :--- |
| See Also | See the following other related instructions: |
|  | $\square$ |
|  | FIRSSUB (Antisymmetrical Finite Impulse Response Filter) |

## Example

| Syntax | Description |
| :--- | :--- |
| FIRSADD *AR0, *AR1, *CDP, AC0, AC1 | The content of AC0(32-16) multiplied by the content addressed <br> by the coefficient data pointer register (CDP) is added to the <br> content of AC1 and the result is stored in AC1. The content <br> addressed by AR0 shifted left by 16 bits is added to the content <br> addressed by AR1 shifted left by 16 bits and the result is stored <br> in AC0. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 00 | 6900 | 0000 | AC0 | 00 | 2300 | 0000 |
| AC1 | 00 | 0023 | 0000 | AC1 | FF | D8ED | 3 F 00 |
| *ARO |  |  | 3400 | *ARO |  |  | 3400 |
| *AR1 |  |  | EF00 | *AR1 |  |  | EFO0 |
| * CDP |  |  | A067 | *CDP |  |  | A067 |
| ACOVO |  |  | 0 | ACOVO |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 0 |
| CARRY |  |  | 0 | CARRY |  |  | 1 |
| FRCT |  |  | 0 | FRCT |  |  | 0 |
| SXMD |  |  | 0 | SXMD |  |  | 0 |

## FIRSSUB <br> Antisymmetrical Finite Impulse Response Filter

## Syntax Characteristics



The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of ACx(32-16) and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, sign extended to 17 bits.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

The second operation subtracts the content of data memory operand Ymem, shifted left 16 bits, from the content of data memory operand Xmem, shifted left 16 bits.

- The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, no overflow detection, report, and saturation is done after the shifting operation.

Status Bits

Repeat
See Also

Affected by C54CM, FRCT, M40, SATD, SMUL, SXMD
Affects ACOVx, ACOVy, CARRY
This instruction can be repeated.
See the following other related instructions:
FIRSADD (Symmetrical Finite Impulse Response Filter)

Example

| Syntax | Description |
| :--- | :--- |
| FIRSSUB *AR0, *AR1, *CDP, AC0, AC1 | The content of AC0(32-16) multiplied by the content addressed <br> by the coefficient data pointer register (CDP) is added to the <br> content of AC1 and the result is stored in AC1. The content <br> addressed by AR1 shifted left by 16 bits is subtracted from the <br> content addressed by AR0 shifted left by 16 bits and the result <br> is stored in AC0. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 | 00 | 6900 | 0000 | AC0 | 00 | 4500 | 0000 |
| AC1 | 00 | 0023 | 0000 | AC1 | FF | D8ED | 3 F 00 |
| *ARO |  |  | 3400 | *AR0 |  |  | 3400 |
| *AR1 |  |  | EFOO | *AR1 |  |  | EFO0 |
| *CDP |  |  | A067 | * CDP |  |  | A067 |
| ACOVO |  |  | 0 | ACOVO |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 0 |
| CARRY |  |  | 0 | CARRY |  |  | 0 |
| FRCT |  |  | 0 | FRCT |  |  | 0 |
| SXMD |  |  | 0 | SXMD |  |  | 0 |

IDLE Idle

IDLE
Idle

Syntax Characteristics


## INTR

## Software Interrupt

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | INTR k5 | No | 2 | 3 | D |

Opcode
Operands
Description
k5
This instruction passes control to a specified interrupt service routine (ISR) and interrupts are globally disabled (INTM bit is set to 1 after ST1_ 55 content is pushed onto the data stack pointer). The ISR address is stored at the interrupt vector address defined by the content of an interrupt vector pointer (IVPD or IVPH) combined with the 5 -bit constant, k5. This instruction is executed regardless of the value of INTM bit.

Note:
DBSTAT (the debug status register) holds debug context information used during emulation. Make sure the ISR does not modify the value that will be returned to DBSTAT.

Before beginning an ISR, the CPU automatically saves the value of some CPU registers and two internal registers: the program counter (PC) and a loop context register. The CPU can use these values to re-establish the context of the interrupted program sequence when the ISR is done.

In the slow-return process (default), the return address (from the PC), the loop context bits, and some CPU registers are stored to the stacks (in memory). When the CPU returns from an ISR, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32 -bit load and store instructions. Some CPU registers are saved to the stacks (in memory). For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

When control is passed to the ISR:

- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The status register 2 (ST2_55) content is pushed to the top of SP.
- The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The 7 higher bits of status register 0 (ST0_55) concatenated with 9 zeroes are pushed to the top of SSP.
$\square$ The SP is decremented by 1 word in the access phase of the pipeline. The status register 1 (ST1_55) content is pushed to the top of SP.
- The SSP is decremented by 1 word in the access phase of the pipeline. The debug status register (DBSTAT) content is pushed to the top of SSP.
$\square$ The SP is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
$\square$ The SSP is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
$\square$ The PC is loaded with the ISR program address. The active control flow execution context flags are cleared.
When the software interrupt is acknowledged, the corresponding bits in IFRO and IFR1 are cleared.


| Status Bits | Affected by none |
| :--- | :--- | :--- |
|  | Affects $\quad$ INTM, IFR0, IFR1 |
| Repeat | This instruction cannot be repeated. |

See Also See the following other related instructions:
$\square$ RETI (Return from Interrupt)

- TRAP (Software Trap)


## Example

| Syntax | Description |
| :--- | :--- |
| INTR \#3 | Program control is passed to the specified interrupt service routine. The interrupt vector ad- <br> dress is defined by the content of an interrupt vector pointer (IVPD) combined with the un- <br> signed 5-bit value (3). |

## .LK

## Lock Access Qualifier

## Syntax Characteristics

| No. Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |  |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1] \quad$. LK | No | 2 | 1 | D |  |  |  |
| Opcode | none |  |  | 0100 | 0101 | 1111 | 0010 |
| Operands |  |  |  |  |  |  |  |

This is an operand qualifier that can be paralleled with any of 13 instructions (listed below) which execute a read-modify-write operation to a specific memory operand. If the .LK qualifier is applied to any of 13 instructions, the lock signal is activated at the same cycle with the read request and the corresponding write request follows this read request. This means any memory request issued by other instructions cannot be located between this locked read and write request due to stall generation. This also provides a suitable interface with the OCP.

This operand qualifier cannot be executed:
$\square$ Alone

- In parallel with instructions except the 13 lock instructions

Any of the 13 instructions using the .LK qualifier cannot be combined with any other user-defined parallelism instruction.

The 13 lock instructions which can be paralleled with the .LK qualifier are listed in the table below.

| Number | Algebraic | Mnemonic |
| :---: | :---: | :---: |
| 1 | TC1 = bit(Smem, k4), bit(Smem, k4) = \#1 | BTSTSET k4, Smem, TC1 |
| 2 | TC2 = bit(Smem, k4), bit(Smem, k4) = \#1 | BTSTSET k4, Smem, TC2 |
| 3 | TC1 = bit(Smem, k4), bit(Smem, k4) = \#0 | BTSTCLR k4, Smem, TC1 |
| 4 | TC2 = bit(Smem, k4), bit(Smem, k4) = \#0 | BTSTCLR k4, Smem, TC2 |
| 5 | TC1 = bit(Smem, k4), cbit(Smem, k4) | BTSTNOT k4, Smem, TC1 |
| 6 | TC2 = bit(Smem, k4), cbit(Smem, k4) | BTSTNOT k4, Smem, TC2 |
| 7 | bit(Smem, src) = \#1 | BSET src, Smem |
| 8 | bit(Smem, src) = \#0 | BCLR src, Smem |
| 9 | cbit(Smem, src) | BNOT src, Smem |
| 10 | Smem = Smem \& k16 | AND k16, Smem |
| 11 | Smem = Smem \| k16 | OR k16, Smem |
| 12 | Smem = Smem ^ k16 | XOR k16, Smem |
| 13 | Smem $=$ Smem + k16 | ADD k16, Smem |

Any of the 13 instructions with the .LK qualifier is not allowed in the conditional
execution context which is applied by "if(cond) execute(D_unit)" instruction
due to OCP compliance. The cases below are illegal and rejected by the
code-gen tools:

## LMS

## Least Mean Square

## Syntax Characteristics



The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, sign extended to 17 bits, and the content of data memory operand Ymem, sign extended to 17 bits.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
. When an addition overflow is detected, the accumulator is saturated according to SATD.

The second operation performs an addition between an accumulator content and the content of data memory operand Xmem shifted left by 16 bits.

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. When an overflow is detected, the accumulator is saturated according to SATD.
- Rounding is performed according to RDM.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$, the rounding is performed without clearing the 16 lowest bits of ACx. The addition operation has no overflow detection, report, and saturation after the shifting operation.

| Status Bits | Affected by $\quad$ C54CM, FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVx, ACOVy, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| LMS *AR0, *AR1, AC0, AC1 | The content addressed by AR0 multiplied by the content addressed by AR1 <br> is added to the content of AC1 and the result is stored in AC1. The content <br> addressed by AR0 shifted left by 16 bits is added to the content of AC0. The <br> result is rounded and stored in AC0. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 00 | 1111 | 2222 | ACO | 00 | 2111 | 0000 |
| AC1 | 00 | 1000 | 0000 | AC1 | 00 | 1200 | 0000 |
| *ARO |  |  | 1000 | *ARO |  |  | 1000 |
| *AR1 |  |  | 2000 | *AR1 |  |  | 2000 |
| ACOVO |  |  | 0 | ACOVO |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 0 |
| CARRY |  |  | 0 | CARRY |  |  | 0 |
| FRCT |  |  | 0 | FRCT |  |  | 0 |

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | LMSF Xmem, Ymem, ACx, ACy | No | 4 | 1 | X |
| Opcode $\quad \mid 1000$ 0111 $\mid$ XXXM MMYY $\mid$ YMMM SSDD 01100001 |  |  |  |  |  |
| Operands ACx, ACy, Xem, Ymem, T3 |  |  |  |  |  |
| Description |  | This instruction performs three parallel operations in one cycle. The operations are executed in the D-unit MAC and D-unit ALU. The instruction is executed : |  |  |  |
|  |  | $\begin{aligned} & \mathrm{ACx}=\mathrm{T} 3 *(\text { Ymem }) \\ & \mathrm{ACy}=\mathrm{ACy}+(\text { Xmem }) *(\text { Ymem }) \\ & \text { Xmem }=\mathrm{HI}(\text { rnd }(\mathrm{ACx}+(\text { Xmem }) \ll \# 16)) \end{aligned}$ |  |  |  |

The first operation performs a multiplication in D-unit MAC1. The input operands of the multiplier are the content of data register T3 and the content of data memory operand Ymem. The implied T3 operand is sign extended to 17 bits in the MAC1. The data memory operand Ymem is addressed by DAGEN path $Y$ by using Ymem addressing mode, driven on the CDB bus, and sign extended to 17 bits in the MAC1.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

The second operation performs a multiplication and an addition in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Xmem and the content of data memory operand Ymem. The data memory operand Xmem is addressed by DAGEN path X by using Xmem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2. The other data memory operand Ymem is addressed by DAGEN path Y by using the Ymem addressing mode, driven on data bus CDB, and sign extended to 17 bits in the MAC2.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

The third operation performs an addition between an accumulator content and the content of data memory operand Xmem in the D-unit ALU. The data memory operand Xmem is driven on the DDB bus as described in the above second operation, sign extended to 40 bits according to SXMD, shifted to the left by 16 bits, and supplied to D-unit ALU.

- The shift operation is identical to the arithmetic shift instruction. Therefore, an overflow detection, report and saturation is done after the shifting operation.
- Overflow and CARRY detection are operated as M40 bit is locally set to 0.
- Addition overflow is always detected at bit position 31.
- Addition carry report in CARRY status bit is always extracted at bit position 31.
- A rounding is always performed on the result of the addition. The rounding operation depends on RDM status value.
$\square$ When RDM is 0 , the biased rounding to the infinite is performed. $2^{\wedge} 15$ is added to the 40 -bit result of the accumulation.
- When RDM is 1 , the unbiased rounding to the nearest is performed. According to the value of the 17 LSB of the 40 -bit result of accumulation, $2^{\wedge} 15$ is added as following pseudo code description.

```
if(2^15 < bit(15-0) < 2^16)
    add 2^15 to the 40-bit result of the accumulation
else if(bit(15-0) == 2^15)
    if(bit(16) == 1)
            add 2^15 to the 40-bit result of the accumulation
```

- When an overflow is detected on the result of the rounding, the accumulator is saturated according to SATD. Note that no overflow
detection is performed on the intermediate result after the addition but before the rounding.
$\square$ If an overflow resulting from the shift, or the addition/the rounding is detected, accumulator 0 overflow status bit is set (ACOVO). (In the exceptional case, even if the result of addition is overflowed, the rounding operation may suppress the overflow report.)
$\square$ When an overflow is detected, the result is saturated according to SATD, before being stored in memory. Saturation values are 7FFFh or 8000h.
$\square$ The result of the third operation, high part of ACx is stored into the data memory location addressed by Xmem via the Ebus.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$ and $\mathrm{C} 54 \mathrm{CM}=1$, compatibility is ensured due to following the implementation of Ims instruction.
$\square$ The rounding is performed without clearing the 16 lowest bits of ACx.
$\square$ The addition operation has no overflow detection, report, and saturation after the shifting operation.

Status Bits | Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SXMD, |
| :--- |
| Affects ACOVx, ACOVy, ACOV0, CARRY |

Repeat |  |
| :--- | :--- |
| This instruction can be repeated. |

| Syntax | Description |
| :--- | :--- |
| Imsf(*AR2-,*AR3+,AC0,AC1); <br> SXM=1, FRCT=1; <br> assuming 4KW bank DARAM | The product of the content addressed by AR2 and the content addressed by <br> AR3 is added to the content of AC1 and the result is stored in AC1. The <br> content addressed by AR2, shifted to the left by 16 bits, is added to the con- <br> tent of AC0. The result is rounded and stored in AC0. |

Execution
T3 [16:0] * ((Ymem) [16:0])) -> ACx[39:0]
ACy[39:0] +(Xmem) [16:0]*(Ymem)) [16:0])) -> ACy[39:0]
HI (rnd (ACx[39:0] +((Xmem) <<\#16))) -> Xmem

| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 | 00 | 3 FFF | 8000 | ACO | 00 | 0200 | 0000 |
| AC1 | 00 | 0000 | 8000 | AC1 | 00 | 0004 | 8000 |
| T3 |  |  | 8000 | T3 |  |  | 8000 |
| XAR2 |  | 00 | 30 FF | XAR2 |  | 00 | 30 FE |
| XAR3 |  | 00 | 2000 | XAR3 |  | 00 | 2001 |
| Data memory |  |  |  |  |  |  |  |
| 2000h |  |  | FE00 | 2000h |  |  | FE00 |
| 30 FFh |  |  | FFOO | 30 FFh |  |  | 3 F 00 |

## .LR <br> Linear Addressing Qualifier

## Syntax Characteristics



## MAC

## Multiply and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAC[R] ACx, Tx, ACy[, ACy] | Yes | 2 | 1 | X |
| [2] | MAC[R] ACy, $\mathrm{T}_{\mathrm{x}}$, $A C x, A C y$ | Yes | 2 | 1 | X |
| [3] | MACK[R] Tx, K8, [ACx, ${ }^{\text {ACy }}$ | Yes | 3 | 1 | $x$ |
| [4] | $\operatorname{MACK}[R]$ Tx, K16, [ACx,] ACy | No | 4 | 1 | X |
| [5] | MACM[R] [T3 = ]Smem, Cmem, ACx | No | 3 | 1 | X |
| [6] | MACM[R] [T3 = jSmem, [ACx, ACy | No | 3 | 1 | X |
| [7] | MACM[R] [T3 = ]Smem, Tx, [ACx,] ACy | No | 3 | 1 | X |
| [8] | MACMK[R] [T3 = ]Smem, K8, [ACx,] ACy | No | 4 | 1 | X |
| [9] | ```MACM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], [ACx,] ACy``` | No | 4 | 1 | X |
| [10] | ```MACM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[]], ACx >> #16[, ACy]``` | No | 4 | 1 | X |
| [11] | MAC[R] Smem, uns(Cmem), ACx | No | 3 | 1 | X |

Description This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are:

- ACx (32-16)
- The content of Tx, sign extended to 17 bits
$\square$ The 8 -bit signed constant, K8, sign extended to 17 bits
$\square$ The 16-bit signed constant, K16, sign extended to 17 bits
- The content of a memory (Smem) location, sign extended to 17 bits
$\square$ The content of a data memory operand Cmem, addressed using the coefficient addressing mode, sign extended to 17 bits
$\square$ The content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits

Status Bits
Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also
See the following other related instructions:

- AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and Accumulate)
- MACMZ (Multiply and Accumulate with Parallel Delay)
- MAC::MAC (Parallel Multiply and Accumulates)
- MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract)
- MAC::MPY (Multiply and Accumulate with Parallel Multiply)
- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
- MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory)
- MAS (Multiply and Subtract)
$\square$ MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate)
- MPY::MAC (Multiply with Parallel Multiply and Accumulate)
- MPY::MAS (Multiply with Parallel Multiply and Subtract)


## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | MAC $[R] A C x$, Tx, ACy[, ACy $]$ | Yes | 2 | 1 | X |
| Opcode |  |  |  |  |  |

## Operands

Description

Status Bits

Repeat

## Example

| Syntax | Description |
| :--- | :--- |
| MAC AC1, T0, AC0 | The product of the content of AC1 and the content of T0 is added to the content of <br> AC0. The result is stored in AC0. |

Multiply and Accumulate

## Syntax Characteristics



| Syntax | Description |
| :--- | :--- |
| MACR AC1, T1, AC0, AC1 | The product of the content of AC1 and the content of T1 is added to the con- <br> tent of AC0. The result is rounded and stored in AC1. |

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [3] | MACK[R] Tx, K8, [ACx, ACy | Yes | 3 | 1 | X |
| Opcod |  |  |  |  |  |

## Operands ACx, ACy, K8, Tx

Description This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the 8 -bit signed constant, K8, sign extended to 17 bits:
$A C y=A C x+(T x$ * K8)

- If FRCT = 1, the output of the multiplier is shifted to the left by 1 bit.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD
Affects ACOVy
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MACK T0, \#FFh, AC1, AC0 | The product of the content of T0 and a signed 8-bit value (FFh) is added to <br> the content of AC1. The result is stored in AC0. |

Multiply and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | MACK $[R]$ Tx, K16, $[A C x$,$] ACy$ | No | 4 | 1 | $X$ |

Opcode

## Operands

## Description

 ACx, ACy, K16, Tx

This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the 16 -bit signed constant, K16, sign extended to 17 bits:

ACy $=$ ACx + (Tx * K16)
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.

- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an addition overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MACK T0, \#FFFFh, AC1, AC0 | The product of the content of T0 and a signed 16-bit value (FFFFh) is <br> added to the content of AC1. The result is stored in AC0. |

Multiply and Accumulate
Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [5] | MACM[R] [T3 = ]Smem, Cmem, ACx | No | 3 | 1 | X |
| Opcode |  | 0001 AA | AAAA A | AAAI $\mid$ U\%DD | 01mm |
| Descr | on ACx, Cmem, Smem |  |  |  |  |

## Description

| Status Bits | Affected by $\quad$ FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVx |

## Example

| Syntax | Description |
| :--- | :--- |
| MACMR *AR1, *CDP, AC2 | The product of the content addressed by AR1 and the content addressed by <br> the coefficient data pointer register (CDP) is added to the content of AC2. <br> The result is rounded and stored in AC2. The result generated an overflow. |


| Before |  |  | After |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| AC2 | 00 EC00 | 0000 | AC2 | 00 EC00 0000 |  |
| AR1 |  | 0302 | AR2 |  | 0302 |
| CDP |  | 0202 | CDP |  | 0202 |
| 302 |  | FE00 | 302 | FE00 |  |
| 202 | 0040 | 202 |  | 0040 |  |
| ACOV2 |  | 0 | ACOV2 |  | 1 |

Multiply and Accumulate

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: | Cycles | Pipeline |
| :---: |
| $[6]$ |
| MACM $[R][T 3=]$ Smem, $[A C x$,$] ACy$ |

## Operands

## Description

## ACx, ACy, Smem

This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are ACx(32-16) and the content of a memory location (Smem), sign extended to 17 bits:
$\mathrm{ACy}=\mathrm{ACy}+($ Smem * ACx $)$
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
. The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

## Status Bits

Repeat

Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
This instruction can be repeated.

Example

| Syntax | Description |
| :--- | :--- |
| MACM *AR3, AC0, AC1 | The product of the content addressed by AR3 and the content of AC0 is added <br> to the content of AC1. The result is stored in AC1. |

## Syntax Characteristics



| Syntax | Description |
| :--- | :--- |
| MACM *AR3, T0, AC1, AC0 | The product of the content addressed by AR3 and the content of T0 is <br> added to the content of AC1. The result is stored in AC0. |

## Syntax Characteristics


$A C y=A C x+(S m e m ~ * ~ K 8) ~$
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

Status Bits \begin{tabular}{l}
Affected by FRCT, M40, RDM, SATD <br>
Affects $\quad$ This instruction can be repeated. <br>
Repeat <br>
Example <br>

| Syntax | Description |
| :--- | :--- |
| MACMK *AR3, \#FFh, AC1, AC0 | The product of the content addressed by AR3 and a signed 8-bit value <br> (FFh) is added to the content of AC1. The result is stored in AC0. |


$.$

\end{tabular}

Multiply and Accumulate
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[9]$ | MACM $[R][40][T 3=][$ uns (]$X \operatorname{mem}[)],[$ uns (] Ymem [)$],[A C x]$, | No | 4 | 1 | X |
|  | ACy |  |  |  |  |

Opcode $\quad|1000 \quad 0110|$ XXXM MMYY $\mid$ YMMM SSDD $\mid 001 \mathrm{~g}$ uuU\%

Operands
Description

## ACx, ACy, Xmem, Ymem

This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits:

```
ACy = ACx + (Xmem * Ymem)
```

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an addition overflow is detected, the accumulator is saturated according to SATD.
This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.
This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.


Multiply and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [10] | MACM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], ACx >> \#16[, ACy] | No | 4 | 1 | X |

Opcode
Operands
Description

ACx, ACy, Xmem, Ymem
This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits:

```
ACy = (ACx >> #16) + (Xmem * Ymem)
```

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator $A C x$, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACx(39).
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL, SXMD |
| :---: | :---: |
|  |  |
| Repeat This instruction can be | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| MACM uns(*AR3), uns(*AR4), AC1 >> \#16, AC0 | The product of the unsigned content addressed by AR3 and the unsigned content addressed by AR4 is added to the content of AC 1 , which has been shifted to the right by 16 bits. The result is stored in ACO. |

Multiply and Accumulate (MAC)

## Syntax Characteristics



## Note:

The uns keyword is mandatory for this instruction.
The data memory operand Smem is addressed by DAGEN path X by using the Smem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The another data memory operand Cmem is addressed by DAGEN path C by using the coefficient addressing mode, driven on data bus BDB, and extended to 17 bits with filling zeros in the MAC1:

```
ACx = ACx + (Smem * Cmem)
```

$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB
buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

This instruction can be applied to compute the intermediate multiplication result and accumulation to the other partial result of double precision multiplication and to free up one DAGEN operator (DAGEN path Y ) for storing an instruction with enabling parallelism.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :---: | :---: | :---: |
|  | Affects | ACOVx |
| Repeat | This instruc | can be repeated. |

## Example

| Syntax | Description |
| :--- | :--- |
| MAC *AR3-, uns(*CDP+), AC0 | The product of the content addressed by AR3 and the content <br> addressed by the coefficient data pointer register (CDP) is added <br> to the content of AC0. The result is stored in AC0. AR3 is de- <br> cremented by 1 and CDP is incremented by 1. |


| Execution |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| rnd (ACx+ (Smem) | 16:0] *uns (Cmem) [16:0]) |  |  |  |  |
| Before |  |  | After |  |  |
| AC0 00 | 0000 | 8000 | ACO | FF FFOO | 8000 |
| XAR3 | 00 | 1001 | XAR3 | 00 | 1000 |
| Data memory |  |  |  |  |  |
| 1001h |  | FE00 | 1001h |  | FE0 0 |
| XCDP | 00 | 2000 | XCDP | 00 | 2001 |
| Coeff memory |  |  |  |  |  |
| 2000h |  | 8000 | 2000h |  | 8000 |

## MACMZ

Multiply and Accumulate with Parallel Delay

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MACM[R]Z [T3 = ]Smem, Cmem, ACx | No | 3 | 1 | X |
| Opcode $\quad\|11010000\|$ AAAA AAAI ${ }^{\text {U\%DD }}$ ( ${ }^{\text {axmm }}$ |  |  |  |  |  |
| Operands | ACx, Cmem, Smem |  |  |  |  |
| Description | This instruction performs a multiplication and an accumulation in the D-unit MAC in parallel with the delay memory instruction. The input operands of the multiplier are the content of a memory location (Smem), sign extended to 17 bits, and the content of a data memory operand (Cmem), addressed using the coefficient addressing mode and sign extended to 17 bits. |  |  |  |  |

```
ACx = ACx + (Smem * Cmem)
```

: : delay (Smem)

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.
For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

The soft dual memory addressing mode mechanism cannot be applied to this instruction. This instruction cannot use the port(\#k16) addressing mode or be paralleled with the port() operand qualifier.

This instruction cannot be used for accesses to I/O space. Any illegal access to I/O space generates a hardware bus-error interrupt (BERRINT) to be handled by the CPU.

|  | Compatibility with C54x devices ( $C 54 C M=1$ ) |
| :---: | :---: |
|  | When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. |
| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |
|  | Affects ACOVx |
| Repeat | This instruction can be repeated. |
| See Also | See the following other related instructions: |
|  | $\square$ AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and Accumulate) |
|  | - MAC (Multiply and Accumulate) |
|  | - MAC::MAC (Parallel Multiply and Accumulates) |
|  | - MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract) |
|  | - MAC::MPY (Multiply and Accumulate with Parallel Multiply) |
|  | - MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory) |
|  | - MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory) |
|  | - MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate) |
|  | - MPY::MAC (Multiply with Parallel Multiply and Accumulate) |
|  | - MPY::MAS (Multiply with Parallel Multiply and Subtract) |
| Example |  |
| Syntax | Description |
| MACMZ *AR3, *CDP, AC0 | The product of the content addressed by AR3 and the content addressed by <br> the coefficient data pointer register (CDP) is added to the content of AC0. <br> The result is stored in AC0. The content addressed by AR3 is copied into <br> the next higher address. |

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |
| [2] | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |
| [3] | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 | No | 4 | 1 | X |
| [4] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [5] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx >> \#16 | No | 4 | 1 | X |
| [6] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy >> \#16 :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], <br> ACx >> \#16 | No | 4 | 1 | X |
| [7] | MAC[R][40] [uns(]HI(Lmem) [)$]$, [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [8] | MAC[R][40] [uns(]HI(Lmem) $[$ )], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx >> \#16 | No | 4 | 1 | X |
| [9] | ```MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy >> #16 :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx >> #16``` | No | 4 | 1 | X |
| [10] | MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx | No | 5 | 1 | X |
| [11] | MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> \#16 | No | 5 | 1 | X |
| [12] | ```MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> #16 :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> #16``` | No | 5 | 1 | X |


| Description | These instructions perform two parallel multiply and accumulate (MAC) <br> operations in one cycle. The operations are executed in the two D-unit MACs. |
| :--- | :--- |
| Status Bits | Affected by $\quad$ FRCT, M40, RDM, SATD, SMUL, SXMD |
| See Also | Affects $\quad$ ACOVx, ACOVy |

## Parallel Multiply and Accumulates

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> $:: ~ M A C[R][40] ~[u n s(] Y m e m[)], ~[u n s(] C m e m[)], ~ A C y ~$ | No | 4 | 1 | X |

## Opcode

## Operands

## Description

$10000011 \mid$ XXXM MMYY $\mid$ YMMM 00 mm UuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle:

```
ACx = ACx + (Xmem * Cmem)
:: ACy = ACy + (Ymem * Cmem)
```

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
$\square$ Input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem
- AMAR Ymem
- AMAR Cmem



## Parallel Multiply and Accumulates

## Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |  | No | 4 | 1 | X |
| Opcod |  | \|1000 0011 | XXXM | MMYY ${ }^{\text {Y }}$ | 1 | \| | DDg\% |
| Opera |  | ACx, ACy, Cmem, Xmem, Ymem |  |  |  |  |
| Description |  | This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle: |  |  |  |  |

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACx(39).
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |

## Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(*AR3), uns(*CDP), AC0 >> \#16 | Both instructions are performed in parallel. The product of <br> the unsigned content addressed by AR3 and the unsigned |
| $: \because$ MAC uns(*AR4), uns(*CDP), AC1 | content addressed by the coefficient data pointer register |
|  | (CDP) is added to the content of AC0, which has been <br> shifted to the right by 16 bits. The result is stored in AC0. <br> The product of the unsigned content addressed by AR4 and <br> the unsigned content addressed by CDP is added to the con- <br> tent of AC1. The result is stored in AC1. |

## Parallel Multiply and Accumulates

## Syntax Characteristics



The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator bit 39 .
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem
- AMAR Ymem
- AMAR Cmem

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :---: | :---: |
| MAC uns(*AR3), uns(*CDP), AC0 >> \#16 :: MAC uns(*AR4), uns(*CDP), AC1 >> \#16 | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the coefficient data pointer register (CDP) is added to the content of ACO, which has been shifted to the right by 16 bits. The result is stored in AC0. The product of the unsigned content addressed by AR4 and the unsigned content addressed by CDP is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy | No | 4 | 1 | X |
|  | $::$ MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

Opcode

## Operands

## Description

| 1111 | 1101 | AAAA $A A A I$ | 0001 | 01 mm | DDDD uug\% |
| :--- | :--- | :--- | :--- | :--- | :--- |

ACx, ACy, Cmem, Smem

This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:

```
ACy = ACy + (Smem * HI (Cmem))
:: ACx = ACx + (Smem * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand HI(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM = 1)
None.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL

Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(*AR3-), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of the <br> $\because:$ MAC uns( ${ }^{*}$ AR3-), uns(LO(*CDP+)), AC0 <br> unsigned content addressed by AR3 and the unsigned content <br> andressed by the higher part of the coefficient data pointer reg- <br> ister (CDP) is added to the content of ACC. The result is stored <br> in AC1. The product of the unsigned content addressed by AR3 <br> and the unsigned content addressed by the lower part of CDP <br> is added to the content of AC0. The result ts stored in AC0. AR3 <br> is decremented by 1. When CDP+ is used with HI/LO, CDP is <br> incremented by 2. |

[^3]| Before |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 00 | 0000 | 8000 | AC0 | 00 | 3 F 80 | 8000 |
| XAR3 | 00 | 10FF | XAR3 |  | 00 | 10FE |
| Data memory |  |  |  |  |  |  |
| 10FFh |  | FE00 | 10FFh |  |  | FE00 |
| XCDP | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |
| 2001h |  | 4000 | 2001h |  |  | 4000 |
| AC1 00 | 0000 | 8000 | AC1 | 00 | 7F00 | 8000 |
| Coeff memory |  |  |  |  |  |  |
| 2000h |  | 8000 | 2000 h |  |  | 8000 |

Parallel Multiply and Accumulates

## Syntax Characteristics

|  | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [5] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx >> \#16 |  | No | 4 | 1 | X |
| Opcode |  | 11111101 | AAAI 00 | 0 | m ${ }^{\text {D }}$ | - uug\% |
| Operands |  | ACx, ACy, Cmem, Smem |  |  |  |  |
| Description |  | This instruction performs two operations in one cycle. The op <br> $\mathrm{ACy}=\mathrm{ACy}+($ Smem * HI (Cm <br> $:: A C x=(A C x \gg \# 16)+($ | This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs: |  | ccumul <br> two D- | (MAC) nit MACs: |

The first operation performs a multiplication and an accumulation in the $D$-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path $X$ with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of $E A$ ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator $\operatorname{ACx}(39)$.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM =1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |  |
| :---: | :---: | :---: |
|  | Affects ACOVx, ACOVy |  |
| Repeat | This instruction can be | repeated. |
| Example |  |  |
| Syntax |  | Description |
| MAC uns(*AR <br> :: MAC uns(* | $\begin{aligned} & \left.1\left({ }^{*} \mathrm{CDP}+\right)\right), \mathrm{AC1} \\ & (\mathrm{LO}(* \mathrm{CDP}+\mathrm{)}), \mathrm{AC0} \gg \text { \#16 } \end{aligned}$ | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO, which has been shifted to the right by 16 bits. The result is stored in ACO. AR3 is decremented by 1 . When CDP+ is used with H//LO, CDP is incremented by 2 . |


| Execution |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACy + M4 0 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy |  |  |  |  |  |  |  |  |
| Before |  |  |  | After |  |  |  |  |
| AC0 | 00 | 0800 | 0000 | ACO | 00 | 3 F 80 | 0800 |  |
| XAR3 |  | 00 | 10FF | XAR3 |  | 00 | 10FE |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FFh |  |  | FEOO | 10FFh |  |  | FE00 |  |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 | 00 | 0000 | 8000 | AC1 | 00 | 7F80 | 8000 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000 h |  |  | 8000 |  |

## Parallel Multiply and Accumulates

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: |
| $[6]$ | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy >> \#16 | No | 4 | 1 | X |
|  | $:: ~ M A C[R][40][u n s(] S m e m[)],[u n s(] L O(C m e m)[)]$, |  |  |  |  |
|  | ACx >>\#16 |  |  |  |  |

Opcode
Operands
Description

ACx, ACy, Cmem, Smem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:
$\mathrm{ACy}=(\mathrm{ACy} \gg \# 16)+($ Smem $* \mathrm{HI}($ Cmem $))$
$:: A C x=(A C x \gg \# 16)+(S m e m ~ * ~ L O(C m e m))$
The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator bit 39 .
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by $\quad$ FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |

## Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(*AR3-), uns(HI(* $\mathrm{CDP}+))$, AC1 >> \#16 | Both instructions are performed in parallel. The product <br> of the unsigned content addressed by AR3 and the <br> unsigned content addressed by the higher part of the <br> coefficient data pointer register (CDP) is added to the <br> content of AC1, which has been shifted to the right by <br> 16 bits. The result is stored in AC1. The product of the <br> unsigned content addressed by AR3 and the unsigned <br> content addressed by the lower part of CDP is added to <br> the content of AC0, which has been shifted to the right <br> by 16 bits. The result is stored in AC0. AR3 is decrem- <br> ented by 1. When CDP+ is used with HI/LO, CDP is |
| incremented by 2.. |  |


| Execution |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $(\mathrm{ACy}>\# 16)+\mathrm{M} 40$ (rnd (uns (Smem) [16:0]*uns (HI (Cmem) ) [16:0])) |  |  |  |  |  |  |  | -> ACy |
| $(\mathrm{ACx} \gg \# 16)+\mathrm{M} 40($ rnd $($ uns (Smem) [16:0] *uns (LO (Cmem) ) [16:0])) |  |  |  |  |  |  |  | -> ACx |
| Before |  |  |  | After |  |  |  |  |
| AC0 | 00 | 0800 | 0000 | AC0 | 00 | 3 F 80 | 0800 |  |
| XAR3 |  | 00 | 10FF | XAR3 |  | 00 | 10FE |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FFh |  |  | FE00 | 10FFh |  |  | FE00 |  |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 | 00 | 0800 | 0000 | AC1 | 00 | 7F00 | 0800 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000 h |  |  | 8000 |  |

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bi | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [7] | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| Opcode |  | AAAI 01 | 1 | mm ${ }^{\text {DD }}$ | dug\% |

## Operands <br> ACx, ACy, Cmem, Lmem

Description This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:

```
ACy = ACY + (HI (Lmem) * HI (Cmem))
:: ACx = ACx + (LO(Lmem) * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}($ Lmem ) and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}($ Lmem $)$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVx, ACOVy |  |
| Repeat | This instruction can be repeated. |  |
| Example |  |  |


| Syntax | Description |
| :---: | :---: |
| MAC uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 :: MAC uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 | Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO. The result is stored in AC0. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2 . |

[^4]MAC::MAC Parallel Multiply and Accumulates

| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 00 | 0000 | 8000 | ACO | 00 | 3 F 80 | 8000 |
| XAR3 |  | 00 | 10FE | XAR3 |  | 00 | 10FC |
| Data memory |  |  |  |  |  |  |  |
| 10FFh |  |  | FEOO | 10FFh |  |  | FEOO |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |
| AC1 | 00 | 0000 | 8000 | AC1 | 00 | 7F80 | 8000 |
| Data memory |  |  |  |  |  |  |  |
| 10FEh |  |  | FFOO | 10FEh |  |  | FFOO |
| Coeff memory |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000h |  |  | 8000 |

## Parallel Multiply and Accumulates

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[8]$ | MAC[R][40] [uns(]HI(Lmem) [)$],[\operatorname{uns}(] H I(C m e m)[)]$, ACy | No | 4 | 1 | X |
|  | $::$ MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], |  |  |  |  |
|  | ACx >>\#16 |  |  |  |  |

Opcode
Operands
Description

| 1111 | 1101 | AAAA $A A A I$ | 0110 | 00 mm | DDDD uug\% |
| :--- | :--- | :--- | :--- | :--- | :--- |

ACx, ACy, Cmem, Lmem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:
$\mathrm{ACy}=\mathrm{ACy}+(\mathrm{HI}($ Lmem $) * \mathrm{HI}($ Cmem $))$
$:: A C x=(A C x$ >> \#16) $+($ LO (Lmem) * LO (Cmem) )
The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}(\mathrm{Lmem})$ and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}(\mathrm{Lmem})$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA ( $\mathrm{EA}+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of $E A$ ( $E A+1$ when EA is even, $E A-1$ when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

I If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation,the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ For the second operation, the 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACx(39).
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.


## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits <br> Affected by Affects | Affected by FRCT, M40, RDM, SATD, SMUL <br> Affects ACOVx, ACOVy |
| :---: | :---: |
| Repeat This instruction can | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| MAC uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 :: MAC uns(LO(**AR3-)), uns(LO(*CDP+)), AC0 >> \#16 | Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO, which has been shifted to the right by 16 bits. The result is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2 . When CDP+ is used with HI/LO, CDP is incremented by 2. |



## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[9]$ | MAC[R][40] [uns(]HI(Lmem) [)$],[$ uns(]HI(Cmem)[)], | No | 4 | 1 | X |
|  | ACy >> \#16 |  |  |  |  |
|  | $::$ MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], |  |  |  |  |
|  | ACx >> \#16 |  |  |  |  |

## Opcode

## Operands

## Description

ACx, ACy, Cmem, Lmem

This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:

```
ACy = (ACy >> #16) + (HI (Lmem) * HI (Cmem))
:: ACx = (ACx >> #16) + (LO(Lmem) * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}($ Lmem ) and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}($ Lmem $)$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of EA ( $\mathrm{EA}+1$ when EA is even, $\mathrm{EA}-1$ when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator bit 39 .
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM =1)

None.



## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :---: | :---: | :---: | :---: |
| $[10]$ | MAC $[R][40][\operatorname{uns}(] \mathrm{Ymem}[)],[\operatorname{uns}(] \mathrm{HI}(\mathrm{Cmem})[)], \mathrm{ACy}$ | No | 5 (*) $^{*}$ | 1 | X |
|  | $::$ MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

(*) 1 LSB is allocated to instruction slot \#2.

## Opcode

Operands
Description
| 10010011 | XXXM MMYY | YMMM 00mm \| uuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:
$\mathrm{ACy}=\mathrm{ACy}+($ Ymem * HI (Cmem) $)$
$:: A C x=A C x+(X m e m * \operatorname{LO}($ Cmem $))$
The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

I If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM =1)
None.

## Status Bits

Repeat
Affected by
FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy

Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(*AR3-), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of the <br> $\because:$ MAC uns(*AR2-), uns(LO(*CDP+)), AC0 <br> unsigned content addressed by AR3 and the unsigned content <br> addressed by the higher part of the coefficient data pointer reg- <br>  <br> ister (CDP) is added to the content of AC. The result is stored <br> in AC1. The product of the unsigned content addressed by AR2 <br>  <br>  <br>  <br>  <br>  <br>  <br> and the unsigned content addressed by the lower part of the <br> CDP is added to the content of AC0. The result is stored in <br> AC0. AR3 and AR2 are decremented by 1. When CDP+ is used <br> with HI/LO, CDP is incremented by 2. |



## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [11] | MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> \#16 | No | 5 (*) | 1 | X |

(*) 1 LSB is allocated to instruction slot \#2.

## Opcode

## Operands

## Description

$$
\begin{array}{|l|ll|ll}
1001 & 0011 & \text { XXXM }
\end{array}
$$

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:

```
ACy = ACy + (HI (Ymem) * HI (Cmem))
:: ACx = (ACx >> #16) + (LO(Xmem) * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the contents of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the contents of data memory operand LO(Cmem) which is addressed by DAGEN path C with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The input operands are extended to 17 bits according to uns.

- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT = 1 , the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
The 32 -bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
$\square$ The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL, SXMD |  |
| :---: | :---: | :---: |
|  | Affects ACOVx, ACOVy |  |
| Repeat | This instruction can be repeated. |  |
| Example |  |  |
| Syntax |  | Description |
| MAC uns(HI(* <br> $::$ MAC uns(LO <br> AC0 >> \#16 | $\begin{aligned} & \text { רs(HI(*CDP+)), AC1 } \\ & \text { uns(LO(*CDP+)), } \end{aligned}$ | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is added to the content of ACO, which has been shifted to the right by 16 bits. The result is stored in AC0. AR3 and AR2 are decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2. |



## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [12] | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy } \\ & \text { >> \#16 } \\ & \text { :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx } \\ & \text { >> \#16 } \end{aligned}$ | No | $5{ }^{*}$ ) | 1 | X |

## Opcode

## Operands

## Description

(*) 1 LSB is allocated to instruction slot \#2.

| 1001 | 0011 | XXXM MMYY | YMMM 11mm | uuDD DDg\% |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs:

```
ACy = (ACy >> #16) + (HI (Ymem) * HI (Cmem))
:: ACx = (ACx >> #16) + (LO(Xmem) * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an addition in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
$\square$ If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
$\square$ The Xmem operand can access the MMRs but the Ymem operand can not.
This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the $B A B, B D B$, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.


## Compatibility with C54x devices (C54CM = 1)

None.

## Status Bits

Repeat

FRCT, M40, RDM, SATD, SMUL, SXMD

## Affected by

Affects ACOVx, ACOVy
This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 >> \#16 | Both instructions are performed in parallel. The prod- |
| $\because: \mathrm{MAC}$ uns(LO(*AR2-)), uns(LO(*CDP+)), | uct of the unsigned content addressed by AR3 and the <br> unsigned content addressed by the higher part of the <br> AC0 $\gg \# 16$ <br> coefficient data pointer register (CDP) is added to the <br> content of AC1. The result is stored in AC1. The prod- <br> uct of the unsigned content addressed by AR2 and the <br> unsigned content addressed by the lower part of the <br> CDP is added to the content of AC0, which has been <br> shifted to the right by 16 bits. The result is stored in <br> AC0. AR3 and AR2 are decremented by 1. When <br> CDP+ is used with HI/LO, CDP is incremented by 2. |



## MAC::MAS

Multiply and Accumulate With Parallel Multiply and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [2] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy >> \#16 :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [3] | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [4] | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy >> \#16 } \\ & \text { :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ | No | 4 | 1 | X |
| [5] | MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx | No | 5 | 1 | X |
| [6] | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy >> \#16 } \\ & \text { :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ | No | 5 | 1 | X |

Description These instructions perform two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy
See Also See the following other related instructions:

- AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract)
- MAC (Multiply and Accumulate)
- MAC::MAC (Parallel Multiply and Accumulates)
- MAC::MPY (Multiply and Accumulate with Parallel Multiply)
- MAS (Multiply and Subtract)
- MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate)
- MAS::MAS (Parallel Multiply and Subtracts)
$\square$ MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
$\square$ MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)
$\square$ MAS::MPY (Multiply and Subtract with Parallel Multiply)
. MPY::MAS (Multiply with Parallel Multiply and Subtract)

Multiply and Accumulate With Parallel Multiply and Subtract
Syntax Characteristics


The first operation performs a multiplication and an accumulation in the $D$-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand $\mathrm{LO}(\mathrm{Cmem})$ is addressed by DAGEN path $C$ with the next address of $E A$ ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.


Execution
ACy +M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
ACx-M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) $\rightarrow$ ACx


Multiply and Accumulate With Parallel Multiply and Subtract
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy >> \#16 |  |  |  |  |
|  | $::$ MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

Opcode
Operands
Description
$|11111101|$ AAAA AAAI $|0010 \quad 01 \mathrm{~mm}|$ DDDD ugg\%
ACx, ACy, Cmem, Smem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs:

```
ACy = (ACy >> #16) + (Smem * HI (Cmem))
:: ACx = ACx - (Smem * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand HI(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of $E A$ ( $E A+1$ when $E A$ is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
$\square$ For the second operation, the 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(*AR3-), uns(HI(*CDP+)), AC1 >> \#16 | Both instructions are performed in parallel. The product of <br> the unsigned content addressed by AR3 and the unsigned <br> content addressed by the higher part of the coefficient <br> data pointer register (CDP) is added to the content of <br> AC1, which has been shifted to the right by 16 bits. The <br> AR3-), uns(LO(*CDP+)), AC0 <br> result is stored in AC1. The product of the unsigned con- <br> tent addressed by AR3 and the unsigned content ad- <br> dressed by the lower part of CDP is subtracted from the <br> content of AC0. The result is stored in AC0. AR3 is de- <br> cremented by 1. When CDP+ is used with HI/LO, CDP is <br> incremented by 2. |


| Execution |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (ACy>>\#16) +M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem) ) [16:0])) -> ACy |  |  |  |  |  |  |  |  |
| ACx-M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem) ) [16:0])) -> ACx |  |  |  |  |  |  |  |  |
| Before |  |  |  | After |  |  |  |  |
| AC0 | 00 | 0000 | 8000 | ACO | FF | C080 | 8000 |  |
| XAR3 |  | 00 | 10FF | XAR3 |  | 00 | 10FE |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FFh |  |  | FEOO | 10FFh |  |  | FEOO |  |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 | 00 | 0800 | 0000 | AC1 | 00 | 7F00 | 0800 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000h |  |  | 8000 |  |

Multiply and Accumulate With Parallel Multiply and Subtract

## Syntax Characteristics



## Operands <br> ACx, ACy, Cmem, Lmem

Description This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs:

```
ACy = ACy + (HI (Lmem) * HI (Cmem))
:: ACx = ACx - (LO(Lmem) * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}($ Lmem ) and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}($ Lmem $)$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

I If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ For the second operation, the 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |  |
| :---: | :---: | :---: |
|  | Affects ACOVx, ACOVy |  |
| Repeat | This instruction can be repeated. |  |
| Example |  |  |
| Syntax |  | Description |
| MAC uns(H :: MAS uns | $\begin{aligned} & \text { is(HI(*DP+)), AC1 } \\ & \text { uns(LO(*DP+)), AC0 } \end{aligned}$ | Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by lower part of AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of ACO . The result is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2 . |



## Syntax Characteristics

|  | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [4] | MAC[R][40] [uns(]HI(Lmem)[]], [uns(]HI(Cmem)[]], <br> ACy >> \#16 <br> :: MAS[R][40] [uns(]LO(Lmem)[]], [uns(]LO(Cmem)[]], ACx |  | No | 4 | 1 | X |
| Opcode $\quad\|11111101\|$ AAAA AAAI $\mid 0110$ 01mm ${ }^{\text {DDDD }}$ uug\% |  |  |  |  |  |  |
| Operands ACx, ACy, Cmem, Lmem |  |  |  |  |  |  |
| Description |  | This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs: |  |  |  |  |
|  |  | $\mathrm{ACy}=\left(\mathrm{ACy} \gg \#_{16}\right)+(\mathrm{HI}($ Lmem $) ~ * ~ H I(C m e m)) ~$ <br> :: ACx = ACx - (LO(Lmem) * LO(Cmem)) |  |  |  |  |

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}(\mathrm{Lmem})$ and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}(\mathrm{Lmem})$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA ( $\mathrm{EA}+1$ when EA is even, $\mathrm{EA}-1$ when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices $(C 54 C M=1)$

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :---: | :---: |
| MAC uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 >> \#16 :: MAS uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 | Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by lower part of AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of ACO. The result is stored in AC0. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2 . |



Multiply and Accumulate With Parallel Multiply and Subtract
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[5]$ | MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy <br> $:: ~ M A S[R][40] ~[u n s(] X m e m[)], ~[u n s(] L O(C m e m)[)], ~ A C x ~$ | No | 5 (*) $^{*}$ | 1 | X |
|  |  |  |  |  |  |

(*) $^{*} 1 \mathrm{LSB}$ is allocated to instruction slot \#2.

## Opcode

Operands

## Description


ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs:

```
ACy = ACY + (Ymem * HI (Cmem))
:: ACx = ACx - (Xmem * LO(Cmem))
```

The first operation performs a multiplication and an accumulation in the $D$-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path C with the next address of EA ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The input operands are extended to 17 bits according to uns.

- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT = 1 , the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

Example

| Syntax | Description |
| :---: | :---: |
| MAC uns(*AR3-), uns(HI(*CDP+)), AC1 <br> $\because:$ MAS uns(*AR2-), uns(LO(*CDP+)), AC0 | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is subtracted from the content of ACO. The result is stored in AC0. AR3 and AR2 are decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2. |



Multiply and Accumulate With Parallel Multiply and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [6] | MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], <br> ACy >> \#16 <br> :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx | No | 5 (*) | 1 | X |

(*) 1 LSB is allocated to instruction slot \#2.

Opcode
Operands
Description
| $10010100 \mid$ XXXM MMYY $\mid$ YMMM $00 \mathrm{~mm} \mid$ uuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs:
$\mathrm{ACy}=(\mathrm{ACy} \gg \# 16)+(\mathrm{HI}($ Ymem $) * \mathrm{HI}($ Cmem $))$
$:: A C x=A C x-(\mathrm{LO}(X m e m) * \mathrm{LO}($ Cmem $))$
The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path $C$ with the next address of EA ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
$\square$ If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
$\square$ The Xmem operand can access the MMRs but the Ymem operand can not.
This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the $B A B, B D B$, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.


## Compatibility with C54x devices (C54CM = 1)

None.

Status Bits

Repeat

FRCT, M40, RDM, SATD, SMUL, SXMD
Affected by
ACOVx, ACOVy
This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 >> \#16 | Both instructions are performed in parallel. The prod- |
| $\because:$ MAS uns(LO(*AR2-)), uns(LO(*CDP+)), AC0 | uct of the unsigned content addressed by AR3 and the <br> unsigned content addressed by the higher part of the <br> coefficient data pointer register (CDP) is added to the <br> content of AC1, which has been shifted to the right by <br> 16 bits. The result is stored in AC1. The product of the <br> unsigned content addressed by AR2 and the unsigned <br> content addressed by the lower part of the CDP is <br> added to the content of AC0. The result is stored in <br> AC0. AR3 and AR2 are decremented by 1. When <br> CDP+ is used with HI/LO, CDP is incremented by 2. |



## MAC::MPY <br> Multiply and Accumulate with Parallel Multiply

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |
| [2] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [3] | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy >> \#16 :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [4] | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [5] | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy >> \#16 } \\ & \text { :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ | No | 4 | 1 | X |
| [6] | $\begin{aligned} & \text { MAC }[\mathrm{R}][40][\text { uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy >> \#16 } \\ & \text { :: MPY[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ | No | 5 | 1 | X |

Description These instructions perform two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:

- AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and Accumulate)
- MAC (Multiply and Accumulate)
- MACMZ (Multiply and Accumulate with Parallel Delay)
- MAC::MAC (Parallel Multiply and Accumulates)
- MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract)
- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
$\square$ MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory)
$\square$ MAS::MPY (Multiply and Subtract with Parallel Multiply)
$\square$ MPY::MAC (Multiply with Parallel Multiply and Accumulate)
$\square$ MPY::MAS (Multiply with Parallel Multiply and Subtract)

Multiply and Accumulate With Parallel Multiply
Syntax Characteristics


The first operation performs a multiplication and an accumulation in the $D$-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

This second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

- Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT = 1, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem
- AMAR Ymem
- AMAR Cmem



## Multiply and Accumulate With Parallel Multiply

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[2]$ | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy |  |  |  |  |  |
|  | $:$ MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |  |
| Opcode | 1111 | 1101 | AAAA | AAAI | 0000 | 10 mm |

## Operands ACx, ACy, Cmem, Smem

## Description This instruction performs two parallel operations in one cycle: multiply and

 accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs:```
ACy = ACY + (Smem * HI (Cmem))
:: ACx = Smem * LO(Cmem)
```

The first operation performs a multiplication and an accumulation in the $D$-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |

Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(*AR3-), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of the <br> unsigned content addressed by AR3 and the unsigned content |
| $: \because$ MPY uns(*AR3-), uns(LO(*CDP+)), AC0 | addressed by the higher part of the coefficient data pointer reg- <br> ister (CDP) is added to the content of AC1. The result is stored <br> in AC1. The product of the unsigned content addressed by AR3 <br> and the unsigned content addressed by the lower part of CDP <br> is stored in AC0. AR3 is decremented by 1. When CDP+ is <br> used with HI/LO, CDP is incremented by 2. |

```
Execution
ACy+M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx
```

MAC::MPY

| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 | FF | 8000 | 0000 | ACO | 00 | 3 F 80 | 0000 |
| XAR3 |  | 00 | 10FF | XAR3 |  | 00 | 10FE |
| Data memory |  |  |  |  |  |  |  |
| 10FFh |  |  | FE00 | 10FFh |  |  | FE00 |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |
| AC1 | 00 | 0000 | 8000 | AC1 | 00 | 7F00 | 8000 |
| Coeff memory |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000h |  |  | 8000 |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[3]$ | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy $\gg \# 16$ <br> $:: ~ M P Y[R][40][u n s(] S m e m[)], ~[u n s(] L O(C m e m)[)], ~ A C x ~$ | No | 4 | 1 | X |
|  |  |  |  |  |  |

Opcode

## Operands

## Description

1111 1101|AAAA AAAI | 0010 10mm|DDDD uug\%

ACx, ACy, Cmem, Smem

This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs:

```
ACy = (ACy >> #16) + (Smem * HI (Cmem))
:: ACx = Smem * LO(Cmem)
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand HI(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.


| Execution |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (ACy>>\#16) +M40 (rnd (uns (Smem) [16:0] *uns (HI (Cmem) ) [16:0]) ) |  |  |  |  |  |  |  | -> ACy |
| M40 (rnd (uns (Smem) [16:0] *uns (LO (Cmem) ) [16:0])) -> ACx |  |  |  |  |  |  |  |  |
| Before |  |  |  | After |  |  |  |  |
| AC0 | FF | 8000 | 0000 | ACO | 00 | 3 F 80 | 0000 |  |
| XAR3 |  | 00 | 10FF | XAR3 |  |  | 10FE |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FFh |  |  | FE00 | 10FFh |  |  | FE00 |  |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 | 00 | 0800 | 0000 | AC1 | 00 | 7F00 | 0800 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000h |  |  | 8000 |  |

Multiply and Accumulate With Parallel Multiply

## Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [4] | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem) [)], ACx |  | No | 4 | 1 | X |
| Opcod |  | 11111101 \| AAAA | AAAI 01 | 10 mm \| DDDD uug\% |  |  |
| Opera | nds | ACx, ACy, Cmem, Lmem |  |  |  |  |
| Descr | iption | This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs: |  |  |  |  |
| ACy = ACy + (HI (Lmem) * HI (Cmem)) <br> :: ACx = LO(Lmem) * LO (Cmem) |  |  |  |  |  |  |

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}($ Lmem ) and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}($ Lmem $)$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of $E A$ ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.

- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |
| :---: | :---: |
|  | Vx, ACOVy |
| Repeat This instruction can | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| MAC uns(HI(**R3-)), uns(HI(*CDP+)), AC1 :: MPY uns(LO(*AR3-)), uns(LO(*CDP+)), ACO | Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2 . When CDP+ is used with HI/LO, CDP is incremented by 2 . |

MAC::MPY Multiply and Accumulate with Parallel Multiply


## Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bi | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [5] | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy >> \#16 } \\ & :: \text { MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ |  | No | 4 | 1 | X |
| Opcod |  | 11111101 | AAAI 01 | 10mm ${ }^{\text {D }}$ DDD uug\% |  |  |
| Opera | nds | ACx, ACy, Cmem, Lmem |  |  |  |  |
| Descr | iption | This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs: |  |  |  |  |
| ```ACy = (ACy >> #16) + (HI (Lmem) * HI (Cmem)) :: ACx = LO(Lmem) * LO(Cmem)``` |  |  |  |  |  |  |

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}(\mathrm{Lmem})$ and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}(\mathrm{Lmem})$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices $(C 54 C M=1)$

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |

## Example

| Syntax | Description |
| :--- | :--- |
| MAC uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 >> \#16 | Both instructions are performed in parallel. The |
| $::$ MPY uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 | product of the unsigned content addressed by the <br> higher part of AR3 and the unsigned content ad- <br> dressed by the higher part of the coefficient data <br> pointer register (CDP) is added to the content of <br> AC1, which has been shifted to the right 16 bits. <br> The result is stored in AC1. The product of the un- <br> signed content addressed by the lower part of AR3 <br> and the unsigned content addressed by the lower <br> part of CDP is stored in AC0. When AR3- is used <br> with HI/LO, AR3 is decremented by 2. When CDP+ |
|  | is used with HI/LO, CDP is incremented by 2. |

## Execution

(ACy>>\#16) +M40 (rnd (uns (HI (Lmem)) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
M40 (rnd (uns (LO (Lmem)) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

## Before

ACO
XAR3
FF 80000000

Data memory

| 10FFh |  |  | FE00 | 10FFh |  |  | FE00 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |
| AC1 | 00 | 0800 | 0000 | AC1 | 00 | 7 F 80 | 0800 |
| Data memory |  |  |  |  |  |  |  |
| 10FEh |  |  | FFOO | 10FEh |  |  | FFO0 |
| Coeff memory |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000 h |  |  | 8000 |

Multiply and Accumulate With Parallel Multiply

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [6] | MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], <br> ACy >> \#16 <br> :: MPY[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx | No | 5 (*) | 1 | X |

(*) 1 LSB is allocated to instruction slot \#2.


Operands ACx, ACy, Cmem, Xmem, Ymem

Description This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs:

```
ACy = (ACy >> #16) + (HI (Ymem) * HI (Cmem))
:: ACx = LO(Xmem) * LO(Cmem)
```

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path $C$ with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The input operands are extended to 17 bits according to uns.

- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
$\square$ If FRCT = 1, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
$\square$ The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL, SXMD |  |
| :---: | :---: | :---: |
|  | Affects ACOVx, ACOVy |  |
| Repeat | This instruction can be repeated. |  |
| Example |  |  |
| Syntax |  | Description |
| MAC uns(HI(*AR3-)), uns(HI(*CDP+)), <br> AC1 >> \#16 <br> :: MPY uns(LO(*AR2-)), uns(LO(*CDP+)), AC0 |  | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is stored in AC0. AR3 and AR2 are decremented by 1 . When CDP+ is used with $\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}$ is incremented by 2 . |


| Execution |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| M40 (rnd (uns (Xmem) [16:0] * uns (LO(Cmem)) [16:0])) -> ACx |  |  |  |  |  |  |  |  |
| M40 (rnd ( $\mathrm{ACy} \gg \# 16$ ) + uns (Ymem) [16:0] * uns (HI (Cmem) ) [16:0])) -> ACy |  |  |  |  |  |  |  |  |
| Before |  |  |  | After |  |  |  |  |
| ACO | FF | 8000 | 0000 | AC0 | 00 | 3 F 80 | 0000 |  |
| XAR2 |  | 00 | 10FE | XAR2 |  | 00 | 10FD |  |
| XAR3 |  | 00 | 20FE | XAR3 |  | 00 | 20FD |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FEh |  |  | FE00 | 10FEh |  |  | FEOO |  |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 | 00 | 0800 | 0000 | AC1 |  | 7F80 | 8000 |  |
| Data memory |  |  |  |  |  |  |  |  |
| 20FEh |  |  | FFO0 | 20 FFh |  |  | FFOO |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000 h |  |  | 8000 |  |

## MACM::MOV <br> Multiply and Accumulate with Parallel Load Accumulator from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MACM $[R][T 3=]$ Xmem, Tx, ACx <br> $:: ~ M O V ~ Y m e m ~ \ll \# 16, ~ A C y ~$ | No | 4 | 1 | X |

Opcode
Operands
Description This instruction performs two operations in parallel: multiply and accumulate (MAC) and load:

```
ACx = ACx + (Tx * Xmem)
:: ACy = Ymem << #16
```

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.
$\square$ If $\mathrm{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ When an addition overflow is detected, the accumulator is saturated according to SATD.
$\square$ This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation loads the content of data memory operand Ymem, which has been shifted to the left by 16 bits, to accumulator ACy.

- The input operand is sign extended to 40 bits according to SXMD.



## MACM::MOV

Multiply and Accumulate with Parallel Store Accumulator Content to Memory

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MACM $[R][T 3=]$ Xmem, Tx, ACy <br> $:: ~ M O V ~ H I(A C x ~ \ll ~ T 2), ~ Y m e m ~$ | No | 4 | 1 | X |

Opcode
Operands
Description
$10000111 \mid$ XXXM MMYY $\mid$ YMMM SSDD $\mid 001 \mathrm{x}$ SsU\%
ACx, ACy, Tx, Xmem, Ymem
This instruction performs two operations in parallel: multiply and accumulate (MAC) and store:

```
ACy = rnd(ACy + (Tx * Xmem)),
:: Ymem = HI (ACx << T2) [,T3 = Xmem]
```

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx , sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.
- This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation shifts the accumulator ACx by the content of T2 and stores $\operatorname{ACx}(31-16)$ to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.

- The input operand is shifted in the D-unit shifter according to SXMD.
- After the shift, the high part of the accumulator, $\operatorname{ACx}(31-16)$, is stored to the memory location.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, the 6 LSBs of T 2 determine the shift quantity. The 6 LSBs of T2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
$\square$ If the SST bit = 1 and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

$$
\begin{aligned}
& A C y=\operatorname{rnd}(A C y+(T x * X m e m)), \\
& \text { Ymem }=H I(\text { saturate }(\text { uns }(A C x \ll T 2)))[, T 3=\text { Xmem }]
\end{aligned}
$$

- If the SST bit = 1 and the SXMD bit $=1$, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

$$
\begin{aligned}
& A C y=\operatorname{rnd}(A C y+(T x * \text { Xmem })), \\
& \text { Ymem }=\text { HI(saturate(ACx } \ll \text { T2) })[, T 3=\text { Xmem }]
\end{aligned}
$$

Status Bits Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SST, SXMD
Affects ACOVy
Repeat This instruction can be repeated.
See Also See the following other related instructions:

- AMAR::MAC (Modify Auxiliary Register Content with Parallel Multiply and Accumulate)
- MAC (Multiply and Accumulate)
- MACMZ (Multiply and Accumulate with Parallel Delay)
- MAC::MAC (Parallel Multiply and Accumulates)
- MAC::MPY (Multiply and Accumulate with Parallel Multiply)
- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)
- MPY::MAC (Multiply with Parallel Multiply and Accumulate)
- MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract)
- MPY::MAS (Multiply with Parallel Multiply and Subtract)


## Example

| Syntax | Description |
| :--- | :--- |
| MACM *AR3, T0, AC0 | Both instructions are performed in parallel. The product of the content ad- <br> dressed by AR3 and the content of T0 is added to the content of AC0. The <br> dres <br> result is stored in AC0. The content of AC1 is shifted by the content of T2, <br> and AC1 $21-16)$ is stored at the address of AR4. |

## MANT: ${ }^{\text {NEXP }}$ <br> Syntax Characteristics

Compute Mantissa and Exponent of Accumulator Content

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MANT ACx, ACy | Yes | 3 | 1 | X2 |
|  | $::$ NEXP ACx, Tx |  |  |  |  |

## Opcode

0001 000E $\mid$ DDSS $1001 \mid$ xxdd xxxx

## Operands

## Description

Status Bits

Repeat
See Also See the following other related instructions:

- EXP (Compute Exponent of Accumulator Content)


## Example 1

| Syntax | Description |
| :--- | :--- |
| MANT AC0, AC1 | The exponent is computed by subtracting the number of leading bits in the content of |
| $\because:$ NEXP AC0, T1 | AC0 from 8. The exponent value is a signed 2s-complement value in the -31 to 8 <br> range and is stored in T1. The mantissa is computed by aligning the content of AC0 <br> on a signed 32-bit representation. The mantissa value is stored in AC1. |


| Before | After |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| AC0 | 21 | OAOA OAOA | AC0 | 21 | OAOA | OAOA |
| AC1 | FF FFFF FO01 | AC1 | 00 | 4214 | 1414 |  |
| T1 |  |  | 0000 | T1 |  | 0007 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| MANT AC0, AC1 | The exponent is computed by subtracting the number of leading bits in the content of |
| $: \because$ NEXP AC0, T1 | AC0 from 8. The exponent value is a signed 2s-complement value in the -31 to 8 <br> range and is stored in T1. The mantissa is computed by aligning the content of AC0 <br> on a signed 32-bit representation. The mantissa value is stored in AC1. |


| Before |  |  | After |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| AC0 | 00 | E804 | 0000 | AC0 | 00 | E804 | 0000 |
| AC1 | FF | FFFF | F001 | AC1 | 00 | 7402 | 0000 |
| T1 |  |  | 0000 | T1 |  |  | 0001 |

## MAS

Multiply and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAS[R] Tx, [ACx, ] ACy | Yes | 2 | 1 | X |
| [2] | MASM[R] [T3 = ]Smem, Cmem, ACx | No | 3 | 1 | X |
| [3] | MASM[R] [T3 = ]Smem, [ACx,] ACy | No | 3 | 1 | X |
| [4] | MASM[R] [T3 = ]Smem, Tx, [ACx, ${ }^{\text {a }}$ ACy | No | 3 | 1 | X |
| [5] | $\begin{aligned} & \text { MASM }[R][40][\text { T3 = }][\text { uns( }(\text { Xmem }[)],[\text { uns(]Ymem[)], }[A C x,] \\ & \text { ACy } \end{aligned}$ | No | 4 | 1 | X |
| [6] | MAS[R] Smem, uns(Cmem), ACx | No | 3 | 1 | X |

Description This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are:

- $\mathrm{ACx}(32-16)$
- The content of Tx, sign extended to 17 bits
- The content of a memory location (Smem), sign extended to 17 bits
$\square$ The content of a data memory operand Cmem, addressed using the coefficient addressing mode and sign extended to 17 bits
- The content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:

- AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract)
- MAC (Multiply and Accumulate)
- MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate)
- MAS::MAS (Parallel Multiply and Subtracts)
$\square$ MAS::MPY (Multiply and Subtract with Parallel Multiply)
$\square$ MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
$\square$ MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)

Multiply and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MAS $[R] T x,[A C x] A C y$, | Yes | 2 | 1 | X |

## Opcode

0101 011E $\mid$ DDSS ss1\%

## Operands <br> ACx, ACy, Tx

Description This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are $\operatorname{ACx}(32-16)$ and the content of Tx , sign extended to 17 bits:

ACy $=A C y-(A C x * T x)$
$\square$ If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with M40 $=0$, compatibility is ensured.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVy |  |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :--- | :--- |
| MASR T1, AC0, AC1 | The product of the content of AC0 and the content of T1 is subtracted from the <br> content of AC1. The result is rounded and stored in AC1. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 00 | ECOO | 0000 | ACO | 00 | ECOO | 0000 |
| AC1 | 00 | 3400 | 0000 | AC1 | 00 | 1680 | 0000 |
| T1 |  |  | 2000 | T1 |  |  | 2000 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 0 |
| FRCT |  |  | 0 | FRCT |  |  | 0 |

Multiply and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MASM $[R][T 3=]$ Smem, Cmem, ACx | No | 3 | 1 | X |

## Opcode

11010001 AAAA AAAI $\mid$ U\%DD 10 mm

## Operands

Description This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of a memory location (Smem), sign extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and sign extended to 17 bits:
$A C x=A C x-($ Smem * Cmem $)$

- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |  |
| :---: | :---: | :---: |
|  | Affects | ACOVx |
| Repeat | This instructior | can be repeated. |
| Example |  |  |
| Syntax | Descr |  |
| MASMR *AR1, *CDP, AC2 |  The p <br>  the co <br>  AC2. | uct of the content addressed by AR1 cient data pointer register (CDP) is sub result is rounded and stored in AC2. |


| Before |  |  | After |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| AC2 | 00 ECOO | 0000 | AC2 | 00 EC01 | 0000 |
| AR1 |  | 0302 | AR2 |  | 0302 |
| CDP |  | 0202 | CDP |  | 0202 |
| 302 |  | FE00 | 302 |  | FE00 |
| 202 |  | 0040 | 202 |  | 0040 |
| ACOV2 |  | 0 | ACOV2 |  | 1 |
| SATD |  | 0 | SATD |  | 0 |
| RDM |  | 0 | RDM |  | 0 |
| FRCT |  | 0 | FRCT |  | 0 |

Multiply and Subtract

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: | Cycles | Pipeline |
| :---: |
| $[3]$ |
| MASM $[R][T 3=]$ Smem, $[A C x$,$] ACy$ |
| Opcode |

## Operands

## Description

Status Bits

Repeat

ACx, ACy, Smem
This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are ACx(32-16) and the content of a memory location (Smem), sign extended to 17 bits:

ACy $=$ ACy - (Smem * ACx)
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

## Example

| Syntax | Description |
| :--- | :--- |
| MASM *AR3, AC1, AC0 | The product of the content addressed by AR3 and the content of AC1 is sub- <br> tracted from the content of AC0. The result is stored in AC0. |

## Multiply and Subtract

## Syntax Characteristics

 SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits A | $\begin{array}{ll}\text { Affected by } & \text { FRCT, M40, RDM, SATD, SMUL } \\ \text { Affects } & \text { ACOVy }\end{array}$ |  |
| :---: | :---: | :---: |
| Repeat T | This instruc | can be repeated. |
| Example |  |  |
| Syntax | Descrip |  |
| MASM *AR3, T0, AC1, AC0 |  The p <br> tracte | uct of the content addressed by AR3 and m the content of AC 1 . The result is st |

Multiply and Subtract
Syntax Characteristics


## Operands

Description

## ACx, ACy, Xmem, Ymem

This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits:
$A C y=A C x-($ Xmem * Ymem $)$
$\square$ Input operands are extended to 17 bits according to uns.

- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If $\mathrm{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.
This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.


| Before |  |  | After |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: |
| AC3 | 00 | 2300 | EC0 | AC3 | FF B3E0 EC00 |
| AR2 |  | 302 | AR2 |  | 303 |
| AR3 |  | 202 | AR3 |  | 203 |
| ACOV3 |  | 0 | ACOV3 |  | 0 |
| 302 |  | FE00 | 302 | FE00 |  |
| 202 | 7000 | 202 | 7000 |  |  |
| FRCT |  | 0 | FRCT |  | 0 |

Multiply and Subtract

## Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [6] | MAS[R] Smem, uns(Cmem), ACx |  | No | 3 | 1 | X |
| Opcode $\quad\|11010000\|$ AAAA AAAI $\mid 0 \%$ DD 11 mm |  |  |  |  |  |  |
| Operands |  | ACx, Cmem, Smem |  |  |  |  |
| Description |  | This instruction performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of a data memory location (Smem) and the content of a data memory operand (Cmem): |  |  |  |  |
|  |  | Note: The uns keywo | s instructio |  |  |  |

The data memory operand Smem is addressed by DAGEN path X by using the Smem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The another data memory operand Cmem is addressed by DAGEN path C by using the coefficient addressing mode, driven on data bus BDB, and extended to 17 bits with filling zeros in the MAC1.
$\square$ If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB
buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

This instruction can be applied to compute the intermediate multiplication result and subtraction from the other partial result of double precision arithmetic and to free up one DAGEN operator (DAGEN path Y) for store instruction with enabling parallelism.

## Compatibility with C54x devices (C54CM = 1)

None.


| Execution |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| rnd (ACx+(Smem) [16:0]*uns (Cmem) [16:0]) -> ACx |  |  |  |  |  |  |
| Before |  |  | After |  |  |  |
| AC0 00 | 0000 | 8000 | ACO | 00 | 0100 | 8000 |
| XAR3 | 00 | 1001 | XAR3 |  | 00 | 1000 |
| Data memory |  |  |  |  |  |  |
| 1001h |  | FE00 | 1001h |  |  | FEOO |
| XCDP | 00 | 2000 | XCDP |  | 00 | 2001 |
| Coeff memory |  |  |  |  |  |  |
| 2000h |  | 8000 | 2000h |  |  | 8000 |

## MAS::MAC

## Multiply and Subtract with Parallel Multiply and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |
| [2] | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 | No | 4 | 1 | X |
| [3] | MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [4] | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |

Description These instructions perform two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

Status Bits
Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:

- AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract)
- MAS (Multiply and Subtract)
- MAS::MAS (Parallel Multiply and Subtracts)
- MAS::MPY (Multiply and Subtract with Parallel Multiply)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)

Syntax Characteristics

| No. | Synta |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[]), ACX :: MAC[R][40] [uns(]Ymem[]), [uns(]Cmem[)], ACy |  | No | 4 | 1 | X |
| Opcode |  |  |  |  |  |  |
| Operands |  | ACx, ACy, Cmem, Xmem, Ymem |  |  |  |  |
| Description |  | This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs: |  |  |  |  |
|  |  | ACx = ACx - (Xmem * Cmem) <br> $::$ ACY = ACY + (Ymem * Cmem) |  |  |  |  |

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
$\square$ Input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
$\square$
If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

■ AMAR Xmem

- AMAR Ymem
- AMAR Cmem


## Status Bits

Repeat
Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy

## Example



Multiply and Subtract with Parallel Multiply and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MAS $[R][40][$ uns (] Xmem [)$],[$ uns (]$C m e m[)], ~ A C x ~$ <br> $:: ~ M A C[R][40] ~[u n s(] Y m e m[)], ~[u n s(] C m e m[)], ~ A C y ~ \gg ~ \# 16 ~$ | No | 4 | 1 | X |
|  |  |  |  |  |  |

## Opcode

$10000100 \mid$ XXXM MMYY $\mid$ YMMM $00 \mathrm{~mm} \mid$ UuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs:

```
ACx = ACx - (Xmem * Cmem)
:: ACy = (ACy >> #16) + (Ymem * Cmem)
```

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.

- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- For the second operation, the 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

■ AMAR Xmem
■ AMAR Ymem
■ AMAR Cmem

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :---: | :---: |
| MAS uns(*AR3), uns(*CDP), AC0 <br> :: MAC uns(*AR4), uns(*CDP), AC1 >> \#16 | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the coefficient data pointer register (CDP) is subtracted from the content of ACO. The result is stored in ACO. The product of the unsigned content addressed by AR4 and the unsigned content addressed by CDP is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. |

Multiply and Subtract with Parallel Multiply and Accumulate

## Syntax Characteristics



The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand HI(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of EA ( $\mathrm{EA}+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM = 1)
None.


[^5]| Before |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 00 | 0000 | 8000 | AC0 | 00 | 3 F 80 | 8000 |
| XAR3 | 00 | 10FF | XAR3 |  | 00 | 10FE |
| Data memory |  |  |  |  |  |  |
| 10FFh |  | FE00 | 10FFh |  |  | FE00 |
| XCDP | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |
| 2001h |  | 4000 | 2001h |  |  | 4000 |
| AC1 00 | 0000 | 8000 | AC1 | FF | 8100 | 8000 |
| Coeff memory |  |  |  |  |  |  |
| 2000h |  | 8000 | 2000 h |  |  | 8000 |

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy | No | 4 | 1 | X |
|  | $::$ MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

## Opcode

Operands

## Description

$11111101 \mid$ AAAA AAAI 0101 11mm $\left\lvert\, \begin{array}{ll}\text { DDDD uug\% }\end{array}\right.$

ACx, ACy, Cmem, Lmem

This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs:

```
ACy = ACy - (HI (Lmem) * HI (Cmem))
:: ACx = ACx + (LO(Lmem) * LO(Cmem))
```

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}(\mathrm{Lmem})$ and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}(\mathrm{Lmem})$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA ( $\mathrm{EA}+1$ when EA is even, $\mathrm{EA}-1$ when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of EA ( $\mathrm{EA}+1$ when EA is even, $\mathrm{EA}-1$ when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

I If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
$\square$ For the second operation, the 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |  |
| :---: | :---: | :---: |
|  | Affects ACOVx, ACOVy |  |
| Repeat | This instruction can be repeated. |  |
| Example |  |  |
| Syntax |  | Description |
| MAS uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 :: MAC uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 |  | Both instructions are perform the unsigned content address and the unsigned content add the coefficient data pointer re from the content of AC1. The product of the unsigned conte part of AR3 and the unsigned lower part of CDP is added to result is stored in AC0. When AR3 is decremented by 2 . Wh LO, CDP is incremented by 2 |

MAS::MAC Multiply and Subtract with Parallel Multiply and Accumulate

| Execution |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACy-M40 (rnd (uns (HI (Lmem) ) [16:0]*uns (HI (Cmem) ) [16:0]) |  |  |  |  |  |  |  |  |
| ACx+M40 (rnd (uns (LO (Lmem) ) [16:0]*uns (LO (Cmem) ) [16:0]) |  |  |  |  |  |  |  |  |
| Before |  |  |  | After |  |  |  |  |
| AC0 | 00 | 0000 | 8000 | AC0 | 00 | 3F80 | 8000 |  |
| XAR3 |  | 00 | 10FE | XAR3 |  | 00 | 10FC |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FFh |  |  | FE00 | 10FFh |  |  | FE00 |  |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 | 00 | 0000 | 8000 | AC1 | FF | 8080 | 8000 |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FEh |  |  | FFOO | 10FEh |  |  | FFOO |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000h |  |  | 8000 |  |

## MAS::MAS

Parallel Multiply and Subtracts

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |
| [2] | MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [3] | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [4] | MAS[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx | No | 5 | 1 | X |

Description These instructions perform two parallel multiply and subtract (MAS) operations in one cycle. The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy

## See Also

See the following other related instructions:

- AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract)
- MAC::MAC (Parallel Multiply and Accumulates)
- MAS (Multiply and Subtract)
- MAS:: MAC (Multiply and Subtract with Parallel Multiply and Accumulate)
- MAS::MPY (Multiply and Subtract with Parallel Multiply)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)
- MPY::MPY (Parallel Multiplies)


## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |

## Opcode

$10000101 \mid$ XXXM MMYY $\mid$ YMMM 01mm $\mid$ UuDD DDg\%

## Operands

ACx, ACy, Cmem, Xmem, Ymem

## Description

This instruction performs two parallel multiply and subtract (MAS) operations in one cycle. The operations are executed in the two D-unit MACs:

```
ACx = ACx - (Xmem * Cmem)
:: ACy = ACy - (Ymem * Cmem)
```

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and a subtraction in the D -unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
$\square$ Input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

I If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.

- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem

■ AMAR Ymem

- AMAR Cmem

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :---: | :---: |
| MAS uns(*AR3), uns(*CDP), AC0 :: MAS uns(*AR4), uns(*CDP), AC1 | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the coefficient data pointer register (CDP) is subtracted from the content of ACO. The result is stored in ACO. The product of the unsigned content addressed by AR4 and the unsigned content addressed by CDP is subtracted from the content of AC1. The result is stored in AC1. |

## Syntax Characteristics



The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand HI(Cmem). The data memory operand Smem is addressed by DAGEN path $X$ with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of $E A$ ( $E A+1$ when $E A$ is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
. The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :--- | :--- |
| MAS uns(*AR3-), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of the |
| $\because:$ MAS uns(*AR3-), uns(LO(*CDP+)), AC0 | unsigned content addressed by AR3 and the unsigned content <br> addressed by the higher part of the coefficient data pointer reg- <br> ister (CDP) is subtracted from the content of AC1. The result is <br> stored in AC1. The product of the unsigned content addressed <br> by AR3 and the unsigned content addressed by the lower part <br> of CDP is subtracted from the content of AC0. The result is <br> stored in AC0. AR3 is decremented by 1. When CDP+ is used <br> with HI/LO, CDP is incremented by 2. |

[^6]MAS::MAS


## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [3] | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  | No | 4 | 1 | X |
| Opcode $\|11111101\|$ AAAA AAAI $\mid 011100 \mathrm{~mm}$ \| DDDD uug\% |  |  |  |  |  |  |
| Operands ACx, ACy, Cmem, Lmem |  |  |  |  |  |  |
| Description |  | This instruction performs two parallel multiply and subtract (MAS) operations in one cycle. The operations are executed in the two D-unit MACs: |  |  |  |  |

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}(\mathrm{Lmem})$ and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}($ Lmem $)$ is addressed by DAGEN path $X$ with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

Example

| Syntax | Description |
| :--- | :--- |
| MAS uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of |
| $::$ MAS uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 | the unsigned content addressed by the higher part of AR3 |
| and the unsigned content addressed by the higher part of |  |
| the coefficient data pointer register (CDP) is subtracted |  |
|  | from the content of AC1. The result is stored in AC1. The |
| product of the unsigned content addressed by the lower |  |
| part of AR3 and the unsigned content addressed by the |  |
|  | lower part of CDP is subtracted from the content of AC0. |
|  | The result is stored in AC0. When AR3- is used with HI/ |
|  | LO, AR3 is decremented by 2. When CDP+ is used with |
|  | HI/LO, CDP is incremented by 2. |

[^7]| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 00 | 0000 | 8000 | AC0 | FF | C080 | 8000 |
| XAR3 |  | 00 | 10FE | XAR3 |  | 00 | 10FC |
| Data memory |  |  |  |  |  |  |  |
| 10FFh |  |  | FEOO | 10FFh |  |  | FEOO |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |
| AC1 | 00 | 0000 | 8000 | AC1 | FF | 8080 | 8000 |
| Data memory |  |  |  |  |  |  |  |
| 10FEh |  |  | FFOO | 10FEh |  |  | FFOO |
| Coeff memory |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000 h |  |  | 8000 |

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [4] | MAS[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx | No | $5{ }^{*}$ ) | 1 | X |

(*) 1 LSB is allocated to instruction slot \#2.

## Opcode

## Operands

Description This instruction performs two parallel multiply and subtraction (MAS) operations in one cycle. The operations are executed in the two D-unit MACs:
$\mathrm{ACy}=\mathrm{ACy}-(\mathrm{HI}($ Ymem $) * \mathrm{HI}($ Cmem $))$
$:: A C x=A C x-(\mathrm{LO}(X m e m) * \mathrm{LO}($ Cmem $))$
The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D -unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
$\square$ The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL, SXMD <br>  Affects <br> Repeat ACOVx, ACOVy |
| :--- | :--- | :--- |
| This instruction can be repeated. |  |


| Syntax | Description |
| :--- | :--- |
| MAS uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of |
| $\because:$ MAS uns(LO(*AR2-)), uns(LO(*CDP +$))$, AC0 | the unsigned content addressed by AR3 and the unsigned |
|  | content addressed by the higher part of the coefficient data |
| pointer register (CDP) is subtracted from the content of |  |
|  | AC1. The result is stored in AC1. The product of the un- |
|  | signed content addressed by AR2 and the unsigned con- |
| tent addressed by the lower part of the CDP is subtracted |  |
|  | from the content of AC0. The result is stored in AC0. AR3 |
|  | and AR2 are decremented by 1. When CDP+ is used with |
|  | HI/LO, CDP is incremented by 2. |

MAS::MAS Parallel Multiply and Subtracts

| Execution |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| M40 (rnd (ACx - uns (Xmem) [16:0] * uns (LO(Cmem) ${ }^{\text {a }}$ [16:0]) ) -> ACx |  |  |  |  |  |  |  |
| M40 (rnd (ACy - uns(Ymem) [16:0] * uns (HI (Cmem)) [16:0])) -> ACy |  |  |  |  |  |  |  |
| Before |  |  | After |  |  |  |  |
| AC0 00 | 000000 | 8000 | AC0 | FF | C080 | 8000 |  |
| XAR2 | 00 | 10FE | XAR2 |  | 00 | 10FD |  |
| XAR3 | 00 | 20 FE | XAR3 |  | 00 | 20FD |  |
| Data memory |  |  |  |  |  |  |  |
| 10FEh |  | FE00 | 10FEh |  |  | FE00 |  |
| XCDP | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |
| 2001h |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 00 | 000000 | 8000 | AC1 |  | 8080 | 8000 |  |
| Data memory |  |  |  |  |  |  |  |
| 20FEh |  | FFOO | 20 FFh |  |  | FFO0 |  |
| Coeff memory |  |  |  |  |  |  |  |
| 2000h |  | 8000 | 2000 h |  |  | 8000 |  |

## MAS:IMPY <br> Multiply and Subtract with Parallel Multiply

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |
| [2] | MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [3] | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |

Description These instructions perform two parallel operations in one cycle: multiply and subtract (MAS) and multiply. The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:

- AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract)
- MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract)
$\square$ MAC::MPY (Multiply and Accumulate with Parallel Multiply)
- MAS (Multiply and Subtract)
- MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate)
- MAS::MAS (Parallel Multiply and Subtracts)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)

Multiply and Subtract With Parallel Multiply
Syntax Characteristics


The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

- AMAR Xmem
- AMAR Ymem
- AMAR Cmem

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL, SXMD |  |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :--- | :--- |
| MAS uns(*AR3), uns(*CDP), AC0 | Both instructions are performed in parallel. The product of the un- |
| signed content addressed by AR3 and the unsigned content ad- |  |
| dressed by the coefficient data pointer register (CDP) is subtracted |  |
| from the content of AC0. The result is stored in AC0. The product |  |
| of the unsigned content addressed by AR4 and the unsigned con- |  |
| tent addressed by CDP is stored in AC1. |  |

Multiply and Subtract With Parallel Multiply
Syntax Characteristics

|  | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MAS[R][40] [uns(]Smem[]), [uns(]HI(Cmem) $[$ ], ACy :: MPY[R][40] [uns(]Smem[]], [uns(]LO(Cmem) []], ACx |  | No | 4 | 1 | X |
| Opcode $\quad\|11111101\|$ AAAA AAAI $\|000100 \mathrm{~mm}\|$ DDDD $\mathrm{uug} \mathrm{\%}$ |  |  |  |  |  |  |
| Operands ACx, ACy, Cmem, Smem |  |  |  |  |  |  |
| Description |  | This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply. The operations are executed in the two D-unit MACs: |  |  |  |  |
|  |  | $\mathrm{ACy}=\mathrm{ACy}-($ Smem * HI (Cmem) $)$ <br> :: ACx = Smem * LO (Cmem) |  |  |  |  |

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |

Example

| Syntax | Description |
| :--- | :--- |
| MAS uns(*AR3-), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of the |
| un MPY uns(*AR3-), uns(LO(*CDP+)), AC0 | unsigned content addressed by AR3 and the unsigned content <br> addressed by the higher part of the coefficient data pointer reg- <br> ister (CDP) is subtracted from the content of AC1. The result is <br> stored in AC1. The product of the unsigned content addressed <br> by AR3 and the unsigned content addressed by the lower part <br> of CDP is stored in AC0. AR3 is decremented by 1. When <br> CDP+ is used with HI/LO, CDP is incremented by 2. |

```
Execution
ACy-M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx
```

MAS::MPY

| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 | FF | 8000 | 0000 | ACO | 00 | 3F80 | 0000 |
| XAR3 |  | 00 | 10 FF | XAR3 |  | 00 | 10FE |
| Data memory |  |  |  |  |  |  |  |
| 10FFh |  |  | FE00 | 10FFh |  |  | FE00 |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |
| AC1 | 00 | 0000 | 8000 | AC1 | FF | 8100 | 8000 |
| Coeff memory |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000h |  |  | 8000 |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[3]$ | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy | No | 4 | 1 | X |
|  | $:: ~ M P Y[R][40][u n s(] L O(L m e m)[)],[\operatorname{uns}(] L O(C m e m)[)]$, ACx |  |  |  |  |

Opcode

## Operands

Description

1111 1101|AAAA AAAI $010100 \mathrm{~mm} \mid$ DDDD uug\%
ACx, ACy, Cmem, Lmem
This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply. The operations are executed in the two D-unit MACs:

```
ACy = ACy - (HI (Lmem) * HI (Cmem))
```

: : ACx = LO (Lmem) * LO (Cmem)

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}(\mathrm{Lmem})$ and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}(\mathrm{Lmem})$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVx, ACOVy |  |
| Repeat | This instruction can be repeated. |  |
| Example |  |  |


| Syntax | Description |
| :--- | :--- |
| MAS uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of |
| $\because:$ MPY uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 | the unsigned content addressed by the higher part of AR3 |
| and the unsigned content addressed by the higher part of |  |
| the coefficient data pointer register (CDP) is subtracted |  |
| from the content of AC1. The result is stored in AC1. The |  |
| product of the unsigned content addressed by the lower |  |
| part of AR3 and the unsigned content addressed by the |  |
| lower part of CDP is stored in AC0. When AR3- is used |  |
|  | with HI/LO, AR3 is decremented by 2. When CDP+ is used |
| with HI/LO, CDP is incremented by 2. |  |


| Execution |  |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACy-M40 (rnd (uns (HI (Lmem)) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy |  |  |  |  |  |  |  |  |
| M40 (rnd (uns (LO (Lmem) ) [16:0]*uns (LO (Cmem) ) [16:0])) -> ACx |  |  |  |  |  |  |  |  |
| Before |  |  |  | After |  |  |  |  |
| ACO | FF | 8000 | 0000 | ACO |  | 3 F 80 | 0000 |  |
| XAR3 |  | 00 | 10FE | XAR3 |  | 00 | 10FC |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FFh |  |  | FEOO | 10FFh |  |  | FEOO |  |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |  |
| AC1 | 00 | 0000 | 8000 | AC1 | FF | 8080 | 8000 |  |
| Data memory |  |  |  |  |  |  |  |  |
| 10FEh |  |  | FFOO | 10FEh |  |  | FFOO |  |
| Coeff memory |  |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000 h |  |  | 8000 |  |

## MASM::MOV <br> Multiply and Subtract with Parallel Load Accumulator from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MASM $[R][T 3=]$ Xmem, Tx, ACx <br> $:: ~ M O V ~ Y m e m ~ \ll \# 16, ~ A C y ~$ | No | 4 | 1 | X |

## Opcode

ACx, ACy, Tx, Xmem, Ymem
Description This instruction performs two operations in parallel: multiply and subtract (MAS) and load:

```
ACx = ACx - (Tx * Xmem)
:: ACy = Ymem << #16
```

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation loads the content of data memory operand Ymem, which has been shifted to the left by 16 bits, into the accumulator ACy.
$\square$ The input operand is sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.

- The input operand is shifted to the left by 16 bits according to M40.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
Repeat This instruction can be repeated.
See Also See the following other related instructions:

- AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract)
- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory
- MAS (Multiply and Subtract)
- MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate)
- MAS::MAS (Parallel Multiply and Subtracts)
- MAS::MPY (Multiply and Subtract with Parallel Multiply)
- MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)


## Example

| Syntax | Description |
| :--- | :--- |
| MASM *AR3, T0, AC0 | Both instructions are performed in parallel. The product of the content ad- <br> dressed by AR3 and the content of T0 is subtracted from the content of AC0. <br> The result is stored in AC0. The content addressed by AR4, which has been <br> shifted to the left by 16 bits, is stored in AC1. |

## MASM::MOV <br> Multiply and Subtract with Parallel Store Accumulator Content to Memory

## Syntax Characteristics



The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of Tx , sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.

- The 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation shifts the accumulator ACx by the content of T2 and stores $\operatorname{ACx}(31-16)$ to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.

- The input operand is shifted in the D-unit shifter according to SXMD.
- After the shift, the high part of the accumulator, $\operatorname{ACx}(31-16)$, is stored to the memory location.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, the 6 LSBs of T 2 determine the shift quantity. The 6 LSBs of T2 define a shift quantity within -32 to +31 . When the 16 -bit value in T 2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
$\square$ If the SST bit $=1$ and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

$$
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{ACy}-(\mathrm{Tx} * \text { Xmem })) \\
& \text { Ymem }=\mathrm{HI}(\text { saturate(uns(ACx } \ll \text { T2) }))[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
$$

- If the SST bit = 1 and the SXMD bit $=1$, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

$$
\begin{aligned}
& A C y=\operatorname{rnd}(A C y-(T x * \text { Xmem })) \\
& \text { Ymem }=\text { HI(saturate(ACx } \ll \text { T2 }))[, T 3=\text { Xmem }]
\end{aligned}
$$

## Status Bits Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SST, SXMD Affects ACOVy <br> This instruction can be repeated. <br> See the following other related instructions: <br> - AMAR::MAS (Modify Auxiliary Register Content with Parallel Multiply and Subtract) <br> - MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory) <br> - MAS (Multiply and Subtract) <br> - MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate) <br> - MAS::MAS (Parallel Multiply and Subtracts) <br> - MAS::MPY (Multiply and Subtract with Parallel Multiply) <br> - MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)

## Example

| Syntax | Description |
| :--- | :--- |
| MASM *AR3, T0, AC0, | Both instructions are performed in parallel. The product of the content ad- <br> dressed by AR3 and the content of T0 is subtracted from the content of <br> $:: ~ M O V ~ H I(A C 1 ~ \ll ~ T 2), ~ * A R 4 ~$ |
|  | AC0. The result is stored in AC0. The content of AC1 is shifted by the con- <br> tent of T2, and AC1 $(31-16)$ is stored at the address of AR4. |

## MAX

Compare Accumulator, Auxiliary, or Temporary Register Content Maximum

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MAX $[\mathrm{src}] dst$, | Yes | 2 | 1 | X |

## Opcode

0010 111E $\operatorname{FSSS}$ FDDD

## Operands

## Description

dst, src
This instruction performs a maximum comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 .
$\square$ When the destination operand (dst) is an accumulator:

- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD.
■ The operation is performed on 40 bits in the D-unit ALU:
If $\mathrm{M} 40=0, \operatorname{src}(31-0)$ content is compared to $\operatorname{dst}(31-0)$ content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .

```
stepl:if (src(31-0) > dst(31-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
    else
step3: CARRY = 1
```

If $\mathrm{M} 40=1, \operatorname{src}(39-0)$ content is compared to $\mathrm{dst}(39-0)$ content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .

```
step1:if (src(39-0) > dst(39-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
    else
step3: CARRY = 1
```

There is no overflow detection, overflow report, and saturation.

When the destination operand (dst) is an auxiliary or temporary register:

- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- The operation is performed on 16 bits in the A-unit ALU:

The $\operatorname{src}(15-0)$ content is compared to the $\operatorname{dst}(15-0)$ content. The extremum value is stored in dst.

```
step1:if (src(15-0) > dst(15-0))
step2:dst = src
```

- There is no overflow detection and saturation.


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, this instruction is executed as if M 40 status bit was locally set to 1 . When the destination operand (dst) is an auxiliary or temporary register, the instruction execution is not impacted by the C54CM status bit. When the destination operand (dst) is an accumulator, this instruction always compares the source operand (src) with AC1 as follows:

- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD
- The operation is performed on 40 bits in the D-unit ALU:

The $\operatorname{src}(39-0)$ content is compared to $\mathrm{AC} 1(39-0)$ content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .

```
step1:if (src(39-0) > AC1(39-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
    else
step3: { CARRY = 1; dst(39-0) = AC1(39-0) }
```

■ There is no overflow detection, overflow report, and saturation.

Status Bits

Repeat

Affected by C54CM, M40, SXMD
Affects CARRY
This instruction can be repeated.

| See Also | See the following other related instructions: |
| :--- | :--- |
| CMP (Compare Memory with Immediate Value) |  |
| CMP (Compare Accumulator, Auxiliary, or Temporary Register Content) |  |
| $\square$ CMPAND (Compare Accumulator, Auxiliary, or Temporary Register |  |
| Content with AND) |  |
| $\square$ | CMPOR (Compare Accumulator, Auxiliary, or Temporary Register |
| Content with OR) |  |
| $\square$ MAXDIFF (Compare and Select Accumulator Content Maximum) |  |
| $\square$ | MIN (Compare Accumulator, Auxiliary, or Temporary Register Content |
| Minimum) |  |

## Example 1

| Syntax | Description |
| :--- | :--- |
| MAX AC2, AC1 | The content of AC2 is less than the content of AC1, the content of AC1 remains <br> the same and the CARRY status bit is set to 1. |


| Before |  |  |  |  |  |  | After |
| :--- | ---: | ---: | ---: | :--- | ---: | ---: | ---: |
| AC2 | 00 | 0000 | 0000 | AC2 | 00 | 0000 | 0000 |
| AC1 | 00 | 8500 | 0000 | AC1 | 00 | 8500 | 0000 |
| SXMD |  |  | 1 | SXMD |  |  | 1 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| CARRY |  |  | 0 | CARRY |  |  | 1 |

## Example 2



## Example 3

| Syntax | Description |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| MAX AC1, T1 | The content of AC1(15-0) is greater than the content of T1, the content of AC1(15-0) is stored in T1 and the CARRY status bit is cleared to 0 . |  |  |  |  |
| Before |  | After |  |  |  |
| AC1 000000 | 8020 | AC1 | 00 | 0000 | 8020 |
| T1 | 8010 | T1 |  |  | 8020 |
| CARRY | 0 | CARRY |  |  | 0 |

## MAXDIFF

Compare and Select Accumulator Content Maximum

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MAXDIFF ACx, ACy, ACz, ACw | Yes | 3 | 1 | X |
| $[2]$ | DMAXDIFF ACx, ACy, ACz, ACw, TRNx | Yes | 3 | 1 | $X$ |

Description Instruction [1] performs two paralleled 16-bit extremum selections in the D-unit

Status Bits

## See Also

 ALU. Instruction [2] performs a single 40-bit extremum selection in the D-unit ALU.Affected by C54CM, M40, SATD
Affects ACOVw, CARRY
See the following other related instructions:

- CMP (Compare Accumulator, Auxiliary, or Temporary Register Content)
- MAX (Compare Accumulator, Auxiliary, or Temporary Register Content Maximum)
- MIN (Compare and Select Accumulator Content Minimum)

Compare and Select Accumulator Content Maximum

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MAXDIFF ACx, ACy, ACz, ACw | Yes | 3 | 1 | X |
| Opcode $\quad\|0001000 \mathrm{E}\|$ DDSS $1100 \mid$ SSDD nnnn |  |  |  |  |  |
| Oper | ACw, ACx, ACy, ACz |  |  |  |  |
| Descr | tion <br> This instruction D-unit ALU in <br> The two operation locally in dual accumulators attached to the | The two operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulators are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit data path). |  |  |  |

For each datapath (high and low):

- ACx and ACy are the source accumulators.
- The differences are stored in accumulator ACw.
- The subtraction computation is equivalent to the dual 16-bit subtractions instruction.
- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVw) is set.
- For the operations performed in the ALU low part, overflow is detected at bit position 15.
■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
$\square$ For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
$\square$ Independently on each data path, if SATD = 1 when an overflow is detected on the data path, a saturation is performed:
■ For the operations performed in the ALU low part, saturation values are 7FFFh (positive) and 8000h (negative).
- For the operations performed in the ALU high part, saturation values are 00 7FFFh (positive) and FF 8000h (negative).
$\square$ The extremum is stored in accumulator ACz.
$\square$ The extremum is searched considering the selected bit width of the accumulators:
- for the lower 16-bit data path, the sign bit is extracted at bit position 15
- for the higher 24-bit data path, the sign bit is extracted at bit position 31

According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs:

- TRNO tracks the decision for the high part data path
- TRN1 tracks the decision for the low part data path

If the extremum value is the ACx high or low part, the decision bit is cleared to 0 ; otherwise, it is set to 1 :

```
TRNO = TRNO >> #1
TRN1 = TRN1 >> #1
ACw(39-16) = ACy(39-16) - ACx(39-16)
ACw(15-0) = ACy(15-0) - ACx(15-0)
if (ACx(31-16) > ACy(31-16))
    { bit(TRN0, 15) = #0 ; ACz(39-16) = ACx(39-16) }
else
    { bit(TRN0, 15) = #1 ; ACz(39-16) = ACy(39-16) }
if (ACx(15-0) > ACy(15-0))
    { bit(TRN1, 15) = #0 ; ACz(15-0) = ACx(15-0) }
else
    { bit(TRN1, 15) = #1 ; ACz(15-0) = ACy(15-0) }
```

Compatibility with C54x devices (C54CM = 1)
When $\mathrm{C} 54 \mathrm{CM}=1$, this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24-bit data path (overflow is detected at bit position 31).

| Status Bits | Affected by $\quad$ C54CM, SATD |  |
| :--- | :--- | ---: |
|  | Affects | ACOVw, CARRY |
| Repeat | This instruction can be repeated. |  |

Example

| Syntax | Description |
| :---: | :---: |
| MAXDIFF AC0, AC1, AC2, AC1 | The difference is stored in AC1. The content of AC0(39-16) is subtracted from the content of $A C 1(39-16)$ and the result is stored in $\mathrm{AC} 1(39-16)$. Since SATD = 1 and an overflow is detected, AC1(39-16) = FF 8000h (saturation). The content of $\mathrm{ACO}(15-0)$ is subtracted from the content of AC1(15-0) and the result is stored in AC1(15-0). The maximum is stored in AC2. The content of TRN0 and TRN1 is shifted to the right by 1 bit. $\mathrm{ACO}(31-16)$ is greater than $\mathrm{AC} 1(31-16), \mathrm{ACO}(39-16)$ is stored in $\mathrm{AC} 2(39-16)$ and $\operatorname{TRNO}(15)$ is cleared to 0 . $\mathrm{ACO}(15-0)$ is greater than $\mathrm{AC} 1(15-0), \mathrm{ACO}(15-0)$ is stored in $\mathrm{AC2}(15-0)$ and TRN1(15) is cleared to 0 . |


| Before |  |  |  | After |  |  |  |
| :--- | ---: | ---: | ---: | :--- | ---: | :--- | ---: |
| AC0 | 10 | 2400 | 2222 | AC0 | 10 | 2400 | 2222 |
| AC1 | 90 | 0000 | 0000 | AC1 | FF | 8000 | DDDE |
| AC2 | 00 | 0000 | 0000 | AC2 | 10 | 2400 | 2222 |
| SATD |  |  | 1 | SATD |  |  | 1 |
| TRN0 |  |  | 1000 | TRN0 |  |  | 0800 |
| TRN1 |  |  | 0100 | TRN1 |  |  | 0080 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 1 |
| CARRY |  |  | 1 | CARRY |  |  | 0 |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2 \mathrm{a}]$ | DMAXDIFF ACx, ACy, ACz, ACw, TRN0 | Yes | 3 | 1 | $X$ |
| $[2 \mathrm{~b}]$ | DMAXDIFF ACx, ACy, ACz, ACw, TRN1 | Yes | 3 | 1 | $X$ |


| Opcode | TRN0 | 0001 | 000E | DDSS | 1101 | SSDD | xxx0 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | TRN1 | 0001 | 000E | DDSS | 1101 | SSDD | xxx1 |
| Operands | ACw, |  |  |  |  |  |  |
| Description | This in This in |  | extrem | um se | ction in | the D- | it ALU. |

This instruction performs a maximum search.

- ACx and ACy are the two source accumulators.
- The difference between the source accumulators is stored in accumulator ACw.
$\square$ The subtraction computation is equivalent to the subtraction instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
- The extremum between the source accumulators is stored in accumulator ACz.
$\square$ The extremum computation is similar to the compare register content maximum instruction. However, the CARRY status bit is not updated by the extremum search but by the subtraction instruction.
$\square$ According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs. If the extremum value is ACx, the decision bit is cleared to 0 ; otherwise, it is set to 1 .
$\left.\begin{array}{lll}\text { Status Bits } & \text { Affected by } & \text { C54CM, M40, SATD } \\ & \text { Affects } & \text { ACOVw, CARRY }\end{array}\right\}$

Example


## MIN

Compare Accumulator, Auxiliary, or Temporary Register Content Minimum

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MIN $[\mathrm{src}, \mathrm{]}$ dst | Yes | 2 | 1 | X |

## Opcode

| 0011 000E $\mid$ FSSS FDDD

## Operands

Description
dst, src
This instruction performs a minimum comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 .

- When the destination operand (dst) is an accumulator:

■ If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD.

- The operation is performed on 40 bits in the D-unit ALU:

If $\mathrm{M} 40=0, \operatorname{src}(31-0)$ content is compared to dst(31-0) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .

```
step1:if (src(31-0) < dst(31-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
    else
step3: CARRY = 1
```

If M40 $=1, \operatorname{src}(39-0)$ content is compared to dst(39-0) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .

```
step1:if (src(39-0) < dst(39-0))
step2:{ CARRY = 0; dst (39-0) = src(39-0) }
    else
step3:CARRY = 1
```

- There is no overflow detection, overflow report, and saturation.

When the destination operand (dst) is an auxiliary or temporary register:
■ If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

- The operation is performed on 16 bits in the A-unit ALU:

The $\operatorname{src}(15-0)$ content is compared to the $\operatorname{dst}(15-0)$ content. The extremum value is stored in dst.

```
step1:if (src(15-0) < dst(15-0))
step2:dst = src
```

There is no overflow detection and saturation.

## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, this instruction is executed as if M 40 status bit was locally set to 1 . When the destination operand (dst) is an auxiliary or temporary register, the instruction execution is not impacted by the C54CM status bit. When the destination operand (dst) is an accumulator, this instruction always compares the source operand (src) with AC1 as follows:

- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD

■ The operation is performed on 40 bits in the D-unit ALU:
The $\operatorname{src}(39-0)$ content is compared to $\mathrm{AC} 1(39-0)$ content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .

```
stepl:if (src(39-0) < AC1(39-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
    else
step3:{ CARRY = 1; dst(39-0) = AC1(39-0) }
```

| Status Bits | Affected by | C54CM, M40, SXMD |
| :--- | :--- | :--- |
|  | Affects $\quad$ CARRY |  |



## MINDIFF <br> Compare and Select Accumulator Content Minimum

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MINDIFF ACx, ACy, ACz, ACw | Yes | 3 | 1 | X |
| $[2]$ | DMINDIFF ACx, ACy, ACz, ACw, TRNx | Yes | 3 | 1 | $X$ |

Description Instruction [1] performs two paralleled 16-bit extremum selections in the D-unit ALU. Instruction [2] performs a single 40-bit extremum selection in the D-unit ALU.

Status Bits
Affected by C54CM, M40, SATD
Affects ACOVw, CARRY
See Also
See the following other related instructions:

- CMP (Compare Accumulator, Auxiliary, or Temporary Register Content)
- MAX (Compare and Select Accumulator Content Maximum)
- MIN (Compare Accumulator, Auxiliary, or Temporary Register Content Minimum)


## Syntax Characteristics



For each datapath (high and low):
$\square$ ACx and ACy are the source accumulators.
$\square$ The differences are stored in accumulator ACw.

- The subtraction computation is equivalent to the dual 16-bit subtractions instruction.
$\square$ For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit ( ACOV ) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
$\square$ Independently on each data path, if SATD $=1$ when an overflow is detected on the data path, a saturation is performed:
■ For the operations performed in the ALU low part, saturation values are 7FFFh (positive) and 8000h (negative).

■ For the operations performed in the ALU high part, saturation values are 00 7FFFh (positive) and FF 8000h (negative).
$\square$ The extremum is stored in accumulator ACz.

- The extremum is searched considering the selected bit width of the accumulators:

■ for the lower 16-bit data path, the sign bit is extracted at bit position 15
■ for the higher 24-bit data path, the sign bit is extracted at bit position 31

- According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs:
- TRNO tracks the decision for the high part data path
- TRN1 tracks the decision for the low part data path

If the extremum value is the ACx high or low part, the decision bit is cleared to 0 ; otherwise, it is set to 1 :

```
TRNO = TRNO >> #1
TRN1 = TRN1 >> #1
ACw(39-16) = ACy(39-16) - ACx(39-16)
ACw(15-0) = ACy(15-0) - ACx(15-0)
if (ACx(31-16) < ACy(31-16))
    { bit(TRN0, 15) = #0 ; ACz(39-16) = ACx(39-16) }
else
    { bit(TRN0, 15) = #1 ; ACz(39-16) = ACy(39-16) }
if (ACx(15-0) < ACy(15-0))
    { bit(TRN1, 15) = #0 ; ACz(15-0) = ACx(15-0) }
else
    { bit(TRN1, 15) = #1 ; ACz(15-0) = ACy (15-0) }
```


## Compatibility with C54x devices (C54CM = 1)

When C54CM $=1$, this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24 -bit data path (overflow is detected at bit position 31).

## Status Bits

Affected by C54CM, SATD
Affects ACOVw, CARRY
Repeat This instruction can be repeated.

Example

| Syntax | Description |
| :---: | :---: |
| MINDIFF AC0, AC1, AC2, AC1 | The difference is stored in AC1. The content of ACO(39-16) is subtracted from the content of $A C 1(39-16)$ and the result is stored in AC1(39-16). Since SATD $=1$ and an overflow is detected, AC1(39-16) = FF 8000h (saturation). The content of $\mathrm{ACO}(15-0)$ is subtracted from the content of AC1 (15-0) and the result is stored in AC1(15-0). The minimum is stored in AC2 (sign bit extracted at bits 31 and 15). The content of TRN0 and TRN1 is shifted to the right by 1 bit. $\mathrm{ACO}(31-16)$ is greater than or equal to AC 1 (31-16), $\mathrm{AC} 1(39-16)$ is stored in $\mathrm{AC} 2(39-16)$ and TRNO(15) is set to 1 . $\mathrm{ACO}(15-0)$ is greater than or equal to $\mathrm{AC} 1(15-0), \mathrm{AC}(15-0)$ is stored in $\mathrm{AC} 2(15-0)$ and $\operatorname{TRN} 1(15)$ is set to 1. |


| Before |  |  | After |  |  |  |  |
| :--- | ---: | ---: | ---: | :--- | ---: | ---: | ---: |
| AC0 | 10 | 2400 | 2222 | AC0 | 10 | 2400 | 2222 |
| AC1 | 00 | 8000 | DDDE | AC1 | FF | 8000 | BBBC |
| AC2 | 10 | 2400 | 2222 | AC2 | 00 | 8000 | DDDE |
| SATD |  |  | 1 | SATD |  |  | 1 |
| TRN0 |  |  | 0800 | TRN0 |  |  | 8400 |
| TRN1 |  |  | 0040 | TRN1 |  |  | 8020 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 1 |
| CARRY |  |  | 0 | CARRY |  |  | 1 |

Compare and Select Accumulator Content Minimum

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2 a]$ | DMINDIFF ACx, ACy, ACz, ACw, TRN0 | Yes | 3 | 1 | X |
| $[2 b]$ | DMINDIFF ACx, ACy, ACz, ACw, TRN1 | Yes | 3 | 1 | $X$ |



- ACx and ACy are the two source accumulators.
- The difference between the source accumulators is stored in accumulator ACw.
$\square$ The subtraction computation is equivalent to the subtraction instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

- The extremum between the source accumulators is stored in accumulator ACz.
- The extremum computation is similar to the compare register content maximum instruction. However, the CARRY status bit is not updated by the extremum search but by the subtraction instruction.
- According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs. If the extremum value is ACx, the decision bit is cleared to 0 ; otherwise, it is set to 1 .

```
If M40 = 0:
    TRNX = TRNX >> #1
    ACw(39-0) = ACy(39-0) - ACx(39-0)
    if (ACx(31-0) < ACy(31-0))
    { bit(TRNx, 15) = #0 ; ACz(39-0) = ACx(39-0) }
else
    { bit(TRNx, 15) = #1 ; ACz(39-0) = ACy(39-0) }
If M40 = 1:
    TRNX = TRNX >> #1
    ACw(39-0) = ACy(39-0) - ACx(39-0)
    if (ACx(39-0) < ACy(39-0))
    { bit(TRNx, 15) = #0 ; ACz(39-0) = ACx(39-0) }
else
    { bit(TRNx, 15) = #1 ; ACz(39-0) = ACy(39-0) }
```


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, this instruction is executed as if M 40 status bit was locally set to 1 . However to ensure compatibility versus overflow detection and saturation of the destination accumulator, this instruction must be executed with $\mathrm{M} 40=0$.

| Status Bits | Affected by C54CM, M40, SATD |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVw, CARRY |
| Repeat | This instruction can be repeated. |

Example

| Syntax | Description |
| :--- | :--- |
| DMINDIFF AC0, AC1, AC2, AC3, TRN0 | The difference is stored in AC3. The content of AC0 is sub- <br> tracted from the content of AC1 and the result is stored in AC3. <br> The minimum is stored in AC2. The content of TRN0 is shifted <br> to the right by 1 bit. If AC0 is less than AC1, AC0 is stored in <br> AC2 and TRN0(15) is cleared to 0; otherwise, AC1 is stored in <br> AC2 and TRN0(15) is set to 1. |

## mmap

Memory-Mapped Register Access Qualifier

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: |
| [1] mmap | No | 1 | 1 | D |
| Opcode | \| 10011000 |  |  |  |
| Operands | none |  |  |  |
| Description | This is an operand qualifier that can be paralleled with any instruction making a Smem or Lmem direct memory access (dma). This operand qualifier allows you to locally prevent the dma access from being relative to the data stack pointer (SP) or the local data page register (DP). It forces the dma access to be relative to the memory-mapped register (MMR) data page start address, 00 0000h. |  |  |  |

This operand qualifier cannot be executed:
$\square$ as a stand-alone instruction (assembler generates an error message)
$\square$ in parallel with instructions not embedding an Smem or Lmem data memory operand
$\square$ in parallel with instructions loading or storing a byte to a register (see Load Accumulator, Auxiliary, or Temporary Register from Memory instructions [2] and [3]; Load Accumulator from Memory instructions [2] and [3]; and Store Accumulator, Auxiliary, or Temporary Register Content to Memory instructions [2] and [3])
The MMRs are mapped as 16 -bit data entities between addresses 0 h and 5 Fh. The scratch-pad memory that is mapped between addresses 60h and 7Fh of each main data pages of 64 K words cannot be accessed through this mechanism.

Any instruction using the mmap modifier cannot be combined with any other user-defined parallelism instruction. The following instruction is not valid:

```
MOV AR1, MMAP(@BSAC)
```

|| RSBT CDPLC

The following instruction is valid:

```
MOV AR1, MMAP (@BSAC)
```

| Status Bits | Affected by none |
| :--- | :--- | :--- |
|  | Affects none |

## Example

| Syntax | Description |
| :--- | :--- |
| MOV AC0, T2 <br> $\\|$ mmap | The content of AC0(15-0) is copied into T2. |

## MOV <br> Load Accumulator from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV [rnd(]Smem << Tx[)], ACx | No | 3 | 1 | X |
| $[2]$ | MOV low_byte(Smem) $\ll$ \#SHIFTW, ACx | No | 3 | 1 | X |
| $[3]$ | MOV high_byte(Smem) $\ll$ \#SHIFTW, ACx | No | 3 | 1 | X |
| $[4]$ | MOV Smem $\ll \# 16$, ACx | No | 2 | 1 | X |
| $[5]$ | MOV [uns(]Smem[)], ACx | No | 3 | 1 | X |
| $[6]$ | MOV [uns(]Smem[)] <<\#SHIFTW, ACx | No | 4 | 1 | X |
| $[7]$ | MOV[40] dbI(Lmem), ACx | No | 3 | 1 | X |
| $[8]$ | MOV Xmem, Ymem, ACx | No | 3 | 1 | X |

## Description

Status Bits

See Also

This instruction loads a 16 -bit signed constant, K16, the content of a memory (Smem) location, the content of a data memory operand (Lmem), or the content of dual data memory operands (Xmem and Ymem) to a selected accumulator (ACx).

Affected by C54CM, M40, RDM, SATD, SXMD
Affects ACOVx
See the following other related instructions:

- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MOV (Load Accumulator Pair from Memory)
- MOV (Load Accumulator with Immediate Value)
- MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
- MOV (Load Accumulator, Auxiliary, or Temporary Register with Immediate Value)
- MOV (Load Auxiliary or Temporary Register Pair from Memory)
- MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory)

Load Accumulator from Memory

## Syntax Characteristics



Load Accumulator from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MOV low_byte(Smem) $\ll$ \#SHIFTW, ACx | No | 3 | 1 | $X$ |

## Opcode

11100001 AAAA AAAI $\mid$ DDSH IFTW

## Operands

Description This instruction loads the low-byte content of a memory (Smem) location shifted by the 6-bit value, SHIFTW, to the accumulator (ACx):

```
ACx = low_byte(Smem) << #SHIFTW
```

$\square$ The content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ The input operand is shifted by the 6-bit value in the D-unit shifter. The shift operation is equivalent to the signed shift instruction.
$\square$ In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$, no overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by C54CM, M40, SATD, SXMD |  |
| :---: | :---: | :---: |
|  | Affects | ACOVx |
| Repeat | This instruc | n can be repeated. |
| Example |  |  |
| Syntax |  | Description |
| MOV low_byte(*AR3) << \#31, AC0 |  | The low-byte content addressed and loaded into ACO. |

Load Accumulator from Memory

## Syntax Characteristics



- The content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ The input operand is shifted by the 6-bit value in the D-unit shifter. The shift operation is equivalent to the signed shift instruction.
$\square$ In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM = 1, no overflow detection, report, and saturation is done after the shifting operation.

Status Bits \begin{tabular}{l}
Affected by C54CM, M40, SATD, SXMD <br>
Affects <br>
Repeat <br>
Example <br>

| Syntax | This instruction can be repeated. |
| :--- | :--- | :--- |
| MOV high_byte(*AR3) << \#31, AC0 | The high-byte content addressed by AR3 is shifted left by 31 bits <br> and loaded into AC0. |


$.$

<br>
\hline
\end{tabular}

Load Accumulator from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | MOV Smem <<\#16, ACx | No | 2 | 1 | X |

## Opcode

$101100 D \mathrm{D} \mid$ AAAA AAAI

## Operands

Description This instruction loads the content of a memory (Smem) location shifted left by 16 bits to the accumulator (ACx):

```
ACx = Smem << #16
```

- The input operand is sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
- The input operand is shifted left by 16 bits according to M40.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM = 1, overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx |

## Example

| Syntax | Description |
| :--- | :--- |
| MOV *AR3+ << \#16, AC1 | The content addressed by AR3 shifted left by 16 bits is loaded into AC1. AR3 <br> is incremented by 1. |


| Before | After |  |  |  |  |
| :--- | ---: | ---: | :--- | ---: | :--- |
| AC1 | 00 | 0200 | FC00 | AC1 | 00 |
| AR3 |  | 0200 | AR3 |  | 0000 |
| 200 |  | 3400 | 200 |  | 0201 |
|  |  |  |  |  | 3400 |

Load Accumulator from Memory

## Syntax Characteristics



- The memory operand is extended to 40 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ The load operation in the accumulator uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by | SXMD |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MOV uns(*AR3), AC0 | The content addressed by AR3 is zero extended to 40 bits and loaded into AC0. |

## Syntax Characteristics



- The memory operand is extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ The input operand is shifted by the 6-bit value in the D-unit shifter. The shift operation is equivalent to the signed shift instruction.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, no overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx |

## Example

| Syntax | Description |
| :--- | :--- |
| MOV uns $\left({ }^{*}\right.$ AR3 $) \ll \# 31$, AC0 | The content addressed by AR3 is zero extended to 40 bits, shifted left by <br> 31 bits, and loaded into AC0. |

Load Accumulator from Memory

## Syntax Characteristics



Load Accumulator from Memory

## Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bit |  | Size | Cycle | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [8] | MOV Xmem, Ymem, ACx |  | No |  | 3 | 1 | X |
| Opcode |  | 1000 | 0001 | XXX | M | YY | M 10DD |

## Operands

## Description

ACx, Xmem, Ymem
This instruction performs a dual 16-bit load of accumulator high and low parts:

```
LO (ACx) = Xmem
```

:: HI (ACx) = Ymem
The operation is executed in dual 16-bit mode; however, it is independent of the 40 -bit D-unit ALU. The 16 lower bits of the accumulator are separated from the higher 24 bits and the 8 guard bits are attached to the higher 16 -bit datapath.

- The data memory operand Xmem is loaded as a 16-bit operand to the destination accumulator (ACx) low part. And, according to SXMD the data memory operand Ymem is sign extended to 24 bits and is loaded to the destination accumulator (ACx) high part.
- For the load operations in higher accumulator bits, overflow detection is performed at bit position 31. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ If SATD is 1 when an overflow is detected on the higher data path, a saturation is performed with saturation value of 007 FFFFh.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, this instruction is executed as if SATD was locally cleared to 0 .

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :---: | :---: | :---: |
|  | Affects | ACOVx |
| Repeat | This instruction can be repeated. |  |
| Example |  |  |
| Syntax | Description |  |
| MOV *AR3, *AR4, AC0 | The content into ACO(39 AC0(15-0). | he location addressed by AR4, sig and the content at the location a |

## MOV

Load Accumulator Pair from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV dbI(Lmem), pair(HI(ACx)) | No | 3 | 1 | X |
| $[2]$ | MOV dbI(Lmem), pair(LO(ACx)) | No | 3 | 1 | $X$ |

Description This instruction loads the content of a data memory operand (Lmem) to the selected accumulator pair, $A C x$ and $A C(x+1)$.

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVx, $\operatorname{ACOV}(x+1)$

## See Also

See the following other related instructions:

- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MOV (Load Accumulator from Memory)
- MOV (Load Accumulator with Immediate Value)
- MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
- MOV (Load Accumulator, Auxiliary, or Temporary Register with Immediate Value)
- MOV (Load Auxiliary or Temporary Register Pair from Memory)
- MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory)

Load Accumulator Pair from Memory
Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MOV dbl(Lmem), pair(HI(ACx)) | No | 3 | 1 | X |
| Opcode $\quad \mid 1110$ 1101\|AAAA AAAI $\mid$ xxDD 101x |  |  |  |  |  |

## Operands

Description

Status Bit

Repeat This instruction can be repeated.

## Example



Load Accumulator Pair from Memory

## Syntax Characteristics



## MOV

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV K16 <<\#16, ACx | No | 4 | 1 | X |
| $[2]$ | MOV K16 $\ll \#$ SHFT, ACx | No | 4 | 1 | X |


| Description | This instruction loads a 16 -bit signed constant, K16, to a selected accumulator |
| :--- | :--- |
| $(\mathrm{ACx})$. |  |

Status Bits $\quad$ Affected by C54CM, M40, SATD, SXMD

See Also See the following other related instructions:

- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MOV (Load Accumulator from Memory)
- MOV (Load Accumulator Pair from Memory)
- MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
- MOV (Load Accumulator, Auxiliary, or Temporary Register with Immediate Value)
- MOV (Load Auxiliary or Temporary Register Pair from Memory)
- MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory)

Load Accumulator with Immediate Value

## Syntax Characteristics



Load Accumulator with Immediate Value

## Syntax Characteristics



## MOV

Load Accumulator, Auxiliary, or Temporary Register from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV Smem, dst | No | 2 | 1 | X |
| $[2]$ | MOV [uns(]high_byte(Smem)[)], dst | No | 3 | 1 | X |
| $[3]$ | MOV [uns(]low_byte(Smem)[)], dst | No | 3 | 1 | X |

Description This instruction loads the content of a memory (Smem) location to a selected destination (dst) register.

Status Bits Affected by M40, SXMD
Affects none

## See Also

See the following other related instructions:

- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MOV (Load Accumulator from Memory)
- MOV (Load Accumulator Pair from Memory)
- MOV (Load Accumulator with Immediate Value)
- MOV (Load Accumulator, Auxiliary, or Temporary Register with Immediate Value)
- MOV (Load Auxiliary or Temporary Register Pair from Memory)
- MOV (Store Accumulator, Auxiliary, or Temporary Register Content to Memory)
- MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory)

Load Accumulator, Auxiliary, or Temporary Register from Memory
Syntax Characteristics


Load Accumulator, Auxiliary, or Temporary Register from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bi | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MOV [uns(]high_byte(Smem)[)], dst | No | 3 | 1 | X |
| Opcode |  | 1111 \| AA | AA | I ${ }^{\text {F }}$ F | 000u |
| Operands | dst, Smem |  |  |  |  |
| Description | This instruction loads the high-byte content of a memory (Smem) location to the destination (dst) register: |  |  |  |  |

$\square$ When the destination register is an accumulator:

- The memory operand is extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
- The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
- When the destination register is an auxiliary or temporary register:
- The memory operand is extended to 16 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 16 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 16 bits regardless of SXMD.
- The load operation in the destination register uses a dedicated path independent of the A-unit ALU.
$\square$ In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

Status Bits \begin{tabular}{l}
Affected by M40, SXMD <br>
Affects <br>
Repeat <br>
This instruction can be repeated. <br>
Example <br>

| Syntax | Description |
| :--- | :--- |
| MOV uns(high_byte(*AR3)), AC0 | The high-byte content addressed by AR3 is zero extended to 40 bits <br> and loaded into AC0. |


 

<br>
\hline
\end{tabular}

Load Accumulator, Auxiliary, or Temporary Register from Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [3] | MOV [uns(]low_byte(Smem)[)], dst | No | 3 | 1 | X |
| Opcode |  | 1111 \| AAAA | A | I ${ }^{\text {FDDD }}$ | $001 u$ |
| Operands | dst, Smem |  |  |  |  |
| Description | This instruction loads the low-byte content of a memory (Smem) location to the destination (dst) register: <br> dst $=$ low_byte (Smem) |  |  |  |  |

- When the destination register is an accumulator:
- The memory operand is extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
- The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
- When the destination register is an auxiliary or temporary register:
- The memory operand is extended to 16 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 16 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 16 bits regardless of SXMD.
- The load operation in the destination register uses a dedicated path independent of the A-unit ALU.
$\square$ In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

Status Bits \begin{tabular}{l}
Affected by M40, SXMD <br>
Affects <br>
Repeat <br>
This instruction can be repeated. <br>
Example <br>

| Syntax | Description |
| :--- | :--- |
| MOV uns(low_byte(*AR3)), AC0 | The low-byte content addressed by AR3 is zero extended to 40 bits <br> and loaded into AC0. |


 

<br>
\hline
\end{tabular}

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV k4, dst | Yes | 2 | 1 | X |
| $[2]$ | MOV -k4, dst | Yes | 2 | 1 | X |
| $[3]$ | MOV K16, dst | No | 4 | 1 | X |

Description This instruction loads a 4-bit unsigned constant, k4; the 2s complement representation of the 4 -bit unsigned constant; or a 16-bit signed constant, K16, to a selected destination (dst) register.

Status Bits Affected by M40, SXMD
Affects none
See Also See the following other related instructions:

- MACM::MOV (Multiply and Accumulate with Parallel Load Accumulator from Memory)
- MASM::MOV (Multiply and Subtract with Parallel Load Accumulator from Memory)
- MOV (Load Accumulator from Memory)
- MOV (Load Accumulator Pair from Memory)
- MOV (Load Accumulator with Immediate Value)
- MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
- MOV (Load Auxiliary or Temporary Register Pair from Memory)
- MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory)

Load Accumulator, Auxiliary, or Temporary Register with Immediate Value

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size Cycles Pipeline |
| :---: | :---: |
| [1] MOV k4, dst | $\begin{array}{llll}\text { Yes } & 2 & 1 & \end{array}$ |
| Opcode | 0011 110E ${ }^{\text {a }}$ kkkk FDDD |
| Operands | dst, k4 |
| Description | This instruction loads the 4-bit unsigned constant, k 4 , to the destination (dst) register: <br> dst $=k 4$ When the destination register is an accumulator: <br> ■ The 4 -bit constant, k 4 , is zero extended to 40 bits. <br> - The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs. When the destination register is an auxiliary or temporary register: <br> - The 4-bit constant, k 4 , is zero extended to 16 bits. <br> - The load operation in the destination register uses a dedicated path independent of the A-unit ALU. <br> Compatibility with C54x devices (C54CM =1) <br> When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. |
| Status Bits | Affected by M40 <br> Affects none |
| Repeat <br> Example | This instruction can be repeated. |
| Syntax | Description |
| MOV \#2, AC0 | AC0 is loaded with the unsigned 4-bit value (2). |

Load Accumulator, Auxiliary, or Temporary Register with Immediate Value

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MOV -k 4, dst | Yes | 2 | 1 | $X$ |

## Opcode

| 0011 111E kkkk FDDD
Operands
Description This instruction loads the 2s complement representation of the 4-bit unsigned constant, k 4 , to the destination (dst) register:

```
dst = -k4
```

- When the destination register is an accumulator:

■ The 4-bit constant, k4, is negated in the I-unit, loaded into the accumulator, and sign extended to 40 bits before being processed by the D-unit as a signed constant.

■ The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
$\square$ When the destination register is an auxiliary or temporary register:

- The 4-bit constant, k4, is zero extended to 16 bits and negated in the I -unit before being processed by the A-unit as a signed K16 constant.

■ The load operation in the destination register uses a dedicated path independent of the A-unit ALU.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by | M40 |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MOV \#-2, AC0 | AC0 is loaded with a 2s complement representation of the unsigned 4-bit value (2). |

Load Accumulator, Auxiliary, or Temporary Register with Immediate Value

## Syntax Characteristics



## MOV

Load Auxiliary or Temporary Register Pair from Memory

## Syntax Characteristics

| No. Syn |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] MO | MOV dbl(Lmem), pair(TAx) | No | 3 | 1 | X |
| Opcode |  | 1101 \| ${ }^{\text {A }}$ | A A | AI FDD | 111x |
| Operands | Lmem, TAx |  |  |  |  |
| Description | This instruc the tempora memory op pair(TAX) The loa path inc Valid au Valid te <br> Compatibil <br> When this in | s of data and load or auxilia <br> or auxilia U. <br> R2, AR4, <br> T2. $\begin{aligned} & 3 C M=1) \\ & M 40=0, ~ c \end{aligned}$ | mory the 16 regis <br> regist <br> d AR <br> mpatib | operand lowest bis TA( $x+$ <br> uses a <br> lity is ens | Lmem) to ts of data 1): <br> dedicated <br> ured. |
| Status Bits | Affected by <br> Affects |  |  |  |  |
| Repeat | This instructior |  |  |  |  |
| See Also | See the follo AMOV MOV (L MOV Lo Value) | ons: <br> ary Regist or Tempo or Temporary | Cont <br> y Re <br> Reg | nt) ister from ter with | Memory) <br> mmediate |

## Example

| Syntax | Description |
| :--- | :--- |
| MOV dbl(*AR2), pair(T0) | The 16 highest bits of the content at the location addressed by AR2 are loaded <br> into T0 and the 16 lowest bits of the content at the location addressed by AR2 + 1 <br> are loaded into T1. |

## MOV

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV Smem, BK03 | No | 3 | 1 | X |
| $[2]$ | MOV Smem, BK47 | No | 3 | 1 | X |
| $[3]$ | MOV Smem, BKC | No | 3 | 1 | X |
| $[4]$ | MOV Smem, BSA01 | No | 3 | 1 | X |
| $[5]$ | MOV Smem, BSA23 | No | 3 | 1 | X |
| $[6]$ | MOV Smem, BSA45 | No | 3 | 1 | X |
| $[7]$ | MOV Smem, BSA67 | No | 3 | 1 | X |
| $[8]$ | MOV Smem, BSAC | No | 3 | 1 | X |
| $[9]$ | MOV Smem, BRC0 | No | 3 | 1 | X |
| $[10]$ | MOV Smem, BRC1 | No | 3 | 1 | X |
| $[11]$ | MOV Smem, CDP | No | 3 | 1 | X |
| $[12]$ | MOV Smem, CSR | No | 3 | 1 | X |
| $[13]$ | MOV Smem, DP | No | 3 | 1 | X |
| $[14]$ | MOV Smem, DPH | No | 3 | 1 | X |
| $[15]$ | MOV Smem, PDP | No | 3 | 1 | X |
| $[16]$ | MOV Smem, SP | No | 3 | 1 | X |
| $[17]$ | MOV Smem, SSP | No | 3 | 1 | X |
| $[18]$ | MOV Smem, TRN0 | No | 3 | 1 | X |
| $[19]$ | MOV Smem, TRN1 | 3 | 1 | X |  |
| $[20]$ | MOV dbI(Lmem), RETA |  | 3 | X |  |
|  | No |  |  |  |  |

Opcode See Table 5-1 (page 5-370).

Operands Lmem, Smem

| Description | Instructions [1] through [19] load the content of a memory (Smem) location to the destination CPU register. This instruction uses a dedicated datapath independent of the A-unit ALU and the D-unit operators to perform the operation. The content of the memory location is zero extended to the bitwidth of the destination CPU register. |
| :---: | :---: |
|  | The operation is performed in the execute phase of the pipeline. There is a 3 -cycle latency between PDP, DP, SP, SSP, CDP, BSAx, BKx, BRCx, and CSR loads and their use in the address phase by the A-unit address generator units or by the P -unit loop control management. |
|  | For instruction [10], when BRC1 is loaded, the block repeat save register (BRS1) is also loaded with the same value. |
|  | Instruction [20] loads the content of data memory operand (Lmem) to the 24-bit RETA register (the return address of the calling subroutine) and to the 8 -bit CFCT register (active control flow execution context flags of the calling subroutine): |
|  | - The 16 highest bits of Lmem are loaded into the CFCT register and into the 8 highest bits of the RETA register. |
|  | - The 16 lowest bits of Lmem are loaded into the 16 lowest bits of the RETA register. |
|  | When instruction [20] is decoded, the CPU pipeline is flushed and the instruction is executed in 5 cycles, regardless of the instruction context. |
| Status Bits | Affected by none |
|  | Affects none |
| Repeat | Instructions [13] and [20] cannot be repeated; all other instructions can be repeated. |
| See Also | See the following other related instructions: |
|  | - MOV (Load CPU Register with Immediate Value) |

Table 5-1. Opcodes for Load CPU Register from Memory Instruction

| No. | Syntax | Opcode |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MOV Smem, BK03 | 110 | 1100 | AAAA | AAAI | 1001 | xx10 |
| [2] | MOV Smem, BK47 | 110 | 1100 | AAAA | AAAI | 1010 | xx10 |
| [3] | MOV Smem, BKC | 110 | 1100 | AAAA | AAAI | 1011 | xx10 |
| [4] | MOV Smem, BSA01 | 110 | 1100 | AAAA | AAAI | 0010 | xx10 |
| [5] | MOV Smem, BSA23 | 110 | 1100 | AAAA | AAAI | 0011 | xx10 |
| [6] | MOV Smem, BSA45 | 110 | 1100 | AAAA | AAAI | 0100 | xx10 |
| [7] | MOV Smem, BSA67 | 110 | 1100 | AAAA | AAAI | 0101 | xx10 |
| [8] | MOV Smem, BSAC | 110 | 1100 | AAAA | AAAI | 0110 | xx10 |
| [9] | MOV Smem, BRCO | 110 | 1100 | AAAA | AAAI | x001 | xx11 |
| [10] | MOV Smem, BRC1 | 110 | 1100 | AAAA | AAAI | x 010 | xx11 |
| [11] | MOV Smem, CDP | 110 | 1100 | AAAA | AAAI | 0001 | xx10 |
| [12] | MOV Smem, CSR | 110 | 1100 | AAAA | AAAI | x000 | xx11 |
| [13] | MOV Smem, DP | 110 | 1100 | AAAA | AAAI | 0000 | xx10 |
| [14] | MOV Smem, DPH | 110 | 1100 | AAAA | AAAI | 1100 | xx10 |
| [15] | MOV Smem, PDP | 110 | 1100 | AAAA | AAAI | 1111 | xx10 |
| [16] | MOV Smem, SP | 110 | 1100 | AAAA | AAAI | 0111 | xx10 |
| [17] | MOV Smem, SSP | 110 | 1100 | AAAA | AAAI | 1000 | xx10 |
| [18] | MOV Smem, TRNO | 110 | 1100 | AAAA | AAAI | x011 | xx11 |
| [19] | MOV Smem, TRN1 | 110 | 1100 | AAAA | AAAI | x100 | xx11 |
| [20] | MOV dbl(Lmem), RETA | 111 | 1101 | AAAA | AAAI | xxxx | 011x |

## MOV <br> Load CPU Register with Immediate Value

Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MOV k12, BK03 | Yes | 3 | 1 | AD |
| [2] | MOV k12, BK47 | Yes | 3 | 1 | AD |
| [3] | MOV k12, BKC | Yes | 3 | 1 | AD |
| [4] | MOV k12, BRCO | Yes | 3 | 1 | AD |
| [5] | MOV k12, BRC1 | Yes | 3 | 1 | AD |
| [6] | MOV k12, CSR | Yes | 3 | 1 | AD |
| [7] | MOV k7, DPH | Yes | 3 | 1 | AD |
| [8] | MOV k9, PDP | Yes | 3 | 1 | AD |
| [9] | MOV k16, BSA01 | No | 4 | 1 | AD |
| [10] | MOV k16, BSA23 | No | 4 | 1 | AD |
| [11] | MOV k16, BSA45 | No | 4 | 1 | AD |
| [12] | MOV k16, BSA67 | No | 4 | 1 | AD |
| [13] | MOV k16, BSAC | No | 4 | 1 | AD |
| [14] | MOV k16, CDP | No | 4 | 1 | AD |
| [15] | MOV k16, DP | No | 4 | 1 | AD |
| [16] | MOV k16, SP | No | 4 | 1 | AD |
| [17] | MOV k16, SSP | No | 4 | 1 | AD |

Opcode See Table 5-2 (page 5-372).
Operands kx
Description This instruction loads the unsigned constant, $k x$, to the destination CPU register. This instruction uses a dedicated datapath independent of the A-unit ALU and the D-unit operators to perform the operation. The constant is zero extended to the bitwidth of the destination CPU register.

For instruction [5], when BRC1 is loaded, the block repeat save register (BRS1) is also loaded with the same value.

The operation is performed in the address phase of the pipeline.

| Status Bits | Affected by none |
| :--- | :--- | :--- |
|  | Affects none |
| Repeat | Instruction [15] cannot be repeated; all other instructions can be repeated. |
| See Also | See the following other related instructions: |
|  | $\square$ MOV (Load CPU Register from Memory) |

Table 5-2. Opcodes for Load CPU Register with Immediate Value Instruction

| No. | Syntax | Opcode |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MOV k12, BK03 |  | 0001 | 011E | kkkk | kkkk | kkkk | 0100 |  |
| [2] | MOV k12, BK47 |  | 00010 | 011E | kkkk | kkkk | kkkk | 0101 |  |
| [3] | MOV k12, BKC |  | 0001 | 011E | kkkk | kkkk | kkkk | 0110 |  |
| [4] | MOV k12, BRCO |  | 0001 | 011 E | kkkk | kkkk | kkkk | 1001 |  |
| [5] | MOV k12, BRC1 |  | 0001 | 011E | kkkk | kkkk | kkkk | 1010 |  |
| [6] | MOV k12, CSR |  | 0001 | 011E | kkkk | kkkk | kkkk | 1000 |  |
| [7] | MOV k7, DPH |  | 0001 | 011E | xxxx | xkkk | kkkk | 0000 |  |
| [8] | MOV k9, PDP |  | 0001 | 011E | xxxk | kkkk | kkkk | 0011 |  |
| [9] | MOV k16, BSA01 | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 011x |
| [10] | MOV k16, BSA23 | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 100x |
| [11] | MOV k16, BSA45 | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 101x |
| [12] | MOV k16, BSA67 | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 110x |
| [13] | MOV k16, BSAC | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 111x |
| [14] | MOV k16, CDP | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 010x |
| [15] | MOV k16, DP | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 000x |
| [16] | MOV k16, SP | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx1 | 000x |
| [17] | MOV k16, SSP | 0111 | 1000 k | kkkk | kkkk | kkkk | kkkk | xxx0 | 001x |

## MOV

Load Extended Auxiliary Register from Memory

## Syntax Characteristics

| No. Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| MOV dbl(Lmem), XAdst |  | No | 3 | 1 | X |
| Opcode | 1110 | 1101 \| AAAA | A | \| XDDD | 1111 |
| Operands | Lmem, XAdst |  |  |  |  |
| Description | This instruction loads the lower 23 bits of the data addressed by data memory operand (Lmem) to the 23-bit destination register (XARx, XSP, XSSP, XDP, or XCDP). |  |  |  |  |
| Status Bits Affected by none |  |  |  |  |  |
| Repeat This instruction can be repeated. |  |  |  |  |  |
| See Also | See the following other related instruc AMAR (Modify Extended Auxiliary AMOV (Load Extended Auxiliary MOV (Move Extended Auxiliary R MOV (Store Extended Auxiliary R | ions: <br> Register Con <br> Register with <br> egister Conte <br> gister Conte | ntent) <br> Imme <br> nt) <br> nt to | diate Value <br> Memory) |  |
| Example |  |  |  |  |  |
| Syntax | Description |  |  |  |  |
| MOV dbl(*AR3), XAR1 | The 7 lowest bits of the content at the location addressed by AR3 and the 16 bits of the content at the location addressed by AR3 +1 are loaded into XAR1. |  |  |  |  |


| Before |  | After |  |  |
| :--- | ---: | :--- | ---: | :--- |
| XAR1 | 00 | 0000 | XAR1 | 12 |
| OFD3 |  |  |  |  |
| AR3 | 0200 | AR3 | 0200 |  |
| 200 | 3492 | 200 | 3492 |  |
| 201 | $0 F D 3$ | 201 | $0 F D 3$ |  |

## MOV

Syntax Characteristics


## MOV

Move Accumulator Content to Auxiliary or Temporary Register

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV HI(ACx), TAx | Yes | 2 | 1 | $X$ |

## Opcode

$\mid 0100$ 010E $\mid 00 S$ FDDD

## Operands <br> ACx, TAx

Description This instruction moves the high part of the accumulator, $\operatorname{ACx}(31-16)$, to the destination auxiliary or temporary register (TAx):

TAx = HI (ACx)
The 16 -bit move operation is performed in the A-unit ALU.
Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by M40

Affects none
Repeat This instruction can be repeated.
See Also See the following other related instructions:

- MOV (Move Accumulator, Auxiliary, or Temporary Register Content)
- MOV (Move Auxiliary or Temporary Register Content to Accumulator)


## Example

| Syntax | Description |
| :--- | :--- |
| MOV HI(AC0), AR2 | The content of ACO(31-16) is copied to AR2. |


| Before |  | After |  |  |  |
| :--- | ---: | ---: | :--- | ---: | :--- | :--- |
| AC0 | 01 E500 | 0030 | AC0 | 01 E500 | 0030 |
| AR2 |  | 0200 | AR2 |  | E500 |



## Example

| Syntax | Description |
| :--- | :--- |
| MOV AC0, AC1 | The content of AC0 is copied to AC1. Because an overflow occurred, ACOV1 is <br> set to 1. |


| Before |  |  | After |  |  |  |  |
| :--- | ---: | ---: | :--- | :--- | :--- | :--- | :--- | :--- |
| AC0 | 01 | E500 | 0030 | AC0 | 01 | E500 | 0030 |
| AC1 | 00 | 2800 | 0200 | AC1 | 01 | E500 | 0030 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| SATD |  |  | 0 | SATD |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 1 |

## MOV

Move Auxiliary or Temporary Register Content to Accumulator

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV TAx, HI(ACx) | Yes | 2 | 1 | X |
| Opcode |  | 0101 | 001 E | FSSS | OODD |

## Operands

Description

Status Bits

Repeat
See Also

ACx, TAx
This instruction moves the content of the auxiliary or temporary register (TAx) to the high part of the accumulator, $\mathrm{ACx}(31-16)$ :
HI (ACx) $=$ TAx
$\square$ The 16 -bit move operation is performed in the D-unit ALU.
$\square$ During the 16-bit move operation, an overflow is detected according to M40:

- the destination accumulator overflow status bit (ACOVx) is set.

■ the destination accumulator (ACx) is saturated according to SATD.
$\square$ If the source (src) register is an auxiliary or temporary register, the 16 LSBs of the source register are sign extended to 40 bits according to SXMD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by M40, SATD, SXMD |
| :--- | :--- |
|  | Affects ACOVx |
| Repeat | This instruction can be repeated. |
| See Also | See the following other related instructions: |
|  | $\square$ MOV (Move Accumulator Content to Auxiliary or Temporary Register) |
|  | $\square$ MOV (Move Accumulator, Auxiliary, or Temporary Register Content) |
|  | $\square$ MOV (Move Auxiliary or Temporary Register Content to CPU Register) |
|  | $\square$ MOV (Move Extended Auxiliary Register Content) |

## Example

| Syntax | Description |
| :--- | :--- |
| MOV T0, HI(AC0) | The content of T0 is copied to $\mathrm{ACO}(31-16)$. |

MOV
Move Auxiliary or Temporary Register Content to CPU Register

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV TAx, BRC0 | Yes | 2 | 1 | X |
| $[2]$ | MOV TAx, BRC1 | Yes | 2 | 1 | X |
| $[3]$ | MOV TAx, CDP | Yes | 2 | 1 | X |
| $[4]$ | MOV TAx, CSR | Yes | 2 | 1 | X |
| $[5]$ | MOV TAx, SP | Yes | 2 | 1 | X |
| $[6]$ | MOV TAx, SSP | Yes | 2 | 1 | X |


| Opcode | See Table 5-3 (page 5-380). |
| :--- | :--- |
| Operands | TAx |

Description This instruction moves the content of the auxiliary or temporary register (TAx) to the selected CPU register. All the move operations are performed in the execute phase of the pipeline and the A-unit ALU is used to transfer the content of the registers.

There is a 3-cycle latency between SP, SSP, CDP, TAx, CSR, and BRCx update and their use in the address phase by the A-unit address generator units or by the P -unit loop control management.

For instruction [2] when BRC1 is loaded with the content of TAx, the block repeat save register (BRS1) is also loaded with the same value.

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.
See Also See the following other related instructions:

- MOV (Move Accumulator Content to Auxiliary or Temporary Register)
- MOV (Move Accumulator, Auxiliary, or Temporary Register Content)
$\square$ MOV (Move Auxiliary or Temporary Register Content to Accumulator)
- MOV (Move CPU Register Content to Auxiliary or Temporary Register)
- MOV (Move Extended Auxiliary Register Content)


## Example

| Syntax | Description |
| :--- | :--- |
| MOV T1, BRC1 | The content of T1 is copied to the block repeat register (BRC1) and to the block <br> repeat save register (BRS1). |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| T1 | 0034 | T1 | 0034 |
| BRC1 | 00 EA | BRC1 | 0034 |
| BRS1 | 00 EA | BRS1 | 0034 |

Table 5-3. Opcodes for Move Auxiliary or Temporary Register Content to CPU Register
Instruction

| No. | Syntax | Opcode |  |
| :---: | :--- | :---: | :--- |
| $[1]$ | MOV TAx, BRC0 | 0101 | 001 E |
| $[2]$ | MOV TAx, BRC1 | 1110 |  |
| $[3]$ | MOV TAx, CDP | 0101 | 001 E |
| 4$]$ | MOV TAx, CSR | 1101 |  |
| $[5]$ | MOV TAx, SP | 0101 | 001 E |
| $6 S S S$ | 1010 |  |  |
| $[6]$ | MOV TAx, SSP | 0101 | 001 E |

MOV
Move CPU Register Content to Auxiliary or Temporary Register

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV BRC0, TAx | Yes | 2 | 1 | X |
| $[2]$ | MOV BRC1, TAx | Yes | 2 | 1 | X |
| $[3]$ | MOV CDP, TAx | Yes | 2 | 1 | X |
| $[4]$ | MOV SP, TAx | Yes | 2 | 1 | X |
| $[5]$ | MOV SSP, TAx | Yes | 2 | 1 | X |
| $[6]$ | MOV RPTC, TAx | Yes | 2 | 1 | X |

## Opcode See Table 5-4 (page 5-382). <br> Operands <br> TAx

Description

Status Bits

Repeat
See Also

This instruction moves the content of the selected CPU register to the auxiliary or temporary register (TAx). All the move operations are performed in the execute phase of the pipeline and the A-unit ALU is used to transfer the content of the registers.

For instructions [1] and [2], BRCx is decremented in the address phase of the last instruction of a loop. These instructions have a 3-cycle latency requirement versus the last instruction of a loop.

For instructions [3], [4], and [5], there is a 3-cycle latency between SP, SSP, CDP, and TAx update and their use in the address phase by the A-unit address generator units or by the P -unit loop control management.

Affected by none
Affects none
Instruction [6] cannot be repeated; all other instructions can be repeated.
See the following other related instructions:

- MOV (Move Accumulator Content to Auxiliary or Temporary Register)
- MOV (Move Auxiliary or Temporary Register Content to CPU Register)
- MOV (Store CPU Register Content to Memory)


## Example

| Syntax |  | Description |  |
| :---: | :---: | :---: | :---: |
| MOV BRC1, T1 |  | The content of block repeat register (BRC1) is copied to T 1 . |  |
| Before |  | After |  |
| T1 | 0034 | T1 | OOEA |
| BRC1 | OOEA | BRC1 | OOEA |

## Table 5-4. Opcodes for Move CPU Register Content to Auxiliary or Temporary Register Instruction

| No. | Syntax | Opcode |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MOV BRCO, TAx | 0100 | 010E | 1100 | FDDD |
| [2] | MOV BRC1, TAx | 0100 | 010E | 1101 | FDDD |
| [3] | MOV CDP, TAx | 0100 | 010E | 1010 | FDDD |
| [4] | MOV SP, TAx | 0100 | 010E | 1000 | FDDD |
| [5] | MOV SSP, TAx | 0100 | 010E | 1001 | FDDD |
| [6] | MOV RPTC, TAx | 0100 | 010E | 1110 | FDDD |

## MOV

## Move Extended Auxiliary Register Content

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size Cycles Pipeline |
| :---: | :---: |
| [1] MOV xsrc, xdst | $\begin{array}{llll}\text { No } & 2 & 1 & X\end{array}$ |
| Opcode | $10010000 \mid$ XSSS XDDD |
| Operands | xdst, xsrc |
| Description | This instruction moves the content of the source register (xsrc) to the destination register (xdst): <br> xdst $=$ xsrc <br> $\square$ When the destination register (xdst) is an accumulator (ACx) and the source register (xsrc) is a 23-bit register (XARx, XSP, XSSP, XDP, or XCDP): <br> - The 23-bit move operation is performed in the D-unit ALU. <br> - The upper bits of $A C x$ are filled with 0 . <br> $\square$ When the source register (xsrc) is an accumulator (ACx) and the destination register (xdst) is a 23-bit register (XARx, XSP, XSSP, XDP, or XCDP): <br> - The 23-bit move operation is performed in the A-unit ALU. <br> - The lower 23 bits of $A C x$ are loaded into xdst. When both the source register (xsrc) and the destination register (xdst) are accumulators, the Move Accumulator Content instruction (MOV src, dst) is assembled. |
| Status Bits | Affected by none <br> Affects none |
| Repeat | This instruction can be repeated. |
| See Also | See the following other related instructions: AMAR (Modify Extended Auxiliary Register Content) AMOV (Load Extended Auxiliary Register with Immediate Value) MOV (Load Extended Auxiliary Register from Memory) MOV (Store Extended Auxiliary Register Content to Memory) |
| Example |  |
| Syntax | Description |
| MOV AC0, XAR1 | The lower 23 bits of AC0 are loaded into XAR1. |

## MOV

Move Memory to Memory

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV Cmem, Smem | No | 3 | 1 | X |
| $[2]$ | MOV Smem, Cmem | No | 3 | 1 | X |
| $[3]$ | MOV Cmem, dbl(Lmem) | No | 3 | 1 | X |
| $[4]$ | MOV dbI(Lmem), Cmem | No | 3 | 1 | X |
| $[5]$ | MOV dbI(Xmem), dbl(Ymem) | No | 3 | 1 | X |
| $[6]$ | MOV Xmem, Ymem | No | 3 | 1 | X |


| Description | These instructions store the content of a memory location to a memory <br> location. They use a dedicated datapath to perform the operation. |
| :--- | :--- |
| Status Bits | Affected by none <br> Affects |
| See Also none |  |$\quad$| See the following other related instructions: |
| :--- |
| MOV (Store Accumulator, Auxiliary, or Temporary Register Content to |
| Memory) |

Move Memory to Memory

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV Cmem, Smem | No | 3 | 1 | X |  |
| Opcode |  | 1110 | 1111 | AAAA AAAI | xxxx | 00 mm |
| Operands | Cmem, Smem |  |  |  |  |  |
| Description | This instruction stores the content of a data memory operand Cmem, <br> addressed using the coefficient addressing mode, to a memory (Smem) <br> location: <br> Smem $=$ Cmem |  |  |  |  |  |

For this instruction, the Cmem operand is not accessed through the BB bus. On all C55x-based devices, the Cmem operand may be mapped in external or internal memory space.

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MOV *CDP, *(\#0500h) | The content addressed by the coefficient data pointer register (CDP) is copied to <br> address 0500 h. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| *CDP | 3400 | *CDP | 3400 |
| 500 | 0000 | 500 | 3400 |

Move Memory to Memory

## Syntax Characteristics



Move Memory to Memory

## Syntax Characteristics



Move Memory to Memory

## Syntax Characteristics



Move Memory to Memory

## Syntax Characteristics



| Before |  | After |  |
| :--- | :--- | :--- | :--- |
| ARO | 0300 | ARO | 0300 |
| AR1 | 0400 | AR1 | 0400 |
| 300 | 3400 | 300 | 3400 |
| 301 | $0 F D 3$ | 301 | $0 F D 3$ |
| 400 | 0000 | 400 | 3400 |
| 401 | 0000 | 401 | $0 F D 3$ |

Move Memory to Memory

Syntax Characteristics


## MOV

Store Accumulator Content to Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV HI(ACx), Smem | No | 2 | 1 | X |
| $[2]$ | MOV [rnd(]HI(ACx)[)], Smem | No | 3 | 1 | X |
| $[3]$ | MOV ACx << Tx, Smem | No | 3 | 1 | X |
| $[4]$ | MOV [rnd(]HI(ACx << Tx)[)], Smem | No | 3 | 1 | X |
| $[5]$ | MOV ACx << \#SHIFTW, Smem | No | 3 | 1 | X |
| $[6]$ | MOV HI(ACx << \#SHIFTW), Smem | No | 3 | 1 | X |
| $[7]$ | MOV [rnd(]HI(ACx << \#SHIFTW)[)], Smem | No | 4 | 1 | X |
| $[8]$ | MOV [uns(][rnd(]HI[(saturate](ACx)[)))], Smem | No | 3 | 1 | X |
| $[9]$ | MOV [uns(][rnd(]HI[(saturate](ACx << Tx)[)))], Smem | No | 3 | 1 | X |
| $[10]$ | MOV [uns(][rnd(]HI[(saturate](ACx << \#SHIFTW)[)))], Smem | No | 4 | 1 | X |
| $[11]$ | MOV ACx, dbI(Lmem) | No | 3 | 1 | X |
| $[12]$ | MOV [uns(]saturate(ACx)[)], dbI(Lmem) | No | 3 | 1 | X |
| $[13]$ | MOV ACx >> \#1, dual(Lmem) | No | 3 | 1 | X |
| $[14]$ | MOV ACx, Xmem, Ymem | No | 3 | 1 | X |

## Description

Status Bits

This instruction stores the content of the selected accumulator (ACx) to a memory (Smem) location, to a data memory operand (Lmem), or to dual data memory operands (Xmem and Ymem).

Affected by C54CM, RDM, SXMD
Affects none
See Also See the following other related instructions:

- ADD::MOV (Addition with Parallel Store Accumulator Content to Memory)
- MACM::MOV (Multiply and Accumulate with Parallel Store AccumulatorContent to Memory)- MASM::MOV (Multiply and Subtract with Parallel Store AccumulatorContent to Memory)
- MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
- MOV (Store Accumulator Pair Content to Memory)
- MOV (Store Accumulator, Auxiliary, or Temporary Register Content toMemory)$\square$ MOV (Store Auxiliary or Temporary Register Pair Content to Memory)- MOV::MOV (Load Accumulator from Memory with Parallel StoreAccumulator Content to Memory)
- MPYM::MOV (Multiply with Parallel Store Accumulator Content toMemory)- SUB::MOV (Subtraction with Parallel Store Accumulator Content toMemory)


## Store Accumulator Content to Memory

## Syntax Characteristics



Syntax Characteristics


Rounding is performed in the D-unit shifter according to RDM, if the optional rnd keyword is applied to the input operand.

## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation:
$\square$ If the SST bit $=1$ and the $\operatorname{SXMD}$ bit $=0$, then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem $=\mathrm{HI}($ saturate $($ uns $(\operatorname{rnd}(A C x)))$ )
$\square$ If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate (rnd(ACx)))

- If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
- If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.

| Status Bits | Affected by | C54CM, RDM, SST, SXMD |
| :--- | :--- | :--- |
|  | Affects | none | (this instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MOV rnd(HI(ACO)), *AR3 | The content of ACO(31-16) is rounded and stored at the location addressed by <br> AR3. |

Store Accumulator Content to Memory
Syntax Characteristics


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, the 6 LSBs of Tx determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the 16 -bit value in $T x$ is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

- If the SST bit = 1 and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
Smem = LO(saturate(uns(ACx << Tx)))
- If the SST bit = 1 and the SXMD bit $=1$, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = LO(saturate (ACx << Tx))
Affected by C54CM, RDM, SST, SXMD
Affects
none
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MOV AC0 $\ll$ T0, *AR3 | The content of AC0 is shifted by the content of T0 and AC0(15-0) is stored at the <br> location addressed by AR3. |

## Store Accumulator Content to Memory

Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [4] | MOV [rnd(]HI(ACx << Tx)[]], Smem | No | 3 | 1 | X |
| Opcode |  | 0111 \| AAAA | AAA | I ${ }^{\text {SSss }}$ | 10x\% |
| Operands | ACx, Smem, Tx |  |  |  |  |
| Description | This instruction shifts the accumulator, ACx , by the content of Tx and stores high part of the accumulator, $\operatorname{ACx}(31-16)$, to the memory (Smem) location: |  |  |  |  |

```
Smem = HI (rnd(ACx << Tx))
```

If the 16 -bit value in Tx is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value. The input operand is shifted in the D-unit shifter according to SXMD. Rounding is performed in the D-unit shifter according to RDM, if the optional rnd keyword is applied to the input operand.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with C54CM $=1$, the 6 LSBs of Tx determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the 16 -bit value in $T x$ is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
$\square$ If the SST bit $=1$ and the $S X M D$ bit $=0$, then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem $=\mathrm{HI}($ saturate $($ uns $(\operatorname{rnd}(\mathrm{ACx} \ll \mathrm{Tx})))$ )

- If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem $=\mathrm{HI}($ saturate $(\operatorname{rnd}(\mathrm{ACx} \ll \operatorname{Tx})))$
Status Bits Affected by C54CM, RDM, SST, SXMD
Affects none
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MOV rnd(HI(AC0 << T0)), *AR3 | The content of AC0 is shifted by the content of T0, is rounded, and <br> AC0(31-16) is stored at the location addressed by AR3. |

Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $[5]$ | MOV ACx $\ll$ \#SHIFTW, Smem | No | 3 | 1 | X |  |
| Opcode |  | 1110 | $1001 \mid A A A A$ | AAAI | SSSH | IFTW |
| Operands | ACx, SHIFTW, Smem |  |  |  |  |  |
| Description | This instruction shifts the accumulator, ACx, by the 6-bit value, SHIFTW, and <br> stores the low part of the accumulator, ACx(15-0), to the memory (Smem) <br> location: <br> Smem $=$ LO (ACx $\ll$ \#SHIFTW) |  |  |  |  |  |

The input operand is shifted by the 6 -bit value in the D-unit shifter according to SXMD

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation:

- If the SST bit = 1 and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = LO(saturate(uns(ACx << \#SHIFTW)))
$\square$ If the SST bit $=1$ and the SXMD bit $=1$, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = LO(saturate(ACx <<\#SHIFTW))

- If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
- If the optional uns keyword is not applied to the input operand, then bits $39-31$ of the result are compared to bit 39 of the input operand and SXMD.

| Status Bits | Affected by | C54CM, RDM, SST, SXMD |
| :--- | :--- | :--- |
|  | Affects | none |

## Example

| Syntax | Description |
| :--- | :--- |
| MOV AC0 << \#31, *AR3 | The content of AC0 is shifted left by 31 bits and ACO(15-0) is stored at the <br> location addressed by AR3. |

Store Accumulator Content to Memory
Syntax Characteristics

| No. Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $[6]$ | MOV HI(ACx $\ll$ \#SHIFTW), Smem | No | 3 | 1 | X |  |
| Opcode |  | 1110 | 1010 | AAAA | AAAI | SSSH |
| Operands | ACx, SHIFTW, Smem |  |  |  |  |  |
| Description | This instruction shifts the accumulator, ACx, by the 6-bit value, SHIFTW, and <br> stores the high part of the accumulator, ACx(31-16), to the memory (Smem) <br> location: <br> Smem $=$ HI (ACx $\ll$ \#SHIFTW) |  |  |  |  |  |

The input operand is shifted by the 6 -bit value in the D-unit shifter according to SXMD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation:
$\square$ If the SST bit $=1$ and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate(uns(ACx << \#SHIFTW)))
$\square$ If the SST bit $=1$ and the SXMD bit $=1$, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem $=$ HI(saturate(ACx $\ll$ \#SHIFTW)))

- If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
$\square$ If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.

| Status Bits | Affected by | C54CM, RDM, SST, SXMD |
| :--- | :--- | :--- |
|  | Affects | none |

## Example

| Syntax | Description |
| :--- | :--- |
| $\mathrm{MOV} \mathrm{HI}(\mathrm{ACO} \ll \# 31),{ }^{*} \mathrm{AR} 3$ | The content of AC0 is shifted left by 31 bits and ACO(31-16) is stored at the <br> location addressed by AR3. |

Syntax Characteristics


The input operand is shifted by the 6 -bit value in the D-unit shifter according to SXMD. Rounding is performed in the D-unit shifter according to RDM, if the optional rnd keyword is applied to the input operand.

## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation:

- If the SST bit $=1$ and the $\operatorname{SXMD}$ bit $=0$, then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem $=$ HI(saturate(uns(rnd(ACx << \#SHIFTW))))
$\square$ If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

$$
\text { Smem }=\mathrm{HI}(\text { saturate }(\text { rnd }(\text { ACx } \ll \text { \#SHIFTW })))
$$

- If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
- If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.

| Status Bits | Affected by | C54CM, RDM, SST, SXMD |
| :--- | :--- | :--- |
|  | Affects | none |

## Example

| Syntax | Description |
| :--- | :--- |
| MOV rnd(HI(ACO << \#31)), *AR3 | The content of AC0 is shifted left by 31 bits, is rounded, and <br> AC0(31-16) is stored at the location addressed by AR3. |

Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: |
| $[8]$ | MOV $[$ uns (]$[\operatorname{rnd}(] \mathrm{HI}[($ saturate $](A C x)[)))]$, Smem | No | 3 | 1 | X |  |
| Opcode |  | 1110 | 1000 | AAAA | AAAI | SSXx |
| x1u\% |  |  |  |  |  |  |

## Operands

Description

ACx, Smem
This instruction stores the high part of the accumulator, $\operatorname{ACx}(31-16)$, to the memory (Smem) location:

Smem = HI (saturate (uns (rnd (ACx))))
$\square$ When the C54CM bit $=0$ or the SST bit $=0$, the saturate and uns keywords are optional and can be applied or not.
$\square$ Input operands are considered signed or unsigned according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.
- If the optional rnd keyword is applied to the input operand, rounding is performed in the D-unit shifter according to RDM.
- When a rounding overflow is detected and if the optional saturate keyword is applied to the input operand, the 40 -bit output of the operation is saturated:
- If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.
■ If the optional uns keyword is not applied, saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the round operation:
$\square$ If the SST bit $=1$ and the SXMD bit $=0$, then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user.
$\square$ If the SST bit = 1 and the SXMD bit = 1, then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate(rnd(ACx)))
$\square$ If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
$\square$ If the optional uns keyword is not applied to the input operand, then bits $39-31$ of the result are compared to bit 39 of the input operand and SXMD.

Status Bits

Repeat

Affected by C54CM, RDM, SST, SXMD
Affects none
This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MOV uns(rnd(HI(saturate(ACO)))), *AR3 | The unsigned content of AC0 is rounded, is saturated, and <br> AC0(31-16) is stored at the location addressed by AR3. |

Store Accumulator Content to Memory
Syntax Characteristics


```
Smem = HI (saturate (uns(rnd(ACx << Tx))))
```

If the 16 -bit value in $T x$ is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
$\square$ When the C54CM bit $=0$ or the SST bit $=0$, the saturate and uns keywords are optional and can be applied or not.
$\square$ Input operands are considered signed or unsigned according to uns.

- If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.
$\square$ The input operand is shifted in the D-unit shifter according to SXMD.
$\square$ When shifting, the sign position of the input operand is compared to the shift quantity.
- If the optional uns keyword is applied to the input operand, this comparison is performed against bit 32 of the shifted operand.
- If the optional uns keyword is not applied, this comparison is performed against bit 31 of the shifted operand that is considered signed (the sign is defined by bit 39 of the input operand and SXMD).
- An overflow is generated accordingly.
$\square$ If the optional rnd keyword is applied to the input operand, rounding is performed in the D-unit shifter according to RDM.
$\square$ When a shift or rounding overflow is detected and if the optional saturate keyword is applied to the input operand, the 40-bit output of the operation is saturated:
- If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.
- If the optional uns keyword is not applied, saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$ :
$\square$ If the SST bit = 1 and the $\operatorname{SXMD}$ bit $=0$, then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user.

- If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem $=\mathrm{HI}($ saturate $(\operatorname{rnd}(A C x \ll T x)))$

Status Bits
Affected by C54CM, RDM, SST, SXMD
Affects
none
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MOV uns(rnd(HI(saturate $(\mathrm{ACO} \ll \mathrm{TO})))),{ }^{*} \mathrm{AR} 3$ | The unsigned content of AC0 is shifted by the content of <br> T0, is rounded, is saturated, and $\mathrm{ACO}(31-16)$ is stored at <br> the location addressed by AR3. |

Store Accumulator Content to Memory

Syntax Characteristics

$\square$ When the C54CM bit $=0$ or the SST bit $=0$, the saturate and uns keywords are optional and can be applied or not.
$\square$ Input operands are considered signed or unsigned according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.
$\square$ The input operand is shifted by the 6 -bit value in the D-unit shifter according to SXMD.

When shifting, the sign position of the input operand is compared to the shift quantity.

■ If the optional uns keyword is applied to the input operand, this comparison is performed against bit 32 of the shifted operand.
■ If the optional uns keyword is not applied, this comparison is performed against bit 31 of the shifted operand that is considered signed (the sign is defined by bit 39 of the input operand and SXMD).

- An overflow is generated accordingly.
- If the optional rnd keyword is applied to the input operand, rounding is performed in the D-unit shifter according to RDM.

When a shift or rounding overflow is detected and if the optional saturate keyword is applied to the input operand, the 40 -bit output of the operation is saturated:

- If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.

■ If the optional uns keyword is not applied, saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation.

- If the SST bit = 1 and the SXMD bit $=0$, then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user.
- If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate(rnd(ACx << \#SHIFTW)))
$\square$ If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
$\square$ If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.


## Store Accumulator Content to Memory

## Syntax Characteristics


Operands ACx, Lmem

Description This instruction stores the content of the accumulator, $\operatorname{ACx}(31-0)$, to the data memory operand (Lmem):
dbl (Lmem) $=\mathrm{ACx}$
The store operation to the memory location uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
Status Bits Affected by none

Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MOV AC0, dbl(*AR3) | The content of AC0 is stored at the locations addressed by AR3 and AR3 +1. |

## Store Accumulator Content to Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [12] | MOV [uns(]saturate(ACx)[)], dbl(Lmem) | No | 3 | 1 | X |
| Opcode |  | 1011 AAAA | AAA | I xxSS | 10u1 |
| Operands | ACx, Lmem |  |  |  |  |
| Description | This instruction stores the content of the accumulator, $\operatorname{ACx}(31-0)$, to the data memory operand (Lmem): |  |  |  |  |

```
dbl(Lmem) = saturate(uns(ACx))
```

- When the C54CM bit $=0$ or the SST bit $=0$, the saturate and uns keywords are optional and can be applied or not.
- Input operands are considered signed or unsigned according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.
- The 40 -bit output of the operation is saturated:
- If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.
- If the optional uns keyword is not applied, saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).
- The store operation to the memory location uses the D-unit shifter.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation.

- If the SST bit = 1 and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user.



## Store Accumulator Content to Memory

## Syntax Characteristics



## Store Accumulator Content to Memory

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[14]$ | MOV ACx, Xmem, Ymem | No | 3 | 1 | X |  |
| Opcode |  | 1000 | 0000 | XXXM | MMYY | YMMM |
| Operands | ACx, Xmem, Ymem |  |  |  |  |  |
| Description | This instruction performs two store operations in parallel: <br>  <br> Xmem $=$ LO (ACx) <br> $:: ~ Y m e m ~=~ H I ~(A C x) ~$ |  |  |  |  |  |

- The 16 lowest bits of the accumulator, $\operatorname{ACx}(15-0)$, are stored to data memory operand Xmem.
$\square$ The 16 highest bits, $\operatorname{ACx}(31-16)$, are stored to data memory operand Ymem.
Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.


## Example

| Syntax | Description |
| :--- | :--- |
| MOV AC0, *AR1, *AR2 | The content of AC0(15-0) is stored at the location addressed by AR1 and the <br> content of $\mathrm{ACO}(31-16)$ is stored at the location addressed by AR2. |


| Before |  | After |  |  |  |
| :--- | ---: | :--- | :--- | :--- | :--- |
| AC0 | 014500 | 0030 | AC0 | 014500 | 0030 |
| AR1 | 0200 | AR1 |  | 0200 |  |
| AR2 | 0201 | AR2 |  | 0201 |  |
| 200 | 3400 | 200 | 0030 |  |  |
| 201 |  | $0 F D 3$ | 201 |  | 4500 |

## MOV

Store Accumulator Pair Content to Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV pair(HI(ACx)), dbl(Lmem) | No | 3 | 1 | X |
| $[2]$ | MOV pair(LO(ACx)), dbl(Lmem) | No | 3 | 1 | X |

Description This instruction stores the content of the selected accumulator pair, ACx and $\mathrm{AC}(\mathrm{x}+1)$, to a data memory operand (Lmem).

Status Bits Affected by none
Affects none

## See Also

See the following other related instructions:

- ADD::MOV (Addition with Parallel Store Accumulator Content to Memory)
- MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory)
- MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)
- MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
- MOV (Store Accumulator Content to Memory)
- MOV (Store Accumulator, Auxiliary, or Temporary Register Content to Memory)
- MOV (Store Auxiliary or Temporary Register Pair Content to Memory)
$\square$ MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory)
- MPYM::MOV (Multiply with Parallel Store Accumulator Content to Memory)
- SUB::MOV (Subtraction with Parallel Store Accumulator Content to Memory)

Store Accumulator Pair Content to Memory
Syntax Characteristics


## Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | MOV pair(LO(ACx)), dbl(Lmem) |  | No | 3 | 1 | X |
| Opcod |  |  | 1011 \| AAAA | AAA | I $\mathrm{xxSS}^{\text {a }}$ | 1111 |
| Opera |  | ACx, Lmem |  |  |  |  |
| Descri | ion | This instruction stores the 16 lowest bits of the accumulator, $\mathrm{ACx}(15-0)$, to the 16 highest bits of the data memory operand (Lmem) and stores the 16 lowest bits of $A C(x+1)$ to the16 lowest bits of data memory operand (Lmem): <br> Lmem $=\operatorname{pair}(\mathrm{LO}(\mathrm{ACx}))$ The store operation to the memory location uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs. Valid accumulators are $\mathrm{ACO} / \mathrm{AC} 1$ and $\mathrm{AC} 2 / \mathrm{AC} 3$. |  |  |  |  |
| Status | $\begin{array}{ll}\text { its } & \text { Affec } \\ & \text { Affect }\end{array}$ | Affected by none |  |  |  |  |
| Repea | This instruction can be repeated. |  |  |  |  |  |
| Example |  |  |  |  |  |  |
| Syntax |  | Description |  |  |  |  |
| MOV pair(LO(AC0)), dbl(*AR3) |  | The content of ACO(15-0) is stored at the location addressed by AR3 and the content of AC1(15-0) is stored at the location addressed by AR3 +1 . |  |  |  |  |

## MOV

Store Accumulator, Auxiliary, or Temporary Register Content to Memory

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MOV src, Smem | No | 2 | 1 | X |
| $[2]$ | MOV src, high_byte(Smem) | No | 3 | 1 | $X$ |
| $[3]$ | MOV src, low_byte(Smem) | No | 3 | 1 | X |

Description This instruction stores the content of the selected source (src) register to a memory (Smem) location.

Status Bits Affected by none
Affects none
See Also See the following other related instructions:

- ADD::MOV (Addition with Parallel Store Accumulator Content to Memory)
- MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory)
$\square$ MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)
- MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
- MOV (Store Accumulator Content to Memory)
- MOV (Store Accumulator Pair Content to Memory)
- MOV (Store Auxiliary or Temporary Register Pair Content to Memory)
- MOV::MOV (Load Accumulator from Memory with Parallel Store Accumulator Content to Memory)
- MPYM::MOV (Multiply with Parallel Store Accumulator Content to Memory)
- SUB::MOV (Subtraction with Parallel Store Accumulator Content to Memory)


## Store Accumulator, Auxiliary, or Temporary Register Content to Memory

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $[1]$ | MOV src, Smem | No | 2 | 1 | X |  |
| Opcode |  | 1100 | FSSS | AAAA | AAAI |  |
| Operands | Smem, src |  |  |  |  |  |
| Description | This instruction stores the content of the source (src) register to a memory <br>  <br>  <br> (Smem) location: <br> Smem $=$ src |  |  |  |  |  |

- When the source register is an accumulator:
- The low part of the accumulator, $\operatorname{ACx}(15-0)$, is stored to the memory location.

■ The store operation to the memory location uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.

- When the source register is an auxiliary or temporary register:

■ The content of the auxiliary or temporary register is stored to the memory location.

■ The store operation to the memory location uses a dedicated path independent of the A-unit ALU.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MOV AC0, ${ }^{*}(\# 0 \mathrm{E} 10 \mathrm{~h})$ | The content of AC0(15-0) is stored at location E10h. |


| Before |  | After |  |  |  |
| :--- | :--- | :--- | :--- | ---: | :--- |
| AC0 | 23 | 0400 | 6500 | AC0 | 230400 |
| 0E10 |  | 0000 | 0E10 |  | 6500 |
|  |  |  |  |  | 6500 |

Store Accumulator, Auxiliary, or Temporary Register Content to Memory
Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MOV src, high_byte(Smem) |  | No | 3 | 1 | X |
| Opcode |  | 1110 | $0101 \mid$ AAAA | AAAI | FSSS | $01 \times 0$ |
| Operands | Smem, src |  |  |  |  |  |
| Description | This instruction stores the low byte (bits 7-0) of the source (src) register to the <br> high byte (bits 15-8) of the memory (Smem) location. The low byte (bits 7-0) <br> of Smem is unchanged: <br> high_byte (Smem) $=$ src |  |  |  |  |  |

- When the source register is an accumulator:
- The low part of the accumulator, $\operatorname{ACx}(7-0)$, is stored to the high byte of the memory location.
- The store operation to the memory location uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
- When the source register is an auxiliary or temporary register:
- The low part (bits 7-0) content of the auxiliary or temporary register is stored to the high byte of the memory location.
■ The store operation to the memory location uses a dedicated path independent of the A-unit ALU.
$\square$ In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.
Status Bits Affected by none

Affects none
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MOV AC1, high_byte(*AR1) | The content of AC1 (7-0) is stored in the high byte (bits 15-8) at the location <br> addressed by AR1. |


| Before | After |  |  |  |  |
| :--- | ---: | :--- | :--- | :--- | :--- | :--- |
| AC1 | 20 FC00 | 6788 | AC1 | 20 FC00 | 6788 |
| AR1 |  | 0200 | AR1 |  | 0200 |
| 200 |  | 6903 | 200 | 8803 |  |

## Store Accumulator, Auxiliary, or Temporary Register Content to Memory

## Syntax Characteristics



## MOV

Store Auxiliary or Temporary Register Pair Content to Memory
Syntax Characteristics

| No. Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] MOV pair(TAx), | , dbl(Lmem) | No | 3 | 1 | X |
| Opcode |  | 1011 \| AAA | AA | I FSSS | 1100 |
| Operands | TAx, Lmem |  |  |  |  |
| Description | This instruction stores the content of the temporary or auxiliary register (TAx) to the 16 highest bits of the data memory operand (Lmem) and stores the content of TA $(x+1)$ to the 16 lowest bits of data memory operand (Lmem):The store operation to the memory location uses a dedicated path independent of the A-unit ALU.Valid auxiliary registers are AR0, AR2, AR4, and AR6.Valid temporary registers are T0 and T2. |  |  |  |  |
| Status Bits | Affected by none |  |  |  |  |
| Repeat | This instruction can be repeated. |  |  |  |  |
| See Also | See the follo MOV (L <br> $\square$ MOV (St Memory | ions: or Temporary y, or Tempor | Reg <br> ary | ster from <br> egister C | Memory) <br> ontent to |
| Example |  |  |  |  |  |
| Syntax | Description |  |  |  |  |
| MOV pair(T0), dbl(*AR2) | The content of T0 is stored at the location addressed by AR2 and the content of T 1 is stored at the location addressed by AR2 +1 . |  |  |  |  |

MOV
Store CPU Register Content to Memory

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MOV BK03, Smem | No | 3 | 1 | X |
| [2] | MOV BK47, Smem | No | 3 | 1 | X |
| [3] | MOV BKC, Smem | No | 3 | 1 | X |
| [4] | MOV BSA01, Smem | No | 3 | 1 | X |
| [5] | MOV BSA23, Smem | No | 3 | 1 | X |
| [6] | MOV BSA45, Smem | No | 3 | 1 | X |
| [7] | MOV BSA67, Smem | No | 3 | 1 | X |
| [8] | MOV BSAC, Smem | No | 3 | 1 | X |
| [9] | MOV BRC0, Smem | No | 3 | 1 | X |
| [10] | MOV BRC1, Smem | No | 3 | 1 | X |
| [11] | MOV CDP, Smem | No | 3 | 1 | X |
| [12] | MOV CSR, Smem | No | 3 | 1 | X |
| [13] | MOV DP, Smem | No | 3 | 1 | X |
| [14] | MOV DPH, Smem | No | 3 | 1 | X |
| [15] | MOV PDP, Smem | No | 3 | 1 | X |
| [16] | MOV SP, Smem | No | 3 | 1 | X |
| [17] | MOV SSP, Smem | No | 3 | 1 | X |
| [18] | MOV TRNO, Smem | No | 3 | 1 | X |
| [19] | MOV TRN1, Smem | No | 3 | 1 | X |
| [20] | MOV RETA, dbl(Lmem) | No | 3 | 5 | X |

[^8]| Description | These instructions store the content of the selected source CPU register to a <br> memory (Smem) location or a data memory operand (Lmem). |
| :--- | :--- |
| For instructions [9] and [10], the block repeat register (BRCx) is decremented <br> in the address phase of the last instruction of the loop. These instructions have |  |
| a 3-cycle latency requirement versus the last instruction of the loop. |  |
| For instruction [20], the content of the 24-bit RETA register (the return address |  |
| of the calling subroutine) and the 8-bit CFCT register (active control flow |  |
| execution context flags of the calling subroutine) are stored to the data |  |
| memory operand (Lmem): |  |

Example 2

| Syntax | Description |  |
| :--- | :--- | :--- |
| MOV SSP, *AR1+ | The content of the system stack pointer (SSP) is stored in the location addressed <br> by AR1. AR1 is incremented by 1. |  |
| Before   <br> AR1 After  <br> SSP 0201 AR1 |  |  |
| 201 | 0000 | SSP |

## Example 3

| Syntax | Description |
| :--- | :--- |
| MOV TRN0, *AR1+ | The content of the transition register (TRN0) is stored in the location addressed <br> by AR1. AR1 is incremented by 1. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| AR1 | 0202 | AR1 | 0203 |
| TRNO | 3490 | TRNO | 3490 |
| 202 | 0000 | 202 | 3490 |

## Example 4

| Syntax | Description |
| :--- | :--- |
| MOV TRN1, *AR1+ | The content of the transition register (TRN1) is stored in the location addressed <br> by AR1. AR1 is incremented by 1. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| AR1 | 0203 | AR1 | 0204 |
| TRN1 | 0020 | TRN1 | 0020 |
| 203 | 0000 | 203 | 0020 |

Example 5

| Syntax | Description |
| :--- | :--- |
| MOV RETA, dbl(*AR3) | The contents of the RETA and CFCT are stored in the location addressed by AR3 <br> and AR3 +1. |

Table 5-5. Opcodes for Store CPU Register Content to Memory Instruction

| No. | Syntax | Opcode |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MOV BK03, Smem | 1110 | 0101 | AAAA | AAAI | 1001 | 10xx |
| [2] | MOV BK47, Smem | 1110 | 0101 | AAAA | AAAI | 1010 | 10xx |
| [3] | MOV BKC, Smem | 1110 | 0101 | AAAA | AAAI | 1011 | 10xx |
| [4] | MOV BSA01, Smem | 1110 | 0101 | AAAA | AAAI | 0010 | 10xx |
| [5] | MOV BSA23, Smem | 1110 | 0101 | AAAA | AAAI | 0011 | 10xx |
| [6] | MOV BSA45, Smem | 1110 | 0101 | AAAA | AAAI | 0100 | 10xx |
| [7] | MOV BSA67, Smem | 1110 | 0101 | AAAA | AAAI | 0101 | 10xx |
| [8] | MOV BSAC, Smem | 1110 | 0101 | AAAA | AAAI | 0110 | 10xx |
| [9] | MOV BRC0, Smem | 1110 | 0101 | AAAA | AAAI | x001 | 11xx |
| [10] | MOV BRC1, Smem | 1110 | 0101 | AAAA | AAAI | x 010 | 11xx |
| [11] | MOV CDP, Smem | 1110 | 0101 | AAAA | AAAI | 0001 | 10xx |
| [12] | MOV CSR, Smem | 1110 | 0101 | AAAA | AAAI | x 000 | 11xx |
| [13] | MOV DP, Smem | 1110 | 0101 | AAAA | AAAI | 0000 | 10xx |
| [14] | MOV DPH, Smem | 1110 | 0101 | AAAA | AAAI | 1100 | 10xx |
| [15] | MOV PDP, Smem | 1110 | 0101 | AAAA | AAAI | 1111 | 10xx |
| [16] | MOV SP, Smem | 1110 | 0101 | AAAA | AAAI | 0111 | 10xx |
| [17] | MOV SSP, Smem | 1110 | 0101 | AAAA | AAAI | 1000 | 10xx |
| [18] | MOV TRN0, Smem | 1110 | 0101 | AAAA | AAAI | x011 | 11xx |
| [19] | MOV TRN1, Smem | 1110 | 0101 | AAAA | AAAI | x 100 | 11xx |
| [20] | MOV RETA, dbl(Lmem) | 1110 | 1011 | AAAA | AAAI | xxxx | 01xx |

## MOV

## Syntax Characteristics



Example

| Syntax | Description |
| :--- | :--- |
| MOV XAR1, dbl(*AR3) | The 7 highest bits of XAR1 are moved to the 7 lowest bits of the location <br> addressed by AR3, the 9 highest bits are filled with 0, and the 16 lowest bits of <br> XAR1 are moved to the location addressed by AR3 +1. |


| Before |  | After |  |
| :--- | ---: | :--- | ---: |
| XAR1 | $7 F$ | 3492 | XAR1 |
| AR3 | 0200 | AR3 | $7 F$ |
| 200 | 3765 | 200 | 0200 |
| 201 | $0 F D 3$ | 201 | $007 F$ |
|  |  |  | 3492 |

## MOV:MOV

Load Accumulator from Memory with Parallel Store Accumulator Content to Memory

## Syntax Characteristics



The first operation loads the content of data memory operand Xmem shifted left by 16 bits to the accumulator ACy.

- The input operand is sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
- The input operand is shifted left by 16 bits according to M40.

The second operation shifts the accumulator ACx by the content of T2 and stores $\operatorname{ACx}(31-16)$ to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.

- The input operand is shifted in the D-unit shifter according to SXMD.
- After the shift, the high part of the accumulator, $\operatorname{ACx}(31-16)$, is stored to the memory location.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, the 6 LSBs of T 2 determine the shift quantity. The 6 LSBs of T 2 define a shift quantity within -32 to +31 . When the 16 -bit value in T 2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

- If the SST bit = 1 and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected
by the user, with the following syntax:

> ACy = Xmem << \#16

Ymem = HI(saturate(uns(ACx << T2)))
$\square$ If the SST bit = 1 and the SXMD bit = 1, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

ACy = Xmem <<\#16
Ymem $=\mathrm{HI}($ saturate $(A C x \ll \mathrm{~T} 2))$
Status Bits
Affected by C54CM, M40, RDM, SATD, SST, SXMD
Affects ACOVy
Repeat This instruction can be repeated.
See Also See the following other related instructions:
$\square$ MOV (Load Accumulator from Memory)
$\square$ MOV (Load Accumulator Pair from Memory)
$\square$ MOV (Load Accumulator with Immediate Value)
$\square$ MOV (Load Accumulator, Auxiliary, or Temporary Register from Memory)
$\square$ MOV (Load Accumulator, Auxiliary, or Temporary Register with Immediate Value)

## Example

| Syntax | Description |
| :--- | :--- |
| MOV *AR3 <<\#16, AC0 | Both instructions are performed in parallel. The content addressed by |
| $\because$ MOV HI(AC1 << T2), *AR4 | AR3 shifted left by 16 bits is stored in AC0. The content of AC1 is shifted |
|  | by the content of T2, and AC1(31-16) is stored at the address of AR4. |

## MPY

## Multiply

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MPY[R] [ACx, ${ }^{\text {a }}$ A $y$ | Yes | 2 | 1 | X |
| [2] | MPY[R] Tx, [ACx, ] ACy | Yes | 2 | 1 | X |
| [3] | MPYK[R] K8, [ACx, ] ACy | Yes | 3 | 1 | X |
| [4] | MPYK[R] K16, [ACx,] ACy | No | 4 | 1 | X |
| [5] | MPYM[R] [T3 = ]Smem, Cmem, ACx | No | 3 | 1 | X |
| [6] | MPYM[R] [T3 = ]Smem, [ACx, ] ACy | No | 3 | 1 | X |
| [7] | MPYMK[R] [T3 = ]Smem, K8, ACx | No | 4 | 1 | X |
| [8] | MPYM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], ACx | No | 4 | 1 | X |
| [9] | MPYM[R][U] [T3 = ]Smem, Tx, ACx | No | 3 | 1 | X |
| [10] | MPY[R] Smem, uns(Cmem), ACx | No | 3 | 1 | X |

Description This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are:

- ACx(32-16)
- The content of Tx, sign extended to 17 bits

The 8 -bit signed constant, K8, sign extended to 17 bits
$\square$ The 16 -bit signed constant, K16, sign extended to 17 bits

- The content of a memory location (Smem), sign extended to 17 bits
- The content of a data memory operand Cmem, addressed using the coefficient addressing mode, sign extended to 17 bits
$\square$ The content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy
See Also See the following other related instructions:
$\square$ AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply)

- MAC (Multiply and Accumulate)
- MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract)
- MAC::MPY (Multiply and Accumulate with Parallel Multiply)
- MAS (Multiply and Subtract)
- MAS::MPY (Multiply and Subtract with Parallel Multiply)
- MPY::MAC (Multiply with Parallel Multiply and Accumulate)
- MPY::MAS (Multiply with Parallel Multiply and Subtract)
- MPY::MPY (Parallel Multiplies)
- MPYM::MOV (Multiply with Parallel Store Accumulator Content to Memory)
- SQR (Square)

Multiply

## Syntax Characteristics



| Syntax | Description |
| :--- | :--- |
| MPY AC1, AC0 | The product of the content of AC1 and the content of AC0 is stored in AC1. |


| Before |  |  | After |  |  |  |  |
| :--- | ---: | ---: | ---: | :--- | ---: | ---: | ---: |
| AC0 | 02 | 6000 | 3400 | AC0 | 02 | 6000 | 3400 |
| AC1 | 00 | C000 | 0000 | AC1 | 00 | 4800 | 0000 |
| M40 |  |  | 1 | M40 |  |  | 1 |
| FRCT |  |  | 0 | FRCT |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 0 |

## Multiply

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MPY $[R]$ Tx, $[A C x] A C y$, | Yes | 2 | 1 | $X$ |

Opcode $\quad \mid 0101$ 100E $\mid$ DDSS ss0\%

## Operands <br> ACx, ACy, Tx

Description This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are $\operatorname{ACx}(32-16)$ and the content of Tx , sign extended to 17 bits:

ACy $=$ ACx * Tx

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- |
|  | Affects ACOVy |
| Repeat | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| MPY T0, AC1, AC0 | The product of the content of AC1 and the content of T0 is stored in AC0. |

Multiply

## Syntax Characteristics



## Multiply

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | MPYK $[\mathrm{R}] \mathrm{K} 16,[\mathrm{ACx}] ACy$, | No | 4 | 1 | X |

## Opcode

## Operands

Description

ACx, ACy, K16
This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are $\operatorname{ACx}(32-16)$ and the 16 -bit signed constant, K16, sign extended to 17 bits:
$A C y=A C x$ * K16
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
This instruction can be repeated.

Example

| Syntax | Description |
| :--- | :--- |
| MPYK \#-64, AC1, AC0 | The product of the content of AC1 and a signed 16-bit value (-64) is stored in <br> AC0. |

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[5]$ | MPYM $[R][T 3=]$ Smem, Cmem, ACx |  | No | 3 | 1 | X |
|  |  | 1101 | $0001 \mid$ AAAA | AAAI | U\%DD | 00 mm |

Operands ACx, Cmem, Smem

## Description This instruction performs a multiplication in the D-unit MAC. The input

 operands of the multiplier are the content of a memory location (Smem), sign extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and sign extended to 17 bits:$A C x=$ Smem * Cmem

- If FRCT = 1, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.


Multiply
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[6]$ | MPYM $[R][T 3=]$ Smem, $[A C x$,$] ACy$ | No | 3 | 1 | X |

## Opcode

$11010011 \mid$ AAAA AAAI $\mid$ U\%DD 00SS

## Operands ACx, ACy, Smem

Description This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are $\operatorname{ACx}(32-16)$ and the content of a memory location (Smem), sign extended to 17 bits:
$\mathrm{ACy}=$ Smem * ACx

- If FRCT = 1, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

## Status Bits

Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MPYM *AR3, AC1, AC0 | The product of the content addressed by AR3 and the content of AC1 is stored in <br> AC0. |

## Multiply

## Syntax Characteristics



Multiply

## Syntax Characteristics



Operands

## Description

## ACx, Xmem, Ymem

This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits:

ACx = Xmem * Ymem
$\square$ Input operands are extended to 17 bits according to uns.
If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit ( ACOV x ) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.


## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [9] | MPYM[R][U] [T3 = JSmem, Tx, ACx | No | 3 | 1 | X |
| Opcod |  | 11010011 \| AAA | A AAAI \\| U\%DD |  | u1ss |
| Opera | ACx, Smem, Tx |  |  |  |  |
| Descr | This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of a memory location (Smem), sign extended to 17 bits: |  |  |  |  |

## ACx $=$ Tx * Smem

$\square$ If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ The 32 -bit result of the multiplication is extended to 40 bits according to U .
■ If the optional U keyword is applied to the instruction, the 32-bit result is zero extended to 40 bits.

■ If the optional U keyword is not applied to the instruction, the 32-bit result is sign extended to 40 bits.

- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits

|  | Affected by FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- |
| Repeat | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| MPYMU *AR3, T0, AC0 | The product of the content addressed by AR3 and the content of T0 is stored as <br> an unsigned result in AC0. |

Multiply

## Syntax Characteristics



## Note:

The uns keyword is mandatory for this instruction.

The data memory operand Smem is addressed by DAGEN path $X$ with using the Smem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The another data memory operand Cmem is addressed by DAGEN path C with using the coefficient addressing mode, driven on data bus BDB, and extended to 17 bits with filling zeros in the MAC1.

- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To
prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

This instruction can be applied to compute the intermediate multiplication result of double precision multiplication and to free up one DAGEN operator (DAGEN path Y) for storing an instruction with enabling parallelism.

## Compatibility with C54x devices (C54CM = 1)

None.

Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |
| :--- |
| Affects ACOVx |

Repeat $\quad$ This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| MPY *AR3-, uns(*CDP+), ACx | The product of the content addressed by AR3 and the content addressed <br> by the coefficient data pointer register (CDP) is stored in AC0. AR3 is de- <br> cremented by 1 and CDP is incremented by 1. |


| Execution |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| rnd ((Smem) [16:0]*uns (Cmem) [16:0]) -> ACx |  |  |  |  |  |
| Before |  |  | After |  |  |
| AC0 | FF 8000 | 0000 | AC0 | FF FFOO | 0000 |
| XAR3 | 00 | 1001 | XAR3 | 00 | 1000 |
| Data memory |  |  |  |  |  |
| 1001h |  | FE00 | 1001h |  | FEOO |
| XCDP | 00 | 2000 | XCDP | 00 | 2000 |
| Coeff memory |  |  |  |  |  |
| 2000h |  | 8000 | 2000h |  | 8000 |

## MPY::MAC <br> Multiply with Parallel Multiply and Accumulate

Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 | No | 4 | 1 | X |
| [2] | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [3] | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [4] | MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx | No | 5 | 1 | X |


| Description | These instructions perform two parallel operations in one cycle: multiply and <br> multiply and accumulate (MAC). The operations are executed in the two D-unit <br> MACs. |
| :--- | :--- |
| Status Bits | Affected by $\quad$ FRCT, M40, RDM, SATD, SMUL, SXMD |
| See Also | Affects $\quad$ ACOVx, ACOVy |
| See the following other related instructions: |  |
| $\square$ MAC (Multiply and Accumulate) |  |
|  | $\square$ MAC::MAC (Parallel Multiply and Accumulates) |
|  | $\square$ MPY (Multiply) |

Multiply With Parallel Multiply and Accumulate
Syntax Characteristics

| No. | Synta |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 |  | No | 4 | 1 | X |
| Opcode $\|10000100\|$ XXXM MMYY $\mid$ YMMM $10 \mathrm{~mm} \mid$ UuDD DDg\% |  |  |  |  |  |  |
| Operands ACx, ACy, Cmem, Xmem, Ymem |  |  |  |  |  |  |
| Description |  | This instruction performs two parallel operations in one cycle: multiply and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs: |  |  |  |  |
|  |  | $\mathrm{ACx}=\mathrm{Xmem}$ * Cmem <br> $:: A C y=(A C y \gg \# 16)+(Y m e m * C m e m)$ |  |  |  |  |

The first operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

- Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy shifted right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
. Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

■ AMAR Xmem

- AMAR Ymem

■ AMAR Cmem

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy | (This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MPY uns(*AR3), uns(*CDP), AC0 | Both instructions are performed in parallel. The product of |
| $::$ MAC uns(*AR4), uns(*$\left.{ }^{*} C D P\right)$, AC1 >> \#16 | the unsigned content addressed by AR3 and the unsigned |
|  | content addressed by the coefficient data pointer register |
|  | (CDP) is stored in AC0. The product of the unsigned content |
|  | addressed by AR4 and the unsigned content addressed by |
|  | CDP is added to the content of AC1, which has been shifted |
| to the right by 16 bits. The result is stored in AC1. |  |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy | No | 4 | 1 | X |
|  | $::$ MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

Opcode

## Operands

## Description

ACx, ACy, Cmem, Smem

This instruction performs two parallel operations in one cycle: multiply and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs:

ACy = Smem * HI (Cmem)
:: ACx = ACx + (Smem * LO (Cmem))
The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by $\quad$ FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |

## Example

| Syntax | Description |
| :--- | :--- |
| MPY uns(*AR3-), uns(HI(*CDP+)), AC1 <br> $\because: ~ M A C ~ u n s(* A R 3-), ~ u n s(L O(* C D P+)), ~ A C 0 ~$ | Both instructions are performed in parallel. The product of the <br> unsigned content addressed by AR3 and the unsigned content <br> addressed by the higher part of the coefficient data pointer reg- <br> ister (CDP) is stored in AC1. The product of the unsigned con- <br> tent addressed by AR3 and the unsigned content addressed by <br> the lower part of the CDP is added to the content of AC0. The <br> result is stored in ACO. AR3 is decremented by 1. When CDP+ <br> is used with HI/LO, CDP is incremented by 2. |

[^9]

Multiply with Parallel Multiply and Accumulate
Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [3] | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |

## Opcode

## Operands

## Description

| 1111 | 1101 | AAAA AAAI | 0100 | 01 mm | DDDD uug\% |
| :--- | :--- | :--- | :--- | :--- | :--- |

## ACx, ACy, Cmem, Lmem

This instruction performs two parallel operations in one cycle: multiply and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs:

```
ACy = HI (Lmem) * HI (Cmem)
:: ACx = ACx + (LO(Lmem) * LO(Cmem))
```

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}($ Lmem $)$ and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}($ Lmem ) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the $\mathrm{BAB}, \mathrm{BDB}$, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |
| :---: | :---: |
|  | Vx, ACOVy |
| Repeat This instruction can | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| MPY uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 :: MAC uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 | Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of the CDP is added to the content of ACO. The result is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2 . When CDP+ is used with HI/LO, CDP is incremented by 2 . |



## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | MPY[R][40][uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy | No | 5 (*) $^{*}$ | 1 | X |
|  | $::$ MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

(*) 1 LSB is allocated to instruction slot \#2.

Opcode
Operands
Description
| 10010010 | XXXM MMYY | YMMM 01mm | uuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs:

ACy = Ymem * HI (Cmem)
:: ACx = ACx + (Xmem * LO (Cmem))
The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

I If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

## Status Bits

Repeat
Affected by
FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy

Example

| Syntax | Description |
| :---: | :---: |
| MPY uns( ${ }^{*}$ AR3-), uns(HI(*CDP+)), AC1 $\because:$ MAC uns(*AR2-), uns(LO(*CDP + )), AC0 | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is added to the content of ACO. The result is stored in ACO. AR3 and AR2 are decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2. |



| AS Multiply With Parallel Multiply and Subtract |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Syntax Characteristics |  |  |  |  |  |
|  | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| [1] | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy <br> :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [2] | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[]], ACy :: MAS[R][40] [uns(]LO(Lmem)[]], [uns(]LO(Cmem)[]], ACx | No | 4 | 1 | X |
| [3] | MPY[R][40] [uns(]Ymem[]), [uns(]HI(Cmem)[]], ACy :: MAS[R][40] [uns(]Xmem[]], [uns([LO(Cmem)[]], ACx | No | 5 | 1 | X |


| Description | These instructions perform two parallel operations in one cycle: multiply and <br> multiply and subtract (MAS). The operations are executed in the two D-unit <br> MACs. |
| :--- | :--- |
| Status Bits | Affected by $\quad$ FRCT, M40, RDM, SATD, SMUL |
| See Also | Affects $\quad$ ACOVx, ACOVy |
| See the following other related instructions: |  |
| $\square$ MPY (Multiply) |  |
| $\square$ | MAS (Multiply and Subtract) |
|  | $\square$ MAS::MAS (Parallel Multiply and Subtracts) |

## Syntax Characteristics

| No. | Syntax |  |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[]], ACy <br> :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[]], ACx |  |  | No | 4 | 1 | X |
| Opco |  | \| 11111101 | AAAA | AAAI 0000 | 1 | \| D | uug\% |
| Oper | ands | ACx, ACy, Cmem, Smem |  |  |  |  |  |
| Desc | ription | This instruction performs two parallel operations in one cycle: multiply and multiply and subtract (MAS). The operations are executed in the two D-unit MACs: |  |  |  |  |  |
| ```ACy = Smem * HI (Cmem) :: ACx = ACx - (Smem * LO(Cme``` |  |  |  |  |  |  |  |

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path $C$ with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.


## Status Bits

Repeat
For the first operation, the 32 -bit result of the multiplication is sign extended to 40 bits.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.

- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

## Example

| Syntax | Description |
| :--- | :--- |
| MPY uns(*AR3-), uns(HI(*CDP+)), AC1 <br> $\because: ~ M A S ~ u n s(* A R 3-), ~ u n s(L O(* C D P+)), ~ A C 0 ~$ | Both instructions are performed in parallel. The product of the <br> unsigned content addressed by AR3 and the unsigned content <br> addressed by the higher part of the coefficient data pointer reg- <br> ister (CDP) is stored in AC1. The product of the unsigned con- <br> tent addressed by AR3 and the unsigned content addressed by <br> the lower part of the CDP is subtracted from the content of AC0. <br> The result is stored in AC0. AR3 is decremented by 1. When <br> CDP+ is used with HI/LO, CDP is incremented by 2. |

[^10]| Before |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO 00 | 0000 | 8000 | ACO | FF | C080 | 8000 |
| XAR3 | 00 | 10 FF | XAR3 |  | 00 | 10 FE |
| Data memory |  |  |  |  |  |  |
| 10FFh |  | FEOO | 10FFh |  |  | FE00 |
| XCDP | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |
| 2001h |  | 4000 | 2001h |  |  | 4000 |
| AC1 FF | 8000 | 0000 | AC1 | 00 | 7F00 | 0000 |
| Coeff memory |  |  |  |  |  |  |
| 2000h |  | 8000 | 2000h |  |  | 8000 |

## Syntax Characteristics

$\left.\begin{array}{lllccc}\hline \text { No. } & \text { Syntax } & & \begin{array}{c}\text { Parallel } \\ \text { Enable Bit }\end{array} & \text { Size } & \text { Cycles }\end{array} \begin{array}{c}\text { Pipe- } \\ \text { line }\end{array}\right]$

## Operands <br> ACx, ACy, Cmem, Lmem

Description This instruction performs two parallel operations in one cycle: multiply and multiply and subtract (MAS). The operations are executed in the two D-unit MACs:
$\mathrm{ACy}=\mathrm{HI}($ Lmem $) ~ * ~ H I ~(C m e m) ~$
$:: A C x=A C x-(L O(L m e m) * \mathrm{LO}(C m e m))$
The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand $\mathrm{HI}($ Lmem ) and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}($ Lmem ) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

I If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |  |
| :---: | :---: | :---: |
|  | Affects ACOVx, ACOVy |  |
| Repeat | This instruction can | be repeated. |
| Example |  |  |
| Syntax |  | Description |
| MPY uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 :: MAS uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 |  | Both instructions are performed the unsigned content addresse and the unsigned content addr the coefficient data pointer reg The product of the unsigned co er part of AR3 and the unsigne lower part of the CDP is subtra ACO. The result is stored in ACO $\mathrm{HI} / \mathrm{LO}$, AR3 is decremented by $\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}$ is incremented by |



Multiply with Parallel Multiply and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[3]$ | MPY[R][40][uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy | No | 5 (*) $^{*}$ | 1 | X |
|  | $::$ MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

(*) 1 LSB is allocated to instruction slot \#2.

Opcode
Operands
Description

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and multiply and subtract (MAS). The operations are executed in the two D-unit MACs:
$\mathrm{ACy}=$ Ymem * HI (Cmem)
:: ACx = ACx - (Xmem * LO (Cmem))
The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the contents of data memory operand HI (Cmem) which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path C with the next address of EA ( $E A+1$ when $E A$ is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC 1.

- The input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 key word is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

## Status Bits

Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MPY uns( ${ }^{*}$ AR3-), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of <br> the unsigned content addressed by AR3 and the unsigned <br> c: |
| content addressed by the higher part of the coefficient |  |
| data pointer register (CDP), in stored in AC1. The product |  |
| of the unsigned content addressed by AR2 and the un- |  |
| signed content addressed by the lower part of the CDD is |  |
| subtracted from the content of AC0. The result is stored in |  |
| AC0. AR3 and AR2 are decremented by 1. When CDP+ |  |
| As used with HI/LO, CDP is incremented by 2. |  |


| Execution |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| M40 (rnd (ACx - uns (Xmem) [16:0] * uns (LO (Cmem)) [16:0]) |  |  |  |  |  |  |  |
| M40 (rnd (uns (Ymem) [16:0] * uns (HI (Cmem)) [16:0])) -> ACy |  |  |  |  |  |  |  |
| Before |  |  |  | After |  |  |  |
| ACO | 00 | 0000 | 8000 | ACO |  | C080 | 8000 |
| XAR2 |  | 00 | 10FE | XAR2 |  | 00 | 10FD |
| XAR3 |  | 00 | 20 FE | XAR3 |  | 00 | 20 FD |
| Data memory |  |  |  |  |  |  |  |
| 10FEh |  |  | FE00 | 10FEh |  |  | FEO 0 |
| XCDP |  | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |  |
| 2001h |  |  | 4000 | 2001h |  |  | 4000 |
| AC1 | FF | 8000 | 0000 | AC1 |  | 7F80 | 0000 |
| Data memory |  |  |  |  |  |  |  |
| 20FEh |  |  | FFO0 | 20FFh |  |  | FFO 0 |
| Coeff memory |  |  |  |  |  |  |  |
| 2000h |  |  | 8000 | 2000 h |  |  | 8000 |

## MPY::MPY

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | No | 4 | 1 | X |
| [2] | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy <br> :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [3] | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | No | 4 | 1 | X |
| [4] | MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx | No | 5 | 1 | X |

Description These instructions perform two parallel multiply operations in one cycle. The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:

- AMAR::MPY (Modify Auxiliary Register Content with Parallel Multiply)
- MAC::MAC (Parallel Multiply and Accumulates)
- MAC::MAS (Multiply and Accumulate with Parallel Multiply and Subtract)
- MAC::MPY (Multiply and Accumulate with Parallel Multiply)
- MAS::MAC (Multiply and Subtract with Parallel Multiply and Accumulate)
- MAS::MAS (Parallel Multiply and Subtracts)
- MAS::MPY (Multiply and Subtract with Parallel Multiply)
$\square$ MPY (Multiply)


## Parallel Multiplies

Syntax Characteristics


The first operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

This second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

I If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.

- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional 40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

■ AMAR Xmem

- AMAR Ymem
- AMAR Cmem

Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- |

Repeat $\quad$ This instruction can be repeated.

## Parallel Multiplies

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy <br> $:: ~ M P Y[R][40] ~[u n s(] S m e m[)], ~[u n s(] L O(C m e m)[)], ~ A C x ~$ | No | 4 | 1 | X |

Opcode
Operands
Description
| $11111101 \mid$ AAAA AAAI $|000000 \mathrm{~mm}|$ DDDD uug\%
ACx, ACy, Cmem, Smem
This instruction performs two parallel multiply operations in one cycle. The operations are executed in the D-unit MACs:

ACy $=$ Smem * HI (Cmem)
: : ACx $=$ Smem * LO (Cmem)
The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(Cmem). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand $\mathrm{LO}(\mathrm{Cmem})$ is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when $E A$ is even, $E A-1$ when $E A$ is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy | Repeat $\quad$ This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| MPY uns(*AR3-), uns(HI(*CDP+)), AC1 <br> $:: ~ M P Y ~ u n s(* A R 3-), ~ u n s(L O(* C D P+)), ~ A C 0 ~$ | Both instructions are performed in parallel. The product of the <br> unsigned content addressed by AR3 and the unsigned content <br> addressed by the higher part of the coefficient data pointer reg- <br> ister (CDP) is stored in AC1. The product of the unsigned con- <br> tent addressed by AR3 and the unsigned content addressed by <br> the lower part of CDP is stored in AC0. AR3 is decremented by <br> 1. When CDP+ is used with HI/LO, CDP is incremented by 2. |

[^11]

Parallel Multiplies

## Syntax Characteristics

|  | Synta |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [3] | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  | No | 4 | 1 | X |
| Opcode |  | 11111101 \| AAAA | AAAI 0 | 00 | m D | uug\% |
| Operands |  | ACx, ACy, Cmem, Lmem |  |  |  |  |
| Description |  | This instruction performs two parallel multiply operations in one cycle. The operations are executed in the D-unit MACs: |  |  |  |  |
|  |  | $\mathrm{ACy}=\mathrm{HI}($ Lmem ) * HI (Cmem) <br> :: ACx = LO (Lmem) * LO (Cmem) |  |  |  |  |

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand HI(Lmem) and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$. The data memory operand $\mathrm{HI}(\mathrm{Lmem})$ is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand $\mathrm{HI}(\mathrm{Cmem})$ is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(Cmem). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of $E A$ ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(Cmem) is addressed by DAGEN path C with the next address of $E A$ ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

- If FRCT $=1$, the output of the multiplier is shifted to the left by 1 bit.
$\square$ Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

## Compatibility with C54x devices (C54CM = 1)

None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |

Example

| Syntax | Description |
| :--- | :--- |
| MPY uns(HI(*AR3-)), uns(HI(*CDP+)), AC1 | Both instructions are performed in parallel. The product of |
| $\because:$ MPY uns(LO(*AR3-)), uns(LO(*CDP+)), AC0 | the unsigned content addressed by AR3 and the unsigned <br> content addressed by the higher part of the coefficient data <br> pointer register (CDP) is stored in AC1. The product of the <br> unsigned content addressed by AR3 and the unsigned <br> content addressed by the lower part of CDP is stored in <br> AC0. When AR3- is used with HI/LO, AR3 is decremented <br> by 2. When CDP+ is used with HI/LO, CDP is incremented <br> by 2. |

[^12]| Before |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 FF | 8000 | 0000 | ACO | 00 | 3 F 80 | 0000 |
| XAR3 | 00 | 10FE | XAR3 |  | 00 | 10FC |
| Data memory |  |  |  |  |  |  |
| 10FFh |  | FEOO | 10FFh |  |  | FEOO |
| XCDP | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |
| 2001h |  | 4000 | 2001h |  |  | 4000 |
| AC1 FF | 8000 | 0000 | AC1 | 00 | 7 F 80 | 0000 |
| Data memory |  |  |  |  |  |  |
| 10FEh |  | FFOO | 10FEh |  |  | FFOO |
| Coeff memory |  |  |  |  |  |  |
| 2000h |  | 8000 | 2000h |  |  | 8000 |

## Parallel Multiplies

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy | No | 5 (*) $^{*}$ | 1 | X |
|  | $::$ MPY[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |  |  |  |  |

(*) 1 LSB is allocated to instruction slot \#2.

## Opcode

Operands
Description
| $10010010 \mid$ XXXM MMYY $\mid$ YMMM $00 \mathrm{~mm} \mid$ UuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply operations in one cycle. The operations are executed in the D-unit MACs:

```
ACy = Ymem * HI (Cmem)
:: ACx = Xmem * LO(Cmem)
```

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand $\mathrm{HI}(\mathrm{Cmem})$ which is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(Cmem) which is addressed by DAGEN path $C$ with the next address of $E A$ ( $E A+1$ when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
$\square$ The input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
$\square$ Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
$\square$ The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM =1)
None.

| Status Bits | Affected by | FRCT, M40, RDM, SATD, SMUL, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy |
| Repeat | This instruction can be repeated. |  |

## Example

| Syntax | Description |
| :---: | :---: |
| MPY uns(*AR3-), uns(HI(*CDP+)), AC1 :: MPY uns(*AR2-), uns(LO(*CDP+)), AC0 | Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is stored in AC0. AR3 and AR2 are decremented by 1 . When CDP+ is used with $\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}$ is incremented by 2 . |

[^13]| Before |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC0 FF | 8000 | 0000 | AC0 | 00 | 3 F 80 | 0000 |
| XAR2 | 00 | 10FE | XAR2 |  | 00 | 10FD |
| XAR3 | 00 | 20 FE | XAR3 |  | 00 | 20 FD |
| Data memory |  |  |  |  |  |  |
| 10FEh |  | FE00 | 10FEh |  |  | FE00 |
| XCDP | 00 | 2000 | XCDP |  | 00 | 2002 |
| Coeff memory |  |  |  |  |  |  |
| 2001h |  | 4000 | 2001h |  |  | 4000 |
| AC1 FF | 8000 | 0000 | AC1 | 00 | 7F80 | 0000 |
| Data memory |  |  |  |  |  |  |
| 20FEh |  | FFOO | 20FFh |  |  | FFOO |
| Coeff memory |  |  |  |  |  |  |
| 2000h |  | 8000 | 2000h |  |  | 8000 |

## MPYM:MOV

Multiply with Parallel Store Accumulator Content to Memory
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | MPYM $[R][$ T3 $=]$ Xmem, Tx, ACy | No | 4 | 1 | X |
|  | $::$ MOV HI(ACx << T2), Ymem |  |  |  |  |

Opcode

## Operands

Description

ACx, ACy, Tx, Xmem, Ymem
This instruction performs two operations in parallel: multiply and store:

```
ACy = rnd(Tx * Xmem)
```

:: Ymem $=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{T} 2) \quad$ [,T3 = Xmem]

The first operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.

- If FRCT = 1 , the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32 -bit result of the multiplication is sign extended to 40 bits.

- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation shifts the accumulator ACx by the content of T2 and stores $\operatorname{ACx}(31-16)$ to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.

- The input operand is shifted in the D-unit shifter according to SXMD.
- After the shift, the high part of the accumulator, $\operatorname{ACx}(31-16)$, is stored to the memory location.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, the 6 LSBs of T 2 determine the shift quantity. The 6 LSBs of T2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
$\square$ If the SST bit $=1$ and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
$\mathrm{ACy}=\operatorname{rnd}(\mathrm{Tx} *$ Xmem $)$
Ymem $=\mathrm{HI}($ saturate $($ uns $(\mathrm{ACx} \ll \mathrm{T} 2)))[, \mathrm{T} 3=$ Xmem $]$
$\square$ If the SST bit = 1 and the SXMD bit = 1, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

ACy $=\operatorname{rnd}(T x$ * Xmem)
Ymem $=\mathrm{HI}($ saturate $(\mathrm{ACx} \ll \mathrm{T} 2))$ [,T3 = Xmem]
Status Bits

Repeat
Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SST, SXMD
Affects ACOVy
This instruction can be repeated.
See Also
See the following other related instructions:
$\square$ ADD::MOV (Addition with Parallel Store Accumulator Content to Memory)
$\square$ MACM::MOV (Multiply and Accumulate with Parallel Store Accumulator Content to Memory)
$\square$ MASM::MOV (Multiply and Subtract with Parallel Store Accumulator Content to Memory)
$\square$ MOV (Store Accumulator Content to Memory)
$\square$ MPY (Multiply)
$\square$ SUB::MOV (Subtraction with Parallel Store Accumulator Content to Memory)

## Example

| Syntax | Description |
| :--- | :--- |
| MPYMR *AR0+, T0, AC1 | Both instructions are performed in parallel. The content addressed by AR0 |
| $::$ MOV HI(AC0 << T2), *AR1+ | is multiplied by the content of T0. Since FRCT $=1$, the result is multiplied by |
|  | 2, rounded, and stored in AC1. The content of AC0 is shifted by the content <br> of T2, and AC0(31-16) is stored at the address of AR1. AR0 and AR1 are <br> both incremented by 1. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | FF | 8421 | 1234 | ACO | FF | 8421 | 1234 |
| AC1 | 00 | 0000 | 0000 | AC1 | 00 | 2000 | 0000 |
| ARO |  |  | 0200 | ARO |  |  | 0201 |
| AR1 |  |  | 0300 | AR1 |  |  | 0301 |
| T0 |  |  | 4000 | T0 |  |  | 4000 |
| T2 |  |  | 0004 | T2 |  |  | 0004 |
| 200 |  |  | 4000 | 200 |  |  | 4000 |
| 300 |  |  | 1111 | 300 |  |  | 4211 |
| FRCT |  |  | 1 | FRCT |  |  | 1 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 0 |
| CARRY |  |  | 0 | CARRY |  |  | 0 |

## NEG

## Negate Accumulator, Auxiliary, or Temporary Register Content

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | NEG [scc,] dst |  | Yes | 2 | 1 | X |
| Opcode |  |  | 0011 | 010 E | FSSS | FDDD |
| Operands | dst, src |  |  |  |  |  |
| Description | This instruction computes the 2s complement of the content of the source <br> register (src): <br> dst $=-$ src |  |  |  |  |  |

This instruction clears the CARRY status bit to 0 for all nonzero values of src. If src equals 0 , the CARRY status bit is set to 1 .
$\square$ When the destination operand (dst) is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are sign extended to 40 bits according to SXMD.

- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ Overflow detection and CARRY status bit depends on M40.
■ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination operand (dst) is an auxiliary or temporary register:

- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

■ Overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by M40, SATA, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects ACOVx, CARRY |

## NOP

## No Operation

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: |
| [1] NOP | Yes | 1 | 1 | D |
| [2] NOP_16 | Yes | 2 | 1 | D |
| Opcode |  |  | 001 | 000E |
| Operands | none |  |  |  |
| Description | Instruction [1] increments the program counter register (PC) by 1 byte. Instruction [2] increments the PC by 2 bytes. |  |  |  |
| Status Bits | Affected by none |  |  |  |
|  | Affects none |  |  |  |
| Repeat | This instruction can be repeated. |  |  |  |
| Example |  |  |  |  |
| Syntax | Description |  |  |  |
| NOP | The program counter (PC) is incremented by 1 byte. |  |  |  |

## NOT Complement Accumulator, Auxiliary, or Temporary Register Content

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | NOT [scc,] dst | Yes | 2 | 1 | X |  |
| Opcode | dst, src |  | 0011 | 011 E | FSSS | FDDD |
| Operands | This instruction computes the 1s complement (bitwise complement) of the <br> Content of the source register (src). |  |  |  |  |  |

- When the destination (dst) operand is an accumulator:

■ The bit inversion is performed on 40 bits in the D-unit ALU and the result is stored in the destination accumulator.

- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended.
$\square$ When the destination (dst) operand is an auxiliary or temporary register:
■ The bit inversion is performed on 16 bits in the A-unit ALU and the result is stored in the destination auxiliary or temporary register.
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

## Repeat

This instruction can be repeated.
See the following other related instructions:
$\square$ BNOT (Complement Accumulator, Auxiliary, or Temporary Register Bit)

- BNOT (Complement Memory Bit)
- NEG (Negate Accumulator, Auxiliary, or Temporary Register Content)


## Example

| Syntax | Description |
| :--- | :--- |
| NOT AC0, AC1 | The content of AC0 is complemented and the result is stored in AC1. |


| Before | After |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| AC0 | 7 E | 2355 | 4 FCO | AC0 | 7 E | 2355 |
| AC1 | 00 | 2300 | 5678 | AC1 | 81 | DCAA B03F |

## OR

## Bitwise OR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | OR src, dst | Yes | 2 | 1 | X |
| $[2]$ | OR k8, src, dst | Yes | 3 | 1 | X |
| $[3]$ | OR k16, src, dst | No | 4 | 1 | X |
| $[4]$ | OR Smem, src, dst | No | 3 | 1 | X |
| $[5]$ | OR ACx << \#SHIFTW[, ACy] | Yes | 3 | 1 | X |
| $[6]$ | OR k16 << \#16, [ACx,] ACy | No | 4 | 1 | X |
| $[7]$ | OR k16 << \#SHFT, [ACx,] ACy | No | 4 | 1 | X |
| $[8]$ | OR k16, Smem | No | 4 | 1 | $X$ |

## Description These instructions perform a bitwise OR operation:

- In the D-unit, if the destination operand is an accumulator.
- In the A-unit ALU, if the destination operand is an auxiliary or temporary register.
$\square$ In the A-unit ALU, if the destination operand is the memory.
Status Bits Affected by C54CM
Affects none
See Also See the following other related instructions:
- AND (Bitwise AND)
- XOR (Bitwise Exclusive OR)

Bitwise OR

Syntax Characteristics


## Bitwise OR

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size Cycles Pipeline |
| :---: | :---: |
| [2] OR k8, src, dst | $\begin{array}{llll}\text { Yes } & 3 & 1 & X\end{array}$ |
| Opcode |  |
| Operands | dst, k8, src |
| Description | This instruction performs a bitwise OR operation between a source (src) register content and an 8-bit value, k8: <br> dst $=\operatorname{src} \mid k 8$ When the destination (dst) operand is an accumulator: <br> - The operation is performed on 40 bits in the D-unit ALU. <br> - Input operands are zero extended to 40 bits. <br> - If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. When the destination (dst) operand is an auxiliary or temporary register: <br> - The operation is performed on 16 bits in the A-unit ALU. <br> - If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation. |
| Status Bits | Affected by none <br> Affects none |
| Repeat | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| OR \#FFh, AC1, AC0 | The content of AC1 is ORed with the unsigned 8-bit value (FFh) and the result is stored in ACO. |

Bitwise OR

Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size ${ }^{\text {cher }}$ Cycles Pipeline |
| :---: | :---: |
| [3] OR k16, src, ds | No $\begin{array}{llll}4 & 1 & X\end{array}$ |
| Opcode |  |
| Operands | dst, k16, src |
| Description | This instruction performs a bitwise OR operation between a source (src) register content and a 16-bit unsigned constant, k16: <br> dst = src \| k16 <br> - When the destination (dst) operand is an accumulator: <br> - The operation is performed on 40 bits in the D-unit ALU. <br> - Input operands are zero extended to 40 bits. <br> - If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. <br> When the destination (dst) operand is an auxiliary or temporary register: <br> - The operation is performed on 16 bits in the A-unit ALU. <br> - If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation. |
| Status Bits | Affected by none <br> Affects none |
| Repeat | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| OR \#FFFFh, AC1, AC0 | The content of AC1 is ORed with the unsigned 16-bit value (FFFFh) and the result is stored in AC0. |

## Bitwise OR

## Syntax Characteristics



Bitwise OR

## Syntax Characteristics



## Bitwise OR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[6]$ | OR k16 <<\#16, $[A C x] A C y$, | No | 4 | 1 | $X$ |

## Opcode

$01111010|k k k k \quad k k k k| k k k k \quad k k k k \left\lvert\, \begin{aligned} & \text { SSDD 011x }\end{aligned}\right.$

## Operands <br> ACx, ACy, k16

Description This instruction performs a bitwise OR operation between an accumulator (ACx) content and a 16-bit unsigned constant, k16, shifted to the left by 16 bits:

ACy $=$ ACx $\mid(k 16 \lll \# 16)$
$\square$ The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are zero extended to 40 bits.
$\square$ The input operand (k16) is shifted 16 bits to the MSBs.
Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| OR \#FFFFh << \#16, AC1, AC0 | The content of AC1 is ORed with the unsigned 16-bit value (FFFFh) <br> logically shifted to the left by 16 bits and the result is stored in AC0. |

Bitwise OR

## Syntax Characteristics



## Bitwise OR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[8]$ | OR k16, Smem | No | 4 | 1 | $X$ |

Opcode

## Operands

Description This instruction performs a bitwise OR operation between a memory location (Smem) and a 16-bit unsigned constant, k16:

Smem
= Smem
k16
$\square$ The operation is performed on 16 bits in the A-unit ALU.

- The result is stored in memory.

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| OR \#OFC0h, *AR1 | The content addressed by AR1 is ORed with the unsigned 16-bit value (FCOh) <br> and the result is stored in the location addressed by AR1. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| *AR1 | 5678 | *AR1 | $5 F F 8$ |

## POP

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | POP dst1, dst2 | Yes | 2 | 1 | X |
| $[2]$ | POP dst | Yes | 2 | 1 | X |
| $[3]$ | POP dst, Smem | No | 3 | 1 | X |
| $[4]$ | POP dbl(ACx) | Yes | 2 | 1 | X |
| $[5]$ | POP Smem | No | 2 | 1 | X |
| $[6]$ | POP dbl(Lmem) | No | 2 | 1 | X |

Description These instructions move the content of the data memory location addressed by the data stack pointer (SP) to:
$\square$ an accumulator, auxiliary, or temporary register

- a data memory location

When the destination register is an accumulator, the guard bits and the 16 higher bits of the accumulator, $\operatorname{ACx}(39-16)$, are reloaded (unchanged) with the current value and are not modified by these instructions.

The increment operation performed on SP is done by the A-unit address generator dedicated to the stack addressing management.

## Status Bits

Affected by none
Affects none
See Also See the following other related instructions:

- POPBOTH (Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers)
- PSH (Push to Top of Stack)
- PSHBOTH (Push Accumulator or Extended Auxiliary Register Content to Stack Pointers)

Pop Top of Stack

## Syntax Characteristics



Pop Top of Stack

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[2]$ | POP dst | Yes | 2 | 1 | X |

## Opcode

$0101000 \mathrm{~F} \mid$ FDD x 010

## Operands

Description This instruction moves the content of the 16-bit data memory location pointed by SP to destination register dst.

When the destination register, dst, is an accumulator, the content of the 16-bit data memory operand is moved to the destination accumulator low part, ACx(15-0). The guard bits and the 16 higher bits of the accumulator, ACx(39-16), are reloaded (unchanged) with the current value and are not modified by this instruction. SP is incremented by 1 .

## Status Bits

Repeat

| Affected by | none |
| :--- | :--- |
| Affects | none |

This instruction cannot be repeated with a single conditional or unconditional repeat instruction. It can be repeated in other repeat instructions.

## Example

| Syntax | Description |
| :--- | :--- |
| POP AC0 | The content of the memory location pointed by the data stack pointer (SP) is copied to <br> AC0(15-0). bits 39-16 of AC0 are unchanged. The SP is incremented by 1. |

Pop Top of Stack

## Syntax Characteristics



Pop Top of Stack

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | POP dbI $(A C x)$ | Yes | 2 | 1 | $X$ |

## Opcode

$0101000 \mathrm{E} \| \mathrm{xDD} \mathrm{x} 011$

## Operands

Description This instruction moves the content of the 16-bit data memory location pointed by SP to the accumulator high part $\operatorname{ACx}(31-16)$ and moves the content of the 16-bit data memory location pointed by SP + 1 to the accumulator low part ACx (15-0).

The guard bits of the accumulator, $\operatorname{ACx}(39-32)$, are reloaded (unchanged) with the current value and are not modified by this instruction. SP is incremented by 2.

Status Bits Affected by none
Affects none
Repeat This instruction cannot be repeated with a single conditional or unconditional repeat instruction. It can be repeated in other repeat instructions.

## Example

| Syntax | Description |
| :--- | :--- |
| POP dbl(AC1) | The content of the memory location pointed by the data stack pointer (SP) is copied to <br> AC1 (31-16) and the content of the memory location pointed by SP +1 is copied to <br> AC1(15-0). bits 39-32 of AC1 are unchanged. The SP is incremented by 2. |


| Before |  |  | After |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| AC1 | 03 | 3800 | FC00 | AC1 | 03 | 5644 |
| F8 800 |  |  |  |  |  |  |
| SP |  | 0304 | SP |  | 0306 |  |
| 304 |  | 5644 | 304 | 5644 |  |  |
| 305 |  | F 800 | 305 | F 800 |  |  |

Pop Top of Stack

## Syntax Characteristics

| No. Syntax |  |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [5] POP Smem |  |  | No | 2 | 1 | X |
| Opcode |  |  |  | 10 | 11 \| AA | A AAAI |
| Operands | Smem |  |  |  |  |  |
| Description | This instruction moves the content of the 16-bit data memory location pointed by SP to data memory (Smem) location. SP is incremented by 1 . |  |  |  |  |  |
| Status Bits | Affected by | none |  |  |  |  |
|  | Affects | none |  |  |  |  |
| Repeat | This instruction can be repeated. |  |  |  |  |  |

Example

| Syntax | Description |  |
| :--- | :--- | :--- |
| POP *AR1 | The content of the memory location pointed by the data stack pointer (SP) is copied to <br> the location addressed by AR1. The SP is incremented by 1. |  |
| Before | After |  |
| AR1 | 0200 | AR1 |
| SP | 0300 | SP |
| 200 | 3400 | 200 |

Pop Top of Stack

## Syntax Characteristics



## Syntax Characteristics



## port <br> Peripheral Port Register Access Qualifiers

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | port(Smem) | No | 1 | 1 | D |
| $[2]$ | port(k16) | No | 3 | 1 | D |

## Opcode

10011001

## Operands <br> k16, Smem

Description These operand qualifiers allow you to locally disable access toward the data memory and enable access to the 64 K -word I/O space. The I/O data location is specified by the Smem, Xmem, or Ymem fields.
$\square$ An operand qualifier may be included in any instruction making a word single data memory access Smem or Xmem that is used in a read operation, except the DELAY and MACMZ instructions.
$\square$ A readport() operand qualifier cannot be used in any instruction making a dual memory access Xmem or Ymem that is used in a read operation. There is an exception for the instructions making a dual read/write memory access of the type Ymem = Xmem, or Smem = coeff, where readport() qualifier can be used.

- An operand qualifier may be included in any instruction making a word single data memory access Smem or Ymem that is used in a write operation, except the DELAY and MACMZ instructions.
- A writeport() operand qualifier cannot be used in any instruction making a dual memory access Xmem or Ymem that is used in a write operation. There is an exception for the instructions making a dual read/write memory access of the type Ymem = Xmem, or coeff = Smem, where writeport() qualifier can be used.
$\square$ An operand qualifier cannot be executed as a stand-alone instruction (assembler generates an error message).

Any instruction making a word single data memory access Smem (except those listed above) can use the port(k16) addressing mode to access the 64 K -word I/O space with an immediate address. When an instruction uses
port(k16), the 16-bit unsigned constant, $k 16$, is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using port(k16) cannot be executed in parallel with another instruction.

The following indirect operands cannot be used for accesses to I/O space. An instruction using one of these operands requires a 2-byte extension to the instruction. Because of the extension, an instruction using one of the following indirect operands cannot be executed with these operand qualifiers.

- *ARn(\#K16)
- *+ARn(\#K16)
- *CDP(\#K16)
- *+CDP(\#K16)

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

Repeat An instruction using this operand qualifier can be repeated.
Example 1

| Syntax | Description |
| :--- | :--- |
| MOV port(*CDP+), T2 | The content addressed by CDP (I/O address) is loaded into T2. After being used <br> for the address, CDP is incremented by 1. |

## Example 2

| Syntax | Description |
| :--- | :--- |
| MOV *CDP, port(\#456h) | The content addressed by CDP is written to I/O address 456h. |

## PSH

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | PSH src1, src2 | Yes | 2 | 1 | X |
| $[2]$ | PSH src | Yes | 2 | 1 | X |
| $[3]$ | PSH src,Smem | No | 3 | 1 | X |
| $[4]$ | PSH dbI(ACx) | Yes | 2 | 1 | X |
| $[5]$ | PSH Smem | No | 2 | 1 | X |
| $[6]$ | PSH dbI(Lmem) | No | 2 | 1 | X |

Description These instructions move one or two operands to the data memory location addressed by the data stack pointer (SP). The operands may be:

- an accumulator, auxiliary, or temporary register
- a data memory location

The decrement operation performed on SP is done by the A-unit address generator dedicated to the stack addressing management.

Status Bits

See Also
Affected by none
Affects none
See the following other related instructions:

- POP (Pop Top of Stack)
$\square$ POPBOTH (Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers)
- PSHBOTH (Push Accumulator or Extended Auxiliary Register Content to Stack Pointers)

Push to Top of Stack

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | PSH src1, src2 | Yes | 2 | 1 | $X$ |

Opcode

## Operands

Description

Status Bits

Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| PSH AR0, AC1 | The data stack pointer (SP) is decremented by 2. The content of AR0 is copied to the <br> memory location pointed by SP and the content of AC1 $15-0)$ is copied to the memory <br> location pointed by SP +1. |


| Before |  | After |  |  |
| :--- | :--- | :--- | :--- | :--- |
| AR0 | 0300 | AR0 |  |  |
| AC1 | 03644 | F800 | AC1 | 03 |
| SP | 0300 | SP |  | 0300 |
| 2 FE |  | 0000 | 2 FE | 02 FE |
| 2 FF |  | 0000 | 2 FF | 0300 |
| 300 | 5890 | 300 | F800 |  |
|  |  |  |  | 5890 |

Push to Top of Stack

## Syntax Characteristics



Push to Top of Stack

## Syntax Characteristics



Push to Top of Stack

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | PSH dbl $(\mathrm{ACx})$ | Yes | 2 | 1 | X |

## Opcode

0101000 ExSS x111

## Operands

Description This instruction decrements SP by 2, then moves the content of the accumulator high part ACx(31-16) to the 16-bit data memory location pointed by SP and moves the content of the accumulator low part ACx $(15-0)$ to the 16 -bit data memory location pointed by SP +1 .

## Status Bits Affected by none

Affects none
Repeat This instruction cannot be repeated with a single conditional or unconditional repeat instruction. It can be repeated in other repeat instructions.

## Example

| Syntax | Description |
| :--- | :--- |
| PSH dbl(ACO) | The data stack pointer (SP) is decremented by 2. The content of ACO(31-16) is copied <br> to the memory location pointed by SP and the content of $A C 0(15-0)$ is copied to the <br> memory location pointed by SP +1. |

Push to Top of Stack

Syntax Characteristics

| No. Syntax |  |  | Parallel <br> Enable Bit | Size | Cycles |  | peline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [5] PSH Smem |  |  | No | 2 | 1 |  | X |
| Opcode |  |  |  | 01 | \| 1 A | A | AAAI |
| Operands | Smem |  |  |  |  |  |  |
| Description | This instruc memory (S | decre <br> ) loca | then move data memory | the <br> loca | ontent ion poin | f the <br> ed b | e data <br> by SP. |
| Status Bits | Affected by | none |  |  |  |  |  |
|  | Affects | none |  |  |  |  |  |
| Repeat | This instruction can be repeated. |  |  |  |  |  |  |

Example

| Syntax | Description |
| :--- | :--- |
| PSH *AR1 | The data stack pointer (SP) is decremented by 1. The content addressed by AR1 is copied <br> to the memory location pointed by SP. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| *AR1 | 6903 | *AR1 | 6903 |
| SP | 0305 | SP | 0304 |
| 304 | 0000 | 304 | 6903 |
| 305 | 0300 | 305 | 0300 |

Push to Top of Stack

## Syntax Characteristics

| No. Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| PSH dbl(Lmem) |  | No | 2 | 1 | X |
| Opcode |  |  | 10 | 1 \| AAAA | A AAAI |
| Operands Lmem |  |  |  |  |  |
| Description | This instruc memory loc moves the memory loc <br> When Lme stack are st at an odd add memory loc | hen moves ta memory ory locatio <br> he two 16-b em in the s pushed order. | he 16 cation Lmem <br> valu me or to the | highest bit pointed by to the 1 <br> pushed der. When stack are | its of data by SP and 6-bit data <br> onto the Lmem is stored at |
| Status Bits | Affected by none |  |  |  |  |
|  | Affects |  |  |  |  |
| Repeat | This instruction can be repeated. |  |  |  |  |
| Example |  |  |  |  |  |
| Syntax | Description |  |  |  |  |
| PSH dbl(*AR3-) | The data stack pointer (SP) is decremented by 2 . The 16 highest bits of the content at the location addressed by AR3 are copied to the memory location pointed by SP and the 16 lowest bits of the content at the location addressed by AR3 are copied to the memory location pointed by $S P+1$. Because this instruction is a long-operand instruction, AR3 is decremented by 2 after the execution. |  |  |  |  |

## PSHBOTH

Push Accumulator or Extended Auxiliary Register Content to Stack Pointers

## Syntax Characteristics



## RESET

## Software Reset

Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size $\begin{aligned} & \text { Cycles }\end{aligned}$ |
| :---: | :---: |
| [1] RESET | No 2 ? D |
| Opcode | $10010100 \mid$ xxxx xxxx |
| Operands | none |
| Description | This instruction performs a nonmaskable software reset that can be used any time to put the device in a known state. <br> The reset instruction affects ST0_55, ST1_55, ST2_55, IFR0, IFR1, and T2 (Table 5-6 and Figure 5-3); status register ST3_55 and interrupt vectors pointer registers (IVPD and IVPH) are not affected. When the reset instruction is acknowledged, the INTM is set to 1 to disable maskable interrupts. All pending interrupts in IFR0 and IFR1 are cleared. The initialization of the system control register, the interrupt vectors pointer, and the peripheral registers is different from the initialization performed by a hardware reset. |
| Status Bits | Affected by none |
|  | Affects IFR0, IFR1, ST0_55, ST1_55, ST2_55 |
| Repeat | This instruction cannot be repeated. |

## Table 5-6. Effects of a Software Reset on DSP Registers

| Register | Bit | Reset Value | Comment |
| :---: | :---: | :---: | :---: |
| T2 | All | 0 | All bits are cleared. To ensure TMS320C54x DSP compatibility, instructions affected by ASM bit will use a shift count of 0 (no shift). |
| IFR0 | All | 0 | All pending interrupt flags are cleared. |
| IFR1 | All | 0 | All pending interrupt flags are cleared. |
| ST0_55 | ACOV2 | 0 | AC2 overflow flag is cleared. |
|  | ACOV3 | 0 | AC3 overflow flag is cleared. |
|  | TC1 | 1 | Test control flag 1 is cleared. |
|  | TC2 | 1 | Test control flag 2 is cleared. |
|  | CARRY | 1 | CARRY bit is cleared. |
|  | ACOVO | 0 | AC0 overflow flag is cleared. |
|  | ACOV1 | 0 | AC1 overflow flag is cleared. |
|  | DP | 0 | All bits are cleared, data page 0 is selected. |
| ST1_55 | BRAF | 0 | This flag is cleared. |
|  | CPL | 0 | The DP (rather than SP) direct addressing mode is selected. Direct accesses to data space are made relative to the data page register (DP). |
|  | XF | 1 | External flag is set. |
|  | HM | 0 | When an active HOLD signal forces the DSP to place its external interface in the high-impedance state, the DSP continues executing code from internal memory. |
|  | INTM | 1 | Maskable interrupts are globally disabled. |
|  | M40 | 0 | 32-bit (rather than 40-bit) computation mode is selected for the D unit. |
|  | SATD | 0 | CPU will not saturate overflow results in the D unit. |
|  | SXMD | 1 | Sign-extension mode is on. |
|  | C16 | 0 | Dual 16-bit mode is off. For an instruction that is affected by C16, the Dunit ALU performs one 32-bit operation rather than two parallel 16-bit operations. |
|  | FRCT | 0 | Results of multiply operations are not shifted. |
|  | C54CM | 1 | TMS320C54x-compatibility mode is on. |
|  | ASM | 0 | Instructions affected by ASM will use a shift count of 0 (no shift). |

Table 5-6. Effects of a Software Reset on DSP Registers (Continued)

| Register | Bit | Reset <br> Value | Comment |
| :--- | :--- | :---: | :--- |
| ST2_55 | ARMS | 0 | When you use the AR indirect addressing mode, the DSP mode (rather <br> than control mode) operands are available. |
|  | DBGM | 1 | Debug events are disabled. |
|  | EALLOW | 0 | A program cannot write to the non-CPU emulation registers. |
|  | RDM | 0 | When an instruction specifies that an operand should be rounded, the <br> CPU uses rounding to the infinite (rather than rounding to the nearest). |
|  | CDPLC | 0 | CDP is used for linear addressing (rather than circular addressing). |
| AR7LC | 0 | AR7 is used for linear addressing. |  |
| AR6LC | 0 | AR6 is used for linear addressing. |  |
| AR5LC | 0 | AR5 is used for linear addressing. |  |
| AR4LC | 0 | AR4 is used for linear addressing. |  |
| AR3LC | 0 | AR3 is used for linear addressing. |  |
| AR1LC | 0 | AR2 is used for linear addressing. |  |
| AR0LC | 0 | AR1 is used for linear addressing. |  |

Figure 5-3. Effects of a Software Reset on Status Registers
STO_55


## ST1_55

| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| BRAF | CPL | XF | HM | INTM | M40 | SATD | SXMD |
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 |


| 7 | 6 | 4 |  |
| :---: | :---: | :---: | :---: | :---: |
| C16 | FRCT | C54CM | ASM |
| 0 | 0 | 1 | 0 |

ST2_55


| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AR7LC | AR6LC | AR5LC | AR4LC | AR3LC | AR2LC | AR1LC | AR0LC |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

## RET

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | RET | Yes | 2 | 5 | D |

Opcode $\quad |$| 0100100 Exxx | xl 00 |
| :--- | :--- |

Operands
Description This instruction passes control back to the calling subroutine.

After returning from a called subroutine, the CPU restores the value of two internal registers: the program counter ( PC ) and a loop context register. The CPU uses these values to re-establish the context of the program sequence.

In the slow-return process (default), the return address (from the PC ) and the loop context bits are restored from the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are restored from the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions. For fastreturn mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

- The loop context bits concatenated with the 8 MSBs of the return address are popped from the top of the system stack pointer (SSP). The SSP is incremented by 1 word in the address phase of the pipeline.
$\square$ The 16 LSBs of the return address are popped from the top of the data stack pointer (SP). The SP is incremented by 1 word in the address phase of the pipeline.


| Status Bits | Affected by none |
| :--- | :--- |
| Repeat | Affects none |
| See Also | This instruction cannot be repeated. |
|  | See the following other related instructions: |
|  | $\square$ CALL (Call Unconditionally) |
|  | $\square$ CALLCC (Call Conditionally) |
|  | $\square$ RETCC (Return Conditionally) |
|  | $\square$ RETI (Return from Interrupt) |

## Example

## Syntax

Description
RET
The program counter is loaded with the return address of the calling subroutine.

RETCC
Return Conditionally
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles $^{\dagger}$ | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | RETCC cond | Yes | 3 | $5 / 5$ | $R$ |

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false

## Opcode

## Operands

Description
cond
This instructions evaluates a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a return occurs to the return address of the calling subroutine. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

After returning from a called subroutine, the CPU restores the value of two internal registers: the program counter $(\mathrm{PC})$ and a loop context register. The CPU uses these values to re-establish the context of the program sequence.

In the slow-return process (default), the return address (from the PC ) and the loop context bits are restored from the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are restored from the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32 -bit load and store instructions. For fastreturn mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

When a return from a subroutine occurs:
$\square$ The loop context bits concatenated with the 8 MSBs of the return address are popped from the top of the system stack pointer (SSP). The SSP is incremented by 1 word in the read phase of the pipeline.

- The 16 LSBs of the return address are popped from the top of the data stack pointer (SP). The SP is incremented by 1 word in the read phase of the pipeline.



## Compatibility with C54x devices (C54CM =1)

When $\mathrm{C} 54 \mathrm{CM}=1$, the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.
See Also See the following other related instructions:
$\square$ CALL (Call Unconditionally)
$\square$ CALLCC (Call Conditionally)
$\square$ RET (Return Unconditionally)
$\square \quad \mathrm{RETI}$ (Return from Interrupt)
Example

| Syntax | Description |
| :--- | :--- |
| RETCC ACOVO = \#0 | The ACO overflow bit is equal to 0, the program counter (PC) is loaded with the <br> return address of the calling subroutine. |

Before
ACOVO
PC
SP

## After

ACOVO 0

PC
(return address)
SP

RETI
Return from Interrupt

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size |
| :---: | :---: |
| [1] RETI | No 205 |
| Opcode | 0100100 E \| xxxx x101 |
| Operands | none |
| Description | This instruction passes control back to the interrupted task. <br> After returning from an interrupt service routine (ISR), the CPU automatically restores the value of some CPU registers and two internal registers: the program counter (PC) and a loop context register. The CPU uses these values to re-establish the context of the program sequence. <br> In the slow-return process (default), the return address (from the PC), the loop context bits, and some CPU registers are restored from the stacks (in memory). When the CPU returns from an ISR, the speed at which these values are restored is dependent on the speed of the memory accesses. |

In the fast-return process, the return address (from the PC) and the loop context bits are restored from the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions. Some CPU registers are restored from the stacks (in memory). For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

- The loop context bits concatenated with the 8 MSBs of the return address are popped from the top of the system stack pointer (SSP). The SSP is incremented by 1 word in the address phase of the pipeline.
$\square$ The 16 LSBs of the return address are popped from the top of the data stack pointer (SP). The SP is incremented by 1 word in the address phase of the pipeline.
$\square$ The debug status register (DBSTAT) content is popped from the top of SSP. The SSP is incremented by 1 word in the access phase of the pipeline.
- The status register 1 (ST1_55) content is popped from the top of SP. The SP is incremented by 1 word in the access phase of the pipeline.
- The 7 higher bits of status register 0 (ST0_55) concatenated with 9 zeroes are popped from the top of SSP. The SSP is incremented by 1 word in the read phase of the pipeline.
$\square$ The status register 2 (ST2_55) content is popped from the top of SP. The SP is incremented by 1 word in the read phase of the pipeline.



## ROL <br> Rotate Left Accumulator, Auxiliary, or Temporary Register Content

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
|  | ROL BitOut, src, Bitln, dst |  |  |  |  |
| $[1]$ | ROL TC2, src, TC2, dst | Yes | 3 | 1 | X |
| $[2]$ | ROL TC2, src, CARRY, dst | Yes | 3 | 1 | X |
| $[3]$ | ROL CARRY, src, TC2, dst | Yes | 3 | 1 | X |
| $[4]$ | ROL CARRY, src, CARRY, dst | Yes | 3 | 1 | X |

## Opcode

## Operands

## Description

Status Bits
dst, src
This instruction performs a bitwise rotation to the MSBs. Both TC2 and CARRY can be used to shift in one bit (Bitln) or to store the shifted out bit (BitOut). The one bit in Bitln is shifted into the source (src) operand and the shifted out bit is stored to BitOut.
$\square$ When the destination (dst) operand is an accumulator:

- if an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the register are zero extended to 40 bits
■ the operation is performed on 40 bits in the D-unit shifter
- Bitln is inserted at bit position 0
- BitOut is extracted at a bit position according to M40
- When the destination (dst) operand is an auxiliary or temporary register:
- if an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation
- the operation is performed on 16 bits in the A-unit ALU
- Bitln is inserted at bit position 0
- BitOut is extracted at bit position 15

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Affected by CARRY, M40, TC2
Affects CARRY, TC2

## Repeat This instruction can be repeated.

## See Also See the following other related instructions:

$\square$ ROR (Rotate Right Accumulator, Auxiliary, or Temporary Register Content)

## Example

| Syntax | Description |
| :--- | :--- |
| ROL CARRY, AC1, TC2, AC1 | The value of TC2 (1) before the execution of the instruction is shifted into <br> the LSB of AC1 and bit 31 shifted out from AC1 is stored in the CARRY <br> status bit. The rotated value is stored in AC1. Because M40 = 0, the <br> guard bits (39-32) are cleared. |


| Before | After |  |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: |
| AC1 | $0 F$ E34 | 5678 | AC1 | 00 | C680 ACF1 |
| TC2 |  | 1 | TC2 |  | 1 |
| CARRY | 1 | CARRY | 1 |  |  |
| M40 |  | 0 | M40 |  | 0 |

## ROR <br> Rotate Right Accumulator, Auxiliary, or Temporary Register Content

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :---: | :---: | :---: | :---: |
|  | ROR Bitln, src, BitOut, dst |  |  |  |  |
| $[1]$ | ROR TC2, src, TC2, dst | Yes | 3 | 1 | X |
| $[2]$ | ROR TC2, src, CARRY, dst | Yes | 3 | 1 | X |
| $[3]$ | ROR CARRY, src, TC2, dst | Yes | 3 | 1 | X |
| $[4]$ | ROR CARRY, src, CARRY, dst | Yes | 3 | 1 | X |

## Opcode

$0001001 \mathrm{E} \mid$ FSSS xx11|FDDD 1xvv

## Operands

Description

Status Bits
dst, src
This instruction performs a bitwise rotation to the LSBs. Both TC2 and CARRY can be used to shift in one bit (Bitln) or to store the shifted out bit (BitOut). The one bit in Bitln is shifted into the source (src) operand and the shifted out bit is stored to BitOut.
$\square$ When the destination (dst) operand is an accumulator:
■ if an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the register are zero extended to 40 bits

- the operation is performed on 40 bits in the D-unit shifter

■ Bitln is inserted at a bit position according to M40

- BitOut is extracted at bit position 0
$\square$ When the destination (dst) operand is an auxiliary or temporary register:
- if an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation
- the operation is performed on 16 bits in the A-unit ALU
- Bitln is inserted at bit position 15
- BitOut is extracted at bit position 0

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Affected by CARRY, M40, TC2
Affects CARRY, TC2

Repeat This instruction can be repeated.
See Also See the following other related instructions:
$\square$ ROL (Rotate Left Accumulator, Auxiliary, or Temporary Register Content)

## Example

| Syntax | Description |
| :--- | :--- |
| ROR TC2, AC0, TC2, AC1 | The value of TC2 (1) before the execution of the instruction is shifted <br> into bit 31 of AC0 and the LSB shifted out from AC0 is stored in TC2. The <br> rotated value is stored in AC1. Because M40 = 0, the guard bits (39-32) are <br> cleared. |


| Before | After |  |  |  |  |  |
| :--- | ---: | ---: | ---: | ---: | ---: | ---: |
| AC0 | 5 F | B000 | 1234 | AC0 | 5 F | B000 |
| AC1 | 00 | C680 ACF1 | AC1 | 00 | D800 | 091 A |
| TC2 |  |  | 1 | TC2 |  |  |
| M40 |  | 0 | M40 |  | 0 |  |

## ROUND

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipe |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | ROUND [ACx,] ACy | Yes | 2 | 1 | X |
| Opcode |  | 0101 010E \\| DDSS |  |  | S 1 |
| Opera |  | ACx, ACy |  |  |  |
| Description |  | of the sour <br> on RDM: <br> rounding to -bit source rounding 7 LSBs of th <br> 10000h) <br> bit sour 000h) <br> bit sour med, the 16 | accu <br> the in accum the $n$ 40-bi <br> e acc <br> e acc <br> lowest | mulator <br> inite is lator AC arest is source <br> mulato <br> mulato bits of the | Cx in erform x. erform cumu <br> ACx <br> ACx result |
|  |  | ds on M40 in CARRY tination acc |  | overflow | statu |
|  |  | When an overflow is detected, the accumulator is saturated according to SATD. |  |  |  |
|  |  | Compatibility with C54x devices ( $C 54 C M=1$ ) |  |  |  |
|  |  | When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$, the rounding is performed without clearing the LSBs o accumulator ACx. |  |  |  |


| Status Bits | Affected by | C54CM, M40, RDM, SATD |
| :---: | :---: | :---: |
|  | Affects | ACOVy |
| Repeat | This instruction cannot be repeated. |  |
| Example |  |  |


| Syntax | Description |
| :--- | :--- |
| ROUND AC0, AC1 | The content of AC0 is added to 8000h, the 16 LSBs are cleared to 0, and the <br> result is stored in AC1. M40 is cleared to 0, so overflow is detected at bit 31; <br> SATD is cleared to 0, so AC1 is not saturated. |


| Before |  |  | After |  |  |  |  |
| :--- | ---: | ---: | :--- | :--- | :--- | :--- | :--- |
| AC0 | EF | 0FF0 | 8023 | AC0 | EF | 0FF0 | 8023 |
| AC1 | 00 | 0000 | 0000 | AC1 | EF | OFF1 | 0000 |
| RDM |  |  | 1 | RDM |  |  | 1 |
| M40 |  |  | 0 | M40 |  |  | 0 |
| SATD |  |  | 0 | SATD |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 1 |

## RPT

Repeat Single Instruction Unconditionally
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | RPT k8 | Yes | 2 | 1 | $A D$ |
| $[2]$ | RPT k16 | Yes | 3 | 1 | $A D$ |
| $[3]$ | RPT CSR | Yes | 2 | 1 | $A D$ |

Description This instruction repeats the next instruction or the next two paralleled instructions the number of times specified by the content of the computed single repeat register (CSR) +1 or an immediate value, $k x+1$. This value is loaded into the repeat counter register (RPTC). The maximum number of executions of a given instruction or paralleled instructions is $2^{16}-1$ (65535).

The repeat single mechanism triggered by these instructions is interruptible.
These instructions cannot be repeated.
These instructions cannot be used as the last instruction in a repeat loop structure.

Two paralleled instructions can be repeated when following the parallelism general rules.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.
Status Bits Affected by none

See Also
See the following other related instructions:
$\square$ RPTADD (Repeat Single Instruction Unconditionally and Increment CSR)

- RPTB (Repeat Block of Instructions Unconditionally)
$\square$ RPTCC (Repeat Single Instruction Conditionally)
- RPTSUB (Repeat Single Instruction Unconditionally and Decrement CSR)


## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | RPT k8 |  | Yes | 2 | 1 | AD |
| [2] | RPT k16 |  | Yes | 3 | 1 | AD |
| Opcode |  | k8 | 01001 |  | OE \| kk | kkkk |
|  |  | k16 | \| 0000 110E ${ }^{\text {k }}$ kkk kkkk $\mid$ kkkk |  |  | kkkk |
| Operands |  | kx |  |  |  |  |
| Description |  | This instruction repeats the next instruction or the next two paralleled instructions the number of times specified by an immediate value, $\mathrm{kx}+1$. The repeat counter register (RPTC): |  |  |  |  |
|  |  | - Is loaded with the immediate value in the address phase of the pipeline. |  |  |  |  |
|  |  | $\square$ Is decremented by 1 in the decode phase of the repeated instruction. |  |  |  |  |
|  |  | - Contains 0 at the end of the repeat single mechanism. |  |  |  |  |
|  |  | - Must not be accessed when it is being decremented in the repeat single |  |  |  |  | mechanism.

The repeat single mechanism triggered by this instruction is interruptible.
Two paralleled instructions can be repeated when following the parallelism general rules.

This instruction cannot be used as the last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.
$\begin{array}{lll}\text { Status Bits } & \text { Affected by none } \\ & \text { Affects } & \text { none }\end{array}$
Repeat This instruction cannot be repeated.

## Example 1

| Syntax | Description |
| :--- | :--- |
| RPT \#3 | The single instruction following the repeat instruction is repeated four times. |
| MACM *AR3+, *AR4+, AC1 |  |


| Before |  |  |  | After |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| AC1 | 00 | 0000 | 0000 | AC1 | 003376 | AD10 |
| AR3 |  |  | 0200 | AR3 |  | 0204 |
| AR4 |  |  | 0400 | AR4 |  | 0404 |
| 200 |  |  | AC03 | 200 |  | AC03 |
| 201 |  |  | 3468 | 201 |  | 3468 |
| 202 |  |  | FE00 | 202 |  | FE00 |
| 203 |  |  | 23DC | 203 |  | 23DC |
| 400 |  |  | D768 | 400 |  | D768 |
| 401 |  |  | 6987 | 401 |  | 6987 |
| 402 |  |  | 3400 | 402 |  | 3400 |
| 403 |  |  | 7900 | 403 |  | 7900 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| RPT \#513 | A single instruction is repeated as defined by the unsigned 16-bit value +1 <br> $(513+1)$. |

## Syntax Characteristics

| No. Syntax | Parallel Enable Bit Size Cycles Pipeline |
| :---: | :---: |
| [3] RPT CSR | $\begin{array}{llll}\text { Yes } & 2 & 1 & \end{array}$ |
| Opcode | 0100100 E \| xxxx x000 |
| Operands | none |
| Description | This instruction repeats the next instruction or the next two paralleled instructions the number of times specified by the content of the computed single repeat register (CSR) +1 . The repeat counter register (RPTC): Is loaded with CSR content in the address phase of the pipeline. Is decremented by 1 in the decode phase of the repeated instruction. Contains 0 at the end of the repeat single mechanism. Must not be accessed when it is being decremented in the repeat single mechanism. <br> The repeat single mechanism triggered by this instruction is interruptible. <br> Two paralleled instructions can be repeated when following the parallelism general rules. <br> This instruction cannot be used as the last instruction in a repeat loop structure. <br> See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism. |
| Status Bits | $\begin{array}{ll}\text { Affected by } & \text { none } \\ \text { Affects } & \text { none }\end{array}$ |
| Repeat | This instruction cannot be repeated. |

Example

| Syntax | Description |
| :--- | :--- |
| RPT CSR | The single instruction following the repeat instruction is repeated as defined <br> by the content of CSR +1. |


| Before | After |  |  |  |
| :--- | ---: | :--- | ---: | :--- |
| AC1 | 000000 | 0000 | AC1 | 003376 AD10 |
| CSR | 0003 | CSR | 0003 |  |
| AR3 | 0200 | AR3 | 0204 |  |
| AR4 | 0400 | AR4 | 0404 |  |
| 200 | AC03 | 200 | AC03 |  |
| 201 | 3468 | 201 | 3468 |  |
| 202 | FE00 | 202 | FE00 |  |
| 203 | $23 D C$ | 203 | $23 D C$ |  |
| 400 | D768 | 400 | D768 |  |
| 401 | 6987 | 401 | 6987 |  |
| 402 | 3400 | 402 | 3400 |  |
| 403 | 7900 | 403 | 7900 |  |

RPTADD
Repeat Single Instruction Unconditionally and Increment CSR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | RPTADD CSR, TAx | Yes | 2 | 1 | $X$ |
| $[2]$ | RPTADD CSR, k4 | Yes | 2 | 1 | $X$ |

Description These instructions repeat the next instruction or the next two paralleled instructions the number of times specified by the content of the computed single repeat register (CSR) +1 . This value is loaded into the repeat counter register (RPTC). The maximum number of executions of a given instruction or paralleled instructions is $2^{16}-1$ (65535).

With the A-unit ALU, these instructions allow the content of CSR to be incremented. The CSR modification is performed in the execute phase of the pipeline; there is a 3-cycle latency between the CSR modification and its usage in the address phase.

The repeat single mechanism triggered by these instructions is interruptible.
Two paralleled instructions can be repeated when following the parallelism general rules.

These instructions cannot be repeated.
These instructions cannot be used as the last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.

| Status Bits | Affected by none |
| :--- | :--- |
| Affects none |  |
| See Also | See the following other related instructions: |
|  | $\square$ RPT (Repeat Single Instruction Unconditionally) |
| $\square$ | RPTB (Repeat Block of Instructions Unconditionally) |
| $\square$ | RPTCC (Repeat Single Instruction Conditionally) |
|  | $\square$ |
|  | RPTSUB (Repeat Single Instruction Unconditionally and Decrement CSR) |

Repeat Single Instruction Unconditionally and Increment CSR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | RPTADD CSR, TAx | Yes | 2 | 1 | $X$ |

Opcode

## Operands

## Description

## Status Bits

Repeat
Example

| Syntax | Description |
| :--- | :--- |
| RPTADD CSR, T1 | A single instruction is repeated as defined by the content of CSR + 1. The content <br> of CSR is incremented by the content of temporary register T1. |

Repeat Single Instruction Unconditionally and Increment CSR

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | RPTADD CSR, k4 | Yes | 2 | 1 | X |  |
| Opcode |  | $\mid 0100$ | $100 \mathrm{E} \mid \mathrm{kkkk}$ | x 010 |  |  |
| Operands | k 4 |  |  |  |  |  |
| Description | This instruction repeats the next instruction or the next two paralleled <br> instructions the number of times specified by the content of the computed <br> single repeat register (CSR) + 1. The repeat counter register (RPTC): |  |  |  |  |  |

$\square$ Is loaded with CSR content in the address phase of the pipeline.
$\square$ Is decremented by 1 in the decode phase of the repeated instruction.
$\square$ Contains 0 at the end of the repeat single mechanism.

- Must not be accessed when it is being decremented in the repeat single mechanism.

With the A-unit ALU, this instruction allows the content of CSR to be incremented by k4. The CSR modification is performed in the execute phase of the pipeline; there is a 3-cycle latency between the CSR modification and its usage in the address phase.

The repeat single mechanism triggered by this instruction is interruptible.
Two paralleled instructions can be repeated when following the parallelism general rules.

This instruction cannot be used as the last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.
Status Bits Affected by none

Repeat This instruction cannot be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| RPTADD CSR, \#2 | A single instruction is repeated as defined by the content of CSR + 1. The content <br> of CSR is incremented by the unsigned 4-bit value (2). |

## RPTB

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | RPTBLOCAL pmad | Yes | 2 | 1 | $A D$ |
| $[2]$ | RPTB pmad | Yes | 3 | 1 | $A D$ |

Description These instructions repeat a block of instructions the number of times specified by:

- The content of BRCO +1 , if no loop has already been detected.
$\square$ The content of BRS1 + 1, if one level of the loop has already been detected.

Loop structures defined by these instructions must have the following characteristics:

- The minimum number of instructions executed within one loop iteration is 2.
$\square$ The minimum number of cycles executed within one loop iteration is 2.
$\square$ Since the result of updating BRCx (and BRAF in C54CM $=1$ ) within 3 instruction cycles from the end of the loop is uncertain (effective in the same iteration or the next iteration depending on the pipeline state), this operation is prohibited.
$\square$ The block-repeat operation can only be cleared by branching to a destination address outside the active block-repeat loop.
- C54CM bit in ST1_55 cannot be modified within a block-repeat loop.

These instructions cannot be repeated.
See section 1.5 for a list of instructions that cannot be used in a repeat block mechanism.

| Status Bits | Affected by | none |
| :--- | :--- | :--- |
|  | Affects | none |

See Also See the following other related instructions:

- RPT (Repeat Single Instruction Unconditionally)
$\square$ RPTADD (Repeat Single Instruction Unconditionally and Increment CSR)
$\square$ RPTCC (Repeat Single Instruction Conditionally)
$\square$ RPTSUB (Repeat Single Instruction Unconditionally and Decrement CSR)


## Syntax Characteristics


$\square$ the content of $B R C 0+1$, if no loop has already been detected. In this case:

- In the address phase of the pipeline, RSA0 is loaded with the program address of the first instruction of the loop.
- The program address (pmad) of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REA0.
- BRCO is decremented at the address phase of the last instruction of the loop when its content is not equal to 0 .
- BRCO contains 0 after the block-repeat operation has ended.
$\square$ the content of BRS1 +1, if one level of the loop has already been detected.
In this case:
- BRC1 is loaded with the content of BRS1 in the address phase of the repeat block instruction.
- In the address phase of the pipeline, RSA1 is loaded with the program address of the first instruction of the loop.
- The program address of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REA1.
- BRC1 is decremented at the address phase of the last instruction of the loop when its content is not equal to 0 .
- BRC1 contains 0 after the block-repeat operation has ended.
- BRS1 content is not impacted by the block-repeat operation.

Loop structures defined by this instruction must have the following characteristics:
$\square$ The minimum number of instructions executed within one loop iteration is 2 .
$\square$ The minimum number of cycles executed within one loop iteration is 2 .

- The maximum loop size is 128 bytes.
$\square$ Since the result of updating BRCx (and BRAF in C54CM = 1) within 3 instruction cycles from the end of the loop is uncertain (effective in the same iteration or the next iteration depending on the pipeline state), this operation is prohibited.
- C54CM bit in ST1_55 cannot be modified within a block-repeat loop.
$\square$ The following instructions cannot be used as the last instruction in the loop structure:

| RPT | RPTCC | RPTADD |
| :--- | :--- | :--- |
| RPTSUB | XCC | XCCPART |

## Note:

Instructions if (cond) execute (AD_Unit), or if (cond) execute (D_Unit) must be replaced with their mnemonic ID: XCC and XCCPART as the last instruction in the loop structure if the instruction is executed with the instruction with which it is paralleled (if (cond) execute (AD_Unit) || instruction_executes conditionally)

A local loop is defined as when all the code of the loop is repeatedly executed from within the instruction buffer queue (IBQ):

- All the code of the local loop must fit within the 128-byte, 4-byte-aligned IBQ; therefore, local repeat blocks are limited to 128 bytes minus the 0 to 3 bytes of first-instruction misalignment. The 128th byte of the IBQ can only occur in a paralleled instruction. See Figure 5-4 for legal uses of the RPTBLOCAL instruction.
- The following instructions cannot be used in any form in a local loop code:

| BCC | CALL | IDLE |
| :--- | :--- | :--- |
| INTR | RESET | RET |
| RPTB | TRAP |  |

$\square$ Nested local repeat block (RPTBLOCAL) instructions are allowed.
$\square$ The only branch instructions allowed in a RPTBLOCAL structure are the branch instructions with a target branch address pointing to an instruction included within the loop code and being at a higher address than the branching instruction. In this case, the branch conditionally (BCC) instruction is executed in 3 cycles and the condition is evaluated in the address phase of the pipeline (there is a 3-cycle latency on the condition setting).

## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$ :
$\square$ This instruction only uses block-repeat level 0; block-repeat level 1 is disabled.
$\square$ The block-repeat active flag (BRAF) is set to 1 . BRAF is cleared to 0 at the end of the block-repeat operation when BRC0 contains 0.
$\square$ You can stop an active block-repeat operation by clearing BRAF to 0 .
$\square$ Block-repeat control registers for level 1 are not used. Nested block-repeat operations are supported using the C54x convention with context save/restore and BRAF. When an interrupt is acknowledged, unlike the C54x device, BRAF is captured into the control-flow context register (CFCT), and saved to the stack. You can use a block/local loop instruction in an interrupt without preserving BRAF (while preserving BRC0, RSA0, and REA0).

Status Bits Affected by none
Affects none
Repeat This instruction cannot be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| RPTBLOCAL | A block of instructions is repeated as defined by the content of BRC0 + 1. |


|  | Address | BRCO | RSA0 | REAO | BRS1 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| MOV \#3, BRC0 |  | 0003 | 0000 | 0000 | 0000 |
| RPTBLOCAL \{ | 004003 | ?* | 4005 | 400D | ? |
| $\ldots$ | 004005 | ? | ? | ? | ? |
| $\ldots$ | 00400D | DTZ** | ? | ? | ? |
| \} |  | 0000 | 4005 | 400D | 0000 |
| *?: Unchanged <br> **DTZ: Decrease till zero |  |  |  |  |  |

Figure 5-4. Legal Uses of Repeat Block of Instructions Unconditionally (RPTBLOCAL) Instruction
(a) 128-Byte Unaligned Loop-Legal Use


The entire local repeat block and the next instruction reside in the IBQ, this code is accepted by the assembler.
(b) 129-Byte Unaligned Loop with Single Instruction at End of Loop-Illegal Use

```
    .. .. ; no alignment directive
RPTBLOCAL {
            1st instruction
    .. ... } 129-byte loop body
    Last instruction
    (nonparalleled = single)
    }
next instruction
```

The RPTBLOCAL instruction is not aligned; the next instruction may not be fetched in the IBQ. Because the last instruction of the local repeat block is a nonparalleled (single) instruction, the CPU must confirm that the next instruction does not have a parallel enable bit; therefore, this code is rejected by the assembler.

Figure 5-4. Legal Uses of Repeat Block of Instructions Unconditionally (RPTBLOCAL) Instruction (Continued)
(c) 129-Byte Unaligned Loop with Paralleled Instruction at End of Loop—Legal Use


The RPTBLOCAL instruction is not aligned; the next instruction may not be fetched in the IBQ. Because the last instruction of the local repeat block is a paralleled instruction, the CPU does not need to confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.
(d) 129-Byte Aligned Loop with Single Instruction at End of Loop-Legal Use
align 4
; alignment directive
RPTBLOCAL \{
1st instruction
\} 129-byte loop body
Last instruction (nonparalleled = single)
\}
next instruction

The RPTBLOCAL instruction is aligned, so the entire local repeat block and the next instruction reside in the IBQ. Because the next instruction is in the IBQ , the CPU can confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.

Figure 5-4. Legal Uses of Repeat Block of Instructions Unconditionally (RPTBLOCAL) Instruction (Continued)
(e) 130-Byte Unaligned Loop-Illegal Use

| $\ldots \ldots$ |  | ; no alignment directive |
| :---: | :---: | :---: |
| RPTBLOCAL $\{$ |  |  |
|  | 1st instruction |  |
| $\ldots$ |  | 130-byte loop body |
| \} |  |  |
| next instruction instruction |  |  |
| $\ldots \ldots$ |  |  |

The RPTBLOCAL instruction is not aligned; the entire local repeat block may not reside in the IBQ. Because the last instruction of the local repeat block may not reside in the IBQ, this code is rejected by the assembler.
(f) 130-Byte Aligned Loop with Single Instruction at End of Loop—Legal Use

```
    align 4 ; alignment directive
    NOP_16|NOP ; 3-byte instruction
RPTBLOCAL {
            1st instruction
                } 130-byte loop body
            Last instruction
            (nonparalleled = single)
    }
next instruction
```

The NOP instructions are aligned so the RPTBLOCAL instruction, the entire local repeat block, and the next instruction reside in the IBQ. Because the next instruction is in the IBQ, the CPU can confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.

Figure 5-4. Legal Uses of Repeat Block of Instructions Unconditionally (RPTBLOCAL) Instruction (Continued)
(g) 132-Byte Aligned Loop with Paralleled Instruction at End of Loop—Legal Use

```
            align 4
            ; alignment directive
            NOP_16
            ; 2-byte instruction
RPTBLOCAL {
            1st instruction
    ..... } 132-byte loop body
        Last instruction (paralleled)
    }
next instruction
```

The NOP instruction is aligned, so the RPTBLOCAL instruction and the entire local repeat block reside in the IBQ; the next instruction is not fetched in the IBQ. Because the last instruction of the local repeat block is a paralleled instruction, the CPU does not need to confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[2]$ | RPTB pmad |  | Yes | 3 | 1 | $A D$ |  |
|  |  |  |  |  |  |  |  |
| Opcode |  |  |  |  |  |  |  |
| Operands | pmad |  |  |  |  |  |  |
| Description | This instruction repeats a block of instructions the number of times specified by: |  |  |  |  |  |  |

$\square$ the content of $B R C 0+1$, if no loop has already been detected. In this case:

- In the address phase of the pipeline, RSA0 is loaded with the program address of the first instruction of the loop.
- The program address (pmad) of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REA0.
- BRCO is decremented at the address phase of the last instruction of the loop when its content is not equal to 0 .
- BRCO contains 0 after the block-repeat operation has ended.
$\square$ the content of BRS1 +1, if one level of the loop has already been detected.
In this case:
- BRC1 is loaded with the content of BRS1 in the address phase of the repeat block instruction.
- In the address phase of the pipeline, RSA1 is loaded with the program address of the first instruction of the loop.
- The program address of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REA1.
- BRC1 is decremented at the address phase of the last instruction of the loop when its content is not equal to 0 .
- BRC1 contains 0 after the block-repeat operation has ended.
- BRS1 content is not impacted by the block-repeat operation.

Loop structures defined by these instructions must have the following characteristics:

- The minimum number of instructions executed within one loop iteration is 2.
- The minimum number of cycles executed within one loop iteration is 2 .
- The maximum loop size is 64 Kbytes.
$\square$ The block-repeat operation can only be cleared by branching to a destination address outside the active block-repeat loop.
- Since the result of updating BRCx (and BRAF in C54CM =1) within 3 instruction cycles from the end of the loop is uncertain (effective in the same iteration or the next iteration depending on the pipeline state), this operation is prohibited.
- C54CM bit in ST1_55 cannot be modified within a block-repeat loop.
- The following instructions cannot be used as the last instruction in the loop structure:

| RPT | RPTCC | RPTADD |
| :--- | :--- | :--- |
| RPTSUB | XCC |  |

$\square$ See section 1.5 for a list of instructions that cannot be used in the block-repeat loop code.

Compatibility with C54x devices (C54CM = 1)
When $\mathrm{C} 54 \mathrm{CM}=1$ :
$\square$ This instruction only uses block-repeat level 0 ; block-repeat level 1 is disabled.

- The block-repeat active flag (BRAF) is set to 1 . BRAF is cleared to 0 at the end of the block-repeat operation when BRCO contains 0 .
- You can stop an active block-repeat operation by clearing BRAF to 0 .
- Block-repeat control registers for level 1 are not used. Nested block-repeat operations are supported using the C54x convention with context save/restore and BRAF. The control-flow context register (CFCT) values are not used.
$\square$ BRAF is automatically cleared to 0 when a far branch (FB) or far call (FCALL) instruction is executed.

| Status Bits | Affected by none |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | Affects |  | none |  |  |  |  |  |
| Repeat | This instruction cannot be repeated. |  |  |  |  |  |  |  |
| Example |  |  |  |  |  |  |  |  |
| Syntax | Description |  |  |  |  |  |  |  |
| RPTB | A block of instructions is repeated as defined by the content of BRCO +1. A second loop of instructions is repeated as defined by the content of BRS1 +1 (BRC1 is loaded with the content of BRS1). |  |  |  |  |  |  |  |
| MOV \#3, BRC0 | Address | BRC0 | RSAO | REAO | BRS1 | BRC1 | RSA1 | REA1 |
|  |  | 0003 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 |
| MOV \#1, BRC1 |  | ?* | ? | ? | 0001 | 0001 | ? | ? |
| RPTB \{ | 004006 | ? | 4009 | 4017 | ? | ? | ? | ? |
|  | 004009 | ? | ? | ? | ? | ? | ? | ? |
| RPTBLOCAL \{ | 00400B | ? | ? | ? | ? | (BRS1) | 400D | 4015 |
|  | 00400D | ? | ? | ? | ? | ? | ? | ? |
| ... ... | 004015 | ? | ? | ? | ? | DTZ** | ? | ? |
| \} |  |  |  |  |  |  |  |  |
|  | 004017 | DTZ** | ? | ? | ? | ? | ? | ? |
| *?: Unchanged |  | 0000 | 4009 | 4017 | 0001 | 0000 | 400D | 4015 |
| *?: Unchanged <br> **DTZ: Decreas | se till zero |  |  |  |  |  |  |  |

RPTCC
Syntax Characteristics

| No. | Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | RPTCC k8, cond |  | Yes | 3 | 1 | AD |
| Opcode |  | 0000 | O00E ${ }^{\text {a }}$ x | C | CC ${ }^{\text {kk }}$ | k kkkk |
| Operands |  | cond, k8 |  |  |  |  |
| Description |  | This instruction evaluates a single condition defined by the cond field and as long as the condition is true, the next instruction or the next two paralleled instructions is repeated the number of times specified by an 8-bit immediate value, $k 8+1$. The maximum number of executions of a given instruction or paralleled instructions is $2^{8}-1$ (255). See Table $1-3$ for a list of conditions. |  |  |  |  |

The 8 LSBs of the repeat counter register (RPTC):
$\square$ Are loaded with the immediate value at the address phase of the pipeline.
$\square$ Are decremented by 1 in the decode phase of the repeated instruction.
The 8 MSBs of RPTC:

- Are loaded with the cond code at the address phase of the pipeline.
$\square$ Are untouched during the instruction execution.
At each step of the iteration, the condition defined by the cond field is tested in the execute phase of the pipeline. When the condition becomes false, the instruction repetition stops.
- If the condition becomes false at any execution of the repeated instruction, the 8 LSBs of RPTC are corrected to indicate exactly how many iterations were not performed.
- Since the condition is evaluated in the execute phase of the repeated instruction, when the condition is tested false, some of the succeeding iterations of that repeated instruction may have gone through the address, access, and read phases of the pipeline. Therefore, they may have modified the pointer registers used in the DAGEN units to generate data memory operands addresses in the address phase.
When the instruction structure is exited, reading the computed single-repeat register (CSR) content enables you to determine how many instructions have gone through the address phase of the pipeline. You may then use the Repeat Single Instruction Unconditionally instruction [3] to rewind the pointer registers. Note that this must only be performed when a false condition has been met inside the instruction structure.

The following table provides the 8 LSBs of RPTC and CSR once the instruction structure is exited.

| If the condition is not met | RPTC[7:0] content <br> after exiting loop <br> RPTCinit +1 | CSR content <br> after exiting loop |
| :--- | :---: | :---: |
| At $1^{\text {st }}$ iteration | RPTCinit | 4 |
| At $2^{\text {nd }}$ iteration | RPTC -1 | 4 |
| At $3^{\text {rd }}$ iteration | $\ldots$ | 4 |
| $\ldots$ | 4 | $\ldots$ |
| At RPTCinit -2 iteration | 3 | 3 |
| At RPTCinit -1 iteration | 2 | 2 |
| At RPTCinit iteration | 1 | 1 |
| At RPTCinit +1 iteration | 0 | 0 |
| Never |  | 0 |

RPTCinit is the number of requested iterations minus 1 .

The repeat single mechanism triggered by this instruction is interruptible. Saving and restoring the RPTC content in ISRs enables you to preserve the instruction structure context.

Instead of programming a number of iterations (minus 1 ) equal to 0 , it is recommended that you use the conditional execute() structure.

This instruction cannot be used as the last or the second to last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.

In addition, any store-to-memory instruction including push instructions cannot be used in a conditional repeat single mechanism.

## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, the comparison of accumulators to 0 is performed as if M 40 was set to 1 .

| Status Bits | Affected by ACOVx, CARRY, C54CM, M40, TCx |
| :--- | :--- | :--- |
|  | Affects ACOVx |
| Repeat | This instruction cannot be repeated. |

See Also | See the following other related instructions: |
| :--- |
|  |
|  |
| Example |$\quad$ RPT (Repeat Single Instruction Unconditionally)

| Syntax | RPTADD (Repeat Single Instruction Unconditionally and Increment CSR) |
| :--- | :--- |
| RPTCC \#7, AC1 > \#0 | Description <br> As long as the content of AC1 is greater than 0 and the repeat counter is not <br> equal to 0 , the next single instruction is repeated as defined by the unsigned 8 -bit <br> value (7) + 1. At the address phase of the pipeline, RPTC is automatically <br> initialized to 4107h and then is immediately decreased to 4106h. |



* At the address phase of the pipeline, RPTC is automatically initialized to 4107 h and then is immediately decreased to 4106 h .


## RPTSUB <br> Repeat Single Instruction Unconditionally and Decrement CSR

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | RPTSUB CSR, k4 | Yes | 2 | 1 | X |  |
|  |  |  |  |  |  |  |
| Opcode |  | 0100 | 100 E | kkkk | x 011 |  |

## Operands k4

Description This instruction repeats the next instruction or the next two paralleled instructions the number of times specified by the content of the computed single repeat register (CSR) +1 . The repeat counter register (RPTC):
$\square$ Is loaded with CSR content in the address phase of the pipeline.
$\square$ Is decremented by 1 in the decode phase of the repeated instruction.

- Contains 0 at the end of the repeat single mechanism.
- Must not be accessed when it is being decremented in the repeat single mechanism.

With the A-unit ALU, this instruction allows the content of CSR to be decremented by k4. The CSR modification is performed in the execute phase of the pipeline; there is a 3-cycle latency between the CSR modification and its usage in the address phase.

The repeat single mechanism triggered by this instruction is interruptible.
Two paralleled instructions can be repeated when following the parallelism general rules.

This instruction cannot be used as the last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.

| Status Bits | Affected by none |
| :--- | :--- | :--- |
|  | Affects none |
| Repeat | This instruction cannot be repeated. |

## See Also

See the following other related instructions:

- RPT (Repeat Single Instruction Unconditionally)
$\square$ RPTADD (Repeat Single Instruction Unconditionally and Increment CSR)
- RPTB (Repeat Block of Instructions Unconditionally)
$\square$ RPTCC (Repeat Single Instruction Conditionally)


## Example

| Syntax | Description |
| :--- | :--- |
| RPTSUB CSR, \#2 | A single instruction is repeated as defined by the content of CSR + 1. The content <br> of CSR is decremented by the unsigned 4-bit value (2). |

## SAT

## Saturate Accumulator Content

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :--- | :--- | :--- |
| $[1]$ | SAT $[R][A C x$,$] ACy$ | Yes | 2 | 1 | X |
| Opcode |  |  | 0101 | 010 E | DDSS |
| Operands | 110\% |  |  |  |  |

Description This instruction performs a saturation of the source accumulator $A C x$ to the 32-bit width frame in the D-unit ALU.

- A rounding is performed if the optional $R$ keyword is applied to the instruction. The rounding operation depends on RDM:
- When RDM $=0$, the biased rounding to the infinite is performed. $8000 \mathrm{~h}\left(2^{15}\right)$ is added to the 40 -bit source accumulator ACx.
- When RDM $=1$, the unbiased rounding to the nearest is performed. According to the value of the 17 LSBs of the 40 -bit source accumulator ACx, 8000h $\left(2^{15}\right)$ is added:

```
    if( 8000h < bit(15-0) < 10000h)
            add 8000h to the 40-bit source accumulator ACx
        else if( bit(15-0) == 8000h)
            if( bit(16) == 1)
            add 8000h to the 40-bit source accumulator ACx
```

If a rounding has been performed, the 16 lowest bits of the result are cleared to 0.

- An overflow is detected at bit position 31.
$\square$ No addition carry report is stored in CARRY status bit.
- If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the destination register is saturated. Saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$, the rounding is performed without clearing the LSBs of accumulator ACx.
Status Bits
Affected by C54CM, RDM
Affects
ACOVy

## Example 2

| Syntax | Description |
| :--- | :--- |
| SATR AC0, AC1 | The 32-bit width content of AC0 is saturated. The saturated value, |
|  | 00 7FFF FFFFFh, is rounded, 16 LSBs are cleared, and stored in AC1. |


| Before | After |  |  |  |  |  |  |
| :--- | ---: | ---: | :--- | :--- | :--- | :--- | :--- |
| AC0 | 00 | 7 FFF | 8000 | AC0 | 00 | 7 FFF | 8000 |
| AC1 | 00 | 0000 | 0000 | AC1 | 00 | 7 FFF | 0000 |
| RDM |  |  | 0 | RDM |  | 0 |  |
| ACOV1 |  |  | 0 | ACOV1 |  | 1 |  |

SFTCC
Shift Accumulator Content Conditionally

Syntax Characteristics


## Example 1

| Syntax | Description |
| :--- | :--- |
| SFTCC AC0, TC1 | Because AC0(31) XORed with AC0(30) equals 1, the content of AC0 is not shifted <br> left and TC1 is set to 1. |


| Before |  |  | After |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| AC0 | FF | 8765 | 0055 | AC0 | FF | 8765 |
| TC1 |  | 0 | TC1 |  |  | 1 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| SFTCC AC0, TC2 | Because AC0(31) XORed with ACO(30) equals 0, the content of AC0 is shifted left <br> by 1 bit and TC2 is cleared to 0. |


| Before | After |  |  |  |  |  |
| :--- | ---: | ---: | ---: | ---: | ---: | ---: |
| AC0 | 00 | 1234 | 0000 | AC0 | 00 | 2468 |
| TC2 |  | 0 | TC2 | 0000 |  |  |

## SFTL

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SFTL ACx, Tx[, ACy] | Yes | 2 | 1 | $X$ |
| $[2]$ | SFTL ACx, \#SHIFTW[, ACy] | Yes | 3 | 1 | $X$ |

Description These instructions perform an unsigned shift by an immediate value, SHIFTW, or the content of a temporary register ( Tx ) in the D-unit shifter.

Status Bits
Affected by C54CM, M40 Affects CARRY

## See Also

See the following other related instructions:
$\square$ SFTCC (Shift Accumulator Content Conditionally)

- SFTL (Shift Accumulator, Auxiliary, or Temporary Register Content Logically)
- SFTS (Signed Shift of Accumulator Content)
$\square$ SFTS (Signed Shift of Accumulator, Auxiliary, or Temporary Register Content)

Shift Accumulator Content Logically

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | SFTL ACx, Tx[, ACy $]$ | Yes | 2 | 1 | X |
| Opcode |  | 0101 | 110 E | DDSS | Ss 00 |

## Operands

## Description

Status Bits Affected by C54CM, M40
Affects CARRY
Repeat This instruction can be repeated.

## Example

| Syntax | Description |  |  |  |
| :---: | :---: | :---: | :---: | :---: |
| SFTL AC0, T0, AC1 | The content of ACO is logically shifted right by the content of T0 and the result is stored in AC1. There is a right shift because the content of T0 is negative ( -6 ). Because $\mathrm{M} 40=0$, the guard bits (39-32) are cleared. |  |  |  |
| Before |  | Aft |  |  |
| AC0 5F B000 | 1234 | ACO | 5F B000 | 1234 |
| AC1 00 C680 | ACF0 | AC1 | 00 02C0 | 0048 |
| T0 | FFFA | T0 |  | FFFA |
| M40 | 0 | M40 |  | 0 |

Shift Accumulator Content Logically

## Syntax Characteristics



SFTL Shift Accumulator, Auxiliary, or Temporary Register Content Logically

SFTL Shift Accumulator, Auxiliary, or Temporary Register Content Logically
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SFTL dst, \#1 | Yes | 2 | 1 | X |
| $[2]$ | SFTL dst, \#-1 | Yes | 2 | 1 | $X$ |

Description These instructions perform an unsigned shift by 1 bit:
$\square$ In the D-unit shifter, if the destination operand is an accumulator (ACx).
$\square$ In the A-unit ALU, if the destination operand is an auxiliary or temporary register (TAx).

Status Bits Affected by C54CM, M40
Affects CARRY
See Also
See the following other related instructions:
$\square$ SFTCC (Shift Accumulator Content Conditionally)
$\square$ SFTL (Shift Accumulator Content Logically)
$\square$ SFTS (Signed Shift of Accumulator Content)
$\square$ SFTS (Signed Shift of Accumulator, Auxiliary, or Temporary Register Content)

Syntax Characteristics


SFTL Shift Accumulator, Auxiliary, or Temporary Register Content Logically

Shift Accumulator, Auxiliary, or Temporary Register Content Logically

## Syntax Characteristics



## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SFTS ACx, Tx[, ACy] | Yes | 2 | 1 | X |
| $[2]$ | SFTSC ACx, Tx[, ACy] | Yes | 2 | 1 | $X$ |
| $[3]$ | SFTS ACx, \#SHIFTW[, ACy] | Yes | 3 | 1 | $X$ |
| $[4]$ | SFTSC ACx, \#SHIFTW[, ACy] | Yes | 3 | 1 | $X$ |

Description These instructions perform a signed shift by an immediate value, SHIFTW, or by the content of a temporary register (Tx) in the D-unit shifter.

Status Bits
Affected by C54CM, M40, SATA, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
See Also
See the following other related instructions:
$\square$ SFTCC (Shift Accumulator Content Conditionally)

- SFTL (Shift Accumulator Content Logically)
$\square$ SFTL (Shift Accumulator, Auxiliary, or Temporary Register Content Logically)
$\square$ SFTS (Signed Shift of Accumulator, Auxiliary, or Temporary Register Content)


## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | SFTS ACx, Tx[, ACy] | Yes | 2 | 1 | X |
| Opcode |  |  |  |  |  |

## Operands

Description This instruction shifts by the temporary register (Tx) content the accumulator (ACx) content. If the 16-bit value contained in Tx is out of the -32 to +31 range, the shift is saturated to -32 or +31 and the shift operation is performed with this value; a destination accumulator overflow is reported when such saturation occurs.
$\square$ The operation is performed on 40 bits in the D-unit shifter.
$\square$ When M40 $=0$, the input to the shifter is modified according to SXMD and then the modified input is shifted by the Tx content:

- if $S X M D=0,0$ is substituted for the guard bits (39-32) as the input, instead of $\operatorname{ACx}(39-32)$, to the shifter
- if $S X M D=1$, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
$\square$ The sign position of the source operand is compared to the shift quantity. This comparison depends on M40:
- if M40 $=0$, comparison is performed versus bit 31
- if $\mathrm{M} 40=1$, comparison is performed versus bit 39
$\square 0$ is inserted at bit position 0 .
$\square$ The shifted-out bit is extracted according to M40.
$\square$ After shifting, unless otherwise noted, when $\mathrm{M} 40=0$ :
- overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)
- if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
$\square$ After shifting, unless otherwise noted, when $\mathrm{M} 40=1$ :
■ overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVy bit is set)
■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)


## Compatibility with C54x devices (C54CM = 1)

When C54CM = 1 :
$\square$ These instructions are executed as if M40 status bit was locally set to 1 .
$\square$ There is no overflow detection, overflow report, and saturation performed by the D -unit shifter.

- The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects $\quad$ ACOVy |  |

Example

| Syntax | Description |
| :--- | :--- |
| SFTS AC1, T0, AC0 | The content of AC1 is shifted by the content of T0 and the result is stored in AC0. |

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | SFTSC ACx, Tx[, ACy] | Yes | 2 | 1 | $X$ |

## Opcode

0101 110E $\mid$ DDSS ss10

## Operands

Description

ACx, ACy, Tx
This instruction shifts by the temporary register ( Tx ) content the accumulator (ACx) content and stores the shifted-out bit in the CARRY status bit. If the 16 -bit value contained in $T x$ is out of the -32 to +31 range, the shift is saturated to -32 or +31 and the shift operation is performed with this value; a destination accumulator overflow is reported when such saturation occurs.

The operation is performed on 40 bits in the D-unit shifter.

- When M40 $=0$, the input to the shifter is modified according to SXMD and then the modified input is shifted by the Tx content:

■ if $S X M D=0,0$ is substituted for the guard bits (39-32) as the input, instead of $\operatorname{ACx}(39-32)$, to the shifter

■ if $\operatorname{SXMD}=1$, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of $\operatorname{ACx}(39-32)$, to the shifter
$\square$ The sign position of the source operand is compared to the shift quantity. This comparison depends on M4O:
■ if $\mathrm{M} 40=0$, comparison is performed versus bit 31

- if $\mathrm{M} 40=1$, comparison is performed versus bit 39
$\square 0$ is inserted at bit position 0 .
- The shifted-out bit is extracted according to M40 and stored in the CARRY status bit. When the shift count is zero, $\mathrm{Tx}=0$, the CARRY status bit is cleared to 0 .
- After shifting, unless otherwise noted, when $\mathrm{M} 40=0$ :

■ overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)
■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)

After shifting, unless otherwise noted, when $\mathrm{M} 40=1$ :

- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVy bit is set)
- if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$ :
$\square$ These instructions are executed as if M40 status bit was locally set to 1 .
$\square$ There is no overflow detection, overflow report, and saturation performed by the D-unit shifter.
$\square$ The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
\(\left.\begin{array}{lll}Status Bits \& Affected by \& C54CM, M40, SATD, SXMD <br>

\& Affects \& ACOVy, CARRY\end{array}\right]\)|  | This instruction can be repeated. |
| :--- | :--- |

Example

| Syntax | Description |
| :--- | :--- |
| SFTSC AC2, T1 | The content of AC2 is shifted left by the content of T1 and the saturated result is <br> stored in AC2. The shifted out bit is stored in the CARRY status bit. Since <br> SATD = 1 and M40 = 0, AC2 = FF 8000 0000 (saturation). |


| Before |  |  | After |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: |
| AC2 | 80 AA00 | 1234 | AC2 | FF 8000 | 0000 |
| T1 |  | 0005 | T1 |  | 0005 |
| CARRY |  | 0 | CARRY |  | 1 |
| M40 | 0 | M40 |  | 0 |  |
| ACOV2 |  | 0 | ACOV2 |  | 1 |
| SXMD |  | 1 | SXMD |  | 1 |
| SATD |  | 1 | SATD |  | 1 |

Signed Shift of Accumulator Content

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[3]$ | SFTS ACx, \#SHIFTW[, ACy] | Yes | 3 | 1 | X |

## Opcode

## Operands

Description

ACx, ACy, SHIFTW
This instruction shifts by a 6-bit value, SHIFTW, the accumulator (ACx) content.

- The operation is performed on 40 bits in the D-unit shifter.
- When M40 $=0$, the input to the shifter is modified according to SXMD and then the modified input is shifted by the 6 -bit value, SHIFTW:
■ if $S X M D=0,0$ is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
■ if $\operatorname{SXMD}=1$, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of $\operatorname{ACx}(39-32)$, to the shifter
$\square$ The sign position of the source operand is compared to the shift quantity. This comparison depends on M40:
■ if $\mathrm{M} 40=0$, comparison is performed versus bit 31
- if $\mathrm{M} 40=1$, comparison is performed versus bit 39
$\square 0$ is inserted at bit position 0 .
- The shifted-out bit is extracted according to M40.
- After shifting, unless otherwise noted, when $\mathrm{M} 40=0$ :
- overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)
■ if SATD = 1 , when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
- After shifting, unless otherwise noted, when $M 40=1$ :
- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVy bit is set)
■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, these instructions are executed as if M40 status bit was locally set to 1 . There is no overflow detection, overflow report, and saturation performed by the D-unit shifter.

| Status Bits | Affected by C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects ACOVy |

## Example 2

| Syntax | Description |
| :--- | :--- |
| SFTS AC1, \#-32 | The content of AC1 is shifted right by 32 bits and the result is stored in AC1. |

Signed Shift of Accumulator Content

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[4]$ | SFTSC ACx, \#SHIFTW[, ACy] | Yes | 3 | 1 | $X$ |

Opcode $\quad|0001000 \mathrm{E}|$ DDSS $0110 \mid$ xxSH IFTW

## Operands ACx, ACy, SHIFTW

Description This instruction shifts by a 6-bit value, SHIFTW, the accumulator (ACx) content and stores the shifted-out bit in the CARRY status bit.
$\square$ The operation is performed on 40 bits in the D-unit shifter.

- When M40 $=0$, the input to the shifter is modified according to SXMD and then the modified input is shifted by the 6 -bit value, SHIFTW:

■ if $S X M D=0,0$ is substituted for the guard bits (39-32) as the input, instead of $\operatorname{ACx}(39-32)$, to the shifter

■ if $\operatorname{SXMD}=1$, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of $\operatorname{ACx}(39-32)$, to the shifter
$\square$ The sign position of the source operand is compared to the shift quantity. This comparison depends on M4O:

■ if $\mathrm{M} 40=0$, comparison is performed versus bit 31

- if $\mathrm{M} 40=1$, comparison is performed versus bit 39
$\square 0$ is inserted at bit position 0 .
$\square$ The shifted-out bit is extracted according to M40 and stored in the CARRY status bit. When the shift count is zero, $\mathrm{SHIFTW}=0$, the CARRY status bit is cleared to 0 .
- After shifting, unless otherwise noted, when $\mathrm{M} 40=0$ :

■ overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)

■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)


## SFTS <br> Signed Shift of Accumulator, Auxiliary, or Temporary Register Content

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SFTS dst, \#-1 | Yes | 2 | 1 | $X$ |
| $[2]$ | SFTS dst, \#1 | Yes | 2 | 1 | $X$ |


| Description | These instructions perform a shift of 1 bit: |
| :--- | :--- |
| Status Bits | In the D-unit shifter, if the destination operand is an accumulator (ACx). |
| See Also | In the A-unit ALU, if the destination operand is an auxiliary or temporary <br> register (TAx). |
|  | Affects $\quad$ C54CM, M40, SATA, SATD, SXMD |
| See the following other related instructions: |  |
| SFTCC (Shift Accumulator Content Conditionally) |  |

## Syntax Characteristics



If the destination operand (dst) is an auxiliary or temporary register:

- The operation is performed on 16 bits in the A-unit ALU.
$\square$ Bit 15 is sign extended.


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, these instructions are executed as if M40 status bit was locally set to 1 . There is no overflow detection, overflow report, and saturation performed by the D-unit shifter.

| Status Bits | Affected by | C54CM, M40, SXMD |
| :--- | :--- | :--- |
|  | Affects | none |

## Example

| Syntax | Description |
| :--- | :--- |
| SFTS AC0, \#-1 | The content of AC0 is shifted right by 1 bit and the result is stored in AC0. |

Signed Shift of Accumulator, Auxiliary, or Temporary Register Content

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | SFTS dst, \#1 |  | Yes | 2 | 1 | X |
| Opcode | dst |  | 0100 | 010 E | 01 xl | FDDD |
| Operands | This instruction shifts left by 1 bit the content of the destination register (dst). |  |  |  |  |  |
| Description |  |  |  |  |  |  |

If the destination operand (dst) is an accumulator:
$\square$ The operation is performed on 40 bits in the D-unit shifter.
$\square$ When M40 $=0$, the input to the shifter is modified according to SXMD and then the modified input is shifted left by 1 bit:
■ if $\operatorname{SXMD}=0,0$ is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
■ if $\operatorname{SXMD}=1$, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
$\square$ The sign position of the source operand is compared to the shift quantity.
This comparison depends on M40:
■ if $\mathrm{M} 40=0$, comparison is performed versus bit 31

- if $\mathrm{M} 40=1$, comparison is performed versus bit 39
- 0 is inserted at bit position 0 .
- The shifted-out bit is extracted according to M40.
- After shifting, unless otherwise noted, when $\mathrm{M} 40=0$ :
- overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVx bit is set)
■ if SATD = 1 , when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
- After shifting, unless otherwise noted, when $M 40=1$ :
- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVx bit is set)

■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)

If the destination operand (dst) is an auxiliary or temporary register:

- The operation is performed on 16 bits in the A-unit ALU.
- 0 is inserted at bit position 0 .
$\square$ After shifting, unless otherwise noted:
■ overflow is detected at bit position 15 (if an overflow is detected, the destination ACOVx bit is set)
- if SATA $=1$, when an overflow is detected, the destination register saturation values are 7FFFh (positive overflow) or 8000h (negative overflow)


## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, these instructions are executed as if M40 status bit was locally set to 1 . There is no overflow detection, overflow report, and saturation performed by the D -unit shifter.

| Status Bits | Affected by C54CM, M40, SATA, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects ACOVx |
| Repeat | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| SFTS T2, \#1 | The content of T2 is shifted left by 1 bit and the result is stored in T2. |


| Before |  | After |  |
| :--- | ---: | :--- | ---: |
| T2 | EF27 | T2 | DE4E |
| SATA | 1 | SATA | 1 |

## SQA

## Square and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SQA $[R][A C x] A C y$, | Yes | 2 | 1 | $X$ |
| $[2]$ | SQAM $[R][T 3=]$ Smem, $[A C x$,$] ACy$ | No | 3 | 1 | $X$ |


| Description | This instruction performs a multiplication and an accumulation in the D-unit <br> MAC. The input operands of the multiplier are: |
| :--- | :--- |
| Status Bits | Affected by $\quad$ FRCT, M40, RDM, SATD, SMUL <br> See Also |
| Affects $\quad$ ACOVx, ACOVy |  |

Square and Accumulate

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SQA $[R][A C x] A C y$, | Yes | 2 | 1 | $X$ |

Opcode $\quad \mid 0101$ 010E $\mid$ DDSS 001\%

## Operands ACx, ACy

Description This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are ACx(32-16):
$A C y=A C y+(A C x * A C x)$
I If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.

- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an addition overflow is detected, the accumulator is saturated according to SATD.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
\(\left.\begin{array}{ll}Status Bits \& Affected by FRCT, M40, RDM, SATD, SMUL <br>

Affects \quad ACOVy\end{array}\right]\)|  |  |
| :--- | :--- |
| Repeat | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| SQA AC1, AC0 | The content of AC1 squared is added to the content of AC0 and the result is stored in AC0. |

## Square and Accumulate

## Syntax Characteristics



## SQDST

Square Distance
Syntax Characteristics


The first operation performs a multiplication and an accumulation in the D -unit MAC. The input operands of the multiplier are ACx(32-16).
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an addition overflow is detected, the accumulator is saturated according to SATD.

The second operation subtracts the content of data memory operand Ymem, shifted left 16 bits, from the content of data memory operand Xmem, shifted left 16 bits.

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$, during the subtraction an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by C54CM, FRCT, M40, SATD, SMUL, SXMD |
| :--- | :--- |
| Repeat | Affects ACOVx, ACOVy, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| SQDST *AR0, *AR1, AC0, AC1 | The content of AC0 squared is added to the content of AC1 and the <br> result is stored in AC1. The content addressed by AR1 shifted left by <br> 16 bits is subtracted from the content addressed by AR0 shifted left by <br> 16 bits and the result is stored in AC0. |


| Before |  |  | After |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| AC0 | FF ABCD | 0000 | AC0 | FF FFAB | 0000 |  |
| AC1 | 000000 | 0000 | AC1 | 00 | 1BB1 | 8229 |
| *AR0 |  | 0055 | *AR0 |  | 0055 |  |
| *AR1 |  | $00 A A$ | *AR1 |  | 00 AA |  |
| ACOV0 |  |  | 0 | ACOV0 |  | 0 |
| ACOV1 |  | 0 | ACOV1 |  | 0 |  |
| CARRY |  | 0 | CARRY |  | 0 |  |
| FRCT |  | 0 | FRCT |  | 0 |  |

## SQR

Square
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SQR $[R][A C x$,$] ACy$ | Yes | 2 | 1 | $X$ |
| $[2]$ | SQRM $[R][T 3=]$ Smem, ACx | No | 3 | 1 | $X$ |

Description

Status Bits

See Also
See the following other related instructions:

- MPY (Multiply)
- SQA (Square and Accumulate)
- SQDST (Square Distance)
- SQS (Square and Subtract)

Square

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | SQR $[R][A C x] A C y$, | Yes | 2 | 1 | $X$ |

## Opcode

## Operands ACx, ACy

## Description This instruction performs a multiplication in the D-unit MAC. The input

 operands of the multiplier are ACx(32-16):ACy = ACx * ACx

- If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
. Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| SQR AC1, AC0 | The content of AC1 is squared and the result is stored in AC0. |

## Square

Syntax Characteristics


## Operands ACx, Smem

Description This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of a memory (Smem) location, sign extended to 17 bits:
$\mathrm{ACx}=$ Smem * Smem

- If FRCT $=1$, the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| SQRM *AR3, AC0 | The content addressed by AR3 is squared and the result is stored in AC0. |

## SQS

## Square and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SQS $[R][A C x$,$] ACy$ | Yes | 2 | 1 | $X$ |
| $[2]$ | SQSM $[R][T 3=]$ Smem, $[A C x$,$] ACy$ | No | 3 | 1 | $X$ |


| Description | This instruction performs a multiplication and a subtraction in the D-unit M <br> The input operands of the multiplier are: |
| :--- | :--- |
| Status Bits | ACx(32-16) |
| See Also memory (Smem) location, sign extended to 17 bits |  |
|  | Affects FRCT, M40, RDM, SATD, SMUL |
| See the following other related instructions: |  |
| ACOVx, ACOVy |  |

Square and Subtract

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SQS $[R][A C x$,$] ACy$ | Yes | 2 | 1 | $X$ |

## Opcode

0101 010E $\mid$ DDSS 010\%

## Operands <br> ACx, ACy

Description This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are ACx(32-16):
$A C y=A C y-(A C x * A C x)$
$\square$ If $\operatorname{FRCT}=1$, the output of the multiplier is shifted left by 1 bit.

- Multiplication overflow detection depends on SMUL.
$\square$ The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- Rounding is performed according to RDM, if the optional R keyword is applied to the instruction.
$\square$ Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by FRCT, M40, RDM, SATD, SMUL |
| :--- | :--- | :--- |
| Affects ACOVy |  |

## Square and Subtract

Syntax Characteristics


## SUB

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SUB dual(Lmem), [ACy,] ACy | No | 3 | 1 | X |
| $[2]$ | SUB ACx, dual(Lmem), ACy | No | 3 | 1 | $X$ |
| $[3]$ | SUB dual(Lmem), Tx, ACx | No | 3 | 1 | $X$ |
| $[4]$ | SUB Tx, dual(Lmem), ACx | No | 3 | 1 | $X$ |

Description These instructions perform two paralleled subtraction operations in one cycle.
The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

Status Bits $\quad$ Affected by C54CM, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
See Also See the following other related instructions:

- ADDSUB (Dual 16-Bit Addition and Subtraction)
$\square$ ADDSUBCC (Addition or Subtraction Conditionally)
$\square$ ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally)
$\square$ ADDSUB2CC (Addition or Subtraction Conditionally with Shift)
- SUB (Subtraction)
- SUB::MOV (Subtraction with Parallel Store Accumulator Content to Memory)
- SUBADD (Dual 16-Bit Subtraction and Addition)
- SUBC (Subtract Conditionally)


## Syntax Characteristics



The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit data path).

- The data memory operand dbl(Lmem) is divided into two 16-bit parts:

■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part

- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word = Lmem + 1

■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem -1

- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVy) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15.
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.



## Syntax Characteristics



The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
$\square$ The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
$\square$ The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word = Lmem + 1

■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem -1

- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVy) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15.
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.

Status Bits | Affected by C16, C54CM, SATD, SXMD |
| :--- |
| Affects ACOVy, CARRY |

Repeat \begin{tabular}{l|l|}
<br>
This instruction can be repeated. <br>

| Syntax | Description |
| :--- | :--- |
| SUB AC1, dual(*AR3), AC0 | Both instructions are performed in parallel. When the Lmem address is <br> even (AR3 = even): The content of AC1(39-16) is subtracted from the <br> content addressed by AR3 and the result is stored in AC0(39-16). The <br> content of AC1 (15-0) is subtracted from the content addressed by AR3 + 1 <br> and the result is stored in AC0(15-0). |

\end{tabular}.

## Dual 16-Bit Subtractions

Syntax Characteristics


The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The temporary register Tx:

■ is used as one of the 16-bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
$\square$ The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part

- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem -1
$\square$ For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.

- For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
For all instructions, the carry of the operation performed in the ALU high
part is reported in the CARRY status bit. The CARRY status bit is always
extracted at bit position 31.

## Dual 16-Bit Subtractions

Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | SUB Tx, dual(Lmem), ACx | No | 3 | 1 | X |
| Opcode |  | \| 1110 1110 | AAAA AAAI ${ }^{\text {asdD }}$ 101x |  |  |  |
| Operands |  | ACx, Tx, Lmem |  |  |  |
| Description |  | This instruction performs two paralleled subtraction operations in one cycle: <br> HI (ACx) $=$ HI (Lmem) -Tx <br> :: LO (ACx) = LO(Lmem) - Tx |  |  |  |

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The temporary register Tx:

■ is used as one of the 16-bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
$\square$ The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part

- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem -1
$\square$ For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.

- For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.

| Status Bits | Affected by C16, C54CM, SATD, SXMD |  |
| :--- | :--- | :--- |
|  | Affects | ACOVx, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| SUB T0, dual(*AR3), AC0 | Both instructions are performed in parallel. When the Lmem address is <br> even (AR3 = even): The content of T0 is subtracted from the content <br> addressed by AR3 and the result is stored in AC0(39-16). The duplicated <br> content of T0 is subtracted from the content addressed by AR3 + 1 and the <br> result is stored in AC0(15-0). |

## SUB

## Subtraction

Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [1] | SUB [src,] dst | Yes | 2 | 1 | X |
| [2] | SUB k4, dst | Yes | 2 | 1 | X |
| [3] | SUB K16, [src,] dst | No | 4 | 1 | X |
| [4] | SUB Smem, [src,] dst | No | 3 | 1 | X |
| [5] | SUB src, Smem, dst | No | 3 | 1 | X |
| [6] | SUB ACx $\ll$ Tx, ACy | Yes | 2 | 1 | X |
| [7] | SUB ACx << \#SHIFTW, ACy | Yes | 3 | 1 | X |
| [8] | SUB K16 << \#16, [ACx, ] ACy | No | 4 | 1 | X |
| [9] | SUB K16 << \#SHFT, [ACx,] ACy | No | 4 | 1 | X |
| [10] | SUB Smem << Tx, [ACx,] ACy | No | 3 | 1 | X |
| [11] | SUB Smem << \#16, [ACx,], ACy | No | 3 | 1 | X |
| [12] | SUB ACx, Smem <<\#16, ACy | No | 3 | 1 | X |
| [13] | SUB [uns(]Smem[)], BORROW, [ACx,] ACy | No | 3 | 1 | X |
| [14] | SUB [uns(]Smem[)], [ACx,] ACy | No | 3 | 1 | X |
| [15] | SUB [uns(]Smem[)] << \#SHIFTW, [ACx,] ACy | No | 4 | 1 | X |
| [16] | SUB dbl(Lmem), [ACx,] ACy | No | 3 | 1 | X |
| [17] | SUB ACx, dbl(Lmem), ACy | No | 3 | 1 | X |
| [18] | SUB Xmem, Ymem, ACx | No | 3 | 1 | X |

Description These instructions perform a subtraction operation.
Status Bits Affected by CARRY, C54CM, M40, SATA, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
See Also See the following other related instructions:- ADD (Addition)- ADDSUB (Dual 16-Bit Addition and Subtraction)$\square$ ADDSUBCC (Addition or Subtraction Conditionally)
$\square$ ADDSUBCC (Addition, Subtraction, or Move Accumulator ContentConditionally)
$\square$ ADDSUB2CC (Addition or Subtraction Conditionally with Shift)
$\square$ SUB (Dual 16-Bit Subtractions)- SUB::MOV (Subtraction with Parallel Store Accumulator Content toMemory)

- SUBADD (Dual 16-Bit Subtraction and Addition)
- SUBC (Subtract Conditionally)


## Subtraction

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SUB [src,] dst | Yes | 2 | 1 | X |  |
|  |  |  |  |  |  |  |
| Opcode |  | 0010 | 011 E | FSSS | FDDD |  |
| Operands | dst, src |  |  |  |  |  |
| Description | This instruction performs a subtraction operation between two registers: |  |  |  |  |  |
|  | dst $=$ dst - src |  |  |  |  |  |

- When the destination operand (dst) is an accumulator:

■ The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are sign extended to 40 bits according to SXMD.
■ If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

- When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.

■ If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

■ Overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits

| Affected by M40, SATA, SATD, SXMD |  |
| :--- | :--- |
| Repeat | Affects |
| Example | This instruction can be repeated. |
| Syntax | Description |
| SUB AC1, AC0 | The content of AC1 is subtracted from the content of AC0 and the result is stored in AC0. |

## Subtraction

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[2]$ | SUB k4, dst |  | Yes | 2 | 1 | X |
| Opcode |  |  |  | 0100 | 011 E | $\mid$ kkkk |
| Operands | dst, k4 |  |  |  |  |  |
| Description | This instruction subtracts a 4-bit unsigned constant, k4, from a register: |  |  |  |  |  |
|  | dst $=$ dst -k 4 |  |  |  |  |  |

$\square$ When the destination operand (dst) is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

■ When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination operand (dst) is an auxiliary or temporary register:

- The operation is performed on 16 bits in the A-unit ALU.

■ Overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

## Compatibility with C54x devices $(C 54 C M=1)$

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by M40, SATA, SATD
Affects ACOVx, CARRY
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| SUB \#15, AC0 | An unsigned 4-bit value (15) is subtracted from the content of AC0 and the result <br> is stored in AC0. |

Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[3]$ | SUB K16, [src,] dst | No | 4 | 1 | X |

Opcode $\quad|01111100|$ KKKK KKKK $\mid$ KKKK KKKK $\mid$ FDDD FSSS

## Operands

Description
dst, K16, src
This instruction subtracts a 16-bit signed constant, K16, from a register:
dst = src - K16
$\square$ When the destination operand (dst) is an accumulator:
The operation is performed on 40 bits in the D-unit ALU.

- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.
- The 16 -bit constant, K 16 , is sign extended to 40 bits according to SXMD.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- Overflow detection is done at bit position 15.
- When an overflow is detected, the destination register is saturated according to SATA.


## Compatibility with C54x devices (C54CM =1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

## Status Bits Affected by M40, SATA, SATD, SXMD <br> Affects ACOVx, CARRY <br> Repeat This instruction can be repeated. <br> Example

| Syntax | Description |
| :--- | :--- |
| SUB \#FFFFh, AC1, AC0 | A signed 16-bit value (FFFFh) is subtracted from the content of AC1 and the <br> result is stored in AC0. |

## Subtraction

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[4]$ | SUB Smem, [src,] dst | No | 3 | 1 | X |  |
| Opcode |  | 1101 | $0111 \mid A A A A$ | AAAI | FDDD | FSSS |
| Operands | dst, Smem, src |  |  |  |  |  |
| Description | This instruction subtracts the content of a memory (Smem) location from a <br> register: <br>  |  |  |  |  |  |

$\square$ When the destination operand (dst) is an accumulator:

- The operation is performed on 40 bits in the D-unit ALU.
- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.
■ The content of the memory location is sign extended to 40 bits according to SXMD.

■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

- When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- Overflow detection is done at bit position 15.
- When an overflow is detected, the destination register is saturated according to SATA.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits |
| :--- |
| Affected by M40, SATA, SATD, SXMD <br> Repeat |
|  This instruction can be repeated. <br> Example Description <br> Syntax The content addressed by AR3 is subtracted from the content of AC1 and the <br> result is stored in AC0. <br> SUB *AR3, AC1, AC0  |

## Subtraction

## Syntax Characteristics



```
dst = Smem - src
```

$\square$ When the destination operand (dst) is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
■ If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ The content of the memory location is sign extended to 40 bits according to SXMD.

■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

- When an overflow is detected, the accumulator is saturated according to SATD.
$\square$ When the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.

■ If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

- Overflow detection is done at bit position 15.

■ When an overflow is detected, the destination register is saturated according to SATA.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits |
| :--- |
| Affected by M40, SATA, SATD, SXMD <br> Repeat |
|  This instruction can be repeated. <br> Example Description <br> Syntax The content of AC1 is subtracted from the content addressed by AR3 and the <br> result is stored in AC0. <br> SUB AC1, *AR3, AC0  |

Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | SUB ACx << Tx, ACy | Yes | 2 | 1 | X |
| Opcode |  | 0101 101E \| DDSS ss01 |  |  |  |
| Operands ACx |  | ACx, ACy, Tx |  |  |  |
| Description $\quad$ of |  | This instruction subtracts an accumulator content $A C x$ shifted by the content of $T x$ from an accumulator content ACy: |  |  |  |

$\square$ The operation is performed on 40 bits in the D-unit shifter.

- Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$ :

- An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

| Status Bits | Affected by <br> Affects $\quad$ C54CM, M40, SATD, SXMD |
| :--- | :--- |
| Repeat | This instruction can be repeated. |

## Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | SUB ACx << \#SHIFTW, ACy | Yes | 3 | 1 | X |
| Opcode $\quad\|0001000 \mathrm{E}\|$ DDSS $0100 \mid$ xxSH IFTW |  |  |  |  |  |
| Operands ACx, ACy, SHIFTW |  |  |  |  |  |
| Description |  | This instruction subtracts an accumulator content ACx shifted by the 6-bit value, SHIFTW, from an accumulator content ACy: |  |  |  |

$\square$ The operation is performed on 40 bits in the D-unit shifter.

- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| SUB AC1 << \#31, AC0 | The content of AC1 shifted left by 31 bits is subtracted from the content of AC0 <br> and the result is stored in AC0. |

## Subtraction

## Syntax Characteristics



## Subtraction

## Syntax Characteristics


$\square$ The operation is performed on 40 bits in the D-unit shifter.
$\square$ Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| SUB \#9800h << \#5, AC0, AC1 | A signed 16-bit value (9800h) shifted left by 5 bits is subtracted from the <br> content of AC0 and the result is stored in AC1. |

Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [10] | SUB Smem << Tx, [ACx, ] ACy | No | 3 | 1 | X |
| Opcode |  | 11011101 \|AAAA AAAI ${ }^{\text {S }}$ SDD Ss01 |  |  |  |
| Operands |  | ACx, ACy, Smem, Tx |  |  |  |
| Description |  | This instruction subtracts the content of a memory (Smem) location shifted by the content of Tx from an accumulator content ACx: <br> $\mathrm{ACy}=\mathrm{ACx}-($ Smem $\ll \mathrm{Tx})$ |  |  |  |

$\square$ The operation is performed on 40 bits in the D-unit shifter.

- Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When $\mathrm{C} 54 \mathrm{CM}=1$ :

An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

Status Bits

Repeat
Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY

Example

| Syntax | Description |
| :--- | :--- |
| SUB *AR3 << T0, AC1, AC0 | The content addressed by AR3 shifted by the content of T0 is subtracted <br> from the content of AC1 and the result is stored in AC0. |

## Subtraction

## Syntax Characteristics


$\square$ The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. If the result of the subtraction generates a borrow, the CARRY status bit is cleared; otherwise, the CARRY status bit is not affected.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| SUB *AR3 << \#16, AC1, AC0 | The content addressed by AR3 shifted left by 16 bits is subtracted from the <br> content of AC1 and the result is stored in AC0. |

## Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [12] | SUB ACx, Smem << \#16, ACy | No | 3 | 1 | X |
| Opcode |  | 1110 \| AR | A A | AI SSDD | 0110 |
| Operands |  | ACx, ACy, Smem |  |  |  |
| Description |  | This instruction subtracts an accumulator content ACx from the content of a memory (Smem) location shifted left by 16 bits: |  |  |  |

- The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| SUB AC1, *AR3 << \#16, AC0 | The content of AC1 is subtracted from the content addressed by AR3 <br> shifted left by 16 bits and the result is stored in AC0. |

## Subtraction

## Syntax Characteristics


$\square$ The operation is performed on 40 bits in the D-unit ALU.

- Input operands are extended to 40 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by | CARRY, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVy, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| SUB uns(*AR1), BORROW, AC0, AC1 | The complement of the CARRY bit (1) and the unsigned content <br> addressed by AR1 (F000h) are subtracted from the content of AC0 <br> and the result is stored in AC1. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 00 | ECOO | 0000 | ACO | 00 | ECOO | 0000 |
| AC1 | 00 | 0000 | 0000 | AC1 | 00 | EBFF | OFFF |
| AR1 |  |  | 0302 | AR1 |  |  | 0302 |
| 302 |  |  | F000 | 302 |  |  | FOOO |
| CARRY |  |  | 0 | CARRY |  |  | 1 |

## Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [14] | SUB [uns(]Smem[)], [ACx,] ACy | No | 3 | 1 | X |
| Opcode |  | 11011111 \| AAAA AAAI ${ }^{\text {\| }}$ S |  |  | 111u |
| Operands |  | ACx, ACy, Smem |  |  |  |
| Description |  | This instruction subtracts the content of a memory (Smem) location from an accumulator content ACx: |  |  |  |

- The operation is performed on 40 bits in the D-unit ALU.
$\square$ Input operands are extended to 40 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

- When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

Status Bits | Affected by M40, SATD, SXMD |
| :--- |
| Affects ACOVy, CARRY |

Repeat $\quad$ This instruction can be repeated.
Example

| Syntax | Description |
| :--- | :--- |
| SUB uns(*AR3), AC1, AC0 | The unsigned content addressed by AR3 is subtracted from the content of AC1 <br> and the result is stored in AC0. |

Subtraction

## Syntax Characteristics

| No. Syntax |  |  | Parallel Enable Bit | Size | Cycles |  | peline |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [15] | SUB [uns(]Smem[]] << \#SHIFTW, [ACx, ] ACy |  | No | 4 | 1 |  | X |
| Opcode |  | 11111001 \|AAAA AAAI \| UxSH IFTW ${ }^{\text {d }}$ SSDD |  |  |  |  | 01xx |
| Operands |  | ACx, ACy, SHIFTW, Smem |  |  |  |  |  |
| Description |  | This instruction subtracts the content of a memory (Smem) location shifted by the 6 -bit value, SHIFTW, from an accumulator content ACx: |  |  |  |  |  |

```
ACy = ACx - (Smem << #SHIFTW)
```

$\square$ The operation is performed on 40 bits in the D-unit shifter.

- Input operands are extended to 40 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

| Status Bits | Affected by | C54CM, M40, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVy, CARRY |
| Repeat | This instruction can be repeated. |  |

Example

| Syntax | Description |
| :--- | :--- |
| SUB uns(*AR3) << \#31, AC1, AC0 | The unsigned content addressed by AR3 shifted left by 31 bits is <br> subtracted from the content of AC1 and the result is stored in AC0. |

## Subtraction

## Syntax Characteristics

| No. | Syntax | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| [16] | SUB dbl(Lmem), [ACx,] ACy | No | 3 | 1 | X |

Operands ACx, ACy, Lmem

Description This instruction subtracts the content of data memory operand dbl(Lmem) from an accumulator content ACx:

```
ACy = ACx - dbl (Lmem)
```

$\square$ The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1
■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem -1

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
$\square$ When an overflow is detected, the accumulator is saturated according to SATD.


## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.
Status Bits Affected by M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| SUB dbl(*AR3+), AC1, AC0 | The content (long word) addressed by AR3 and AR3 + 1 is subtracted from the <br> content of AC1 and the result is stored in AC0. Because this instruction is a <br> long-operand instruction, AR3 is incremented by 2 after the execution. |

## Subtraction

## Syntax Characteristics


$\square$ The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem -1
$\square$ The operation is performed on 40 bits in the D-unit ALU.

- Input operands are sign extended to 40 bits according to SXMD.
$\square$ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured.

| Status Bits | Affected by M40, SATD, SXMD <br> Affects $\quad$ ACOVy, CARRY |
| :--- | :--- | :--- |
| Repeat | This instruction can be repeated. |

## Subtraction

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| $[18]$ | SUB Xmem, Ymem, ACx | No | 3 | 1 | X |  |
| Opcode | $\mid 1000$ | 0001 | XXXM | MMYY | YMMM | 01DD |

## Operands <br> ACx, Xmem, Ymem

Description

Status Bits

Repeat
Example

| Syntax | Description |
| :--- | :--- |
| SUB *AR3, *AR4, AC0 | The content addressed by AR4 shifted left by 16 bits is subtracted from the <br> content addressed by AR3 shifted left by 16 bits and the result is stored in AC0. |

## SUB::MOV <br> Subtraction with Parallel Store Accumulator Content to Memory

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SUB Xmem << \#16, ACx, ACy <br> $:: ~ M O V ~ H I(A C y ~ \ll ~ T 2), ~ Y m e m ~$ | No | 4 | 1 | X |

Opcode
Operands
Description This instruction performs two operations in parallel: subtraction and store:

```
ACy = (Xmem << #16) - ACx
:: Ymem = HI (ACY << T2)
```

The first operation subtracts an accumulator content from the content of data memory operand Xmem shifted left by 16 bits.

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
$\square$ The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit. When C54CM $=1$, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

When an overflow is detected, the accumulator is saturated according to SATD.

The second operation shifts the accumulator ACy by the content of T2 and stores $\operatorname{ACy}(31-16)$ to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
$\square$ The input operand is shifted in the D-unit shifter according to SXMD.
$\square$ After the shift, the high part of the accumulator, $\mathrm{ACy}(31-16)$, is stored to the memory location.

## Compatibility with C54x devices (C54CM = 1)

When this instruction is executed with $\mathrm{M} 40=0$, compatibility is ensured. When this instruction is executed with $\mathrm{C} 54 \mathrm{CM}=1$, the 6 LSBs of T 2 are used to determine the shift quantity. The 6 LSBs of T2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

- If the SST bit = 1 and the SXMD bit $=0$, then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

ACy $=($ Xmem $\ll \# 16)-$ ACx
Ymem = HI(saturate(uns(ACy << T2)))

- If the SST bit = 1 and the SXMD bit $=1$, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

ACy $=($ Xmem $\ll \# 16)-$ ACx
Ymem = HI(saturate(ACy $\ll$ T2))
Status Bits

Repeat
See Also
Affected by C54CM, M40, RDM, SATD, SST, SXMD
Affects ACOVy, CARRY
This instruction can be repeated.
See the following other related instructions:
$\square$ ADDSUB (Dual 16-Bit Addition and Subtraction)
$\square$ ADDSUBCC (Addition or Subtraction Conditionally)
$\square$ ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally)

- ADDSUB2CC (Addition or Subtraction Conditionally with Shift)
$\square$ SUB (Dual 16-Bit Subtractions)
- SUB (Subtraction)
- SUBADD (Dual 16-Bit Subtraction and Addition)
- SUBC (Subtract Conditionally)


## Example

| Syntax | Description |
| :--- | :--- |
| SUB *AR3 <<\#16, AC1, AC0 | Both instructions are performed in parallel. The content of AC1 is <br> subtracted from the content addressed by AR3 shifted left by 16 bits and <br> the result is stored in AC0. The content of AC0 is shifted by the content <br> of T2, and AC0(31-16) is stored at the address of AR4. |

## SUBADD <br> Dual 16-Bit Subtraction and Addition

Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | SUBADD Tx, Smem, ACx | No | 3 | 1 | $X$ |
| $[2]$ | SUBADD Tx, dual(Lmem), ACx | No | 3 | 1 | $X$ |

Description These instructions perform two paralleled subtraction and addition operations in one cycle.

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

| Status Bits | Affected by | C54CM, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, ACOVy, CARRY |

See Also See the following other related instructions:

- ADD (Addition)
- ADD (Dual 16-Bit Additions)
- ADDSUB (Dual 16-Bit Addition and Subtraction)
$\square$ SUB (Dual 16-Bit Subtractions)
- SUB (Subtraction)


## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |  |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | SUBADD Tx, Smem, ACx | No | 3 | 1 | X |  |  |
| Opcode |  | 1101 | 1110 | AAAA | AAAI | SSDD | 1001 |
| Operands | ACx, Smem, Tx |  |  |  |  |  |  |

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The data memory operand Smem:

■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part

- The temporary register Tx:

■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part

- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15.
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.

|  | - Independently on each data path, if SATD = 1 when an overflow is detected on the data path, a saturation is performed: |
| :---: | :---: |
|  | For the operations performed in the ALU low part, saturation values are 7FFFh and 8000h. |
|  | For the operations performed in the ALU high part, saturation values are 007 FFFh and FF 8000h. |
|  | Compatibility with C54x devices ( $C 54 C M=1$ ) |
|  | When $\mathrm{C} 54 \mathrm{CM}=1$, this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24-bit datapath (overflow is detected at bit position 31). |
| Status Bits | Affected by C54CM, SATD, SXMD |
|  | Affects ACOVx, CARRY |
| Repeat | This instruction can be repeated. |
| Example |  |
| Syntax | Description |
| SUBADD T0, *AR3, AC0 | Both instructions are performed in parallel. The content of TO is subtracted from the content addressed by AR3 and the result is stored in ACO(39-16). The duplicated content of TO is added to the duplicated content addressed by AR3 and the result is stored in $\mathrm{ACO}(15-0)$. |

## Syntax Characteristics



The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

- The temporary register Tx:

■ is used as one of the 16-bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
$\square$ The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
$\square$ The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word $=$ Lmem +1
■ if Lmem address is odd: most significant word = Lmem, least significant word $=$ Lmem - 1

- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit ( ACOV x ) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15.

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.

| Status Bits | Affected by | C16, C54CM, SATD, SXMD |
| :--- | :--- | :--- |
|  | Affects | ACOVx, CARRY |

## Example

| Syntax | Description |
| :--- | :--- |
| SUBADD T0, dual(*AR3), AC0 | Both instructions are performed in parallel. When the Lmem address is <br> even (AR3 = even): The content of T0 is subtracted from the content <br> addressed by AR3 and the result is stored in AC0(39-16). The duplicated <br> content of T0 is added to the content addressed by AR3 + 1 and the result <br> is stored in AC0(15-0). |

## SUBC

## Subtract Conditionally

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size |
| :---: | :--- | :--- | :---: | :---: | :---: | :---: | Cycles | Pipeline |
| :---: |
| $[1]$ |
| SUBC Smem, [ACx,] ACy |

ACx, ACy, Smem
This instruction performs a conditional subtraction in the D-unit ALU. The D-unit shifter is not used to perform the memory operand shift.
$\square$ The 16-bit data memory operand Smem is sign extended to 40 bits according to SXMD, shifted left by 15 bits, and subtracted from the content of the source accumulator ACx.

■ The shift operation is equivalent to the signed shift instruction.
■ Overflow and CARRY bit is always detected at bit position 31. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

- If an overflow is detected and reported in accumulator overflow bit ACOVy, no saturation is performed on the result of the operation.
$\square$ If the result of the subtraction is greater than 0 (bit $39=0$ ), the result is shifted left by 1 bit, added to 1 , and stored in the destination accumulator ACy.
- If the result of the subtraction is less than 0 (bit $39=1$ ), the source accumulator ACx is shifted left by 1 bit and stored in the destination accumulator ACy.

```
if ((ACx - (Smem << #15)) >= 0)
    ACy = (ACx - (Smem << #15)) << #1 + 1
else
    ACy = ACx << #1
```

This instruction is used to make a 16 step 16 -bit by 16 -bit division. The divisor and the dividend are both assumed to be positive in this instruction. SXMD affects this operation:

- If $\operatorname{SXMD}=1$, the divisor must have a 0 value in the most significant bit
$\square$ If $S X M D=0$, any 16 -bit divisor value produces the expected result
The dividend, which is in the source accumulator ACx, must be positive (bit $31=0$ ) during the computation.

| Status Bits | Affected by SXMD |
| :---: | :---: |
|  | Affects ACOVy, CARRY |
| Repeat | This instruction can be repeated. |
| See Also | See the following other related instructions: |
|  | - ADDSUBCC (Addition or Subtraction Conditionally) |
|  | $\square$ ADDSUBCC (Addition, Subtraction, or Move Accumulator Content Conditionally) |
|  | - ADDSUB2CC (Addition or Subtraction Conditionally with Shift) |
|  | - SUB (Subtraction) |
|  | - SUB::MOV (Subtraction with Parallel Store Accumulator Content to Memory) |
|  | $\square$ SUBADD (Dual 16-Bit Subtraction and Addition) |

## Example 1

| Syntax | Description |
| :--- | :--- |
| SUBC *AR1, AC0, AC1 | The content addressed by AR1 shifted left by 15 bits is subtracted from the <br> content of AC0. The result is greater than 0; therefore, the result is shifted left by <br> 1 bit, added to 1, and the new result stored in AC1. The result generated an <br> overflow and a carry. |


| Before |  |  |  | After |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ACO | 23 | 4300 | 0000 | ACO | 23 | 4300 | 0000 |
| AC1 | 00 | 0000 | 0000 | AC1 | 46 | 8400 | 0001 |
| AR1 |  |  | 300 | AR1 |  |  | 300 |
| 300 |  |  | 200 | 300 |  |  | 200 |
| SXMD |  |  | 0 | SXMD |  |  | 0 |
| ACOV1 |  |  | 0 | ACOV1 |  |  | 1 |
| CARRY |  |  | 0 | CARRY |  |  | 1 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| repeat (CSR) | The content addressed by AR1 shifted left by 15 bits is subtracted from the |
| SUBC *AR1, AC1 | content of AC1. The result is greater than 0; therefore, the result is shifted left by <br> 1 bit, added to 1, and the new result stored in AC1. The content addressed by <br> AR1 shifted left by 15 bits is subtracted from the content of AC1. The result is <br> greater than 0; therefore, the result is shifted left by 1 bit, added to 1, and the new <br> result stored in AC1. The result generated a carry. |


| Before |  |  | After |  |  |  |
| :--- | ---: | ---: | :--- | ---: | ---: | ---: |
| AC1 | 00 | 0746 | 0000 | AC1 | 00 | 1 A18 |
| AR1 |  | 200007 |  |  |  |  |
| 200 |  | 0100 | AR1 | 200 |  | 200 |
| CSR |  | 1 | CSR |  |  | 0100 |
| ACOV1 |  | 0 | ACOV1 |  |  | 0 |
| CARRY |  | 0 | CARRY |  |  | 0 |
|  |  |  |  |  |  | 1 |

## SWAP

## Syntax Characteristics



## SWAP

## Swap Auxiliary Register Content

## Syntax Characteristics



## SWAP

Swap Auxiliary and Temporary Register Content

## Syntax Characteristics



## Example

| Syntax | Description |
| :--- | :--- |
| SWAP AR4, T0 | The content of AR4 is moved to T0 and the content of T0 is moved to AR4. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| T0 | 6500 | T0 | 0300 |
| AR4 | 0300 | AR4 | 6500 |

## SWAP

Swap Temporary Register Content

## Syntax Characteristics



## SWAPP

Swap Accumulator Pair Content

## Syntax Characteristics



## Example

| Syntax | Description |
| :--- | :--- |
| SWAPP AC0, AC2 | The following two swap instructions are performed in parallel: the content of AC0 <br> is moved to AC2 and the content of AC2 is moved to AC0, and the content of AC1 <br> is moved to AC3 and the content of AC3 is moved to AC1. |


| Before |  | After |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| AC0 | 01 | E500 | 0030 | AC0 | 00 | 2800 |
| AC1 | 00 | FFFF | 0000 | AC1 | 00 | 8800 |
| AC2 | 00 | 2800 | 0200 | AC2 | 01 | E500 |
| AC3 | 00 | 8800 | 0800 | AC3 | 00 | FFFF 0000 |

## SWAPP

## Swap Auxiliary Register Pair Content

## Syntax Characteristics



## Example

| Syntax | Description |
| :--- | :--- |
| SWAPP AR0, AR2 | The following two swap instructions are performed in parallel: the content of AR0 <br> is moved to AR2 and the content of AR2 is moved to AR0, and the content of AR1 <br> is moved to AR3 and the content of AR3 is moved to AR1. |
|  |  |
| Before | After |
| AR0 | AR0 |
| AR1 | 0200 |

## SWAPP

## Swap Auxiliary and Temporary Register Pair Content

## Syntax Characteristics



## Example

| Syntax | Description |
| :--- | :--- |
| SWAPP AR4, T0 | The following two swap instructions are performed in parallel: the content of AR4 <br> is moved to T0 and the content of T0 is moved to AR4, and the content of AR5 is <br> moved to T1 and the content of T1 is moved to AR5. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| AR4 | 0200 | AR4 | 6788 |
| AR5 | 0300 | AR5 | 0200 |
| T0 | 6788 | T0 | 0200 |
| T1 | 0200 | T1 | 0300 |

## SWAPP

## Swap Temporary Register Pair Content

## Syntax Characteristics



## Operands

Description

T0, T2
This instruction performs two parallel moves between four temporary registers (T0 and T2, T1 and T3) in one cycle. These operations are performed in a dedicated datapath independent of the A-unit operators. Temporary register swapping is performed in the address phase of the pipeline.

This instruction performs two parallel moves:
$\square$ the content of T0 to T2, and reciprocally the content of T2 to T0

- the content of T1 to T3, and reciprocally the content of T3 to T1

Status Bits Affected by none
Affects none
Repeat
This instruction can be repeated.
See the following other related instructions:

- SWAP (Swap Temporary Register Content)
- SWAPP (Swap Accumulator Pair Content)
- SWAPP (Swap Auxiliary Register Pair Content)
$\square$ SWAPP (Swap Auxiliary and Temporary Register Pair Content)


## Example

| Syntax | Description |
| :--- | :--- |
| SWAPP T0, T2 | The following two swap instructions are performed in parallel: the content of T0 is <br> moved to T2 and the content of T2 is moved to T0, and the content of T1 is <br> moved to T3 and the content of T3 is moved to T1. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| T0 | 0200 | T0 | 6788 |
| T1 | 0300 | T1 | 0200 |
| T2 | 6788 | T2 | 0200 |
| T3 | 0200 | T3 | 0300 |

## SWAP4

Swap Auxiliary and Temporary Register Pairs Content

Syntax Characteristics


## Example

| Syntax | Description |
| :--- | :--- |
| SWAP4 AR4, T0 | The following four swap instructions are performed in parallel: the content of AR4 <br> is moved to T0 and the content of T0 is moved to AR4, the content of AR5 is <br> moved to T1 and the content of T1 is moved to AR5, the content of AR6 is moved <br> to T2 and the content of T2 is moved to AR6, and the content of AR7 is moved to <br> T3 and the content of T3 is moved to AR7. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| AR4 | 0200 | AR4 | 0030 |
| AR5 | 0300 | AR5 | 0200 |
| AR6 | 0240 | AR6 | 3400 |
| AR7 | 0400 | AR7 | 0 FD3 |
| T0 | 0030 | T0 | 0200 |
| T1 | 0200 | T1 | 0300 |
| T2 | 3400 | T2 | 0240 |
| T3 | 0FD3 | T3 | 0400 |

## Syntax Characteristics

| No. | Syntax |  | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :--- | :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | TRAP k5 |  | No | 2 | $?$ | D |
| Opcode | k5 | 1001 | 0101 | 1 xxk | kkkk |  |
| Operands | This instruction passes control to a specified interrupt service routine (ISR) <br> and this instruction does not affect INTM bit in ST1_55 and DBGM bit in <br> Sescription | ST2_55. The ISR address is stored at the interrupt vector address defined by <br> the content of an interrupt vector pointer (IVPD or IVPH) combined with the <br> 5-bit constant, k5. This instruction is executed regardless of the value of INTM <br> bit. This instruction is not maskable. |  |  |  |  |

## Note:

DBSTAT (the debug status register) holds debug context information used during emulation. Make sure the ISR does not modify the value that will be returned to DBSTAT.

Before beginning an ISR, the CPU automatically saves the value of some CPU registers and two internal registers: the program counter (PC) and a loop context register. The CPU can use these values to re-establish the context of the interrupted program sequence when the ISR is done.
In the slow-return process (default), the return address (from the PC), the loop context bits, and some CPU registers are stored to the stacks (in memory). When the CPU returns from an ISR, the speed at which these values are restored is dependent on the speed of the memory accesses.
In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32 -bit load and store instructions. Some CPU registers are saved to the stacks (in memory). For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
When control is passed to the ISR:
$\square$ The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The status register 2 (ST2_55) content is pushed to the top of SP.

- The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The 7 higher bits of status register 0 (ST0_55) concatenated with 9 zeroes are pushed to the top of SSP.
$\square$ The SP is decremented by 1 word in the access phase of the pipeline. The status register 1 (ST1_55) content is pushed to the top of SP.
- The SSP is decremented by 1 word in the access phase of the pipeline. The debug status register (DBSTAT) content is pushed to the top of SSP.
$\square$ The SP is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
$\square$ The SSP is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.

The PC is loaded with the ISR program address. The active control flow execution context flags are cleared.


XCC

## Execute Conditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | XCC $[$ label, ]cond | No | 2 | 1 | AD |
| $[2]$ | XCCPART [label, ]cond | No | 2 | 1 | X |

Description These instructions evaluate a single condition defined by the cond field and allow you to control execution of all operations implied by the instruction or part of the instruction. See Table 1-3 for a list of conditions.

Instruction [1] allows you to control the entire execution flow from the address phase to the execute phase of the pipeline. Instruction [2] allows you to only control the execution flow from the execute phase of the pipeline. The use of a label, where control of the execute conditionally instruction ends, is optional.
$\square$ These instructions may be executed alone.

- These instructions may be executed with two paralleled instructions.
- These instructions may be executed with the instruction with which it is paralleled.
- These instructions may be executed with the previous instruction.
- These instructions may be executed with the previous instruction and two paralleled instructions.
- These instructions cannot be repeated.
- These instructions cannot be used as the last instruction in a repeat loop structure.
- These instructions cannot control the execution of the following program control instructions:

| B (branch) | BCC | IDLE | INTR | XCC |
| :--- | :--- | :--- | :--- | :--- |
| CALL | CALLCC | RPT | RPTCC | XCCPART |
| RET | RETCC | RETI | RPTB | RPTBLOCAL |
| RESET | TRAP |  |  |  |

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx

## Execute Conditionally

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | XCC [label, ]cond | No | 2 | 1 | AD |
| Opcode |  | 1001 | 0110 | $0 C C C$ | CCCC |
|  |  | $\mid 1001$ | 1110 | $0 C C C$ | CCCC |
|  |  | $\mid 1001$ | 1111 | $0 C C C$ | CCCC |

## Operands

Description

The assembler selects the opcode depending on the instruction position in a paralleled pair.
cond
This instruction evaluates a single condition defined by the cond field and allows you to control the execution flow of an instruction, or instructions, from the address phase to the execute phase of the pipeline. See Table 1-3 for a list of conditions.

When this instruction moves into the address phase of the pipeline, the condition specified in the cond field is evaluated. If the tested condition is true, the conditional instruction(s) is read and executed; if the tested condition is false, the conditional instruction(s) is not read and program control is passed to the instruction following the conditional instruction(s) or to the program address defined by label. There is a 3-cycle latency for the condition testing.
$\square$ This instruction may be executed alone:

```
XCC [label, ]cond
instruction_executes_conditionally
[label:]
```

- This instruction may be executed with two paralleled instructions:

```
    XCC [label, ]cond
    instruction_1_executes_conditionally
    || instruct\overline{i}
[label:]
```

$\square$ This instruction may be executed with the instruction with which it is paralleled:

XCC [label, ]cond
|| instruction_executes_conditionally
[label:]

- This instruction may be executed with a previous instruction:

```
    previous_instruction
    || XCC [label, ] cond
    instruction_executes_conditionally
[label:]
```

This instruction may be executed with a previous instruction and two paralleled instructions:

```
    previous_instruction
    XCC [label, ]cond
    instruction_1_executes_conditionally
    || instruct\overline{i}
[label:]
```

This instruction cannot be used as the last instruction in a repeat loop structure.

This instruction cannot control the execution of the following program control instructions:

| B (branch) | BCC | IDLE | INTR | XCC |
| :--- | :--- | :--- | :--- | :--- |
| CALL | CALLCC | RPT | RPTCC | XCCPART |
| RET | RETCC | RETI | RPTB | RPTBLOCAL |
| RESET | TRAP |  |  |  |

## Compatibility with C54x devices (C54CM = 1)

When C54CM $=1$, the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.

## Example 1

| Syntax | Description |
| :--- | :--- |
| XCC branch, AR0 ! $=\# 0$ | The content of AR0 is not equal to 0, the next (ADD) instruction is executed. The <br> content of AC0 is added to the content addressed by AR2 and the result is stored <br> in AC0. AR2 is incremented by 1. |


| Before |  | After |  |
| :--- | :--- | :--- | :--- |
| AR0 | 3000 | ARO | 3000 |
| AR2 | 0405 | AR2 | 0406 |
| 405 | EF00 | 405 | EF00 |
| AC0 | 000000 | $000 C$ | AC0 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| XCC ARO != \#0 | The content of AR0 is equal to 0, the next (ADD) instruction is not executed and <br> control is passed to the instruction following the conditionally executed (ADD) <br> instruction. |


| Before |  | After |  |
| :--- | :--- | :--- | :--- |
| AR0 | 0000 | AR0 | 0000 |
| AR2 | 0405 | AR2 | 0405 |
| 405 | EF00 | 405 | EF00 |
| AC0 | 000000 | $000 C$ | AC0 |

Execute Conditionally
Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[2]$ | XCCPART [label, ]cond | No | 2 | 1 | X |
| Opcode | $\mid 1001$ | 0110 | 1 CCC | CCCC |  |
|  |  | 1001 | 1110 | 1 CCC | CCCC |
|  |  | 1001 | 1111 | 1 CCC | CCCC |

## Operands <br> cond

Description
The assembler selects the opcode depending on the instruction position in a paralleled pair.

This instruction evaluates a single condition defined by the cond field and allows you to control the execution flow of an instruction, or instructions, from the execute phase of the pipeline. This instruction differs from instruction [1] because in this instruction operations performed in the address phase are always executed. See Table 1-3 for a list of conditions.
When this instruction moves into the execute phase of the pipeline, the condition specified in the cond field is evaluated. If the tested condition is true, the conditional instruction(s) is read and executed; if the tested condition is false, the conditional instruction(s) is not read and program control is passed to the instruction following the conditional instruction(s) or to the program address defined by label. There is a 0 -cycle latency for the condition testing.
$\square$ This instruction may be executed alone:
XCCPART [label, ]cond
instruction_executes_conditionally
[label:]

- This instruction may be executed with two paralleled instructions:

```
XCCPART [label, ]cond
instruction_1_executes_conditionally
|| instruction_2_executes_conditionally
[label:]
```

- This instruction may be executed with the instruction with which it is paralleled. When this instruction syntax is used and the instruction to be executed conditionally is a store-to-memory instruction, there is a 1 -cycle latency for the condition setting.

XCCPART [label, ]cond
|| instruction_executes_conditionally [label:]
$\square$ This instruction may be executed with a previous instruction:

```
    previous_instruction
    XCCPART [label, ]cond
    instruction_executes_conditionally
[label:]
```

$\square$ This instruction may be executed with a previous instruction and two paralleled instructions:

```
    previous_instruction
        XCCPART [label, ]cond
    instruction_1_executes_conditionally
    || instruct\overline{i}
[label:]
```

This instruction cannot be used as the last instruction in a repeat loop structure.

This instruction cannot control the execution of the following program control instructions:

| B (branch) | BCC | IDLE | INTR | XCC |
| :--- | :--- | :--- | :--- | :--- |
| CALL | CALLCC | RPT | RPTCC | XCCPART |
| RET | RETCC | RETI | RPTB | RPTBLOCAL |
| RESET | TRAP |  |  |  |

## Compatibility with C54x devices (C54CM = 1)

When $\mathrm{C} 54 \mathrm{CM}=1$, the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits
Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
This instruction cannot be repeated.
Example 1

| Syntax | Description |
| :--- | :--- |
| XCCPART branch, AR0 != \#0 | The content of AR0 is not equal to 0, the next (ADD) instruction is executed. <br> The content of AC0 is added to the content addressed by AR2 and the <br> result is stored in AC0. AR2 is incremented by 1. |


| Before |  | After |  |
| :--- | :--- | :--- | :--- |
| AR0 | 3000 | AR0 | 3000 |
| AR2 | 0405 | AR2 | 0406 |
| 405 | EF00 | 405 | EF00 |
| AC0 | 000000 | $000 C$ | AC0 |

## Example 2

| Syntax | Description |
| :--- | :--- |
| XCCPART AR0 != \#0 | The content of AR0 is equal to 0, the next (ADD) instruction is not executed and <br> control is passed to the instruction following the conditionally executed (ADD) <br> instruction; however, since the next (ADD) instruction includes a pointer <br> modification, AR2 is incremented by 1 in the address phase. |


| Before | After |  |  |
| :--- | :--- | :--- | :--- |
| AR0 | 0000 | AR0 | 0000 |
| AR2 | 0405 | AR2 | 0406 |
| 405 | EF00 | 405 | EF00 |
| AC0 | 000000 | $000 C$ | AC0 |

## XOR

Bitwise Exclusive OR (XOR)

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[1]$ | XOR src, dst | Yes | 2 | 1 | X |
| $[2]$ | XOR k8, src, dst | Yes | 3 | 1 | X |
| $[3]$ | XOR k16, src, dst | No | 4 | 1 | X |
| $[4]$ | XOR Smem, src, dst | No | 3 | 1 | X |
| $[5]$ | XOR ACx <<\#SHIFTW[, ACy] | Yes | 3 | 1 | X |
| $[6]$ | XOR k16 <<\#16, [ACx,] ACy | No | 4 | 1 | X |
| $[7]$ | XOR k16 <<\#SHFT, [ACx,] ACy | No | 4 | 1 | X |
| $[8]$ | XOR k16, Smem | No | 4 | 1 | $X$ |

Description These instructions perform a bitwise exclusive-OR (XOR) operation:

- In the D-unit, if the destination operand is an accumulator.
- In the A-unit ALU, if the destination operand is an auxiliary or temporary register.
$\square$ In the A-unit ALU, if the destination operand is the memory.
Status Bits Affected by C54CM
Affects none
See Also
See the following other related instructions:
- AND (Bitwise AND)
$\square$ OR (Bitwise OR)

Bitwise Exclusive OR (XOR)

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $[1]$ | XOR src, dst | Yes | 2 | 1 | X |

Opcode $\quad \mid 0010$ 110E |FSSS FDDD

## Operands

Description

Status Bits

Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| XOR AC0, AC1 | The content of AC0 is XORed with the content of AC1 and the result is stored in AC1. |


| Before | After |  |  |  |  |  |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| AC0 | 7 E | 2355 | 4 FCO | AC0 | 7 E | 2355 |
| AC1 | 0 F | E 340 | 5678 | AC1 | 71 | C015 |
|  |  | 19 B 8 |  |  |  |  |

## Bitwise Exclusive OR (XOR)

## Syntax Characteristics



Bitwise Exclusive OR (XOR)

## Syntax Characteristics



## Bitwise Exclusive OR (XOR)

## Syntax Characteristics

| No. Syntax |  | Parallel Enable Bit | Size | Cycles | Pipeline |
| :---: | :---: | :---: | :---: | :---: | :---: |
| XOR Smem, src, dst |  | No | 3 | 1 | X |
| Opcode |  | 1011 \| A | A A | A \\| FDD | FSSS |
| Operands | dst, Smem, src |  |  |  |  |
| Description | This instruction performs a bitwise exclusive-OR (XOR) operation between a source (src) register content and a memory (Smem) location: dst $=\operatorname{src}^{\wedge}$ smem <br> $\square$ When the destination (dst) operand is an accumulator: <br> - The operation is performed on 40 bits in the D-unit ALU. <br> ■ Input operands are zero extended to 40 bits. <br> - If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. <br> $\square$ When the destination (dst) operand is an auxiliary or temporary register: <br> - The operation is performed on 16 bits in the A-unit ALU. <br> - If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation. |  |  |  |  |
| Status Bits | Affected by <br> Affects |  |  |  |  |
| Repeat | This instruction can be repeated. |  |  |  |  |
| Example |  |  |  |  |  |
| Syntax | Description |  |  |  |  |
| XOR *AR3, AC1, AC0 | The content of AC1 is XORed with the content addressed by AR3 and the result is stored in ACO. |  |  |  |  |

Bitwise Exclusive OR (XOR)

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[5]$ | XOR ACx $\ll$ \#SHIFTW[, ACy] | Yes | 3 | 1 | $X$ |

## Opcode

0001000 E DDSS $0010 \mid x \times S H$ IFTW

## Operands

Description

| Status Bits | Affected by | C54CM, M40 |
| :--- | :--- | :--- |
|  | Affects | none | Repeat $\quad$ This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| XOR AC1 << \#30, AC0 | The content of AC0 is XORed with the content of AC1 logically shifted left by <br> 30 bits and the result is stored in AC0. |

## Bitwise Exclusive OR (XOR)

## Syntax Characteristics

| No. | Syntax | Parallel <br> Enable Bit | Size | Cycles | Pipeline |
| :---: | :--- | :---: | :---: | :---: | :---: |
| $[6]$ | XOR k16 <<\#16, $[A C x$,$] ACy$ | No | 4 | 1 | $X$ |


Operands ACx, ACy, k16

Description This instruction performs a bitwise exclusive-OR (XOR) operation between an accumulator (ACx) content and a 16-bit unsigned constant, k16, shifted left by 16 bits:
$A C y=A C x{ }^{\wedge} \quad(k 16 \lll \# 16)$

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are zero extended to 40 bits.
- The input operand (k16) is shifted 16 bits to the MSBs.

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.

## Example

| Syntax | Description |
| :--- | :--- |
| XOR \#FFFFh <<\#16, AC1, AC0 | The content of AC1 is XORed with the unsigned 16-bit value (FFFFh) <br> logically shifted left by 16 bits and the result is stored in AC0. |

Bitwise Exclusive OR (XOR)

## Syntax Characteristics



## Bitwise Exclusive OR (XOR)

## Syntax Characteristics



# Instruction Opcodes in Sequential Order 

This chapter provides the opcode in sequential order for each TMS320C55x ${ }^{\text {TM }}$ DSP instruction syntax.
Topic Page
6.1 Instruction Set Opcodes ..... 6-2
6.2 Instruction Set Opcode Symbols and Abbreviations ..... 6-18

### 6.1 Instruction Set Opcodes

Table 6-1 lists the opcodes of the instruction set. See Table 6-2 (page 6-18) for a list of the symbols and abbreviations used in the instruction set opcode. See Table 1-1 (page 1-2) and Table 1-2 (page 1-6) for a list of the terms, symbols, and abbreviations used in the mnemonic syntax.

Table 6-1. Instruction Set Opcodes

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 0000000E xCCCCCCC kkkkkkkk | RPTCC k8, cond |
| 0000001 E xCCCCCCC xxxxxxxx | RETCC cond |
| 0000010 E xCCCCCCC LLLLLLLL | BCC L8, cond |
| 0000011 E LLLLLLLL LLLLLLLL | B L16 |
| 0000100 E LLLLLLLL LLLLLLLL | CALL L16 |
| 0000110 E kkkkkkkk kkkkkkkk | RPT k16 |
| 0000111 Elllllll llllllll | RPTB pmad |
| 0001000E DDSSO000 xxSHIFTW | AND ACx <<\#SHIFTW[, ACy] |
| 0001000E DDSS0001 xxSHIFTW | OR ACx << \#SHIFTW[, ACy] |
| 0001000E DDSS0010 xxSHIFTW | XOR ACx << \#SHIFTW[, ACy] |
| 0001000E DDSS0011 xxSHIFTW | ADD ACx << \#SHIFTW, ACy |
| 0001000E DDSS0100 xxSHIFTW | SUB ACx << \#SHIFTW, ACy |
| 0001000 EDSS 0101 xxSHIFTW | SFTS ACx, \#SHIFTW[, ACy] |
| 0001000E DDSS0110 xxSHIFTW | SFTSC ACx, \#SHIFTW[, ACy] |
| 0001000E DDSS0111 xxSHIFTW | SFTL ACx, \#SHIFTW[, ACy] |
| 0001000E xxSS1000 xxddxxxx | EXP ACx, Tx |
| 0001000 EDSS 1001 xxddxxxx | MANT ACx, ACy <br> :: NEXP ACx, Tx |
| 0001000E xxSS1010 SSddxxxt | BCNT ACx, ACy,TCx, Tx |
| $0001000 \mathrm{DDSS1100}$ SSDDnnnn | MAXDIFF ACx, ACy, ACz, ACw |
| $0001000 \mathrm{EDSS1101}$ SSDDxxxr | DMAXDIFF ACx, ACy, ACz, ACw, TRNx |
| $0001000 \mathrm{DDSS1110}$ SSDDxxxx | MINDIFF ACx, ACy, ACz, ACw |
| $0001000 \mathrm{DDSS1111}$ SSDDxxxr | DMINDIFF ACx, ACy, ACz, ACw, TRNx |
| 0001001 E FSSScc00 FDDDxuxt | CMP[U] src RELOP dst, TCx |
| 0001001 E FSSScc01 FDDDOutt | CMPAND[U] src RELOP dst, TCy, TCx |
| 0001001 E FSSScc01 FDDD1utt | CMPAND[U] src RELOP dst, !TCy, TCx |
| 0001001 E FSSScc10 FDDD0utt | CMPOR[U] src RELOP dst, TCy, TCx |
| 0001001 E FSSScc10 FDDD1utt | CMPOR[U] src RELOP dst, !TCy, TCx |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 0001001 E FSSSxx11 FDDD0xvv | ROL BitOut, src, Bitln, dst |
| 0001001 E FSSSxx11 FDDD1xvv | ROR Bitln, src, BitOut, dst |
| 0001010E FSSSxxxx FDDD0000 | AADD TAx, TAy |
| 0001010E FSSSxxxx FDDD0001 | AMOV TAx, TAy |
| 0001010 E FSSSxxxx FDDD0010 | ASUB TAx, TAy |
| 0001010 E PPPPPPPP FDDD0100 | AADD P8, TAx |
| 0001010 E PPPPPPPP FDDD0101 | AMOV P8, TAx |
| 0001010 E PPPPPPPP FDDD0110 | ASUB P8, TAx |
| 0001010 E FSSSxxxx FDDD1000 | AADD TAx, TAy |
| 0001010 E FSSSxxxx FDDD1001 | AMOV TAx, TAy |
| 0001010 E FSSSxxxx FDDD1010 | ASUB TAx, TAy |
| 0001010 E PPPPPPPP FDDD1100 | AADD P8, TAx |
| 0001010 E PPPPPPPP FDDD1101 | AMOV P8, TAx |
| 0001010 E PPPPPPPP FDDD1110 | ASUB P8, TAx |
| 0001010 E XACS0001 XACD0000 (Note: for DAG_X) | AADD XACsrc, XACdst |
| 0001010 E XACSO001 XACD0001 <br> (Note: for DAG_X) | AMOV XACsrc, XACdst |
| 0001010 E XACS0001 XACD0010 <br> (Note: for DAG_X) | ASUB XACsrc, XACdst |
| 0001010 E XACS0001 XACD1000 <br> (Note: for DAG_Y) | AADD XACsrc, XACdst |
| 0001010 E XACS0001 XACD1001 <br> (Note: for DAG_Y) | AMOV XACsrc, XACdst |
| ```0001010E XACS0001 XACD1010 (Note: for DAG_Y)``` | ASUB XACsrc, XACdst |
| 0001011 E xxxxxkkk kkkk0000 | MOV k7, DPH |
| 0001011 Exxxkkkkk kkkk0011 | MOV k9, PDP |
| 0001011 E kkkkkkkk kkkk0100 | MOV k12, BK03 |
| 0001011 E kkkkkkkk kkkk0101 | MOV k12, BK47 |
| 0001011 E kkkkkkkk kkkk0110 | MOV k12, BKC |
| 0001011 E kkkkkkkk kkkk1000 | MOV k12, CSR |
| 0001011 E kkkkkkkk kkkk1001 | MOV k12, BRC0 |
| 0001011 E kkkkkkkk kkkk1010 | MOV k12, BRC1 |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 0001100E kkkkkkkk FDDDFSSS | AND k8, src, dst |
| 0001101 E kkkkkkkk FDDDFSSS | OR k8, src, dst |
| 0001110E kkkkkkkk FDDDFSSS | XOR k8, src, dst |
| 0001111 E KKKKKKKK SSDDxx0\% | MPYK[R] K8, [ACx,] ACy |
| 0001111 E KKKKKKKK SSDDss1\% | MACK[R] Tx, K8, [ACx,] ACy |
| 0010000E | NOP |
| 0010001 E FSSSFDDD | MOV src, dst |
| 0010010 E FSSSFDDD | ADD [src,] dst |
| 0010011 E FSSSFDDD | SUB [src,] dst |
| 0010100E FSSSFDDD | AND src, dst |
| 0010101 E FSSSFDDD | OR src, dst |
| 0010110E FSSSFDDD | XOR src, dst |
| 0010111 E FSSSFDDD | MAX [src,] dst |
| 0011000E FSSSFDDD | MIN [src,] dst |
| 0011001 E FSSSFDDD | ABS [src,] dst |
| 0011010 E FSSSFDDD | NEG [src,] dst |
| 0011011 E FSSSFDDD | NOT [src,] dst |
| 0011100E FSSSFDDD <br> (Note: FSSS = src1, FDDD = src2) | PSH src1, src2 |
| 0011101E FSSSFDDD <br> (Note: FSSS = dst1, FDDD = dst2) | POP dst1, dst2 |
| 0011110 E kkkkFDDD | MOV k4, dst |
| 0011111 E kkkkFDDD | MOV -k4, dst |
| 0100000E kkkkFDDD | ADD k4, dst |
| 0100010111110010 | .LK |
| 0100001E kkkkFDDD | SUB k4, dst |
| 0100010E OOSSFDDD | MOV HI(ACx), TAx |
| 0100010E 01x0FDDD | SFTS dst, \#-1 |
| 0100010 E 01x1FDDD | SFTS dst, \#1 |
| 0100010E 1000FDDD | MOV SP, TAx |
| 0100010E 1001FDDD | MOV SSP, TAx |
| 0100010E 1010FDDD | MOV CDP, TAx |
| 0100010E 1100FDDD | MOV BRC0, TAx |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 0100010E 1101FDDD | MOV BRC1, TAx |
| 0100010E 1110FDDD | MOV RPTC, TAx |
| 0100011 E kkkk0000 | BCLR k4, ST0_55 |
| 0100011 E kkkk0001 | BSET k4, ST0_55 |
| 0100011 E kkkk0010 | BCLR k4, ST1_55 |
| 0100011 E kkkk0011 | BSET k4, ST1_55 |
| 0100011 E kkkk0100 | BCLR k4, ST2_55 |
| 0100011 E kkkk0101 | BSET k4, ST2_55 |
| 0100011 E kkkk0110 | BCLR k4, ST3_55 |
| 0100011 E kkkk0111 | BSET k4, ST3_55 |
| 0100100E xxxxx000 | RPT CSR |
| 0100100E FSSSx001 | RPTADD CSR, TAx |
| 0100100E kkkkx010 | RPTADD CSR, k4 |
| 0100100E kkkkx011 | RPTSUB CSR, k4 |
| 0100100E xxxxx100 | RET |
| 01001000 xxxxx100 | RETI |
| 0100101 ELLLLLLLL | B L7 |
| 0100101 Elllllll | RPTBLOCAL pmad |
| 0100110 E kkkkkkkk | RPT k8 |
| 0100111 E KKKKKKKK | AADD K8,SP |
| 0101000E FDDDx000 | SFTL dst, \#1 |
| 0101000E FDDDx001 | SFTL dst, \#-1 |
| 0101000E FDDDx010 | POP dst |
| 0101000E xxDDx011 | POP dbl(ACx) |
| 0101000 E FSSSx110 | PSH src |
| 0101000E xxSSx111 | PSH dbl(ACx) |
| 0101000E XDDD0100 | POPBOTH xdst |
| 0101000E XSSS0101 | PSHBOTH xsrc |
| 0101001E FSSSOODD | MOV TAx, HI(ACx) |
| 0101001 E FSSS1000 | MOV TAx, SP |
| 0101001 E FSSS1001 | MOV TAx, SSP |
| 0101001 ESSS 1010 | MOV TAx, CDP |
| 0101001E FSSS1100 | MOV TAx, CSR |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 0101001 E FSSS1101 | MOV TAx, BRC1 |
| $0101001 E$ FSSS1110 | MOV TAx, BRCO |
| 0101010E DDSS000\% | ADD[R]V [ACx,] ACy |
| 0101010E DDSS001\% | SQA[R] [ACx, ] ACy |
| 0101010E DDSS010\% | SQS[R] [ACx, ] ACy |
| 0101010E DDSS011\% | MPY[R] [ACx, ] ACy |
| 0101010E DDSS100\% | SQR[R] [ACx, ] ACy |
| 0101010E DDSS101\% | ROUND [ACx,] ACy |
| 0101010E DDSS110\% | SAT[R] [ACx, ] ACy |
| 0101011E DDSSss0\% | MAC[R] ACx, Tx, ACy[, ACy] |
| 0101011E DDSSss1\% | MAS[R] Tx, [ACx, ] ACy |
| 0101100E DDSSss0\% | MPY[R] Tx, [ACx, ${ }^{\text {ACy }}$ |
| 0101100E DDSSss1\% | MAC[R] ACy, Tx, ACx, ACy |
| 0101101E DDSSss00 | ADD ACx $\ll$ Tx, ACy |
| 0101101 E DSSss01 | SUB ACx << Tx, ACy |
| $0101101 \mathrm{EDxxxx1t}$ | SFTCC ACx, TCx |
| 0101110E DDSSss00 | SFTL ACx, Tx[, ACy] |
| 0101110E DDSSss01 | SFTS ACx, Tx[, ACy] |
| 0101110E DDSSss10 | SFTSC ACx, Tx[, ACy] |
| $0101111 \mathrm{E} 00 \mathrm{kkkk} k$ | SWAP () |
| 01100111 lCCCCCCC | BCC 14, cond |
| 01101000 xCCCCCCC PPPPPPPP PPPPPPPP PPPPPPPP | BCC P24, cond |
| 01101001 xCCCCCCC PPPPPPPP PPPPPPPP PPPPPPPP | CALLCC P24, cond |
| 01101010 PPPPPPPPP PPPPPPPP PPPPPPPP | B P24 |
| 01101100 PPPPPPPP PPPPPPPP PPPPPPPP | CALL P24 |
| 01101101 xCCCCCCC LLLLLLLL LLLLLLLL | BCC L16, cond |
| 01101110 xCCCCCCC LLLLLLLL LLLLLLLL | CALLCC L16, cond |
| 01101111 FSSSccxu KKKKKKKK LLLLLLLL | BCC[U] L8, src RELOP K8 |
| 01110000 KKKKKKKK KKKKKKKK SSDDSHFT | ADD K16 << \#SHFT, [ACx,] ACy |
| 01110001 KKKKKKKK KKKKKKKK SSDDSHFT | SUB K16 << \#SHFT, [ACx,] ACy |
| 01110010 kkkkkkkk kkkkkkkk SSDDSHFT | AND k16 << \#SHFT, [ACx,] ACy |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 01110011 kkkkkkkk kkkkkkkk SSDDSHFT | OR k16 << \#SHFT, [ACx,] ACy |
| 01110100 kkkkkkkk kkkkkkkk SSDDSHFT | XOR k16 << \#SHFT, [ACx, ] ACy |
| 01110101 KKKKKKKK KKKKKKKK xxDDSHFT | MOV K16 << \#SHFT, ACx |
| 01110110 kkkkkkkk kkkkkkkk FDDDOOSS | BFXTR k16, ACx, dst |
| 01110110 kkkkkkkk kkkkkkkk FDDD01SS | BFXPA k16, ACx, dst |
| 01110110 KKKKKKKK KKKKKKKK FDDD10xx | MOV K16, dst |
| 01110111 DDDDDDDD DDDDDDDD FDDDxxxx | AMOV D16, TAx |
| 01111000 kkkkkkkk kkkkkkkk xxx0000x | MOV k16, DP |
| 01111000 kkkkkkkk kkkkkkkk xxx0001x | MOV k16, SSP |
| 01111000 kkkkkkkk kkkkkkkk xxx0010x | MOV k16, CDP |
| 01111000 kkkkkkkk kkkkkkkk xxx0011x | MOV k16, BSA01 |
| 01111000 kkkkkkkk kkkkkkkk xxx0100x | MOV k16, BSA23 |
| 01111000 kkkkkkkk kkkkkkkk xxx0101x | MOV k16, BSA45 |
| 01111000 kkkkkkkk kkkkkkkk xxx0110x | MOV k16, BSA67 |
| 01111000 kkkkkkkk kkkkkkkk xxx0111x | MOV k16, BSAC |
| 01111000 kkkkkkkk kkkkkkkk xxx1000x | MOV k16, SP |
| 01111001 KКKKKKKK KкKKKKKK SSDDxx0\% | MPYK[R] K16, [ACx, ] ACy |
| 01111001 KKKKKKKK KKKKKKKK SSDDss1\% | MACK[R] Tx, K16, [ACx,] ACy |
| 01111010 KKKKKKKK KKKKKKKK SSDD000x | ADD K16 <<\#16, [ACx,] ACy |
| 01111010 KKKKKKKK KKKKKKKK SSDD001x | SUB K16 <<\#16, [ACx,] ACy |
| 01111010 kkkkkkkk kkkkkkkk SSDD010x | AND k16 <<\#16, [ACx,] ACy |
| 01111010 kkkkkkkk kkkkkkkk SSDD011x | OR k16 <<\#16, [ACx,] ACy |
| 01111010 kkkkkkkk kkkkkkkk SSDD100x | XOR k16 <<\#16, [ACx,] ACy |
| 01111010 KKKKKKKK KKKKKKKK xxDD101x | MOV K16 <<\#16, ACx |
| 01111010 xxxxxxxx xxxxxxxx xxxx110x | IDLE |
| 01111011 KкKKKKKKK KKKKKKKK FDDDFSSS | ADD K16, [src,] dst |
| 01111100 KKKKKKKK KKKKKKKK FDDDFSSS | SUB K16, [src,] dst |
| 01111101 kkkkkkkk kkkkkkkk FDDDFSSS | AND k16, src, dst |
| 01111110 kkkkkkkk kkkkkkkk FDDDFSSS | OR k16, src, dst |
| 01111111 kkkkkkkk kkkkkkkk FDDDFSSS | XOR k16, src, dst |
| 10000000 XXXMMMYY YMMMO0xx | MOV dbl(Xmem), dbl(Ymem) |
| 10000000 XXXMMMYY YMMMO1xx | MOV Xmem, Ymem |
| 10000000 XXXMMMYY YMMM10SS | MOV ACx, Xmem, Ymem |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode |  |  |  | Mnemonic syntax |
| :---: | :---: | :---: | :---: | :---: |
| 10000001 | XXXMMMYY | YMMMO ODD |  | ADD Xmem, Ymem, ACx |
| 10000001 | XXXMMMYY | YMMM01DD |  | SUB Xmem, Ymem, ACx |
| 10000001 | XXXMMMYY | YMMM1 ODD |  | MOV Xmem, Ymem, ACx |
| 10000010 | XXXMMMYY | YMMMO 0 mm | uuDDDDg\% | MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |
| 10000010 | XXXMMMYY | YMMM01mm | uuDDDDg\% | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |
| 10000010 | XXXMMMYY | YMMM1 0mm | uuDDDDg\% | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |
| 10000010 | XXXMMMYY | YMMM11mm | uuxxDDg\% | AMAR Xmem <br> :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx |
| 10000011 | XXXMMMYY | YMMMO 0mm | uuDDDDg\% | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |
| 10000011 | XXXMMMYY | YMMM01mm | uuDDDDg\% | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |
| 10000011 | XXXMMMYY | YMMM1 0mm | uuDDDDg\% | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |
| 10000011 | XXXMMMYY | YMMM11mm | uuxxDDg\% | AMAR Xmem <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx |
| 10000100 | XXXMMMYY | YMMMO 0mm | uuDDDDg\% | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 |
| 10000100 | XXXMMMYY | YMMM01mm | uuxxDDg\% | AMAR Xmem <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx >> \#16 |
| 10000100 | XXXMMMYY | YMMM1 0mm | uuDDDDg\% | MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 |
| 10000100 | XXXMMMYY | YMMM11mm | uuDDDDg\% | MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 |
| 10000101 | XXXMMMYY | YMMMO 0 mm | uuxxDDg\% | AMAR Xmem <br> :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx |
| 10000101 | XXXMMMYY | YMMM01mm | uuDDDDg\% | MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |
| 10000101 | XXXMMMYY | YMMM1 0mm | Xxxxxxxx | AMAR Xmem, Ymem, Cmem |
| 10000101 | XXXMMMYY | YMMM11mm | DDx0DDU\% | FIRSADD Xmem, Ymem, Cmem, ACx, ACy |
| 10000101 | XXXMMMYY | YMMM11mm | DDx1DDU\% | FIRSSUB Xmem, Ymem, Cmem, ACx, ACy |
| 10000110 | XXXMMMYY | YMMMxxDD | 000guuU\% | MPYM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], ACx |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode |  |  |  | Mnemonic syntax |
| :---: | :---: | :---: | :---: | :---: |
| 10000110 | XXXMMMYY | YMMMSSDD | 001guuU\% | $\begin{aligned} & \text { MACM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], [ACx,] } \\ & \text { ACy } \end{aligned}$ |
| 10000110 | XXXMMMYY | YMMMSSDD | 010guuU\% | $\text { MACM }[\mathrm{R}][40][\mathrm{T} 3=][\text { uns( }] \text { Xmem[)], [uns(]Ymem[)], }$ ACx >> \#16[, ACy] |
| 10000110 | XXXMммYY | YMMMSSDD | 011guuU\% | $\begin{aligned} & \text { MASM }[R][40][\text { T3 = ][uns(]Xmem[)], [uns(]Ymem[)], [ACx,] } \\ & \text { ACy } \end{aligned}$ |
| 10000110 | XXXMmмY | YMMMDDDD | $100 \mathrm{xssu} \%$ | MASM[R] [T3 = ]Xmem, Tx, ACx <br> :: MOV Ymem <<\#16, ACy |
| 10000110 | XXXMмMYY | YMMMDDDD | 101xssu\% | MACM[R] [T3 = ]Xmem, Tx, ACx <br> :: MOV Ymem <<\#16, ACy |
| 10000110 | XXXMMMYY | YMMMDDDD | 110xxxx\% | LMS Xmem, Ymem, ACx, ACy |
| 10000110 | XXXMMMY | YMMMDDDD | 1110 xxn \% | SQDST Xmem, Ymem, ACx, ACy |
| 10000110 | XXXMMMYY | YMMMDDDD | 1111xxn\% | ABDST Xmem, Ymem, ACx, ACy |
| 10000111 | XXXMMMYY | YMMMSSDD | 000xssU\% | MPYM[R] [T3 = ]Xmem, Tx, ACy :: MOV HI(ACx << T2), Ymem |
| 10000111 | XXXMMMYY | YMMMSSDD | 001xssU\% | MACM[R] [T3 = ]Xmem, Tx, ACy <br> :: MOV HI(ACx << T2), Ymem |
| 10000111 | XXXMMMYY | YMMMSSDD | 010xssU\% | MASM[R] [T3 = ]Xmem, Tx, ACy :: MOV HI(ACx << T2), Ymem |
| 10000111 | XXXMMMYY | YMMMSSDD | 01100001 | LMSF Xmem, Ymem, ACx, ACy |
| 10000111 | XXXMMМY | YMMMSSDD | 100xxxxx | ADD Xmem << \#16, ACx, ACy :: MOV HI(ACy << T2), Ymem |
| 10000111 | XXXMMMYY | YMMMSSDD | 101xxxxx | SUB Xmem << \#16, ACx, ACy <br> :: MOV HI(ACy << T2), Ymem |
| 10000111 | XXXMMMYY | YMMMSSDD | 110xxxxx | MOV Xmem <<\#16, ACy <br> :: MOV HI(ACx << T2), Ymem |
| 10010000 | XSSSXDDD |  |  | MOV xsrc, xdst |
| 10010001 | xxxxxxSS |  |  | B ACx |
| 10010010 | XXXMMMYY | YMMMO Omm | uuDDDDg\% | MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MPY[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |
| 10010010 | XXXMMмYY | YMMM01mm | uuDDDDg\% | MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |
| 10010010 | XXXMMMYY | YMMM1 0mm | uuDDDDg\% | MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |
| 10010010 | xxxxxxSS |  |  | CALL ACx |

Table 6-1. Instruction Set Opcodes (Continued)

|  | Opcode | Mnemonic syntax |
| :---: | :---: | :---: |
| 10010011 | XXXммMYY YмMmoomm uudDDDg\% | MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |
| 10010011 | XXXMMMYY YMMMO1mm uudDDDg\% | MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |
| 10010011 | XXXMMMYY YMMM10mm uuDDDDg\% | MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[]], ACy, :: MAC[R][40] [uns(]LO(Xmem)[]], [uns(]LO(Cmem)[]], ACx >> \#16 |
| 10010011 | XXXMMМYY YMMM11mm uuDDDDg\% | ```MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> #16, :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> #16``` |
| 10010100 | XXXMMМYY YMMMOOmm uuDDDDg\% | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy >> \#16, } \\ & \text { :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ |
| 10010100 | XXXMMMYY YMMM10mm uudDDDg\% | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy >> \#16, } \\ & :: \text { MPY[R][40] [uns(]LO(Xmem) })] \text { ], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ |
| 10010100 | xxxxxxxx | RESET |
| 10010101 | 0xxkkkkk | INTR k5 |
| 10010101 | 1xxkkkkk | TRAP k5 |
| 10010101 | XXXMMMYY YMMMO1mm uuDDDDg\% | MAS[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx |
| 10010110 | 0 CCCCCCC | XCC [label, ]cond |
| 10010110 | 1 CCCCCCC | XCCPART [label, ]cond |
| 10011000 |  | mmap |
| 10011001 |  | port(Smem) |
| 10011010 |  | port(Smem) |
| 10011100 |  | <instruction>.LR |
| 10011101 |  | <instruction>.CR |
| 10011110 | 0 0 CCCCCC | XCC [label, ]cond |
| 10011110 | 1CCCCCCC | XCCPART [label, ]cond |
| 10011111 | 0 0 cccccc | XCC [label, ]cond |
| 10011111 | 1 CCCCCCC | XCCPART [label, ]cond |
| 1010FDDD | AAAAAAAI | MOV Smem, dst |
| 101100DD | AAAAAAAI | MOV Smem <<\#16, ACx |
| 10110100 | AAAAAAAI | AMAR Smem |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 10110101 AAAAAAAI | PSH Smem |
| 10110110 AAAAAAAI | DELAY Smem |
| 10110111 AAAAAAAI | PSH dbl(Lmem) |
| 10111000 AAAAAAAI | POP dbl(Lmem) |
| 10111011 AAAAAAAI | POP Smem |
| 101111SS AAAAAAAI | MOV HI(ACx), Smem |
| 1100FSSS AAAAAAAI | MOV src, Smem |
| 11010000 AAAAAAAI 0\%DD01mm | MPY[R] Smem, uns(Cmem), ACx |
| 11010000 AAAAAAAI 0\%DD10mm | MAC[R] Smem, uns(Cmem), ACx |
| 11010000 AAAAAAAI 0\%DD11mm | MAS[R] Smem, uns(Cmem), ACx |
| 11010000 AAAAAAAI U\%DDxxmm | MACM $[R] Z[$ [3 = ]Smem, Cmem, ACx |
| 11010001 AAAAAAAI U\%DD00mm | MPYM[R] [T3 = ]Smem, Cmem, ACx |
| 11010001 AAAAAAAI U\%DD01mm | MACM $[\mathrm{R}][\mathrm{T} 3=]$ Smem, Cmem, ACx |
| 11010001 AAAAAAAI U\%DD10mm | MASM[R] [T3 = ]Smem, Cmem, ACx |
| 11010010 AAAAAAAI U\%DD00SS | MACM $[R][$ [3 = ]Smem, [ACx,] ACy |
| 11010010 AAAAAAAI U\%DD01SS | MASM[R] [T3 = ]Smem, [ACx,] ACy |
| 11010010 AAAAAAAI U\%DD10SS | SQAM[R] [T3 = ]Smem, [ACx,] ACy |
| 11010010 AAAAAAAI U\%DD11SS | SQSM[R] [T3 = ]Smem, [ACx,] ACy |
| 11010011 AAAAAAAI U\%DDOOSS | MPYM[R] [T3 = ]Smem, [ACx,] ACy |
| 11010011 AAAAAAAI U\%DD10xx | SQRM[R] [T3 = ]Smem, ACx |
| 11010011 AAAAAAAI U\%DDu1ss | MPYM[R][U] [T3 = ]Smem, Tx, ACx |
| 11010100 AAAAAAAI U\%DDssSS | MACM[R] [T3 = ]Smem, Tx, [ACx, ] ACy |
| 11010101 AAAAAAAI U\%DDssSS | MASM[R] [T3 = ]Smem, Tx, [ACx, ] ACy |
| 11010110 AAAAAAAI FDDDFSSS | ADD Smem, [src,] dst |
| 11010111 AAAAAAAI FDDDFSSS | SUB Smem, [src,] dst |
| 11011000 AAAAAAAI FDDDFSSS | SUB src, Smem, dst |
| 11011001 AAAAAAAI FDDDFSSS | AND Smem, src, dst |
| 11011010 AAAAAAAI FDDDFSSS | OR Smem, src, dst |
| 11011011 AAAAAAAI FDDDFSSS | XOR Smem, src, dst |
| 11011100 AAAAAAAI kkkkxx00 | BTST k4, Smem, TC1 |
| 11011100 AAAAAAAI kkkkxx01 | BTST k4, Smem, TC2 |
| 11011100 AAAAAAAI 0000xx10 | MOV Smem, DP |
| 11011100 AAAAAAAI 0001xx10 | MOV Smem, CDP |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 11011100 AAAAAAAI 0010xx10 | MOV Smem, BSA01 |
| 11011100 AAAAAAAI 0011xx10 | MOV Smem, BSA23 |
| 11011100 AAAAAAAI 0100xx10 | MOV Smem, BSA45 |
| 11011100 AAAAAAAI 0101xx10 | MOV Smem, BSA67 |
| 11011100 AAAAAAAI 0110xx10 | MOV Smem, BSAC |
| 11011100 AAAAAAAI 0111xx10 | MOV Smem, SP |
| 11011100 AAAAAAAI 1000xx10 | MOV Smem, SSP |
| 11011100 AAAAAAAI 1001xx10 | MOV Smem, BK03 |
| 11011100 AAAAAAAI 1010xx10 | MOV Smem, BK47 |
| 11011100 AAAAAAAI 1011xx10 | MOV Smem, BKC |
| 11011100 AAAAAAAI 1100xx10 | MOV Smem, DPH |
| 11011100 AAAAAAAI 1111xx10 | MOV Smem, PDP |
| 11011100 AAAAAAAI x000xx11 | MOV Smem, CSR |
| 11011100 AAAAAAAI x001xx11 | MOV Smem, BRC0 |
| 11011100 AAAAAAAI x010xx11 | MOV Smem, BRC1 |
| 11011100 AAAAAAAI x011xx11 | MOV Smem, TRN0 |
| 11011100 AAAAAAAI x100xx11 | MOV Smem, TRN1 |
| 11011101 AAAAAAAI SSDDss00 | ADD Smem << Tx, [ACx,] ACy |
| 11011101 AAAAAAAI SSDDss01 | SUB Smem << Tx, [ACx,] ACy |
| 11011101 AAAAAAAI SSDDss10 | ADDSUB2CC Smem, ACx, Tx, TC1, TC2, ACy |
| 11011101 AAAAAAAI x\%DDss11 | MOV [rnd(]Smem << Tx[)], ACx |
| 11011110 AAAAAAAI SSDD0000 | ADDSUBCC Smem, ACx, TC1, ACy |
| 11011110 AAAAAAAI SSDD0001 | ADDSUBCC Smem, ACx, TC2, ACy |
| 11011110 AAAAAAAI SSDD0010 | ADDSUBCC Smem, ACx, TC1, TC2, ACy |
| 11011110 AAAAAAAI SSDD0011 | SUBC Smem, [ACx,] ACy |
| 11011110 AAAAAAAI SSDD0100 | ADD Smem <<\#16, [ACx, ] ACy |
| 11011110 AAAAAAAI SSDD0101 | SUB Smem <<\#16, [ACx,] ACy |
| 11011110 AAAAAAAI SSDD0110 | SUB ACx, Smem <<\#16, ACy |
| 11011110 AAAAAAAI ssDD1000 | ADDSUB Tx, Smem, ACx |
| 11011110 AAAAAAAI ssDD1001 | SUBADD Tx, Smem, ACx |
| 11011111 AAAAAAAI FDDD000u | MOV [uns(]high_byte(Smem)[)], dst |
| 11011111 AAAAAAAI FDDD001u | MOV [uns(Jlow_byte(Smem)[)], dst |
| 11011111 AAAAAAAI xxDD010u | MOV [uns(]Smem[)], ACx |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 11011111 AAAAAAAI SSDD100u | ADD [uns(]Smem[)], CARRY, [ACx,] ACy |
| 11011111 AAAAAAAI SSDD101u | SUB [uns(]Smem[)], BORROW, [ACx,] ACy |
| 11011111 AAAAAAAI SSDD110u | ADD [uns(]Smem[)], [ACx,] ACy |
| 11011111 AAAAAAAI SSDD111u | SUB [uns(]Smem[)], [ACx,] ACy |
| 11100000 AAAAAAAI FSSSxxxt | BTST src, Smem, TCx |
| 11100001 AAAAAAAI DDSHIFTW | MOV low_byte(Smem) << \#SHIFTW, ACx |
| 11100010 AAAAAAAI DDSHIFTW | MOV high_byte(Smem) << \#SHIFTW, ACx |
| 11100011 AAAAAAAI kkkk000x | BTSTSET k4, Smem, TC1 |
| 11100011 AAAAAAAI kkkk001x | BTSTSET k4, Smem, TC2 |
| 11100011 AAAAAAAI kkkk010x | BTSTCLR k4, Smem, TC1 |
| 11100011 AAAAAAAI kkkk011x | BTSTCLR k4, Smem, TC2 |
| 11100011 AAAAAAAI kkkk100x | BTSTNOT k4, Smem, TC1 |
| 11100011 AAAAAAAI kkkk101x | BTSTNOT k4, Smem, TC2 |
| 11100011 AAAAAAAI FSSS1100 | BSET src, Smem |
| 11100011 AAAAAAAI FSSS1101 | BCLR src, Smem |
| 11100011 AAAAAAAI FSSS111x | BNOT src, Smem |
| 11100100 AAAAAAAI FSSSx0xx | PSH src,Smem |
| 11100100 AAAAAAAI FDDDx1xx | POP dst, Smem |
| 11100101 AAAAAAAI FSSS01x0 | MOV src, high_byte(Smem) |
| 11100101 AAAAAAAI FSSS01x1 | MOV src, low_byte(Smem) |
| 11100101 AAAAAAAI 000010xx | MOV DP, Smem |
| 11100101 AAAAAAAI 000110xx | MOV CDP, Smem |
| 11100101 AAAAAAAI 001010xx | MOV BSA01, Smem |
| 11100101 AAAAAAAI 001110xx | MOV BSA23, Smem |
| 11100101 AAAAAAAI 010010xx | MOV BSA45, Smem |
| 11100101 AAAAAAAI 010110xx | MOV BSA67, Smem |
| 11100101 AAAAAAAI 011010xx | MOV BSAC, Smem |
| 11100101 AAAAAAAI 011110xx | MOV SP, Smem |
| 11100101 AAAAAAAI 100010xx | MOV SSP, Smem |
| 11100101 AAAAAAAI 100110xx | MOV BK03, Smem |
| 11100101 AAAAAAAI 101010xx | MOV BK47, Smem |
| 11100101 AAAAAAAI 101110xx | MOV BKC, Smem |
| 11100101 AAAAAAAI 110010xx | MOV DPH, Smem |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 11100101 AAAAAAAI 111110xx | MOV PDP, Smem |
| 11100101 AAAAAAAI x00011xx | MOV CSR, Smem |
| 11100101 AAAAAAAI x00111xx | MOV BRC0, Smem |
| 11100101 AAAAAAAI x01011xx | MOV BRC1, Smem |
| 11100101 AAAAAAAI x01111xx | MOV TRN0, Smem |
| 11100101 AAAAAAAI x10011xx | MOV TRN1, Smem |
| 11100110 AAAAAAAI KKKKKKKK | MOV K8, Smem |
| 11100111 AAAAAAAI SSss00xx | MOV ACx << Tx, Smem |
| 11100111 AAAAAAAI SSss10x\% | MOV [rnd(]HI(ACx $\ll$ Tx)[)], Smem |
| 11100111 AAAAAAAI SSss11u\% | MOV [uns(] [rnd(]HI[(saturate](ACx $\ll$ Tx)[) ) $]$, Smem |
| 11101000 AAAAAAAI SSxxx0x\% | MOV [rnd(]HI(ACx)[)], Smem |
| 11101000 AAAAAAAI SSxxxlu\% | MOV [uns(] [rnd(]HI[(saturate](ACx)[)) )], Smem |
| 11101001 AAAAAAAI SSSHIFTW | MOV ACx << \#SHIFTW, Smem |
| 11101010 AAAAAAAI SSSHIFTW | MOV HI(ACx << \#SHIFTW), Smem |
| 11101011 AAAAAAAI xxxx 01 xx | MOV RETA, dbl(Lmem) |
| 11101011 AAAAAAAI $\mathrm{xxSS10x} 0$ | MOV ACx, dbl(Lmem) |
| 11101011 AAAAAAAI xxSS10u1 | MOV [uns(]saturate(ACx)[)], dbl(Lmem) |
| 11101011 AAAAAAAI FSSS1100 | MOV pair(TAx), dbl(Lmem) |
| 11101011 AAAAAAAI xxSS1101 | MOV ACx >> \#1, dual(Lmem) |
| 11101100 AAAAAAAI FSSS000x | BSET Baddr, src |
| 11101100 AAAAAAAI FSSS001x | BCLR Baddr, src |
| 11101100 AAAAAAAI FSSS010x | BTSTP Baddr, src |
| 11101100 AAAAAAAI FSSS011x | BNOT Baddr, src |
| 11101100 AAAAAAAI FSSS100t | BTST Baddr, src, TCx |
| 11101100 AAAAAAAI XDDD1110 | AMAR Smem, XAdst |
| 11101101 AAAAAAAI O0DD1010 | MOV dbl(Lmem), pair(HI(ACx)) |
| 11101101 AAAAAAAI O0DD1100 | MOV dbl(Lmem), pair(LO(ACx)) |
| 11101101 AAAAAAAI 00SS1110 | MOV pair(HI(ACx)), dbl(Lmem) |
| 11101101 AAAAAAAI 00SS1111 | MOV pair(LO(ACx)), dbl(Lmem) |
| 11101101 AAAAAAAI SSDD000n | ADD dbl(Lmem), [ACx,] ACy |
| 11101101 AAAAAAAI SSDD001n | SUB dbl(Lmem), [ACx,] ACy |
| 11101101 AAAAAAAI SSDD010x | SUB ACx, dbl(Lmem), ACy |
| 11101101 AAAAAAAI xxxx011x | MOV dbl(Lmem), RETA |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode | Mnemonic syntax |
| :---: | :---: |
| 11101101 AAAAAAAI $\mathrm{xxDD100g}$ | MOV[40] dbl(Lmem), ACx |
| 11101101 AAAAAAAI FDDD111x | MOV dbl(Lmem), pair(TAx) |
| 11101101 AAAAAAAI XDDD1111 | MOV dbl(Lmem), XAdst |
| 11101101 AAAAAAAI XSSS0101 | MOV XAsrc, dbl(Lmem) |
| 11101110 AAAAAAAI SSDD000x | ADD dual(Lmem), [ACx, ] ACy |
| 11101110 AAAAAAAI SSDD001x | SUB dual(Lmem), [ACx,] ACy |
| 11101110 AAAAAAAI SSDD010x | SUB ACx, dual(Lmem), ACy |
| 11101110 AAAAAAAI ssDD011x | SUB dual(Lmem), Tx, ACx |
| 11101110 AAAAAAAI ssDD100x | ADD dual(Lmem), Tx, ACx |
| 11101110 AAAAAAAI ssDD101x | SUB Tx, dual(Lmem), ACx |
| 11101110 AAAAAAAI ssDD110x | ADDSUB Tx, dual(Lmem), ACx |
| 11101110 AAAAAAAI ssDD111x | SUBADD Tx, dual(Lmem), ACx |
| 11101111 AAAAAAAI xxxx00mm | MOV Cmem, Smem |
| 11101111 AAAAAAAI xxxx01mm | MOV Smem, Cmem |
| 11101111 AAAAAAAI xxxx10mm | MOV Cmem,dbl(Lmem) |
| 11101111 AAAAAAAI xxxx11mm | MOV dbl(Lmem), Cmem |
| 11110000 AAAAAAAI KKKKKKKK KКккКККК | CMP Smem == K16, TC1 |
| 11110001 AAAAAAAI KKKKKKKK KkKkKkKk | CMP Smem == K16, TC2 |
| 11110010 AAAAAAAI kkkkkkkk kkkkkkkk | BAND Smem, k16, TC1 |
| 11110011 AAAAAAAI kkkkkkkk kkkkkkkk | BAND Smem, k16, TC2 |
| 11110100 AAAAAAAI kkkkkkkk kkkkkkkk | AND k16, Smem |
| 11110101 AAAAAAAI kkkkkkkk kkkkkkkk | OR k16, Smem |
| 11110110 AAAAAAAI kkkkkkkk kkkkkkkk | XOR k16, Smem |
| 11110111 AAAAAAAI KKкккккK KккккккK | ADD K16, Smem |
| 11111000 AAAAAAAI KKKKKKKK xxDDx0U\% | MPYMK[R] [T3 = ]Smem, K8, ACx |
| 11111000 AAAAAAAI KKKKKKKK SSDDx1U\% | MACMK[R] [T3 = ]Smem, K8, [ACx, ] ACy |
| 11111001 AAAAAAAI uxSHIFTW SSDDO0xx | ADD [uns(]Smem[)] << \#SHIFTW, [ACx,] ACy |
| 11111001 AAAAAAAI uxSHIFTW SSDD01xx | SUB [uns(]Smem[)] << \#SHIFTW, [ACx,] ACy |
| 11111001 AAAAAAAI uxSHIFTW xxDD10xx | MOV [uns(]Smem[)] << \#SHIFTW, ACx |
| 11111010 AAAAAAAI $x x S H I F T W$ SSxxx0x\% | MOV [rnd(]HI(ACx << \#SHIFTW)[)], Smem |
| 11111010 AAAAAAAI uxSHIFTW SSxxx1x\% | MOV [uns(] [rnd(]HI[(saturate](ACx $\ll$ \#SHIFTW)[)))], Smem |
| 11111011 AAAAAAAI KKKKKKKK KКккКККК | MOV K16, Smem |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode |  |  |  | Mnemonic syntax |
| :---: | :---: | :---: | :---: | :---: |
| 11111100 | AAAAAAAI | LLLLLLLL | LLLLLLLL | BCC L16, ARn_mod ! = \#0 |
| 11111101 | AAAAAAAI | 000000 mm | DDDDuug\% | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 000001 mm | DDDDuug\% | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 000010 mm | DDDDuug\% | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 000011 mm | DDDDuug\% | MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 000100 mm | DDDDuug\% | MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 000101 mm | DDDDuug\% | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 000110 mm | DDDDuug\% | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 000111 mm | DDDDuug\% | MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 001000 mm | DDDDuug\% | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx>>\#16 |
| 11111101 | AAAAAAAI | 001001 mm | DDDDuug\% | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 001010 mm | DDDDuug\% | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 001011 mm | DDDDuug\% | MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], <br> ACx>>\#16 |
| 11111101 | AAAAAAAI | 001100 mm | DDDDuug\% | MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 010000 mm | DDDDuug\% | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 010001 mm | DDDDuug\% | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 010010 mm | DDDDuug\% | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | 010011 mm | DDDDuug\% | MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |

Table 6-1. Instruction Set Opcodes (Continued)

| Opcode |  |  | Mnemonic syntax |
| :---: | :---: | :---: | :---: |
| 11111101 | AAAAAAAI | 010100 mm DDDDuug\% | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | $010101 \mathrm{~mm} \mathrm{DDDDuug} \mathrm{\%}$ | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | $010110 \mathrm{~mm} \mathrm{DDDDuug} \mathrm{\%}$ | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | $010111 \mathrm{~mm} \mathrm{DDDDuug} \mathrm{\%}$ | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |
| 11111101 | AAAAAAAI | $011000 \mathrm{~mm} \mathrm{DDDDuug} \mathrm{\%}$ | MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx>>\#16 |
| 11111101 | AAAAAAAI | 011001 mm DDDDuug\% | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy>>\#16 } \\ & :: \text { MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ |
| 11111101 | AAAAAAAI | $011010 \mathrm{~mm} \mathrm{DDDDuug} \mathrm{\%}$ | $\begin{aligned} & \text { MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], } \\ & \text { ACy>>\#16 } \\ & :: \text { MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx } \end{aligned}$ |
| 11111101 | AAAAAAAI | $011011 \mathrm{~mm} \mathrm{DDDDuag} \mathrm{\%}$ | ```MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>#16 :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx>>#16``` |
| 11111101 | AAAAAAAI | 011100 mm DDDDuug\% | MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |

### 6.2 Instruction Set Opcode Symbols and Abbreviations

Table 6-2 lists the symbols and abbreviations used in the instruction set opcode.

Table 6-2. Instruction Set Opcode Symbols and Abbreviations

| Bit Field Name | Bit Field Value | Bit Field Description |
| :---: | :---: | :---: |
| \% | 0 | Rounding is disabled |
|  | 1 | Rounding is enabled |
| AAAA AAAI |  | Smem addressing mode: |
|  | AAAA AAAO | @dma, direct memory address (dma) direct access |
|  | AAAA AAA1 | Smem indirect memory access: |
|  | 00010001 | ABS16(\#k16) |
|  | 00110001 | *(\#k23) |
|  | 01010001 | port(\#k16) |
|  | 01110001 | *CDP |
|  | 10010001 | *CDP+ |
|  | 10110001 | *CDP- |
|  | 11010001 | *CDP(\#K16) |
|  | 11110001 | *+CDP(\#K16) |
|  | PPP0 0001 | *ARn |
|  | PPP0 0011 | *ARn+ |
|  | PPP0 0101 | *ARn- |
|  | PPP0 0111 | *(ARn + T0), when C54CM $=0$ <br> *(ARn + T0), when C54CM = 1 |
|  | PPP0 1001 | *(ARn - TO), when C54CM $=0$ <br> *(ARn - T0), when C54CM = 1 |
|  | PPP0 1011 | * ARn (T0), when $\mathrm{C} 54 \mathrm{CM}=0$ <br> *ARn(T0), when C54CM = 1 |
|  | PPP0 1101 | *ARn(\#K16) |
|  | PPP0 1111 | * $+\mathrm{ARn}(\# \mathrm{~K} 16)$ |
|  | PPP1 0011 | ${ }^{*}(A R n+T 1)$, when ARMS $=0$ <br> *ARn(short(\#1)), when ARMS = 1 |

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)

| Bit Field Name | Bit Field Value | Bit Field Description |
| :---: | :---: | :---: |
|  | PPP1 0101 | *(ARn - T1), when ARMS = 0 <br> *ARn(short(\#2)), when ARMS = 1 |
|  | PPP1 0111 | *ARn(T1), when $A R M S=0$ <br> *ARn(short(\#3)), when ARMS = 1 |
|  | PPP1 1001 | * + ARn, when ARMS $=0$ <br> *ARn(short(\#4)), when ARMS = 1 |
|  | PPP1 1011 | *-ARn, when ARMS $=0$ <br> *ARn(short(\#5)), when ARMS = 1 |
|  | PPP1 1101 | *(ARn + T0B), when ARMS $=0$ <br> *ARn(short(\#6)), when ARMS = 1 |
|  | PPP1 1111 | *(ARn - TOB), when ARMS $=0$ <br> *ARn(short(\#7)), when ARMS = 1 |
|  | PPP encode | an auxiliary register (ARn) as for XXX |

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)


Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)

| Bit Field Name | Bit Field Value | Bit Field Description |
| :---: | :---: | :---: |
|  | 1101000 | TC1 \& TC2 |
|  | 1101001 | TC1 \& !TC2 |
|  | 1101010 | !TC1 \& TC2 |
|  | 1101011 | $!T C 1$ \& !TC2 |
|  | 110 11xx | Reserved |
|  | 111 00SS | !overflow(ACx) (source accumulator overflow status bit (ACOVx) is tested against 0) |
|  | 1110100 | !TC1 (status bit is tested against 0) |
|  | 1110101 | !TC2 (status bit is tested against 0) |
|  | 1110110 | !CARRY (status bit is tested against 0) |
|  | 1110111 | Reserved |
|  | 1111000 | TC1 \| TC2 |
|  | 1111001 | TC1 \| TC 2 |
|  | 1111010 | !TC1 \| TC2 |
|  | 1111011 | !TC1 \| !TC2 |
|  | 1111100 | TC1 ^ TC2 |
|  | 1111101 | TC1 ^ ! TC2 |
|  | 1111110 | !TC1 ${ }^{\wedge}$ TC2 |
|  | 1111111 | $!T C 1 \wedge!T C 2$ |
| dd |  | Destination temporary register (Tx, Ty) : |
|  | 00 | Temporary register 0 (T0) |
|  | 01 | Temporary register 1 (T1) |
|  | 10 | Temporary register 2 (T2) |
|  | 11 | Temporary register 3 (T3) |

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)

| Bit Field Name | Bit Field Value | Bit Field Description |
| :---: | :---: | :---: |
| DD |  | Destination accumulator register (ACw, ACx, ACy, ACz): |
|  | 00 | Accumulator 0 (AC0) |
|  | 01 | Accumulator 1 (AC1) |
|  | 10 | Accumulator 2 (AC2) |
|  | 11 | Accumulator 3 (AC3) |
| DDD. |  | Data address label coded on n bits (absolute address) |
| E | 0 | Parallel Enable bit is cleared to 0 |
|  | 1 | Parallel Enable bit is set to 1 |
| $\begin{aligned} & \text { FDDD } \\ & \text { FSSS } \end{aligned}$ |  | Destination or Source accumulator, auxiliary, or temporary register (dst, src, TAx, TAy): |
|  | 0000 | Accumulator 0 (ACO) |
|  | 0001 | Accumulator 1 (AC1) |
|  | 0010 | Accumulator 2 (AC2) |
|  | 0011 | Accumulator 3 (AC3) |
|  | 0100 | Temporary register 0 (T0) |
|  | 0101 | Temporary register 1 (T1) |
|  | 0110 | Temporary register 2 (T2) |
|  | 0111 | Temporary register 3 (T3) |
|  | 1000 | Auxiliary register 0 (AR0) |
|  | 1001 | Auxiliary register 1 (AR1) |
|  | 1010 | Auxiliary register 2 (AR2) |
|  | 1011 | Auxiliary register 3 (AR3) |
|  | 1100 | Auxiliary register 4 (AR4) |
|  | 1101 | Auxiliary register 5 (AR5) |
|  | 1110 | Auxiliary register 6 (AR6) |
|  | 1111 | Auxiliary register 7 (AR7) |

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)

| Bit Field Name | Bit Field Value | Bit Field Description |
| :---: | :---: | :---: |
| 9 | 0 | 40 keyword is not applied |
|  | 1 | 40 keyword is applied; M40 is locally set to 1 |
| kk kkkk |  | Swap code for Swap Register Content instruction: |
|  | 000000 | SWAP AC0, AC2 |
|  | 000001 | SWAP AC1, AC3 |
|  | 000100 | SWAP T0, T2 |
|  | 000101 | SWAP T1, T3 |
|  | 001000 | SWAP AR0, AR2 |
|  | 001001 | SWAP AR1, AR3 |
|  | 001100 | SWAP AR4, T0 |
|  | 001101 | SWAP AR5, T1 |
|  | 001110 | SWAP AR6, T2 |
|  | 001111 | SWAP AR7, T3 |
|  | 010000 | SWAPP AC0, AC2 |
|  | 010001 | Reserved |
|  | 010100 | SWAPP T0, T2 |
|  | 010101 | Reserved |
|  | 011000 | SWAPP AR0, AR2 |
|  | 011001 | Reserved |
|  | 011100 | SWAPP AR4, T0 |
|  | 011101 | Reserved |
|  | 011110 | SWAPP AR6, T2 |
|  | 011111 | Reserved |
|  | 101000 | Reserved |
|  | 101100 | SWAP4 AR4, T0 |
|  | 111000 | SWAP AR0, AR1 |
|  | 111100 | Reserved |
|  | 1x 0000 | Reserved |
|  | 1x 0001 | Reserved |

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)
$\left.\begin{array}{lll}\hline \begin{array}{lll}\text { Bit Field } \\ \text { Name }\end{array} & \begin{array}{l}\text { Bit Field } \\ \text { Value }\end{array} & \text { Bit Field Description }\end{array}\right]$

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)

| Bit Field Name | Bit Field Value | Bit Field Description |
| :---: | :---: | :---: |
|  | 101 | *(ARn - T0), when C54CM = 0 <br> *(ARn - AR0), when C54CM = 1 |
|  | 110 | *(ARn - T1) |
|  | 111 | * $\mathrm{ARn}(\mathrm{TO})$, when $\mathrm{C} 54 \mathrm{CM}=0$ <br> *ARn(AR0), when C54CM = 1 |
| n |  | Reserved bit |
| PPP | P | Program or data address label coded on n bits (absolute address) |
| r | 0 | Select TRN0 |
|  | 1 | Select TRN1 |
| SHFT |  | 4-bit immediate shift value, 0 to 15 |
| SHIFTW |  | 6 -bit immediate shift value, -32 to +31 |
| ss |  | Source temporary register (Tx, Ty): |
|  | 00 | Temporary register 0 (T0) |
|  | 01 | Temporary register 1 (T1) |
|  | 10 | Temporary register 2 (T2) |
|  | 11 | Temporary register 3 (T3) |
| SS |  | Source accumulator register (ACw, ACx, ACy, ACz): |
|  | 00 | Accumulator 0 (AC0) |
|  | 01 | Accumulator 1 (AC1) |
|  | 10 | Accumulator 2 (AC2) |
|  | 11 | Accumulator 3 (AC3) |
| tt | 00 | Bit 0: destination TCy bit of Compare Register Content instruction |
|  | 01 | Bit 1: source TCx bit of Compare Register Content instruction |

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)

| Bit Field Name | Bit Field Value | Bit Field Description |
| :---: | :---: | :---: |
| u | 10 | When value $=0$ : TC1 is selected |
|  | 11 | When value = 1: TC2 is selected |
|  | 0 | $U$ or uns keyword is not applied; operand is considered signed |
|  | 1 | $U$ or uns keyword is applied; operand is considered unsigned |
| U | 0 | No update of T3 with Smem or Xmem content |
|  | 1 | T3 is updated with Smem or Xmem content |
| vv | 00 | Bit 0: shifted-out bit of Rotate instruction |
|  | 01 | Bit 1: shifted-in bit of Rotate instruction |
|  | 10 | When value $=0:$ CARRY is selected |
|  | 11 | When value = 1: TC2 is selected |
| x |  | Reserved bit |
| $\begin{aligned} & \text { XDDD } \\ & \text { XSSS } \end{aligned}$ |  | Destination or Source accumulator or extended register. All 23 bits of stack pointer (XSP), system stack pointer (XSSP), data page pointer (XDP), coefficient data pointer (XCDP), and extended auxiliary register (XARx). |
|  | 0000 | Accumulator 0 (AC0) |
|  | 0001 | Accumulator 1 (AC1) |
|  | 0010 | Accumulator 2 (AC2) |
|  | 0011 | Accumulator 3 (AC3) |
|  | 0100 | Stack pointer (XSP) |
|  | 0101 | System stack pointer (XSSP) |
|  | 0110 | Data page pointer (XDP) |
|  | 0111 | Coefficient data pointer (XCDP) |
|  | 1000 | Auxiliary register 0 (XARO) |
|  | 1001 | Auxiliary register 1 (XAR1) |
|  | 1010 | Auxiliary register 2 (XAR2) |
|  | 1011 | Auxiliary register 3 (XAR3) |
|  | 1100 | Auxiliary register 4 (XAR4) |

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)

| Bit Field <br> Name | Bit Field <br> Value | Bit Field Description |
| :--- | :--- | :--- |
|  | 1101 | Auxiliary register 5 (XAR5) |
|  | 1110 | Auxiliary register 6 (XAR6) |
|  | 1111 | Auxiliary register 7 (XAR7) |
|  |  |  |
| XXX |  | Auxiliary register designation for Xmem or Ymem addressing mode: |
| YYY | 000 | Auxiliary register 0 (AR0) |
|  | 001 | Auxiliary register 1 (AR1) |
|  | 010 | Auxiliary register 2 (AR2) |
|  | 011 | Auxiliary register 3 (AR3) |
|  | 100 | Auxiliary register 4 (AR4) |
|  | 101 | Auxiliary register 5 (AR5) |
|  | 110 | Auxiliary register 6 (AR6) |
|  | 111 | Auxiliary register 7 (AR7) |

# Cross-Reference of Mnemonic and Algebraic Instruction Sets 

This chapter provides a cross-reference between the TMS320C55xTM DSP mnemonic instruction set and the algebraic instruction set (Table 7-1). For more information on the algebraic instruction set, see TMS320C55x DSP Algebraic Instruction Set Reference Guide, SWPU068.

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| AADD: Modify Auxiliary or Temporary Register Content by Addition | Modify Auxiliary or Temporary Register Content by Addition |
| AADD TAx, TAy | $\operatorname{mar}(\mathrm{TAy}+\mathrm{TAx})$ |
| AADD P8, TAx | $\operatorname{mar}(\mathrm{TAx}+\mathrm{P} 8)$ |
| AADD: Modify Data Stack Pointer (SP) | Modify Data Stack Pointer |
| AADD K8, SP | SP = SP + K8 |
| AADD: Modify Extended Auxiliary Register Content by Addition | Modify Extended Auxiliary Register Content by Addition |
| AADD XACsrc, XACdst for DAG_X | $\operatorname{mar}$ (XACdst + XACsrc) for DAG_X |
| AADD XACsrc, XACdst for DAG_Y | $\operatorname{mar}$ (XACdst + XACsrc) for DAG_Y |
| ABDST: Absolute Distance | Absolute Distance |
| ABDST Xmem, Ymem, ACx, ACy | abdst(Xmem, Ymem, ACx, ACy) |
| ABS: Absolute Value | Absolute Value |
| ABS [src,] dst | $\mathrm{dst}=\|\mathrm{src}\|$ |
| ADD: Addition | Addition |
| ADD [src,] dst | $d s t=d s t+s r c$ |
| ADD k4, dst | $\mathrm{dst}=\mathrm{dst}+\mathrm{k} 4$ |
| ADD K16, [src,] dst | $\mathrm{dst}=\mathrm{src}+\mathrm{K} 16$ |
| ADD Smem, [src,] dst | dst $=$ src + Smem |
| ADD ACx $\ll$ Tx, ACy | $A C y=A C y+(A C x \ll T x)$ |
| ADD ACx <<\#SHIFTW, ACy | ACy $=$ ACy $+($ ACx $\ll$ \#SHIFTW) |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| ADD K16 << \#16, [ACx,] ACy | ACy $=$ ACx $+(\mathrm{K} 16 \ll \# 16)$ |
| ADD K16 <<\#SHFT, [ACx,] ACy | $A C y=A C x+(\mathrm{K} 16 \ll \# S H F T)$ |
| ADD Smem << Tx, [ACx, ] ACy | $A C y=A C x+($ Smem $\ll$ Tx $)$ |
| ADD Smem <<\#16, [ACx,] ACy | ACy $=$ ACx + (Smem $\ll \# 16)$ |
| ADD [uns(]Smem[)], CARRY, [ACx,] ACy | $A C y=A C x+u n s($ Smem $)+$ CARRY |
| ADD [uns(]Smem[)], [ACx,] ACy | $A C y=A C x+u n s($ Smem $)$ |
| ADD [uns(]Smem[)] << \#SHIFTW, [ACx, ] ACy | ACy $=$ ACx $+($ uns (Smem) $\ll$ \#SHIFTW) |
| ADD dbl(Lmem), [ACx, ] ACy | $A C y=A C x+d b l(L m e m)$ |
| ADD Xmem, Ymem, ACx | ACx $=($ Xmem $\ll \# 16)+($ Ymem $\ll \# 16)$ |
| ADD K16, Smem | Smem $=$ Smem + K16 |
| ADD: Dual 16-Bit Additions | Dual 16-Bit Additions |
| ADD dual(Lmem), [ACx, ] ACy | $\begin{aligned} & \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\text { Lmem })+\mathrm{HI}(\mathrm{ACx}), \\ & \mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\text { Lmem })+\mathrm{LO}(\mathrm{ACx}) \end{aligned}$ |
| ADD dual(Lmem), Tx , ACx | $\begin{aligned} & \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })+\mathrm{Tx}, \\ & \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })+\mathrm{Tx} \end{aligned}$ |
| ADD::MOV: Addition with Parallel Store Accumulator Content to Memory | Addition with Parallel Store Accumulator Content to Memory |
| ADD Xmem <<\#16, ACx, ACy :: MOV HI(ACy << T2), Ymem | $\begin{aligned} & \text { ACy }=\text { ACx }+(\text { Xmem } \ll \# 16), \\ & \text { Ymem }=\text { HI(ACy } \ll \text { T2 }) \end{aligned}$ |
| ADDSUB: Dual 16-Bit Addition and Subtraction | Dual 16-Bit Addition and Subtraction |
| ADDSUB Tx, Smem, ACx | $\begin{aligned} & \text { HI }(A C x)=\text { Smem }+ \text { Tx }, \\ & \text { LO(ACx })=\text { Smem - Tx } \end{aligned}$ |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| AMAR::MAC: Modify Auxiliary Register Content with Parallel Multiply and Accumulate | Modify Auxiliary Register Content with Parallel Multiply and Accumulate |
| AMAR Xmem <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx | $\begin{aligned} & \operatorname{mar}(\text { Xmem }), \\ & \mathrm{ACx}=\mathrm{M} 40\left(\text { rnd }\left(\mathrm{ACx}+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\operatorname{Cmem}))\right)\right)\right) \end{aligned}$ |
| AMAR Xmem <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx >> \#16 | $\begin{aligned} & \operatorname{mar}(\text { Xmem }), \\ & \text { ACx }=\text { M40(rnd ((ACx >> \#16) }+(\text { uns }(\text { Ymem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))) \end{aligned}$ |
| AMAR::MAS: Modify Auxiliary Register Content with Parallel Multiply and Subtract | Modify Auxiliary Register Content with Parallel Multiply and Subtract |
| AMAR Xmem <br> :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx | ```mar(Xmem), ACx = M40(rnd(ACx - (uns(Ymem) * uns(coef(Cmem)))))``` |
| AMAR::MPY: Modify Auxiliary Register Content with Parallel Multiply | Modify Auxiliary Register Content with Parallel Multiply |
| AMAR Xmem <br> :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx | ```mar(Xmem), ACx = M40(rnd(uns(Ymem) * uns(coef(Cmem))))``` |
| AMOV: Load Extended Auxiliary Register with Immediate Value | Load Extended Auxiliary Register with Immediate Value |
| AMOV k23, XAdst | XAdst $=$ k23 |
| AMOV: Modify Auxiliary or Temporary Register Content | Modify Auxiliary or Temporary Register Content |
| AMOV TAx, TAy | $\operatorname{mar}(\mathrm{TAy}=\mathrm{TAx})$ |
| AMOV P8, TAx | $\operatorname{mar}(\mathrm{TAx}=\mathrm{P} 8)$ |
| AMOV D16, TAx | $\operatorname{mar}(\mathrm{TAx}=\mathrm{D} 16)$ |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| AND: Bitwise AND | Bitwise AND |
| AND src, dst | $\mathrm{dst}=\mathrm{dst}$ \& src |
| AND k8, src, dst | $\mathrm{dst}=\mathrm{src}$ \& k8 |
| AND k16, src, dst | $\mathrm{dst}=$ src \& k16 |
| AND Smem, src, dst | dst $=$ src \& Smem |
| AND ACx <<\#SHIFTW[, ACy] | ACy $=$ ACy \& (ACx $\lll$ \#SHIFTW) |
| AND k16 <<\#16, [ACx,] ACy | ACy $=$ ACx \& (k16 <<< \#16) |
| AND k16 << \#SHFT, [ACx, ] ACy | ACy $=$ ACx \& (k16 <<< \#SHFT) |
| AND k16, Smem | Smem = Smem \& k16 |
| ASUB: Modify Auxiliary or Temporary Register Content by Subtraction | Modify Auxiliary or Temporary Register Content by Subtraction |
| ASUB TAx, TAy | $\operatorname{mar}(\mathrm{TAy}-\mathrm{TAx})$ |
| ASUB P8, TAx | $\operatorname{mar}(\mathrm{TAx}-\mathrm{P} 8)$ |
| ASUB: Modify Extended Auxiliary Register Content by Subtraction | Modify Extended Auxiliary Register Content by Subtraction |
| ASUB XACsrc, XACdst for DAG_X | mar(XACdst - XACsrc) for DAG_X |
| ASUB XACsrc, XACdst for DAG_Y | mar(XACdst - XACsrc) for DAG_Y |
| B: Branch Unconditionally | Branch Unconditionally |
| B ACx | goto ACx |
| B L7 | goto L7 |
| B L16 | goto L16 |
| B P24 | goto P24 |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| BAND: Bitwise AND Memory with Immediate Value and Compare to Zero | Bitwise AND Memory with Immediate Value and Compare to Zero |
| BAND Smem, k16, TCx | TCx = Smem \& k16 |
| BCC: Branch Conditionally | Branch Conditionally |
| BCC 14, cond | if (cond) goto 14 |
| BCC L8, cond | if (cond) goto L8 |
| BCC L16, cond | if (cond) goto L16 |
| BCC P24, cond | if (cond) goto P24 |
| BCC: Branch on Auxiliary Register Not Zero | Branch on Auxiliary Register Not Zero |
| BCC L16, ARn_mod != \#0 | if (ARn_mod ! = \#) goto L16 |
| BCC: Compare and Branch | Compare and Branch |
| BCC[U] L8, src RELOP K8 | compare (uns(src RELOP K8)) goto L8 |
| BCLR: Clear Accumulator, Auxiliary, or Temporary Register Bit BCLR Baddr, src | Clear Accumulator, Auxiliary, or Temporary Register Bit bit(src, Baddr) = \#0 |
| BCLR: Clear Memory Bit | Clear Memory Bit |
| BCLR src, Smem | bit(Smem, src) = \#0 |
| BCLR: Clear Status Register Bit | Clear Status Register Bit |
| BCLR k4, STx_55 | bit(STx, k4) = \#0 |
| BCLR f-name |  |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :--- | :--- |
| BCNT: Count Accumulator Bits | Count Accumulator Bits |
| BCNT ACx, ACy, TCx, Tx | Tx = count(ACx, ACy, TCx) |
| BFXPA: Expand Accumulator Bit Field | Expand Accumulator Bit Field |
| BFXPA k16, ACx, dst | dst = field_expand(ACx, k16) |
| BFXTR: Extract Accumulator Bit Field | Extract Accumulator Bit Field |
| BFXTR k16, ACx, dst | dst = field_extract(ACx, k16) |
| BNOT: Complement Accumulator, Auxiliary, or Temporary | Complement Accumulator, Auxiliary, or Temporary Register Bit |
| Register Bit | cbit(src, Baddr) |
| BNOT Baddr, src | Complement Memory Bit |
| BNOT: Complement Memory Bit | cbit(Smem, src) |
| BNOT src, Smem | Set Accumulator, Auxiliary, or Temporary Register Bit |
| BSET: Set Accumulator, Auxiliary, or Temporary Register Bit |  |
| BSET Baddr, src | bit(src, Baddr) = \#1 |
| BSET: Set Memory Bit | Set Memory Bit |
| BSET src, Smem | bit(Smem, src) = \#1 |
| BSET: Set Status Register Bit | Set Status Register Bit |
| BSET k4, STx_55 | bit(STx, k4) = \#1 |
| BSET f-name |  |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| BTST: Test Accumulator, Auxiliary, or Temporary Register Bit | Test Accumulator, Auxiliary, or Temporary Register Bit |
| BTST Baddr, src, TCx | TCx = bit(src, Baddr) |
| BTST: Test Memory Bit | Test Memory Bit |
| BTST src, Smem, TCx | TCx $=\operatorname{bit}($ Smem, src) |
| BTST k4, Smem, TCx | TCx $=\operatorname{bit}($ Smem, k4) |
| BTSTCLR: Test and Clear Memory Bit | Test and Clear Memory Bit |
| BTSTCLR k4, Smem, TCx | TCX = bit(Smem, k4), bit(Smem, k4) = \#0 |
| BTSTNOT: Test and Complement Memory Bit | Test and Complement Memory Bit |
| BTSTNOT k4, Smem, TCx | $\begin{aligned} & \text { TCx = bit(Smem, k4), } \\ & \text { cbit(Smem, k4) } \end{aligned}$ |
| BTSTP: Test Accumulator, Auxiliary, or Temporary Register Bit Pair | Test Accumulator, Auxiliary, or Temporary Register Bit Pair |
| BTSTP Baddr, src | bit(src, pair(Baddr)) |
| BTSTSET: Test and Set Memory Bit | Test and Set Memory Bit |
| BTSTSET k4, Smem, TCx | TCx = bit(Smem, k4), bit(Smem, k4) = \#1 |
| CALL: Call Unconditionally | Call Unconditionally |
| CALL ACx | call ACx |
| CALL L16 | call L16 |
| CALL P24 | call P24 |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :--- | :--- |
| CALLCC: Call Conditionally | Call Conditionally |
| CALLCC L16, cond | if (cond) call L16 |
| CALLCC P24, cond (cond) call P24 |  |
| CMP: Compare Memory with Immediate Value | Compare Memory with Immediate Value |
| CMP Smem == K16, TCx | TCx = (Smem == K16) |
| CMP: Compare Accumulator, Auxiliary, or Temporary Register | Compare Accumulator, Auxiliary, or Temporary Register <br> Content |
| CMP[U] src RELOP dst, TCx | TCx = uns(src RELOP dst) |
| CMPAND: Compare Accumulator, Auxiliary, or Temporary | Compare Accumulator, Auxiliary, or Temporary Register |
| Register Content with AND | Content with AND |
| CMPAND[U] src RELOP dst, TCy, TCx | TCx = TCy \& uns(src RELOP dst) |
| CMPAND[U] src RELOP dst, !TCy, TCx | TCx = !TCy \& uns(src RELOP dst) |
| CMPOR: Compare Accumulator, Auxiliary, or Temporary | Compare Accumulator, Auxiliary, or Temporary Register |
| Register Content with OR | Content with OR |
| CMPOR[U] src RELOP dst, TCy, TCx | TCx = TCy \| uns(src RELOP dst) |
| CMPOR[U] src RELOP dst, !TCy, TCx | TCx = !TCy \| uns(src RELOP dst) |
| CR: Circular Addressing Qualifier | Circular Addressing Qualifier |
| <instruction>.CR | circular() |
| DELAY: Memory Delay | Memory Delay |
| DELAY Smem | delay(Smem) |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| EXP: Compute Exponent of Accumulator Content | Compute Exponent of Accumulator Content |
| EXP ACx, Tx | $T x=\exp (A C x)$ |
| FIRSADD: Finite Impulse Response Filter, Symmetrical | Finite Impulse Response Filter, Symmetrical |
| FIRSADD Xmem, Ymem, Cmem, ACx, ACy | firs(Xmem, Ymem, coef(Cmem), ACx, ACy) |
| FIRSSUB: Finite Impulse Response Filter, Antisymmetrical | Finite Impulse Response Filter, Antisymmetrical |
| FIRSSUB Xmem, Ymem, Cmem, ACx, ACy | firsn(Xmem, Ymem, coef(Cmem), ACx, ACy) |
| IDLE | Idle |
| IDLE | idle |
| INTR: Software Interrupt | Software Interrupt |
| INTR k5 | intr(k5) |
| .LK: Lock Access Qualifier | Lock Access Qualifier |
| .LK | lock() |
| LMS: Least Mean Square | Least Mean Square (LMS) |
| LMS Xmem, Ymem, ACx, ACy | Ims(Xmem, Ymem, ACx, ACy) |
| LMSF Xmem, Ymem, ACx, ACy | Imsf(Xmem, Ymem, ACx, ACy) |
| .LR: Linear Addressing Qualifier | Linear Addressing Qualifier |
| <instruction>.LR | linear() |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MAC: Multiply and Accumulate | Multiply and Accumulate (MAC) |
| MAC[R] ACx, Tx, ACy[, ACy] | $A C y=r n d(A C y+(A C x * T x))$ |
| MAC[R] ACy, Tx, ACx, ACy | $A C y=r n d((A C y * T x)+A C x)$ |
| MAC[R] Smem, uns(Cmem), ACx | $A C x=\operatorname{rnd}(\mathbf{A C x}+($ Smem * uns(coef(Cmem) ) ) $)$ |
| MACK[R] Tx, K8, [ACx,] ACy | $A C y=r n d(A C x+(T x * K 8))$ |
| MACK[R] Tx, K16, [ACx,] ACy | $A C y=r n d(A C x+(T x * K 16))$ |
| MACM [R] [T3 = ]Smem, Cmem, ACx | ACx $=\operatorname{rnd}(\mathrm{ACX}+($ Smem * $\operatorname{coef}($ Cmem $))$ ), T3 = Smem $]$ |
| MACM $[R][$ [3 = ]Smem, [ACx, $]$ ACy | ACy $=\operatorname{rnd}($ ACy $+($ Smem * ACx $)$ ) $[$, T3 $=$ Smem $]$ |
| MACM $[R][$ [ $3=]$ Smem, Tx, [ACx, $]$ ACy | $A C y=r n d(A C x+(T x * S m e m))[, T 3=$ Smem $]$ |
| $\operatorname{MACMK}[R][T 3=] S m e m, ~ K 8,[A C x$,$] ACy$ | ACy $=\operatorname{rnd}($ ACx $+($ Smem * K8) $)[$, T3 $=$ Smem ] |
| MACM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], [ACx,] ACy | ACy $=$ M40(rnd(ACx $+(\mathrm{uns}($ Xmem $) *$ uns(Ymem) $)$ ) $[$ [ T3 $=$ Xmem $]$ |
| $\begin{aligned} & \text { MACM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], ACx >> \#16 } \\ & {[, \text { ACy] }} \end{aligned}$ | $\begin{aligned} & \mathrm{ACy}=\mathrm{M} 40(\mathrm{rnd}((\mathrm{ACx} \gg \# 16)+(\text { uns }(\text { Xmem }) * \text { uns }(\text { Ymem })))) \\ & {[, \mathrm{T} 3=\text { Xmem }]} \end{aligned}$ |
| MACMZ: Multiply and Accumulate with Parallel Delay | Multiply and Accumulate with Parallel Delay |
| MACM[R]Z [T3 = ]Smem, Cmem, ACx | ```ACx = rnd(ACx + (Smem * coef(Cmem)))[, T3 = Smem], delay(Smem)``` |
| MAC::MAC: Parallel Multiply and Accumulates | Parallel Multiply and Accumulates |
| MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | $\begin{aligned} & A C x=M 40\left(\operatorname{rnd}\left(A C x+\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\ & A C y=M 40\left(\operatorname{rnd}\left(A C y+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right) \end{aligned}$ |
| MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | $\begin{aligned} & A C x=M 40(\operatorname{rnd}((\text { ACx } \gg \# 16)+(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\ & A C y=M 4(\text { rnd }(\text { ACy }+(\text { uns }(\text { Ymem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))) \end{aligned}$ |
| MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 | $\begin{aligned} & A C x=M 40(\operatorname{rnd}((A C x \gg \# 16)+(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\ & A C y=M 40\left(\text { rnd }\left((\text { ACy } \gg \# 16)+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right) \end{aligned}$ |

MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16

MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16
:: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16

> Algebraic Syntax
> Multiply and Accumulate (MAC)
> $A C y=\operatorname{rnd}(A C y+(A C x * T x))$
> $A C y=\operatorname{rnd}((A C y * T x)+A C x)$
> ACx $=\operatorname{rnd}(A C x+($ Smem * uns $(\operatorname{coef}($ Cmem $)))$
> $A C y=\operatorname{rnd}(A C x+(T x * K 16))$
> $A C x=\operatorname{rnd}(A C x+($ Smem * $\operatorname{coef}($ Cmem $)))[$, T3 $=$ Smem $]$
> ACy $=\operatorname{rnd}(A C y+(S m e m * A C x))[, T 3=$ Smem $]$
> ACy $=\operatorname{rnd}(A C x+(T x *$ Smem $))[$, T3 = Smem $]$
> ACy $=\operatorname{rnd}(A C x+(S m e m * K 8))[, T 3=$ Smem $]$
> $A C y=M 40($ rnd $((A C x \gg \# 16)+($ uns $(X m e m)$ * uns(Ymem) $)))$
> [, T3 = Xmem]

Multiply and Accumulate with Parallel Delay
ACx $=\operatorname{rnd}(A C x+($ Smem * $\operatorname{coef}($ Cmem $)))[$ T3 = Smem],

## Parallel Multiply and Accumulat

$A C y=M 40($ rnd $(A C y+(u n s(Y m e m) *$ uns $(\operatorname{coef}(C m e m)))))$
ACx $=\mathrm{M} 40($ rnd ((ACx >> \#16) $+($ uns(Xmem) * uns(coef(Cmem) $))))$,
(ma(ACy + (uns(Ymem) uns(coef(Cmem))))
$A C y=M 40($ rnd ((ACy >> \#16) $+($ uns(Ymem) * uns(coef(Cmem)))))

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |  |
| MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx>>\#16 | $\begin{aligned} & \text { ACy }=\text { M } 40(\text { rnd }(\text { ACy }+(\text { uns }(\text { Smem }) * \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))))), \\ & \text { ACx }=\text { M40(rnd((ACx>>\#16) }+\left(\text { uns }(\text { Smem })^{*}\right) \\ & \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem) })))) \end{aligned}$ |
| MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx>>\#16 | $\begin{aligned} & \text { ACy }=\text { M } 40(\text { (nd }((\text { ACy } \gg \# 16)+(\text { uns }(\text { Smem }) ~ * ~ \\ & \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\ & \mathrm{ACx}=\mathrm{M} 40(\text { (rnd }((\mathrm{ACx} \gg \# 16)+(\text { uns }(\text { Smem }) \end{aligned}$ |
| MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  |
| MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy <br> :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx>>\#16 | ```ACy = M40(rnd(ACy + (uns(HI(Lmem)) * uns(HI(coef(Cmem)))))), ACx = M40(rnd((ACx>>#16) + (uns(LO(Lmem)) * uns(LO(coef(Cmem)))))``` |
| MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx>>\#16 |  |
| MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx | ACy $=\mathrm{M} 40\left(\right.$ rnd $\left(\mathrm{ACy}+\mathrm{uns}(\mathrm{Ymem}){ }^{*}\right.$ uns(HI(coef(Cmem))))), <br> $A C x=M 40(\operatorname{rnd}(A C x+u n s(X m e m) *$ uns(LO(coef(Cmem))))) |
| MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy, <br> :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> \#16 | $\begin{aligned} & A C y=M 40\left(\operatorname{rnd}\left(\text { ACy }+\left(\text { uns }(\text { Ymem })^{*} \operatorname{uns}(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right)\right), \\ & \text { ACx }=M 4(\text { Mnd }((\text { ACx } \gg \# 16)+(\text { uns }(\text { Xmem }) \\ & \text { } \end{aligned}$ |
| MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> \#16, :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> \#16 | ```ACy = M40(rnd((ACy >> #16) + (uns(Ymem) * uns(HI(coef(Cmem)))))), ACx = M40(rnd((ACx >> #16) + (uns(Xmem) * uns(LO(coef(Cmem))))))``` |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MAC::MAS: Multiply and Accumulate with Parallel Multiply and Subtract | Multiply and Accumulate with Parallel Multiply and Subtract |
| MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & A C y=M 40(\operatorname{rnd}(\text { ACy }+(\text { uns }(\text { Smem }) * \operatorname{uns}(\operatorname{HI}(\operatorname{coef}(\text { Cmem })))))), \\ & A C x=M 40(\operatorname{rnd}(A C x-(\text { uns }(\text { Smem }) * \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))))) \end{aligned}$ |
| MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & \text { ACy }=\text { M40(rnd((ACy>>\#16) }+\left(\text { uns }(\text { Smem })^{*}\right. \\ & \text { uns( } \mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\ & \text { ACx }=\text { M40 }\left(\text { rnd }\left(\text { ACx }-\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))\right)\right)\right) \end{aligned}$ |
| MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem) $)$ ]], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  |
| MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & \mathrm{ACy}=\mathrm{M} 40\left(\mathrm { rnd } \left((\mathrm{ACy} \gg \# 16)+\left(\operatorname{uns}(\mathrm{HI}(\text { Lmem }))^{*}\right.\right.\right. \\ & \text { uns }(\mathrm{HI}(\operatorname{coef}(\operatorname{Cmem}))))), \\ & \mathrm{ACx}=\mathrm{M} 40\left(\mathrm{rnd}\left(\mathrm{ACx}-\left(\operatorname{uns}(\mathrm{LO}(\text { Lmem }))^{*} \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem })))\right)\right)\right) \end{aligned}$ |
| MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & A C y=M 40(\operatorname{rnd}(A C y+u n s(\text { Ymem }) * u n s(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\ & A C x=M 40\left(\operatorname { r n d } \left(A C x-\text { uns }^{*}(\text { Xmem }) *\right.\right. \end{aligned}$ |
| MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> \#16, :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx | ```ACy = M40(rnd((ACy >> #16) + (uns(Ymem) * uns(HI(coef(Cmem)))))), ACx = M40(rnd(ACx - (uns(Xmem) * uns(LO(coef(Cmem)))))``` |
| MAC::MPY: Multiply and Accumulate with Parallel Multiply | Multiply and Accumulate with Parallel Multiply |
| MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | ```ACx = M40(rnd(ACx + (uns(Xmem) * uns(coef(Cmem))))), ACy = M40(rnd(uns(Ymem) * uns(coef(Cmem))))``` |
| MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & \text { ACy }=\text { M } 40(\text { rnd }(\text { ACy }+(\text { uns }(\text { Smem }) * u n s(\operatorname{HI}(\operatorname{coef}(\text { Cmem }))))), \\ & \text { ACx }=\text { M } 40(\text { rnd }(\text { uns }(\text { Smem }) ~ \end{aligned}$ |
| MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |  |
| MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem) [)$],$ ACy :: MPY[R][40] [uns(]LO(Lmem)[]], [uns(]LO(Cmem)[)], ACx |  |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  |
| MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> \#16, :: MPY[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & \text { ACy }=\text { M40(rnd((ACy >> \#16) }+\left(\text { uns }(\text { Ymem })^{*}\right. \\ & \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\ & \text { ACx }=\text { M40(rnd }(\text { uns }(\text { Xmem }) * \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem }))))) \end{aligned}$ |
| MACM::MOV: Multiply and Accumulate with Parallel Load Accumulator from Memory | Multiply and Accumulate with Parallel Load Accumulator from Memory |
| MACM[R] [T3 = ]Xmem, Tx, ACx <br> :: MOV Ymem <<\#16, ACy | $\begin{aligned} & \mathrm{ACx}=\operatorname{rnd}\left(\mathrm{ACx}+\left(\mathrm{Tx}{ }^{*} \text { Xmem }\right)\right), \\ & \mathrm{ACy}=\text { Ymem } \ll \# 16[, \mathrm{~T} 3=\text { Xmem }] \end{aligned}$ |
| MACM::MOV: Multiply and Accumulate with Parallel Store Accumulator Content to Memory | Multiply and Accumulate with Parallel Store Accumulator Content to Memory |
| MACM[R] [T3 = ]Xmem, Tx, ACy :: MOV HI(ACx << T2), Ymem | $\begin{aligned} & \text { ACy }=\operatorname{rnd}\left(\mathrm{ACy}+\left(\mathrm{Tx}{ }^{*} \text { Xmem }\right)\right), \\ & \text { Ymem }=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{~T} 2)[, \mathrm{T} 3=\text { Xmem }] \end{aligned}$ |
| MANT::NEXP: Compute Mantissa and Exponent of Accumulator Content | Compute Mantissa and Exponent of Accumulator Content |
| MANT ACx, ACy :: NEXP ACx, Tx | $A C y=m a n t(A C x), T x=-\exp (A C x)$ |
| MAS: Multiply and Subtract | Multiply and Subtract |
| MAS[R] Tx, [ACx, ] ACy | $A C y=\operatorname{rnd}(A C y-(A C x * T x))$ |
| MAS[R] Smem, uns(Cmem), ACx | ACx $=\operatorname{rnd}($ ACx $-($ Smem * uns(coef(Cmem) $)$ ) |
| MASM [R] [T3 = ]Smem, Cmem, ACx | ACx $=\operatorname{rnd}(\mathrm{ACx}-($ Smem * $\operatorname{coef}($ Cmem $)$ ) $[$, T3 $=$ Smem $]$ |
| MASM[R] [T3 = ]Smem, [ACx, $]$ ACy | ACy $=\operatorname{rnd}($ ACy $-($ Smem * ACx $)$ )[, T3 = Smem $]$ |
| MASM[R] [T3 = ]Smem, Tx, [ACx, ] ACy | $A C y=\operatorname{rnd}($ ACx $-(T x *$ Smem $)$ )[, T3 = Smem $]$ |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MASM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], [ACx,] ACy | ACy $=\mathrm{M} 40(\mathrm{rnd}(\mathrm{ACx}-(\mathrm{uns}($ Xmem $) *$ uns(Ymem) ) ) [, T3 = Xmem] |
| MAS::MAC: Multiply and Subtract with Parallel Multiply and Accumulate | Multiply and Subtract with Parallel Multiply and Accumulate |
| MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | $\begin{aligned} & A C x=M 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) * u n s(\operatorname{coef}(\text { Cmem }))))), \\ & A C y=M 40(\text { rnd }(\text { ACy }+(\text { uns }(\text { Ymem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))) \end{aligned}$ |
| MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 | $\begin{aligned} & A C x=M 40(\operatorname{rnd}(A C x-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\ & A C y=M 40(\text { rnd }((A C y \gg \# 16)+(\text { uns }(\text { Ymem }) * \text { uns(coef(Cmem) }))) \end{aligned}$ |
| MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx |  |
| MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  |
| MAS::MAS: Parallel Multiply and Subtracts | Parallel Multiply and Subtracts |
| MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | $\begin{aligned} & A C x=M 40(\text { rnd }(A C x-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\ & A C y=M 40\left(\text { rnd }\left(\text { ACy }-\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right) \end{aligned}$ |
| MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & A C y=M 40(\operatorname{rnd}(\text { ACy }-(\text { uns }(\text { Smem }) * \operatorname{uns}(\operatorname{HI}(\operatorname{coef}(\text { Cmem })))))), \\ & A C x=M 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Smem }) * \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem }))))) \end{aligned}$ |
| MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem) [)$],$ ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | $\left.\left.\left.\begin{array}{l} \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{ACy}-\left(\operatorname{uns}(\mathrm{HI}(\text { Lmem }))^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right)\right), \\ \mathrm{ACx}=\mathrm{M} 40(\mathrm{rnd}(\mathrm{ACx}-(\mathrm{uns}(\mathrm{LO}(\text { Lmem })) \end{array}{ }^{*} \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem })))\right)\right)\right)$ |
| MAS[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx |  |
| MAS: MPY: Multiply and Subtract with Parallel Multiply | Multiply and Subtract with Parallel Multiply |
| MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy | $\begin{aligned} & A C x=M 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\ & A C y=M 40\left(\text { rnd } \left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef(Cmem))))}\right.\right. \end{aligned}$ |
| MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & \text { ACy }=\mathrm{M} 40(\operatorname{rnd}(\text { ACy }-(\text { uns }(\text { Smem }) * \operatorname{uns}(\text { HI }(\operatorname{coef}(\text { Cmem })))))), \\ & \text { ACx }=\text { M } 40(\text { rnd }(\text { uns }(\text { Smem }) ~ \end{aligned}$ |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  |
| MASM::MOV: Multiply and Subtract with Parallel Load Accumulator from Memory | Multiply and Subtract with Parallel Load Accumulator from Memory |
| MASM[R] [T3 = ]Xmem, Tx, ACx :: MOV Ymem <<\#16, ACy | $\begin{aligned} & \text { ACx }=\operatorname{rnd}\left(\mathrm{ACx}-\left(\text { Tx }{ }^{*} \text { Xmem }\right)\right), \\ & \text { ACy }=\text { Ymem } \ll \# 16[, \mathrm{~T} 3=\text { Xmem }] \end{aligned}$ |
| MASM::MOV: Multiply and Subtract with Parallel Store Accumulator Content to Memory | Multiply and Subtract with Parallel Store Accumulator Content to Memory |
| MASM[R] [T3 = ]Xmem, Tx, ACy :: MOV HI(ACx << T2), Ymem | $\begin{aligned} & \mathrm{ACy}=\operatorname{rnd}(\mathrm{ACy}-(\mathrm{Tx} * \text { Xmem })), \\ & \text { Ymem }=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{~T} 2)[, \mathrm{T} 3=\text { Xmem }] \end{aligned}$ |
| MAX: Compare Accumulator, Auxiliary, or Temporary Register Content Maximum | Compare Accumulator, Auxiliary, or Temporary Register Content Maximum |
| MAX [src,] dst | $\mathrm{dst}=\max (\mathrm{src}, \mathrm{dst})$ |
| MAXDIFF: Compare and Select Accumulator Content Maximum | Compare and Select Accumulator Content Maximum |
| MAXDIFF ACx, ACy, ACz, ACw | max_diff(ACx, ACy, ACz, ACw) |
| DMAXDIFF ACx, ACy, ACz, ACw, TRNx | max_diff_dbl(ACx, ACy, ACz, ACw, TRNx) |
| MIN: Compare Accumulator, Auxiliary, or Temporary Register Content Minimum | Compare Accumulator, Auxiliary, or Temporary Register Content Minimum |
| MIN [src, ] dst | dst $=\min (\mathrm{src}, \mathrm{dst})$ |
| MINDIFF: Compare and Select Accumulator Content Minimum | Compare and Select Accumulator Content Minimum |
| MINDIFF ACx, ACy, ACz, ACw | min_diff(ACx, ACy, ACz, ACw) |
| DMINDIFF ACx, ACy, ACz, ACw, TRNx | min_diff_dbl(ACx, ACy, ACz, ACw, TRNx) |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :--- | :--- |
| mmap: Memory-Mapped Register Access Qualifier | Memory-Mapped Register Access Qualifier |
| mmap | mmap() |
| MOV: Load Accumulator from Memory | Load Accumulator from Memory |
| MOV [rnd(]Smem << Tx[)], ACx | ACx = rnd(Smem << Tx) |
| MOV low_byte(Smem) << \#SHIFTW, ACx | ACx = low_byte(Smem) << \#SHIFTW |
| MOV high_byte(Smem) << \#SHIFTW, ACx | ACx = high_byte(Smem) << \#SHIFTW |
| MOV Smem <<\#16, ACx | ACx = Smem << \#16 |
| MOV [uns(]Smem[)], ACx | ACx = uns(Smem) |
| MOV [uns(]Smem[)] << \#SHIFTW, ACx | ACx = uns(Smem) << \#SHIFTW |
| MOV[40] dbI(Lmem), ACx | ACx = M40(dbI(Lmem)) |
| MOV Xmem, Ymem, ACx | LO(ACx) = Xmem, |
|  | HI(ACx) = Ymem |
| MOV: Load Accumulator Pair from Memory | Load Accumulator Pair from Memory |
| MOV dbI(Lmem), pair(HI(ACx)) | pair(HI(ACx)) = Lmem |
| MOV dbI(Lmem), pair(LO(ACx)) | pair(LO(ACx)) = Lmem |
| MOV: Load Accumulator with Immediate Value | Load Accumulator with Immediate Value |
| MOV K16 <<\#16, ACx | ACx = K16 << \#16 |
| MOV K16 << \#SHFT, ACx | ACx = K16 << \#SHFT |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MOV: Load Accumulator, Auxiliary, or Temporary Register from Memory | Load Accumulator, Auxiliary, or Temporary Register from Memory |
| MOV Smem, dst | dst $=$ Smem |
| MOV [uns(]high_byte(Smem)[)], dst | dst = uns(high_byte(Smem)) |
| MOV [uns(]low_byte(Smem)[]], dst | dst = uns(low_byte(Smem)) |
| MOV: Load Accumulator, Auxiliary, or Temporary Register with Immediate Value | Load Accumulator, Auxiliary, or Temporary Register with Immediate Value |
| MOV k4, dst | $\mathrm{dst}=\mathrm{k} 4$ |
| MOV -k4, dst | $\mathrm{dst}=-\mathrm{k} 4$ |
| MOV K16, dst | $\mathrm{dst}=\mathrm{K} 16$ |
| MOV: Load Auxiliary or Temporary Register Pair from Memory MOV dbl(Lmem), pair(TAx) | Load Auxiliary or Temporary Register Pair from Memory pair(TAx) $=$ Lmem |
| MOV: Load CPU Register from Memory | Load CPU Register from Memory |
| MOV Smem, BK03 | BK03 = Smem |
| MOV Smem, BK47 | BK47 = Smem |
| MOV Smem, BKC | BKC = Smem |
| MOV Smem, BSA01 | BSA01 = Smem |
| MOV Smem, BSA23 | BSA23 = Smem |
| MOV Smem, BSA45 | BSA45 = Smem |
| MOV Smem, BSA67 | BSA67 = Smem |
| MOV Smem, BSAC | BSAC $=$ Smem |
| MOV Smem, BRC0 | $\mathrm{BRC0}=$ Smem |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MOV Smem, BRC1 | BRC1 = Smem |
| MOV Smem, CDP | CDP = Smem |
| MOV Smem, CSR | CSR = Smem |
| MOV Smem, DP | DP = Smem |
| MOV Smem, DPH | DPH = Smem |
| MOV Smem, PDP | PDP = Smem |
| MOV Smem, SP | SP = Smem |
| MOV Smem, SSP | SSP = Smem |
| MOV Smem, TRN0 | TRN0 $=$ Smem |
| MOV Smem, TRN1 | TRN1 = Smem |
| MOV dbl(Lmem), RETA | RETA $=\mathrm{dbl}($ Lmem $)$ |
| MOV: Load CPU Register with Immediate Value | Load CPU Register with Immediate Value |
| MOV k12, BK03 | BK03 $=\mathrm{k} 12$ |
| MOV k12, BK47 | BK47 $=\mathrm{k} 12$ |
| MOV k12, BKC | $B K C=k 12$ |
| MOV k12, BRC0 | BRC0 $=k 12$ |
| MOV k12, BRC1 | BRC1 $=\mathrm{k} 12$ |
| MOV k12, CSR | CSR $=\mathrm{k} 12$ |
| MOV k7, DPH | DPH $=\mathrm{k} 7$ |
| MOV k9, PDP | PDP = k9 |
| MOV k16, BSA01 | $B S A 01=k 16$ |
| MOV k16, BSA23 | $B S A 23=k 16$ |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MOV k16, BSA45 | BSA45 $=\mathrm{k} 16$ |
| MOV k16, BSA67 | BSA67 $=k 16$ |
| MOV k16, BSAC | BSAC $=\mathrm{k} 16$ |
| MOV k16, CDP | $C D P=k 16$ |
| MOV k16, DP | DP $=\mathrm{k} 16$ |
| MOV k16, SP | $\mathrm{SP}=\mathrm{k} 16$ |
| MOV k16, SSP | SSP $=\mathrm{k} 16$ |
| MOV: Load Extended Auxiliary Register from Memory | Load Extended Auxiliary Register from Memory |
| MOV dbl(Lmem), XAdst | XAdst $=\mathrm{dbl}($ Lmem $)$ |
| MOV: Load Memory with Immediate Value | Load Memory with Immediate Value |
| MOV K8, Smem | Smem $=$ K8 |
| MOV K16, Smem | Smem $=$ K16 |
| MOV: Move Accumulator Content to Auxiliary or Temporary Register | Move Accumulator Content to Auxiliary or Temporary Register |
| MOV HI(ACx), TAx | $\mathrm{TAx}=\mathrm{HI}(\mathrm{ACx})$ |
| MOV: Move Accumulator, Auxiliary, or Temporary Register Content | Move Accumulator, Auxiliary, or Temporary Register Content |
| MOV src, dst | dst $=$ src |
| MOV: Move Auxiliary or Temporary Register Content to Accumulator | Move Auxiliary or Temporary Register Content to Accumulator |
| MOV TAx, HI(ACx) | HI(ACx) = TAx |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MOV dbl(Lmem), Cmem | $\mathrm{dbl}(\operatorname{coef}(\mathrm{Cmem}))=$ Lmem |
| MOV dbl(Xmem), dbl(Ymem) | $\mathrm{dbl}($ Ymem $)=\mathrm{dbl}($ Xmem $)$ |
| MOV Xmem, Ymem | Ymem = Xmem |
| MOV: Store Accumulator Content to Memory | Store Accumulator Content to Memory |
| MOV HI(ACx), Smem | Smem $=$ HI(ACx) |
| MOV [rnd(]HI(ACx)[]], Smem | Smem $=\mathrm{HI}(\mathrm{rnd}($ ACx $)$ ) |
| MOV ACx << Tx, Smem | Smem $=$ LO(ACx $\ll$ Tx) |
| MOV [rnd(]HI(ACx << Tx $)[$ ] , Smem | Smem $=\mathrm{HI}(\mathrm{rnd}(\mathrm{ACx} \ll \mathrm{Tx})$ ) |
| MOV ACx << \#SHIFTW, Smem | Smem $=$ LO(ACx $\ll$ \#SHIFTW) |
| MOV HI(ACx << \#SHIFTW), Smem | Smem $=$ HI(ACx $\ll$ \#SHIFTW) |
| MOV [rnd(]HI(ACx << \#SHIFTW)[)], Smem | Smem $=\mathrm{HI}($ rnd(ACx $\ll$ \#SHIFTW $)$ ) |
| MOV [uns(] [rnd(]HI[(saturate](ACx)[) ) ] , Smem | Smem $=\mathrm{HI}($ saturate $($ uns (rnd(ACx) ) ) |
| MOV [uns(] [rnd(]HI[(saturate](ACx << Tx $)$ [) ) ] , Smem | Smem $=$ HI(saturate(uns(rnd(ACx $\ll$ Tx) ) ) |
| MOV [uns(] [rnd(]HI[(saturate](ACx << \#SHIFTW)[)) )], Smem | Smem = HI(saturate(uns(rnd(ACx << \#SHIFTW) ) ) |
| MOV ACx, dbl(Lmem) | $\mathrm{dbl}($ Lmem $)=\mathrm{ACx}$ |
| MOV [uns(]saturate(ACx)[)], dbl(Lmem) | $\mathrm{dbl}($ Lmem $)=$ saturate $($ uns $(A C x))$ |
| MOV ACx >> \#1, dual(Lmem) | $\begin{aligned} & \mathrm{HI}(\text { Lmem })=\mathrm{HI}(\mathrm{ACx}) \gg \# 1, \\ & \mathrm{LO}(\text { Lmem })=\mathrm{LO}(\mathrm{ACx}) \gg \# 1 \end{aligned}$ |
| MOV ACx, Xmem, Ymem | $\begin{aligned} & \text { Xmem }=\mathrm{LO}(\mathrm{ACx}), \\ & \text { Ymem }=\mathrm{HI}(\mathrm{ACx}) \end{aligned}$ |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MOV: Store Accumulator Pair Content to Memory | Store Accumulator Pair Content to Memory |
| MOV pair(HI(ACx)), dbl(Lmem) | Lmem $=\operatorname{pair}(\mathrm{HI}($ ACx $)$ ) |
| MOV pair(LO(ACx)), dbl(Lmem) | Lmem $=\operatorname{pair}(\mathrm{LO}(\mathrm{ACx})$ ) |
| MOV: Store Accumulator, Auxiliary, or Temporary Register Content to Memory | Store Accumulator, Auxiliary, or Temporary Register Content to Memory |
| MOV src, Smem | Smem = src |
| MOV src, high_byte(Smem) | high_byte(Smem) = src |
| MOV src, low_byte(Smem) | low_byte(Smem) = src |
| MOV: Store Auxiliary or Temporary Register Pair Content to Memory | Store Auxiliary or Temporary Register Pair Content to Memory |
| MOV pair(TAx), dbl(Lmem) | Lmem = pair(TAx) |
| MOV: Store CPU Register Content to Memory | Store CPU Register Content to Memory |
| MOV BK03, Smem | Smem = BK03 |
| MOV BK47, Smem | Smem $=$ BK47 |
| MOV BKC, Smem | Smem = BKC |
| MOV BSA01, Smem | Smem $=$ BSA01 |
| MOV BSA23, Smem | Smem = BSA23 |
| MOV BSA45, Smem | Smem = BSA45 |
| MOV BSA67, Smem | Smem = BSA67 |
| MOV BSAC, Smem | Smem = BSAC |
| MOV BRC0, Smem | Smem = BRC0 |
| MOV BRC1, Smem | Smem = BRC1 |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MOV CDP, Smem | Smem = CDP |
| MOV CSR, Smem | Smem $=$ CSR |
| MOV DP, Smem | Smem = DP |
| MOV DPH, Smem | Smem = DPH |
| MOV PDP, Smem | Smem = PDP |
| MOV SP, Smem | Smem = SP |
| MOV SSP, Smem | Smem = SSP |
| MOV TRN0, Smem | Smem $=$ TRN0 |
| MOV TRN1, Smem | Smem = TRN1 |
| MOV RETA, dbl(Lmem) | $\mathrm{dbl}($ Lmem $)=$ RETA |
| MOV: Store Extended Auxiliary Register Content to Memory MOV XAsrc, dbl(Lmem) | Store Extended Auxiliary Register Content to Memory $\mathrm{dbl}($ Lmem $)=$ XAsrc |
| MOV::MOV: Load Accumulator from Memory with Parallel Store Accumulator Content to Memory | Load Accumulator from Memory with Parallel Store Accumulator Content to Memory |
| MOV Xmem << \#16, ACy <br> :: MOV HI(ACx << T2), Ymem | $\begin{aligned} & \text { ACy }=\text { Xmem } \ll \# 16, \\ & \text { Ymem }=\mathrm{HI}(\text { ACx } \ll \text { T2) } \end{aligned}$ |
| MPY: Multiply | Multiply |
| MPY[R] [ACx, ] ACy | ACy $=\operatorname{rnd}\left(A^{\prime} y\right.$ * $\left.A C x\right)$ |
| MPY[R] Tx, [ACx,] ACy | $A C y=\operatorname{rnd}(A C x * T x)$ |
| MPYK[R] K8, [ACx, ${ }^{\text {ACy }}$ | $\mathrm{ACy}=\operatorname{rnd}\left(\mathrm{ACx} *{ }^{\text {K }}\right.$ \% $)$ |
| MPYK[R] K16, [ACx,] ACy | ACy $=\operatorname{rnd}\left(A^{\prime} x^{*} \mathrm{~K} 16\right)$ |
| MPY[R] Smem, uns(Cmem), ACx | ACx $=$ rnd(Smem * uns(coef(Cmem) ) |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MPYM[R] [T3 = ]Smem, Cmem, ACx | ACx $=\operatorname{rnd}($ Smem * $\operatorname{coef}(\mathrm{Cmem})$ )[, $\mathrm{T} 3=$ Smem] |
| MPYM[R] [T3 = ]Smem, [ACx, ] ACy | ACy $=\operatorname{rnd}($ Smem * ACx$)[$, T3 $=$ Smem $]$ |
| MPYMK[R] [T3 = ]Smem, K8, ACx | ACx $=\operatorname{rnd}($ Smem * K8)[, T3 = Smem] |
| MPYM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], ACx | ACx $=$ M40(rnd(uns(Xmem) * uns(Ymem)) )[, T3 = Xmem] |
| MPYM $[R][\mathrm{U}][\mathrm{T} 3=]$ Smem, Tx, ACx | ACx $=\operatorname{rnd}($ uns $(T x *$ Smem $)$ [ $[$ T3 $=$ Smem $]$ |
| MPY::MAC: Multiply with Parallel Multiply and Accumulate | Multiply with Parallel Multiply and Accumulate |
| MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx <br> :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 | ACx $=\mathrm{M} 40($ rnd $($ uns $($ Xmem $) ~ * ~ u n s(c o e f(C m e m) ~))), ~$ <br>  |
| MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & A C y=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right), \\ & \left.A C x=M 40\left(\text { rnd }\left(\text { ACx }+\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { LO (coef(Cmem) })\right)\right)\right)\right) \end{aligned}$ |
| MPY[R][40] [uns(]HI(Lmem) )], [uns(]HI(Cmem) [)$],$ ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & A C y=M 40\left(\operatorname{rnd}\left(\operatorname{uns}(\operatorname{HI}(\operatorname{Lmem}))^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\operatorname{Cmem})))\right)\right), \\ & A C x=M 40\left(\operatorname { r n d } \left(\mathrm{ACx}+\left(\operatorname{uns}(\mathrm{LO}(\operatorname{Lmem}))^{*}\right. \text { uns(LO(coef(Cmem))))))}\right.\right. \end{aligned}$ |
| MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |  |
| MPY::MAS: Multiply with Parallel Multiply and Subtract | Multiply with Parallel Multiply and Subtract |
| MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & A C y=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Smem })^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right), \\ & \text { ACx }=\text { M } 40\left(\text { rnd }\left(\text { ACx }-\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))\right)\right)\right. \end{aligned}$ |
| MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx |  |
| MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx |  |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| MPY::MPY: Parallel Multiplies | Parallel Multiplies |
| MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy |  |
| MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx | ACy $=\mathrm{M} 40($ rnd $($ uns $($ Smem $) ~ * ~ u n s(H I(c o e f(C m e m))))), ~$ <br> ACx $=$ M40(rnd(uns(Smem) * uns(LO(coef(Cmem))))) |
| MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx | $\begin{aligned} & \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{uns}(\mathrm{HI}(\text { Lmem }))^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\operatorname{Cmem})))\right)\right), \\ & \mathrm{ACx}=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{uns}(\mathrm{LO}(\operatorname{Lmem}))^{*} \text { uns(LO}(\operatorname{coef}(\text { Cmem }))\right)\right) \end{aligned}$ |
| MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MPY[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx | ACy $=$ M40 (rnd(uns(Ymem) * uns $(\mathrm{HI}(\operatorname{coef}($ Cmem $))))$ ), <br> ACx $=$ M40(rnd(uns(Xmem) *uns(LO(coef(Cmem))))) |
| MPYM::MOV: Multiply with Parallel Store Accumulator Content to Memory | Multiply with Parallel Store Accumulator Content to Memory |
| MPYM[R] [T3 = ]Xmem, Tx, ACy <br> :: MOV HI(ACx << T2), Ymem | $\begin{aligned} & \mathrm{ACy}=\operatorname{rnd}(\mathrm{Tx} * \text { Xmem }), \\ & \text { Ymem }=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{~T} 2)[, \mathrm{T} 3=\text { Xmem }] \end{aligned}$ |
| NEG: Negate Accumulator, Auxiliary, or Temporary Register Content | Negate Accumulator, Auxiliary, or Temporary Register Content |
| NEG [src,] dst | $\mathrm{dst}=-\mathrm{src}$ |
| NOP: No Operation | No Operation |
| NOP | nop |
| NOP_16 | nop_16 |
| NOT: Complement Accumulator, Auxiliary, or Temporary Register Content | Complement Accumulator, Auxiliary, or Temporary Register Content |
| NOT [src,] dst | dst $=\sim$ src |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| OR: Bitwise OR | Bitwise OR |
| OR src, dst | $\mathrm{dst}=\mathrm{dst} \mid \mathrm{src}$ |
| OR k8, src, dst | $\mathrm{dst}=\mathrm{src} \mid \mathrm{k} 8$ |
| OR k16, src, dst | dst $=$ src $\mid \mathrm{k} 16$ |
| OR Smem, src, dst | dst $=$ src $\mid$ Smem |
| OR ACx << \#SHIFTW[, ACy] | ACy $=$ ACy \| (ACx $\lll$ \#SHIFTW) |
| OR k16 <<\#16, [ACx,] ACy | ACy $=$ ACx $\mid$ (k16 $\lll \# 16)$ |
| OR k16 << \#SHFT, [ACx, ] ACy | ACy $=$ ACx \| (k16 <<< \#SHFT) |
| OR k16, Smem | Smem $=$ Smem $\mid$ k16 |
| POP: Pop Top of Stack | Pop Top of Stack |
| POP dst1, dst2 | dst1, dst2 = pop() |
| POP dst | dst $=\operatorname{pop}()$ |
| POP dst, Smem | dst, Smem = pop() |
| POP ACx | ACx $=\mathrm{dbl}(\mathrm{pop}())$ |
| POP Smem | Smem $=\operatorname{pop}()$ |
| POP dbl(Lmem) | $\mathrm{dbl}($ Lmem $)=\operatorname{pop}()$ |
| POPBOTH: Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers | Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers |
| POPBOTH xdst | xdst $=$ popboth() |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| port: Peripheral Port Register Access Qualifiers | Peripheral Port Register Access Qualifiers |
| port(Smem) | readport() |
| port(Smem) | writeport() |
| PSH: Push to Top of Stack | Push to Top of Stack |
| PSH src1, src2 | push(src1, src2) |
| PSH src | push(src) |
| PSH src, Smem | push(src, Smem) |
| PSH ACx | dbl(push(ACx)) |
| PSH Smem | push(Smem) |
| PSH dbl(Lmem) | push(dbl(Lmem)) |
| PSHBOTH: Push Accumulator or Extended Auxiliary Register Content to Stack Pointers | Push Accumulator or Extended Auxiliary Register Content to Stack Pointers |
| PSHBOTH xsrc | pshboth(xsrc) |
| RESET: Software Reset | Software Reset |
| RESET | reset |
| RET: Return Unconditionally | Return Unconditionally |
| RET | return |
| RETCC: Return Conditionally | Return Conditionally |
| RETCC cond | if (cond) return |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| RETI: Return from Interrupt | Return from Interrupt |
| RETI | return_int |
| ROL: Rotate Left Accumulator, Auxiliary, or Temporary Register Content | Rotate Left Accumulator, Auxiliary, or Temporary Register Content |
| ROL BitOut, src, Bitln, dst | $\mathrm{dst}=$ BitOut |
| src |  |
| Bitln |  |
| ROR: Rotate Right Accumulator, Auxiliary, or Temporary Register Content | Rotate Right Accumulator, Auxiliary, or Temporary Register Content |
| ROR Bitln, src, BitOut, dst | dst = Bitln // src // BitOut |
| ROUND: Round Accumulator Content | Round Accumulator Content |
| ROUND [ACx, ] ACy | $A C y=r n d(A C x)$ |
| RPT: Repeat Single Instruction Unconditionally | Repeat Single Instruction Unconditionally |
| RPT k8 | repeat(k8) |
| RPT k16 | repeat(k16) |
| RPT CSR | repeat(CSR) |
| RPTADD: Repeat Single Instruction Unconditionally and Increment CSR | Repeat Single Instruction Unconditionally and Increment CSR |
| RPTADD CSR, TAx | repeat(CSR), CSR += TAx |
| RPTADD CSR, k4 | repeat(CSR), CSR += k4 |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| RPTB: Repeat Block of Instructions Unconditionally | Repeat Block of Instructions Unconditionally |
| RPTBLOCAL pmad | localrepeat \{ \} |
| RPTB pmad | blockrepeat \{ \} |
| RPTCC: Repeat Single Instruction Conditionally | Repeat Single Instruction Conditionally |
| RPTCC k8, cond | while (cond \& \& (RPTC < k8)) repeat |
| RPTSUB: Repeat Single Instruction Unconditionally and Decrement CSR | Repeat Single Instruction Unconditionally and Decrement CSR |
| RPTSUB CSR, k4 | repeat(CSR), CSR -= k 4 |
| SAT: Saturate Accumulator Content | Saturate Accumulator Content |
| SAT[R] [ACx, ] ACy | ACy = saturate (rnd (ACx) ) |
| SFTCC: Shift Accumulator Content Conditionally | Shift Accumulator Content Conditionally |
| SFTCC ACx, TCx | ACx $=\operatorname{sftc}(\mathrm{ACx}, \mathrm{TCx}$ ) |
| SFTL: Shift Accumulator Content Logically | Shift Accumulator Content Logically |
| SFTL ACx, Tx[, ACy] | ACy $=$ ACx $\lll$ Tx |
| SFTL ACx, \#SHIFTW[, ACy] | ACy $=$ ACx $\lll$ \#SHIFTW |
| SFTL: Shift Accumulator, Auxiliary, or Temporary Register Content Logically | Shift Accumulator, Auxiliary, or Temporary Register Content Logically |
| SFTL dst, \#1 | dst $=$ dst $\lll \# 1$ |
| SFTL dst, \#-1 | dst = dst >>> \#1 |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| SFTS: Signed Shift of Accumulator Content | Signed Shift of Accumulator Content |
| SFTS ACx, Tx[, ACy] | $A C y=A C x \ll T x$ |
| SFTS ACx, \#SHIFTW[, ACy] | ACy $=$ ACx $\ll$ \#SHIFTW |
| SFTSC ACx, Tx[, ACy] | $A C y=A C x \ll C T x$ |
| SFTSC ACx, \#SHIFTW[, ACy] | ACy $=$ ACx $\ll C$ \#SHIFTW |
| SFTS: Signed Shift of Accumulator, Auxiliary, or Temporary Register Content | Signed Shift of Accumulator, Auxiliary, or Temporary Register Content |
| SFTS dst, \#-1 | dst $=$ dst >> \#1 |
| SFTS dst, \#1 | dst $=$ dst $\ll \# 1$ |
| SQA: Square and Accumulate | Square and Accumulate |
| SQA[R] [ACx, ] ACy | $A C y=r n d(A C y+(A C x * A C x))$ |
| SQAM $[\mathrm{R}][\mathrm{T} 3=]$ Smem, $[$ ACx, $]$ ACy | ACy $=\operatorname{rnd}($ ACx $+($ Smem * Smem $)$ ) $[$, T3 = Smem $]$ |
| SQDST: Square Distance | Square Distance |
| SQDST Xmem, Ymem, ACx, ACy | sqdst(Xmem, Ymem, ACx, ACy) |
| SQR: Square | Square |
| SQR[R] [ACx, ] ACy | ACy $=\operatorname{rnd}(A C x * A C x)$ |
| SQRM[R] [T3 = ]Smem, ACx | ACX $=\operatorname{rnd}($ Smem * Smem $)[$, T3 = Smem $]$ |
| SQS: Square and Subtract | Square and Subtract |
| SQS[R] [ACx,] ACy | ACy $=\operatorname{rnd}(A C y-(A C x * A C x))$ |
| SQSM[R] [T3 = ]Smem, [ACx,] ACy | ACy $=\operatorname{rnd}($ ACx $-($ Smem * Smem $)$ [, $\mathrm{T} 3=$ Smem $]$ |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| SUB: Dual 16-Bit Subtractions | Dual 16-Bit Subtractions |
| SUB dual(Lmem), [ACx,] ACy | $\begin{aligned} & \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\mathrm{ACx})-\mathrm{HI}(\text { Lmem }), \\ & \mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\mathrm{ACx})-\mathrm{LO}(\text { Lmem }) \end{aligned}$ |
| SUB ACx, dual(Lmem), ACy | $\begin{aligned} & \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\text { Lmem })-\mathrm{HI}(\mathrm{ACx}), \\ & \mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\text { Lmem })-\mathrm{LO}(A C x) \end{aligned}$ |
| SUB dual(Lmem), Tx, ACx | $\begin{aligned} & \mathrm{HI}(\mathrm{ACx})=\mathrm{Tx}-\mathrm{HI}(\text { Lmem }), \\ & \mathrm{LO}(\mathrm{ACx})=\mathrm{Tx}-\mathrm{LO}(\text { Lmem }) \end{aligned}$ |
| SUB Tx, dual(Lmem), ACx | $\begin{aligned} & \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\mathrm{Lmem})-\mathrm{Tx}, \\ & \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })-\mathrm{Tx} \end{aligned}$ |
| SUB: Subtraction | Subtraction |
| SUB [src,] dst | $d s t=d s t-s r c$ |
| SUB k4, dst | $\mathrm{dst}=\mathrm{dst}-\mathrm{k} 4$ |
| SUB K16, [src,] dst | $\mathrm{dst}=\mathrm{src}-\mathrm{K} 16$ |
| SUB Smem, [src,] dst | dst $=$ src - Smem |
| SUB src, Smem, dst | dst $=$ Smem - src |
| SUB ACx << Tx, ACy | $A C y=A C y-(A C x \ll T x)$ |
| SUB ACx << \#SHIFTW, ACy | ACy $=$ ACy $-($ ACx $\ll$ \#SHIFTW) |
| SUB K16 <<\#16, [ACx,] ACy | ACy $=$ ACx - (K16 <<\#16) |
| SUB K16 <<\#SHFT, [ACx,] ACy | ACy $=$ ACx $-(\mathrm{K} 16 \ll \# S H F T)$ |
| SUB Smem << Tx, [ACx,] ACy | ACy $=$ ACx $-($ Smem $\ll$ Tx) |
| SUB Smem <<\#16, [ACx,] ACy | ACy $=$ ACx - (Smem $\ll \# 16)$ |
| SUB ACx, Smem <<\#16, ACy | ACy $=($ Smem $\ll \# 16)-$ ACx |
| SUB [uns(]Smem[)], BORROW, [ACx,] ACy | ACy $=$ ACx - uns(Smem) - BORROW |
| SUB [uns(]Smem[)], [ACx,] ACy | ACy = ACx - uns(Smem) |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| SUB [uns(]Smem[)] << \#SHIFTW, [ACx,] ACy | ACy $=$ ACx $-($ uns (Smem) $\ll$ \#SHIFTW) |
| SUB dbl(Lmem), [ACx,] ACy | ACy $=$ ACx $-\mathrm{dbl}($ Lmem $)$ |
| SUB ACx, dbl(Lmem), ACy | $A C y=d b l(L m e m)-A C x$ |
| SUB Xmem, Ymem, ACx | ACx $=($ Xmem $\ll \# 16)-($ Ymem $\ll \# 16)$ |
| SUB::MOV: Subtraction with Parallel Store Accumulator Content to Memory | Subtraction with Parallel Store Accumulator Content to Memory |
| SUB Xmem << \#16, ACx, ACy :: MOV HI(ACy << T2), Ymem | $\begin{aligned} & \text { ACy }=(\text { Xmem } \ll \# 16)-\text { ACx } \\ & \text { Ymem }=\mathrm{HI}(\mathrm{ACy} \ll \mathrm{~T} 2) \end{aligned}$ |
| SUBADD: Dual 16-Bit Subtraction and Addition | Dual 16-Bit Subtraction and Addition |
| SUBADD Tx, Smem, ACx | $\begin{aligned} & \mathrm{HI}(\mathrm{ACx})=\text { Smem }- \text { Tx } \\ & \mathrm{LO}(A C x)=\text { Smem }+ \text { Tx } \end{aligned}$ |
| SUBADD Tx, dual(Lmem), ACx | $\begin{aligned} & \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })-\mathrm{Tx}, \\ & \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })+\mathrm{Tx} \end{aligned}$ |
| SUBC: Subtract Conditionally SUBC Smem, [ACx,] ACy | Subtract Conditionally subc(Smem, ACx, ACy) |
| SWAP: Swap Accumulator Content | Swap Accumulator Content |
| SWAP ACx, ACy | swap(ACx, ACy) |
| SWAP: Swap Auxiliary Register Content | Swap Auxiliary Register Content |
| SWAP ARx, ARy | swap(ARx, ARy) |


| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| SWAP: Swap Auxiliary and Temporary Register Content SWAP ARx, Tx | Swap Auxiliary and Temporary Register Content swap(ARx, Tx) |
| SWAP: Swap Temporary Register Content SWAP Tx, Ty | Swap Temporary Register Content swap(Tx, Ty) |
| SWAPP: Swap Accumulator Pair Content SWAPP AC0, AC2 | Swap Accumulator Pair Content swap(pair(AC0), pair(AC2)) |
| SWAPP: Swap Auxiliary Register Pair Content SWAPP ARO, AR2 | Swap Auxiliary Register Pair Content swap(pair(AR0), pair(AR2)) |
| SWAPP: Swap Auxiliary and Temporary Register Pair Content SWAPP ARx, Tx | Swap Auxiliary and Temporary Register Pair Content swap(pair(ARx), pair(Tx)) |
| SWAPP: Swap Temporary Register Pair Content SWAPP T0, T2 | Swap Temporary Register Pair Content swap(pair(T0), pair(T2)) |
| SWAP4: Swap Auxiliary and Temporary Register Pairs Content SWAP4 AR4, T0 | Swap Auxiliary and Temporary Register Pairs Content swap(block(AR4), block(T0)) |
| TRAP: Software Trap TRAP k5 | Software Trap trap(k5) |

Table 7-1. Cross-Reference of Mnemonic and Algebraic Instruction Sets (Continued)

| Mnemonic Syntax | Algebraic Syntax |
| :---: | :---: |
| XCC: Execute Conditionally | Execute Conditionally |
| XCC [label, ]cond | if (cond) execute(AD_Unit) |
| XCCPART [label, ]cond | if (cond) execute(D_Unit) |
| XOR: Bitwise Exclusive OR (XOR) | Bitwise Exclusive OR (XOR) |
| XOR src, dst | $\mathrm{dst}=\mathrm{dst}{ }^{\wedge} \mathrm{src}$ |
| XOR k8, src, dst | $\mathrm{dst}=\operatorname{src}^{\wedge} \mathrm{k} 8$ |
| XOR k16, src, dst | $\mathrm{dst}=\operatorname{src}^{\wedge} \mathrm{k} 16$ |
| XOR Smem, src, dst | dst $=\operatorname{src}^{\wedge}$ Smem |
| XOR ACx << \#SHIFTW[, ACy] | ACy $=$ ACy ${ }^{\wedge}$ (ACx $\lll$ \#SHIFTW) |
| XOR k16 <<\#16, [ACx,] ACy | $A C y=A C x \wedge(k 16 \lll \# 16)$ |
| XOR k16 << \#SHFT, [ACx, ] ACy | $A C y=A C x \wedge(k 16 \lll \# S H F T)$ |
| XOR k16, Smem | Smem $=$ Smem ${ }^{\wedge} \mathrm{k} 16$ |

## Index

## A

AADD 5-2,|5-6,||5-7
ABDST 5-9
ABS 5-11
absolute addressing modes 3-3
I/O absolute 3-3
k16 absolute 3-3
k23 absolute 3-3
Absolute Distance (ABDST) 5-9
Absolute Value (ABS) 5-11
ADD 5-14, 5 5-35
ADD::MOV 5-40
Addition (ADD) 5-14
Addition or Subtraction Conditionally (ADDSUBCC) 5-47
Addition or Subtraction Conditionally with Shift (ADDSUB2CC) 5-51
Addition with Absolute Value (ADDV) 5-54
Addition with Parallel Store Accumulator Content to Memory (ADD::MOV) 5-40
Addition, Subtraction, or Move Accumulator Content Conditionally (ADDSUBCC) 5-49
addressing modes
absolute 3-3
direct 3-4
indirect 3-6
introduction 3 -2
ADDSUB 5-42
ADDSUB2CC 5-51
ADDSUBCC 5-47, 5-49
ADDV 5-54
affect of status bits $1-9$
algebraic instruction set cross-reference to mnemonic instruction set $7-1$
AMAR 5-56,||5-58,|5-59

AMAR::MAC 5-60
AMAR::MAS 5-65
AMAR::MPY 5-67
AMOV 5-69,|5-70,|5-74
AND 5-76
Antisymmetrical Finite Impulse Response Filter
(FIRSSUB) 5-160
arithmetic
absolute distance 5-9
absolute value $5-11$
addition 5-14
addition or subtraction conditionally 5-47, $55-49$
addition or subtraction conditionally with shift $5-51$
addition with absolute value 5-54
compare memory with immediate value 5-141
compute exponent of accumulator content 5-157
compute mantissa and exponent of accumulator content 5-272
dual 16-bit addition and subtraction 5-42
dual 16-bit additions 5-35
dual 16-bit subtraction and addition 5-626
dual 16-bit subtractions 5-590
finite impulse response filter, antisymmetrical 5-160
finite impulse response filter, symmetrical 5-158
least mean square $5-167, \mid \overline{5}-169$
multiply 5-430
multiply and accumulate 5-174
multiply and subtract 5-274
negation 5-483
round accumulator content 5-528
saturate accumulator content $\mathbf{5 - 5 5 5}$
square 5-584
square and accumulate 5 5-579
square and subtract 5-587
square distance 5-582

```
    subtract conditionally 5-631
    subtraction 5-599
ASUB 5-85,|5-89
B
B 5-91
BAND 5-95
BCC 5-96, 5-100, 5-103
BCLR 5-106, \(\overline{5}\)-107, \(\overline{5-108}\)
BCNT 5-111
BFXPA 5-112
BFXTR 5-113
bit field comparison 5-95
bit field counting 5-111
bit field expand 5-112
bit field extract 5-113
bit manipulation
bitwise AND memory with immediate value and compare to zero 5-95
clear accumulator, auxiliary, or temporary register bit 5-106
clear memory bit 5-107
clear status register bit 5-108
complement accumulator, auxiliary, or temporary register bit 5-114
complement accumulator, auxiliary, or temporary register content 5-486
complement memory bit 5-115
expand accumulator bit field \(5-112\)
extract accumulator bit field 5-113
set accumulator, auxiliary, or temporary register bit 5-116
set memory bit 5-117
set status register bit 5 -118
test accumulator, auxiliary, or temporary register bit 5-121
test accumulator, auxiliary, or temporary register bit pair 5-128
test and clear memory bit 5-126
test and complement memory bit 5-127
test and set memory bit 5-130
test memory bit 5-123
Bitwise AND 5-76
Bitwise AND Memory with Immediate Value and
Compare to Zero (BAND) 5-95
bitwise complement 5-486
```

Bitwise Exclusive OR (XOR) 5-655
Bitwise OR 5-487
BNOT 5-114, 5 -115
branch
conditionally 5-96
on auxiliary register not zero 5-100
unconditionally 5-91
Branch Conditionally (BCC) 5-96
Branch on Auxiliary Register Not Zero
(BCC) 5-100
Branch Unconditionally (B) 5-91
BSET 5-116, 5-117| 5 -118
BTST 5-121, 5 -123
BTSTCLR 5-126
BTSTNOT 5-127
BTSTP 5-128
BTSTSET 5-130

## C

.CR 5-155
CALL 5-131
call
conditionally 5-135
unconditionally 5 5-131
Call Conditionally (CALLCC) 5-135
Call Unconditionally (CALL) 5-131
CALLCC 5-135
circular addressing 3-21
Circular Addressing Qualifier (.CR) 5-155
clear
accumulator bit 5-106
auxiliary register bit 5 5-106
memory bit 5-107
status register bit 5-108
temporary register bit 5-106
Clear Accumulator Bit (BCLR) 5-106
Clear Auxiliary Register Bit (BCLR) 5-106
Clear Memory Bit (BCLR) 5-107
Clear Status Register Bit (BCLR) 5-108
Clear Temporary Register Bit (BCLR) 5-106
CMP 5-141, $\overline{5-143}$
CMPAND 5-145
CMPOR 5-150
compare
accumulator, auxiliary, or temporary register content 5-143
accumulator, auxiliary, or temporary register content maximum 5-322
accumulator, auxiliary, or temporary register content minimum 5-331
accumulator, auxiliary, or temporary register
content with AND 5-145
accumulator, auxiliary, or temporary register content with OR 5-150
and branch 5-103
and select accumulator content
maximum 5-325
and select accumulator content minimum 5-334
memory with immediate value 5-141
Compare Accumulator Content (CMP) 5-143
Compare Accumulator Content Maximum
(MAX) 5-322
Compare Accumulator Content Minimum
(MIN) 5-331
Compare Accumulator Content with AND (CMPAND) 5-145
Compare Accumulator Content with OR
(CMPOR) 5-150
Compare and Branch (BCC) 5-103
Compare and Select Accumulator Content
Maximum (MAXDIFF) 5-325
Compare and Select Accumulator Content Minimum (MINDIFF) 5-334
Compare Auxiliary Register Content (CMP) 5-143
Compare Auxiliary Register Content Maximum (MAX) 5-322
Compare Auxiliary Register Content Minimum (MIN) 5-331
Compare Auxiliary Register Content with AND
(CMPAND) 5-145
Compare Auxiliary Register Content with OR
(CMPOR) 5-150
compare maximum 5-322
Compare Memory with Immediate Value
(CMP) 5-141
compare minimum 5-331
Compare Temporary Register Content (CMP) 5-143
Compare Temporary Register Content Maximum
(MAX) 5-322
Compare Temporary Register Content Minimum (MIN) 5-331

Compare Temporary Register Content with AND (CMPAND) 5-145
Compare Temporary Register Content with OR (CMPOR) 5-150
complement
accumulator bit 5-114
accumulator content 5 5-486
auxiliary register bit 5-114
auxiliary register content 5-486
memory bit 5-115
temporary register bit 5-114
temporary register content 5-486
Complement Accumulator Bit (BNOT) 5-114
Complement Accumulator Content (NOT) 5-486
Complement Auxiliary Register Bit (BNOT) 5-114
Complement Auxiliary Register Content
(NOT) 5-486
Complement Memory Bit (BNOT) 5-115
Complement Temporary Register Bit
(BNOT) 5-114
Complement Temporary Register Content
(NOT) 5-486
Compute Exponent of Accumulator Content (EXP) 5-157
Compute Mantissa and Exponent of Accumulator
Content (MANT::NEXP) 5-272
cond field 1-7
conditional
addition or subtraction 5-47
addition or subtraction with shift 5-51
addition, subtraction, or move accumulator content 5-49
branch 5-96
call 5-135
execute 5-648
repeat single instruction 5-550
return 5-520
shift 5-557
subtract $5-631$
Count Accumulator Bits (BCNT) 5-111
Cross-Reference to Algebraic and Mnemonic Instruction Sets 7-1

## D

DELAY 5-156
direct addressing modes 3-4
DP direct 3-4

PDP direct 3-5
register-bit direct 3-5
SP direct 3-5
DMAXDIFF 5-325
DMINDIFF 5-334
Dual 16-Bit Addition and Subtraction (ADDSUB) 5-42
Dual 16-Bit Additions (ADD) 5-35
dual 16-bit arithmetic
addition and subtraction 5-42
additions 5-35
subtraction and addition 5-626 subtractions 5-590
Dual 16-Bit Subtraction and Addition
(SUBADD) 5-626
Dual 16-Bit Subtractions (SUB) 5-590

## E

Execute Conditionally (XCC) 5-648
EXP 5-157
Expand Accumulator Bit Field (BFXPA) 5-112
extended auxiliary register (XAR)
load from memory 5-373
load with immediate value $5-69$
modify content 5-58,||5-74
modify content by addition [5-7
modify content by subtraction 5 -89
move content 5-383
pop content from stack pointers 5-503
push content to stack pointers 5-513
store to memory 5-427
Extract Accumulator Bit Field (BFXTR) 5-113

## F

finite impulse response (FIR) filter antisymmetrical 5-160
symmetrical 5-158
FIRSADD 5-158
FIRSSUB 5-160

## 1

IDLE 5-162
indirect addressing modes 3-6
AR indirect 3-6

CDP indirect 3-16
coefficient indirect 3 3-19
dual AR indirect 3-14
initialize memory 5-374
instruction qualifier
circular addressing 5-155
linear addressing 5-173
memory-mapped register access 5-340
instruction set
abbreviations 1-2
affect of status bits $1-9$
conditional fields 1-7
nonrepeatable instructions 1-21
notes 1-14
opcode symbols and abbreviations 6-18
opcodes 6-2
operators 1-6
rules 1 -14
symbols $11-2$
terms 1-2
instruction set conditional fields 1-7
instruction set notes and rules 1-14
instruction set opcode
abbreviations 6-18
symbols 6-18
instruction set opcodes 6-2
instruction set summary 4-1
instruction set terms, symbols, and
abbreviations 1-2
Interrupt (INTR) 5-163
INTR 5-163


Least Mean Square (LMS) 5-167
Least Mean Square (LMSF) 5-169
Linear Addressing Qualifier (.LR) 5-173
List of Mnemonic Instruction Opcodes
(Sequentially) 6-1
LMS 5-167
LMSF 5-169
load
accumulator from memory 5-342
accumulator from memory with parallel store accumulator content to memory 5-428
accumulator pair from memory 5-351
accumulator with immediate value 5-354
accumulator, auxiliary, or temporary register from memory 5-357
accumulator, auxiliary, or temporary register with immediate value 5-363
auxiliary or temporary register pair from memory $5-367$
CPU register from memory 5-368
CPU register with immediate value 5 5-371
extended auxiliary register (XAR) from memory 5-373
extended auxiliary register (XAR) with immediate value 5-69
memory with immediate value 5-374
Load Accumulator from Memory (MOV) 5-342, 5-357
Load Accumulator from Memory with Parallel Store Accumulator Content to Memory (MOV::MOV) 5-428
Load Accumulator Pair from Memory (MOV) 5-351
Load Accumulator with Immediate Value (MOV) 5-354
Load Auxiliary Register from Memory (MOV) 5-357
Load Auxiliary Register Pair from Memory (MOV) 5-367
Load Auxiliary Register with Immediate Value (MOV) 5-363
Load CPU Register from Memory (MOV) 5-368
Load CPU Register with Immediate Value (MOV) 5-371
Load Extended Auxiliary Register from Memory (MOV) 5-373
Load Extended Auxiliary Register with Immediate Value (AMOV) 5-69
Load Memory with Immediate Value (MOV) 5-374
Load Temporary Register from Memory (MOV) 5-357
Load Temporary Register Pair from Memory (MOV) 5-367
Load Temporary Register with Immediate Value (MOV) 5-363
lock, access qualifier 5-165
Lock Access Qualifier (.LK) 5-165
logical
bitwise AND 5 -76
bitwise OR 5-487
bitwise XOR 5-655
count accumulator bits 5-111
shift accumulator content logically 5-559
shift accumulator, auxiliary, or temporary register content logically 5-562
M
MAC 5-174
MAC::MAC 5-193
MAC::MAS 5-228
MAC::MPY 5-248
MACK 5-174
MACM 5-174
MACM::MOV 5-267, 5 -269
MACMK 5-174
MACMZ 5-191
MANT::NEXP 5-272
MAS 5-274
MAS::MAC 5-286
MAS::MAS 5-297
MAS::MPY 5-309
MASM 5-274
MASM::MOV 5-318| $5-320$
MAX 5-322
MAXDIFF 5-325
memory bit
clear 5-107
complement (not) 5-115
set 5-117
test 5-123
test and clear 5-126
test and complement 5-127
test and set 5-130
Memory Delay (DELAY) 5-156
Memory-Mapped Register Access Qualifier (mmap) 5-340
MIN 5-331
MINDIFF 5-334
mmap 5-340
mnemonic instruction set cross-reference to algebraic instruction set $7-1$
modify
auxiliary or temporary register content 5-70
auxiliary or temporary register content by addition 5-2
auxiliary or temporary register content by subtraction 5-85
auxiliary register content 5-56
auxiliary register content with parallel multiply 5-67
auxiliary register content with parallel multiply and accumulate 5-60
auxiliary register content with parallel multiply and subtract 5-65
data stack pointer 5-6
extended auxiliary register (XAR) content 5-58, 5-74
extended auxiliary register (XAR) content by addition 5-7
extended auxiliary register (XAR) content by subtraction 5-89
Modify Auxiliary Register Content (AMOV) 5-70
Modify Auxiliary Register Content (AMAR) 5-56
Modify Auxiliary Register Content by Addition (AADD) 5-2
Modify Auxiliary Register Content by Subtraction (ASUB) 5-85
Modify Auxiliary Register Content with Parallel Multiply (AMAR::MPY) 5-67
Modify Auxiliary Register Content with Parallel Multiply and Accumulate (AMAR::MAC) 5-60
Modify Auxiliary Register Content with Parallel Multiply and Subtract (AMAR::MAS) 5-65
Modify Data Stack Pointer (AADD) 5-6
Modify Extended Auxiliary Register Content (AMAR) 5-58
Modify Extended Auxiliary Register Content (AMOV) 5-74
Modify Extended Auxiliary Register Content by Addition (AADD) 5-7
Modify Extended Auxiliary Register Content by Subtraction (ASUB) 5-89
Modify Temporary Register Content (AMOV) 5-70
Modify Temporary Register Content by Addition (AADD) 5-2
Modify Temporary Register Content by Subtraction (ASUB) 5-85


 | $5-379$, | $5-381$, | $5-383$, | $5-384$, | p-391, |
| :---: | :---: | :---: | :---: | :---: | 5-422, 5 -423, $\mathbf{5}-427$

MOV::MOV 5-428
move
accumulator content to auxiliary or temporary register 5-375
accumulator, auxiliary, or temporary register content 5-376
auxiliary or temporary register content to accumulator 5-378
auxiliary or temporary register content to CPU register 5-379
CPU register content to auxiliary or temporary register 5-381
extended auxiliary register content 5-383
memory delay 5-156

| memory to memory | $5-384$ |
| :--- | :--- |

pop accumulator or extended auxiliary register content from stack pointers 5-503
pop top of stack 5-496
push accumulator or extended auxiliary register content to stack pointers 5-513
push to top of stack 5-506
swap accumulator content 5-634
swap accumulator pair content $\quad 5-639$
swap auxiliary and temporary register content 5-636
swap auxiliary and temporary register pair content 5-641
swap auxiliary and temporary register pairs content 5-644
swap auxiliary register content 5-635
swap auxiliary register pair content 5 5-640
swap temporary register content 5-638
swap temporary register pair content 5 5-643
Move Accumulator Content (MOV) 5-376
Move Accumulator Content to Auxiliary Register (MOV) 5-375
Move Accumulator Content to Temporary Register (MOV) 5-375
Move Auxiliary Register Content (MOV) 5-376
Move Auxiliary Register Content to Accumulator (MOV) 5-378
Move Auxiliary Register Content to CPU Register (MOV) 5-379
Move CPU Register Content to Auxiliary Register (MOV) 5-381
Move CPU Register Content to Temporary Register (MOV) 5-381
Move Extended Auxiliary Register Content (MOV) 5-383
Move Memory to Memory (MOV) 5-384

Move Temporary Register Content (MOV) 5-376
Move Temporary Register Content to Accumulator (MOV) 5-378
Move Temporary Register Content to CPU Register (MOV) 5-379
MPY 5-430
MPY::MAC 5-446
MPY::MAS 5-458
MPY::MPY 5-468
MPYK 5-430
MPYM 5-430
MPYM::MOV 5-480
MPYMK 5-430
Multiply (MPY) 5-430
Multiply and Accumulate (MAC) 5-174
Multiply and Accumulate with Parallel Delay (MACMZ) 5-191
Multiply and Accumulate with Parallel Load Accumulator from Memory (MACM::MOV) 5-267
Multiply and Accumulate with Parallel Multiply (MAC::MPY) 5-248
Multiply and Accumulate with Parallel Multiply and Subtract (MAC::MAS) 5-228
Multiply and Accumulate with Parallel Store Accumulator Content to Memory (MACM::MOV) 5-269
Multiply and Subtract (MAS) 5-274
Multiply and Subtract with Parallel Load Accumulator from Memory (MASM::MOV) 5-318
Multiply and Subtract with Parallel Multiply (MAS::MPY) 5-309
Multiply and Subtract with Parallel Multiply and Accumulate (MAS::MAC) 5-286
Multiply and Subtract with Parallel Store Accumulator Content to Memory (MASM::MOV) 5-320
Multiply with Parallel Multiply and Accumulate (MPY::MAC) 5-446
Multiply with Parallel Multiply and Subtract (MPY::MAS) 5-458
Multiply with Parallel Store Accumulator Content to Memory (MPYM::MOV) 5-480

NEG 5-483
Negate Accumulator Content (NEG) 5-483
Negate Auxiliary Register Content (NEG) 5-483
Negate Temporary Register Content (NEG) 5-483
negation
accumulator content 5-483
auxiliary register content 5-483
temporary register content 5 5-483
No Operation (NOP) 5-485
nonrepeatable instructions 1-21
NOP 5-485
NOT 5-486
0
operand qualifier 5-504
OR 5-487


Parallel Modify Auxiliary Register Contents (AMAR) 5-59
Parallel Multiplies (MPY::MPY) 5-468
Parallel Multiply and Accumulates (MAC::MAC) 5-193
Parallel Multiply and Subtracts (MAS::MAS) 5-297
parallel operations
addition with parallel store accumulator content to memory 5-40
load accumulator from memory with parallel store accumulator content to memory 5-428
modify auxiliary register content with parallel multiply 5-67
modify auxiliary register content with parallel multiply and accumulate 5-60
modify auxiliary register content with parallel multiply and subtract 5-65
modify auxiliary register contents 5-59
multiplies 5-468
multiply and accumulate with parallel delay 5-191
multiply and accumulate with parallel load accumulator from memory 5-267
multiply and accumulate with parallel multiply 5-248
multiply and accumulate with parallel multiply and subtract 5-228
multiply and accumulate with parallel store accumulator content to memory 5-269
multiply and accumulates 5-193
multiply and subtract with parallel load accumulator from memory 5-318
multiply and subtract with parallel multiply 5-309
multiply and subtract with parallel multiply and accumulate 5-286
multiply and subtract with parallel store accumulator content to memory 5-320
multiply and subtracts 5-297
multiply with parallel multiply and accumulate 5-446
multiply with parallel multiply and subtract 5-458
multiply with parallel store accumulator content to memory 5-480
subtraction with parallel store accumulator content to memory 5-624
parallelism basics 2-3
parallelism features 2-2
Peripheral Port Register Access Qualifiers
(port) 5-504
POP 5-496
Pop Accumulator Content from Stack Pointers (POPBOTH) 5-503
Pop Extended Auxiliary Register Content from Stack Pointers (POPBOTH) 5-503
Pop Top of Stack (POP) 5-496
POPBOTH 5-503
port 5-504
program control
branch conditionally 5-96
branch on auxiliary register not zero 5-100
branch unconditionally 5-91
call conditionally 5-135
call unconditionally 5-131
compare and branch 5-103
execute conditionally 5-648
idle 5-162
no operation 5-485
repeat block of instructions
unconditionally 5-538
repeat single instruction conditionally 5-550
repeat single instruction unconditionally 5-530
repeat single instruction unconditionally and decrement CSR 5-553
repeat single instruction unconditionally and increment CSR 5-535
return conditionally 5-520
return from interrupt 5-522
return unconditionally 5-518
software interrupt 5-163
software reset 5-514
software trap 5-646
PSH 5-506
PSHBOTH 5-513
Push Accumulator Content to Stack Pointers (PSHBOTH) 5-513
Push Extended Auxiliary Register Content to Stack Pointers (PSHBOTH) 5-513
Push to Top of Stack (PSH) 5-506

## R

register bit
clear 5-106
complement (not) 5-114
set 5-116
test 5-121
test bit pair 5-128
Repeat Block of Instructions Unconditionally (RPTB) 5-538
Repeat Single Instruction Conditionally (RPTCC) 5-550
Repeat Single Instruction Unconditionally (RPT) 5-530
Repeat Single Instruction Unconditionally and Decrement CSR (RPTSUB) 5-553
Repeat Single Instruction Unconditionally and Increment CSR (RPTADD) 5-535
RESET 5-514
resource conflicts in a parallel pair 2-4
RET 5-518
RETCC 5-520
RETI 5-522
Return Conditionally (RETCC) 5-520
Return from Interrupt (RETI) 5-522
Return Unconditionally (RET) 5-518
ROL 5-524
ROR 5-526
Rotate Left Accumulator Content (ROL) 5-524
Rotate Left Auxiliary Register Content (ROL) 5-524

Rotate Left Temporary Register Content (ROL) 5-524
Rotate Right Accumulator Content (ROR)

```5-526
```

Rotate Right Auxiliary Register Content (ROR) 5-526
Rotate Right Temporary Register Content
(ROR) 5-526
ROUND 5-528
Round Accumulator Content (ROUND) 5-528
RPT 5-530
RPTADD 5-535
RPTB 5-538
RPTBLOCAL 5-538
RPTCC 5-550
RPTSUB 5-553

## s

SAT 5-555
Saturate Accumulator Content (SAT) 5-555
set
accumulator bit 5-116
auxiliary register bit 5 5-116
memory bit 5-117
status register bit 5-118
temporary register bit 5-116
Set Accumulator Bit (BSET) 5-116
Set Auxiliary Register Bit (BSET) 5-116
Set Memory Bit (BSET) 5-117
Set Status Register Bit (BSET) 5-118
Set Temporary Register Bit (BSET) 5-116
SFTCC 5-557
SFTL 5-559 6-562
SFTS 5-565, $5-574$
SFTSC 5-565
Shift Accumulator Content Conditionally (SFTCC) 5-557
Shift Accumulator Content Logically (SFTL) 5-559, 5-562
Shift Auxiliary Register Content Logically
(SFTL) 5-562
shift conditionally 5-557
shift logically 5-559, 5-562
Shift Temporary Register Content Logically (SFTL) 5-562

Signed Shift of Accumulator Content
(SFTS) 5-565, $5-574$
Signed Shift of Auxiliary Register Content (SFTS) 5-574
Signed Shift of Temporary Register Content (SFTS) 5-574
soft-dual parallelism 2-5
Software Interrupt (INTR) 5-163
Software Reset (RESET) 5-514
Software Trap (TRAP) 5-646
SQA 5-579
SQAM 5-579
SQDST 5-582
SQR 5-584
SQRM 5-584
SQS 5-587
SQSM 5-587
Square (SQR) 5-584
Square and Accumulate (SQA) 5-579
Square and Subtract (SQS) 5-587
Square Distance (SQDST) 5-582
status register bit
clear 5-108
set 5-118
store
accumulator content to memory 5-391
accumulator pair content to memory 5-415
accumulator, auxiliary, or temporary register content to memory 5-418
auxiliary or temporary register pair content to memory 5-422
CPU register content to memory 5-423
extended auxiliary register (XAR) to memory 5-427
Store Accumulator Content to Memory
(MOV) 5-391, 5 -418
Store Accumulator Pair Content to Memory (MOV) 5-415
Store Auxiliary Register Content to Memory (MOV) 5-418
Store Auxiliary Register Pair Content to Memory (MOV) 5-422
Store CPU Register Content to Memory (MOV) 5-423
Store Extended Auxiliary Register Content to Memory (MOV) 5-427

Store Temporary Register Content to Memory
(MOV) 5-418
Store Temporary Register Pair Content to Memory
(MOV) 5-422
SUB 5-590, $5-599$
SUB::MOV 5-624
SUBADD 5-626
SUBC 5-631
Subtract Conditionally (SUBC) 5-631
Subtraction (SUB) 5-599
Subtraction with Parallel Store Accumulator Content
to Memory (SUB::MOV) 5-624
SWAP 5-634, 反-635, $\overline{\mathrm{p}-636, \sqrt{\mathrm{p}}-638}$
Swap Accumulator Content (SWAP) 5-634
Swap Accumulator Pair Content (SWAPP) 5-639
Swap Auxiliary and Temporary Register Content (SWAP) 5-636
Swap Auxiliary and Temporary Register Pair Content (SWAPP) 5-641
Swap Auxiliary and Temporary Register Pairs Content (SWAP4) 5-644
Swap Auxiliary Register Content (SWAP) 5-635
Swap Auxiliary Register Pair Content (SWAPP) 5-640
Swap Temporary Register Content (SWAP) 5-638
Swap Temporary Register Pair Content (SWAPP) 5-643
SWAP4 5-644
SWAPP 5 5-639, $\overline{5}-640,|\overline{5}-641,| \overline{5-643}$
Symmetrical Finite Impulse Response Filter (FIRSADD) 5-158

## $T$

test
accumulator bit 5-121
accumulator bit pair 5-128
auxiliary register bit 5-121
auxiliary register bit pair 5-128
memory bit 5-123
temporary register bit 5-121
temporary register bit pair 5-128
Test Accumulator Bit (BTST) 5-121
Test Accumulator Bit Pair (BTSTP) 5-128
Test and Clear Memory Bit (BTSTCLR) 5-126
Test and Complement Memory Bit (BTSTNOT) 5-127
Test and Set Memory Bit (BTSTSET) 5-130
Test Auxiliary Register Bit (BTST) 5-121
Test Auxiliary Register Bit Pair (BTSTP) 5-128
Test Memory Bit (BTST) 5-123
Test Temporary Register Bit (BTST) 5-121
Test Temporary Register Bit Pair (BTSTP) 5-128
TRAP 5-646

## U

unconditional
branch 5-91
call 5-131
repeat block of instructions 5-538
repeat single instruction 5-530
repeat single instruction and decrement CSR 5-553
repeat single instruction and increment CSR 5-535
return 5-518
return from interrupt 5 -522

## X

XCC 5-648
XCCPART 5-648
XOR 5-655

## IMPORTANT NOTICE

Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications, enhancements, improvements, and other changes to its products and services at any time and to discontinue any product or service without notice. Customers should obtain the latest relevant information before placing orders and should verify that such information is current and complete. All products are sold subject to Tl's terms and conditions of sale supplied at the time of order acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale in accordance with Tl's standard warranty. Testing and other quality control techniques are used to the extent TI deems necessary to support this warranty. Except where mandated by government requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are responsible for their products and applications using TI components. To minimize the risks associated with customer products and applications, customers should provide adequate design and operating safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right, copyright, mask work right, or other TI intellectual property right relating to any combination, machine, or process in which TI products or services are used. Information published by TI regarding third-party products or services does not constitute a license from TI to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property of the third party, or a license from TI under the patents or other intellectual property of TI .
Reproduction of TI information in TI data books or data sheets is permissible only if reproduction is without alteration and is accompanied by all associated warranties, conditions, limitations, and notices. Reproduction of this information with alteration is an unfair and deceptive business practice. TI is not responsible or liable for such altered documentation. Information of third parties may be subject to additional restrictions.

Resale of TI products or services with statements different from or beyond the parameters stated by TI for that product or service voids all express and any implied warranties for the associated Tl product or service and is an unfair and deceptive business practice. TI is not responsible or liable for any such statements.

TI products are not authorized for use in safety-critical applications (such as life support) where a failure of the TI product would reasonably be expected to cause severe personal injury or death, unless officers of the parties have executed an agreement specifically governing such use. Buyers represent that they have all necessary expertise in the safety and regulatory ramifications of their applications, and acknowledge and agree that they are solely responsible for all legal, regulatory and safety-related requirements concerning their products and any use of TI products in such safety-critical applications, notwithstanding any applications-related information or support that may be provided by TI. Further, Buyers must fully indemnify TI and its representatives against any damages arising out of the use of Tl products in such safety-critical applications.
TI products are neither designed nor intended for use in military/aerospace applications or environments unless the TI products are specifically designated by TI as military-grade or "enhanced plastic." Only products designated by TI as military-grade meet military specifications. Buyers acknowledge and agree that any such use of TI products which TI has not designated as military-grade is solely at the Buyer's risk, and that they are solely responsible for compliance with all legal and regulatory requirements in connection with such use.
TI products are neither designed nor intended for use in automotive applications or environments unless the specific Tl products are designated by TI as compliant with ISO/TS 16949 requirements. Buyers acknowledge and agree that, if they use any non-designated products in automotive applications, TI will not be responsible for any failure to meet such requirements.
Following are URLs where you can obtain information on other Texas Instruments products and application solutions:

## Products

## Amplifiers

Data Converters
DLP® Products
DSP
Clocks and Timers
Interface
Logic
Power Mgmt
Microcontrollers
RFID
RF/IF and ZigBee® Solutions
amplifier.ti.com
dataconverter.ti.com
www.dlp.com
dsw.ti.com
www.ti.com/clocks
nterface.ti.com
ogic.ti.com
oower.ticom
microcontroller.ti.com
www.ti-rfid.com
www.ti.com/lprt

| Applications |  |
| :---: | :---: |
| Audio | www.ti.com/audio |
| Automotive | www.ticom/automotiva |
| Broadband | www.ti.com/broadband |
| Digital Control | www.ti.com/digitalcontro |
| Medical | www.ti.com/medica |
| Military | www.ti.com/military |
| Optical Networking | www.ti.com/opticalnetwork |
| Security | Www.ti.com/security |
| Telephony | Www.ti.com/telephony |
| Video \& Imaging | www.ti.com/vided |
| Wireless | www.ti.com/wireless |

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2009, Texas Instruments Incorporated


[^0]:    $\underset{\text { © }}{-} \quad[9] \quad \operatorname{MACM[R][40]~[T3~}=][$ uns(]Xmem[)], [uns(]Ymem[]), [ACx,] ACy
    [10] MACM[R][40] [T3 = ][uns(]Xmem[]], [uns(1Ymem[]), ACx >> \#16[, ACy]

[^1]:    $\stackrel{\Delta}{v}$
    Notes: 1) dst-DU, src-AU or dst-DU, src-DU
    2) dst-DU, src-AU or dst-AU, src-DU

[^2]:    Notes: 1) dst-DU, src-AU or dst-DU, src-DU
    2) dst-DU, src-AU or dst-AU, src-DU

[^3]:    Execution
    ACY+M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    ACx + M 40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^4]:    Execution
    ACy +M 40 (rnd (uns (HI (Lmem)) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    ACx+M40 (rnd (uns (LO (Lmem)) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^5]:    Execution
    ACy-M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    ACx+M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^6]:    Execution
    ACy-M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    ACx-M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^7]:    Execution
    ACy-M40 (rnd (uns (HI (Lmem)) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    ACx-M40 (rnd (uns (LO (Lmem)) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^8]:    Opcode See Table 5-5 (page 5-426).

    Operands
    Lmem, Smem

[^9]:    Execution
    M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    ACx +M 40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^10]:    Execution
    M40 (rnd (uns (Smem) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    ACx-M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^11]:    Execution
    M40 (rnd (uns (Smem) [16:0] *uns (HI (Cmem)) [16:0])) -> ACy
    M40 (rnd (uns (Smem) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^12]:    Execution
    M40 (rnd (uns (HI (Lmem)) [16:0]*uns (HI (Cmem)) [16:0])) -> ACy
    M40 (rnd (uns (LO (Lmem)) [16:0]*uns (LO (Cmem)) [16:0])) -> ACx

[^13]:    Execution
    M40 (rnd (uns (Xmem) [16:0] * uns (LO (Cmem)) [16:0])) -> ACx
    M40 (rnd (uns (Ymem) [16:0] * uns (HI (Cmem)) [16:0])) -> ACy

