# C55x v3.x CPU Algebraic Instruction Set Reference Guide 

## Preface

## Read This First

## About This Manual

The C55x ${ }^{\top \mathrm{M}}$ is a fixed-point digital signal processor (DSP) in the TMS320™ DSP family, and it can use either of two forms of the instruction set: a mnemonic form or an algebraic form. This book is a reference for the algebraic form of the instruction set. It contains information about the instructions used for all types of operations. For information on the mnemonic instruction set, see C55x v3.x CPU Mnemonic Instruction Set Reference Guide, SWPU067.

This release is updated with the 3.0 Revision of the TMS320C55x DSP. Information not affected by the revision remains identical to the previous manual. The main new features of this revision are:

- Relaxed parallelism restrictions:

Total size of both instructions may be up to 8 bytes.
Constant buses (KAB and KDB) are no longer a source of conflict.

- New instructions:

Ismf
36 dual mac instructions with double coefficient features
MPY, MAC, and MAS instructions with unsigned coefficients
lock

## Notational Conventions

This book uses the following conventions.

- In syntax descriptions, the instruction is in a bold typeface. Portions of a syntax in bold must be entered as shown. Here is an example of an instruction syntax:

Ims(Xmem, Ymem, ACx, ACy)
Ims is the instruction, and it has four operands: Xmem, Ymem, ACx, and $A C y$. When you use Ims, the operands should be actual dual datamemory operand values and accumulator values. A comma and a space (optional) must separate the four values.
$\square$ Square brackets, [ and ], identify an optional parameter. If you use an optional parameter, specify the information within the brackets; do not type the brackets themselves.

## Related Documentation From Texas Instruments

The following books describe the C55x™ devices and related support tools. To obtain a copy of any of these Tl documents, call the Texas Instruments Literature Response Center at (800) 477-8924. When ordering, please identify the book by its title and literature number.

TMS320C55x Technical Overview (SPRU393). This overview is an introduction to the TMS320C55x™ digital signal processor (DSP). The TMS320C55x is the latest generation of fixed-point DSPs in the TMS320C5000 ${ }^{\text {TM }}$ DSP platform. Like the previous generations, this processor is optimized for high performance and low-power operation. This book describes the CPU architecture, low-power enhancements, and embedded emulation features of the TMS320C55x.

C55x CPU Reference Guide (literature number SWPU073) describes the architecture, registers, and operation of the CPU for the TMS320C55x ${ }^{\text {TM }}$ digital signal processors (DSPs).

C55x CPU Mnemonic Instruction Set Reference Guide (literature number SWPU067) describes the mnemonic instructions individually. It also includes a summary of the instruction set, a list of the instruction opcodes, and a cross-reference to the algebraic instruction set.

TMS320C55x Programmer's Guide (literature number SPRU376) describes ways to optimize C and assembly code for the TMS320C55x™ DSPs and explains how to write code that uses special features and instructions of the DSP.

TMS320C55x Optimizing C Compiler User's Guide (literature number SPRU281) describes the TMS320C55x ${ }^{\text {TM }}$ C Compiler. This C compiler accepts ANSI standard C source code and produces assembly language source code for TMS320C55x devices.

TMS320C55x Assembly Language Tools User's Guide (literature number SPRU280) describes the assembly language tools (assembler, linker, and other tools used to develop assembly language code), assembler directives, macros, common object file format, and symbolic debugging directives for TMS320C55x ${ }^{\text {TM }}$ devices.

## Trademarks

TMS320, TMS320C54x, TMS320C55x, C54x, and C55x are trademarks of Texas Instruments.

## Contents

1 Terms, Symbols, and Abbreviations ..... 1-1
Lists and defines the terms, symbols, and abbreviations used in the TMS320C55x DSP algebraic instruction set summary and in the individual instruction descriptions.
1.1 Instruction Set Terms, Symbols, and Abbreviations ..... 1-2
1.2 Instruction Set Conditional (cond) Fields ..... 1-7
1.3 Affect of Status Bits ..... 1-9
1.3.1 Accumulator Overflow Status Bit (ACOVx) ..... 1-9
1.3.2 C54CM Status Bit ..... 1-9
1.3.3 CARRY Status Bit ..... 1-9
1.3.4 FRCT Status Bit ..... 1-9
1.3.5 INTM Status Bit ..... 1-9
1.3.6 M40 Status Bit ..... 1-10
1.3.7 RDM Status Bit ..... 1-12
1.3.8 SATA Status Bit ..... 1-12
1.3.9 SATD Status Bit ..... 1-13
1.3.10 SMUL Status Bit ..... 1-13
1.3.11 SXMD Status Bit ..... 1-13
1.3.12 Test Control Status Bit (TCx) ..... 1-13
1.4 Instruction Set Notes and Rules ..... 1-14
1.4.1 Notes ..... 1-14
1.4.2 Rules ..... 1-14
1.5 Nonrepeatable Instructions ..... 1-20
2 Parallelism Features and Rules ..... 2-1
Describes the parallelism features and rules of the TMS320C55x DSP algebraic instruction set.
2.1 Parallelism Features ..... 2-2
2.2 Parallelism Basics ..... 2-3
2.3 Resource Conflicts ..... 2-4
2.3.1 Operators ..... 2-4
2.3.2 Address Generation Units ..... 2-4
2.3.3 Buses ..... 2-5
2.4 Soft-Dual Parallelism ..... 2-5
2.4.1 Soft-Dual Parallelism of MAR Instructions ..... 2-6
2.5 Execute Conditionally Instructions ..... 2-6
2.6 Other Exceptions ..... 2-7
3 Introduction to Addressing Modes ..... 3-1Provides an introduction to the addressing modes of the TMS320C55x DSP.
3.1 Introduction to the Addressing Modes ..... 3-2
3.2 Absolute Addressing Modes ..... 3-3
3.2.1 k16 Absolute Addressing Mode ..... 3-3
3.2.2 k23 Absolute Addressing Mode ..... 3-3
3.2.3 I/O Absolute Addressing Mode ..... 3-3
3.3 Direct Addressing Modes ..... 3-4
3.3.1 DP Direct Addressing Mode ..... 3-4
3.3.2 SP Direct Addressing Mode ..... 3-5
3.3.3 Register-Bit Direct Addressing Mode ..... 3-5
3.3.4 PDP Direct Addressing Mode ..... 3-5
3.4 Indirect Addressing Modes ..... 3-6
3.4.1 AR Indirect Addressing Mode ..... 3-6
3.4.2 Dual AR Indirect Addressing Mode ..... 3-14
3.4.3 CDP Indirect Addressing Mode ..... 3-16
3.4.4 Coefficient Indirect Addressing Mode ..... 3-19
3.5 Circular Addressing ..... 3-21
4 Instruction Set Summary ..... 4-1
Provides a summary of the TMS320C55x DSP algebraic instruction set.
5 Instruction Set Descriptions ..... 5-1
Detailed information on the TMS320C55x DSP algebraic instruction set.
Absolute Distance (abdst) ..... 5-2
Absolute Value ..... 5-4
Addition ..... 5-7
Addition with Absolute Value ..... 5-27
Addition with Parallel Store Accumulator Content to Memory ..... 5-29
Addition or Subtraction Conditionally (adsc) ..... 5-31
Addition or Subtraction Conditionally with Shift (ads2c) ..... 5-33
Addition, Subtraction, or Move Accumulator Content Conditionally (adsc) ..... 5-36
Bitwise AND ..... 5-38
Bitwise AND Memory with Immediate Value and Compare to Zero ..... 5-47
Bitwise OR ..... 5-48
Bitwise Exclusive OR (XOR) ..... 5-57
Branch Conditionally (if goto) ..... 5-66
Branch Unconditionally (goto) ..... 5-70
Branch on Auxiliary Register Not Zero (if goto) ..... 5-74
Call Conditionally (if call) ..... 5-77
Call Unconditionally (call) ..... 5-83
Circular Addressing Qualifier (circular) ..... 5-87
Clear Accumulator, Auxiliary, or Temporary Register Bit ..... 5-88
Clear Memory Bit ..... 5-89
Clear Status Register Bit ..... 5-90
Compare Accumulator, Auxiliary, or Temporary Register Content ..... 5-93
Compare Accumulator, Auxiliary, or Temporary Register Content with AND ..... 5-95
Compare Accumulator, Auxiliary, or Temporary Register Content with OR ..... 5-100
Compare Accumulator, Auxiliary, or Temporary Register Content Maximum (max) ..... 5-105
Compare Accumulator, Auxiliary, or Temporary Register Content Minimum (min) ..... 5-108
Compare and Branch (compare goto) ..... 5-111
Compare and Select Accumulator Content Maximum (max_diff) ..... 5-114
Compare and Select Accumulator Content Minimum (min_diff) ..... 5-120
Compare Memory with Immediate Value ..... 5-126
Complement Accumulator, Auxiliary, or Temporary Register Bit (cbit) ..... 5-128
Complement Accumulator, Auxiliary, or Temporary Register Content ..... 5-129
Complement Memory Bit (cbit) ..... 5-130
Compute Exponent of Accumulator Content (exp) ..... 5-131
Compute Mantissa and Exponent of Accumulator Content (mant, exp) ..... 5-132
Count Accumulator Bits (count) ..... 5-134
Dual 16-Bit Additions ..... 5-135
Dual 16-Bit Addition and Subtraction ..... 5-140
Dual 16-Bit Subtractions ..... 5-145
Dual 16-Bit Subtraction and Addition ..... 5-154
Execute Conditionally (if execute) ..... 5-159
Expand Accumulator Bit Field (field_expand) ..... 5-166
Extract Accumulator Bit Field (field_extract) ..... 5-167
Finite Impulse Response Filter, Antisymmetrical (firsn) ..... 5-168
Finite Impulse Response Filter, Symmetrical (firs) ..... 5-170
Idle ..... 5-172
Least Mean Square (Ims) ..... 5-173
Least Mean Square (Imsf) ..... 5-175
Linear Addressing Qualifier (linear) ..... 5-179
Load Accumulator from Memory ..... 5-180
Load Accumulator from Memory with Parallel Store Accumulator Content to Memory ..... 5-189
Load Accumulator Pair from Memory ..... 5-191
Load Accumulator with Immediate Value ..... 5-196
Load Accumulator, Auxiliary, or Temporary Register from Memory ..... 5-199
Load Accumulator, Auxiliary, or Temporary Register with Immediate Value ..... 5-205
Load Auxiliary or Temporary Register Pair from Memory ..... 5-209
Load CPU Register from Memory ..... 5-210
Load CPU Register with Immediate Value ..... 5-213
Load Extended Auxiliary Register from Memory ..... 5-215
Load Extended Auxiliary Register with Immediate Value ..... 5-216
Load Memory with Immediate Value ..... 5-217
Lock Access Qualifier ..... 5-218
Memory Delay (delay) ..... 5-220
Memory-Mapped Register Access Qualifier (mmap) ..... 5-221
Modify Auxiliary Register Content (mar) ..... 5-222
Modify Auxiliary Register Content with Parallel Multiply ..... 5-224
Modify Auxiliary Register Content with Parallel Multiply and Accumulate ..... 5-226
Modify Auxiliary Register Content with Parallel Multiply and Subtract ..... 5-231
Modify Auxiliary or Temporary Register Content (mar) ..... 5-233
Modify Auxiliary or Temporary Register Content by Addition (mar) ..... 5-237
Modify Auxiliary or Temporary Register Content by Subtraction (mar) ..... 5-241
Modify Data Stack Pointer ..... 5-245
Modify Extended Auxiliary Register Content (mar) ..... 5-246
Modify Extended Auxiliary Register Content by Addition (mar) ..... 5-249
Modify Extended Auxiliary Register Content by Subtraction (mar) ..... 5-251
Move Accumulator Content to Auxiliary or Temporary Register ..... 5-253
Move Accumulator, Auxiliary, or Temporary Register Content ..... 5-254
Move Auxiliary or Temporary Register Content to Accumulator ..... 5-256
Move Auxiliary or Temporary Register Content to CPU Register ..... 5-257
Move CPU Register Content to Auxiliary or Temporary Register ..... 5-259
Move Extended Auxiliary Register Content ..... 5-261
Move Memory to Memory ..... 5-262
Multiply ..... 5-269
Multiply with Parallel Multiply and Accumulate ..... 5-283
Multiply with Parallel Multiply and Subtract ..... 5-295
Multiply with Parallel Store Accumulator Content to Memory ..... 5-305
Multiply and Accumulate (MAC) ..... 5-308
Multiply and Accumulate with Parallel Delay ..... 5-325
Multiply and Accumulate with Parallel Load Accumulator from Memory ..... 5-327
Multiply and Accumulate with Parallel Multiply ..... 5-329
Multiply and Accumulate with Parallel Multiply and Subtract ..... 5-347
Multiply and Accumulate with Parallel Store Accumulator Content to Memory ..... 5-367
Multiply and Subtract ..... 5-369
Multiply and Subtract with Parallel Load Accumulator from Memory ..... 5-379
Multiply and Subtract with Parallel Multiply ..... 5-381
Multiply and Subtract with Parallel Multiply and Accumulate ..... 5-390
Multiply and Subtract with Parallel Store Accumulator Content to Memory ..... 5-401
Negate Accumulator, Auxiliary, or Temporary Register Content ..... 5-403
No Operation (nop) ..... 5-405
Parallel Modify Auxiliary Register Contents (mar) ..... 5-406
Parallel Multiplies ..... 5-407
Parallel Multiply and Accumulates ..... 5-419
Parallel Multiply and Subtracts ..... 5-454
Peripheral Port Register Access Qualifiers ..... 5-466
Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers (popboth) ..... 5-468
Pop Top of Stack (pop) ..... 5-469
Push Accumulator or Extended Auxiliary Register Content to Stack Pointers (pshboth) ..... 5-476
Push to Top of Stack (push) ..... 5-477
Repeat Block of Instructions Unconditionally ..... 5-484
Repeat Single Instruction Conditionally (while/repeat) ..... 5-495
Repeat Single Instruction Unconditionally (repeat) ..... 5-498
Repeat Single Instruction Unconditionally and Decrement CSR (repeat) ..... 5-503
Repeat Single Instruction Unconditionally and Increment CSR (repeat) ..... 5-505
Return Conditionally (if return) ..... 5-508
Return Unconditionally (return) ..... 5-510
Return from Interrupt (return_int) ..... 5-512
Rotate Left Accumulator, Auxiliary, or Temporary Register Content ..... 5-514
Rotate Right Accumulator, Auxiliary, or Temporary Register Content ..... 5-516
Round Accumulator Content (rnd) ..... 5-518
Saturate Accumulator Content (saturate) ..... 5-520
Set Accumulator, Auxiliary, or Temporary Register Bit ..... 5-522
Set Memory Bit ..... 5-523
Set Status Register Bit ..... 5-524
Shift Accumulator Content Conditionally (sftc) ..... 5-527
Shift Accumulator Content Logically ..... 5-529
Shift Accumulator, Auxiliary, or Temporary Register Content Logically ..... 5-532
Signed Shift of Accumulator Content ..... 5-535
Signed Shift of Accumulator, Auxiliary, or Temporary Register Content ..... 5-544
Software Interrupt (intr) ..... 5-549
Software Reset (reset) ..... 5-551
Software Trap (trap) ..... 5-555
Square ..... 5-557
Square and Accumulate ..... 5-560
Square and Subtract ..... 5-563
Square Distance (sqdst) ..... 5-566
Store Accumulator Content to Memory ..... 5-568
Store Accumulator Pair Content to Memory ..... 5-588
Store Accumulator, Auxiliary, or Temporary Register Content to Memory ..... 5-591
Store Auxiliary or Temporary Register Pair Content to Memory ..... 5-595
Store CPU Register Content to Memory ..... 5-596
Store Extended Auxiliary Register Content to Memory ..... 5-600
Subtract Conditionally (subc) ..... 5-601
Subtraction ..... 5-603
Subtraction with Parallel Store Accumulator Content to Memory ..... 5-627
Swap Accumulator Content (swap) ..... 5-629
Swap Accumulator Pair Content (swap) ..... 5-630
Swap Auxiliary Register Content (swap) ..... 5-631
Swap Auxiliary Register Pair Content (swap) ..... 5-632
Swap Auxiliary and Temporary Register Content (swap) ..... 5-633
Swap Auxiliary and Temporary Register Pair Content (swap) ..... 5-635
Swap Auxiliary and Temporary Register Pairs Content (swap) ..... 5-637
Swap Temporary Register Content (swap) ..... 5-639
Swap Temporary Register Pair Content (swap) ..... 5-640
Test Accumulator, Auxiliary, or Temporary Register Bit ..... 5-641
Test Accumulator, Auxiliary, or Temporary Register Bit Pair ..... 5-643
Test Memory Bit ..... 5-645
Test and Clear Memory Bit ..... 5-648
Test and Complement Memory Bit ..... 5-649
Test and Set Memory Bit ..... 5-650
6 Instruction Opcodes in Sequential Order ..... 6-1
The opcode in sequential order for each TMS320C55x DSP instruction syntax.
6.1 Instruction Set Opcodes ..... 6-2
6.2 Instruction Set Opcode Symbols and Abbreviations ..... 6-19
7 Cross-Reference of Algebraic and Mnemonic Instruction Sets ..... 7-1
Cross-Reference of TMS320C55x DSP Algebraic and Mnemonic Instruction Sets.
8 Index ..... Index-1

## Figures

5-1 Status Registers Bit Mapping ..... 5-92
5-2 Legal Uses of Repeat Block of Instructions Unconditionally (localrepeat) Instruction ..... 5-488
5-3 Status Registers Bit Mapping ..... 5-526
5-4 Effects of a Software Reset on Status Registers ..... 5-554
1-1 Instruction Set Terms, Symbols, and Abbreviations ..... 1-2
1-2 Operators Used in Instruction Set ..... 1-6
1-3 Instruction Set Conditional (cond) Field ..... 1-7
1-4 Nonrepeatable Instructions ..... 1-20
3-1 Addressing-Mode Operands ..... 3-2
3-2 Absolute Addressing Modes ..... 3-3
3-3 Direct Addressing Modes ..... 3-4
3-4 Indirect Addressing Modes ..... 3-6
3-5 DSP Mode Operands for the AR Indirect Addressing Mode ..... 3-8
3-6 Control Mode Operands for the AR Indirect Addressing Mode ..... 3-12
3-7 Dual AR Indirect Operands ..... 3-15
3-8 CDP Indirect Operands ..... 3-17
3-9 Coefficient Indirect Operands ..... 3-20
3-10 Circular Addressing Pointers ..... 3-21
4-1 Algebraic Instruction Set Summary ..... 4-3
5-1 Opcodes for Load CPU Register from Memory Instruction ..... 5-212
5-2 Opcodes for Load CPU Register with Immediate Value Instruction ..... 5-214
5-3 Opcodes for Move Auxiliary or Temporary Register Content to CPU Register Instruction ..... 5-258
5-4 Opcodes for Move CPU Register Content to Auxiliary or Temporary Register Instruction ..... 5-260
5-5 Effects of a Software Reset on DSP Registers ..... 5-552
5-6 Opcodes for Store CPU Register Content to Memory Instruction ..... 5-599
6-1 Instruction Set Opcodes ..... 6-2
6-2 Instruction Set Opcode Symbols and Abbreviations ..... 6-19
7-1 Cross-Reference of Algebraic and Mnemonic Instruction Sets ..... 7-2

# Terms, Symbols, and Abbreviations 

This chapter lists and defines the terms, symbols, and abbreviations used in the TMS320C55x ${ }^{\top M}$ DSP algebraic instruction set summary and in the individual instruction descriptions. Also provided are instruction set notes and rules and a list of nonrepeatable instructions.
Topic Page
1.1 Instruction Set Terms, Symbols, and Abbreviations ..... 1-2
1.2 Instruction Set Conditional (cond) Fields ..... 1-7
1.3 Affect of Status Bits ..... 1-9
1.4 Instruction Set Notes and Rules ..... 1-14
1.5 Nonrepeatable Instructions ..... 1-20

### 1.1 Instruction Set Terms, Symbols, and Abbreviations

Table 1-1 lists the terms, symbols, and abbreviations used and Table 1-2 lists the operators used in the instruction set summary and in the individual instruction descriptions.

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations

| Symbol | Meaning |
| :---: | :---: |
| [ ] | Optional operands |
| ACB | Bus that brings D-unit registers to A-unit and P-unit operators |
| ACOVx | Accumulator overflow status bit: ACOV0, ACOV1, ACOV2, ACOV3 |
| ACw, ACx, ACy, ACz | Accumulator: AC0, AC1, AC2, AC3 |
| ARn_mod | Content of selected auxiliary register (ARn) is premodified or postmodified in the address generation unit. |
| ARx, ARy | Auxiliary register: <br> AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 |
| AU | A unit |
| Baddr | Register bit address |
| Bitln | Shifted bit in: <br> Test control flag 2 (TC2) or CARRY status bit |
| BitOut | Shifted bit out: <br> Test control flag 2 (TC2) or CARRY status bit |
| BORROW | Logical complement of CARRY status bit |
| C, Cycles | Execution in cycles. For conditional instructions, $x / y$ field means: $x$ cycle, if the condition is true. <br> $y$ cycle, if the condition is false. |
| CA | Coefficient address generation unit |
| CARRY | Value of CARRY status bit |
| Cmem | Coefficient indirect operand referencing a 16-bit or 32-bit value in data space |
| cond | Condition based on accumulator value (ACx), auxiliary register (ARx) value, temporary register (Tx) value, test control (TCx) flag, or CARRY status bit. See section 1.2. |
| CR | Coefficient Read bus |
| CSR | Computed single-repeat register |

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations (Continued)

| Symbol | Meaning |
| :---: | :---: |
| DA | Data address generation unit |
| DR | Data Read bus |
| dst | Destination accumulator (ACx), lower 16 bits of auxiliary register (ARx), or temporary register (Tx): <br> AC0, AC1, AC2, AC3 <br> AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 <br> T0, T1, T2, T3 |
| DU | D unit |
| DW | Data Write bus |
| Dx | Data address label coded on x bits (absolute address) |
| E | Indicates if the instruction contains a parallel enable bit. |
| kx | Unsigned constant coded on x bits |
| Kx | Signed constant coded on x bits |
| Lmem | Long-word single data memory access (32-bit data access). Same legal inputs as Smem. |
| Ix | Program address label coded on x bits (unsigned offset relative to program counter register) |
| Lx | Program address label coded on x bits (signed offset relative to program counter register) |
| M40 | If the optional M40 keyword is applied to the instruction, the instruction provides the option to locally set M40 to 1 for the execution of the instruction |
| Operator | Operator(s) used by an instruction. |
| Pipe, Pipeline | Pipeline phase in which the instruction executes: |
|  | AD Address |
|  | D Decode |
|  | R Read |
|  | X Execute |
| Px | Program or data address label coded on x bits (absolute address) |
| RELOP | Relational operators: |
|  | $\begin{array}{lc} == & \text { equal to } \\ < & \text { less than } \\ >= & \text { greater than or equal to } \\ != & \text { not equal to } \end{array}$ |

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations (Continued)

| Symbol | Meaning |
| :---: | :---: |
| rnd | If the optional rnd keyword is applied to the instruction, rounding is performed in the instruction |
| RPTC | Single-repeat counter register |
| S, Size | Instruction size in bytes. |
| SA | Stack address generation unit |
| saturate | If the optional saturate keyword is applied to the input operand, the 40 -bit output of the operation is saturated |
| SHFT | 4-bit immediate shift value, 0 to 15 |
| SHIFTW | 6 -bit immediate shift value, -32 to +31 |
| Smem | Word single data memory access (16-bit data access) |
| SP | Data stack pointer |
| src | Source accumulator (ACx), lower 16 bits of auxiliary register (ARx), or temporary register (Tx): <br> AC0, AC1, AC2, AC3 <br> AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 <br> T0, T1, T2, T3 |
| SSP | System stack pointer |
| STx | Status register: <br> ST0, ST1, ST2, ST3 |
| TAx, TAy | Auxiliary register (ARx) or temporary register (Tx): AR0, AR1, AR2, AR3, AR4, AR5, AR6, AR7 T0, T1, T2, T3 |
| TCx, TCy | Test control flag: TC1, TC2 |
| TRNx | Transition register: TRN0, TRN1 |
| Tx, Ty | $\begin{aligned} & \text { Temporary register (Tx): } \\ & \text { T0, } \mathrm{T} 1, \mathrm{~T} 2, \mathrm{~T} 3 \end{aligned}$ |
| uns | If the optional uns keyword is applied to the input operand, the operand is zero extended |
| XACdst | Destination extended register: All 23 bits of coefficient data pointer (XCDP), and extended auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |

Table 1-1. Instruction Set Terms, Symbols, and Abbreviations (Continued)

| Symbol | Meaning |
| :--- | :--- |
| XACsrc | Source extended register: All 23 bits of coefficient data pointer (XCDP), and extended <br> auxiliary register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| XAdst | Destination extended register: All 23 bits of data stack pointer (XSP), system stack pointer <br> (XSSP), and data page pointer (XDP) |
| XARx | All 23 bits of extended auxiliary register: <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
| XAsrc | Source extended register: All 23 bits of data stack pointer (XSP), system stack pointer <br> (XSSP), and data page pointer (XDP) |
|  | Accumulator: <br> AC0, AC1, AC2, AC3 |
|  | Destination extended register: All 23 bits of data stack pointer (XSP), system stack pointer <br> (XSSP), data page pointer (XDP), coefficient data pointer (XCDP), and extended auxiliary <br> register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
|  | Accumulator: <br> AC0, AC1, AC2, AC3 |
|  | Source extended register: All 23 bits of data stack pointer (XSP), system stack pointer <br> (XSSP), data page pointer (XDP), coefficient data pointer (XCDP), and extended auxiliary |
|  | register (XARx): <br> XAR0, XAR1, XAR2, XAR3, XAR4, XAR5, XAR6, XAR7 |
|  | Indirect dual data memory access (two data accesses) |

## Table 1-2. Operators Used in Instruction Set

|  |  | Symbols | Operators | Evaluation |
| :---: | :---: | :---: | :---: | :---: |
| + | - | ~ | Unary plus, minus, 1s complement | Right to left |
| * | 1 | \% | Multiplication, division, modulo | Left to right |
| + |  | - | Addition, subtraction | Left to right |
| << |  | >> | Signed left shift, right shift | Left to right |
| <<< |  | >>> | Logical left shift, logical right shift | Left to right |
| < |  | <= | Less than, less than or equal to | Left to right |
| > |  | >= | Greater than, greater than or equal to | Left to right |
| $=$ |  | != | Equal to, not equal to | Left to right |
| \& |  |  | Bitwise AND | Left to right |
| \| |  |  | Bitwise OR | Left to right |
| $\wedge$ |  |  | Bitwise exclusive OR (XOR) | Left to right |

Note: Unary,+- , and * have higher precedence than the binary forms.

### 1.2 Instruction Set Conditional (cond) Fields

Table 1-3 lists the testing conditions available in the cond field of the conditional instructions.

Table 1-3. Instruction Set Conditional (cond) Field

| Bit or Register | Condition (cond) Field | For Condition to be True ... |
| :---: | :---: | :---: |
| Accumulator | Tests the accumulator (ACx) content against 0 . The comparison against 0 depends on M40 status bit: |  |
|  | $\square$ If $\mathrm{M} 40=0, \mathrm{ACx}(31-0)$ is compared to 0 . |  |
|  | - If M40 = 1, $\mathrm{ACx}(39-0)$ is compared to 0 . |  |
|  | ACx $=$ = \#0 | ACx content is equal to 0 |
|  | ACx < \#0 | ACx content is less than 0 |
|  | ACx > \#0 | ACx content is greater than 0 |
|  | ACx ! = \#0 | ACx content is not equal to 0 |
|  | ACx $<=$ \#0 | ACx content is less than or equal to 0 |
|  | ACx $>=$ \#0 | ACx content is greater than or equal to 0 |
| Accumulator Overflow Status Bit | Tests the accumulator overflow status bit (ACOVx) against 1 ; when the optional ! symbol is used before the bit designation, the bit can be tested against 0 . When this condition is used, the corresponding ACOVx is cleared to 0 . |  |
|  | overflow(ACx) | ACOVx bit is set to 1 |
|  | !overflow(ACx) | ACOVx bit is cleared to 0 |
| Auxiliary Register | Tests the auxiliary register (ARx) content against 0 . |  |
|  | ARx $=$ = \#0 | ARx content is equal to 0 |
|  | ARx $<\# 0$ | ARx content is less than 0 |
|  | ARx > \#0 | ARx content is greater than 0 |
|  | ARx ! = \#0 | ARx content is not equal to 0 |
|  | ARx $<=\# 0$ | ARx content is less than or equal to 0 |
|  | ARx $>=\# 0$ | ARx content is greater than or equal to 0 |
| CARRY Status Bit | Tests the CARRY status bit against 1 ; when the optional ! symbol is used before the bit designation, the bit can be tested against 0 . |  |
|  | CARRY | CARRY bit is set to 1 |
|  | !CARRY | CARRY bit is cleared to 0 |

Table 1-3. Instruction Set Conditional (cond) Field (Continued)

| Bit or Register | Condition (cond) Field | For Condition to be True ... |
| :---: | :---: | :---: |
| Temporary Register | Tests the temporary register ( Tx ) content against 0 . |  |
|  | Tx = = \#0 | Tx content is equal to 0 |
|  | Tx < \# | Tx content is less than 0 |
|  | Tx > \#0 | Tx content is greater than 0 |
|  | Tx ! = \#0 | Tx content is not equal to 0 |
|  | Tx <= \#0 | Tx content is less than or equal to 0 |
|  | Tx >= \#0 | Tx content is greater than or equal to 0 |
| Test Control Flags | Tests the test control flags (TC1 and TC2) independently against 1 ; when the optional ! symbol is used before the flag designation, the flag can be tested independently against 0 . |  |
|  | TCx | TCx flag is set to 1 |
|  | !TCx | TCx flag is cleared to 0 |
|  | TC1 and TC2 can be combined with an AND (\&), OR (\|), and XOR (^) logical bit combinations: |  |
|  | TC1 \& TC2 | TC1 AND TC2 is equal to 1 |
|  | !TC1 \& TC2 | TC1 AND TC2 is equal to 1 |
|  | TC1 \& !TC2 | TC1 AND TC2 is equal to 1 |
|  | !TC1 \& !TC2 | TC1 AND TC2 is equal to 1 |
|  | TC1 \| TC2 | TC1 OR TC2 is equal to 1 |
|  | !TC1 \| TC2 | TC1 OR TC2 is equal to 1 |
|  | TC1 \| !TC2 | TC1 OR TC2 is equal to 1 |
|  | !TC1 \| !TC2 | TC1 OR TC2 is equal to 1 |
|  | TC1 ^ TC2 | TC1 XOR TC2 is equal to 1 |
|  | $!T C 1 \wedge ~ T C 2 ~$ | TC1 XOR TC2 is equal to 1 |
|  | TC1 ^ ! CC2 $^{\text {a }}$ | TC1 XOR TC2 is equal to 1 |
|  | $!T C 1 \wedge!T C 2$ | TC1 XOR TC2 is equal to 1 |

### 1.3 Affect of Status Bits

### 1.3.1 Accumulator Overflow Status Bit (ACOVx)

The ACOV[0-3] depends on M40:

- When M40 $=0$, overflow is detected at bit position 31
- When M40 $=1$, overflow is detected at bit position 39

If an overflow is detected, the destination accumulator overflow status bit is set to 1 .

### 1.3.2 C54CM Status Bit

- When C54CM = 0, the enhanced mode, the CPU supports code originally developed for a TMS320C55x ${ }^{\text {TM }}$ DSP.
- When C54CM = 1, the compatible mode, all the C55x CPU resources remain available; therefore, as you translate code, you can take advantage of the additional features on the C55x DSP to optimize your code. This mode must be set when you are porting code that was originally developed for a TMS320C54x ${ }^{\text {TM }}$ DSP.


### 1.3.3 CARRY Status Bit

- When M40 $=0$, the carry/borrow is detected at bit position 31
- When M40 = 1, the carry/borrow is detected at bit position 39

When performing a logical shift or signed shift that affects the CARRY status bit and the shift count is zero, the CARRY status bit is cleared to 0 .

### 1.3.4 FRCT Status Bit

- When FRCT $=0$, the fractional mode is OFF and results of multiply operations are not shifted.
- When FRCT = 1, the fractional mode is ON and results of multiply operations are shifted left by 1 bit to eliminate an extra sign bit.


### 1.3.5 INTM Status Bit

The INTM bit globally enables or disables the maskable interrupts. This bit has no effect on nonmaskable interrupts (those that cannot be blocked by software).
$\square$ When INTM $=0$, all unmasked interrupts are enabled.

- When INTM = 1, all maskable interrupts are disabled.


### 1.3.6 M40 Status Bit

$\square \quad$ When M40 $=0$ :
overflow is detected at bit position 31

- the carry/borrow is detected at bit position 31

■ saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)

- TMS320C54x ${ }^{\text {TM }}$ DSP compatibility mode
- for conditional instructions, the comparison against 0 (zero) is performed on 32 bits, $\mathrm{ACx}(31-0)$
$\square$ When M40=1:
- overflow is detected at bit position 39
- the carry/borrow is detected at bit position 39
- saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)
- for conditional instructions, the comparison against 0 (zero) is performed on 40 bits, $\mathrm{ACx}(39-0)$


### 1.3.6.1 M40 Status Bit When Sign Shifting

In D-unit shifter:
$\square$ When shifting to the LSBs:

- when M40 = 0, the input to the shifter is modified according to SXMD and then the modified input is shifted according to the shift quantity:
- if $\mathrm{SXMD}=0,0$ is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
- if $\operatorname{SXMD}=1$, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
bit 39 is extended according to SXMD
the shifted-out bit is extracted at bit position 0
$\square$ When shifting to the MSBs:
0 is inserted at bit position 0
- if M40 $=0$, the shifted-out bit is extracted at bit position 31
- if M40 = 1, the shifted-out bit is extracted at bit position 39

After shifting, unless otherwise noted, when $\mathrm{M} 40=0$ :

- overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVx bit is set)
- the carry/borrow is detected at bit position 31
- if SATD = 1, when an overflow is detected, ACx saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
- TMS320C54x ${ }^{\text {TM }}$ DSP compatibility mode
$\square$ After shifting, unless otherwise noted, when $\mathrm{M} 40=1$ :
- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVx bit is set)
- the carry/borrow is detected at bit position 39
- if SATD = 1, when an overflow is detected, ACx saturation values are 7F FFFF FFFFh (positive overflow) or 8000000000 h (negative overflow)

In A-unit ALU:
$\square$ When shifting to the LSBs, bit 15 is sign extended
$\square$ When shifting to the MSBs, 0 is inserted at bit position 0
$\square$ After shifting, unless otherwise noted:

- overflow is detected at bit position 15 (if an overflow is detected, the destination ACOVx bit is set)
- if SATA = 1, when an overflow is detected, register saturation values are 7FFFh (positive overflow) or 8000h (negative overflow)


### 1.3.6.2 M40 Status Bit When Logically Shifting

In D-unit shifter:
When shifting to the LSBs:

- if $\mathrm{M} 40=0,0$ is inserted at bit position 31 and the guard bits (39-32) of the destination accumulator are cleared
- if M40 $=1,0$ is inserted at bit position 39
- the shifted-out bit is extracted at bit position 0 and stored in the CARRY status bit
- When shifting to the MSBs:

■ 0 is inserted at bit position 0

- if $\mathrm{M} 40=0$, the shifted-out bit is extracted at bit position 31 and stored in the CARRY status bit, and the guard bits (39-32) of the destination accumulator are cleared
- if $\mathrm{M} 40=1$, the shifted-out bit is extracted at bit position 39 and stored in the CARRY status bit

In A-unit ALU:

- When shifting to the LSBs:

0 is inserted at bit position 15
■ the shifted-out bit is extracted at bit position 0 and stored in the CARRY status bit
$\square$ When shifting to the MSBs:
■ 0 is inserted at bit position 0

- the shifted-out bit is extracted at bit position 15 and stored in the CARRY status bit


### 1.3.7 RDM Status Bit

When the optional rnd or R keyword is applied to the instruction, then rounding is performed in the D-unit shifter. This is done according to RDM:

- When RDM $=0$, the biased rounding to the infinite is performed. 8000 h $\left(2^{15}\right)$ is added to the 40 -bit result of the shift result.
$\square$ When RDM $=1$, the unbiased rounding to the nearest is performed. According to the value of the 17 LSBs of the 40 -bit result of the shift result, $8000 \mathrm{~h}\left(2^{15}\right)$ is added:

```
if( 8000h < bit(15-0) < 10000h)
    add 8000h to the 40-bit result of the shift result.
else if( bit(15-0) == 8000h)
        if( bit(16) == 1)
        add 8000h to the 40-bit result of the shift result.
```

If a rounding has been performed, the 16 lowest bits of the result are cleared to 0 .

### 1.3.8 SATA Status Bit

This status bit controls operations performed in the A unit.

- When SATA $=0$, no saturation is performed.
$\square$ When SATA $=1$ and an overflow is detected, the destination register is saturated to 7FFFh (positive overflow) or 8000h (negative overflow).


### 1.3.9 SATD Status Bit

This status bit controls operations performed in the $D$ unit.

- When SATD $=0$, no saturation is performed.
- When SATD $=1$ and an overflow is detected, the destination register is saturated.


### 1.3.10 SMUL Status Bit

- When $\mathrm{SMUL}=0$, the saturation mode is OFF.
$\square$ When $\operatorname{SMUL}=1$, the saturation mode is ON. When $\mathrm{SMUL}=1, \mathrm{FRCT}=1$, and SATD $=1$, the result of $18000 \mathrm{~h} \times 18000 \mathrm{~h}$ is saturated to 00 7FFF FFFFh (regardless of the value of the M40 bit). This forces the product of the two negative numbers to be a positive number. For multiply-and-accumulate/subtract instructions, the saturation is performed after the multiplication and before the addition/subtraction.


### 1.3.11 SXMD Status Bit

This status bit controls operations performed in the $D$ unit.

- When $\operatorname{SXMD}=0$, input operands are zero extended.
$\square$ When $\operatorname{SXMD}=1$, input operands are sign extended.


### 1.3.12 Test Control Status Bit (TCx)

The test/control status bits (TC1 or TC2) hold the result of a test performed by the instruction.

### 1.4 Instruction Set Notes and Rules

### 1.4.1 Notes

- Algebraic syntax keywords and operand modifiers are case insensitive. You can write:
abdst(*AR0, *ar1, AC0, ac1)
or
aBdST(*ar0, *aR1, aC0, Ac1)
- Operands for commutative operations (+, *, \& , |, ^) can be arranged in any order.
$\square$ Expression qualifiers can be specified in any order. For example, these two instructions are equivalent:

```
ACO = m40(rnd(uns(*ARO) * uns(*AR1)))
AC0 = rnd(m40(uns(*AR0) * uns(*AR1)))
```

- Algebraic instructions must use parenthesis in the exact form shown in the instruction set. For example, this instruction is legal:

```
ACO = AC0 + (AC1 << T0)
```

while both of these instructions are illegal:
$\mathrm{ACO}=\mathrm{ACO}+((\mathrm{AC1} \ll \mathrm{TO}))$
$\mathrm{ACO}=\mathrm{ACO}+\mathrm{AC1} \ll \mathrm{TO}$

### 1.4.2 Rules

- Simple instructions are not allowed to span multiple lines. One exception, single instructions that use the "," notation to imply parallelism. These instructions may be split up following the "," notation.
The following example shows a single instruction (dual multiply) occupying two lines:

```
ACx = m40(rnd(uns(Xmem) * uns(coef (Cmem)))),
ACy = m40(rnd(uns(Ymem) * uns(coef (Cmem))))
```

$\square$ User-defined parallelism instructions (using || notation) are allowed to span multiple lines. For example, all of the following instructions are legal:

```
AC0 = AC1 || AC2 = AC3
AC0 = AC1 ||
AC2 = AC3
AC0 = AC1
| AC2 = AC3
ACO = AC1
|
AC2 = AC3
```

$\square$ The block repeat syntax uses braces to delimit the block that is to be repeated:

```
blockrepeat {
    instr
        instr
            :
        instr
    }
localrepeat {
    instr
        instr
            :
        instr
    }
```

The left opening brace must appear on the same line as the repeat keyword. The right closing brace must appear alone on a line (trailing comments allowed).

Note that a label placed just inside the closing brace of the loop is effectively outside the loop. The following two code sequences are equivalent:

```
localrepeat {
    instr1
    instr2
    Label:
        }
        instr3
and
localrepeat {
        instr1
        instr2
    }
    Label:
        instr3
```

A label is the address of the first construct following the label that gets assembled into code in the object file. A closing brace does not generate any code and so the label marks the address of the first instruction that generates code, that is, instr3.

In this example, "goto Label" exits the loop, which is somewhat unintuitive:

```
localrepeat {
    goto Label
    instr2
    Label:
    }
    instr3
```


### 1.4.2.1 Reserved Words

Register names and algebraic syntax keywords are reserved. They may not be used as names of identifiers, labels, etc.

### 1.4.2.2 Literal and Address Operands

Literals in the algebraic strings are denoted as K or k fields. In the Smem address modes that require an offset, the offset is also a literal (K16 or k3). 8-bit and 16 -bit literals are allowed to be linktime-relocatable; for other literals, the value must be known at assembly time.

Addresses are the elements of the algebraic strings denoted by P, L, and I. Further, 16 -bit and 24 -bit absolute address Smem modes are addresses, as is the dma Smem mode, denoted by the '@' syntax. Addresses may be assem-bly-time constants or symbolic linktime-known constants or expressions.

Both literals and addresses follow syntax rule 1. For addresses only, rules 2 and 3 also apply.

## Rule 1

A valid address or literal is a \# followed by one of the following:
$\square$ a number (\#123)
$\square$ an identifier (\#FOO)
$\square$ a parenthesized expression (\# (FOO + 2) )
Note that \# is not used inside the expression.

## Rule 2

When an address is used in a dma, the address does not need to have a leading \#, be it a number, a symbol or an expression. These are all legal:

## @\#123

@123
@\#foo
@foo
@\# (foo+2)
@ (foo+2)

## Rule 3

When used in contexts other than dma (such as branch targets or Smemabsolute address), addresses generally need a leading \#. As a convenience, the \# may be omitted in front of an identifier. These are all legal:

## Branch

goto \#123
goto \#foo
goto foo
goto \#(fool2) *(\#(foo+2))
These are illegal:

```
goto 123
goto (foo+2)
```


## Absolute Address

* (\#123)
* (\#foo)
* (foo)

| goto 123 | $*(123)$ |
| :--- | :--- |
| goto (foo +2$)$ | $*((f \circ o+2))$ |

### 1.4.2.3 Memory Operands

$\square$ Syntax of Smem is the same as that of Lmem or Baddr.
$\square$ In the following instruction syntaxes, Smem cannot reference to a memory-mapped register (MMR). No instruction can access a byte within a memory-mapped register. If Smem is an MMR in one of the following syntaxes, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

```
dst = uns(high_byte(Smem))
dst = uns(low_byte(Smem))
ACx = low_byte(Smem) << #SHIFTW
ACx = high_byte(Smem) << #SHIFTW
high_byte(Smem) = src
low_byte(Smem) = src
```

$\square$ Syntax of Xmem is the same as that of Ymem.
$\square$ Syntax of coefficient operands, Cmem:
*CDP

* CDP+
*CDP-
* (CDP + T0), when C54CM = 0
* (CDP + AR0), when C54CM = 1

When an instruction uses a Cmem operand with paralleled instructions, the pointer modification of the Cmem operand must be the same for both instructions of the paralleled pair or the assembler generates an error. For example:

```
ACO = ACO + (*AR2 + * coef (*CDP+)),
AC1 = AC1 + (*AR3 + * coef (*CDP+))
```

An optional mmr prefix is allowed to be specified for indirect memory operands, for example, mmr (*ARO). This is an assertion by you that this is an access to a memory-mapped register. The assembler checks whether such access is legal in given circumstances.

The mmr prefix is supported for Xmem, Ymem, indirect Smem, indirect Lmem, and Cmem operands. It is not supported for direct memory operands; it is expected that an explicit mmap() parallel instruction is used in conjunction with direct memory operands to indicate MMR access.

Note that the mmr prefix is part of the syntax. It is an implementation restriction that mmr cannot exchange positions with other prefixes around the memory operand, such as dbl or uns. If several prefixes are specified, mmr must be the innermost prefix. Thus, uns (mmr (*ARO)) is legal, but mmr (uns (*ARO)) is not legal.
$\square$ The following indirect operands cannot be used for accesses to I/O space. An instruction using one of these operands requires a 2-byte extension for the constant. This extension would prevent the use of the port() qualifier needed to indicate an I/O-space access.

```
*ARn (#K16)
*+ARn(#K16)
* CDP(#K16)
*+CDP(#K16)
```

Also, the following instructions that include the delay operation cannot be used for accesses to I/O space:

```
delay(Smem)
ACx = rnd (ACx + (Smem * coef(Cmem))) [,T3 = Smem],
delay(Smem)
```

Any illegal access to I/O space will generate a hardware bus-error interrupt (BERRINT) to be handled by the CPU.

### 1.4.2.4 Operand Modifiers

Operand modifiers look like function calls on operands. Note that uns is an operand modifier and an instruction modifier meaning unsigned. The operand modifier uns is used when the operand is modified on the way to the rest of the operation (multiply-and-accumulate). The instruction modifier uns is used when the whole operation is affected (multiply, register compare, compare and branch).

| Modifier | Meaning |
| :--- | :--- |
| dbl | Access a true 32-bit memory operand <br> dual |
| Access a 32-bit memory operand for use as two <br> independent 16-bit halves of the given operation |  |
| HI | Access upper 16 bits of the accumulator |
| high_byte | Access the high byte of the memory location |
| LO | Access lower 16 bits of the accumulator |
| low_byte | Access the low byte of the memory location |
| pair | Dual register access |
| rnd | Round |
| saturate | Saturate |
| uns | Unsigned operand |

When an instruction uses a Cmem operand with paralleled instructions and the Cmem operand is defined as unsigned (uns), both Cmem operands of the paralleled pair must be defined as unsigned (and reciprocally).

When an instruction uses both Xmem and Ymem operands with paralleled instructions and the Xmem operand is defined as unsigned (uns), Ymem operand must also be defined as unsigned (and reciprocally).

### 1.4.2.5 Operator Syntax Rules

Instructions that read and write the same operand can also be written in op-assign form. For example:

```
ACO = ACO + *AR4
```

can also be written:
ACO += *AR4
This form is supported for these operations: $+=,-=, \mathcal{\&}=, \mid=,{ }^{\wedge}=$
Note that in certain instances use of op-assign notation results in ambiguous algebraic assembly. This happens if the op-assign operator is not delimited by white space, for example:
*ARO+=\#4 is ambiguous, is it *ARO += \#4 or *ARO+ = \#4 ?
The assembler always parses adjacent += as plus-assign; therefore, this instructions is parsed as *ARO += \#4.
*ARO $+=* A R 1$ is ambiguous, is it *ARO += *AR1 or *ARO+ =*AR1?
Once again, the first form, *AR0 $+=$ *AR1, is used. This is not a valid instruction -- an error is printed.

### 1.5 Nonrepeatable Instructions

Table 1-4 lists the instructions that cannot be used in a repeatable instruction.
Table 1-4. Nonrepeatable Instructions

| Instruction Description | Algebraic Syntax That Cannot Be Repeated |
| :---: | :---: |
| Branch Conditionally | if (cond) goto 14 |
|  | if (cond) goto L8 |
|  | if (cond) goto L16 |
|  | if (cond) goto P24 |
| Branch Unconditionally | goto ACx |
|  | goto L7 |
|  | goto L16 |
|  | goto P24 |
| Branch on Auxiliary Register Not Zero | if (ARn_mod ! = \#0) goto L16 |
| Call Conditionally | if (cond) call L16 |
|  | if (cond) call P24 |
| Call Unconditionally | call ACx |
|  | call L16 |
|  | call P24 |
| Clear Status Register Bit | $\operatorname{bit}(\mathrm{ST} x, \mathrm{k} 4)=$ \#0 |
| Compare and Branch | compare (uns(src RELOP K8)) goto L8 |
| Execute Conditionally | if (cond) execute(AD_Unit) |
|  | if (cond) execute(D_Unit) |
| Idle | idle |
| Load CPU Register from Memory | DP = Smem |
|  | RETA $=$ dbl(Lmem) |
| Load CPU Register with Immediate Value | DP = k16 |
| Move CPU Register Content to Auxiliary or Temporary Register | TAx = RPTC |
| Repeat Block of Instructions Unconditionally | localrepeat $\}$ |
|  | blockrepeat $\}$ |
| Repeat Single Instruction Conditionally | while (cond \& \& (RPTC < k8)) repeat |

Table 1-4. Nonrepeatable Instructions (Continued)

| Instruction Description | Algebraic Syntax That Cannot Be Repeated |
| :--- | :--- |
| Repeat Single Instruction Unconditionally | repeat(k8) <br> repeat(k16) <br> repeat(CSR) |
| Repeat Single Instruction Unconditionally and <br> Decrement CSR | repeat(CSR), CSR -= k4 |
| Repeat Single Instruction Unconditionally and <br> Increment CSR | repeat(CSR), CSR += TAx <br> repeat(CSR), CSR += k4 |
| Return Conditionally | if (cond) return |
| Return Unconditionally | return |
| Return from Interrupt | return_int |
| Round Accumulator Content | ACy $=$ rnd(ACx) |
| Set Status Register Bit | bit(STx, k4) = \#1 |
| Software Interrupt | intr(k5) |
| Software Reset | reset |
| Software Trap | trap(k5) |
| Store CPU Register Content to Memory | dbl(Lmem) = RETA |

## Parallelism Features and Rules

## This chapter describes the parallelism features and rules of the TMS320C55x ${ }^{\text {TM }}$ DSP algebraic instruction set.

Topic Page
2.1 Parallelism Features ..... 2-2
2.2 Parallelism Basics ..... 2-3
2.3 Resource Conflicts ..... 2-4
2.4 Soft-Dual Parallelism ..... 2-5
2.5 Execute Conditionally Instructions ..... 2-6
2.6 Other Exceptions ..... 2-7

### 2.1 Parallelism Features

The C55x ${ }^{\text {TM }}$ DSP architecture enables you to execute two instructions in parallel within the same cycle of execution. The types of parallelism are:

- Built-in parallelism within a single instruction.

Some instructions perform two different operations in parallel. A comma is used to separate the two operations. This type of parallelism is also called implied parallelism. For example:

$$
\begin{array}{ll}
\mathrm{ACO}=* \operatorname{ARO} * \operatorname{coef}(* \operatorname{CDP}), & \begin{array}{l}
\text { This is a single instruction. The data } \\
\text { AC1 }=* \text { AR1 } * \operatorname{coef}(* \mathrm{CDP})
\end{array} \\
\begin{array}{l}
\text { referenced by AR0 is multiplied by the } \\
\text { coefficient referenced by CDP. At the }
\end{array} \\
& \text { same time, the data referenced by AR1 } \\
\text { is multiplied by the same coefficient } \\
& \text { (CDP). }
\end{array}
$$

- User-defined parallelism between two instructions.

Two instructions may be paralleled by you or the C compiler. The parallel bars, $\|$, are used to separate the two instructions to be executed in parallel. For example:

| $\begin{aligned} & \text { AC1 }=* A R 1-* * A R 2+ \\ & \left\|\mid \mathrm{T1}=\mathrm{T} 1 \wedge^{*}\right. \text { AR2 } \end{aligned}$ | The first instruction performs a multiplication in the D-unit. The second instruction performs a logical operation in the A-unit ALU. |
| :---: | :---: |

$\square$ Built-in parallelism can be combined with user-defined parallelism. Parenthesis separators can be used to determine boundaries of the two instructions. For example:

```
(AC2 = *AR3+ * AC1, The first instruction includes implied
T3 = *AR3 +) parallelism. The second instruction is
| AR1 = #5 paralleled by you.
```


### 2.2 Parallelism Basics

In the parallel pair, all of these constraints must be met:
$\square$ No resource conflicts as detailed in section 2.3.
$\square$ One instruction must have a parallel enable bit or the pair must qualify for soft-dual parallelism as detailed in section 2.4.
$\square$ No memory operand may use an addressing mode that requires a constant that is 16 bits or larger:

```
■ *abs16(#k16)
■ *(#k23)
\square *port(#k16)
\square *ARn(K16)
| *+ARn(K16)
\square *CDP(K16)
\square *+CDP(K16)
```

$\square$ The following instructions cannot be in parallel:

```
\square if (cond) goto P24
\square if (cond) call P24
■ idle
| intr(k5)
\square reset
■ trap(k5)
```

$\square$ Neither instruction in the parallel pair can use any of these instruction or operand modifiers:

```
circular()
■ linear()
\square mmap()
■ readport()
■ writeport()
```

$\square$ A particular register or memory location can only be written once per pipeline phase. Violations of this rule take many forms. Loading the same register twice is a simple case. Other cases include:

- Conflicting address mode modifications (for example, *AR2+ versus *AR2-)
- Combining a SWAP instruction (modifies all of its registers) with any other instruction that writes one of the same registers
$\square$ Data stack pointer (XSP) or system stack pointer (XSSP) modifications cannot be combined with any of the following instructions:
- Call Conditionally, (if (cond) call instructions)
- Call Unconditionally, (call instructions)
- Push to top of Stack (push instructions)
- Pop from top of Stack (pop instructions)
- Return Conditionally, (if (cond) return instructions)
- Return Unconditionally, (return instructions)
- Return from Interrupt, (return_int, instructions)
- trap or intr instructions
$\square$ When both instructions in a parallel pair modify a status bit, the value of that status bit becomes undefined.


### 2.3 Resource Conflicts

Every instruction uses some set of operators, address generation units, and buses, collectively called resources, while executing. To determine which resources are used by a specific instruction, see Table 4-1. Two instructions in parallel use all the resources of the individual instructions. A resource conflict occurs when two instructions use a combination of resources that is not supported on the C55x device. This section details the resource conflicts.

### 2.3.1 Operators

You may use each of these operators only once:

- D Unit ALU

D Unit Shift
D Unit Swap
A Unit Swap

- A Unit ALU
- P Unit

For an instruction that uses multiple operators, any other instruction that uses one or more of those same operators may not be placed in parallel.

### 2.3.2 Address Generation Units

You may use no more than the indicated number of data address generation units:

- 2 Data Address (DA) Generation Units
- 1 Coefficient Address (CA) Generation Unit
- 1 Stack Address (SA) Generation Unit


### 2.3.3 Buses

You may use no more than the indicated number of buses:

| $\square$ | 2 Data Read (DR) Buses |
| :--- | :--- |
| 1 Coefficient Read (CR) Bus |  |
| 2 Data Write (DW) Buses |  |
| 1 | 1 ACB Bus - brings D-unit registers to A-unit and P-unit operators |

### 2.4 Soft-Dual Parallelism

Instructions that reference memory operands do not have parallel enable bits. Two such instructions may still be combined with a type of parallelism called soft-dual parallelism. The constraints of soft-dual parallelism are:
$\square$ Both memory operands must meet the constraints of the dual AR indirect addressing mode (Xmem and Ymem), as described in section 3.4.2. The operands available for the dual AR indirect addressing mode are:

```
- *ARn
- *ARn+
- *ARn-
- *(ARn \(+A R 0)\)
- *(ARn + T0)
- *(ARn - ARO)
- *(ARn - TO)
- *ARn(ARO)
- *ARn(TO)
- *(ARn \(+\mathrm{T} 1)\)
- *(ARn-T1)
```

$\square$ Neither instruction can contain any of the following:

- Instructions embedding high_byte(Smem) and low_byte(Smem).
- dst $=$ uns (high_byte (Smem))
- dst $=$ uns (low_byte (Smem))

■ ACx = low_byte (Smem) << \#SHIFTW

- ACx = high_byte(Smem) << \#SHIFTW

■ high_byte(Smem) = src
■ low_byte (Smem) = src

■ These instructions that read and write the same memory location:

```
■ cbit(Smem, src)
■ bit(Smem, src) = #0
■ bit(Smem, src) = #1
■ TCx = bit(Smem, k4), bit(Smem, k4) = #1
■ TCx = bit(Smem, k4), bit(Smem, k4) = #0
■ TCx = bit(Smem, k4), cbit(Smem, k4)
```

$\square$ With regard to soft-dual parallelism, the mar (Smem) and XAdst = mar (Smem) instructions have the same properties as any memory reference instruction.

### 2.4.1 Soft-Dual Parallelism of MAR Instructions

Although the following modify auxiliary register (MAR) instructions do not reference memory and do not have parallel enable bits, they may be combined together or with any other memory reference instructions (not limited to Xmem/ Ymem) to form soft-dual parallelism.

```
mar(TAy + TAx)
mar(TAx + k8)
mar(TAy = TAx)
mar(TAx = k8)
mar(TAy - TAx)
mar(TAx - k8)
```

Note that this is not the full list of MAR instructions; instructions $\operatorname{mar}($ TAx $=\mathrm{D} 16)$ is not included.

### 2.5 Execute Conditionally Instructions

The parallelization of the execute conditionally, if (cond) execute, instructions does not adhere to the descriptions in this chapter. All of the specific instances of legal parallelism are covered in the execute conditionally descriptions in Chapter 5.

### 2.6 Other Exceptions

The following are other exceptions not covered elsewhere in this chapter.
$\square$ An instruction that reads the repeat counter register (RPTC) may not be combined with any single-repeat instruction:

- repeat()
- repeat (CSR)
- while (cond) repeat


## Introduction to Addressing Modes

This chapter provides an introduction to the addressing modes of the TMS320C55x™ DSP.
Topic Page
3.1 Introduction to the Addressing Modes ..... 3-2
3.2 Absolute Addressing Modes ..... 3-3
3.3 Direct Addressing Modes ..... 3-4
3.4 Indirect Addressing Modes ..... 3-6
3.5 Circular Addressing ..... 3-21

### 3.1 Introduction to the Addressing Modes

The TMS320C55x DSP supports three types of addressing modes that enable flexible access to data memory, to memory-mapped registers, to register bits, and to I/O space:
$\square$ The absolute addressing mode allows you to reference a location by supplying all or part of an address as a constant in an instruction.

- The direct addressing mode allows you to reference a location using an address offset.
$\square$ The indirect addressing mode allows you to reference a location using a pointer.

Each addressing mode provides one or more types of operands. An instruction that supports an addressing-mode operand has one of the following syntax elements listed in Table 3-1.

Table 3-1. Addressing-Mode Operands

| Syntax <br> Element(s) | Description |
| :--- | :--- | | Baddr | When an instruction contains Baddr, that instruction can access one or two bits in an <br> accumulator (ACO-AC3), an auxiliary register (AR0-AR7), or a temporary register (T0-T3). <br> Only the register bit test/set/clear/complement instructions support Baddr. As you write one of <br> these instructions, replace Baddr with a compatible operand. |
| :--- | :--- |
| Cmem | When an instruction contains Cmem, that instruction can access a single word (16 bits) of data <br> from data memory. As you write the instruction, replace Cmem with a compatible operand. |
| HI(Cmem)/ | When an instruction contains HI(Cmem)/LO(Cmem), that instruction can access a long word <br> (32 bits) of data from data memory. As you write the instruction, replace Cmem with a compatible <br> operand. |
| Lmem | When an instruction contains Lmem, that instruction can access a long word (32 bits) of data <br> from data memory or from a memory-mapped registers. As you write the instruction, replace |
| Lmem with a compatible operand. |  |

### 3.2 Absolute Addressing Modes

Table 3-2 lists the absolute addressing modes available.
Table 3-2. Absolute Addressing Modes

| Addressing Mode | Description |
| :--- | :--- |
| k16 absolute | This mode uses the 7-bit register called DPH (high part of the extended data page <br> register) and a 16-bit unsigned constant to form a 23-bit data-space address. This mode <br> is used to access a memory location or a memory-mapped register. |
| k23 absolute | This mode enables you to specify a full address as a 23-bit unsigned constant. This <br> mode is used to access a memory location or a memory-mapped register. |
| I/O absolute | This mode enables you to specify an I/O address as a 16-bit unsigned constant. This <br> mode is used to access a location in I/O space. |

### 3.2.1 k16 Absolute Addressing Mode

The k16 absolute addressing mode uses the operand *abs16(\#k16), where k16 is a 16 -bit unsigned constant. DPH (the high part of the extended data page register) and k 16 are concatenated to form a 23 -bit data-space address. An instruction using this addressing mode encodes the constant as a 2-byte extension to the instruction. Because of the extension, an instruction using this mode cannot be executed in parallel with another instruction.

### 3.2.2 k23 Absolute Addressing Mode

The k23 absolute addressing mode uses the *(\#k23) operand, where k23 is a 23 -bit unsigned constant. An instruction using this addressing mode encodes the constant as a 3-byte extension to the instruction (the most-significant bit of this 3-byte extension is discarded). Because of the extension, an instruction using this mode cannot be executed in parallel with another instruction.

Instructions using the operand ${ }^{*}(\# k 23)$ to access the memory operand Smem cannot be used in a repeatable instruction. See Table 1-4 for a list of these instructions.

### 3.2.3 I/O Absolute Addressing Mode

The I/O absolute addressing mode uses the *port(\#k16) operand, where k16 is a 16 -bit unsigned constant. An instruction using this addressing mode encodes the constant as a 2-byte extension to the instruction. Because of the extension, an instruction using this mode cannot be executed in parallel with another instruction. The delay () instruction cannot use this mode.

### 3.3 Direct Addressing Modes

Table 3-3 lists the direct addressing modes available.
Table 3-3. Direct Addressing Modes

| Addressing Mode | Description |
| :--- | :--- |
| DP direct | This mode uses the main data page specified by DPH (high part of the extended data <br> page register) in conjunction with the data page register (DP). This mode is used to <br> access a memory location or a memory-mapped register. |
| SP direct | This mode uses the main data page specified by SPH (high part of the extended stack <br> pointers) in conjunction with the data stack pointer (SP). This mode is used to access <br> stack values in data memory. |
| Register-bit direct | This mode uses an offset to specify a bit address. This mode is used to access one <br> register bit or two adjacent register bits. |
| PDP direct | This mode uses the peripheral data page register (PDP) and an offset to specify an I/O <br> address. This mode is used to access a location in I/O space. |

The DP direct and SP direct addressing modes are mutually exclusive. The mode selected depends on the CPL bit in status register ST1_55:

| CPL | Addressing Mode Selected |
| :--- | :--- |
| 0 | DP direct addressing mode |
| 1 | SP direct addressing mode |

The register-bit and PDP direct addressing modes are independent of the CPL bit.

### 3.3.1 DP Direct Addressing Mode

When an instruction uses the DP direct addressing mode, a 23-bit address is formed. The 7 MSBs are taken from DPH that selects one of the 128 main data pages ( 0 through 127). The 16 LSBs are the sum of two values:
$\square$ The value in the data page register (DP). DP identifies the start address of a 128 -word local data page within the main data page. This start address can be any address within the selected main data page.

- A 7-bit offset (Doffset) calculated by the assembler. The calculation depends on whether you are accessing data memory or a memorymapped register (using the mmap() qualifier).

The concatenation of DPH and DP is called the extended data page register (XDP). You can load DPH and DP individually, or you can use an instruction that loads XDP.

### 3.3.2 SP Direct Addressing Mode

When an instruction uses the SP direct addressing mode, a 23 -bit address is formed. The 7 MSBs are taken from SPH. The 16 LSBs are the sum of the SP value and a 7 -bit offset that you specify in the instruction. The offset can be a value from 0 to 127. The concatenation of SPH and SP is called the extended data stack pointer (XSP). You can load SPH and SP individually, or you can use an instruction that loads XSP.

On the first main data page, addresses $000000 \mathrm{~h}-00005 \mathrm{Fh}$ are reserved for the memory-mapped registers. If any of your data stack is in main data page 0 , make sure it uses only addresses 00 0060h-00 FFFFh on that page.

### 3.3.3 Register-Bit Direct Addressing Mode

In the register-bit direct addressing mode, the offset you supply in the operand, @bitoffset, is an offset from the LSB of the register. For example, if bitoffset is 0 , you are addressing the LSB of a register. If bitoffset is 3 , you are addressing bit 3 of the register.

Only the register bit test/set/clear/complement instructions support this mode. These instructions enable you to access bits in the following registers only: the accumulators (AC0-AC3), the auxiliary registers (AR0-AR7), and the temporary registers (T0-T3).

### 3.3.4 PDP Direct Addressing Mode

When an instruction uses the PDP direct addressing mode, a 16 -bit I/O address is formed. The 9 MSBs are taken from the 9 -bit peripheral data page register (PDP) that selects one of the 512 peripheral data pages ( 0 through 511). Each page has 128 words ( 0 to 127). You select a particular word by specifying a 7 -bit offset (Poffset) in the instruction. For example, to access the first word on a page, use an offset of 0 .

You must use a readport() or writeport() instruction qualifier to indicate that you are accessing an I/O-space location rather than a data-memory location. You place the readport() or the writeport() instruction qualifier in parallel with the instruction that performs the I/O-space access.

### 3.4 Indirect Addressing Modes

Table 3-4 list the indirect addressing modes available. You may use these modes for linear addressing or circular addressing.

Table 3-4. Indirect Addressing Modes

| Addressing Mode | Description |
| :--- | :--- |
| AR indirect | This mode uses one of eight auxiliary registers (AR0-AR7) to point to data. The way the <br> CPU uses the auxiliary register to generate an address depends on whether you are <br> accessing data space (memory or memory-mapped registers), individual register bits, <br> or I/O space. |
| Dual AR indirect | This mode uses the same address-generation process as the AR indirect addressing <br> mode. This mode is used with instructions that access two or more data-memory <br> locations. |
| CDP indirect | This mode uses the coefficient data pointer (CDP) to point to data. The way the CPU <br> uses CDP to generate an address depends on whether you are accessing data space <br> (memory or memory-mapped registers), individual register bits, or I/O space. |
| Coefficient indirect | This mode uses the same address-generation process as the CDP indirect addressing <br> mode. This mode is available to support instructions that can access a coefficient in data <br> memory at the same time they access two other data-memory values using the dual AR <br> indirect addressing mode. |

### 3.4.1 AR Indirect Addressing Mode

The AR indirect addressing mode uses an auxiliary register ARn ( $\mathrm{n}=0,1,2$, $3,4,5,6$, or 7 ) to point to data. The way the CPU uses ARn to generate an address depends on the access type:

| For An Access To ... | ARn Contains ... |
| :--- | :--- |
| Data space | The 16 least significant bits (LSBs) of a 23-bit address. <br> (memory or registers) <br> The 7 most significant bits (MSBs) are supplied by |
|  | ARnH, which is the high part of extended auxiliary <br> register XARn. For accesses to data space, use an <br> instruction that loads XARn; ARn can be individually <br> loaded, but ARnH cannot be loaded. |
| A register bit (or bit pair) | A bit number. Only the register bit test/set/clear/com- <br> plement instructions support AR indirect accesses to <br> register bits. These instructions enable you to access <br> bits in the following registers only: the accumulators <br> (AC0-AC3), the auxiliary registers (AR0-AR7), and <br> the temporary registers (T0-T3). |
| I/O space | A 16-bit I/O address. |

The AR indirect addressing-mode operand available depends on the ARMS bit of status register ST2_55:

## ARMS DSP Mode or Control Mode

$0 \quad$ DSP mode. The CPU can use the list of DSP mode operands (Table 3-5), which provide efficient execution of DSP-intensive applications.

1 Control mode. The CPU can use the list of control mode operands (Table 3-6), which enable optimized code size for control system applications.

Table 3-5 (page 3-8) introduces the DSP operands available for the AR indirect addressing mode. Table 3-6 (page 3-12) introduces the control mode operands. When using the tables, keep in mind that:
$\square$ Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the appropriate 16-bit buffer start address register (BSA01, BSA23, BSA45, or BSA67) is added only if circular addressing is activated for the chosen pointer.
$\square$ All additions to and subtractions from the pointers are done modulo 8M. ARnH is updated by the hardware when the pointer modification crosses the main data pages' boundary.

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *ARn | ARn is not modified. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn+ | ARn is incremented after the address is generated: If 16 -bit/1-bit operation: $A R n=A R n+1$ <br> If 32 -bit/2-bit operation: $\mathrm{ARn}=\mathrm{ARn}+2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn- | ARn is decremented after the address is generated: <br> If 16-bit/1-bit operation: $\mathrm{ARn}=\mathrm{ARn}-1$ <br> If 32-bit/2-bit operation: $A R n=A R n-2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| * + ARn | ARn is incremented before the address is generated: <br> If 16 -bit/1-bit operation: $A R n=A R n+1$ <br> If 32-bit/2-bit operation: $A R n=A R n+2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *-ARn | ARn is decremented before the address is generated: <br> If 16-bit/1-bit operation: $A R n=A R n-1$ <br> If 32-bit/2-bit operation: $A R n=A R n-2$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *(ARn + AR0) | The 16 -bit signed constant in ARO is added to ARn after the address is generated: $A R n=A R n+A R 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) I/O-space (Smem) |

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn + T0) | The 16-bit signed constant in T0 is added to ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}+\mathrm{T} 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - ARO) | The 16-bit signed constant in ARO is subtracted from ARn after the address is generated: $A R n=A R n-A R 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - T0) | The 16-bit signed constant in T0 is subtracted from ARn after the address is generated: $A R n=A R n-T 0$ <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(ARO) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in ARO is used as an offset from that base pointer. <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(TO) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in TO is used as an offset from that base pointer. <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(T1) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in T1 is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn + T1) | The 16-bit signed constant in T1 is added to ARn after the address is generated: $A R n=A R n+T 1$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - T1) | The 16 -bit signed constant in T1 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{T} 1$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn + AR0B) | The 16 -bit signed constant in ARO is added to ARn after the address is generated: $A R n=A R n+A R 0$ <br> (The addition is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If ARn is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn + TOB) | The 16-bit signed constant in T0 is added to ARn after the address is generated: $A R n=A R n+T 0$ <br> (The addition is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If ARn is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |

Table 3-5. DSP Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn - AR0B) | The 16-bit signed constant in AR0 is subtracted from ARn after the address is generated: $A R n=A R n-A R 0$ <br> (The subtraction is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If ARn is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *(ARn - TOB) | The 16 -bit signed constant in T0 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{TO}$ <br> (The subtraction is done with reverse carry propagation) <br> This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. <br> Note: When this bit-reverse operand is used, ARn cannot be used as a circular pointer. If $A R n$ is configured in ST2_55 for circular addressing, the corresponding buffer start address register value (BSAxx) is added to ARn, but ARn is not modified so as to remain inside a circular buffer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) <br> Register bit (Baddr) <br> I/O-space (Smem) |
| *ARn(\#K16) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant (K16) is used as an offset from that base pointer. <br> Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) <br> Register bit (Baddr) |
| * + ARn(\#K16) | The 16-bit signed constant (K16) is added to ARn before the address is generated: $A R n=A R n+K 16$ <br> Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) <br> Register bit (Baddr) |

Table 3-6. Control Mode Operands for the AR Indirect Addressing Mode

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *ARn | ARn is not modified. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn+ | ARn is incremented after the address is generated: If 16-bit/1-bit operation: $A R n=A R n+1$ <br> If 32-bit/2-bit operation: $\mathrm{ARn}=\mathrm{ARn}+2$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *ARn- | ARn is decremented after the address is generated: <br> If 16-bit/1-bit operation: $\mathrm{ARn}=\mathrm{ARn}-1$ <br> If 32-bit/2-bit operation: $\mathrm{ARn}=\mathrm{ARn}-2$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register Smem, Lmem) |
|  |  | Register bit (Baddr) |
|  |  | I/O-space (Smem) |
| *(ARn + AR0) | The 16 -bit signed constant in ARO is added to ARn after the address is generated:$A R n=A R n+A R 0$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when.$c 54 \mathrm{~cm}$ on is active at | Register bit (Baddr) |
|  | assembly time. - | I/O-space (Smem) |
| *(ARn + T0) | The 16-bit signed constant in T0 is added to ARn after the address is generated:$A R n=A R n+T 0$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at | Register bit (Baddr) |
|  | assembly time. - | I/O-space (Smem) |
| *(ARn - AR0) | The 16 -bit signed constant in AR0 is subtracted from ARn after the address is generated:$A R n=A R n-A R 0$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Register bit (Baddr) |
|  |  | I/O-space (Smem) |

Table 3-6. Control Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn - T0) | The 16 -bit signed constant in T0 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{T} 0$ | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Register bit (Baddr) I/O-space (Smem) |
| *ARn(AR0) | ARn is not modified. ARn is used as a base pointer. The 16-bit signed constant in ARO is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. | Register bit (Baddr) I/O-space (Smem) |
| *ARn(TO) | $A R n$ is not modified. ARn is used as a base pointer. The 16 -bit signed constant in T0 is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register (Smem, Lmem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. | Register bit (Baddr) I/O-space (Smem) |
| *ARn(\#K16) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant (K16) is used as an offset from that base pointer. | Data-memory (Smem, Lmem) <br> Memory-mapped register <br> (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Register bit (Baddr) |

Table 3-6. Control Mode Operands for the AR Indirect Addressing Mode (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :--- | :--- | :--- |
| ${ }^{*}+$ ARn(\#K16) | The 16-bit signed constant (K16) is added to ARn <br> before the address is generated: <br> ARn = ARn + K16 | Data-memory (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the <br> constant is encoded in a 2-byte extension to the <br> instruction. Because of the extension, an instruction <br> using this operand cannot be executed in parallel with <br> another instruction. | Memory-mapped register <br> (Smem, Lmem) |
| *ARn(short(\#k3))) | ARn is not modified. ARn is used as a base pointer. The bit (Baddr) <br> 3-bit unsigned constant (k3) is used as an offset from <br> that base pointer. k3 is in the range 1 to 7. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register |
| (Smem, Lmem) |  |  |

### 3.4.2 Dual AR Indirect Addressing Mode

The dual AR indirect addressing mode enables you to make two data-memory accesses through the eight auxiliary registers, AR0-AR7. As with single AR indirect accesses to data space, the CPU uses an extended auxiliary register to create each 23 -bit address. You can use linear addressing or circular addressing for each of the two accesses.

You may use the dual AR indirect addressing mode for:

- Executing an instruction that makes two 16-bit data-memory accesses. In this case, the two data-memory operands are designated in the instruction syntax as Xmem and Ymem. For example:
$A C x=($ Xmem $\ll \# 16)+($ Ymem $\ll \# 16)$
- Executing two instructions in parallel. In this case, both instructions must each access a single memory value, designated in the instruction syntaxes as Smem or Lmem. For example:

```
dst = Smem
|| dst = src & Smem
```

The operand of the first instruction is treated as an Xmem operand, and the operand of the second instruction is treated as a Ymem operand.

The available dual AR indirect operands are a subset of the AR indirect operands. The ARMS status bit does not affect the set of dual AR indirect operands available.

## Note:

The assembler rejects code in which dual operands use the same auxiliary register with two different auxiliary register modifications. You can use the same ARn for both operands, if one of the operands is *ARn or *ARn(TO); neither modifies ARn.

Table 3-7 (page 3-15) introduces the operands available for the dual AR indirect addressing mode. Note that:

- Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the appropriate 16-bit buffer start address register (BSA01, BSA23, BSA45, or BSA67) is added only if circular addressing is activated for the chosen pointer.
- All additions to and subtractions from the pointers are done modulo 8M. ARnH is updated by the hardware when the pointer modification crosses the main data pages' boundary.

Table 3-7. Dual AR Indirect Operands

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *ARn | ARn is not modified. | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *ARn+ | ARn is incremented after the address is generated: <br> If 16 -bit operation: $A R n=A R n+1$ <br> If 32-bit operation: $A R n=A R n+2$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *ARn- | ARn is decremented after the address is generated: <br> If 16-bit operation: $\mathrm{ARn}=\mathrm{ARn}-1$ <br> If 32-bit operation: $\mathrm{ARn}=\mathrm{ARn}-2$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *(ARn + AR0) | The 16-bit signed constant in ARO is added to ARn after the address is generated: $A R n=A R n+A R 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. |  |
| *(ARn + T0) | The 16 -bit signed constant in T0 is added to ARn after the address is generated: $A R n=A R n+T 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |

Table 3-7. Dual AR Indirect Operands (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *(ARn - ARO) | The 16-bit signed constant in AR0 is subtracted from ARn after the address is generated: $A R n=A R n-A R 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. |  |
| *(ARn - T0) | The 16-bit signed constant in T0 is subtracted from ARn after the address is generated: $A R n=A R n-T 0$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |
| *ARn(AR0) | ARn is not modified. ARn is used as a base pointer. The 16 -bit signed constant in ARO is used as an offset from that base pointer. | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c54cm_on is active at assembly time. |  |
| *ARn(T0) | ARn is not modified. ARn is used as a base pointer. The 16-bit signed constant in T0 is used as an offset from that base pointer. | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |
| *(ARn + T1) | The 16-bit signed constant in T1 is added to ARn after the address is generated: $A R n=A R n+T 1$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |
| *(ARn - T1) | The 16 -bit signed constant in T 1 is subtracted from ARn after the address is generated: $\mathrm{ARn}=\mathrm{ARn}-\mathrm{T} 1$ | Data-memory <br> (Smem, Lmem, Xmem, Ymem) |

### 3.4.3 CDP Indirect Addressing Mode

The CDP indirect addressing mode uses the coefficient data pointer (CDP) to point to data. The way the CPU uses CDP to generate an address depends on the access type:

| For An Access To ... | CDP Contains ... |
| :--- | :--- |
| Data space  <br> (memory or registers) The 16 least significant bits (LSBs) of a 23-bit address. <br> The 7 most significant bits (MSBs) are supplied by <br> CDPH, the high part of the extended coefficient data <br> pointer (XCDP). <br> A register bit (or bit pair) A bit number. Only the register bit test/set/clear/com- <br> plement instructions support CDP indirect accesses to <br> register bits. These instructions enable you to access <br> bits in the following registers only: the accumulators <br> (AC0-AC3), the auxiliary registers (AR0-AR7), and <br> the temporary registers (T0-T3). <br>  A 16-bit I/O address. |  |

Table 3-8 (page 3-17) introduces the operands available for the CDP indirect addressing mode. Note that:
$\square$ Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the 16 -bit buffer start address register BSAC is added only if circular addressing is activated for CDP.
$\square$ All additions to and subtractions from the pointers are done modulo 8 M . CDPH is updated by the hardware when the pointer modification crosses the main data pages' boundary.

Table 3-8. CDP Indirect Operands

| Operand | Pointer Modification | Supported Access Types |
| :--- | :--- | :--- |
| ${ }^{*}$ CDP | CDP is not modified. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register <br> (Smem, Lmem) |
|  |  | Register-bit (Baddr) |
|  |  | I/O-space (Smem) |
|  |  |  |
|  | CDP + | If is incremented after the address is generated: |
|  | If 32-bit/2-bit operation: CDP $=$ CDP +1 | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register |
|  |  | (Smem, Lmem) |
|  |  | Register-bit (Baddr) |
|  |  | I/O-space (Smem) |

Table 3-8. CDP Indirect Operands (Continued)

| Operand | Pointer Modification | Supported Access Types |
| :---: | :---: | :---: |
| *CDP- | CDP is decremented after the address is generated: <br> If 16-bit/1-bit operation: CDP = CDP - 1 <br> If 32-bit/2-bit operation: $C D P=C D P-2$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  |  | Register-bit (Baddr) |
|  |  | I/O-space (Smem) |
| *CDP(\#K16) | CDP is not modified. CDP is used as a base pointer. The 16 -bit signed constant (K16) is used as an offset from that base pointer. | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Register-bit (Baddr) |
| *+CDP(\#K16) | The 16-bit signed constant (K16) is added to CDP before the address is generated:$C D P=C D P+K 16$ | Data-memory (Smem, Lmem) |
|  |  | Memory-mapped register (Smem, Lmem) |
|  | Note: When an instruction uses this operand, the constant is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using this operand cannot be executed in parallel with another instruction. | Register-bit (Baddr) |

### 3.4.4 Coefficient Indirect Addressing Mode

The coefficient indirect addressing mode uses the same address-generation process as the CDP indirect addressing mode for data-space accesses. The coefficient indirect addressing mode is supported by select memory-tomemory move and memory initialization instructions and by the following arithmetical instructions:

| $\square$ | Dual multiply (accumulate/subtract) |
| :--- | :--- |
| Finite impulse response filter |  |
| Multiply |  |
| Multiply and accumulate |  |
| $\square$ Multiply and subtract |  |

Instructions using the coefficient indirect addressing mode to access data are mainly instructions performing operations with three memory operands per cycle. Two of these operands (Xmem and Ymem) are accessed with the dual AR indirect addressing mode. The third operand (Cmem) is accessed with the coefficient indirect addressing mode. The Cmem operand is carried on the BB bus.

Keep the following facts about the BB bus in mind as you use the coefficient indirect addressing mode:

- The BB bus is not connected to external memory. If a Cmem operand is accessed through the BB bus, the operand must be in internal memory.
- Although the following instructions access Cmem operands, they do not use the BB bus to fetch the 16 -bit or 32-bit Cmem operand.

| Instruction <br> Syntax | Description of <br> Cmem Access | Bus Used to <br> Access Cmem |
| :--- | :--- | :--- |
| Smem = Cmem | 16-bit read from Cmem | DB |
| Cmem $=$ Smem | 16-bit write to Cmem | EB |
| Lmem $=$ dbl(Cmem) | 32-bit read from Cmem | CB for most significant <br> word (MSW) <br> DB for least significant <br> word (LSW) |
| dbl(Cmem) $=$ Lmem | 32-bit write to Cmem | FB for MSW <br> EB for LSW |

Consider the following instruction syntax. In one cycle, two multiplications can be performed in parallel. One memory operand (Cmem) is common to both multiplications, while dual AR indirect operands (Xmem and Ymem) are used for the other values in the multiplication.

```
ACx = Xmem * Cmem,
ACy = Ymem * Cmem
```

To access three memory values (as in the above example) in a single cycle, the value referenced by Cmem must be located in a memory bank different from the one containing the Xmem and Ymem values.

Table 3-9 introduces the operands available for the coefficient indirect addressing mode. Note that:
$\square$ Both pointer modification and address generation are linear or circular according to the pointer configuration in status register ST2_55. The content of the 16 -bit buffer start address register BSAC is added only if circular addressing is activated for CDP.
$\square$ All additions to and subtractions from the pointers are done modulo 8M. CDPH is updated by the hardware when the pointer modification crosses the main data pages' boundary.

Table 3-9. Coefficient Indirect Operands

| Operand | Pointer Modification | Supported Access Type |
| :---: | :---: | :---: |
| *CDP | CDP is not modified. 1 | Data-memory |
| *CDP+ | CDP is incremented after the address is generated: If 16-bit operation: $C D P=C D P+1$ <br> If 32-bit operation: $C D P=C D P+2$ | Data-memory |
| *CDP- | CDP is decremented after the address is generated: <br> If 16-bit operation: CDP $=$ CDP - 1 <br> If 32-bit operation: $C D P=C D P-2$ | Data-memory |
| ${ }^{*}(C D P+A R 0)$ | The 16-bit signed constant in ARO is added to CDP after the address is generated: $C D P=C D P+A R 0$ | Data-memory |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=1$. This operand is usable when .c 54 cm _on is active at assembly time. |  |
| * $\mathrm{CDP}+\mathrm{TO}$ ) | The 16-bit signed constant in T0 is added to CDP after the address is generated: $\mathrm{CDP}=\mathrm{CDP}+\mathrm{TO}$ | Data-memory |
|  | This operand is available when $\mathrm{C} 54 \mathrm{CM}=0$. This operand is usable when .c54cm_off is active at assembly time. |  |

### 3.5 Circular Addressing

Circular addressing can be used with any of the indirect addressing modes. Each of the eight auxiliary registers (AR0-AR7) and the coefficient data pointer (CDP) can be independently configured to be linearly or circularly modified as they act as pointers to data or to register bits, see Table 3-10. This configuration is done with a bit (ARnLC) in status register ST2_55. To choose circular modification, set the bit.

Table 3-10. Circular Addressing Pointers

| Pointer | Linear/Circular <br> Configuration Bit | Supplier of <br> Main Data Page | Buffer Start Address <br> Register | Buffer Size <br> Register |
| :---: | :---: | :---: | :---: | :---: |
| AR0 | ST2_55(0) = AR0LC | AR0H | BSA01 | BK03 |
| AR1 | ST2_55(1) = AR1LC | AR1H | BSA01 | BK03 |
| AR2 | ST2_55(2) = AR2LC | AR2H | BSA23 | BK03 |
| AR3 | ST2_55(3) =AR3LC | AR3H | BSA23 | BK03 |
| AR4 | ST2_55(4) =AR4LC | AR4H | BSA45 | BK47 |
| AR5 | ST2_55(5) $=$ AR5LC | AR5H | BSA45 | BK47 |
| AR6 | ST2_55(6) $=$ AR6LC | AR6H | BSA67 | BK47 |
| AR7 | ST2_55(7) $=$ AR7LC | AR7H | BSA67 | BK47 |
| CDP | ST2_55(8) $=$ CDPLC | CDPH | BSAC | BKC |

Each auxiliary register ARn has its own linear/circular configuration bit in ST2_55:

| ARnLC | ARn Is Used For ... |
| :--- | :--- |
| 0 | Linear addressing |
| 1 | Circular addressing |

The CDPLC bit in status register ST2_55 configures the DSP to use CDP for linear addressing or circular addressing:

| CDPLC | CDP Is Used For ... |
| :--- | :--- |
| 0 | Linear addressing |
| 1 | Circular addressing |

You can use the circular addressing instruction qualifier, circular(), if you want every pointer used by the instruction to be modified circularly, just add the circular() qualifier in parallel with the instruction. The circular addressing instruction qualifier overrides the linear/circular configuration in ST2_55.

## Instruction Set Summary

This chapter provides a summary of the TMS320C55x ${ }^{\text {TM }}$ DSP algebraic instruction set (Table 4-1). With each instruction, you will find the availability of a parallel enable bit, word count (size), cycle time, what pipeline phase the instruction executes, in what operator unit the instruction executes, how many of each address generation unit is used, and how many of each bus is used.

Table 4-1 does not list all of the resources that may be used by an instruction, it only lists those that may result in a resource conflict, and thus prevent two instructions from being in parallel. If an instruction lists nothing in a particular column, it means that particular resource will never be in conflict for that instruction.

The column heads of Table 4-1 are:
$\square$ Instruction: In cases where the resource usage of an instruction varies with the kinds of registers, you see the notation <name>-AU for A-unit registers and <name>-DU for D-unit registers. So, dst-AU is a destination that is an A-unit register and src-DU is a source that is a D-unit register. In the few cases where that notation is insufficient, you see the cases listed in the Notes column.
$\square$ E: Whether that instruction has a parallel enable bit
$\square$ S: The size of the instruction in bytes
$\square \mathrm{C}$ : Number of cycles required for the instruction
$\square$ Pipe: The pipeline phase in which the instruction executes:

| Name | Phase |
| :--- | :--- |
| AD | Address |
| D | Decode |
| R | Read |
| X | Execute |

$\square$ Operator: Which operator(s) are used by this instruction. When an instruction uses multiple operators, any other instruction that uses one or more of those same operators may not be placed in parallel.

- Address Generation Unit: How many of each address generation unit is used. The address generation units are:


## Name <br> Unit

DA
Data Address Generation Unit
CA Coefficient Address Generation Unit
SA Stack Address Generation Unit
$\square$ Buses: How many of each bus is used. The buses are:

| Name | Bus |
| :--- | :--- |
| DR | Data Read |
| CR | Coefficient Read |
| DW | Data Write |
| ACB | Brings D unit registers to $A$ unit and $P$ unit operators |

Table 4-1. Algebraic Instruction Set Summary

|  |  |  |  |  |  |  |  |  |  |  | ses |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| No. Instruction | E | S | C | Pipe | Operator | DA | CA | SA | DR | CR | DW | ACB | Notes |

Absolute Distance (page 5-2)
abdst(Xmem, Ymem, ACx, ACy)
$\left|\begin{array}{lllll|llll}\mathrm{N} & 4 & 1 & \mathrm{X} & \begin{array}{l}\text { DU_ALU }+ \\ \text { DU_MAC1 }\end{array} & 2 & \cdot & 2\end{array}\right|$
Absolute Value (page 5-4)

$$
\begin{aligned}
& \mathrm{dst}-\mathrm{AU}=|\mathrm{src}-\mathrm{AU}| \\
& \mathrm{dst}-\mathrm{AU}=|\mathrm{src}-\mathrm{DU}| \\
& \mathrm{dst}-\mathrm{DU}=|\mathrm{src}|
\end{aligned}
$$

| $Y$ | 2 | 1 | $X$ | $A U \_A L U$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $Y$ | 2 | 1 | $X$ | $A U \_A L U$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 |
| $Y$ | 2 | 1 | $X$ | DU_ALU | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | See Note 1.

Addition (page 5-7)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
$\stackrel{+}{\omega}$
2) dst-DU, src-AU or dst-AU, src-DU
$\stackrel{\text { P }}{ } \times$ Table 4-1. Algebraic Instruction Set Summary (Continued)

| $\underset{\underset{\sim}{\top}}{\underset{\sim}{2}}$ |  | Instruction | E | S | C | Pipe | Operator | AddressGeneration Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | No. |  |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |
|  | [12] | ACy $=\mathrm{ACx}+\mathrm{uns}$ (Smem) | N | 3 | 1 | X | DU_ALU | 1 | . |  | 1 | . | . | . |  |
| ¢ | [13] | ACy $=$ ACx $+($ uns(Smem) $\ll$ \#SHIFTW) | N | 4 | 1 | x | DU_SHIFT | 1 | . |  | 1 | . |  |  |  |
| 9 | [14] | $A C y=A C x+d b l(L m e m)$ | N | 3 | 1 | X | DU_ALU | 1 | . | . | 2 | . | . | . |  |
| $\underset{\sim}{\infty}$ | [15] | ACx $=($ Xmem $\ll \# 16)+($ Ymem $\ll \# 16)$ | N | 3 | 1 | X | DU_ALU | 2 |  |  | 2 |  |  |  |  |
| $\stackrel{0}{5}$ | [16] | Smem $=$ Smem + K16 |  | 4 | 1 | x | DU_ALU | 1 | . |  | 1 | . | 1 |  |  |

Addition with Absolute Value (page 5-27) $A C y=\operatorname{rnd}(A C y+|A C x|)$

| $Y$ | 2 | 1 | $X$ | DU_MAC1 |
| :--- | :--- | :--- | :--- | :--- |

1. 

Addition with Parallel Store Accumulator Content to Memory (page 5-29)
$A C y=A C x+(X m e m \ll \# 16)$,
$Y m e m=H I(A C y \ll T 2)$
[1] $A C y=\operatorname{adsc}($ Smem, $A C x$, TC1)
[2] $A C y=\operatorname{adsc}($ Smem, $A C x, T C 2)$

## Addition or Subtraction Conditionally with Shift (page 5-33)

 ACy = ads2c(Smem, ACx, Tx, TC1, TC2)
## Addition, Subtraction, or Move Accumulator Content Conditionally (page 5-36)

$$
A C y=\operatorname{adsc}(\text { Smem }, A C x, T C 1, T C 2)
$$

$$
\left.\begin{array}{llll}
\mathrm{N} & 3 & 1
\end{array}\right\rangle
$$

Bitwise AND (page 5-38)
[1] dst-AU = dst-AU \& src-AU

$$
\begin{aligned}
& \mathrm{dst}-\mathrm{AU}=\mathrm{dst}-\mathrm{AU} \& \mathrm{src}-\mathrm{DU} \\
& \mathrm{dst}-\mathrm{DU}=\mathrm{dst}-\mathrm{DU} \& \mathrm{src}
\end{aligned}
$$

$$
\left\lvert\, \begin{array}{lllll}
Y & 2 & 1 & X & \text { AU_ALU } \\
Y & 2 & 1 & X & \text { AU_ALU } \\
Y & 2 & 1 & X & \text { DU_ALU } \\
Y & 3 & 1 & X & \text { AU_ALU } \\
Y & 3 & 1 & X & \text { AU_ALU } \\
Y & 3 & 1 & X & \text { DU_ALU }
\end{array}\right.
$$



Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

| No. | Instruction | E | S | C | Pipe | Operator | $\begin{gathered} \text { Address } \\ \text { Generation Unit } \end{gathered}$ |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |
| [3] | dst-AU = src-AU \& k 16 | N | 4 | 1 | X | AU_ALU | . | . | . | . | . | . | . |  |
|  | dst-AU $=$ src-DU \& k16 | N | 4 | 1 | x | AU_ALU | . | . | . | . |  |  | 1 |  |
|  | dst-DU $=\operatorname{src}$ \& k16 | N | 4 | 1 | x | DU_ALU |  | . |  | . |  |  |  | See Note 1. |
| [4] | dst-AU = src-AU \& Smem | N | 3 | 1 | X | AU_ALU | 1 | . | . | 1 | - | - | . |  |
|  | dst-AU = src-DU \& Smem | N | 3 | 1 | $x$ | AU_ALU | 1 | . |  | 1 | - |  | 1 |  |
|  | dst-DU $=\operatorname{src} \&$ Smem | N | 3 | 1 | $x$ | DU_ALU | 1 | . | . | 1 | . | . | . | See Note 1. |
| [5] | ACy $=$ ACy \& (ACx $\lll$ \#SHIFTW) | Y | 3 | 1 | X | DU_SHIFT | . | - | - | . | - | - | . |  |
| [6] | $A C y=A C x \&(k 16 \lll \# 16)$ | N | 4 | 1 | x | DU_ALU | . | . | . | . | . | . | . |  |
| [7] | $A C y=A C x \&(k 16 \lll \# S H F T)$ | N | 4 | 1 | $x$ | DU_SHIFT | . | - | - | . | - | - | . |  |
| [8] | Smem = Smem \& k16 | N | 4 | 1 | x | AU_ALU | 1 | . | . | 1 | . | 1 | . |  |

## Bitwise AND Memory with Immediate Value and Compare to Zero (page 5-47)

[1] TC1 = Smem \& k16
[2] TC2 = Smem \& k16

## Bitwise OR (page 5-48)



Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

+ ${ }_{\mathrm{o}}$ Table 4-1. Algebraic Instruction Set Summary (Continued)

|  | No. |  | E | S | C | Pipe | Operator | AddressGeneration Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Instruction |  |  |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |
|  | [5] | ACy $=$ ACy \| (ACx <<< \#SHIFTW) | Y | 3 | 1 | X | DU_SHIFT | . | . | - | . | . | . |  |  |
|  | [6] | ACy $=$ ACx \| (k16 <<<\#16) | N | 4 | 1 | x | DU_ALU | . | . | . | . | . | . | . |  |
|  | [7] | ACy $=$ ACx \| $\mathrm{k} 16 \lll$ \#SHFT $)$ | N | 4 | 1 | x | DU_SHIFT | . | . | . | . | . | . | . |  |
|  | [8] | Smem $=$ Smem \| k16 | N | 4 | 1 | x | AU_ALU | 1 | . | . | 1 | . | 1 | . |  |
|  | Bitwise Exclusive OR (XOR) (page 5-57) |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | [1] | dst-AU $=$ dst-AU ^ src-AU | Y | 2 | 1 | $x$ | AU_ALU | . | . | . | . | . | . |  |  |
|  |  | dst-AU = dst-AU ^ src-DU | Y | 2 | 1 | x | AU_ALU | . | . |  |  | . |  | 1 |  |
|  |  | $d s t-D U=d s t-D U^{\wedge}$ src | Y | 2 | 1 | x | DU_ALU | . | . | . | . | . | . | . | See Note 1. |
|  | [2] | dst-AU $=$ src-AU ^ k 8 | Y | 3 | 1 | x | AU_ALU | . | . | . | . | . | . |  |  |
|  |  | dst-AU $=$ src-DU^ $\mathrm{k}^{\text {8 }}$ | Y | 3 | 1 | x | AU_ALU | . | . | . |  | . | . | 1 |  |
|  |  | dst-DU $=\operatorname{src} \wedge \mathrm{k} 8$ | Y | 3 | 1 | x | DU_ALU |  | . |  | . | . | . |  | See Note 1. |
|  | [3] | dst-AU $=$ src-AU ^ $k 16$ | N | 4 | 1 | x | AU_ALU | . | . |  |  | . |  |  |  |
|  |  | dst-AU $=$ src-DU ^ k 16 | N | 4 | 1 | x | AU_ALU | . | . | . | . | . | . | 1 |  |
|  |  | dst-DU $=\operatorname{src}^{\wedge} \mathrm{k} 16$ | N | 4 | 1 | $x$ | DU_ALU | . | - | . | . | . | - |  | See Note 1. |
|  | [4] | dst-AU = src-AU ^ Smem | N | 3 | 1 | x | AU_ALU | 1 | . | . | 1 | . | . |  |  |
|  |  | dst-AU = src-DU ^ Smem | N | 3 | 1 | X | AU_ALU | 1 | . | . | 1 | . | - | 1 |  |
|  |  | dst-DU $=\operatorname{src}^{\wedge}$ Smem | N | 3 | 1 | x | DU_ALU | 1 | . |  | 1 | . | . |  | See Note 1. |
|  | [5] | $A C y=A C y \wedge(A C x \lll$ \#SHIFTW) | Y | 3 | 1 | X | DU_SHIFT | . | . | . | . | . | . | . |  |
|  | [6] | $A C y=A C x \wedge(k 16 \lll \# 16)$ | N | 4 | 1 | x | DU_ALU | . | . | . | . | . | . | . |  |
|  | [7] | ACy $=$ ACx^ (k16 <<< \#SHFT) | N | 4 | 1 | X | DU_SHIFT | . | . |  |  | . | . |  |  |
|  | [8] | Smem $=$ Smem ${ }^{\wedge} \mathrm{k} 16$ | N | 4 | 1 | X | AU_ALU | 1 | . | . | 1 | . | 1 | . |  |

## Branch Conditionally (page 5-66)

[1] if (cond) goto 14
SWPU068E
[2] if (cond) goto L8
$\begin{array}{lll}\mathrm{N} & 26 / 5^{\dagger} \quad \mathrm{R}\end{array}$
PU_UNIT $\begin{array}{lll}\mathrm{Y} & 3 & 6 / 5^{\dagger}\end{array}$

## PU_UNIT

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{\substack{\infty} \text { Table 4-1. Algebraic Instruction Set Summary (Continued) }}^{\substack{\infty \\ \hline}}$

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
Branch Unconditionally (page 5-70)

$\dagger$ These instructions execute in 3 cycles if the addressed instruction is in the instruction buffer unit.

## Branch on Auxiliary Register Not Zero (page 5-74)

if (ARn_mod != \#0) goto L16

$$
\begin{array}{|lllll|l}
\mathrm{N} & 4 & 6 / 5^{\dagger} & \text { AD } & \text { PU_UNIT } & 1
\end{array}
$$

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
Call Conditionally (page 5-77)

$\dagger \mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false
Call Unconditionally (page 5-83)
$\begin{array}{lll}\underset{5}{5} & \text { [1] } & \text { call ACx } \\ \underset{\sim}{\square} & \text { [2] } & \text { call L16 } \\ \underset{\sim}{\square} & \text { [3] } & \text { call P24 }\end{array}$
Circular Addressing Qualifier (page 5-87) circular()
$\begin{array}{llll}\mathrm{N} & 1 & 1 & \text { AD }\end{array}$
Clear Accumulator, Auxiliary, or Temporary Register Bit (page 5-88)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\stackrel{\sim}{\infty}$
Table 4-1. Algebraic Instruction Set Summary (Continued)


## Compare Accumulator, Auxiliary, or Temporary Register Content (page 5-93)



Compare Accumulator, Auxiliary, or Temporary Register Content with AND (page 5-95)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Algebraic Instruction Set Summary (Continued)

|  |  |  |  |  |  | $\begin{gathered} \text { Address } \\ \text { Generation Unit } \end{gathered}$ |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| No. Instruction | E | s | c | Pipe | Operator | DA | CA | SA | DR | CR | DW | ACB |  |

## Compare Accumulator, Auxiliary, or Temporary Register Content with OR (page 5-100)

[1] TCX $=$ TCy $\mid$ uns(sr-AU RELOP dst-AU)

$$
T C x=T C y \mid u n s(\text { src RELOP dst) }
$$

$|$| $Y$ | 3 | 1 | $X$ | $A U \_A L U$ |
| :--- | :--- | :--- | :--- | :--- |
| $Y$ | 3 | 1 | $X$ | $A U \_A L U$ |
| $Y$ | 3 | 1 | $X$ | $D U \_A L U$ |
| $Y$ | 3 | 1 | $X$ | $A U \_A L U$ |
| $Y$ | 3 | 1 | $X$ | $A U \_A L U$ |
| $Y$ | 3 | 1 | $X$ | $D U \_A L U$ |


| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 | See Note 2. |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |  |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |  |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ | 1 | See Note 2. |

Compare Accumulator, Auxiliary, or Temporary Register Content Maximum (page 5-105)


Compare Accumulator, Auxiliary, or Temporary Register Content Minimum (page 5-108)


## Compare and Branch (page 5-111)

compare (uns(src-AU RELOP K8)) goto L8
compare (uns(src-DU RELOP K8)) goto L8

$$
\left\lvert\, \begin{array}{ccccc}
N & 4 & 7 / 6^{\dagger} & X & \begin{array}{l}
\text { AU_ALU }+ \\
\text { PU_UNIT }
\end{array} \\
N & 4 & 7 / 6^{\dagger} & X & \begin{array}{l}
\text { DU_ALU } \\
\text { PU_UNIT }
\end{array}
\end{array}\right.
$$

$+\mathrm{x} / \mathrm{y}$ cycles: x cycles $=$ condition true, y cycles $=$ condition false

## Compare and Select Accumulator Content Maximum (page 5-114)

[1] max_diff(ACx, $A C y, A C z, A C w)$

| $Y$ | 3 | 1 | $X$ |
| :--- | :--- | :--- | :--- |

DU_ALU

[2] max_diff_dbl(ACx, $A C y, A C z, A C w$, TRNx $)$ | Y | 3 |
| :--- | :--- |

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

Table 4-1. Algebraic Instruction Set Summary (Continued)


Complement Accumulator, Auxiliary, or Temporary Register Bit (page 5-128)


## Complement Accumulator, Auxiliary, or Temporary Register Content (page 5-129)



Compute Mantissa and Exponent of Accumulator Content (page 5-132)
$A C y=\operatorname{mant}(A C x), T x=\exp (A C x)$

Count Accumulator Bits (page 5-134

$\sum_{0}^{\infty}$ Table 4-1. Algebraic Instruction Set Summary (Continued)

|  |  |  |  |  |  | AddressGeneration Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| No. Instruction | E | S | C | Pipe | Operator | DA | CA | SA | DR | CR | DW | ACB |  |

## Dual 16-Bit Additions (page 5-135)

[1] $\quad \mathrm{H}(\mathrm{ACy})=\mathrm{HI}(\mathrm{Lmem})+\mathrm{HI}(\mathrm{ACx})$,
$H(A C y)=H 1($ Lmem $)+\mathrm{H}(\mathrm{ACx})$,
$\mathrm{LO}(A C y)=\mathrm{LO}($ Lmem $)+\mathrm{LO}(A C x)$
2] $\mathrm{H}(\mathrm{ACx})=\mathrm{H}(\mathrm{Lmem})+\mathrm{Tx}$, LO(ACx) $=$ LO(Lmem) $+T x$

Dual 16-Bit Addition and Subtraction (page 5-140)
[1] $\mathrm{H}(\mathrm{ACx})=$ Smem +Tx ,
LO(ACx) = Smem - Tx
[2] $H(A C x)=H I(L m e m)+T x$, LO(ACx) $=$ LO(Lmem) $-T x$

Dual 16-Bit Subtractions (page 5-145)
[1] $H I(A C y)=H I(A C x)-H I(L m e m)$,
$\mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\mathrm{ACx})-\mathrm{LO}(\mathrm{Lmem})$
2] $\mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\mathrm{Lmem})-\mathrm{HI}(\mathrm{ACx})$, LO(ACy) $=\mathrm{LO}($ Lmem $)-$ LO(ACx

3] $\quad \mathrm{HI}(\mathrm{ACx})=\mathrm{Tx}-\mathrm{HI}(\mathrm{Lmem})$,
LO(ACx) $=$ Tx - LO(Lmem)
[4] $\mathrm{HI}(\mathrm{ACx})=\mathrm{HI}($ Lmem $)-\mathrm{Tx}$. LO(ACx) $=$ LO(Lmem) $-T x$

Dual 16-Bit Subtraction and Addition (page 5-154)
[1] $\quad \mathrm{HI}(\mathrm{ACx})=$ Smem - Tx,,


## Execute Conditionally (page 5-159)


(2)

| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 2 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 2 |


| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 1 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 2 |


| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 2 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 2 |
| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 2 |
| $N$ | 3 | 1 | $X$ | DU_ALU | 1 | $\cdot$ | $\cdot$ | 2 |

$\left|\begin{array}{lllll|lll}N & 3 & 1 & X & \text { DU_ALU } & 1 & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_ALU } & 1 & \cdot & \cdot\end{array}\right|$
$\stackrel{\text { A }}{\stackrel{1}{\sim}} \quad$ Table 4-1. Algebraic Instruction Set Summary (Continued)


Extract Accumulator Bit Field (page 5-167)


Finite Impulse Response Filter, Antisymmetrical (page 5-168)
firsn(Xmem, Ymem, coef(Cmem), ACx, ACy)

Finite Impulse Response Filter, Symmetrical (page 5-170) firs(Xmem, Ymem, coef(Cmem), ACx, ACy)

Idle (page 5-172)
idle


Linear Addressing Qualifier (page 5-179)


[^0]$\sum_{0}^{\infty}$ Table 4-1. Algebraic Instruction Set Summary (Continued)

| No. Instruction |  | c | Pipe | Operator | AddressGeneration Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | E S |  |  |  | DA | CA | SA | DR | CR | DW | ACB |  |

## Load Accumulator from Memory (page 5-180)

[1] $A C x=\operatorname{rnd}($ Smem $\ll T x)$
[2] ACx = low_byte(Smem) <<\#SHIFTW

| N | 3 | 1 | X | DU_SHIFT | 1 | $\cdot$ | $\cdot$ | 1 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| N | 3 | 1 | X | DU_SHIFT | 1 | $\cdot$ | $\cdot$ | 1 |
| N | 3 | 1 | X | DU_SHIFT | 1 | $\cdot$ | $\cdot$ | 1 |
| N | 2 | 1 | X | DU_LOAD | 1 | $\cdot$ | $\cdot$ | 1 |
| N | 3 | 1 | X | DU_LOAD | 1 | $\cdot$ | $\cdot$ | 1 |
| N | 4 | 1 | X | DU_SHIFT | 1 | $\cdot$ | $\cdot$ | 1 |
| N | 3 | 1 | X | DU_LOAD | 1 | $\cdot$ | $\cdot$ | 2 |
| N | 3 | 1 | X | DU_LOAD | 2 | $\cdot$ | $\cdot$ | 2 |

## Load Accumulator Pair from Memory (page 5-191)

[1] $\quad \operatorname{pair}(H \mid(A C x))=$ Lmem
[2] $\quad \operatorname{pair}(\operatorname{LO}(A C x))=$ Lmem

| $N$ | 3 | 1 | $X$ | DU_LOAD | 1 | $\cdot$ | $\cdot$ | 2 |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| $N$ | 3 | 1 | $X$ | DU_LOAD | 1 | $\cdot$ | $\cdot$ | 2 |

Load Accumulator with Immediate Value (page 5-196)
[1] $A C X=$ K16 <<\#16
[2] $A C X=K 16 \ll \# S H F$
$\left|\begin{array}{lllll}\mathrm{N} & 4 & 1 & \mathrm{X} & \text { DU_LOAD } \\ \mathrm{N} & 4 & 1 & \mathrm{X} & \text { DU_SHIFT }\end{array}\right|$

| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |
| :---: | :---: | :---: | :---: |
| $\cdot$ | $\cdot$ | $\cdot$ | $\cdot$ |$|$

Load Accumulator from Memory with Parallel Store Accumulator Content to Memory (page 5-189)

|  |  | $\begin{aligned} & \text { ACy = Xmem << \#16, } \\ & \text { Ymem }=\mathrm{HI}(\text { ACx << T2) } \end{aligned}$ | N | 4 | 1 | X | $\begin{aligned} & \text { DU_LOAD + } \\ & \text { DU_SHIFT } \end{aligned}$ | 2 | . |  | 2 |  | 2 | . |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | Load Accumulator, Auxiliary, or Temporary Register from Memory (page 5-199) |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  | [1] | dst-AU = Smem | N | 2 | 1 | X | AU_LOAD | 1 | . | . | 1 | . | . | . |
|  |  | dst-DU = Smem | N | 2 | 1 | X | DU_LOAD | 1 | . | . | 1 |  | . | . |
|  | [2] | dst-AU = uns(high_byte(Smem)) | N | 3 | 1 | X | AU_LOAD | 1 | . | . | 1 |  | . | . |
|  |  | dst-DU = uns(high_byte(Smem)) | N | 3 | 1 | X | DU_LOAD | 1 | . | . | 1 |  | . | . |
|  | [3] | dst-AU = uns(low_byte(Smem)) | N | 3 | 1 | X | AU_LOAD | 1 | . | . | 1 | . | - | - |
|  |  | dst-DU $=$ uns(low_byte(Smem)) | N | 3 | 1 | X | DU_LOAD | 1 | . | . | 1 | . | . | . |

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

4-1
Table 4-1. Algebraic Instruction Set Summary (Continued)

| No. Instruction |  |  |  |  |  |  |  |  |  |  | ses |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | E | s | C | Pipe | Operator | DA | CA | SA | DR | CR | Dw | ACB | Notes |

$\stackrel{\text { © }}{\text { ธ }}$ Load Accumulator, Auxiliary, or Temporary Register with Immediate Value (page 5-205)
[1] dst-AU = k4 $\left\lvert\, \begin{array}{llllll} & 2 & 1 & X & \text { AU LOAD }\end{array}\right.$

$$
\mathrm{dst}-\mathrm{DU}=
$$

$$
\begin{array}{|lllll}
Y & 2 & 1 & X & A U \_L O A D \\
Y & 2 & 1 & X & D U \_L O A D \\
Y & 2 & 1 & X & A U \_L O A D \\
Y & 2 & 1 & X & D U \_L O A D \\
N & 4 & 1 & X & \text { AU_LOAD } \\
N & 4 & 1 & X & D U \_L O A D
\end{array}
$$

| $\cdot$ | $\cdot$ | $\cdot$ |
| :---: | :---: | :---: | :---: |
| $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ |
| $\cdot$ | $\cdot$ | $\cdot$ |

Load Auxiliary or Temporary Register Pair from Memory (page 5-209)
pair(TAx) $=$ Lmem
oad CPU Register from Memory (page 5-210)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

$\stackrel{\stackrel{\rightharpoonup}{\oplus}}{\stackrel{\rightharpoonup}{\sigma}} \quad$ Table 4-1. Algebraic Instruction Set Summary (Continued)


Modify Auxiliary Register Content with Parallel Multiply and Accumulate (page 5-226)

| [1] | $\operatorname{mar}$ (Xmem), <br> $A C x=M 40\left(\operatorname{rnd}\left(A C x+\left(u n s(Y m e m){ }^{*}\right.\right.\right.$ uns(coef(Cmem))$\left.\left.)\right)\right)$ | N | 4 | 1 | x | DU_MAC1 | 2 | 1 | 2 | 1 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| [2] | $\operatorname{mar}$ (Xmem), <br> $A C x=M 40($ rnd $((A C x \gg \# 16)+(u n s(Y m e m) * u n s(\operatorname{coef}(C m e m)))))$ | N | 4 | 1 | X | DU_MAC1 | 2 | 1 | 2 | 1 |

Modify Auxiliary Register Content with Parallel Multiply and Subtract (page 5-231)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
$\sum_{0}^{\infty}$ Table 4-1. Algebraic Instruction Set Summary (Continued)


## Modify Auxiliary or Temporary Register Content (page 5-233)

| $[1]$ | $\operatorname{mar}(\mathrm{TAy}=\mathrm{TAx})$ |
| :--- | :--- | :--- | :--- | :--- | :--- |
| $[2]$ | $\operatorname{mar}(\mathrm{TAx}=\mathrm{P} 8)$ |
| $[3]$ | $\operatorname{mar}(\mathrm{TAx}=\mathrm{D} 16)$ |$|$| N | 3 | 1 | AD |
| :--- | :--- | :--- | :--- |
| N | 3 | 1 | AD |
| N | 4 | 1 | AD |

## Modify Auxiliary or Temporary Register Content by Addition (page 5-237)

| [1] | $\operatorname{mar}(T A y+T A x)$ |
| :--- | :--- | :--- | :--- | :--- | :--- |
| [2] | $\operatorname{mar}(T A x+P 8)$ |$|$| $N$ | 3 | 1 | $A D$ |
| :--- | :--- | :--- | :--- |
| $N$ | 3 | 1 | $A D$ |

Modify Auxiliary or Temporary Register Content by Subtraction (page 5-241)
[1] $\quad \operatorname{mar}($ TAy - TAx $)$
$\begin{array}{llll}\mathrm{N} & 3 & 1 & \text { AD }\end{array}$
[2] $\operatorname{mar}(T A x-P 8)$
$\begin{array}{llll}\mathrm{N} & 3 & 1 & A D\end{array}$

Modify Data Stack Pointer (SP) (page 5-245)

$$
\mathrm{SP}=\mathrm{SP}+\mathrm{K} 8
$$



Modify Extended Auxiliary Register Content (page 5-246)

| $[1]$ | XAdst $=\operatorname{mar}($ Smem $)$ |
| :--- | :--- | :--- | :--- | :--- | :--- |
| $[2]$ | mar $($ XACdst $=$ XACsrc $)$ |$|$| N | 3 | 1 | $A D$ |
| :--- | :--- | :--- | :--- |
| Y | 3 | 1 | $A D$ |

## Modify Extended Auxiliary Register Content by Addition (page 5-249)

$$
\left.\begin{array}{l|llll}
\operatorname{mar}(X A C d s t
\end{array}+\text { XACsrc }\right) \quad \left\lvert\, \begin{array}{llll} 
& & \\
\hline
\end{array}\right.
$$

Modify Extended Auxiliary Register Content by Subtraction (page 5-251) $\operatorname{mar}$ (XACdst - XACsrc) $\left\lvert\, \begin{array}{lll}\text { Y } & 3 & 1\end{array}\right.$ AD
Move Accumulator Content to Auxiliary or Temporary Register (page 5-253)

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

Table 4-1. Algebraic Instruction Set Summary (Continued)


Move CPU Register Content to Auxiliary or Temporary Register (page 5-259)
[1] TAx = BRCO $\left\lvert\, \begin{array}{llll} & & 2 & 1\end{array}\right.$
[2] $\mathrm{TAx}=\mathrm{BRC} 1$


AU ALU
AU_ALU

3] $\mathrm{TAx}=\mathrm{CDP}$
(

4] TAx $=$ RPTC
$\begin{array}{lll}Y & 2 & 1 \\ Y & 2 & 1\end{array}$
AU_ALU

5] $T A x=S P$
$\begin{array}{llll}\mathrm{Y} & 2 & 1 & X \\ Y & 2 & 1 & X\end{array}$
AU_ALU
AU_ALU
6] TAx = SSP
Move Extended Auxiliary Register Content (page 5-261)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU


|  |  |  |  |  |  | AddressGeneration Unit |  |  | Buses |  |  |  | Notes |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| No. Instruction | E | s | c | Pipe | Operator |  | CA | SA | DR | CR | DW | ACB |  |

## Multiply with Parallel Multiply and Subtract (page 5-295)

[1] $\quad \mathrm{ACy}=\mathrm{M} 40($ rnd $($ uns $($ Smem $) ~ * ~ u n s(H I(c o e f(C m e m))))), ~$
$A C x=M 40($ rnd $(A C x-($ uns $(S m e m) *$ uns(LO(coef(Cmem)) )) ))
[2] $\quad A C y=M 40($ rnd $($ uns $(H)($ Lmem $)) *$ uns (HI (coef(Cmem))))),
$\mathrm{ACx}=\mathrm{M} 40\left(\operatorname{mdd}\left(\mathrm{ACx}-\left(\operatorname{uns}(\mathrm{LO}(\operatorname{Lmem}))^{*} \operatorname{uns}(\right.\right.\right.$ LO(coef(Cmem) $\left.\left.\left.\left.)\right)\right)\right)\right)$
3] $\mathrm{ACy}=\mathrm{M} 40\left(\mathrm{rnd}\left(\mathrm{uns}(\mathrm{Ymem}){ }^{*} \operatorname{uns}(\mathrm{HI}(\operatorname{coef}(\right.\right.$ Cmem $\left.)))\right)$ ), $\mathrm{ACx}=\mathrm{M} 40(\operatorname{md}(\mathrm{ACx}-\mathrm{uns}(\mathrm{Xmem}) * \operatorname{uns}(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})))))$

$|$| $N$ | 4 | 1 | $X$ | DU_MAC1 + <br> DU_MAC2 | 1 | 1 | $\cdot$ | 1 | 2 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :--- | :--- |
| $N$ | 4 | 1 | $X$ | DU_MAC1 + <br> DU_MAC2 | 1 | 1 | $\cdot$ | 2 | 2 |
| $N$ | 5 | 1 | $X$ | DU_MAC1 + <br> DU_MAC2 | 2 | 1 | $\cdot$ | 2 | 2 |

Multiply with Parallel Store Accumulator Content to Memory (page 5-305)
ACy $=\operatorname{rnd}\left(T x^{*}\right.$ Xmem),
Ymem $=$ HI(ACx $\ll$ T2) $[, T 3=X m e m]$


Multiply and Accumulate (MAC) (page 5-308)
[1] $\quad A C y=\operatorname{rnd}(A C y+(A C x * T x))$
$\left|\begin{array}{lllll|lll|llll}Y & 2 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ Y & 2 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ Y & 3 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & 1 & \cdot & 1 & 1 & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & 2 & \cdot & \cdot & 2 & \cdot & \cdot & \cdot \\ N & 4 & 1 & X & \text { DU_MAC1 } & 2 & \cdot & \cdot & 2 & \cdot & \cdot & \cdot \\ \text { N } & 3 & 1 & X & \text { DU_MAC1 } & 1 & 1 & \cdot & 1 & 1 & \cdot & \cdot\end{array}\right|$

Multiply and Accumulate with Parallel Delay (page 5-325)

```
ACx = rnd(ACx + (Smem * coef(Cmem)))[, T3 = Smem],
``` delay(Smem)


Multiply and Accumulate with Parallel Load Accumulator from Memory (page 5-327)

\(\sum_{0}^{\infty}\) Table 4-1. Algebraic Instruction Set Summary (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[b]{2}{*}{No. Instruction} & & & & & Gene & ddres & & & & ses & & \\
\hline & s & c & Pipe & Operator & DA & CA & SA & DR & CR & DW & ACB & Notes \\
\hline
\end{tabular}

\section*{Multiply and Accumulate with Parallel Multiply (page 5-329)}
[1] \(\quad \mathrm{ACx}=\mathrm{M} 40(\) (nd \((\mathrm{ACx}+(\mathrm{uns}(\) Xmem \() *\) uns (coef(Cmem) \()))\) ), \(A C y=M 40(\) rnd \((\) uns \((\) Ymem \() *\) uns \((\) coef \((\) Cmem \())))\)
[2] \(\quad A C y=M 40\left(\operatorname{rnd}\left(A C y+\left(\operatorname{uns}(S m e m){ }^{*} \operatorname{uns}(H(\operatorname{coef}(C m e m)))\right)\right)\right.\), ACx \(=\mathrm{M} 40\left(\right.\) rnd (uns(Smem) \({ }^{*}\) uns \((\) LO(coef(Cmem) ))) \()\)
[3] \(\quad \mathrm{ACy}=\mathrm{M} 40(\) (rnd \(((\mathrm{ACy} \gg \# 16)+(\) uns \((\) Smem \() *\) uns \((\mathrm{H})(\) coef(Cmem) \())))\) ), \(A C x=M 40\left(\right.\) rnd (uns(Smem) \({ }^{*}\) uns(LO(coef(Cmem)))))
[4] \(\quad \mathrm{ACy}=\mathrm{M} 40\left(\mathrm{rnd}\left(\mathrm{ACy}+\left(\mathrm{uns}(\mathrm{HI}(\mathrm{Lmem}))^{*} \operatorname{uns}(\mathrm{H}(\operatorname{coef}(\mathrm{Cmem})))\right)\right)\right.\) ),

[5] \(\quad \mathrm{ACy}=\mathrm{M} 40\left(\right.\) rnd \(\left((\mathrm{ACy} \gg \# 16)+\left(\right.\right.\) uns \((\mathrm{H}(\text { (Lmem) }))^{*}\) uns (HI \((\operatorname{coef}(\) Cmem \(\left.\left.\left.))\right)\right)\right)\) ), ACx \(=\mathrm{M} 40(\) rnd \((\) uns \((\mathrm{LO}(\) Lmem \()) *\) uns(LO(coef(Cmem))) )
[6] \(\quad \mathrm{ACy}=\mathrm{M} 40(\operatorname{rnd}((\mathrm{ACy} \gg \# 16)+(\) uns \((\) Ymem \() *\) uns \((\mathrm{HI}(\) (coef(Cmem) \())))\) ), ACx \(=\mathrm{M} 40(\) rnd \((\) uns \((\) Xmem \() *\) uns \((\) LO \((\) coef(Cmem) \())))\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline N & 4 & 1 & X & DU_MAC1 +
DU_MAC2 & 2 & 1 & 2 & 1 \\
\hline N & 4 & 1 & X & DU_MAC1 + DU_MAC2 & 1 & 1 & 1 & 2 \\
\hline N & 4 & 1 & X & DU_MAC1 + DU_MAC2 & 1 & 1 & 1 & 2 \\
\hline N & 4 & 1 & x & DU_MAC1 + DU_MAC2 & 1 & 1 & 2 & 2 \\
\hline N & 4 & 1 & x & \begin{tabular}{l}
DU MAC1 + \\
DU_MAC2
\end{tabular} & 1 & 1 & 2 & 2 \\
\hline N & 5 & 1 & X & DU_MAC1 + DU_MAC2 & 2 & 1 & 2 & 2 \\
\hline
\end{tabular}

Multiply and Accumulate with Parallel Multiply and Subtract (page 5-347)
[1] \(\quad \mathrm{ACy}=\mathrm{M} 40(\) rnd \((\mathrm{ACy}+(\mathrm{uns}(\) Smem \() *\) uns \((\mathrm{HI}(\operatorname{coef}(\) (Cmem) \())))\) )
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline N & 4 & 1 & X & DU_MAC1 + DU_MAC2 & 1 & 1 & 1 & 2 \\
\hline N & 4 & 1 & x & DU MAC1 + DU_MAC2 & 1 & 1 & 1 & 2 \\
\hline N & 4 & 1 & x & DU MAC1 + DU_MAC2 & 1 & 1 & 2 & 2 \\
\hline N & 4 & 1 & x & DU_MAC1 + DU_MAC2 & 1 & 1 & 2 & 2 \\
\hline N & 5 & 1 & x & DU MAC1 + DU_MAC2 & 2 & 1 & 2 & 2 \\
\hline N & 5 & 1 & x & DU MAC1 + DU_MAC2 & 2 & 1 & 2 & 2 \\
\hline
\end{tabular}

2] \(\mathrm{ACy}=\mathrm{M} 40(\) rnd \(((\mathrm{ACy} \gg \# 16)+(\) uns \((\) Smem \() *\) uns \((\) HII (coef(Cmem) \()))))\), \(A C x=M 40(\) rnd \((A C x-(\) uns \((S m e m) *\) uns \((\) LO \((\) coef \((\) Cmem) \()))))\)
[3] \(\quad \mathrm{ACy}=\mathrm{M} 40\left(\mathrm{rnd}\left(\mathrm{ACy}+\left(\mathrm{uns}(\mathrm{H}(\mathrm{Lmem})){ }^{*}\right.\right.\right.\) uns \(\left.\left.\left.(\mathrm{Hl}(\operatorname{coef}(\mathrm{Cmem})))\right)\right)\right)\), \(A C x=M 40(\) rnd (ACx \(-(\operatorname{uns}(\) LO(Lmem) ) * uns(LO(coef(Cmem))))))
[4] \(A C y=M 40(\) rnd ((ACy >>\#16) \(+(\) uns \((H 1(\) Lmem \())\) * uns(HI(coef(Cmem)))))), \(A C x=M 40(\) rnd \((A C x-(\) uns \((\) LO(Lmem) \() *\) uns \((\) LO \((\) cooef \((\) Cmem \())))))\)

Instruction Set Summary
[6] \(\quad \mathrm{ACy}=\mathrm{M} 40(\) (rnd ((ACy >>\#16) \(+(\) uns \((\) Ymem \() *\) uns (HI(coef(Cmem)))))), ACx \(=\mathrm{M} 40(\) rnd \((\mathrm{ACx}-(\) uns \((\) Xmem \() *\) uns \((\) LO \((\) coef \((\) Cmem \()))))\)

\section*{Multiply and Accumulate with Parallel Store Accumulator Content to Memory (page 5-367)}
\(A C y=\operatorname{rnd}(A C y+(T x *\) Xmem \())\),
\(Y\) mem \(=H I(A C x \ll T 2)[, T 3=X m e m]\)
DU_SHIFT


Multiply and Subtract (MAS) (page 5-369)
[1] \(\quad A C y=\operatorname{rnd}(A C y-(A C x * T x))\)
\(\left|\begin{array}{lllll}Y & 2 & 1 & X & \text { DU_MAC1 } \\ N & 3 & 1 & X & \text { DU_MAC1 }\end{array}\right|\)

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

Table 4-1. Algebraic Instruction Set Summary (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & & \multirow[b]{2}{*}{E} & \multirow[b]{2}{*}{S} & \multirow[b]{2}{*}{C} & \multirow[b]{2}{*}{Pipe} & \multirow[b]{2}{*}{Operator} & \multicolumn{3}{|l|}{Address
Generation Unit} & \multicolumn{4}{|c|}{Buses} & \multirow[b]{2}{*}{Notes} \\
\hline No. & Instruction & & & & & & DA & CA & SA & DR & CR & DW & ACB & \\
\hline [3] & ACy \(=\operatorname{rnd}(\mathrm{ACy}-(\) Smem * ACx\()[\) [, T3 \(=\) Smem] & N & 3 & 1 & X & DU_MAC1 & 1 & - & & 1 & . & \(\cdot\) & & \\
\hline [4] & ACy \(=\operatorname{rnd}(\mathbf{A C x}-(\mathrm{Tx}\) * Smem) \()\) [, T3 = Smem] & N & 3 & 1 & \(x\) & DU_MAC1 & 1 & . & & 1 & . & . & & \\
\hline [5] & ACy \(=\mathrm{M} 40\) (rnd( \(\mathrm{ACx}-(\mathrm{uns}(\) Xmem \() *\) uns(Ymem) \()\) ) \([\), T3 \(=\) Xmem \(]\) & N & 4 & 1 & \(x\) & DU_MAC1 & 2 & . & . & 2 & . & . & & \\
\hline [6] & ACx \(=\operatorname{rnd}(\) ACx \(-(\) Smem * uns(coef(Cmem) \()\) ) & N & 3 & 1 & X & DU_MAC1 & 1 & 1 & . & 1 & 1 & . & . & \\
\hline
\end{tabular}

\section*{Multiply and Subtract with Parallel Load Accumulator from Memory (page 5-379)}


Multiply and Subtract with Parallel Multiply (page 5-381)
[1] \(A C x=M 40(\) rnd \((A C x-(\) uns \((\) Xmem \() *\) uns(coef(Cmem) \())))\),
\(|\)\begin{tabular}{lllll|lll|ll}
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 2 & 1 & \(\cdot\) & 2 & 1 \\
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 1 & 1 & \(\cdot\) & 1 & 2 \\
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 1 & 1 & \(\cdot\) & 2 & 2
\end{tabular}

Multiply and Subtract with Parallel Multiply and Accumulate (page 5-390)
[1] \(\quad A C x=M 40(\operatorname{rnd}(A C x-(\operatorname{uns}(\) (Xmem \() *\) uns \((\operatorname{coef}(\) Cmem \())))\) ),
\(A C y=M 40(\) rnd \((A C y+(\) uns \((Y \operatorname{mem}) *\) uns \((\) coef \((\) Cmem \()))))\)
[2] \(A C x=M 40(\) rnd \((A C x-(\) uns \((X m e m) *\) uns \((\operatorname{coef}(\) Cmem \())))\) ),
\begin{tabular}{|lllll|lll|ll}
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 \\
DU_MAC2
\end{tabular} & 2 & 1 & \(\cdot\) & 2 & 1 \\
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 \\
DU_MAC2
\end{tabular} & 2 & 1 & \(\cdot\) & 2 & 1 \\
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 \\
DU_MAC2
\end{tabular} & 1 & 1 & \(\cdot\) & 1 & 2 \\
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 \\
DU_MAC2
\end{tabular} & 1 & 1 & \(\cdot\) & 2 & 2
\end{tabular}

ACy \(=\mathrm{M} 40(\) rnd \((\mathrm{ACy}-(\) uns \((\) Smem \() *\) uns(HI(coef(Cmem) \())))\), ACx \(=\mathrm{M} 40\left(\right.\) (rnd \(\left(\mathrm{ACx}+\left(\right.\right.\) uns \((\) Smem \(){ }^{*}\) uns \((\) LO(coef(Cmem) \(\left.\left.\left.)\right)\right)\right)\)
[4] \(\mathrm{ACy}=\mathrm{M} 40\left(\right.\) rnd \(\left(\mathrm{ACy}-\left(\mathrm{uns}(\mathrm{HI}(\mathrm{Lmem}))^{*} \operatorname{uns}(\mathrm{HI}(\operatorname{coef}(\right.\right.\) Cmem \(\left.\left.)))\right)\right)\) ), \(A C x=M 40(\) rnd \((A C x+(\) uns \((L O(L m e m)) *\) uns \((\) LO(coef(Cmem) ) ) ) ) \()\)

\section*{Multiply and Subtract with Parallel Store Accumulator Content to Memory (page 5-401)}


Negate Accumulator, Auxiliary, or Temporary Register Content (page 5-403)

\(\sum_{0}^{\infty}\) Table 4-1. Algebraic Instruction Set Summary (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[b]{2}{*}{No.} & & \multirow[b]{2}{*}{E} & & & & & & & & & & ses & & \multirow[b]{2}{*}{Notes} \\
\hline & Instruction & & s & c & Pipe & Operator & DA & CA & SA & DR & CR & DW & ACB & \\
\hline \multicolumn{2}{|r|}{dst -DU \(=-\mathrm{src}\)} & Y & 2 & 1 & X & DU_ALU & . & & \(\cdot\) & & . & - & - & See Note 1. \\
\hline \multicolumn{15}{|l|}{No Operation (NOP) (page 5-405)} \\
\hline [1] & nop & Y & 1 & 1 & D & & . & . & & . & . & - & & \\
\hline [2] & nop_16 & Y & 2 & 1 & D & & & & & & & & & \\
\hline
\end{tabular}
Parallel Modify Auxiliary Register Contents (page 5-406)
\(\operatorname{mar}(\) Xmem \(), \operatorname{mar}(\) Ymem \(), \operatorname{mar}(\) coef(Cmem))

Parallel Multiplies (page 5-407)
[1] \(A C x=M 40(\) mnd \((\) uns \((X m e m) ~ * ~ u n s(c o e f(C m e m)))), ~\) \(A C y=M 40(\) rnd (uns (Ymem) * uns(coef(Cmem)) )
2] \(\mathrm{ACy}=\mathrm{M} 40(\) rnd (uns(Smem) * uns(HI(coef(Cmem))))), ACx \(=\mathrm{M} 40(\) rnd(uns(Smem) * uns(LO(coef(Cmem)))))
[3] \(\quad \mathrm{ACy}=\mathrm{M} 40(\operatorname{rnd}(\mathrm{uns}(\mathrm{Hl}(\mathrm{Lmem})) *\) uns \((\mathrm{HI}(\) coef \((\) Cmem \()))))\), \(\mathrm{ACx}=\mathrm{M} 40(\) rnd \((\) uns \((\mathrm{LO}(\) Lmem \()) *\) uns \((\mathrm{LO}(\operatorname{coef}(\) Cmem \())))\)
 ACx \(=\mathrm{M} 40\left(\right.\) rnd (uns(Xmem) \({ }^{*}\) uns(LO(coef(Cmem)))))
\(|\)\begin{tabular}{lcccc|ccc|ll}
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 2 & 1 & \(\cdot\) & 2 & 1 \\
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 1 & 1 & \(\cdot\) & 1 & 2 \\
\(N\) & 4 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 1 & 1 & \(\cdot\) & 2 & 2 \\
\(N\) & 5 & 1 & \(X\) & \begin{tabular}{l} 
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 2 & 1 & \(\cdot\) & 2 & 2
\end{tabular}

\section*{Parallel Multiply and Accumulates (page 5-419)}
[1] \(\quad A C x=M 40(\operatorname{rnd}(A C x+(\) uns \((\) Xmem \() *\) * uns (coef(Cmem)) )) \()\) \(A C y=M 40\left(\right.\) rnd \(\left(A C y+\left(\right.\right.\) uns \((Y m e m){ }^{*}\) uns(coef(Cmem)))) \()\)
[2] \(\quad A C x=M 40(\) rnd ((ACx >> \#16) \(+(\operatorname{uns}(\) Xmem \() *\) uns \((\operatorname{coef}(\) Cmem \())))\), \(A C y=M 4(\) rnd \((A C y+(\) uns \((Y m e m) *\) uns \((\operatorname{coef}(\) Cmem \())))\)
[3] \(\mathrm{ACx}=\mathrm{M} 40(\) rnd ((ACx >> \#16) \(+(\) uns (Xmem) * uns(coef(Cmem) ))) \()\)
ACy \(=\) M40(rnd((ACy >> \#16) \(+(\) uns \((\) Ymem \() *\) uns \((\operatorname{coef}(\) Cmem \())))\)

5] \(\mathrm{ACy}=\mathrm{M} 40(\) rnd \((\mathrm{ACy}+(\) uns \((\) Smem \() *\) uns \((\mathrm{HI}(\operatorname{coef}(\) Cmem \()))))\) \(A C x=M 40(\operatorname{rnd}((A C x \gg \# 16)+(\operatorname{uns}(\) Smem \() *\) uns(LO(coef(Cmem)) )) ) \()\)
(6] \(\mathrm{ACy}=\mathrm{M} 40(\) (rnd ((ACy >> \#16) \(+(\) uns(Smem) * uns( \(\mathrm{HI}(\operatorname{coef}(\) Cmem \()))))\), ACx \(=\operatorname{M40}(\) (rnd ((ACx >> \#16) \(+(\) uns (Smem) * uns(LO(coet(Cmem)))) \()\)
\(\mathrm{ACy}=\mathrm{M} 40\left(\right.\) rnd \(\left(\mathrm{ACy}+\left(\mathrm{uns}(\mathrm{HI}(\right.\right.\) Lmem \()){ }^{*}\) uns(HI(coef(Cmem) ) ) ) ) ,

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & \multirow[b]{2}{*}{Instruction} & \multirow[b]{2}{*}{E} & \multirow[b]{2}{*}{s} & \multirow[b]{2}{*}{c} & \multirow[b]{2}{*}{Pipe} & \multirow[b]{2}{*}{Operator} & \multicolumn{3}{|l|}{\[
\begin{gathered}
\text { Address } \\
\text { Generation Unit }
\end{gathered}
\]} & \multicolumn{4}{|c|}{Buses} & \multirow[b]{2}{*}{Notes} \\
\hline No. & & & & & & & DA & CA & SA & DR & CR & Dw & ACB & \\
\hline [8] & \begin{tabular}{l}
\(\mathrm{ACy}=\mathrm{M} 40\left(\right.\) rnd \(\left(\mathrm{ACy}+\left(\operatorname{uns}(\mathrm{HI}(\mathrm{Lmem})){ }^{*}\right.\right.\) uns \((\mathrm{HI}(\operatorname{coef}(\) Cmem \(\left.\left.)))\right)\right)\) ), \\
\(A C x=M 40(\) rnd \(((A C x \gg \# 16)+(\) uns \((\) LO(Lmem) \() * u n s(\) LO(coef(Cmem) \()))))\)
\end{tabular} & N & 4 & 1 & X & DU_MAC1 + DU_MAC2 & 1 & 1 & - & 2 & 2 & & & \\
\hline [9] & \begin{tabular}{l}
ACy \(=\mathrm{M} 40\left(\right.\) rnd \(\left((\mathrm{ACy} \gg \# 16)+\left(\mathrm{uns}(\mathrm{HI}(\right.\right.\) Lmem \()){ }^{*}\) uns \((\mathrm{HI}(\operatorname{coef}(\) Cmem \(\left.\left.\left.)))\right)\right)\right)\), \\
\(A C x=M 40(\operatorname{rnd}((A C x \gg \# 16)+(u n s(L O(L m e m)) *\) uns(LO(coef(Cmem))))))
\end{tabular} & N & 4 & 1 & x & DU_MAC1 + DU_MAC2 & 1 & 1 & . & 2 & 2 & . & & \\
\hline [10] & \begin{tabular}{l}
\(A C y=M 40(\) rnd \((\mathrm{ACy}+\mathrm{uns}(\) Ymem \() *\) uns (HI(coef(Cmem) ) )) ), \\
\(A C x=M 40(\operatorname{rnd}(A C x+\) uns \((\) Xmem \() *\) uns (LO(coef(Cmem) ) ) ) )
\end{tabular} & N & 5 & 1 & x & \begin{tabular}{l}
DU MAC1 + \\
DU_MAC2
\end{tabular} & 2 & 1 & . & 2 & 2 & & & \\
\hline [11] & \begin{tabular}{l}
\(\mathrm{ACy}=\mathrm{M} 40(\) rnd \((\mathrm{ACy}+(\) uns \((\mathrm{Ymem}) * \operatorname{uns}(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})))))\) ), \\
ACx \(=\) M40(rnd((ACx >> \#16) \(+(\) uns (Xmem) * uns(LO(coef(Cmem))))))
\end{tabular} & N & 5 & 1 & x & DU MAC1 + DU_MAC2 & 2 & 1 & . & 2 & 2 & & & \\
\hline [12] &  & N & 5 & 1 & x & DU_MAC1 + DU_MAC2 & 2 & 1 & & 2 & 2 & & & \\
\hline \multicolumn{15}{|l|}{Parallel Multiply and Subtracts (page 5-454)} \\
\hline [1] & \begin{tabular}{l}
\(A C x=M 40\left(\right.\) rnd \(\left(A C x-\left(\right.\right.\) uns \((X m e m){ }^{*}\) uns( \(\operatorname{coef}(\) Cmem) \(\left.\left.\left.)\right)\right)\right)\) ), \\
ACy \(=\mathrm{M} 40\left(\right.\) rnd (ACy \(-\left(\right.\) uns(Ymem) \({ }^{*}\) uns( \(\operatorname{coef(Cmem)))))~}\)
\end{tabular} & N & 4 & 1 & x & DU MAC1 + DU_MAC2 & 2 & 1 & & 2 & 1 & & & \\
\hline [2] &  & N & 4 & 1 & x & \begin{tabular}{l}
DU_MAC1 + \\
DU_MAC2
\end{tabular} & 1 & 1 & & 1 & 2 & & & \\
\hline [3] &  & N & 4 & 1 & x & DU_MAC1 + DU_MAC2 & 1 & 1 & & 2 & 2 & & & \\
\hline [4] &  & N & 5 & 1 & x & DU MAC1 + DU_MAC2 & 2 & 1 & & 2 & 2 & & & \\
\hline \multicolumn{15}{|l|}{Peripheral Port Register Access Qualifiers (page 5-466)} \\
\hline [1] & readport() & N & 1 & 1 & D & & . & & & & & & & \\
\hline [2] & writeport() & N & 1 & 1 & D & & & & & & & & & \\
\hline
\end{tabular}

Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers (page 5-468)


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU


Table 4-1. Algebraic Instruction Set Summary (Continued)


Repeat Single Instruction Unconditionally (page 5-498)
\begin{tabular}{ll|lllll}
{\([1]\)} & repeat(k8) \\
[2] & repeat(k16)
\end{tabular} \left\lvert\, \(\begin{array}{llll}\text { Y } & 2 & 1 & \text { AD } \\
\text { PU_UNIT } \\
Y & 3 & 1 & \text { AD } \\
\text { PU_UNIT }\end{array}\right.\)
[1]
[3] repeat(CSR)
Y 2
D PU_UNIT
Repeat Single Instruction Unconditionally and Decrement CSR (page 5-503)
\begin{tabular}{r|lllll} 
repeat(CSR), \(\mathrm{CSR}-=k 4\) & y & 2 & 1 & \(X\) & AU_ALU + \\
PU
\end{tabular}
\({ }_{\text {PU_UNIT }}{ }^{\text {AU_AL }}\)


Repeat Single Instruction Unconditionally and Increment CSR (page 5-505)
[1] repeat(CSR), CSR += TAx
[2] repeat(CSR), CSR \(+=\mathrm{k} 4\)
\begin{tabular}{lllll}
\(Y\) & 2 & 1 & \(X\) & \begin{tabular}{l} 
AU_ALU + \\
\(P U \_U N I T\)
\end{tabular} \\
\(Y\) & 2 & 1 & \(X\) & \begin{tabular}{l} 
AU_ALU + \\
\(P U \_U N I T\)
\end{tabular}
\end{tabular}


Return Conditionally (page 5-508)
if (cond) return

† \(\mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false
Return Unconditionally (page 5-510) return
\(|\)\begin{tabular}{lllll|lll|l}
Y & 2 & 5 & D & PU_UNIT & 1 &. & 1 & 2 \\
N & 2 & 5 & D & PU_UNIT & 1 &. & 1 & 2
\end{tabular}

Rotate Left Accumulator, Auxiliary, or Temporary Register Content (page 5-514)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & dst-AU \(=\) BitOut \(\backslash\) src-AU \(\backslash \backslash\) Bitln & Y & 3 & 1 & \(x\) & AU_ALU & . & . & . & . & . & . & & \\
\hline & dst-AU = BitOut \(\backslash\) src-DU \(\backslash \backslash\) Bitln & Y & 3 & 1 & \(x\) & AU_ALU & & & & & & & 1 & \\
\hline \(\Sigma\) & dst-DU = BitOut \(\backslash\) src-AU \(\backslash \backslash\) Bitln & Y & 3 & 1 & X & DU_SHIFT & . & . & . & & & . & . & See Note 1. \\
\hline
\end{tabular}

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
\(\sum_{0}^{\infty}\) Table 4-1. Algebraic Instruction Set Summary (Continued)


Rotate Right Accumulator, Auxiliary, or Temporary Register Content (page 5-516)


Round Accumulator Content (page 5-518)
\(A C y=\operatorname{rnd}(A C x)\)
\(\left|\begin{array}{lllll}Y & 2 & 1 & X & \text { DU_ALU }\end{array}\right| \cdot \quad . \quad . \mid\)

Saturate Accumulator Content (page 5-520)

\section*{\(A C y=\operatorname{saturate}(\operatorname{rnd}(A C x))\)}
\(\begin{array}{lllll}Y & 2 & 1 & X & \text { DU_ALU }\end{array}\)
Set Accumulator, Auxiliary, or Temporary Register Bit (page 5-522)


Set Status Register Bit (page 5-524)

- \(\quad\) When this instruction is decoded to modify status bit CAFRZ (15), CAEN (14), or CACLR (13), the CPU pipeline is flushed and the instruction is executed in 5 cycles regardless of the instruction context.

\section*{\(\underset{\sim}{\infty} \quad\) Shift Accumulator Content Conditionally (page 5-527)}


Table 4-1. Algebraic Instruction Set Summary (Continued)

\(\underset{ }{\stackrel{\rightharpoonup}{\Sigma}}\) Shift Accumulator, Auxiliary, or Temporary Register Content Logically (page 5-532)
[1] dst-AU = dst-AU <<<\#1
\(\left|\begin{array}{lllll}Y & 2 & 1 & X & \begin{array}{c}\text { AU_ALU + } \\ \text { DU_SHIFT }\end{array} \\ Y & 2 & 1 & X & \text { DU_SHIFT } \\ Y & 2 & 1 & X & \begin{array}{l}\text { AU_ALU + } \\ \text { DU_SHIFT }\end{array} \\ Y & 2 & 1 & X & \text { DU_SHIFT }\end{array}\right|\)

\section*{Signed Shift of Accumulator, Auxiliary, or Temporary Register Content (page 5-544)}
\begin{tabular}{|c|c|}
\hline [1] & dst-AU = dst-AU >> \#1 \\
\hline & dst-DU \(=\) dst-DU >> \#1 \\
\hline [2] & dst-AU = dst-AU <<\#1 \\
\hline & dst-DU = dst-DU <<\#1 \\
\hline
\end{tabular}
\(\left|\begin{array}{lllll}Y & 2 & 1 & X & \begin{array}{c}\text { AU_ALU + } \\ \text { DU_SHIFT }\end{array} \\ Y & 2 & 1 & X & \text { DU_SHIFT } \\ Y & 2 & 1 & X & \begin{array}{l}\text { AU_ALU + } \\ \text { DU_SHIFT }\end{array} \\ Y & 2 & 1 & X & \text { DU_SHIFT }\end{array}\right|\)

Software Interrupt (page 5-549) intr(k5)
```

| N 2 3 % D PU_UNIT

```

Software Reset (page 5-551)
reset


Software Trap (page 5-555)
\begin{tabular}{|c|c|}
\hline \multicolumn{2}{|c|}{trap(k5)} \\
\hline Notes: & 1) dst-DU, src-AU or dst-DU, src-DU \\
\hline & 2) dst-DU, src-AU or dst-AU, src-DU \\
\hline
\end{tabular}
\(\sum_{0}^{\infty}\) Table 4-1. Algebraic Instruction Set Summary (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & & & & & & \multicolumn{3}{|l|}{Address
Generation Unit} & \multicolumn{4}{|c|}{Buses} & \multirow[b]{2}{*}{Notes} \\
\hline No. Instruction & E & S & C & Pipe & Operator & DA & CA & SA & DR & CR & DW & ACB & \\
\hline
\end{tabular}

Square (page 5-557)
[1] \(\quad A C y=\operatorname{rnd}(A C x * A C x)\)
[2] \(\quad A C X=\operatorname{rnd}(\) Smem *Smem) \([, T 3=\) Smem \(]\)
\(\left|\begin{array}{lllll|lll|llll}Y & 2 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot & \cdot\end{array}\right|\)
\(\left|\begin{array}{lllllllllll}Y & 2 & 1 & X & \text { DU_MAC1 } & \cdot & \cdot & \cdot & \cdot & \cdot & \cdot \\ N & 3 & 1 & X & \text { DU_MAC1 } & 1 & \cdot & \cdot & 1 & \cdot & \cdot\end{array}\right|\)
[2] \(\quad A C y=\operatorname{rnd}(A C x+(S m e m *\) Smem \())[\), T3 \(=\) Smem \(]\)

\section*{Square and Subtract (page 5-563)}
[1] \(A C y=\operatorname{rnd}(A C y-(A C x * A C x))\)
[2] \(A C y=\operatorname{rnd}(A C x-(\) Smem * Smem \())[, T 3=\) Smem \(]\)


Store Accumulator Content to Memory (page 5-568)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline [1] & Smem \(=\mathrm{HI}(\mathrm{ACx})\) & N & 2 & 1 & x & & 1 & . & & & . & 1 & & \\
\hline [2] & Smem \(=\mathrm{HI}(\mathrm{rnd}(\mathrm{ACx})\) ) & N & 3 & 1 & x & DU_SHIFT & 1 & . & . & & . & 1 & . & \\
\hline [3] & Smem \(=\) LO(ACx \(\ll\) Tx) & N & 3 & 1 & X & DU_SHIFT & 1 & . & . & & . & 1 & . & \\
\hline [4] & Smem \(=\mathrm{H}(\mathrm{rnd}(\mathrm{ACx} \ll \mathrm{Tx})\) ) & N & 3 & 1 & X & DU_SHIFT & 1 & . & . & & . & 1 & & \\
\hline ¢ [5] & Smem = LO(ACx <<\#SHIFTW) & N & 3 & 1 & X & DU_SHIFT & 1 & . & . & & . & 1 & . & \\
\hline ¢ [6] & Smem = HI(ACx << \#SHIFTW) & N & 3 & 1 & x & DU_SHIFT & 1 & . & . & & . & 1 & . & \\
\hline \(\bigcirc\) [7] & Smem \(=\) HI(rnd(ACx <<\#SHIFTW) & N & 4 & 1 & X & DU_SHIFT & 1 & . & . & & & 1 & & \\
\hline ¢ [8] & Smem \(=\mathrm{HI}(\) saturate \((\) uns \((\operatorname{rnd}(\mathrm{ACx})\) )) ) & N & 3 & 1 & X & DU_SHIFT & 1 & . & . & & . & 1 & . & \\
\hline \(\cdots\) [9] & Smem \(=\) HI(saturate(uns(rnd(ACx \(\ll\) Tx \()\) )) & N & 3 & 1 & X & DU_SHIFT & 1 & . & . & & . & 1 & . & \\
\hline [10] & Smem = HI(saturate(uns(rnd(ACX \(\ll\) \#SHIFTW) ) ) & N & 4 & 1 & X & DU_SHIFT & 1 & . & & & & 1 & . & \\
\hline - [11] & \(\mathrm{dbl}(\) Lmem \()=\mathrm{ACx}\) & N & 3 & 1 & x & & 1 & . & . & . & . & 2 & . & \\
\hline [12] & \(\mathrm{dbl}(\) Lmem \()=\operatorname{saturate}(\) uns \((A C x))\) & N & 3 & 1 & x & DU_SHIFT & 1 & . & . & & . & 2 & . & \\
\hline
\end{tabular}

\footnotetext{
ț
N
Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU
}

Table 4-1. Algebraic Instruction Set Summary (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[b]{2}{*}{No.} & \multirow[b]{2}{*}{Instruction} & \multirow[b]{2}{*}{E} & \multirow[b]{2}{*}{S} & \multirow[b]{2}{*}{C} & \multirow[b]{2}{*}{Pipe} & \multirow[b]{2}{*}{Operator} & \multicolumn{3}{|l|}{\[
\begin{gathered}
\text { Address } \\
\text { Generation Unit }
\end{gathered}
\]} & \multicolumn{4}{|c|}{Buses} & \multirow[b]{2}{*}{Notes} \\
\hline & & & & & & & DA & CA & SA & DR & CR & DW & ACB & \\
\hline [13] & \[
\begin{aligned}
& \mathrm{HI}(\text { Lmem })=\mathrm{HI}(\mathrm{ACx}) \gg \# 1, \\
& \mathrm{LO}(\text { Lmem })=\mathrm{LO}(\mathrm{ACx}) \gg \# 1
\end{aligned}
\] & N & 3 & 1 & X & DU_SHIFT & 1 & . & . & & - & 2 & . & \\
\hline [14] & \[
\begin{aligned}
& \text { Xmem }=\mathrm{LO}(A C x), \\
& \text { Ymem }=H I(A C x)
\end{aligned}
\] & N & 3 & 1 & X & & 2 & . & . & . & . & 2 & . & \\
\hline
\end{tabular}

Store Accumulator Pair Content to Memory (page 5-588)
\begin{tabular}{ll|llll} 
[1] & Lmem \(=\) pair(H1(ACx)) \\
[2] & Lmem \(=\) pair(LO(ACx))
\end{tabular}\(|\)\begin{tabular}{cccc} 
N & 3 & 1 & X \\
N & 3 & 1 & X
\end{tabular}
\begin{tabular}{|lll|lll}
1 & \(\cdot\) & \(\cdot\) & \(\cdot\) & \(\cdot\) & 2 \\
1 & \(\cdot\) & \(\cdot\) & \(\cdot\) & \(\cdot\) & 2
\end{tabular}

Store Accumulator, Auxiliary, or Temporary Register Content to Memory (page 5-591)
\begin{tabular}{|c|c|c|c|c|c|}
\hline [1] & Smem = src & N & 2 & 1 & \(x\) \\
\hline [2] & high_byte(Smem) = src & N & 3 & 1 & x \\
\hline [3] & low_byte(Smem) = src & N & 3 & 1 & x \\
\hline
\end{tabular}
\begin{tabular}{|ccc|cccc|}
1 & \(\cdot\) & \(\cdot\) & \(\cdot\) & \(\cdot\) & 1 & \(\cdot\) \\
1 & \(\cdot\) & \(\cdot\) & \(\cdot\) & \(\cdot\) & 1 & \(\cdot\) \\
1 & \(\cdot\) & \(\cdot\) & \(\cdot\) & \(\cdot\) & 1 & \(\cdot\)
\end{tabular}\(|\)

Store Auxiliary or Temporary Register Pair Content to Memory (page 5-595) Lmem = pair(TAx)

\section*{Store CPU Register Content to Memory (page 5-596)}


Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

\(\stackrel{\text { L }}{\omega}\) Table 4-1. Algebraic Instruction Set Summary (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[b]{5}{*}{} & & \multirow[b]{2}{*}{Instruction} & \multirow[b]{2}{*}{E} & \multirow[b]{2}{*}{S} & \multirow[b]{2}{*}{C} & \multirow[b]{2}{*}{Pipe} & \multirow[b]{2}{*}{Operator} & \multicolumn{3}{|l|}{Address Generation Unit} & \multicolumn{4}{|c|}{Buses} & \multirow[b]{2}{*}{Notes} \\
\hline & No. & & & & & & & DA & CA & SA & DR & CR & DW & ACB & \\
\hline & \multirow[t]{3}{*}{[5]} & dst-AU = Smem - src-AU & N & 3 & 1 & X & AU_ALU & 1 & . & & 1 & & . & & \\
\hline & & dst-AU \(=\) Smem - src-DU & N & 3 & 1 & \(x\) & AU_ALU & 1 & . & . & 1 & . & . & 1 & \\
\hline & & dst-DU \(=\) Smem - src & N & 3 & 1 & \(x\) & DU_ALU & 1 & . & . & 1 & . & . & . & See Note 1. \\
\hline \[
\begin{aligned}
& \infty \\
& \infty
\end{aligned}
\] & [6] & \(A C y=A C y-(A C x \ll T x)\) & Y & 2 & 1 & x & DU_SHIFT & . & . & . & . & . & . & . & \\
\hline \multirow[t]{12}{*}{\[
\]} & [7] & ACy \(=\) ACy - (ACx \(\ll \#\) SHIFTW \()\) & Y & 3 & 1 & X & DU_SHIFT & . & . & . & . & . & . & . & \\
\hline & [8] & ACy \(=\) ACx - (K16 <<\#16) & N & 4 & 1 & \(x\) & DU_ALU & . & . & . & & . & . & & \\
\hline & [9] & ACy \(=\) ACx - (K16 <<\#SHFT) & N & 4 & 1 & \(x\) & DU_SHIFT & . & . & . & . & . & . & . & \\
\hline & [10] & \(A C y=A C x-(S m e m \ll T x)\) & N & 3 & 1 & \(x\) & DU_SHIFT & 1 & . & . & 1 & . & . & . & \\
\hline & [11] & ACy \(=\) ACx \(-(\) Smem \(\ll \# 16)\) & N & 3 & 1 & X & DU_ALU & 1 & . & . & 1 & . & . & . & \\
\hline & [12] & ACy \(=(\) Smem \(\ll \# 16)-\) ACx & N & 3 & 1 & x & DU_ALU & 1 & . & . & 1 & . & . & . & \\
\hline & [13] & ACy \(=\) ACx - uns(Smem) - BORROW & N & 3 & 1 & X & DU_ALU & 1 & . & . & 1 & . & . & . & \\
\hline & [14] & ACy \(=\) ACx - uns(Smem) & N & 3 & 1 & x & DU_ALU & 1 & . & . & 1 & . & . & . & \\
\hline & [15] & ACy \(=\) ACx \(-(\) uns(Smem) \(\ll\) \#SHIFTW) & N & 4 & 1 & x & DU_SHIFT & 1 & . & & 1 & . & . & & \\
\hline & [16] & \(A C y=A C x-d b l(L m e m)\) & N & 3 & 1 & X & DU_ALU & 1 & . & & 2 & . & & & \\
\hline & [17] & \(\mathrm{ACy}=\mathrm{dbl}(\) Lmem \()-\mathrm{ACx}\) & N & 3 & 1 & X & DU_ALU & 1 & . & . & 2 & . & . & & \\
\hline & [18] & ACx \(=(\) Xmem <<\#16) - (Ymem <<\#16) & N & 3 & 1 & x & DU_ALU & 2 & . & & 2 & & & & \\
\hline
\end{tabular}

\section*{Subtraction with Parallel Store Accumulator Content to Memory (page 5-627)}
\[
\begin{aligned}
& \mathrm{ACy}=(\text { Xmem <<\#16) }-\mathrm{ACx}, \\
& \text { Y }
\end{aligned}
\]
\[
\text { Ymem = } \mathrm{HI}(\mathrm{ACy} \ll \mathrm{~T} 2)
\]


Swap Accumulator Content (page 5-629)
[1] swap(AC0, AC2)
[2] \(\operatorname{swap}(\mathrm{AC} 1, \mathrm{AC} 3)\)
Swap Accumulator Pair Content (page 5-630) swap(pair(AC0), pair(AC2))
\(\left|\begin{array}{lllll}Y & 2 & 1 & X & \text { DU_SWAP } \\ Y & 2 & 1 & X & \text { DU_SWAP }\end{array}\right|\)


Swap Auxiliary Register Content (page 5-631)
SWPU068E
Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU


Table 4-1. Algebraic Instruction Set Summary (Continued)

\section*{Test Accumulator, Auxiliary, or Temporary Register Bit Pair (page 5-643)}
\begin{tabular}{l|lllll|l} 
bit(src-AU, pair(Baddr)) \\
bit(src-DU, pair(Baddr))
\end{tabular} \left\lvert\, \begin{tabular}{ccccc}
\(N\) & 3 & 1 & \(X\) & AU_ALU \\
N & 3 & 1 & X & DU_BIT
\end{tabular} 1\right.
\(\left|\begin{array}{cccc}\cdot & \cdot & \cdot & \cdot \\ \cdot & \cdot & \cdot & \cdot\end{array}\right|\)

Test Memory Bit (page 5-645)
[1] TCX = bit(Smem, src
[2] TCx \(=\) bit(Smem, k4)
\begin{tabular}{|lllll|lll|l}
\(N\) & 3 & 1 & \(X\) & \(A U \_A L U\) & 1 & \(\cdot\) & \(\cdot\) & 1 \\
\(N\) & 3 & 1 & \(X\) & \(A U \_A L U\) & 1 & \(\cdot\) & \(\cdot\) & 1
\end{tabular}

Test and Clear Memory Bit (page 5-648)
[1] TC1 = bit(Smem, k4),
bit(Smem, k4) = \#0
[2] \(\mathrm{TC} 2=\operatorname{bit}(\) Smem, k4),
\(\left|\begin{array}{lllll|lll|llll}N & 3 & 1 & X & A U \_A L U & 1 & \cdot & \cdot & 1 & \cdot & 1 & \cdot \\ N & 3 & 1 & X & A U \_A L U & 1 & \cdot & \cdot & 1 & \cdot & 1 & \cdot\end{array}\right|\)

Test and Complement Memory Bit (page 5-649)
[1] TC1 = bit(Smem, k4),
cbit(Smem, k4)
[2] \(\mathrm{TC} 2=\operatorname{bit}(\) Smem, k4 \()\)
\[
\left|\begin{array}{lllll}
N & 3 & 1 & X & A U \_A L U \\
N & 3 & 1 & X & A U \_A L U
\end{array}\right|
\]


Test and Set Memory Bit (page 5-650)
\[
\begin{aligned}
& \text { TC1 = bitt(Smem, k4) } \\
& \text { bit(Smem, k4) = }
\end{aligned}
\]
[2] \(\quad\) TC2 \(=\operatorname{bit}(\) Smem, k4), bit(Smem, \(k 4)=\# 1\)
\(\left|\begin{array}{lllll|lll|llll}N & 3 & 1 & X & A U \_A L U & 1 & \cdot & \cdot & 1 & \cdot & 1 & \cdot \\ N & 3 & 1 & X & A U \_A L U & 1 & \cdot & \cdot & 1 & \cdot & 1 & \cdot\end{array}\right|\)

Notes: 1) dst-DU, src-AU or dst-DU, src-DU
2) dst-DU, src-AU or dst-AU, src-DU

\section*{Instruction Set Descriptions}

This chapter provides detailed information on the TMS320C55x \({ }^{\text {TM }}\) DSP algebraic instruction set.

See Section 1.1, Instruction Set Terms, Symbols, and Abbreviations, for definitions of symbols and abbreviations used in the description of each instruction. See Chapter 4 for a summary of the instruction set.

\section*{abdst}

\section*{Absolute Distance}

\section*{Syntax Characteristics}
\begin{tabular}{cllccccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size
\end{tabular} Cycles \begin{tabular}{c} 
Pipeline \\
\hline\([1]\) \\
abdst(Xmem, Ymem, ACx, ACy)
\end{tabular}

\section*{Operands}

\section*{Description}

\section*{Status Bits}

Repeat

ACx, ACy, Xmem, Ymem
This instruction executes two operations in parallel: one in the D-unit MAC and one in the D-unit ALU:
\(A C y=A C y+|H I(A C x)|\)
ACx \(=(\) Xmem \(\ll\) \#16) \(-(\) Ymem \(\ll\) \#16)
The absolute value of accumulator ACx content is computed and added to accumulator ACy content through the D-unit MAC. When an overflow is detected according to M40:
\(\square\) the destination accumulator overflow status bit (ACOVy) is set
\(\square\) the destination register (ACy) is saturated according to SATD
The Ymem content shifted left 16 bits is subtracted from the Xmem content shifted left 16 bits in the D-unit ALU.
- Input operands (Xmem and Ymem) are sign extended to 40 bits according to SXMD.
- CARRY status bit depends on M40. Subtraction borrow bit is reported in CARRY status bit. It is the logical complement of CARRY status bit.
\(\square\) When an overflow is detected according to M40:
- the destination accumulator overflow status bit (ACOVx) is set
- the destination register (ACx) is saturated according to SATD

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
When \(\mathrm{C} 54 \mathrm{CM}=1\), the subtract operation does not have any overflow detection, report, and saturation after the shifting operation.

Affected by C54CM, FRCT, M40, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
This instruction can be repeated.

\section*{See Also See the following other related instructions:}

\section*{- Square Distance}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline abdst(*AR0+, \({ }^{*}\) AR1, AC0, AC1) & \begin{tabular}{l} 
The absolute value of the content of AC0 is added to the content of \\
AC1 and the result is stored in AC1. The content addressed by AR1 is \\
subtracted from the content addressed by AR0 and the result is stored \\
in AC0. The content of AR0 is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline AC0 & 00 & 0000 & 0000 & ACO & 00 & 4500 & 0000 \\
\hline AC1 & 00 & E800 & 0000 & AC1 & 00 & E800 & 0000 \\
\hline ARO & & & 202 & AR0 & & & 203 \\
\hline AR1 & & & 302 & AR1 & & & 302 \\
\hline 202 & & & 3400 & 202 & & & 3400 \\
\hline 302 & & & EFOO & 302 & & & EFOO \\
\hline ACOVO & & & 0 & ACOVO & & & 0 \\
\hline ACOV1 & & & 0 & ACOV1 & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & 0 \\
\hline M40 & & & 1 & M40 & & & 1 \\
\hline SXMD & & & 1 & SXMD & & & 1 \\
\hline
\end{tabular}

\section*{ABS}

Absolute Value

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\mathrm{dst}=|\mathrm{src}|\) & Yes & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description
dst, src
This instruction computes the absolute value of the source register (src).
\(\square\) When the destination register (dst) is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
■ If an auxiliary or temporary register is the source operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD.
- If \(\mathrm{M} 40=0\), the sign of the source register is extracted at bit position 31. If \(\operatorname{src}(31)=1\), the source register content is negated. If \(\operatorname{src}(31)=0\), the source register content is moved to the destination accumulator.
- If \(\mathrm{M} 40=1\), the sign of the source register is extracted at bit position 39. If \(\operatorname{src}(39)=1\), the source register content is negated. If \(\operatorname{src}(39)=0\), the source register content is moved to the destination accumulator.
■ During the 40-bit move operation, an overflow and CARRY bit status are detected according to M40:
- The destination accumulator overflow status bit (ACOVx) is set.
- The destination register is saturated according to SATD.
- The CARRY status bit is updated as follows: If the result of the operation stored in the destination register is 0 , CARRY is set; otherwise, CARRY is cleared.
\(\square\) When the destination register (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
■ The sign of the source register is extracted at bit position 15. If \(\operatorname{src}(15)=1\), the source register content is negated. If \(\operatorname{src}(15)=0\), the source register content is moved to the destination register. Overflow is detected at bit position 15.
- The destination register is saturated according to SATA.

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM \(=1\), this instruction is executed as if M40 status bit was locally set to 1 . To ensure compatibility versus overflow detection and saturation of destination accumulator, this instruction must be executed with \(\mathrm{M} 40=0\).

\section*{Status Bits \\ Repeat \\ Affected by C54CM, M40, SATA, SATD, SXMD \\ Affects ACOVx, CARRY \\ See Also See the following other related instructions: \\ \(\square\) Addition with Absolute Value}

Example 1
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=|A C 0|\) & The absolute value of the content of AC0 is stored in AC1. \\
\hline
\end{tabular}
```

Before After

```
\begin{tabular}{rrrrrrrr} 
AC1 & 00 & 0000 & 2000 & AC1 & 7D FFFF & EDCC \\
AC0 & 82 & 0000 & 1234 & AC0 & 82 & 0000 & 1234 \\
M40 & & 1 & M40 & & & 1
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 \(=\mid\) AR1 \(\mid\) & The absolute value of the content of AR1 is stored in AC1. \\
\hline
\end{tabular}
\begin{tabular}{lrrrrrr} 
Before & \multicolumn{4}{c}{ After } \\
AC1 & 000000 & 2000 & AC1 & 000000 & 0000 \\
AR1 & & 0000 & AR1 & & 0000 \\
CARRY & & & 0 & CARRY & & 1
\end{tabular}

\section*{Example 3}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=|A R 1|\) & \begin{tabular}{l} 
The absolute value of the content of AR1 is stored in AC1. Since SXMD \(=1\), AR1 content \\
is sign extended. The resulting 40-bit data is negated since \(M 40=0\) and AR1 \((31)=1\).
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & & & After \\
AC1 & 000000 & 2000 & AC1 & 000000 & 7900 \\
AR1 & & 8700 & AR1 & & 8700 \\
M40 & & 0 & M40 & & 0 \\
SXMD & & 1 & SXMD & & 1
\end{tabular}

\section*{Example 4}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{T} 1=|\mathrm{AC} 0|\) & \begin{tabular}{l} 
The absolute value of the content of \(\mathrm{ACO}(15-0)\) is stored in T 1. The sign bit is extracted at \\
\(\mathrm{ACO}(15)\). Since \(\mathrm{ACO}(15)=0, \mathrm{~T} 1=\mathrm{ACO}(15-0)\).
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lllllll} 
Before & & After \\
T1 & 2000 & T1 & & \\
AC0 & 800002 & 1234 & AC0 & 80 & 0002 & 1234
\end{tabular}

\section*{Example 5}


\section*{ADD}

\section*{Addition}

Syntax Characteristics

See Also See the following other related instructions:
- Addition or Subtraction Conditionally
\(\square\) Addition or Subtraction Conditionally with Shift
- Addition with Absolute Value
- Addition with Parallel Store Accumulator Content to Memory
- Addition, Subtraction, or Move Accumulator Content Conditionally
- Dual 16-Bit Additions
\(\square\) Dual 16-Bit Addition and Subtraction
\(\square\) Dual 16-Bit Subtraction and Addition
- Subtraction

\section*{Addition}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. & Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \multicolumn{2}{|l|}{dst \(=\) dst + src} & Yes & 2 & 1 & X \\
\hline \multicolumn{3}{|l|}{Opcode} & \multicolumn{4}{|r|}{| 0010 010E \({ }^{\text {FSSS }}\) FDDD} \\
\hline Opera & & dst, src & & & & \\
\hline Descrip & tion & This inst & peration be & ween & wo regi & ers. \\
\hline
\end{tabular}
\(\square\) When the destination (dst) operand is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are sign extended to 40 bits according to SXMD.
■ If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.
■ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.
- When the destination (dst) operand is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- Addition overflow detection is done at bit position 15.

■ When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40, SATA, SATD, SXMD
Affects ACOVx, CARRY
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0+A C 1\) & The content of AC1 is added to the content of AC0 and the result is stored in AC0. \\
\hline
\end{tabular}

Syntax Characteristics


\section*{Addition}

\section*{Syntax Characteristics}
 a 16-bit signed constant, K16.
\(\square\) When the destination (dst) operand is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.
■ The 16 -bit constant, K16, is sign extended to 40 bits according to SXMD.
■ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.
- When the destination (dst) operand is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- Addition overflow detection is done at bit position 15.

■ When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40, SATA, SATD, SXMD
Affects ACOVx, CARRY
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = AC0 + \#2E00h & \begin{tabular}{l} 
The content of AC0 is added to the signed 16-bit value (2E00h) and the result is \\
stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([4]\) & \(\mathrm{dst}=\) src + Smem & No & 3 & 1 & X \\
\hline Opcode & \(\mid 1101\) & 0110 & AAAA & AAAI & FDDD & FSSS
\end{tabular}

\section*{Operands}

\section*{Description}

Status Bits
dst, Smem, src
This instruction performs an addition operation between a register content and the content of a memory (Smem) location.
- When the destination (dst) operand is an accumulator:

■ The operation is performed on 40 bits in the D-unit ALU.
■ If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ The content of the memory location is sign extended to 40 bits according to SXMD.

■ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) When the destination (dst) operand is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.

■ If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- Addition overflow detection is done at bit position 15.

■ When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Affected by M40, SATA, SATD, SXMD
Affects ACOVx, CARRY
Repeat
Example
Exis instruction can be repeated.
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline T1 \(=\) T0 + *AR3+ & \begin{tabular}{l} 
The content of T0 is added to the content addressed by AR3 and the result is \\
stored in T1. AR3 is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{l} 
Before \\
AR3
\end{tabular}
302

Addition
Syntax Characteristics
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([5]\) & \(A C y=A C y+(A C x \ll T x)\) & Yes & 2 & 1 & \(X\) \\
\hline Opcode & & 0101 & 101 E & DDSS & SS00
\end{tabular}

\section*{Operands}

\section*{Description}

Status Bits

Repeat

\section*{ACx, ACy, Tx}

This instruction performs an addition operation between an accumulator content ACy and an accumulator content ACx shifted by the content of Tx.
\(\square\) The operation is performed on 40 bits in the D-unit shifter.
\(\square\) Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with M40 \(=0\), compatibility is ensured. When C54CM \(=1\) :
- An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
- The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0+(A C 1 \ll T 0)\) & \begin{tabular}{l} 
The content of AC1 shifted by the content of T0 is added to the content of AC0 \\
and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Addition}

\section*{Syntax Characteristics}


\section*{Syntax Characteristics}


\section*{Addition}

\section*{Syntax Characteristics}


\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [9] & ACy \(=\) ACx \(+(\) Smem \(\ll\) Tx) & No & 3 & 1 & X \\
\hline Opcod & & \multicolumn{3}{|l|}{| 11011101 | AAAA AAAI | SSDD} & ssoo \\
\hline Opera & & \multicolumn{4}{|c|}{ACx, ACy, Tx, Smem} \\
\hline Descr & \begin{tabular}{l}
tion \\
This instru content ACX content of
\end{tabular} & \multicolumn{4}{|l|}{This instruction performs an addition operation between an accumulator content ACx and the content of a memory (Smem) location shifted by the content of Tx.} \\
\hline
\end{tabular}

The operation is performed on 40 bits in the D-unit shifter.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40.

When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with M40 \(=0\), compatibility is ensured. When \(C 54 C M=1\) :
- An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\(\square\) The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SATD, SXMD \\
& Affects & ACOVy, CARRY \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\mathrm{AC} 1+\left({ }^{*} \mathrm{AR} 1 \ll \mathrm{~T} 0\right)\) & \begin{tabular}{l} 
The content addressed by AR1 shifted left by the content of T0 is added to the \\
content of AC1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline AC0 & 00 & 0000 & 0000 & ACO & 00 & 2330 & 0000 \\
\hline AC1 & 00 & 2300 & 0000 & AC1 & 00 & 2300 & 0000 \\
\hline T0 & & & 000C & T0 & & & 000C \\
\hline AR1 & & & 0200 & AR1 & & & 0200 \\
\hline 200 & & & 0300 & 200 & & & 0300 \\
\hline SXMD & & & 0 & SXMD & & & 0 \\
\hline M40 & & & 0 & M40 & & & 0 \\
\hline ACOVO & & & 0 & Acovo & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & 1 \\
\hline
\end{tabular}

Addition

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([10]\) & \(A C y=A C x+(\) Smem \(\ll \# 16)\) & No & 3 & 1 & \(X\) \\
\hline Opcode & \(\mid 1101\) & \(1110 \mid A A A A\) & AAAI & SSDD & 0100
\end{tabular}
Operands ACx, ACy, Smem

Description This instruction performs an addition operation between an accumulator content ACx and the content of a memory (Smem) location shifted left by 16 bits.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. If the result of the addition generates a carry, the CARRY status bit is set; otherwise, the CARRY status bit is not affected.

When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits \(\quad\) Affected by \(\quad\) C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC1 + (*AR3 <<\#16) & \begin{tabular}{l} 
The content addressed by AR3 shifted left by 16 bits is added to the \\
content of AC1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Addition}

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size
\end{tabular} Cycles \begin{tabular}{c} 
Pipeline \\
\hline\([11]\)
\end{tabular} \(\mathrm{ACy=ACx+} \mathrm{uns(Smem)} \mathrm{+} \mathrm{CARRY}\)

\section*{Operands ACx, ACy, Smem}

Description This instruction performs an addition operation of the accumulator content ACx, the content of a memory (Smem) location, and the value of the CARRY status bit.
- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
\(\square\) Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.


\section*{Addition}

\section*{Syntax Characteristics}

- Overflow detection and CARRY status bit depends on M40.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.


\section*{Addition}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. & Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [13] & \multicolumn{2}{|l|}{ACy \(=\) ACx \(+(\) uns (Smem) \(\ll\) \#SHIFTW)} & No & 4 & 1 & X \\
\hline \multicolumn{2}{|l|}{Opcode} & \multicolumn{5}{|r|}{11111001 | AAAA AAAI | uxSH IFTW | SSDD 00xx} \\
\hline \multicolumn{2}{|l|}{Operands} & \multicolumn{5}{|l|}{ACx, ACy, SHIFTW, Smem} \\
\hline \multicolumn{2}{|l|}{Description} & \multicolumn{5}{|l|}{This instruction performs an addition operation between an accumulator content ACx and the content of a memory (Smem) location shifted by the 6-bit value, SHIFTW.} \\
\hline
\end{tabular} value, SHIFTW.
\(\square\) The operation is performed on 40 bits in the D-unit shifter.
\(\square\) Input operands are extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat
This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC1 + (uns(*AR3) << \#31) & \begin{tabular}{l} 
The unsigned content addressed by AR3 shifted left by 31 bits is \\
added to the content of AC1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([14]\) & \(A C y=A C x+d b I(L m e m)\) & No & 3 & 1 & \(X\) \\
\hline Opcode & 1110 & 1101 & AAAAA AAAI & SSDD & 000 n
\end{tabular}

\section*{Operands ACx, ACy, Lmem}

Description This instruction performs an addition operation between an accumulator content ACx and the content of data memory operand dbl(Lmem).
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem -1
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
\(\square\) Input operands are sign extended to 40 bits according to SXMD.
\(\square\) Overflow detection and CARRY status bit depends on M40.
When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1+\mathrm{dbl}\left({ }^{*} A R 3+\right)\) & \begin{tabular}{l} 
The content (long word) addressed by AR3 and AR3 + 1 is added to the \\
content of AC1 and the result is stored in AC0. Because this instruction is a \\
long-operand instruction, AR3 is incremented by 2 after the execution.
\end{tabular} \\
\hline
\end{tabular}

\section*{Addition}

\section*{Syntax Characteristics}


\section*{Addition}

\section*{Syntax Characteristics}
\begin{tabular}{lllllll}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([16]\) & Smem \(=\) Smem + K16 & & No & 4 & 1 & X \\
\hline Opcode & \(\mid 1111\) & 0111 & AAAA & AAAI & KKKK & KKKK
\end{tabular}\(|\)\begin{tabular}{lll} 
KKKK & KKKK
\end{tabular}

\section*{Operands}

Description This instruction performs an addition operation between a 16-bit signed constant, K16, and the content of a memory (Smem) location.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD and shifted by 16 bits to the MSBs before being added.
\(\square\) Addition overflow is detected at bit position 31. If an overflow is detected, accumulator 0 overflow status bit (ACOV0) is set.
- Addition carry report in CARRY status bit is extracted at bit position 31.
\(\square\) If SATD is 1 when an overflow is detected, the result is saturated before being stored in memory. Saturation values are 7FFFh or 8000 h .

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits \(\quad\) Affected by SATD, SXMD
Affects ACOV0, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline *AR3 \(=\) *AR3 + \#2E00h & \begin{tabular}{l} 
The content addressed by AR3 is added to a signed 16-bit value (2E00h) and the \\
result is stored back into the location addressed by AR3.
\end{tabular} \\
\hline
\end{tabular}

\section*{ADDV}

\section*{Addition with Absolute Value}

\section*{Syntax Characteristics}


\section*{See Also See the following other related instructions:}
- Absolute Value
\(\square\) Addition
- Addition or Subtraction Conditionally
- Addition or Subtraction Conditionally with Shift
- Addition, Subtraction, or Move Accumulator Content Conditionally

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0+|A C 1|\) & \begin{tabular}{l} 
The absolute value of \(A C 1\) is added to the content of AC0 and the result is stored \\
in \(A C 0\).
\end{tabular} \\
\hline
\end{tabular}

\section*{ADD::MOV}

\section*{Addition with Parallel Store Accumulator Content to Memory}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
ACy \(=\mathrm{ACx}+(\) Xmem \(\ll\) \#16 \()\), \\
Ymem \(=\mathrm{HI}(\mathrm{ACy} \mathrm{<<} \mathrm{T2)}\)
\end{tabular} & No & 4 & 1 & X \\
& & & & & \\
\hline
\end{tabular}

Opcode
Operands
Description

ACx, ACy, T2, Xmem, Ymem
This instruction performs two operations in parallel: addition and store.
The first operation performs an addition between an accumulator content ACx and the content of data memory operand Xmem shifted left by 16 bits.
\(\square\) The operation is performed on 40 bits in the \(D\)-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

The second operation shifts the accumulator ACy by the content of T2 and stores \(\operatorname{ACy}(31-16)\) to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
- The input operand is shifted in the D-unit shifter according to SXMD.
\(\square\) After the shift, the high part of the accumulator, \(\mathrm{ACy}(31-16)\), is stored to the memory location.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of T2 are used to determine the shift quantity. The 6 LSBs of T 2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .


\section*{ADDSUBCC \\ Addition or Subtraction Conditionally}

\section*{Syntax Characteristics}

\begin{tabular}{cc}
\hline TC1 or TC2 & Operation \\
\hline 0 & \(\mathrm{ACy}=\mathrm{ACx}-(\) Smem \(\ll \# 16)\) \\
1 & \(\mathrm{ACy}=\mathrm{ACx}+(\) Smem \(\ll \# 16)\) \\
\hline\(\square\) & TCx \(=\mathbf{0}\), then \(\mathrm{ACy}=\mathrm{ACx}-(\) Smem \(\ll \# 16):\)
\end{tabular}

This instruction subtracts the content of a memory (Smem) location shifted left by 16 bits from accumulator ACx and stores the result in accumulator ACy.
- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.
\(\square T C x=1\), then \(A C y=A C x+(\) Smem \(\ll \# 16)\) :
This instruction performs an addition operation between accumulator ACx and the content of a memory (Smem) location shifted left by 16 bits and stores the result in accumulator ACy.
- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.

■ Overflow detection and CARRY status bit depends on M40.
■ When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM = 1, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by C54CM, M40, SATD, SXMD, TCx
Affects ACOVy, CARRY
Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Addition or Subtraction Conditionally with Shift
- Addition, Subtraction, or Move Accumulator Content Conditionally

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\operatorname{adsc}\left({ }^{*} A R 3, ~ A C 1, ~ T C 1\right)\) & If TC1 = 1, the content addressed by AR3 shifted left by 16 bits is \\
added to the content of AC1 and the result is stored in AC0. If \\
TC1 = 0, the content addressed by AR3 shifted left by 16 bits is \\
subtracted from the content of AC1 and the result is stored in AC0. \\
\hline
\end{tabular}

\section*{Example 2}


\section*{ADDSUB2CC}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & & eline \\
\hline [1] & ACy = ads2c(Smem, ACx, Tx, TC1, TC2) & No & 3 & 1 & & X \\
\hline Opcod & & \multicolumn{5}{|l|}{} \\
\hline Opera & \multicolumn{6}{|l|}{ACx, ACy, Tx, Smem, TC1, TC2} \\
\hline Descr & \multicolumn{6}{|l|}{This instruction evaluates the TC1 status bit and based on the result of the test, either an addition or a subtraction is performed; this instruction evaluates the TC2 status bit and based on the result of the test, either a shift left by 16 bits or the content of Tx is performed. Evaluation of the condition on the TCx status bits is performed during the Execute phase of the instruction.} \\
\hline
\end{tabular}
\begin{tabular}{ccc}
\hline TC1 & TC2 & Operation \\
\hline 0 & 0 & \(\mathrm{ACy}=\mathrm{ACx}-(\) Smem \(\ll \mathrm{Tx})\) \\
0 & 1 & \(\mathrm{ACy}=\mathrm{ACx}-(\) Smem \(\ll \# 16)\) \\
1 & 0 & \(\mathrm{ACy}=\mathrm{ACx}+(\) Smem \(\ll \mathrm{Tx})\) \\
1 & 1 & \(\mathrm{ACy}=\mathrm{ACx}+(\) Smem \(\ll \# 16)\) \\
\hline
\end{tabular}
- TC1 = 0 and TC2 \(=0\), then \(A C y=A C x-(S m e m ~ \ll T x)\) :

This instruction subtracts the content of a memory (Smem) location shifted left by the content of Tx from an accumulator ACx and stores the result in accumulator ACy.
- TC1 = 0 and TC2 = 1, then ACy = ACx - (Smem <<\#16):

This instruction subtracts the content of a memory (Smem) location shifted left by 16 bits from an accumulator \(A C x\) and stores the result in accumulator ACy.
■ The operation is performed on 40 bits in the D-unit shifter.
■ Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.

■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.
- TC1 = \(\mathbf{1}\) and TC2 = 0, then ACy \(=A C x+(\) Smem \(\ll T x)\) :

This instruction performs an addition operation between an accumulator ACx and the content of a memory (Smem) location shifted left by the content of Tx and stores the result in accumulator ACy.
- TC1 = \(\mathbf{1}\) and TC2 = 1, then \(A C y=A C x+(\) Smem \(\ll \# 16)\) :

This instruction performs an addition operation between an accumulator ACx and the content of a memory (Smem) location shifted left by 16 bits and stores the result in accumulator ACy.
- The operation is performed on 40 bits in the D-unit shifter.

■ Input operands are sign extended to 40 bits according to SXMD.
■ The shift operation is equivalent to the signed shift instruction.
■ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM = 1:
- An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
- The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

Status Bits Affected by C54CM, M40, SATD, SXMD, TC1, TC2
Affects ACOVy, CARRY
This instruction can be repeated.
See Also See the following other related instructions:
- Addition or Subtraction Conditionally
- Addition, Subtraction, or Move Accumulator Content Conditionally

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC2 \(=\operatorname{ads2c}(* A R 2\), AC0, T1, TC1, TC2) & \begin{tabular}{l} 
TC1 \(=1\) and TC2 \(=0\), the content addressed by AR2 \\
shifted left by the content of T1 is added to the content of \\
AC0 and the result is stored in AC2. The result generated \\
an overflow.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC0 & 00 & ECOO & 0000 & ACO & 00 & ECOO & 0000 \\
\hline AC2 & 00 & 0000 & 0000 & AC2 & 00 & ECOO & CCOO \\
\hline AR2 & & & 0201 & AR2 & & & 0201 \\
\hline 201 & & & 3300 & 201 & & & 3300 \\
\hline T1 & & & 0002 & T1 & & & 0002 \\
\hline TC1 & & & 1 & TC1 & & & 1 \\
\hline TC2 & & & 0 & TC2 & & & 0 \\
\hline M40 & & & 0 & M40 & & & 0 \\
\hline ACOV2 & & & 0 & ACOV2 & & & 1 \\
\hline CARRY & & & 0 & CARRY & & & 0 \\
\hline
\end{tabular}

\section*{ADDSUBCC}

\author{
Addition, Subtraction, or Move Accumulator Content Conditionally
}

Syntax Characteristics
\begin{tabular}{lllllll}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & ACy = adsc(Smem, ACx, TC1, TC2) & No & 3 & 1 & X \\
\hline Opcode & & 1101 & 1110 & AAAA & AAAI & SSDD \\
Operands & ACx, ACy, Smem, TC1, TC2
\end{tabular}
\begin{tabular}{ccc}
\hline TC1 & TC2 & Operation \\
\hline 0 & 0 & ACy \(=\mathrm{ACx}-(\) Smem \(\ll \# 16)\) \\
0 & 1 & ACy \(=\mathrm{ACx}\) \\
1 & 0 & ACy \(=\mathrm{ACx}+(\) Smem \(\ll \# 16)\) \\
1 & 1 & ACy \(=\mathrm{ACx}\) \\
\hline
\end{tabular}
\(\square\) TC2 = 1, then ACy = ACx:
This instruction moves the content of ACx to ACy.
■ The 40 -bit move operation is performed in the D-unit ALU.
■ During the 40-bit move operation, an overflow is detected according to M40:
- the destination accumulator overflow status bit (ACOVy) is set.
- the destination register (ACy) is saturated according to SATD.
- TC1 = 0 and TC2 \(=\mathbf{0}\), then \(A C y=A C x-(S m e m \ll \# 16)\) :

This instruction subtracts the content of a memory (Smem) location shifted left by 16 bits from accumulator ACx and stores the result in accumulator ACy.

■ The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are sign extended to 40 bits according to SXMD.
■ The shift operation is equivalent to the signed shift instruction.
■ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.

TC1 = \(\mathbf{1}\) and TC2 = 0, then ACy \(=A C x+(\) Smem <<\#16):
This instruction performs an addition operation between accumulator ACx and the content of a memory (Smem) location shifted left by 16 bits and stores the result in accumulator ACy.
- The operation is performed on 40 bits in the D-unit ALU.

■ Input operands are sign extended to 40 bits according to SXMD.
■ The shift operation is equivalent to the signed shift instruction.
■ Overflow detection and CARRY status bit depends on M40.
■ When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{lll} 
Status Bits & Affected by \(\quad\) C54CM, M40, SATD, SXMD, TC1, TC2 \\
& Affects \(\quad\) ACOVy, CARRY
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\operatorname{adsc}\) (*AR3, AC1, TC1, TC2) & If TC2 \(=1\), the content of AC1 is stored in AC0. If TC2 \(=0\) and \\
& TC1 \(=1\), the content addressed by AR3 shifted left by 16 bits is \\
added to the content of \(A C 1\) and the result is stored in AC0. If \\
& TC2 \(=0\) and TC1 \(=0\), the content addressed by AR3 shifted left \\
by 16 bits is subtracted from the content of AC1 and the result is \\
stored in AC0.
\end{tabular}

\section*{AND}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \(\mathrm{dst}=\mathrm{dst}\) \& src & Yes & 2 & 1 & X \\
\hline [2] & \(\mathrm{dst}=\mathrm{src} \& \mathrm{k} 8\) & Yes & 3 & 1 & X \\
\hline [3] & \(\mathrm{dst}=\) src \& k16 & No & 4 & 1 & X \\
\hline [4] & \(\mathrm{dst}=\) src \& Smem & No & 3 & 1 & X \\
\hline [5] & ACy = ACy \& (ACx \(\lll\) \#SHIFTW) & Yes & 3 & 1 & X \\
\hline [6] & ACy \(=\) ACx \& (k16 <<< \#16) & No & 4 & 1 & X \\
\hline [7] & ACy = ACx \& (k16 <<< \#SHFT) & No & 4 & 1 & X \\
\hline [8] & Smem = Smem \& k16 & No & 4 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform a bitwise AND operation:
- In the D-unit, if the destination operand is an accumulator.
- In the A-unit ALU, if the destination operand is an auxiliary or temporary register.
\(\square\) In the A-unit ALU, if the destination operand is the memory.

\section*{Status Bits Affected by C54CM}

Affects none
See Also
See the following other related instructions:
- Bitwise AND Memory with Immediate Value and Compare to Zero
\(\square\) Bitwise OR
\(\square\) Bitwise Exclusive OR (XOR)

\section*{Bitwise AND}

\section*{Syntax Characteristics}


Bitwise AND

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [2] \(\mathrm{dst}=\mathrm{src} \& \mathrm{k} 8\) & \(\begin{array}{llll}\text { Yes } & 3 & 1\end{array}\) \\
\hline Opcode & 0001 100E \({ }^{\text {a }}\) kkkk kkkk \({ }^{\text {FDDD }}\) FSSS \\
\hline Operands & dst, k8, src \\
\hline Description & \begin{tabular}{l}
This instruction performs a bitwise AND operation between a source (src) register content and an 8-bit value, k8. \\
\(\square\) When the destination (dst) operand is an accumulator: \\
■ The operation is performed on 40 bits in the D-unit ALU. \\
■ Input operands are zero extended to 40 bits. \\
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. \\
When the destination (dst) operand is an auxiliary or temporary register: \\
- The operation is performed on 16 bits in the A-unit ALU. \\
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{l}
Affected by none \\
Affects none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline AC0 = AC1 \& \#FFh & The content of AC1 is ANDed with the unsigned 8-bit value (FFh) and the result is stored in ACO. \\
\hline
\end{tabular}

\section*{Bitwise AND}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [3] dst = src \& k16 & \(\begin{array}{llll}\text { No } & 4 & 1\end{array}\) \\
\hline Opcode &  \\
\hline Operands & dst, k16, src \\
\hline Description & \begin{tabular}{l}
This instruction performs a bitwise AND operation between a source (src) register content and a 16-bit unsigned constant, k16. \\
\(\square\) When the destination (dst) operand is an accumulator: \\
- The operation is performed on 40 bits in the D-unit ALU. \\
- Input operands are zero extended to 40 bits. \\
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. \\
When the destination (dst) operand is an auxiliary or temporary register: \\
- The operation is performed on 16 bits in the A-unit ALU. \\
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & none \\
Affects & none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline AC0 = AC1 \& \#FFFFh & The content of AC1 is ANDed with the unsigned 16-bit value (FFFFh) and the result is stored in AC0. \\
\hline
\end{tabular}

Bitwise AND

\section*{Syntax Characteristics}


\section*{Bitwise AND}

\section*{Syntax Characteristics}
 SHIFTW.
- The shift and AND operations are performed in one cycle in the D-unit shifter.
- When M40 \(=0\) and \(\mathrm{C} 54 \mathrm{CM}=0\), input operands \(\mathrm{ACx}(31-0)\) are zero extended to 40 bits. Otherwise, \(\operatorname{ACx}(39-0)\) is used as is.
\(\square\) The input operand (ACx) is shifted by a 6-bit immediate value in the D-unit shifter.
- The CARRY status bit is not affected by the logical shift operation.

Compatibility with C54x devices (C54CM = 1)
When \(\mathrm{C} 54 \mathrm{CM}=1\), the intermediary logical shift is performed as if M40 is locally set to 1 . The 8 upper bits of the 40 -bit intermediary result are not cleared.

Status Bits Affected by C54CM, M40
Affects none
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0\) \& \((A C 1 \lll \# 30)\) & \begin{tabular}{l} 
The content of AC0 is ANDed with the content of AC1 logically shifted left \\
by 30 bits and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Bitwise AND

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([6]\) & \(\mathrm{ACy}=\mathrm{ACx} \&(\mathrm{k} 16 \lll 16)\) & No & 4 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|01111010| k k k k \quad k k k k|k k k k \quad k k k k| S S D D \quad 010 \mathrm{x}\)

\section*{Operands}

Description This instruction performs a bitwise AND operation between an accumulator (ACx) content and a 16-bit unsigned constant, k16, shifted left by 16 bits.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
\(\square\) Input operands are zero extended to 40 bits.
\(\square\) The input operand (k16) is shifted 16 bits to the MSBs.

\section*{Status Bits}

Affected by
none
Affects none
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC1 \& (\#FFFFh <<< \#16) & \begin{tabular}{l} 
The content of AC1 is ANDed with the unsigned 16-bit value (FFFFh) \\
logically shifted left by 16 bits and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Bitwise AND}

\section*{Syntax Characteristics}


Bitwise AND

\section*{Syntax Characteristics}

\begin{tabular}{llll} 
Before & & After & \\
*AR1 & 5678 & *ARI & 0640
\end{tabular}

\section*{BAND \\ Bitwise AND Memory with Immediate Value and Compare to Zero}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & TC1 \(=\) Smem \& k16 & No & 4 & 1 & X \\
{\([2]\)} & TC2 \(=\) Smem \& k16 & No & 4 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline Opcode & TC1 & 1111 & 0010 & AAAA & AAAI & kkkk & kkkk & kkkk & kkkk \\
\hline & TC2 & 1111 & 0011 & AAAA & AAAI & kkkk & kkkk & kkkk & kkkk \\
\hline
\end{tabular}

\section*{Operands \\ k16, Smem, TCx}

Description This instruction performs a bit field manipulation in the A-unit ALU. The 16-bit field mask, k 16 , is ANDed with the memory (Smem) operand and the result is compared to 0 :
```

if( ((Smem) AND k16 ) == 0)
TCx = 0
else
TCx = 1

```
Status Bits Affected by none
Affects TCx

Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Bitwise AND

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline TC1 = *AR0 \& \#0060h & \begin{tabular}{l} 
The unsigned 16-bit value (0060h) is ANDed with the content addressed by \\
AR0. The result is 1, TC1 is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrlr} 
Before & After \\
*AR0 & 0040 & *AR0 & 0040 \\
TC1 & 0 & TC1 & 1
\end{tabular}

\section*{OR \\ Bitwise OR}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \(\mathrm{dst}=\mathrm{dst} \mid \mathrm{src}\) & Yes & 2 & 1 & X \\
\hline [2] & \(\mathrm{dst}=\mathrm{src} \mid \mathrm{k} 8\) & Yes & 3 & 1 & X \\
\hline [3] & \(\mathrm{dst}=\mathrm{src} \mid \mathrm{k} 16\) & No & 4 & 1 & X \\
\hline [4] & dst \(=\) src \(\mid\) Smem & No & 3 & 1 & X \\
\hline [5] & ACy = ACy | (ACx <<< \#SHIFTW) & Yes & 3 & 1 & X \\
\hline [6] & ACy \(=\) ACx \(\mid\) (k16 <<< \#16) & No & 4 & 1 & X \\
\hline [7] & ACy \(=\) ACx | (k16 <<< \#SHFT) & No & 4 & 1 & X \\
\hline [8] & Smem \(=\) Smem | k16 & No & 4 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform a bitwise OR operation:
\(\square\) In the \(D\)-unit, if the destination operand is an accumulator.
- In the A-unit ALU, if the destination operand is an auxiliary or temporary register.
\(\square\) In the A-unit ALU, if the destination operand is the memory.
Status Bits Affected by C54CM
Affects none
See Also
See the following other related instructions:
- Bitwise AND
- Bitwise Exclusive OR (XOR)

\section*{Bitwise OR}

\section*{Syntax Characteristics}


Bitwise OR

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit
Size \\
\hline [2] dst \(=\mathrm{src} / \mathrm{k} 8\) & Yes \\
\hline Opcode & 0001 101E | kkkk kkkk | FDDD FSSS \\
\hline Operands & dst, k8, src \\
\hline Description & \begin{tabular}{l}
This instruction performs a bitwise OR operation between a source (src) register content and an 8-bit value, k8. \\
\(\square\) When the destination (dst) operand is an accumulator: \\
- The operation is performed on 40 bits in the D-unit ALU. \\
- Input operands are zero extended to 40 bits. \\
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. \\
When the destination (dst) operand is an auxiliary or temporary register: \\
- The operation is performed on 16 bits in the A-unit ALU. \\
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{l}
Affected by none \\
Affects none
\end{tabular} \\
\hline \begin{tabular}{l}
Repeat \\
Example
\end{tabular} & This instruction can be repeated. \\
\hline Syntax & Description \\
\hline AC0 = AC1 | \#FFh & The content of AC1 is ORed with the unsigned 8-bit value (FFh) and the result is stored in ACO. \\
\hline
\end{tabular}

\section*{Bitwise OR}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [3] dst \(=\mathrm{src} \mid \mathrm{k} 16\) & \(\begin{array}{llll}\text { No } & 4 & 1 & X\end{array}\) \\
\hline Opcode &  \\
\hline Operands & dst, k16, src \\
\hline Description & \begin{tabular}{l}
This instruction performs a bitwise OR operation between a source (src) register content and a 16-bit unsigned constantk16. \\
When the destination (dst) operand is an accumulator: \\
- The operation is performed on 40 bits in the D-unit ALU. \\
- Input operands are zero extended to 40 bits. \\
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. \\
When the destination (dst) operand is an auxiliary or temporary register: \\
- The operation is performed on 16 bits in the A-unit ALU. \\
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & none \\
Affects & none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline AC0 = AC1 \| \#FFFFh & The content of AC1 is ORed with the unsigned 16-bit value (FFFFh) and the result is stored in ACO. \\
\hline
\end{tabular}

Bitwise OR

Syntax Characteristics


\section*{Bitwise OR}

\section*{Syntax Characteristics}

\begin{tabular}{lllllll} 
Before & \multicolumn{5}{c}{ After } \\
AC0 & 7 E & 2355 & 4 FCO & AC0 & 7 E & 2355 \\
AC1 & 0 F & E 340 & 5678 & AC1 & 0F & F754
\end{tabular}

Bitwise OR

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([6]\) & \(A C y=A C x \mid(k 16 \lll \# 16)\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}
Opcode \(\quad|01111010| k k k k \quad k k k k|k k k k \quad k k k k| S S D D \quad 011 x\)

\section*{Operands}

Description This instruction performs a bitwise OR operation between an accumulator (ACx) content and a 16 -bit unsigned constant, k16, shifted left by 16 bits.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
- Input operands are zero extended to 40 bits.
- The input operand (k16) is shifted 16 bits to the MSBs.
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = AC1 | (\#FFFFh <<< \#16) & \begin{tabular}{l} 
The content of AC1 is ORed with the unsigned 16-bit value (FFFFh) \\
logically shifted left by 16 bits and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Bitwise OR}

\section*{Syntax Characteristics}


Bitwise OR

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline \multicolumn{2}{|l|}{Smem = Smem | k16} & No & 4 & 1 & X \\
\hline Opcode & | 11110101 | AAAA & AAAI \({ }^{\text {k }}\) & k & | k & k kkkk \\
\hline Operands & \multicolumn{5}{|l|}{k16, Smem} \\
\hline Description & This instruction performs a bitwise OR location and a 16-bit unsigned constant
The operation is performed on 16
The result is stored in memory. & \begin{tabular}{l}
operation bet t, k16. \\
bits in the \(A\)
\end{tabular} & \begin{tabular}{l}
tween \\
unit
\end{tabular} & \begin{tabular}{l}
a memory \\
U.
\end{tabular} & ry (Smem) \\
\hline Status Bits & Affected by none
Affects none & & & & \\
\hline Repeat
Example & This instruction can be repeated. & & & & \\
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline *AR1 = *AR1 \| \#0FC0h & \multicolumn{5}{|l|}{The content addressed by AR1 is ORed with the unsigned 16-bit value (FCOh) and the result is stored in the location addressed by AR1.} \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & & \\
*AR1 & 5678 & After & \\
& *AR1 & \(5 F F 8\)
\end{tabular}

\section*{XOR}

\section*{Bitwise Exclusive OR (XOR)}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \(\mathrm{dst}=\mathrm{dst}{ }^{\wedge} \mathrm{src}\) & Yes & 2 & 1 & X \\
\hline [2] & \(\mathrm{dst}=\mathrm{src}^{\wedge} \mathrm{k} 8\) & Yes & 3 & 1 & X \\
\hline [3] & \(\mathrm{dst}=\operatorname{src}^{\wedge} \mathrm{k} 16\) & No & 4 & 1 & X \\
\hline [4] & dst \(=\) src^ Smem & No & 3 & 1 & X \\
\hline [5] & ACy \(=\) ACy ^ (ACx \(\lll\) \#SHIFTW) & Yes & 3 & 1 & X \\
\hline [6] & \(A C y=A C x^{\wedge}(k 16 \lll \# 16)\) & No & 4 & 1 & X \\
\hline [7] & \(A C y=A C x \wedge(k 16 \lll \# S H F T)\) & No & 4 & 1 & X \\
\hline [8] & Smem \(=\) Smem \({ }^{\wedge}\) k16 & No & 4 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform a bitwise exclusive-OR (XOR) operation:
- In the D-unit, if the destination operand is an accumulator.
- In the A-unit ALU, if the destination operand is an auxiliary or temporary register.
\(\square\) In the A-unit ALU, if the destination operand is the memory.
Status Bits Affected by C54CM
Affects none
See Also
See the following other related instructions:Bitwise AND
\(\square\) Bitwise OR

Bitwise Exclusive OR (XOR)

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{r} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\mathrm{dst}=\mathrm{dst} \wedge\) src & Yes & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

0010 110E \(\mid\) FSSS FDDD

\section*{Operands}

Description
dst, src
This instruction performs a bitwise exclusive-OR (XOR) operation between two registers.
\(\square\) When the destination (dst) operand is an accumulator:
The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are zero extended to 40 bits.
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended.
\(\square\) When the destination (dst) operand is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.

■ If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

Affected by none
Affects none
This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 \(=\mathrm{AC} 1{ }^{\wedge} \mathrm{AC} 0\) & The content of AC 0 is XORed with the content of AC1 and the result is stored in AC1. \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & Aft & & & \\
\hline AC0 & 7 E & 2355 & 4 FCO & ACO & 7 E & 2355 & 4 FCO \\
\hline AC1 & OF & E340 & 5678 & AC1 & 71 & C015 & 19B8 \\
\hline
\end{tabular}

\section*{Bitwise Exclusive OR (XOR)}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [2] \(\mathrm{dst}=\mathrm{src}^{\wedge} \mathrm{k} 8\) & \(\begin{array}{llll}\text { Yes } & 3 & 1\end{array}\) \\
\hline Opcode & 0001 110E \({ }^{\text {k }}\) kkk kkkk \({ }^{\text {FDDD }}\) FSSS \\
\hline Operands & dst, k8, src \\
\hline Description & \begin{tabular}{l}
This instruction performs a bitwise exclusive-OR (XOR) operation between a source (src) register content and an 8-bit value, k8.
When the destination (dst) operand is an accumulator: \\
■ The operation is performed on 40 bits in the D-unit ALU. \\
- Input operands are zero extended to 40 bits. \\
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended.
When the destination (dst) operand is an auxiliary or temporary register: \\
- The operation is performed on 16 bits in the A-unit ALU. \\
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & none \\
Affects & none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline \(\mathrm{AC0}=\mathrm{AC} 1^{\wedge}\) \#FFh & The content of AC1 is XORed with the unsigned 8-bit value (FFh) and the result is stored in ACO. \\
\hline
\end{tabular}

Bitwise Exclusive OR (XOR)

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit
Size \\
\hline [3] \(\mathrm{dst}=\operatorname{src}^{\wedge} \mathrm{k} 16\) & No \(\begin{array}{llll}4 & 1 & X\end{array}\) \\
\hline Opcode & 01111111 | kkkk kkkk \({ }^{\text {ckkkk }}\) kkkk \({ }^{\text {FDDD }}\) FSSS \\
\hline Operands & dst, k16, src \\
\hline Description & \begin{tabular}{l}
This instruction performs a bitwise exclusive-OR (XOR) operation between a source (src) register content and a 16-bit unsigned constant, k16.
When the destination (dst) operand is an accumulator: \\
- The operation is performed on 40 bits in the D-unit ALU. \\
- Input operands are zero extended to 40 bits. \\
- If an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the auxiliary or temporary register are zero extended. \\
When the destination (dst) operand is an auxiliary or temporary register: \\
- The operation is performed on 16 bits in the A-unit ALU. \\
- If an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{l}
Affected by none \\
Affects none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline AC0 = AC1^\#FFFFh & The content of AC1 is XORed with the unsigned 16-bit value (FFFFh) and the result is stored in ACO. \\
\hline
\end{tabular}

\section*{Bitwise Exclusive OR (XOR)}

\section*{Syntax Characteristics}


Bitwise Exclusive OR (XOR)

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([5]\) & \(A C y=A C y \wedge(A C x \lll \# S H I F T W)\) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
\(0001000 \mathrm{E} \mid\) DDSS \(0010 \mid \mathrm{xxSH}\) IFTW

\section*{Operands}

Description This instruction performs a bitwise exclusive-OR (XOR) operation between an accumulator (ACy) content and an accumulator (ACx) content shifted by the 6 -bit value, SHIFTW.
- The shift and XOR operations are performed in one cycle in the D-unit shifter.
- When \(\mathrm{M} 40=0\) and \(\mathrm{C} 54 \mathrm{CM}=0\), input operands \(\mathrm{ACx}(31-0)\) are zero extended to 40 bits. Otherwise, \(\operatorname{ACx}(39-0)\) is used as is.
- The input operand (ACx) is shifted by a 6-bit immediate value in the D-unit shifter.
\(\square\) The CARRY status bit is not affected by the logical shift operation.
Compatibility with C54x devices (C54CM =1)
When \(\mathrm{C} 54 \mathrm{CM}=1\), the intermediary logical shift is performed as if M40 is locally set to 1 . The 8 upper bits of the 40-bit intermediary result are not cleared.
Status Bits
\begin{tabular}{ll} 
Affected by \(\quad\) C54CM, M40
\end{tabular}
Repeat
Example \(\quad\)\begin{tabular}{ll} 
This instruction can be repeated.
\end{tabular}

\section*{Bitwise Exclusive OR (XOR)}

\section*{Syntax Characteristics}


Bitwise Exclusive OR (XOR)

\section*{Syntax Characteristics}


\section*{Bitwise Exclusive OR (XOR)}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([8]\) & Smem \(=S m e m{ }^{\wedge} k 16\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
| \(11110110 \mid\) AAAA AAAI \(\mid\) kkkk kkkk \(\mid\) kkkk kkkk
Operands
Description This instruction performs a bitwise exclusive-OR (XOR) operation between a memory (Smem) location and a 16-bit unsigned constant, k16.
- The operation is performed on 16 bits in the A-unit ALU.
- The result is stored in memory.

Status Bits
Affected by none
Affects none
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{*}\) AR3 \(=\) *AR3 ^ \#FFFFh & \begin{tabular}{l} 
The content addressed by AR3 is XORed with the unsigned 16-bit value (FFFFh) \\
and the result is stored in the location addressed by AR3.
\end{tabular} \\
\hline
\end{tabular}

\section*{BCC}

\section*{Branch Conditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([1]\) & if (cond) goto l4 & No & 2 & \(6 / 5\) & R \\
{\([2]\)} & if (cond) goto L8 & Yes & 3 & \(6 / 5\) & R \\
{\([3]\)} & if (cond) goto L16 & No & 4 & \(6 / 5\) & R \\
{\([4]\)} & if (cond) goto P24 & No & 5 & \(5 / 5\) & R \\
\hline
\end{tabular}
\(\dagger x / y\) cycles: \(x\) cycles \(=\) condition true, \(y\) cycles \(=\) condition false
Description These instructions evaluate a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a branch occurs to the program address label assembled into \(14, \mathrm{Lx}\), or P24. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

The instruction selection depends on the branch offset between the current PC value and the program branch address specified by the label.

These instructions cannot be repeated.
Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
See Also See the following other related instructions:
- Branch Unconditionally
- Branch on Auxiliary Register Not Zero
- Call Conditionally
- Compare and Branch

\section*{Branch Conditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([1]\) & if (cond) goto 14 & No & 2 & \(6 / 5\) & R \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false

\section*{Opcode}

\section*{Operands}

Description

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if \((A C 0!=\# 0)\) goto branch & \begin{tabular}{l} 
The content of AC0 is not equal to 0, control is passed to the program address \\
label defined by branch.
\end{tabular} \\
\hline
\end{tabular}


\section*{Branch Conditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([2]\) & if (cond) goto L8 & Yes & 3 & \(6 / 5\) & R \\
{\([3]\)} & if (cond) goto L16 & No & 4 & \(6 / 5\) & R \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false


\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if (ACO != \#0) goto branch & \begin{tabular}{l} 
The content of ACO is not equal to 0, control is passed to the program address \\
label defined by branch.
\end{tabular} \\
\hline
\end{tabular}


\section*{Branch Conditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([4]\) & if (cond) goto P24 & No & 5 & \(5 / 5\) & R \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false


\section*{Operands \\ cond, P24}

Description This instruction evaluates a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a branch occurs to the program address label assembled into P24. There is a 1-cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat This instruction cannot be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if (ACO != \#0) goto branch & \begin{tabular}{l} 
The content of AC0 is not equal to 0, control is passed to the program address \\
label defined by branch.
\end{tabular} \\
\hline
\end{tabular}


B

\section*{Branch Unconditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & goto ACx & No & 2 & 10 & X \\
{\([2]\)} & goto L7 & Yes & 2 & \(6^{\dagger}\) & AD \\
{\([3]\)} & goto L16 & Yes & 3 & \(6^{\dagger}\) & AD \\
{\([4]\)} & goto P24 & No & 4 & 5 & D \\
\hline
\end{tabular}
\(\dagger\) This instruction executes in 3 cycles if the addressed instruction is in the instruction buffer unit.

Description This instruction branches to a 24-bit program address defined by the content of the 24 lowest bits of an accumulator (ACx), or to a program address defined by the program address label assembled into Lx or P24.

These instructions cannot be repeated.
Status Bits
Affected by none
Affects none
See Also See the following other related instructions:
- Branch Conditionally
- Branch on Auxiliary Register Not Zero
- Call Unconditionally
- Compare and Branch

\section*{Branch Unconditionally}

Syntax Characteristics
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & goto \(A C x\) & No & 2 & 10 & \(X\) \\
\hline
\end{tabular}

Opcode
10010001 xxxx xxSS
Operands
Description This instruction branches to a 24-bit program address defined by the content of the 24 lowest bits of an accumulator (ACx).

Status Bits Affected by none
Affects none
Repeat This instruction cannot be repeated
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline goto AC0 & Program control is passed to the program address defined by the content of ACO(23-0). \\
\hline
\end{tabular}
\begin{tabular}{lrrrrr} 
Before & \multicolumn{4}{c}{ After } \\
AC0 & \(000000403 D\) & AC0 & \(000000403 D\) \\
PC & \(001 F 0 A\) & PC & & \(00403 D\)
\end{tabular}

B Branch Unconditionally (goto)

Branch Unconditionally

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([2]\) & goto L7 & Yes & 2 & 6 & \(A D\) \\
{\([3]\)} & goto L16 & Yes & 3 & 6 & \(A D\) \\
\hline
\end{tabular}
\(\dagger\) Executes in 3 cycles if the addressed instruction is in the instruction buffer unit.
\begin{tabular}{llll|lll} 
Opcode & L7 & & 0100 & 101 E & OLLL & LLLL \\
L16 & \(\mid 0000\) & 011E & LLLL & LLLL & LLLL & LLLL
\end{tabular}
Operands Lx

Description This instruction branches to a program address defined by a program address label assembled into Lx.

Status Bits Affected by none
Affects none
Repeat \(\quad\) This instruction cannot be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline goto branch & Program control is passed to the absolute address defined by branch. \\
\hline
\end{tabular}
\begin{tabular}{|lll|}
\hline \multicolumn{1}{l|}{ goto branch } & & \\
& ACO \(=\# 1\) & address: 004044 \\
& \(\ldots \ldots\) \\
branch: & \(\ldots \ldots\) \\
& ACO \(=\# 0\) & \\
\hline
\end{tabular}
\begin{tabular}{lrlrr} 
Before & & After \\
PC & & Af & \\
ACO & 0004042 & PC & 006047 \\
& 000000001 & ACO & \(000000 \quad 0000\)
\end{tabular}

\section*{Branch Unconditionally}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [4] goto P24 & & No & 4 & 5 & D \\
\hline Opcode & 01101010 | PPPP & PPPP \(\mid\) P & P & - P & PPPP \\
\hline Operands & P24 & & & & \\
\hline Description & \multicolumn{5}{|l|}{This instruction branches to a program address defined by a program address label assembled into P24.} \\
\hline Status Bits & Affected by none & & & & \\
\hline & Affects none & & & & \\
\hline Repeat & \multicolumn{5}{|l|}{This instruction cannot be repeated.} \\
\hline
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline goto branch & Program control is passed to the absolute address defined by branch. \\
\hline
\end{tabular}
\begin{tabular}{|lll|}
\hline \multicolumn{2}{|l|}{ goto branch } & \\
& ACO \(=\# 1\) & address: \\
& 004044 \\
branch: & \(\ldots \ldots\) \\
& AC0 \(=\# 0\) & \\
\hline
\end{tabular}
\begin{tabular}{lrlrr} 
Before & & After \\
PC & 004042 & PC & 006047 \\
ACO & \(000000 \quad 0001\) & ACO & \(000000 \quad 0000\)
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([1]\) & if (ARn_mod != \#0) goto L16 & No & 4 & \(6 / 5\) & AD \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false

\section*{Opcode}

11111100 |AAAA AAAI | LLLL LLLL \(\mid\) LLLL LLLL

\section*{Operands}

ARn_mod, L16
Description This instruction performs a conditional branch (selected auxiliary register content not equal to 0 ) of the program counter ( PC ). The program branch address is specified as a 16 -bit signed offset, L16, relative to PC. Use this instruction to branch within a 64 K -byte window centered on the current PC value.

The possible addressing operands can be grouped into three categories:
\(\square\) ARx not modified (ARx as base pointer), some examples:
*AR1; No modification or offset
*AR1(\#15); Use 16-bit immediate value (15) as offset
*AR1(T0); Use content of T0 as offset
*AR1(short(\#4)); Use 3-bit immediate value (4) as offset
\(\square\) ARx modified before being compared to 0 , some examples:
*-AR1; Decrement by 1 before comparison
*+AR1(\#20); Add 16-bit immediate value (20) before comparison
\(\square\) ARx modified after being compared to 0 , some examples:
*AR1+; Increment by 1 after comparison
*(AR1 - T1); Subtract content of T1 after comparison
1) The content of the selected auxiliary register (ARn) is premodified in the address generation unit.
2) The (premodified) content of ARn is compared to 0 and sets the condition in the address phase of the pipeline.
3) If the condition is not true, a branch occurs. If the condition is true, the instructions are executed in sequence.
4) The content of ARn is postmodified in the address generation unit.

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM = 1 :
The premodifier *ARn(TO) is not available; \({ }^{*} \operatorname{ARn}(\operatorname{ARO})\) is available.
The postmodifiers *(ARn + TO) and *(ARn - TO) are not available; * \((A R n+A R O)\) and \({ }^{*}(A R n-A R O)\) are available.

The legality of the modifier usage is checked by the assembler when using the .c54cm_on and .c54cm_off assembler directives.
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM \\
& Affects & none
\end{tabular}
Repeat This instruction cannot be repeated.

See Also See the following other related instructions:
- Branch Conditionally
- Branch Unconditionally
- Compare and Branch

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if (*AR1(\#6) != \#0) goto branch & \begin{tabular}{l} 
The content of AR1 is compared to 0. The content is not 0, program control \\
is passed to the program address label defined by branch.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|}
\hline & If (*AR1(\#6) ! = \#0) goto branch & address: & \[
004004
\] \\
\hline & ... & ; & 00400A \\
\hline branch &  & ; & 00400C \\
\hline
\end{tabular}
\begin{tabular}{lrlr} 
Before & & After \\
AR1 & 0005 & AR1 & 0005 \\
PC & 004004 & PC & 00400 C
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if (*AR3- != \#0) goto branch & \begin{tabular}{l} 
The content of AR3 is compared to 0. The content is 0, program control is \\
passed to the next instruction (the branch is not taken). AR3 is decremented \\
by 1 after the comparison.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|}
\hline & If (*AR3- != \#0) goto branch & address: & 00400F \\
\hline & \(\ldots\) & ; & 004013 \\
\hline branch & ...... & ; & 004015 \\
\hline
\end{tabular}
\begin{tabular}{lrlr} 
Before & & After \\
AR3 & 0000 & AR3 & FFFF \\
PC & \(00400 F\) & PC & 004013
\end{tabular}

\section*{CALLCC}

\section*{Call Conditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([1]\) & if (cond) call L16 & No & 4 & \(6 / 5\) & R \\
{\([2]\)} & if (cond) call P24 & No & 5 & \(5 / 5\) & R \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false

Description These instructions evaluate a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a subroutine call occurs to the program address defined by the program address label assembled into L 16 or P24. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

Before beginning a called subroutine, the CPU automatically saves the value of two internal registers: the program counter (PC) and a loop context register. The CPU can use these values to re-establish the context of the interrupted program sequence when the subroutine is done.

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions.

The instruction selection depends on the branch offset between the current PC value and program subroutine address specified by the label.

These instructions cannot be repeated.
\begin{tabular}{lll} 
Status Bits & Affected by & ACOVx, CARRY, C54CM, M40, TCx \\
Affects & ACOVx
\end{tabular}
See Also See the following other related instructions:
- Branch Conditionally
- Call Unconditionally
- Return Conditionally
- Return Unconditionally

\section*{Call Conditionally}

Syntax Characteristics
\begin{tabular}{llccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([1]\) & if (cond) call L16 & No & 4 & \(6 / 5\) & \(R\) \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false
Opcode \(\quad|01101110| x C C C \quad\) CCCC \(\mid\) LLLL LLLL \(\mid\) LLLL LLLL

\section*{Operands}

Description
cond, L16
This instruction evaluates a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a subroutine call occurs to the program address defined by the program address label assembled into L16. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

When a subroutine call occurs in the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
\(\square\) The data stack pointer (SP) is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter ( PC ), of the called subroutine are pushed to the top of SP.
- The system stack pointer (SSP) is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
\(\square\) The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.

System Stack (SSP)
\begin{tabular}{rl|l|}
\begin{tabular}{r} 
After \\
Save
\end{tabular} & \(\rightarrow\) SSP \(=x-1\) & (Loop bits):PC(23-16) \\
\begin{tabular}{rl} 
Before \\
Save
\end{tabular}\(\rightarrow\) SSP \(=x\) & Previously saved data \\
\hline
\end{tabular}

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), the comparison of accumulators to 0 is performed as if M40 was set to 1 .

CALLCC Call Conditionally (if call)
Status Bits \begin{tabular}{l} 
Affected by ACOVx, CARRY, C54CM, M40, TCx \\
Affects \\
Repeat \\
Example \\
\begin{tabular}{|l|l|}
\hline Syntax & Descovx instruction cannot be repeated.
\end{tabular} \\
\hline if (AC1 >= \#2000h) call (subroutine) \\
\begin{tabular}{l} 
The content of AC1 is equal to or greater than 2000h, control is \\
passed to the program address label, subroutine. The program \\
counter (PC) is loaded with the subroutine program address.
\end{tabular} \\
\hline
\end{tabular}

\section*{Call Conditionally}

Syntax Characteristics
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([2]\) & if (cond) call P24 & No & 5 & \(5 / 5\) & \(R\) \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false


\section*{Operands}
cond, P24

\section*{Description}

This instruction evaluates a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a subroutine call occurs to the program address defined by the program address label assembled into P24. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

When a subroutine call occurs in the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
\(\square\) The data stack pointer (SP) is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter ( PC ), of the called subroutine are pushed to the top of SP.
- The system stack pointer (SSP) is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
\(\square\) The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.
\begin{tabular}{|c|c|c|c|}
\hline & System Stack (SSP) & \multirow[b]{3}{*}{\[
\begin{aligned}
\text { After } \\
\text { Save }
\end{aligned} \rightarrow \text { SP =y-1 }
\]} & Data Stack (SP) \\
\hline After
Save \(\rightarrow\) SSP \(=x-1\) & (Loop bits):PC(23-16) & & \(\mathrm{PC}(15-0)\) \\
\hline Before
\[
\begin{aligned}
\text { serore } \\
\text { Save }
\end{aligned} \rightarrow \text { SSP }=x
\] & Previously saved data & & Previously saved data \\
\hline
\end{tabular}

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), the comparison of accumulators to 0 is performed as if M 40 was set to 1 .

CALLCC
\begin{tabular}{l} 
Status Bits \\
\begin{tabular}{ll} 
Affected by ACOVx, CARRY, C54CM, M40, TCx
\end{tabular} \\
Repeat \\
\begin{tabular}{ll} 
Affects \(\quad\) ACOVx
\end{tabular} \\
\hline Example
\end{tabular}

\section*{CALL}

\section*{Call Unconditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & call ACx & No & 2 & 10 & X \\
{\([2]\)} & call L16 & Yes & 3 & 6 & AD \\
{\([3]\)} & call P24 & No & 4 & 5 & D \\
\hline
\end{tabular}

\section*{Description}

Status Bits

\section*{See Also}

This instruction passes control to a specified subroutine program address defined by the content of the 24 lowest bits of the accumulator, ACx, or a program address label assembled into L16 or P24.

Before beginning a called subroutine, the CPU automatically saves the value of two internal registers: the program counter (PC) and a loop context register The CPU can use these values to re-establish the context of the interrupted program sequence when the subroutine is done.

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions.

These instructions cannot be repeated.
Affected by none
Affects none
See the following other related instructions:
- Branch Unconditionally
- Call Conditionally
- Return Conditionally
- Return Unconditionally

CALL Call Unconditionally (call)

\section*{Call Unconditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & call ACx & No & 2 & 10 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

10010010 xxxx xxSS

\section*{Operands \\ ACx}

Description
This instruction passes control to a specified subroutine program address defined by the content of the 24 lowest bits of the accumulator, \(A C x\).

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
- The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
\(\square\) The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multirow[b]{2}{*}{After Save} & \multirow[b]{2}{*}{\(\rightarrow\)} & \multirow[b]{2}{*}{SSP = \(\mathrm{x}-1\)} & System Stack (SSP) & \multirow[b]{2}{*}{After Save} & \multirow[b]{2}{*}{\(S P=y-1\)} & Data Stack (SP) \\
\hline & & & (Loop bits):PC(23-16) & & & \(\mathrm{PC}(15-0)\) \\
\hline Before Save & \(\rightarrow\) & SSP \(=x\) & Previously saved data & Before Save & \(\rightarrow \quad S P=y\) & Previously saved data \\
\hline
\end{tabular}
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat This instruction cannot be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline call AC0 & Program control is passed to the program address defined by the content of AC0(23-0). \\
\hline
\end{tabular}

Call Unconditionally
Syntax Characteristics
\begin{tabular}{clccccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & call L16 & Yes & 3 & 6 & AD \\
\hline Opcode & \multirow{2}{l}{0000} & 100E & LLLL & LLLL & LLLL & LLLL
\end{tabular}

\section*{Operands \\ L16}

Description

This instruction passes control to a specified subroutine program address defined by a program address label assembled into L16.

In the slow-return process (default), the return address (from the PC ) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
\(\square\) The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
\(\square\) The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.

System Stack (SSP)
\begin{tabular}{rl|l|}
\begin{tabular}{r} 
After \\
Save
\end{tabular} & \(\rightarrow \mathrm{SSP}=\mathrm{x}-1\) & (Loop bits):PC(23-16) \\
\begin{tabular}{rl} 
Before \\
Save
\end{tabular}\(\rightarrow \mathrm{SSP}=x\) & & Previously saved data \\
\hline
\end{tabular}
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat This instruction cannot be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline call FOO & \begin{tabular}{l} 
Program control is passed to the program address label (FOO) assembled into the signed \\
16-bit offset value relative to the program counter register.
\end{tabular} \\
\hline
\end{tabular}

CALL Call Unconditionally (call)

\section*{Call Unconditionally}

\section*{Syntax Characteristics}


\section*{Operands}

\section*{Description}

P24
This instruction passes control to a specified subroutine program address defined by a program address label assembled into P24.

In the slow-return process (default), the return address (from the PC) and the loop context bits are stored to the stacks. For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
\(\square\) The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
- The PC is loaded with the subroutine program address. The active control flow execution context flags are cleared.

System Stack (SSP)
\begin{tabular}{rl|l|}
\begin{tabular}{r} 
After \\
Save
\end{tabular} & \(\rightarrow\) SSP \(=x-1\) & (Loop bits):PC(23-16) \\
\begin{tabular}{r} 
Before \\
Save
\end{tabular}\(\rightarrow\) SSP \(=x\) & Previously saved data \\
\hline
\end{tabular}

Data Stack (SP)
\begin{tabular}{rl|c|}
\(\begin{array}{r}\text { After } \\
\text { Save }\end{array}\) & \(\rightarrow\) SP = y-1 & \(\mathrm{PC}(15-0)\) \\
\(\begin{array}{r}\text { Before } \\
\text { Save }\end{array}\) & \(\rightarrow\) SP =y & Previously saved data \\
\end{tabular}
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat This instruction cannot be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline call FOO & \begin{tabular}{l} 
Program control is passed to the program address label (FOO) assembled into an absolute \\
address defined by the 24-bit value.
\end{tabular} \\
\hline
\end{tabular}

\section*{.CR \\ Circular Addressing Qualifier}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [1] circular() & \(\begin{array}{llll}\text { No } & 1 & 1 & \text { AD }\end{array}\) \\
\hline Opcode & 10011101 \\
\hline Operands & none \\
\hline Description & \begin{tabular}{l}
This instruction is an instruction qualifier that can be paralleled only with any instruction making an indirect Smem, Xmem, Ymem, Lmem, Baddr, or Cmem addressing or mar instructions. This instruction cannot be executed in parallel with any other types of instructions and it cannot be executed as a stand-alone instruction (assembler generates an error message). \\
When this instruction is used in parallel, all modifications of ARx and CDP pointer registers used in the indirect addressing mode are done circularly (as if ST2_55 register bits 0 to 8 were set to 1). \\
Compatibility with C54x devices ( \(\mathrm{C} 54 \mathrm{CM}=1\) ) \\
When C54CM = 1, this instruction does not affect *ARn, *ARn+, *ARn-, and *ARn(DRO) addressing modes of dual memory access (Xmem/Ymem) instruction.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & none \\
Affects & none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline
\end{tabular}

\section*{BCLR}

Clear Accumulator, Auxiliary, or Temporary Register Bit

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & bit(src, Baddr) \(=\# 0\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
\(11101100 \mid\) AAAA AAAI \(\mid\) FSSS 001x
\begin{tabular}{ll} 
Operands & Baddr, src \\
Description & This instruction performs a bit manipulation:
\end{tabular}
\(\square\) In the D-unit ALU, if the source (src) register operand is an accumulator.
\(\square\) In the A-unit ALU, if the source (src) register operand is an auxiliary or temporary register.

The instruction clears to 0 a single bit, as defined by the bit addressing mode, Baddr, of the source register.

The generated bit address must be within:
- 0-39 when accessing accumulator bits (only the 6 LSBs of the generated bit address are used to determine the bit position). If the generated bit address is not within \(0-39\), the selected register bit value does not change.
- 0-15 when accessing auxiliary or temporary register bits (only the 4 LSBs of the generated address are used to determine the bit position).
\begin{tabular}{lll} 
Status Bits & Affected by none \\
& Affects none \\
Repeat & This instruction can be repeated. \\
See Also & See the following other related instructions: \\
& \(\square\) & Clear Memory Bit \\
& \(\square\) & Clear Status Register Bit
\end{tabular}

\section*{BCLR}

Clear Memory Bit

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline \multicolumn{2}{|l|}{bit(Smem, src) = \#0} & No & 3 & 1 & X \\
\hline Opcode & & 0011 | & A & AI FSSS & S 1101 \\
\hline Operands & Smem, src & & & & \\
\hline Description & \multicolumn{5}{|l|}{This instruction performs a bit manipulation in the A-unit ALU. The instruction clears to 0 a single bit, as defined by the content of the source (src) operand, of a memory (Smem) location.} \\
\hline Status Bits & \begin{tabular}{l}
Affected by \\
Affects
\end{tabular} & & & & \\
\hline Repeat & This instructior & & & & \\
\hline See Also & See the follow
Clear A
Clear S
Comple
Set Men & \begin{tabular}{l}
tions: \\
mporary
\end{tabular} & giste & & \\
\hline \multicolumn{6}{|l|}{Example} \\
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline bit(*AR3, AC0) = \#0 & \multicolumn{5}{|l|}{The bit at the position defined by \(\operatorname{ACO}(3-0)\) in the content addressed by AR3 is cleared to 0 .} \\
\hline
\end{tabular}

\section*{BCLR}

Clear Status Register Bit

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\operatorname{bit}(S T 0, k 4)=\# 0\) & Yes & 2 & 1 & X \\
{\([2]\)} & \(\operatorname{bit}(S T 1, \mathrm{k} 4)=\# 0\) & Yes & 2 & 1 & X \\
{\([3]\)} & \(\operatorname{bit}(S T 2, \mathrm{k} 4)=\# 0\) & Yes & 2 & 1 & X \\
{\([4]\)} & \(\operatorname{bit}(S T 3, k 4)=\# 0\) & Yes & 2 & \(1^{\dagger}\) & X \\
\hline
\end{tabular}
\(\dagger\) When this instruction is decoded to modify status bit CAFRZ (15), CAEN (14), or CACLR (13), the CPU pipeline is flushed and the instruction is executed in 5 cycles regardless of the instruction context.
\begin{tabular}{|c|c|c|c|c|c|}
\hline Opcode & ST0 & 0100 & 011E & kkkk & 0000 \\
\hline & ST1 & 0100 & 011E & kkkk & 0010 \\
\hline & ST2 & 0100 & 011E & kkkk & 0100 \\
\hline & ST3 & 0100 & 011E & kkkk & 0110 \\
\hline
\end{tabular}

\section*{Operands k4, STx}

Description These instructions perform a bit manipulation in the A-unit ALU.
These instructions clear to 0 a single bit, as defined by a 4-bit immediate value, k4, in the selected status register (ST0, ST1, ST2, or ST3).

It is not allowed to access DP register mapped in STO register with bit(ST0, k4) = \#0 instruction. Therefore, k4 cannot have a value of 0-8.

It is not allowed to access ASM bit field in ST1 with bit(ST1, k4) = \#0 instruction. Therefore, k 4 cannot have a value of \(0-4\).

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

C55x DSP status registers bit mapping (Figure 5-1, page 5-92) does not correspond to C54x DSP status register bits.

Status Bits Affected by none
Affects Selected status bits
Repeat This instruction cannot be repeated.

\section*{See Also See the following other related instructions:}
- Clear Accumulator, Auxiliary, or Temporary Register Bit
- Clear Memory Bit
- Set Status Register Bit

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline bit(ST2, \#ST2_AR2LC) = \#0; AR2LC = bit 2 & \begin{tabular}{l} 
The ST2 bit position defined by the label (ST2_AR2LC, bit 2) \\
is cleared to 0.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lll} 
Before & \multicolumn{2}{l}{ After } \\
ST2 55 & 0006 & ST2 55
\end{tabular}

Figure 5-1. Status Registers Bit Mapping
STO_55
\begin{tabular}{|c|c|c|c|c|c|c|}
\multicolumn{1}{c}{15} & 14 & \multicolumn{2}{c}{13} & \multicolumn{2}{c}{12} & \multicolumn{2}{c}{11} & 10 & 9 \\
\hline ACOV2 \(^{\dagger}\) & ACOV3 \(^{\dagger}\) & TC1 \(^{\dagger}\) & TC2 & CARRY & ACOV0 & ACOV1 \\
\hline R/W-0 & R/W-0 & R/W-1 & R/W-1 & R/W-1 & R/W-0 & R/W-0 \\
\hline
\end{tabular}


ST1_55
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\multicolumn{1}{c}{15} & 14 & 13 & 12 & 11 & \multicolumn{2}{c}{10} & 9 \\
\hline BRAF & CPL & XF & HM & INTM & M40 \(^{\dagger}\) & SATD & SXMD \\
\hline R/W-0 & R/W-0 & R/W-1 & R/W-0 & R/W-1 & R/W-0 & R/W-0 & R/W-1 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{8}{c|}{5} & \multicolumn{1}{c}{4} & \\
\hline C16 & FRCT & C54CM \(^{\dagger}\) & ASM \\
\hline R/W-0 & R/W-0 & R/W-1 & R/W-0
\end{tabular}

ST2_55
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\multicolumn{1}{c}{14} & \multicolumn{2}{c}{13} & 12 & 11 & \multicolumn{2}{c}{10} & 9 \\
\hline ARMS & Reserved & DBGM & EALLOW & RDM & Reserved & CDPLC \\
\hline R/W-0 & & R/W-1 & R/W-0 & R/W-0 & & R/W-0 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline 7 & 6 & 5 & 4 & & 3 & & 2 \\
0 & 1 & 0 \\
\hline AR7LC & AR6LC & AR5LC & AR4LC & AR3LC & AR2LC & AR1LC & AR0LC \\
\hline R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 \\
\hline
\end{tabular}

\section*{ST3_55}
\begin{tabular}{|c|c|c|c|cc|}
\multicolumn{1}{c}{15} & \multicolumn{2}{c}{13} & 12 & \multicolumn{1}{c}{11} & Reserved (always write 1100b) \\
\hline CAFRZ \(^{\dagger}\) & CAEN \(^{\dagger}\) & CACLR \(^{\dagger}\) & HINT \(^{\dagger}\) & \\
\hline R/W-0 & R/W-0 & R/W-0 & R/W-1 & \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0 \\
\hline CBERR \({ }^{\dagger}\) & MPNMC§ & SATA \({ }^{\dagger}\) & Reserved & & CLKOFF & SMUL & SST \\
\hline R/W-0 & R/W-pins & R/W-0 & & & R/W-0 & R/W-0 & R/W-0 \\
\hline
\end{tabular}

Legend: \(\mathrm{R}=\) Read; \(\mathrm{W}=\) Write; \(-n=\) Value after reset
\(\dagger\) Highlighted bit: If you write to the protected address of the status register, a write to this bit has no effect, and the bit always appears as a 0 during read operations.
\(\ddagger\) The HINT bit is not used for all C55x host port interfaces (HPIs). Consult the documentation for the specific C55x DSP.
§ The reset value of MPNMC may be dependent on the state of predefined pins at reset. To check this for a particular C55x DSP, see the boot loader section of its data sheet.

\section*{CMP}

Compare Accumulator, Auxiliary, or Temporary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|}
\hline No. & Syntax & Size & Cycles & Pipeline \\
\hline [1] & TC1 = uns(src RELOP dst) & 3 & 1 & X \\
\hline [2] & TC2 \(=\) uns(src RELOP dst) & 3 & 1 & X \\
\hline \multirow[t]{2}{*}{Opcoc} & TC1 & \multicolumn{2}{|l|}{| 0001 001E \(\mid\) FSSS CC00|} & D xux0 \\
\hline & TC2 & S C & \(00 \mid\) FD & xux1 \\
\hline \multicolumn{2}{|l|}{Operands} & & & \\
\hline Descri & \begin{tabular}{l}
tion \\
This instru \\
Two accu compared temporary \\
A-unit ALU \\
it is cleare \\
The comp accumulat specifies for accum
\end{tabular} & \multicolumn{3}{|l|}{This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 .} \\
\hline
\end{tabular}
\begin{tabular}{cccl} 
uns & src & dst & Comparison Type \\
no & TAx & TAy & 16-bit signed comparison in A-unit ALU \\
no & TAx & ACy & 16-bit signed comparison in A-unit ALU \\
no & ACx & TAy & 16-bit signed comparison in A-unit ALU \\
no & ACx & ACy & \begin{tabular}{l} 
if M40 \(=0,32\)-bit signed comparison in D-unit ALU \\
if M40 \(=1,40\)-bit signed comparison in D-unit ALU
\end{tabular} \\
yes & TAx & TAy & 16-bit unsigned comparison in A-unit ALU \\
yes & TAx & ACy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU
\end{tabular} \\
yes & ACx & TAy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU \\
yes
\end{tabular} \\
& ACx & ACy & \begin{tabular}{l} 
if \(40=0,32\)-bit unsigned comparison in D-unit ALU \\
if M40 \(=1,40\)-bit unsigned comparison in D-unit ALU
\end{tabular}
\end{tabular}

\section*{Compatibility with C54x devices (C54CM = 1)}

Contrary to the corresponding C54x instruction, the C55x register comparison instruction is performed in execute phase of the pipeline.

When \(\mathrm{C} 54 \mathrm{CM}=1\), the conditions testing the accumulators content are all performed as if M40 was set to 1 .
\begin{tabular}{ll} 
Status Bits & Affected by C54CM, M40 \\
Repeat & Affects \(\quad\) TCX \\
See Also & This instruction can be repeated. \\
& See the following other related instructions: \\
& \(\square\) Compare Accumulator, Auxiliary, or Temporary Register Content with AND \\
& \(\square\) Compare Accumulator, Auxiliary, or Temporary Register Content with OR \\
& \(\square\) Compare Accumulator, Auxiliary, or Temporary Register Content Maximum \\
& \(\square\) Compare Memory with Immediate Value
\end{tabular}

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{TC} 1=\mathrm{AC} 1==\mathrm{T} 1\) & \begin{tabular}{l} 
The signed content of AC1(15-0) is compared to the content of T1 and because \\
they are equal, TC1 is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & \multicolumn{4}{c}{ After } \\
AC1 & 00 & 0028 & 0400 & AC1 & 000028 & 0400 \\
T1 & & 0400 & T1 & & 0400 \\
TC1 & & 0 & TC1 & & 1
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{TC} 1=\mathrm{T} 1>=\mathrm{AC} 1\) & \begin{tabular}{l} 
The content of T1 is compared to the signed content of AC1(15-0). The content of \\
T 1 is greater than the content of \(\mathrm{AC} 1, \mathrm{TC} 1\) is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline T1 & & & 0500 & T1 & & & 0500 \\
\hline AC1 & 80 & 0000 & 0400 & AC1 & 80 & 0000 & 0400 \\
\hline TC1 & & & 0 & TC1 & & & 1 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & TCx \(=\) TCy \& uns(src RELOP dst) & Yes & 3 & 1 & \(X\) \\
{\([2]\)} & TCx \(=\) !TCy \& uns(src RELOP dst) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{ll} 
Description & \begin{tabular}{l} 
These instructions perform a comparison in the D-unit ALU or in the A-unit \\
ALU. Two accumulator, auxiliary registers, and temporary registers contents \\
are compared. When an accumulator ACx is compared with an auxiliary or \\
temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the \\
A-unit ALU.
\end{tabular} \\
Status Bits & \begin{tabular}{l} 
Affected by C54CM, M40, TCy
\end{tabular} \\
See Also & \begin{tabular}{l} 
Affects \(\quad\) TCx
\end{tabular} \\
See the following other related instructions: \\
Compare Accumulator, Auxiliary, or Temporary Register Content
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{llcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline & TCx \(=\) TCy \& uns(src RELOP dst) & & & & \\
{\([1 \mathrm{a}]\)} & TC1 \(=\) TC2 \& uns(Src RELOP dst) & Yes & 3 & 1 & X \\
{\([1 \mathrm{~b}]\)} & TC2 \(=\) TC1 \& uns(src RELOP dst) & Yes & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

0001 001E \(\mid\) FSSS Cc01 \(\mid\) FDDD Outt

\section*{Operands dst, RELOP, src, TC1, TC2}

Description This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ANDed with TCy; TCx is updated with this operation.

The comparison depends on the optional uns keyword and on M40 for accumulator comparisons. As the following table shows, the uns keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons.
\begin{tabular}{cccl} 
uns & src & dst & Comparison Type \\
no & TAx & TAy & 16-bit signed comparison in A-unit ALU \\
no & TAx & ACy & 16-bit signed comparison in A-unit ALU \\
no & ACx & TAy & \begin{tabular}{l} 
16-bit signed comparison in A-unit ALU \\
no
\end{tabular} \\
ACx & ACy & \begin{tabular}{l} 
If M40 \(=0,32\)-bit signed comparison in D-unit ALU \\
if M40 \(=1,40\)-bit signed comparison in D-unit ALU
\end{tabular} \\
yes & TAx & TAy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU
\end{tabular} \\
yes & TAx & ACy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU
\end{tabular} \\
yes & ACx & TAy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU \\
yes
\end{tabular} \\
ACx & ACy & \begin{tabular}{l} 
If \(40=0,32\)-bit unsigned comparison in D-unit ALU \\
if M40 \(=1,40\)-bit unsigned comparison in D-unit ALU
\end{tabular}
\end{tabular}

\section*{Compatibility with C54x devices (C54CM = 1)}

Contrary to the corresponding C54x instruction, the C55x register comparison instruction is performed in execute phase of the pipeline.

When \(\mathrm{C} 54 \mathrm{CM}=1\), the conditions testing the accumulators content are all performed as if M40 was set to 1 .
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, TCy \\
& Affects & TCx
\end{tabular}

Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline TC2 = TC1 \& AC1 == AC2 & \begin{tabular}{l} 
The content of AC1(31-0) is compared to the content of AC2(31-0). \\
The contents are equal (true), TC2 \(=\mathrm{TC} 1 \& 1\).
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlllll} 
Before & & \multicolumn{4}{c}{ After } \\
AC1 & 80 & 0028 & 0400 & AC1 & 80 & 0028 & 0400 \\
AC2 & 00 & 0028 & 0400 & AC2 & 00 & 0028 & 0400 \\
M40 & & & 0 & M40 & & & 0 \\
TC1 & & & 1 & TC1 & & & 1 \\
TC2 & & & 0 & TC2 & & & 1
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{lllccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline & TCx \(=!\) TCy \& uns(src RELOP dst) & & & & \\
{\([2 a]\)} & TC1 \(=!T C 2\) \& uns(src RELOP dst) & Yes & 3 & 1 & X \\
{\([2 b]\)} & TC2 \(=!T C 1 \&\) uns(src RELOP dst) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

0001 001E \(\mid\) FSSS CC01 \(\mid\) FDDD lutt

\section*{Operands \\ dst, RELOP, src, TC1, TC2}

Description This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ANDed with the complement of TCy; TCx is updated with this operation.

The comparison depends on the optional uns keyword and on M40 for accumulator comparisons. As the following table shows, the uns keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons.
\begin{tabular}{cccl} 
uns & src & dst & Comparison Type \\
no & TAx & TAy & 16-bit signed comparison in A-unit ALU \\
no & TAx & ACy & 16-bit signed comparison in A-unit ALU \\
no & ACx & TAy & \begin{tabular}{l} 
16-bit signed comparison in A-unit ALU \\
no
\end{tabular} \\
ACx & ACy & \begin{tabular}{l} 
if \(40=0,32\)-bit signed comparison in D-unit ALU \\
if M40 \(=1,40\)-bit signed comparison in D-unit ALU
\end{tabular} \\
yes & TAx & TAy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU
\end{tabular} \\
yes & TAx & ACy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU
\end{tabular} \\
yes & ACx & TAy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU \\
yes
\end{tabular} \\
& ACx & ACy & \begin{tabular}{l} 
if \(40=0,32\)-bit unsigned comparison in D-unit ALU \\
if \(M 40=1,40\)-bit unsigned comparison in D-unit ALU
\end{tabular}
\end{tabular}

\section*{Compatibility with C54x devices (C54CM = 1)}

Contrary to the corresponding C54x instruction, the C55x register comparison instruction is performed in execute phase of the pipeline.

When \(\mathrm{C} 54 \mathrm{CM}=1\), the conditions testing the accumulators content are all performed as if M40 was set to 1 .
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, TCy \\
& Affects & TCx
\end{tabular}

Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline TC2 \(=!T C 1 \& A C 1==A C 2\) & \begin{tabular}{l} 
The content of AC1(31-0) is compared to the content of AC2(31-0). \\
The contents are equal (true), TC2 \(=!T C 1 \& 1\).
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC1 & 80 & 0028 & 0400 & AC1 & 80 & 0028 & 0400 \\
\hline AC2 & 00 & 0028 & 0400 & AC2 & 00 & 0028 & 0400 \\
\hline M40 & & & 0 & M40 & & & 0 \\
\hline TC1 & & & 1 & TC1 & & & 1 \\
\hline TC2 & & & 0 & TC2 & & & 0 \\
\hline
\end{tabular}

\section*{CMPOR}

Compare Accumulator, Auxiliary, or Temporary Register Content with OR

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & TCx \(=\) TCy | uns(src RELOP dst) & Yes & 3 & 1 & X \\
{\([2]\)} & TCx \(=!\) TCy \(\mid\) uns(src RELOP dst) & Yes & 3 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU.
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, TCy \\
& Affects & TCx
\end{tabular}

See Also See the following other related instructions:
- Compare Accumulator, Auxiliary, or Temporary Register Content
- Compare Accumulator, Auxiliary, or Temporary Register Content with AND
- Compare Accumulator, Auxiliary, or Temporary Register Content Maximum
- Compare Accumulator, Auxiliary, or Temporary Register Content Minimum
- Compare Memory with Immediate Value

\section*{Syntax Characteristics}
\begin{tabular}{lllccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline & TCx \(=\) TCy | uns(src RELOP dst) & & & & \\
{\([1 \mathrm{a}]\)} & TC1 \(=\) TC2 \(\mid\) uns(Src RELOP dst) & Yes & 3 & 1 & X \\
{\([1 \mathrm{~b}]\)} & TC2 \(=\) TC1 \(\mid\) uns(Src RELOP dst) & Yes & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

0001 001E \(\mid\) FSSS cc10 \(\mid\) FDDD Outt

Operands
Description
dst, RELOP, src, TC1, TC2
This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ORed with TCy; TCx is updated with this operation.

The comparison depends on the optional uns keyword and on M40 for accumulator comparisons. As the following table shows, the uns keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons.
\begin{tabular}{cccl} 
uns & src & dst & Comparison Type \\
no & TAx & TAy & 16-bit signed comparison in A-unit ALU \\
no & TAx & ACy & 16-bit signed comparison in A-unit ALU \\
no & ACx & TAy & \begin{tabular}{l} 
16-bit signed comparison in A-unit ALU \\
no
\end{tabular} \\
ACx & ACy & \begin{tabular}{l} 
if \(M 40=0,32\)-bit signed comparison in D-unit ALU \\
if M40 \(=1,40\)-bit signed comparison in D-unit ALU
\end{tabular} \\
yes & TAx & TAy & 16-bit unsigned comparison in A-unit ALU \\
yes & TAx & ACy & 16-bit unsigned comparison in A-unit ALU \\
yes & ACx & TAy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU \\
yes
\end{tabular} \\
ACx & ACy & \begin{tabular}{l} 
if \(40=0,32\)-bit unsigned comparison in D-unit ALU \\
if \(M 40=1,40\)-bit unsigned comparison in D-unit ALU
\end{tabular}
\end{tabular}


\section*{Syntax Characteristics}
\begin{tabular}{lllccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline & TCx \(=!\) TCy | uns(src RELOP dst) & & & & \\
{\([2 \mathrm{aa}]\)} & TC1 \(=!\) TC2 | uns(src RELOP dst) & Yes & 3 & 1 & X \\
{\([2 b]\)} & TC2 \(=!T C 1 \mid\) uns(src RELOP dst) & Yes & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

0001 001E \begin{tabular}{l} 
FSSS col| \\
\hline
\end{tabular}

Operands
Description
dst, RELOP, src, TC1, TC2
This instruction performs a comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 . The result of the comparison is ORed with the complement of TCy; TCx is updated with this operation.

The comparison depends on the optional uns keyword and on M40 for accumulator comparisons. As the following table shows, the uns keyword specifies an unsigned comparison and M40 defines the comparison bit width for accumulator comparisons.
\begin{tabular}{cccl} 
uns & src & dst & Comparison Type \\
no & TAx & TAy & 16-bit signed comparison in A-unit ALU \\
no & TAx & ACy & 16-bit signed comparison in A-unit ALU \\
no & ACx & TAy & \begin{tabular}{l} 
16-bit signed comparison in A-unit ALU \\
no
\end{tabular} \\
ACx & ACy & \begin{tabular}{l} 
if \(M 40=0,32\)-bit signed comparison in D-unit ALU \\
if M40 \(=1,40\)-bit signed comparison in D-unit ALU
\end{tabular} \\
yes & TAx & TAy & 16-bit unsigned comparison in A-unit ALU \\
yes & TAx & ACy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU
\end{tabular} \\
yes & ACx & TAy & \begin{tabular}{l} 
16-bit unsigned comparison in A-unit ALU \\
yes
\end{tabular} \\
ACx & ACy & \begin{tabular}{l} 
if \(40=0,32\)-bit unsigned comparison in D-unit ALU \\
if \(M 40=1,40\)-bit unsigned comparison in D-unit ALU
\end{tabular}
\end{tabular}


\section*{MAX}

Compare Accumulator, Auxiliary, or Temporary Register Content Maximum

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & dst \(=\boldsymbol{\operatorname { m a x } ( \text { src, dst) }}\) & Yes & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

0010 111E FSSS FDDD

Operands
Description
dst, src
This instruction performs a maximum comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 .
- When the destination operand (dst) is an accumulator:

■ If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD.
- The operation is performed on 40 bits in the D-unit ALU:

If \(\mathrm{M} 40=0, \operatorname{src}(31-0)\) content is compared to \(\mathrm{dst}(31-0)\) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .
```

step1:if (src(31-0) > dst(31-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
else
step3: CARRY = 1

```

If M40 \(=1, \operatorname{src}(39-0)\) content is compared to dst(39-0) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .
```

step1:if (src(39-0) > dst(39-0))
step2:{ CARRY = 0; dst (39-0) = src(39-0) }
else
step3:CARRY = 1

```
- There is no overflow detection, overflow report, and saturation.
- When the destination operand (dst) is an auxiliary or temporary register:
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- The operation is performed on 16 bits in the A-unit ALU:

The \(\operatorname{src}(15-0)\) content is compared to the \(\operatorname{dst}(15-0)\) content. The extremum value is stored in dst.
```

stepl:if (src(15-0) > dst(15-0))
step2:dst = src

```
- There is no overflow detection and saturation.

Compatibility with C54x devices (C54CM =1)
When \(\mathrm{C} 54 \mathrm{CM}=1\), this instruction is executed as if M40 status bit was locally set to 1 . When the destination operand (dst) is an auxiliary or temporary register, the instruction execution is not impacted by the C54CM status bit. When the destination operand (dst) is an accumulator, this instruction always compares the source operand (src) with AC1 as follows:
\(\square\) If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD
- The operation is performed on 40 bits in the D-unit ALU:

The \(\operatorname{src}(39-0)\) content is compared to \(\mathrm{AC} 1(39-0)\) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .
```

stepl:if (src(39-0) > ACl (39-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
else
step3:{ CARRY = 1; dst(39-0) = AC1(39-0) }

```

There is no overflow detection, overflow report, and saturation.

\section*{Status Bits}

Affected by C54CM, M40, SXMD
Affects
CARRY
Repeat
This instruction can be repeated.

\section*{See Also See the following other related instructions:}
- Compare Accumulator, Auxiliary, or Temporary Register Content
- Compare Accumulator, Auxiliary, or Temporary Register Content with AND
\(\square\) Compare Accumulator, Auxiliary, or Temporary Register Content with OR
- Compare Accumulator, Auxiliary, or Temporary Register Content Minimum
- Compare and Select Accumulator Content Maximum
- Compare Memory with Immediate Value

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=\max (A C 2, A C 1)\) & \begin{tabular}{l} 
The content of AC2 is less than the content of AC1, the content of AC1 remains \\
the same and the CARRY status bit is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline AC2 & 00 & 0000 & 0000 & AC2 & 00 & 0000 & 0000 \\
\hline AC1 & 00 & 8500 & 0000 & AC1 & 00 & 8500 & 0000 \\
\hline SXMD & & & 1 & SXMD & & & 1 \\
\hline M40 & & & 0 & M40 & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & 1 \\
\hline
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = max(AR1, AC1) & \begin{tabular}{l} 
The content of AR1 is less than the content of AC1, the content of AC1 remains \\
the same and the CARRY status bit is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AR1 & & & 8020 & AR1 & & & 8020 \\
\hline AC1 & 00 & 0000 & 0040 & AC1 & 00 & 0000 & 0040 \\
\hline CARRY & & & 0 & CARRY & & & 1 \\
\hline
\end{tabular}

Example 3


\section*{MIN}

Compare Accumulator, Auxiliary, or Temporary Register Content Minimum

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\mathrm{dst}=\boldsymbol{\operatorname { m i n }}(\mathrm{src}, \mathrm{dst})\) & Yes & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
|0011 000E|FSSS FDDD

\section*{Operands}

Description
dst, src
This instruction performs a minimum comparison in the D-unit ALU or in the A-unit ALU. Two accumulator, auxiliary registers, and temporary registers contents are compared. When an accumulator ACx is compared with an auxiliary or temporary register TAx, the 16 lowest bits of ACx are compared with TAx in the A-unit ALU. If the comparison is true, the TCx status bit is set to 1 ; otherwise, it is cleared to 0 .
\(\square\) When the destination operand (dst) is an accumulator:
- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD.
■ The operation is performed on 40 bits in the D-unit ALU:
If M40 \(=0, \operatorname{src}(31-0)\) content is compared to \(\mathrm{dst}(31-0)\) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .
```

step1:if (src(31-0) < dst(31-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
else
step3: CARRY = 1

```

If \(\mathrm{M} 40=1, \operatorname{src}(39-0)\) content is compared to \(\mathrm{dst}(39-0)\) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .
```

step1:if (src(39-0) < dst(39-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
else
step3: CARRY = 1

```

There is no overflow detection, overflow report, and saturation.

When the destination operand (dst) is an auxiliary or temporary register:
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- The operation is performed on 16 bits in the A-unit ALU:

The \(\operatorname{src}(15-0)\) content is compared to the \(\operatorname{dst}(15-0)\) content. The extremum value is stored in dst.
```

step1:if (src(15-0) < dst(15-0))
step2:dst = src

```
- There is no overflow detection and saturation.

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), this instruction is executed as if M 40 status bit was locally set to 1 . When the destination operand (dst) is an auxiliary or temporary register, the instruction execution is not impacted by the C54CM status bit. When the destination operand (dst) is an accumulator, this instruction always compares the source operand (src) with AC1 as follows:
\(\square\) If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended to 40 bits according to SXMD
\(\square\) The operation is performed on 40 bits in the D-unit ALU:
The \(\operatorname{src}(39-0)\) content is compared to \(\mathrm{AC} 1(39-0)\) content. The extremum value is stored in dst. If the extremum value is the src content, the CARRY status bit is cleared to 0 ; otherwise, it is set to 1 .
```

step1:if (src(39-0) < AC1(39-0))
step2:{ CARRY = 0; dst(39-0) = src(39-0) }
else
step3:{ CARRY = 1; dst(39-0) = AC1(39-0) }

```

There is no overflow detection, overflow report, and saturation.
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SXMD \\
& Affects & CARRY
\end{tabular}

\section*{See Also See the following other related instructions:}
- Compare Accumulator, Auxiliary, or Temporary Register Content
- Compare Accumulator, Auxiliary, or Temporary Register Content with AND
- Compare Accumulator, Auxiliary, or Temporary Register Content with OR
\(\square\) Compare Accumulator, Auxiliary, or Temporary Register Content Maximum
- Compare and Select Accumulator Content Minimum
- Compare Memory with Immediate Value

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{T} 1=\min (\mathrm{AC} 1, \mathrm{~T} 1)\) & \begin{tabular}{l} 
The content of \(\mathrm{AC} 1(15-0)\) is greater than the content of T1, the content of T1 \\
remains the same and the CARRY status bit is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & \multicolumn{4}{c}{ After } \\
AC1 & 008000 & 0000 & AC1 & 008000 & 0000 \\
T1 & & 8020 & T1 & & 8020 \\
CARRY & & 0 & CARRY & & & 1
\end{tabular}

\section*{BCC}

\section*{Compare and Branch}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & compare (uns(src RELOP K8)) goto L8 & No & 4 & \(7 / 6\) & X \\
\hline
\end{tabular}
\(\dagger \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false

\section*{Opcode}

\section*{Operands}

Description
\(01101111 \mid\) FSSS ccxu \(\mid\) KKKK KKKK \(\mid\) LLLL LLLL K8, L8, RELOP, src
This instruction performs a comparison operation between a source (src) register content and an 8 -bit signed value, K8. The instruction performs a comparison in the D-unit ALU or in the A-unit ALU. The comparison is performed in the execute phase of the pipeline. If the result of the comparison is true, a branch occurs.

The program branch address is specified as an 8-bit signed offset, L8, relative to the program counter (PC). Use this instruction to branch within a 256-byte window centered on the current PC value.

The comparison depends on the optional uns keyword and, for accumulator comparisons, on M40.
- In the case of an unsigned comparison, the 8-bit constant, K8, is zero extended to:
- 16 bits, if the source (src) operand is an auxiliary or temporary register.
- 40 bits, if the source (src) operand is an accumulator.
\(\square\) In the case of a signed comparison, the 8 -bit constant, K8, is sign extended to:
- 16 bits, if the source (src) operand is an auxiliary or temporary register.
- 40 bits, if the source (src) operand is an accumulator.

As the following table shows, the uns keyword specifies an unsigned comparison; M40 defines the comparison bit width of the accumulator.
\(\left.\begin{array}{ccl}\text { uns } & \text { src } & \text { Comparison Type } \\ \text { no } & \text { TAx } & \begin{array}{l}\text { 16-bit signed comparison in A-unit ALU } \\ \text { no }\end{array} \\ \text { ACx M40 = 0,32-bit signed comparison in D-unit ALU } \\ \text { if M40 = 1, 40-bit signed comparison in D-unit ALU }\end{array}\right\}\)

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), the conditions testing the accumulator contents are all performed as if M40 was set to 1 .

\section*{Status Bits}

\section*{Affected by C54CM, M40}

Affects none
Repeat
This instruction cannot be repeated.
See Also See the following other related instructions:
- Branch Conditionally
- Branch Unconditionally
- Branch on Auxiliary Register Not Zero

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline compare (AC0 >= \#12) goto branch & \begin{tabular}{l} 
The signed content of AC0 is compared to the sign-extended 8-bit \\
value (12). Because the content of AC0 is greater than or equal to 12, \\
program control is passed to the program address label defined by \\
branch (004078h).
\end{tabular} \\
\hline
\end{tabular}


\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline compare (T1 != \#1) goto branch & \begin{tabular}{l} 
The content of T1 is not equal to 1, program control is passed to the \\
next instruction (the branch is not taken).
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|lll|}
\hline \multicolumn{7}{|c|}{ compare (T1!=\#1) goto branch } & & \\
& \(\ldots \ldots\) \\
& \(\ldots \ldots\) \\
branch & \(\ldots \ldots\) & address: 00407D
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
T1 & 0000 & T1 & 0000 \\
PC & 4079 & PC & \(407 D\)
\end{tabular}

\section*{MAXDIFF}

Compare and Select Accumulator Content Maximum

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & max_diff(ACx, ACy, ACz, ACw) & Yes & 3 & 1 & \(X\) \\
{\([2]\)} & max_diff_dbl(ACx, ACy, ACz, ACw, TRNx) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{ll} 
Description & \begin{tabular}{l} 
Instruction [1] performs two paralleled 16-bit extremum selections in the D-unit \\
ALU. Instruction [2] performs a single 40-bit extremum selection in the D-unit \\
ALU.
\end{tabular} \\
Status Bits & \begin{tabular}{l} 
Affected by \(\quad\) C54CM, M40, SATD
\end{tabular} \\
See Also & \begin{tabular}{l} 
Affects \\
See the following other related instructions:
\end{tabular} \\
& \(\square\) Compare Accumulator, Auxiliary, or Temporary Register Content \\
Compare Accumulator, Auxiliary, or Temporary Register Content Maximum
\end{tabular}

\section*{Syntax Characteristics}


For each datapath (high and low):
\(\square\) ACx and ACy are the source accumulators.
\(\square\) The differences are stored in accumulator ACw.
- The subtraction computation is equivalent to the dual 16-bit subtractions instruction.
- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit ( ACOV ) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15.
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
- Independently on each data path, if SATD \(=1\) when an overflow is detected on the data path, a saturation is performed:
- For the operations performed in the ALU low part, saturation values are 7FFFh (positive) and 8000h (negative).
- For the operations performed in the ALU high part, saturation values are 00 7FFFh (positive) and FF 8000h (negative).
- The extremum is stored in accumulator ACz.
- The extremum is searched considering the selected bit width of the accumulators:

■ for the lower 16-bit data path, the sign bit is extracted at bit position 15
■ for the higher 24-bit data path, the sign bit is extracted at bit position 31
- According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs:
- TRNO tracks the decision for the high part data path
- TRN1 tracks the decision for the low part data path

If the extremum value is the ACx high or low part, the decision bit is cleared to 0 ; otherwise, it is set to 1 :
```

TRNO = TRNO >> \#1
TRN1 = TRN1 >> \#1
ACw(39-16) = ACy(39-16) - ACx(39-16)
ACw(15-0) = ACy(15-0) - ACx(15-0)
If (ACx(31-16) > ACy(31-16))
{ bit(TRNO, 15) = \#0 ; ACz(39-16) = ACx(39-16) }
else
{ bit(TRNO, 15) = \#1 ; ACz(39-16) = ACy(39-16) }
if (ACx(15-0) > ACy(15-0))
{ bit(TRN1, 15) = \#0 ; ACz(15-0) = ACx (15-0) }
else
{ bit(TRN1, 15) = \#1 ; ACz(15-0) = ACy(15-0) }

```

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24 -bit data path (overflow is detected at bit position 31).

Status Bits
Affected by C54CM, SATD
Affects ACOVw, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline max_diff(AC0, AC1, AC2, AC1) & The difference is stored in AC1. The content of \(\mathrm{ACO}(39-16)\) is subtracted from the content of \(A C 1(39-16)\) and the result is stored in \(A C 1\) (39-16). Since SATD \(=1\) and an overflow is detected, AC1(39-16) = FF 8000h (saturation). The content of \(\mathrm{ACO}(15-0)\) is subtracted from the content of AC1(15-0) and the result is stored in AC1(15-0). The maximum is stored in AC2. The content of TRN0 and TRN1 is shifted right 1 bit. AC0(31-16) is greater than \(\mathrm{AC} 1(31-16), \mathrm{AC} 0(39-16)\) is stored in \(\mathrm{AC} 2(39-16)\) and TRNO(15) is cleared to \(0 . \mathrm{ACO}(15-0)\) is greater than \(\mathrm{AC} 1(15-0)\), \(\mathrm{ACO}(15-0)\) is stored in \(\mathrm{AC2}(15-0)\) and TRN1(15) is cleared to 0. \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC0 & 10 & 2400 & 2222 & ACO & & 2400 & 2222 \\
\hline AC1 & 90 & 0000 & 0000 & AC1 & & 8000 & DDDE \\
\hline AC2 & 00 & 0000 & 0000 & AC2 & 10 & 2400 & 2222 \\
\hline SATD & & & 1 & SATD & & & 1 \\
\hline TRNO & & & 1000 & TRNO & & & 0800 \\
\hline TRN1 & & & 0100 & TRN1 & & & 0080 \\
\hline ACOV1 & & & 0 & ACOV1 & & & 1 \\
\hline CARRY & & & 1 & CARRY & & & 0 \\
\hline
\end{tabular}

MAXDIFF Compare and Select Accumulator Content Maximum (max_diff_dbl)

Compare and Select Accumulator Content Maximum

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2 a]\) & max_diff_dbI(ACx, ACy, ACz, ACw, TRN0) & Yes & 3 & 1 & X \\
{\([2 b]\)} & max_diff_dbI(ACx, ACy, ACz, ACw, TRN1) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Opcode & TRNO & 0001 & 000E & DDSS & 1101 & SSDD & xxx0 \\
\hline & TRN1 & 0001 & 000E & DDSS & 1101 & SSDD & xxx1 \\
\hline Operands & \multicolumn{7}{|l|}{ACw, ACx, ACy, ACz, TRNx} \\
\hline Description & \multicolumn{7}{|l|}{This instruction performs a single 40-bit extremum selection in the D-unit ALU. This instruction performs a maximum search.} \\
\hline
\end{tabular}
- ACx and ACy are the two source accumulators.
\(\square\) The difference between the source accumulators is stored in accumulator ACw.
\(\square\) The subtraction computation is equivalent to the subtraction instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) The extremum between the source accumulators is stored in accumulator ACz.
\(\square\) The extremum computation is similar to the compare register content maximum instruction. However, the CARRY status bit is not updated by the extremum search but by the subtraction instruction.
\(\square\) According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs. If the extremum value is ACx, the decision bit is cleared to 0 ; otherwise, it is set to 1 .
```

If $\mathrm{M} 40=0$ :
TRNX = TRNX >> \#1
$\mathrm{ACw}(39-0)=\mathrm{ACy}(39-0)-\operatorname{ACx}(39-0)$
If (ACx (31-0) > ACy (31-0))
$\{\mathrm{bit}(T R N x, 15)=\# 0 ; \mathrm{ACz}(39-0)=\mathrm{ACx}(39-0)\}$
else
$\{$ bit(TRNx, 15) = \#1 ; ACz (39-0) = ACy (39-0) \}
If $\mathrm{M} 40=1$ :
TRNx = TRNx >> \#1
ACw (39-0) = ACy (39-0) - ACx (39-0)
If (ACx (39-0) > ACy (39-0))
$\{$ bit (TRNx, 15) = \#0 ; ACz (39-0) = ACx (39-0) \}
else
$\{$ bit (TRNx, 15) = \#1 ; ACz (39-0) = ACy (39-0) \}

```

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), this instruction is executed as if M40 status bit was locally set to 1 . However to ensure compatibility versus overflow detection and saturation of the destination accumulator, this instruction must be executed with \(\mathrm{M} 40=0\).

Status Bits

Repeat

Affected by C54CM, M40, SATD
Affects ACOVw, CARRY
This instruction can be repeated.
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline max_diff_dbl(AC0, AC1, AC2, AC3, TRN1) & \begin{tabular}{l} 
The difference is stored in AC3. The content of AC0 is \\
subtracted from the content of AC1 and the result is stored in \\
\\
AC3. The maximum is stored in AC2. The content of TRN1 is \\
shifted right 1 bit. AC0 is greater than AC1, AC0 is stored in \\
AC2 and TRN1(15) is cleared to 0. \\
\hline
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrrlrrr} 
Before & & \multicolumn{4}{c}{ After } \\
AC0 & 10 & 2400 & 2222 & AC0 & 10 & 2400 & 2222 \\
AC1 & 00 & 8000 & DDDE & AC1 & 00 & 8000 & DDDE \\
AC2 & 00 & 0000 & 0000 & AC2 & 10 & 2400 & 2222 \\
AC3 & 00 & 0000 & 0000 & AC3 & F0 & \(5 C 00\) & BBBC \\
M40 & & & 1 & M40 & & & 1 \\
SATD & & & 1 & SATD & & & 1 \\
TRN1 & & & 0080 & TRN1 & & & 0040 \\
ACOV3 & & & 0 & ACOV3 & & & 0 \\
CARRY & & & 0 & CARRY & & & 0
\end{tabular}

MINDIFF Compare and Select Accumulator Content Minimum

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & min_diff(ACx, ACy, ACz, ACw) & Yes & 3 & 1 & X \\
{\([2]\)} & min_diff_dbl(ACx, ACy, ACz, ACw, TRNx) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}

Description Instruction [1] performs two paralleled 16-bit extremum selections in the D-unit ALU. Instruction [2] performs a single 40-bit extremum selection in the D-unit ALU.

Status Bits
Affected by C54CM, M40, SATD
Affects ACOVw, CARRY
See Also See the following other related instructions:
- Compare Accumulator, Auxiliary, or Temporary Register Content
- Compare Accumulator, Auxiliary, or Temporary Register Content Minimum
- Compare and Select Accumulator Content Maximum

\section*{Syntax Characteristics}


For each datapath (high and low):
\(\square\) ACx and ACy are the source accumulators.
\(\square\) The differences are stored in accumulator ACw.
- The subtraction computation is equivalent to the dual 16-bit subtractions instruction.
\(\square\) For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit ( ACOV ) is set.
■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
\(\square\) Independently on each data path, if SATD \(=1\) when an overflow is detected on the data path, a saturation is performed:
■ For the operations performed in the ALU low part, saturation values are 7FFFh (positive) and 8000h (negative).

■ For the operations performed in the ALU high part, saturation values are 00 7FFFh (positive) and FF 8000h (negative).
- The extremum is stored in accumulator ACz.
- The extremum is searched considering the selected bit width of the accumulators:

■ for the lower 16-bit data path, the sign bit is extracted at bit position 15
■ for the higher 24-bit data path, the sign bit is extracted at bit position 31
- According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs:
- TRNO tracks the decision for the high part data path
- TRN1 tracks the decision for the low part data path

If the extremum value is the ACx high or low part, the decision bit is cleared to 0 ; otherwise, it is set to 1 :
```

TRNO = TRNO >> \#1
TRN1 = TRN1 >> \#1
ACw(39-16) = ACy(39-16) - ACx(39-16)
ACw(15-0) = ACy(15-0) - ACx(15-0)
If (ACx(31-16) < ACy(31-16))
{ bit(TRN0, 15) = \#0 ; ACz(39-16) = ACx(39-16) }
else
{ bit(TRN0, 15) = \#1 ; ACz(39-16) = ACy(39-16) }
if (ACx(15-0) < ACy(15-0))
{ bit(TRN1, 15) = \#0 ; ACz(15-0) = ACx(15-0) }
else
{ bit(TRN1, 15) = \#1 ; ACz(15-0) = ACY(15-0) }

```

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM \(=1\), this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24 -bit data path (overflow is detected at bit position 31).

\section*{Status Bits}

Affected by C54CM, SATD
Affects ACOVw, CARRY
Repeat This instruction can be repeated.

\section*{Example}

\section*{Syntax}

Description
min_diff(AC0, AC1, AC2, AC1) \(\quad\) The difference is stored in AC1. The content of AC0(39-16) is subtracted from the content of AC1(39-16) and the result is stored in AC1(39-16). Since SATD = 1 and an overflow is detected, AC1 (39-16) = FF 8000h (saturation). The content of \(\mathrm{ACO}(15-0)\) is subtracted from the content of AC1(15-0) and the result is stored in AC1(15-0). The minimum is stored in AC2 (sign bit extracted at bits 31 and 15). The content of TRN0 and TRN1 is shifted right 1 bit. \(\mathrm{ACO}(31-16)\) is greater than or equal to \(\mathrm{AC} 1(31-16), \mathrm{AC} 1(39-16)\) is stored in \(\mathrm{AC2}(39-16)\) and \(\mathrm{TRNO}(15)\) is set to 1 . \(\mathrm{AC} 0(15-0)\) is greater than or equal to \(\mathrm{AC} 1(15-0), \mathrm{AC} 1(15-0)\) is stored in AC2(15-0) and TRN1(15) is set to 1.

\section*{Before}

\section*{After}
\begin{tabular}{lrrrlrrr} 
AC0 & 10 & 2400 & 2222 & AC0 & 10 & 2400 & 2222 \\
AC1 & 00 & 8000 & DDDE & AC1 & FF & 8000 & BBBC \\
AC2 & 10 & 2400 & 2222 & AC2 & 00 & 8000 & DDDE \\
SATD & & & 1 & SATD & & & 1 \\
TRN0 & & & 0800 & TRN0 & & & 8400 \\
TRN1 & & & 0040 & TRN1 & & & 8020 \\
ACOV1 & & & 0 & ACOV1 & & & 1 \\
CARRY & & & 0 & CARRY & & & 1
\end{tabular}

Compare and Select Accumulator Content Minimum

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2 a]\) & min_diff_dbl(ACx, ACy, ACz, ACw, TRN0) & Yes & 3 & 1 & X \\
{\([2 b]\)} & min_diff_dbI(ACx, ACy, ACz, ACw, TRN1) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Opcode & TRNO & 0001 & 000E & DDSS & 1111 & SSDD & xxx0 \\
\hline & TRN1 & 0001 & 000E & DDSS & 1111 & SSDD & xxx1 \\
\hline Operands & \multicolumn{7}{|l|}{ACw, ACx, ACy, ACz, TRNx} \\
\hline Description & \multicolumn{7}{|l|}{This instruction performs a single 40-bit extremum selection in the D-unit ALU. This instruction performs a minimum search.} \\
\hline
\end{tabular}
- ACx and ACy are the two source accumulators.
- The difference between the source accumulators is stored in accumulator ACw.
\(\square\) The subtraction computation is equivalent to the subtraction instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) The extremum between the source accumulators is stored in accumulator ACz.
\(\square\) The extremum computation is similar to the compare register content maximum instruction. However, the CARRY status bit is not updated by the extremum search but by the subtraction instruction.
\(\square\) According to the extremum found, a decision bit is shifted in TRNx from the MSBs to the LSBs. If the extremum value is ACx, the decision bit is cleared to 0 ; otherwise, it is set to 1 .
```

If M40 = 0:
TRNX = TRNX >> \#1
ACw(39-0) = ACy(39-0) - ACx(39-0)
If (ACx(31-0) < ACy(31-0))
{ bit(TRNx, 15) = \#0 ; ACz(39-0) = ACx(39-0) }
else
{ bit(TRNx, 15) = \#1 ; ACz(39-0) = ACy(39-0) }
If M40=1:
TRNX = TRNX >> \#1
ACw(39-0) = ACy(39-0) - ACx(39-0)
If (ACx(39-0) < ACy(39-0))
{ bit(TRNx, 15) = \#0 ; ACz(39-0) = ACx(39-0) }
else
{ bit(TRNx, 15) = \#1 ; ACz(39-0) = ACy(39-0) }

```

\section*{Compatibility with C54x devices (C54CM =1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), this instruction is executed as if M 40 status bit was locally set to 1 . However to ensure compatibility versus overflow detection and saturation of the destination accumulator, this instruction must be executed with \(\mathrm{M} 40=0\).
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SATD \\
& Affects & ACOVw, CARRY
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline min_diff_dbl(AC0, AC1, AC2, AC3, TRN0) & \begin{tabular}{l} 
The difference is stored in AC3. The content of AC0 is \\
subtracted from the content of AC1 and the result is stored in \\
AC3. The minimum is stored in AC2. The content of TRN0 is \\
shifted right 1 bit. If AC0 is less than AC1, AC0 is stored in AC2 \\
and TRN0(15) is cleared to 0; otherwise, AC1 is stored in AC2 \\
and TRN0(15) is set to 1.
\end{tabular} \\
\hline
\end{tabular}

\section*{CMP}

\section*{Syntax Characteristics}


\section*{Example 2}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline TC2 = (*AR1 == \#400h) & The content addressed by AR1 is compared to the signed 16-bit value (400h). Because they are not equal, TC2 is cleared to 0 . \\
\hline Before & After \\
\hline AR1 0285 & AR1 0285 \\
\hline 02850000 & 02850000 \\
\hline TC2 0 & TC2 0 \\
\hline
\end{tabular}

\section*{BNOT}

\section*{BNOT}

Complement Accumulator, Auxiliary, or Temporary Register Bit
Syntax Characteristics
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & cbit(src, Baddr) & No & 3 & 1 & X \\
\hline Opcode & 1110 & 1100 & AAAA & AAAI & FSSS & 011x
\end{tabular}

Operands
Description

\section*{Status Bits Affected by none}
Affects none

Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Clear Accumulator, Auxiliary, or Temporary Register Bit
- Complement Accumulator, Auxiliary, or Temporary Register Content
- Complement Memory Bit
\(\square\) Set Accumulator, Auxiliary, or Temporary Register Bit

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline cbit(T0, AR1) & The bit at the position defined by the content of AR1(3-0) in T0 is complemented. \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & After \\
T0 & E000 & T0 & F000 \\
AR1 & 000 C & AR1 & 000 C
\end{tabular}

\section*{NOT}

Complement Accumulator, Auxiliary, or Temporary Register Content

\section*{Syntax Characteristics}


BNOT
Complement Memory Bit
Syntax Characteristics
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [1] cbit(Smem, src) & \(\begin{array}{llll}\text { No } & 3 & 1\end{array}\) \\
\hline Opcode & 11100011 | AAAA AAAI \({ }^{\text {FSSS }}\) 111x \\
\hline Operands & Smem, src \\
\hline Description & \begin{tabular}{l}
This instruction performs a bit manipulation in the A-unit ALU. The instruction complements a single bit, as defined by the content of the source (src) operand, of a memory (Smem) location. \\
The generated bit address must be within 0-15 (only the 4 LSBs of the register are used to determine the bit position).
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by none \\
Affects & none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline See Also & See the following other related instructions:
Clear Memory Bit
Complement Accumulator, Auxiliary, or Temporary Register Bit
Complement Accumulator, Auxiliary, or Temporary Register Content
Set Memory Bit \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline cbit(*AR3, AC0) & The bit at the position defined by AC0(3-0) in the content addressed by AR3 is complemented. \\
\hline
\end{tabular}

\section*{EXP \\ Compute Exponent of Accumulator Content}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(T x=\exp (A C x)\) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
Operands ACx, Tx

Description This instruction computes the exponent of the source accumulator ACx in the D-unit shifter. The result of the operation is stored in the temporary register Tx. The A-unit ALU is used to make the move operation.

This exponent is a signed 2 s -complement value in the -8 to 31 range. The exponent is computed by calculating the number of leading bits in ACx and subtracting 8 from this value. The number of leading bits is the number of shifts to the MSBs needed to align the accumulator content on a signed 40-bit representation.

ACx is not modified after the execution of this instruction. If \(A C x\) is equal to 0 , Tx is loaded with 0 .

This instruction produces in Tx the opposite result than computed by the Compute Mantissa and Exponent of Accumulator Content instruction (page 5-132).
\begin{tabular}{lll} 
Status Bits & Affected by none \\
& Affects & none
\end{tabular}

Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Compute Mantissa and Exponent of Accumulator Content

\section*{Example}
\begin{tabular}{|c|c|c|c|c|c|}
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline T1 \(=\exp (\mathrm{ACO})\) & \multicolumn{5}{|l|}{The exponent is computed by subtracting content of ACO. The exponent value is a 31 range and is stored in T1.} \\
\hline Before & & Aft & & & \\
\hline ACO FF FFFF & FFCB & ACO & FF & FFFF & FFCB \\
\hline T1 & 0000 & T1 & & & 0019 \\
\hline
\end{tabular}

\section*{MANT: \(N\) EXP}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\operatorname{mant}(A C x), T x=-\exp (A C x)\) & Yes & 3 & 1 & \(X 2\) \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description

Status Bit

Repeat
See Also

ACx, ACy, Tx
This instruction computes the exponent and mantissa of the source accumulator ACx. The computation of the exponent and the mantissa is executed in the D-unit shifter. The exponent is computed and stored in the temporary register Tx. The A-unit is used to make the move operation. The mantissa is stored in the accumulator ACy.

The exponent is a signed 2 s -complement value in the -31 to 8 range. The exponent is computed by calculating the number of leading bits in ACx and subtracting this value from 8 . The number of leading bits is the number of shifts to the MSBs needed to align the accumulator content on a signed 40-bit representation.

The mantissa is obtained by aligning the ACx content on a signed 32-bit representation. The mantissa is computed and stored in ACy.
\(\square\) The shift operation is performed on 40 bits.
■ When shifting to the LSBs, bit 39 of ACx is extended to bit 31 .
- When shifting to the MSBs, 0 is inserted at bit position 0 .
- If \(A C x\) is equal to \(0, T x\) is loaded with 8000 h .

This instruction produces in Tx the opposite result than computed by the Compute Exponent of Accumulator Content instruction (page 5-131).

Affected by none
Affects none
This instruction can be repeated.
See the following other related instructions:
- Compute Exponent of Accumulator Content

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=\operatorname{mant}(A C 0), \mathrm{T} 1=-\exp (\mathrm{AC} 0)\) & \begin{tabular}{l} 
The exponent is computed by subtracting the number of leading bits in \\
the content of \(A C 0\) from 8. The exponent value is a signed 2s-comple- \\
ment value in the -31 to 8 range and is stored in T1. The mantissa is \\
computed by aligning the content of ACO on a signed 32-bit representa- \\
tion. The mantissa value is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{llllllll} 
Before & & & After \\
AC0 & 21 & OAOA OAOA & AC0 & 21 & OAOA & OA0A \\
AC1 & FF FFFF & F001 & AC1 & 00 & 4214 & 1414 \\
T1 & & 0000 & T1 & & 0007
\end{tabular}

\section*{Example 2}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline AC1 \(=\operatorname{mant}(\mathrm{ACO}), \mathrm{T} 1=-\exp (\mathrm{AC} 0)\) & The exponent is computed by subtracting the number of leading bits in the content of ACO from 8 . The exponent value is a signed 2 s -complement value in the -31 to 8 range and is stored in T1. The mantissa is computed by aligning the content of ACO on a signed 32-bit representation. The mantissa value is stored in AC1. \\
\hline Before & After \\
\hline AC0 00 E804 0000 & AC0 00 E804 0000 \\
\hline AC1 FF FFFF F001 & AC1 0074020000 \\
\hline T1 0000 & T1 0001 \\
\hline
\end{tabular}

\section*{BCNT}

Count Accumulator Bits

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(T x=\operatorname{count}(A C x, A C y, ~ T C 1)\) & Yes & 3 & 1 & \(X\) \\
{\([2]\)} & \(T x=\operatorname{count}(A C x, A C y, ~ T C 2)\) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{ccc|cc|cc} 
Opcode & TC1 & \(\mid 0001\) & \(000 \mathrm{E} \mid \mathrm{xxSS}\) & \(1010 \mid\) SSdd xxx0 \\
TC2 & \(\mid 0001\) & \(000 \mathrm{E} \mid \mathrm{XXSS}\) & \(1010 \mid\) SSdd xxx1
\end{tabular}
Operands ACx, ACy, Tx, TCx

Description This instruction performs bit field manipulation in the D-unit shifter. The result is stored in the selected temporary register ( T x ). The A-unit ALU is used to make the move operation.

Accumulator ACx is ANDed with accumulator ACy. The number of bits set to 1 in the intermediary result is evaluated and stored in the selected temporary register ( Tx ). If the number of bits is even, the selected TCx status bit is cleared to 0 . If the number of bits is odd, the selected TCx status bit is set to 1 .
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & TCx
\end{tabular}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline T1 = count(AC1, AC2, TC1) & \begin{tabular}{l} 
The content of AC1 is ANDed with the content of AC2, the number of bits \\
set to 1 in the result is evaluated and stored in T1. The number of bits set \\
to 1 is odd, TC1 is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & \multicolumn{7}{|c|}{After} \\
\hline AC1 & 7 E & 2355 & 4 FCO & AC1 & 7 E & 2355 & 4 FCO \\
\hline AC2 & OF & E340 & 5678 & AC2 & OF & E340 & 5678 \\
\hline T1 & & & 0000 & T1 & & & 000B \\
\hline TC1 & & & 0 & TC1 & & & 1 \\
\hline
\end{tabular}

\section*{ADD}

Dual 16-Bit Additions

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
HI(ACy \()=\) HI(Lmem \()+\) HI(ACx \(),\) \\
LO(ACy \()=\) LO(Lmem \()+\) LO(ACx \()\)
\end{tabular} & No & 3 & 1 & X \\
{\([2]\)} & \begin{tabular}{l} 
HI(ACx) \(=\) HI(Lmem \()+\) Tx, \\
LO(ACx \()=\) LO(Lmem \()+\) Tx
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline \multirow[t]{2}{*}{Description} & These instructions perform two paralleled addition operations in one cycle. \\
\hline & The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath). \\
\hline \multirow[t]{2}{*}{Status Bits} & Affected by C54CM, SATD, SXMD \\
\hline & Affects ACOVx, ACOVy, CARRY \\
\hline \multirow[t]{8}{*}{See Also} & See the following other related instructions: \\
\hline & \(\square\) Addition \\
\hline & - Addition or Subtraction Conditionally \\
\hline & \(\square\) Addition or Subtraction Conditionally with Shift \\
\hline & - Addition with Parallel Store Accumulator Content to Memory \\
\hline & \(\square\) Addition, Subtraction, or Move Accumulator Content Conditionally \\
\hline & \(\square\) Dual 16-Bit Addition and Subtraction \\
\hline & \(\square\) Dual 16-Bit Subtraction and Addition \\
\hline
\end{tabular}

Dual 16-Bit Additions

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
HI \((A C y)=\operatorname{HI}(\operatorname{Lmem})+\operatorname{HI}(A C x)\), \\
LO(ACy \()=\) LO (Lmem \()+\operatorname{LO}(A C x)\)
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

\section*{Description}

ACx, ACy, Lmem
This instruction performs two paralleled addition operations in one cycle. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
- The data memory operand dbl(Lmem) is divided into two 16-bit parts:

■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem -1
- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVy) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
\(\left.\begin{array}{lll}\text { Status Bits } & \text { Affected by C16, C54CM, SATD, SXMD } \\
& \text { Affects } \quad \text { ACOVy, CARRY }\end{array}\right\}\)\begin{tabular}{ll} 
Repeat & This instruction can be repeated. \\
Example & \\
\hline
\end{tabular}
\begin{tabular}{|l|}
\hline Syntax \\
\hline \(\mathrm{HI}(\mathrm{ACO})=\mathrm{HI}\left({ }^{*} \mathrm{AR} 3\right)+\mathrm{HI}(\mathrm{AC} 1)\), \\
\(\mathrm{LO}(\mathrm{ACO})=\mathrm{LO}\left({ }^{*} A R 3\right)+\mathrm{LO}(\mathrm{AC} 1)\) \\
\hline
\end{tabular}
\begin{tabular}{|l|}
\hline Description \\
\hline Both instructions are performed in parallel. When the Lmem address is \\
even (AR3 = even): The content of AC1(39-16) is added to the content \\
addressed by AR3 and the result is stored in AC0(39-16). The content \\
of AC1(15-0) is added to the content addressed by AR3 + 1 and the \\
result is stored in AC0(15-0).
\end{tabular}

Dual 16-Bit Additions
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \begin{tabular}{l} 
HI \((A C x)=\) HI(Lmem \()+\) Tx, \\
LO(ACx \()=\) LO (Lmem \()+\) Tx
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}

Opcode
| \(11101110 \mid\) AAAA AAAI \(\mid\) ssDD \(100 x\)
ACx, Lmem, Tx
This instruction performs two paralleled addition operations in one cycle. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
\(\square\) The temporary register Tx:
- is used as one of the 16-bit operands of the ALU low part
- is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
\(\square\) The data memory operand dbl(Lmem) is divided into two 16-bit parts:
- the lower part is used as one of the 16-bit operands of the ALU low part
- the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
- if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1
- if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem - 1
\(\square\) For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.
- For the operations performed in the ALU low part, overflow is detected at bit position 15.
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
\(\square\) For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
\(\square\) Independently on each data path, if SATD \(=1\) when an overflow is detected on the data path, a saturation is performed:
- For the operations performed in the ALU low part, saturation values are 7FFFh and 8000h.
- For the operations performed in the ALU high part, saturation values are 00 7FFFh and FF 8000h.

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM = 1, this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24-bit datapath (overflow is detected at bit position 31).
\begin{tabular}{lll} 
Status Bits & Affected by C54CM, SATD, SXMD \\
Affects & ACOVx, CARRY
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{HI}(\mathrm{ACO})=\mathrm{HI}\left({ }^{*}\right.\) AR3 \()+\) TO, & Both instructions are performed in parallel. When the Lmem address is \\
\(\mathrm{LO}(\mathrm{ACO})=\mathrm{LO}\left({ }^{*} \mathrm{AR} 3\right)+\) T0 & \begin{tabular}{l} 
even (AR3 \(=\) even \():\) The content of T0 is added to the content addressed \\
by AR3 and the result is stored in ACO(39-16). The duplicated content of \\
T0 is added to the content addressed by AR3 +1 and the result is stored \\
in AC0(15-0).
\end{tabular} \\
\hline
\end{tabular}

\section*{ADDSUB \\ Dual 16-Bit Addition and Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& \text { HI(ACx) }=\text { Smem }+ \text { Tx, } \\
& \text { LO(ACx) = Smem - Tx }
\end{aligned}
\] & No & 3 & 1 & X \\
\hline [2] & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })+\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })-\mathrm{Tx}
\end{aligned}
\] & No & 3 & 1 & X \\
\hline
\end{tabular}
Description These instructions perform two paralleled addition and subtraction operations in one cycle.

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

Status Bits
Affected by C54CM, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
See Also See the following other related instructions:
- Addition
- Dual 16-Bit Additions
\(\square\) Dual 16-Bit Subtractions
- Dual 16-Bit Subtraction and Addition
\(\square\) Subtraction

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
HI \((A C x)=\) Smem + Tx, \\
LO \((A C x)=\) Smem - Tx
\end{tabular} & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

Opcode


\section*{Operands}

ACx, Smem, Tx
Description This instruction performs two paralleled arithmetical operations in one cycle: an addition and subtraction. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
- The data memory operand Smem:

■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
- The temporary register Tx:

■ is used as one of the 16 -bit operands of the ALU low part
- is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part

For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15.

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
- Independently on each data path, if SATD = 1 when an overflow is detected on the data path, a saturation is performed:
■ For the operations performed in the ALU low part, saturation values are 7 FFFh and 8000 h .
- For the operations performed in the ALU high part, saturation values are 007 FFFh and FF 8000h.

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24 -bit datapath (overflow is detected at bit position 31).
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & Affected by & C54CM, SATD, SXMD \\
\hline & Affects & ACOVx, CARRY \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline Syntax & \multicolumn{2}{|l|}{Description} \\
\hline \[
\begin{aligned}
& \mathrm{HI}(\mathrm{AC} 1)={ }^{*} \mathrm{AR} 1+\mathrm{T} 1, \\
& \mathrm{LO}(\mathrm{AC} 1)={ }^{*} \mathrm{AR} 1-\mathrm{T} 1
\end{aligned}
\] & Both instruction to the conten of T 1 is subt stored in AC & are performed in parallel. T 1 and the result is stored ed from the duplicated con 5-0). \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC1 & 00 & 2300 & 0000 & AC1 & 00 & 2300 & A300 \\
\hline T1 & & & 4000 & T1 & & & 4000 \\
\hline AR1 & & & 0201 & AR1 & & & 0201 \\
\hline 201 & & & E300 & 201 & & & E300 \\
\hline SXMD & & & 1 & SXMD & & & 1 \\
\hline M40 & & & 1 & M40 & & & 1 \\
\hline ACOVO & & & 0 & ACOVO & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & 1 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \begin{tabular}{l} 
HI \((A C x)=\) HI (Lmem \()+\) Tx \\
LO \((A C x)=\) LO (Lmem \()-\) Tx
\end{tabular} & No & 3 & 1 & X \\
& & & & & \\
\hline
\end{tabular}

Opcode
Operands
Description
\(|11101110|\) AAAA AAAI |ssDD 110 x
ACx, Lmem, Tx
This instruction performs two paralleled arithmetical operations in one cycle: an addition and subtraction. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
\(\square\) The temporary register Tx:
■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
- The data memory operand dbl(Lmem) is divided into two 16-bit parts:

■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1
■ if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem - 1

For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
\begin{tabular}{lll} 
Status Bits & Affected by & C16, C54CM, SATD, SXMD \\
& Affects & ACOVx, CARRY
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{HI}(\mathrm{ACO})=\mathrm{HI}\left({ }^{*} \mathrm{AR} 3\right)+\mathrm{TO}\), & \begin{tabular}{l} 
Both instructions are performed in parallel. When the Lmem address is even \\
\(\mathrm{LO}(\mathrm{ACO})=\mathrm{LO}\left({ }^{*} \mathrm{AR} 3\right)-\) T0 \\
(AR3 \(=\) even \():\) The content of T0 is added to the content addressed by AR3 and \\
the result is stored in AC0(39-16). The duplicated content of T0 is subtracted from \\
the content addressed by AR3 +1 and the result is stored in AC0(15-0).
\end{tabular} \\
\hline
\end{tabular}

\section*{SUB}

Dual 16-Bit Subtractions

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\mathrm{ACx})-\mathrm{HI}(\text { Lmem }), \\
& \mathrm{LO}(A C y)=\mathrm{LO}(A C x)-\mathrm{LO}(\text { Lmem })
\end{aligned}
\] & No & 3 & 1 & X \\
\hline [2] & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\text { Lmem })-\mathrm{HI}(\mathrm{ACx}) \\
& \mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\text { Lmem })-\mathrm{LO}(\mathrm{AC})
\end{aligned}
\] & No & 3 & 1 & X \\
\hline [3] & \[
\begin{aligned}
& \text { HI(ACx) }=\mathrm{Tx}-\mathrm{HI}(\text { Lmem }), \\
& \text { LO(ACx) }=\text { Tx }- \text { LO(Lmem) }
\end{aligned}
\] & No & 3 & 1 & X \\
\hline [4] & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })-\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })-\mathrm{Tx}
\end{aligned}
\] & No & 3 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{ll} 
Description & \begin{tabular}{l} 
These instructions perform two paralleled subtraction operations in one cycle. \\
The operations are executed on 40 bits in the D-unit ALU that is configured \\
locally in dual 16-bit mode. The 16 lower bits of both the ALU and the \\
accumulator are separated from their higher 24 bits (the 8 guard bits are \\
attached to the higher 16-bit datapath).
\end{tabular} \\
Status Bits & Affected by C54CM, SATD, SXMD \\
Affects ACOVx, ACOVy, CARRY \\
See Also & See the following other related instructions: \\
Addition or Subtraction Conditionally
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
HI \((A C y)=\operatorname{HI}(A C x)-\operatorname{HI}(\) Lmem \()\), \\
LO \((A C y)=\) LO(ACx \()-\) LO \((\) Lmem \()\)
\end{tabular} & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
| 11101110 | AAAA AAAI \(\mid\) SSDD 001x

\section*{Operands}

\section*{Description}

ACx, ACy, Lmem
This instruction performs two paralleled subtraction operations in one cycle. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit data path).
\(\square\) The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem -1
- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVy) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
\begin{tabular}{lll} 
Status Bits & Affected by & C16, C54CM, SATD, SXMD \\
& Affects & ACOVy, CARRY
\end{tabular}

\section*{Example}

\section*{Syntax}
\(\mathrm{HI}(\mathrm{AC} 0)=\mathrm{HI}(\mathrm{AC} 1)-\mathrm{HI}\left({ }^{*}\right.\) AR3 \()\), \(\mathrm{LO}(\mathrm{AC} 0)=\mathrm{LO}(\mathrm{AC} 1)-\mathrm{LO}\left({ }^{*} \mathrm{AR} 3\right)\)

\section*{Description}

Both instructions are performed in parallel. When the Lmem address is even (AR3 = even): The content addressed by AR3 (sign extended to 24 bits) is subtracted from the content of AC1(39-16) and the result is stored in ACO(39-16). The content addressed by AR3 + 1 is subtracted from the content of \(\mathrm{AC} 1(15-0)\) and the result is stored in \(\mathrm{ACO}(15-0)\).

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \begin{tabular}{l} 
HI \((A C y)=\operatorname{HI}(\operatorname{Lmem})-\operatorname{HI}(A C x)\), \\
LO \((A C y)=\) LO \((L m e m)-\operatorname{LO}(A C x)\)
\end{tabular} & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description

ACx, ACy, Lmem
This instruction performs two paralleled subtraction operations in one cycle. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
\(\square\) The data memory operand dbl(Lmem) is divided into two 16-bit parts:
■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
- The data memory operand dbl(Lmem) addresses are aligned:

■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1

■ if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem -1
- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVy) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
\begin{tabular}{|c|c|}
\hline Status Bits Affected by & Affected by C 16, C54CM, SATD, SXMD \\
\hline Affects & ACOVy, CARRY \\
\hline Repeat This instructior & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACO})=\mathrm{HI}\left({ }^{*} \mathrm{AR} 3\right)-\mathrm{HI}(\mathrm{AC} 1), \\
& \mathrm{LO}(\mathrm{ACO})=\mathrm{LO}\left({ }^{*} \mathrm{AR} 3\right)-\mathrm{LO}(\mathrm{AC} 1)
\end{aligned}
\] & Both instructions are performed in parallel. When the Lmem address is even (AR3 = even): The content of AC1(39-16) is subtracted from the content addressed by AR3 and the result is stored in AC0(39-16). The content of \(\mathrm{AC} 1(15-0)\) is subtracted from the content addressed by AR3 + 1 and the result is stored in \(\mathrm{ACO}(15-0)\). \\
\hline
\end{tabular}

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([3]\) & \begin{tabular}{l} 
HI \((A C x)=T x-H I(L m e m)\), \\
LO(ACx \()=\) Tx - LO(Lmem \()\)
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}
| 11101110 |AAAA AAAI \(\mid\) ssDD 011x

ACx, Lmem, Tx
This instruction performs two paralleled subtraction operations in one cycle. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
\(\square\) The temporary register Tx:
■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
- The data memory operand dbl(Lmem) is divided into two 16-bit parts:

■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1
■ if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem - 1
- For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
\(\square\) For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
\(\square\) Independently on each data path, if SATD \(=1\) when an overflow is detected on the data path, a saturation is performed:
- For the operations performed in the ALU low part, saturation values are 7FFFh and 8000h.
- For the operations performed in the ALU high part, saturation values are 00 7FFFh and FF 8000h.

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM = 1, this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24-bit datapath (overflow is detected at bit position 31).
\begin{tabular}{lll} 
Status Bits & Affected by C54CM, SATD, SXMD \\
Affects & ACOVx, CARRY
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{HI}(\mathrm{ACO})=\mathrm{TO}-\mathrm{HI}\left({ }^{*} \mathrm{AR} 3\right)\), & Both instructions are performed in parallel. When the Lmem address is even \\
\(\mathrm{LO}(\mathrm{ACO})=\mathrm{TO}-\mathrm{LO}\left({ }^{*} \mathrm{AR} 3\right)\) & (AR3 = even): The content addressed by AR3 is subtracted from the \\
content of T0 and the result is stored in ACO(39-16). The content addressed \\
by AR3 +1 is subtracted from the duplicated content of T0 and the result \\
is stored in AC0(15-0).
\end{tabular}

Dual 16-Bit Subtractions
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([4]\) & \begin{tabular}{l} 
HI \((A C x)=\) HI (Lmem \()-T x\), \\
\(L O(A C x)=\) LO (Lmem \()-T x\)
\end{tabular} & No & 3 & 1 & X \\
& & & & & \\
\hline
\end{tabular}

Opcode
| \(11101110 \mid\) AAAA AAA \(\mid\) ssDD \(101 x\)
ACx, Tx, Lmem
This instruction performs two paralleled subtraction operations in one cycle. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
\(\square\) The temporary register Tx:
- is used as one of the 16-bit operands of the ALU low part
- is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
\(\square\) The data memory operand dbl(Lmem) is divided into two 16-bit parts:
- the lower part is used as one of the 16-bit operands of the ALU low part
- the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
- if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1
- if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem - 1
\(\square\) For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.
- For the operations performed in the ALU low part, overflow is detected at bit position 15.
- For the operations performed in the ALU high part, overflow is detected at bit position 31.

Status Bits

Repeat

For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.
- Independently on each data path, if SATD \(=1\) when an overflow is detected on the data path, a saturation is performed:
- For the operations performed in the ALU low part, saturation values are 7 FFFh and 8000 h .
- For the operations performed in the ALU high part, saturation values are 00 7FFFh and FF 8000h.

\section*{Compatibility with C54x devices (C54CM =1)}

When C54CM \(=1\), this instruction is executed as if SATD is locally cleared to 0 . Overflow is only detected and reported for the computation performed in the higher 24-bit datapath (overflow is detected at bit position 31).

\(\square\)
When \(\mathrm{C} 54 \mathrm{CM}=1\) and \(\mathrm{C} 16=1\), the instruction behaves like a dual 16-bit instruction and the carry is not propagated at bit 15 in the D-unit ALU.
- When C54CM = 1 and C16 = 0 , the instruction behaves like a single arithmetic instruction and the carry is propagated at bit 15 in the D -unit ALU.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{HI}(\mathrm{ACO})=\mathrm{HI}\left({ }^{*} \mathrm{AR} 3\right)-\mathrm{TO}\), & Both instructions are performed in parallel. When the Lmem address is even \\
\(\mathrm{LO}(\mathrm{ACO})=\mathrm{LO}\left({ }^{*} \mathrm{AR} 3\right)-\) T0 & (AR3 = even): The content of T0 is subtracted from the content addressed \\
by AR3 and the result is stored in ACO(39-16). The duplicated content of \\
TO is subtracted from the content addressed by AR3 + 1 and the result is \\
stored in AC0(15-0). \\
\hline
\end{tabular}

\section*{SUBADD \\ Dual 16-Bit Subtraction and Addition}

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
HI(ACx \()=\) Smem - Tx, \\
LO(ACx \()=\) Smem + Tx
\end{tabular} & No & 3 & 1 & X \\
{\([2]\)} & \begin{tabular}{l} 
HI(ACx) \(=\) HI(Lmem \()-\) Tx, \\
LO(ACx \()=\) LO(Lmem \()+\) Tx
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform two paralleled subtraction and addition operations in one cycle.

The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16 -bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).

Status Bits
Affected by C54CM, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
See Also See the following other related instructions:
\(\square\) Addition
\(\square\) Dual 16-Bit Additions
- Dual 16-Bit Addition and Subtraction
- Dual 16-Bit Subtractions
\(\square\) Subtraction

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
HI \((A C x)=\) Smem - Tx \\
LO(ACx \()=\) Smem + Tx
\end{tabular} & No & 3 & 1 & X \\
& & & & & \\
\hline
\end{tabular}

Opcode
\begin{tabular}{|ll|l|l|l|}
1101 & 1110 & AAAA AAAI & ssDD 1001
\end{tabular}

Operands
Description

ACx, Smem, Tx
This instruction performs two paralleled arithmetical operations in one cycle: a subtraction and addition. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
- The data memory operand Smem:

■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
- The temporary register Tx:

■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part

For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15.

■ For the operations performed in the ALU high part, overflow is detected at bit position 31.
- For all instructions, the carry of the operation performed in the ALU high part is reported in the CARRY status bit. The CARRY status bit is always extracted at bit position 31.


\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \begin{tabular}{l} 
HI \((A C x)=\) HI (Lmem \()-T x\), \\
LO \((A C x)=\) LO \((\) Lmem \()+\) Tx
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description

ACx, Lmem, Tx
This instruction performs two paralleled arithmetical operations in one cycle: a subtraction and addition. The operations are executed on 40 bits in the D-unit ALU that is configured locally in dual 16-bit mode. The 16 lower bits of both the ALU and the accumulator are separated from their higher 24 bits (the 8 guard bits are attached to the higher 16-bit datapath).
\(\square\) The temporary register Tx:
■ is used as one of the 16 -bit operands of the ALU low part
■ is duplicated and, according to SXMD, sign extended to 24 bits to be used in the ALU high part
- The data memory operand dbl(Lmem) is divided into two 16-bit parts:

■ the lower part is used as one of the 16 -bit operands of the ALU low part
■ the higher part is sign extended to 24 bits according to SXMD and is used in the ALU high part
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1
■ if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem -1

For each of the two computations performed in the ALU, an overflow detection is made. If an overflow is detected on any of the data paths, the destination accumulator overflow status bit (ACOVx) is set.

■ For the operations performed in the ALU low part, overflow is detected at bit position 15 .
- For the operations performed in the ALU high part, overflow is detected at bit position 31.
\begin{tabular}{lll} 
Status Bits & Affected by & C16, C54CM, SATD, SXMD \\
& Affects & ACOVx, CARRY
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{HI}(\mathrm{ACO})=\mathrm{HI}\left({ }^{*} \mathrm{AR3}\right)-\mathrm{TO}\), & Both instructions are performed in parallel. When the Lmem address is even \\
\(\mathrm{LO}(\mathrm{ACO})=\mathrm{LO}\left({ }^{*} \mathrm{AR} 3\right)+\mathrm{TO}\) & (AR3 = even): The content of T0 is subtracted from the content addressed \\
by AR3 and the result is stored in AC0(39-16). The duplicated content of \\
T0 is added to the content addressed by AR3 + 1 and the result is stored \\
in AC0(15-0).
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & if (cond) execute(AD_Unit) & No & 2 & 1 & AD \\
{\([2]\)} & if (cond) execute(D_Unit) & No & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Description}

\section*{Status Bits}

These instructions evaluate a single condition defined by the cond field and allow you to control execution of all operations implied by the instruction or part of the instruction. See Table 1-3 for a list of conditions.

Instruction [1] allows you to control the entire execution flow from the address phase to the execute phase of the pipeline. Instruction [2] allows you to only control the execution flow from the execute phase of the pipeline. The use of a label, where control of the execute conditionally instruction ends, is optional.
- These instructions may be executed alone.
- These instructions may be executed with two paralleled instructions.
- These instructions may be executed with the instruction with which it is paralleled.
- These instructions may be executed with the previous instruction.
- These instructions may be executed with the previous instruction and two paralleled instructions.
- These instructions cannot be repeated.
- These instructions cannot be used as the last instruction in a repeat loop structure.
- These instructions cannot control the execution of the following program control instructions:
\begin{tabular}{|l|l|l|l|l|}
\hline goto & (cond) goto & intr & blockrepeat & return_int \\
\hline call & (cond) call & idle & (cond) execute(AD_unit) & \\
\hline return & (cond) return & reset & (cond) execute(D_unit) & \\
\hline trap & localrepeat & repeat & while (cond) repeat & \\
\hline
\end{tabular}

\section*{Execute Conditionally}

Syntax Characteristics
\begin{tabular}{llccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & if (cond) execute(AD_Unit) & No & 2 & 1 & AD \\
\hline Opcode & \(\mid 1001\) & 0110 & \(0 C C C\) & CCCC \\
& & 1001 & \(1110 \mid 0 C C C\) & CCCC \\
& & 1001 & 1111 & 0 CCC & CCCC
\end{tabular}

\section*{Operands}

\section*{Description}

The assembler selects the opcode depending on the instruction position in a paralleled pair.

\section*{cond}

This instruction evaluates a single condition defined by the cond field and allows you to control the execution flow of an instruction, or instructions, from the address phase to the execute phase of the pipeline. See Table 1-3 for a list of conditions.

When this instruction moves into the address phase of the pipeline, the condition specified in the cond field is evaluated. If the tested condition is true, the conditional instruction(s) is read and executed; if the tested condition is false, the conditional instruction(s) is not read and program control is passed to the instruction following the conditional instruction(s) or to the program address defined by label. There is a 3-cycle latency for the condition testing.
- This instruction may be executed alone:
```

if(cond) execute(AD_unit)
instruction_executes_conditionally

```
label:
\(\square\) This instruction may be executed with two paralleled instructions:
```

if(cond) execute(AD_unit)
instruction_1_executes_conditionally
|| instructīon__2_execu\overline{tes_conditionally}
label:

```
\(\square\) This instruction may be executed with the instruction with which it is paralleled:
if (cond) execute(AD_unit)
|| instruction_execūtes_conditionally
label:
\(\square\) This instruction may be executed with a previous instruction:
```

    previous_instruction
    || if(coñd) execute(AD_unit)
    instruction_executes_conditionally
    label:

```
\(\square\) This instruction may be executed with a previous instruction and two paralleled instructions:
```

previous_instruction
|| if(coñd) execute(AD_unit)
instruction_1_executes_conditionally
|| instruction_2_executes_conditionally

```
label:

This instruction cannot be used as the last instruction in a repeat loop structure.

This instruction cannot control the execution of the following program control instructions:
\begin{tabular}{|l|l|l|l|l|}
\hline goto & (cond) goto & intr & blockrepeat & return_int \\
\hline call & (cond) call & idle & (cond) execute(AD_unit) & \\
\hline return & (cond) return & reset & (cond) execute(D_unit) & \\
\hline trap & localrepeat & repeat & while (cond) repeat & \\
\hline
\end{tabular}

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM \(=1\), the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits
Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
Repeat
This instruction cannot be repeated.
Example 1
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if (TC1) execute(AD_unit) & TC1 is equal to 1, the next instruction is executed (AR1 is incremented by 1). \\
mar(*AR1+) & The content of AC1 is added to the content addressed by AR1 + 1 (2021h) and \\
AC1 = AC1 + *AR1 & the result is stored in AC1.
\end{tabular}
\begin{tabular}{lrrlrr} 
Before & & After \\
AC1 & 00 & 0000 & 4300 & AC1 & 000000621 \\
TC1 & & 1 & TC1 & & 1 \\
CARRY & & 1 & CARRY & & 0 \\
AR1 & 0200 & AR1 & & 0201 \\
200 & & 2020 & 200 & 2020 \\
201 & 2021 & 201 & 2021
\end{tabular}

\section*{Example 2}


\section*{Execute Conditionally}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & if (cond) execute(D_Unit) & No & 2 & 1 & X \\
\hline Opcode & \(\mid 1001\) & 0110 & \(1 C C C\) & CCCC \\
& & 1001 & 1110 & \(1 C C C\) & CCCC \\
& & 1001 & \(1111 \mid 1 C C C\) & CCCC
\end{tabular}

\section*{Operands}

Description

The assembler selects the opcode depending on the instruction position in a paralleled pair.
cond
This instruction evaluates a single condition defined by the cond field and allows you to control the execution flow of an instruction, or instructions, from the execute phase of the pipeline. This instruction differs from instruction [1] because in this instruction operations performed in the address phase are always executed. See Table 1-3 for a list of conditions.

When this instruction moves into the execute phase of the pipeline, the condition specified in the cond field is evaluated. If the tested condition is true, the conditional instruction(s) is read and executed; if the tested condition is false, the conditional instruction(s) is not read and program control is passed to the instruction following the conditional instruction(s) or to the program address defined by label. There is a 0 -cycle latency for the condition testing.
- This instruction may be executed alone:
```

if(cond) execute(D_unit)
instruction_executes_conditionally

```
label:
\(\square\) This instruction may be executed with two paralleled instructions:
```

if(cond) execute(D_unit)
instruction_1_executes_conditionally
|| instruction_2_executes_conditionally

```
label:
- This instruction may be executed with the instruction with which it is paralleled. When this instruction syntax is used and the instruction to be executed conditionally is a store-to-memory instruction, there is a 1 -cycle latency for the condition setting.
```

if(cond) execute(D_unit)
|| instruction_exe\overline{cutes_conditionally}

```
label:
- This instruction may be executed with a previous instruction:
previous_instruction
|| if(coñ) execute(D_unit)
instruction_executes_c̄onditionally
label:
\(\square\) This instruction may be executed with a previous instruction and two paralleled instructions:
```

    previous_instruction
    || if(cond) execute(D_unit)
    instruction_1_executes_conditionally
    || instructīon_2_execu\overline{tes_conditionally}
    label:

```

This instruction cannot be used as the last instruction in a repeat loop structure.

When the instruction to be executed conditionally is an instruction to read data from memory, the data read operation is performed regardless of the condition and the read data is discarded at the execute phase if the condition is false.

This instruction cannot control the execution of the following program control instructions:
\begin{tabular}{|l|l|l|l|l|}
\hline goto & (cond) goto & intr & blockrepeat & return_int \\
\hline call & (cond) call & idle & (cond) execute(AD_unit) & \\
\hline return & (cond) return & reset & (cond) execute(D_unit) & \\
\hline trap & localrepeat & repeat & while (cond) repeat & \\
\hline
\end{tabular}
and an instruction to read data from I/O space.
Compatibility with C54x devices (C54CM = 1)
When \(\mathrm{C} 54 \mathrm{CM}=1\), the comparison of accumulators to 0 is performed as if M40 was set to 1 .

Status Bits

Repeat
Affected by ACOVx, CARRY, C54CM, M40, TCx
Affects ACOVx
This instruction cannot be repeated.

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if (TC1) execute(D_unit) & TC1 is equal to 1, the next instruction is executed (AR1 is incremented by 1). \\
mar(*AR1+) & The content of AC1 is added to the content addressed by AR1 + 1 (2021h) and \\
AC1 = AC1 + *AR1 & the result is stored in AC1.
\end{tabular}


Example 2
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline if (TC1) execute(D_unit) & \begin{tabular}{l} 
TC1 is not equal to 1, the next instruction would not be executed; however, \\
since the next instruction is a pointer modification, AR1 is incremented by 1 \\
mar(*AR1+) \\
in the address phase. The content of AC1 is added to the content addressed \\
AC1 = AC1 + *AR1
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrr} 
Before & & After \\
AC1 & 00 & 0000 & 4300 & AC1 & 00000 \\
TC1 & & 0 & TC1 & & 0321 \\
CARRY & & 1 & CARRY & & 0 \\
AR1 & 0200 & AR1 & & 0201 \\
200 & 2020 & 200 & 2020 \\
201 & & 2021 & 201 & 2021
\end{tabular}

\section*{BFXPA}

Expand Accumulator Bit Field
Syntax Characteristics

\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{8}{|l|}{Execution} \\
\hline \#k16 (8024h) & 1000 & 0000 & 0010 & 0100 & & & \\
\hline AC0 (15-0) & 0010 & 1011 & 0110 & 0101 & & & \\
\hline T2 & 1000 & 0000 & 0000 & 0100 & & & \\
\hline Before & & & After & & & & \\
\hline ACO 002300 & 2B65 & & ACO & & 00 & 2300 & 2B65 \\
\hline T2 & 0000 & & T2 & & & & 8004 \\
\hline
\end{tabular}

\section*{BFXTR}

\section*{Extract Accumulator Bit Field}

\section*{Syntax Characteristics}

\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{8}{|l|}{Execution} \\
\hline \#k16 (8024h) & 1000 & 0000 & 0010 & 0100 & & & \\
\hline AC0 (15-0) & 0101 & 0101 & 1010 & 1010 & & & \\
\hline T2 & 0000 & 0000 & 0000 & 0010 & & & \\
\hline Before & & & After & & & & \\
\hline AC0 002300 & 55AA & & AC0 & & 00 & 2300 & 55AA \\
\hline T2 & 0000 & & T2 & & & & 0002 \\
\hline
\end{tabular}

\section*{FIRSSUB}

Finite Impulse Response Filter, Antisymmetrical

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. & Synt & & Parallel Enable Bi & Size & Cycles & Pipeline \\
\hline & \multicolumn{2}{|l|}{firsn(Xmem, Ymem, coef(Cmem), ACx, ACy)} & No & 4 & 1 & X \\
\hline \multicolumn{7}{|l|}{Opcode \(\quad \left\lvert\, \begin{array}{lll}1000 & 0101 \mid \text { XXXM }\end{array}\right.\)} \\
\hline \multicolumn{7}{|l|}{Operands ACx, ACy, Cmem, Xmem, Ymem} \\
\hline \multicolumn{2}{|l|}{\multirow[t]{2}{*}{Description}} & \multicolumn{5}{|l|}{This instruction performs two parallel operations: multiply and accumulate (MAC), and subtraction. The firsn() operation is executed:} \\
\hline & & \multicolumn{5}{|l|}{\[
\begin{aligned}
& A C y=A C y+(A C x * C m e m), \\
& A C x=(\text { Xmem } \ll \# 16)-(\text { Ymem } \ll \# 16)
\end{aligned}
\]} \\
\hline
\end{tabular}

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of \(\mathrm{ACx}(32-16)\) and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

The second operation subtracts the content of data memory operand Ymem, shifted left 16 bits, from the content of data memory operand Xmem, shifted left 16 bits.
- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), no overflow detection, report, and saturation is done after the shifting operation.

Status Bits

Repeat
See Also

Affected by C54CM, FRCT, M40, SATD, SMUL, SXMD
Affects ACOVx, ACOVy, CARRY
This instruction can be repeated.
See the following other related instructions:
Finite Impulse Response Filter, Symmetrical

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline firsn(*AR0, *AR1, coef(*CDP), AC0, AC1) & \begin{tabular}{l} 
The content of AC0(32-16) multiplied by the content addressed \\
by the coefficient data pointer register (CDP) is added to the \\
content of AC1 and the result is stored in AC1. The content \\
addressed by AR1 shifted left by 16 bits is subtracted from the \\
content addressed by AR0 shifted left by 16 bits and the result is \\
stored in AC0.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC0 & 00 & 6900 & 0000 & AC0 & 00 & 4500 & 0000 \\
\hline AC1 & 00 & 0023 & 0000 & AC1 & FF & D8ED & 3 F 00 \\
\hline *ARO & & & 3400 & *AR0 & & & 3400 \\
\hline *AR1 & & & EFOO & *AR1 & & & EFO0 \\
\hline *CDP & & & A067 & * CDP & & & A067 \\
\hline ACOVO & & & 0 & ACOVO & & & 0 \\
\hline ACOV1 & & & 0 & ACOV1 & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & 0 \\
\hline FRCT & & & 0 & FRCT & & & 0 \\
\hline SXMD & & & 0 & SXMD & & & 0 \\
\hline
\end{tabular}

\section*{FIRSADD}

Finite Impulse Response Filter, Symmetrical

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline No. & Synt & & & Parallel Enable Bi & Size & Cycles & Pipeline \\
\hline [1] & \multicolumn{2}{|l|}{firs(Xmem, Ymem, coef(Cmem), ACx, ACy)} & & No & 4 & 1 & X \\
\hline Opcod & & 10000101 & XXXM & MMYY \(\mid\) YM & M 11 & mm \({ }^{\text {D }}\) & D DU\% \\
\hline Opera & & \multicolumn{6}{|l|}{ACx, ACy, Cmem, Xmem, Ymem} \\
\hline \multicolumn{2}{|l|}{Description} & \multicolumn{6}{|l|}{This instruction performs two parallel operations: multiply and accumulate (MAC), and addition. The firs() operation is executed:} \\
\hline
\end{tabular}

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of \(\mathrm{ACx}(32-16)\) and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

The second operation performs an addition operation between the content of data memory operand Xmem, shifted left 16 bits, and the content of data memory operand Ymem, shifted left 16 bits.
- The operation is performed on 40 bits in the D-unit ALU.
\(\square\) Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40.

When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{lll} 
Status Bits & Affected by \(\quad\) C54CM, FRCT, M40, SATD, SMUL, SXMD \\
& Affects \(\quad\) ACOVx, ACOVy, CARRY
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline firs( \({ }^{*}\) AR0, \({ }^{*}\) AR1, coef(*CDP), AC0, AC1) & \begin{tabular}{l} 
The content of AC0(32-16) multiplied by the content addressed by \\
the coefficient data pointer register (CDP) is added to the content of \\
\\
\\
\\
AC1 and the result is stored in AC1. The content addressed by AR0 \\
shifted left by 16 bits is added to the content addressed by AR1 \\
shifted left by 16 bits and the result is stored in AC0. \\
\hline
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC0 & 00 & 6900 & 0000 & ACO & 00 & 2300 & 0000 \\
\hline AC1 & 00 & 0023 & 0000 & AC1 & FF & D8ED & 3 F 00 \\
\hline *ARO & & & 3400 & *AR0 & & & 3400 \\
\hline *AR1 & & & EF00 & *AR1 & & & EFO0 \\
\hline * CDP & & & A067 & * CDP & & & A067 \\
\hline ACOVO & & & 0 & Acovo & & & 0 \\
\hline ACOV1 & & & 0 & ACOV1 & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & 1 \\
\hline FRCT & & & 0 & FRCT & & & 0 \\
\hline SXMD & & & 0 & SXMD & & & 0 \\
\hline
\end{tabular}

\section*{IDLE}

Syntax Characteristics


\section*{LMS}

\section*{Least Mean Square}

\section*{Syntax Characteristics}
\begin{tabular}{cllccccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\operatorname{lms}(X m e m, ~ Y m e m, ~ A C x, ~ A C y) ~\) & & No & 4 & 1 & X \\
\hline & & 1000 & 0110 & XXXM & MMYY & YMMM & DDDD
\end{tabular}

\section*{Operands}

Description

\section*{ACx, ACy, Xmem, Ymem}

This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC), and addition. The instruction is executed:
```

ACy = ACy + (Xmem * Ymem),
ACx = rnd(ACx + (Xmem << \#16))

```

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, sign extended to 17 bits, and the content of data memory operand Ymem, sign extended to 17 bits.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
. The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.

When an addition overflow is detected, the accumulator is saturated according to SATD.

The second operation performs an addition between an accumulator content and the content of data memory operand Xmem shifted left by 16 bits.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40.
- Rounding is performed according to RDM.


\section*{Syntax Characteristics}

```

ACx = T3 * (Ymem)
ACy = ACy + (Xmem) * (Ymem)
Xmem = HI(rnd(ACx + (Xmem) <<\#16))

```

The first operation performs a multiplication in D-unit MAC1. The input operands of the multiplier are the content of data register T3 and the content of data memory operand Ymem. The implied T3 operand is sign extended to 17 bits in the MAC1. The data memory operand Ymem is addressed by DAGEN path \(Y\) by using Ymem addressing mode, driven on the CDB bus, and sign extended to 17 bits in the MAC1.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

The second operation performs a multiplication and an addition in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Xmem and the content of data memory operand Ymem. The data memory operand Xmem is addressed by DAGEN path X by using Xmem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2. The other data memory operand Ymem is addressed by DAGEN path Y by using the Ymem addressing mode, driven on data bus CDB, and sign extended to 17 bits in the MAC2.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

The third operation performs an addition between an accumulator content and the content of data memory operand Xmem in the D-unit ALU. The data memory operand Xmem is driven on the DDB bus as described in the above second operation, sign extended to 40 bits according to SXMD, shifted to the left by 16 bits, and supplied to the D-unit ALU.
\(\square\) The shift operation is identical to the arithmetic shift instruction. Therefore, an overflow detection, report, and saturation is done after the shifting operation.
- Overflow and CARRY detection are operated as M40 is locally set to 0 .
\(\square\) Addition overflow is always detected at bit position 31.
- Addition carry report in CARRY status bit is always extracted at bit position 31.
\(\square\) A rounding is always performed on the result of the addition. The rounding operation depends on the RDM status value.
- When RDM is 0 , the biased rounding to the infinite is performed. \(2^{\wedge} 15\) is added to the 40 -bit result of the accumulation.
\(\square\) When RDM is 1 , the unbiased rounding to the nearest is performed. According to the value of the 17 LSBs of the 40 -bit result of accumulation, \(2^{\wedge} 15\) is added as the following pseudo code description.
```

if(2^15 < bit(15-0) < 2^16)
add 2^15 to the 40-bit result of the accumulation
else if(bit(15-0) == 2^15)
if(bit(16) == 1)
add 2^15 to the 40-bit result of the
accumulation

```
- When an overflow is detected on the result of the rounding, the accumulator is saturated according to SATD. Note that no overflow
detection is performed on the intermediate result after the addition but before the rounding.
\(\square\) If an overflow resulting from the shift, or the addition/rounding, is detected, then the accumulator 0 overflow status bit is set (ACOVO). (In the exceptional case, even if the result of addition is overflowed, the rounding operation may suppress the overflow report.)
\(\square\) When an overflow is detected, the result is saturated according to SATD, before being stored in memory. Saturation values are 7FFFh or 8000h.
\(\square\) The result of the third operation, high part of ACx is stored into the data memory location addressed by Xmem via the Ebus.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\) and \(\mathrm{C} 54 \mathrm{CM}=1\), compatibility is ensured due to following the implementation of the Ims instruction.
\(\square\) The rounding is performed without clearing the 16 lowest bits of ACx.
\(\square\) The addition operation has no overflow detection, report, and saturation after the shifting operation.
Status Bits \begin{tabular}{l} 
Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SXMD, \\
Affects ACOVx, ACOVy, ACOV0, CARRY
\end{tabular}
Repeat \begin{tabular}{l|l|} 
\\
This instruction can be repeated.
\end{tabular}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \begin{tabular}{l} 
Imsf(*AR2-,*AR3+,AC0,AC1); \\
SXM=1, FRCT=1; \\
assuming 4KW bank DARAM
\end{tabular} & \begin{tabular}{l} 
The product of the content addressed by AR2 and the content addressed by \\
AR3 is added to the content of AC1 and the result is stored in AC1. The \\
content addressed by AR2, shifted to the left by 16 bits, is added to the con- \\
tent of AC0. The result is rounded and stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Execution
T3 [16:0] * ((Ymem)[16:0])) -> ACx[39:0]
ACy[39:0] +(Xmem) [16:0]*(Ymem)) [16:0])) -> ACy[39:0]
HI (rnd (ACx[39:0] +((Xmem) <<\#16))) -> Xmem
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline AC0 & 00 & 3 FFF & 8000 & ACO & 00 & 0200 & 0000 \\
\hline AC1 & 00 & 0000 & 8000 & AC1 & 00 & 0004 & 8000 \\
\hline T3 & & & 8000 & T3 & & & 8000 \\
\hline XAR2 & & 00 & 30 FF & XAR2 & & 00 & 30 FE \\
\hline XAR3 & & 00 & 2000 & XAR3 & & 00 & 2001 \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 2000h & & & FE00 & 2000h & & & FE00 \\
\hline 30 FFh & & & FFOO & 30 FFh & & & 3 F 00 \\
\hline
\end{tabular}

\section*{.LR \\ Linear Addressing Qualifier}

\section*{Syntax Characteristics}


\section*{MOV}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & ACx \(=\operatorname{rnd}(\) Smem \(\ll\) Tx) & No & 3 & 1 & X \\
\hline [2] & ACx = low_byte(Smem) <<\#SHIFTW & No & 3 & 1 & X \\
\hline [3] & ACx \(=\) high_byte(Smem) <<\#SHIFTW & No & 3 & 1 & X \\
\hline [4] & ACx \(=\) Smem <<\#16 & No & 2 & 1 & X \\
\hline [5] & ACx \(=\) uns (Smem) & No & 3 & 1 & X \\
\hline [6] & ACx \(=\) uns(Smem) <<\#SHIFTW & No & 4 & 1 & X \\
\hline [7] & ACx \(=\) M40(dbl(Lmem) & No & 3 & 1 & x \\
\hline [8] & \[
\begin{aligned}
& \text { LO(ACx) = Xmem, } \\
& \text { HI(ACx) = Ymem }
\end{aligned}
\] & No & 3 & 1 & x \\
\hline
\end{tabular}

Description

Status Bits

See Also

These instructions load a 16 -bit signed constant, K16, the content of a memory (Smem) location, the content of a data memory operand (Lmem), or the content of dual data memory operands (Xmem and Ymem) to a selected accumulator (ACx).

Affected by C54CM, M40, RDM, SATD, SXMD
Affects ACOVx
See the following other related instructions:
\(\square\) Load Accumulator from Memory with Parallel Store Accumulator Content to Memory
- Load Accumulator Pair from Memory
- Load Accumulator with Immediate Value
\(\square\) Load Accumulator, Auxiliary, or Temporary Register from Memory
\(\square\) Load Accumulator, Auxiliary, or Temporary Register with Immediate Value
\(\square\) Load Auxiliary or Temporary Register Pair from Memory
- Multiply and Accumulate with Parallel Load Accumulator from Memory
\(\square\) Multiply and Subtract with Parallel Load Accumulator from Memory

Load Accumulator from Memory

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline \multicolumn{2}{|l|}{ACx \(=\) rnd(Smem \(\ll T \mathrm{~T}\) )} & No & 3 & 1 & X \\
\hline Opcode & \multicolumn{5}{|r|}{} \\
\hline Operands & \multicolumn{5}{|l|}{ACx, Smem, Tx} \\
\hline Description & \multicolumn{5}{|l|}{This instruction loads the content of a memory (Smem) location shifted by the content of \(T x\) to the accumulator (ACx):} \\
\hline & \multicolumn{5}{|l|}{\begin{tabular}{l}
- The input operand is sign extended to 40 bits according to SXMD. \\
\(\square\) The input operand is shifted by the 4 -bit value in the D -unit shifter. The shiff operation is equivalent to the signed shift instruction.
\end{tabular}} \\
\hline & \multicolumn{5}{|l|}{\(\square\) Rounding is performed in the D-unit shifter according to RDM, if the optional rnd keyword is applied to the input operand.} \\
\hline & \multicolumn{5}{|l|}{Compatibility with C54x devices (C54CM \(=1\) )} \\
\hline & \multicolumn{5}{|l|}{When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), no overflow detection, report, and saturation is done after the shifting operation. The 6 LSBs of Tx are used to determine the shift quantity The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .} \\
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{5}{|l|}{Affected by C54CM, M40, RDM, SATD, SXMD} \\
\hline & \multicolumn{5}{|l|}{Affects ACOVx} \\
\hline Repeat & \multicolumn{5}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{6}{|l|}{Example} \\
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline AC0 \(=\) *AR3 \(\ll\) T0 & \multicolumn{5}{|l|}{AC0 is loaded with the content addressed by AR3 shifted by the content of T0.} \\
\hline
\end{tabular}

Load Accumulator from Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & ACx = low_byte \((\) Smem \() \ll \# S H I F T W\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

11100001 AAAA AAAI \(\mid\) DDSH IFTW

\section*{Operands}

Description This instruction loads the low-byte content of a memory (Smem) location shifted by the 6-bit value, SHIFTW, to the accumulator (ACx):
- The content of the memory location is sign extended to 40 bits according to SXMD.
- The input operand is shifted by the 6-bit value in the D-unit shifter. The shift operation is equivalent to the signed shift instruction.
- In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVx
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) low_byte(*AR3) << \#31 & \begin{tabular}{l} 
The low-byte content addressed by AR3 is shifted left by 31 bits and \\
loaded into AC0.
\end{tabular} \\
\hline
\end{tabular}

Load Accumulator from Memory

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([3]\) & ACx \(=\) high_byte \((S m e m) \ll\) \#SHIFTW & No & 3 & 1 & X \\
\hline & & 1110 & \(0010 \mid A A A A\) & AAAI & DDSH & IFTW
\end{tabular}
Operands ACx, SHIFTW, Smem

Description This instruction loads the high-byte content of a memory (Smem) location shifted by the 6 -bit value, SHIFTW, to the accumulator (ACx):
- The content of the memory location is sign extended to 40 bits according to SXMD.
- The input operand is shifted by the 6-bit value in the D-unit shifter. The shift operation is equivalent to the signed shift instruction.
\(\square\) In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SATD, SXMD \\
& Affects & ACOVx
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = high_byte (*AR3) \(\ll \# 31\)\begin{tabular}{l} 
The high-byte content addressed by AR3 is shifted left by 31 bits and \\
loaded into AC0.
\end{tabular} \\
\hline
\end{tabular}

Load Accumulator from Memory

\section*{Syntax Characteristics}


Load Accumulator from Memory

\section*{Syntax Characteristics}


Load Accumulator from Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([6]\) & ACx \(=\) uns \((\) Smem \() \ll \#\) SHIFTW & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

11111001 | AAAA AAAI 1 uxSH IFTW \(\mid\) xxDD 10xx

\section*{Operands}

Description This instruction loads the content of a memory (Smem) location, shifted by the 6 -bit value, SHIFTW, to the accumulator (ACx):

The memory operand is extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
\(\square\) The input operand is shifted by the 6 -bit value in the \(D\)-unit shifter. The shift operation is equivalent to the signed shift instruction.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{lll|}
\hline Status Bits & Affected by C54CM, M40, SATD, SXMD \\
Affects \(\quad\) ACOVx
\end{tabular}

Load Accumulator from Memory

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([7]\) & \(A C x=M 40(\mathrm{dbl}(\mathrm{Lmem}))\) & No & 3 & 1 & X \\
\hline & & & & &
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description This instruction loads the content of data memory operand (Lmem) to the accumulator (ACx):
\(\square\) The input operand is sign extended to 40 bits according to SXMD.
- The load operation in the accumulator uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
- Status bit M40 is locally set to 1 , if the optional M40 keyword is applied to the input operand.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{ll} 
Status Bits & Affected by M40, SATD, SXMD \\
Affects ACOVx
\end{tabular}\(\quad\)\begin{tabular}{ll} 
This instruction can be repeated. \\
Repeat & Description \\
\hline Example & \begin{tabular}{l} 
The content (long word) addressed by AR3 and AR3 + 1 is loaded into AC0. \\
Because this instruction is a long-operand instruction, AR3 is decremented by 2 \\
after the execution.
\end{tabular} \\
\hline Syntax &
\end{tabular}

Load Accumulator from Memory
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([8]\) & \begin{tabular}{l} 
LO(ACx \()=\) Xmem, \\
\\
HI (ACx \()=\) Ymem
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
| 10000001 XXXM MMYY \(\mid\) YMMM 10DD

\section*{Operands}

ACx, Xmem, Ymem
Description This instruction performs a dual 16-bit load of accumulator high and low parts. The operation is executed in dual 16-bit mode; however, it is independent of the 40 -bit D-unit ALU. The 16 lower bits of the accumulator are separated from the higher 24 bits and the 8 guard bits are attached to the higher 16-bit datapath.
- The data memory operand Xmem is loaded as a 16-bit operand to the destination accumulator (ACx) low part. And, according to SXMD the data memory operand Ymem is sign extended to 24 bits and is loaded to the destination accumulator (ACx) high part.
- For the load operations in higher accumulator bits, overflow detection is performed at bit position 31. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
- If SATD is 1 when an overflow is detected on the higher data path, a saturation is performed with saturation value of 007 FFFh .

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), this instruction is executed as if SATD was locally cleared to 0 .

\section*{Status Bits}

Repeat

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{LO}(\mathrm{ACO})={ }^{*}\) AR3, & The content at the location addressed by AR4, sign extended to 24 bits, is loaded \\
\(\mathrm{HI}(\mathrm{ACO})=\) *AR4 & into ACO(39-16) and the content at the location addressed by AR3 is loaded into \\
& ACO(15-0). \\
\hline
\end{tabular}

\section*{MOV:*MOV}

Load Accumulator from Memory with Parallel Store Accumulator Content to Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
ACy \(=\) Xmem \(\ll\) \#16, \\
Ymem \(=\) HI (ACx << T2)
\end{tabular} & No & 4 & 1 & X \\
& & & & & \\
\hline
\end{tabular}

\section*{Opcode}

Operands
Description
| 10000111 |XXXM MMYY \(\mid\) YMMM \(\operatorname{SSDD} \mid 110 x\) xxxx
ACx, ACy, T2, Xmem, Ymem
This instruction performs two operations in parallel: load and store.
The first operation loads the content of data memory operand Xmem shifted left by 16 bits to the accumulator ACy.
\(\square\) The input operand is sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
\(\square\) The input operand is shifted left by 16 bits according to M40.
The second operation shifts the accumulator ACx by the content of T2 and stores \(\operatorname{ACx}(31-16)\) to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
- The input operand is shifted in the D-unit shifter according to SXMD.
\(\square\) After the shift, the high part of the accumulator, \(\operatorname{ACx}(31-16)\), is stored to the memory location.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of T 2 are used to determine the shift quantity. The 6 LSBs of T 2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
- If the SST bit = 1 and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
\[
\begin{aligned}
& \text { ACy = Xmem << \#16, } \\
& \text { Ymem = HI(saturate(uns(ACx << T2))) }
\end{aligned}
\]
\begin{tabular}{|c|c|}
\hline & \(\square\) If the SST bit = 1 and the SXMD bit = 1 , then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
\[
\begin{aligned}
& \text { ACy }=\text { Xmem } \ll \# 16, \\
& \text { Ymem }=\text { HI(saturate(ACx } \ll \text { T2)) }
\end{aligned}
\] \\
\hline \multirow[t]{2}{*}{Status Bits} & Affected by C54CM, M40, RDM, SATD, SST, SXMD \\
\hline & Affects ACOVy \\
\hline Repeat & This instruction can be repeated. \\
\hline \multirow[t]{6}{*}{See Also} & See the following other related instructions: \\
\hline & \(\square\) Load Accumulator from Memory \\
\hline & \(\square\) Load Accumulator Pair from Memory \\
\hline & \(\square\) Load Accumulator with Immediate Value \\
\hline & Load Accumulator, Auxiliary, or Temporary Register from Memory \\
\hline & \(\square\) Load Accumulator, Auxiliary, or Temporary Register with Immediate Value \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \hline \text { AC0 = *AR3 << \#16, } \\
& \text { *AR4 = HI(AC1 << T2) }
\end{aligned}
\] & Both instructions are performed in parallel. The content addressed by AR3 shifted left by 16 bits is stored in ACO. The content of AC1 is shifted by the content of T2, and AC1(31-16) is stored at the address of AR4. \\
\hline
\end{tabular}

\section*{MOV}

\section*{Load Accumulator Pair from Memory}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & pair \((\mathrm{HI}(\mathrm{ACx}))=\) Lmem & No & 3 & 1 & X \\
{\([2]\)} & pair \((\mathrm{LO}(A C x))=\) Lmem & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

Description These instructions load the content of a data memory operand (Lmem) to the selected accumulator pair, \(A C x\) and \(A C(x+1)\).

Status Bits \(\quad\) Affected by C54CM, M40, SATD, SXMD
Affects ACOVx, \(\operatorname{ACOV}(x+1)\)

\section*{See Also}

See the following other related instructions:
- Load Accumulator from Memory
- Load Accumulator from Memory with Parallel Store Accumulator Content to Memory
- Load Accumulator with Immediate Value
- Load Accumulator, Auxiliary, or Temporary Register from Memory
\(\square\) Load Accumulator, Auxiliary, or Temporary Register with Immediate Value
- Load Auxiliary or Temporary Register Pair from Memory
- Multiply and Accumulate with Parallel Load Accumulator from Memory
- Multiply and Subtract with Parallel Load Accumulator from Memory

Load Accumulator Pair from Memory

\section*{Syntax Characteristics}


\section*{Operands \\ ACx, Lmem}

Description This instruction loads the 16 MSBs of data memory operand (Lmem) to the 24 MSBs of the destination accumulator (ACx) and loads the 16 LSBs of the data memory operand (Lmem) to the 24 MSBs of the destination accumulator \(\mathrm{AC}(\mathrm{x}+1)\).
\(\square\) The 16 MSBs and 16 LSBs of the source memory operand (Lmem) are sign extended to 24 bits and loaded into the 24 MSBs of the destination accumulator ACx and AC(x+1) according to the SXMD.
\(\square\) For the load operation in higher accumulator bits, overflow detection is performed at bit position 31. If an overflow is detected, the destination accumulator overflow status bit is set.
- The valid combination of source accumulators are ACO/AC1 and AC2/AC3.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), overflow detection, report, and saturation are done after the operation.

Status Bits \(\quad\) Affected by C54CM, M40, SATD, SXMD
Affects ACOVx, \(\operatorname{ACOV}(x+1)\)
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline pair(HI(AC2)) \(=\) *AR3+ & \begin{tabular}{l} 
The 16 highest bits of the content at the location addressed by AR3 are loaded \\
into AC2(31-16). The 16 lowest bits of the content at the location addressed by \\
AR3 +1 when the value in AR3 is even or AR3 -1 when AR3 is odd are loaded \\
into AC3(31-16). AR3 is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{8}{|l|}{Execution} \\
\hline \multicolumn{8}{|l|}{(Lmem[31:16]) -> ACx[39:16],} \\
\hline \multicolumn{8}{|l|}{(Lmem[15:0]) -> AC (x+1) [39:16]} \\
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC1 & & FFFF & 8000 & AC1 & 00 & 1234 & 8000 \\
\hline AC2 & 00 & 1234 & 1234 & AC2 & FF & ABCD & 8000 \\
\hline XAR3 & & 00 & 2000 & XAR3 & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 2000h & & & 1234 & 2000h & & & 1234 \\
\hline 2001h & & & ABCD & 2001h & & & ABCD \\
\hline
\end{tabular}

Load Accumulator Pair from Memory
Syntax Characteristics


\section*{Operands ACx, Lmem}

Description This instruction loads the 16 MSBs of data memory operand (Lmem) to the 16 LSBs of the destination accumulator (ACx) and loads the 16 LSBs of the data memory operand (Lmem) to the 16 LSBs of the destination accumulator \(\mathrm{AC}(\mathrm{x}+1)\).
- The 16 LSBs of the source accumulator ACx is sign extended to 24 bits and loaded into the 24 MSBs of the destination accumulator ACy according to the SXMD.
\(\square\) For the load operation in higher accumulator bits, overflow detection is performed at bit position 31. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected on higher data path at \(\operatorname{SXMD}=0\) and \(A C x[15]=1\), a saturation is performed with a saturation value of 007 FFFh.
\(\square\) The valid combination of source accumulators are \(A C O / A C 1\) and AC2/AC3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.

\section*{Status Bits}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline pair(LO \((\mathrm{ACO}))=\) *AR3 & \begin{tabular}{l} 
The 16 highest bits of the content at the location addressed by AR3 are loaded \\
into AC0(15-0). The 16 lowest bits of the content at the location addressed by \\
AR3 +1 when the value in AR3 is even or AR3 -1 when AR3 is odd are loaded \\
into AC1(15-0).
\end{tabular} \\
\hline
\end{tabular}

\section*{Execution \\ (Lmem[31:16]) -> ACx[15:0], \\ (Lmem[15:0]) -> AC \((x+1)\) [15:0]}


\section*{MOV}

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C x=K 16 \ll \# 16\) & No & 4 & 1 & \(X\) \\
{\([2]\)} & \(A C x=K 16 \ll \# S H F T\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}
Description These instructions load a 16-bit signed constant, K16, to a selected
Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVx

See Also See the following other related instructions:
- Load Accumulator from Memory
\(\square\) Load Accumulator from Memory with Parallel Store Accumulator Content to Memory
- Load Accumulator Pair from Memory
\(\square\) Load Accumulator, Auxiliary, or Temporary Register from Memory
- Load Accumulator, Auxiliary, or Temporary Register with Immediate Value
- Load Auxiliary or Temporary Register Pair from Memory
- Multiply and Accumulate with Parallel Load Accumulator from Memory
\(\square\) Multiply and Subtract with Parallel Load Accumulator from Memory

Load Accumulator with Immediate Value

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C x=K 16 \ll \# 16\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

Operands ACx, K16

Description This instruction loads the 16-bit signed constant, K16, shifted left by 16 bits to the accumulator (ACx):
\(\square\) The 16 -bit constant, K16, is sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) The input operand is shifted left by 16 bits according to M40.
Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When \(\mathrm{C} 54 \mathrm{CM}=1\), overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{ll} 
Status Bits & Affected by C54CM, M40, SATD, SXMD \\
& Affects ACOVx \\
Repeat & This instruction can be repeated. \\
Example & \\
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = \#-2 <<\#16 & AC0 is loaded with the signed 16-bit value (-2) shifted left by 16 bits. \\
\hline
\end{tabular}
\end{tabular}

Load Accumulator with Immediate Value

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(A C x=K 16 \ll \# S H F T\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

01110101 | KKKK KKKK \(\mid\) KKKK KKKK \(\mid\) xxDD SHFT

\section*{Operands \\ ACx, K16, SHFT}

Description This instruction loads the 16-bit signed constant, K16, shifted left by the 4-bit value, SHFT, to the accumulator (ACx):
- The 16 -bit constant, K16, is sign extended to 40 bits according to SXMD.
\(\square\) The input operand is shifted by the 4-bit value in the D-unit shifter. The shift operation is equivalent to the signed shift instruction.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM = 1, no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{lll|}
\hline Status Bits & Affected by C54CM, M40, SXMD \\
& Affects none \\
Repeat & This instruction can be repeated. \\
Example & \\
\hline Syntax & Description \\
\hline AC0 \(=\#-2 \ll \# 15\) & AC0 is loaded with the signed 16-bit value (-2) shifted left by 15 bits. \\
\hline
\end{tabular}

\section*{MOV}

Load Accumulator, Auxiliary, or Temporary Register from Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & dst \(=\) Smem & No & 2 & 1 & X \\
{\([2]\)} & dst \(=\) uns(high_byte(Smem) \()\) & No & 3 & 1 & \(X\) \\
{\([3]\)} & dst \(=\) uns(low_byte(Smem) \()\) & No & 3 & 1 & X \\
\hline
\end{tabular}

Description These instructions load the content of a memory (Smem) location to a selected

Status Bits Affected by M40, SXMD
Affects none

\section*{See Also}

See the following other related instructions:
- Load Accumulator from Memory
- Load Accumulator from Memory with Parallel Store Accumulator Content to Memory
- Load Accumulator Pair from Memory
- Load Accumulator with Immediate Value
- Load Accumulator, Auxiliary, or Temporary Register with Immediate Value
- Load Auxiliary or Temporary Register Pair from Memory
- Multiply and Accumulate with Parallel Load Accumulator from Memory
- Multiply and Subtract with Parallel Load Accumulator from Memory
- Store Accumulator, Auxiliary, or Temporary Register Content to Memory

Load Accumulator, Auxiliary, or Temporary Register from Memory
Syntax Characteristics
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\mathrm{dst}=\) Smem & No & 2 & 1 & X \\
\hline
\end{tabular}
Opcode \(\mid 1010\) FDDD \(\mid\) AAAA AAAI
Operands dst, Smem

Description This instruction loads the content of a memory (Smem) location to the destination (dst) register.
- When the destination register is an accumulator:

■ The content of the memory location is sign extended to 40 bits according to SXMD.
■ The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
\(\square\) When the destination register is an auxiliary or temporary register:
- The content of the memory location is sign extended to 16 bits.

■ The load operation in the destination register uses a dedicated path independent of the A-unit ALU.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{ll|}
\hline Status Bits & Affected by M40, SXMD \\
& Affects none \\
Repeat & This instruction can be repeated. \\
Example & \\
\hline Syntax & Description \\
\hline AR1 = *AR3 + & AR1 is loaded with the content addressed by AR3. AR3 is incremented by 1. \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
AR1 & FC00 & AR1 & 3400 \\
AR3 & 0200 & AR3 & 0201 \\
200 & 3400 & 200 & 3400
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & dst \(=\) uns(high_byte(Smem) \()\) & No & 3 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
dst, Smem
This instruction loads the high-byte content of a memory (Smem) location to the destination (dst) register.
- When the destination register is an accumulator:

■ The memory operand is extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.

■ The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
\(\square\) When the destination register is an auxiliary or temporary register:
■ The memory operand is extended to 16 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 16 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 16 bits regardless of SXMD.

■ The load operation in the destination register uses a dedicated path independent of the A-unit ALU.
\(\square\) In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits \begin{tabular}{l} 
Affected by M40, SXMD \\
Affects none
\end{tabular}
Repeat \(\quad\) This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = uns(high_byte(*AR3)) & \begin{tabular}{l} 
The high-byte content addressed by AR3 is zero extended to 40 bits and \\
loaded into AC0.
\end{tabular} \\
\hline
\end{tabular}

Load Accumulator, Auxiliary, or Temporary Register from Memory

\section*{Syntax Characteristics}

- When the destination register is an accumulator:
- The memory operand is extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.

■ The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
\(\square\) When the destination register is an auxiliary or temporary register:
- The memory operand is extended to 16 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 16 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 16 bits regardless of SXMD.

■ The load operation in the destination register uses a dedicated path independent of the A-unit ALU.
- In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits \begin{tabular}{l} 
Affected by M40, SXMD \\
Affects none \\
Repeat \\
Example \\
\begin{tabular}{|l|l|}
\hline Syntax & Description instruction can be repeated. \\
\hline AC0 = uns(low_byte(*AR3)) & \begin{tabular}{l} 
The low-byte content addressed by AR3 is zero extended to 40 bits and \\
loaded into AC0.
\end{tabular} \\
\hline
\end{tabular}
\end{tabular}\(.\)\begin{tabular}{l} 
\\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(d s t=k 4\) & Yes & 2 & 1 & X \\
{\([2]\)} & \(d s t=-k 4\) & Yes & 2 & 1 & X \\
{\([3]\)} & \(\mathrm{dst}=\mathrm{K} 16\) & No & 4 & 1 & X \\
\hline
\end{tabular}

Description These instructions load a 4-bit unsigned constant, k4; the 2s complement representation of the 4 -bit unsigned constant; or a 16-bit signed constant, K16, to a selected destination (dst) register.

Status Bits Affected by M40, SXMD
Affects none
See Also See the following other related instructions:
- Load Accumulator from Memory

L Load Accumulator from Memory with Parallel Store Accumulator Content to Memory
- Load Accumulator Pair from Memory
- Load Accumulator with Immediate Value
- Load Accumulator, Auxiliary, or Temporary Register from Memory
- Load Auxiliary or Temporary Register Pair from Memory
- Multiply and Accumulate with Parallel Load Accumulator from Memory
- Multiply and Subtract with Parallel Load Accumulator from Memory

Load Accumulator, Auxiliary, or Temporary Register with Immediate Value

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [1] dst \(=\mathrm{k} 4\) & \(\begin{array}{llll}\text { Yes } & 2 & 1 & \end{array}\) \\
\hline Opcode & 0011 110E \({ }^{\text {a }}\) kkkk FDDD \\
\hline Operands & dst, k4 \\
\hline Description & \begin{tabular}{l}
This instruction loads the 4-bit unsigned constant, k 4 , to the destination (dst) register. \\
\(\square\) When the destination register is an accumulator: \\
■ The 4-bit constant, k 4 , is zero extended to 40 bits. \\
- The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs. \\
When the destination register is an auxiliary or temporary register: \\
■ The 4-bit constant, k4, is zero extended to 16 bits. \\
- The load operation in the destination register uses a dedicated path independent of the A-unit ALU. \\
Compatibility with C54x devices (C54CM =1) \\
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & M40 \\
Affects & none
\end{tabular} \\
\hline \begin{tabular}{l}
Repeat \\
Example
\end{tabular} & This instruction can be repeated. \\
\hline Syntax & Description \\
\hline AC0 = \#2 & ACO is loaded with the unsigned 4-bit value (2). \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(\mathrm{dst}=-\mathrm{k} 4\) & Yes & 2 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad \mid 0011\) 111E \(\mid\) kkkk FDDD

\section*{Operands}

Description This instruction loads the 2s complement representation of the 4-bit unsigned constant, k 4 , to the destination (dst) register.
- When the destination register is an accumulator:

■ The 4-bit constant, k4, is negated in the I-unit, loaded into the accumulator, and sign extended to 40 bits before being processed by the D -unit as a signed constant.

■ The load operation in the destination register uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
\(\square\) When the destination register is an auxiliary or temporary register:
■ The 4-bit constant, k4, is zero extended to 16 bits and negated in the I -unit before being processed by the A-unit as a signed K16 constant.

■ The load operation in the destination register uses a dedicated path independent of the A-unit ALU.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{ll|} 
Status Bits & \multicolumn{1}{c}{ Affected by M40 } \\
& Affects none
\end{tabular}\(\quad\)\begin{tabular}{l} 
This instruction can be repeated. \\
Repeat
\end{tabular}

Load Accumulator, Auxiliary, or Temporary Register with Immediate Value

\section*{Syntax Characteristics}


\section*{MOV}

Load Auxiliary or Temporary Register Pair from Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & pair \((\mathrm{TAx})=\) Lmem & No & 3 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|11101101|\) AAAA AAAI \(\mid\) FDDD 111 x

\section*{Operands Lmem, TAx}

Description This instruction loads the 16 highest bits of data memory operand (Lmem) to the temporary or auxiliary register (TAx) and loads the 16 lowest bits of data memory operand (Lmem) to temporary or auxiliary register TA(x+1):
\(\square\) The load operation in the temporary or auxiliary register uses a dedicated path independent of the A-unit ALU.
\(\square\) Valid auxiliary registers are AR0, AR2, AR4, and AR6.
- Valid temporary registers are T0 and T2.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40
Affects none

Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Load Accumulator, Auxiliary, or Temporary Register from Memory
\(\square\) Load Accumulator, Auxiliary, or Temporary Register with Immediate Value
\(\square\) Modify Auxiliary or Temporary Register Content

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline pair(TO) \(=\) *AR2 & \begin{tabular}{l} 
The 16 highest bits of the content at the location addressed by AR2 are loaded \\
into T0 and the 16 lowest bits of the content at the location addressed by AR2 +1 \\
are loaded into T1.
\end{tabular} \\
\hline
\end{tabular}

\section*{MOV}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & BK03 = Smem & No & 3 & 1 & X \\
\hline [2] & BK47 = Smem & No & 3 & 1 & X \\
\hline [3] & BKC = Smem & No & 3 & 1 & X \\
\hline [4] & BSA01 = Smem & No & 3 & 1 & X \\
\hline [5] & BSA23 = Smem & No & 3 & 1 & X \\
\hline [6] & BSA45 = Smem & No & 3 & 1 & X \\
\hline [7] & BSA67 = Smem & No & 3 & 1 & X \\
\hline [8] & BSAC = Smem & No & 3 & 1 & \(X\) \\
\hline [9] & BRC0 \(=\) Smem & No & 3 & 1 & \(X\) \\
\hline [10] & BRC1 \(=\) Smem & No & 3 & 1 & X \\
\hline [11] & CDP = Smem & No & 3 & 1 & X \\
\hline [12] & CSR = Smem & No & 3 & 1 & X \\
\hline [13] & DP = Smem & No & 3 & 1 & X \\
\hline [14] & DPH = Smem & No & 3 & 1 & X \\
\hline [15] & PDP = Smem & No & 3 & 1 & X \\
\hline [16] & SP = Smem & No & 3 & 1 & X \\
\hline [17] & SSP = Smem & No & 3 & 1 & X \\
\hline [18] & TRN0 = Smem & No & 3 & 1 & X \\
\hline [19] & TRN1 = Smem & No & 3 & 1 & X \\
\hline [20] & RETA \(=\) dbl(Lmem) & No & 3 & 5 & X \\
\hline
\end{tabular}
\begin{tabular}{ll} 
Opcode & See Table 5-1 (page 5-212). \\
Operands & Lmem, Smem
\end{tabular}
Description \begin{tabular}{l} 
Instructions [1] through [19] load the content of a memory (Smem) location to \\
the destination CPU register. This instruction uses a dedicated datapath \\
independent of the A-unit ALU and the D-unit operators to perform the \\
operation. The content of the memory location is zero extended to the bitwidth \\
of the destination CPU register. \\
The operation is performed in the execute phase of the pipeline. There is a \\
3-cycle latency between PDP, DP, SP, SSP, CDP, BSAx, BKx, BRCx, and CSR \\
loads and their use in the address phase by the A-unit address generator units \\
or by the P-unit loop control management. \\
For instruction [10], when BRC1 is loaded, the block repeat save register \\
(BRS1) is also loaded with the same value. \\
Instruction [20] loads the content of data memory operand (Lmem) to the 24-bit \\
RETA register (the return address of the calling subroutine) and to the 8-bit \\
CFCT register (active control flow execution context flags of the calling \\
subroutine): \\
The 16 highest bits of Lmem are loaded into the CFCT register and into \\
the 8 highest bits of the RETA register.
\end{tabular}
I The 16 lowest bits of Lmem are loaded into the 16 lowest bits of the RETA
register.
When instruction [20] is decoded, the CPU pipeline is flushed and the
instruction is executed in 5 cycles, regardless of the instruction context.

Table 5-1. Opcodes for Load CPU Register from Memory Instruction


\section*{MOV}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & BK03 \(=\mathrm{k} 12\) & Yes & 3 & 1 & AD \\
\hline [2] & BK47 \(=\mathrm{k} 12\) & Yes & 3 & 1 & AD \\
\hline [3] & \(B K C=k 12\) & Yes & 3 & 1 & AD \\
\hline [4] & \(\mathbf{B R C O}=\mathrm{k} 12\) & Yes & 3 & 1 & AD \\
\hline [5] & BRC1 = k12 & Yes & 3 & 1 & AD \\
\hline [6] & \(\mathbf{C S R}=\mathrm{k} 12\) & Yes & 3 & 1 & AD \\
\hline [7] & DPH \(=\mathrm{k} 7\) & Yes & 3 & 1 & AD \\
\hline [8] & \(\mathbf{P D P}=\mathrm{k} 9\) & Yes & 3 & 1 & AD \\
\hline [9] & BSA01 \(=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline [10] & BSA23 \(=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline [11] & BSA45 \(=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline [12] & BSA67 \(=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline [13] & BSAC \(=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline [14] & \(\mathbf{C D P}=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline [15] & DP = k 16 & No & 4 & 1 & AD \\
\hline [16] & \(\mathbf{S P}=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline [17] & SSP \(=\mathrm{k} 16\) & No & 4 & 1 & AD \\
\hline
\end{tabular}

Opcode See Table 5-2 (page 5-214).
Operands kx
Description These instructions load the unsigned constant, \(k x\), to the destination CPU register. This instruction uses a dedicated datapath independent of the A-unit ALU and the D-unit operators to perform the operation. The constant is zero extended to the bitwidth of the destination CPU register.

For instruction [5], when BRC1 is loaded, the block repeat save register (BRS1) is also loaded with the same value.

The operation is performed in the address phase of the pipeline.
\begin{tabular}{lll} 
Status Bits & Affected by none \\
& Affects none \\
Repeat & Instruction [15] cannot be repeated; all other instructions can be repeated. \\
See Also & See the following other related instructions: \\
& \(\square\) Load CPU Register from Memory
\end{tabular}

Table 5-2. Opcodes for Load CPU Register with Immediate Value Instruction
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline No. & Syntax & \multicolumn{8}{|c|}{Opcode} \\
\hline [1] & BK03 \(=\mathrm{k} 12\) & & 0001 & 011E & kkkk & kkkk & kkkk & 0100 & \\
\hline [2] & BK47 \(=\mathrm{k} 12\) & & 0001 & 011E & kkkk & kkkk & kkkk & 0101 & \\
\hline [3] & \(B K C=k 12\) & & 0001 & 011E & kkkk & kkkk & kkkk & 0110 & \\
\hline [4] & \(\mathbf{B R C O}=\mathrm{k} 12\) & & 0001 & 011E & kkkk & kkkk & kkkk & 1001 & \\
\hline [5] & \(\boldsymbol{B R C 1}=\mathrm{k} 12\) & & 0001 & 011E & kkkk & kkkk & kkkk & 1010 & \\
\hline [6] & \(\mathbf{C S R}=\mathrm{k} 12\) & & 0001 & 011E & kkkk & kkkk & kkkk & 1000 & \\
\hline [7] & DPH \(=\mathrm{k} 7\) & & 0001 & 011E & xxxx & xkkk & kkkk & 0000 & \\
\hline [8] & \(\mathbf{P D P}=\mathrm{k} 9\) & & 0001 & 011E & xxxk & kkkk & kkkk & 0011 & \\
\hline [9] & BSA01 \(=\mathrm{k} 16\) & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 011x \\
\hline [10] & BSA23 \(=\mathrm{k} 16\) & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 100x \\
\hline [11] & BSA45 = k16 & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 101x \\
\hline [12] & BSA67 = k16 & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 110x \\
\hline [13] & BSAC \(=\mathrm{k} 16\) & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 111x \\
\hline [14] & CDP = k16 & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 010x \\
\hline [15] & \(\mathbf{D P}=\mathrm{k} 16\) & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 000x \\
\hline [16] & \(\mathbf{S P}=\mathrm{k} 16\) & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx1 & 000x \\
\hline [17] & \(\mathbf{S S P}=\mathrm{k} 16\) & 0111 & 1000 & kkkk & kkkk & kkkk & kkkk & xxx0 & 001x \\
\hline
\end{tabular}

\section*{MOV}

Load Extended Auxiliary Register from Memory

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & XAdst \(=\mathrm{dbl}(\) Lmem \()\) & No & 3 & 1 & X \\
\hline Opcode & \(\mid 1110\) & 1101 & AAAA & AAAI & XDDD & 1111
\end{tabular}

\section*{Operands \\ Lmem, XAdst}

Description
This instruction loads the lower 23 bits of the data addressed by data memory operand (Lmem) to the 23-bit destination register (XARx, XSP, XSSP, XDP, or XCDP).
Status Bits Affected by none

Affects none
Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Load Extended Auxiliary Register with Immediate Value
- Modify Extended Auxiliary Register Content
- Move Extended Auxiliary Register Content
- Store Extended Auxiliary Register Content to Memory

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline XAR1 \(=\mathrm{dbl}(* A R 3)\) & \begin{tabular}{l} 
The 7 lowest bits of the content at the location addressed by AR3 and the 16 bits of \\
the content at the location addressed by AR3 +1 are loaded into XAR1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrlrl} 
Before & & After \\
XAR1 & 00 & 0000 & XAR1 & 12 \\
AR3 & 0200 & AR3 & 0200 \\
200 & 3492 & 200 & 3492 \\
201 & \(0 F D 3\) & 201 & \(0 F D 3\)
\end{tabular}

\section*{AMOV \\ Load Extended Auxiliary Register with Immediate Value}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit
Size \\
\hline [1] XAdst \(=\) k23 & \(\begin{array}{llll}\text { No } & 6 & 1 & \text { AD }\end{array}\) \\
\hline Opcode & \(11101100 \mid\) AAAA AAAI \({ }^{\text {a }} 10 \mathrm{DDD} 1110\) \\
\hline Operands & k23, XAdst \\
\hline Description & \begin{tabular}{l}
This instruction loads a 23-bit unsigned constant (k23) into the 23-bit destination register (XARx, XSP, XSSP, XDP, or XCDP). This operation is completed in the address phase of the pipeline by the A-unit address generator. Data memory is not accessed. \\
The premodification or postmodification of the auxiliary register (ARx), the use of *port(\#K), and the use of the readport() or writeport() qualifier is not supported for this instruction. The use of auxiliary register offset operations is supported. If the corresponding bit (ARnLC) in status register ST2_55 is set to 1, the circular buffer management also controls the result stored in XAdst.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & ST2_55 \\
Affects & none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline See Also & See the following other related instructions:
Load Extended Auxiliary Register from Memory
Modify Extended Auxiliary Register Content
Move Extended Auxiliary Register Content
Store Extended Auxiliary Register Content to Memory \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline XAR0 = \#7FFFFFh & The 23-bit value (7FFFFFh) is loaded into XAR0. \\
\hline
\end{tabular}

\section*{MOV}

\section*{Load Memory with Immediate Value}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & Smem \(=\) K8 & No & 3 & 1 & X \\
{\([2]\)} & Smem \(=\) K16 & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline Opcode & K8 & & & 1110 & 0110 & AAAA & AAAI & KKKK & KKKK \\
\hline & K16 & 1111 & 1011 & AAAA & AAAI & KKKK & KKKK & KKKK & KKKK \\
\hline Operands & \multicolumn{9}{|l|}{Kx, Smem} \\
\hline Description & \multicolumn{9}{|l|}{These instructions initialize a data memory location. These instructions store an 8 -bit signed constant, K8, or a 16 -bit signed constant, K16, to a memory (Smem) location. They use a dedicated datapath to perform the operation.} \\
\hline
\end{tabular}

For instruction [1], the immediate value is always signed extended to 16 bits before being stored in memory.

Status Bits Affected by none
Affects none
Repeat Both instructions [1] and [2] can be repeated.
See Also See the following other related instructions:
- Move Memory to Memory

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{\star}(\# 0501 \mathrm{~h})=\# 248\) & The signed 16-bit value (248) is loaded to address 501h. \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & After & \\
0501 & FC00 & 0501 & F800
\end{tabular}

Syntax Characteristics
\begin{tabular}{lllllll}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & lock() & & No & 2 & 1 & D \\
\hline Opcode & none & & 0100 & 0101 & 1111 & 0010 \\
Operands & \begin{tabular}{l} 
This is an operand qualifier that can be paralleled with any of 13 instructions \\
(listed below) which execute a read-modify-write operation to a specific
\end{tabular} \\
Description & \begin{tabular}{l} 
memory operand. If the lock() qualifier is applied to any of 13 instructions, the \\
lock signal is activated at the same cycle with the read request and the \\
corresponding write request follows this read request. This means any \\
memory request issued by other instructions cannot be located between this \\
locked read and write request due to stall generation. This also provides a \\
suitable interface with the OCP.
\end{tabular}
\end{tabular} sultable interface with the OCP.
This operand qualifier cannot be executed:
\(\square\) Alone
- In parallel with instructions except the 13 lock instructions

Any of the 13 instructions using the lock() qualifier cannot be combined with any other user-defined parallelism instruction.

The 13 lock instructions which can be paralleled with the lock() qualifier are listed in the table below.
\begin{tabular}{|c|l|l|}
\hline Number & Algebraic & Mnemonic \\
\hline 1 & TC1 \(=\) bit(Smem, k4), bit(Smem, k4) = \#1 & BTSTSET k4, Smem, TC1 \\
\hline 2 & TC2 = bit(Smem, k4), bit(Smem, k4) = \#1 & BTSTSET k4, Smem, TC2 \\
\hline 3 & TC1 = bit(Smem, k4), bit(Smem, k4) = \#0 & BTSTCLR k4, Smem, TC1 \\
\hline 4 & TC2 \(=\) bit(Smem, k4), bit(Smem, k4) = \#0 & BTSTCLR k4, Smem, TC2 \\
\hline 5 & TC1 = bit(Smem, k4), cbit(Smem, k4) & BTSTNOT k4, Smem, TC1 \\
\hline 6 & TC2 = bit(Smem, k4), cbit(Smem, k4) & BTSTNOT k4, Smem, TC2 \\
\hline 7 & bit(Smem, src) = \#1 & BSET src, Smem \\
\hline 8 & bit(Smem, src) = \#0 & BCLR src, Smem \\
\hline 9 & cbit(Smem, src) & BNOT src, Smem \\
\hline 10 & Smem = Smem \& k16 & AND k16, Smem \\
\hline 11 & Smem = Smem | k16 & OR k16, Smem \\
\hline 12 & Smem = Smem \(\wedge\) k16 & XOR k16, Smem \\
\hline 13 & Smem = Smem + k16 & ADD k16, Smem \\
\hline
\end{tabular}

Any of the 13 instructions with the lock() qualifier is not allowed in the conditional execution context which is applied by "if(cond) execute(D_unit)" instruction due to OCP compliance. The cases below are illegal and rejected by the code-gen tools:
```

if(cond execute(D_unit)
TC1=bit(*ar2+, \#2), bit(*ar2+, \#2)=\#1 || lock()
instruction || if(cond) execute(D_unit)
TC1=bit(*ar2+, \#2), bit(*ar2+, \#2)=\#1 || lock()

```

Compatibility with C54x devices (C54CM = 1)
None.


\section*{DELAY}
Memory Delay

\section*{Syntax Characteristics}

mmap
Memory-Mapped Register Access Qualifier

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & mmap() & & No & 1 & 1 & D \\
\hline Opcode & none & 1001 & 1000 \\
Operands & \begin{tabular}{l} 
This is an operand qualifier that can be paralleled with any instruction making \\
a Smem or Lmem direct memory access (dma). This operand qualifier allows \\
you to locally prevent the dma access from being relative to the data stack
\end{tabular} \\
& \begin{tabular}{l} 
yointer (SP) or the local data page register (DP). It forces the dma access to \\
pe relative to the memory-mapped register (MMR) data page start address,
\end{tabular} \\
&
\end{tabular} 00 0000h.

This operand qualifier cannot be executed:
\(\square\) as a stand-alone instruction (assembler generates an error message)
\(\square\) in parallel with instructions not embedding an Smem or Lmem data memory operand
\(\square\) in parallel with instructions loading or storing a byte to a register (see Load Accumulator, Auxiliary, or Temporary Register from Memory instructions [2] and [3]; Load Accumulator from Memory instructions [2] and [3]; and Store Accumulator, Auxiliary, or Temporary Register Content to Memory instructions [2] and [3])

The MMRs are mapped as 16-bit data entities between addresses 0h and 5Fh. The scratch-pad memory that is mapped between addresses 60h and 7Fh of each main data pages of 64 K words cannot be accessed through this mechanism.

Any instruction using the mmap() modifier cannot be combined with any other user-defined parallelism instruction.
\begin{tabular}{ll} 
Status Bits & Affected by none \\
Affects none
\end{tabular}\(\quad\)\begin{tabular}{ll} 
This instruction can be repeated. \\
Repeat & Description \\
Example & \begin{tabular}{l} 
AC0_L is a keyword representing AC0(15-0). The content of AC0(15-0) is copied \\
into T2.
\end{tabular} \\
\hline Syntax & \begin{tabular}{l} 
T2 = @(AC0_L)) \\
\(\|\) mmap ()
\end{tabular}
\end{tabular}

\section*{AMAR}

Modify Auxiliary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & mar(Smem) & No & 2 & 1 & \(A D\) \\
\hline
\end{tabular}

\section*{Opcode}
\(10110100 \mid\) AAAA AAAI

\section*{Operands}

Description This instruction performs, in the A-unit address generation units, the auxiliary register modification specified by Smem as if a word single data memory operand access was made. The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1, the circular buffer management controls the result stored in the destination register.

\section*{Compatibility with C54x devices (C54CM =1)}

In the translated code section, the mar() instruction must be executed with C54CM set to 1 .

When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.

\section*{Status Bits \\ Affected by ST2_55}

Affects none
Repeat This instruction can be repeated.

\section*{See Also See the following other related instructions: \\ - Modify Auxiliary or Temporary Register Content \\ - Modify Auxiliary or Temporary Register Content by Addition \\ \(\square\) Modify Auxiliary or Temporary Register Content by Subtraction \\ - Modify Auxiliary Register Content with Parallel Multiply \\ - Modify Auxiliary Register Content with Parallel Multiply and Accumulate \\ - Modify Auxiliary Register Content with Parallel Multiply and Subtract \\ - Modify Extended Auxiliary Register Content \\ - Modify Extended Auxiliary Register Content by Addition \\ - Modify Extended Auxiliary Register Content by Subtraction \\ - Parallel Modify Auxiliary Register Contents}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}\left({ }^{*} A R 3+\right)\) & The content of AR3 is incremented by 1. \\
\hline
\end{tabular}

\section*{AMAR::MPY \\ Modify Auxiliary Register Content with Parallel Multiply}

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\operatorname{mar}(\) Xmem \()\), \\
& \(A C x=M 40(\) rnd \((\) uns \((\) Ymem \() *\) uns \((\operatorname{coef}(\) Cmem \())))\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}
Opcode \(\quad|10000010|\) XXXM MMYY \(\left|\begin{array}{ll}\text { YMMM } & 11 \mathrm{~mm}\end{array}\right|\) uuxx DDg\%

\section*{Operands}

Description

ACx, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: modify auxiliary register (MAR) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.
- Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32 -bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
\begin{tabular}{ll} 
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
Affects \(\quad\) ACOVx
\end{tabular} Repeat \begin{tabular}{ll} 
This instruction can be repeated. \\
See Also & See the following other related instructions: \\
& \(\square\) Modify Auxiliary Register Content \\
& \(\square\) Modify Auxiliary Register Content with Parallel Multiply and Accumulate \\
& \(\square\) Modify Auxiliary Register Content with Parallel Multiply and Subtract \\
& \(\square\) Multiply
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline mar(*AR3+), & \begin{tabular}{l} 
Both instructions are performed in parallel. AR3 is incremented by 1. The \\
AC0 = uns(*AR4) * uns(coef(*\({ }^{*}\) CDP)) \\
unsigned content addressed by AR4 is multiplied by the unsigned \\
content addressed by the coefficient data pointer register (CDP) and the \\
result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{AMAR:\#MAC \\ Modify Auxiliary Register Content with Parallel Multiply and Accumulate}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. & Syn & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \multicolumn{3}{|l|}{```
mar(Xmem),
ACx = M40(rnd(ACx + (uns(Ymem) * uns(coef(Cmem)))))
```} & 4 & 1 & X \\
\hline [2] & \multicolumn{3}{|l|}{\[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\mathrm{M} 40(\text { (rnd((ACx >> \#16) }+(\text { uns(Ymem }) ~ * \\
& \text { uns(coef(Cmem))))) }
\end{aligned}
\]} & 4 & 1 & X \\
\hline \multicolumn{2}{|l|}{Description} & \multicolumn{5}{|l|}{These instructions perform two parallel operations in one cycle: modify auxiliary register (MAR), and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.} \\
\hline \multicolumn{2}{|l|}{\multirow[t]{2}{*}{Status Bits}} & \multicolumn{5}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL, SXMD} \\
\hline & & Affects & & & & \\
\hline \multicolumn{2}{|l|}{\multirow[t]{4}{*}{See Also}} & \multicolumn{5}{|l|}{See the following other related instructions:} \\
\hline & & \multicolumn{5}{|l|}{\(\square\) Modify Auxiliary Register Content} \\
\hline & & \multicolumn{5}{|l|}{\(\square\) Modify Auxiliary Register Content with Parallel Multiply} \\
\hline & & \multicolumn{5}{|l|}{\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Subtract} \\
\hline
\end{tabular}

Modify Auxiliary Register Content with Parallel Multiply and Accumulate

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & ```
mar(Xmem),
ACx = M40(rnd(ACx + (uns(Ymem) * uns(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|10000011|\) XXXM MMYY \(\mid\) YMMM \(11 \mathrm{~mm} \mid\) uuxx DDg\%

\author{
ACx, Cmem, Xmem, Ymem
}

This instruction performs two parallel operations in one cycle: modify auxiliary register (MAR), and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
\(\square\) For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline mar( \({ }^{*}\) AR3 + ), & Both instructions are performed in parallel. AR3 is incremented \\
AC0 \(=\) AC0 \(+\left(\right.\) uns \(\left({ }^{*}\right.\) AR4 \() *\) uns(coef( \({ }^{*}\) CDP) \(\left.)\right)\) & \begin{tabular}{l} 
by 1. The unsigned content addressed by AR4 multiplied by the \\
unsigned content addressed by the coefficient data pointer \\
register (CDP) is added to the content of AC0 and the result is \\
stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\text { M } 40(\text { rnd ((ACx >> \#16) }+(\text { uns }(\text { Ymem }) ~ * ~ \\
& \text { uns }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
\(10000100 \mid\) XXXM MMYY \(\mid\) YMMM 01mm \(\mid\) uuxx DDg\%
ACx, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: modify auxiliary register (MAR), and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.
- Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator \(A C x\) shifted right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACx(39).
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

■ mar(Xmem)
- \(\operatorname{mar}(\mathrm{Ymem})\)
- mar(Cmem)

Status Bits

Repeat

Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx
This instruction can be repeated.
Example
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline \multicolumn{6}{|l|}{Syntax} & \multicolumn{3}{|r|}{Description} \\
\hline \multicolumn{6}{|l|}{\[
\begin{aligned}
& \operatorname{mar}\left({ }^{*} A R 2+\right), \\
& \text { AC0 }=\left(\left(\text { ACO >> \#16) }+\left(\text { uns(*AR1) * uns( } \operatorname{coef}\left({ }^{*} C D P\right)\right)\right)\right)
\end{aligned}
\]} & \multicolumn{3}{|r|}{Both instructions are performed in parallel. AR2 is incremented by 1. The unsigned content addressed by AR1 multiplied by the unsigned content addressed by the coefficient data pointer register (CDP) is added to the content of AC0 shifted right by 16 bits and the result is stored in ACO. An overflow is detected in ACO.} \\
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{5}{|l|}{After} \\
\hline ACO & 00 & 6900 & 0000 & ACO & 00 & 95C0 & 9200 & \\
\hline AC1 & 00 & 0023 & 0000 & AC1 & 00 & 0023 & 0000 & \\
\hline *AR1 & & & EFOO & *AR1 & & & EFOO & \\
\hline AR2 & & & 0201 & AR2 & & & 0202 & \\
\hline * CDP & & & A067 & * CDP & & & A067 & \\
\hline ACOVO & & & 0 & ACOVO & & & & 1 \\
\hline ACOV1 & & & 0 & ACOV1 & & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & & 0 \\
\hline M40 & & & 0 & M40 & & & & 0 \\
\hline FRCT & & & 0 & FRCT & & & & 0 \\
\hline SATD & & & 0 & SATD & & & & 0 \\
\hline
\end{tabular}

\section*{AMAR::MAS \\ Modify Auxiliary Register Content with Parallel Multiply and Subtract}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\mathrm{M} 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Ymem }) ~ * ~ u n s(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}


\author{
ACx, Cmem, Xmem, Ymem
}

This instruction performs two parallel operations in one cycle: modify auxiliary register (MAR), and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs an auxiliary register modification. The auxiliary register modification is specified by the content of data memory operand Xmem.

The second operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode, extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
- This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
\(\square\) For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx
Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Modify Auxiliary Register Content
- Modify Auxiliary Register Content with Parallel Multiply
- Modify Auxiliary Register Content with Parallel Multiply and Accumulate
- Multiply and Subtract

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \operatorname{mar}\left({ }^{*} \mathrm{AR} 3+\right), \\
& \mathrm{ACO}=\mathrm{ACO}-(\text { uns (*AR4) * uns(coef(*CDP))) }
\end{aligned}
\] & Both instructions are performed in parallel. AR3 is incremented by 1 . The unsigned content addressed by AR4 multiplied by the unsigned content addressed by the coefficient data pointer register (CDP) is subtracted from the content of ACO and the result is stored in ACO. \\
\hline
\end{tabular}

\section*{AMOV}

Modify Auxiliary or Temporary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\operatorname{mar}(\mathrm{TAy}=\mathrm{TAx})\) & No & 3 & 1 & AD \\
{\([2]\)} & \(\operatorname{mar}(\mathrm{TAx}=\mathrm{P} 8)\) & No & 3 & 1 & AD \\
{\([3]\)} & \(\operatorname{mar}(\mathrm{TAx}=\mathrm{D} 16)\) & No & 4 & 1 & AD \\
\hline
\end{tabular}
\begin{tabular}{ll} 
Description & These instructions perform, in the A-unit address generation units: \\
a move from auxiliary or temporary register TAx to auxiliary or temporary \\
register TAy
\end{tabular}

Modify Auxiliary or Temporary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline No. & Syntax & & \multicolumn{2}{|l|}{Parallel Enable Bit} & Size & \multicolumn{2}{|l|}{Cycles} & Pipeline \\
\hline [1] & \(\operatorname{mar}(\mathrm{TAy}=\mathrm{TAx})\) & & \multicolumn{2}{|l|}{No} & 3 & \multicolumn{2}{|c|}{1} & AD \\
\hline \multicolumn{2}{|l|}{\multirow[t]{2}{*}{Opcode}} & 0001 & 010E & FSSS & \multicolumn{2}{|l|}{S xxxx} & FDDD & 0001 \\
\hline & & 0001 & 010E & FSSS & X & & FDD & D 1001 \\
\hline
\end{tabular}

The assembler selects the opcode depending on the instruction position in a paralleled pair.

\section*{Operands}

Description This instruction performs, in the A-unit address generation units, a move from the auxiliary or temporary register TAx to auxiliary or temporary register TAy. The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(\) AR0 \(=\) AR1 \()\) & The content of AR1 is copied to AR0. \\
\hline
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(\mathrm{T0}=\mathrm{T} 1)\) & The content of T1 is copied to T0. \\
\hline
\end{tabular}

Modify Auxiliary or Temporary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline No. & Syntax & & \multicolumn{2}{|l|}{\begin{tabular}{l}
Parallel \\
Enable Bit
\end{tabular}} & Size & \multicolumn{2}{|l|}{Cycles} & Pipeline \\
\hline [2] & \(\operatorname{mar}(\mathrm{TAx}=\mathrm{P} 8\) ) & & No & & 3 & & & AD \\
\hline \multicolumn{2}{|l|}{\multirow[t]{2}{*}{Opcode}} & 0001 & 010E \({ }^{\text {P }}\) & PPPP & P & PP & FDD & D 0101 \\
\hline & & 0001 & 010E P & PPPP & P & & FDD & 1101 \\
\hline
\end{tabular}

The assembler selects the opcode depending on the instruction position in a paralleled pair.

\section*{Operands \\ TAx, P8}

Description This instruction performs, in the A-unit address generation units, a load in the auxiliary or temporary registers TAx of a program address defined by a program address label assembled into P8. The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.
Example 1
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(\operatorname{AR0}=\# 255)\) & The unsigned 8-bit value \((255)\) is copied to AR0. \\
\hline
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(T 0=\# 255)\) & The unsigned 8-bit value \((255)\) is copied to T0. \\
\hline
\end{tabular}

\section*{Syntax Characteristics}


\section*{AADD}

\section*{Modify Auxiliary or Temporary Register Content by Addition}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\operatorname{mar}(T A y+T A x)\) & No & 3 & 1 & \(A D\) \\
{\([2]\)} & \(\operatorname{mar}(T A x+P 8)\) & No & 3 & 1 & \(A D\) \\
\hline
\end{tabular}
Description These instructions perform, in the A-unit address generation units:
an addition between two auxiliary or temporary registers, TAx and TAy,
and stores the result in TAy
an addition between the auxiliary or temporary registers TAx and a
program address defined by a program address label assembled into
unsigned P8, and stores the result in TAx

The operation is performed in the address phase of the pipeline, however data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 or the circular addressing qualifier is in paralleled, the circular buffer management controls the result stored in the destination register.

\section*{Status Bits Affected by ST2_55}

Affects none
See Also See the following other related instructions:
- Modify Auxiliary Register Content
- Modify Auxiliary or Temporary Register Content
- Modify Auxiliary or Temporary Register Content by Subtraction
- Modify Extended Auxiliary Register Content
- Modify Extended Auxiliary Register Content by Addition
- Modify Extended Auxiliary Register Content by Subtraction

AADD Modify Auxiliary or Temporary Register Content by Addition (mar)

Modify Auxiliary or Temporary Register Content by Addition

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [1] mar(TAy + TAx) & \(\begin{array}{llll}\text { No } & 3 & 1 & \text { AD }\end{array}\) \\
\hline \multirow[t]{3}{*}{Opcode} &  \\
\hline &  \\
\hline & The assembler selects the opcode depending on the instruction position in a paralleled pair. \\
\hline Operands & TAx, TAy \\
\hline Description & \begin{tabular}{l}
This instruction performs, in the A-unit address generation units, an addition between two auxiliary or temporary registers, TAy and TAx, and stores the result in TAy. The content of TAx is considered signed. The operation is performed in the address phase of the pipeline; however, data memory is not accessed. \\
If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 or the circular addressing qualifier is in paralleled, the circular buffer management controls the result stored in the destination register. \\
Compatibility with C54x devices (C54CM =1) \\
In the translated code section, the mar() instruction must be executed with C54CM set to 1 . \\
When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & ST2_55 \\
Affects & none
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline
\end{tabular}

\section*{Example 1}
\begin{tabular}{l|ll|}
\hline Syntax & Description \\
\hline mar(AR0 + T0) & The content of AR0 is added to the signed content of T0 and the result is stored in AR0. \\
\hline & & \\
Before \\
XARO & 010000 & After \\
T0 & 8000 & TORO
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(\mathrm{T} 0+\mathrm{T} 1)\) & The content of T0 is added to the content of T1 and the result is stored in T0. \\
\hline
\end{tabular}

Modify Auxiliary or Temporary Register Content by Addition
Syntax Characteristics


The assembler selects the opcode depending on the instruction position in a paralleled pair.

\section*{Operands}

Description This instruction performs, in the A-unit address generation units, an addition between the auxiliary or temporary register TAx and a program address defined by a program address label assembled into unsigned P8, and stores the result in TAx. The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 or the circular addressing qualifier is in paralleled, the circular buffer management controls the result stored in the destination register.

\section*{Compatibility with C54x devices (C54CM = 1)}

In the translated code section, the mar() instruction must be executed with C54CM set to 1 .

When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.
\begin{tabular}{lll|} 
Status Bits & Affected by ST2_55 \\
Affects none
\end{tabular}

\section*{ASUB}

Modify Auxiliary or Temporary Register Content by Subtraction

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\operatorname{mar}(\mathrm{TAy}-\mathrm{TAx})\) & No & 3 & 1 & AD \\
{\([2]\)} & \(\operatorname{mar}(\mathrm{TAx}-\mathrm{P} 8)\) & No & 3 & 1 & AD \\
\hline
\end{tabular}

\section*{Description}

Status Bits

See Also
These instructions perform, in the A-unit address generation units:
- a subtraction between two auxiliary or temporary registers, TAy and TAx, and stores the result in TAy
\(\square\) a subtraction between the auxiliary or temporary registers TAx and a program address defined by a program address label assembled into unsigned P8, and stores the result in TAx

The operation is performed in the address phase of the pipeline, however data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 or the circular addressing qualifier is in paralleled, the circular buffer management controls the result stored in the destination register.

Affected by ST2_55
Affects none
See the following other related instructions:
- Modify Auxiliary Register Content
- Modify Auxiliary or Temporary Register Content
- Modify Auxiliary or Temporary Register Content by Addition
- Modify Extended Auxiliary Register Content
- Modify Extended Auxiliary Register Content by Addition
- Modify Extended Auxiliary Register Content by Subtraction

Modify Auxiliary or Temporary Register Content by Subtraction

\section*{Syntax Characteristics}


\section*{Example 1}
\begin{tabular}{|c|c|c|c|}
\hline Syntax & \multicolumn{3}{|l|}{Description} \\
\hline \(\operatorname{mar}(\mathrm{ARO}-\mathrm{TO})\) & \multicolumn{3}{|l|}{The signed content of T0 is subtracted from the content of ARO and the result is stored in ARO.} \\
\hline Before & & After & \\
\hline XAR0 & 018000 & XARO & 010000 \\
\hline T0 & 8000 & T0 & 8000 \\
\hline
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(\mathrm{T0}-\mathrm{T} 1)\) & The content of T1 is subtracted from the content of T0 and the result is stored in T0. \\
\hline
\end{tabular}

Modify Auxiliary or Temporary Register Content by Subtraction
Syntax Characteristics


The assembler selects the opcode depending on the instruction position in a paralleled pair.

\section*{Operands}

Description This instruction performs, in the A-unit address generation units, a subtraction between the auxiliary or temporary register TAx and a program address defined by a program address label assembled into unsigned P8, and stores the result in TAx. The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

If the destination register is an auxiliary register and the corresponding bit (ARnLC) in status register ST2_55 is set to 1 or the circular addressing qualifier is in paralleled, the circular buffer management controls the result stored in the destination register.

\section*{Compatibility with C54x devices (C54CM = 1)}

In the translated code section, the mar() instruction must be executed with C54CM set to 1 .

When circular modification is selected for the destination auxiliary register, this instruction modifies the selected destination auxiliary register by using BK03 as the circular buffer size register; BK47 is not used.
\begin{tabular}{lll|} 
Status Bits & Affected by ST2_55 \\
& Affects none
\end{tabular}

\section*{AADD}

\section*{Modify Data Stack Pointer}

\section*{Syntax Characteristics}


\section*{AMAR}

Modify Extended Auxiliary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & XAdst \(=\) mar \((\) Smem \()\) & No & 3 & 1 & \(A D\) \\
{\([2]\)} & mar XACdst \(=\) XACsrc \()\) & Yes & 3 & 1 & \(A D\) \\
\hline
\end{tabular}

Description These instructions perform, in the A-unit address generation units:
- The effective address specified by the Smem operand field and modifies the 23-bit destination register (XARx, XSP, XSSP, XDP, or XCDP). Data memory is not accessed.
- A full 23-bit move from one addressing register to another addressing register, from XACsrc to XACdst, and stores the result in XACdst. The operation is performed in the address phase of the pipeline; however, data memory is not accessed.

\section*{Status Bits Affected by ST2_55}

Affects none
See Also See the following other related instructions:
- Load Extended Auxiliary Register from Memory
- Load Extended Auxiliary Register with Immediate Value
- Modify Auxiliary Register Content
- Move Extended Auxiliary Register Content
- Store Extended Auxiliary Register Content to Memory
- Modify Extended Auxiliary Register Content by Addition
\(\square\) Modify Extended Auxiliary Register Content by Subtraction

Syntax Characteristics


Modify Extended Auxiliary Register Content
Syntax Characteristics


\section*{Compatibility with C54x devices (C54CM = 1)}

None.

\section*{Status Bits Affected by}

Affects
Repeat This instruction can be repeated.
Example 1
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(\) XAR1 = XAR0 \()\) & The content of XAR0 is copied to XAR1. \\
\hline
\end{tabular}
\begin{tabular}{llllll} 
Before & & After \\
XARO & 12 & 3456 & XARO & 12 & 3456 \\
XAR1 & 43 & 5634 & XAR1 & 12 & 3456
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(X C D P=\) XAR7 \()\) & The content of XAR7 is copied to XCDP. \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & After \\
XCDP & 008000 & XCDP & 014000 \\
XAR7 & 014000 & XAR7 & 014000
\end{tabular}

Execution
(XACsrc) -> XACdst

\section*{AADD}

\section*{Modify Extended Auxiliary Register Content by Addition}

\section*{Syntax Characteristics}


\section*{Compatibility with C54x devices (C54CM = 1)}

None.
Status Bits Affected by
Affects
Repeat This instruction can be repeated.
See Also See the following other related instructions:
\(\square\) Modify Auxiliary or Temporary Register Content
\(\square\) Modify Auxiliary or Temporary Register Content by Addition
\(\square\) Modify Auxiliary or Temporary Register Content by Subtraction
- Modify Auxiliary Register Content with Parallel Multiply
- Modify Auxiliary Register Content with Parallel Multiply and Accumulate
- Modify Auxiliary Register Content with Parallel Multiply and Subtract
- Modify Extended Auxiliary Register Content
\(\square\) Modify Extended Auxiliary Register Content by Subtraction
- Parallel Modify Auxiliary Register Contents

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(X A R 1+\) XAR0 \()\) & The content of XAR0 is added to XAR1 and stored in XAR1. \\
\hline
\end{tabular}
\begin{tabular}{llllll} 
Before & After \\
XAR0 & 12 & 3456 & XAR0 & 12 & 3456 \\
XAR1 & 43 & 5634 & XAR1 & 55 & \(8 A 8 A\)
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(X C D P+X A R 7)\) & The content of XAR7 is added to XCDP and stored in XCDP. \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{l}{ After } \\
XCDP & 008000 & XCDP & 010080 \\
XAR7 & 008080 & XAR7 & 008080
\end{tabular}

Execution
(XACdst) + (XACsrc) -> XACdst

\section*{ASUB}

\section*{Modify Extended Auxiliary Register Content by Subtraction}

\section*{Syntax Characteristics}


\section*{Compatibility with C54x devices (C54CM = 1)}

None.
Status Bits Affected by
Affects
Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Modify Auxiliary or Temporary Register Content
- Modify Auxiliary or Temporary Register Content by Addition
\(\square\) Modify Auxiliary or Temporary Register Content by Subtraction
- Modify Auxiliary Register Content with Parallel Multiply
- Modify Auxiliary Register Content with Parallel Multiply and Accumulate
- Modify Auxiliary Register Content with Parallel Multiply and Subtract
- Modify Extended Auxiliary Register Content
- Modify Extended Auxiliary Register Content by Addition
- Parallel Modify Auxiliary Register Contents

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(X A R 1-\) XAR0 \()\) & The content of XAR0 is subtracted from XAR1 and stored in XAR1. \\
\hline
\end{tabular}
\begin{tabular}{lllll} 
Before & After \\
XAR0 & 12 & 3456 & XARO & 12 \\
XAR1 & 43 & 5634 & XAR1 & 31
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{mar}(\) XCDP - XAR7) & The content of XAR7 is subtracted from XCDP and stored in XCDP. \\
\hline
\end{tabular}
\begin{tabular}{lllll} 
Before & \multicolumn{3}{c}{ After } \\
XCDP & 008000 & XCDP & 00 & 7000 \\
XAR7 & 001000 & XAR7 & 001000
\end{tabular}

Execution
(XACdst) - (XACsrc) -> XACdst

\section*{MOV}

Move Accumulator Content to Auxiliary or Temporary Register

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\mathrm{TAx}=\mathrm{HI}(\mathrm{ACx})\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
Operands ACx, TAx

Description This instruction moves the high part of the accumulator, \(\operatorname{ACx}(31-16)\), to the destination auxiliary or temporary register (TAx). The 16-bit move operation is performed in the A-unit ALU.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40

Affects none
Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Move Accumulator, Auxiliary, or Temporary Register Content
- Move Auxiliary or Temporary Register Content to Accumulator

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{AR2} 2=\mathrm{HI}(\mathrm{ACO})\) & The content of \(\mathrm{ACO}(31-16)\) is copied to AR2. \\
\hline
\end{tabular}
\begin{tabular}{lrrlrll} 
Before & & After \\
AC0 & 01 E500 & 0030 & AC0 & 01 E500 & 0030 \\
AR2 & & 0200 & AR2 & & E500
\end{tabular}

\section*{MOV}

Move Accumulator, Auxiliary, or Temporary Register Content

\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = AC0 & The content of AC0 is copied to AC1. Because an overflow occurred, ACOV1 is set to 1. \\
\hline
\end{tabular}
\begin{tabular}{lrrrlrlr} 
Before & & & After \\
AC0 & 01 & E500 & 0030 & AC0 & 01 & E500 & 0030 \\
AC1 & 00 & 2800 & 0200 & AC1 & 01 & E500 & 0030 \\
M40 & & & 0 & M40 & & 0 \\
SATD & & & 0 & SATD & & 0 \\
ACOV1 & & & 0 & ACOV1 & & 0 & 1
\end{tabular}

\section*{MOV}

Move Auxiliary or Temporary Register Content to Accumulator

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size \(\begin{aligned} & \text { Cycles }\end{aligned}\) \\
\hline [1] \(\quad \mathrm{H}(\mathrm{ACx})=\) TAx & Yes \\
\hline Opcode & 0101 001E|FSSS OODD \\
\hline Operands & ACx, TAx \\
\hline Description & \begin{tabular}{l}
This instruction moves the content of the auxiliary or temporary register (TAx) to the high part of the accumulator, \(\mathrm{ACx}(31-16)\) :
The 16 -bit move operation is performed in the D-unit ALU.
During the 16 -bit move operation, an overflow is detected according to M40: \\
- the destination accumulator overflow status bit (ACOVx) is set. \\
- the destination register (ACx) is saturated according to SATD. \\
\(\square\) If the source (src) register is an auxiliary or temporary register, the 16 LSBs of the source register are sign extended to 40 bits according to SXMD. \\
Compatibility with C54x devices (C54CM =1) \\
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & M40, SATD, SXMD \\
Affects & ACOVx
\end{tabular} \\
\hline Repeat & This instruction can be repeated. \\
\hline See Also & See the following other related instructions:
Move Accumulator Content to Auxiliary or Temporary Register
Move Accumulator, Auxiliary, or Temporary Register Content
Move Auxiliary or Temporary Register Content to CPU Register
Move Extended Auxiliary Register Content \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline \(\mathrm{HI}(\mathrm{AC} 0)=\) T0 & The content of T0 is copied to AC0(31-16). \\
\hline
\end{tabular}

MOV
Move Auxiliary or Temporary Register Content to CPU Register

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & BRCO \(=\) TAx & Yes & 2 & 1 & X \\
{\([2]\)} & BRC1 \(=\) TAx & Yes & 2 & 1 & X \\
{\([3]\)} & CDP \(=\) TAx & Yes & 2 & 1 & X \\
{\([4]\)} & CSR \(=\) TAx & Yes & 2 & 1 & X \\
{\([5]\)} & SP \(=\) TAx & Yes & 2 & 1 & X \\
{\([6]\)} & SSP \(=\) TAx & Yes & 2 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description

Status Bits

Repeat
See Also

See Table 5-3 (page 5-258).

\section*{TAx}

This instruction moves the content of the auxiliary or temporary register (TAx) to the selected CPU register. All the move operations are performed in the execute phase of the pipeline and the A-unit ALU is used to transfer the content of the registers.

There is a 3-cycle latency between SP, SSP, CDP, TAx, CSR, and BRCx update and their use in the address phase by the A-unit address generator units or by the P -unit loop control management.

For instruction [2] when BRC1 is loaded with the content of TAx, the block repeat save register (BRS1) is also loaded with the same value.

Affected by none
Affects none
This instruction can be repeated.
See the following other related instructions:
- Move Accumulator Content to Auxiliary or Temporary Register
- Move Accumulator, Auxiliary, or Temporary Register Content
- Move Auxiliary or Temporary Register Content to Accumulator
- Move CPU Register Content to Auxiliary or Temporary Register
- Move Extended Auxiliary Register Content

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline BRC1 \(=\) T1 & \begin{tabular}{l} 
The content of T1 is copied to the block repeat register (BRC1) and to the block \\
repeat save register (BRS1).
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
T1 & 0034 & T1 & 0034 \\
BRC1 & 00 EA & BRC1 & 0034 \\
BRS1 & 00 EA & BRS1 & 0034
\end{tabular}

Table 5-3. Opcodes for Move Auxiliary or Temporary Register Content to CPU Register
Instruction
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & \multicolumn{4}{|c|}{Opcode} \\
\hline [1] & \(B R C 0=T A x\) & 0101 & 001E & FSSS & 1110 \\
\hline [2] & \(B R C 1=T A x\) & 0101 & 001E & FSSS & 1101 \\
\hline [3] & CDP = TAx & 0101 & 001E & FSSS & 1010 \\
\hline [4] & CSR \(=\) TAx & 0101 & 001E & FSSS & 1100 \\
\hline [5] & SP = TAx & 0101 & 001E & FSSS & 1000 \\
\hline [6] & \(\mathbf{S S P}=\) TAx & 0101 & 001E & FSSS & 1001 \\
\hline
\end{tabular}

MOV
Move CPU Register Content to Auxiliary or Temporary Register

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & TAx = BRC0 & Yes & 2 & 1 & X \\
{\([2]\)} & TAx \(=\) BRC1 & Yes & 2 & 1 & X \\
{\([3]\)} & TAx \(=\) CDP & Yes & 2 & 1 & X \\
{\([4]\)} & TAx \(=\) SP & Yes & 2 & 1 & X \\
{\([5]\)} & TAx \(=\) SSP & Yes & 2 & 1 & X \\
{\([6]\)} & TAx \(=\) RPTC & Yes & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode See Table 5-4 (page 5-260). \\ Operands \\ TAx}

Description

Status Bits

Repeat
See Also

This instruction moves the content of the selected CPU register to the auxiliary or temporary register (TAx). All the move operations are performed in the execute phase of the pipeline and the A-unit ALU is used to transfer the content of the registers.

For instructions [1] and [2], BRCx is decremented in the address phase of the last instruction of a loop. These instructions have a 3-cycle latency requirement versus the last instruction of a loop.

For instructions [3], [4], and [5], there is a 3-cycle latency between SP, SSP, CDP, and TAx update and their use in the address phase by the A-unit address generator units or by the P -unit loop control management.

Affected by none
Affects none
Instruction [6] cannot be repeated; all other instructions can be repeated.
See the following other related instructions:
- Move Accumulator Content to Auxiliary or Temporary Register
- Move Auxiliary or Temporary Register Content to CPU Register
- Store CPU Register Content to Memory

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline T1 = BRC1 & The content of block repeat register (BRC1) is copied to T1. \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & & After \\
T1 & 0034 & T1 & 00 EA \\
BRC1 & 00 EA & BRC1 & 00 EA
\end{tabular}

Table 5-4. Opcodes for Move CPU Register Content to Auxiliary or Temporary Register Instruction
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & \multicolumn{4}{|c|}{Opcode} \\
\hline [1] & TAx = BRCO & 0100 & 010E & 1100 & FDDD \\
\hline [2] & TAx = BRC1 & 0100 & 010E & 1101 & FDDD \\
\hline [3] & TAx = CDP & 0100 & 010E & 1010 & FDDD \\
\hline [4] & TAx \(=\mathbf{S P}\) & 0100 & 010E & 1000 & FDDD \\
\hline [5] & TAx \(=\mathbf{S S P}\) & 0100 & 010E & 1001 & FDDD \\
\hline [6] & TAx = RPTC & 0100 & 010E & 1110 & FDDD \\
\hline
\end{tabular}

MOV
Move Extended Auxiliary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & xdst \(=\) xsrc & No & 2 & 1 & \(X\) \\
\hline Opcode & & 1001 & 0000 & XSSS & XDDD
\end{tabular}

Operands xdst, xsrc
Description This instruction moves the content of the source register (xsrc) to the destination register (xdst):
- When the destination register (xdst) is an accumulator (ACx) and the source register (xsrc) is a 23-bit register (XARx, XSP, XSSP, XDP, or XCDP):
■ The 23 -bit move operation is performed in the D-unit ALU.
- The upper bits of ACx are filled with 0 .
- When the source register (xsrc) is an accumulator (ACx) and the destination register (xdst) is a 23 -bit register (XARx, XSP, XSSP, XDP, or XCDP):
- The 23-bit move operation is performed in the A-unit ALU.
- The lower 23 bits of \(A C x\) are loaded into \(x d s t\).
- When both the source register (xsrc) and the destination register (xdst) are accumulators, the Move Accumulator Content instruction (dst = src) is assembled.
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Load Extended Auxiliary Register from Memory
- Load Extended Auxiliary Register with Immediate Value
- Modify Extended Auxiliary Register Content
- Store Extended Auxiliary Register Content to Memory

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline XAR1 \(=\) AC0 & The lower 23 bits of AC0 are loaded into XAR1. \\
\hline
\end{tabular}

\section*{MOV}

Move Memory to Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & Smem = coef(Cmem) & No & 3 & 1 & X \\
{\([2]\)} & coef(Cmem) \(=\) Smem & No & 3 & 1 & X \\
{\([3]\)} & Lmem \(=\) dbl(coef(Cmem)) & No & 3 & 1 & \(X\) \\
{\([4]\)} & dbl(coef(Cmem) \()=\) Lmem & No & 3 & 1 & \(X\) \\
{\([5]\)} & dbl(Ymem \()=\) dbl(Xmem \()\) & No & 3 & 1 & \(X\) \\
{\([6]\)} & Ymem \(=\) Xmem & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Description & These instructions store the content of a memory location to a memory location. They use a dedicated datapath to perform the operation. \\
\hline \multirow[t]{2}{*}{Status Bits} & Affected by none \\
\hline & Affects none \\
\hline \multirow[t]{6}{*}{See Also} & See the following other related instructions: \\
\hline & - Store Accumulator Content to Memory \\
\hline & \(\square\) Store Accumulator, Auxiliary, or Temporary Register Content to Memory \\
\hline & \(\square\) Store Auxiliary or Temporary Register Pair Content to Memory \\
\hline & - Store CPU Register Content to Memory \\
\hline & - Store Extended Auxiliary Register Content to Memory \\
\hline
\end{tabular}

Move Memory to Memory

\section*{Syntax Characteristics}

\begin{tabular}{llll} 
Before & After \\
*CDP & 3400 & *CDP & 3400 \\
500 & 0000 & 500 & 3400
\end{tabular}

Move Memory to Memory

\section*{Syntax Characteristics}


Move Memory to Memory

\section*{Syntax Characteristics}


Move Memory to Memory

\section*{Syntax Characteristics}


Move Memory to Memory

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline No. & Syntax & & \multicolumn{2}{|l|}{\begin{tabular}{l}
Parallel \\
Enable Bit
\end{tabular}} & Size & \multicolumn{2}{|l|}{Cycles} & Pipeline \\
\hline [5] & dbl(Ymem) = dbl(Xmem) & & \multicolumn{2}{|l|}{No} & 3 & \multicolumn{2}{|l|}{1} & X \\
\hline \multicolumn{2}{|l|}{Opcode} & 1000 & 0000 & \multicolumn{2}{|l|}{XXXM} & MMYY & YMMM & 00xx \\
\hline
\end{tabular}

\section*{Operands \\ Xmem, Ymem}

Description This instruction stores the content of two consecutive data memory (Xmem) locations, addressed using the dual addressing mode, to two consecutive data memory (Ymem) locations.
Status Bits Affected by none
Affects none

Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{dbl}\left({ }^{*}\right.\) AR1 \()=\mathrm{dbl}\left({ }^{*}\right.\) AR0 \()\) & \begin{tabular}{l} 
The content addressed by AR0 is copied in the location addressed by AR1 and the \\
content addressed by AR0 +1 is copied in the location addressed by AR1 +1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
ARO & 0300 & ARO & 0300 \\
AR1 & 0400 & AR1 & 0400 \\
300 & 3400 & 300 & 3400 \\
301 & \(0 F D 3\) & 301 & 0 FD3 \\
400 & 0000 & 400 & 3400 \\
401 & 0000 & 401 & \(0 F D 3\)
\end{tabular}

Move Memory to Memory

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline Ymem = Xmem & & No & 3 & 1 & X \\
\hline Opcode & 1000 & 0000 | X & XM & YY \(\mid\) YM & M 01xx \\
\hline Operands & \multicolumn{5}{|l|}{Xmem, Ymem} \\
\hline Description & \multicolumn{5}{|l|}{This instruction stores the content of data memory (Xmem) location, addressed using the dual addressing mode, to data memory (Ymem) location} \\
\hline Status Bits & Affected by none & & & & \\
\hline & Affects none & & & & \\
\hline Repeat & \multicolumn{5}{|l|}{This instruction can be repeated.} \\
\hline Example & & & & & \\
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline *AR3 = *AR5 & \multicolumn{5}{|l|}{The content addressed by AR5 is copied in the location addressed by AR3.} \\
\hline
\end{tabular}

\section*{MPY}

\section*{Multiply}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & \begin{tabular}{l}
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline [1] & ACy \(=\operatorname{rnd}(A C y * A C x)\) & Yes & 2 & 1 & X \\
\hline [2] & \(A C y=\operatorname{rnd}(A C x * T x)\) & Yes & 2 & 1 & X \\
\hline [3] & \(A C y=\operatorname{rnd}(A C x * K 8)\) & Yes & 3 & 1 & X \\
\hline [4] & ACy \(=\operatorname{rnd}(A C x * * 16)\) & No & 4 & 1 & X \\
\hline [5] & \(A C x=\operatorname{rnd}(\) Smem * \(\operatorname{coef}(\) Cmem \()\) )[, T3 \(=\) Smem \(]\) & No & 3 & 1 & X \\
\hline [6] & ACy \(=\operatorname{rnd}(\) Smem * ACx \()[\), T3 \(=\) Smem \(]\) & No & 3 & 1 & X \\
\hline [7] & ACx \(=\operatorname{rnd}(\) Smem * K8) \([\), T3 \(=\) Smem \(]\) & No & 4 & 1 & X \\
\hline [8] &  & No & 4 & 1 & X \\
\hline [9] & ACx \(=\operatorname{rnd}(\) uns \((T x *\) Smem \()\) )[, T3 \(=\) Smem \(]\) & No & 3 & 1 & X \\
\hline [10] & ACx \(=\operatorname{rnd}(\) Smem * uns(coef(Cmem) ) \()\) & No & 3 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform a multiplication in the D-unit MAC.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:
- Modify Auxiliary Register Content with Parallel Multiply
- Multiply and Accumulate
- Multiply and Accumulate with Parallel Multiply
- Multiply and Subtract
- Multiply and Subtract with Parallel Multiply
\(\square\) Multiply with Parallel Multiply and Accumulate
- Multiply with Parallel Store Accumulator Content to Memory
- Parallel Multiplies
- Square

Multiply
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\operatorname{rnd}(A C y * A C x)\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

Opcode
0101 010E \({ }^{\text {D }}\) DSS 011\%

\section*{Operands}

Description

ACx, ACy
This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are \(\mathrm{ACx}(32-16)\) and \(\mathrm{ACy}(32-16)\).
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\(\left.\begin{array}{lll}\text { Status Bits } & \begin{array}{l}\text { Affected by } \\ \text { Affects }\end{array} \quad \text { FRCT, M40, RDM, SATD, SMUL }\end{array}\right]\)
\begin{tabular}{lrrrlrrr} 
Before & & & After \\
AC0 & 02 & 6000 & 3400 & AC0 & 02 & 6000 & 3400 \\
AC1 & 00 & C000 & 0000 & AC1 & 00 & 4800 & 0000 \\
M40 & & & 1 & M40 & & & 1 \\
FRCT & & & 0 & FRCT & & & 0 \\
ACOV1 & & & 0 & ACOV1 & & & 0
\end{tabular}

\section*{Multiply}

\section*{Syntax Characteristics}
\begin{tabular}{rlcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(A C y=\operatorname{rnd}(A C x * T x)\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
\(\mid 0101\) 100E \(\mid\) DDSS \(\operatorname{ss0\% }\)

\section*{Operands}

Description This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are \(\operatorname{ACx}(32-16)\) and the content of Tx , sign extended to 17 bits.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32 -bit result of the multiplication is sign extended to 40 bits.
. Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1\) * \(T 0\) & The product of the content of AC1 and the content of T0 is stored in AC0. \\
\hline
\end{tabular}

Multiply

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & \begin{tabular}{l}
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline \multicolumn{2}{|l|}{\(A C y=\operatorname{rnd}(A C x * K 8)\)} & Yes & 3 & 1 & X \\
\hline Opcode & \multicolumn{5}{|r|}{0001 111E \({ }^{\text {KKKK }}\) KKKK \({ }^{\text {SSDD }}\) xx0\%} \\
\hline Operands & \multicolumn{5}{|l|}{ACx, ACy, K8} \\
\hline Description & \multicolumn{5}{|l|}{This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are \(\operatorname{ACx}(32-16)\) and the 8 -bit signed constant, K8, sign extended to 17 bits.} \\
\hline & \multicolumn{5}{|l|}{\(\square\) If \(\mathrm{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.} \\
\hline & \multicolumn{5}{|l|}{\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.} \\
\hline & \multicolumn{5}{|l|}{- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.} \\
\hline \multicolumn{6}{|c|}{Compatibility with C54x devices (C54CM \(=1\) )} \\
\hline & \multicolumn{5}{|l|}{When this instruction is executed with M40 \(=0\), compatibility is ensured.} \\
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{5}{|l|}{Affected by FRCT, M40, RDM} \\
\hline & Affects & & & & \\
\hline Repeat & \multicolumn{5}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{6}{|l|}{Example} \\
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline AC0 = AC1 * \#-2 & \multicolumn{5}{|l|}{The product of the content of AC1 and a signed 8-bit value (-2) is stored in AC0.} \\
\hline
\end{tabular}

\section*{Multiply}

\section*{Syntax Characteristics}


Multiply

\section*{Syntax Characteristics}

\(\square\) If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{|c|c|c|}
\hline Status Bits & \begin{tabular}{l}
Affected by \\
Affects
\end{tabular} & FRCT, M40, RDM, SATD, SMUL
\[
\mathrm{ACOVx}
\] \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline Syntax & Descrip & \\
\hline AC0 \(=\) *AR3 * coef(*CDP) & The p the & ct of the content addressed by AR3 and ient data pointer register (CDP) is sto \\
\hline
\end{tabular}

Multiply

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([6]\) & \(\mathrm{ACy}=\mathrm{rnd}\left(\mathrm{Smem}{ }^{*} \mathrm{ACx}\right)[, \mathrm{T} 3=\mathrm{Smem}]\) & No & 3 & 1 & X \\
\hline Opcode & \(\mid 1101\) & \(0011 \mid A A A A\) & AAAI & \(\mathrm{U} \% \mathrm{DD}\) & 00 SS
\end{tabular}

\section*{Operands ACx, ACy, Smem}

Description This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are \(\operatorname{ACx}(32-16)\) and the content of a memory location (Smem), sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{l} 
Status Bits \\
\multicolumn{1}{l}{\begin{tabular}{l} 
Affected by FRCT, M40, RDM, SATD, SMUL \\
Repeat
\end{tabular}} \\
\begin{tabular}{ll} 
This instruction can be repeated.
\end{tabular} \\
\begin{tabular}{|ll|}
\hline Syntax & Description \\
\hline AC0 = *AR3 * AC1 & \begin{tabular}{l} 
The product of the content addressed by AR3 and the content of AC1 is stored in \\
AC0.
\end{tabular} \\
\hline
\end{tabular}
\end{tabular}

\section*{Multiply}

\section*{Syntax Characteristics}


Syntax Characteristics


\section*{Operands \\ ACx, Xmem, Ymem}

Description
This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If FRCT = 1, the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits \begin{tabular}{l} 
Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
Affects ACOVx
\end{tabular}
Repeat \(\quad\) This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = uns(*AR3) * uns(*AR4) & \begin{tabular}{l} 
The product of the unsigned content addressed by AR3 and the unsigned \\
content addressed by AR4 is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Multiply
Syntax Characteristics
\begin{tabular}{clccccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([9]\) & \(A C x=\operatorname{rnd}\left(\right.\) uns \(\left(T x^{*}\right.\) Smem \(\left.)\right)[, T 3=\) Smem \(]\) & No & 3 & 1 & \(X\) \\
\hline Opcode & 1101 & 0011 & AAAA & AAAI & U\%DD & u1ss
\end{tabular}

\section*{Operands}

Description

\section*{Status Bits}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) uns(T0 * *AR3) & \begin{tabular}{l} 
The unsigned product of the content addressed by AR3 and the content of T0 is \\
stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Multiply}

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([10]\) & ACx \(=\operatorname{rnd}(\) Smem * uns(coef(Cmem))) & No & 3 & 1 & X \\
\hline Opcode & ACx, Cmem, Smem & \(\mid 1101\) & 0000 & AAAA & AAAI & \(0 \% D D\) \\
Operands & 01 mm \\
Description & \begin{tabular}{l} 
This instruction performs a multiplication in the D-unit MAC1. The input \\
operands of the multiplier are the content of a data memory location (Smem) \\
and the content of a data memory operand (Cmem).
\end{tabular}
\end{tabular}

\section*{Note:}

The uns keyword is mandatory for this instruction.

The data memory operand Smem is addressed by DAGEN path X by using the Smem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand Cmem is addressed by DAGEN path \(C\) by using the coefficient addressing mode, driven on data bus BDB, and sign extended to 17 bits with filling zeros in the MAC1.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32 -bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

This instruction can be applied to compute the intermediate multiplication result of a double precision multiplication and to free up one DAGEN operator (DAGEN path Y ) for storing an instruction with enabling parallelism.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects \(\quad\) ACOVx
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(={ }^{*}\) AR3- * uns(coef(*CDP + )) & \begin{tabular}{l} 
The product of the content addressed by AR3 and the unsigned content \\
addressed by the coefficient data pointer register (CDP) is stored in AC0. \\
AR3 is decremented by 1 and CDP is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}


\section*{MPY::MAC}

\section*{Multiply with Parallel Multiply and Accumulate}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& A C x=M 40\left(\text { rnd }\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right), \\
& \text { ACy } \left.\left.=\text { M40(rnd((ACy >> \#16) }+\left(\text { uns }(\text { Ymem })^{*}\right)\right)\right) \\
& \text { uns }(\text { coef(Cmem) })
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [2] & \[
\begin{aligned}
& \left.A C y=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { HI( } \operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& \text { ACx }=M 40\left(\operatorname{rnd}\left(\text { ACx }+\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { LO(coef(Cmem) })\right)\right)\right)
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [3] & ```
ACy = M40(rnd(uns(HI(Lmem)) * uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [4] & \[
\begin{aligned}
& A C y=M 40(\text { rnd }(\text { uns }(\text { Ymem }) * \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem }))))), \\
& A C x=M 40(\text { rnd }(\text { ACx }+ \text { uns }(\text { Xmem }) *) ~
\end{aligned}
\] & No & 5 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Description & These instructions perform two parallel operations in one cycle: multiply, and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs. \\
\hline Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
\hline & Affects ACOVx, ACOVy \\
\hline See Also & See the following other related instructions: \\
\hline & - Multiply \\
\hline & - Multiply and Accumulate \\
\hline & - Parallel Multiply and Accumulates \\
\hline
\end{tabular}

Multiply With Parallel Multiply and Accumulate
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& \text { ACx } \left.\left.=\text { M40(rnd(uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right), \\
& \text { ACy }=\text { M40(rnd((ACy >> \#16) }+\left(\text { uns }(\text { Ymem })^{*}\right. \\
& \text { uns }(\operatorname{coef(Cmem)))))~}
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Opcode & 10000100 & XXXM & MMYY & YMMM & 10 mm & uuDD & DDg\% \\
\hline
\end{tabular}

\section*{Operands}

\section*{Description}

\author{
ACx, ACy, Cmem, Xmem, Ymem
}

This instruction performs two parallel operations in one cycle: multiply, and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)
\begin{tabular}{lll} 
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

Example
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline ```
AC0 = uns(*AR3) * uns(coef(*CDP)),
AC1 = (AC1 >> #16) + (uns(*AR4) * uns(coef(*CDP)))
``` & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the coefficient data pointer register (CDP) is stored in ACO. The product of the unsigned content addressed by AR4 and the unsigned content addressed by CDP is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. \\
\hline
\end{tabular}

Multiply with Parallel Multiply and Accumulate
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \[
\begin{aligned}
& \text { ACy } \left.=\text { M40(rnd }\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& \text { ACx }=\text { M40(rnd }\left(\text { ACx }+\left(\text { uns }(\text { Smem })^{*}\right)\right. \\
& \text { uns }(\text { LO(coef(Cmem) }))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode

Operands
Description

ACx, ACy, Cmem, Smem
This instruction performs two parallel operations in one cycle: multiply, and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the \(B A B, B D B\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM = 1)
None.
\begin{tabular}{|c|c|}
\hline \multicolumn{2}{|l|}{Status Bits Affected by FRCT, M40, RDM, SATD, SMUL} \\
\hline \multicolumn{2}{|l|}{Affects ACOVx, ACOVy} \\
\hline \multicolumn{2}{|l|}{Repeat This instruction can be repeated.} \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { uns (*AR3 }-)^{*} \text { uns(HI(coef(*CDP+))), } \\
& \text { AC0 } \left.\left.=\text { AC0 }+\left(\text { uns }\left({ }^{*} \text { AR3 }-\right)^{*} \text { uns(LO(coef(*}\left({ }^{*} C D P+\right)\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of the CDP is added to the content of ACO. The result is stored in AC0. AR3 is decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}

Execution
M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem)) ) [16:0])) -> ACy
ACx+M40 (rnd (uns (Smem) [16:0]*uns (LO (coef (Cmem)) ) [16:0])) ->ACx
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline Before & & & After & & & \\
\hline ACO 00 & 0000 & 8000 & ACO & 00 & 3 F 80 & 8000 \\
\hline XAR3 & 00 & 10FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FFh & & FEOO & 10FFh & & & FEO 0 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 FF & 8000 & 0000 & AC1 & 00 & 7F00 & 0000 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & ```
ACy = M40(rnd(uns(HI(Lmem)) * uns(HI(\boldsymbol{coef(Cmem))))),}
ACx = M40(rnd(ACx + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description

1111 1101|AAAA AAAI | 0100 01mm| DDDD uug\%
ACx, ACy, Cmem, Lmem
This instruction performs two parallel operations in one cycle: multiply, and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\mathrm{Lmem})\) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{HI}(\) Lmem ) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(coef(Cmem)). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

\section*{Status Bits}

Repeat
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{AC} 1=\mathrm{uns}\left(\mathrm{HI}\left({ }^{*} \mathrm{AR} 3-\right)\right)^{*} \mathrm{uns}\left(\mathrm{HI}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\), & Both instructions are performed in parallel. The \\
\(\mathrm{AC0}=\mathrm{ACO}+\left(\mathrm{uns}\left(\mathrm{LO}\left({ }^{*} \mathrm{AR3} 3-\right)\right)^{*} \mathrm{uns}\left(\mathrm{LO}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right)\) & \begin{tabular}{l} 
product of the unsigned content addressed by the \\
higher part of AR3 and the unsigned content ad- \\
dressed by the higher part of the coefficient data \\
pointer register (CDP) is stored in AC1. The prod- \\
uct of the unsigned content addressed by the low- \\
er part of AR3 and the unsigned content ad- \\
dressed by the lower part of the CDP is added to \\
the content of AC0. The result is stored in AC0. \\
When AR3- is used with HI/LO, AR3 is decrem- \\
ented by 2. When CDP+ is used with HI/LO, CDP \\
is incremented by 2.
\end{tabular} \\
\hline
\end{tabular}


\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [4] & \[
\begin{aligned}
& \text { ACy }=\text { M40(rnd }(\text { uns }(\text { Ymem }) * \text { uns }(\mathbf{H I}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M40(rnd(ACx }+ \text { uns }(\text { Xmem }) * \\
& \text { uns }(\text { LO(coef(Cmem) }))))
\end{aligned}
\] & No & 5 (*) & 1 & X \\
\hline
\end{tabular}
(*) 1 LSB is allocated to instruction slot \#2.

\section*{Opcode}

Operands
Description
\[
\begin{array}{|ll|l|l|l|}
1001 & 0010 & \text { XXXM }
\end{array}
\]

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply, and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\) which is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
\(\square\) The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

None.
\begin{tabular}{|c|c|}
\hline \(\begin{array}{lll}\text { Status Bits } & \text { Affected by } & \text { FRCT, } \\ & \text { Affects } & \text { ACOVx, }\end{array}\) & 440, RDM, SATD, SMUL, SXMD ACOVy \\
\hline \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { uns (*AR3-) }{ }^{*} \text { uns(HI(coef(*CDP+))), } \\
& \text { AC0 } \left.\left.=\text { AC0 }+\left(\text { uns }\left({ }^{*} A R 2-\right)^{*} \text { uns(LO(coef(*}\left({ }^{*} C D P+\right)\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is added to the content of ACO. The result is stored in ACO. AR3 and AR2 are decremented by 1 . When CDP+ is used with \(\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}\) is incremented by 2 . \\
\hline
\end{tabular}
```

Execution
M40 (rnd (ACx $+\operatorname{uns}(X m e m)[16: 0] * \operatorname{uns}(L O(\operatorname{coef}(C m e m)))[16: 0]))$-> ACx
M40 (rnd (uns (Ymem) [16:0] * uns (HI (coef (Cmem)) ) [16:0])) -> ACy

```
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline Before & & & After & & & \\
\hline ACO 00 & 0000 & 8000 & ACO & 00 & 3 F 80 & 8000 \\
\hline XAR2 & 00 & 10 FE & XAR2 & & 00 & 10FD \\
\hline XAR3 & 00 & 20 FE & XAR3 & & 00 & 20FD \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FEh & & FEO 0 & 10FEh & & & FE00 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 FF & 8000 & 0000 & AC1 & 00 & 7F80 & 0000 \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 20FEh & & FFO0 & 20FFh & & & FFO0 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

\section*{MPY::MAS}

\section*{Multiply With Parallel Multiply and Subtract}

\section*{Syntax Characteristics}


Multiply with Parallel Multiply and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] &  & No & 4 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|11111101|\) AAAA AAAI \(|000011 \mathrm{~mm}|\) DDDD uug\%

Operands ACx, ACy, Cmem, Smem
Description This instruction performs two parallel operations in one cycle: multiply, and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\) coef \((\mathrm{Cmem})\) ) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\mathrm{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the \(B A B, B D B\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM = 1)
None.
\begin{tabular}{lll} 
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = uns(*AR3-) * uns(HI(coef(*CDP+))), & Both instructions are performed in parallel. The product \\
AC0 = AC0 \(-\left(\text { uns( }{ }^{*} \text { AR3- }\right)^{*}\) uns(LO(coef(*CDP+)))) & of the unsigned content addressed by AR3 and the \\
unsigned content addressed by the higher part of the \\
coefficient data pointer register (CDP) is stored in AC1. \\
The product of the unsigned content addressed by AR3 \\
and the unsigned content addressed by the lower part \\
of the CDP is subtracted from the content of AC0. The \\
result is stored in AC0. AR3 is decremented by 1. \\
When CDP+ is used with HI/LO, CDP is incremented \\
by 2.
\end{tabular}

Execution
M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem)) ) [16:0])) -> ACy
ACx-M40 (rnd (uns (Smem) [16:0]*uns (LO (coef (Cmem)) ) [16:0])) ->ACx


Multiply with Parallel Multiply and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \begin{tabular}{l}
 \\
\(\mathrm{ACx}=\mathrm{M} 40\left(\mathrm{rnd}\left(\mathrm{ACx}-\left(\mathrm{uns}(\mathrm{LO}(\mathrm{Lmem}))^{*}\right.\right.\right.\) \\
uns(LO(coef(Cmem)))))
\end{tabular} & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode

\section*{Operands}

Description
\(11111101 \mid\) AAAA AAAI 0100 11mm|DDD uug\%

\author{
ACx, ACy, Cmem, Lmem
}

This instruction performs two parallel operations in one cycle: multiply, and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand HI(Lmem) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{HI}(\) Lmem ) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(coef(Cmem)). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.

I If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.
Compatibility with C54x devices (C54CM =1)
None.
Status Bits

Repeat
Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy
This instruction can be repeated.

Example
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline  & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of the CDP is subtracted from the content of ACO. The result is stored in AC0. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}


\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & \[
\begin{aligned}
& \text { ACy }=\text { M40(rnd }(\text { uns }(\text { Ymem }) * \text { uns }(\mathbf{H I}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M40(rnd(ACx }- \text { uns }(\text { Xmem })^{*} \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & No & 5 (*) & 1 & X \\
\hline
\end{tabular}
(*) 1 LSB is allocated to instruction slot \#2.

\section*{Opcode}

Operands
Description
| \(10010010 \mid\) XXXM MMYY \(\mid\) YMMM \(10 \mathrm{~mm} \mid\) UuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply, and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the contents of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC 1.
- The input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
\(\square\) The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 key word is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

None.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{2}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL, SXMD} \\
\hline & \multicolumn{2}{|l|}{Affects ACOVx, ACOVy} \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline Syntax & & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { uns }\left({ }^{*} A\right. \\
& \text { AC0 }=A C 0-
\end{aligned}
\] & \[
\begin{aligned}
& \left.\mathrm{HI}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right), \\
& \left.-)^{*} \operatorname{uns}\left(\mathrm{LO}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1 The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is subtracted from the content of ACO. The result is stored in ACO. AR3 and AR2 are decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}
```

Execution
M40 (rnd (ACx - uns (Xmem) [16:0] * uns (LO (coef(Cmem))) [16:0])) -> ACx
M40 (rnd (uns (Ymem) [16:0] * uns (HI (coef (Cmem))) [16:0])) -> ACy

```
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline Before & & & After & & & \\
\hline AC0 00 & 0000 & 8000 & ACO & FF & C080 & 8000 \\
\hline XAR2 & 00 & 10FE & XAR2 & & 00 & 10FD \\
\hline XAR3 & 00 & 20 FE & XAR3 & & 00 & 20FD \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FEh & & FE00 & 10FEh & & & FEO0 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 FF & 8000 & 0000 & AC1 & 00 & 7F80 & 0000 \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 20FEh & & FFO0 & 20FFh & & & FFOO \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & ACy \(=\operatorname{rnd}\left(\right.\) Tx \({ }^{*}\) Xmem \()\), \\
& Ymem \(=\) HI(ACx << T2) \([, T 3=X m e m]\) & No & 4 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|10000111|\) XXXM MMYY \(\mid\) YMMM SSDD \(\mid 000 \mathrm{x}\) ssU\%

\section*{Operands}

Description

ACx, ACy, Tx, Xmem, Ymem
This instruction performs two operations in parallel: multiply and store.
The first operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation shifts the accumulator ACx by the content of T2 and stores \(\operatorname{ACx}(31-16)\) to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
\(\square\) The input operand is shifted in the D-unit shifter according to SXMD.
\(\square\) After the shift, the high part of the accumulator, \(\operatorname{ACx}(31-16)\), is stored to the memory location.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of T 2 determine the
shift quantity. The 6 LSBs of T2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17, a modulo 16 operation transforms the shift quantity to within -16 to -1 .
- If the SST bit \(=1\) and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
\[
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{Tx} * \text { Xmem }), \\
& \text { Ymem }=\mathrm{HI}(\text { saturate }(\text { uns }(\mathrm{ACx} \ll \mathrm{~T} 2)))[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\]
\(\square\) If the SST bit \(=1\) and the SXMD bit \(=1\), then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

ACy \(=\operatorname{rnd}(T x *\) Xmem \()\),
Ymem \(=\) HI(saturate \((A C x \ll T 2))[, T 3=\) Xmem \(]\)
Status Bits Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SST, SXMD
Affects ACOVy
Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Addition with Parallel Store Accumulator Content to Memory
- Multiply
\(\square\) Multiply and Accumulate with Parallel Store Accumulator Content to Memory
- Multiply and Subtract with Parallel Store Accumulator Content to Memory
- Store Accumulator Content to Memory
- Subtraction with Parallel Store Accumulator Content to Memory

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \begin{tabular}{ll} 
AC1 \(=\) rnd(T0 * *AR0+), \\
\(* A R 1 ~\)
\end{tabular}\(=\mathrm{HI}(\mathrm{ACO} \ll \mathrm{T} 2)\)
\end{tabular}\(\quad\)\begin{tabular}{l} 
Both instructions are performed in parallel. The content addressed by AR0 is \\
multiplied by the content of T0. Since FRCT \(=1\), the result is multiplied by 2, \\
rounded, and stored in AC1. The content of AC0 is shifted by the content of T2, \\
and AC0(31-16) is stored at the address of AR1. AR0 and AR1 are both \\
incremented by 1.
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline ACO & FF & 8421 & 1234 & ACO & FF & 8421 & 1234 \\
\hline AC1 & 00 & 0000 & 0000 & AC1 & 00 & 2000 & 0000 \\
\hline ARO & & & 0200 & ARO & & & 0201 \\
\hline AR1 & & & 0300 & AR1 & & & 0301 \\
\hline T0 & & & 4000 & T0 & & & 4000 \\
\hline T2 & & & 0004 & T2 & & & 0004 \\
\hline 200 & & & 4000 & 200 & & & 4000 \\
\hline 300 & & & 1111 & 300 & & & 4211 \\
\hline FRCT & & & 1 & FRCT & & & 1 \\
\hline ACOV1 & & & 0 & ACOV1 & & & 0 \\
\hline CARRY & & & 0 & CARRY & & & 0 \\
\hline
\end{tabular}

\section*{MAC}

\section*{Multiply and Accumulate (MAC)}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \(A C y=\operatorname{rnd}(A C y+(A C x * T x))\) & Yes & 2 & 1 & X \\
\hline [2] & \(A C y=r n d((A C y * T x)+A C x)\) & Yes & 2 & 1 & X \\
\hline [3] & \(A C y=r n d(A C x+(T x * K 8))\) & Yes & 3 & 1 & X \\
\hline [4] & \(A C y=\operatorname{rnd}(A C x+(T x * K 16))\) & No & 4 & 1 & X \\
\hline [5] & \(A C x=\operatorname{rnd}(A C x+(S m e m * \operatorname{coef}(\mathrm{Cmem}))\) )[, T3 = Smem] & No & 3 & 1 & \(X\) \\
\hline [6] & ACy \(=\operatorname{rnd}\left(\mathrm{ACy}+\left(\right.\right.\) Smem \({ }^{*}\) ACx \()\) ) \([\), T3 \(=\) Smem \(]\) & No & 3 & 1 & X \\
\hline [7] & \(A C y=\operatorname{rnd}(A C x+(T x *\) Smem \()\) [, T3 = Smem \(]\) & No & 3 & 1 & X \\
\hline [8] & ACy \(=\operatorname{rnd}(\mathrm{ACx}+(\) Smem * K8) \()[, \mathrm{T} 3=\) Smem \(]\) & No & 4 & 1 & X \\
\hline [9] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40(\text { rnd }(\text { ACx }+(\text { uns }(\text { Xmem }) * u n s(\text { Ymem })))) \\
& {[, \text { T3 }=\text { Xmem }]}
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [10] & \begin{tabular}{l}
\(A C y=M 40\left(\right.\) rnd \(\left((\right.\) ACx \(\gg\) \#16) \()+\left(\right.\) uns (Xmem) \({ }^{*}\) uns(Ymem) \(\left.\left.)\right)\right)\) \\
[, T3 = Xmem]
\end{tabular} & No & 4 & 1 & X \\
\hline [11] & \(A C x=\operatorname{rnd}(A C x+(S m e m * u n s(\operatorname{coef}(\) Cmem \()\) ) \()\) & No & 3 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform a multiplication and an accumulation in the D-unit MAC.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD Affects ACOVx, ACOVy

See Also See the following other related instructions:
\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Accumulate
\(\square\) Multiply and Accumulate with Parallel Delay
\(\square\) Multiply and Accumulate with Parallel Load Accumulator from Memory
\(\square\) Multiply and Accumulate with Parallel Multiply
\(\square\) Multiply and Accumulate with Parallel Store Accumulator Content to Memory
\(\square\) Multiply and Subtract
\(\square\) Multiply and Subtract with Parallel Multiply and Accumulate
\(\square\) Multiply with Parallel Multiply and Accumulate
\(\square\) Parallel Multiply and Accumulates

Multiply and Accumulate (MAC)

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\operatorname{rnd}(A C y+(A C x * T x))\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

0101 011E|DDSS ss0\%

\section*{Operands ACx, ACy, Tx}

Description This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are \(\operatorname{ACx}(32-16)\) and the content of Tx , sign extended to 17 bits.

If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
. The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.

R Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an addition overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.

\section*{Status Bits}

Repeat
Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0+\left(A C 1 ~^{*} T 0\right)\) & \begin{tabular}{l} 
The product of the content of AC1 and the content of T0 is added to the content of \\
AC0. The result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Accumulate (MAC)

Syntax Characteristics
\begin{tabular}{rlcccc}
\hline No. & Syntax & \begin{tabular}{r} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(A C y=\operatorname{rnd}((A C y * T x)+A C x)\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
\(\mid 0101\) 100E \(\mid\) DDSS ss1\%

\section*{Operands}

Description

ACx, ACy, Tx
This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are \(\operatorname{ACy}(32-16)\) and the content of Tx, sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.

When an addition overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
This instruction can be repeated.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=\operatorname{rnd}((A C 1 * T 1)+A C 0)\) & \begin{tabular}{l} 
The product of the content of AC1 and the content of T1 is added to the content \\
of AC0. The result is rounded and stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Accumulate (MAC)

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([3]\) & \(A C y=\operatorname{rnd}(A C x+(T x * K 8))\) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

0001 111E \(\mid\) KKKK KKKK \(\mid\) SSDD ss1\%

\section*{Operands}

Description
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD \\
& Affects \(\quad\) ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1+(T 0 * K 8)\) & \begin{tabular}{l} 
The product of the content of T0 and a signed 8-bit value is added to the content of \\
AC1. The result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Accumulate (MAC)

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([4]\) & \(A C y=\operatorname{rnd}(A C x+(T x * K 16))\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

Opcode
Operands
Description
\begin{tabular}{|ll|l|l|l|l|}
0111 & 1001 & KKKK KKKK & KKKK KKKK & SSDD ss1\%
\end{tabular}
ACx, ACy, K16, Tx
This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the 16 -bit signed constant, K16, sign extended to 17 bits.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.

When an addition overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1+(T 0 *\) \#FFFFh) & \begin{tabular}{l} 
The product of the content of T0 and a signed 16-bit value (FFFFh) is added \\
to the content of AC1. The result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Accumulate (MAC)

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([5]\) & \(A C x=\operatorname{rnd}(A C x+(S m e m * \operatorname{coef}(C m e m)))[, T 3=S m e m]\) & No & 3 & 1 & \(X\) \\
\hline Opcode & 1101 & \(0001 \mid A A A A\) & \(A A A I\) & \(U \circ D D\) & 01 mm
\end{tabular}
Operands ACx, Cmem, Smem

\section*{Description}
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx
\end{tabular}

This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of a memory location (Smem), sign extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
\(\square\) When an addition overflow is detected, the accumulator is saturated according to SATD.
This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.
For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Affected by FRCT, M40, RDM, SATD, SMUL

This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\mathrm{AC} 2=\operatorname{rnd}\left(\mathrm{AC} 2+\left({ }^{*} \mathrm{AR} 1{ }^{*} \operatorname{coef}\left({ }^{*} \mathrm{CDP}\right)\right)\right)\) & \begin{tabular}{l} 
The product of the content addressed by AR1 and the content \\
addressed by the coefficient data pointer register (CDP) is added to \\
the content of AC2. The result is rounded and stored in AC2. The \\
result generated an overflow.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & & & After \\
AC2 & 00 EC00 & 0000 & AC2 & 00 EC00 & 0000 \\
AR1 & & 0302 & AR2 & & 0302 \\
CDP & & 0202 & CDP & & 0202 \\
302 & & FE00 & 302 & FE00 \\
202 & 0040 & 202 & & 0040 \\
ACOV2 & & 0 & ACOV2 & & 1
\end{tabular}

Multiply and Accumulate (MAC)
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([6]\) & \(A C y=\operatorname{rnd}(A C y+(S m e m ~ * A C x))[, T 3=S m e m]\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
\(11010010 \mid\) AAAA AAAI \(\mid\) U\%DD 00SS

\section*{Operands ACx, ACy, Smem}

Description This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are ACx(32-16) and the content of a memory location (Smem), sign extended to 17 bits.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=A C 1+\left({ }^{*} A R 3 * A C 0\right)\) & \begin{tabular}{l} 
The product of the content addressed by AR3 and the content of AC0 is added \\
to the content of \(A C 1\). The result is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Accumulate (MAC)
Syntax Characteristics


\section*{Operands}

Description

ACx, ACy, Smem, Tx
This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of a memory location (Smem), sign extended to 17 bits.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
This instruction can be repeated.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1+(T 0 * * A R 3)\) & \begin{tabular}{l} 
The product of the content addressed by AR3 and the content of T0 is added \\
to the content of AC1. The result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Accumulate (MAC)
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([8]\) & \(A C y=\operatorname{rnd}(A C x+(S m e m * K 8))[, T 3=S m e m]\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

\section*{Description}


Multiply and Accumulate (MAC)

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([9]\) & \begin{tabular}{l} 
ACy \(=M 40(\operatorname{rnd}(A C x+(\mathrm{uns}(\mathrm{Xmem}) *\) uns \((\) Ymem \())))\) \\
{\([, T 3=X m e m]\)}
\end{tabular} & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
| \(10000110 \mid\) XXXM MMYY \(\mid\) YMMM SSDD \(\mid 001 \mathrm{~g}\) uuU\%
ACx, ACy, Xmem, Ymem
This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits.
\(\square\) Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{|c|c|}
\hline \begin{tabular}{l}
Status Bits \\
Affected by \\
Affects
\end{tabular} & \begin{tabular}{ll} 
Affected by & FRCT, M40, RDM, SATD, SMUL, SXMD \\
Affects & ACOVy
\end{tabular} \\
\hline Repeat This instruction can & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline AC3 \(=\) rnd(AC3 + (uns(*AR2+) * uns(*AR3+)) & The product of the unsigned content addressed by AR2 and the unsigned content addressed by AR3 is added to the content of AC3. The result is rounded and stored in AC3. The result generated an overflow. AR2 and AR3 are both incremented by 1 . \\
\hline
\end{tabular}
\begin{tabular}{lrlrr} 
Before & & & After \\
AC3 & 00 & 2300 & EC00 & AC3
\end{tabular} 00092210000

Multiply and Accumulate (MAC)
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [10] & \[
\begin{aligned}
& \mathrm{ACy}=\mathrm{M} 40\left(\text { rnd }\left((\mathrm{ACx} \gg \# 16)+\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\text { Ymem })\right)\right)\right) \\
& {[, \mathrm{T} 3=\text { Xmem }]}
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|1000 \quad 0110|\) XXXM MMYY \(\mid\) YMMM SSDD \(\mid 010 \mathrm{~g}\) uuU\%

\section*{ACx, ACy, Xmem, Ymem}

Description This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits.
- Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx , which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator \(\operatorname{ACx}(39)\).
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits \begin{tabular}{l} 
Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
Affects \\
Repeat
\end{tabular}\(\quad\) This instruction can be repeated. \begin{tabular}{l} 
Example \\
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=(\) AC1 >> \#16) + (uns(*AR3) * uns(*AR4)) & \begin{tabular}{l} 
The product of the unsigned content addressed by AR3 and \\
the unsigned content addressed by AR4 is added to the \\
content of AC1, which has been shifted to the right by \\
16 bits. The result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}
\end{tabular}

Multiply and Accumulate (MAC)

\section*{Syntax Characteristics}


\section*{Note:}

The uns keyword is mandatory for this instruction.

The data memory operand Smem is addressed by DAGEN path X by using the Smem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand Cmem is addressed by DAGEN path C by using the coefficient addressing mode, driven on data bus BDB, and sign extended to 17 bits with filling zeros in the MAC1.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the \(\mathrm{BAB}, \mathrm{BDB}\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To
prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

This instruction can be applied to compute the intermediate multiplication result and accumulation to the other partial result of double precision multiplication, and to free up one DAGEN operator (DAGEN path Y) for storing an instruction with enabling parallelism.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{|c|c|c|}
\hline Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
\hline & Affects & ACOVx \\
\hline Repeat & This instruction & can be repeated. \\
\hline
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC0 \(+\left({ }^{*} A R 3-*\right.\) uns \(\left.\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\) & \begin{tabular}{l} 
The product of the content addressed by AR3 and the content \\
addressed by the coefficient data pointer register (CDP) is added \\
to the content of AC0. The result is stored in AC0. AR3 is de- \\
cremented by 1 and CDP in incremented by 1.
\end{tabular} \\
\hline
\end{tabular}


\section*{MACMZ}

Multiply and Accumulate with Parallel Delay

\section*{Syntax Characteristics}
\(\left.\begin{array}{clcccc}\hline \text { No. } & \text { Syntax } & \begin{array}{c}\text { Parallel } \\ \text { Enable Bit }\end{array} & \text { Size } & \text { Cycles } & \text { Pipeline } \\ \hline[1] & \begin{array}{l}\text { ACx }=\operatorname{rnd}(A C x+(\text { Smem * } \operatorname{coef}(C m e m)) \\ \text { delay }(\text { Smem })\end{array} & \text { T3 }=\text { Smem }], & \text { No } & 3 & 1\end{array}\right] \times\)

\section*{Opcode}

\section*{Operands}

Description

ACx, Cmem, Smem
This instruction performs a multiplication and an accumulation in the D-unit MAC in parallel with the delay memory instruction. The input operands of the multiplier are the content of a memory location (Smem), sign extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and sign extended to 17 bits.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit ( ACOV x\()\) is set.
\(\square\) When an addition overflow is detected, the accumulator is saturated according to SATD.
This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.
For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.
The soft dual memory addressing mode mechanism cannot be applied to this instruction. This instruction cannot use the *port(\#k16) addressing mode or be paralleled with the readport() or writeport() operand qualifier.

This instruction cannot be used for accesses to I/O space. Any illegal access to I/O space generates a hardware bus-error interrupt (BERRINT) to be handled by the CPU.

MACM::MOV Multiply and Accumulate with Parallel Load Accumulator from Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C x=\operatorname{rnd}\left(A C x+\left(T x^{*}\right.\right.\) Xmem \(\left.)\right)\), & No & 4 & 1 & \(X\) \\
& \(A C y=Y m e m \ll \# 16[, T 3=X m e m]\) & & & & \\
\hline
\end{tabular}
Opcode \(\quad|10000110| X X X M\) MMYY \(\mid\) YMMM DDDD \(\mid 101 \mathrm{x}\) SsU\%

\section*{Operands}

Description

ACx, ACy, Tx, Xmem, Ymem
This instruction performs two operations in parallel: multiply and accumulate (MAC) and load.

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVX) is set.

When an addition overflow is detected, the accumulator is saturated according to SATD.
- This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation loads the content of data memory operand Ymem, which has been shifted to the left by 16 bits, into accumulator ACy.
\(\square\) The input operand is sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
\(\square\) The input operand is shifted to the left by 16 bits according to M40.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
Repeat This instruction can be repeated.
See Also See the following other related instructions:
\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Accumulate
\(\square\) Multiply and Accumulate
\(\square\) Multiply and Accumulate with Parallel Delay
\(\square\) Multiply and Accumulate with Parallel Multiply
\(\square\) Multiply and Accumulate with Parallel Store Accumulator Content to Memory
\(\square\) Multiply and Subtract with Parallel Load Accumulator from Memory
\(\square\) Multiply with Parallel Multiply and Accumulate
\(\square\) Parallel Multiply and Accumulates

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC0 \(+(\) T0 * *AR3 \()\), & Both instructions are performed in parallel. The product of the content addressed \\
AC1 \(=\) *AR4 \(\ll \# 16\) & by AR3 and the content of T0 is added to the content of AC0. The result is stored \\
in AC0. The content addressed by AR4, which has been shifted to the left by \\
& 16 bits, is stored in AC1.
\end{tabular}

\section*{MAC::MPY \\ Multiply and Accumulate with Parallel Multiply}

\section*{Syntax Characteristics}


\section*{Multiply and Accumulate With Parallel Multiply}

\section*{Syntax Characteristics}


\section*{Operands}

Description

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, sign extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

This second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
\(\square\) Input operands are extended to 17 bits according to uns.
If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)
\begin{tabular}{lll} 
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC0 \(+\left(\text { uns }^{*} \text { AR3 }\right)^{*}\) uns \(\left.\left(\operatorname{coef}\left({ }^{*} C D P\right)\right)\right)\), & Both instructions are performed in parallel. The product of the \\
AC1 \(=\) uns \(\left({ }^{*} A R 4\right)^{*}\) uns \(\left(\operatorname{coef}\left({ }^{*} C D P\right)\right)\) & unsigned content addressed by AR3 and the unsigned \\
content addressed by the coefficient data pointer register \\
& (CDP) is added to the content of AC0. The result is stored in \\
& \begin{tabular}{l} 
AC0. The product of the unsigned content addressed by AR4 \\
and the unsigned content addressed by CDP is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

\section*{Multiply and Accumulate With Parallel Multiply}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40(\text { rnd }(\text { ACy }+(\text { uns }(\text { Smem }) ~ \\
& \text { * } \\
& \text { uns(HI(coef(Cmem)))))), } \\
& \text { ACx }=\text { M40(rnd(uns(Smem) }{ }^{*} \text { uns(LO(coef(Cmem))))) }
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description

ACx, ACy, Cmem, Smem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the \(B A B, B D B\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & \begin{tabular}{l} 
Affected by
\end{tabular} & FRCT, M40, RDM, SATD, SMUL \\
Affects & ACOVx, ACOVy
\end{tabular}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { AC1 + (uns(*AR3-) * uns(HI(coef(*CDP }+)))), \\
& \text { AC0 }=\text { uns(*AR3-) }{ }^{*} \text { uns(LO(coef(*CDP+))) }
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is stored in AC0. AR3 is decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}

Execution
ACy+M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem))) [16:0])) ->ACy
M40 (rnd (uns (Smem) [16:0]*uns (LO (coef (Cmem)) ) [16:0])) ->ACx

MAC::MPY
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline AC0 & FF & 8000 & 0000 & ACO & 00 & 3F80 & 0000 \\
\hline XAR3 & & 00 & 10 FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FE00 & 10FFh & & & FE00 \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & 00 & 0000 & 8000 & AC1 & 00 & 7F00 & 8000 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

Multiply and Accumulate With Parallel Multiply

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & \[
\begin{aligned}
& \text { ACy = M40(rnd((ACy >> \#16) + (uns(Smem) * } \\
& \text { uns(HI(coef(Cmem)))))), } \\
& \text { ACx = M40(rnd(uns(Smem) * uns(LO(coef(Cmem))))) }
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode Operands

Description

ACx, ACy, Cmem, Smem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path \(C\) with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{|c|c|}
\hline Status Bits Affected by & FRCT, M40, RDM, SATD, SMUL \\
\hline Affects & ACOVx, ACOVy \\
\hline Repeat This instruction & can be repeated. \\
\hline Example & \\
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\left(\text { AC1 >> \#16) }+\left(\text { uns }\left({ }^{*} \text { AR3- }\right)^{*}\right.\right. \\
& \text { uns(HI(coef(*CDP+))), } \\
& \text { AC0 }=\text { uns(*AR3-) * uns(LO(coef(*CDP }+)))
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is stored in AC0. AR3 is decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline \multicolumn{9}{|l|}{Execution} \\
\hline \multicolumn{9}{|l|}{(ACy>>\#16) +M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem) ) [16:0])) -> ACy} \\
\hline \multicolumn{9}{|l|}{M40 (rnd (uns (Smem) [16:0]*uns (LO(coef (Cmem) ) [16:0])) -> ACx} \\
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{5}{|l|}{After} \\
\hline ACO & FF & 8000 & 0000 & ACO & 00 & 3 F 80 & 0000 & \\
\hline XAR3 & & 00 & 10FF & XAR3 & & 00 & 10FE & \\
\hline \multicolumn{9}{|l|}{Data memory} \\
\hline 10FFh & & & FEOO & 10FFh & & & FEOO & \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 & \\
\hline \multicolumn{9}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 & \\
\hline AC1 & 00 & 0800 & 0000 & AC1 & 00 & 7F00 & 0800 & \\
\hline \multicolumn{9}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000h & & & 8000 & \\
\hline
\end{tabular}

Multiply and Accumulate With Parallel Multiply

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [4] & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(uns(LO(Lmem)) * uns(LO(\boldsymbol{coef(Cmem)))))}
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem \()\) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{HI}(\) Lmem \()\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(coef(Cmem)). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- For the second operation, the 32 -bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \begin{tabular}{l}
AC1 \(=\mathrm{AC} 1+\left(\right.\) uns (HI(*AR3-)) \({ }^{*}\) uns(HI(coef(*CDP + )))), \\
AC0 \(=\) uns(LO(*AR3-)) *uns(LO(coef(*CDP+)))
\end{tabular} & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is stored in AC0. When AR3- is used with HI/LO, AR3 is decremented by 2 . When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}

MAC::MPY Multiply and Accumulate with Parallel Multiply


Multiply and Accumulate With Parallel Multiply

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [5] & ```
ACy = M40(rnd((ACy>>#16) + (uns(HI(Lmem)) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd(uns(LO(Lmem)) * uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
\begin{tabular}{|ll|l|ll}
1111 & 1101 & AAAA AAAI \(\mid 0110\) & \(10 \mathrm{~mm} \mid\) DDDD ung\%
\end{tabular}
ACx, ACy, Cmem, Lmem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand \(\mathrm{HI}(\mathrm{Lmem})\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\) coef(Cmem)) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(coef(Cmem)). The data memory operand LO(Lmem) is addressed by DAGEN path \(X\) with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, \(\mathrm{EA}-1\) when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM =1)
None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=(\mathrm{AC} 1 \gg \# 16)+\left(\operatorname{uns}\left(\mathrm{HI}\left({ }^{*} \mathrm{AR} 3-\right)\right){ }^{*} \operatorname{uns}\left(\mathrm{HI}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right), \\
& \mathrm{AC0}=\operatorname{uns}\left(\mathrm{LO}\left({ }^{*} \mathrm{AR} 3-\right)\right){ }^{*} \operatorname{uns}\left(\mathrm{LO}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is stored in ACO. When AR3is used with \(\mathrm{HI} / \mathrm{LO}\), AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline \multicolumn{9}{|l|}{Execution} \\
\hline \multicolumn{9}{|l|}{(ACy>>\#16) +M40 (rnd (uns (HI (Lmem) ) [16:0]*uns (HI (coef (Cmem) ) [16:0])) -> ACy} \\
\hline \multicolumn{9}{|l|}{M40 (rnd (uns (LO (Lmem) ) [16:0] *uns (LO(coef (Cmem) ) [16:0])) -> ACx} \\
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{5}{|l|}{After} \\
\hline AC0 & FF & 8000 & 0000 & ACO & 00 & 3F80 & 0000 & \\
\hline XAR3 & & 00 & 10FE & XAR3 & & 00 & 10FC & \\
\hline \multicolumn{9}{|l|}{Data memory} \\
\hline 10FFh & & & FEOO & 10FFh & & & FEOO & \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 & \\
\hline \multicolumn{9}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 & \\
\hline AC1 & 00 & 0800 & 0000 & AC1 & & 7F80 & 0800 & \\
\hline \multicolumn{9}{|l|}{Data memory} \\
\hline 10FEh & & & FFOO & 10FEh & & & FFOO & \\
\hline \multicolumn{9}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000 h & & & 8000 & \\
\hline
\end{tabular}

Multiply and Accumulate With Parallel Multiply
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [6] & \[
\begin{aligned}
& \text { ACy }=\text { M40(rnd((ACy >> \#16) }+\left(\text { uns }(\text { Ymem })^{*}\right. \\
& \text { uns(HI(coef(Cmem)))))), } \\
& \text { ACx }=\text { M40(rnd(uns(Xmem) * uns(LO(coef(Cmem))))) }
\end{aligned}
\] & No & 5 (*) & 1 & X \\
\hline
\end{tabular}
(*) 1 LSB is allocated to instruction slot \#2.

\section*{Opcode}

\section*{Operands}

\section*{Description}
| \(10010100 \mid\) XXXM MMYY \(\mid\) YMMM \(10 \mathrm{~mm} \mid\) UuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and extended to 17 bits in the MAC1.
\(\square\) The input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{2}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL, SXMD} \\
\hline & \multicolumn{2}{|l|}{Affects ACOVx, ACOVy} \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline Syntax & & Description \\
\hline \multicolumn{2}{|l|}{\[
\begin{aligned}
& \text { AC1 }=(\text { AC1 } \gg \# 16)+\left(\text { uns } \left({ }^{*}\right.\right. \text { AR3-) * } \\
& \text { uns(HI(coef(*CDP }+ \text { ) ))), } \\
& \text { AC0 }=\text { uns(*AR2-) * uns(LO(coef(*CDP }+)))
\end{aligned}
\]} & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is stored in AC0. AR3 and AR2 are decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}


\section*{MAC:HMAS}

\section*{Multiply and Accumulate With Parallel Multiply and Subtract}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] &  & No & 4 & 1 & X \\
\hline [2] & ```
ACy = M40(rnd((ACy >> #16) + (uns(Smem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(Smem) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [3] & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [4] & ```
ACy = M40(rnd((ACy >> #16) + (uns(HI(Lmem)) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd(ACx - (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [5] & ```
ACy = M40(rnd(ACy + uns(Ymem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & No & 5 & 1 & X \\
\hline [6] & ```
ACy = M40(rnd((ACy >> #16) + (uns(Ymem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & No & 5 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

Status Bits
Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy
See Also See the following other related instructions:\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Subtract
\(\square\) Multiply and Subtract
\(\square\) Multiply and Subtract with Parallel Load Accumulator from Memory
\(\square\) Multiply and Subtract with Parallel Store Accumulator Content to Memory
- Multiply and Subtract with Parallel Multiply
- Parallel Multiply and Subtracts

Multiply and Accumulate With Parallel Multiply and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40\left(\text { rnd } \left(\text { ACy }+\left(\text { uns }(\text { Smem })^{*}\right.\right.\right. \\
& \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))))), \\
& \text { ACx }=\text { M40(rnd }(\text { ACx }-(\text { uns }(\text { Smem }) ~
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description

ACx, ACy, Cmem, Smem
This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

Example
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { AC1 }+\left(\text { uns (*AR3-) }{ }^{*}\right. \text { uns(HI(coef(*CDP+)))), } \\
& \text { AC0 }=\text { AC0 }-\left(\text { uns (*AR3-) }{ }^{*}\right. \text { uns(LO(coef(*CDP+)))) }
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of ACO. The result is stored in AC0. AR3 is decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}

\footnotetext{
Execution
ACy + M40 (rnd (uns (Smem) [16:0] *uns (HI (coef (Cmem))) [16:0])) -> ACy
ACx-M40 (rnd (uns (Smem) [16:0]*uns (LO (coef(Cmem))) [16:0])) -> ACx
}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline Before & & & After & & & \\
\hline AC0 00 & 0000 & 8000 & ACO & FF & C080 & 8000 \\
\hline XAR3 & 00 & 10FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FFh & & FE00 & 10FFh & & & FE00 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 00 & 0000 & 8000 & AC1 & 00 & 7F00 & 8000 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

Multiply and Accumulate With Parallel Multiply and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & ```
ACy = M40(rnd((ACy >> #16) + (uns(Smem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(Smem) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
\(11111101 \mid\) AAAA AAAI \(\mid 0010\) 01mm \(\mid\) DDDD uug\%

\section*{Operands}

ACx, ACy, Cmem, Smem
Description This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM =1)
None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

MAC::MAS Multiply and Accumulate With Parallel Multiply and Subtract

Example
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=(\text { AC1 >> \#16) + (uns(*AR3-) * uns(HI(coef(*CDP+)))), } \\
& \text { AC0 } \left.=\text { AC0 }-\left(\text { uns(*AR3-) * uns(LO }\left(\operatorname{coef}\left({ }^{*} C D P+\right)\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of ACO. The result is stored in AC0. AR3 is decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}

\section*{Execution \\ (ACy>>\#16) +M40 (rnd (uns (Smem) [16:0]*uns (HI (coef(Cmem))) [16:0])) -> ACy \\ ACx-M40 (rnd (uns (Smem) [16:0] *uns (LO (coef(Cmem))) [16:0])) -> ACx}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline ACO & 00 & 0000 & 8000 & ACO & FF & C080 & 8000 \\
\hline XAR3 & & 00 & 10FF & XAR3 & & 00 & 10 FE \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FE00 & 10FFh & & & FEOO \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & 00 & 0800 & 0000 & AC1 & 00 & 7F00 & 0800 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode Operands

\section*{Description}
 ACx, ACy, Cmem, Lmem

This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand \(\mathrm{HI}(\mathrm{Lmem})\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(coef(Cmem)). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\begin{tabular}{|c|c|c|}
\hline \multicolumn{3}{|l|}{\begin{tabular}{l}
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit. \\
- Multiplication overflow detection depends on SMUL. \\
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\end{tabular}} \\
\hline \multicolumn{3}{|r|}{- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.} \\
\hline \multicolumn{3}{|r|}{- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.} \\
\hline & \multicolumn{2}{|l|}{This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.} \\
\hline & \multicolumn{2}{|l|}{For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.} \\
\hline \multicolumn{3}{|c|}{Compatibility with C54x devices (C54CM = 1)} \\
\hline \multicolumn{3}{|c|}{None.} \\
\hline Status Bits & Affected by FRCT, M40 & RDM, SATD, SMUL \\
\hline & Affects ACOVx, AC & \\
\hline Repeat & This instruction can be repe & ated. \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline \multicolumn{2}{|l|}{Syntax} & Description \\
\hline \[
\begin{aligned}
& \mathrm{AC} 1=\mathrm{AC} 1+ \\
& \mathrm{AC0}=\mathrm{AC} 0-
\end{aligned}
\] & \[
\begin{aligned}
& \text { R3-)) * uns(HI(coef(*CDP+)))), } \\
& \text { AR3-)) * uns(LO(coef(*CDP+)))) }
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by lower part of AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of ACO. The result is stored in ACO. When AR3is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}


Multiply and Accumulate With Parallel Multiply and Subtract
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [4] & ```
ACy = M40(rnd((ACy >> #16) + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

\section*{Description}
 ACx, ACy, Cmem, Lmem

This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\) coef(Cmem)). The data memory operand \(\mathrm{HI}(\mathrm{Lmem})\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
\(\square\) The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy
\end{tabular}

Example
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline  & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by lower part of AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of ACO. The result is stored in AC0. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with \(\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}\) is incremented by 2. \\
\hline
\end{tabular}

\section*{Execution}
```

(ACy>>\#16) +M40(rnd(uns(HI (Lmem)) [16:0] *uns(HI (coef(Cmem))) [16:0])) -> ACy
ACx-M40(rnd(uns(LO(Lmem)) [16:0]*uns(LO(coef(Cmem))) [16:0])) -> ACx

```
Before After
\begin{tabular}{lrrlrl} 
AC0 & 00 & 0000 & 8000 & AC0 & FF C080 \\
XAR3 & 00 & 10 FE & XAR3 & 00 & 10 FC
\end{tabular}

Data memory
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline 10FFh & & FEOO & 10FFh & & & FEOO \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 00 & 0800 & 0000 & AC1 & 00 & 7F80 & 0800 \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FEh & & FFOO & 10FEh & & & FFOO \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [5] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40(\operatorname{rnd}(\text { ACy }+ \text { uns }(\text { Ymem }) ~ * ~ \\
& \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))), \\
& \text { ACx }=\text { M40 }(\text { rnd }(\text { ACx }- \text { uns }(\text { Xmem }) ~ * ~ \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & No & 5 (*) & 1 & X \\
\hline
\end{tabular}

Opcode

Operands

\section*{Description}
(*) 1 LSB is allocated to instruction slot \#2.

\section*{ACx, ACy, Cmem, Xmem, Ymem}

This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
\(\square\) The input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
\(\square\) The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 \(=\) AC1 + (uns(*AR3-) * uns(HI(coef(*CDP+)))), & Both instructions are performed in parallel. The prod- \\
AC0 = AC0 \(-(\) uns(*AR2-) * uns(LO(coef(*CDP+)))) & \begin{tabular}{l} 
uct of the unsigned content addressed by AR3 and the \\
unsigned content addressed by the higher part of the \\
coefficient data pointer register (CDP) is added to the \\
content of AC1. The result is stored in AC1. The prod- \\
uct of the unsigned content addressed by AR2 and the \\
unsigned content addressed by the lower part of the \\
CDP is subtracted from the content of AC0. The result \\
is stored in AC0. AR3 and AR2 are decremented by 1. \\
When CDP+ is used with HI/LO, CDP is incremented \\
by 2.
\end{tabular} \\
\hline
\end{tabular}


Multiply and Accumulate With Parallel Multiply and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [6] & ```
ACy = M40(rnd((ACy >> #16) + (uns(Ymem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & No & 5 (*) & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
(*) 1 LSB is allocated to instruction slot \#2.

\section*{Operands}

\section*{ACx, ACy, Cmem, Xmem, Ymem}

Description This instruction performs two parallel operations in one cycle: multiply and accumulate (MAC) and multiply and subtract (MAS). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the \(D\)-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D -unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.

The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the \(B A B, B D B\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB bus are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{2}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL, SXMD} \\
\hline & \multicolumn{2}{|l|}{Affects ACOVx, ACOVy} \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline Syntax & & Description \\
\hline \[
\begin{aligned}
& \mathrm{AC1}=(\mathrm{AC} 1> \\
& \text { uns }\left(\mathrm { HI } \left(\mathrm { coef } \left({ }^{*} \mathrm{C}\right.\right.\right. \\
& \mathrm{ACO}=\mathrm{ACO}
\end{aligned}
\] & \[
\begin{aligned}
& \text { uns(*AR3-) * } \\
& \left.-)^{*} \operatorname{uns}\left(\mathrm{LO}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is added to the content of ACO. The result is stored in AC0. AR3 and AR2 are decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}


\section*{MACM::MOV \\ Multiply and Accumulate with Parallel Store Accumulator Content to Memory}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
ACy \(=\operatorname{rnd}(\mathrm{ACy}+(\mathrm{Tx} *\) Xmem \())\), \\
\\
\\
Ymem \(=\operatorname{HI}(\mathrm{ACx} \ll \mathrm{T} 2)[, \mathrm{T} 3=\) Xmem \(]\)
\end{tabular} & No & 4 & 1 & X \\
& & & & & \\
\hline
\end{tabular}

Opcode
Operands
Description

ACx, ACy, Tx, Xmem, Ymem
This instruction performs two operations in parallel: multiply and accumulate (MAC) and store.

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an addition overflow is detected, the accumulator is saturated according to SATD.
\(\square\) This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation shifts the accumulator ACx by the content of T2 and stores \(\operatorname{ACx}(31-16)\) to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
\(\square\) The input operand is shifted in the D-unit shifter according to SXMD.
\(\square\) After the shift, the high part of the accumulator, \(\operatorname{ACx}(31-16)\), is stored to the memory location.

\section*{Status Bits}

\section*{Repeat}

\section*{See Also}

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of T 2 are used to determine the shift quantity. The 6 LSBs of T2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
\(\square\) If the SST bit \(=1\) and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
\[
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{ACy}+(\mathrm{Tx} * \mathrm{Xmem})), \\
& \text { Ymem }=\mathrm{HI}(\text { saturate }(\text { uns }(\mathrm{ACx} \ll \mathrm{~T} 2)))[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\]
\(\square\) If the SST bit = 1 and the SXMD bit = 1, then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
\[
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{ACy}+(\mathrm{Tx} * \mathrm{Xmem})), \\
& \text { Ymem }=\mathrm{HI}(\text { saturate }(\mathrm{ACx} \ll \mathrm{~T} 2))[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\]

Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SST, SXMD
Affects ACOVy
This instruction can be repeated.
See the following other related instructions:
\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Accumulate
\(\square\) Multiply and Accumulate
\(\square\) Multiply and Accumulate with Parallel Delay
\(\square\) Multiply and Accumulate with Parallel Load Accumulator from Memory
\(\square\) Multiply and Accumulate with Parallel Multiply
\(\square\) Multiply and Subtract with Parallel Store Accumulator Content to Memory
\(\square\) Multiply with Parallel Multiply and Accumulate
\(\square\) Parallel Multiply and Accumulates

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \begin{tabular}{l} 
AC0 \(=\mathrm{AC} 0+(T 0 * * A R 3)\), \\
\\
\\
\\
AR4 \(=\mathrm{HI}(\mathrm{AC} 1 \ll \mathrm{~T} 2)\)
\end{tabular} & \begin{tabular}{l} 
Both instructions are performed in parallel. The product of the content \\
addressed by AR3 and the content of T0 is added to the content of AC0. The \\
result is stored in AC0. The content of AC1 is shifted by the content of T2, and \\
\\
\\
AC1 \((31-16)\) is stored at the address of AR4. \\
\hline
\end{tabular} \\
\hline
\end{tabular}

\section*{MAS}

\section*{Multiply and Subtract}

\section*{Syntax Characteristics}


Multiply and Subtract
Syntax Characteristics
 extended to 17 bits.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
When an overflow is detected, the accumulator is saturated according to SATD.
Compatibility with C54x devices (C54CM =1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline AC1 \(=\operatorname{rnd}(\mathrm{AC} 1-(\mathrm{AC0}\) * T1 \()\) ) & The product of the content of AC0 and the content of T1 is subtracted from the content of \(A C 1\). The result is rounded and stored in \(A C 1\). \\
\hline Before & After \\
\hline ACO 00 ECOO 0000 & ACO 00 ECOO 0000 \\
\hline AC1 0034000000 & AC1 0016800000 \\
\hline T1 2000 & T1 2000 \\
\hline M40 0 & M40 0 \\
\hline ACOV1 0 & ACOV1 0 \\
\hline FRCT 0 & FRCT 0 \\
\hline
\end{tabular}

Multiply and Subtract

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(\mathrm{ACx}=\operatorname{rnd}(\mathrm{ACx}-(\) Smem * \(\operatorname{coef}(\mathrm{Cmem})))[, \mathrm{T} 3=\mathrm{Smem}]\) & No & 3 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|11010001|\) AAAA AAAI \(\mid\) U\%DD 10 mm

\section*{Operands ACx, Cmem, Smem}

Description This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of a memory location (Smem), sign extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.

\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|l|}{Before} & \multicolumn{3}{|l|}{After} \\
\hline AC2 & 00 ECOO & 0000 & AC2 & 00 ECO1 & 0000 \\
\hline AR1 & & 0302 & AR2 & & 0302 \\
\hline CDP & & 0202 & CDP & & 0202 \\
\hline 302 & & FE00 & 302 & & FE00 \\
\hline 202 & & 0040 & 202 & & 0040 \\
\hline ACOV2 & & 0 & ACOV2 & & 1 \\
\hline SATD & & 0 & SATD & & 0 \\
\hline RDM & & 0 & RDM & & 0 \\
\hline FRCT & & 0 & FRCT & & 0 \\
\hline
\end{tabular}

\section*{Multiply and Subtract}

\section*{Syntax Characteristics}


This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits \begin{tabular}{l} 
Affected by FRCT, M40, RDM, SATD, SMUL \\
Affects ACOVy
\end{tabular}
Repeat \(\quad\) This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0-\left({ }^{*} A R 3 * A C 1\right)\) & \begin{tabular}{l} 
The product of the content addressed by AR3 and the content of AC1 is \\
subtracted from the content of AC0. The result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([4]\) & \(A C y=\operatorname{rnd}(A C x-(T x * S m e m))[\), T3 \(=\) Smem \(]\) & No & 3 & 1 & \(x\) \\
\hline Opcode & 1101 & \(0101 \mid A A A A\) & AAAI & U\%DD & ssSS
\end{tabular}

\section*{Operands}

Description

Status Bits

Repeat

\section*{ACx, ACy, Smem, Tx}

This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of a memory location (Smem), sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.
Compatibility with C54x devices (C54CM =1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
This instruction can be repeated.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1-(T 0 * * A R 3)\) & \begin{tabular}{l} 
The product of the content addressed by AR3 and the content of T0 is \\
subtracted from the content of AC1. The result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Multiply and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([5]\) & \(\left.\begin{array}{l}\text { ACy }=M 40(\operatorname{rnd}(A C x-(u n s(X m e m) \\
{[, T 3=X m e m}\end{array}\right]\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
\(10000110 \mid\) XXXM MMYY \(\mid\) YMMM SSDD \(\mid 011 g\) uuU\%
ACx, ACy, Xmem, Ymem
This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of data memory operand Ymem, extended to 17 bits.
- Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.

If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.

\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|l|}{Before} & \multicolumn{3}{|l|}{After} \\
\hline AC3 & 002300 & ECOO & AC3 & FF B3E0 & ECOO \\
\hline AR2 & & 302 & AR2 & & 303 \\
\hline AR3 & & 202 & AR3 & & 203 \\
\hline ACOV3 & & 0 & ACOV3 & & 0 \\
\hline 302 & & FE00 & 302 & & FEO0 \\
\hline 202 & & 7000 & 202 & & 7000 \\
\hline FRCT & & 0 & FRCT & & 0 \\
\hline
\end{tabular}

\section*{Multiply and Subtract}

\section*{Syntax Characteristics}


\section*{Note:}

The uns keyword is mandatory for this instruction.

The data memory operand Smem is addressed by DAGEN path \(X\) by using the Smem addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The another data memory operand Cmem is addressed by DAGEN path \(C\) by using the coefficient addressing mode, driven on data bus BDB , and sign extended to 17 bits with filling zeros in the MAC1.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To
prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

This instruction can be applied to compute the intermediate multiplication result and subtraction from the other partial result of double precision arithmetic, and to free up one DAGEN operator (DAGEN path \(Y\) ) for storing an instruction with enabling parallelism.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx
\end{tabular} Repeat \(\quad\) This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\mathrm{ACO}-\left({ }^{*} \mathrm{AR} 3-{ }^{*}\right.\) uns \(\left.\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\) & \begin{tabular}{l} 
The product of the content addressed by AR3 and the unsigned \\
content addressed by the coefficient data pointer register (CDP)
\end{tabular} \\
& \begin{tabular}{l} 
is subtracted from the content of AC0. The result is stored in \\
AC0. AR3 is decremented by 1 and CDP is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multicolumn{7}{|l|}{Execution} \\
\hline rnd (ACx+ (Smem) & \multicolumn{2}{|l|}{16:0] *uns (Cmem)} & \multicolumn{4}{|l|}{16:0]) -> ACx} \\
\hline Before & & & After & & & \\
\hline AC0 00 & 0000 & 8000 & ACO & 00 & 0100 & 8000 \\
\hline XAR3 & 00 & 1001 & XAR3 & & 00 & 1000 \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 1001h & & FEOO & 1001h & & & FE00 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2001 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{llcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C x=\operatorname{rnd}\left(A C x-\left(T x^{*}\right.\right.\) Xmem \(\left.)\right)\), \\
& \(A C y=Y m e m \ll \# 16[, T 3=X m e m]\) & No & 4 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

Operands
Description

ACx, ACy, Tx, Xmem, Ymem
This instruction performs two operations in parallel: multiply and subtract (MAS), and load.

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.

The second operation loads the content of data memory operand Ymem, which has been shifted to the left by 16 bits, into accumulator ACy.
\(\square\) The input operand is sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
\(\square\) The input operand is shifted to the left by 16 bits according to M40.
Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{ll} 
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
Repeat & Affects ACOVx, ACOVy \\
See Also & This instruction can be repeated. \\
& See the following other related instructions: \\
\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Subtract \\
& \(\square\) Multiply and Accumulate with Parallel Load Accumulator from Memory \\
& \(\square\) Multiply and Subtract \\
& \(\square\) Multiply and Subtract with Parallel Multiply \\
& \(\square\) Multiply and Subtract with Parallel Multiply and Accumulate \\
& \(\square\) Murallely and Subtract with Parallel Store Accumulator Content to Memory Subtracts
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = AC0 - (T0 * *AR3), & \begin{tabular}{l} 
Both instructions are performed in parallel. The product of the content ad- \\
dressed by AR3 and the content of T0 is subtracted from the content of AC0. \\
The result is stored in AC0. The content addressed by AR4, which has been \\
Thifted to the left by 16 bits, is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

\section*{MAS:IMPY \\ Multiply and Subtract with Parallel Multiply}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& A C x=M 40\left(\operatorname{rnd}\left(\text { ACx }-\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=M 40\left(\text { rnd }\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right.
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [2] & ```
ACy = M40(rnd(ACy - (uns(Smem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(uns(Smem) * uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [3] & ```
ACy = M40(rnd(ACy - (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(uns(LO(Lmem)) * uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform two parallel operations in one cycle: multiply and subtract (MAS) and multiply. The operations are executed in the two D-unit MACs.

Status Bits

See Also
Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See the following other related instructions:
\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Subtract
\(\square\) Multiply and Accumulate with Parallel Multiply
\(\square\) Multiply and Subtract
\(\square\) Multiply and Subtract with Parallel Load Accumulator from Memory
\(\square\) Multiply and Subtract with Parallel Multiply and Accumulate
\(\square\) Multiply and Subtract with Parallel Store Accumulator Content to Memory
\(\square\) Parallel Multiply and Subtracts

Multiply and Subtract With Parallel Multiply
Syntax Characteristics


\section*{Operands}

Description

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)
\begin{tabular}{lll} 
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC0 \(-\left(\right.\) uns \(\left({ }^{*} A R 3\right) *\) uns \(\left.\left(\operatorname{coef}\left({ }^{*} C D P\right)\right)\right)\), & Both instructions are performed in parallel. The product of the \\
AC1 \(=\) uns \(\left({ }^{*} A R 4\right)^{*}\) uns \(\left(\operatorname{coef}\left({ }^{*} C D P\right)\right)\) & unsigned content addressed by AR3 and the unsigned \\
content addressed by the coefficient data pointer register \\
& (CDP) is subtracted from the content of AC0. The result is \\
stored in AC0. The product of the unsigned content \\
addressed by AR4 and the unsigned content addressed by \\
CDP is stored in AC1.
\end{tabular}

Multiply and Subtract With Parallel Multiply
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] &  & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description

ACx, ACy, Cmem, Smem
This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the \(B A B, B D B\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM = 1)
None.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{2}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL} \\
\hline & Affects ACOVx & \(A C O V y\) \\
\hline Repeat & This instruction can be & peated. \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline \multicolumn{2}{|l|}{Syntax} & Description \\
\hline \multicolumn{2}{|l|}{\[
\begin{aligned}
& \text { AC1 }=\text { AC1 }-\left(\text { uns }\left({ }^{*} \text { AR3-) }{ }^{*} \operatorname{uns}\left(\mathrm{HI}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right),\right. \\
& \text { AC0 }=\text { uns }\left({ }^{*} \text { AR3-) }{ }^{*} \text { uns }\left(\text { LO }\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right.
\end{aligned}
\]} & Both instructions are perfo uct of the unsigned conten unsigned content address coefficient data pointer reg from the content of AC1. Th The product of the unsign AR3 and the unsigned con part of CDP is stored in ACO 1. When CDP+ is used with ented by 2 . \\
\hline
\end{tabular}

Execution
ACy-M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem)) ) [16:0])) -> ACy
M40 (rnd (uns (Smem) [16:0]*uns (LO (coef (Cmem))) [16:0])) \(\rightarrow\) ACx

MAS::MPY
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline AC0 & FF & 8000 & 0000 & ACO & 00 & 3 F 80 & 0000 \\
\hline XAR3 & & 00 & 10FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FE00 & 10FFh & & & FE00 \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & 00 & 0000 & 8000 & AC1 & FF & 8100 & 8000 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

Multiply and Subtract With Parallel Multiply

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & ```
ACy = M40(rnd(ACy - (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(uns(LO(Lmem)) * uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
 ACx, ACy, Cmem, Lmem

This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand \(\mathrm{HI}(\mathrm{Lmem})\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\) coef(Cmem)) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, \(\mathrm{EA}-1\) when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- For the second operation, the 32 -bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { AC1 }-\left(\operatorname{uns}\left(\mathrm{HI}\left({ }^{*} \mathrm{AR} 3-\right)\right){ }^{*} \operatorname{uns}\left(\mathrm{HI}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right), \\
& \mathrm{AC0}=\mathrm{uns}\left(\mathrm{LO}\left({ }^{*} \mathrm{AR} 3-\right)\right)^{*} \operatorname{uns}\left(\mathrm{LO}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is subtracted from the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}


\section*{MAS::MAC}

\section*{Multiply and Subtract with Parallel Multiply and Accumulate}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& A C x=M 40\left(\operatorname{rnd}\left(\text { ACx }-\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=M 40\left(\text { rnd }\left(\text { ACy }+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right)
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [2] & \[
\begin{aligned}
& A C x=M 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACy }=M 40(\text { rnd }((\text { ACy } \gg \# 16)+(\text { uns }(\text { Ymem }) * \\
& \text { uns }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [3] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40(\text { rnd }(\text { ACy }-(\text { uns }(\text { Smem }) \\
& \text { * } \\
& \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))))), \\
& \text { ACx }=\text { M40 }(\text { rnd }(\text { ACx }+(\text { uns }(\text { Smem }) \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [4] & ```
ACy = M40(rnd(ACy - (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Description These instructions perform two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:
\(\square\) Modify Auxiliary Register Content with Parallel Multiply and Subtract
- Multiply and Subtract
- Multiply and Subtract with Parallel Load Accumulator from Memory
- Multiply and Subtract with Parallel Multiply
- Multiply and Subtract with Parallel Store Accumulator Content to Memory
- Parallel Multiply and Subtracts

Multiply and Subtract with Parallel Multiply and Accumulate
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& A C x=M 40\left(\operatorname{rnd}\left(A C x-\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=M 40\left(\text { rnd } \left(A C y+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef(Cmem)))))}\right.\right.\right.
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
\(\square\) Input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.
Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}


Multiply and Subtract with Parallel Multiply and Accumulate
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \[
\begin{aligned}
& A C x=M 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACy }=\text { M } 40\left(\text { rnd } \left((\text { ACy } \gg \# 16)+\left(\text { uns }(\text { Ymem })^{*}\right.\right.\right. \\
& \text { uns }(\text { coef(Cmem) }))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline \multicolumn{2}{|l|}{Opcode} & MMYY \({ }^{\text {Y }}\) & YMMM 0 & 00 mm | u & DDg\% \\
\hline
\end{tabular}

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACy(39).
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- \(\operatorname{mar}(Y \mathrm{mem})\)
- mar(Cmem)

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
This instruction can be repeated.

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC0 }=\text { AC0 }-\left(\text { uns (*AR3) }{ }^{*}\right. \text { uns(coef(*CDP))), } \\
& \text { AC1 } \left.=\left(\text { AC1 >> \#16) }+\left(\text { uns( }{ }^{*} \text { AR4) }{ }^{*} \text { uns(coef( }{ }^{*} C D P\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the coefficient data pointer register (CDP) is subtracted from the content of ACO. The result is stored in ACO. The product of the unsigned content addressed by AR4 and the unsigned content addressed by CDP is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. \\
\hline
\end{tabular}

Multiply and Subtract with Parallel Multiply and Accumulate

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & ```
ACy = M40(rnd(ACy - (uns(Smem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(Smem) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
\(11111101 \mid\) AAAA AAAI \(\mid 0001\) 11mm \(\mid\) DDDD uug\% ACx, ACy, Cmem, Smem

This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\) coef(Cmem)) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA ( \(E A+1\) when \(E A\) is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.

The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

Example
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { AC1 }-\left(\text { uns }\left({ }^{*} A R 3-\right)^{*}\right. \text { uns(HI(coef(*CDP+)))), } \\
& \text { AC0 }=\text { AC0 }+\left(\text { uns(*AR3- }{ }^{*}\right. \text { uns(LO(coef(*CDP+)))) }
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is subtracted from the content of \(A C 1\). The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO. The result is stored in AC0. AR3 is decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}

\footnotetext{
Execution
ACy-M40 (rnd (uns (Smem) [16:0] *uns (HI (coef (Cmem))) [16:0])) -> ACy
ACx+M40 (rnd (uns (Smem) [16:0]*uns (LO (coef(Cmem))) [16:0])) -> ACx
}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline Before & & & After & & & \\
\hline AC0 00 & 0000 & 8000 & AC0 & 00 & 3 F 80 & 8000 \\
\hline XAR3 & 00 & 10FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FFh & & FE00 & 10FFh & & & FE00 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 00 & 0000 & 8000 & AC1 & FF & 8100 & 8000 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & \begin{tabular}{l}
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline [4] & ```
ACy = M40(rnd(ACy - (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
\begin{tabular}{|ll|l|l|l|l|} 
& 1111 & AAAA \(A A A I\) & 0101 & 11 mm & DDDD uug\%
\end{tabular} ACx, ACy, Cmem, Lmem

This instruction performs two parallel operations in one cycle: multiply and subtract (MAS) and multiply and accumulate (MAC). The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\) coef(Cmem)). The data memory operand \(\mathrm{HI}(\) Lmem \()\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path \(C\) with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{LO}(\mathrm{Lmem})\) is addressed by DAGEN path X with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
\(\square\) The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.
For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects \(\quad\) ACOVx, ACOVy \\
Repeat & This instruction can be repeated. \\
Example &
\end{tabular}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \begin{tabular}{l}
AC1 \(=\) AC1 - (uns(HI(*AR3-)) * uns(HI(coef(*CDP+)))), \\

\end{tabular} & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is subtracted from the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO. The result is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with \(\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}\) is incremented by 2 . \\
\hline
\end{tabular}

MAS::MAC Multiply and Subtract with Parallel Multiply and Accumulate
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{8}{|l|}{Execution} \\
\hline \multicolumn{7}{|l|}{ACy-M40 (rnd (uns (HI (Lmem) ) [16:0]*uns (HI (coef (Cmem)) ) [16:0]))} & > ACy \\
\hline \multicolumn{7}{|l|}{ACx \(+\mathrm{M} 40(\) rnd (uns (LO (Lmem) ) [16:0]*uns (LO (coef (Cmem) ) ) [16:0]))} & -> ACx \\
\hline Before & & & & After & & & \\
\hline ACO & 00 & 0000 & 8000 & ACO & 003 F 80 & 8000 & \\
\hline XAR3 & & 00 & 10 FE & XAR3 & 00 & 10FC & \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FE00 & 10FFh & & FE00 & \\
\hline XCDP & & 00 & 2000 & XCDP & 00 & 2002 & \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & 4000 & \\
\hline AC1 & 00 & 0000 & 8000 & AC1 & FF 8080 & 8000 & \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FEh & & & FFOO & 10FEh & & FFO0 & \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000 h & & 8000 & \\
\hline
\end{tabular}

Multiply and Subtract with Parallel Store Accumulator Content to Memory

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& \text { ACy }=\operatorname{rnd}(\text { ACy }-(\text { Tx * Xmem })), \\
& \text { Ymem }=\text { HI(ACx << T2) }[, \text { T3 = Xmem }]
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description

ACx, ACy, Tx, Xmem, Ymem
This instruction performs two operations in parallel: multiply and subtract (MAS) and store.

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of Tx, sign extended to 17 bits, and the content of data memory operand Xmem, sign extended to 17 bits.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
- This instruction provides the option to store the 16-bit data memory operand Xmem in temporary register T3.
The second operation shifts the accumulator ACx by the content of T2 and stores \(\operatorname{ACx}(31-16)\) to data memory operand Ymem. If the 16 -bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
- The input operand is shifted in the D-unit shifter according to SXMD.
\(\square\) After the shift, the high part of the accumulator, \(\operatorname{ACx}(31-16)\), is stored to the memory location.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of T 2 determine the shift quantity. The 6 LSBs of T 2 define a shift quantity within -32 to +31 . When the 16 -bit value in T 2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
- If the SST bit \(=1\) and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
\[
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{ACy}-(\mathrm{Tx} * \text { Xmem })), \\
& \text { Ymem }=\mathrm{HI}(\text { saturate }(\text { uns }(\mathrm{ACx} \ll \mathrm{~T} 2)))[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\]
\(\square\) If the SST bit = 1 and the SXMD bit \(=1\), then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
\(A C y=\operatorname{rnd}(A C y-(T x * X m e m))\),
Ymem \(=\mathrm{HI}(\) saturate \((\mathrm{ACx} \ll \mathrm{T} 2))[, \mathrm{T} 3=\) Xmem \(]\)
Status Bits Affected by C54CM, FRCT, M40, RDM, SATD, SMUL, SST, SXMD

Repeat
Affects
ACOVy

See Also
See the following other related instructions:
- Modify Auxiliary Register Content with Parallel Multiply and Subtract
- Multiply and Accumulate with Parallel Store Accumulator Content to Memory
- Multiply and Subtract
- Multiply and Subtract with Parallel Load Accumulator from Memory
- Multiply and Subtract with Parallel Multiply
- Multiply and Subtract with Parallel Multiply and Accumulate
- Parallel Multiply and Subtracts

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \begin{tabular}{l} 
AC0 \(=A C 0-(T 0 * * A R 3)\), \\
\(* A R 4=H I(A C 1 \ll T 2)\)
\end{tabular} & \begin{tabular}{l} 
Both instructions are performed in parallel. The product of the content \\
addressed by AR3 and the content of T0 is subtracted from the content of AC0. \\
The result is stored in AC0. The content of AC1 is shifted by the content of T2, \\
and AC1 (31-16) is stored at the address of AR4.
\end{tabular} \\
\hline
\end{tabular}

\section*{NEG}

\section*{Negate Accumulator, Auxiliary, or Temporary Register Content}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(d s t=-\) src & Yes & 2 & 1 & X \\
\hline & & & & & \\
Opcode & & 0011 & 010 E & FSSS & FDDD
\end{tabular}

Operands
Description This instruction computes the 2s complement of the content of the source register (src). This instruction clears the CARRY status bit to 0 for all nonzero values of src. If src equals 0 , the CARRY status bit is set to 1 .
\(\square\) When the destination operand (dst) is an accumulator:
- The operation is performed on 40 bits in the D-unit ALU.

■ Input operands are sign extended to 40 bits according to SXMD.
- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ Overflow detection and CARRY status bit depends on M40.
- When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) When the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
- Overflow detection is done at bit position 15.
- When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & Affected by & M40, SATA, SATD, SXMD \\
\hline & Affects & ACOVx, CARRY \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multirow[t]{3}{*}{See Also} & \multicolumn{2}{|l|}{See the following other related instructions:} \\
\hline & \multicolumn{2}{|l|}{\(\square\) Complement Accumulator, Auxiliary, or Temporary Register Bit} \\
\hline & \multicolumn{2}{|l|}{\(\square\) Complement Accumulator, Auxiliary, or Temporary Register Content} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline Syntax & Description & \\
\hline \(\mathrm{AC0}=-\mathrm{AC} 1\) & The 2s comp & ment of the content of AC1 is sto \\
\hline
\end{tabular}

\section*{NOP}

No Operation (nop)

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|}
\hline No. Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] nop & Yes & 1 & 1 & D \\
\hline [2] nop_16 & Yes & 2 & 1 & D \\
\hline Opcode & & & 0010 & 0 O00E \\
\hline Operands & none & & & \\
\hline Description & \multicolumn{4}{|l|}{Instruction [1] increments the program counter register (PC) by 1 byte. Instruction [2] increments the PC by 2 bytes.} \\
\hline Status Bits & Affected by none & & & \\
\hline & Affects none & & & \\
\hline Repeat & \multicolumn{4}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{5}{|l|}{Example} \\
\hline Syntax & \multicolumn{4}{|l|}{Description} \\
\hline nop & \multicolumn{4}{|l|}{The program counter (PC) is incremented by 1 byte.} \\
\hline
\end{tabular}

\section*{AMAR}

Parallel Modify Auxiliary Register Contents

\section*{Syntax Characteristics}


\section*{MPY::MPY \\ Parallel Multiplies}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \begin{tabular}{l}
\(A C x=M 40(\operatorname{rnd}(\) uns \((X m e m)\) * uns( \(\boldsymbol{\operatorname { c o e f }}(C m e m)))\) ), \\
ACy \(=\) M40(rnd(uns(Ymem) * uns(coef(Cmem))))
\end{tabular} & No & 4 & 1 & X \\
\hline [2] &  & No & 4 & 1 & X \\
\hline [3] &  & No & 4 & 1 & X \\
\hline [4] & \[
\begin{aligned}
& \text { ACy }=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& A C x=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))\right)\right)
\end{aligned}
\] & No & 5 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Description & These instructions perform two parallel multiply operations in one cycle. The operations are executed in the two D-unit MACs. \\
\hline Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
\hline & Affects ACOVx, ACOVy \\
\hline See Also & See the following other related instructions: \\
\hline & - Modify Auxiliary Register Content with Parallel Multiply \\
\hline & - Multiply \\
\hline & - Multiply and Accumulate with Parallel Multiply \\
\hline & - Multiply and Subtract with Parallel Multiply \\
\hline & - Parallel Multiply and Accumulates \\
\hline & - Parallel Multiply and Subtracts \\
\hline
\end{tabular}

Parallel Multiplies
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \begin{tabular}{l}
ACx \(=\mathrm{M} 40(\) rnd (uns(Xmem) * uns(coef(Cmem) ) ) \()\), \\
ACy \(=\mathrm{M} 40(\) rnd \((\) uns \((\) Ymem \() ~ * u n s(\operatorname{coef}(\) Cmem) \()))\)
\end{tabular} & No & 4 & 1 & X \\
\hline
\end{tabular}


\section*{Operands}

Description

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

This second operation performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
\(\square\) Input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
\(\square\) The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)
\begin{tabular}{lll} 
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects ACOVx, ACOVy \\
Repeat & This instruction can be repeated. \\
Example &
\end{tabular}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) uns \(\left({ }^{*}\right.\) AR3 \() *\) uns(coef(*CDP)), & Both instructions are performed in parallel. The product of the \\
AC1 \(=\) uns \(\left({ }^{*} \mathrm{AR} 4\right){ }^{*}\) uns(coef(*CDP)) & unsigned content addressed by AR3 and the unsigned content \\
addressed by the coefficient data pointer register (CDP) is stored in \\
AC0. The product of the unsigned content addressed by AR4 and the \\
unsigned content addressed by CDP is stored in AC1. \\
\hline
\end{tabular}

Parallel Multiplies
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \begin{tabular}{l}
ACy \(=\) M40(rnd(uns(Smem) * uns(HI(coef(Cmem))))), \\
ACx \(=\) M40(rnd(uns(Smem) * uns(LO(coef(Cmem)))))
\end{tabular} & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
\(11111101 \mid\) AAAA AAAI \(|000000 \mathrm{~mm}|\) DDDD uug\%

\section*{Operands}

ACx, ACy, Cmem, Smem
Description This instruction performs two parallel multiply operations in one cycle. The operations are executed in the D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\) coef \((\mathrm{Cmem})\) ) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path \(X\) with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
. The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32 -bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects \(\quad\) ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 \(=\) uns(*AR3-) * uns(HI(coef(*CDP+))), & \begin{tabular}{l} 
Both instructions are performed in parallel. The product of the \\
AC0 \(=\) uns(*AR3-) \({ }^{*}\) uns(LO(coef(*CDP+))) \\
unsigned content addressed by AR3 and the unsigned content \\
addressed by the higher part of the coefficient data pointer \\
register (CDP) is stored in AC1. The product of the unsigned \\
content addressed by AR3 and the unsigned content ad- \\
dressed by the lower part of CDP is stored in AC0. AR3 is de- \\
cremented by 1. When CDP+ is used with HI/LO, CDP is in- \\
cremented by 2.
\end{tabular} \\
\hline
\end{tabular}

Execution
M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem)) ) [16:0])) -> ACy
M40 (rnd (uns (Smem) [16:0]*uns (LO (coef (Cmem)) ) [16:0])) -> ACx
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline ACO & FF & 8000 & 0000 & ACO & 00 & 3 F 80 & 0000 \\
\hline XAR3 & & 00 & 10FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FEO 0 & 10FFh & & & FEOO \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & FF & 8000 & 0000 & AC1 & 00 & 7F00 & 0000 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

Parallel Multiplies

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] &  & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode

\section*{Operands}

\section*{Description}
\(11111101 \mid\) AAAA AAAI \(010000 \mathrm{~mm} \mid\) DDDD uug\%
ACx, ACy, Cmem, Lmem
This instruction performs two parallel multiply operations in one cycle. The operations are executed in the D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand HI(Lmem) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{HI}(\) Lmem ) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand LO(coef(Cmem)). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \mathrm{AC} 1=\mathrm{uns}\left(\mathrm{HI}\left({ }^{*} \mathrm{AR} 3-\right)\right){ }^{*} \operatorname{uns}\left(\mathrm{HI}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right), \\
& \mathrm{AC0}=\mathrm{uns}\left(\mathrm{LO}\left({ }^{*} \mathrm{AR} 3-\right)\right)^{*} \mathrm{uns}\left(\mathrm{LO}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is stored in AC0. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with \(\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}\) is incremented by 2. \\
\hline
\end{tabular}

\footnotetext{
Execution
M40 (rnd (uns (HI (Lmem)) [16:0] *uns (HI (coef (Cmem))) [16:0])) -> ACy
M40 (rnd (uns (LO (Lmem)) [16:0]*uns (LO (coef(Cmem))) [16:0])) -> ACx
}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline Before & & & After & & & \\
\hline AC0 FF & 8000 & 0000 & ACO & 00 & 3 F 80 & 0000 \\
\hline XAR3 & 00 & 10FE & XAR3 & & 00 & 10FC \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FFh & & FEOO & 10FFh & & & FE00 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 FF & 8000 & 0000 & AC1 & 00 & 7 F 80 & 0000 \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FEh & & FFOO & 10FEh & & & FFO0 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

Parallel Multiplies

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [4] & \begin{tabular}{l}
ACy \(=\mathrm{M} 40\left(\right.\) rnd \(\left(\mathrm{uns}(\mathrm{Ymem}){ }^{*}\right.\) uns( \(\mathbf{H I}(\operatorname{coef}(\) Cmem) \())\) )), \\
ACx = M40(rnd(uns(Xmem) * uns(LO(coef(Cmem)))))
\end{tabular} & No & 5 (*) & 1 & X \\
\hline
\end{tabular}
(*) 1 LSB is allocated to instruction slot \#2.

Opcode

\section*{Operands}

\section*{Description}
| \(10010010 \mid\) XXXM MMYY \(\mid\) YMMM \(00 \mathrm{~mm} \mid\) UuDD DDg\%

\section*{ACx, ACy, Cmem, Xmem, Ymem}

This instruction performs two parallel multiply operations in one cycle. The operations are executed in the D-unit MACs.

The first operation performs a multiplication in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
\(\square\) The input operands are extended to 17 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
\(\square\) The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the \(B A B, B D B\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.


Execution
M40 (rnd (uns (Xmem) [16:0] * uns (LO (coef (Cmem))) [16:0])) \(\rightarrow\) ACx
M40 (rnd (uns (Ymem) [16:0] * uns (HI (coef (Cmem)) ) [16:0])) -> ACy
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline AC0 & FF & 8000 & 0000 & AC0 & 00 & 3 F 80 & 0000 \\
\hline XAR2 & & 00 & 10 FE & XAR2 & & 00 & 10FD \\
\hline XAR3 & & 00 & 20 FE & XAR3 & & 00 & 20FD \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FEh & & & FEOO & 10FEh & & & FE00 \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & FF & 8000 & 0000 & AC1 & 00 & 7 F 80 & 0000 \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 20FEh & & & FFOO & 20 FFh & & & FFO0 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

\section*{MAC::MAC}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& A C x=M 40\left(\text { rnd }\left(A C x+\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=M 40(\text { rnd }(A C y+(\text { uns }(\text { Ymem }) * \text { uns }(\operatorname{coef(Cmem)))))}
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [2] & ```
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(coef(Cmem))))),
ACy = M4(rnd(ACy + (uns(Ymem) * uns(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [3] &  & No & 4 & 1 & X \\
\hline [4] & \[
\begin{aligned}
& A C y=M 40(\operatorname{rnd}(A C y+(\text { uns }(\text { Smem }) * u n s(\text { HI }(\operatorname{coef}(\text { Cmem })))))), \\
& A C x=M 40(\operatorname{rnd}(A C x+(\text { uns }(\text { Smem }) * \text { uns }(\text { LO( } \operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline [5] & ```
ACy = M40(rnd(ACy + (uns(Smem) * uns(HI(coef(Cmem)))))),
ACx = M40(rnd((ACx >> #16) + (uns(Smem) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [6] &  & No & 4 & 1 & X \\
\hline [7] & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [8] & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd((ACx >> #16) + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [9] & ```
ACy = M40(rnd((ACy >> #16) + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd((ACx >> #16) + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [10] & \[
\begin{aligned}
& A C y=M 40(\operatorname{rnd}(A C y+u n s(\text { Ymem }) * \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem }))))), \\
& A C x=M 40(\text { rnd }(A C x+u n s(\text { Xmem }) * u n s(\text { LO }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & No & 5 & 1 & X \\
\hline
\end{tabular}


\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C x=M 40\left(\operatorname{rnd}\left(A C x+\left(\operatorname{uns}(\text { Xmem })^{*} \operatorname{uns}(\operatorname{coef}(\right.\right.\right.\) Cmem \(\left.\left.\left.))\right)\right)\right)\), & No & 4 & 1 & \(X\) \\
& \(A C y=M 40\left(\operatorname{rnd}\left(A C y+\left(\operatorname{uns}(\text { Ymem })^{*}\right.\right.\right.\) uns \((\operatorname{coef}(\) (Cmem) \(\left.\left.))\right)\right)\)
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description

\author{
ACx, ACy, Cmem, Xmem, Ymem
}

This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.
The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)
\begin{tabular}{|c|c|c|}
\hline Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL, SXMD \\
\hline & Affects & ACOVx, ACOVy \\
\hline Repeat & This instruction & can be repeated. \\
\hline
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \begin{tabular}{l}
AC0 \(=A C 0+\left(\right.\) uns \(\left({ }^{*} A R 3\right) *\) uns(coef(*CDP))), \\
AC1 \(=A C 1+\left(u n s\left({ }^{*} A R 4\right) * u n s\left(\operatorname{coef}\left({ }^{*} C D P\right)\right)\right)\)
\end{tabular} & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the coefficient data pointer register (CDP) is added to the content of ACO. The result is stored in AC0. The product of the unsigned content addressed by AR4 and the unsigned content addressed by CDP is added to the content of AC1. The result is stored in AC1. \\
\hline
\end{tabular}

\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & ```
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(coef(Cmem))))),
ACy = M4(rnd(ACy + (uns(Ymem) * uns(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode

\section*{Operands}

Description
| \(10000011 \mid\) XXXM MMYY \(\mid\) YMMM \(10 \mathrm{~mm} \mid\) uuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D -unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator ACx(39).
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:

■ mar(Xmem)
- \(\operatorname{mar}(\mathrm{Ymem})\)
- mar(Cmem)

Status Bits \(\quad\) Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=(A C 0 \gg \# 16)+\left(\right.\) uns \(\left({ }^{*} A R 3\right)^{*}\) uns \(\left.\left(\operatorname{coef}\left({ }^{*} C D P\right)\right)\right)\), & \begin{tabular}{l} 
Both instructions are performed in parallel. The \\
product of the unsigned content addressed by AR3 \\
and the unsigned content addressed by the \\
coefficient data pointer register (CDP) is added to the \\
content of AC0, which has been shifted to the right by \\
16 bits. The result is stored in AC0. The product of the
\end{tabular} \\
& \begin{tabular}{l} 
unsigned content addressed by AR4 and the \\
unsigned content addressed by CDP is added to the
\end{tabular} \\
& \begin{tabular}{l} 
uns \\
content of \(A C 1\). The result is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & \[
\begin{aligned}
& \text { ACx }=\text { M40(rnd((ACx >> \#16) }+(\text { uns }(\text { Xmem }) ~ * ~ \\
& \text { uns }(\operatorname{coef(Cmem))))),} \\
& \text { ACy }=\text { M40(rnd((ACy >> \#16) }+(\text { uns }(\text { Ymem }) ~ * ~ \\
& \text { uns }(\operatorname{coef}(\text { Cmem) }))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
\(10000100 \mid\) XXXM MMYY \(\mid\) YMMM \(11 \mathrm{~mm} \mid\) UuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator bit 39 .
Rounding is performed according to RDM, if the optional rnd keyword is
applied to the instruction.
Overflow detection depends on M40. If an overflow is detected, the
destination accumulator overflow status bit is set. \begin{tabular}{l} 
When an overflow is detected, the accumulator is saturated according to \\
SATD. \\
This instruction provides the option to locally set M40 to 1 for the execution of \\
the instruction, if the optional M40 keyword is applied to the instruction. \\
For this instruction, the Cmem operand is accessed through the BB bus; on \\
some C55x-based devices, the BB bus is only connected to internal memory \\
and not to external memory. To prevent the generation of a bus error, the \\
Cmem operand must not be mapped on external memory.
\end{tabular}

\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [4] & ```
ACy = M40(rnd(ACy + (uns(Smem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(Smem) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode

\section*{Operands}

Description

1111 1101|AAAA AAAI | 0001 01mm|DDD uug\%
ACx, ACy, Cmem, Smem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA ( \(E A+1\) when \(E A\) is even, \(E A-1\) when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, \(B D B\), and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM =1)}

None.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
\begin{tabular}{ll} 
& Affects \(\quad\) ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline  & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO. The result is stored in AC0. AR3 is decremented by 1 and CDP. When CDP+ is used with \(\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}\) is incremented by 2. \\
\hline
\end{tabular}

\footnotetext{
Execution
ACy+M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem))) [16:0])) -> ACy
ACx +M 40 (rnd (uns (Smem) [16:0] *uns (LO (coef(Cmem))) [16:0])) \(\rightarrow\) ACx
}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC0 & 00 & 0000 & 8000 & AC0 & 00 & 3 F 80 & 8000 \\
\hline XAR3 & & 00 & 10FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FE00 & 10FFh & & & FEO 0 \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & 00 & 0000 & 8000 & AC1 & 00 & 7F00 & 8000 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [5] & ```
ACy = M40(rnd(ACy + (uns(Smem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd((ACx >> #16) + (uns(Smem) *
uns(LO(coef(Cmem))))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

Operands
Description
\(11111101 \mid\) AAAA AAAI \(|001000 \mathrm{~mm}|\) DDDD uug\%
ACx, ACy, Cmem, Smem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand Smem is addressed by DAGEN path \(X\) with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator \(\operatorname{ACx}(39)\).
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the \(B A B, B D B\), and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy
Repeat This instruction can be repeated.
Example
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline  & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO, which has been shifted to the right by 16 bits. The result is stored in AC0. AR3 is decremented by 1 . When CDP+ is used with \(\mathrm{HI} / \mathrm{LO}, \mathrm{CDP}\) is incremented by 2 . \\
\hline
\end{tabular}


\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [6] & ```
ACy = M40(rnd((ACy >> #16) + (uns(Smem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd((ACx >> #16) + (uns(Smem) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description

1111 1101|AAAA AAAI | 0010 11mm| DDDD uug\%
ACx, ACy, Cmem, Smem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA ( \(E A+1\) when \(E A\) is even, \(E A-1\) when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator bit 39 .
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 = (AC1 >> \#16) + (uns(*AR3-) * } \\
& \text { uns(HI(coef(*CDP+)))), } \\
& \text { AC0 = (AC0 >> \#16) + (uns(*AR3-) * } \\
& \text { uns(LO(coef(*CDP+)))) }
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is added to the content of AC0, which has been shifted to the right by 16 bits. The result is stored in ACO. AR3 is decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}

\footnotetext{
Execution
```

(ACy>>\#16) +M40(rnd(uns(Smem) [16:0]*uns(HI (coef(Cmem))) [16:0])) -> ACy

```
(ACx>>\#16) +M40 (rnd (uns (Smem) [16:0]*uns (LO (coef(Cmem))) [16:0])) -> ACx
}


\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [7] & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Description}
\(11111101 \mid\) AAAA AAAI \(|010101 \mathrm{~mm}|\) DDDD uug\%

\section*{ACx, ACy, Cmem, Lmem}

This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{HI}(\) Lmem \()\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB, and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path \(C\) with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
\(\square\) The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy
\end{tabular}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{array}{|l}
\mathrm{AC} 1=\mathrm{AC} 1+\left(\text { uns }\left(\mathrm{Hl}\left({ }^{*} \mathrm{AR} 3-\right)\right){ }^{*} \text { uns }\left(\mathrm{Hl}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right), \\
\mathrm{AC0}=\mathrm{AC0}+\left(\text { uns }\left(\mathrm{LO}\left({ }^{*} \mathrm{AR} 3-\right)\right)^{*}\right. \\
\text { uns(LO(coef(*} \left.\left.\left.\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right)
\end{array}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO. The result is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2 . When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}

\footnotetext{
Execution
ACy +M 40 (rnd (uns (HI (Lmem)) [16:0]*uns (HI (coef(Cmem))) [16:0])) -> ACy
ACx +M40 (rnd (uns (LO (Lmem)) [16:0]*uns (LO (coef(Cmem))) [16:0])) -> ACx
}

MAC::MAC Parallel Multiply and Accumulates
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline ACO & 00 & 0000 & 8000 & ACO & 00 & 3 F 80 & 8000 \\
\hline XAR3 & & 00 & 10FE & XAR3 & & 00 & 10FC \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FEO 0 & 10FFh & & & FE00 \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & 00 & 0000 & 8000 & AC1 & 00 & 7F80 & 8000 \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FEh & & & FFOO & 10FEh & & & FFOO \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000 h & & & 8000 \\
\hline
\end{tabular}

\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [8] & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd((ACx >> #16) + (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description

ACx, ACy, Cmem, Lmem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\) coef(Cmem)). The data memory operand \(\mathrm{HI}(\) Lmem \()\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB, and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when EA is even, \(E A-1\) when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
\(\square\) For the first operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
- For the second operation, the 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACx, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator \(\mathrm{ACx}(39)\).
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 = AC1 + (uns(HI(*AR3-)) * } \\
& \text { uns(HI(coef(*CDP+)))), } \\
& \text { AC0 = (AC0 >> \#16) }+\left(\text { uns(LO (*AR3-)) }{ }^{*}\right. \\
& \text { uns(LO(coef(*CDP+)))) }
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is added to the content of ACO, which has been shifted to the right by 16 bits. The result is stored in AC0. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}


\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [9] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40\left(\text { (rnd } \left((\text { ACy } \gg \# 16)+\left(\text { uns }(\text { HI }(\text { Lmem }))^{*}\right.\right.\right. \\
& \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M40(rnd((ACx >> \#16) }+(\text { uns }(\text { LO(Lmem }))^{*} \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem) })))))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
\(11111101 \mid\) AAAA AAAI \(\mid 0110\) 11mm \(\mid\) DDDD uug\%

\section*{Operands}

\section*{Description}

\section*{ACx, ACy, Cmem, Lmem}

This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem \()\) and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{HI}(\) Lmem \()\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path \(C\) with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand LO(Lmem) is addressed by DAGEN path X with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
\(\square\) The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator, which has been shifted to the right by 16 bits. The shifting operation is performed with a sign extension of source accumulator bit 39 .
. Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.

Status Bits

Repeat

Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \mathrm{AC} 1=(\mathrm{AC} 1 \gg \# 16)+\left(\text { uns }\left(\mathrm{HI}\left({ }^{*} \mathrm{AR} 3-\right)\right){ }^{*}\right. \\
& \text { uns(HI(coef(*} \mathrm{CDP}+))), \\
& \mathrm{AC0}=(\mathrm{AC0} \gg \# 16)+\left(\text { uns }\left(\mathrm{LO}\left({ }^{*} \mathrm{AR} 3-\right)\right){ }^{*}\right. \\
& \left.\operatorname{uns}\left(\mathrm{LO}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right)\right)
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1, which has been shifted to the right by 16 bits. The result is stored in AC1. The product of the unsigned content addressed by lower part of AR3 and the unsigned content addressed by the lower part of CDP is added to the content of AC0, which has been shifted to the right by 16 bits. The result is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}


\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [10] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40(\operatorname{rnd}(\text { ACy }+ \text { uns }(\text { Ymem }) \\
& \text { * } \\
& \text { * }(\text { HI }(\operatorname{coef}(\text { Cmem }))))), \\
& \text { uns }=\text { M } 40(\text { (rnd }(\text { ACx }+ \text { uns }(\text { Xmem }(\text { Cmem })))))
\end{aligned}
\] & No & 5 (*) & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
(*) \(^{*} 1\) LSB is allocated to instruction slot \#2.
\[
\begin{array}{|l|l|l|l|}
1001 & 0011 & \text { XXXM MMYY } & \text { YMMM } 00 \mathrm{~mm}
\end{array} \text { UuDD DDg\% }
\]

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand HI(coef(Cmem)) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand \(\mathrm{LO}(\) coef(Cmem)) which is addressed by DAGEN path C with the next address of EA (EA +1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{\(\begin{array}{ll}\text { Affected by } & \text { FRCT, M40, RDM, SATD, SMUL, SXMD } \\ \text { Affects } & \text { ACOVx, ACOVy }\end{array}\)}} \\
\hline & & \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline \multicolumn{2}{|l|}{Syntax} & Description \\
\hline \[
\begin{aligned}
& \mathrm{AC} 1=\mathrm{AC} 1+ \\
& \mathrm{AC} 0=\mathrm{AC} 0+
\end{aligned}
\] & \begin{tabular}{l}
) * uns(HI(coef(*CDP+)))), \\
\(-)\) * uns(LO(coef(*CDP+))))
\end{tabular} & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is added to the content of ACO. The result is stored in AC0. AR3 and AR2 are decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2. \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{8}{|l|}{Execution} \\
\hline \multicolumn{8}{|l|}{M40 (rnd (ACx + uns (Xmem) [16:0] * uns (LO(coef (Cmem) ) [16:0])) -> ACx} \\
\hline \multicolumn{8}{|l|}{M40 (rnd (ACy + uns (Ymem) [16:0] * uns (HI (coef (Cmem)) ) [16:0])) -> ACy} \\
\hline \multicolumn{8}{|l|}{Before After} \\
\hline AC0 00 & 0000 & 8000 & AC0 & & 3F80 & 8000 & \\
\hline XAR2 & 00 & 10 FE & XAR2 & & 00 & 10FD & \\
\hline XAR3 & 00 & 20 FE & XAR3 & & 00 & 20FD & \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline \multicolumn{2}{|l|}{10FEh} & FE00 & \multicolumn{3}{|l|}{10FEh} & \multicolumn{2}{|l|}{FE00} \\
\hline XCDP & 00 & 2000 & XCDP & \multicolumn{2}{|r|}{00} & \multicolumn{2}{|l|}{2002} \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & \multicolumn{2}{|r|}{4000} & 2001h & \multicolumn{4}{|c|}{4000} \\
\hline AC1 00 & 0000 & 8000 & AC1 & & 7F80 & 8000 & \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 20FEh & \multicolumn{2}{|r|}{FFOO} & 20FFh & \multicolumn{4}{|c|}{FFOO} \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000h & & & 8000 & \\
\hline
\end{tabular}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [11] & ```
ACy = M40(rnd(ACy + (uns(Ymem) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & No & \(5{ }^{*}\) ) & 1 & X \\
\hline
\end{tabular}
(*) 1 LSB is allocated to instruction slot \#2.

\section*{Opcode}

Operands

\section*{Description}
| \(10010011 \mid\) XXXM MMYY \(\mid\) YMMM \(10 \mathrm{~mm} \mid\) uuDD DDg\%
ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the contents of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an accumulation in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the contents of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.

The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{2}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL, SXMD} \\
\hline & \multicolumn{2}{|l|}{Affects ACOVx, ACOVy} \\
\hline Repeat & \multicolumn{2}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline \multicolumn{2}{|l|}{Syntax} & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\mathrm{AC1}+ \\
& \mathrm{ACO}=(\mathrm{ACO} \\
& \text { uns }(\mathrm{LO} \text { (coef( }
\end{aligned}
\] & \[
\begin{aligned}
& -)^{*} \operatorname{uns}\left(\operatorname{HI}\left(\operatorname{coef}\left({ }^{*} \mathrm{CDP}+\right)\right)\right) \text {, } \\
& \text { ins }\left({ }^{*} \text { AR2- }\right)^{*}
\end{aligned}
\] & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is added to the content of \(A C O\), which has been shifted to the right by 16 bits. The result is stored in AC0. AR3 and AR2 are decremented by 1. When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}


\section*{Parallel Multiply and Accumulates}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [12] & ```
ACy = M40(rnd((ACy >> #16) + (uns(Ymem) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & No & 5 (*) & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
(*) 1 LSB is allocated to instruction slot \#2.
\[
10010011 \mid \text { XXXM MMYY } \mid \text { YMMM } 11 \mathrm{~mm} \mid \text { uuDD DDg\% }
\]

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and accumulate (MAC) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and an accumulation in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand HI(coef(Cmem)) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and an addition in the D-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when \(E A\) is even, \(E A-1\) when \(E A\) is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
- The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM = 1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects & ACOVx, ACOVy
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline  & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is added to the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is added to the content of \(A C 0\), which has been shifted to the right by 16 bits. The result is stored in AC0. AR3 and AR2 are decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}


\section*{MAS::MAS}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \begin{tabular}{l}
\(A C x=M 40(\operatorname{rnd}(A C x-(u n s(X m e m) *\) uns \((\operatorname{coef}(C m e m)))))\), \\
\(A C y=M 40(\) rnd \((A C y-(u n s(Y m e m) * u n s(\operatorname{coef}(C m e m)))))\)
\end{tabular} & No & 4 & 1 & X \\
\hline [2] & ```
ACy = M40(rnd(ACy - (uns(Smem)*
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(Smem) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & x \\
\hline [3] & ```
ACy = M40(rnd(ACy - (uns(HI(Lmem)) *
uns(H1(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline [4] & ```
ACy = M40(rnd(ACy - (uns(Ymem)*
uns(H1(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & No & 5 & 1 & x \\
\hline
\end{tabular}

Description These instructions perform two parallel multiply and subtract (MAS) operations in one cycle. The operations are executed in the two D-unit MACs.

Status Bits Affected by FRCT, M40, RDM, SATD, SMUL, SXMD
Affects ACOVx, ACOVy
See Also See the following other related instructions:
- Modify Auxiliary Register Content with Parallel Multiply and Subtract
- Multiply and Subtract
- Multiply and Subtract with Parallel Load Accumulator from Memory
- Multiply and Subtract with Parallel Multiply
\(\square\) Multiply and Subtract with Parallel Multiply and Accumulate
- Multiply and Subtract with Parallel Store Accumulator Content to Memory
- Parallel Multiplies
- Parallel Multiply and Accumulates

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \[
\begin{aligned}
& A C x=M 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& A C y=M 40(\text { rnd }(\text { ACy }-(\text { uns }(\text { Ymem }) *))
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode

\section*{Operands}

Description

\author{
ACx, ACy, Cmem, Xmem, Ymem
}

This instruction performs two parallel multiply and subtract (MAS) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Xmem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.

The second operation performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of data memory operand Ymem, extended to 17 bits, and the content of a data memory operand Cmem, addressed using the coefficient addressing mode and extended to 17 bits.
- Input operands are extended to 17 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BB bus; on some C55x-based devices, the BB bus is only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Each data flow can also disable the usage of the corresponding MAC unit, while allowing the modification of auxiliary registers in the three address generation units through the following instructions:
- mar(Xmem)
- mar(Ymem)
- mar(Cmem)
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects & ACOVx, ACOVy
\end{tabular} Repeat \(\quad\) This instruction can be repeated.

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline  & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the coefficient data pointer register (CDP) is subtracted from the content of AC0. The result is stored in ACO. The product of the unsigned content addressed by AR4 and the unsigned content addressed by CDP is subtracted from the content of AC1. The result is stored in AC1. \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \[
\begin{aligned}
& \text { ACy }=\text { M } 40\left(\text { rnd } \left(\text { ACy }-\left(\text { uns }(\text { Smem })^{*}\right.\right.\right. \\
& \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem }))))) \text {, } \\
& \text { ACx }=\text { M40 }(\text { rnd }(\text { ACx }-(\text { uns }(\text { Smem }) ~
\end{aligned}
\] & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode

\section*{Operands}

Description

ACx, ACy, Cmem, Smem
This instruction performs two parallel multiply and subtract (MAS) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC2 (this data is shared to MAC1 and MAC2). The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem})\) ) is addressed by DAGEN path C with the EA (effective address); the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand Smem and the content of data memory operand LO(coef(Cmem)). The data memory operand Smem is addressed by DAGEN path X with the corresponding addressing mode, driven on data bus DDB, and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path \(C\) with the next address of EA (EA+1 when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
\(\square\) If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.

When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices (C54CM =1)}

None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL \\
& Affects \(\quad\) ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline \[
\begin{aligned}
& \text { AC1 }=\text { AC1 }-\left(\text { uns(*AR3-) }{ }^{*}\right. \text { uns(HI(coef(*CDP+)))), } \\
& \text { AC0 }=\text { AC0 }-\left(\text { uns(*AR3-) }{ }^{*} \text { uns(LO(coef(*} \text { ( }{ }^{*}\right. \text { PP+)))) }
\end{aligned}
\] & \begin{tabular}{l}
Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is subtracted from the content of \(A C 1\). The result is stored in AC1. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of AC0. The result is stored in ACO. AR3 is decremented by 1. \\
When CDP+ is used with HI/LO, CDP is incremented by 2 .
\end{tabular} \\
\hline
\end{tabular}

\footnotetext{
Execution
ACy-M40 (rnd (uns (Smem) [16:0]*uns (HI (coef (Cmem))) [16:0])) -> ACy
ACx-M40 (rnd (uns (Smem) [16:0] *uns (LO (coef(Cmem))) [16:0])) -> ACx
}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline Before & & & & After & & & \\
\hline ACO & 00 & 0000 & 8000 & ACO & FF & C080 & 8000 \\
\hline XAR3 & & 00 & 10FF & XAR3 & & 00 & 10FE \\
\hline \multicolumn{8}{|l|}{Data memory} \\
\hline 10FFh & & & FEOO & 10FFh & & & FEOO \\
\hline XCDP & & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2001h & & & 4000 & 2001h & & & 4000 \\
\hline AC1 & 00 & 0000 & 8000 & AC1 & FF & 8100 & 8000 \\
\hline \multicolumn{8}{|l|}{Coeff memory} \\
\hline 2000h & & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [3] & ```
ACy = M40(rnd(ACy - (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` & No & 4 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

11111101 |AAAA AAAI \(|011100 \mathrm{~mm}|\) DDDD uug\%

\section*{Operands}

\section*{Description}

\section*{ACx, ACy, Cmem, Lmem}

This instruction performs two parallel multiply and subtract (MAS) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the content of data memory operand \(\mathrm{HI}(\) Lmem ) and the content of data memory operand \(\mathrm{HI}(\) coef(Cmem)). The data memory operand \(\mathrm{HI}(\mathrm{Lmem})\) is addressed by DAGEN path X with the EA (effective address); the data, which can be assumed to be the higher part of long word memory data, is driven on data bus CDB and sign extended to 17 bits in the MAC2. The other data memory operand \(\mathrm{HI}(\operatorname{coef}(\mathrm{Cmem}))\) is addressed by DAGEN path \(C\) with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the D-unit MAC1. The input operands of the multiplier are the content of data memory operand LO(Lmem) and the content of data memory operand \(\mathrm{LO}(\operatorname{coef}(\mathrm{Cmem}))\). The data memory operand \(\mathrm{LO}(\mathrm{Lmem})\) is addressed by DAGEN path X with the next address of EA ( \(\mathrm{EA}+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word memory data, is driven on data bus DDB and sign extended to 17 bits in the MAC1. The other data memory operand LO(coef(Cmem)) is addressed by DAGEN path \(C\) with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
\(\square\) The content of the memory location is zero extended to 17 bits, if the optional uns keyword is applied to the input operand.
- If FRCT \(=1\), the output of the multiplier is shifted to the left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
. The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

None.
\begin{tabular}{|c|c|c|}
\hline Status Bits & \multicolumn{2}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL} \\
\hline Repeat & This instruction can be repe & ated. \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline \multicolumn{2}{|l|}{Syntax} & Description \\
\hline \[
\begin{aligned}
& \mathrm{AC} 1=\mathrm{AC} 1 \\
& \mathrm{ACO}=\mathrm{ACO}
\end{aligned}
\] & \begin{tabular}{l}
R3-)) * uns(HI(coef(*CDP+)))), \\
R3-)) * uns(LO(coef(*CDP+))))
\end{tabular} & Both instructions are performed in parallel. The product of the unsigned content addressed by the higher part of AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is subtracted from the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by the lower part of AR3 and the unsigned content addressed by the lower part of CDP is subtracted from the content of ACO. The result is stored in ACO. When AR3- is used with HI/LO, AR3 is decremented by 2. When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}
```

Execution
ACy-M40 (rnd (uns (HI (Lmem)) [16:0]*uns (HI (coef(Cmem))) [16:0])) -> ACy
ACx-M40 (rnd (uns (LO (Lmem)) [16:0]*uns (LO (coef(Cmem))) [16:0])) -> ACx

```

MAS::MAS Parallel Multiply and Subtracts
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline Before & & & After & & & \\
\hline ACO 00 & 0000 & 8000 & AC0 & FF & C080 & 8000 \\
\hline XAR3 & 00 & 10FE & XAR3 & & 00 & 10FC \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FFh & & FEOO & 10FFh & & & FEO 0 \\
\hline XCDP & 00 & 2000 & XCDP & & 00 & 2002 \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2001h & & 4000 & 2001h & & & 4000 \\
\hline AC1 00 & 0000 & 8000 & AC1 & FF & 8080 & 8000 \\
\hline \multicolumn{7}{|l|}{Data memory} \\
\hline 10FEh & & FFOO & 10FEh & & & FFOO \\
\hline \multicolumn{7}{|l|}{Coeff memory} \\
\hline 2000h & & 8000 & 2000h & & & 8000 \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [4] & ```
ACy = M40(rnd(ACy - (uns(Ymem) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & No & \(5{ }^{*}\) ) & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
(*) \(^{*} 1\) LSB is allocated to instruction slot \#2.
\[
\begin{array}{|l|l|l|l|l|}
10010101 & \text { XXXM MMYY } & \text { YMMM } 01 \mathrm{~mm} & \text { UuDD DDg\% }
\end{array}
\]

ACx, ACy, Cmem, Xmem, Ymem
This instruction performs two parallel multiply and subtraction (MAS) operations in one cycle. The operations are executed in the two D-unit MACs.

The first operation performs a multiplication and a subtraction in the D-unit MAC2. The input operands of the multiplier are the contents of data memory operand Ymem, extended to 17 bits, and the content of data memory operand HI(coef(Cmem)) which is addressed by DAGEN path C with the EA; the data, which can be assumed to be the higher part of long word coefficient data, is driven on data bus B2DB and sign extended to 17 bits in the MAC2.

The second operation performs a multiplication and a subtraction in the \(D\)-unit MAC1. The input operands of the multiplier are the contents of data memory operand Xmem, extended to 17 bits, and the content of data memory operand LO(coef(Cmem)) which is addressed by DAGEN path C with the next address of \(E A\) ( \(E A+1\) when EA is even, EA-1 when EA is odd); the data, which can be assumed to be the lower part of long word coefficient data, is driven on data bus BDB and sign extended to 17 bits in the MAC1.
- The input operands are extended to 17 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 17 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 17 bits according to SXMD.
\(\square\) If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted to the left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
\(\square\) Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit is set.
- When an overflow is detected, the accumulator is saturated according to SATD.
- Because this instruction occupies both instruction slots \#1 and \#2, this can not be executed in parallel with other instructions.
\(\square\) The Xmem operand can access the MMRs but the Ymem operand can not.

This instruction provides the option to locally set M40 to 1 for the execution of the instruction, if the optional M40 keyword is applied to the instruction.

For this instruction, the Cmem operand is accessed through the BAB, BDB, and B2DB buses; on some C55xx-based devices, the BAB, BDB, and B2DB buses are only connected to internal memory and not to external memory. To prevent the generation of a bus error, the Cmem operand must not be mapped on external memory.

Compatibility with C54x devices (C54CM =1)
None.
\begin{tabular}{lll} 
Status Bits & Affected by & FRCT, M40, RDM, SATD, SMUL, SXMD \\
& Affects & ACOVx, ACOVy \\
Repeat & This instruction can be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|c|c|}
\hline Syntax & Description \\
\hline  & Both instructions are performed in parallel. The product of the unsigned content addressed by AR3 and the unsigned content addressed by the higher part of the coefficient data pointer register (CDP) is subtracted from the content of AC1. The result is stored in AC1. The product of the unsigned content addressed by AR2 and the unsigned content addressed by the lower part of the CDP is subtracted from the content of ACO. The result is stored in ACO. AR3 and AR2 are decremented by 1 . When CDP+ is used with HI/LO, CDP is incremented by 2 . \\
\hline
\end{tabular}


\section*{port \\ Peripheral Port Register Access Qualifiers}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & readport() & No & 1 & 1 & D \\
{\([2]\)} & writeport() & No & 1 & 1 & D \\
\hline
\end{tabular}
Opcode
\begin{tabular}{l|ll} 
readport & \(\left.\left\lvert\, \begin{array}{ll}1001 & 1001 \\
\text { writeport } & \mid 1001 \\
1010\end{array}\right.\right)\)
\end{tabular}

Operands
Description
none
These operand qualifiers allow you to locally disable access toward the data memory and enable access to the 64K-word I/O space. The I/O data location is specified by the Smem, Xmem, or Ymem fields.
\(\square\) A readport() operand qualifier may be included in any instruction making a word single data memory access Smem or Xmem that is used in a read operation, except instructions using delay().
- A readport() operand qualifier cannot be used in any instruction making a dual memory access Xmem or Ymem that is used in read operation. There is an exception for the instructions making a dual read/write memory access of the type Ymem = Xmem, or Smem = coeff, where readport() qualifier can be used.
- A writeport() operand qualifier may be included in any instruction making a word single data memory access Smem or Ymem that is used in a write operation, except instructions using the delay().
- A writeport() operand qualifier cannot be used in any instruction making a dual memory access Xmem or Ymem that is used in write operation. There is an exception for the instructions making a dual read/write memory access of the type Ymem = Xmem, or coeff = Smem, where writeport() qualifier can be used.
\(\square\) A readport() or writeport() operand qualifier cannot be used as a stand-alone instruction (the assembler generates an error message).
Any instruction making a word single data memory access Smem (except those listed above) can use the *port(\#k16) addressing mode to access the 64 K -word I/O space with an immediate address. When an instruction uses *port(\#k16), the 16 -bit unsigned constant, k16, is encoded in a 2-byte extension to the instruction. Because of the extension, an instruction using *port(\#k16) cannot be executed in parallel with another instruction.

The following indirect operands cannot be used for accesses to I/O space. An instruction using one of these operands requires a 2-byte extension to the instruction. Because of the extension, an instruction using one of the following indirect operands cannot be executed with these operand qualifiers.
```

*ARn(\#K16)
\square *+ARn(\#K16)
\square *CDP(\#K16)
\square *+CDP(\#K16)

```
Status Bits Affected by none

Repeat An instruction using this operand qualifier can be repeated.
Example 1
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \begin{tabular}{l} 
T2 \(~\) \\
\(\|\) \\
\(\|\) readp3 \\
\end{tabular} & The content \()\)
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \begin{tabular}{l}
\(*\) AR3 \(=\) T2 \\
\(\|\) writeport()
\end{tabular} & The content of T2 is written to the location addressed by AR3 (I/O address). \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline POPBOTH & Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers \\
\hline \multicolumn{2}{|l|}{Syntax Characteristics} \\
\hline No. Syntax &  \\
\hline [1] xdst = popboth() & () Yes \(20 \begin{array}{llll}\text { ( }\end{array}\) \\
\hline Opcode & 0101 000E | XDDD 0100 \\
\hline Operands & xdst \\
\hline \multirow[t]{4}{*}{Description} & This instruction moves the content of two 16-bit data memory locations addressed by the data stack pointer (SP) and system stack pointer (SSP) to accumulator ACx or to the 23-bit destination register (XARx, XSP, XSSP, XDP, or XCDP). \\
\hline & The content of \(x d s t(15-0)\) is loaded from the location addressed by SP and the content of xdst(31-16) is loaded from the location addressed by SSP. The return address register (RETA) and the control-flow context register (CFCT) are not accessed by this instruction even in the fast-return process. \\
\hline & When xdst is a 23 -bit register, the upper 9 bits of the data memory addressed by SSP are discarded and only the 7 lower bits of the data memory are loaded into the high part of \(x d s t(22-16)\). \\
\hline & When xdst is an accumulator, the guard bits, ACx(39-32), are reloaded (unchanged) with the current value and are not modified by this instruction. \\
\hline \multirow[t]{2}{*}{Status Bits} & Affected by none \\
\hline & Affects none \\
\hline Repeat & This instruction can be repeated. \\
\hline \multirow[t]{4}{*}{See Also} & See the following other related instructions: \\
\hline & - Pop Top of Stack \\
\hline & - Push to Top of Stack \\
\hline & - Push Accumulator or Extended Auxiliary Register Content to Stack Pointers \\
\hline
\end{tabular}

\section*{POP}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & dst1, dst2 = pop() & Yes & 2 & 1 & X \\
\hline [2] & \(\mathrm{dst}=\mathbf{p o p}()\) & Yes & 2 & 1 & X \\
\hline [3] & dst, Smem = pop() & No & 3 & 1 & X \\
\hline [4] & \(A C x=d b l(p o p())\) & Yes & 2 & 1 & X \\
\hline [5] & Smem \(=\mathbf{p o p}()\) & No & 2 & 1 & X \\
\hline [6] & \(\mathrm{dbl}(\) Lmem \()=\operatorname{pop}()\) & No & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Description These instructions move the content of the data memory location addressed} by the data stack pointer (SP) to:
- an accumulator, auxiliary, or temporary register
- a data memory location

The return address register (RETA) and the control-flow context register (CFCT) are not accessed by this instruction even in the fast-return process.

When the destination register is an accumulator, the guard bits and the 16 higher bits of the accumulator, \(\operatorname{ACx}(39-16)\), are reloaded (unchanged) with the current value and are not modified by these instructions.

The increment operation performed on SP is done by the A-unit address generator dedicated to the stack addressing management.

Status Bits
Affected by none
Affects none

\section*{See Also \\ See the following other related instructions:}
\(\square\) Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers
- Push to Top of Stack
- Push Accumulator or Extended Auxiliary Register Content to Stack Pointers

Pop Top of Stack

\section*{Syntax Characteristics}


Pop Top of Stack

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(\mathrm{dst}=\operatorname{pop}()\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
| 0101000 E FDDD x 010

\section*{Operands}

Description This instruction moves the content of the 16-bit data memory location pointed by SP to destination register dst.

When the destination register, dst, is an accumulator, the content of the 16-bit data memory operand is moved to the destination accumulator low part, \(\operatorname{ACx}(15-0)\). The guard bits and the 16 higher bits of the accumulator, ACx(39-16), are reloaded (unchanged) with the current value and are not modified by this instruction. SP is incremented by 1.

Status Bits
Affected by none
Affects none
Repeat
This instruction cannot be repeated with a single conditional or unconditional repeat instruction. It can be repeated in other repeat instructions.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=\operatorname{pop}()\) & \begin{tabular}{l} 
The content of the memory location pointed by the data stack pointer (SP) is copied \\
to \(A C 0(15-0)\). Bits \(39-16\) of \(A C 0\) are unchanged. The SP is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}

Pop Top of Stack

\section*{Syntax Characteristics}


Pop Top of Stack

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([4]\) & \(A C x=\operatorname{dbl}(\operatorname{pop}())\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
| \(0101000 \mathrm{E} \mid \operatorname{xxDD} \mathrm{x} 011\)

\section*{Operands}

ACx
Description This instruction moves the content of the 16-bit data memory location pointed by SP to the accumulator high part \(\mathrm{ACx}(31-16)\) and moves the content of the 16-bit data memory location pointed by SP +1 to the accumulator low part ACx (15-0).

The guard bits of the accumulator, \(\operatorname{ACx}(39-32)\), are reloaded (unchanged) with the current value and are not modified by this instruction. SP is incremented by 2.
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat
This instruction cannot be repeated with a single conditional or unconditional repeat instruction. It can be repeated in other repeat instructions.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=\mathrm{dbl}(\operatorname{pop}())\) & \begin{tabular}{l} 
The content of the memory location pointed by the data stack pointer (SP) is copied \\
to \(A C 1(31-16)\) and the content of the memory location pointed by SP +1 is copied \\
to \(A C 1(15-0)\). Bits 39-32 of \(A C 1\) are unchanged. The SP is incremented by 2.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lllllll} 
Before & & & After \\
AC1 & 03 & 3800 & FC0 & AC1 & 03 & 5644 \\
SP800 \\
SP & & 0304 & SP & & 0306 \\
304 & & 5644 & 304 & 5644 \\
305 & & F 800 & 305 & F 800
\end{tabular}

Pop Top of Stack

\section*{Syntax Characteristics}


Pop Top of Stack

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline \multicolumn{2}{|l|}{dbl(Lmem) = pop()} & No & 2 & 1 & X \\
\hline Opcode & & \multicolumn{4}{|r|}{\(|10111000|\) AAAA AAAI} \\
\hline Operands & \multicolumn{5}{|l|}{Lmem} \\
\hline Description & \multicolumn{5}{|l|}{This instruction moves the content of the 16-bit data memory location pointed by SP to the 16 highest bits of data memory location Lmem and moves the content of the 16 -bit data memory location pointed by SP +1 to the 16 lowest bits of data memory location Lmem.} \\
\hline & \multicolumn{5}{|l|}{When Lmem is at an even address, the two 16 -bit values popped from the stack are stored at memory location Lmem in the same order. When Lmem is at an odd address, the two 16-bit values popped from the stack are stored at memory location Lmem in the reverse order.} \\
\hline \multicolumn{6}{|c|}{SP is incremented by 2.} \\
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{5}{|l|}{Affected by none} \\
\hline & \multicolumn{5}{|l|}{Affects none} \\
\hline Repeat & \multicolumn{5}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{6}{|l|}{Example} \\
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline dbl(*AR3-) = pop() & \multicolumn{5}{|l|}{The content of the memory location pointed by the data stack pointer (SP) is copied to the 16 highest bits of the location addressed by AR3 and the content of the memory location pointed by SP +1 is copied to the 16 lowest bits of the location addressed by AR3. Because this instruction is a long-operand instruction, AR3 is decremented by 2 after the execution. The SP is incremented by 2.} \\
\hline
\end{tabular}

\section*{PSHBOTH}

Push Accumulator or Extended Auxiliary Register Content to Stack Pointers

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & pshboth(xsrc) & Yes & 2 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad \mid 0101\) 000E|XSSS 0101

Operands
Description This instruction moves the lower 32 bits of \(A C x\) or the content of the 23-bit source register (XARx, XSP, XSSP, XDP, or XCDP) to the two 16-bit memory locations addressed by the data stack pointer (SP) and system stack pointer (SSP). The return address register (RETA) and the control-flow context register (CFCT) are not accessed by this instruction even in the fast-return process.

The content of \(\mathrm{xsrc}(15-0)\) is moved to the location addressed by SP and the content of \(\mathrm{xsrc}(31-16)\) is moved to the location addressed by SSP.

When xsrc is a 23-bit register, the upper 9 bits of the location addressed by SSP are filled with 0 .

Status Bits Affected by none
Affects none
Repeat This instruction can be repeated.
See Also See the following other related instructions:
- Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers
- Pop Top of Stack
- Push to Top of Stack

\section*{PSH}

\section*{Push to Top of Stack}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & push(src1, src2) & Yes & 2 & 1 & X \\
{\([2]\)} & push(src) & Yes & 2 & 1 & X \\
{\([3]\)} & push(src, Smem) & No & 3 & 1 & X \\
{\([4]\)} & dbl(push(ACx)) & Yes & 2 & 1 & X \\
{\([5]\)} & push(Smem) & No & 2 & 1 & X \\
{\([6]\)} & push(dbl(Lmem)) & No & 2 & 1 & X \\
\hline
\end{tabular}

\section*{Description These instructions move one or two operands to the data memory location} addressed by the data stack pointer (SP). The return address register (RETA) and the control-flow context register (CFCT) are not accessed by this instruction even in the fast-return process. The operands may be:
- an accumulator, auxiliary, or temporary register
- a data memory location

The decrement operation performed on SP is done by the A-unit address generator dedicated to the stack addressing management.

Status Bits
Affected by none
Affects none
See Also
See the following other related instructions:Pop Top of Stack
- Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers
- Push Accumulator or Extended Auxiliary Register Content to Stack Pointers

\section*{Push to Top of Stack}

\section*{Syntax Characteristics}


Push to Top of Stack

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & push(src) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

Description This instruction decrements SP by 1, then moves the content of the source register (src) to the 16 -bit data memory location pointed by SP. When the source register is an accumulator, the source accumulator low part, ACx (15-0), is moved to the 16 -bit data memory operand.

Status Bits Affected by none
Affects none
Repeat This instruction cannot be repeated with a single conditional or unconditional repeat instruction. It can be repeated in other repeat instructions.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline push(ACO) & \begin{tabular}{l} 
The data stack pointer (SP) is decremented by 1. The content of AC0(15-0) is copied \\
to the memory location pointed by SP.
\end{tabular} \\
\hline
\end{tabular}

\section*{Push to Top of Stack}

\section*{Syntax Characteristics}


Push to Top of Stack

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([4]\) & dbl \((\) push \((A C x))\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

Opcode
| 0101000 Ex xSS x111
Operands
Description This instruction decrements SP by 2, then moves the content of the accumulator high part \(\mathrm{ACx}(31-16)\) to the 16-bit data memory location pointed by SP and moves the content of the accumulator low part \(\mathrm{ACx}(15-0)\) to the 16 -bit data memory location pointed by \(S P+1\).

Status Bits Affected by none
Affects none
Repeat This instruction cannot be repeated with a single conditional or unconditional repeat instruction. It can be repeated in other repeat instructions.

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline dbl(push(ACO)) & \begin{tabular}{l} 
The data stack pointer \((S P)\) is decremented by 2. The content of ACO(31-16) is \\
copied to the memory location pointed by SP and the content of \(A C 0(15-0)\) is copied \\
to the memory location pointed by SP +1.
\end{tabular} \\
\hline
\end{tabular}

Push to Top of Stack

\section*{Syntax Characteristics}


\section*{Push to Top of Stack}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. Syntax & & Parallel Enable Bit & Size & Cycles & Pip & peline \\
\hline \multicolumn{2}{|l|}{push(dbl(Lmem))} & No & 2 & 1 & & X \\
\hline Opcode & \multicolumn{6}{|r|}{10110111 AAAA AAAI} \\
\hline Operands & \multicolumn{6}{|l|}{Lmem} \\
\hline Description & \begin{tabular}{l}
This instruct memory locatic moves the memory loca \\
When Lmem stack are sto at an odd add memory loc
\end{tabular} & \begin{tabular}{l}
hen moves ta memory ory locatio \\
he two 16-b mem in the sa s pushed order.
\end{tabular} & \multicolumn{2}{|l|}{This instruction decrements SP by 2, then moves the 16 highest bits of data memory location Lmem to the 16-bit data memory location pointed by SP and moves the 16 lowest bits of data memory location Lmem to the 16-bit data memory location pointed by \(S P+1\).} & & of data SP and it data nto the mem is ored at \\
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{6}{|l|}{Affected by none} \\
\hline & \multicolumn{6}{|l|}{Affects none} \\
\hline Repeat & \multicolumn{6}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{7}{|l|}{Example} \\
\hline Syntax & \multicolumn{6}{|l|}{Description} \\
\hline push(dbl(*AR3-)) & \multicolumn{6}{|l|}{The data stack pointer (SP) is decremented by 2. The 16 highest bits of the content at the location addressed by AR3 are copied to the memory location pointed by SP and the 16 lowest bits of the content at the location addressed by AR3 are copied to the memory location pointed by \(S P+1\). Because this instruction is a long-operand instruction, AR3 is decremented by 2 after the execution.} \\
\hline
\end{tabular}

\section*{RPTB}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & localrepeat\{\} & Yes & 2 & 1 & AD \\
{\([2]\)} & blockrepeat \(\}\) & Yes & 3 & 1 & AD \\
\hline
\end{tabular}

Description These instructions repeat a block of instructions the number of times specified by:
\(\square\) The content of BRCO +1 , if no loop has already been detected.
\(\square\) The content of BRS1 + 1, if one level of the loop has already been detected.

Loop structures defined by these instructions must have the following characteristics:
- The minimum number of instructions executed within one loop iteration is 2.
\(\square\) The minimum number of cycles executed within one loop iteration is 2.
\(\square\) Since the result of updating BRCx (and BRAF in C54CM \(=1\) ) within 3 instruction cycles from the end of the loop is uncertain (effective in the same iteration or the next iteration depending on the pipeline state), this operation is prohibited.
\(\square\) The block-repeat operation can only be cleared by branching to a destination address outside the active block-repeat loop.
\(\square\) C54CM bit in ST1_55 cannot be modified within a block-repeat loop.
These instructions cannot be repeated.
See section 1.5 for a list of instructions that cannot be used in a repeat block mechanism.
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

See Also See the following other related instructions:
- Repeat Single Instruction Conditionally
\(\square\) Repeat Single Instruction Unconditionally
\(\square\) Repeat Single Instruction Unconditionally and Decrement CSR
\(\square\) Repeat Single Instruction Unconditionally and Increment CSR

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & localrepeat \(\}\) & Yes & 2 & 1 & \(A D\) \\
\hline
\end{tabular}
Opcode \(\quad \mid 0100\) 101E \(\mid 1111\) llll

\section*{Operands}

Description
none
This instruction repeats a block of instructions the number of times specified by:
- The content of BRC0 + 1, if no loop has already been detected. In this case:
- In the address phase of the pipeline, RSA0 is loaded with the program address of the first instruction of the loop.

■ The program address of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REAO.
- BRCO is decremented at the decode phase of the last instruction of the loop when its content is not equal to 0 .

■ BRCO contains 0 after the block-repeat operation has ended.
The content of BRS1 + 1, if one level of the loop has already been detected. In this case:
- BRC1 is loaded with the content of BRS1 in the address phase of the repeat block instruction.
- In the address phase of the pipeline, RSA1 is loaded with the program address of the first instruction of the loop.

■ The program address of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REA1.
- BRC1 is decremented at the decode phase of the last instruction of the loop when its content is not equal to 0 .
- BRC1 contains 0 after the block-repeat operation has ended.
- BRS1 content is not impacted by the block-repeat operation.

Loop structures defined by this instruction must have the following characteristics:
\(\square\) The minimum number of instructions executed within one loop iteration is 2 .
- The minimum number of cycles executed within one loop iteration is 2 .
- The maximum loop size is 128 bytes.
- Since the result of updating BRCx (and BRAF in C54CM =1) within 3 instruction cycles from the end of the loop is uncertain (effective in the same iteration or the next iteration depending on the pipeline state), this operation is prohibited.
- C54CM bit in ST1_55 cannot be modified within a block-repeat loop.
- The following instructions cannot be used as the last instruction in the loop structure:
\begin{tabular}{|l|l|l|}
\hline while (cond \&\& (RPTC < k8)) repeat & repeat(k8) & repeat(CSR), CSR += k4 \\
\hline if (cond) execute(AD_Unit) & repeat(k16) & repeat(CSR), CSR += TAx \\
\hline if (cond) execute(D_Unit) & repeat(CSR) & repeat(CSR), CSR \(==k 4\) \\
\hline
\end{tabular}

\section*{Note:}

Instructions if (cond) execute (AD_Unit), or if (cond) execute (D_Unit) can be used as the last instruction in the loop structure if the instruction is executed with the instruction with which it is paralleled (if (cond) execute (AD_Unit) || instruction_executes conditionally)

A local loop is defined as when all the code of the loop is repeatedly executed from within the instruction buffer queue (IBQ):
- All the code of the local loop must fit within the 128-byte, 4-byte-aligned IBQ; therefore, local repeat blocks are limited to 128 bytes minus the 0 to 3 bytes of first-instruction misalignment. The 128th byte of the IBQ can only occur in a paralleled instruction. See Figure 5-2 for legal uses of the localrepeat instruction.
- The following instructions cannot be used in any form in a local loop code:
\begin{tabular}{lll} 
blockrepeat & call & goto \\
idle & intr & reset \\
return & trap &
\end{tabular}
\(\square\) Nested local repeat block instructions are allowed.
\(\square\) The only branch instructions allowed in a localrepeat structure are the branch conditionally instructions (if (cond) goto) with a target branch address pointing to an instruction included within the loop code and being at a higher address than the branching instruction. In this case, the branch conditionally instruction is executed in 3 cycles and the condition is evaluated in the read phase of the pipeline (there is a 1-cycle latency on the condition setting).

\section*{Compatibility with C54x devices (C54CM = 1)}

\section*{When \(\mathrm{C} 54 \mathrm{CM}=1\) :}
\(\square\) This instruction only uses block-repeat level 0; block-repeat level 1 is disabled.
\(\square\) The block-repeat active flag (BRAF) is set to 1 . BRAF is cleared to 0 at the end of the block-repeat operation when BRC0 contains 0.
\(\square\) You can stop an active block-repeat operation by clearing BRAF to 0 .
\(\square\) Block-repeat control registers for level 1 are not used. Nested block-repeat operations are supported using the C54x convention with context save/restore and BRAF. When an interrupt is acknowledged, unlike C54x device, BRAF is captured into CFCT register, and saved to the stack. You can use a block/local loop instruction in an interrupt without preserving BRAF (while preserving BRC0, RSA0 and REA0).
\(\square\) BRAF is automatically cleared to 0 when a far branch (FB) or far call (FCALL) instruction is executed.
\begin{tabular}{ll} 
Status Bits & Affected by none \\
& Affects none \\
Repeat & This instruction cannot be repeated.
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline localrepeat & A block of instructions is repeated as defined by the content of BRCO +1. \\
\hline
\end{tabular}


Figure 5-2. Legal Uses of Repeat Block of Instructions Unconditionally (localrepeat) Instruction
(a) 128-Byte Unaligned Loop-Legal Use


The entire localrepeat block and the next instruction reside in the IBQ, this code is accepted by the assembler.

Figure 5-2. Legal Uses of Repeat Block of Instructions Unconditionally (localrepeat) Instruction (Continued)
(b) 129-Byte Unaligned Loop with Single Instruction at End of Loop-Illegal Use


The localrepeat instruction is not aligned; the next instruction may not be fetched in the IBQ. Because the last instruction of the localrepeat block is a nonparalleled (single) instruction, the CPU must confirm that the next instruction does not have a parallel enable bit; therefore, this code is rejected by the assembler.
(c) 129-Byte Unaligned Loop with Paralleled Instruction at End of Loop-Legal Use
```

    ... ... ; no alignment directive
    localrepeat {
1st instruction
Last instruction (paralleled)
}
next instruction

```

The localrepeat instruction is not aligned; the next instruction may not be fetched in the IBQ. Because the last instruction of the localrepeat block is a paralleled instruction, the CPU does not need to confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.

Figure 5-2. Legal Uses of Repeat Block of Instructions Unconditionally (localrepeat) Instruction (Continued)
(d) 129-Byte Aligned Loop with Single Instruction at End of Loop-Legal Use
```

    align 4
        ; alignment directive
    localrepeat {
            1st instruction
                } 129-byte loop body
            Last instruction
        (nonparalleled = single)
    }
    next instruction

```

The localrepeat instruction is aligned, so the entire localrepeat block and the next instruction reside in the IBQ. Because the next instruction is in the IBQ, the CPU can confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.
(e) 130-Byte Unaligned Loop-Illegal Use
```

..... ; no alignment directive
localrepeat {
1st instruction
..... } 130-byte loop body
Last instruction
}
next instruction

```

The localrepeat instruction is not aligned; the entire localrepeat block may not reside in the IBQ. Because the last instruction of the localrepeat block may not reside in the IBQ, this code is rejected by the assembler.

Figure 5-2. Legal Uses of Repeat Block of Instructions Unconditionally (localrepeat) Instruction (Continued)
(f) 130-Byte Aligned Loop with Single Instruction at End of Loop-Legal Use
```

    align 4
    ; alignment directive
    nop_16||nop
    ; 3-byte instruction
    localrepeat {
1st instruction
.. ... } 130-byte loop body
Last instruction
(nonparalleled = single)
}
next instruction

```

The nop instructions are aligned so the localrepeat instruction, the entire localrepeat block, and the next instruction reside in the IBQ. Because the next instruction is in the IBQ, the CPU can confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.
(g) 132-Byte Aligned Loop with Paralleled Instruction at End of Loop—Legal Use
\begin{tabular}{|c|c|c|}
\hline align 4 & & ; alignment directive \\
\hline nop_16 & & ; 2-byte instruction \\
\hline \multicolumn{3}{|l|}{localrepeat \{} \\
\hline & 1st instruction & \\
\hline \multirow[t]{2}{*}{\(\ldots\)} & & \} 132-byte loop body \\
\hline & Last instruction (paralleled) & \\
\hline \} & & \\
\hline next instruction & & \\
\hline
\end{tabular}

The nop instruction is aligned, so the localrepeat instruction and the entire localrepeat block reside in the IBQ; the next instruction is not fetched in the IBQ. Because the last instruction of the localrepeat block is a paralleled instruction, the CPU does not need to confirm that the next instruction does not have a parallel enable bit; therefore, this code is accepted by the assembler.

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & blockrepeat \(\}\) & & Yes & 3 & 1 & \(A D\) \\
\hline & & & & & & \\
Opcode & & & & & & \\
Operands & none & & & & & \\
Description & This instruction repeats a block of instructions the number of times specified by:
\end{tabular}
- the content of BRCO +1 , if no loop has already been detected. In this case:
- In the address phase of the pipeline, RSAO is loaded with the program address of the first instruction of the loop.
- The program address of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REAO.
- BRCO is decremented at the decode phase of the last instruction of the loop when its content is not equal to 0 .
- BRCO contains 0 after the block-repeat operation has ended.
- the content of BRS1 + 1 , if one level of the loop has already been detected. In this case:
- BRC1 is loaded with the content of BRS1 in the address phase of the repeat block instruction.
- In the address phase of the pipeline, RSA1 is loaded with the program address of the first instruction of the loop.
■ The program address of the last instruction of the loop (that may be two parallel instructions) is computed in the address phase of the pipeline and stored in REA1.
- BRC1 is decremented at the decode phase of the last instruction of the loop when its content is not equal to 0 .
- BRC1 contains 0 after the block-repeat operation has ended.
- BRS1 content is not impacted by the block-repeat operation.

Loop structures defined by these instructions must have the following characteristics:
\(\square\) The minimum number of instructions executed within one loop iteration is 2 .
\(\square\) The minimum number of cycles executed within one loop iteration is 2.
\(\square\) The maximum loop size is 64 K bytes.
\(\square\) The block-repeat operation can only be cleared by branching to a destination address outside the active block-repeat loop.
\(\square\) Since the result of updating BRCx (and BRAF in C54CM =1) within 3 instruction cycles from the end of the loop is uncertain (effective in the same iteration or the next iteration depending on the pipeline state), this operation is prohibited.
\(\square\) C54CM bit in ST1_55 cannot be modified within a block-repeat loop.
\(\square\) The following instructions cannot be used as the last instruction in the loop structure:
\begin{tabular}{|l|l|l|}
\hline while (cond \&\& (RPTC < k8)) repeat & repeat(k8) & repeat(CSR), CSR += k4 \\
\hline if (cond) execute(AD_Unit) & repeat(k16) & repeat(CSR), CSR += TAx \\
\hline if (cond) execute(D_Unit) & repeat(CSR) & repeat(CSR), CSR \(=\) = k4 \\
\hline
\end{tabular}

See section 1.5 for a list of instructions that cannot be used in the block-repeat loop code.

\section*{Compatibility with C54x devices (C54CM = 1)}

\section*{When C54CM =1}
\(\square\) This instruction only uses block-repeat level 0; block-repeat level 1 is disabled.
\(\square\) The block-repeat active flag (BRAF) is set to 1 . BRAF is cleared to 0 at the end of the block-repeat operation when BRC0 contains 0.
\(\square\) You can stop an active block-repeat operation by clearing BRAF to 0 .
\(\square\) Block-repeat control registers for level 1 are not used. Nested block-repeat operations are supported using the C54x convention with context save/restore and BRAF. The control-flow context register (CFCT) values are not used.
\(\square\) BRAF is automatically cleared to 0 when a far branch (FB) or far call (FCALL) instruction is executed.
Status Bits \begin{tabular}{l} 
Affected by none \\
Repeat \\
Example \\
\begin{tabular}{|l|l|}
\hline Syntax & Description instruction cannot be repeated. \\
\hline blockrepeat & \begin{tabular}{l} 
A block of instructions is repeated as defined by the content of BRC0 + 1. A second \\
loop of instructions is repeated as defined by the content of BRS1 + 1 (BRC1 is \\
loaded with the content of BRS1).
\end{tabular} \\
\hline
\end{tabular}
\end{tabular} \begin{tabular}{l}
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline & Address & BRC0 & RSA0 & REAO & BRS1 & BRC1 & RSA1 & REA1 \\
\hline BRCO \(=\) \#3 & & 0003 & 0000 & 0000 & 0000 & 0000 & 0000 & 0000 \\
\hline BRC1 = \#1 & & ?* & ? & ? & 0001 & 0001 & ? & ? \\
\hline \multirow[t]{2}{*}{blockrepeat \{
.....} & 004006 & ? & 4009 & 4017 & ? & ? & ? & ? \\
\hline & 004009 & ? & ? & ? & ? & ? & ? & ? \\
\hline \multirow[t]{2}{*}{localrepeat \{} & 00400B & ? & ? & ? & ? & (BRS1) & 400D & 4015 \\
\hline & 00400D & ? & ? & ? & ? & ? & ? & ? \\
\hline \(\ldots\)... & 004015 & ? & ? & ? & ? & DTZ** & ? & ? \\
\hline \multicolumn{9}{|l|}{\}} \\
\hline \(\cdots \cdots\) & 004017 & DTZ** & ? & ? & ? & ? & ? & ? \\
\hline \begin{tabular}{l}
*?: Unchanged \\
**DTZ: Decrea
\end{tabular} & zero & 0000 & 4009 & 4017 & 0001 & 0000 & 400D & 4015 \\
\hline
\end{tabular}

RPTCC
Repeat Single Instruction Conditionally

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles
\end{tabular} Pipeline \begin{tabular}{lllll}
{\([1]\)} & while (cond \&\& (RPTC < k8)) repeat & & Yes & 3 \\
\hline
\end{tabular}

\section*{Operands}

Description
cond, k8
This instruction evaluates a single condition defined by the cond field and as long as the condition is true, the next instruction or the next two paralleled instructions is repeated the number of times specified by an 8 -bit immediate value, \(k 8+1\). The maximum number of executions of a given instruction or paralleled instructions is \(2^{8}-1\) (255). See Table \(1-3\) for a list of conditions.

The 8 LSBs of the repeat counter register (RPTC):
\(\square\) Are loaded with the immediate value at the address phase of the pipeline.
- Are decremented by 1 in the decode phase of the repeated instruction.
- Must not be accessed when it is being decremented in the repeat single mechanism or in parallel with the repeat instruction itself.

The 8 MSBs of RPTC:
- Are loaded with the cond code at the address phase of the pipeline.
\(\square\) Are untouched during the while/repeat structure execution.
At each step of the iteration, the condition defined by the cond field is tested in the execute phase of the pipeline. When the condition becomes false, the instruction repetition stops.
\(\square\) If the condition becomes false at any execution of the repeated instruction, the 8 LSBs of RPTC are corrected to indicate exactly how many iterations were not performed.
\(\square\) Since the condition is evaluated in the execute phase of the repeated instruction, when the condition is tested false, some of the succeeding iterations of that repeated instruction may have gone through the address, access, and read phases of the pipeline. Therefore, they may have modified the pointer registers used in the DAGEN units to generate data memory operands addresses in the address phase.

> When the while/repeat structure is exited, reading the computed single-repeat register (CSR) content enables you to determine how many instructions have gone through the address phase of the pipeline. You may then use the Repeat Single Instruction Unconditionally instruction [3] to rewind the pointer registers. Note that this must only be performed when a false condition has been met inside the while/repeat structure.

The following table provides the 8 LSBs of RPTC and CSR once the while/repeat structure is exited.
\begin{tabular}{lcc}
\multicolumn{1}{c}{ If the condition is not met } & \begin{tabular}{c} 
RPTC[7:0] content \\
after exiting loop
\end{tabular} & \begin{tabular}{c} 
CSR content after \\
exiting loop
\end{tabular} \\
At \(1^{\text {st }}\) iteration & RPTCinit +1 & 4 \\
At \(2^{\text {nd }}\) iteration & RPTCinit & 4 \\
At \(3^{\text {rd }}\) iteration & RPTC -1 & 4 \\
.. & \(\ldots\) & \(\ldots\) \\
At RPTCinit -2 iteration & 4 & 3 \\
At RPTCinit -1 iteration & 3 & 2 \\
At RPTCinit iteration & 2 & 1 \\
At RPTCinit + 1 iteration & 1 & 0 \\
Never & 0 & 0
\end{tabular}

RPTCinit is the number of requested iterations minus 1.
The repeat single mechanism triggered by this instruction is interruptible. Saving and restoring the RPTC content in ISRs enables you to preserve the while/repeat structure context.

Instead of programming a number of iterations (minus 1) equal to 0 , it is recommended that you use the conditional execute() structure.

This instruction cannot be used as the last or the second to last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.

In addition, any store-to-memory instruction including push instructions cannot be used in a conditional repeat single mechanism.

\section*{Compatibility with C54x devices (C54CM =1)}

When C54CM \(=1\), the comparison of accumulators to 0 is performed as if M40 was set to 1 .
\begin{tabular}{ll} 
Status Bits & Affected by ACOVx, CARRY, C54CM, M40, TCx \\
Repeat & Affects ACOVx \\
See Also & This instruction cannot be repeated. \\
& See the following other related instructions: \\
& \(\square\) Repeat Block of Instructions Unconditionally \\
& \(\square\) Repeat Single Instruction Unconditionally \\
& \(\square\) Repeat Single Instruction Unconditionally and Decrement CSR \\
& \(\square\) Repeat Single Instruction Unconditionally and Increment CSR
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline while \((A C 1>\# 0\) \&\& (RPTC < \#7)) repeat & \begin{tabular}{l} 
As long as the content of AC1 is greater than 0 and the repeat \\
counter is not equal to 0, the next single instruction is repeated as \\
defined by the unsigned 8-bit value (7) + 1. At the address phase \\
of the pipeline, RPTC is automatically initialized to 4107h and then \\
is immediately decreased to 4106h.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|lll|}
\hline while (AC1 > \#0 \&\& (RPTC < \#7)) repeat & address: & 004004 \\
AC1 = AC1 - (T0 * *AR1) & & 004008 \\
\(\ldots \ldots\) & & \(00400 B\) \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & & & After \\
AC1 & 00 & 2359 & 0340 & AC1 & 00 & \(1 \mathrm{FC2}\) \\
T0 & & 0340 & T0 & & \\
*AR1 & & 2354 & *AR1 & & 0340 \\
RPTC & & \(4106^{\dagger}\) & RPTC & & 2354 \\
& & & & & 0000
\end{tabular}
\(\dagger\) At the address phase of the pipeline, RPTC is automatically initialized to 4107 h and then is immediately decreased to 4106 h .

\section*{RPT}

Repeat Single Instruction Unconditionally

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & repeat(k8) & Yes & 2 & 1 & AD \\
{\([2]\)} & repeat(k16) & Yes & 3 & 1 & AD \\
{\([3]\)} & repeat(CSR) & Yes & 2 & 1 & AD \\
\hline
\end{tabular}

Description This instruction repeats the next instruction or the next two paralleled instructions the number of times specified by the content of the computed single repeat register (CSR) +1 or an immediate value, \(k x+1\). This value is loaded into the repeat counter register (RPTC). The maximum number of executions of a given instruction or paralleled instructions is \(2^{16}-1\) (65535).

The repeat single mechanism triggered by these instructions is interruptible.
These instructions cannot be repeated.
These instructions cannot be used as the last instruction in a repeat loop structure.

Two paralleled instructions can be repeated when following the parallelism general rules.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.
Status Bits Affected by none

See Also
See the following other related instructions:
- Repeat Block of Instructions Unconditionally
- Repeat Single Instruction Conditionally
- Repeat Single Instruction Unconditionally and Decrement CSR
- Repeat Single Instruction Unconditionally and Increment CSR

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. & Syntax & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline & repeat(k8) & & Yes & 2 & 1 & AD \\
\hline [2] & repeat(k16) & & Yes & 3 & 1 & AD \\
\hline \multicolumn{2}{|l|}{\multirow[t]{2}{*}{Opcode}} & k8 & \multicolumn{4}{|r|}{0100 110E \({ }^{\text {a }}\) kkkk kkkk} \\
\hline & & k16 & \multicolumn{4}{|l|}{} \\
\hline \multicolumn{2}{|l|}{Operands} & \multicolumn{5}{|l|}{kx} \\
\hline \multicolumn{2}{|l|}{\multirow[t]{9}{*}{Description}} & \multicolumn{5}{|l|}{This instruction repeats the next instruction or the next two paralleled instructions the number of times specified by an immediate value, \(k x+1\). The repeat counter register (RPTC):} \\
\hline & & \multicolumn{5}{|l|}{\(\square\) Is loaded with the immediate value in the address phase of the pipeline.} \\
\hline & & \multicolumn{5}{|l|}{\(\square\) Is decremented by 1 in the decode phase of the repeated instruction.} \\
\hline & & \multicolumn{5}{|l|}{\(\square\) Contains 0 at the end of the repeat single mechanism.} \\
\hline & & \multicolumn{5}{|l|}{\(\square\) Must not be accessed when it is being decremented in the repeat single mechanism or in parallel with the repeat instruction itself.} \\
\hline & & \multicolumn{5}{|l|}{The repeat single mechanism triggered by this instruction is interruptible.} \\
\hline & & \multicolumn{5}{|l|}{Two paralleled instructions can be repeated when following the parallelism general rules.} \\
\hline & & \multicolumn{5}{|l|}{This instruction cannot be used as the last instruction in a repeat loop structure.} \\
\hline & & \multicolumn{5}{|l|}{See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.} \\
\hline \multicolumn{2}{|l|}{\multirow[t]{2}{*}{Status Bits}} & Affected by & & & & \\
\hline & & Affects & & & & \\
\hline \multicolumn{2}{|l|}{Repeat} & \multicolumn{5}{|l|}{This instruction cannot be repeated.} \\
\hline
\end{tabular}

\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline repeat(\#3) & The single instruction following the repeat instruction is repeated four times. \\
AC1 \(=\) AC1 \(+{ }^{*}\) AR3 + * *AR4+ & \\
\hline
\end{tabular}
\begin{tabular}{lrlrl} 
Before & & After \\
AC1 & 000000 & 0000 & AC1 & 003376 AD10 \\
AR3 & 0200 & AR3 & & 0204 \\
AR4 & 0400 & AR4 & 0404 \\
200 & AC03 & 200 & AC03 \\
201 & 3468 & 201 & 3468 \\
202 & FE00 & 202 & FE00 \\
203 & \(23 D C\) & 203 & \(23 D C\) \\
400 & D768 & 400 & D768 \\
401 & 6987 & 401 & 6987 \\
402 & 3400 & 402 & 3400 \\
403 & 7900 & 403 & 7900
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline repeat(\#513) & A single instruction is repeated as defined by the unsigned 16-bit value \(+1(513+1)\). \\
\hline
\end{tabular}

\section*{Syntax Characteristics}


Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline repeat(CSR) & \begin{tabular}{l} 
The single instruction following the repeat instruction is repeated as defined \\
by the content of CSR +1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrlrl} 
Before & & After \\
AC1 & 000000 & 0000 & AC1 & 003376 AD10 \\
CSR & 0003 & CSR & 0003 \\
AR3 & 0200 & AR3 & 0204 \\
AR4 & 0400 & AR4 & 0404 \\
200 & AC03 & 200 & AC03 \\
201 & 3468 & 201 & 3468 \\
202 & FE00 & 202 & FE00 \\
203 & \(23 D C\) & 203 & \(23 D C\) \\
400 & D768 & 400 & D768 \\
401 & 6987 & 401 & 6987 \\
402 & 3400 & 402 & 3400 \\
403 & 7900 & 403 & 7900
\end{tabular}

RPTSUB
Repeat Single Instruction Unconditionally and Decrement CSR

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & repeat(CSR), CSR \(-=\mathrm{k} 4\) & Yes & 2 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad \mid 0100\) 100E \(\mid\) kkkk x011

\section*{Operands k4}

Description This instruction repeats the next instruction or the next two paralleled instructions the number of times specified by the content of the computed single repeat register (CSR) +1 . The repeat counter register (RPTC):
- Is loaded with CSR content in the address phase of the pipeline.
\(\square\) Is decremented by 1 in the decode phase of the repeated instruction.
\(\square\) Contains 0 at the end of the repeat single mechanism.
. Must not be accessed when it is being decremented in the repeat single mechanism or in parallel with the repeat instruction itself.

With the A-unit ALU, this instruction allows the content of CSR to be decremented by k4. The CSR modification is performed in the execute phase of the pipeline; there is a 3-cycle latency between the CSR modification and its usage in the address phase.

The repeat single mechanism triggered by this instruction is interruptible.
Two paralleled instructions can be repeated when following the parallelism general rules.

This instruction cannot be used as the last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.
\begin{tabular}{lll} 
Status Bits & Affected by none \\
& Affects none \\
Repeat & This instruction cannot be repeated.
\end{tabular}

See Also See the following other related instructions:
- Repeat Block of Instructions Unconditionally
- Repeat Single Instruction Conditionally
- Repeat Single Instruction Unconditionally
- Repeat Single Instruction Unconditionally and Increment CSR

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline repeat(CSR), CSR -= \#2 & \begin{tabular}{l} 
A single instruction is repeated as defined by the content of CSR + 1. The \\
content of CSR is decremented by the unsigned 4-bit value (2).
\end{tabular} \\
\hline
\end{tabular}

RPTADD
Repeat Single Instruction Unconditionally and Increment CSR

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & repeat(CSR), CSR \(+=\) TAx & Yes & 2 & 1 & \(X\) \\
{\([2]\)} & repeat(CSR), CSR \(-=k 4\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

Description These instructions repeat the next instruction or the next two paralleled instructions the number of times specified by the content of the computed single repeat register (CSR) +1 . This value is loaded into the repeat counter register (RPTC). The maximum number of executions of a given instruction or paralleled instructions is \(2^{16}-1\) (65535).

With the A-unit ALU, these instructions allow the content of CSR to be incremented. The CSR modification is performed in the execute phase of the pipeline; there is a 3-cycle latency between the CSR modification and its usage in the address phase.

The repeat single mechanism triggered by these instructions is interruptible.
Two paralleled instructions can be repeated when following the parallelism general rules.

These instructions cannot be repeated.
These instructions cannot be used as the last instruction in a repeat loop structure.

See section 1.5 for a list of instructions that cannot be used in a repeat single mechanism.
Status Bits Affected by none

\section*{See Also}

See the following other related instructions:
- Repeat Block of Instructions Unconditionally
- Repeat Single Instruction Conditionally
- Repeat Single Instruction Unconditionally
\(\square\) Repeat Single Instruction Unconditionally and Decrement CSR

Repeat Single Instruction Unconditionally and Increment CSR

\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline repeat(CSR), CSR \(+=\) T1 & \begin{tabular}{l} 
A single instruction is repeated as defined by the content of CSR +1. The \\
content of CSR is incremented by the content of temporary register T1.
\end{tabular} \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{llllll}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles
\end{tabular} Pipeline
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline repeat(CSR), CSR \(+=\) \#2 & \begin{tabular}{l} 
A single instruction is repeated as defined by the content of CSR +1. The \\
content of CSR is incremented by the unsigned 4-bit value (2).
\end{tabular} \\
\hline
\end{tabular}

RETCC
Return Conditionally

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles \(^{\dagger}\) & Pipeline \\
\hline\([1]\) & if (cond) return & Yes & 3 & \(5 / 5\) & \(R\) \\
\hline
\end{tabular}
\({ }^{\dagger} \mathrm{x} / \mathrm{y}\) cycles: x cycles \(=\) condition true, y cycles \(=\) condition false

\section*{Opcode}

\section*{Operands}

Description
cond
This instructions evaluates a single condition defined by the cond field in the read phase of the pipeline. If the condition is true, a return occurs to the return address of the calling subroutine. There is a 1 -cycle latency on the condition setting. A single condition can be tested as determined by the cond field of the instruction. See Table 1-3 for a list of conditions.

After returning from a called subroutine, the CPU restores the value of two internal registers: the program counter \((\mathrm{PC})\) and a loop context register. The CPU uses these values to re-establish the context of the program sequence.

In the slow-return process (default), the return address (from the PC ) and the loop context bits are restored from the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are restored from the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32 -bit load and store instructions. For fastreturn mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

When a return from a subroutine occurs:
\(\square\) The loop context bits concatenated with the 8 MSBs of the return address are popped from the top of the system stack pointer (SSP). The SSP is incremented by 1 word in the read phase of the pipeline.
- The 16 LSBs of the return address are popped from the top of the data stack pointer (SP). The SP is incremented by 1 word in the read phase of the pipeline.


\section*{RET}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & return & Yes & 2 & 5 & D \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|}
\hline Opcode & 0100 100E & xxxx x100 \\
\hline
\end{tabular}

Operands
Description This instruction passes control back to the calling subroutine.

After returning from a called subroutine, the CPU restores the value of two internal registers: the program counter ( PC ) and a loop context register. The CPU uses these values to re-establish the context of the program sequence.

In the slow-return process (default), the return address (from the PC ) and the loop context bits are restored from the stacks (in memory). When the CPU returns from a subroutine, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are restored from the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions. For fastreturn mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
- The loop context bits concatenated with the 8 MSBs of the return address are popped from the top of the system stack pointer (SSP). The SSP is incremented by 1 word in the address phase of the pipeline.
\(\square\) The 16 LSBs of the return address are popped from the top of the data stack pointer (SP). The SP is incremented by 1 word in the address phase of the pipeline.

\begin{tabular}{ll} 
Status Bits & \multicolumn{1}{l}{ Affected by none } \\
Repeat & Affects none \\
See Also & This instruction cannot be repeated. \\
& See the following other related instructions: \\
& \(\square\) Call Conditionally \\
& \(\square\) \\
& \(\square\) \\
& \(\square\) Return Conditionally Unconditionally \\
\hline Example & \\
\hline Syntax & Description \\
\hline return & The program counter is loaded with the return address of the calling subroutine. \\
\hline
\end{tabular}

RETI
Return from Interrupt

\section*{Syntax Characteristics}
\begin{tabular}{lllllll}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & return_int & & No & 2 & 5 & D \\
\hline Opcode & none & & 0100 & 100 E & & xxxx \\
Operands & x101 \\
Description & This instruction passes control back to the interrupted task. \\
& \begin{tabular}{l} 
After returning from an interrupt service routine (ISR), the CPU automatically \\
restores the value of some CPU registers and two internal registers: the \\
program counter (PC) and a loop context register. The CPU uses these values \\
to re-establish the context of the program sequence.
\end{tabular} \\
& \begin{tabular}{l} 
In the slow-return process (default), the return address (from the PC), the loop \\
context bits, and some CPU registers are restored from the stacks (in \\
memory). When the CPU returns from an ISR, the speed at which these values \\
are restored is dependent on the speed of the memory accesses.
\end{tabular}
\end{tabular}

In the fast-return process, the return address (from the PC) and the loop context bits are restored from the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32-bit load and store instructions. Some CPU registers are restored from the stacks (in memory). For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).
\(\square\) The loop context bits concatenated with the 8 MSBs of the return address are popped from the top of the system stack pointer (SSP). The SSP is incremented by 1 word in the address phase of the pipeline.
\(\square\) The 16 LSBs of the return address are popped from the top of the data stack pointer (SP). The SP is incremented by 1 word in the address phase of the pipeline.
\(\square\) The debug status register (DBSTAT) content is popped from the top of SSP. The SSP is incremented by 1 word in the access phase of the pipeline.
- The status register 1 (ST1_55) content is popped from the top of SP. The SP is incremented by 1 word in the access phase of the pipeline.
\(\square\) The 7 higher bits of status register 0 (STO_55) concatenated with 9 zeroes are popped from the top of SSP. The SSP is incremented by 1 word in the read phase of the pipeline.
\(\square\) The status register 2 (ST2_55) content is popped from the top of SP. The SP is incremented by 1 word in the read phase of the pipeline.


\section*{ROL \\ Rotate Left Accumulator, Auxiliary, or Temporary Register Content}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline \multicolumn{6}{|c|}{dst = BitOut \\ src \\ Bitln} \\
\hline [1] & dst = TC2 \\ src \\ TC2 & Yes & 3 & 1 & X \\
\hline [2] & \(\mathrm{dst}=\mathbf{T C 2} \backslash \backslash \mathrm{src} \backslash \backslash\) CARRY & Yes & 3 & 1 & X \\
\hline [3] & dst = CARRY \(\backslash \backslash \mathrm{src} \backslash \backslash\) TC2 & Yes & 3 & 1 & X \\
\hline [4] & \(\mathrm{dst}=\mathbf{C A R R Y} \backslash \backslash \mathrm{src} \backslash \backslash\) CARRY & Yes & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

\section*{Description}

Status Bits
dst, src
This instruction performs a bitwise rotation to the MSBs. Both TC2 and CARRY can be used to shift in one bit (Bitln) or to store the shifted out bit (BitOut). The one bit in Bitln is shifted into the source (src) operand and the shifted out bit is stored to BitOut.
\(\square\) When the destination (dst) operand is an accumulator:
■ if an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the register are zero extended to 40 bits
■ the operation is performed on 40 bits in the D-unit shifter
- Bitln is inserted at bit position 0
- BitOut is extracted at a bit position according to M40
- When the destination (dst) operand is an auxiliary or temporary register:

■ if an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation
- the operation is performed on 16 bits in the A-unit ALU
- Bitln is inserted at bit position 0
- BitOut is extracted at bit position 15

Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Affected by CARRY, M40, TC2
Affects CARRY, TC2

\section*{Repeat This instruction can be repeated.}

\section*{See Also See the following other related instructions:}
- Rotate Right Accumulator, Auxiliary, or Temporary Register Content

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = CARRY \(\backslash \backslash\) AC1 \(\backslash\) TC2 & \begin{tabular}{l} 
The value of TC2 (1) before the execution of the instruction is shifted into \\
the LSB of AC1 and bit 31 shifted out from AC1 is stored in the CARRY \\
status bit. The rotated value is stored in AC1. Because M40 = 0, the \\
guard bits (39-32) are cleared.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrr} 
Before & \multicolumn{4}{c}{ After } \\
AC1 & \(0 F\) & E340 & 5678 & AC1 & 00 C680 ACF1 \\
TC2 & & 1 & TC2 & & 1 \\
CARRY & 1 & CARRY & & 1 \\
M40 & & 0 & M40 & 0
\end{tabular}

\section*{ROR Rotate Right Accumulator, Auxiliary, or Temporary Register Content}

\section*{Syntax Characteristics}
\begin{tabular}{lllccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline & \(\mathrm{dst}=\) Bitln // src // BitOut & & & & \\
{\([1]\)} & \(\mathrm{dst}=\mathbf{T C 2} / /\) src // TC2 & Yes & 3 & 1 & X \\
{\([2]\)} & \(\mathrm{dst}=\mathbf{T C 2 ~ / / ~ s r c ~ / / ~ C A R R Y ~}\) & Yes & 3 & 1 & X \\
{\([3]\)} & \(\mathrm{dst}=\mathbf{C A R R Y ~ / / ~ s r c ~ / / ~ T C 2 ~}\) & Yes & 3 & 1 & X \\
{\([4]\)} & \(\mathrm{dst} \mathrm{=} \mathrm{CARRY} \mathrm{//} \mathrm{src} \mathrm{//} \mathrm{CARRY}\) & Yes & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands}

\section*{Description}
dst, src
This instruction performs a bitwise rotation to the LSBs. Both TC2 and CARRY can be used to shift in one bit (Bitln) or to store the shifted out bit (BitOut). The one bit in Bitln is shifted into the source (src) operand and the shifted out bit is stored to BitOut.
- When the destination (dst) operand is an accumulator:
- if an auxiliary or temporary register is the source (src) operand of the instruction, the 16 LSBs of the register are zero extended to 40 bits
■ the operation is performed on 40 bits in the D-unit shifter
■ Bitln is inserted at a bit position according to M40
- BitOut is extracted at bit position 0
\(\square\) When the destination (dst) operand is an auxiliary or temporary register:
- if an accumulator is the source (src) operand of the instruction, the 16 LSBs of the accumulator are used to perform the operation
- the operation is performed on 16 bits in the A-unit ALU
- Bitln is inserted at bit position 15
- BitOut is extracted at bit position 0

Compatibility with C54x devices \((C 54 C M=1)\)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits
Affected by CARRY, M40, TC2
Affects CARRY, TC2

Repeat This instruction can be repeated.
See Also See the following other related instructions:
\(\square\) Rotate Left Accumulator, Auxiliary, or Temporary Register Content

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = TC2 // AC0 // TC2 & \begin{tabular}{l} 
The value of TC2 (1) before the execution of the instruction is shifted into \\
bit 31 of AC0 and the LSB shifted out from AC0 is stored in TC2. The \\
rotated value is stored in AC1. Because M40 = 0, the guard bits (39-32) are \\
cleared.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrrrrrr} 
Before & \multicolumn{5}{c}{ After } \\
AC0 & 5 F & B000 & 1234 & AC0 & 5 F & B000 & 1234 \\
AC1 & 00 & C680 ACF1 & AC1 & 00 & D800 & 091 A \\
TC2 & & & 1 & TC2 & & & 0 \\
M40 & & & 0 & M40 & & & 0
\end{tabular}

ROUND
Syntax Characteristics
 D-unit ALU.
\(\square\) The rounding operation depends on RDM:
- When RDM \(=0\), the biased rounding to the infinite is performed. \(8000 \mathrm{~h}\left(2^{15}\right)\) is added to the 40 -bit source accumulator ACx.
- When RDM = 1 , the unbiased rounding to the nearest is performed. According to the value of the 17 LSBs of the 40 -bit source accumulator ACx, 8000h \(\left(2^{15}\right)\) is added:
```

if( 8000h < bit(15-0) < 10000h)
add 8000h to the 40-bit source accumulator ACx
else if( bit(15-0) == 8000h)
if( bit(16) == 1)
add 8000h to the 40-bit source accumulator ACx

``` If a rounding has been performed, the 16 lowest bits of the result are cleared to 0 .
- Addition overflow detection depends on M40.
\(\square\) No addition carry report is stored in CARRY status bit.
- If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When \(\mathrm{C} 54 \mathrm{CM}=1\), the rounding is performed without clearing the LSBs of accumulator ACx.

\section*{Status Bits Affected by C54CM, M40, RDM, SATD \\ Affects ACOVy \\ Repeat This instruction cannot be repeated. \\ Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=\operatorname{rnd}(A C 0)\) & \begin{tabular}{l} 
The content of ACO is added to 8000 h, the 16 LSBs are cleared to 0, and the result \\
is stored in AC1. M40 is cleared to 0 , so overflow is detected at bit 31; SATD is cleared \\
to 0, so AC1 is not saturated.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlllll} 
Before & & & After \\
AC0 & EF & 0 FFO & 8023 & AC0 & EF & 0FF0 & 8023 \\
AC1 & 00 & 0000 & 0000 & AC1 & EF & \(0 F F 1\) & 0000 \\
RDM & & & 1 & RDM & & & 1 \\
M40 & & & 0 & M40 & & 0 \\
SATD & & & 0 & SATD & & & 0 \\
ACOV1 & & & 0 & ACOV1 & & & 1
\end{tabular}

\section*{SAT}

Saturate Accumulator Content

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\) saturate \((\operatorname{rnd}(A C x))\) & Yes & 2 & 1 & \(X\) \\
\hline Opcode & & 0101 & 010 E & DDSS & \(110 \%\)
\end{tabular}

\section*{Operands}

Description

ACx, ACy
This instruction performs a saturation of the source accumulator \(A C x\) to the 32-bit width frame in the D-unit ALU.
- A rounding is performed if the optional rnd keyword is applied to the instruction. The rounding operation depends on RDM:
■ When RDM \(=0\), the biased rounding to the infinite is performed. \(8000 \mathrm{~h}\left(2^{15}\right)\) is added to the 40 -bit source accumulator ACx.
- When \(\operatorname{RDM}=1\), the unbiased rounding to the nearest is performed. According to the value of the 17 LSBs of the 40 -bit source accumulator ACx, 8000h \(\left(2^{15}\right)\) is added:
```

if( 8000h < bit(15-0) < 10000h)
add 8000h to the 40-bit source accumulator ACx
else if( bit(15-0) == 8000h)
if( bit(16) == 1)
add 8000h to the 40-bit source accumulator ACx

``` If a rounding has been performed, the 16 lowest bits of the result are cleared to 0 .
\(\square\) An overflow is detected at bit position 31.
- No addition carry report is stored in CARRY status bit.
- If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an overflow is detected, the destination register is saturated. Saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), the rounding is performed without clearing the LSBs of accumulator ACx.

\section*{Status Bits Affected by C54CM, RDM \\ Affects ACOVy \\ Repeat This instruction can be repeated. \\ Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = saturate(AC0) & \begin{tabular}{l} 
The 32-bit width content of AC0 is saturated and the saturated value, FF 8000 0000, \\
is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multicolumn{4}{|l|}{Before} & \multicolumn{4}{|l|}{After} \\
\hline AC0 & EF & OFFO & 8023 & ACO & EF & OFFO & 8023 \\
\hline AC1 & 00 & 0000 & 0000 & AC1 & FF & 8000 & 0000 \\
\hline ACOV1 & & & 0 & ACOV1 & & & 1 \\
\hline
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \begin{tabular}{l} 
AC1 \(=\) satu- \\
rate \((\) rnd \((A C 0))\)
\end{tabular} & \begin{tabular}{l} 
The 32-bit width content of AC0 is saturated. The saturated value, 00 7FFF FFFFF, \\
is rounded, 16 LSBs are cleared, and stored in AC1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrrlrlr} 
Before & \multicolumn{5}{c}{ After } \\
AC0 & 00 & 7 FFF & 8000 & AC0 & 00 & 7 FFF & 8000 \\
AC1 & 00 & 0000 & 0000 & AC1 & 00 & 7 FFF & 0000 \\
RDM & & & 0 & RDM & & 0 \\
ACOV1 & & & 0 & ACOV1 & & 1
\end{tabular}

\section*{BSET}

Set Accumulator, Auxiliary, or Temporary Register Bit

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & bit(src, Baddr) \(=\# 1\) & No & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
\(11101100 \mid\) AAAA AAAI \(\mid\) FSSS 000x
\begin{tabular}{ll} 
Operands & Baddr, src \\
Description & This instruction performs a bit manipulation:
\end{tabular}
\(\square\) In the D-unit ALU, if the source (src) register operand is an accumulator.
\(\square\) In the A-unit ALU, if the source (src) register operand is an auxiliary or temporary register.

The instruction sets to 1 a single bit, as defined by the bit addressing mode, Baddr, of the source register.

The generated bit address must be within:
- 0-39 when accessing accumulator bits (only the 6 LSBs of the generated bit address are used to determine the bit position). If the generated bit address is not within \(0-39\), the selected register bit value does not change.
- 0-15 when accessing auxiliary or temporary register bits (only the 4 LSBs of the generated address are used to determine the bit position).
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat This instruction can be repeated.
See Also See the following other related instructions:
\(\square\) Clear Accumulator, Auxiliary, or Temporary Register Bit
\(\square\) Complement Accumulator, Auxiliary, or Temporary Register Bit
- Set Memory Bit
- Set Status Register Bit

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline \(\operatorname{bit}(A C 0\), AR3 \()=\# 1\) & The bit at the position defined by the content of AR3(4-0) in AC0 is set to 1. \\
\hline
\end{tabular}

\section*{BSET}

Set Memory Bit

\section*{Syntax Characteristics}


\section*{BSET}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\operatorname{bit}(S T 0, k 4)=\# 1\) & Yes & 2 & 1 & X \\
{\([2]\)} & \(\operatorname{bit}(S T 1, ~ k 4)=\# 1\) & Yes & 2 & 1 & \(X\) \\
{\([3]\)} & \(\operatorname{bit}(S T 2, k 4)=\# 1\) & Yes & 2 & 1 & \(X\) \\
{\([4]\)} & \(\operatorname{bit}(S T 3, k 4)=\# 1\) & Yes & 2 & \(1^{\dagger}\) & \(X\) \\
\hline
\end{tabular}
\(\dagger\) When this instruction is decoded to modify status bit CAFRZ (15), CAEN (14), or CACLR (13), the CPU pipeline is flushed and the instruction is executed in 5 cycles regardless of the instruction context.


\section*{Compatibility with C54x devices (C54CM = 1)}

C55x DSP status registers bit mapping (Figure 5-3, page 5-526) does not correspond to C54x DSP status register bits.

Status Bits Affected by none
Affects Selected status bits
Repeat This instruction cannot be repeated.
See Also See the following other related instructions:
- Clear Status Register Bit
\(\square\) Set Accumulator, Auxiliary, or Temporary Register Bit
\(\square\) Set Memory Bit

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline bit(ST0, ST0_CARRY) = \#1; ST0_CARRY = bit 11 & \begin{tabular}{l} 
The ST0 bit position defined by the label (ST0_CARRY, \\
bit 11) is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lll} 
Before & & After \\
STO & 0000 & ST0
\end{tabular}

Figure 5-3. Status Registers Bit Mapping

\section*{STO_55}
\begin{tabular}{|c|c|c|c|c|c|c|}
\multicolumn{1}{c}{15} & 14 & \multicolumn{2}{c}{13} & \multicolumn{2}{c}{12} & \multicolumn{2}{c}{11} & \multicolumn{2}{c}{10} & 9 \\
\hline ACOV2 \(^{\dagger}\) & ACOV3 \(^{\dagger}\) & TC1 \(^{\dagger}\) & TC2 & CARRY & ACOV0 & ACOV1 \(^{2}\) \\
\hline R/W-0 & R/W-0 & R/W-1 & R/W-1 & R/W-1 & R/W-0 & R/W-0 \\
\hline
\end{tabular}


ST1_55
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\multicolumn{1}{c}{15} & 14 & 13 & 12 & 11 & \multicolumn{2}{c}{10} & 9 \\
\hline BRAF & CPL & XF & HM & INTM & M40 \(^{\dagger}\) & SATD & SXMD \\
\hline R/W-0 & R/W-0 & R/W-1 & R/W-0 & R/W-1 & R/W-0 & R/W-0 & R/W-1 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|cc|}
\hline \multicolumn{1}{c}{5} & \multicolumn{1}{c}{4} & & 0 \\
\hline C16 & FRCT & C54CM \(^{\dagger}\) & ASM \\
\hline R/W-0 & R/W-0 & R/W-1 & R/W-0 \\
\hline
\end{tabular}

ST2_55
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\multicolumn{1}{c}{14} & \multicolumn{2}{c}{13} & 12 & 11 & \multicolumn{2}{c}{10} & 9 \\
\hline ARMS & Reserved & DBGM & EALLOW & RDM & Reserved & CDPLC \\
\hline R/W-0 & & R/W-1 & R/W-0 & R/W-0 & & R/W-0 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline 7 & 6 & 5 & 4 & & 3 & & 2 \\
0 & 1 & 0 \\
\hline AR7LC & AR6LC & AR5LC & AR4LC & AR3LC & AR2LC & AR1LC & AR0LC \\
\hline R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 & R/W-0 \\
\hline
\end{tabular}

\section*{ST3_55}
\begin{tabular}{|c|c|c|c|c|}
\multicolumn{1}{c}{15} & \multicolumn{1}{c}{13} & 12 & 11 & 8 \\
\hline CAFRZ \(^{\dagger}\) & CAEN \(^{\dagger}\) & CACLR \(^{\dagger}\) & HINT \(^{\ddagger}\) & Reserved (always write 1100b) \\
\hline R/W-0 & R/W-0 & R/W-0 & R/W-1 & \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0 \\
\hline CBERR \({ }^{\dagger}\) & MPNMC§ & SATA \({ }^{\dagger}\) & Reserved & & CLKOFF & SMUL & SST \\
\hline R/W-0 & R/W-pins & R/W-0 & & & R/W-0 & R/W-0 & R/W-0 \\
\hline
\end{tabular}

Legend: \(\mathrm{R}=\) Read; \(\mathrm{W}=\) Write; \(-n=\) Value after reset
\(\dagger\) Highlighted bit: If you write to the protected address of the status register, a write to this bit has no effect, and the bit always appears as a 0 during read operations.
\(\ddagger\) The HINT bit is not used for all C55x host port interfaces (HPIs). Consult the documentation for the specific C55x DSP.
§ The reset value of MPNMC may be dependent on the state of predefined pins at reset. To check this for a particular C55x DSP, see the boot loader section of its data sheet.

\section*{SFTCC}

\section*{Shift Accumulator Content Conditionally}

\section*{Syntax Characteristics}


\section*{Example 1}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\operatorname{sftc}(\mathrm{AC} 0, \mathrm{TC} 1)\) & \begin{tabular}{l} 
Because \(\mathrm{ACO}(31) \mathrm{XORed}\) with \(\mathrm{ACO}(30)\) equals 1, the content of AC0 is not shifted \\
left and TC1 is set to 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & & After \\
AC0 & FF & 8765 & 0055 & AC0 & FF & 8765 \\
TC1 & & 0 & TC1 & & & 1
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=\operatorname{sftc}(A C 0\), TC2) & \begin{tabular}{l} 
Because AC0(31) XORed with AC0(30) equals 0, the content of \(A C 0\) is shifted left \\
by 1 bit and TC2 is cleared to 0.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrrrrr} 
Before & & \multicolumn{5}{c}{ After } \\
AC0 & 00 & 1234 & 0000 & AC0 & 00 & 2468 \\
TC2 & & & 0 & TC2 & &
\end{tabular}

\section*{SFTL}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=A C x \lll T x\) & Yes & 2 & 1 & \(X\) \\
{\([2]\)} & \(A C y=A C x \lll\) SHIFTW & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}

Description These instructions perform an unsigned shift by an immediate value, SHIFTW, or the content of a temporary register ( Tx ) in the D-unit shifter.

Status Bits Affected by C54CM, M40 Affects CARRY

See Also
See the following other related instructions:
\(\square\) Shift Accumulator Content Conditionally
- Shift Accumulator, Auxiliary, or Temporary Register Content Logically
- Signed Shift of Accumulator Content
\(\square\) Signed Shift of Accumulator, Auxiliary, or Temporary Register Content

Shift Accumulator Content Logically

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(\mathrm{ACy} \mathrm{ACx} \lll \mathrm{Tx}\) & Yes & 2 & 1 & X \\
\hline Opcode & & 0101 & 110 E & DDSS & Ss 00
\end{tabular}

\section*{Operands}

Description This instruction shifts by the temporary register ( Tx ) content the accumulator (ACx) content and stores the shifted-out bit in the CARRY status bit. If the 16 -bit value contained in \(T x\) is out of the -32 to +31 range, the shift is saturated to -32 or +31 and the shift operation is performed with this value. However, no overflow is reported when such saturation occurs.
\(\square\) The operation is performed on 40 bits in the D-unit shifter.
- The shift operation is performed according to M40.

The CARRY status bit contains the shifted-out bit. When the shift count is zero, \(\mathrm{Tx}=0\), the CARRY status bit is cleared to 0 .

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of Tx define the shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
Status Bits Affected by C54CM, M40
Affects

CARRY

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|c|c|c|c|c|}
\hline Syntax & \multicolumn{4}{|l|}{Description} \\
\hline \(A C 1=A C 0 \ggg 0\) & \multicolumn{4}{|l|}{The content of ACO is logically shifted right by the content of T0 and the result is stored in AC1. There is a right shift because the content of T0 is negative ( -6 ). Because M40 = 0, the guard bits (39-32) are cleared.} \\
\hline Before & & Aft & & \\
\hline AC0 5F B000 & 1234 & AC0 & \(5 \mathrm{~F} \mathrm{B000}\) & 1234 \\
\hline AC1 00 C680 & ACFO & AC1 & 00 02C0 & 0048 \\
\hline T0 & FFFA & T0 & & FFFA \\
\hline M40 & 0 & M40 & & 0 \\
\hline
\end{tabular}

Shift Accumulator Content Logically

\section*{Syntax Characteristics}


SFTL Shift Accumulator, Auxiliary, or Temporary Register Content Logically

\section*{SFTL Shift Accumulator, Auxiliary, or Temporary Register Content Logically}

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(d s t=d s t ~ \lll \# 1\) & Yes & 2 & 1 & \(X\) \\
{\([2]\)} & \(d s t=d s t ~ \ggg \# 1\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

Description These instructions perform an unsigned shift by 1 bit:
- In the D-unit shifter, if the destination operand is an accumulator (ACx).
- In the A-unit ALU, if the destination operand is an auxiliary or temporary register (TAx).

Status Bits Affected by C54CM, M40
Affects CARRY
See Also See the following other related instructions:
- Shift Accumulator Content Conditionally
- Shift Accumulator Content Logically
- Signed Shift of Accumulator Content
- Signed Shift of Accumulator, Auxiliary, or Temporary Register Content

Syntax Characteristics


SFTL Shift Accumulator, Auxiliary, or Temporary Register Content Logically

Shift Accumulator, Auxiliary, or Temporary Register Content Logically

\section*{Syntax Characteristics}


SFTS
Signed Shift of Accumulator Content

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & ACy \(=A C x \ll\) Tx & Yes & 2 & 1 & X \\
{\([2]\)} & \(A C y ~=A C x \ll C ~ T x\) & Yes & 2 & 1 & \(X\) \\
{\([3]\)} & ACy \(=A C x \ll\) \#SHIFTW & Yes & 3 & 1 & \(X\) \\
{\([4]\)} & ACy \(=A C x \ll C\) \#SHIFTW & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}

Description These instructions perform a signed shift by an immediate value, SHIFTW, or by the content of a temporary register (Tx) in the D-unit shifter.

Status Bits Affected by C54CM, M40, SATA, SATD, SXMD
Affects ACOVx, ACOVy, CARRY
See Also See the following other related instructions:
- Shift Accumulator Content Conditionally
- Shift Accumulator Content Logically
- Shift Accumulator, Auxiliary, or Temporary Register Content Logically
- Signed Shift of Accumulator, Auxiliary, or Temporary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=A C x \ll T x\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

0101 110E \(\mid\) DDSS ss01

\section*{Operands}

Description This instruction shifts by the temporary register (Tx) content the accumulator (ACx) content. If the 16-bit value contained in \(T x\) is out of the -32 to +31 range, the shift is saturated to -32 or +31 and the shift operation is performed with this value; a destination accumulator overflow is reported when such saturation occurs.
\(\square\) The operation is performed on 40 bits in the D-unit shifter.
\(\square\) When M40 \(=0\), the input to the shifter is modified according to SXMD and then the modified input is shifted by the Tx content:
- if \(S X M D=0,0\) is substituted for the guard bits (39-32) as the input, instead of \(\operatorname{ACx}(39-32)\), to the shifter
- if SXMD = 1, bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of \(\operatorname{ACx}(39-32)\), to the shifter
\(\square\) The sign position of the source operand is compared to the shift quantity. This comparison depends on M40:
- if M40 \(=0\), comparison is performed versus bit 31
- if M40 =1, comparison is performed versus bit 39
\(\square \quad 0\) is inserted at bit position 0 .
\(\square\) The shifted-out bit is extracted according to M40.
\(\square\) After shifting, unless otherwise noted, when \(\mathrm{M} 40=0\) :
overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)
- if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
\(\square\) After shifting, unless otherwise noted, when \(\mathrm{M} 40=1\) :
■ overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVy bit is set)
■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM = 1 :
\(\square\) These instructions are executed as if M40 status bit was locally set to 1 .
\(\square\) There is no overflow detection, overflow report, and saturation performed by the D -unit shifter.
- The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SATD, SXMD \\
& Affects & ACOVy
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\mathrm{AC} 1 \ll\) T0 & The content of AC1 is shifted by the content of T0 and the result is stored in AC0. \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(A C y=A C x \ll C ~ T x\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

0101 110E \(\mid\) DDSS ss10

\section*{Operands}

Description

ACx, ACy, Tx
This instruction shifts by the temporary register ( Tx ) content the accumulator (ACx) content and stores the shifted-out bit in the CARRY status bit. If the 16 -bit value contained in \(T x\) is out of the -32 to +31 range, the shift is saturated to -32 or +31 and the shift operation is performed with this value; a destination accumulator overflow is reported when such saturation occurs.

The operation is performed on 40 bits in the D-unit shifter.
- When M40 \(=0\), the input to the shifter is modified according to SXMD and then the modified input is shifted by the Tx content:

■ if \(S X M D=0,0\) is substituted for the guard bits (39-32) as the input, instead of \(\operatorname{ACx}(39-32)\), to the shifter

■ if \(S X M D=1\), bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of \(\operatorname{ACx}(39-32)\), to the shifter
\(\square\) The sign position of the source operand is compared to the shift quantity. This comparison depends on M4O:
- if M40 \(=0\), comparison is performed versus bit 31
- if \(\mathrm{M} 40=1\), comparison is performed versus bit 39
\(\square 0\) is inserted at bit position 0 .
- The shifted-out bit is extracted according to M40 and stored in the CARRY status bit. When the shift count is zero, \(\mathrm{Tx}=0\), the CARRY status bit is cleared to 0 .
- After shifting, unless otherwise noted, when \(\mathrm{M} 40=0\) :

■ overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)
■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)

After shifting, unless otherwise noted, when \(\mathrm{M} 40=1\) :
- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVy bit is set)
- if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\) :
\(\square\) These instructions are executed as if M40 status bit was locally set to 1 .
\(\square\) There is no overflow detection, overflow report, and saturation performed by the D-unit shifter.
\(\square\) The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
\begin{tabular}{lll} 
Status Bits & Affected by \(\quad\) C54CM, M40, SATD, SXMD \\
& Affects & ACOVy, CARRY
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC2 \(=A C 2 \ll C\) T1 & \begin{tabular}{l} 
The content of AC2 is shifted left by the content of T1 and the saturated result is \\
stored in AC2. The shifted out bit is stored in the CARRY status bit. Since SATD \(=1\) \\
and M40 = 0, AC2 = FF 80000000 (saturation).
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrr} 
Before & & & After \\
AC2 & 80 & AA00 & 1234 & AC2 & FF 8000 \\
T1 & 0005 & T1 & & 0000 \\
CARRY & & 0 & CARRY & & 1 \\
M40 & 0 & M40 & & 0 \\
ACOV2 & 0 & ACOV2 & & 1 \\
SXMD & & 1 & SXMD & & 1 \\
SATD & & 1 & SATD & & 1
\end{tabular}

Signed Shift of Accumulator Content

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([3]\) & \(\mathrm{ACy}=\mathrm{ACx} \ll\) \#SHIFTW & Yes & 3 & 1 & X \\
\hline Opcode & & 0001 & 000 E & DDSS & 0101 & xxSH \\
Operands & ACx, ACy, SHIFTW & & & & & \\
Description & This instruction shifts by a 6-bit value, SHIFTW, the accumulator (ACx)
\end{tabular} content.
- The operation is performed on 40 bits in the D-unit shifter.
- When M40 \(=0\), the input to the shifter is modified according to SXMD and then the modified input is shifted by the 6 -bit value, SHIFTW:
■ if \(S X M D=0,0\) is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
- if \(S X M D=1\), bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of \(\operatorname{ACx}(39-32)\), to the shifter
- The sign position of the source operand is compared to the shift quantity. This comparison depends on M40:
■ if M40 \(=0\), comparison is performed versus bit 31
- if M40 \(=1\), comparison is performed versus bit 39
\(\square 0\) is inserted at bit position 0 .
- The shifted-out bit is extracted according to M40.
- After shifting, unless otherwise noted, when \(\mathrm{M} 40=0\) :
- overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)
■ if \(\operatorname{SATD}=1\), when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
- After shifting, unless otherwise noted, when \(M 40=1\) :
- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVy bit is set)
■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 8000000000 h (negative overflow)

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), these instructions are executed as if M40 status bit was locally set to 1 . There is no overflow detection, overflow report, and saturation performed by the D-unit shifter.
\begin{tabular}{lll|} 
Status Bits & Affected by C54CM, M40, SATD, SXMD \\
& Affects ACOVy
\end{tabular}

\section*{Example 2}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1 \ll \#-32\) & The content of AC1 is shifted right by 32 bits and the result is stored in AC0. \\
\hline
\end{tabular}

Signed Shift of Accumulator Content

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([4]\) & \(A C y=A C x \ll C\) \#SHIFTW & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
Opcode \(\quad|0001000 \mathrm{E}|\) DDSS \(0110 \mid \mathrm{xxSH}\) IFTW

\section*{Operands ACx, ACy, SHIFTW}

Description This instruction shifts by a 6-bit value, SHIFTW, the accumulator (ACx) content and stores the shifted-out bit in the CARRY status bit.
\(\square\) The operation is performed on 40 bits in the D-unit shifter.
- When M40 \(=0\), the input to the shifter is modified according to SXMD and then the modified input is shifted by the 6 -bit value, SHIFTW:

■ if \(S X M D=0,0\) is substituted for the guard bits (39-32) as the input, instead of \(\operatorname{ACX}(39-32)\), to the shifter
- if \(S X M D=1\), bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of \(\operatorname{ACx}(39-32)\), to the shifter
\(\square\) The sign position of the source operand is compared to the shift quantity. This comparison depends on M4O:
- if M40 \(=0\), comparison is performed versus bit 31
- if \(\mathrm{M} 40=1\), comparison is performed versus bit 39
\(\square 0\) is inserted at bit position 0 .
- The shifted-out bit is extracted according to M40 and stored in the CARRY status bit. When the shift count is zero, \(\mathrm{SHIFTW}=0\), the CARRY status bit is cleared to 0 .
- After shifting, unless otherwise noted, when \(\mathrm{M} 40=0\) :

■ overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVy bit is set)

■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)


\section*{SFTS \\ Signed Shift of Accumulator, Auxiliary, or Temporary Register Content}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(d s t=d s t ~ \gg \# 1\) & Yes & 2 & 1 & \(X\) \\
{\([2]\)} & \(d s t=d s t ~ \ll \# 1\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{ll} 
Description & These instructions perform a shift of 1 bit: \\
Status Bits & In the D-unit shifter, if the destination operand is an accumulator (ACx). \\
See Also & \begin{tabular}{l} 
In the A-unit ALU, if the destination operand is an auxiliary or temporary \\
register (TAx).
\end{tabular} \\
Affected by C54CM, M40, SATA, SATD, SXMD \\
ACOVx, ACOVy, CARRY
\end{tabular}

\section*{Syntax Characteristics}


If the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
\(\square\) Bit 15 is sign extended.

\section*{Compatibility with C54x devices (C54CM = 1)}

When \(\mathrm{C} 54 \mathrm{CM}=1\), these instructions are executed as if M40 status bit was locally set to 1 . There is no overflow detection, overflow report, and saturation performed by the D-unit shifter.
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SXMD \\
& Affects & none
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=A C 0\) >> \#1 & The content of AC0 is shifted right by 1 bit and the result is stored in AC0. \\
\hline
\end{tabular}

Signed Shift of Accumulator, Auxiliary, or Temporary Register Content

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(\mathrm{dst}=\mathrm{dst} \ll \# 1\) & Yes & 2 & 1 & X \\
\hline Opcode & & & 0100 & 010 E & 01 xl & FDDD \\
Operands & dst & & & \\
Description & This instruction shifts left by 1 bit the content of the destination register (dst).
\end{tabular}

If the destination operand (dst) is an accumulator:
- The operation is performed on 40 bits in the D-unit shifter.
\(\square\) When M40 \(=0\), the input to the shifter is modified according to SXMD and then the modified input is shifted left by 1 bit:
■ if \(\operatorname{SXMD}=0,0\) is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
■ if \(\operatorname{SXMD}=1\), bit 31 of the source operand is substituted for the guard bits (39-32) as the input, instead of ACx(39-32), to the shifter
\(\square\) The sign position of the source operand is compared to the shift quantity. This comparison depends on M40:
- if M40 \(=0\), comparison is performed versus bit 31
- if \(\mathrm{M} 40=1\), comparison is performed versus bit 39
\(\square 0\) is inserted at bit position 0 .
- The shifted-out bit is extracted according to M40.
- After shifting, unless otherwise noted, when \(\mathrm{M} 40=0\) :
- overflow is detected at bit position 31 (if an overflow is detected, the destination ACOVx bit is set)
■ if SATD = 1 , when an overflow is detected, the destination accumulator saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow)
- After shifting, unless otherwise noted, when \(\mathrm{M} 40=1\) :
- overflow is detected at bit position 39 (if an overflow is detected, the destination ACOVx bit is set)

■ if SATD = 1, when an overflow is detected, the destination accumulator saturation values are 7F FFFF FFFFh (positive overflow) or 800000 0000h (negative overflow)

If the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- 0 is inserted at bit position 0 .
\(\square\) After shifting, unless otherwise noted:
■ overflow is detected at bit position 15 (if an overflow is detected, the destination ACOVx bit is set)
- if SATA \(=1\), when an overflow is detected, the destination register saturation values are 7FFFh (positive overflow) or 8000h (negative overflow)

\section*{Compatibility with C54x devices (C54CM = 1)}

When C54CM \(=1\), these instructions are executed as if M40 status bit was locally set to 1 . There is no overflow detection, overflow report, and saturation performed by the D -unit shifter.
\begin{tabular}{lll|}
\hline Status Bits & Affected by C54CM, M40, SATA, SATD, SXMD \\
& Affects ACOVx \\
Repeat & This instruction can be repeated. \\
Example & \\
\hline Syntax & Description \\
\hline \(\mathrm{T} 2=\mathrm{T} 2 \ll \# 1\) & The content of T2 is shifted left by 1 bit and the result is stored in T2. \\
\hline
\end{tabular}
\begin{tabular}{lrlr} 
Before & & After \\
T2 & EF27 & T2 & DE4E \\
SATA & 1 & SATA & 1
\end{tabular}

\section*{INTR}

\section*{Software Interrupt}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable Bit Size Cycles Pipeline \\
\hline [1] intr(k5) & \(\begin{array}{llll}\text { No } & 2 & 3 & \text { D }\end{array}\) \\
\hline Opcode & 10010101 0xxk kkkk \\
\hline Operands & k5 \\
\hline Description & This instruction passes control to a specified interrupt service routine (ISR) and interrupts are globally disabled (INTM bit is set to 1 after ST1_ 55 content is pushed onto the data stack pointer). The ISR address is stored at the interrupt vector address defined by the content of an interrupt vector pointer (IVPD or IVPH) combined with the 5-bit constant, k5. This instruction is executed regardless of the value of INTM bit. \\
\hline
\end{tabular}

\section*{Note:}

DBSTAT (the debug status register) holds debug context information used during emulation. Make sure the ISR does not modify the value that will be returned to DBSTAT.

Before beginning an ISR, the CPU automatically saves the value of some CPU registers and two internal registers: the program counter (PC) and a loop context register. The CPU can use these values to re-establish the context of the interrupted program sequence when the ISR is done.
In the slow-return process (default), the return address (from the PC), the loop context bits, and some CPU registers are stored to the stacks (in memory). When the CPU returns from an ISR, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32 -bit load and store instructions. Some CPU registers are saved to the stacks (in memory). For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

When control is passed to the ISR:
- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The status register 2 (ST2_55) content is pushed to the top of SP.
- The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The 7 higher bits of status register 0 (ST0_55) concatenated with 9 zeroes are pushed to the top of SSP.
\(\square\) The SP is decremented by 1 word in the access phase of the pipeline. The status register 1 (ST1_55) content is pushed to the top of SP.
\(\square\) The SSP is decremented by 1 word in the access phase of the pipeline. The debug status register (DBSTAT) content is pushed to the top of SSP.

The SP is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
\(\square\) The SSP is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
- The PC is loaded with the ISR program address. The active control flow execution context flags are cleared.

When the software interrupt is acknowledged, the corresponding bits in IFR0 and IFR1 are cleared.

\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & INTM, IFR0, IFR1
\end{tabular}

Repeat This instruction cannot be repeated.
See Also See the following other related instructions:
- Return from Interrupt
- Software Trap

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline intr(\#3) & \begin{tabular}{l} 
Program control is passed to the specified interrupt service routine. The interrupt vector address is \\
defined by the content of an interrupt vector pointer (IVPD) combined with the unsigned 5-bit value (3).
\end{tabular} \\
\hline
\end{tabular}

RESET
Software Reset

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|}
\hline No. Syntax & Parallel
Enable bit Size Cycles Pipeline \\
\hline [1] reset & No 2 ? D \\
\hline Opcode & \(10010100 \mid\) xxxx xxxx \\
\hline Operands & none \\
\hline Description & \begin{tabular}{l}
This instruction performs a nonmaskable software reset that can be used any time to put the device in a known state. \\
The reset instruction affects ST0_55, ST1_55, ST2_55, IFR0, IFR1, and T2 (Table 5-5 and Figure 5-4); status register ST3_55 and interrupt vectors pointer registers (IVPD and IVPH) are not affected. When the reset instruction is acknowledged, the INTM is set to 1 to disable maskable interrupts. All pending interrupts in IFR0 and IFR1 are cleared. The initialization of the system control register, the interrupt vectors pointer, and the peripheral registers is different from the initialization performed by a hardware reset.
\end{tabular} \\
\hline Status Bits & \begin{tabular}{ll} 
Affected by & none \\
Affects & IFR0, IFR1, ST0_55, ST1_55, ST2_55
\end{tabular} \\
\hline Repeat & This instruction cannot be repeated. \\
\hline
\end{tabular}

Table 5-5. Effects of a Software Reset on DSP Registers
\begin{tabular}{|c|c|c|c|}
\hline Register & Bit & \begin{tabular}{l}
Reset \\
Value
\end{tabular} & Comment \\
\hline T2 & All & 0 & All bits are cleared. To ensure TMS320C54x DSP compatibility, instructions affected by ASM bit will use a shift count of 0 (no shift). \\
\hline IFR0 & All & 0 & All pending interrupt flags are cleared. \\
\hline IFR1 & All & 0 & All pending interrupt flags are cleared. \\
\hline \multirow[t]{8}{*}{ST0_55} & ACOV2 & 0 & AC2 overflow flag is cleared. \\
\hline & ACOV3 & 0 & AC3 overflow flag is cleared. \\
\hline & TC1 & 1 & Test control flag 1 is cleared. \\
\hline & TC2 & 1 & Test control flag 2 is cleared. \\
\hline & CARRY & 1 & CARRY bit is cleared. \\
\hline & ACOVO & 0 & ACO overflow flag is cleared. \\
\hline & ACOV1 & 0 & AC1 overflow flag is cleared. \\
\hline & DP & 0 & All bits are cleared, data page 0 is selected. \\
\hline \multirow[t]{12}{*}{ST1_55} & BRAF & 0 & This flag is cleared. \\
\hline & CPL & 0 & The DP (rather than SP) direct addressing mode is selected. Direct accesses to data space are made relative to the data page register (DP). \\
\hline & XF & 1 & External flag is set. \\
\hline & HM & 0 & When an active HOLD signal forces the DSP to place its external interface in the high-impedance state, the DSP continues executing code from internal memory. \\
\hline & INTM & 1 & Maskable interrupts are globally disabled. \\
\hline & M40 & 0 & 32-bit (rather than 40-bit) computation mode is selected for the D unit. \\
\hline & SATD & 0 & CPU will not saturate overflow results in the D unit. \\
\hline & SXMD & 1 & Sign-extension mode is on. \\
\hline & C16 & 0 & Dual 16 -bit mode is off. For an instruction that is affected by C16, the Dunit ALU performs one 32-bit operation rather than two parallel 16-bit operations. \\
\hline & FRCT & 0 & Results of multiply operations are not shifted. \\
\hline & C54CM & 1 & TMS320C54x-compatibility mode is on. \\
\hline & ASM & 0 & Instructions affected by ASM will use a shift count of 0 (no shift). \\
\hline
\end{tabular}

Table 5-5. Effects of a Software Reset on DSP Registers (Continued)
\begin{tabular}{llcl}
\hline Register & Bit & \begin{tabular}{l} 
Reset \\
Value
\end{tabular} & Comment
\end{tabular} \begin{tabular}{llll} 
ST2_55 & ARMS & 0 & \begin{tabular}{l} 
When you use the AR indirect addressing mode, the DSP mode (rather \\
than control mode) operands are available.
\end{tabular} \\
& DBGM & 1 & Debug events are disabled. \\
& EALLOW & 0 & A program cannot write to the non-CPU emulation registers. \\
& RDM & 0 & \begin{tabular}{l} 
When an instruction specifies that an operand should be rounded, the \\
CPU uses rounding to the infinite (rather than rounding to the nearest).
\end{tabular} \\
& CDPLC & 0 & CDP is used for linear addressing (rather than circular addressing). \\
& AR7LC & 0 & AR7 is used for linear addressing. \\
& AR5LC & 0 & AR6 is used for linear addressing. \\
AR4LC & 0 & AR5 is used for linear addressing. \\
AR3LC & 0 & AR4 is used for linear addressing. \\
& AR2LC is used for linear addressing. & 0 & AR2 is used for linear addressing. \\
AR1LC & 0 & AR1 is used for linear addressing. \\
AR0LC & 0 & AR0 is used for linear addressing. \\
\hline
\end{tabular}

\section*{RESET Software Reset (reset)}

Figure 5-4. Effects of a Software Reset on Status Registers

\section*{STO_55}


ST1_55


ST2_55
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline 15 & \multicolumn{1}{c}{14} & \multicolumn{2}{c}{13} & 12 & 11 & \\
\hline ARMS & Reserved & DBGM & EALLOW & RDM & Reserved & CDPLC \\
\hline 0 & 1 & 0 & 0 & & 0 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline 7 & 6 & 5 & 4 & & 3 & 2 & 1 \\
\hline AR7LC & AR6LC & AR5LC & AR4LC & AR3LC & AR2LC & AR1LC & AR0LC \\
\hline 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\hline
\end{tabular}

\section*{TRAP}

Software Trap

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & trap(k5) & & No & 2 & \(?\) & D \\
\hline Opcode & k5 & \(\mid 1001\) & 0101 & 1 xxk & kkkk \\
Operands & \begin{tabular}{l} 
This instruction passes control to a specified interrupt service routine (ISR) \\
and this instruction does not affect INTM bit in ST1_55 and DBGM bit in
\end{tabular} \\
Description & \begin{tabular}{l} 
ST2_55. The ISR address is stored at the interrupt vector address defined by \\
the content of an interrupt vector pointer (IVPD or IVPH) combined with the
\end{tabular} \\
& \begin{tabular}{l} 
5-bit constant, k5. This instruction is executed regardless of the value of INTM \\
bit. This instruction is not maskable.
\end{tabular}
\end{tabular}

\section*{Note:}

DBSTAT (the debug status register) holds debug context information used during emulation. Make sure the ISR does not modify the value that will be returned to DBSTAT.

Before beginning an ISR, the CPU automatically saves the value of some CPU registers and two internal registers: the program counter (PC) and a loop context register. The CPU can use these values to re-establish the context of the interrupted program sequence when the ISR is done.

In the slow-return process (default), the return address (from the PC), the loop context bits, and some CPU registers are stored to the stacks (in memory). When the CPU returns from an ISR, the speed at which these values are restored is dependent on the speed of the memory accesses.

In the fast-return process, the return address (from the PC) and the loop context bits are saved to registers, so that these values can always be restored quickly. These special registers are the return address register (RETA) and the control-flow context register (CFCT). You can read from or write to RETA and CFCT as a pair with dedicated, 32 -bit load and store instructions. Some CPU registers are saved to the stacks (in memory). For fast-return mode operation, see the TMS320C55x DSP CPU Reference Guide (SPRU371).

When control is passed to the ISR:
- The data stack pointer (SP) is decremented by 1 word in the address phase of the pipeline. The status register 2 (ST2_55) content is pushed to the top of SP.
- The system stack pointer (SSP) is decremented by 1 word in the address phase of the pipeline. The 7 higher bits of status register 0 (ST0_55) concatenated with 9 zeroes are pushed to the top of SSP.
\(\square\) The SP is decremented by 1 word in the access phase of the pipeline. The status register 1 (ST1_55) content is pushed to the top of SP.
- The SSP is decremented by 1 word in the access phase of the pipeline. The debug status register (DBSTAT) content is pushed to the top of SSP.
- The SP is decremented by 1 word in the read phase of the pipeline. The 16 LSBs of the return address, from the program counter (PC), of the called subroutine are pushed to the top of SP.
\(\square\) The SSP is decremented by 1 word in the read phase of the pipeline. The loop context bits concatenated with the 8 MSBs of the return address are pushed to the top of SSP.
\(\square\) The PC is loaded with the ISR program address. The active control flow execution context flags are cleared.


\section*{SQR}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\operatorname{rnd}\left(A C x{ }^{*} A C x\right)\) & Yes & 2 & 1 & \(X\) \\
{\([2]\)} & \(A C x=\operatorname{rnd}\left(\right.\) Smem \(^{*}\) Smem \()[, T 3=\) Smem \(]\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Description & \begin{tabular}{l}
This instruction performs a multiplication in the D-unit MAC. The operands of the multiplier are:
ACx(32-16) \\
\(\square\) the content of a memory (Smem) location, sign extended to 17 bits
\end{tabular} \\
\hline Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL \\
\hline & Affects ACOVx, ACOVy \\
\hline See Also & See the following other related instructions: \\
\hline & - Multiply \\
\hline & - Square and Accumulate \\
\hline & \(\square\) Square and Subtract \\
\hline & \(\square\) Square Distance \\
\hline
\end{tabular}

\section*{Square}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. Syntax & & \[
\begin{gathered}
\text { Parallel } \\
\text { Enable Bit }
\end{gathered}
\] & Size & Cycles & Pipeline \\
\hline [1] \(\mathrm{ACy}=\mathrm{rnd}\) & \(A C y=\operatorname{rnd}(A C x * A C x)\) & Yes & 2 & 1 & X \\
\hline Opcode & & \multicolumn{3}{|r|}{0101 010E | DDSS} & 100\% \\
\hline Operands & \multicolumn{5}{|l|}{ACx, ACy} \\
\hline \multirow[t]{9}{*}{Description} & \multicolumn{5}{|l|}{This instruction performs a multiplication in the D-unit MAC. The inpu operands of the multiplier are \(\mathrm{ACx}(32-16)\).} \\
\hline & \multicolumn{5}{|l|}{\(\square\) If \(\mathrm{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.} \\
\hline & \multicolumn{5}{|l|}{\(\square\) Multiplication overflow detection depends on SMUL.} \\
\hline & \multicolumn{5}{|l|}{- The 32-bit result of the multiplication is sign extended to 40 bits.} \\
\hline & \multicolumn{5}{|l|}{- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.} \\
\hline & \multicolumn{5}{|l|}{- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.} \\
\hline & \multicolumn{5}{|l|}{\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.} \\
\hline & \multicolumn{5}{|l|}{Compatibility with C54x devices (C54CM = 1)} \\
\hline & \multicolumn{5}{|l|}{When this instruction is executed with M40 \(=0\), compatibility is ensured.} \\
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{5}{|l|}{Affected by FRCT, M40, RDM, SATD, SMUL} \\
\hline & \multicolumn{5}{|l|}{Affects ACOVy} \\
\hline Repeat & \multicolumn{5}{|l|}{This instruction can be repeated.} \\
\hline \multicolumn{6}{|l|}{Example} \\
\hline Syntax & \multicolumn{5}{|l|}{Description} \\
\hline AC0 \(=\) AC1 * AC1 & \multicolumn{5}{|l|}{The content of AC1 is squared and the result is stored in AC0.} \\
\hline
\end{tabular}

Square

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & ACx \(=\) rnd(Smem * Smem) \({ }^{\text {c }}\), \({ }^{\text {3 }}=\) Smem \(]\) & No & 3 & 1 & X \\
\hline \multicolumn{2}{|l|}{Opcode} & \multicolumn{4}{|l|}{| 11010011 | AAAA AAAI \({ }^{\text {U\%DD }}\) 10xx} \\
\hline
\end{tabular}

\section*{Operands \\ ACx, Smem}

Description This instruction performs a multiplication in the D-unit MAC. The input operands of the multiplier are the content of a memory (Smem) location, sign extended to 17 bits.
\(\square\) If \(\mathrm{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
T. The 32-bit result of the multiplication is sign extended to 40 bits.
\(\square\) Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVx) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{lll|} 
Status Bits & Affected by \(\quad\) FRCT, M40, RDM, SATD, SMUL \\
& Affects \(\quad\) ACOVx
\end{tabular}

\section*{SQA}

Square and Accumulate
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\operatorname{rnd}(A C y+(A C x * A C x))\) & Yes & 2 & 1 & \(X\) \\
{\([2]\)} & \(A C y=\operatorname{rnd}(A C x+(S m e m * S m e m))[, T 3=S m e m]\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

Description This instruction performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are:
- \(\mathrm{ACx}(32-16)\)
\(\square\) the content of a memory (Smem) location, sign extended to 17 bits
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVx, ACOVy
See Also See the following other related instructions:
- Multiply and Accumulate
- Square
- Square Distance
- Square and Subtract

\section*{Square and Accumulate}

\section*{Syntax Characteristics}


\section*{Square and Accumulate}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(A C y=\operatorname{rnd}(A C x+(S m e m * S m e m))[, T 3=S m e m]\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
\(11010010 \mid\) AAAA AAAI \(\mid\) U\%DD 10SS

\section*{Operands}

\section*{Description}

Status Bits
\begin{tabular}{ll} 
Affected by & FRCT, M40, RDM, SATD, SMUL \\
Affects & ACOVy
\end{tabular}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1+\left({ }^{*} A R 3 ~^{*} A R 3\right)\) & \begin{tabular}{l} 
The content addressed by AR3 squared is added to the content of AC1 and \\
the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{SQS}

\section*{Square and Subtract}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\operatorname{rnd}(A C y-(A C x * A C x))\) & Yes & 2 & 1 & \(X\) \\
{\([2]\)} & \(A C y=\operatorname{rnd}(A C x-(S m e m * S m e m))[, T 3=\) Smem \(]\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}
\begin{tabular}{ll} 
Description & This instruction performs a multiplication and a subtraction in the D-unit MAC. \\
The input operands of the multiplier are: \\
Status Bits & Affected by FRCT, M40, RDM, SATD, SMUL \\
Affects \(\quad\) ACOVx, ACOVy \\
See Also & See the following other related instructions: \\
& Multiply and Subtract \\
Square \\
Square and Accumulate \\
Square Distance
\end{tabular}

Square and Subtract

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{r} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \(A C y=\operatorname{rnd}\left(A C y-\left(A C x^{*} A C x\right)\right)\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

0101 010E \(\mid\) DDSS 010\%

\section*{Operands}

Description This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are ACx(32-16).
\(\square\) If FRCT \(=1\), the output of the multiplier is shifted left by 1 bit.
- Multiplication overflow detection depends on SMUL.
- The 32-bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACy.
- Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.
- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by FRCT, M40, RDM, SATD, SMUL
Affects ACOVy
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 1=A C 1-(A C 0 * A C 0)\) & \begin{tabular}{l} 
The content of AC0 squared is subtracted from the content of AC1 and the \\
result is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

\section*{Square and Subtract}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline No. & Synt & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [2] & \multicolumn{2}{|l|}{ACy \(=\operatorname{rnd}(\) ACx \(-(\) Smem * Smem) \()\) [, T3 = Smem \(]\)} & No & 3 & 1 & X \\
\hline \multicolumn{2}{|l|}{\multirow[t]{9}{*}{\begin{tabular}{l}
Opcode \\
Operands \\
Description
\end{tabular}}} & & 0010 | AA & A A & I \| U\%D & 11SS \\
\hline & & \multicolumn{5}{|l|}{ACx, ACy, Smem} \\
\hline & & \multicolumn{5}{|l|}{This instruction performs a multiplication and a subtraction in the D-unit MAC. The input operands of the multiplier are the content of a memory (Smem) location, sign extended to 17 bits.} \\
\hline & & \multicolumn{5}{|l|}{\multirow[t]{3}{*}{\begin{tabular}{l}
I If FRCT \(=1\), the output of the multiplier is shifted left by 1 bit. \\
- Multiplication overflow detection depends on SMUL. \\
- The 32 -bit result of the multiplication is sign extended to 40 bits and subtracted from the source accumulator ACx.
\end{tabular}}} \\
\hline & & & & & & \\
\hline & & & & & & \\
\hline & & \multicolumn{5}{|l|}{Rounding is performed according to RDM, if the optional rnd keyword is applied to the instruction.} \\
\hline & & \multicolumn{5}{|l|}{- Overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.} \\
\hline & & \multicolumn{5}{|l|}{- When an overflow is detected, the accumulator is saturated according to SATD.} \\
\hline
\end{tabular}

This instruction provides the option to store the 16-bit data memory operand Smem in temporary register T3.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\(\left.\begin{array}{ll|l|}\text { Status Bits } & \begin{array}{l}\text { Affected by } \\ \text { Affects }\end{array} \quad \text { FRCT, M40, RDM, SATD, SMUL }\end{array}\right]\)

Syntax Characteristics


The first operation performs a multiplication and an accumulation in the D-unit MAC. The input operands of the multiplier are ACx(32-16).
- If \(\operatorname{FRCT}=1\), the output of the multiplier is shifted left by 1 bit.
\(\square\) Multiplication overflow detection depends on SMUL.
- The 32 -bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy.
\(\square\) Addition overflow detection depends on M40. If an overflow is detected, the destination accumulator overflow status bit (ACOVy) is set.
- When an addition overflow is detected, the accumulator is saturated according to SATD.

The second operation subtracts the content of data memory operand Ymem, shifted left 16 bits, from the content of data memory operand Xmem, shifted left 16 bits.
- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When \(\mathrm{C} 54 \mathrm{CM}=1\), during the subtraction an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{ll} 
Status Bits & Affected by C54CM, FRCT, M40, SATD, SMUL, SXMD \\
Affects \(\quad\) ACOVx, ACOVy, CARRY \\
Repeat & This instruction can be repeated. \\
See Also & See the following other related instructions: \\
\(\square\) Absolute Distance \\
\(\square\) & Square \\
\(\square\) Square and Accumulate \\
\(\square\) & Square and Subtract
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline sqdst(*AR0, *AR1, AC0, AC1) & \begin{tabular}{l} 
The content of AC0 squared is added to the content of AC1 and the result \\
is stored in AC1. The content addressed by AR1 shifted left by 16 bits is \\
subtracted from the content addressed by AR0 shifted left by 16 bits and \\
the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrllrr} 
Before & & & After \\
AC0 & FF ABCD & 0000 & AC0 & FF FFAB & 0000 \\
AC1 & 000000 & 0000 & AC1 & 00 & 1BB1 8229 \\
*AR0 & & 0055 & *AR0 & & 0055 \\
*AR1 & & \(00 A A\) & *AR1 & & 00 AA \\
ACOV0 & & 0 & ACOV0 & & 0 \\
ACOV1 & & 0 & ACOV1 & & 0 \\
CARRY & & 0 & CARRY & & 0 \\
FRCT & & 0 & FRCT & & 0
\end{tabular}

\section*{MOV \\ Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & Smem \(=\mathbf{H I}(A C x)\) & No & 2 & 1 & X \\
\hline [2] & Smem \(=\mathbf{H I}(\operatorname{rnd}(\) ACx \()\) ) & No & 3 & 1 & X \\
\hline [3] & Smem \(=\) LO(ACx \(\ll\) Tx) & No & 3 & 1 & X \\
\hline [4] & Smem \(=\mathbf{H I}(\) rnd \((A C x \ll T x))\) & No & 3 & 1 & X \\
\hline [5] & Smem = LO(ACx \(\ll\) \#SHIFTW) & No & 3 & 1 & X \\
\hline [6] & Smem \(=\) HI(ACx \(\ll\) \#SHIFTW) & No & 3 & 1 & X \\
\hline [7] & Smem \(=\mathbf{H I}(\) rnd \((\) ACx \(\ll\) \#SHIFTW) \()\) & No & 4 & 1 & X \\
\hline [8] & Smem \(=\mathbf{H I}\) (saturate(uns(rnd(ACx) \()\) ) & No & 3 & 1 & X \\
\hline [9] & Smem \(=\mathbf{H I}(\) saturate \((\) uns \((\operatorname{rnd}(\) (ACx \(\ll \mathbf{T x})\) ) ) & No & 3 & 1 & X \\
\hline [10] & Smem \(=\) HI(saturate(uns(rnd(ACx \(\ll\) \#SHIFTW) \()\) ) & No & 4 & 1 & X \\
\hline [11] & \(\mathrm{dbl}(\) Lmem \()=\mathrm{ACx}\) & No & 3 & 1 & X \\
\hline [12] & dbl(Lmem) = saturate(uns(ACx) ) & No & 3 & 1 & X \\
\hline [13] & \[
\begin{aligned}
& \mathrm{HI}(\text { Lmem })=\mathrm{HI}(\mathrm{ACx}) \gg \# 1, \\
& \mathrm{LO}(\text { Lmem })=\mathrm{LO}(A C x) \gg \# 1
\end{aligned}
\] & No & 3 & 1 & X \\
\hline [14] & \[
\begin{aligned}
& \text { Xmem }=\text { LO(ACx) }, \\
& \text { Ymem }=\text { HI(ACx) }
\end{aligned}
\] & No & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Description}

Status Bits

This instruction stores the content of the selected accumulator (ACx) to a memory (Smem) location, to a data memory operand (Lmem), or to dual data memory operands (Xmem and Ymem).

Affected by C54CM, RDM, SXMD
Affects none
See Also See the following other related instructions:
- Addition with Parallel Store Accumulator Content to Memory
\(\square\) Load Accumulator from Memory with Parallel Store Accumulator Content to Memory
- Load Accumulator, Auxiliary, or Temporary Register from Memory
- Multiply and Accumulate with Parallel Store Accumulator Content to Memory
- Multiply and Subtract with Parallel Store Accumulator Content to Memory
- Multiply with Parallel Store Accumulator Content to Memory
\(\square\) Store Accumulator Pair Content to Memory
- Store Accumulator, Auxiliary, or Temporary Register Content to Memory
- Store Auxiliary or Temporary Register Pair Content to Memory
- Subtraction with Parallel Store Accumulator Content to Memory

\section*{Store Accumulator Content to Memory}

\section*{Syntax Characteristics}


\section*{Store Accumulator Content to Memory}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & Smem \(=\mathbf{H I}(\) rnd \((A C x))\) & No & 3 & 1 & \(X\) \\
\hline Opcode & & 1110 & 1000 & AAAA & AAAI & SSXx \\
& \(x 0 x \%\)
\end{tabular}

\section*{Operands ACx, Smem}

Description This instruction stores the high part of the accumulator, \(\operatorname{ACx}(31-16)\), to the memory (Smem) location. Rounding is performed in the D-unit shifter according to RDM, if the optional rnd keyword is applied to the input operand.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40-bit result of the shift and round operation:
- If the SST bit = 1 and the SXMD bit \(=0\), then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem \(=\mathrm{HI}(\) saturate \((\) uns \((\operatorname{rnd}(\) ACx \()))\) )
- If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem \(=\mathrm{HI}(\) saturate \((\mathrm{rnd}(\mathrm{ACx}))\) )
- If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
\(\square\) If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.
\begin{tabular}{ll} 
Status Bits & Affected by C54CM, RDM, SST, SXMD \\
& Affects none \\
Repeat & This instruction can be repeated. \\
Example & \\
\hline Syntax & Description \\
\hline *AR3 \(=\mathrm{HI}(\) rnd \((\) AC0 \())\) & The content of ACO(31-16) is rounded and stored at the location addressed by AR3. \\
\hline
\end{tabular}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline & Smem = LO(ACx << Tx) & No & 3 & 1 & X \\
\hline Opcod & & \multicolumn{3}{|l|}{11100111 | AAAA AAAI \({ }^{\text {SSss }}\)} & 00xx \\
\hline Opera & & \multicolumn{4}{|c|}{ACx, Smem, Tx} \\
\hline Descri &  & \multicolumn{4}{|l|}{This instruction shifts the accumulator, ACx , by the content of Tx and stores the low part of the accumulator, \(\operatorname{ACx}(15-0)\), to the memory (Smem) location. If the 16 -bit value in Tx is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value. The input operand is shifted in the D -unit shifter according to SXMD.} \\
\hline
\end{tabular}

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of Tx determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the 16 -bit value in Tx is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
\(\square\) If the SST bit = 1 and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

> Smem = LO(saturate(uns(ACx << Tx)))
\(\square\) If the SST bit = 1 and the SXMD bit \(=1\), then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = LO(saturate (ACx << Tx))
\begin{tabular}{lll|}
\hline Status Bits & Affected by C54CM, RDM, SST, SXMD \\
& Affects none
\end{tabular}

\section*{Store Accumulator Content to Memory}

\section*{Syntax Characteristics}


\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with C54CM \(=1\), the 6 LSBs of Tx determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the 16 -bit value in Tx is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
- If the SST bit = 1 and the SXMD bit \(=0\), then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem \(=\mathrm{HI}(\) saturate \((\) uns \((\operatorname{rnd}(\mathrm{ACx} \ll \mathrm{Tx})))\) )
- If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem \(=\mathrm{HI}(\) saturate \((\operatorname{rnd}(A C x \ll T x)))\)
Affected by C54CM, RDM, SST, SXMD
Affects none

Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{*}\) AR3 \(=\mathrm{HI}(\mathrm{rnd}(\mathrm{ACO} \ll \mathrm{TO}))\) & \begin{tabular}{l} 
The content of ACO is shifted by the content of T0, is rounded, and \\
ACO(31-16) is stored at the location addressed by AR3.
\end{tabular} \\
\hline
\end{tabular}

\section*{Store Accumulator Content to Memory}

\section*{Syntax Characteristics}


When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation:
- If the SST bit = 1 and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = LO(saturate(uns(ACx << \#SHIFTW)))
\(\square\) If the SST bit = 1 and the SXMD bit \(=1\), then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:
Smem = LO(saturate(ACx << \#SHIFTW))
\(\square\) If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
- If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.
\begin{tabular}{|c|c|}
\hline Status Bits & \(\begin{array}{ll}\text { Affected by } & \text { C54CM, RDM, SST, SXMD } \\ \text { Affects } & \text { none }\end{array}\) \\
\hline Repeat & This instruction can be repeated. \\
\hline \multicolumn{2}{|l|}{Example} \\
\hline Syntax & Description \\
\hline *AR3 \(=\) LO(AC0 <<\#31) & The content of ACO is shifted left by 31 bits and ACO(15-0) is stored at the location addressed by AR3. \\
\hline
\end{tabular}

\section*{Store Accumulator Content to Memory}

Syntax Characteristics


When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40-bit result of the shift and round operation:
- If the SST bit = 1 and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate(uns(ACx << \#SHIFTW)))
\(\square\) If the SST bit \(=1\) and the \(\operatorname{SXMD}\) bit \(=1\), then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate(ACx <<\#SHIFTW))
- If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
- If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.
Status Bits Affected by C54CM, RDM, SST, SXMD
Affects none
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{*} \mathrm{AR} 3=\mathrm{HI}(\mathrm{ACO} \ll \# 31)\) & \begin{tabular}{l} 
The content of AC0 is shifted left by 31 bits and ACO(31-16) is stored at the \\
location addressed by AR3.
\end{tabular} \\
\hline
\end{tabular}

Store Accumulator Content to Memory
Syntax Characteristics


\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation:
- If the SST bit \(=1\) and the \(\operatorname{SXMD}\) bit \(=0\), then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate(uns(rnd(ACx << \#SHIFTW))))
\(\square\) If the SST bit = 1 and the SXMD bit = 1 , then only the saturate and rnd keywords are applied to the instruction regardless of the optional keywords selected by the user, with the following syntax:

Smem = HI(saturate(rnd(ACx << \#SHIFTW)))
- If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
- If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.

\section*{Status Bits}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{*} \mathrm{AR} 3=\mathrm{HI}(\operatorname{rnd}(\mathrm{ACO} \ll \# 31))\) & \begin{tabular}{l} 
The content of AC0 is shifted left by 31 bits, is rounded, and \(\mathrm{ACO}(31-16)\) is \\
stored at the location addressed by AR3.
\end{tabular} \\
\hline
\end{tabular}

Store Accumulator Content to Memory
Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & \begin{tabular}{l}
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline [8] & Smem \(=\) HI(saturate(uns(rnd(ACx) ) ) & No & 3 & 1 & X \\
\hline \multicolumn{2}{|l|}{Opcode} & 1000 | AAA & A AA & I SSXx & \(x\) x1u\% \\
\hline Operands & \multicolumn{5}{|l|}{ACx, Smem} \\
\hline Description & \multicolumn{5}{|l|}{This instruction stores the high part of the accumulator, \(\operatorname{ACx}(31-16)\), to the memory (Smem) location.} \\
\hline
\end{tabular}
- When the C54CM bit \(=0\) or the SST bit \(=0\), the saturate and uns keywords are optional and can be applied or not.
\(\square\) Input operands are considered signed or unsigned according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.

If the optional rnd keyword is applied to the input operand, rounding is performed in the D-unit shifter according to RDM.
\(\square\) When a rounding overflow is detected and if the optional saturate keyword is applied to the input operand, the 40 -bit output of the operation is saturated:

■ If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.

■ If the optional uns keyword is not applied, saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the round operation:
- If the SST bit = 1 and the SXMD bit \(=0\), then the saturate, rnd, and uns keywords are applied to the instruction regardless of the optional keywords selected by the user.
\begin{tabular}{l} 
If the SST bit = 1 and the SXMD bit \(=1\), then only the saturate and rnd \\
keywords are applied to the instruction regardless of the optional \\
keywords selected by the user, with the following syntax: \\
Smem = HI(saturate(rnd(ACx)))
\end{tabular}
If the optional uns keyword is applied to the input operand, then bits 39-32
of the result are compared to 0.

\section*{Store Accumulator Content to Memory}

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([9]\) & Smem \(=\mathbf{H I}(\) saturate \((\) uns \((\operatorname{rnd}(\mathrm{ACx} \ll \mathrm{Tx}))))\) & No & 3 & 1 & X \\
\hline Opcode & \(\mid 1110\) & 0111 & AAAA & AAAI & SSSS & \(11 \mathrm{u} \mathrm{\%}\)
\end{tabular}

Operands
Description

ACx, Smem, Tx
This instruction shifts the accumulator, ACx , by the content of Tx and stores the high part of the accumulator, \(\operatorname{ACx}(31-16)\), to the memory (Smem) location. If the 16 -bit value in Tx is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
- When the C54CM bit \(=0\) or the SST bit \(=0\), the saturate and uns keywords are optional and can be applied or not.
- Input operands are considered signed or unsigned according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.
- The input operand is shifted in the D-unit shifter according to SXMD.
- When shifting, the sign position of the input operand is compared to the shift quantity.
■ If the optional uns keyword is applied to the input operand, this comparison is performed against bit 32 of the shifted operand.
■ If the optional uns keyword is not applied, this comparison is performed against bit 31 of the shifted operand that is considered signed (the sign is defined by bit 39 of the input operand and SXMD).
- An overflow is generated accordingly.
- If the optional rnd keyword is applied to the input operand, rounding is performed in the D-unit shifter according to RDM.
\(\square\) When a shift or rounding overflow is detected and if the optional saturate keyword is applied to the input operand, the 40 -bit output of the operation is saturated:
■ If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.
\begin{tabular}{l} 
If the optional uns keyword is not applied, saturation values are \\
00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative \\
overflow).
\end{tabular}
Compatibility with C54x devices (C54CM = 1)
When this instruction is executed with C54CM =1:
If the SST bit = 1 and the SXMD bit = 0 , then the saturate, rnd, and uns
keywords are applied to the instruction regardless of the optional
keywords selected by the user.

\section*{Store Accumulator Content to Memory}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline No. & Synta & & & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [10] & \multicolumn{2}{|l|}{Smem = HI(saturate(uns(rnd(ACx << \#SHIFTW) ))} & & No & 4 & 1 & X \\
\hline \multicolumn{2}{|l|}{Opcode} & 11111010 & AAAA & AAAI \({ }^{\text {uxSH }}\) & IFTW & SSxx & x1x\% \\
\hline \multicolumn{2}{|l|}{Operands} & \multicolumn{6}{|l|}{ACx, SHIFTW, Smem} \\
\hline \multicolumn{2}{|l|}{Description} & \multicolumn{6}{|l|}{This instruction shifts the accumulator, ACx, by the 6-bit value, SHIFTW, and stores the high part of the accumulator, \(\operatorname{ACx}(31-16)\), to the memory (Smem)} \\
\hline
\end{tabular} location.
\(\square\) When the C54CM bit \(=0\) or the SST bit \(=0\), the saturate and uns keywords are optional and can be applied or not.
\(\square\) Input operands are considered signed or unsigned according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.
\(\square\) The input operand is shifted by the 6-bit value in the D-unit shifter according to SXMD.
\(\square\) When shifting, the sign position of the input operand is compared to the shift quantity.
- If the optional uns keyword is applied to the input operand, this comparison is performed against bit 32 of the shifted operand.
- If the optional uns keyword is not applied, this comparison is performed against bit 31 of the shifted operand that is considered signed (the sign is defined by bit 39 of the input operand and SXMD).
- An overflow is generated accordingly.
\(\square\) If the optional rnd keyword is applied to the input operand, rounding is performed in the D-unit shifter according to RDM.
\(\square\) When a shift or rounding overflow is detected and if the optional saturate keyword is applied to the input operand, the 40-bit output of the operation is saturated:
- If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.
\begin{tabular}{|c|c|c|}
\hline \multirow[t]{2}{*}{Status Bits} & \multicolumn{2}{|l|}{Affected by C54CM, RDM, SST, SXMD} \\
\hline & Affects & ne \\
\hline Repeat & This instruction & be repeated. \\
\hline \multicolumn{3}{|l|}{Example} \\
\hline Syntax & & Description \\
\hline \multicolumn{2}{|l|}{*AR3 \(=\) HI(saturate(uns(rnd(AC0 << \#31)))} & The unsigned content of \(A C 0\) rounded, is saturated, and location addressed by AR3 \\
\hline
\end{tabular}

Store Accumulator Content to Memory

\section*{Syntax Characteristics}


Syntax Characteristics
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([12]\) & dbl(Lmem) = saturate \((\operatorname{uns}(A C x))\) & No & 3 & 1 & X \\
\hline Opcode & 1110 & 1011 & AAAA & AAAI & xxSS & \(10 u 1\)
\end{tabular}

Operands ACx, Lmem
Description This instruction stores the content of the accumulator, \(\operatorname{ACx}(31-0)\), to the data memory operand (Lmem).
- When the C54CM bit \(=0\) or the SST bit \(=0\), the saturate and uns keywords are optional and can be applied or not.
\(\square\) Input operands are considered signed or unsigned according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is considered unsigned.
- If the optional uns keyword is not applied to the input operand, the content of the memory location is considered signed.
- The 40-bit output of the operation is saturated:
- If the optional uns keyword is applied to the input operand, saturation value is 00 FFFF FFFFh.
- If the optional uns keyword is not applied, saturation values are 00 7FFF FFFFh (positive overflow) or FF 8000 0000h (negative overflow).
- The store operation to the memory location uses the D-unit shifter.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), overflow detection at the output of the shifter consists of checking if the sign of the input operand is identical to the most-significant bits of the 40 -bit result of the shift and round operation.
\(\square\) If the SST bit = 1 and the SXMD bit \(=0\), then the saturate and uns keywords are applied to the instruction regardless of the optional keywords selected by the user.
\(\square\) If the SST bit \(=1\) and the SXMD bit \(=1\), then only the saturate keyword is applied to the instruction regardless of the optional keywords selected by the user.
\(\square\) If the optional uns keyword is applied to the input operand, then bits 39-32 of the result are compared to 0 .
\(\square\) If the optional uns keyword is not applied to the input operand, then bits 39-31 of the result are compared to bit 39 of the input operand and SXMD.

Status Bits Affected by C54CM, RDM, SST, SXMD
Affects none
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline dbl(*AR3 \()=\) saturate \((\) uns \((A C 0))\) & \begin{tabular}{l} 
The unsigned content of AC0 is saturated and stored at the locations \\
addressed by AR3 and AR3 +1.
\end{tabular} \\
\hline
\end{tabular}

\section*{Syntax Characteristics}


\section*{Store Accumulator Content to Memory}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([14]\) & \begin{tabular}{l} 
Xmem = LO(ACx \(),\) \\
Ymem = HI (ACx)
\end{tabular} & No & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Opcode}
| \(10000000 \mid\) XXXM MMYY \(\mid\) YMMM 10SS
Operands
ACx, Xmem, Ymem
Description This instruction performs two store operations in parallel:
- The 16 lowest bits of the accumulator, \(\operatorname{ACx}(15-0)\), are stored to data memory operand Xmem.
\(\square\) The 16 highest bits, \(\operatorname{ACx}(31-16)\), are stored to data memory operand Ymem.
Status Bits Affected by none

Affects none
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{*} A R 1=\mathrm{LO}(\mathrm{ACO})\), & The content of \(\mathrm{ACO}(15-0)\) is stored at the location addressed by AR1 and the \\
*AR2 \(=\mathrm{HI}(\mathrm{ACO})\) & content of \(\mathrm{ACO}(31-16)\) is stored at the location addressed by AR2. \\
\hline
\end{tabular}
\begin{tabular}{lllll} 
Before & & After \\
AC0 & 014500 & 0030 & AC0 & 014500 \\
AR1 & 0200 & AR1 & & 0030 \\
AR2 & 0201 & AR2 & & 0200 \\
200 & 3400 & 200 & 0201 \\
201 & \(0 F D 3\) & 201 & 0030 \\
& & & & 4500
\end{tabular}
MOV Store Accumulator Pair Content to Memory
Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & Lmem = pair(HI(ACx)) & No & 3 & 1 & X \\
{\([2]\)} & Lmem = pair(LO(ACx)) & No & 3 & 1 & X \\
\hline
\end{tabular}

\section*{Description}

\section*{Status Bits}

See Also

This instruction stores the content of the selected accumulator pair, ACx and AC( \(x+1\) ), to a data memory operand (Lmem).

Affected by none
Affects none
See the following other related instructions:
- Addition with Parallel Store Accumulator Content to Memory
\(\square\) Load Accumulator from Memory with Parallel Store Accumulator Content to Memory
\(\square\) Load Accumulator, Auxiliary, or Temporary Register from Memory
- Multiply and Accumulate with Parallel Store Accumulator Content to Memory
- Multiply and Subtract with Parallel Store Accumulator Content to Memory
- Multiply with Parallel Store Accumulator Content to Memory
\(\square\) Store Accumulator Content to Memory
- Store Accumulator, Auxiliary, or Temporary Register Content to Memory
- Store Auxiliary or Temporary Register Pair Content to Memory
- Subtraction with Parallel Store Accumulator Content to Memory

\section*{Syntax Characteristics}


Store Accumulator Pair Content to Memory
Syntax Characteristics


\section*{Syntax Characteristics}
\begin{tabular}{llllll}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles
\end{tabular} Pipeline

Syntax Characteristics
\begin{tabular}{cllcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & Smem \(=\) src & No & 2 & 1 & X \\
\hline Opcode & & 1100 & FSSSS & AAAA & AAAI
\end{tabular}

\section*{Operands}

Description This instruction stores the content of the source (src) register to a memory (Smem) location.
- When the source register is an accumulator:
- The low part of the accumulator, \(\operatorname{ACx}(15-0)\), is stored to the memory location.
■ The store operation to the memory location uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
- When the source register is an auxiliary or temporary register:

■ The content of the auxiliary or temporary register is stored to the memory location.
- The store operation to the memory location uses a dedicated path independent of the A-unit ALU.
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & none
\end{tabular}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{*}(\# 0 \mathrm{E} 10 \mathrm{~h})=\mathrm{AC0}\) & The content of \(\mathrm{ACO}(15-0)\) is stored at location E10h. \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & & After \\
AC0 & 23 & 0400 & 6500 & AC0 & 23 & 0400 \\
0E10 & & 0000 & 0E10 & & 6500 \\
& & & & & 6500
\end{tabular}

\section*{Store Accumulator, Auxiliary, or Temporary Register Content to Memory}

\section*{Syntax Characteristics}


\section*{Syntax Characteristics}


\section*{Operands Smem, src}

Description This instruction stores the low byte (bits 7-0) of the source (src) register to the low byte (bits 7-0) of the memory (Smem) location. The high byte (bits 15-8) of Smem is unchanged.
- When the source register is an accumulator:
- The low part of the accumulator, \(\operatorname{ACx}(7-0)\), is stored to the low byte of the memory location.
- The store operation to the memory location uses a dedicated path independent of the D-unit ALU, the D-unit shifter, and the D-unit MACs.
- When the source register is an auxiliary or temporary register:
- The low part (bits 7-0) content of the auxiliary or temporary register is stored to the low byte of the memory location.
- The store operation to the memory location uses a dedicated path independent of the A-unit ALU.
- In this instruction, Smem cannot reference to a memory-mapped register (MMR). This instruction cannot access a byte within an MMR. If Smem is an MMR, the DSP sends a hardware bus-error interrupt (BERRINT) request to the CPU.

\section*{Status Bits}

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline low_byte(*AR3) = AC0 & \begin{tabular}{l} 
The content of AC0(7-0) is stored in the low byte (bits 7-0) at the location \\
addressed by AR3.
\end{tabular} \\
\hline
\end{tabular}

\section*{MOV}

Store Auxiliary or Temporary Register Pair Content to Memory

\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\({ }^{*}\) AR2 \(=\) pair(T0) & \begin{tabular}{l} 
The content of T0 is stored at the location addressed by AR2 and the content of \\
T1 is stored at the location addressed by AR2 +1.
\end{tabular} \\
\hline
\end{tabular}

\section*{MOV}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & Smem \(=\) BK03 & No & 3 & 1 & X \\
\hline [2] & Smem \(=\) BK47 & No & 3 & 1 & X \\
\hline [3] & Smem = BKC & No & 3 & 1 & X \\
\hline [4] & Smem = BSA01 & No & 3 & 1 & X \\
\hline [5] & Smem = BSA23 & No & 3 & 1 & X \\
\hline [6] & Smem = BSA45 & No & 3 & 1 & X \\
\hline [7] & Smem = BSA67 & No & 3 & 1 & X \\
\hline [8] & Smem = BSAC & No & 3 & 1 & X \\
\hline [9] & Smem \(=\) BRC0 & No & 3 & 1 & X \\
\hline [10] & Smem = BRC1 & No & 3 & 1 & X \\
\hline [11] & Smem = CDP & No & 3 & 1 & X \\
\hline [12] & Smem = CSR & No & 3 & 1 & X \\
\hline [13] & Smem = DP & No & 3 & 1 & X \\
\hline [14] & Smem = DPH & No & 3 & 1 & X \\
\hline [15] & Smem = PDP & No & 3 & 1 & X \\
\hline [16] & Smem \(=\mathbf{S P}\) & No & 3 & 1 & X \\
\hline [17] & Smem = SSP & No & 3 & 1 & X \\
\hline [18] & Smem \(=\) TRN0 & No & 3 & 1 & X \\
\hline [19] & Smem \(=\) TRN1 & No & 3 & 1 & X \\
\hline [20] & dbl(Lmem) = RETA & No & 3 & 5 & X \\
\hline
\end{tabular}

Opcode See Table 5-6 (page 5-599).

Operands Lmem, Smem

\section*{Description \\ Status Bits \\ Repeat \\ See Also \\ These instructions store the content of the selected source CPU register to a memory (Smem) location or a data memory operand (Lmem). \\ For instructions [9] and [10], the block repeat register (BRCx) is decremented in the address phase of the last instruction of the loop. These instructions have a 3-cycle latency requirement versus the last instruction of the loop. \\ For instruction [20], the content of the 24 -bit RETA register (the return address of the calling subroutine) and the 8 -bit CFCT register (active control flow execution context flags of the calling subroutine) are stored to the data memory operand (Lmem): \\ - The content of the CFCT register and the 8 highest bits of the RETA register are stored in the 16 highest bits of Lmem. \\ \(\square\) The 16 lowest bits of the RETA register are stored in the 16 lowest bits of Lmem. \\ When instruction [20] is decoded, the CPU pipeline is flushed and the instruction is executed in 5 cycles, regardless of the instruction context. \\ Affected by none \\ Affects none \\ Instruction [20] cannot be repeated; all other instructions can be repeated. \\ See the following other related instructions: \\ \(\square\) Load CPU Register from Memory \\ - Load CPU Register with Immediate Value \\ - Move CPU Register Content to Auxiliary or Temporary Register \\ - Store Accumulator Content to Memory \\ - Store Accumulator Pair Content to Memory \\ - Store Accumulator, Auxiliary, or Temporary Register Content to Memory \\ - Store Auxiliary or Temporary Register Pair Content to Memory \\ Example 1 \\ \begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline *AR1+ = SP & \begin{tabular}{l} 
The content of the data stack pointer (SP) is stored in the location addressed by AR1. \\
AR1 is incremented by 1.
\end{tabular} \\
\hline
\end{tabular} \\ \begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
AR1 & 0200 & AR1 & 0201 \\
SP & 0200 & SP & 0200 \\
200 & 0000 & 200 & 0200
\end{tabular}}

\section*{Example 2}
\begin{tabular}{l|ll|}
\hline Syntax & Description \\
\hline\({ }^{*}\) AR1 \(+=\) SSP & \begin{tabular}{l} 
The content of the system stack pointer (SSP) is stored in the location addressed by AR1. \\
AR1 is incremented by 1.
\end{tabular} \\
\hline
\end{tabular} \begin{tabular}{lll} 
\\
Before \\
AR1 & 0201 & After \\
SSP & 0000 & AR1
\end{tabular}

\section*{Example 3}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(*\) AR1 \(+=\) TRN0 & \begin{tabular}{l} 
The content of the transition register (TRN0) is stored in the location addressed by AR1. \\
AR1 is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
AR1 & 0202 & AR1 & 0203 \\
TRN0 & 3490 & TRNO & 3490 \\
202 & 0000 & 202 & 3490
\end{tabular}

\section*{Example 4}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline *AR1+ = TRN1 & \begin{tabular}{l} 
The content of the transition register (TRN1) is stored in the location addressed by AR1. \\
AR1 is incremented by 1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
AR1 & 0203 & AR1 & 0204 \\
TRN1 & 0020 & TRN1 & 0020 \\
203 & 0000 & 203 & 0020
\end{tabular}

\section*{Example 5}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline dbl \(\left({ }^{*}\right.\) AR3 \()=\) RETA & \begin{tabular}{l} 
The contents of the RETA and CFCT are stored in the location addressed by AR3 \\
and AR3 +1.
\end{tabular} \\
\hline
\end{tabular}

Table 5-6. Opcodes for Store CPU Register Content to Memory Instruction
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline No. & Syntax & \multicolumn{6}{|c|}{Opcode} \\
\hline [1] & Smem = BK03 & 1110 & 0101 & AAAA & AAAI & 1001 & 10xx \\
\hline [2] & Smem \(=\) BK47 & 1110 & 0101 & AAAA & AAAI & 1010 & 10xx \\
\hline [3] & Smem = BKC & 1110 & 0101 & AAAA & AAAI & 1011 & 10xx \\
\hline [4] & Smem = BSA01 & 1110 & 0101 & AAAA & AAAI & 0010 & 10xx \\
\hline [5] & Smem = BSA23 & 1110 & 0101 & AAAA & AAAI & 0011 & 10xx \\
\hline [6] & Smem = BSA45 & 1110 & 0101 & AAAA & AAAI & 0100 & 10xx \\
\hline [7] & Smem = BSA67 & 1110 & 0101 & AAAA & AAAI & 0101 & 10xx \\
\hline [8] & Smem = BSAC & 1110 & 0101 & AAAA & AAAI & 0110 & 10xx \\
\hline [9] & Smem = BRCO & 1110 & 0101 & AAAA & AAAI & x001 & 11xx \\
\hline [10] & Smem \(=\) BRC1 & 1110 & 0101 & AAAA & AAAI & x010 & 11xx \\
\hline [11] & Smem = CDP & 1110 & 0101 & AAAA & AAAI & 0001 & 10xx \\
\hline [12] & Smem = CSR & 1110 & 0101 & AAAA & AAAI & x000 & 11xx \\
\hline [13] & Smem = DP & 1110 & 0101 & AAAA & AAAI & 0000 & 10xx \\
\hline [14] & Smem \(=\) DPH & 1110 & 0101 & AAAA & AAAI & 1100 & 10xx \\
\hline [15] & Smem = PDP & 1110 & 0101 & AAAA & AAAI & 1111 & 10xx \\
\hline [16] & Smem = SP & 1110 & 0101 & AAAA & AAAI & 0111 & 10xx \\
\hline [17] & Smem = SSP & 1110 & 0101 & AAAA & AAAI & 1000 & 10xx \\
\hline [18] & Smem \(=\) TRN0 & 1110 & 0101 & AAAA & AAAI & x011 & 11xx \\
\hline [19] & Smem \(=\) TRN1 & 1110 & 0101 & AAAA & AAAI & x100 & 11xx \\
\hline [20] & dbl(Lmem) = RETA & 1110 & 1011 & AAAA & AAAI & xxxx & 01xx \\
\hline
\end{tabular}

\section*{MOV}

Store Extended Auxiliary Register Content to Memory

\section*{Syntax Characteristics}


\section*{SUBC}

\section*{Subtract Conditionally}

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size
\end{tabular} Cycles \begin{tabular}{c} 
Pipeline \\
\hline\([1]\) \\
subc(Smem, ACx, ACy)
\end{tabular}

ACx, ACy, Smem
This instruction performs a conditional subtraction in the D-unit ALU. The D-unit shifter is not used to perform the memory operand shift.
\(\square\) The 16 -bit data memory operand Smem is sign extended to 40 bits according to SXMD, shifted left by 15 bits, and subtracted from the content of the source accumulator ACx.

■ The shift operation is equivalent to the signed shift instruction.
■ Overflow and carry bit is always detected at bit position 31. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- If an overflow is detected and reported in accumulator overflow bit ACOVy, no saturation is performed on the result of the operation.
\(\square\) If the result of the subtraction is greater than 0 (bit \(39=0\) ), the result is shifted left by 1 bit, added to 1 , and stored in the destination accumulator ACy.
- If the result of the subtraction is less than 0 (bit \(39=1\) ), the source accumulator ACx is shifted left by 1 bit and stored in the destination accumulator ACy.
```

if ((ACx - (Smem << \#15)) >= 0)
ACy = (ACx - (Smem << \#15)) << \#1 + 1
else
ACy = ACx << \#1

```

This instruction is used to make a 16 step 16 -bit by 16 -bit division. The divisor and the dividend are both assumed to be positive in this instruction. SXMD affects this operation:
\(\square\) If SXMD \(=1\), the divisor must have a 0 value in the most significant bit
\(\square\) If \(S X M D=0\), any 16 -bit divisor value produces the expected result
The dividend, which is in the source accumulator ACx, must be positive (bit \(31=0\) ) during the computation.
\begin{tabular}{ll} 
Status Bits & Affected by SXMD \\
& Affects ACOVy, CARRY \\
Repeat & This instruction can be repeated. \\
See Also & See the following other related instructions: \\
& \(\square\) Addition or Subtraction Conditionally \\
& \(\square\) Addition or Subtraction Conditionally with Shift \\
& \(\square\) Addition, Subtraction, or Move Accumulator Content Conditionally \\
& \(\square\) Dual 16-Bit Subtraction and Addition \\
& \(\square\) Subtraction \\
& \(\square\) Subtraction with Parallel Store Accumulator Content to Memory
\end{tabular}

\section*{Example 1}
\begin{tabular}{|c|c|c|c|c|}
\hline Syntax & \multicolumn{4}{|c|}{Description} \\
\hline subc(*AR1, AC0, AC1) & \multicolumn{4}{|r|}{The content addressed by AR1 shifted left by 15 bits is subtracted from the content of ACO. The result is greater than 0 ; therefore, the result is shifted left by 1 bit, added to 1, and the new result stored in AC1. The result generated an overflow and a carry.} \\
\hline Before & & After & & \\
\hline AC0 234300 & 0000 & ACO & 234300 & 0000 \\
\hline AC1 000000 & 0000 & AC1 & 468400 & 0001 \\
\hline AR1 & 300 & AR1 & & 300 \\
\hline 300 & 200 & 300 & & 200 \\
\hline SXMD & 0 & SXMD & & 0 \\
\hline ACOV1 & 0 & ACOV1 & & 1 \\
\hline CARRY & 0 & CARRY & & 1 \\
\hline
\end{tabular}

\section*{Example 2}
\begin{tabular}{|c|c|c|c|c|}
\hline Syntax & \multicolumn{4}{|c|}{Description} \\
\hline \[
\begin{aligned}
& \text { repeat (CSR) } \\
& \text { subc(*AR1, AC1, AC1) }
\end{aligned}
\] & \multicolumn{4}{|l|}{The content addressed by AR1 shifted left by 15 bits is subtracted from the content of AC1. The result is greater than 0 ; therefore, the result is shifted left by 1 bit, added to 1, and the new result stored in AC1. The content addressed by AR1 shifted left by 15 bits is subtracted from the content of AC1. The result is greater than 0 ; therefore, the result is shifted left by 1 bit, added to 1 , and the new result stored in AC1. The result generated a carry.} \\
\hline Before & & After & & \\
\hline AC1 000746 & 0000 & AC1 & 00 1A18 & 0007 \\
\hline AR1 & 200 & AR1 & & 200 \\
\hline 200 & 0100 & 200 & & 0100 \\
\hline CSR & 1 & CSR & & 0 \\
\hline ACOV1 & 0 & ACOV1 & & 0 \\
\hline CARRY & 0 & CARRY & & 1 \\
\hline
\end{tabular}

\section*{SUB}

\section*{Subtraction}

Syntax Characteristics
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [1] & \(\mathrm{dst}=\mathrm{dst}-\mathrm{src}\) & Yes & 2 & 1 & X \\
\hline [2] & \(\mathrm{dst}=\mathrm{dst}-\mathrm{k} 4\) & Yes & 2 & 1 & X \\
\hline [3] & \(\mathrm{dst}=\mathrm{src}-\mathrm{K} 16\) & No & 4 & 1 & X \\
\hline [4] & dst \(=\) src - Smem & No & 3 & 1 & X \\
\hline [5] & dst \(=\) Smem - src & No & 3 & 1 & X \\
\hline [6] & \(A C y=A C y-(A C x \ll T x)\) & Yes & 2 & 1 & X \\
\hline [7] & ACy \(=\) ACy \(-(\) ACx \(\ll\) \#SHIFTW) & Yes & 3 & 1 & X \\
\hline [8] & \(A C y=A C x-(K 16 \ll \# 16)\) & No & 4 & 1 & X \\
\hline [9] & ACy \(=\) ACx \(-(\mathrm{K} 16 \ll \# S H F T)\) & No & 4 & 1 & X \\
\hline [10] & \(A C y=A C x-(\) Smem \(\ll\) Tx \()\) & No & 3 & 1 & X \\
\hline [11] & ACy \(=\) ACx \(-(\) Smem \(\ll \# 16)\) & No & 3 & 1 & X \\
\hline [12] & ACy \(=(\) Smem \(\ll \# 16)-\) ACx & No & 3 & 1 & X \\
\hline [13] & ACy \(=\) ACx - uns(Smem) - BORROW & No & 3 & 1 & X \\
\hline [14] & ACy \(=\) ACx - uns(Smem) & No & 3 & 1 & X \\
\hline [15] & ACy \(=\) ACx \(-(\) uns (Smem) \(\ll\) \#SHIFTW) & No & 4 & 1 & X \\
\hline [16] & \(A C y=A C x-d b l(L m e m)\) & No & 3 & 1 & X \\
\hline [17] & ACy \(=\mathbf{d b l}(\) Lmem \()-\mathrm{ACx}\) & No & 3 & 1 & X \\
\hline [18] & ACx \(=(\) Xmem \(\ll \# 16)-(\) Ymem \(\ll \# 16)\) & No & 3 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{lll} 
Description & These instructions perform a subtraction operation. \\
Status Bits & Affected by & CARRY, C54CM, M40, SATA, SATD, SXMD \\
& Affects & ACOVx, ACOVy, CARRY
\end{tabular}
See Also See the following other related instructions:
- Addition
\(\square\) Addition or Subtraction Conditionally
- Addition or Subtraction Conditionally with Shift
- Addition, Subtraction, or Move Accumulator Content Conditionally
\(\square\) Dual 16-Bit Addition and Subtraction
\(\square\) Dual 16-Bit Subtractions
- Dual 16-Bit Subtraction and Addition
\(\square\) Subtract Conditionally
- Subtraction with Parallel Store Accumulator Content to Memory

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{llccccc}
\hline No. \(\quad\) Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1] \quad \mathrm{dst}=\mathrm{dst}-\mathrm{src}\) & Yes & 2 & 1 & X \\
\hline Opcode & & & 0010 & 011 E & FSSS & FDDD \\
Operands & dst, src & & & &
\end{tabular}

Operands
Description

This instruction performs a subtraction operation between two registers.
\(\square\) When the destination operand (dst) is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
■ Input operands are sign extended to 40 bits according to SXMD.
- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.
■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) When the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.
■ Overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40, SATA, SATD, SXMD
Affects ACOVx, CARRY
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0-A C 1\) & The content of AC1 is subtracted from the content of AC0 and the result is stored in AC0. \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([2]\) & \(d s t=d s t-k 4\) & Yes & 2 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}
| 0100 011E \(\mid k k k k \quad\) FDDD

\section*{Operands}

Description

Repeat

\section*{Status Bits}
dst, k4
This instruction subtracts a 4-bit unsigned constant, k 4 , from a register.
\(\square\) When the destination operand (dst) is an accumulator:
The operation is performed on 40 bits in the D-unit ALU.
■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) When the destination operand (dst) is an auxiliary or temporary register:
The operation is performed on 16 bits in the A-unit ALU.
- Overflow detection is done at bit position 15.
- When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) ACO - \#15 & \begin{tabular}{l} 
An unsigned 4-bit value (15) is subtracted from the content of AC0 and the result is \\
stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{clccccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([3]\) & \(\mathrm{dst}=\mathrm{src}-\mathrm{K} 16\) & No & 4 & 1 & X \\
\hline
\end{tabular}

Opcode
Operands
Description
dst, K16, src
This instruction subtracts a 16 -bit signed constant, K16, from a register.
\(\square\) When the destination operand (dst) is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ The 16 -bit constant, K16, is sign extended to 40 bits according to SXMD.

■ Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) When the destination operand (dst) is an auxiliary or temporary register:
- The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

■ Overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{l} 
Status Bits \\
\multicolumn{1}{l}{ Affected by M40, SATA, SATD, SXMD } \\
Repeat \\
\begin{tabular}{|l|l|}
\hline Syntax & This instruction can be repeated. \\
\hline AC0 = AC1 - FFFFh & \begin{tabular}{l} 
Description \\
A signed 16-bit value (FFFFh) is subtracted from the content of AC1 and the result \\
is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}
\end{tabular}

\section*{Subtraction}

Syntax Characteristics
 register content.
\(\square\) When the destination operand (dst) is an accumulator:
- The operation is performed on 40 bits in the D-unit ALU.
- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ The content of the memory location is sign extended to 40 bits according to SXMD.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

■ When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) When the destination operand (dst) is an auxiliary or temporary register:
■ The operation is performed on 16 bits in the A-unit ALU.
- If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

■ Overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{l} 
Status Bits \\
\begin{tabular}{ll} 
Affected by M40, SATA, SATD, SXMD \\
Repeat & This instruction can be repeated. \\
Example & Description \\
\hline Syntax & \begin{tabular}{l} 
The content addressed by AR3 is subtracted from the content of AC1 and the result \\
is stored in AC0.
\end{tabular} \\
\hline AC0 = AC1 - *AR3
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([5]\) & dst \(=\) Smem - src & No & 3 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad \mid 1101\) 1000|AAAA AAAI |FDDD FSSS
Operands dst, Smem, src

Description This instruction subtracts a register content from the content of a memory (Smem) location.
\(\square\) When the destination operand (dst) is an accumulator:
■ The operation is performed on 40 bits in the D-unit ALU.
- If an auxiliary or temporary register is the source operand (src) of the instruction, the 16 LSBs of the auxiliary or temporary register are sign extended according to SXMD.

■ The content of the memory location is sign extended to 40 bits according to SXMD.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

■ When an overflow is detected, the accumulator is saturated according to SATD.
\(\square\) When the destination operand (dst) is an auxiliary or temporary register:
■ The operation is performed on 16 bits in the A-unit ALU.
■ If an accumulator is the source operand (src) of the instruction, the 16 LSBs of the accumulator are used to perform the operation.

■ Overflow detection is done at bit position 15.
■ When an overflow is detected, the destination register is saturated according to SATA.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
\begin{tabular}{l} 
Status Bits \\
\multicolumn{1}{l}{ Affected by M40, SATA, SATD, SXMD } \\
Repeat \\
Affect \\
Example \\
\begin{tabular}{|l|l|}
\hline Syntax & Description instruction can be repeated. \\
\hline AC0 = *AR3 - AC1 & \begin{tabular}{l} 
The content of AC1 is subtracted from the content addressed by AR3 and the result \\
is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}


\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM = 1:
\(\square\) An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\(\square\) The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .

Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 0-(A C 1 \ll T 0)\) & \begin{tabular}{l} 
The content of AC1 shifted by the content of T0 is subtracted from the content of \\
AC0 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([7]\) & \(A C y=A C y-(A C x \ll \# S H I F T W)\) & Yes & 3 & 1 & \(X\) \\
\hline
\end{tabular}
Opcode \(\quad|0001000 \mathrm{E}|\) DDSS \(0100 \mid\) xxSH IFTW

\section*{Operands ACx, ACy, SHIFTW}

Description This instruction subtracts an accumulator content ACx shifted by the 6-bit value, SHIFTW, from an accumulator content ACy.

The operation is performed on 40 bits in the D-unit shifter.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits \(\quad\) Affected by \(\quad\) C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC0 \(-(\) AC1 \(\ll \# 31)\) & \begin{tabular}{l} 
The content of AC1 shifted left by 31 bits is subtracted from the content of AC0 \\
and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([8]\) & \(\mathrm{ACy}=\mathrm{ACx}-(\mathrm{K} 16 \ll \# 16)\) & No & 4 & 1 & X \\
\hline
\end{tabular}
Opcode \(\quad|01111010|\) KKKK KKKK \(\mid\) KKKK KKKK \(\mid\) SSDD 001 x

\section*{Operands}

Description This instruction subtracts the 16-bit signed constant, K16, shifted left by 16 bits from an accumulator content ACx.

The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SATD, SXMD \\
& Affects & ACOVy, CARRY
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = AC1 - (FFFFh <<\#16) & \begin{tabular}{l} 
A signed 16-bit value (FFFFh) shifted left by 16 bits is subtracted from the \\
content of AC1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & Parallel & & & \\
\hline\([9]\) & \(A C y=A C x-(\mathrm{K} 16 \ll \# S H F T)\) & Enable Bit & Size & Cycles & Pipeline \\
\hline
\end{tabular}

\section*{Opcode}
\(01110001 \mid\) KKKK KKKK \(\mid\) KKKK KKKK \(\mid\) SSDD SHFT

\section*{Operands \\ ACx, ACy, K16, SHFT}

Description This instruction subtracts the 16-bit signed constant, K16, shifted left by the 4-bit value, SHFT, from an accumulator content ACx.
\(\square\) The operation is performed on 40 bits in the D-unit shifter.
- Input operands are sign extended to 40 bits according to SXMD.
- The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with M40 \(=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits Affected by M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 \(=\) AC0 \(-(\# 9800 \mathrm{~h} \ll \# 5)\) & \begin{tabular}{l} 
A signed 16-bit value \((9800 \mathrm{~h})\) shifted left by 5 bits is subtracted from the \\
content of AC0 and the result is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{|c|c|c|c|c|c|}
\hline No. & Syntax & Parallel Enable Bit & Size & Cycles & Pipeline \\
\hline [10] & \(A C y=A C x-(\) Smem \(\ll T x)\) & No & 3 & 1 & X \\
\hline \multicolumn{6}{|l|}{Opcode \(\quad|11011101|\) AAAA AAAI \(\mid\) SSDD Ss01} \\
\hline Opera & & \multicolumn{4}{|c|}{ACx, ACy, Smem, Tx} \\
\hline Descr & \begin{tabular}{l}
tion \\
This instruc the content
The op
Input o
The sh
Overflo \\
subtrac is the log
When SATD.
\end{tabular} & \multicolumn{4}{|l|}{\begin{tabular}{l}
This instruction subtracts the content of a memory (Smem) location shifted by the content of Tx from an accumulator content ACx.
The operation is performed on 40 bits in the D-unit shifter.
Input operands are sign extended to 40 bits according to SXMD.
The shift operation is equivalent to the signed shift instruction.
Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit. \\
When an overflow is detected, the accumulator is saturated according to SATD.
\end{tabular}} \\
\hline
\end{tabular}

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM = 1:
\(\square\) An intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\(\square\) The 6 LSBs of Tx are used to determine the shift quantity. The 6 LSBs of Tx define a shift quantity within -32 to +31 . When the value is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .
Status Bits Affected by C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC1 \(-\left({ }^{*} A R 3 \ll T 0\right)\) & \begin{tabular}{l} 
The content addressed by AR3 shifted by the content of T0 is subtracted from \\
the content of AC1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([11]\) & \(A C y=A C x-(\) Smem \(\ll \# 16)\) & No & 3 & 1 & \(\times\) \\
\hline & & & & & & \\
Opcode & 1101 & \(1110 \mid A A A A\) & AAAI & SSDD & 0101
\end{tabular}

\section*{Operands \\ ACx, ACy, Smem}

\section*{Description This instruction subtracts the content of a memory (Smem) location shifted left} by 16 bits from an accumulator content ACx.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. If the result of the subtraction generates a borrow, the CARRY status bit is cleared; otherwise, the CARRY status bit is not affected.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.

Status Bits \(\quad\) Affected by \(\quad\) C54CM, M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = AC1 - (*AR3 <<\#16) & \begin{tabular}{l} 
The content addressed by AR3 shifted left by 16 bits is subtracted from the \\
content of AC1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

Syntax Characteristics
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([12]\) & \(A C y=(\) Smem <<\#16 \()-A C x\) & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}
Opcode \(\quad \mid 1101\) 1110| AAAA AAAI \(\mid\) SSDD 0110

\section*{Operands}

Description This instruction subtracts an accumulator content ACx from the content of a memory (Smem) location shifted left by 16 bits.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
\(\square\) Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.

When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices \((C 54 C M=1)\)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{lll} 
Status Bits & Affected by & C54CM, M40, SATD, SXMD \\
& Affects & ACOVy, CARRY
\end{tabular}

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\left({ }^{*} A R 3 \ll \# 16\right)-\) AC1 & \begin{tabular}{l} 
The content of AC1 is subtracted from the content addressed by AR3 shifted \\
left by 16 bits and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Subtraction

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([13]\) & \(A C y=A C x-\operatorname{lns}(\) Smem \()-\) BORROW & No & 3 & 1 & \(X\) \\
\hline
\end{tabular}

\section*{Opcode}

\section*{Operands ACx, ACy, Smem}

Description This instruction subtracts the logical complement of the CARRY status bit (borrow) and the content of a memory (Smem) location from an accumulator content ACx.
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
- Input operands are extended to 40 bits according to uns.
- If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.

■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

Compatibility with C54x devices (C54CM =1)
When this instruction is executed with M40 \(=0\), compatibility is ensured.
\begin{tabular}{lll} 
Status Bits & Affected by & CARRY, M40, SATD, SXMD \\
& Affects & ACOVy, CARRY
\end{tabular}

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC1 = AC0 - uns(*AR1) - BORROW & \begin{tabular}{l} 
The complement of the CARRY bit (1) and the unsigned content \\
addressed by AR1 (F000h) are subtracted from the content of AC0 and \\
the result is stored in AC1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrlr} 
Before & & & After \\
AC0 & 00 EC0 & 0000 & ACO & 00 EC00 0000 \\
AC1 & 00 & 0000 & 0000 & AC1 & 00 EBFF & \(0 F F F\) \\
AR1 & & 0302 & AR1 & & 0302 \\
302 & & F000 & 302 & F000 \\
CARRY & & 0 & CARRY & & 1
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}

- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are extended to 40 bits according to uns.

■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 = AC1 - uns(*AR3) & \begin{tabular}{l} 
The unsigned content addressed by AR3 is subtracted from the content of AC1 and \\
the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
 the 6-bit value, SHIFTW, from an accumulator content ACx.
\(\square\) The operation is performed on 40 bits in the D-unit shifter.
\(\square\) Input operands are extended to 40 bits according to uns.
■ If the optional uns keyword is applied to the input operand, the content of the memory location is zero extended to 40 bits.
■ If the optional uns keyword is not applied to the input operand, the content of the memory location is sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM =1)}

When this instruction is executed with M40 \(=0\), compatibility is ensured. When C54CM \(=1\), an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
Status Bits Affected by C54CM, M40, SATD, SXMD

Repeat This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\) AC1 \(-(\) uns(*AR3 \() \ll \# 31)\) & \begin{tabular}{l} 
The unsigned content addressed by AR3 shifted left by 31 bits is \\
subtracted from the content of AC1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([16]\) & \(A C y=A C x-\mathrm{dbl}(\) Lmem \()\) & No & 3 & 1 & \(X\) \\
\hline Opcode & 1110 & \(1101 \mid A A A A\) & AAAI & SSDD & 001 n
\end{tabular}

\section*{Operands}

Description This instruction subtracts the content of data memory operand dbl(Lmem) from an accumulator content ACx.
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
■ if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1
- if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem - 1
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
\(\square\) Input operands are sign extended to 40 bits according to SXMD.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.

\section*{Status Bits}

Repeat

Affected by M40, SATD, SXMD
Affects ACOVy, CARRY
This instruction can be repeated.

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline\(A C 0=A C 1-\mathrm{dbl}\left({ }^{*} A R 3+\right)\) & \begin{tabular}{l} 
The content (long word) addressed by AR3 and AR3 + 1 is subtracted from the \\
content of AC1 and the result is stored in AC0. Because this instruction is a \\
long-operand instruction, AR3 is incremented by 2 after the execution.
\end{tabular} \\
\hline
\end{tabular}

\section*{Subtraction}

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([17]\) & ACy \(=\mathrm{dbl}(\) Lmem \()-\mathrm{ACx}\) & No & 3 & 1 & X \\
\hline Opcode & & 1110 & 1101 & AAAA & AAAI & SSDD \\
Operands & ACx, ACy, Lmem & & & & \\
Description & \begin{tabular}{l} 
This instruction subtracts an accumulator content ACx from the content of data \\
memory operand dbl(Lmem).
\end{tabular}
\end{tabular}
\(\square\) The data memory operand dbl(Lmem) addresses are aligned:
- if Lmem address is even: most significant word = Lmem, least significant word \(=\) Lmem +1
- if Lmem address is odd: most significant word = Lmem, least significant word \(=\) Lmem - 1
\(\square\) The operation is performed on 40 bits in the D-unit ALU.
\(\square\) Input operands are sign extended to 40 bits according to SXMD.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
\(\square\) When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured.
Status Bits Affected by M40, SATD, SXMD
Affects ACOVy, CARRY
Repeat This instruction can be repeated.
Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline AC0 \(=\mathrm{dbl}\left({ }^{*} A R 3\right)-\) AC1 & \begin{tabular}{l} 
The content of AC1 is subtracted from the content (long word) addressed by AR3 and \\
AR3 +1 and the result is stored in AC0.
\end{tabular} \\
\hline
\end{tabular}

Subtraction

\section*{Syntax Characteristics}
\begin{tabular}{lllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([18]\) & \(\mathrm{ACx}=(\) Xmem \(\ll \# 16)-(\) Ymem \(\ll \# 16)\) & No & 3 & 1 & X \\
\hline Opcode & & 1000 & \(0001 \mid X X X M\) & MMYY & YMMM & 01DD \\
Operands & ACx, Xmem, Ymem & & & \\
Description & \begin{tabular}{l} 
This instruction subtracts the content of data memory operand Ymem, shifted \\
left 16 bits, from the content of data memory operand Xmem, shifted left \\
16 bits.
\end{tabular}
\end{tabular}
- The operation is performed on 40 bits in the D-unit ALU.
\(\square\) Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
- Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit.
- When an overflow is detected, the accumulator is saturated according to SATD.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When C54CM = 1, an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
\begin{tabular}{|c|c|}
\hline Status Bits Affected by & C54CM, M40, SATD, SXMD \\
\hline Affects & ACOVx, CARRY \\
\hline Repeat This instruc & n can be repeated. \\
\hline Example & \\
\hline Syntax & Description \\
\hline AC0 \(=\left({ }^{*}\right.\) AR3 \(\left.\ll \# 16\right)-(* A R 4 \ll \# 16)\) & The content addressed by AR4 shifted left by 16 bits is subtracted from the content addressed by AR3 shifted left by 16 bits and the result is stored in AC0. \\
\hline
\end{tabular}

\section*{SUB::MOV}

Subtraction with Parallel Store Accumulator Content to Memory

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & \begin{tabular}{l} 
ACy \(=(\mathrm{Xmem} \ll \# 16)-A C x\), \\
Ymem \(=\mathrm{HI}(\mathrm{ACy} \mathrm{<<} \mathrm{T2)}\)
\end{tabular} & No & 4 & 1 & X \\
& & & & &
\end{tabular}
\begin{tabular}{ll} 
Opcode & \(|1000 \quad 0111|\) XXXM MMYY \(\mid\) YMMM SSDD \(\mid 101 \mathrm{x}\) xxxx \\
Operands & ACx, ACy, T2, Xmem, Ymem \\
Description & This instruction performs two operations in parallel: subtraction and store. \\
& The first operation subtracts an accumulator content from the content of data
\end{tabular}
- The operation is performed on 40 bits in the D-unit ALU.
- Input operands are sign extended to 40 bits according to SXMD.
\(\square\) The shift operation is equivalent to the signed shift instruction.
\(\square\) Overflow detection and CARRY status bit depends on M40. The subtraction borrow bit is reported in the CARRY status bit; the borrow bit is the logical complement of the CARRY status bit. When C54CM = 1 , an intermediary shift operation is performed as if M40 is locally set to 1 and no overflow detection, report, and saturation is done after the shifting operation.
- When an overflow is detected, the accumulator is saturated according to SATD.

The second operation shifts the accumulator ACy by the content of T2 and stores \(\mathrm{ACy}(31-16)\) to data memory operand Ymem. If the 16-bit value in T2 is not within -32 to +31 , the shift is saturated to -32 or +31 and the shift is performed with this value.
- The input operand is shifted in the D-unit shifter according to SXMD.
- After the shift, the high part of the accumulator, \(\mathrm{ACy}(31-16)\), is stored to the memory location.

\section*{Compatibility with C54x devices (C54CM = 1)}

When this instruction is executed with \(\mathrm{M} 40=0\), compatibility is ensured. When this instruction is executed with \(\mathrm{C} 54 \mathrm{CM}=1\), the 6 LSBs of T2 are used to determine the shift quantity. The 6 LSBs of T 2 define a shift quantity within -32 to +31 . When the 16 -bit value in T2 is between -32 to -17 , a modulo 16 operation transforms the shift quantity to within -16 to -1 .


\section*{SWAP}

\section*{Swap Accumulator Content}

\section*{Syntax Characteristics}


\section*{SWAP}

Swap Accumulator Pair Content

\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{ll|l|}
\hline Syntax & Description \\
\hline swap(pair(AC0), pair(AC2)) & \begin{tabular}{l} 
The following two swap instructions are performed in parallel: the content of \\
AC0 is moved to AC2 and the content of AC2 is moved to AC0, and the content \\
of AC1 is moved to AC3 and the content of AC3 is moved to AC1.
\end{tabular} \\
\hline
\end{tabular}

\section*{SWAP \\ Swap Auxiliary Register Content}

\section*{Syntax Characteristics}


\section*{SWAPP}

Swap Auxiliary Register Pair Content

\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline swap(pair(AR0), pair(AR2)) & \begin{tabular}{l} 
The following two swap instructions are performed in parallel: the content of \\
AR0 is moved to AR2 and the content of AR2 is moved to AR0, and the content \\
of AR1 is moved to AR3 and the content of AR3 is moved to AR1.
\end{tabular} \\
\begin{tabular}{llll} 
Before & After \\
AR0 & 0200 & AR0 & 6788 \\
AR1 & 0300 & AR1 & AR2 \\
AR2 & 6788 & AR3 & 0200 \\
AR3 & 0200 & 0300
\end{tabular}
\end{tabular}

\section*{SWAP \\ Syntax Characteristics}

Swap Auxiliary and Temporary Register Content


SWAP Swap Auxiliary and Temporary Register Content (swap)

Example
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline swap(AR4, T0) & The content of AR4 is moved to T0 and the content of T0 is moved to AR4. \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
T0 & 6500 & T0 & 0300 \\
AR4 & 0300 & AR4 & 6500
\end{tabular}

\section*{SWAPP}

Swap Auxiliary and Temporary Register Pair Content

\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline swap(pair(AR4), pair(T0)) & \begin{tabular}{l} 
The following two swap instructions are performed in parallel: the content of \\
AR4 is moved to T0 and the content of T0 is moved to AR4, and the content \\
of AR5 is moved to T1 and the content of T1 is moved to AR5.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
AR4 & 0200 & AR4 & 6788 \\
AR5 & 0300 & AR5 & 0200 \\
T0 & 6788 & T0 & 0200 \\
T1 & 0200 & T1 & 0300
\end{tabular}

\section*{SWAP4}

Swap Auxiliary and Temporary Register Pairs Content

\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline swap (block(AR4), block(T0)) & The following four swap instructions are performed in parallel: the content of \\
& \begin{tabular}{l} 
AR4 is moved to T0 and the content of T0 is moved to AR4, the content of AR5 \\
is moved to T1 and the content of T1 is moved to AR5, the content of AR6 is \\
moved to T2 and the content of T2 is moved to AR6, and the content of AR7 \\
is moved to T3 and the content of T3 is moved to AR7.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{llll} 
Before & \multicolumn{3}{c}{ After } \\
AR4 & 0200 & AR4 & 0030 \\
AR5 & 0300 & AR5 & 0200 \\
AR6 & 0240 & AR6 & 3400 \\
AR7 & 0400 & AR7 & 0 FD3 \\
T0 & 0030 & T0 & 0200 \\
T1 & 0200 & T1 & 0300 \\
T2 & 3400 & T2 & 0240 \\
T3 & \(0 F D 3\) & T3 & 0400
\end{tabular}

\section*{SWAP}

Swap Temporary Register Content

\section*{Syntax Characteristics}


\section*{SWAPP}

Swap Temporary Register Pair Content

\section*{Syntax Characteristics}


\section*{Syntax Characteristics}


\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline TC1 = bit(T0, @\#12) & \begin{tabular}{l} 
The bit at the position defined by the register bit address (12) in T0 is tested and the \\
tested bit is copied into TC1.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrlr} 
Before & & After \\
T0 & FE00 & T0 & FE00 \\
TC1 & 0 & TC1 & 1
\end{tabular}

\section*{BTSTP}

\section*{Syntax Characteristics}
\begin{tabular}{cllcccc}
\hline No. & Syntax & & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & bit(src, pair(Baddr)) & No & 3 & 1 & X \\
\hline Opcode & 1110 & \(1100 \mid A A A A\) & AAAI & FSSS & 010x
\end{tabular}

\section*{Operands \\ Baddr, src}

Description This instruction performs a bit manipulation:
- In the D-unit ALU, if the source (src) register operand is an accumulator.
\(\square\) In the A-unit ALU, if the source (src) register operand is an auxiliary or temporary register.

The instruction tests two consecutive bits of the source register location as defined by the bit addressing mode, Baddr and Baddr +1 . The tested bits are copied into status bits TC1 and TC2:
- TC1 tests the bit that is defined by Baddr
- TC2 tests the bit defined by Baddr + 1

The generated bit address must be within:
- 0-38 when accessing accumulator bits (only the 6 LSBs of the generated bit address are used to determine the bit position). If the generated bit address is not within \(0-38\) :
- If the generated bit address is 39 , bit 39 of the register is stored into TC1 and 0 is stored into TC2.

■ In all other cases, 0 is stored into TC1 and TC2.
- 0-14 when accessing auxiliary or temporary register bits (only the 4 LSBs of the generated address are used to determine the bit position). If the generated bit address is not within \(0-14\) :
■ If the generated bit address is 15 , bit 15 of the register is stored into TC1 and 0 is stored into TC2.

■ In all other cases, 0 is stored into TC1 and TC2.
\begin{tabular}{lll} 
Status Bits & Affected by & none \\
& Affects & TC1, TC2
\end{tabular}

\section*{Repeat This instruction can be repeated.}

See Also See the following other related instructions:
\(\square\) Clear Accumulator, Auxiliary, or Temporary Register Bit
\(\square\) Complement Accumulator, Auxiliary, or Temporary Register Bit
\(\square\) Set Accumulator, Auxiliary, or Temporary Register Bit
\(\square\) Test Accumulator, Auxiliary, or Temporary Register Bit
\(\square\) Test Memory Bit

\section*{Example}
\begin{tabular}{|l|l|}
\hline Syntax & Description \\
\hline bit(AC0, pair(AR1(T0))) & \begin{tabular}{l} 
The bit at the position defined by the content of AR1(T0) in AC0 is tested and the \\
tested bit is copied into TC1. The bit at the position defined by the content of \\
AR1(T0) +1 in AC0 is tested and the tested bit is copied into TC2.
\end{tabular} \\
\hline
\end{tabular}
\begin{tabular}{lrrlrrr} 
Before & & & After \\
AC0 & E0 & 1234 & 0000 & AC0 & E0 1234 & 0000 \\
AR1 & & 0026 & AR1 & & 0026 \\
T0 & 0001 & T0 & & 0001 \\
TC1 & & 0 & TC1 & & \\
TC2 & & 0 & TC2 & & 1 \\
\end{tabular}

\section*{BTST}

Test Memory Bit

\section*{Syntax Characteristics}
\begin{tabular}{clcccc}
\hline No. & Syntax & \begin{tabular}{c} 
Parallel \\
Enable Bit
\end{tabular} & Size & Cycles & Pipeline \\
\hline\([1]\) & TCX \(=\) bit(Smem, src) & No & 3 & 1 & X \\
{\([2]\)} & TCX \(=\) bit(Smem, k4) & No & 3 & 1 & X \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Description & \begin{tabular}{l}
These instructions perform a bit manipulation in the A-unit ALU. These instructions test a single bit of a memory (Smem) location. The bit tested is defined by either the content of the source (src) operand or a 4-bit immediate value, k 4 . The tested bit is copied into the selected TCx status bit. \\
For instruction [1], the generated bit address must be within 0-15 (only the 4 LSBs of the register are used to determine the bit position).
\end{tabular} \\
\hline Status Bits & Affected by none \\
\hline & Affects TCx \\
\hline See Also & See the following other related instructions: \\
\hline & \(\square\) Clear Memory Bit \\
\hline & \(\square\) Complement Memory Bit \\
\hline & - Set Memory Bit \\
\hline & - Test Accumulator, Auxiliary, or Temporary Register Bit \\
\hline & - Test Accumulator, Auxiliary, or Temporary Register Bit Pair \\
\hline & - Test and Clear Memory Bit \\
\hline & - Test and Complement Memory Bit \\
\hline & \(\square\) Test and Set Memory Bit \\
\hline
\end{tabular}

\section*{Test Memory Bit}

\section*{Syntax Characteristics}


Test Memory Bit

\section*{Syntax Characteristics}


\section*{BTSTCLR Test and Clear Memory Bit}

\section*{Syntax Characteristics}


\section*{BTSTNOT Test and Complement Memory Bit}

\section*{Syntax Characteristics}


\section*{BTSTSET}

Test and Set Memory Bit

\section*{Syntax Characteristics}


\title{
Instruction Opcodes in Sequential Order
}

\author{
This chapter provides the opcode in sequential order for each TMS320C55x \({ }^{\text {TM }}\) DSP instruction syntax.
}
Topic Page
6.1 Instruction Set Opcodes ..... 6-2
6.2 Instruction Set Opcode Symbols and Abbreviations ..... 6-19

\subsection*{6.1 Instruction Set Opcodes}

Table 6-1 lists the opcodes of the instruction set. See Table 6-2 (page 6-19) for a list of the symbols and abbreviations used in the instruction set opcode. See Table 1-1 (page 1-2) and Table 1-2 (page 1-6) for a list of the terms, symbols, and abbreviations used in the algebraic syntax.

Table 6-1. Instruction Set Opcodes
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 0000000E xCCCCCCC kkkkkkkk & while (cond \&\& (RPTC < k8)) repeat \\
\hline 000001 E xCCCCCCC xxxxxxxx & if (cond) return \\
\hline 0000010 E xCCCCCCC LLLLLLLL & if (cond) goto L8 \\
\hline 0000011 E LLLLLLLL LLLLLLLL & goto L16 \\
\hline 0000100E LLLLLLLL LLLLLLLL & call L16 \\
\hline 0000110E kkkkkkkk kkkkkkkk & repeat(k16) \\
\hline 0000111 E llllllll llllllll & blockrepeat \(\}\) \\
\hline 0001000E DDSS0000 xxSHIFTW & ACy = ACy \& (ACx \(\lll\) \#SHIFTW) \\
\hline 0001000E DDSS0001 xxSHIFTW & ACy \(=\) ACy | (ACx \(\lll\) \#SHIFTW) \\
\hline 0001000E DDSS0010 xxSHIFTW & ACy \(=\) ACy \({ }^{\wedge}\) (ACx \(\lll\) \#SHIFTW) \\
\hline 0001000E DDSS0011 xxSHIFTW & ACy \(=\) ACy + (ACx \(\ll\) \#SHIFTW) \\
\hline 0001000E DDSS0100 xxSHIFTW & ACy \(=\) ACy \(-(\) ACx \(\ll\) \#SHIFTW) \\
\hline 0001000 EDSS 0101 xxSHIFTW & ACy \(=\) ACx \(\ll\) \#SHIFTW \\
\hline 0001000 EDSS 0110 xxSHIFTW & ACy \(=\) ACx \(\ll\) C \#SHIFTW \\
\hline 0001000E DDSS0111 xxSHIFTW & ACy \(=\) ACx \(\lll\) \#SHIFTW \\
\hline \(0001000 \mathrm{ExSS1000}\) xxddxxxx & Tx \(=\exp (A C x)\) \\
\hline 0001000 E DSS1001 xxddxxxx & \(A C y=m a n t(A C x), ~ T x=-\exp (A C x)\) \\
\hline 0001000 ExSS 1010 SSddxxxt & Tx = count(ACx, ACy, TCx \\
\hline \(0001000 \mathrm{EDSS1100}\) SSDDnnnn & max_diff(ACx,ACy,ACz,ACw) \\
\hline \(0001000 \mathrm{EDSS1101}\) SSDDxxxr & max_diff_dbl(ACx,ACy,ACz,ACw,TRNx) \\
\hline 0001000E DDSS1110 SSDDxxxx & min_diff(ACx,ACy,ACz,ACw) \\
\hline \(0001000 \mathrm{DDSS1111}\) SSDDxxxr & min_diff_dbl(ACx,ACy,ACz,ACw,TRNx) \\
\hline 0001001 E FSSScc00 FDDDxuxt & TCx \(=\) uns(src RELOP dst) \\
\hline 0001001 E FSSScc01 FDDD0utt & TCx \(=\) TCy \& uns(src RELOP dst) \\
\hline 0001001 E FSSScc01 FDDD1utt & TCx = !TCy \& uns(src RELOP dst) \\
\hline 0001001 E FSSScc10 FDDD0utt & TCx = TCy | uns(src RELOP dst) \\
\hline 0001001 E FSSScc10 FDDD1utt & TCx = !TCy | uns(src RELOP dst) \\
\hline 0001001 E FSSSxx11 FDDD0xvV & dst = BitOut \\ src \\ Bitln \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 0001001 E FSSSxx11 FDDD1xvv & dst = Bitln // src // BitOut \\
\hline 0001010 E FSSSxxxx FDDD0000 & \(\operatorname{mar}(\mathrm{TAy}+\mathrm{TAx})\) \\
\hline 0001010E FSSSxxxx FDDD0001 & \(\operatorname{mar}(\mathrm{TAy}=\mathrm{TAx})\) \\
\hline 0001010E FSSSxxxx FDDD0010 & \(\operatorname{mar}(\mathrm{TAy}-\mathrm{TAx})\) \\
\hline \(0001010 E\) PPPPPPPP FDDD0100 & \(\operatorname{mar}(\mathrm{TAx}+\mathrm{P} 8)\) \\
\hline \(0001010 E\) PPPPPPPP FDDD0101 & \(\operatorname{mar}(\mathrm{TAx}=\mathrm{P} 8)\) \\
\hline 0001010 E PPPPPPPP FDDD0110 & \(\operatorname{mar}(\mathrm{TAx}-\mathrm{P} 8)\) \\
\hline 0001010E FSSSxxxx FDDD1000 & \(\operatorname{mar}(T A y+T A x)\) \\
\hline 0001010 E FSSSxxxx FDDD1001 & \(\operatorname{mar}(\mathrm{TAy}=\mathrm{TAx})\) \\
\hline 0001010 E FSSSxxxx FDDD1010 & \(\operatorname{mar}(\mathrm{TAy}-\mathrm{TAx})\) \\
\hline 0001010 EPPPPPPPP FDDD1100 & \(\operatorname{mar}(\mathrm{TAx}+\mathrm{P} 8\) ) \\
\hline 0001010 E PPPPPPPP FDDD1101 & \(\operatorname{mar}(\mathrm{TAx}=\mathrm{P} 8)\) \\
\hline 0001010 E PPPPPPPP FDDD1110 & \(\operatorname{mar}(\mathrm{TAx}-\mathrm{P} 8\) ) \\
\hline \begin{tabular}{l}
0001010E XACSOOO1 XACDOOOO \\
(Note: for DAG_X)
\end{tabular} & mar(XACdst + XACsrc) \\
\hline \begin{tabular}{l}
0001010 E XACSOOO1 XACDOOO1 \\
(Note: for DAG_X)
\end{tabular} & mar(XACdst \(=\) XACsrc \()\) \\
\hline \begin{tabular}{l}
0001010 E XACS0001 XACD0010 \\
(Note: for DAG_X)
\end{tabular} & mar(XACdst - XACsrc) \\
\hline \begin{tabular}{l}
0001010 E XACSO001 XACD1000 \\
(Note: for DAG_Y)
\end{tabular} & mar(XACdst + XACsrc) \\
\hline \begin{tabular}{l}
0001010 E XACS0001 XACD1001 \\
(Note: for DAG_Y)
\end{tabular} & mar(XACdst \(=\) XACsrc \()\) \\
\hline \begin{tabular}{l}
0001010 EACS 0001 XACD1010 \\
(Note: for DAG_Y)
\end{tabular} & mar(XACdst - XACsrc) \\
\hline 0001011 E xxxxxkkk kkkk0000 & DPH \(=\mathrm{k} 7\) \\
\hline 0001011 E xxxkkkkk kkkk0011 & PDP \(=\mathrm{k} 9\) \\
\hline 0001011 E kkkkkkkk kkkk0100 & BK03 \(=\mathrm{k} 12\) \\
\hline 0001011 E kkkkkkkk kkkk0101 & BK47 \(=\mathrm{k} 12\) \\
\hline 0001011 E kkkkkkkk kkkk0110 & \(B K C=k 12\) \\
\hline 0001011 E kkkkkkkk kkkk1000 & CSR \(=\mathrm{k} 12\) \\
\hline 0001011 E kkkkkkkk kkkk1001 & BRC0 \(=\mathrm{k} 12\) \\
\hline 0001011 E kkkkkkkk kkkk1010 & BRC1 \(=\mathrm{k} 12\) \\
\hline 0001100 E kkkkkkkk FDDDFSSS & \(\mathrm{dst}=\mathrm{src} \& \mathrm{k} 8\) \\
\hline 0001101 E kkkkkkkk FDDDFSSS & \(\mathrm{dst}=\mathrm{src} \mid \mathrm{k} 8\) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 0001110E kkkkkkkk FDDDFSSS & dst \(=\operatorname{src}^{\wedge} \mathrm{k} 8\) \\
\hline 0001111 E KKKKKKKK SSDDxx0\% & \(\mathrm{ACy}=\operatorname{rnd}(\mathrm{ACx} * \mathrm{~K} 8)\) \\
\hline 0001111 E KKKKKKKK SSDDss1\% & \(A C y=r n d(A C x+(T x * K 8))\) \\
\hline 0010000E & nop \\
\hline 0010001 EFSSSFDDD & dst \(=\) src \\
\hline 0010010E FSSSFDDD & \(d s t=d s t+s r c\) \\
\hline 0010011 E FSSSFDDD & \(\mathrm{dst}=\mathrm{dst}-\mathrm{src}\) \\
\hline 0010100E FSSSFDDD & \(\mathrm{dst}=\mathrm{dst}\) \& src \\
\hline 0010101 EFSSSFDDD & \(\mathrm{dst}=\mathrm{dst} \mid\) src \\
\hline 0010110 E FSSSFDDD & \(\mathrm{dst}=\mathrm{dst}{ }^{\wedge} \mathrm{src}\) \\
\hline 0010111 E FSSSFDDD & \(\mathrm{dst}=\max (\mathrm{src}, \mathrm{dst})\) \\
\hline 0011000E FSSSFDDD & \(d s t=\min (\mathrm{src}, \mathrm{dst})\) \\
\hline 0011001 E FSSSFDDD & \(d s t=|s r c|\) \\
\hline 0011010 E FSSSFDDD & \(\mathrm{dst}=-\mathrm{src}\) \\
\hline 0011011 E FSSSFDDD & dst \(=\sim\) src \\
\hline \begin{tabular}{l}
0011100E FSSSFDDD \\
(Note: FSSS = src1, FDDD = src2)
\end{tabular} & push(src1, src2) \\
\hline \begin{tabular}{l}
0011101E FSSSFDDD \\
(Note: FSSS = dst1, FDDD = dst2)
\end{tabular} & dst1, dst2 = pop() \\
\hline 0011110E kkkkFDDD & \(\mathrm{dst}=\mathrm{k} 4\) \\
\hline 0011111 E kkkkFDDD & \(d s t=-\mathrm{k} 4\) \\
\hline 0100000E kkkkFDDD & \(\mathrm{dst}=\mathrm{dst}+\mathrm{k} 4\) \\
\hline 0100001 E kkkkFDDD & \(\mathrm{dst}=\mathrm{dst}-\mathrm{k} 4\) \\
\hline 0100010111110010 & lock() \\
\hline 0100010E 00SSFDDD & \(\mathrm{TAx}=\mathrm{HI}(\mathrm{ACx})\) \\
\hline 0100010E 01x0FDDD & dst = dst >> \#1 \\
\hline 0100010E 01x1FDDD & dst \(=\) dst \(\ll \# 1\) \\
\hline 0100010E 1000FDDD & TAx = SP \\
\hline 0100010E 1001FDDD & TAx = SSP \\
\hline 0100010E 1010FDDD & TAx = CDP \\
\hline 0100010E 1100FDDD & TAx \(=\) BRC0 \\
\hline 0100010 E 1101FDDD & TAx \(=\) BRC1 \\
\hline 0100010E 1110FDDD & TAx = RPTC \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 0100011E kkkk0000 & \(\operatorname{bit}(\) ST0, k4) = \#0 \\
\hline 0100011 E kkkk0001 & \(\operatorname{bit}(\) ST0, k4) = \#1 \\
\hline 0100011 E kkkk0010 & \(\operatorname{bit}(\mathrm{ST} 1, \mathrm{k} 4)=\) \#0 \\
\hline 0100011 E kkkk0011 & \(\operatorname{bit}(\mathrm{ST} 1, \mathrm{k} 4)=\) \#1 \\
\hline 0100011 E kkkk0100 & bit(ST2, k4) = \#0 \\
\hline 0100011 E kkkk0101 & \(\operatorname{bit}(\mathrm{ST} 2, \mathrm{k} 4)=\) \#1 \\
\hline 0100011 E kkkk0110 & \(\operatorname{bit}(\mathrm{ST} 3, \mathrm{k} 4)=\) \#0 \\
\hline 0100011 E kkkk0111 & \(\operatorname{bit}(\mathrm{ST3}, \mathrm{k} 4)=\) \#1 \\
\hline 0100100E xxxxx000 & repeat(CSR) \\
\hline 0100100E FSSSx001 & repeat(CSR), CSR += TAx \\
\hline 0100100E kkkkx010 & repeat(CSR), CSR \(+=\mathrm{k} 4\) \\
\hline 0100100E kkkkx011 & repeat(CSR), CSR -= k4 \\
\hline 0100100E xxxxx100 & return \\
\hline 01001000 xxxxx100 & return_int \\
\hline 0100101 E OLLLLLLL & goto L7 \\
\hline 0100101 E 1lllllll & localrepeat \(\}\) \\
\hline 0100110E kkkkkkkk & repeat(k8) \\
\hline 0100111 E KKKKKKKK & \(\mathrm{SP}=\mathrm{SP}+\mathrm{K} 8\) \\
\hline 0101000E FDDDx000 & dst = dst <<< \#1 \\
\hline 0101000E FDDDx001 & dst = dst >>> \#1 \\
\hline 0101000E FDDDx010 & \(\mathrm{dst}=\mathrm{pop}()\) \\
\hline 0101000E xxDDx011 & \(A C x=d b l(p o p())\) \\
\hline 0101000E FSSSx110 & push(src) \\
\hline 0101000E xxSSx111 & dbl(push(ACx)) \\
\hline 0101000E XDDD0100 & xdst \(=\) popboth() \\
\hline 0101000E XSSS0101 & pshboth(xsrc) \\
\hline 0101001E FSSSOODD & HI(ACx) = TAx \\
\hline 0101001 E FSSS1000 & \(S P=T A x\) \\
\hline 0101001 E FSSS1001 & SSP = TAx \\
\hline 0101001 ESSS 1010 & CDP \(=\) TAx \\
\hline 0101001 E FSSS1100 & CSR = TAx \\
\hline 0101001 E FSSS1101 & \(B R C 1=T A x\) \\
\hline 0101001 E FSSS1110 & \(B R C 0=T A x\) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 0101010E DDSS000\% & ACy \(=\operatorname{rnd}(\mathrm{ACy}+|\mathrm{ACx}|)\) \\
\hline 0101010E DDSS001\% & \(A C y=\operatorname{rnd}(A C y+(A C x * A C x))\) \\
\hline 0101010E DDSS010\% & \(A C y=\operatorname{rnd}(A C y-(A C x * A C x))\) \\
\hline 0101010E DDSS011\% &  \\
\hline 0101010E DDSS100\% & \(A C y=r n d(A C x * A C x)\) \\
\hline 0101010E DDSS101\% & \(A C y=\operatorname{rnd}(A C x)\) \\
\hline 0101010E DDSS110\% & ACy = saturate \((\mathrm{rnd}(\mathrm{ACx})\) ) \\
\hline 0101011E DDSSss0\% & \(A C y=\operatorname{rnd}(A C y+(A C x * T x))\) \\
\hline 0101011E DDSSss1\% & \(A C y=\operatorname{rnd}(A C y-(A C x * T x))\) \\
\hline 0101100E DDSSss0\% & \(A C y=r n d(A C x * T x)\) \\
\hline 0101100E DDSSss1\% & \(A C y=r n d((A C y * T x)+A C x)\) \\
\hline 0101101E DDSSss00 & \(A C y=A C y+(A C x \ll T x)\) \\
\hline 0101101 E DSSSss01 & \(A C y=A C y-(A C x \ll T x)\) \\
\hline 0101101 E Dxxxxx1t & ACx \(=\operatorname{sftc}(\mathrm{ACx}, \mathrm{TCx})\) \\
\hline 0101110E DDSSss00 & \(A C y=A C x \lll T x\) \\
\hline 0101110E DDSSss01 & \(A C y=A C x \ll T x\) \\
\hline 0101110E DDSSss10 & \(A C y=A C x \ll C T x\) \\
\hline 0101111 E 00 kkkkkk & swap( ) \\
\hline 01100111 lCCCCCCC & if (cond) goto 14 \\
\hline 01101000 xCCCCCCC PPPPPPPP PPPPPPPP PPPPPPPP & if (cond) goto P24 \\
\hline 01101001 xCCCCCCC PPPPPPPP PPPPPPPP PPPPPPPP & if (cond) call P24 \\
\hline 01101010 PPPPPPPPP PPPPPPPP PPPPPPPP & goto P24 \\
\hline 01101100 PPPPPPPPP PPPPPPPP PPPPPPPP & call P24 \\
\hline 01101101 xCCCCCCC LLLLLLLL LLLLLLLL & if (cond) goto L16 \\
\hline 01101110 xCCCCCCC LLLLLLLLL LLLLLLLL & if (cond) call L16 \\
\hline 01101111 FSSSccxu KKKKKKKK LLLLLLLL & compare (uns(src RELOP K8)) goto L8 \\
\hline 01110000 KKKKKKKK KKKKKKKK SSDDSHFT & ACy \(=\) ACx \(+(\mathrm{K} 16 \ll \# S H F T)\) \\
\hline 01110001 KKKKKKKK KKKKKKKK SSDDSHFT & ACy \(=\) ACx \(-(\mathrm{K} 16 \ll \# S H F T)\) \\
\hline 01110010 kkkkkkkk kkkkkkkk SSDDSHFT & ACy \(=\) ACx \& (k16 \(\lll\) \#SHFT \()\) \\
\hline 01110011 kkkkkkkk kkkkkkkk SSDDSHFT & ACy \(=\) ACx | (k16 <<< \#SHFT) \\
\hline 01110100 kkkkkkkk kkkkkkkk SSDDSHFT & \(A C y=A C x \wedge(k 16 \lll \# S H F T)\) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|c|}
\hline & Opcode & Algebraic syntax \\
\hline 01110101 & KKккккKKK KккккKKK xxDDSHFT & ACx \(=\) K16 \(\ll\) \#SHFT \\
\hline 01110110 & kkkkkkkk kkkkkkkk FDDDOOSS & dst = field_extract(ACx,k16) \\
\hline 01110110 & kkkkkkkk kkkkkkkk FDDD01SS & dst \(=\) field_expand(ACx,k16) \\
\hline 01110110 & KKKKKKKK KKKKKKKK FDDD10xx & \(d s t=\mathrm{K} 16\) \\
\hline 01110111 &  & \(\operatorname{mar}(\mathrm{TAx}=\mathrm{D} 16)\) \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx0000x & \(D P=k 16\) \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx0001x & SSP = k16 \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx0010x & \(C D P=k 16\) \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx0011x & BSA01 = k16 \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx0100x & BSA23 \(=\mathrm{k} 16\) \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx0101x & \(B S A 45=k 16\) \\
\hline 01111000 & kkkkkkkk kkkkkkkkk xxx0110x & BSA67 \(=k 16\) \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx0111x & \(B S A C=k 16\) \\
\hline 01111000 & kkkkkkkk kkkkkkkk xxx1000x & \(\mathrm{SP}=\mathrm{k} 16\) \\
\hline 01111001 & KKKKKKKK KKKKKKKK SSDDxx0\% & \(A C y=r n d(A C x * K 16)\) \\
\hline 01111001 & KKKKKKKK KKKKKKKK SSDDss1\% & \(A C y=r n d(A C x+(T x * K 16))\) \\
\hline 01111010 & KKKKKKKK KKKKKKKK SSDDOOOx & \(A C y=A C x+(K 16 \ll \# 16)\) \\
\hline 01111010 & KKKKKKKK KKKKKKKK SSDD001x & ACy \(=\) ACx - (K16 <<\#16) \\
\hline 01111010 & kkkkkkkk kkkkkkkk SSDD010x & \(A C y=A C x \&(k 16 \lll \# 16)\) \\
\hline 01111010 & kkkkkkkk kkkkkkkk SSDD011x & \(A C y=A C x \mid(k 16 \lll \# 16)\) \\
\hline 01111010 & kkkkkkkk kkkkkkkk SSDD100x & \(A C y=A C x \wedge(k 16 \lll \# 16)\) \\
\hline 01111010 & KKKKKKKK KKKKKKKK xxDD101x & \(A C x=K 16 \ll \# 16\) \\
\hline 01111010 & xxxxxxxx xxxxxxxx xxxx110x & idle \\
\hline 01111011 & KKKKKKKK KKKKKKKK FDDDFSSS & \(\mathrm{dst}=\mathrm{src}+\mathrm{K} 16\) \\
\hline 01111100 & KKKKKKKK KKKKKKKK FDDDFSSS & dst \(=\) src -K 16 \\
\hline 01111101 & kkkkkkkk kkkkkkkk FDDDFSSS & \(\mathrm{dst}=\) src \& k16 \\
\hline 01111110 & kkkkkkkk kkkkkkkk FDDDFSSS & \(\mathrm{dst}=\mathrm{src} \mid \mathrm{k} 16\) \\
\hline 01111111 & kkkkkkkk kkkkkkkkk FDDDFSSS & \(\mathrm{dst}=\mathrm{src}^{\wedge} \mathrm{k} 16\) \\
\hline 10000000 & XXXMMMYY YMMMOOxx & \(\mathrm{dbl}(\) Ymem \()=\mathrm{dbl}(\) Xmem \()\) \\
\hline 10000000 & XXXMMMYY YMMM01xx & Ymem = Xmem \\
\hline 10000000 & XXXMMMYY YMMM10SS & \[
\begin{aligned}
& \text { Xmem }=\mathrm{LO}(\mathrm{ACx}), \\
& \text { Ymem }=\mathrm{HI}(\mathrm{ACx})
\end{aligned}
\] \\
\hline 10000001 & XXXMMMYY YMMMOODD & \(A C x=(\) Xmem \(\ll \# 16)+(\) Ymem \(\ll \# 16)\) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|c|}
\hline & Opcode & Algebraic syntax \\
\hline 10000001 & XXXMMMYY YMMMO1DD & ACx \(=(\) Xmem \(\ll \# 16)-(\) Ymem \(\ll \# 16)\) \\
\hline 10000001 & XXXMMMYY YMMM10DD & \[
\begin{aligned}
& \mathrm{LO}(\mathrm{ACx})=\text { Xmem }, \\
& \mathrm{HI}(\mathrm{ACx})=\text { Ymem }
\end{aligned}
\] \\
\hline 10000010 & XXXMMMYY YMMMOOmm uudDDDg\% & \[
\begin{aligned}
& A C x=M 40(\text { rnd }(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem })))), \\
& A C y=M 40(\text { rnd }(\text { uns }(\text { Ymem }) ~
\end{aligned}
\] \\
\hline 10000010 & XXXMмMYY YMMMO1mm uudDDDg\% & \[
\begin{aligned}
& A C x=M 40\left(\operatorname{rnd}\left(\operatorname{ACx}+\left(\operatorname{uns}(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)
\end{aligned}
\] \\
\hline 10000010 & XXXMMMYY YMMM10mm uudDDDg\% & \[
\begin{aligned}
& A C x=M 40\left(\operatorname{rnd}\left(\text { ACx }-\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=M 40\left(\text { rnd }\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)
\end{aligned}
\] \\
\hline 10000010 & XXXMMМYY YMMM11mm uuxxDDg\% & \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\text { M } 40(\text { rnd }(\text { uns }(\text { Ymem }) ~ * ~ u n s(\operatorname{coef(Cmem)~})))
\end{aligned}
\] \\
\hline 10000011 & XXXMMMYY YMMMOOmm uudDDDg\% & \[
\begin{aligned}
& A C x=M 40(\operatorname{rnd}(A C x+(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& A C y=M 40(\operatorname{rnd}(A C y+(\text { uns }(\text { Ymem }) * \text { uns }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline 10000011 & XXXMММYY YMMMO1mm uuDDDDg\% & \[
\begin{aligned}
& A C x=M 40\left(\text { rnd }\left(\text { ACx }-\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=\text { M } 40\left(\text { rnd }\left(\text { ACy }+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right.
\end{aligned}
\] \\
\hline 10000011 & XXXMMMYY YMMM10mm uudDDDg\% & \[
\begin{aligned}
& A C x=M 40\left(\operatorname { r n d } \left((A C x \gg \# 16)+\left(\text { uns }(\text { Xmem })^{*}\right.\right.\right. \\
& \text { uns(coef(Cmem)) })), \\
& A C y=M 40\left(\text { rnd }\left(A C y+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right)
\end{aligned}
\] \\
\hline 10000011 & XXXMMMYY YMMM11mm uuxxDDg\% & \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \left.A C x=M 40\left(\operatorname{rnd}\left(A C x+\left(\text { uns }(\text { Ymem })^{*} \text { uns(coef(Cmem) }\right)\right)\right)\right)
\end{aligned}
\] \\
\hline 10000100 & XXXMмMYY YMmmoomm uudDDDg\% & \[
\begin{aligned}
& \text { ACx }=\text { M } 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACy }=\text { M40(rnd }\left((\text { ACy >> \#16 })+\left(\text { uns }(\text { Ymem })^{*}\right.\right. \\
& \text { uns }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline 10000100 & XXXMMMYY YMMMO1mm uuxxDDg\% & \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\text { M } 40(\text { rnd }((\text { ACx } \gg \# 16)+(\text { uns }(\text { Ymem }) ~ * ~ \\
& \text { uns }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline 10000100 & XXXMMMYY YMMM10mm uuDDDDg\% & \[
\begin{aligned}
& A C x=M 40(\text { rnd }(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem })))), \\
& \text { ACy }=\text { M } 40\left(\text { rnd } \left((\text { ACy } \gg \# 16)+\left(\text { uns }(\text { Ymem })^{*}\right.\right.\right. \\
& \text { uns }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline 10000100 & XXXMMMYY YMMM11mm uudDDDg\% & ```
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(coef(Cmem))))),
ACy = M40(rnd((ACy >> #16) + (uns(Ymem) *
uns(coef(Cmem))))
``` \\
\hline 10000101 & XXXMмMYY YMMmOOmm uuxxDDg\% & \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& A C x=M 40\left(\operatorname{rnd}\left(A C x-\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\operatorname{Cmem}))\right)\right)\right)
\end{aligned}
\] \\
\hline 10000101 & XXXMмMYY YMMMO1mm uudDDDg\% & \[
\begin{aligned}
& A C x=M 40\left(\text { rnd }\left(\text { ACx }-\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=\text { M } 40(\text { rnd }(A C y-(\text { uns }(\text { Ymem })
\end{aligned}
\] \\
\hline 10000101 & XXXMmмYy Ymmm10mm xxxxxxxx & \(\operatorname{mar}\) (Xmem) , mar(Ymem) , mar(coef(Cmem)) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{4}{|c|}{Opcode} & Algebraic syntax \\
\hline 10000101 & XXXMMMYY & YMMM11mm & DDx0DDU\% & firs(Xmem, Ymem, coef(Cmem), ACx, ACy) \\
\hline 10000101 & XXXMmмY & YMMM11mm & DDx1DDU\% & firsn(Xmem, Ymem, coef(Cmem), ACx, ACy) \\
\hline 10000110 & XXXMMМY & YMMMxxDD & 000guuU\% & \[
\begin{aligned}
& \mathrm{ACx}=\mathrm{M} 40(\text { rnd }(\text { uns }(\text { Xmem }) * \text { uns }(\text { Ymem }))) \\
& {[, \mathrm{T} 3=\text { Xmem }]}
\end{aligned}
\] \\
\hline 10000110 & XXXMММY & YMMMSSDD & 001guuU\% & \[
\begin{aligned}
& \text { ACy }=\text { M40 (rnd }(\text { ACx }+(\text { uns }(\text { Xmem }) * \text { uns }(\text { Ymem })))) \\
& {[, T 3=\text { Xmem }]}
\end{aligned}
\] \\
\hline 10000110 & XXXMmмY & YMMMSSDD & 010guuU\% & \[
\begin{aligned}
& A C y=M 40(\operatorname{rnd}((\text { ACx } \gg \# 16)+(\text { uns }(\text { Xmem }) * \text { uns }(\text { Ymem })))) \\
& {[, T 3=\text { Xmem }]}
\end{aligned}
\] \\
\hline 10000110 & XXXMММY & YMMMSSDD & 011guuU\% & \[
\begin{aligned}
& \text { ACy } \left.=\text { M40 (rnd }\left(\text { ACx }-\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\text { Ymem })\right)\right)\right) \\
& {[, \text { T3 }=\text { Xmem }]}
\end{aligned}
\] \\
\hline 10000110 & XXXMMMY & YMMMDDDD & 100xssU\% & \[
\begin{aligned}
& A C x=\operatorname{rnd}\left(A C x-\left(\text { Tx }{ }^{*} \text { Xmem }\right)\right), \\
& A C y=\text { Ymem } \ll \# 16[, T 3=\text { Xmem }]
\end{aligned}
\] \\
\hline 10000110 & XXXMMMY & YMMMDDDD & 101xssU\% & \[
\begin{aligned}
& A C x=\operatorname{rnd}(A C x+(\text { Tx * Xmem })), \\
& A C y=\text { Ymem } \ll \# 16[, T 3=\text { Xmem }]
\end{aligned}
\] \\
\hline 10000110 & XXXMMMY & YMMMDDDD & 110xxxx\% & Ims(Xmem, Ymem, ACx, ACy) \\
\hline 10000110 & XXXMMMY & YMMMDDDD & \(1110 \mathrm{xxn} \%\) & sqdst(Xmem, Ymem, ACx, ACy) \\
\hline 10000110 & XXXMMMY & YMMMDDDD & 1111xxn\% & abdst(Xmem, Ymem, ACx, ACy) \\
\hline 10000111 & XXXMmмY & YMMMSSDD & 000xssU\% & \[
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{Tx} * \text { Xmem }), \\
& \text { Ymem }=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{~T} 2)[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\] \\
\hline 10000111 & XXXMmмY & YMMMSSDD & 001xssU\% & \[
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{ACy}+(\mathrm{Tx} * \text { Xmem })), \\
& \text { Ymem }=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{~T} 2)[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\] \\
\hline 10000111 & XXXMmмY & YMMMSSDD & 010xssU\% & \[
\begin{aligned}
& \mathrm{ACy}=\operatorname{rnd}(\mathrm{ACy}-(\mathrm{Tx} * \text { Xmem })), \\
& \text { Ymem }=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{~T} 2)[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\] \\
\hline 10000111 & XXXMMMY & YMMMSSDD & 01100001 & Imsf(Xmem, Ymem, ACx, ACy) \\
\hline 10000111 & XXXMMMYY & YMMMSSDD & 100xxxxx & \[
\begin{aligned}
& \text { ACy }=\mathrm{ACx}+(\text { Xmem } \ll \# 16), \\
& \text { Ymem }=\mathrm{HI}(\mathrm{ACy} \ll \text { T2 })
\end{aligned}
\] \\
\hline 10000111 & XXXMММY & YMMMSSDD & 101xxxxx & \[
\begin{aligned}
& \text { ACy }=(\text { Xmem } \ll \# 16)-A C x, \\
& \text { Ymem }=\mathrm{HI}(A C y ~ \ll ~ T 2)
\end{aligned}
\] \\
\hline 10000111 & ХххммммYу & YMMMSSDD & 110xxxxx & \[
\begin{aligned}
& \text { ACy }=\text { Xmem } \ll \# 16, \\
& \text { Ymem }=\mathrm{HI}(\mathrm{ACx} \ll \text { T2) }
\end{aligned}
\] \\
\hline 10010000 & XSSSXDDD & & & xdst \(=\) xsrc \\
\hline 10010001 & xxxxxxSS & & & goto ACx \\
\hline 10010010 & XXXMММY & YMMM0 Omm & uuDDDDg\% & \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& \text { ACx }=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))\right)\right)
\end{aligned}
\] \\
\hline 10010010 & XXXMmмY & YMMM01mm & uuDDDDg\% & \[
\begin{aligned}
& \text { ACy }=\text { M } 40\left(\text { rnd }\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& \text { ACx }=\text { M40(rnd }\left(\text { ACx }+ \text { uns }(\text { Xmem })^{*}\right. \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|c|}
\hline & Opcode & Algebraic syntax \\
\hline 10010010 & XXXмммYY YMMM10mm uudDDDg\% & \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40(\text { rnd }(\text { uns }(\text { Ymem }) * \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\mathrm{M} 40(\text { rnd }(\text { ACx }- \text { uns }(\text { Xmem })
\end{aligned}
\] \\
\hline 10010010 & xxxxxxSS & call ACx \\
\hline 10010011 & XXXMMMYY YMMMOOmm uudDDDg\% & \[
\begin{aligned}
& \text { ACy }=\text { M } 40\left(\text { rnd } \left(\text { ACy }+ \text { uns }(\text { Ymem })^{*}\right.\right. \\
& \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M40(rnd }\left(\text { ACx }+ \text { uns }(\text { Xmem })^{*}\right. \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem }))))
\end{aligned}
\] \\
\hline 10010011 & XXXMMMYY YMMMO1mm uuDDDDg\% & \[
\begin{aligned}
& \text { ACy }=\text { M } 40\left(\text { rnd } \left(\text { ACy }+ \text { uns }(\text { Ymem })^{*}\right.\right. \\
& \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M40(rnd }\left(\text { ACx }- \text { uns }(\text { Xmem })^{*}\right. \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline 10010011 & XXXMMмYY YMMM10mm uudDDDg\% & ```
ACy = M40(rnd(ACy + (uns(Ymem) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` \\
\hline 10010011 & XXXMMMYY YMMM11mm uuDDDDg\% & ```
ACy = M40(rnd((ACy >> #16) + (uns(Ymem) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(LO(coef(Cmem))))))
``` \\
\hline 10010100 & XXXMMмYY YMmmoomm uudDDDg\% & \[
\begin{aligned}
& \text { ACy }=\text { M40 }(\text { rnd }((\text { ACy } \gg \# 16)+(\text { uns }(\text { Ymem }) ~ * ~ \\
& \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M40(rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) * \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] \\
\hline 10010100 & XXXMMMYY YMMM10mm uudDDDg\% & \[
\begin{aligned}
& A C y=M 40\left(\text { rnd } \left((\text { ACy } \gg \# 16)+\left(\text { uns }(\text { Ymem })^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M } 40(\text { rnd }(\text { uns }(\text { Xmem }) * \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline 10010100 & xxxxxxxx & reset \\
\hline 10010101 & 0xxkkkkk & intr(k5) \\
\hline 10010101 & 1xxkkkkk & trap(k5) \\
\hline 10010101 & XXXMMMYY YMMMO1mm uudDDDg\% & \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\text { rnd } \left(\text { ACy }-\left(\text { uns }(\text { Ymem })^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M } 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Xmem }) ~ * ~ \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] \\
\hline 10010110 & OCCCCCCC & if (cond) execute(AD_unit) \\
\hline 10010110 & 1 CCCCCCC & if (cond) execute(D_unit) \\
\hline 10011000 & & mmap() \\
\hline 10011001 & & readport() \\
\hline 10011010 & & writeport() \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 10011100 & linear() \\
\hline 10011101 & circular() \\
\hline 10011110 0ccccccc & if (cond) execute(AD_unit) \\
\hline 10011110 1CCCCCCC & if (cond) execute(D_unit) \\
\hline 10011111 0CCCCCCC & if (cond) execute(AD_unit) \\
\hline 10011111 1CCCCCCC & if (cond) execute(D_unit) \\
\hline 1010FDDD AAAAAAAI & dst \(=\) Smem \\
\hline 101100DD AAAAAAAI & ACx = Smem <<\#16 \\
\hline 10110100 AAAAAAAI & mar(Smem) \\
\hline 10110101 AAAAAAAI & push(Smem) \\
\hline 10110110 AAAAAAAI & delay(Smem) \\
\hline 10110111 AAAAAAAI & push(dbl(Lmem)) \\
\hline 10111000 AAAAAAAI & \(\mathrm{dbl}(\) Lmem \()=\operatorname{pop}()\) \\
\hline 10111011 AAAAAAAI & Smem = pop() \\
\hline 101111 SS AAAAAAAI & Smem \(=\mathrm{HI}(\mathrm{ACx})\) \\
\hline 1100FSSS AAAAAAAI & Smem = src \\
\hline 11010000 AAAAAAAI 0\%DD01mm & ACx \(=\operatorname{rnd}(\) Smem * uns( \(\operatorname{coef}(\) Cmem \()\) ) \\
\hline 11010000 AAAAAAAI \(0 \%\) DD10mm & ACx \(=\operatorname{rnd}(\) ACx \(+(\) Smem * uns(coef(Cmem) \()\) )) \\
\hline 11010000 AAAAAAAI 0\%DD11mm & ACx \(=\operatorname{rnd}(\) ACx \(-(\) Smem * uns(coef(Cmem) \()\) ) \()\) \\
\hline 11010000 AAAAAAAI U\%DDxxmm & \(A C x=\operatorname{rnd}(A C x+(S m e m * \operatorname{coef}(C m e m)))[, T 3=\) Smem \(]\), delay(Smem) \\
\hline 11010001 AAAAAAAI U\%DD00mm & ACx \(=\operatorname{rnd}\left(\right.\) Smem \({ }^{*} \operatorname{coef}(\) Cmem \(\left.)\right) ~[, T 3=\) Smem \(]\) \\
\hline 11010001 AAAAAAAI U\%DD01mm & ACx \(=\operatorname{rnd}(\) ACx \(+(\) Smem * \(\operatorname{coef}(\) Cmem \()\) ) \([\) [, T3 \(=\) Smem \(]\) \\
\hline 11010001 AAAAAAAI U\%DD10mm & ACx \(=\operatorname{rnd}(\) ACx \(-(\) Smem * \(\operatorname{coef}(\) Cmem \()\) ) \([, \mathrm{T} 3=\) Smem \(]\) \\
\hline 11010010 AAAAAAAI U\%DDOOSS & ACy \(=\operatorname{rnd}(\) ACy \(+(\) Smem * ACx \()\) ) \([, \mathrm{T} 3=\) Smem \(]\) \\
\hline 11010010 AAAAAAAI U\%DD01SS & ACy \(=\operatorname{rnd}(\) ACy \(-(\) Smem * ACx \()\) ) \([, \mathrm{T} 3=\) Smem \(]\) \\
\hline 11010010 AAAAAAAI U\%DD10SS & ACy \(=\operatorname{rnd}(\) ACx \(+(\) Smem * Smem) \()[\) [ \(73=\) Smem \(]\) \\
\hline 11010010 AAAAAAAI U\%DD11SS & ACy \(=\operatorname{rnd}(\) ACx \(-(\) Smem * Smem) \()[\) [3 \(=\) Smem \(]\) \\
\hline 11010011 AAAAAAAI U\%DDOOSS & ACy \(=\operatorname{rnd}(\) Smem * ACx \() ~[, T 3=\) Smem \(]\) \\
\hline 11010011 AAAAAAAI U\%DD10xx & ACx \(=\operatorname{rnd}(\) Smem * Smem) [,T3 \(=\) Smem \(]\) \\
\hline 11010011 AAAAAAAI U\%DDu1ss & ACx \(=\operatorname{rnd}(\) uns \((T x\) * Smem) \()[, \mathrm{T} 3=\) Smem \(]\) \\
\hline 11010100 AAAAAAAI U\%DDsssS & ACy \(=\operatorname{rnd}(\) ACx \(+(T x *\) Smem \()\) [ \([\), \(73=\) Smem \(]\) \\
\hline 11010101 AAAAAAAI U\%DDssSS & ACy \(=\operatorname{rnd}(\mathrm{ACx}-(\mathrm{Tx}\) * Smem) \()[, \mathrm{T} 3=\) Smem \(]\) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 11010110 AAAAAAAI FDDDFSSS & \(\mathrm{dst}=\mathrm{src}+\) Smem \\
\hline 11010111 AAAAAAAI FDDDFSSS & \(\mathrm{dst}=\mathrm{src}-\mathrm{Smem}\) \\
\hline 11011000 AAAAAAAI FDDDFSSS & dst \(=\) Smem - src \\
\hline 11011001 AAAAAAAI FDDDFSSS & dst \(=\) src \& Smem \\
\hline 11011010 AAAAAAAI FDDDFSSS & dst \(=\) src \(\mid\) Smem \\
\hline 11011011 AAAAAAAI FDDDFSSS & dst \(=\operatorname{src}^{\wedge}\) Smem \\
\hline 11011100 AAAAAAAI kkkkxx00 & TC1 = bit(Smem, k4) \\
\hline 11011100 AAAAAAAI kkkkxx01 & TC2 \(=\) bit(Smem, k4) \\
\hline 11011100 AAAAAAAI 0000xx10 & DP = Smem \\
\hline 11011100 AAAAAAAI 0001xx10 & CDP = Smem \\
\hline 11011100 AAAAAAAI 0010xx10 & BSA01 = Smem \\
\hline 11011100 AAAAAAAI 0011xx10 & BSA23 = Smem \\
\hline 11011100 AAAAAAAI 0100xx10 & BSA45 = Smem \\
\hline 11011100 AAAAAAAI 0101xx10 & BSA67 = Smem \\
\hline 11011100 AAAAAAAI 0110xx10 & BSAC = Smem \\
\hline 11011100 AAAAAAAI 0111xx10 & SP = Smem \\
\hline 11011100 AAAAAAAI 1000xx10 & SSP = Smem \\
\hline 11011100 AAAAAAAI 1001xx10 & BK03 = Smem \\
\hline 11011100 AAAAAAAI 1010xx10 & BK47 = Smem \\
\hline 11011100 AAAAAAAI 1011xx10 & BKC = Smem \\
\hline 11011100 AAAAAAAI 1100xx10 & DPH = Smem \\
\hline 11011100 AAAAAAAI 1111xx10 & PDP = Smem \\
\hline 11011100 AAAAAAAI x000xx11 & CSR = Smem \\
\hline 11011100 AAAAAAAI \(\times 001 x x 11\) & BRC0 \(=\) Smem \\
\hline 11011100 AAAAAAAI x010xx11 & BRC1 \(=\) Smem \\
\hline 11011100 AAAAAAAI x011xx11 & TRN0 = Smem \\
\hline 11011100 AAAAAAAI \(\times 100 \times x 11\) & TRN1 = Smem \\
\hline 11011101 AAAAAAAI SSDDss00 & \(A C y=A C x+(\) Smem \(\ll\) Tx \()\) \\
\hline 11011101 AAAAAAAI SSDDss01 & ACy \(=\) ACx \(-(\) Smem \(\ll\) Tx) \\
\hline 11011101 AAAAAAAI SSDDss10 & ACy \(=\) ads2c(Smem, ACx, Tx, TC1, TC2) \\
\hline 11011101 AAAAAAAI x\%DDss11 & ACx \(=\) rnd (Smem \(\ll\) Tx) \\
\hline 11011110 AAAAAAAI SSDD0000 & ACy \(=\operatorname{adsc}\) (Smem, ACx, TC1) \\
\hline 11011110 AAAAAAAI SSDD0001 & ACy \(=\operatorname{adsc}\) (Smem, ACx, TC2) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|c|}
\hline & Opcode & Algebraic syntax \\
\hline 11011110 & AAAAAAAI SSDD0010 & ACy \(=\operatorname{adsc}\) (Smem, ACx, TC1, TC2) \\
\hline 11011110 & AAAAAAAI SSDD0011 & subc(Smem, ACx, ACy) \\
\hline 11011110 & AAAAAAAI SSDD0100 & ACy \(=\) ACx + (Smem \(\ll \# 16)\) \\
\hline 11011110 & AAAAAAAI SSDD0101 & ACy \(=\) ACx - (Smem \(\ll \# 16)\) \\
\hline 11011110 & AAAAAAAI SSDD0110 & ACy \(=(\) Smem \(\ll \# 16)-\) ACx \\
\hline 11011110 & AAAAAAAI ssDD1000 & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\text { Smem }+\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\text { Smem }-\mathrm{Tx}
\end{aligned}
\] \\
\hline 11011110 & AAAAAAAI ssDD1001 & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\text { Smem }-\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\text { Smem }+\mathrm{Tx}
\end{aligned}
\] \\
\hline 11011111 & AAAAAAAI FDDD000u & \(\mathrm{dst}=\mathrm{uns}(\) high_byte(Smem) \()\) \\
\hline 11011111 & AAAAAAAI FDDD001u & dst = uns(low_byte(Smem)) \\
\hline 11011111 & AAAAAAAI xxDD010u & ACx \(=\) uns(Smem) \\
\hline 11011111 & AAAAAAAI SSDD100u & ACy \(=\) ACx + uns(Smem) + CARRY \\
\hline 11011111 & AAAAAAAI SSDD101u & ACy \(=\) ACx - uns(Smem) - BORROW \\
\hline 11011111 & AAAAAAAI SSDD110u & \(A C y=A C x+u n s(S m e m)\) \\
\hline 11011111 & AAAAAAAI SSDD111u & ACy \(=\) ACx - uns(Smem) \\
\hline 11100000 & AAAAAAAI FSSSxxxt & TCx = bit(Smem, src) \\
\hline 11100001 & AAAAAAAI DDSHIFTW & ACx = low_byte(Smem) << \#SHIFTW \\
\hline 11100010 & AAAAAAAI DDSHIFTW & ACx = high_byte(Smem) << \#SHIFTW \\
\hline 11100011 & AAAAAAAI kkkk000x & TC1 = bit(Smem, k4), bit(Smem, k4) = \#1 \\
\hline 11100011 & AAAAAAAI kkkk001x & TC2 = bit(Smem, k4), bit(Smem, k4) = \#1 \\
\hline 11100011 & AAAAAAAI kkkk010x & TC1 = bit(Smem, k4), bit(Smem, k4) = \#0 \\
\hline 11100011 & AAAAAAAI kkkk011x & TC2 = bit(Smem, k4), bit(Smem, k4) = \#0 \\
\hline 11100011 & AAAAAAAI kkkk100x & TC1 = bit(Smem, k4), cbit(Smem, k4) \\
\hline 11100011 & AAAAAAAI kkkk101x & TC2 = bit(Smem, k4), cbit(Smem, k4) \\
\hline 11100011 & AAAAAAAI FSSS1100 & \(\operatorname{bit}(\) Smem, src) \(=\) \#1 \\
\hline 11100011 & AAAAAAAI FSSS1101 & bit(Smem, src) = \#0 \\
\hline 11100011 & AAAAAAAI FSSS111x & cbit(Smem, src) \\
\hline 11100100 & AAAAAAAI FSSSx0xx & push(src, Smem) \\
\hline 11100100 & AAAAAAAI FDDDx1xx & dst, Smem = pop() \\
\hline 11100101 & AAAAAAAI FSSS01x0 & high_byte(Smem) = src \\
\hline 11100101 & AAAAAAAI FSSS01x1 & low_byte(Smem) = src \\
\hline 11100101 & AAAAAAAI 000010xx & Smem = DP \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 11100101 AAAAAAAI 000110xx & Smem = CDP \\
\hline 11100101 AAAAAAAI 001010xx & Smem = BSA01 \\
\hline 11100101 AAAAAAAI 001110xx & Smem = BSA23 \\
\hline 11100101 AAAAAAAI 010010xx & Smem \(=\) BSA45 \\
\hline 11100101 AAAAAAAI 010110xx & Smem = BSA67 \\
\hline 11100101 AAAAAAAI 011010xx & Smem = BSAC \\
\hline 11100101 AAAAAAAI 011110xx & Smem = SP \\
\hline 11100101 AAAAAAAI 100010xx & Smem = SSP \\
\hline 11100101 AAAAAAAI 100110xx & Smem \(=\) BK03 \\
\hline 11100101 AAAAAAAI 101010xx & Smem \(=\) BK47 \\
\hline 11100101 AAAAAAAI 101110xx & Smem = BKC \\
\hline 11100101 AAAAAAAI 110010xx & Smem = DPH \\
\hline 11100101 AAAAAAAI 111110xx & Smem = PDP \\
\hline 11100101 AAAAAAAI x00011xx & Smem = CSR \\
\hline 11100101 AAAAAAAI x00111xx & Smem = BRC0 \\
\hline 11100101 AAAAAAAI x01011xx & Smem = BRC1 \\
\hline 11100101 AAAAAAAI x01111xx & Smem = TRN0 \\
\hline 11100101 AAAAAAAI x10011xx & Smem = TRN1 \\
\hline 11100110 AAAAAAAI KKKKKKKK & Smem \(=\) K8 \\
\hline 11100111 AAAAAAAI SSss00xx & Smem \(=\) LO(ACx \(\ll\) Tx) \\
\hline 11100111 AAAAAAAI SSss10x\% & Smem \(=\mathrm{HI}(\mathrm{rnd}(\) ACx \(\ll\) Tx \()\) ) \\
\hline 11100111 AAAAAAAI SSss11u\% & Smem \(=\) HI(saturate(uns(rnd(ACx \(\ll\) Tx) ) ) \\
\hline 11101000 AAAAAAAI SSxxx0x\% & Smem \(=\) HI( \(\mathrm{rnd}(\) ACx \()\) ) \\
\hline 11101000 AAAAAAAI SSxxx1u\% & Smem \(=\mathrm{HI}(\) saturate(uns(rnd(ACx) ) ) \\
\hline 11101001 AAAAAAAI SSSHIFTW & Smem \(=\) LO(ACx \(\ll\) \#SHIFTW) \\
\hline 11101010 AAAAAAAI SSSHIFTW & Smem \(=\) HI(ACx \(\ll\) \#SHIFTW) \\
\hline 11101011 AAAAAAAI xxxx01xx & \(\mathrm{dbl}(\) Lmem \()=\) RETA \\
\hline 11101011 AAAAAAAI xxSS10x0 & \(\mathrm{dbl}(\) Lmem \()=\mathrm{ACx}\) \\
\hline 11101011 AAAAAAAI xxSS10u1 & \(\mathrm{dbl}(\) Lmem \()=\) saturate(uns(ACx) \()\) \\
\hline 11101011 AAAAAAAI FSSS1100 & Lmem = pair(TAx) \\
\hline 11101011 AAAAAAAI xxSS1101 & \[
\begin{aligned}
& \mathrm{HI}(\text { Lmem })=\mathrm{HI}(\mathrm{ACx}) \gg \# 1, \\
& \mathrm{LO}(\text { Lmem })=\mathrm{LO}(\mathrm{ACx}) \gg \# 1
\end{aligned}
\] \\
\hline 11101011 AAAAAAAI xxSS1110 & Lmem = pair(HI(ACx)) \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 11101011 AAAAAAAI xxSS1111 & Lmem = pair(LO(ACx)) \\
\hline 11101100 AAAAAAAI FSSSOOOx & bit(src, Baddr) = \#1 \\
\hline 11101100 AAAAAAAI FSSS001x & bit(src, Baddr) = \#0 \\
\hline 11101100 AAAAAAAI FSSSO10x & bit(src, pair(Baddr)) \\
\hline 11101100 AAAAAAAI FSSS011x & cbit(src, Baddr) \\
\hline 11101100 AAAAAAAI FSSS100t & TCx = bit(src, Baddr) \\
\hline 11101100 AAAAAAAI XDDD1110 & XAdst = mar(Smem) \\
\hline 11101101 AAAAAAAI 00DD1010 & \(\operatorname{pair}(\mathrm{HI}(\mathrm{ACx}))=\mathrm{Lmem}\) \\
\hline 11101101 AAAAAAAI O0DD1100 & \(\operatorname{pair}(\mathrm{LO}(\mathrm{ACx}))=\mathrm{Lmem}\) \\
\hline 11101101 AAAAAAAI 00SS1110 & Lmem \(=\operatorname{pair}(\mathrm{HI}(\mathrm{ACx})\) ) \\
\hline 11101101 AAAAAAAI 00SS1111 & Lmem = pair(LO(ACx)) \\
\hline 11101101 AAAAAAAI SSDD000n & \(A C y=A C x+d b l(L m e m)\) \\
\hline 11101101 AAAAAAAI SSDD001n & ACy \(=\) ACx \(-\mathrm{dbl}(\) Lmem \()\) \\
\hline 11101101 AAAAAAAI SSDD010x & \(A C y=d b l(L m e m)-A C x\) \\
\hline 11101101 AAAAAAAI xxxx011x & RETA = dbl(Lmem) \\
\hline 11101101 AAAAAAAI \(x x D D 100 \mathrm{~g}\) & ACx \(=\) M 40 (dbl(Lmem)) \\
\hline 11101101 AAAAAAAI xxDD101x & \(\operatorname{pair}(\mathrm{HI}(\mathrm{ACx}))=\mathrm{Lmem}\) \\
\hline 11101101 AAAAAAAI xxDD110x & \(\operatorname{pair}(\mathrm{LO}(\mathrm{ACx}) \mathrm{)}=\mathrm{Lmem}\) \\
\hline 11101101 AAAAAAAI FDDD111x & \(\operatorname{pair}(\mathrm{TAx})=\) Lmem \\
\hline 11101101 AAAAAAAI XDDD1111 & XAdst \(=\) dbl(Lmem) \\
\hline 11101101 AAAAAAAI XSSS0101 & \(\mathrm{dbl}(\) Lmem \()=\) XAsrc \\
\hline 11101110 AAAAAAAI SSDD000x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\text { Lmem })+\mathrm{HI}(\mathrm{ACx}), \\
& \mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\text { Lmem })+\mathrm{LO}(\mathrm{ACx})
\end{aligned}
\] \\
\hline 11101110 AAAAAAAI SSDD001x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\mathrm{ACx})-\mathrm{HI}(\text { Lmem }), \\
& \mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\mathrm{ACx})-\mathrm{LO}(\text { Lmem })
\end{aligned}
\] \\
\hline 11101110 AAAAAAAI SSDD010x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACy})=\mathrm{HI}(\text { Lmem })-\mathrm{HI}(\mathrm{ACx}), \\
& \mathrm{LO}(\mathrm{ACy})=\mathrm{LO}(\text { Lmem })-\mathrm{LO}(\mathrm{ACx})
\end{aligned}
\] \\
\hline 11101110 AAAAAAAI ssDD011x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\mathrm{Tx}-\mathrm{HI}(\mathrm{Lmem}), \\
& \mathrm{LO}(\mathrm{ACx})=\mathrm{Tx}-\mathrm{LO}(\text { Lmem })
\end{aligned}
\] \\
\hline 11101110 AAAAAAAI ssDD100x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })+\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })+\mathrm{Tx}
\end{aligned}
\] \\
\hline 11101110 AAAAAAAI ssDD101x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })-\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })-\mathrm{Tx}
\end{aligned}
\] \\
\hline 11101110 AAAAAAAI SsDD110x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })+\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })-\mathrm{Tx}
\end{aligned}
\] \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 11101110 AAAAAAAI ssDD111x & \[
\begin{aligned}
& \mathrm{HI}(\mathrm{ACx})=\mathrm{HI}(\text { Lmem })-\mathrm{Tx}, \\
& \mathrm{LO}(\mathrm{ACx})=\mathrm{LO}(\text { Lmem })+\mathrm{Tx}
\end{aligned}
\] \\
\hline 11101111 AAAAAAAI xxxx00mm & Smem \(=\operatorname{coef}(\mathrm{Cmem})\) \\
\hline 11101111 AAAAAAAI xxxx01mm & \(\operatorname{coef}(\) Cmem \()=\) Smem \\
\hline 11101111 AAAAAAAI xxxx10mm & Lmem \(=\) dbl(coef(Cmem) \()\) \\
\hline 11101111 AAAAAAAI \(\mathrm{xxxx11mm}\) & \(\mathrm{dbl}(\operatorname{coef}(\mathrm{Cmem}))=\) Lmem \\
\hline 11110000 AAAAAAAI KKKKKKKK KKKKKKKK & TC1 \(=(\) Smem \(==\) K16) \\
\hline 11110001 AAAAAAAI KKKKKKKK KKKKKKKK & TC2 \(=(\) Smem \(==\) K16 \()\) \\
\hline 11110010 AAAAAAAI kkkkkkkk kkkkkkkk & TC1 = Smem \& k16 \\
\hline 11110011 AAAAAAAI kkkkkkkk kkkkkkkk & TC2 = Smem \& k16 \\
\hline 11110100 AAAAAAAI kkkkkkkk kkkkkkkk & Smem \(=\) Smem \& k16 \\
\hline 11110101 AAAAAAAI kkkkkkkk kkkkkkkk & Smem = Smem | k16 \\
\hline 11110110 AAAAAAAI kkkkkkkk kkkkkkkk & Smem \(=\) Smem ^ \({ }^{\text {k }} 16\) \\
\hline 11110111 AAAAAAAI KKKKKKKK KKKKKKKK & Smem \(=\) Smem + K16 \\
\hline 11111000 AAAAAAAI KKKKKKKK xxDDxOU\% & ACx \(=\operatorname{rnd}(\) Smem * K8) [, T3 = Smem \(]\) \\
\hline 11111000 AAAAAAAI KKKKKKKK SSDDx1U\% & ACy \(=\operatorname{rnd}(\) ACx \(+(\) Smem * K8) \() ~[, T 3=\) Smem \(]\) \\
\hline 11111001 AAAAAAAI uxSHIFTW SSDDO0xx & ACy \(=\) ACx \(+(\) uns(Smem) \(\ll\) \#SHIFTW \()\) \\
\hline 11111001 AAAAAAAI uxSHIFTW SSDD01xx & ACy \(=\) ACx \(-(\) uns(Smem) \(\ll\) \#SHIFTW) \\
\hline 11111001 AAAAAAAI uxSHIFTW xxDD10xx & ACx \(=\) uns(Smem) \(\ll\) \#SHIFTW \\
\hline 11111010 AAAAAAAI \(x \times S H I F T W\) SSxxx0x\% & Smem \(=\) HI(rnd(ACx \(\ll\) \#SHIFTW) \()\) \\
\hline 11111010 AAAAAAAI uxSHIFTW SSxxx1x\% & Smem \(=\) HI(saturate(uns(rnd(ACx \(\ll\) \#SHIFTW) \()\) ) \\
\hline 11111011 AAAAAAAI Kккккккк ккккккккк & Smem \(=\) K16 \\
\hline 11111100 AAAAAAAI LLLLLLLL LLLLLLLL & if (ARn_mod ! = \#0) goto L16 \\
\hline 11111101 AAAAAAAI 000000 mm DDDDuug\% & \[
\begin{aligned}
& \text { ACy }=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& A C x=M 40\left(\text { rnd }\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))\right)\right)
\end{aligned}
\] \\
\hline 11111101 AAAAAAAI 000001 mm DDDDuug\% & ```
ACy = M40(rnd(uns(Smem) * uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx + (uns(Smem) *
uns(LO(coef(Cmem))))))
``` \\
\hline 11111101 AAAAAAAI 000010 mm DDDDuug\% & ```
ACy = M40(rnd(ACy + (uns(Smem) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd(uns(Smem) * uns(LO(coef(Cmem)))))
``` \\
\hline 11111101 AAAAAAAI 000011 mm DDDDuug\% & \[
\begin{aligned}
& \text { ACy }=\text { M } 40\left(\text { rnd }\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& \text { ACx }=\text { M } 40(\text { rnd }(\text { ACx }-(\text { uns }(\text { Smem }) ~
\end{aligned}
\] \\
\hline
\end{tabular}

Table 6-1. Instruction Set Opcodes (Continued)


Table 6-1. Instruction Set Opcodes (Continued)
\begin{tabular}{|c|c|}
\hline Opcode & Algebraic syntax \\
\hline 11111101 AAAAAAAI 010010 mm DDDDuug\% & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(uns(LO(Lmem)) * uns(LO(coef(Cmem)))))
``` \\
\hline 11111101 AAAAAAAI 010011 mm DDDDuug\% & \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\text { uns }(\mathrm{HI}(\text { Lmem }))^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& \mathrm{ACx}=\mathrm{M} 40\left(\text { (rnd } \left(\text { ACx }-\left(\text { uns }(\text { LO }(\text { Lmem }))^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] \\
\hline 11111101 AAAAAAAI 010100 mm DDDDuug\% & \[
\begin{aligned}
& \text { ACy }=\text { M40(rnd(ACy }-\left(\text { uns(HI(Lmem)) }{ }^{*}\right. \\
& \text { uns(HI(coef(Cmem))))), } \\
& \text { ACx }=\text { M40(rnd(uns(LO(Lmem)) }{ }^{*} \text { uns(LO(coef(Cmem))))) }
\end{aligned}
\] \\
\hline 11111101 AAAAAAAI 010101 mm DDDDuug\% &  \\
\hline 11111101 AAAAAAAI 010110 mm DDDDuug\% & ```
ACy = M40(rnd(ACy + (uns(HI(Lmem)) *
uns(HI(coef(Cmem))))),
ACx = M40(rnd(ACx - (uns(LO(Lmem)) *
uns(LO(coef(Cmem)))))
``` \\
\hline 11111101 AAAAAAAI 010111 mm DDDDuug\% &  \\
\hline 11111101 AAAAAAAI 011000 mm DDDDuug\% & \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\text { rnd } \left(\text { ACy }+(\text { uns }(\mathrm{HI}(\text { Lmem })))^{*}\right.\right. \\
& \text { uns }(\mathrm{HI}(\operatorname{coef}(\operatorname{Cmem})))))) \\
& \text { ACx }=\mathrm{M} 40\left(\text { rnd } \left((\text { ACx>>\#16 })+\left(\text { uns }(\text { LO }(\text { Lmem }))^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] \\
\hline 11111101 AAAAAAAI 011001 mm DDDDuug\% & \[
\begin{aligned}
& \mathrm{ACy}=\mathrm{M} 40\left(\mathrm { rnd } \left((\mathrm{ACy>>}+16)+\left(\text { uns }(\mathrm{HI}(\text { Lmem })){ }^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{HI}(\operatorname{coef}(\operatorname{Cmem})))))), \\
& \mathrm{ACx}=\mathrm{M} 40\left(\text { rnd } \left(\mathrm{ACx}-\left(\text { uns }(\mathrm{LO}(\text { Lmem })){ }^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] \\
\hline 11111101 AAAAAAAI 011010 mm DDDDuug\% & \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\text { rnd } \left((\mathrm{ACy} \gg \# 16)+\left(\text { uns }(\mathrm{HI}(\text { Lmem }))^{*}\right.\right.\right. \\
& \text { uns(HI(coef(Cmem)))))), } \\
& \text { ACx } \left.\left.=\mathrm{M} 40\left(\text { rnd }(\text { uns }(\text { LO(Lmem }))^{*} \text { uns }(\text { LO(coef(Cmem) })\right)\right)\right)
\end{aligned}
\] \\
\hline 11111101 AAAAAAAI 011011 mm DDDDuug\% &  \\
\hline 11111101 AAAAAAAI 011100 mm DDDDuug\% & \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\text { rnd } \left(\mathrm{ACy}-\left(\mathrm{uns}(\mathrm{HI}(\text { Lmem }))^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{HI}(\operatorname{coef}(\operatorname{Cmem})))))), \\
& \mathrm{ACx}=\mathrm{M} 40(\operatorname{rnd}(\mathrm{ACx}-(\operatorname{uns}(\mathrm{LO}(\text { Lmem })) \\
& \text { uns }(\mathrm{LO}(\operatorname{coef}(\operatorname{Cmem}))))))
\end{aligned}
\] \\
\hline
\end{tabular}

\subsection*{6.2 Instruction Set Opcode Symbols and Abbreviations}

Table 6-2 lists the symbols and abbreviations used in the instruction set opcode.

Table 6-2. Instruction Set Opcode Symbols and Abbreviations
\begin{tabular}{|c|c|c|}
\hline Bit Field Name & Bit Field Value & Bit Field Description \\
\hline \multirow[t]{2}{*}{\%} & 0 & Rounding is disabled \\
\hline & 1 & Rounding is enabled \\
\hline \multirow[t]{21}{*}{AAAA AAAI} & & Smem addressing mode: \\
\hline & AAAA AAAO & @dma, direct memory address (dma) direct access \\
\hline & AAAA AAA1 & Smem indirect memory access: \\
\hline & 00010001 & ABS16(\#k16) \\
\hline & 00110001 & *(\#k23) \\
\hline & 01010001 & *port(\#k16) \\
\hline & 01110001 & *CDP \\
\hline & 10010001 & *CDP+ \\
\hline & 10110001 & *CDP- \\
\hline & 11010001 & *CDP(\#K16) \\
\hline & 11110001 & *+CDP(\#K16) \\
\hline & PPP0 0001 & *ARn \\
\hline & PPP0 0011 & *ARn+ \\
\hline & PPP0 0101 & *ARn- \\
\hline & PPP0 0111 & \begin{tabular}{l}
*(ARn + T0), when C54CM \(=0\) \\
*(ARn + T0), when \(C 54 C M=1\)
\end{tabular} \\
\hline & PPP0 1001 & \begin{tabular}{l}
*(ARn - T0), when C54CM \(=0\) \\
*(ARn - T0), when C54CM = 1
\end{tabular} \\
\hline & PPP0 1011 & \begin{tabular}{l}
*ARn(T0), when C54CM \(=0\) \\
*ARn(T0), when C54CM = 1
\end{tabular} \\
\hline & PPP0 1101 & *ARn(\#K16) \\
\hline & PPP0 1111 & * \(+\mathrm{ARn}(\# \mathrm{~K} 16)\) \\
\hline & PPP1 0011 & \begin{tabular}{l}
\({ }^{*}(\mathrm{ARn}+\mathrm{T} 1)\), when \(\mathrm{ARMS}=0\) \\
*ARn(short(\#1)), when ARMS = 1
\end{tabular} \\
\hline & PPP1 0101 & \begin{tabular}{l}
\({ }^{*}(A R n-T 1)\), when ARMS \(=0\) \\
*ARn(short(\#2)), when ARMS = 1
\end{tabular} \\
\hline
\end{tabular}

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)


Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)
\begin{tabular}{|c|c|c|}
\hline Bit Field Name & Bit Field Value & Bit Field Description \\
\hline & 1101000 & TC1 \& TC2 \\
\hline & 1101001 & TC1 \& !TC2 \\
\hline & 1101010 & !TC1 \& TC2 \\
\hline & 1101011 & \(!T C 1\) \& !TC2 \\
\hline & 110 11xx & Reserved \\
\hline & 111 00SS & !overflow(ACx) (source accumulator overflow status bit (ACOVx) is tested against 0) \\
\hline & 1110100 & \(!T C 1 \quad\) (status bit is tested against 0) \\
\hline & 1110101 & !TC2 (status bit is tested against 0) \\
\hline & 1110110 & !CARRY (status bit is tested against 0) \\
\hline & 1110111 & Reserved \\
\hline & 1111000 & TC1 | TC2 \\
\hline & 1111001 & TC1 | !TC2 \\
\hline & 1111010 & !TC1 | TC2 \\
\hline & 1111011 & !TC1 | !TC2 \\
\hline & 1111100 & TC1 ^ TC2 \\
\hline & 1111101 & TC1 ^ ! TC2 \\
\hline & 1111110 & \(!T C 1 \wedge ~ T C 2 ~\) \\
\hline & 1111111 & \(!T C 1 \wedge!T C 2\) \\
\hline \multirow[t]{5}{*}{dd} & & Destination temporary register (Tx, Ty): \\
\hline & 00 & Temporary register 0 (TO) \\
\hline & 01 & Temporary register 1 (T1) \\
\hline & 10 & Temporary register 2 (T2) \\
\hline & 11 & Temporary register 3 (T3) \\
\hline
\end{tabular}

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)
\begin{tabular}{|c|c|c|}
\hline Bit Field Name & Bit Field Value & Bit Field Description \\
\hline \multirow[t]{5}{*}{DD} & & Destination accumulator register (ACw, ACx, ACy, ACz): \\
\hline & 00 & Accumulator 0 (AC0) \\
\hline & 01 & Accumulator 1 (AC1) \\
\hline & 10 & Accumulator 2 (AC2) \\
\hline & 11 & Accumulator 3 (AC3) \\
\hline DDD. & & Data address label coded on n bits (absolute address) \\
\hline \multirow[t]{2}{*}{E} & 0 & Parallel Enable bit is cleared to 0 \\
\hline & 1 & Parallel Enable bit is set to 1 \\
\hline \multirow[t]{17}{*}{\[
\begin{aligned}
& \text { FDDD } \\
& \text { FSSS }
\end{aligned}
\]} & & Destination or Source accumulator, auxiliary, or temporary register (dst, src, TAx, TAy): \\
\hline & 0000 & Accumulator 0 (ACO) \\
\hline & 0001 & Accumulator 1 (AC1) \\
\hline & 0010 & Accumulator 2 (AC2) \\
\hline & 0011 & Accumulator 3 (AC3) \\
\hline & 0100 & Temporary register 0 (T0) \\
\hline & 0101 & Temporary register 1 (T1) \\
\hline & 0110 & Temporary register 2 (T2) \\
\hline & 0111 & Temporary register 3 (T3) \\
\hline & 1000 & Auxiliary register 0 (AR0) \\
\hline & 1001 & Auxiliary register 1 (AR1) \\
\hline & 1010 & Auxiliary register 2 (AR2) \\
\hline & 1011 & Auxiliary register 3 (AR3) \\
\hline & 1100 & Auxiliary register 4 (AR4) \\
\hline & 1101 & Auxiliary register 5 (AR5) \\
\hline & 1110 & Auxiliary register 6 (AR6) \\
\hline & 1111 & Auxiliary register 7 (AR7) \\
\hline
\end{tabular}

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)
\begin{tabular}{|c|c|c|}
\hline Bit Field Name & Bit Field Value & Bit Field Description \\
\hline \multirow[t]{2}{*}{9} & 0 & 40 keyword is not applied \\
\hline & 1 & 40 keyword is applied; M40 is locally set to 1 \\
\hline \multirow[t]{27}{*}{kk kkkk} & & Swap code for Swap Register Content instruction: \\
\hline & 000000 & swap(AC0, AC2) \\
\hline & 000001 & swap(AC1, AC3) \\
\hline & 000100 & swap(T0, T2) \\
\hline & 000101 & swap(T1, T3) \\
\hline & 001000 & swap(AR0, AR2) \\
\hline & 001001 & swap(AR1, AR3) \\
\hline & 001100 & swap(AR4, T0) \\
\hline & 001101 & swap(AR5, T1) \\
\hline & 001110 & swap(AR6, T2) \\
\hline & 001111 & swap(AR7, T3) \\
\hline & 010000 & swap(pair(AC0), pair(AC2)) \\
\hline & 010001 & Reserved \\
\hline & 010100 & swap(pair(T0), pair(T2)) \\
\hline & 010101 & Reserved \\
\hline & 011000 & swap(pair(AR0), pair(AR2)) \\
\hline & 011001 & Reserved \\
\hline & 011100 & swap(pair(AR4), pair(T0)) \\
\hline & 011101 & Reserved \\
\hline & 011110 & swap(pair(AR6), pair(T2)) \\
\hline & 011111 & Reserved \\
\hline & 101000 & Reserved \\
\hline & 101100 & swap(block(AR4), block(T0)) \\
\hline & 111000 & swap(AR0, AR1) \\
\hline & 111100 & Reserved \\
\hline & 1x 0000 & Reserved \\
\hline & 1x 0001 & Reserved \\
\hline
\end{tabular}

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)
\(\left.\begin{array}{llll}\hline \begin{array}{lll}\text { Bit Field } \\ \text { Name }\end{array} & \begin{array}{l}\text { Bit Field } \\ \text { Value }\end{array} & \text { Bit Field Description }\end{array}\right]\)

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)


Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)
\begin{tabular}{|c|c|c|}
\hline Bit Field Name & Bit Field Value & Bit Field Description \\
\hline \multirow[t]{4}{*}{tt} & 00 & Bit 0: destination TCy bit of Compare Register Content instruction \\
\hline & 01 & Bit 1: source TCx bit of Compare Register Content instruction \\
\hline & 10 & When value \(=0\) : TC1 is selected \\
\hline & 11 & When value = 1: TC2 is selected \\
\hline \multirow[t]{2}{*}{u} & 0 & uns keyword is not applied; operand is considered signed \\
\hline & 1 & uns keyword is applied; operand is considered unsigned \\
\hline \multirow[t]{2}{*}{U} & 0 & No update of T3 with Smem or Xmem content \\
\hline & 1 & T3 is updated with Smem or Xmem content \\
\hline \multirow[t]{4}{*}{vv} & 00 & Bit 0: shifted-out bit of Rotate instruction \\
\hline & 01 & Bit 1: shifted-in bit of Rotate instruction \\
\hline & 10 & When value \(=0:\) CARRY is selected \\
\hline & 11 & When value = 1: TC2 is selected \\
\hline x & & Reserved bit \\
\hline \multirow[t]{10}{*}{\[
\begin{aligned}
& \text { XDDD } \\
& \text { xSSS }
\end{aligned}
\]} & & Destination or Source accumulator or extended register. All 23 bits of stack pointer (XSP), system stack pointer (XSSP), data page pointer (XDP), \(\operatorname{coef}(\mathrm{Cmem})\) icient data pointer (XCDP), and extended auxiliary register (XARx). \\
\hline & 0000 & Accumulator 0 (ACO) \\
\hline & 0001 & Accumulator 1 (AC1) \\
\hline & 0010 & Accumulator 2 (AC2) \\
\hline & 0011 & Accumulator 3 (AC3) \\
\hline & 0100 & Stack pointer (XSP) \\
\hline & 0101 & System stack pointer (XSSP) \\
\hline & 0110 & Data page pointer (XDP) \\
\hline & 0111 & coef(Cmem)icient data pointer (XCDP) \\
\hline & 1000 & Auxiliary register 0 (XARO) \\
\hline
\end{tabular}

Table 6-2. Instruction Set Opcode Symbols and Abbreviations (Continued)
\begin{tabular}{lll}
\hline \begin{tabular}{l} 
Bit Field \\
Name
\end{tabular} & \begin{tabular}{l} 
Bit Field \\
Value
\end{tabular} & Bit Field Description \\
\hline & 1001 & Auxiliary register 1 (XAR1) \\
& 1010 & Auxiliary register 2 (XAR2) \\
& 1011 & Auxiliary register 3 (XAR3) \\
& 1100 & Auxiliary register 4 (XAR4) \\
& 1101 & Auxiliary register 5 (XAR5) \\
& 1110 & Auxiliary register 6 (XAR6) \\
& 1111 & Auxiliary register 7 (XAR7) \\
& & \\
& 000 & Auxiliary register designation for Xmem or Ymem addressing mode: \\
& 001 & Auxiliary register 0 (AR0) \\
& 010 & Auxiliary register 1 (AR1) \\
& 011 & Auxiliary register 3 (AR3) \\
& 100 & Auxiliary register 4 (AR4) \\
& 101 & Auxiliary register 5 (AR5) \\
& 110 & Auxiliary register 6 (AR6) \\
& 111 & Auxiliary register 7 (AR7) \\
& &
\end{tabular}

\title{
Cross-Reference of Algebraic and Mnemonic Instruction Sets
}

This chapter provides a cross-reference between the TMS320C55x™ DSP algebraic instruction set and the mnemonic instruction set (Table 7-1). For more information on the mnemonic instruction set, see C55x CPU Mnemonic Instruction Set Reference Guide, SWPU067.

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Absolute Distance & ABDST: Absolute Distance \\
\hline abdst(Xmem, Ymem, ACx, ACy) & ABDST Xmem, Ymem, ACx, ACy \\
\hline Absolute Value & ABS: Absolute Value \\
\hline dst \(=|\mathrm{src}|\) & ABS [src,] dst \\
\hline Addition & ADD: Addition \\
\hline \(\mathrm{dst}=\mathrm{dst}+\mathrm{src}\) & ADD [src,] dst \\
\hline \(d s t=d s t+k 4\) & ADD k4, dst \\
\hline \(\mathrm{dst}=\mathrm{src}+\mathrm{K} 16\) & ADD K16, [src,] dst \\
\hline dst \(=\) src + Smem & ADD Smem, [src,] dst \\
\hline \(A C y=A C y+(A C x \ll T x)\) & ADD ACx << Tx, ACy \\
\hline ACy \(=\) ACy + (ACx \(\ll\) \#SHIFTW) & ADD ACx << \#SHIFTW, ACy \\
\hline \(A C y=A C x+(\mathrm{K} 16 \ll \# 16)\) & ADD K16 <<\#16, [ACx,] ACy \\
\hline ACy \(=\) ACx \(+(\mathrm{K} 16 \ll \# S H F T)\) & ADD K16 << \#SHFT, [ACx,] ACy \\
\hline \(A C y=A C x+(\) Smem \(\ll\) Tx \()\) & ADD Smem << Tx, [ACx,] ACy \\
\hline ACy \(=\) ACx + (Smem \(\ll \# 16)\) & ADD Smem <<\#16, [ACx,] ACy \\
\hline ACy \(=\) ACx + uns(Smem \()+\) CARRY & ADD [uns(]Smem[)], CARRY, [ACx,] ACy \\
\hline \(A C y=A C x+u n s(S m e m)\) & ADD [uns(]Smem[)], [ACx, ] ACy \\
\hline ACy \(=\) ACx \(+(\) uns \((\) Smem \() \ll\) \#SHIFTW \()\) & ADD [uns(]Smem[)] <<\#SHIFTW, [ACx,] ACy \\
\hline \(A C y=A C x+d b l(L m e m)\) & ADD dbl(Lmem), [ACx,] ACy \\
\hline ACx \(=(\) Xmem \(\ll \# 16)+(\) Ymem \(\ll \# 16)\) & ADD Xmem, Ymem, ACx \\
\hline Smem \(=\) Smem + K16 & ADD K16, Smem \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Addition with Absolute Value & ADDV: Addition with Absolute Value \\
\hline \(A C y=\operatorname{rnd}(A C y+|A C x|)\) & ADD[R]V [ACx, ] ACy \\
\hline Addition with Parallel Store Accumulator Content to Memory & ADD::MOV: Addition with Parallel Store Accumulator Content to Memory \\
\hline \[
\begin{aligned}
& \text { ACy }=\text { ACx }+(\text { Xmem } \ll \# 16), \\
& \text { Ymem }=\text { HI(ACy } \ll \text { T2 })
\end{aligned}
\] & \begin{tabular}{l}
ADD Xmem <<\#16, ACx, ACy \\
:: MOV HI(ACy << T2), Ymem
\end{tabular} \\
\hline Addition or Subtraction Conditionally & ADDSUBCC: Addition or Subtraction Conditionally \\
\hline ACy \(=\operatorname{adsc}\) (Smem, ACx, TCx) & ADDSUBCC Smem, ACx, TCx, ACy \\
\hline Addition or Subtraction Conditionally with Shift & ADDSUB2CC: Addition or Subtraction Conditionally with Shift \\
\hline ACy \(=\operatorname{ads2c}\) (Smem, ACx, Tx, TC1, TC2) & ADDSUB2CC Smem, ACx, Tx, TC1, TC2, ACy \\
\hline Addition, Subtraction, or Move Accumulator Content Conditionally & ADDSUBCC: Addition, Subtraction, or Move Accumulator Content Conditionally \\
\hline ACy \(=\operatorname{adsc}\) (Smem, ACx, TC1, TC2) & ADDSUBCC Smem, ACx, TC1, TC2, ACy \\
\hline Bitwise AND & AND: Bitwise AND \\
\hline \(\mathrm{dst}=\mathrm{dst} \& \mathrm{src}\) & AND src, dst \\
\hline \(\mathrm{dst}=\mathrm{src} \& \mathrm{k} 8\) & AND k8,src, dst \\
\hline dst \(=\) src \& k16 & AND k16, src, dst \\
\hline dst \(=\) src \& Smem & AND Smem, src, dst \\
\hline ACy \(=\) ACy \& (ACx \(\lll\) \#SHIFTW) & AND ACx <<\#SHIFTW[, ACy] \\
\hline \(A C y=A C x \&(k 16 \lll \# 16)\) & AND k16 << \#16, [ACx,] ACy \\
\hline ACy = ACx \& (k16 <<< \#SHFT) & AND k16 <<\#SHFT, [ACx,] ACy \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Smem \(=\) Smem \& k16 & AND k16, Smem \\
\hline Bitwise AND Memory with Immediate Value and Compare to Zero & BAND: Bitwise AND Memory with Immediate Value and Compare to Zero \\
\hline TCx = Smem \& k16 & BAND Smem, k16, TCx \\
\hline Bitwise OR & OR: Bitwise OR \\
\hline dst \(=\) dst \(\mid\) src & OR src, dst \\
\hline dst \(=\) src \(\mid \mathrm{k} 8\) & OR k8, src, dst \\
\hline dst \(=\) src \(\mid \mathrm{k} 16\) & OR k16, src, dst \\
\hline dst \(=\) src \(\mid\) Smem & OR Smem, src, dst \\
\hline ACy \(=\) ACy | (ACx \(\lll\) \#SHIFTW) & OR ACx << \#SHIFTW[, ACy] \\
\hline ACy \(=\) ACx \(\mid\) (k16 <<< \#16) & OR k16 <<\#16, [ACx, ] ACy \\
\hline ACy \(=\) ACx \({ }^{\text {(k }} 16 \lll\) \#SHFT \()\) & OR k16 << \#SHFT, [ACx,] ACy \\
\hline Smem \(=\) Smem \(\mid \mathrm{k} 16\) & OR k16, Smem \\
\hline Bitwise Exclusive OR (XOR) & XOR: Bitwise Exclusive OR (XOR) \\
\hline \(\mathrm{dst}=\mathrm{dst}{ }^{\wedge} \mathrm{src}\) & XOR src, dst \\
\hline dst \(=\mathrm{src}^{\wedge} \mathrm{k} 8\) & XOR k8, src, dst \\
\hline \(\mathrm{dst}=\operatorname{src}^{\wedge} \mathrm{k} 16\) & XOR k16, src, dst \\
\hline dst \(=\) src^ Smem & XOR Smem, src, dst \\
\hline \(A C y=A C y \wedge(A C x \lll \# S H I F T W)\) & XOR ACx << \#SHIFTW[, ACy] \\
\hline \(A C y=A C x^{\wedge}(\mathrm{k} 16 \lll \# 16)\) & XOR k16 <<\#16, [ACx, ] ACy \\
\hline ACy \(=\) ACx^ (k16 \(\lll\) \#SHFT \()\) & XOR k16 << \#SHFT, [ACx,] ACy \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Smem \(=\) Smem \({ }^{\wedge}\) k16 & XOR k16, Smem \\
\hline Branch Conditionally & BCC: Branch Conditionally \\
\hline if (cond) goto l4 & BCC 14, cond \\
\hline if (cond) goto L8 & BCC L8, cond \\
\hline if (cond) goto L16 & BCC L16, cond \\
\hline if (cond) goto P24 & BCC P24, cond \\
\hline Branch Unconditionally & B: Branch Unconditionally \\
\hline goto ACx & B ACx \\
\hline goto L7 & B L7 \\
\hline goto L16 & B L16 \\
\hline goto P24 & B P24 \\
\hline Branch on Auxiliary Register Not Zero & BCC: Branch on Auxiliary Register Not Zero \\
\hline if (ARn_mod != \#0) goto L16 & BCC L16, ARn_mod != \#0 \\
\hline Call Conditionally & CALLCC: Call Conditionally \\
\hline if (cond) call L16 & CALLCC L16, cond \\
\hline if (cond) call P24 & CALLCC P24, cond \\
\hline Call Unconditionally & CALL: Call Unconditionally \\
\hline call ACx & CALL ACx \\
\hline call L16 & CALL L16 \\
\hline call P24 & CALL P24 \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Circular Addressing Qualifier circular() & .CR: Circular Addressing Qualifier <instruction>.CR \\
\hline Clear Accumulator, Auxiliary, or Temporary Register Bit bit(src, Baddr) = \#0 & BCLR: Clear Accumulator, Auxiliary, or Temporary Register Bit BCLR Baddr, src \\
\hline Clear Memory Bit bit(Smem, src) = \#0 & BCLR: Clear Memory Bit BCLR src, Smem \\
\hline Clear Status Register Bit
bit(STx, k4) = \#0 & BCLR: Clear Status Register Bit BCLR k4, STx_55 BCLR f-name \\
\hline Compare Accumulator, Auxiliary, or Temporary Register Content & CMP: Compare Accumulator, Auxiliary, or Temporary Register Content \\
\hline TCx \(=\) uns(src RELOP dst) & CMP[U] src RELOP dst, TCx \\
\hline Compare Accumulator, Auxiliary, or Temporary Register Content with AND & CMPAND: Compare Accumulator, Auxiliary, or Temporary Register Content with AND \\
\hline TCx = TCy \& uns(src RELOP dst) & CMPAND[U] src RELOP dst, TCy, TCx \\
\hline TCx = !TCy \& uns(src RELOP dst) & CMPAND[U] src RELOP dst, !TCy, TCx \\
\hline Compare Accumulator, Auxiliary, or Temporary Register Content with OR & CMPOR: Compare Accumulator, Auxiliary, or Temporary Register Content with OR \\
\hline TCx = TCy | uns(src RELOP dst) & CMPOR[U] src RELOP dst, TCy, TCx \\
\hline TCx \(=\) !TCy | uns(src RELOP dst) & CMPOR[U] src RELOP dst, !TCy, TCx \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Compare Accumulator, Auxiliary, or Temporary Register Content Maximum & MAX: Compare Accumulator, Auxiliary, or Temporary Register Content Maximum \\
\hline dst \(=\max (\mathrm{src}, \mathrm{dst})\) & MAX [src, ] dst \\
\hline Compare Accumulator, Auxiliary, or Temporary Register Content Minimum & MIN: Compare Accumulator, Auxiliary, or Temporary Register Content Minimum \\
\hline \(\mathrm{dst}=\min (\mathrm{src}, \mathrm{dst})\) & MIN [src,] dst \\
\hline Compare and Branch & BCC: Compare and Branch \\
\hline compare (uns(src RELOP K8)) goto L8 & BCC[U] L8, src RELOP K8 \\
\hline Compare and Select Accumulator Content Maximum & MAXDIFF: Compare and Select Accumulator Content Maximum \\
\hline max_diff(ACx, ACy, ACz, ACw) & MAXDIFF ACx, ACy, ACz, ACw \\
\hline max_diff_dbl(ACx, ACy, ACz, ACw, TRNx) & DMAXDIFF ACx, ACy, ACz, ACw, TRNx \\
\hline Compare and Select Accumulator Content Minimum & MINDIFF: Compare and Select Accumulator Content Minimum \\
\hline min_diff(ACx, ACy, ACz, ACw) & MINDIFF ACx, ACy, ACz, ACw \\
\hline min_diff_dbl(ACx, ACy, ACz, ACw, TRNx) & DMINDIFF ACx, ACy, ACz, ACw, TRNx \\
\hline Compare Memory with Immediate Value & CMP: Compare Memory with Immediate Value \\
\hline TCx \(=(\) Smem \(==\mathrm{K} 16\) ) & CMP Smem == K16, TCx \\
\hline Complement Accumulator, Auxiliary, or Temporary Register Bit & BNOT: Complement Accumulator, Auxiliary, or Temporary Register Bit \\
\hline cbit(src, Baddr) & BNOT Baddr, src \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\(\left.\begin{array}{ll}\hline \text { Algebraic Syntax } & \text { Mnemonic Syntax } \\
\hline \begin{array}{l}\text { Complement Accumulator, Auxiliary, or Temporary Register } \\
\text { Content }\end{array} & \begin{array}{l}\text { NOT: Complement Accumulator, Auxiliary, or Temporary } \\
\text { Register Content }\end{array} \\
\text { dst = } \sim \text { src } & \text { NOT [src,] dst }\end{array}\right]\)\begin{tabular}{ll} 
Complement Memory Bit & BNOT: Complement Memory Bit \\
cbit(Smem, src) & BNOT src, Smem \\
Compute Exponent of Accumulator Content & EXP: Compute Exponent of Accumulator Content \\
Tx = exp(ACx) & EXP ACx, Tx \\
Compute Mantissa and Exponent of Accumulator Content & MANT::NEXP: Compute Mantissa and Exponent of \\
ACy = mant(ACx), Tx = -exp(ACx) & Mccumulator Content \\
MANT ACx, ACy \\
Count Accumulator Bits & \(:\) NEXP ACx, Tx \\
Tx = count(ACx, ACy, TCx) & BCNT: Count Accumulator Bits \\
Dual 16-Bit Additions & BCNT ACx, ACy, TCx, Tx \\
HI(ACy) = HI(Lmem) + HI(ACx), & ADD: Dual 16-Bit Additions \\
LO(ACy) = LO(Lmem) + LO(ACx) & ADD dual(Lmem), [ACx,] ACy \\
HI(ACx) = HI(Lmem) + Tx, & ADD dual(Lmem), Tx, ACx \\
LO(ACx) = LO(Lmem) + Tx & \\
\hline
\end{tabular}


Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Expand Accumulator Bit Field dst = field_expand(ACx, k16) & BFXPA: Expand Accumulator Bit Field BFXPA k16, ACx, dst \\
\hline Extract Accumulator Bit Field dst \(=\) field_extract(ACx, k16) & BFXTR: Extract Accumulator Bit Field BFXTR k16, ACx, dst \\
\hline Finite Impulse Response Filter, Antisymmetrical firsn(Xmem, Ymem, coef(Cmem), ACx, ACy) & FIRSSUB: Finite Impulse Response Filter, Antisymmetrical FIRSSUB Xmem, Ymem, Cmem, ACx, ACy \\
\hline Finite Impulse Response Filter, Symmetrical firs(Xmem, Ymem, coef(Cmem), ACx, ACy) & FIRSADD: Finite Impulse Response Filter, Symmetrical FIRSADD Xmem, Ymem, Cmem, ACx, ACy \\
\hline Idle & IDLE \\
\hline idle & IDLE \\
\hline Least Mean Square (LMS) & LMS: Least Mean Square \\
\hline Ims(Xmem, Ymem, ACx, ACy) & LMS Xmem, Ymem, ACx, ACy \\
\hline Imsf(Xmem, Ymem, ACx, ACy) & LMSF Xmem, Ymem, ACx, ACy \\
\hline Linear Addressing Qualifier linear() & .LR: Linear Addressing Qualifier <instruction>.LR \\
\hline
\end{tabular}


Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Load Accumulator, Auxiliary, or Temporary Register from Memory & MOV: Load Accumulator, Auxiliary, or Temporary Register from Memory \\
\hline dst \(=\) Smem & MOV Smem, dst \\
\hline dst = uns(high_byte(Smem)) & MOV [uns(]high_byte(Smem)[)], dst \\
\hline dst = uns(low_byte(Smem)) & MOV [uns(]low_byte(Smem)[)], dst \\
\hline Load Accumulator, Auxiliary, or Temporary Register with Immediate Value & MOV: Load Accumulator, Auxiliary, or Temporary Register with Immediate Value \\
\hline \(\mathrm{dst}=\mathrm{k} 4\) & MOV k4, dst \\
\hline \(\mathrm{dst}=-\mathrm{k} 4\) & MOV -k4, dst \\
\hline \(\mathrm{dst}=\mathrm{K} 16\) & MOV K16, dst \\
\hline Load Auxiliary or Temporary Register Pair from Memory \(\operatorname{pair}(\mathrm{TAx})=\) Lmem & MOV: Load Auxiliary or Temporary Register Pair from Memory MOV dbl(Lmem), pair(TAx) \\
\hline Load CPU Register from Memory & MOV: Load CPU Register from Memory \\
\hline BK03 \(=\) Smem & MOV Smem, BK03 \\
\hline BK47 = Smem & MOV Smem, BK47 \\
\hline \(B K C=\) Smem & MOV Smem, BKC \\
\hline BSA01 = Smem & MOV Smem, BSA01 \\
\hline BSA23 = Smem & MOV Smem, BSA23 \\
\hline BSA45 = Smem & MOV Smem, BSA45 \\
\hline BSA67 = Smem & MOV Smem, BSA67 \\
\hline BSAC \(=\) Smem & MOV Smem, BSAC \\
\hline \(B R C 0=\) Smem & MOV Smem, BRC0 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline BRC1 = Smem & MOV Smem, BRC1 \\
\hline CDP = Smem & MOV Smem, CDP \\
\hline CSR = Smem & MOV Smem, CSR \\
\hline DP = Smem & MOV Smem, DP \\
\hline DPH \(=\) Smem & MOV Smem, DPH \\
\hline PDP = Smem & MOV Smem, PDP \\
\hline SP = Smem & MOV Smem, SP \\
\hline SSP = Smem & MOV Smem, SSP \\
\hline TRN0 = Smem & MOV Smem, TRN0 \\
\hline TRN1 = Smem & MOV Smem, TRN1 \\
\hline RETA \(=\) dbl(Lmem) & MOV dbl(Lmem), RETA \\
\hline Load CPU Register with Immediate Value & MOV: Load CPU Register with Immediate Value \\
\hline BK03 \(=\mathrm{k} 12\) & MOV k12, BK03 \\
\hline BK47 \(=\mathrm{k} 12\) & MOV k12, BK47 \\
\hline BKC \(=\mathrm{k} 12\) & MOV k12, BKC \\
\hline \(B R C 0=k 12\) & MOV k12, BRC0 \\
\hline BRC1 \(=\mathrm{k} 12\) & MOV k12, BRC1 \\
\hline CSR \(=\mathrm{k} 12\) & MOV k12, CSR \\
\hline DPH \(=\mathrm{k} 7\) & MOV k7, DPH \\
\hline \(\mathrm{PDP}=\mathrm{k} 9\) & MOV k9, PDP \\
\hline BSA01 \(=k 16\) & MOV k16, BSA01 \\
\hline BSA23 \(=k 16\) & MOV k16, BSA23 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline BSA45 \(=k 16\) & MOV k16, BSA45 \\
\hline BSA67 \(=k 16\) & MOV k16, BSA67 \\
\hline BSAC \(=\mathrm{k} 16\) & MOV k16, BSAC \\
\hline \(C D P=k 16\) & MOV k16, CDP \\
\hline DP \(=\mathrm{k} 16\) & MOV k16, DP \\
\hline \(\mathrm{SP}=\mathrm{k} 16\) & MOV k16, SP \\
\hline SSP \(=k 16\) & MOV k16, SSP \\
\hline Load Extended Auxiliary Register from Memory & MOV: Load Extended Auxiliary Register from Memory \\
\hline XAdst \(=\) dbl(Lmem) & MOV dbl(Lmem), XAdst \\
\hline Load Extended Auxiliary Register with Immediate Value & AMOV: Load Extended Auxiliary Register with Immediate Value \\
\hline XAdst \(=\mathrm{k} 23\) & AMOV k23, XAdst \\
\hline Load Memory with Immediate Value & MOV: Load Memory with Immediate Value \\
\hline Smem \(=\) K8 & MOV K8, Smem \\
\hline Smem \(=\) K16 & MOV K16, Smem \\
\hline Lock Access Qualifier & .LK: Lock Access Qualifier \\
\hline lock() & .LK \\
\hline Memory Delay delay(Smem) & DELAY: Memory Delay DELAY Smem \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Memory-Mapped Register Access Qualifier mmap() & mmap: Memory-Mapped Register Access Qualifier mmap \\
\hline Modify Auxiliary Register Content \(\operatorname{mar}(S m e m)\) & AMAR: Modify Auxiliary Register Content AMAR Smem \\
\hline Modify Auxiliary Register Content with Parallel Multiply & AMAR::MPY: Modify Auxiliary Register Content with Parallel Multiply \\
\hline \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\mathrm{M} 40\left(\text { (rnd(uns(Ymem) }{ }^{*} \text { uns( } \operatorname{coef(Cmem))))}\right.
\end{aligned}
\] & \begin{tabular}{l}
AMAR Xmem \\
:: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx
\end{tabular} \\
\hline Modify Auxiliary Register Content with Parallel Multiply and Accumulate & AMAR::MAC: Modify Auxiliary Register Content with Parallel Multiply and Accumulate \\
\hline \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\text { M } 40\left(\text { rnd }\left(A C x+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right)
\end{aligned}
\] & \begin{tabular}{l}
AMAR Xmem \\
:: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx
\end{tabular} \\
\hline \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx } \left.\left.=\text { M40(rnd }\left((\mathrm{ACx} \gg \# 16)+\left(\text { uns }(\text { Ymem })^{*} \text { uns(coef(Cmem) }\right)\right)\right)\right)
\end{aligned}
\] & \begin{tabular}{l}
AMAR Xmem \\
:: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx >> \#16
\end{tabular} \\
\hline Modify Auxiliary Register Content with Parallel Multiply and Subtract & AMAR::MAS: Modify Auxiliary Register Content with Parallel Multiply and Subtract \\
\hline \[
\begin{aligned}
& \operatorname{mar}(\text { Xmem }), \\
& \text { ACx }=\text { M } 40\left(\text { rnd }\left(\text { ACx }-\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)\right)
\end{aligned}
\] & AMAR Xmem :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACx \\
\hline Modify Auxiliary or Temporary Register Content & AMOV: Modify Auxiliary or Temporary Register Content \\
\hline \(\operatorname{mar}(\mathrm{TAy}=\mathrm{TAx}\) ) & AMOV TAx, TAy \\
\hline \(\operatorname{mar}(\mathrm{TAx}=\mathrm{P} 8)\) & AMOV P8, TAx \\
\hline \(\operatorname{mar}(\mathrm{TAx}=\mathrm{D} 16)\) & AMOV D16, TAx \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Modify Auxiliary or Temporary Register Content by Addition & AADD: Modify Auxiliary or Temporary Register Content by Addition \\
\hline \(\operatorname{mar}(\mathrm{TAy}+\mathrm{TAx})\) & AADD TAx, TAy \\
\hline \(\operatorname{mar}(\mathrm{TAx}+\mathrm{P} 8)\) & AADD P8, TAx \\
\hline Modify Auxiliary or Temporary Register Content by Subtraction & ASUB: Modify Auxiliary or Temporary Register Content by Subtraction \\
\hline \(\operatorname{mar}(\mathrm{TAy}-\mathrm{TAx})\) & ASUB TAx, TAy \\
\hline \(\operatorname{mar}(\mathrm{TAx}-\mathrm{P} 8)\) & ASUB P8, TAx \\
\hline Modify Data Stack Pointer & AADD: Modify Data Stack Pointer (SP) \\
\hline \(\mathrm{SP}=\mathrm{SP}+\mathrm{K} 8\) & AADD K8, SP \\
\hline Modify Extended Auxiliary Register Content & AMAR: Modify Extended Auxiliary Register Content \\
\hline XAdst \(=\) mar (Smem) & AMAR Smem, XAdst \\
\hline mar(XACdst = XACsrc) for DAG_X & AMOV XACsrc, XACdst for DAG_X \\
\hline \(\operatorname{mar}(\) XACdst \(=\) XACsrc \()\) for DAG_Y & AMOV XACsrc, XACdst for DAG_Y \\
\hline Modify Extended Auxiliary Register Content by Addition & AADD: Modify Extended Auxiliary Register Content by Addition \\
\hline mar(XACdst + XACsrc) for DAG_X & AADD XACsrc, XACdst for DAG_X \\
\hline mar(XACdst + XACsrc) for DAG_Y & AADD XACsrc, XACdst for DAG_Y \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Modify Extended Auxiliary Register Content by Subtraction & ASUB: Modify Extended Auxiliary Register Content by Subtraction \\
\hline mar(XACdst - XACsrc) for DAG_X & ASUB XACsrc, XACdst for DAG_X \\
\hline mar(XACdst - XACsrc) for DAG_Y & ASUB XACsrc, XACdst for DAG_Y \\
\hline Move Accumulator Content to Auxiliary or Temporary Register & MOV: Move Accumulator Content to Auxiliary or Temporary Register \\
\hline \(T A x=H I(A C x)\) & MOV HI(ACx), TAx \\
\hline Move Accumulator, Auxiliary, or Temporary Register Content & MOV: Move Accumulator, Auxiliary, or Temporary Register Content \\
\hline \(\mathrm{dst}=\mathrm{src}\) & MOV src, dst \\
\hline Move Auxiliary or Temporary Register Content to Accumulator & MOV: Move Auxiliary or Temporary Register Content to Accumulator \\
\hline \(H I(A C x)=T A x\) & MOV TAx, HI(ACx) \\
\hline Move Auxiliary or Temporary Register Content to CPU Register & MOV: Move Auxiliary or Temporary Register Content to CPU Register \\
\hline \(B R C 0=T A x\) & MOV TAx, BRC0 \\
\hline \(B R C 1=T A x\) & MOV TAx, BRC1 \\
\hline CDP = TAx & MOV TAx, CDP \\
\hline CSR \(=\) TAx & MOV TAx, CSR \\
\hline SP = TAx & MOV TAx, SP \\
\hline SSP = TAx & MOV TAx, SSP \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Move CPU Register Content to Auxiliary or Temporary Register & MOV: Move CPU Register Content to Auxiliary or Temporary Register \\
\hline TAx \(=\mathrm{BRCO}\) & MOV BRC0, TAx \\
\hline \(\mathrm{TAx}=\mathrm{BRC} 1\) & MOV BRC1, TAx \\
\hline TAx = CDP & MOV CDP, TAx \\
\hline TAx = RPTC & MOV RPTC, TAx \\
\hline TAx = SP & MOV SP, TAx \\
\hline TAx = SSP & MOV SSP, TAx \\
\hline Move Extended Auxiliary Register Content \(x d s t=x s r c\) & MOV: Move Extended Auxiliary Register Content MOV xsrc, xdst \\
\hline Move Memory to Memory & MOV: Move Memory to Memory \\
\hline Smem \(=\operatorname{coef}(\mathrm{Cmem})\) & MOV Cmem, Smem \\
\hline \(\operatorname{coef}(\mathrm{Cmem})=\) Smem & MOV Smem, Cmem \\
\hline Lmem \(=\mathrm{dbl}(\operatorname{coef}(\) Cmem \()\) ) & MOV Cmem, dbl(Lmem) \\
\hline \(\mathrm{dbl}(\operatorname{coef}(\) Cmem \())=\) Lmem & MOV dbl(Lmem), Cmem \\
\hline \(\mathrm{dbl}(\) Ymem \()=\mathrm{dbl}(\) Xmem \()\) & MOV dbl(Xmem), dbl(Ymem) \\
\hline Ymem = Xmem & MOV Xmem, Ymem \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Multiply & MPY: Multiply \\
\hline ACy \(=\operatorname{rnd}\left(A^{\prime} y\right.\) * \(\left.A C x\right)\) & MPY[R] [ACx, ] ACy \\
\hline \(\mathrm{ACy}=\operatorname{rnd}\left(\mathrm{ACx} * \mathrm{~T}^{\text {a }}\right.\) ) & MPY[R] Tx, [ACx,] ACy \\
\hline \(\mathrm{ACy}=\operatorname{rnd}(\mathrm{ACx} * \mathrm{~K} 8)\) & MPYK[R] K8, [ACx, ] ACy \\
\hline ACy \(=\) rnd( \(A C x\) * K16) & MPYK[R] K16, [ACx,] ACy \\
\hline ACx \(=\operatorname{rnd}(\) Smem * uns(coef(Cmem) ) & MPY[R] Smem, uns(Cmem), ACx \\
\hline ACx \(=\operatorname{rnd}(\) Smem * \(\operatorname{coef}(\) Cmem \()\) [, \(\mathrm{T} 3=\) Smem] & MPYM[R] [T3 = ]Smem, Cmem, ACx \\
\hline ACy \(=\operatorname{rnd}(\) Smem * ACx \()[\), T3 = Smem \(]\) & MPYM \([R][\) [3 \(=1\) Smem, [ACx, ] ACy \\
\hline ACx \(=\operatorname{rnd}(\) Smem * K8)[, T3 \(=\) Smem] & MPYMK[R] [T3 = ]Smem, K8, ACx \\
\hline \(A C x=M 40(\) rnd (uns(Xmem) * uns(Ymem) ) \([\) [, T3 = Xmem \(]\) & MPYM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], ACx \\
\hline ACX \(=\operatorname{rnd}(\) uns ( \(T x\) * Smem) \()\) [, T3 = Smem \(]\) & MPYM \([\mathrm{R}][\mathrm{U}][\mathrm{T} 3=]\) Smem, Tx, ACx \\
\hline Multiply with Parallel Multiply and Accumulate & MPY::MAC: Multiply with Parallel Multiply and Accumulate \\
\hline \[
\begin{aligned}
& A C x=M 40\left(\text { rnd }\left(\text { uns }(\text { Xmem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right), \\
& A C y=M 40\left(\text { rnd }\left((\text { ACy } \gg \# 16)+\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\text { coef(Cmem) })\right)\right)\right)
\end{aligned}
\] & \begin{tabular}{l}
MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx \\
:: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16
\end{tabular} \\
\hline \[
\left.\begin{array}{l}
\text { ACy }=\mathrm{M} 40(\text { rnd }(\text { uns }(\text { Smem }) * \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem }))))), \\
\text { ACx }=\text { M } 40(\text { rnd }(\text { ACx }+(\text { uns }(\text { Smem }) ~
\end{array} \text { uns(LO(coef(Cmem))))) }\right)
\] & MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & \begin{tabular}{l}
MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy \\
:: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx
\end{tabular} \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40(\text { rnd }(\text { uns }(\text { Ymem }) * \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=M 40(\text { rnd }(\mathrm{ACx}+\mathrm{uns}(\text { Xmem }) * \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem }))))
\end{aligned}
\] & MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Multiply with Parallel Multiply and Subtract & MPY::MAS: Multiply with Parallel Multiply and Subtract \\
\hline \[
\begin{aligned}
& A C y=M 40(\operatorname{rnd}(\text { uns }(\text { Smem }) * \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=M 40(\operatorname{rnd}(\text { ACx }-(\text { uns }(\text { Smem }) * \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] & \begin{tabular}{l}
MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy \\
:: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx
\end{tabular} \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{uns}(\mathrm{HI}(\text { Lmem }))^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right), \\
& \text { ACx }=\mathrm{M} 40\left(\operatorname { r n d } \left(\mathrm{ACx}-\left(\mathrm{uns}(\mathrm{LO}(\text { Lmem }))^{*}\right. \text { uns(LO(coef(Cmem)))))) }\right.\right.
\end{aligned}
\] & MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & \begin{tabular}{l}
MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, \\
:: MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx
\end{tabular} \\
\hline Multiply with Parallel Store Accumulator Content to Memory & MPYM::MOV: Multiply with Parallel Store Accumulator Content to Memory \\
\hline \[
\begin{aligned}
& \text { ACy }=\operatorname{rnd}(\text { Tx } * \text { Xmem }), \\
& \text { Ymem }=\mathrm{HI}(\text { ACx } \ll \text { T2) }[, \mathrm{T} 3=\text { Xmem }]
\end{aligned}
\] & MPYM[R] [T3 = ]Xmem, Tx, ACy :: MOV HI(ACx << T2), Ymem \\
\hline Multiply and Accumulate (MAC) & MAC: Multiply and Accumulate \\
\hline \(A C y=r n d(A C y+(A C x * T x))\) & MAC[R] ACx, Tx, ACy[, ACy] \\
\hline \(A C y=r n d((A C y * T x)+A C x)\) & MAC[R] ACy, Tx, ACx, ACy \\
\hline \(A C x=\operatorname{rnd}\left(\mathrm{ACx}+\left(\right.\right.\) Smem \(^{*}\) uns \((\operatorname{coef}(\) Cmem \())\) ) & MAC[R] Smem, uns(Cmem), ACx \\
\hline \(A C y=r n d(A C x+(T x * K 8))\) & MACK[R] Tx, K8, [ACx, ] ACy \\
\hline \(A C y=r n d(A C x+(T x * K 16))\) & MACK[R] Tx, K16, [ACx, \({ }^{\text {aCy }}\) \\
\hline \(A C x=\operatorname{rnd}\left(\mathrm{ACx}+\left(\right.\right.\) Smem \(\left.\left.^{*} \operatorname{coef}(\mathrm{Cmem})\right)\right)[, \mathrm{T} 3=\) Smem \(]\) & MACM[R] [T3 = ]Smem, Cmem, ACx \\
\hline ACy \(=\operatorname{rnd}(\mathrm{ACy}+(\) Smem * ACx \()\) )[, T3 = Smem \(]\) & MACM \([R][\) T3 = ]Smem, \([\) ACx, \(]\) ACy \\
\hline \(A C y=\operatorname{rnd}(\mathrm{ACx}+(\mathrm{Tx} *\) Smem \()\) )[, T3 = Smem \(]\) & MACM \([R][\) T3 = \(]\) Smem, Tx, [ACx, \(]\) ACy \\
\hline ACy \(=\operatorname{rnd}\left(\mathrm{ACx}+\left(\right.\right.\) Smem \(\left.\left.^{*} \mathrm{~K} 8\right)\right)[\), T3 = Smem \(]\) & MACMK[R] [T3 = \(]\) Smem, K8, [ACx, \(]\) ACy \\
\hline ACy \(=\mathrm{M} 40\left(\mathrm{rnd}\left(\mathrm{ACx}+\left(\mathrm{uns}(\mathrm{Xmem})^{*}\right.\right.\right.\) uns(Ymem)\(\left.)\right)\) )[, T3 = Xmem \(]\) & MACM \([R][40][\) [3 = ][uns(]Xmem[)], [uns(]Ymem[)], [ACx, \(]\) ACy \\
\hline \[
\begin{aligned}
& \text { ACy }=\text { M40 }(\text { rnd }((\text { AC } x \gg \# 16)+(\text { uns }(\text { Xmem }) * \text { uns }(\text { Ymem })))) \\
& {[, \text { T3 }=\text { Xmem }]}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MACM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], ACx >> \#16 } \\
& {[, \text { ACy] }}
\end{aligned}
\] \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Multiply and Accumulate with Parallel Delay & MACMZ: Multiply and Accumulate with Parallel Delay \\
\hline ```
ACx = rnd(ACx + (Smem * coef(Cmem)))[, T3 = Smem],
delay(Smem)
``` & MACM[R]Z [T3 = ]Smem, Cmem, ACx \\
\hline Multiply and Accumulate with Parallel Load Accumulator from Memory & MACM::MOV: Multiply and Accumulate with Parallel Load Accumulator from Memory \\
\hline \[
\begin{aligned}
& \mathrm{ACx}=\operatorname{rnd}(\mathrm{ACx}+(\mathrm{Tx} * \text { Xmem })), \\
& \mathrm{ACy}=\text { Ymem } \ll \# 16[, \mathrm{~T} 3=\text { Xmem }]
\end{aligned}
\] & MACM \([R][\) T3 \(=]\) Xmem, Tx, ACx :: MOV Ymem <<\#16, ACy \\
\hline Multiply and Accumulate with Parallel Multiply & MAC::MPY: Multiply and Accumulate with Parallel Multiply \\
\hline \[
\begin{aligned}
& A C x=M 40(\text { rnd }(\text { ACx }+(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& A C y=M 40\left(\text { rnd } \left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef(Cmem))))}\right.\right.
\end{aligned}
\] & MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40(\text { rnd }(\text { ACy }+(\text { uns }(\text { Smem }) * \operatorname{uns}(\text { HI }(\operatorname{coef}(\text { Cmem }))))), \\
& \text { ACx }=\text { M } 40(\text { rnd }(\text { uns }(\text { Smem }) ~
\end{aligned} \text { uns(LO(coef(Cmem))))) }
\] & MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx \\
\hline \[
\text { ACy }=\mathrm{M} 40\left(\mathrm { rnd } \left(\mathrm{ACy}+\left(\mathrm{uns}(\mathrm{HI}(\text { Lmem }))^{*}\right. \text { uns(HI(coef(Cmem)) )))), }\right.\right.
\]
\[
\text { ACx }=\mathrm{M} 40(\operatorname{rnd}(\text { uns }(\mathrm{LO}(\text { Lmem })) * \text { uns }(\mathrm{LO}(\operatorname{coef(Cmem)}))))
\] & MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> \#16, :: MPY[R][40] [uns(]LO(Xmem)[]], [uns(]LO(Cmem)[)], ACx \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Multiply and Accumulate with Parallel Multiply and Subtract & MAC::MAS: Multiply and Accumulate with Parallel Multiply and Subtract \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{ACy}+\left(\mathrm{uns}(\text { Smem })^{*} \operatorname{uns}(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right)\right), \\
& \text { ACx }=\mathrm{M} 40\left(\text { rnd }\left(\mathrm{ACx}-\left(\mathrm{uns}(\text { Smem })^{*} \operatorname{uns}(\mathrm{LO}(\operatorname{coef}(\text { Cmem })))\right)\right)\right.
\end{aligned}
\] & \begin{tabular}{l}
MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy \\
:: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx
\end{tabular} \\
\hline  & MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx \\
\hline ```
ACy = M40(rnd((ACy >> #16) + (uns(Ymem) *
uns(HI(coef(Cmem)))))),
ACx = M40(rnd(ACx - (uns(Xmem) * uns(LO(coef(Cmem)))))
``` & MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> \#16, :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline Multiply and Accumulate with Parallel Store Accumulator Content to Memory & MACM::MOV: Multiply and Accumulate with Parallel Store Accumulator Content to Memory \\
\hline \[
\begin{aligned}
& \text { ACy }=\operatorname{rnd}\left(\text { ACy }+\left(\text { Tx }{ }^{*} \text { Xmem }\right)\right), \\
& \text { Ymem }=\operatorname{HI}(\text { ACx } \ll \text { T2) }[, T 3=\text { Xmem }]
\end{aligned}
\] & \begin{tabular}{l}
MACM[R] [T3 = ]Xmem, Tx, ACy \\
:: MOV HI(ACx << T2), Ymem
\end{tabular} \\
\hline Multiply and Subtract & MAS: Multiply and Subtract \\
\hline \(A C y=\operatorname{rnd}(\mathrm{ACy}-(\mathrm{ACx} * \mathrm{Tx})\) ) & MAS[R] Tx, [ACx,] ACy \\
\hline \(A C x=\operatorname{rnd}(\) ACx \(-(\) Smem * uns(coef(Cmem) \()\) ) & MAS[R] Smem, uns(Cmem), ACx \\
\hline \(A C x=\operatorname{rnd}(\) ACx \(-(\) Smem * \(\operatorname{coef}(C m e m)))[\), T3 = Smem \(]\) & MASM[R] [T3 = ]Smem, Cmem, ACx \\
\hline ACy \(=\operatorname{rnd}(\mathbf{A C y}-(\) Smem * \(A C x))[\), \(\mathrm{T} 3=\) Smem \(]\) & MASM[R] [T3 = \(]\) Smem, \([\) ACx, \(]\) ACy \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline ACy \(=\operatorname{rnd}(\) ACx \(-(T x *\) Smem \()\) [, T3 = Smem] & MASM[R] [T3 = ]Smem, Tx, [ACx,] ACy \\
\hline ACy \(=\mathrm{M} 40(\mathrm{rnd}(\mathrm{ACx}-(\mathrm{uns}(\) Xmem \() *\) uns(Ymem) ) ) [, T3 = Xmem] & MASM[R][40] [T3 = ][uns(]Xmem[)], [uns(]Ymem[)], [ACx, ] ACy \\
\hline Multiply and Subtract with Parallel Load Accumulator from Memory & MASM::MOV: Multiply and Subtract with Parallel Load Accumulator from Memory \\
\hline \[
\begin{aligned}
& \mathrm{ACx}=\operatorname{rnd}(\mathrm{ACx}-(\mathrm{Tx} * \mathrm{Xmem})), \\
& \mathrm{ACy}=\text { Ymem } \ll \# 16[, \mathrm{~T} 3=\text { Xmem }]
\end{aligned}
\] & \(\operatorname{MASM}[R][T 3=] X m e m, T x, A C x\) :: MOV Ymem <<\#16, ACy \\
\hline Multiply and Subtract with Parallel Multiply & MAS::MPY: Multiply and Subtract with Parallel Multiply \\
\hline \[
\begin{aligned}
& A C x=M 40\left(\operatorname{rnd}\left(\operatorname{ACx}-\left(\text { uns }(\text { Xmem })^{*} \operatorname{uns}(\operatorname{coef}(\text { Cmem }))\right)\right)\right), \\
& A C y=M 40\left(\operatorname{rnd}\left(\text { uns }(\text { Ymem })^{*} \text { uns }(\operatorname{coef}(\text { Cmem }))\right)\right)
\end{aligned}
\] & MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40(\operatorname{rnd}(\text { ACy }-(\text { uns }(\text { Smem }) * \operatorname{uns}(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))))), \\
& \text { ACx }=\mathrm{M} 40(\text { rnd }(\text { uns }(\text { Smem }) ~
\end{aligned}
\] & MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx \\
\hline \begin{tabular}{l}
ACy \(=\mathrm{M} 40\left(\right.\) rnd \(\left(\right.\) ACy \(-\left(\mathrm{uns}(\mathrm{HI}(\mathrm{Lmem})){ }^{*}\right.\) uns(HI(coef(Cmem)))))), \\
\(A C x=M 40(\) rnd \((\) uns \((\mathrm{LO}(\) Lmem \())\) * uns \((\mathrm{LO}(\operatorname{coef}(\) Cmem \()))))\)
\end{tabular} & MAS[R][40] [uns(]HI(Lmem)[]], [uns(]HI(Cmem)[]], ACy :: MPY[R][40] [uns(]LO(Lmem)[]], [uns(]LO(Cmem)[], ACx \\
\hline Multiply and Subtract with Parallel Multiply and Accumulate & MAS::MAC: Multiply and Subtract with Parallel Multiply and Accumulate \\
\hline \[
\begin{aligned}
& A C x=M 40(\text { rnd }(A C x-(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& A C y=M 40(\text { rnd }(A C y+(\text { uns }(\text { Ymem }) * \text { uns }(\operatorname{coef(Cmem))))})
\end{aligned}
\] & MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy \\
\hline  & \begin{tabular}{l}
MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx \\
:: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16
\end{tabular} \\
\hline \begin{tabular}{l}
ACy \(=\mathrm{M} 40(\) rnd \((\) ACy \(-(\mathrm{uns}(\) Smem \() *\) uns(HI(coef(Cmem)))))), \\
ACx \(=\mathrm{M} 40(\) rnd \((\mathrm{ACx}+(\mathrm{uns}(\) Smem \() *\) uns \((\mathrm{LO}(\operatorname{coef}(\) Cmem \()))))\)
\end{tabular} & MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & \begin{tabular}{l}
MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy \\
:: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx
\end{tabular} \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{ll}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline \begin{tabular}{l} 
Multiply and Subtract with Parallel Store Accumulator Content \\
to Memory
\end{tabular} & \begin{tabular}{l} 
MASM::MOV: Multiply and Subtract with Parallel Store \\
Accumulator Content to Memory
\end{tabular} \\
\begin{tabular}{ll} 
ACy \(=\operatorname{rnd}(\mathrm{ACy}-(\mathrm{Tx} * \mathrm{Xmem}))\), & \(\mathrm{MASM}[\mathrm{R}][\mathrm{T} 3=] \mathrm{Xmem}, \mathrm{Tx}, \mathrm{ACy}\) \\
Ymem \(=\mathrm{HI}(\mathrm{ACx} \ll \mathrm{T} 2)[, \mathrm{T} 3=\mathrm{Xmem}]\) & \(:: \mathrm{MOV} \mathrm{HI}(\mathrm{ACx} \ll \mathrm{T} 2), \mathrm{Ymem}\)
\end{tabular} \\
Negate Accumulator, Auxiliary, or Temporary Register Content & NEG: Negate Accumulator, Auxiliary, or Temporary Register \\
Content
\end{tabular}

Parallel Modify Auxiliary Register Contents
\(\operatorname{mar}(\) Xmem \(), \operatorname{mar}(\mathrm{Ymem}), \operatorname{mar}(\operatorname{coef}(\) Cmem \())\)

\section*{Parallel Multiplies}
\(A C x=M 40(\) rnd \((\) uns \((\) Xmem \() ~ * ~ u n s(c o e f(C m e m)) ~)), ~\)
\(A C y=M 40(\) rnd \((\) uns \((\) Ymem \() ~ * u n s(\operatorname{coef}(\) Cmem \())))\)
ACy \(=\mathrm{M} 40\) (rnd(uns(Smem) * uns(HI(coef(Cmem))))),
\(A C x=M 40(\) rnd (uns(Smem) * uns(LO(coef(Cmem)))))
ACy \(=\mathrm{M} 40\left(\right.\) rnd \(\left(\right.\) uns \((\mathrm{HI}(\mathrm{Lmem})){ }^{*}\) uns(HI(coef(Cmem))))),
\(A C x=M 40(\) rnd(uns(LO(Lmem)) * uns(LO(coef(Cmem)))))
ACy \(=\mathrm{M} 40\left(\right.\) rnd (uns (Ymem) \({ }^{*}\) uns(HI(coef(Cmem))))),
ACx \(=\mathrm{M} 40(\) (nd \((\) uns \((\) Xmem \() *\) uns \((\) LO(coef(Cmem) \())))\)

Mnemonic Syntax
MASM::MOV: Multiply and Subtract with Parallel Store
MASM[R] [T3 = |Xmem, Tx, ACy
:: MOV HI(ACx << T2), Ymem

NEG: Negate Accumulator, Auxiliary, or Temporary Register

NEG [src,] dst

NOP_16

\section*{AMAR: Parallel Modify Auxiliary Register Contents}

AMAR Xmem, Ymem, Cmem

\section*{MPY::MPY: Parallel Multiplies}

MPY[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx
:: MPY[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy
MPY[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx

MPY[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MPY[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx

MPY[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy,
:: MPY[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Parallel Multiply and Accumulates & MAC::MAC: Parallel Multiply and Accumulates \\
\hline \begin{tabular}{l}
\(A C x=M 40(\) rnd \((A C x+(\) uns \((\) Xmem \() * u n s(\operatorname{coef}(C m e m)))))\), \\

\end{tabular} & MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy \\
\hline \[
\begin{aligned}
& A C x=M 40(\operatorname{rnd}((\text { ACx } \gg \# 16)+(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& A C y=M 4(\text { rnd }(\text { ACy }+(\text { uns }(\text { Ymem }) ~
\end{aligned}
\] & MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy \\
\hline \[
\begin{aligned}
& A C x=M 40(\operatorname{rnd}((\operatorname{ACx} \gg \# 16)+(\text { uns }(\text { Xmem }) * \text { uns }(\operatorname{coef}(\text { Cmem }))))), \\
& A C y=M 40(\text { rnd }((A C y \gg \# 16)+(\text { uns }(\text { Ymem }) * \text { uns }(\operatorname{coef}(\text { Cmem })))))
\end{aligned}
\] & MAC[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx >> \#16 :: MAC[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy >> \#16 \\
\hline  & MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]Smem[)], [uns(JLO(Cmem)[)], ACx \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\text { ACy }+\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { HI }(\operatorname{coef}(\text { Cmem })))\right)\right)\right), \\
& \text { ACx }=\text { M40(rnd }\left((\text { ACx>>\#16 })+\left(\text { uns }(\text { Smem })^{*}\right.\right. \\
& \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] & \begin{tabular}{l}
MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy \\
:: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx>>\#16
\end{tabular} \\
\hline  & MAC[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAC[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx>>\#16 \\
\hline  & MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem) \()\) ], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline  & MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx>>\#16 \\
\hline  & MAC[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy>>\#16 :: MAC[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx>>\#16 \\
\hline \(\mathrm{ACy}=\mathrm{M} 40\left(\mathrm{rnd}\left(\mathrm{ACy}+\mathrm{uns}(\mathrm{Ymem})^{*}\right.\right.\) uns \((\mathrm{HI}(\operatorname{coef}(\) Cmem \(\left.\left.)))\right)\right)\),
\(\mathrm{ACx}=\mathrm{M} 40(\mathrm{rnd}(\mathrm{ACx}+\mathrm{uns}(\) Xmem \() *\) uns \((\mathrm{LO}(\operatorname{coef}(\) Cmem \()))))\) & MAC[R][40] [uns(]Ymem[)], [uns(]HI(Cmem)[)], ACy, :: MAC[R][40] [uns(]Xmem[)], [uns(]LO(Cmem)[)], ACx \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline ```
ACy = M40(rnd(ACy + (uns(Ymem) * uns(HI(coef(Cmem)))))),
ACx = M40(rnd((ACx >> #16) + (uns(Xmem) *
uns(LO(coef(Cmem)))))
``` & \begin{tabular}{l}
MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy, \\
:: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> \#16
\end{tabular} \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40(\text { rnd }((\mathrm{ACy} \gg \# 16)+(\text { uns }(\text { Ymem }) ~ * \\
& \text { uns }(\mathrm{HI}(\text { coef(Cmem) })))), \\
& \text { ACx }=\mathrm{M} 40\left(\text { rnd } \left((\mathrm{ACx} \gg \# 16)+\left(\text { uns }(\text { Xmem })^{*}\right.\right.\right. \\
& \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem }))))))
\end{aligned}
\] & MAC[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy >> \#16, :: MAC[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx >> \#16 \\
\hline Parallel Multiply and Subtracts & MAS: MAS: Parallel Multiply and Subtracts \\
\hline \begin{tabular}{l}
\(A C x=M 40(\operatorname{rnd}(A C x-(\) uns \((\) Xmem \() *\) uns \((\operatorname{coef}(C m e m)))))\), \\
ACy \(=\mathrm{M} 40(\) rnd \((\) ACy \(-(\) uns \((\) Ymem \() ~ * u n s(\operatorname{coef(Cmem)~}))))\)
\end{tabular} & MAS[R][40] [uns(]Xmem[)], [uns(]Cmem[)], ACx :: MAS[R][40] [uns(]Ymem[)], [uns(]Cmem[)], ACy \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{ACy}-\left(\mathrm{uns}(\text { Smem })^{*} \operatorname{uns}(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right)\right), \\
& \text { ACx }=\mathrm{M} 40\left(\text { rnd }\left(\mathrm{ACx}-\left(\text { uns }(\text { Smem })^{*} \text { uns }(\text { LO }(\operatorname{coef}(\text { Cmem })))\right)\right)\right.
\end{aligned}
\] & MAS[R][40] [uns(]Smem[)], [uns(]HI(Cmem)[)], ACy :: MAS[R][40] [uns(]Smem[)], [uns(]LO(Cmem)[)], ACx \\
\hline \[
\begin{aligned}
& \text { ACy }=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{ACy}-\left(\operatorname{uns}(\mathrm{HI}(\text { Lmem }))^{*} \text { uns }(\mathrm{HI}(\operatorname{coef}(\text { Cmem })))\right)\right)\right), \\
& \mathrm{ACx}=\mathrm{M} 40\left(\operatorname{rnd}\left(\mathrm{ACx}-\left(\mathrm{uns}(\mathrm{LO}(\text { Lmem }))^{*} \text { uns }(\mathrm{LO}(\operatorname{coef}(\text { Cmem })))\right)\right)\right.
\end{aligned}
\] & \begin{tabular}{l}
MAS[R][40] [uns(]HI(Lmem)[)], [uns(]HI(Cmem)[)], ACy \\
:: MAS[R][40] [uns(]LO(Lmem)[)], [uns(]LO(Cmem)[)], ACx
\end{tabular} \\
\hline \begin{tabular}{l}
ACy \(=\mathrm{M} 40\left(\right.\) rnd \(\left(\mathrm{ACy}-\left(\mathrm{uns}(\mathrm{Ymem}){ }^{*}\right.\right.\) uns(HI(coef(Cmem)))))), \\
ACx \(=\) M40 (rnd(ACx \(-(\) uns \((\) Xmem \() *\) uns(LO( \(\operatorname{coef(Cmem))))))~}\)
\end{tabular} & MAS[R][40] [uns(]HI(Ymem)[)], [uns(]HI(Cmem)[)], ACy, :: MAS[R][40] [uns(]LO(Xmem)[)], [uns(]LO(Cmem)[)], ACx \\
\hline Peripheral Port Register Access Qualifiers readport() & port: Peripheral Port Register Access Qualifiers port(Smem) \\
\hline writeport() & port(Smem) \\
\hline Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers & POPBOTH: Pop Accumulator or Extended Auxiliary Register Content from Stack Pointers \\
\hline xdst = popboth() & POPBOTH xdst \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Pop Top of Stack & POP: Pop Top of Stack \\
\hline dst1, dst2 = pop() & POP dst1, dst2 \\
\hline dst \(=\operatorname{pop}()\) & POP dst \\
\hline dst, Smem = pop() & POP dst, Smem \\
\hline ACx \(=\mathrm{dbl}(\mathrm{pop}())\) & POP ACx \\
\hline Smem \(=\) pop() & POP Smem \\
\hline \(\mathrm{dbl}(\) Lmem \()=\operatorname{pop}()\) & POP dbl(Lmem) \\
\hline Push Accumulator or Extended Auxiliary Register Content to Stack Pointers & PSHBOTH: Push Accumulator or Extended Auxiliary Register Content to Stack Pointers \\
\hline pshboth(xsrc) & PSHBOTH xsrc \\
\hline Push to Top of Stack & PSH: Push to Top of Stack \\
\hline push(src1, src2) & PSH src1, src2 \\
\hline push(src) & PSH src \\
\hline push(src, Smem) & PSH src, Smem \\
\hline dbl(push(ACx)) & PSH ACx \\
\hline push(Smem) & PSH Smem \\
\hline push(dbl(Lmem)) & PSH dbl(Lmem) \\
\hline Repeat Block of Instructions Unconditionally & RPTB: Repeat Block of Instructions Unconditionally \\
\hline localrepeat \{ \} & RPTBLOCAL pmad \\
\hline blockrepeat \{ \} & RPTB pmad \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Repeat Single Instruction Conditionally & RPTCC: Repeat Single Instruction Conditionally \\
\hline while (cond \&\& (RPTC < k8)) repeat & RPTCC k8, cond \\
\hline Repeat Single Instruction Unconditionally & RPT: Repeat Single Instruction Unconditionally \\
\hline repeat(k8) & RPT k8 \\
\hline repeat(k16) & RPT k16 \\
\hline repeat(CSR) & RPT CSR \\
\hline Repeat Single Instruction Unconditionally and Decrement CSR & RPTSUB: Repeat Single Instruction Unconditionally and Decrement CSR \\
\hline repeat(CSR), CSR -= k4 & RPTSUB CSR, k4 \\
\hline Repeat Single Instruction Unconditionally and Increment CSR & RPTADD: Repeat Single Instruction Unconditionally and Increment CSR \\
\hline repeat(CSR), CSR += TAx & RPTADD CSR, TAx \\
\hline repeat(CSR), CSR \(+=\mathrm{k} 4\) & RPTADD CSR, k4 \\
\hline Return Conditionally & RETCC: Return Conditionally \\
\hline if (cond) return & RETCC cond \\
\hline Return Unconditionally & RET: Return Unconditionally \\
\hline return & RET \\
\hline Return from Interrupt & RETI: Return from Interrupt \\
\hline return_int & RETI \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Rotate Left Accumulator, Auxiliary, or Temporary Register Content & ROL: Rotate Left Accumulator, Auxiliary, or Temporary Register Content \\
\hline dst = BitOut \\ src \\ Bitln & ROL BitOut, src, Bitln, dst \\
\hline Rotate Right Accumulator, Auxiliary, or Temporary Register Content & ROR: Rotate Right Accumulator, Auxiliary, or Temporary Register Content \\
\hline dst = Bitln // src // BitOut & ROR Bitln, src, BitOut, dst \\
\hline Round Accumulator Content & ROUND: Round Accumulator Content \\
\hline \(A C y=\operatorname{rnd}(A C x)\) & ROUND [ACx,] ACy \\
\hline Saturate Accumulator Content & SAT: Saturate Accumulator Content \\
\hline ACy = saturate \((\operatorname{rnd}(\mathrm{ACx})\) ) & SAT[R] [ACx, ] ACy \\
\hline Set Accumulator, Auxiliary, or Temporary Register Bit bit(src, Baddr) = \#1 & BSET: Set Accumulator, Auxiliary, or Temporary Register Bit BSET Baddr, src \\
\hline Set Memory Bit & BSET: Set Memory Bit \\
\hline bit(Smem, src) = \#1 & BSET src, Smem \\
\hline Set Status Register Bit & BSET: Set Status Register Bit \\
\hline \(\operatorname{bit}(\mathrm{STx}, \mathrm{k} 4)=\) \#1 & BSET k4, STx_55 \\
\hline & BSET f-name \\
\hline Shift Accumulator Content Conditionally & SFTCC: Shift Accumulator Content Conditionally \\
\hline ACx \(=\operatorname{sftc}(\) ACx, TCx ) & SFTCC ACx, TCx \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)

\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Software Trap & TRAP: Software Trap \\
\hline trap(k5) & TRAP k5 \\
\hline Square & SQR: Square \\
\hline \(A C y=r n d(A C x * A C x)\) & SQR[R] [ACx, ] ACy \\
\hline ACx \(=\operatorname{rnd}(\) Smem * Smem) \([, \mathrm{T} 3=\) Smem \(]\) & SQRM[R] [T3 = ]Smem, ACx \\
\hline Square and Accumulate & SQA: Square and Accumulate \\
\hline \(A C y=\operatorname{rnd}(A C y+(A C x * A C x))\) & SQA[R] [ACx,] ACy \\
\hline ACy \(=\operatorname{rnd}(\mathbf{A C x}+(\) Smem * Smem \()\) [, T3 = Smem \(]\) & SQAM[R] [T3 = ]Smem, [ACx, \(]\) ACy \\
\hline Square and Subtract & SQS: Square and Subtract \\
\hline \(A C y=r n d(A C y-(A C x * A C x))\) & SQS[R] [ACx,] ACy \\
\hline ACy \(=\operatorname{rnd}(\mathbf{A C x}-(\) Smem * Smem \()\) [, T3 = Smem \(]\) & SQSM[R] [T3 = ]Smem, [ACx, ] ACy \\
\hline Square Distance & SQDST: Square Distance \\
\hline sqdst(Xmem, Ymem, ACx, ACy) & SQDST Xmem, Ymem, ACx, ACy \\
\hline Store Accumulator Content to Memory & MOV: Store Accumulator Content to Memory \\
\hline Smem = HI(ACx) & MOV HI(ACx), Smem \\
\hline Smem \(=\mathrm{HI}(\mathrm{rnd}(\) ACx \()\) ) & MOV [rnd(]HI(ACx)[)], Smem \\
\hline Smem \(=\) LO(ACx \(\ll\) Tx) & MOV ACx << Tx, Smem \\
\hline Smem \(=\mathrm{HI}(\mathrm{rnd}(\mathrm{ACx} \ll \mathrm{Tx}))\) & MOV [rnd(]HI(ACx \(\ll\) Tx) [)], Smem \\
\hline Smem \(=\) LO(ACx \(\ll\) \#SHIFTW) & MOV ACx <<\#SHIFTW, Smem \\
\hline Smem \(=\) HI(ACx << \#SHIFTW) & MOV HI(ACx << \#SHIFTW), Smem \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Smem \(=\mathrm{HI}(\mathrm{rnd}(\) ACx \(\ll\) \#SHIFTW \()\) ) & MOV [rnd(]HI(ACx << \#SHIFTW)[)], Smem \\
\hline Smem \(=\mathrm{HI}(\) saturate \((\) uns \((\operatorname{rnd}(\mathrm{ACx}))\) )) & MOV [uns(] [rnd(]HI[(saturate](ACx)[)) )], Smem \\
\hline Smem \(=\mathrm{HI}(\) saturate \((\) uns \((\operatorname{rnd}(\mathrm{ACx} \ll \mathrm{Tx}))\) )) & MOV [uns(] [rnd(]HI[(saturate](ACx \(\ll\) Tx \()\) [) ) ) ], Smem \\
\hline Smem = HI(saturate(uns(rnd(ACx << \#SHIFTW)))) & MOV [uns(] [rnd(]HI[(saturate](ACx \(\ll\) \#SHIFTW)[)))], Smem \\
\hline dbl(Lmem) = ACx & MOV ACx, dbl(Lmem) \\
\hline dbl(Lmem) = saturate(uns(ACx)) & MOV [uns(]saturate(ACx)[)], dbl(Lmem) \\
\hline \[
\begin{aligned}
& \mathrm{HI}(\text { Lmem })=\mathrm{HI}(\mathrm{ACx}) \gg \# 1, \\
& \mathrm{LO}(\text { Lmem })=\mathrm{LO}(\mathrm{ACx}) \gg \# 1
\end{aligned}
\] & MOV ACx >> \#1, dual(Lmem) \\
\hline \[
\begin{aligned}
& \text { Xmem }=\mathrm{LO}(\mathrm{ACx}), \\
& \text { Ymem }=\mathrm{HI}(\mathrm{ACx})
\end{aligned}
\] & MOV ACx, Xmem, Ymem \\
\hline Store Accumulator Pair Content to Memory & MOV: Store Accumulator Pair Content to Memory \\
\hline Lmem = pair(HI(ACx)) & MOV pair(HI(ACx)), dbl(Lmem) \\
\hline Lmem = pair(LO(ACx)) & MOV pair(LO(ACx)), dbl(Lmem) \\
\hline Store Accumulator, Auxiliary, or Temporary Register Content to Memory & MOV: Store Accumulator, Auxiliary, or Temporary Register Content to Memory \\
\hline Smem = src & MOV src, Smem \\
\hline high_byte(Smem) = src & MOV src, high_byte(Smem) \\
\hline low_byte(Smem) = src & MOV src, low_byte(Smem) \\
\hline Store Auxiliary or Temporary Register Pair Content to Memory & MOV: Store Auxiliary or Temporary Register Pair Content to Memory \\
\hline Lmem = pair(TAx) & MOV pair(TAx), dbl(Lmem) \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Store CPU Register Content to Memory & MOV: Store CPU Register Content to Memory \\
\hline Smem \(=\) BK03 & MOV BK03, Smem \\
\hline Smem \(=\) BK47 & MOV BK47, Smem \\
\hline Smem = BKC & MOV BKC, Smem \\
\hline Smem = BSA01 & MOV BSA01, Smem \\
\hline Smem = BSA23 & MOV BSA23, Smem \\
\hline Smem = BSA45 & MOV BSA45, Smem \\
\hline Smem = BSA67 & MOV BSA67, Smem \\
\hline Smem = BSAC & MOV BSAC, Smem \\
\hline Smem \(=\) BRC0 & MOV BRC0, Smem \\
\hline Smem \(=\) BRC1 & MOV BRC1, Smem \\
\hline Smem = CDP & MOV CDP, Smem \\
\hline Smem = CSR & MOV CSR, Smem \\
\hline Smem = DP & MOV DP, Smem \\
\hline Smem \(=\) DPH & MOV DPH, Smem \\
\hline Smem = PDP & MOV PDP, Smem \\
\hline Smem = SP & MOV SP, Smem \\
\hline Smem = SSP & MOV SSP, Smem \\
\hline Smem \(=\) TRN0 & MOV TRN0, Smem \\
\hline Smem = TRN1 & MOV TRN1, Smem \\
\hline \(\mathrm{dbl}(\) Lmem \()=\) RETA & MOV RETA, dbl(Lmem) \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Store Extended Auxiliary Register Content to Memory \(\mathrm{dbl}(\) Lmem \()=\) XAsrc & MOV: Store Extended Auxiliary Register Content to Memory MOV XAsrc, dbl(Lmem) \\
\hline Subtract Conditionally subc(Smem, ACx, ACy) & SUBC: Subtract Conditionally
SUBC Smem, [ACx,] ACy \\
\hline Subtraction & SUB: Subtraction \\
\hline \(\mathrm{dst}=\mathrm{dst}-\mathrm{src}\) & SUB [src, ] dst \\
\hline \[
\mathrm{dst}=\mathrm{dst}-\mathrm{k} 4
\] & SUB k4, dst \\
\hline \(\mathrm{dst}=\mathrm{src}-\mathrm{K} 16\) & SUB K16, [src,] dst \\
\hline \(d s t=s r c-\) Smem & SUB Smem, [src,] dst \\
\hline dst \(=\) Smem - src & SUB src, Smem, dst \\
\hline \(A C y=A C y-(A C x \ll T x)\) & SUB ACx << Tx, ACy \\
\hline ACy = ACy - (ACx \(\ll\) \#SHIFTW \()\) & SUB ACx << \#SHIFTW, ACy \\
\hline \(A C y=A C x-(\mathrm{K} 16 \ll \# 16)\) & SUB K16 << \#16, [ACx,] ACy \\
\hline \(A C y=A C x-(\mathrm{K} 16 \ll \# S H F T)\) & SUB K16 << \#SHFT, [ACx,] ACy \\
\hline \(A C y=A C x-(\) Smem \(\ll\) Tx \()\) & SUB Smem << Tx, [ACx,] ACy \\
\hline \(A C y=A C x-(\) Smem \(\ll \# 16)\) & SUB Smem <<\#16, [ACx,] ACy \\
\hline \(A C y=(\) Smem \(\ll \# 16)-\) ACx & SUB ACx, Smem << \#16, ACy \\
\hline ACy = ACx - uns(Smem) - BORROW & SUB [uns(]Smem[)], BORROW, [ACx,] ACy \\
\hline \(A C y=A C x-u n s(S m e m)\) & SUB [uns(]Smem[)], [ACx,] ACy \\
\hline ACy \(=\) ACx \(-(\) uns(Smem) \(\ll\) \#SHIFTW) & SUB [uns(]Smem[)] << \#SHIFTW, [ACx,] ACy \\
\hline \(A C y=A C x-d b l(L m e m)\) & SUB dbl(Lmem), [ACx,] ACy \\
\hline \(A C y=d b l(L m e m)-A C x\) & SUB ACx, dbl(Lmem), ACy \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline ACx \(=(\) Xmem \(\ll \# 16)-(\) Ymem \(\ll \# 16)\) & SUB Xmem, Ymem, ACx \\
\hline Subtraction with Parallel Store Accumulator Content to Memory & SUB::MOV: Subtraction with Parallel Store Accumulator Content to Memory \\
\hline \[
\begin{aligned}
& \text { ACy }=(\text { Xmem } \ll \# 16)-\text { ACx, } \\
& \text { Ymem }=\text { HI(ACy } \ll \text { T2) }
\end{aligned}
\] & \begin{tabular}{l}
SUB Xmem <<\#16, ACx, ACy \\
:: MOV HI(ACy << T2), Ymem
\end{tabular} \\
\hline Swap Accumulator Content & SWAP: Swap Accumulator Content \\
\hline swap(ACx, ACy) & SWAP ACx, ACy \\
\hline Swap Accumulator Pair Content swap(pair(AC0), pair(AC2)) & SWAPP: Swap Accumulator Pair Content SWAPP AC0, AC2 \\
\hline Swap Auxiliary Register Content swap(ARx, ARy) & SWAP: Swap Auxiliary Register Content SWAP ARx, ARy \\
\hline Swap Auxiliary Register Pair Content swap(pair(AR0), pair(AR2)) & SWAPP: Swap Auxiliary Register Pair Content SWAPP AR0, AR2 \\
\hline Swap Auxiliary and Temporary Register Content swap(ARx, Tx) & SWAP: Swap Auxiliary and Temporary Register Content SWAP ARx, Tx \\
\hline Swap Auxiliary and Temporary Register Pair Content swap(pair(ARx), pair(Tx)) & SWAPP: Swap Auxiliary and Temporary Register Pair Content SWAPP ARx, Tx \\
\hline Swap Auxiliary and Temporary Register Pairs Content swap(block(AR4), block(T0)) & SWAP4: Swap Auxiliary and Temporary Register Pairs Content SWAP4 AR4, T0 \\
\hline
\end{tabular}

Table 7-1. Cross-Reference of Algebraic and Mnemonic Instruction Sets (Continued)
\begin{tabular}{ll}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Swap Temporary Register Content & SWAP: Swap Temporary Register Content \\
swap(Tx, Ty) & SWAP Tx, Ty \\
Swap Temporary Register Pair Content & SWAPP: Swap Temporary Register Pair Content \\
swap(pair(T0), pair(T2)) & SWAPP T0, T2 \\
Test Accumulator, Auxiliary, or Temporary Register Bit & BTST: Test Accumulator, Auxiliary, or Temporary Register Bit \\
TCx = bit(Src, Baddr) & BTST Baddr, src, TCx \\
Test Accumulator, Auxiliary, or Temporary Register Bit Pair & BTSTP: Test Accumulator, Auxiliary, or Temporary Register Bit \\
bit(src, pair(Baddr)) & Pair \\
Test Memory Bit & BTSTP Baddr, src \\
TCx = bit(Smem, src) & BTST: Test Memory Bit \\
TCx = bit(Smem, k4) & BTST src, Smem, TCx \\
Test and Clear Memory Bit & BTST k4, Smem, TCx \\
TCx = bit(Smem, k4), & BTSTCLR: Test and Clear Memory Bit \\
bit(Smem, k4) = \#0 & BTSTCLR k4, Smem, TCx \\
Test and Complement Memory Bit & \\
TCx = bit(Smem, k4), & BTSTNOT: Test and Complement Memory Bit \\
cbit(Smem, k4) & BTSTNOT k4, Smem, TCx \\
\hline
\end{tabular}
\begin{tabular}{|c|c|}
\hline Algebraic Syntax & Mnemonic Syntax \\
\hline Test and Set Memory Bit & BTSTSET: Test and Set Memory Bit \\
\hline TCx = bit(Smem, k4), bit(Smem, k4) = \#1 & BTSTSET k4, Smem, TCx \\
\hline
\end{tabular}

\section*{Index}

\section*{A}
abdst 5-2
absolute addressing modes 3-3
I/O absolute 3-3
k16 absolute 3-3
k23 absolute 3-3
Absolute Distance (abdst) 5-2
Absolute Value 5-4
Addition 5-7
Addition or Subtraction Conditionally (adsc) 5-31
Addition or Subtraction Conditionally with Shift (ads2c) 5-33
Addition with Absolute Value 5-27
Addition with Parallel Store Accumulator Content to Memory 5-29
Addition, Subtraction, or Move Accumulator Content Conditionally (adsc) 5-36
addressing modes
absolute 3-3
direct 3-4
indirect 3-6
introduction [3-2
ads2c 5-33
adsc 5-31,|5-36
affect of status bits \(1-9\)
algebraic instruction set cross-reference to
mnemonic instruction set \(7-1\)
AND 5-38
Antisymmetrical Finite Impulse Response Filter
(firsn) 5-168
arithmetic
absolute distance 5-2
absolute value 5-4
addition 5-7
addition or subtraction conditionally 5-31,|5-36
addition or subtraction conditionally with shift 5-33
addition with absolute value 5-27
compare memory with immediate value 5-126
compute exponent of accumulator content 5-131
compute mantissa and exponent of accumulator content 5-132
dual 16-bit addition and subtraction 5-140
dual 16-bit additions \(5-135\)
dual 16-bit subtraction and addition 5-154
dual 16-bit subtractions 5-145
finite impulse response filter,
antisymmetrical 5-168
finite impulse response filter, symmetrical 5-170
least mean square 5-173, 5-175
multiply 5-269
multiply and accumulate 5 5-308
multiply and subtract 5-369
negation 5-403
round accumulator content 5-518
saturate accumulator content 5 5-520
square 5-557
square and accumulate 5-560
square and subtract 5-563
square distance 5-566
subtract conditionally 5-601
subtraction 5-603

\section*{B}
bit field comparison 5-47
bit field counting 5-134
bit field expand 5-166
bit field extract 5 5-167
bit manipulation
bitwise AND memory with immediate value and compare to zero 5-47
clear accumulator, auxiliary, or temporary register bit 5-88
clear memory bit 5-89
clear status register bit \(\quad 5-90\)
complement accumulator, auxiliary, or temporary register bit 5-128
complement accumulator, auxiliary, or temporary register content 5-129
complement memory bit 5-130
expand accumulator bit field 5-166
extract accumulator bit field 5-167
set accumulator, auxiliary, or temporary register bit 5-522
set memory bit 5-523
set status register bit \(\quad 5-524\)
test accumulator, auxiliary, or temporary register bit 5-641
test accumulator, auxiliary, or temporary register bit pair 5-643
test and clear memory bit 5-648
test and complement memory bit 5-649
test and set memory bit \(5-650\)
test memory bit 5-645
Bitwise AND 5-38
Bitwise AND Memory with Immediate Value and
Compare to Zero 5-47
bitwise complement 5-129
Bitwise Exclusive OR (XOR) 5-57
Bitwise OR 5-48
blockrepeat 5-484
branch
conditionally 5-66
on auxiliary register not zero 5-74
unconditionally 5-70
Branch Conditionally (if goto) 5-66
Branch on Auxiliary Register Not Zero (if goto) 5-74
Branch Unconditionally (goto) 5-70

\section*{C}
call 5-83
conditionally 5-77
unconditionally 5-83
Call Conditionally (if call) 5-77
Call Unconditionally (call) 5-83
cbit 5-128, \(5-130\)
circular 5-87
circular addressing 3-21
Circular Addressing Qualifier (circular) 5-87
clear
accumulator bit 5-88
auxiliary register bit 5-88
memory bit 5-89
status register bit 5-90
temporary register bit 5-88
Clear Accumulator Bit 5-88
Clear Auxiliary Register Bit 5-88
Clear Memory Bit 5-89
Clear Status Register Bit 5-90
Clear Temporary Register Bit 5-88
compare
accumulator, auxiliary, or temporary register content 5-93
accumulator, auxiliary, or temporary register content maximum 5-105
accumulator, auxiliary, or temporary register content minimum 5-108
accumulator, auxiliary, or temporary register content with AND 5-95
accumulator, auxiliary, or temporary register content with OR 5-100
and branch 5-111
and select accumulator content maximum 5-114
and select accumulator content minimum 5-120
memory with immediate value 5-126
Compare Accumulator Content 5-93
Compare Accumulator Content Maximum (max) 5-105
Compare Accumulator Content Minimum (min) 5-108
Compare Accumulator Content with AND 5-95
Compare Accumulator Content with OR 5-100
Compare and Branch 5-111
compare and goto 5-111
Compare and Select Accumulator Content
Maximum (max_diff) 5-114
Compare and Select Accumulator Content Minimum (min_diff) 5-120
Compare Auxiliary Register Content 5-93
Compare Auxiliary Register Content Maximum (max) 5-105
Compare Auxiliary Register Content Minimum
\((\mathrm{min}) 5-108\)

Index-2

Compare Auxiliary Register Content with AND 5-95
Compare Auxiliary Register Content with OR 5-100
compare maximum 5-105
Compare Memory with Immediate Value 5-126
compare minimum 5-108
Compare Temporary Register Content 5-93
Compare Temporary Register Content Maximum (max) 5-105
Compare Temporary Register Content Minimum (min) 5-108
Compare Temporary Register Content with AND 5-95
Compare Temporary Register Content with OR 5-100
complement
accumulator bit 5-128
accumulator content 5 -129
auxiliary register bit 5-128
auxiliary register content 5-129
memory bit 5-130
temporary register bit 5-128 temporary register content 5 5-129
Complement Accumulator Bit (cbit) 5-128
Complement Accumulator Content 5-129
Complement Auxiliary Register Bit (cbit) 5-128
Complement Auxiliary Register Content 5-129
Complement Memory Bit (cbit) 5-130
Complement Temporary Register Bit (cbit) 5-128
Complement Temporary Register Content 5-129
Compute Exponent of Accumulator Content (exp) 5-131
Compute Mantissa and Exponent of Accumulator Content 5-132
cond field 1-7
conditional
addition or subtraction 5-31
addition or subtraction with shift 5-33
addition, subtraction, or move accumulator content 5-36
branch 5-66
call 5-77
execute 5-159
repeat single instruction 5-495
return 5-508
shift 5-527
subtract 5 5-601
count 5-134
Count Accumulator Bits (count) 5-134
Cross-Reference to Algebraic and Mnemonic Instruction Sets 7-1

\section*{D}
delay 5-220
direct addressing modes 3-4
DP direct 3-4
PDP direct 3-5
register-bit direct 3-5
SP direct 3-5
Dual 16-Bit Addition and Subtraction 5-140
Dual 16-Bit Additions 5-135
dual 16-bit arithmetic
addition and subtraction 5-140
additions 5-135
subtraction and addition 5-154
subtractions 5-145
Dual 16-Bit Subtraction and Addition 5-154
Dual 16-Bit Subtractions 5-145

\section*{E}

Execute Conditionally (if execute) 5-159
\(\exp 5-131,5-132\)
Expand Accumulator Bit Field (field_expand) 5-166
extended auxiliary register (XAR)
load from memory 5-215
load with immediate value 5-216
modify content 5-246
modify content by addition 5-249
modify content by subtraction 5-251
move content 5-261
pop content from stack pointers 5-468
push content to stack pointers 5-476
store to memory 5-600
Extract Accumulator Bit Field (field_extract) 5-167

\section*{F}
field_expand 5-166
field_extract 5-167
finite impulse response (FIR) filter
antisymmetrical 5-168
symmetrical \(5-170\)
firs 5-170
firsn 5-168

goto 5-70
idle 5-172
if call 5-77
if execute 5-159
if goto 5-66,|5-74
if return 5-508
indirect addressing modes 3-6
AR indirect 3-6
CDP indirect 3-16
coefficient indirect 3-19
dual AR indirect 3-14
initialize memory 5-217
instruction qualifier
circular addressing 5-87
linear addressing 5-179
memory-mapped register access 5-221
instruction set
abbreviations 1-2
affect of status bits 1-9
conditional fields 1-7
nonrepeatable instructions 1-20
notes 1-14
opcode symbols and abbreviations 6-19
opcodes 6-2
operators 1-6
rules 1-14
symbols 1 1-2
terms 1-2
instruction set conditional fields 1-7
instruction set notes and rules \(1-14\)
instruction set opcode
abbreviations 6-19
symbols 6-19
instruction set opcodes 6-2
instruction set summary 4-1
instruction set terms, symbols, and abbreviations 1-2
interrupt 5-549
intr 5-549
L
Least Mean Square (Ims) 5-173
Least Mean Square (Imsf) 5-175
linear 5-179
Linear Addressing Qualifier (linear) 5-179
List of Algebraic Instruction Opcodes 6-1
Ims 5-173
Imsf 5-175
load
accumulator from memory 5-180
accumulator from memory with parallel store accumulator content to memory 5 5-189
accumulator pair from memory 5-191
accumulator with immediate value 5-196
accumulator, auxiliary, or temporary register from memory 5-199
accumulator, auxiliary, or temporary register with immediate value 5-205
auxiliary or temporary register pair from memory 5-209
CPU register from memory 5-210
CPU register with immediate value 5-213
extended auxiliary register (XAR) from memory 5-215
extended auxiliary register (XAR) with immediate value 5-216
memory with immediate value 5-217
Load Accumulator from Memory 5-180, 5-199
Load Accumulator from Memory with Parallel Store
Accumulator Content to Memory 5-189
Load Accumulator Pair from Memory 5-191
Load Accumulator with Immediate Value 5-196,
5-205
Load Auxiliary Register from Memory 5-199
Load Auxiliary Register Pair from Memory 5-209
Load Auxiliary Register with Immediate Value 5-205
Load CPU Register from Memory 5-210
Load CPU Register with Immediate Value 5-213
Load Extended Auxiliary Register (XAR) from
Memory 5-215

Load Extended Auxiliary Register (XAR) with Immediate Value 5-216
Load Memory with Immediate Value 5-217
Load Temporary Register from Memory 5-199
Load Temporary Register Pair from Memory 5-209
Load Temporary Register with Immediate
Value 5-205
localrepeat 5-484
lock, access qualifier 5-218
Lock Access Qualifier 5-218
logical
bitwise AND 5-38
bitwise OR 5-48
bitwise XOR 5-57
count accumulator bits 5-134
shift accumulator content logically 5-529
shift accumulator, auxiliary, or temporary register content logically 5-532

\section*{M}
mant 5-132

5-251, \(\overline{5}-406\)
max 5-105
max_diff 5-114
max_diff_dbl 5-114
memory bit
clear 5-89
complement (not) 5-130
set 5-523
test 5-645
test and clear 5-648
test and complement 5-649
test and set 5-650
Memory Delay (delay) 5-220
Memory-Mapped Register Access Qualifier
(mmap) 5-221
min 5-108
min_diff 5-120
min_diff_dbl 5-120
mmap 5-221
mnemonic instruction set cross-reference to algebraic instruction set 7-1
modify
auxiliary or temporary register content 5-233
auxiliary or temporary register content by addition 5-237
auxiliary or temporary register content by subtraction 5-241
auxiliary register content 5-222
auxiliary register content with parallel multiply 5-224
auxiliary register content with parallel multiply and accumulate 5-226
auxiliary register content with parallel multiply and subtract 5-231
data stack pointer 5-245
extended auxiliary register (XAR) content 5-246
extended auxiliary register (XAR) content by addition 5-249
extended auxiliary register (XAR) content by subtraction 5-251
Modify Auxiliary Register Content (mar) 5-222, 5-233
Modify Auxiliary Register Content by Addition (mar) 5-237
Modify Auxiliary Register Content by Subtraction (mar) 5-241
Modify Auxiliary Register Content with Parallel Multiply (mar) 5-224
Modify Auxiliary Register Content with Parallel Multiply and Accumulate (mar) 5-226
Modify Auxiliary Register Content with Parallel Multiply and Subtract (mar) 5-231
Modify Data Stack Pointer 5-245
Modify Extended Auxiliary Register Content (mar) 5-246
Modify Extended Auxiliary Register Content by Addition (mar) 5-249
Modify Extended Auxiliary Register Content by Subtraction (mar) 5-251
Modify Temporary Register Content (mar) 5-233
Modify Temporary Register Content by Addition (mar) 5-237
Modify Temporary Register Content by Subtraction (mar) 5-241
move
accumulator content to auxiliary or temporary register 5-253
accumulator, auxiliary, or temporary register content 5-254
auxiliary or temporary register content to accumulator 5-256
auxiliary or temporary register content to CPU register 5-257
CPU register content to auxiliary or temporary register 5-259
extended auxiliary register content 5-261
memory delay 5-220
memory to memory 5-262
pop accumulator or extended auxiliary register content from stack pointers 5-468
pop top of stack 5-469
push accumulator or extended auxiliary register content to stack pointers 5-476
push to top of stack 5-477
swap accumulator content 5 -629
swap accumulator pair content 5 5-630
swap auxiliary and temporary register content 5-633
swap auxiliary and temporary register pair content 5-635
swap auxiliary and temporary register pairs content 5-637
swap auxiliary register content 5-631
swap auxiliary register pair content 5 5-632
swap temporary register content 5-639
swap temporary register pair content 5-640
Move Accumulator Content 5-254
Move Accumulator Content to Auxiliary
Register 5-253
Move Accumulator Content to Temporary Register 5-253
Move Auxiliary Register Content 5-254
Move Auxiliary Register Content to
Accumulator 5-256
Move Auxiliary Register Content to CPU
Register 5-257
Move CPU Register Content to Auxiliary
Register 5-259
Move CPU Register Content to Temporary
Register 5-259
Move Extended Auxiliary Register (XAR)
Content 5-261
Move Memory to Memory 5-262
Move Temporary Register Content 5-254
Move Temporary Register Content to
Accumulator 5-256
Move Temporary Register Content to CPU
Register 5-257
Multiply 5-269

Multiply and Accumulate (MAC) 5-308
Multiply and Accumulate with Parallel Delay 5-325
Multiply and Accumulate with Parallel Load
Accumulator from Memory 5-327
Multiply and Accumulate with Parallel Multiply 5-329
Multiply and Accumulate with Parallel Multiply and Subtract 5-347
Multiply and Accumulate with Parallel Store Accumulator Content to Memory 5-367
Multiply and Subtract 5-369
Multiply and Subtract with Parallel Load Accumulator from Memory 5-379
Multiply and Subtract with Parallel Multiply 5-381
Multiply and Subtract with Parallel Multiply and Accumulate 5-390
Multiply and Subtract with Parallel Store
Accumulator Content to Memory 5-401
Multiply with Parallel Multiply and
Accumulate 5-283
Multiply with Parallel Multiply and Subtract 5-295
Multiply with Parallel Store Accumulator Content to Memory 5-305

\section*{N}

Negate Accumulator Content 5-403
Negate Auxiliary Register Content 5-403
Negate Temporary Register Content 5-403
negation
accumulator content 5-403
auxiliary register content 5-403
temporary register content 5-403
No Operation (nop) 5-405
nonrepeatable instructions 1-20
nop 5-405
0
operand qualifier 5-466
OR 5-48

\section*{P}

Parallel Modify Auxiliary Register Contents (mar) 5-406
Parallel Multiplies 5-407Parallel Multiply and Accumulates 5-419Parallel Multiply and Subtracts 5-454
parallel operations
    addition with parallel store accumulator content
        to memory 5-29
    load accumulator from memory with parallel store
        accumulator content to memory 5-189
    modify auxiliary register content with parallel
        multiply 5-224
    modify auxiliary register content with parallel
        multiply and accumulate 5-226
    modify auxiliary register content with parallel
        multiply and subtract 5-231
    modify auxiliary register contents 5-406
    multiplies 5-407
    multiply and accumulate with parallel
        delay 5-325
    multiply and accumulate with parallel load
        accumulator from memory 5-327
    multiply and accumulate with parallel
        multiply 5-329
    multiply and accumulate with parallel multiply and
        subtract 5 5-347
    multiply and accumulate with parallel store
        accumulator content to memory 5-367
    multiply and accumulates 5-419
    multiply and subtract with parallel load
        accumulator from memory 5-379
    multiply and subtract with parallel
        multiply 5-381
    multiply and subtract with parallel multiply and
        accumulate 5-390
    multiply and subtract with parallel store
        accumulator content to memory 5-401
    multiply and subtracts 5-454
    multiply with parallel multiply and
        accumulate 5-283
    multiply with parallel multiply and
        subtract 5-295
    multiply with parallel store accumulator content to
        memory 5-305
    subtraction with parallel store accumulator
        content to memory 5-627
parallelism basics 2-3
parallelism features 2-2
Peripheral Port Register Access Qualifiers 5-466
pop 5-469

Pop Accumulator Content from Stack Pointers (popboth) 5-468
Pop Extended Auxiliary Register (XAR) Content from Stack Pointers (popboth) 5-468
Pop Top of Stack (pop) 5-469
popboth 5-468
program control
branch conditionally 5-66
branch on auxiliary register not zero 5-74
branch unconditionally 5-70
call conditionally 5-77
call unconditionally 5-83
compare and branch 5-111
execute conditionally 5-159
idle 5-172
no operation 5-405
repeat block of instructions
unconditionally 5-484
repeat single instruction conditionally 5-495
repeat single instruction unconditionally 5-498
repeat single instruction unconditionally and decrement CSR 5-503
repeat single instruction unconditionally and increment CSR 5-505
return conditionally 5-508
return from interrupt 5-512
return unconditionally 5 5-510
software interrupt 5-549
software reset 5-551
software trap 5-555
pshboth 5-476
push 5-477
Push Accumulator Content to Stack Pointers (pshboth) 5-476
Push Extended Auxiliary Register (XAR) Content to Stack Pointers (pshboth) 5-476
Push to Top of Stack (push) 5-477
R
readport 5-466
register bit
clear 5-88
complement (not) 5-128
set 5-522
test 5-641
test bit pair 5-643
repeat 5-498, \(5-503,5-505\)
Repeat Block of Instructions Unconditionally 5-484

Repeat Single Instruction Conditionally (while repeat) 5-495
Repeat Single Instruction Unconditionally (repeat) 5-498
Repeat Single Instruction Unconditionally and Decrement CSR (repeat) 5-503
Repeat Single Instruction Unconditionally and Increment CSR (repeat) 5-505
reset 5-551
resource conflicts in a parallel pair 2-4
return 5-510
Return Conditionally (if return) 5-508
Return from Interrupt (return_int) 5-512
Return Unconditionally (return) 5-510
return_int 5-512
rnd 5-518
Rotate Left Accumulator Content 5-514
Rotate Left Auxiliary Register Content 5-514
Rotate Left Temporary Register Content 5-514
Rotate Right Accumulator Content 5-516
Rotate Right Auxiliary Register Content 5-516
Rotate Right Temporary Register Content 5-516
Round Accumulator Content (rnd) 5-518
rounding 5-518
S
saturate 5-520
Saturate Accumulator Content (saturate) 5-520
set
accumulator bit 5-522
auxiliary register bit \(5-522\)
memory bit 5-523
status register bit 5 5-524
temporary register bit 5 5-522
Set Accumulator Bit 5-522
Set Auxiliary Register Bit 5-522
Set Memory Bit 5-523
Set Status Register Bit 5-524
Set Temporary Register Bit 5-522
sftc 5-527
Shift Accumulator Content Conditionally (sftc) 5-527
Shift Accumulator Content Logically 5-529: 5 -532
Shift Auxiliary Register Content Logically 5-532
shift conditionally 5-527
shift logically 5-529, \(5-532\)
Shift Temporary Register Content Logically 5-532
Signed Shift of Accumulator Content 5-535, 5-544
Signed Shift of Auxiliary Register Content 5-544
Signed Shift of Temporary Register Content 5-544
soft-dual parallelism 2-5
Software Interrupt (intr) 5-549
Software Reset (reset) 5-551
Software Trap (trap) 5-555
sqdst 5-566
Square 5-557
Square and Accumulate 5-560
Square and Subtract 5-563
Square Distance (sqdst) 5-566
status register bit
clear 5-90
set 5-524
store
accumulator content to memory 5-568
accumulator pair content to memory 5-588
accumulator, auxiliary, or temporary register content to memory 5-591
auxiliary or temporary register pair content to memory 5-595
CPU register content to memory 5-596
extended auxiliary register (XAR) to memory 5-600
Store Accumulator Content to Memory 5-568, 5-591
Store Accumulator Pair Content to Memory 5-588
Store Auxiliary Register Content to Memory 5-591
Store Auxiliary Register Pair Content to Memory 5-595
Store CPU Register Content to Memory 5-596
Store Extended Auxiliary Register (XAR) to Memory 5-600
Store Temporary Register Content to Memory 5-591
Store Temporary Register Pair Content to Memory 5-595
subc 5-601
Subtract Conditionally 5-601
Subtraction 5-603
Subtraction with Parallel Store Accumulator Content to Memory 5-627
swap 5 5-629, \(\overline{5}-630,|5-631,|\overline{5}-632,|5-633,| \overline{5}-635\), 5-637, 5 -639, \(\sqrt{\text { 5-640 }}\)

Index-8


\section*{IMPORTANT NOTICE}

Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications, enhancements, improvements, and other changes to its products and services at any time and to discontinue any product or service without notice. Customers should obtain the latest relevant information before placing orders and should verify that such information is current and complete. All products are sold subject to Tl's terms and conditions of sale supplied at the time of order acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale in accordance with Tl's standard warranty. Testing and other quality control techniques are used to the extent TI deems necessary to support this warranty. Except where mandated by government requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are responsible for their products and applications using TI components. To minimize the risks associated with customer products and applications, customers should provide adequate design and operating safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right, copyright, mask work right, or other TI intellectual property right relating to any combination, machine, or process in which TI products or services are used. Information published by TI regarding third-party products or services does not constitute a license from TI to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property of the third party, or a license from TI under the patents or other intellectual property of TI .
Reproduction of TI information in TI data books or data sheets is permissible only if reproduction is without alteration and is accompanied by all associated warranties, conditions, limitations, and notices. Reproduction of this information with alteration is an unfair and deceptive business practice. TI is not responsible or liable for such altered documentation. Information of third parties may be subject to additional restrictions.

Resale of TI products or services with statements different from or beyond the parameters stated by TI for that product or service voids all express and any implied warranties for the associated Tl product or service and is an unfair and deceptive business practice. TI is not responsible or liable for any such statements.

TI products are not authorized for use in safety-critical applications (such as life support) where a failure of the TI product would reasonably be expected to cause severe personal injury or death, unless officers of the parties have executed an agreement specifically governing such use. Buyers represent that they have all necessary expertise in the safety and regulatory ramifications of their applications, and acknowledge and agree that they are solely responsible for all legal, regulatory and safety-related requirements concerning their products and any use of TI products in such safety-critical applications, notwithstanding any applications-related information or support that may be provided by TI. Further, Buyers must fully indemnify TI and its representatives against any damages arising out of the use of Tl products in such safety-critical applications.
TI products are neither designed nor intended for use in military/aerospace applications or environments unless the TI products are specifically designated by TI as military-grade or "enhanced plastic." Only products designated by TI as military-grade meet military specifications. Buyers acknowledge and agree that any such use of TI products which TI has not designated as military-grade is solely at the Buyer's risk, and that they are solely responsible for compliance with all legal and regulatory requirements in connection with such use.
TI products are neither designed nor intended for use in automotive applications or environments unless the specific Tl products are designated by TI as compliant with ISO/TS 16949 requirements. Buyers acknowledge and agree that, if they use any non-designated products in automotive applications, TI will not be responsible for any failure to meet such requirements.
Following are URLs where you can obtain information on other Texas Instruments products and application solutions:

\section*{Products}

\section*{Amplifiers}

Data Converters
DLP® Products
DSP
Clocks and Timers
Interface
Logic
Power Mgmt
Microcontrollers
RFID
RF/IF and ZigBee® Solutions
amplifier.ti.com
dataconverter.ti.com
www.dlp.com
dsw.ti.com
www.ti.com/clocks
nterface.ti.com
ogic.ti.com
oower.ticom
microcontroller.ti.com
www.ti-rfid.com
www.ti.com/lprt
\begin{tabular}{|c|c|}
\hline Applications & \\
\hline Audio & www.ti.com/audio \\
\hline Automotive & www.ticom/automotiva \\
\hline Broadband & www.ti.com/broadband \\
\hline Digital Control & www.ti.com/digitalcontro \\
\hline Medical & www.ti.com/medica \\
\hline Military & www.ti.com/military \\
\hline Optical Networking & www.ti.com/opticalnetwork \\
\hline Security & Www.ti.com/security \\
\hline Telephony & Www.ti.com/telephony \\
\hline Video \& Imaging & www.ti.com/vided \\
\hline Wireless & www.ti.com/wireless \\
\hline
\end{tabular}

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2009, Texas Instruments Incorporated```


[^0]:    2) dst-DU, src-AU or dst-AU, src-DU
