TMS320C4x
User’s Guide

SPRU063C
May 1999
IMPORTANT NOTICE

Texas Instruments and its subsidiaries (Ti) reserve the right to make changes to their products or to discontinue any product or service without notice, and advise customers to obtain the latest version of relevant information to verify, before placing orders, that information being relied on is current and complete. All products are sold subject to the terms and conditions of sale supplied at the time of order acknowledgement, including those pertaining to warranty, patent infringement, and limitation of liability.

Ti warrants performance of its semiconductor products to the specifications applicable at the time of sale in accordance with Ti’s standard warranty. Testing and other quality control techniques are utilized to the extent Ti deems necessary to support this warranty. Specific testing of all parameters of each device is not necessarily performed, except those mandated by government requirements.

CERTAIN APPLICATIONS USING SEMICONDUCTOR PRODUCTS MAY INVOLVE POTENTIAL RISKS OF DEATH, PERSONAL INJURY, OR SEVERE PROPERTY OR ENVIRONMENTAL DAMAGE (“CRITICAL APPLICATIONS”). Ti SEMICONDUCTOR PRODUCTS ARE NOT DESIGNED, AUTHORIZED, OR WARRANTED TO BE SUITABLE FOR USE IN LIFE-SUPPORT DEVICES OR SYSTEMS OR OTHER CRITICAL APPLICATIONS. INCLUSION OF Ti PRODUCTS IN SUCH APPLICATIONS IS UNDERSTOOD TO BE FULLY AT THE CUSTOMER’S RISK.

In order to minimize risks associated with the customer’s applications, adequate design and operating safeguards must be provided by the customer to minimize inherent or procedural hazards.

Ti assumes no liability for applications assistance or customer product design. Ti does not warrant or represent that any license, either express or implied, is granted under any patent right, copyright, mask work right, or other intellectual property right of Ti covering or relating to any combination, machine, or process in which such semiconductor products or services might be or are used. Ti’s publication of information regarding any third party’s products or services does not constitute Ti’s approval, warranty or endorsement thereof.

Copyright © 1999, Texas Instruments Incorporated
Preface

Read This First

About This Manual

This user’s guide serves as a reference book for the TMS320C40 and TMS320C44 digital signal processors. Throughout the book, all references to the TMS320C4x apply to both devices, except when otherwise noted.

How to Use This Manual

The following table summarizes the information contained in this user’s guide:

<table>
<thead>
<tr>
<th>If you are looking for information about:</th>
<th>Turn to these chapters:</th>
</tr>
</thead>
<tbody>
<tr>
<td>Addressing modes</td>
<td>Chapter 6, <em>Addressing Modes</em></td>
</tr>
<tr>
<td>ARAUs</td>
<td>Chapter 2, <em>Architectural Overview</em></td>
</tr>
<tr>
<td>Bootloader</td>
<td>Chapter 10, <em>The Bootloader</em></td>
</tr>
<tr>
<td>Bus Structure</td>
<td>Chapter 2, <em>Architectural Overview</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 9, <em>External Bus Operation</em></td>
</tr>
<tr>
<td>Cache</td>
<td>Chapter 4, <em>Memory and the Instruction Cache</em></td>
</tr>
<tr>
<td>Communication Ports</td>
<td>Chapter 12, <em>Communication Ports</em></td>
</tr>
<tr>
<td>CPU Architecture</td>
<td>Chapter 2, <em>Architectural Overview</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 3, <em>CPU Registers</em></td>
</tr>
<tr>
<td>DMA</td>
<td>Chapter 11, <em>The DMA Coprocessor</em></td>
</tr>
<tr>
<td>Data Formats</td>
<td>Chapter 5, <em>Data Formats and Floating-Point Operation</em></td>
</tr>
<tr>
<td>Delayed Branches</td>
<td>Chapter 7, <em>Program Flow Control</em></td>
</tr>
<tr>
<td>Instruction set</td>
<td>Chapter 14, <em>Assembly Language Instructions</em></td>
</tr>
</tbody>
</table>
## Style and Symbol Conventions

This document uses the following conventions:

- Program listings, program examples, file names, and symbol names are shown in a special font. Examples use a bold version of the special font for emphasis. Here is a sample program listing segment:

```c
* LOOP1 RPTB MAX
   CMFF *AR0,R0 ;Compare number to the maximum
   MAX LDFLT *AR0,R0 ;If greater, this is a new max
   B NEXT
 LOOP2 RPTB MIN
   CMFF *AR0++(1),R0 ;Compare number to the minimum
   MIN LDFLT *-AR0(1),R0 ;If smaller, this is new minimum
 NEXT .
```

---

<table>
<thead>
<tr>
<th>If you are looking for information about:</th>
<th>Turn to these chapters:</th>
</tr>
</thead>
<tbody>
<tr>
<td>Interrupts</td>
<td>Chapter 7, <em>Program Flow Control</em></td>
</tr>
<tr>
<td>Memory</td>
<td>Chapter 2, <em>Architectural Overview</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 4, <em>Memory and the Instruction Cache</em></td>
</tr>
<tr>
<td>Peripherals</td>
<td>Chapter 12, <em>Communication Ports</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 11, <em>The DMA Coprocessor</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 13, <em>Timers</em></td>
</tr>
<tr>
<td>Overview of the <code>C4x</code></td>
<td>Chapter 1, <em>Introduction</em></td>
</tr>
<tr>
<td>Program control</td>
<td>Chapter 7, <em>Program Flow Control</em></td>
</tr>
<tr>
<td>Pipeline</td>
<td>Chapter 8, <em>Pipeline Operation</em></td>
</tr>
<tr>
<td>Registers</td>
<td>Chapter 3, <em>CPU Registers</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 12, <em>Communication Ports</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 11, <em>The DMA Coprocessor</em></td>
</tr>
<tr>
<td></td>
<td>Chapter 13, <em>Timers</em></td>
</tr>
<tr>
<td>Repeat Mode</td>
<td>Chapter 7, <em>Program Flow Control</em></td>
</tr>
<tr>
<td>Reset</td>
<td>Chapter 7, <em>Program Flow Control</em></td>
</tr>
<tr>
<td>Timers</td>
<td>Chapter 13, <em>Timers</em></td>
</tr>
<tr>
<td>Traps</td>
<td>Chapter 7, <em>Program Flow Control</em></td>
</tr>
</tbody>
</table>
In syntax descriptions, the instruction is in **bold face** and the parameters are in *italic face*. Portions of a syntax that are in **bold face** should be entered as shown; portions of a syntax that are in *italic face* describe the type of information that should be entered. Here is an example of an instruction:

**CMPF3 src2,src1**

Notice that although the instruction mnemonic (CMPF3 in this example) is in capital letters, the 'C4x assembler is **not case sensitive** — it can assemble mnemonics entered in either upper or lower case.

**CMPF3** is the instruction mnemonic. This instruction has two parameters, indicated by src2 and src1.

Square brackets ([ and ]) identify an optional parameter. If you use an optional parameter, you must specify the information within the brackets; however, you don't enter the brackets themselves. Here's an example of an instruction that has an optional parameter:

[**label**] **LDP src [,DP]**

The **LDP** instruction is shown with two parameters; one is optional. The first parameter, *src*, is required. The second parameter, *DP*, and the label, are optional. As this syntax shows, if you use the optional second parameter, you must precede it with a comma.

Throughout this book MSB indicates the most significant bit and LSB indicates the least significant bit. MS indicates the most significant byte and LS indicates the least significant byte.
This book may contain cautions and warnings.

This is an example of a caution statement.
A caution statement describes a situation that could potentially damage your software or equipment.

This is an example of a warning statement.
A warning statement describes a situation that could potentially cause harm to you.

The information in a caution or a warning is provided for your protection. Please read each caution and warning carefully.
The following books describe the TMS320 floating-point devices and related support tools. To obtain a copy of any of these TI documents, call the Texas Instruments Literature Response Center at (800) 477–8924. When ordering, please identify the book by its title and literature number.

**TMS320C4x General-Purpose Applications User’s Guide** (literature number SPRU159) describes software and hardware applications for the 'C4x processor. Also includes development support information, parts lists, and XDS510 emulator design considerations.

**TMS320C4x Parallel Processing Development System Technical Reference** (literature number SPRU075) describes the TMS320C4x parallel processing system, a system with four C4xs with shared and distributed memory.

**Parallel Processing with the TMS320C4x** (literature number SPRA031) describes parallel processing and how the 'C4x can be used in parallel processing. Also provides sample parallel processing applications.

**TMS320 Floating-Point DSP Assembly Language Tools User’s Guide** (literature number SPRU035) describes the assembly language tools (assembler, linker, and other tools used to develop assembly language code), assembler directives, macros, common object file format, and symbolic debugging directives for the 'C3x and 'C4x generations of devices.

**TMS320 Floating-Point DSP Optimizing C Compiler User’s Guide** (literature number SPRU034) describes the TMS320 floating-point C compiler. This C compiler accepts ANSI standard C source code and produces TMS320 assembly language source code for the 'C3x and 'C4x generations of devices.

**TMS320C4x C Source Debugger User’s Guide** (literature number SPRU054) tells you how to invoke the 'C4x emulator and simulator versions of the C source debugger interface. This book discusses various aspects of the debugger interface, including window management, command entry, code execution, data management, and breakpoints. It also includes a tutorial that introduces basic debugger functionality.

**TMS320C4x Technical Brief** (literature number SPRU076) gives a condensed overview of the 'C4x DSP and its development tools. It also lists TMS320C4x third parties.
**TMS320 Family Development Support Reference Guide** (literature number SPRU011) describes the '320 family of digital signal processors and the various products that support it. This includes code-generation tools (compilers, assemblers, linkers, etc.) and system integration and debug tools (simulators, emulators, evaluation modules, etc.). This book also lists related documentation, outlines seminars and the university program, and gives factory repair and exchange information.

**TMS320 Third-Party Support Reference Guide** (literature number SPRU052) alphabetically lists over 100 third parties that supply various products that serve the family of '320 digital signal processors—software and hardware development tools, speech recognition, image processing, noise cancellation, modems, etc.

**TMS320 DSP Designer’s Notebook: Volume 1 (SPRT125).** Presents solutions to common design problems using 'C2x, 'C3x, 'C4x, 'C5x, and other TI DSPs.

### Related Articles and Books

A wide variety of related documentation is available on digital signal processing. These references fall into one of the following application categories:

- General-Purpose DSP
- Graphics/Imagery
- Speech/Voice
- Control
- Multimedia
- Military
- Telecommunications
- Automotive
- Consumer
- Medical
- Development Support

In the following list, references appear in alphabetical order according to author. The documents contain beneficial information regarding designs, operations, and applications for signal-processing systems; all of the documents provide additional references. Texas Instruments strongly suggests that you refer to these publications.

**General-Purpose DSP:**

Related Articles and Books


Related Articles and Books


**Graphics/Imagery:**


**Speech/Voice:**


Control:


Multimedia:


Military:


Telecommunications:


Automotive:


Consumer:


Medical:


**Development Support:**


If You Need Assistance. . .

<table>
<thead>
<tr>
<th>If you want to . . .</th>
<th>Do this . . .</th>
</tr>
</thead>
<tbody>
<tr>
<td>Request more information about Texas Instruments Digital Signal Processing (DSP) products</td>
<td>Write to: Texas Instruments Incorporated Market Communications Manager MS 736 P.O. Box 1443 Houston, Texas 77251–1443</td>
</tr>
<tr>
<td>Order Texas Instruments documentation</td>
<td>Call the TI Literature Response Center: (800) 477–8924</td>
</tr>
<tr>
<td>Ask questions about product operation or report suspected problems</td>
<td>Contact the DSP hotline: Phone: (713) 274–2320 FAX: (713) 274–2324 Electronic Mail: <a href="mailto:4389750@mcimail.com">4389750@mcimail.com</a>.</td>
</tr>
<tr>
<td>Obtain the source code in this user’s guide.</td>
<td>Call the TI BBS: (713) 274–2323 Ftp from: ftp.ti.com log in as user ftp cd to /mirrors/tms320bbs</td>
</tr>
<tr>
<td>Visit TI online, including TI&amp;ME ™, your own customized web page.</td>
<td>Point your browser at: <a href="http://www.ti.com">http://www.ti.com</a></td>
</tr>
<tr>
<td>Report mistakes or make comments about this or any other TI documentation.</td>
<td>Send electronic mail to: <a href="mailto:comments@books.sc.ti.com">comments@books.sc.ti.com</a> Send printed comments to: Texas Instruments Incorporated Technical Publications Mgr., MS 702 P.O. Box 1443 Houston, Texas 77251–1443</td>
</tr>
</tbody>
</table>

Trademarks

MS is a registered trademark of Microsoft Corp.
MS-Windows is a registered trademark of Microsoft Corp.
MS-DOS is a registered trademark of Microsoft Corp.
OS/2 is a trademark of International Business Machines Corp.
Sun and SPARC are trademarks of Sun Microsystems, Inc.
VAX and VMS are trademarks of Digital Equipment Corp.
## Contents

1 Introduction 1-1

*Introduces the TMS320 family and the TMS320C4x*

1.1 TMS320C4x Devices 1-2

1.1.1 The TMS320C40 1-2

1.1.2 The TMS320C44 1-2

1.2 Key Features of the TMS320C4x 1-3

1.3 TMS320C40 and TMS320C44 Device Comparison 1-4

2 Architectural Overview 2-1

Briefly describes the architecture of the CPU, buses, interrupts, and peripherals of the 'C4x

2.1 Central Processing Unit (CPU) 2-4

2.1.1 Floating-Point/Integer Multiplier 2-4

2.1.2 Arithmetic Logic Unit (ALU) and Internal Buses 2-4

2.1.3 Auxiliary Register Arithmetic Units (ARAUs) 2-6

2.1.4 CPU Primary Register File 2-6

2.1.5 CPU Expansion Register File 2-10

2.2 Memory Organization 2-11

2.2.1 RAM, ROM, and Cache 2-11

2.2.2 Memory Maps 2-13

2.2.3 Memory Aliasing (’C44 only) 2-17

2.2.4 Memory Addressing Modes 2-18

2.3 Internal Bus Operation 2-19

2.4 External Bus Operation 2-20

2.5 Interrupts 2-21

2.6 Peripherals 2-22

2.6.1 Communication Ports 2-23

2.6.2 Direct Memory Access (DMA) Coprocessor 2-23

2.6.3 Timers 2-24
3 CPU Registers ................................................................. 3-1
Lists and describes the contents of the CPU primary register file and the CPU expansion register file

3.1 CPU Primary Register File ............................................. 3-2
  3.1.1 Extended-Precision Registers (R0–R11) ....................... 3-3
  3.1.2 Auxiliary Registers (AR0–AR7) ................................. 3-4
  3.1.3 Data-Page Pointer (DP) ........................................ 3-4
  3.1.4 Index Registers (IR0, IR1) ....................................... 3-4
  3.1.5 Block-Size Register (BK) ....................................... 3-5
  3.1.6 System Stack Pointer (SP) ..................................... 3-5
  3.1.7 Status Register (ST) ........................................... 3-5
  3.1.8 DMA Coprocessor Interrupt Enable Register (DIE) ....... 3-8
  3.1.9 CPU Internal Interrupt Enable Register (IIE) .............. 3-11
  3.1.10 IIF Flag Register (IIF) ........................................ 3-13
  3.1.11 Block-Repeat (RS, RE) and Repeat-Count (RC) Registers 3-16
  3.1.12 Program Counter (PC) ......................................... 3-16
  3.1.13 Reserved Bits and Compatibility ............................. 3-16

3.2 CPU Expansion Register File .......................................... 3-17

4 Memory and the Instruction Cache ...................................... 4-1
Describes the structure of the memory map and the architecture of the instruction cache

4.1 Memory Map ............................................................ 4-2
4.2 Peripheral Bus Memory Map ............................................ 4-5
  4.2.1 Local and Global Memory Interface Control Registers .... 4-6
  4.2.2 Analysis Module Registers .................................... 4-6
  4.2.3 Timer Registers ................................................ 4-7
  4.2.4 Communication Port Memory Map ............................. 4-8
  4.2.5 DMA Coprocessor Registers ................................. 4-9

4.3 Instruction Cache ....................................................... 4-10
  4.3.1 Instruction Cache Architecture ............................... 4-10
  4.3.2 Cache Control Bits ............................................ 4-12
  4.3.3 Using the Cache .............................................. 4-13
  4.3.4 The LRU Cache Algorithm .................................... 4-14

5 Data Formats and Floating-Point Operation ............................ 5-1
Describes integer and floating-point data formats and discusses how some mathematical operations are performed on floating-point numbers

5.1 Signed-Integer Formats ................................................ 5-2
  5.1.1 Short Integer Format .......................................... 5-2
  5.1.2 Single-Precision Integer Format .............................. 5-2

5.2 Unsigned-Integer Formats ............................................ 5-3
  5.2.1 Short Unsigned-Integer Format ............................... 5-3
  5.2.2 Single-Precision Unsigned-Integer Format ................. 5-3
5.3 Floating-Point Formats .................................................. 5-4
  5.3.1 Short Floating-Point Format ........................................ 5-5
  5.3.2 Single-Precision Floating-Point Format ............................ 5-6
  5.3.3 Extended-Precision Floating-Point Format ......................... 5-7
  5.3.4 Determining the Decimal Equivalent of a Floating-Point Number ........................................................................... 5-8
  5.3.5 Conversion Between Floating-Point Formats ..................... 5-11
5.4 Floating-Point Conversion (IEEE Std. 754) ......................... 5-13
  5.4.1 Converting IEEE Format to Twos-Complement 'C4x Floating-Point Format .................................................. 5-14
  5.4.2 Converting Twos-Complement 'C4x Floating-Point Format to IEEE Format .................................................. 5-17
5.5 Floating-Point Multiplication .............................................. 5-19
5.6 Floating-Point Addition and Subtraction ............................... 5-23
5.7 Normalization (NORM Instruction) ....................................... 5-27
5.8 Rounding (RND Instruction) ................................................ 5-29
5.9 Floating-Point-to-Integer Conversion (FIX Instruction) .......... 5-31
5.10 Integer-to-Floating-Point Conversion (FLOAT Instruction) ...... 5-33
5.11 Reciprocal (RCPF Instruction) ............................................ 5-34
  5.11.1 Reciprocal Algorithm .................................................. 5-35
5.12 Reciprocal Square Root (RSQRF Instruction) ....................... 5-36
  Newton-Raphson Algorithm .................................................. 5-37

6 Addressing Modes ................................................................ 6-1

Describes addressing modes, using address registers, and stack managements in the 'C4x

6.1 Addressing Types ......................................................... 6-2
6.2 Register Addressing ....................................................... 6-3
6.3 Direct Addressing .......................................................... 6-5
6.4 Indirect Addressing ......................................................... 6-6
6.5 Immediate Addressing ...................................................... 6-18
6.6 PC-Relative Addressing .................................................... 6-19
6.7 Encoding of Addressing Modes .......................................... 6-21
  6.7.1 General Addressing Modes ........................................... 6-21
  6.7.2 Three-Operand Addressing Modes .................................. 6-22
  6.7.3 Parallel Addressing Modes ............................................ 6-24
  6.7.4 Conditional-Branch Addressing Modes ........................... 6-25
6.8 Circular Addressing .......................................................... 6-27
6.9 Bit-Reversed Addressing .................................................... 6-32

7 Program Flow Control ....................................................... 7-1

Describes software and hardware features that control how a program flows

7.1 Repeat Mode .................................................................. 7-2
  7.1.1 Control Bits ............................................................... 7-3
  7.1.2 Repeat-Mode Operation ............................................... 7-3
  7.1.3 RPTB and RPTBD Instructions ...................................... 7-4
  7.1.4 RPTS Instruction ........................................................ 7-5
  7.1.5 Repeat Mode Restriction Rules ...................................... 7-6
  7.1.6 RC Register Value After Repeat Mode Completes ............ 7-7
  7.1.7 Nesting Block Repeats .................................................. 7-8
## 7.2 Delayed Branches
- **7.2.1 Delayed Branches Without Annulling**
- **7.2.2 Delayed Branches With Annulling**

## 7.3 Calls, Traps, Branches, Jumps, and Returns

## 7.4 Interrupts
- **7.4.1 Interrupt Vector Table and Prioritization**
- **7.4.2 CPU Interrupt Control Bits**
- **7.4.3 Interrupt Processing**
- **7.4.4 CPU Interrupt Latency**
- **7.4.5 External Interrupts**

## 7.5 Traps
- **7.5.1 Initialization of Traps and Interrupts**
- **7.5.2 Operation of Traps**
- **7.5.3 Overlapping the Trap and Interrupt Vector Tables**

## 7.6 DMA Interrupts
- **7.6.1 DMA Interrupt Control Bits**
- **7.6.2 DMA Interrupt Processing**
- **7.6.3 CPU/DMA Interrupt Interaction**

## 7.7 Reset
- **7.7.1 Reset’s Effects on Pin States**
- **7.7.2 Reset Vector Location**
- **7.7.3 Additional Reset Operations**

## 8 Pipeline Operation
*Describes and explains the operation of the four pipeline stages in the 'C4x CPU*

### 8.1 Pipeline Structure

### 8.2 Pipeline Conflicts
- **8.2.1 Branch Conflicts**
- **8.2.2 Register Conflicts**
- **8.2.3 Memory Conflicts**

### 8.3 Memory Accesses for Maximum Performance

### 8.4 Clocking of Memory Accesses
- **8.4.1 Program Fetches**
- **8.4.2 Data Loads and Stores**

## 9 External Bus Operation
*Describes the features and functions of the two 'C4x external buses*

### 9.1 Overview

### 9.2 Memory Interface Signals

### 9.3 Memory-Interface Control Registers
- **9.3.1 Mapping Addresses to Strobes**
- **9.3.2 Page Size Operation**

### 9.4 Programmable Wait States

### 9.5 Memory Interface Timing
## Contents

### 9 The Bootloader

Describes 'C4x bootloader operation and also lists the bootloader code

- **9.1 Bootloader Description** .................................................. 9-2
- **9.2 Mode Selection** ......................................................... 9-3
- **9.3 Bootloading Sequence** .................................................. 9-5
- **9.4 Bootloading from External Memory (Examples)** ....................... 9-10
- **9.5 Bootloading from a Communication Port (Examples)** ................. 9-16
- **9.6 Modifying the IIOFx Pins After Bootloading** ......................... 9-19
- **9.7 The Bootloader Program** .............................................. 9-20

### 10 The Bootloader

Describes 'C4x bootloader operation and also lists the bootloader code

- **10.1 Bootloader Description** .................................................. 10-2
- **10.2 Mode Selection** ......................................................... 10-3
- **10.3 Bootloading Sequence** .................................................. 10-5
- **10.4 Bootloading from External Memory (Examples)** ....................... 10-10
- **10.5 Bootloading from a Communication Port (Examples)** ................. 10-16
- **10.6 Modifying the IIOFx Pins After Bootloading** ......................... 10-19
- **10.7 The Bootloader Program** .............................................. 10-20

### 11 The DMA Coprocessor

Describes and discusses operation of the 'C4x DMA coprocessor

- **11.1 Introduction** .......................................................... 11-2
- **11.2 DMA Functional Description** ......................................... 11-3
  - **11.2.1 DMA Basic Operation** ............................................ 11-5
- **11.3 DMA Registers** ........................................................ 11-7
  - **11.3.1 Control Register** ................................................. 11-7
  - **11.3.2 Address and Index Registers** ................................... 11-7
  - **11.3.3 Transfer Counter and Auxiliary Transfer Counter Registers** .. 11-15
  - **11.3.4 Link Pointer and Auxiliary Link-Pointer Registers** .......... 11-17
- **11.4 DMA Unified Mode** .................................................... 11-19
- **11.5 DMA Split Mode** ...................................................... 11-20
- **11.6 DMA Internal Priority Schemes** ..................................... 11-22
  - **11.6.1 Fixed Priority Scheme** .......................................... 11-22
  - **11.6.2 Rotating Priority Scheme** ...................................... 11-22
  - **11.6.3 Split Mode and DMA Channel Arbitration** ....................... 11-24
- **11.7 CPU and DMA Coprocessor Arbitration** ................................ 11-27
- **11.8 Data Transfer Modes** ................................................ 11-28
  - **11.8.1 Running in TRANSFER MODE = 002** .............................. 11-28
  - **11.8.2 Running in TRANSFER MODE = 012** .............................. 11-29
  - **11.8.3 Running in TRANSFER MODE = 102 (Autoinitialization 1)** ..... 11-29
  - **11.8.4 Running in TRANSFER MODE = 112 (Autoinitialization 2)** ..... 11-31
Contents

11.9  Autoinitialization ............................................................... 11-34
     11.9.1  Unified Mode ......................................................... 11-35
     11.9.2  Split Mode ........................................................... 11-35
     11.9.3  Incrementing the Link Pointer .................................... 11-36
     11.9.4  Synchronization ...................................................... 11-37
     11.9.5  Effect on DMA Control Register Bits ............................. 11-38
     11.9.6  Consecutive Autoinitializations .................................. 11-40
11.10 DMA and Interrupts .......................................................... 11-42
     11.10.1 Interrupts and Synchronization of DMA Channels .................. 11-43
     11.10.2 Synchronization Mode Bits ........................................ 11-46
11.11 DMA Memory Transfer Timing .............................................. 11-51
     11.11.1 Single DMA Memory Transfer Timing ............................... 11-51
     11.11.2 DMA Transfer Rate in Synchronization Mode ...................... 11-55
12  Communication Ports ........................................................ 12-1
    Describes and provides tips for using the communication ports
     12.1  Features ................................................................. 12-2
     12.2  Operational Overview ................................................ 12-3
           12.2.1  Token Transfer Operation ................................... 12-5
           12.2.2  Data Transfer Operation ..................................... 12-6
     12.3  Memory Map and Registers ........................................... 12-7
           12.3.1  Communication-Port Control Register (CPCR) ............. 12-8
           12.3.2  Input-Port Register .......................................... 12-9
           12.3.3  Output-Port Register ......................................... 12-9
           12.3.4  Communication-Port Software Reset Register ................ 12-10
     12.4  Port Arbitration Units (PAUs) ...................................... 12-11
     12.5  Halting of Input and Output FIFOs .................................. 12-14
           12.5.1  Input FIFO Halt Operation ................................... 12-15
           12.5.2  Output FIFO Halt Operation ................................ 12-15
     12.6  Coordinating Communication Ports With the CPU and DMA Coprocessor .... 12-17
     12.7  Token Transfer Operation ............................................ 12-19
     12.8  Word Transfer Operation ............................................. 12-22
           CSTRB Width Restrictions ........................................... 12-25
     12.9  Synchronizers ........................................................... 12-26
     12.10 Module Reset ........................................................... 12-29
     12.11 Tips for Using Communication Ports ................................. 12-32
13  Timers ........................................................................ 13-1
    Describes and discusses operation of the two 'C4x on-chip timers
     13.1  Overview of the Timers ............................................... 13-2
     13.2  Timer Pins ............................................................. 13-4
## Contents

13.3 Timer Control Registers .......................................................... 13-5  
  13.3.1 Timer Control Register ....................................................... 13-6  
  13.3.2 Timer Period Register ....................................................... 13-7  
  13.3.3 Timer Counter Register ..................................................... 13-8  
  13.3.4 Boundary Conditions in the Control Registers ......................... 13-8  
13.4 Timer Pulse Generation ......................................................... 13-9  
13.5 Timer Interrupts ................................................................. 13-11  
  13.5.1 Timer Interrupts and Their Vectors ..................................... 13-11  
  13.5.2 Timer Interrupt Operation ................................................. 13-11  
  13.5.3 Considerations When Using a Timer Interrupt ......................... 13-12  
13.6 Selecting CLKSRC and FUNC Values .......................................... 13-13  
  13.6.1 CLKSRC = 1 and FUNC = 0 ................................................. 13-13  
  13.6.2 CLKSRC=1 and FUNC=1 .................................................... 13-13  
  13.6.3 CLKSRC = 0 and FUNC = 0 ................................................. 13-14  
  13.6.4 CLKSRC = 0 and FUNC = 1 ................................................. 13-14  
13.7 Using TCLKx as General-Purpose I/O Pins .................................. 13-15  
13.8 Configuring a Timer ............................................................ 13-16  

14 Assembly Language Instructions ................................................. 14-1  
*Lists the entire instruction set for the 'C4x*

  14.1 Instruction Set ........................................................................ 14-2  
  14.1.1 Load-and-Store Instructions ............................................... 14-2  
  14.1.2 Two-Operand Instructions ................................................. 14-4  
  14.1.3 Three-Operand Instructions ................................................. 14-6  
  14.1.4 Program Control Instructions .............................................. 14-7  
  14.1.5 Interlocked Operations Instructions ..................................... 14-8  
  14.1.6 Parallel Operations Instructions ......................................... 14-9  
  14.1.7 Illegal Instructions ........................................................... 14-11  
  14.2 Condition Codes and Flags ..................................................... 14-12  
  14.3 Individual Instruction Descriptions .......................................... 14-16  
  14.3.1 Symbols and Abbreviations ................................................ 14-16  
  14.3.2 Optional Assembler Syntaxes .............................................. 14-18  
  14.3.3 Individual Instruction Descriptions ..................................... 14-20  

A Glossary ....................................................................................... A-1
Figures

2–1 TMS320C4x Block Diagram ......................................................... 2-2
2–2 Central Processing Unit (CPU) .................................................... 2-5
2–3 Memory Organization ................................................................. 2-12
2–4 'C40 Memory Map ................................................................. 2-14
2–5 'C44 Memory Map ................................................................. 2-15
2–6 Peripheral Memory Map ......................................................... 2-16
2–7 Memory Aliasing ('C44 only) .................................................. 2-17
2–8 Peripheral Modules ................................................................. 2-22
3–1 Extended-Precision Register Floating-Point Format ......................... 3-4
3–2 Extended-Precision Register Integer Format .................................. 3-4
3–3 Status Register (ST) ................................................................. 3-5
3–4 DMA Interrupt Enable Register Bit Functions for DMA Unified Mode ......................................................... 3-8
3–5 DMA Interrupt Enable Register Bit Functions for DMA Split Mode 3-10
3–6 Internal Interrupt Enable Register (IIE) ........................................ 3-12
3–7 Interrupt Flag Register (IIF) ..................................................... 3-14
4–1 'C40 Memory Map ................................................................. 4-3
4–2 'C44 Memory Map ................................................................. 4-4
4–3 Peripheral Memory Map ......................................................... 4-5
4–4 Memory Interface Control Registers ........................................... 4-6
4–5 Timer Registers ................................................................. 4-7
4–6 Communication Port Memory Map ........................................... 4-8
4–7 DMA Coprocessor Memory Map ............................................. 4-9
4–8 Address Partitioning for Cache Control Algorithm ........................... 4-10
4–9 Instruction Cache Architecture .................................................. 4-11
5–1 Short-Integer Format and Sign Extension of Short Integer ................. 5-2
5–2 Single-Precision Integer Format ............................................. 5-2
5–3 Short Unsigned-Integer Format and Zero Fill .................................. 5-3
5–4 Single-Precision Unsigned-Integer Format .................................. 5-3
5–5 General Floating-Point Format .................................................. 5-4
5–6 Short Floating-Point Format ..................................................... 5-5
5–7 Single-Precision Floating-Point Format ..................................... 5-6
5–8 Extended-Precision Floating-Point Format ................................... 5-7
5–9 Short Floating-Point Format Conversion to Single-Precision Floating-Point Format ............................................. 5-11
5–10 Short Floating-Point Format Conversion to Extended-Precision Floating-Point Format ............................................. 5-11
5–11 Single-Precision Floating-Point Format Conversion to Extended-Precision Floating-Point Format ............................................. 5-12
<table>
<thead>
<tr>
<th>Figure</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>11–10</td>
<td>Example of a Priority Wheel</td>
<td>11-24</td>
</tr>
<tr>
<td>11–11</td>
<td>Example of a Channel Priority Scheme in Split Mode</td>
<td>11-25</td>
</tr>
<tr>
<td>11–12</td>
<td>Service Sequence for Split Mode Priority Example</td>
<td>11-26</td>
</tr>
<tr>
<td>11–13</td>
<td>DMA Channel Running in Transfer Mode 102 (Autoinitialization Method 1a)</td>
<td>11-29</td>
</tr>
<tr>
<td>11–14</td>
<td>DMA Channel Running in Transfer Mode 102 (Autoinitialization Method 1b)</td>
<td>11-30</td>
</tr>
<tr>
<td>11–15</td>
<td>DMA Channel Running in Transfer Mode 112 (Autoinitialization Method 2a)</td>
<td>11-31</td>
</tr>
<tr>
<td>11–16</td>
<td>DMA Channel Running in Transfer Mode 112 (Autoinitialization Method 2b)</td>
<td>11-33</td>
</tr>
<tr>
<td>11–17</td>
<td>Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 0)</td>
<td>11-35</td>
</tr>
<tr>
<td>11–18</td>
<td>Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 1 and Transfer Counter = 0)</td>
<td>11-36</td>
</tr>
<tr>
<td>11–19</td>
<td>Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 1 and Auxiliary Transfer Counter = 0)</td>
<td>11-36</td>
</tr>
<tr>
<td>11–20</td>
<td>DMA Channel Control Register Bits Modifiable by Autoinitialization in Unified Mode</td>
<td>11-39</td>
</tr>
<tr>
<td>11–21</td>
<td>DMA Channel Control Register Bit Modifiable by Autoinitialization of the Primary Channel in Split Mode</td>
<td>11-40</td>
</tr>
<tr>
<td>11–22</td>
<td>DMA Channel Control Register Bits That Can Be Modified by Autoinitialization of the Auxiliary Channel in Split Mode</td>
<td>11-40</td>
</tr>
<tr>
<td>11–23</td>
<td>Self-Referential Link Pointer</td>
<td>11-41</td>
</tr>
<tr>
<td>11–24</td>
<td>Referring to a New Link Pointer</td>
<td>11-41</td>
</tr>
<tr>
<td>11–25</td>
<td>DIE Register Bit Functions for DMA Unified Mode</td>
<td>11-44</td>
</tr>
<tr>
<td>11–26</td>
<td>DIE Register Bit Functions for DMA Split Mode</td>
<td>11-45</td>
</tr>
<tr>
<td>11–27</td>
<td>No DMA Synchronization</td>
<td>11-47</td>
</tr>
<tr>
<td>11–28</td>
<td>DMA Source Synchronization</td>
<td>11-48</td>
</tr>
<tr>
<td>11–29</td>
<td>DMA Destination Synchronization</td>
<td>11-49</td>
</tr>
<tr>
<td>11–30</td>
<td>Unified Mode DMA Source and Destination Synchronization</td>
<td>11-50</td>
</tr>
<tr>
<td>11–31</td>
<td>Timing and Number of Cycles for DMA Transfers to On-Chip Destination</td>
<td>11-52</td>
</tr>
<tr>
<td>11–32</td>
<td>Timing and Number of Cycles for DMA Transfers to a Local-Bus Destination</td>
<td>11-53</td>
</tr>
<tr>
<td>11–33</td>
<td>Timing and Number of Cycles for DMA Transfers to a Global-Bus Destination</td>
<td>11-54</td>
</tr>
<tr>
<td>11–34</td>
<td>Unified-Mode DMA Timing for Different Synchronizations</td>
<td>11-55</td>
</tr>
<tr>
<td>11–35</td>
<td>Split-Mode DMA Timing for Different Synchronizations</td>
<td>11-56</td>
</tr>
<tr>
<td>12–1</td>
<td>Communication Port Block Diagram</td>
<td>12-4</td>
</tr>
<tr>
<td>12–2</td>
<td>’C4x Communication-Port Interface-Connection Example</td>
<td>12-5</td>
</tr>
<tr>
<td>12–3</td>
<td>Communication-Port Memory Map</td>
<td>12-7</td>
</tr>
<tr>
<td>12–4</td>
<td>Communication-Port Control Register (CPCR)</td>
<td>12-8</td>
</tr>
<tr>
<td>12–5</td>
<td>Communication-Port Arbitration-Unit State Diagram</td>
<td>12-12</td>
</tr>
<tr>
<td>12–6</td>
<td>Token Transfer Operation</td>
<td>12-20</td>
</tr>
<tr>
<td>12–7</td>
<td>Word Transfer Operation</td>
<td>12-23</td>
</tr>
<tr>
<td>12–8</td>
<td>Type-One Synchronizer Minimum Delay</td>
<td>12-26</td>
</tr>
<tr>
<td>12–9</td>
<td>Type-One Synchronizer Maximum Delay</td>
<td>12-26</td>
</tr>
<tr>
<td>12–10</td>
<td>Type-Two Synchronizer Minimum Delay</td>
<td>12-27</td>
</tr>
<tr>
<td>12–11</td>
<td>Type-Two Synchronizer Maximum Delay</td>
<td>12-27</td>
</tr>
<tr>
<td>12–12</td>
<td>Type-Three Synchronizer Minimum Delay</td>
<td>12-27</td>
</tr>
<tr>
<td>12–13</td>
<td>Type-Three Synchronizer Maximum Delay</td>
<td>12-28</td>
</tr>
<tr>
<td>12–14</td>
<td>Post-Reset State for an Output Port</td>
<td>12-30</td>
</tr>
</tbody>
</table>
Figures

<table>
<thead>
<tr>
<th>Number</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>12–15</td>
<td>Post-Reset State for an Input Port</td>
<td>12-31</td>
</tr>
<tr>
<td>13–1</td>
<td>Timer Block Diagram</td>
<td>13-3</td>
</tr>
<tr>
<td>13–2</td>
<td>Memory-Mapped Timer Locations</td>
<td>13-5</td>
</tr>
<tr>
<td>13–3</td>
<td>Timer Control Register</td>
<td>13-6</td>
</tr>
<tr>
<td>13–4</td>
<td>Timer Pulse Mode and Clock Mode Timing</td>
<td>13-9</td>
</tr>
<tr>
<td>13–5</td>
<td>Timer Output Generation Examples</td>
<td>13-10</td>
</tr>
<tr>
<td>13–6</td>
<td>Timer Configuration With CLKSRC=1 and FUNC=0</td>
<td>13-13</td>
</tr>
<tr>
<td>13–7</td>
<td>Timer Configuration With CLKSRC = 1 and FUNC = 1</td>
<td>13-13</td>
</tr>
<tr>
<td>13–8</td>
<td>Timer Configuration With CLKSRC = 0 and FUNC = 0</td>
<td>13-14</td>
</tr>
<tr>
<td>13–9</td>
<td>Timer Configuration With CLKSRC = 0 and FUNC = 1</td>
<td>13-14</td>
</tr>
<tr>
<td>13–10</td>
<td>TCLK as an Input (I/O = 0)</td>
<td>13-15</td>
</tr>
<tr>
<td>13–11</td>
<td>TCLK as an Output (I/O = 1)</td>
<td>13-15</td>
</tr>
<tr>
<td>14–1</td>
<td>Status Register</td>
<td>14-13</td>
</tr>
<tr>
<td>Table</td>
<td>Description</td>
<td></td>
</tr>
<tr>
<td>---</td>
<td>---</td>
<td></td>
</tr>
<tr>
<td>1–1</td>
<td>Comparison of 'C40 and 'C44 Features</td>
<td></td>
</tr>
<tr>
<td>2–1</td>
<td>CPU Primary Registers</td>
<td></td>
</tr>
<tr>
<td>3–1</td>
<td>CPU Primary Register File</td>
<td></td>
</tr>
<tr>
<td>3–2</td>
<td>Summary of the CE and CF Bits</td>
<td></td>
</tr>
<tr>
<td>3–3</td>
<td>DMA Channels 0 and 1 (DMA0 and DMA1) Unified Mode Synchronization Interrupts</td>
<td></td>
</tr>
<tr>
<td>3–4</td>
<td>DMA Channels 2 to 5 (DMA2 to DMA5) Unified Mode Synchronization Interrupts</td>
<td></td>
</tr>
<tr>
<td>3–5</td>
<td>DMA Channels 0 and 1 (DMA0 and DMA1) Split-Mode Synchronization Interrupts</td>
<td></td>
</tr>
<tr>
<td>3–6</td>
<td>DMA Channels 2 to 5 (DMA2 to DMA5) Split-Mode Synchronization Interrupts</td>
<td></td>
</tr>
<tr>
<td>3–7</td>
<td>CPU Expansion Registers</td>
<td></td>
</tr>
<tr>
<td>4–1</td>
<td>Combined Effect of the CE and CF Bits</td>
<td></td>
</tr>
<tr>
<td>5–1</td>
<td>Converting IEEE Format to Twos-Complement Floating-Point Format</td>
<td></td>
</tr>
<tr>
<td>5–2</td>
<td>Converting Twos-Complement Floating-Point Format to IEEE Format</td>
<td></td>
</tr>
<tr>
<td>6–1</td>
<td>CPU Register/Assembler Syntax and Function</td>
<td></td>
</tr>
<tr>
<td>6–2</td>
<td>Indirect Addressing</td>
<td></td>
</tr>
<tr>
<td>6–3</td>
<td>Three-Operand Instruction Addressing Modes</td>
<td></td>
</tr>
<tr>
<td>6–4</td>
<td>Index Steps and Bit-Reversed Addressing</td>
<td></td>
</tr>
<tr>
<td>7–1</td>
<td>Repeat-Mode Registers</td>
<td></td>
</tr>
<tr>
<td>7–2</td>
<td>Interrupt Latency</td>
<td></td>
</tr>
<tr>
<td>7–3</td>
<td>Pin States At System Reset</td>
<td></td>
</tr>
<tr>
<td>7–4</td>
<td>RESET Vector Locations</td>
<td></td>
</tr>
<tr>
<td>8–1</td>
<td>One Program Fetch and One Data Access for Maximum Performance</td>
<td></td>
</tr>
<tr>
<td>8–2</td>
<td>One Program Fetch and Two Data Accesses for Maximum Performance</td>
<td></td>
</tr>
<tr>
<td>9–1</td>
<td>Global Memory Interface Signals</td>
<td></td>
</tr>
<tr>
<td>9–2</td>
<td>Global Memory Port Status for STRB0 and STRB1 Accesses</td>
<td></td>
</tr>
<tr>
<td>9–3</td>
<td>Page Size as Defined by STRB0/PAGESIZE Bits</td>
<td></td>
</tr>
<tr>
<td>9–4</td>
<td>Address Ranges Specified by STRB ACTIVE Bits</td>
<td></td>
</tr>
<tr>
<td>9–5</td>
<td>Address Ranges Specified by LSTRB ACTIVE Bits</td>
<td></td>
</tr>
<tr>
<td>9–6</td>
<td>Wait-State Generation for Each Value of SWW</td>
<td></td>
</tr>
<tr>
<td>9–7</td>
<td>Interlocked Operations</td>
<td></td>
</tr>
<tr>
<td>10–1</td>
<td>Bootloader Mode Selection Using Pins IIOF(3–0)</td>
<td></td>
</tr>
<tr>
<td>10–2</td>
<td>Structure of Source Program Data Stream</td>
<td></td>
</tr>
<tr>
<td>10–3</td>
<td>Byte-Wide Configured Memory</td>
<td></td>
</tr>
<tr>
<td>10–4</td>
<td>16-Bit Wide Configured Memory</td>
<td></td>
</tr>
<tr>
<td>10–5</td>
<td>32-Bit Wide Configured Memory</td>
<td></td>
</tr>
<tr>
<td>11–1</td>
<td>DMA PRI Bits and CPU/DMA Arbitration Rules</td>
<td></td>
</tr>
<tr>
<td>11–2</td>
<td>TRANSFER MODE (AUX TRANSFER MODE) Field Descriptions</td>
<td></td>
</tr>
<tr>
<td>Pages</td>
<td>Description</td>
<td>Section</td>
</tr>
<tr>
<td>-------</td>
<td>-----------------------------------------------------------------------------</td>
<td>---------</td>
</tr>
<tr>
<td>11–3</td>
<td>SYNC MODE Field Descriptions in Unified Mode</td>
<td>11-13</td>
</tr>
<tr>
<td>11–4</td>
<td>SYNC MODE Field Descriptions in Split Mode</td>
<td>11-13</td>
</tr>
<tr>
<td>11–5</td>
<td>START (AUX START) Field Descriptions</td>
<td>11-14</td>
</tr>
<tr>
<td>11–6</td>
<td>STATUS (AUX STATUS) Field Descriptions</td>
<td>11-14</td>
</tr>
<tr>
<td>11–7</td>
<td>DMA PRI Bits and CPU/DMA Arbitration Rules</td>
<td>11-27</td>
</tr>
<tr>
<td>11–8</td>
<td>TRANSFER MODE (AUX TRANSFER MODE) Field Descriptions</td>
<td>11-28</td>
</tr>
<tr>
<td>11–9</td>
<td>Effect of SYNC MODE and AUTOINIT MODE Bits in Autoinitization</td>
<td>11-38</td>
</tr>
<tr>
<td>11–10</td>
<td>DMA Channels 0 and 1 (DMA0 and DMA1) Unified-Mode Synchronization Interrupts</td>
<td>11-44</td>
</tr>
<tr>
<td>11–11</td>
<td>DMA Channels 2 to 5 (DMA2 to DMA5) Unified-Mode Synchronization Interrupts</td>
<td>11-45</td>
</tr>
<tr>
<td>11–12</td>
<td>DMA Channels 0 and 1 (DMA0 and DMA1) Split-Mode Synchronization Interrupts</td>
<td>11-46</td>
</tr>
<tr>
<td>11–13</td>
<td>DMA Channels 2 to 5 (DMA2 to DMA5) Split-Mode Synchronization Interrupts</td>
<td>11-46</td>
</tr>
<tr>
<td>12–1</td>
<td>Communication-Port Software Reset Address ('C44 and 'C40 &gt;= 5.0)</td>
<td>12-10</td>
</tr>
<tr>
<td>12–2</td>
<td>PAU State Definitions</td>
<td>12-11</td>
</tr>
<tr>
<td>12–3</td>
<td>Summary of Input and Output FIFO Halting</td>
<td>12-14</td>
</tr>
<tr>
<td>12–4</td>
<td>Token Transfer Sequence</td>
<td>12-21</td>
</tr>
<tr>
<td>12–5</td>
<td>Word Transfer Sequence</td>
<td>12-24</td>
</tr>
<tr>
<td>12–6</td>
<td>Communication-Port Signals and Synchronizer Delays</td>
<td>12-28</td>
</tr>
<tr>
<td>14–1</td>
<td>Load-and-Store Instructions</td>
<td>14-3</td>
</tr>
<tr>
<td>14–2</td>
<td>Two-Operand Instructions</td>
<td>14-4</td>
</tr>
<tr>
<td>14–3</td>
<td>Three-Operand Instructions</td>
<td>14-6</td>
</tr>
<tr>
<td>14–4</td>
<td>Program Control Instructions</td>
<td>14-7</td>
</tr>
<tr>
<td>14–5</td>
<td>Interlocked Operations Instructions</td>
<td>14-8</td>
</tr>
<tr>
<td>14–6</td>
<td>Parallel Instructions</td>
<td>14-9</td>
</tr>
<tr>
<td>14–7</td>
<td>Output Value Formats</td>
<td>14-12</td>
</tr>
<tr>
<td>14–8</td>
<td>Condition Codes and Flags</td>
<td>14-14</td>
</tr>
<tr>
<td>14–9</td>
<td>Instruction Symbols</td>
<td>14-17</td>
</tr>
<tr>
<td>14–10</td>
<td>CPU Register Symbols</td>
<td>14-21</td>
</tr>
</tbody>
</table>
Examples

4–1 Enabling the Cache ................................................................. 4-13
5–1 Positive Number ................................................................. 5-9
5–2 Negative Number ............................................................... 5-10
5–3 Fractional Number ............................................................. 5-10
5–4 IEEE to 'C4x Conversion Within Block Memory Transfer .......... 5-16
5–5 'C4x to IEEE Conversion Within Block Memory Transfer .......... 5-18
5–6 Floating-Point Multiply (Both Mantissas = –2.0) ..................... 5-21
5–7 Floating-Point Multiply (Both Mantissas = 1.5) ....................... 5-21
5–8 Floating-Point Multiply (Both Mantissas = 1.0) ....................... 5-22
5–9 Floating-Point Multiply Between Positive and Negative Numbers ................................................................. 5-22
5–10 Floating-Point Addition ....................................................... 5-25
5–11 Floating-Point Subtraction ................................................... 5-25
5–12 Floating-Point Addition With a 32-Bit Shift ......................... 5-26
5–13 Floating-Point Addition/Subtraction and Zero ...................... 5-26
5–14 NORM Instruction ............................................................. 5-28
5–15 Newton-Raphson Algorithm for Computing the Reciprocal .... 5-35
5–16 Newton-Raphson Algorithm for Computing the Reciprocal Square Root ................................................................. 5-38
6–1 Direct Addressing ............................................................... 6-5
6–2 Auxiliary Register Indirect ................................................... 6-9
6–3 Indirect With Predisplacement Add ...................................... 6-9
6–4 Indirect With Predisplacement Subtract ................................. 6-10
6–5 Indirect With Predisplacement Add and Modify ..................... 6-10
6–6 Indirect With Predisplacement Subtract and Modify ............... 6-11
6–7 Indirect With Postdisplacement Add and Modify ................... 6-11
6–8 Indirect With Postdisplacement Subtract and Modify ............... 6-12
6–9 Indirect With Postdisplacement Add and Circular Modify ......... 6-12
6–10 Indirect With Postdisplacement Subtract and Circular Modify ... 6-13
6–11 Indirect With Preindex Add ................................................ 6-13
6–12 Indirect With Preindex Subtract ......................................... 6-14
6–13 Indirect With Preindex Add and Modify ............................... 6-14
6–14 Indirect With Preindex Subtract and Modify ....................... 6-15
6–15 Indirect With Postindex Add and Modify ............................. 6-15
6–16 Indirect With Postindex Subtract and Modify ..................... 6-16
6–17 Indirect With Postindex Add and Circular Modify ................. 6-17
6–18 Indirect With Postindex Subtract and Circular Modify .......... 6-17
6–19 Indirect With Postindex Add and Bit-Reversed Modify .......... 6-17

Contents xxxi
The TMS320C4x devices are 32-bit floating-point digital signal processors optimized for parallel processing. The 'C4x family combines a high performance CPU and DMA controller with up to six communication ports to meet the needs of multiprocessor and I/O-intensive applications. All 'C4x devices are compatible with TI’s multi-chip development environment. Each device contains an on-chip analysis module, which supports hardware breakpoints for parallel-processing development and debugging. The 'C4x family is source-code compatible with the TMS320C3x family of floating-point DSPs.
1.1 TMS320C4x Devices

The TMS320C4x family is made up of three different members: the TMS320C40 and the TMS320C44.

1.1.1 The TMS320C40

The TMS320C40 is the original member of the 'C4x family. It features a CPU that can deliver up to 30 MIPS/60 MFLOPS with a maximum I/O bandwidth of 384M bytes/s. The 'C40 has 2K words of on-chip RAM, 128 words of program cache and a bootloader. Two external buses provide an address reach of 4 gigawords of unified memory space. The 'C40 is available in a 325-pin CPGA package.

1.1.2 The TMS320C44

The TMS320C44 is a lower cost version of the 'C40, for parallel processing applications that are more price sensitive. The 'C44 features four communication ports and has an external address reach of 32M words over two external buses. To further reduce cost, the 'C44 comes in a 304-pin PQFP package. The TMS320C44 can deliver up to 30 MIPS/60 MFLOPS performance with a maximum I/O bandwidth of 384M bytes/s. The 'C44 is source-code compatible with the 'C40.
1.2 Key Features of the TMS320C4x

The TMS320C4x has several key features:

- Up to 40 MIPS/80 MFLOPS performance with 488-Mbytes/s I/O capability
  - IEEE floating-point conversion for ease of use
  - Register-based CPU
  - Single-cycle byte and half-word manipulation capabilities
  - Divide and square root support for improved performance

- On-chip memory includes 2K words of SRAM, 128 words of program cache, and bootloader

- Two external buses providing an address reach of up to 4 gigawords

- Two memory-mapped 32-bit timers

- 6 and 12 channel DMA

- Up to six communication ports for multiprocessor communication

- Idle mode for reduced power consumption
1.3 TMS320C40 and TMS320C44 Device Comparison

Table 1–1 shows the major differences in features of the 'C40 and 'C44.

**Table 1–1. Comparison of 'C40 and 'C44 Features**

<table>
<thead>
<tr>
<th>Feature</th>
<th>'C40</th>
<th>'C44</th>
</tr>
</thead>
<tbody>
<tr>
<td>External local address bus</td>
<td>31 pins</td>
<td>24 pins</td>
</tr>
<tr>
<td>External global address bus</td>
<td>31 pins</td>
<td>24 pins</td>
</tr>
<tr>
<td>Address reach</td>
<td>4G × 32</td>
<td>32M × 32</td>
</tr>
<tr>
<td>Number of comm ports</td>
<td>6</td>
<td>4</td>
</tr>
<tr>
<td>Commport direction pin</td>
<td>no</td>
<td>yes</td>
</tr>
<tr>
<td>NMI with bus grant feature</td>
<td>yes (for revisions ≥ 5.0)</td>
<td>yes</td>
</tr>
<tr>
<td>Individual comm port reset</td>
<td>yes (for revisions ≥ 5.0)</td>
<td>yes</td>
</tr>
<tr>
<td>Package</td>
<td>325-pin CPGA</td>
<td>304-pin PQFP</td>
</tr>
</tbody>
</table>
The “C4x”’s high performance is achieved through the precision and wide dynamic range of the floating-point units, on-chip memory, a high degree of parallelism, communication ports, and the DMA coprocessor.

This chapter gives an architectural overview of the “C4x” processor. Figure 2–1 is a block diagram of the “C4x.”

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>2.1 Central Processing Unit (CPU)</td>
<td>2-4</td>
</tr>
<tr>
<td>2.2 Memory Organization</td>
<td>2-11</td>
</tr>
<tr>
<td>2.3 Internal Bus Operation</td>
<td>2-19</td>
</tr>
<tr>
<td>2.4 External Bus Operation</td>
<td>2-20</td>
</tr>
<tr>
<td>2.5 Interrupts</td>
<td>2-21</td>
</tr>
<tr>
<td>2.6 Peripherals</td>
<td>2-22</td>
</tr>
</tbody>
</table>
Figure 2–1. TMS320C4x Block Diagram

Continued on next page
Figure 2–1. TMS320C4x Block Diagram (Continued)
2.1 Central Processing Unit (CPU)

The 'C4x’s CPU has a register-based architecture. The CPU consists of the several components:

- Floating-point/integer multiplier
- Arithmetic Logic Unit (ALU)
- 32-bit barrel shifter
- Internal buses (CPU1/CPU2 and REG1/REG2)
- Auxiliary register arithmetic units (ARAU5)
- CPU register file

Figure 2–2 shows the CPU’s components.

2.1.1 Floating-Point/Integer Multiplier

The multiplier performs single-cycle multiplications on 32-bit integer and 40-bit floating-point values. The 'C4x implementation of floating-point arithmetic allows for floating-point operations at fixed-point speeds via a 25-ns instruction cycle and a high degree of parallelism. To gain even higher throughput, you can use parallel instructions to perform a multiply and ALU operation in a single cycle.

When the multiplier performs floating-point multiplication, the inputs are 40-bit floating-point numbers, and the result is a 40-bit floating-point number. When the multiplier performs integer multiplication, the input data is 32 bits and yields either the 32 most-significant bits or the 32 least-significant bits of the resulting 64-bit product. See Chapter 5, Data Formats and Floating-Point Operation, for detailed information on data formats and floating-point operation.

2.1.2 Arithmetic Logic Unit (ALU) and Internal Buses

The ALU performs single-cycle operations on 32-bit integer, 32-bit logical, and 40-bit floating-point data, including single-cycle integer and floating-point conversions. Results of the ALU are always maintained in 32-bit integer or 40-bit floating-point formats. The barrel shifter is used to shift up to 32 bits left or right in a single cycle.

Four internal buses, CPU1, CPU2, REG1, and REG2, carry two operands from memory and two operands from the register file, thus allowing parallel multiplies and adds/subtracts on four integer or floating-point operands in a single cycle.
Central Processing Unit (CPU)

Figure 2–2. Central Processing Unit (CPU)
2.1.3 Auxiliary Register Arithmetic Units (ARAU0s)

The two auxiliary register arithmetic units (ARAU0 and ARAU1) can generate two addresses in a single cycle. The ARAUs operate in parallel with the multiplier and ALU. They support addressing with displacements, index registers (IR0 and IR1), and circular and bit-reversed addressing. See Chapter 6, Addressing Modes, for a description of addressing modes.

2.1.4 CPU Primary Register File

The 'C4x primary register file provides 32 registers in a multiport register file that is tightly coupled to the CPU. Table 2–1 lists register names and functions, followed by the section number and page of each description.

All of the primary register file registers can be operated upon by the multiplier and ALU and can be used as general-purpose registers. However, the registers also have some special functions. For example, the 12 extended-precision registers are especially suited for maintaining floating-point results. The eight auxiliary registers support a variety of indirect addressing modes and can be used as general-purpose 32-bit integer and logical registers. The remaining registers provide system functions such as addressing, stack management, processor status, interrupts, and block repeat. See Chapter 3, CPU Registers, for detailed information about CPU registers. See Chapter 6, Addressing Modes, for information about register usage in addressing.

The extended-precision registers (R0–R11) are capable of storing and supporting operations on 32-bit integer and 40-bit floating-point numbers. Any instruction that assumes that the operands are floating-point numbers uses bits 39–0. If the operands are either signed or unsigned integers, only bits 31–0 are used, and bits 39–32 remain unchanged. This is true for all shift operations. See Chapter 5, Data Formats and Floating-Point Operation, for extended-precision register formats of floating-point and integer numbers.

The 32-bit auxiliary registers (AR0–AR7) can be accessed by the CPU and modified by the two auxiliary register arithmetic units (ARAU0s). The primary function of the auxiliary registers is the generation of 32-bit addresses. They can also be used as loop counters or as 32-bit general-purpose registers that can be modified by the multiplier and ALU. See Chapter 6, Addressing Modes, for detailed information and examples of the use of auxiliary registers in addressing.
Table 2–1. CPU Primary Registers

<table>
<thead>
<tr>
<th>Assembler Syntax</th>
<th>Assigned Function Name</th>
<th>Subsection</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>Extended-precision register 0</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R1</td>
<td>Extended-precision register 1</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R2</td>
<td>Extended-precision register 2</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R3</td>
<td>Extended-precision register 3</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R4</td>
<td>Extended-precision register 4</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R5</td>
<td>Extended-precision register 5</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R6</td>
<td>Extended-precision register 6</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R7</td>
<td>Extended-precision register 7</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R8</td>
<td>Extended-precision register 8</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R9</td>
<td>Extended-precision register 9</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R10</td>
<td>Extended-precision register 10</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R11</td>
<td>Extended-precision register 11</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>AR0</td>
<td>Auxiliary register 0</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR1</td>
<td>Auxiliary register 1</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR2</td>
<td>Auxiliary register 2</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR3</td>
<td>Auxiliary register 3</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR4</td>
<td>Auxiliary register 4</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR5</td>
<td>Auxiliary register 5</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR6</td>
<td>Auxiliary register 6</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR7</td>
<td>Auxiliary register 7</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>DP</td>
<td>Data-page pointer</td>
<td>3.1.3</td>
<td>3-4</td>
</tr>
<tr>
<td>IR0</td>
<td>Index register 0</td>
<td>3.1.4</td>
<td>3-4</td>
</tr>
<tr>
<td>IR1</td>
<td>Index register 1</td>
<td>3.1.4</td>
<td>3-4</td>
</tr>
<tr>
<td>BK</td>
<td>Block-size register</td>
<td>3.1.5</td>
<td>3-5</td>
</tr>
<tr>
<td>SP</td>
<td>System stack pointer</td>
<td>3.1.6</td>
<td>3-5</td>
</tr>
</tbody>
</table>
Table 2–1. CPU Primary Registers (Continued)

<table>
<thead>
<tr>
<th>Assembler Syntax</th>
<th>Assigned Function Name</th>
<th>Subsection</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>ST</td>
<td>Status register</td>
<td>3.1.7</td>
<td>3-5</td>
</tr>
<tr>
<td>DIE</td>
<td>DMA Coprocessor interrupt enable</td>
<td>3.1.8</td>
<td>3-8</td>
</tr>
<tr>
<td>IIE</td>
<td>Internal-interrupt enable register</td>
<td>3.1.9</td>
<td>3-11</td>
</tr>
<tr>
<td>IIF</td>
<td>ITOF flag register</td>
<td>3.1.10</td>
<td>3-13</td>
</tr>
<tr>
<td>RS</td>
<td>Repeat start address</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
<tr>
<td>RE</td>
<td>Repeat end address</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
<tr>
<td>RC</td>
<td>Repeat counter</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
</tbody>
</table>

The **data page pointer (DP)** is a 32-bit register. The 16 LSBs of the data page pointer are used by the direct addressing mode as a pointer to the page of data being addressed. The 'C4x can address up to 64K pages, each page containing 64K words. Use of the data page pointer is described in subsection 6.3, *Direct Addressing*, on page 6-5.

The 32-bit **index registers** contain the value used by the auxiliary register arithmetic unit (ARAU) to compute an indexed address. See Section 6.4, *Indirect Addressing*, on page 6-6, and Section 6.9, *Bit-Reversed Addressing*, on page 6-32, for more information about the ARAU.

The ARAU uses the 32-bit **block size register (BK)** in circular addressing to specify the data block size. Circular addressing is described in Section 6.8, *Circular Addressing*, on page 6-27.

The **system stack pointer (SP)** is a 32-bit register that contains the address of the top of the system stack. The SP always points to the last element pushed onto the stack. A push performs a preincrement, and a pop performs a post-decrement of the system stack pointer. The SP is manipulated by interrupts, traps, calls, returns, and the PUSH/PUSHF and POP/POPF instructions. See Section 1.4, *System and User Stack Management*, in the *TMS320C4x General-Purpose Applications User’s Guide* for information about managing the stacks.
The **status register (ST)** contains global information related to the state of the CPU. Typically, operations set the condition flags of the status register according to whether the result is zero, negative, etc. This includes register load and store operations as well as arithmetic and logical functions. When the status register is loaded, however, a bit-for-bit replacement is performed with the contents of the source operand, regardless of the state of any bits in the source operand. Therefore, following a load, the contents of the status register are identically equal to the contents of the source operand. This allows the status register to be easily saved and restored. See subsection 3.1.7, *Status Register (ST)*, on page 3-5, for definitions of the status register bits.

The **DMA coprocessor interrupt enable register (DIE)** is a 32-bit register containing 2- and 3-bit fields to designate the interrupt synchronization scheme for each of the six DMA channels. It allows each DMA channel to service a corresponding input communication port and output communication port. Also, each DMA channel can be synchronized with external interrupts or the on-chip timers. This register is described in subsection 3.1.8, *DMA Coprocessor Interrupt Enable Register (DIE)*, on page 3-8.

The **CPU internal interrupt enable register (IIE)** is a 32-bit register that enables/disables interrupts for the six communication ports, both timers, and the six DMA coprocessor channels. The IIE is described in subsection 3.1.9, *CPU Interrupt Enable Register (IIE)*, on page 3-11.

The **IIOF flag register (IIF)** controls the function (general-purpose I/O or interrupt) of the four external pins (IIOF0 to IIOF3). It also contains timer/DMA interrupt flags. Subsection 3.1.10, *IIOF Flag Register (IIF)*, on page 3-13, provides further description of this register.

The 32-bit **repeat counter (RC)** register specifies the number of times a block of code is to be repeated when a block repeat is performed. When the processor is operating in the repeat mode, the 32-bit **repeat start address register (RS)** contains the starting address of the block of program memory to be repeated, and the 32-bit **repeat end address register (RE)** contains the ending address of the block to be repeated. Further information about these registers is in subsection 3.1.11, *Block Repear (RS,RE) and Repeat Count (RC) Registers*, on page 3-16.

The **program counter (PC)** is a 32-bit register containing the address of the next instruction to be fetched. Although the PC is not part of the CPU register file, it is a register that can be modified by instructions that modify the program flow.
2.1.5 CPU Expansion Register File

Besides the CPU primary register file, the expansion register file contains two special registers that act as pointers:

- The IVTP register points to the interrupt-vector table (IVT), which defines vectors for all interrupts.
- The TVTP register points to the trap vector table (TVT), which defines vectors for 512 traps.

These two registers are fully described in Section 3.2, CPU Expansion Register File on page 3-17.
2.2 Memory Organization

The total memory reach of the 'C4x is 4G 32-bit words. Program memory (on-chip RAM or ROM and external memory) as well as registers affecting timers, communication ports, and DMA channels are contained within this space. This allows tables, coefficients, program code, and data to be stored in either RAM or ROM. Thus, memory usage is maximized, and memory space allocated as desired.

By manipulating one external pin (ROMEN), you can configure the first one-megaword area of memory (0000 0000h to 000F FFFFh) to address the local address bus or to address the on-chip ROM when you use the bootloader (with remaining space reserved). This capability is further discussed in Section 4.1, Memory Map, on page 4-2.

2.2.1 RAM, ROM, and Cache

Figure 2–3 shows how the memory is organized on the 'C4x. RAM blocks 0 and 1 are 4K bytes (1K × 32 bits) each. The ROM block is reserved and contains a bootloader. Each RAM and ROM block is capable of supporting two accesses in a single cycle. The separate program buses, data buses, and DMA buses allow for parallel program fetches, data reads and writes, and DMA operations. For example: the CPU can access two data values in one RAM block and perform an external program fetch in parallel with the DMA coprocessor loading another RAM block, all within a single cycle.

The reserved ROM block (upper right in Figure 2–3) contains a bootloader. This loader supports loading of program and data at reset time. Loading is from 8-, 16-, or 32-bit wide memories or any one of the six communication ports. Chapter 10, The Bootloader, explains the bootloader in detail.

A 128 × 32-bit instruction cache is provided to store often-repeated sections of code, thus greatly reducing the number of needed off-chip accesses. This allows for code to be stored off-chip in slower, lower-cost memories. By using the cache to execute your program, the external buses are freed for use by the DMA controller or CPU.

For further information about memory and the instruction cache, see Section 4.1, Memory Organization, and Section 4.3, Cache Memory.
Figure 2–3. Memory Organization

- Cache (128 x 32) (512 bytes)
- RAM block 0 (1K x 32) (4K bytes)
- RAM block 1 (1K x 32) (4K bytes)
- ROM block (bootloader) (reserved)

- DMA ADDR bus
- DMADATA bus
- DADDR2 bus
- DADDR1 bus
- DADDR bus
- PADDR bus
- DDATA bus
- PDATA bus
- Program counter/instruction register
- CPU
- DMA coprocessor
- MUX

- D(31–0)
- 'C40: A(30–0)
- 'C44: A(23–0)
- DE
- AE
- STAT(3–0)
- LOCK
- STRB
- R/W
- PAGE
- RDY
- CE
- MUX

- LD(31–0)
- 'C40: LA(30–0)
- 'C44: LA(23–0)
- DE
- AE
- LSTAT(3–0)
- LOCK
- LSTRB
- LR/R
- LPAGE
- LRDY
- LCE
- MUX
2.2.2 Memory Maps

The memory map for each processor is shown in Figure 2–4 ('C40) and Figure 2–5 ('C44); for each processor, the level at the external pin ROMEN determines whether or not the first megaword of memory addresses the internal ROM or external memory. The maps illustrate the entire address space of the 'C40 and 'C44.

The value of ROMEN affects only the first megaword of memory:

- A 1 at external pin ROMEN causes internal ROM to be enabled at 0000h with the one-megaword space reserved (0000 0000h – 000F FFFFh). This is shown in the right side of the figure.
- A 0 at ROMEN causes addresses 0000 0000h – 000F FFFFh to be accessible on the local bus. This is shown in the left side of the figure.

The rest of the memory map is the same for either level of ROMEN:

- The second megaword of memory is devoted to peripherals (as shown in Figure 2–6).
- The third megaword of memory contains the two 1K-word (4K-byte) blocks of RAM (BLK0 and BLK1 as shown at 002F F800h – 002F FFFFh).
- The rest of the first 2 gigawords (0030 0000h – 7FFF FFFFh) is on the local bus (external).
- The second 2 gigawords (8000 0000h – FFFF FFFFh) are on the global bus (external).

Section 4.1, Memory Map, on page 4-2 describes the memory maps in greater detail. Section 9.2, Memory Interface Signals on page 9-3, and Section 9.3, Memory Interface Control Registers on page 9-6, discuss the local and global interfaces to memory. The peripheral bus map and the vector locations for reset, interrupts, and traps are also explained in those sections.

Caution

Any access to a reserved area in the address space produces unpredictable results. Do not attempt to access reserved areas.
Figure 2–4. 'C40 Memory Map

(a) Internal ROM disabled
   (ROMEN = 0)

Microprocessor mode

(b) Internal ROM enabled
   (ROMEN = 1)

Microcomputer mode
Figure 2–5. 'C44 Memory Map

Structure identical

(a) Internal ROM disabled (ROMEN = 0)

Microprocessor mode

(b) Internal ROM enabled (ROMEN = 1)

Microcomputer mode
### Figure 2–6. Peripheral Memory Map

<table>
<thead>
<tr>
<th>Address</th>
<th>Peripheral</th>
<th>Described in</th>
</tr>
</thead>
<tbody>
<tr>
<td>0010 0000h</td>
<td>Local and global port control (16 words)</td>
<td>Subsection 4.2.1, Figure 4–4, page 4–6</td>
</tr>
<tr>
<td>0010 000Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0010h</td>
<td>Analysis block registers (16 words)</td>
<td>Subsection 4.2.2.</td>
</tr>
<tr>
<td>0010 001Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0020h</td>
<td>Timer 0 registers (16 words)</td>
<td>Subsection 4.2.3, Figure 4–5, page 4–7</td>
</tr>
<tr>
<td>0010 002Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0030h</td>
<td>Timer 1 registers (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 003Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0040h</td>
<td>Communication port 0 (16 words)</td>
<td>Subsection 4.2.4, Figure 4–6, page 4–8</td>
</tr>
<tr>
<td>0010 004Fh</td>
<td>('C40 only)</td>
<td></td>
</tr>
<tr>
<td>0010 0050h</td>
<td>Communication port 1 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 005Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0060h</td>
<td>Communication port 2 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 006Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0070h</td>
<td>Communication port 3 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 007Fh</td>
<td>('C40 only)</td>
<td></td>
</tr>
<tr>
<td>0010 0080h</td>
<td>Communication port 4 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 008Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0090h</td>
<td>Communication port 5 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 009Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00A0h</td>
<td>DMA coprocessor channel 0 (16 words)</td>
<td>Subsection 4.2.5, Figure 4–7, page 4–9</td>
</tr>
<tr>
<td>0010 00AFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00B0h</td>
<td>DMA coprocessor channel 1 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 00BFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00C0h</td>
<td>DMA coprocessor channel 2 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 00CFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00D0h</td>
<td>DMA coprocessor channel 3 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 00DFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00E0h</td>
<td>DMA coprocessor channel 4 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 00EFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00F0h</td>
<td>DMA coprocessor channel 5 (16 words)</td>
<td></td>
</tr>
<tr>
<td>0010 00FFh</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
2.2.3 Memory Aliasing (’C44 only)

Memory aliasing occurs in the ’C44, since both the global and local ports on that device have 24 pins, instead of the 31 pins on each port in the ’C40. Memory aliasing causes the first 16 M of each address space to be repeated in the memory map. Memory on the local bus occupies, and is aliased, in the first 2 G of address space, and memory on the global bus occupies, and is aliased, in the second 2 G of address space. Figure 2–7 shows the alias regions on the local and global buses.

Figure 2–7. Memory Aliasing (’C44 only)
2.2.4 Memory Addressing Modes

The 'C4x supports a base set of general-purpose instructions as well as arithmetic-intensive instructions that are particularly suited for digital signal processing and other numeric-intensive applications. Refer to Chapter 6, Addressing Modes, for detailed information on addressing.

Four groups of addressing modes are provided on the 'C4x. Each group uses two or more of several different addressing types. The following list shows the addressing modes with their addressing types.

- **General addressing modes:**
  - **Register.** The operand is a CPU register.
  - **Immediate.** The operand is a 16-bit immediate value.
  - **Direct.** The operand is the contents of a 32-bit address (concatenation of 16 bits of the data page pointer and a 16-bit operand).
  - **Indirect.** A 32-bit auxiliary register indicates the address of the operand.

- **Three-operand addressing modes:**
  - **Register.** (same as for general addressing mode).
  - **Indirect.** (same as for general addressing mode).
  - **Immediate.** The operand is an 8-bit immediate value.

- **Parallel addressing modes:**
  - **Register.** The operand is an extended-precision register.
  - **Indirect.** (same as for general addressing mode).

- **Branch addressing modes:**
  - **Register.** (same as for general addressing mode).
  - **PC-relative.** A signed 16-bit displacement or a 24-bit displacement is added to the PC.
2.3 Internal Bus Operation

A large portion of the ‘C4x’s high performance is due to internal busing and parallelism. Separate buses allow for parallel program fetches, data accesses, and DMA accesses:

- **Program buses** PADDR and PDATA
- **Data buses** DADDR1, DADDR2, and DDATA
- **DMA buses** DMAADDR and DMADATA

These buses connect all of the physical spaces (on-chip memory, off-chip memory, and on-chip peripherals) supported by the ‘C4x. Figure 2–3 shows these internal buses and their connections to on-chip and off-chip memory blocks.

The program counter (PC) is connected to the 32-bit program address bus (PADDR). The instruction register (IR) is connected to the 32-bit program data bus (PDATA). In this configuration, the buses can fetch a single instruction word every machine cycle.

The 32-bit data address buses (DADDR1 and DADDR2) and the 32-bit data data bus (DDATA) support two data memory accesses every machine cycle. The DDATA bus carries data to the CPU over the CPU1 and CPU2 buses. The CPU1 and CPU2 buses can carry two data memory operands to the multiplier, ALU, and register file every machine cycle. Also internal to the CPU are register buses REG1 and REG2, which can carry two data values from the register file to the multiplier and ALU every machine cycle. Figure 2–2 shows the buses that are internal to the CPU section of the processor.

The DMA controller is supported with a 32-bit address bus (DMAADDR) and a 32-bit data bus (DMADATA). These buses allow the DMA to perform memory accesses in parallel with the memory accesses occurring from the data and program buses.
2.4 External Bus Operation

The 'C4x provides two identical external interfaces: the global memory interface and the local memory interface. Each consists of a 32-bit data bus, a 31-bit ('C40) or 24-bit ('C44) address bus, and two sets of control signals. Both buses can be used to address external program/data memory or I/O space. The buses also have external RDY signals for wait-state generation with wait states inserted under software control. Chapter 9, External Bus Operation, covers external bus operation.

For multiple processors to access global memory and share data in a coherent manner, arbitration is necessary. This arbitration (handshaking) is the purpose of the 'C4x’s interlocked operations, handled through interlocked instructions. For more information about interlocked instructions, see Section 9.7 on page 9-39, Interlocked Operations.
2.5 Interrupts

The 'C4x supports four external interrupts (IIOF3–0), a number of internal interrupts, a nonmaskable external NMI interrupt, and a nonmaskable external RESET signal, which sets the processor to a known state. The DMA and communication ports have their own internal interrupts. When the CPU responds to the interrupt, the IACK pin can be used to signal an external interrupt acknowledge. Section 7.4, on page 7-15, *Interrupts*, covers RESET and interrupt processing.
2.6 Peripherals

All 'C4x on-chip peripherals are controlled through memory-mapped registers on a dedicated peripheral bus. This peripheral bus is composed of a 32-bit data bus and a 32-bit address bus. This peripheral bus permits straightforward communication to the peripherals. The 'C4x peripherals include two timers and six ('C40) or four ('C44) communication ports. Figure 2–8 shows the peripherals with associated buses and signals.

Figure 2–8. Peripheral Modules
2.6.1 Communication Ports

Six ('C40) or four ('C44) high-speed communication ports provide rapid processor-to-processor communication through each port’s dedicated communication interfaces. Coupled with the 'C4x’s two memory interfaces (global and local), this allows you to construct a parallel processor system that attains optimum system performance by distributing tasks among several processors. Each 'C4x can pass the results of its work to another 'C4x through a communication port, enabling each 'C4x to continue working. Chapter 12, Communication Ports, explains communication port operation in detail.

The communication ports offer several features:

- 160-megabits/s (20-Mbytes or 5-Mwords per second) bidirectional data transfer operations (at 40-ns cycle time)
- Simple processor-to-processor communication via eight data lines and four control lines
- Buffering of all data transfers, both input and output
- Automatic arbitration to ensure communication synchronization
- Synchronization between the CPU or the direct-memory access (DMA) coprocessor and the six communication ports via internal interrupts and internal ready signals.
- Port direction pin (CDIR) to ease interfacing ('C44 only)

2.6.2 Direct Memory Access (DMA) Coprocessor

The six channels of the on-chip DMA coprocessor can read from or write to any location in the memory map without interfering with the operation of the CPU. This allows interfacing to slow external memories and peripherals without reducing throughput to the CPU. The DMA coprocessor contains its own address generators, source and destination registers, and transfer counter. Dedicated DMA address and data buses allow for minimization of conflicts between the CPU and the DMA coprocessor. A DMA operation consists of a block or single-word transfer to or from memory. A key feature of the DMA coprocessor is its ability to automatically reinitialize each channel following a data transfer. See Chapter 11, The DMA Coprocessor, for detailed information on the DMA coprocessor.
2.6.3 Timers

The two timer modules are general-purpose 32-bit timer/event counters with two signaling modes and internal or external clocking. They can signal internally to the 'C4x or externally to the outside world at specified intervals, or they can count external events. Each timer has an I/O pin that can be used as an input clock to the timer, as an output signal driven by the timer, or as a general-purpose I/O pin. The timers are described in detail in Chapter 13, *The Timers*. 
The CPU primary register file contains 32 registers that can be used as operands by the multiplier and ALU (arithmetic logic unit). The register file includes the auxiliary registers, extended-precision registers, and index registers. These registers support addressing, floating-point/integer operations, stack management, processor status, block repeats, branching, and interrupts.

The CPU expansion register file contains two registers — the interrupt vector table pointer (IVTP) and the trap vector table pointer (TVTP).

This chapter describes each of the CPU registers.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.1 CPU Primary Register File</td>
<td>3-2</td>
</tr>
<tr>
<td>3.2 CPU Expansion Register File</td>
<td>3-17</td>
</tr>
</tbody>
</table>
3.1 CPU Primary Register File

The 'C4x provides 32 registers in a multiport register file that is tightly coupled to the CPU. The PC (program counter) is not included in the register file. The contents of the register file are listed in Table 3–1.

Table 3–1. CPU Primary Register File

<table>
<thead>
<tr>
<th>Register Symbol</th>
<th>Register Machine Value (hex)</th>
<th>Assigned Function Name</th>
<th>Subsection</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>00</td>
<td>Extended-precision register 0</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R1</td>
<td>01</td>
<td>Extended-precision register 1</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R2</td>
<td>02</td>
<td>Extended-precision register 2</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R3</td>
<td>03</td>
<td>Extended-precision register 3</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R4</td>
<td>04</td>
<td>Extended-precision register 4</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R5</td>
<td>05</td>
<td>Extended-precision register 5</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R6</td>
<td>06</td>
<td>Extended-precision register 6</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R7</td>
<td>07</td>
<td>Extended-precision register 7</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R8</td>
<td>1C</td>
<td>Extended-precision register 8</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R9</td>
<td>1D</td>
<td>Extended-precision register 9</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R10</td>
<td>1E</td>
<td>Extended-precision register 10</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R11</td>
<td>1F</td>
<td>Extended-precision register 11</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>AR0</td>
<td>08</td>
<td>Auxiliary register 0</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR1</td>
<td>09</td>
<td>Auxiliary register 1</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR2</td>
<td>0A</td>
<td>Auxiliary register 2</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR3</td>
<td>0B</td>
<td>Auxiliary register 3</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR4</td>
<td>0C</td>
<td>Auxiliary register 4</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR5</td>
<td>0D</td>
<td>Auxiliary register 5</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR6</td>
<td>0E</td>
<td>Auxiliary register 6</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR7</td>
<td>0F</td>
<td>Auxiliary register 7</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>DP</td>
<td>10</td>
<td>Data-page pointer</td>
<td>3.1.3</td>
<td>3-4</td>
</tr>
<tr>
<td>IR0</td>
<td>11</td>
<td>Index register 0</td>
<td>3.1.4</td>
<td>3-4</td>
</tr>
<tr>
<td>IR1</td>
<td>12</td>
<td>Index register 1</td>
<td>3.1.4</td>
<td>3-4</td>
</tr>
<tr>
<td>BK</td>
<td>13</td>
<td>Block-size register</td>
<td>3.1.5</td>
<td>3-5</td>
</tr>
<tr>
<td>SP</td>
<td>14</td>
<td>System stack pointer</td>
<td>3.1.6</td>
<td>3-5</td>
</tr>
</tbody>
</table>
Table 3–1. CPU Primary Register File (Continued)

<table>
<thead>
<tr>
<th>Register Symbol</th>
<th>Register Machine Value (hex)</th>
<th>Assigned Function Name</th>
<th>See Subsection</th>
<th>On Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>ST</td>
<td>15</td>
<td>Status register</td>
<td>3.1.7</td>
<td>3-5</td>
</tr>
<tr>
<td>DIE</td>
<td>16</td>
<td>DMA coprocessor interrupt enable</td>
<td>3.1.8</td>
<td>3-8</td>
</tr>
<tr>
<td>IIE</td>
<td>17</td>
<td>Internal-interrupt enable register</td>
<td>3.1.9</td>
<td>3-11</td>
</tr>
<tr>
<td>IIF</td>
<td>18</td>
<td>IIOF flag register (IIOF3–0 pins, timers, DMA)</td>
<td>3.1.10</td>
<td>3-13</td>
</tr>
<tr>
<td>RS</td>
<td>19</td>
<td>Repeat start address</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
<tr>
<td>RE</td>
<td>1A</td>
<td>Repeat end address</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
<tr>
<td>RC</td>
<td>1B</td>
<td>Repeat counter</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
</tbody>
</table>

All of these registers can be used both as operands by the multiplier and ALU, and as general-purpose 32-bit registers. However, the registers also perform some special functions. For example, the 12 extended-precision registers maintain extended-precision floating-point results. The eight auxiliary registers support a variety of indirect addressing modes and can be used as general-purpose 32-bit integer and logical registers. The remaining registers provide system functions such as addressing, stack management, processor status, interrupts, and block repeat. Refer to Chapter 6, Addressing Modes, for detailed information and examples of how CPU registers are used in addressing.

3.1.1 Extended-Precision Registers (R0–R11)

The 12 extended-precision registers (R0–R11) can store and support operations on 32-bit integer and 40-bit floating-point numbers.

For floating-point numbers, these registers consist of two separate and distinct fields:

- Bits 39–32: store the exponent (e) of a floating-point number.
- Bits 31–0: store the mantissa of a floating-point number:
  - Bit 31: sign bit (s),
  - Bits 30–0: the fraction (f).

Any instruction that assumes that the operands are floating-point numbers uses bits 39–0. Figure 3–1 illustrates the storage of 40-bit floating-point numbers in the extended-precision registers.
3.1.2 Auxiliary Registers (AR0–AR7)

The eight 32-bit auxiliary registers (AR0–AR7) can be accessed by the CPU and modified by the two auxiliary register arithmetic units (ARAU)s. The primary function of the auxiliary registers is the generation of 32-bit addresses. However, they can also operate as loop counters in indirect addressing or as 32-bit general-purpose registers that can be modified by the multiplier and ALU. See Chapter 6, Addressing Modes, for detailed information and examples of the use of auxiliary registers in addressing.

3.1.3 Data-Page Pointer (DP)

The data-page pointer (DP) is a 32-bit register whose 16 LSBs are used by the direct addressing mode as a pointer to the page of data being addressed. Data pages are 64K words long with a total of 64K (65,536) pages. Bits 31–16 are reserved; they are always read as zeros and should not be modified by writing to the register. The DP can be loaded by using the LDP pseudoinstruction or the LDI instruction. Figure 6–1, on page 6-5, describes this register’s functions.

3.1.4 Index Registers (IR0, IR1)

The 32-bit index registers (IR0 and IR1) are used by the auxiliary register arithmetic unit (ARAU) for indexing the address. IR0 is also used for bit-reversed addressing. See Chapter 6, Addressing Modes, for detailed information and examples of the use of index registers in addressing. Section 6.4, Indirect Addressing, on page 6-6, discusses and provides examples of using IRn in indirect addressing. Section 6.9, Bit-Reversed Addressing, on page 6-32, describes using IRn with bit-reversed addressing.
3.1.5 Block-Size Register (BK)

The 32-bit block-size register (BK) is used by the ARAU in circular addressing to specify the data block size (see Section 6.8, Circular Addressing, on page 6-27, for more information about the use of the BK register).

3.1.6 System Stack Pointer (SP)

The system stack pointer (SP) is a 32-bit register that contains the address of the top of the system stack. The SP always points to the last element pushed onto the stack. The SP is manipulated by interrupts, traps, calls, returns, and the PUSH, PUSHF, POP, and POPF instructions. Pushes and pops of the stack perform preincrement and postdecrement, respectively, on all 32 bits of the SP.

3.1.7 Status Register (ST)

The status register (ST) contains global information about the CPU’s state. Typically, load, store, arithmetic, and logical operations affect the ST’s condition flags. When the ST is loaded, the contents of the load instruction’s source operand replace the ST’s current contents, regardless of the state of any bit(s) in the source operand. Therefore, following an ST load, the contents of the ST are identical to the contents of the source operand. This allows the status register to be saved easily and restored. At system reset, 0 is written to the ST; after reset, the CF bit is set to 1. The format of the ST is shown in Figure 3–3. The text following the figure describes each field in the ST.

**Figure 3–3. Status Register (ST)**

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>23</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>19</th>
<th>18</th>
<th>17</th>
<th>16</th>
</tr>
</thead>
<tbody>
<tr>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>NMI bus grant</td>
<td>xx</td>
</tr>
<tr>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R</td>
</tr>
<tr>
<td>15</td>
<td>14</td>
<td>13</td>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
<td>8</td>
<td>7</td>
<td>6</td>
<td>5</td>
<td>4</td>
<td>3</td>
<td>2</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>SC</td>
<td>PGIE</td>
<td>GIE</td>
<td>CC</td>
<td>CE</td>
<td>CF</td>
<td>PCF</td>
<td>RM</td>
<td>OVM</td>
<td>LUF</td>
<td>LV</td>
<td>UF</td>
<td>N</td>
<td>Z</td>
<td>V</td>
<td>C</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
</tbody>
</table>

**NOTE:** xx = reserved bit. R = read, W = write.

- **C** Carry-condition flag.
- **V** Overflow condition flag.
- **Z** Zero condition flag.
- **N** Negative condition flag.
- **UF** Floating-point underflow condition flag.
- **LV** Latched overflow condition flag.
- **LUF** Latched floating-point underflow condition flag.
**OVM**  
**Overflow mode (OVM) flag.** This flag affects only integer operations.

If OVM = 0, the overflow mode is turned off.

If OVM = 1, integer results overflowing in the positive direction are set to the most positive 32-bit twos-complement number (7FFFF FFFh), and integer results overflowing in the negative direction are set to the most negative 32-bit twos-complement number (8000 0000h).

Note that the functions of bits V and LV are independent of the setting of OVM.

**RM**  
**Repeat mode (RM) flag.** If RM = 1, the PC is modified in either the repeat-block or repeat-single mode.

**PCF**  
**Previous state of bit CF.** When a trap executes or an interrupt is taken, the CF bit is set to 1 and the PCF bit is set to the CF bit’s previous value.

The RETI and RETID instructions, explained in chapter 14, *Assembly Language Instructions*, copy PCF to the CF bit.

**CF**  
**Cache freeze (CF).** Enables or disables updating of the cache.

Set CF = 1 to freeze the cache. If CF = 1 and CE = 1, fetches from the cache and cache clearing (CC = 1) are allowed, but modification of the cache contents is not allowed. At reset, this bit is cleared to zero; it is set to 1 after reset.

When CF = 0, the cache is automatically updated by instruction fetches from external memory and cache clearing (CC = 1) is allowed. Traps and interrupts set CF. The RETI and RETID instructions copy the PCF bit to the CF bit.

Table 3–2 summarizes the CE and CF bits.

**CE**  
**Cache enable (CE).** CE enables or disables the instruction cache.

Set CE = 1 to enable the cache, allowing the cache to be used according to the LRU (least recently used) cache algorithm.

Set CE = 0 to disable the cache, preventing cache modifications and fetches. Cache clearing (CC = 1) is allowed when CE = 0. At reset, 0 is written to CE.

**CC**  
**Cache clear.** CC = 1 invalidates all entries in the cache (contents not guaranteed). This bit is always cleared after it is written to and thus always read as 0. At reset, 0 is written to this bit. All cache P flags = 0 when cache is cleared.
Table 3–2. Summary of the CE and CF Bits

<table>
<thead>
<tr>
<th>CE</th>
<th>CF</th>
<th>Effect</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>Cache not enabled</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>Cache not enabled</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>Cache enabled and not frozen</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>Cache enabled but frozen (cache read only)</td>
</tr>
</tbody>
</table>

**GIE**  
*Global interrupt enable.* Enables or disables all maskable interrupts.  
If GIE = 1, the CPU responds to any enabled interrupts.  
If GIE = 0, the CPU does not respond to any enabled interrupts. This bit does not affect NMIs. The IDLE, LAT, RETI, RETID, and TRAP instructions affect this bit's value. GIE is cleared to 0 when a trap is executed or an interrupt is taken.

**PGIE**  
*Previous state of bit GIE.* When a trap executes or an interrupt is taken, bit GIE is cleared to 0. When this occurs, the PGIE bit is set to the GIE bit's value before the trap or interrupt. Note that the RETIcond and RETIcondD instructions copy PGIE to the GIE bit. At reset, this bit is cleared to 0.

**SET COND**  
*SC*  
This bit determines how condition flags (ST bits 0–6) are set.  
If SET COND = 0, condition flags are set if the operation's target is any extended-precision register (R0–R11). This setting makes the ’C4x similar to the ’C3x, regarding condition flag settings. This bit is cleared to 0 at reset.  
If SET COND = 1, condition flags are set if the target of the operation is any register in the primary register files except the status register. Condition flags are always set when a CMPF, CMPI, CMPF3, CMPI3, TSTB, or TSTB3 instruction is executed, regardless of the value of SET COND.

**ANALYSIS**  
This read-only bit is used in analysis mode to provide state information for emulation.

**NMI bus grant**  
(’C44 and ’C40 revision ≥5.0 only)  
The NMI bus-grant feature is useful in correcting communication-port errors when used with the communication-port software reset feature. If bit 19 = 1 and bit 18 = 0, an internal peripheral bus-grant signal is forced on the falling edge of NMI. If NMI is asserted when the peripheral bus is in a stall condition, the NMI breaks the pending cycle and then jumps to the NMI service routine. A stall condition may occur when writing to a full output FIFO, or when reading from an empty input FIFO.

**xx**  
Reserved. Value undefined. These bits are read-only.
3.1.8 DMA Coprocessor Interrupt Enable Register (DIE)

The 32-bit DMA interrupt enable register (DIE) is broken into six subfields that determine which interrupts can be used to control the synchronization for each of the six DMA coprocessor channels. Synchronization controls when a DMA channel reads or writes. At reset, zeros are written to all register bits.

Each DMA channel looks not only at the DMA synchronous interrupts selected but also at the synchronization mode that the channel is currently using (see Table 11–3). The synchronization mode is specified by the SYNC MODE field in the DMA channel control registers located in the DMA coprocessor.

By using interrupt synchronization, each DMA channel can (for example) service a corresponding communication port. Note that DMAi can be synchronized only to signals coming from communication port i (where 0 ≤ i ≤ 5). Also, each DMA channel can be synchronized to external interrupts and to the on-chip timers.

3.1.8.1 Unified Mode

Figure 3–4 shows the DMA interrupt enable register for unified mode. Table 3–3 summarizes the interrupt activity for each of the four possible combinations of DMA0 and DMA1 for unified mode. Table 3–4 summarizes the interrupts enabled by three-bit values in DMA2 through DMA5 for unified mode.

Figure 3–4. DMA Interrupt Enable Register Bit Functions for DMA Unified Mode

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>23</th>
<th>22</th>
<th>21</th>
<th>20</th>
</tr>
</thead>
<tbody>
<tr>
<td>DMA5 Write</td>
<td>DMA5 Read</td>
<td>DMA4 Write</td>
<td>DMA4 Read</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td></td>
<td></td>
</tr>
<tr>
<td>19</td>
<td>18</td>
<td>17</td>
<td>16</td>
<td>15</td>
<td>14</td>
<td>13</td>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
<td>8</td>
</tr>
<tr>
<td>DMA3 Write</td>
<td>DMA3 Read</td>
<td>DMA2 Write</td>
<td>DMA2 Read</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td>6</td>
<td>5</td>
<td>4</td>
<td>3</td>
<td>2</td>
<td>1</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>DMA1 Write</td>
<td>DMA1 Read</td>
<td>DMA0 Write</td>
<td>DMA0 Read</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

R = Read W = Write
Table 3–3. DMA Channels 0 and 1 (DMA0 and DMA1) Unified Mode Synchronization Interrupts

<table>
<thead>
<tr>
<th>Bit Value (in DMA0 or DMA1)</th>
<th>DMA0 Read</th>
<th>DMA0 Write</th>
<th>DMA1 Read</th>
<th>DMA1 Write</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0†</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>--</td>
</tr>
<tr>
<td>0 1†</td>
<td>ICRDY0</td>
<td>OCRDY0</td>
<td>ICRDY1</td>
<td>OCRDY1</td>
<td>From communication port</td>
</tr>
<tr>
<td>1 0</td>
<td>IIOF0</td>
<td>IIOF1</td>
<td>IIOF2</td>
<td>IIOF3</td>
<td>From external pins IIOF0–IIOF3</td>
</tr>
<tr>
<td>1 1</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>From timer TIM0</td>
</tr>
</tbody>
</table>

† DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.
‡ This option is not available for DMA0 and DMA3 in the 'C44.

Table 3–4. DMA Channels 2 to 5 (DMA2 to DMA5) Unified Mode Synchronization Interrupts

<table>
<thead>
<tr>
<th>Bit Value (in DMA2 to DMA5)</th>
<th>Interrupt Enabled at DMA2–DMA5†</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0‡</td>
<td>None</td>
<td>--</td>
</tr>
<tr>
<td>0 0 1§</td>
<td>ICRDYx†</td>
<td>From communication port</td>
</tr>
<tr>
<td>0 1 0</td>
<td>IIOF0</td>
<td>From external pins IIOF0–IIOF3</td>
</tr>
<tr>
<td>0 1 1</td>
<td>IIOF1</td>
<td></td>
</tr>
<tr>
<td>1 0 0</td>
<td>IIOF2</td>
<td></td>
</tr>
<tr>
<td>1 0 1</td>
<td>IIOF3</td>
<td></td>
</tr>
<tr>
<td>1 1 0</td>
<td>TIM0</td>
<td>From timers TIM0 and TIM1</td>
</tr>
<tr>
<td>1 1 1</td>
<td>TIM1</td>
<td></td>
</tr>
</tbody>
</table>

† The x in DMAx is the DMA channel number, which is also the number for the corresponding ICRDYx and OCRDYx interrupts. For example, an 0012 in both DMA2 READ and DMA5 WRITE would enable interrupts ICRDY2 and OCRDY5, respectively. All other viable bit values (0102 to 1112) are the same (as shown in the table) for DMA2 through DMA5.
‡ DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.
§ This option is not available for DMA0 and DMA3 in the 'C44.

Note: DMA Coprocessor Uses Signals to Synchronize

The interrupts in Table 3–3 and Table 3–4 (ICRDYx, OCRDYx, TIM0, etc.) are not vectored. The DMA coprocessor uses these as signals to synchronize DMA coprocessor transfers. This process is explained in Section 11.10.
3.1.8.2 Split Mode

Figure 3–5 shows the DMA interrupt enable register for split mode. Table 3–5 summarizes the interrupt activity for each of the four possible combinations of DMA0 and DMA1 for split mode. Table 3–6 summarizes the interrupts enabled by three-bit values in DMA2 through DMA5 for split mode.

Figure 3–5. DMA Interrupt Enable Register Bit Functions for DMA Split Mode

<table>
<thead>
<tr>
<th>Bit Value</th>
<th>DMA0 Auxiliary Read</th>
<th>DMA0 Primary Write</th>
<th>DMA1 Auxiliary Read</th>
<th>DMA1 Primary Write</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 †</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>– –</td>
</tr>
<tr>
<td>0 1 ‡</td>
<td>ICRDY0</td>
<td>OCRDY0</td>
<td>ICRDY1</td>
<td>OCRDY1</td>
<td>From communication port</td>
</tr>
<tr>
<td>1 0</td>
<td>IIOF0</td>
<td>IIOF1</td>
<td>IIOF2</td>
<td>IIOF3</td>
<td>From external pins IIOF0–IIOF3</td>
</tr>
<tr>
<td>1 1</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>From timer TIM0</td>
</tr>
</tbody>
</table>

† DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.
‡ This option is not available for DMA0 and DMA3 in the ‘C44.

Table 3–5. DMA Channels 0 and 1 (DMA0 and DMA1) Split-Mode Synchronization Interrupts
Table 3–6. DMA Channels 2 to 5 (DMA2 to DMA5) Split-Mode Synchronization Interrupts

<table>
<thead>
<tr>
<th>Bit Value (in DMA2 to DMA5)</th>
<th>DMAx Auxiliary Read†</th>
<th>DMAx Primary Write†</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0‡</td>
<td>None</td>
<td>None</td>
<td>-- --</td>
</tr>
<tr>
<td>0 0 1§</td>
<td>ICRDYx†</td>
<td>OCRDYx†</td>
<td>From communication port</td>
</tr>
<tr>
<td>0 1 0</td>
<td>ITOF0</td>
<td>ITOF0</td>
<td>From external pins ITOF0–ITOF3</td>
</tr>
<tr>
<td>0 1 1</td>
<td>ITOF1</td>
<td>ITOF1</td>
<td></td>
</tr>
<tr>
<td>1 0 0</td>
<td>ITOF2</td>
<td>ITOF2</td>
<td></td>
</tr>
<tr>
<td>1 0 1</td>
<td>ITOF3</td>
<td>ITOF3</td>
<td></td>
</tr>
<tr>
<td>1 1 0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>From timers TIM0 and TIM1</td>
</tr>
<tr>
<td>1 1 1</td>
<td>TIM1</td>
<td>TIM1</td>
<td></td>
</tr>
</tbody>
</table>

† The x in DMAx is the DMA channel number, which is also the number for the corresponding ICRDYx and OCRDYx interrupts. For example, an 0012 in both DMA2 READ and DMA5 WRITE would enable interrupts ICRDY2 and OCRDY5, respectively. All other viable bit values (0102 to 1112) are the same (as shown in the table) for DMA2 through DMA5.

‡ DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.

§ This option is not available for DMA0 and DMA3 in the ‘C44.

### 3.1.9 CPU Internal Interrupt Enable Register (IIE)

The 32-bit internal interrupt enable register, shown in Figure 3–6, enables/disables the following interrupts for the CPU:

- Timers 0 and 1
- For communication ports 0–5:
  - Input-buffer full
  - Input-buffer ready
  - Output-buffer ready
  - Output-buffer empty
- DMA coprocessor channels 0–5

Figure 3–6 shows the IIE register bits. A 1 means the corresponding interrupt is enabled; a 0 indicates disabled. At reset, zeros are written to all register bits.
**Figure 3–6. Internal Interrupt Enable Register (IIE)**

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>23</th>
<th>22</th>
<th>21</th>
</tr>
</thead>
<tbody>
<tr>
<td>ETINT1</td>
<td>EDMA</td>
<td>INT5</td>
<td>EDMA</td>
<td>INT4</td>
<td>EDMA</td>
<td>INT3</td>
<td>EDMA</td>
<td>INT2</td>
<td>EDMA</td>
<td>INT1</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
<tr>
<td>20</td>
<td>19</td>
<td>18</td>
<td>17</td>
<td>16</td>
<td>15</td>
<td>14</td>
<td>13</td>
<td>12</td>
<td>11</td>
<td>10</td>
</tr>
<tr>
<td>EOC</td>
<td>EMPTY4</td>
<td>EOC</td>
<td>RDY4</td>
<td>EIC</td>
<td>FULL4</td>
<td>EOC</td>
<td>EMPTY3</td>
<td>EOC</td>
<td>RDY3</td>
<td>EIC</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
<tr>
<td>8</td>
<td>7</td>
<td>6</td>
<td>5</td>
<td>4</td>
<td>3</td>
<td>2</td>
<td>1</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>EOC</td>
<td>EMPTY1</td>
<td>EOC</td>
<td>RDY1</td>
<td>EIC</td>
<td>FULL1</td>
<td>EOC</td>
<td>EMPTY0</td>
<td>EOC</td>
<td>RDY0</td>
<td>EIC</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
</tbody>
</table>

R = Read, W = Write, R/W = Read/Write

**Notes:**

1) In the figure, the shaded boxes are reserved bits in the ‘C44. Zero should be written to each of these bits.

2) The fields corresponding to each unit are separated by double lines.

The following are definitions for each of the bits in the IIE.

- **EICFULLx** Comm. port x input-buffer full interrupt
- **EICRDYx** Comm. port x input-buffer ready interrupt
- **EOCRDYx** Comm. port x output-buffer ready interrupt
- **EOCEMPTYx** Comm. port x output-buffer empty interrupt
- **EDMAINTx** DMA coprocessor channel x interrupt
- **ETINT0** Timer 0 interrupt
- **ETINT1** Timer 1 interrupt

In each field label, the x represents a communication port number (0 – 5) or a DMA coprocessor channel number (0–5). For example, a 1 in bit 5 causes interrupts to be generated when communication port number 1’s input buffer becomes full. Or, a 1 in bit 26 enables channel 1 of the DMA coprocessor to respond to interrupts. A 1 enables each interrupt; a 0 disables it.
3.1.10 IIOF Flag Register (IIF)

The IIF register controls the external interrupt pins IIOF(3–0). Use it to specify:

- Which IIOF pins are used for general-purpose I/O and which are used for interrupts
- Whether a general-purpose pin is input (read only) or output (read/write)
- Whether an interrupt pin is for edge-triggered or level-triggered interrupts,
- Whether an external interrupt is enabled or disabled

The IIF register also contains timer, DMA and NMI interrupt flags. Figure 3–7 shows the IIF register’s bits. The text following the figure explains these bits in detail.

The IIF register bits can be read from or written to under software control. This provides access to the IIOFx pins, which can be treated as general-purpose I/O or as interrupt pins. For example, if at the IIF register, FUNCx = 0 (I/O pin) and TYPEx = 1 (output pin), then by writing into the FLAGx bit, you can also write to the external pin IIOFx. If FUNCx = 1 (interrupt pin), writing a 1 to the IIF register FLAGx bit has the same effect as an incoming interrupt received on the corresponding pin. Consequently, all interrupts can be triggered and/or cleared through software. Since the interrupt bits also can be read from, the interrupt pins can be polled in software when an interrupt-driven interface is not required.

Internal interrupts operate in a similar manner. In the IIF register, the bit corresponding to an internal interrupt (e.g., TINT0, TINT1) can be read from and written to through software. Writing a 1 sets the interrupt latch, and writing a 0 clears it. All internal interrupts are one H1/H3 cycle in length. Modify the IIF by using logic operations (AND, OR, etc.) as shown:

\[
\begin{array}{ll}
\text{correct} & \text{incorrect} \\
LDI @MASK, R0 & LDI IIF, R1 \\
AND R0, IIF & AND @MASK, R1 \\
& LDI R1, IIF
\end{array}
\]

Traps and interrupts are described briefly in Section 3.2, CPU Expansion Register File, on page 3-17, and in detail in Section 7.4, Interrupts, on page 7-15, and Section 7.5, Traps, on page 7-24.
Figure 3–7. Interrupt Flag Register (IIF)

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
</tr>
</thead>
<tbody>
<tr>
<td>TINT1</td>
<td>DMAINT5</td>
<td>DMAINT4</td>
<td>DMAINT3</td>
<td>DMAINT2</td>
<td>DMAINT1</td>
<td>DMAINT0</td>
<td>TINT0</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
<tr>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>NMI</td>
</tr>
<tr>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
</tr>
<tr>
<td>15</td>
<td>14</td>
<td>13</td>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
<td>8</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
</tbody>
</table>

R = Read, W = Write, R/W = Read/Write

**FUNCx**
Mode of pin IIOFx. If FUNCx = 0, pin IIOFx is a general-purpose I/O (R/W) pin. If FUNCx = 1, pin IIOFx is an interrupt pin.

**TYPEx**
Type of function for pin IIOFx.
If pin IIOFx is a general-purpose I/O pin (FUNCx = 0):
- TYPEx = 0 makes IIOFx an input pin.
- TYPEx = 1 makes IIOFx an output pin
If pin IIOFx is an interrupt pin (FUNCx = 1):
- TYPEx = 0 makes IIOFx an edge-triggered latched interrupt,
- TYPEx = 1 makes IIOFx a level-triggered unlatched interrupt.

**FLAGx**
Flag for pin IIOFx.
If pin IIOFx is a general-purpose input pin (FUNCx = 0, TYPEx = 0),
- FLAGx = the value of pin IIOFx and is read only.
If pin IIOFx is a general-purpose output pin (FUNCx = 0, TYPEx = 1),
- FLAGx = the value on pin IIOFx and is R/W.
If pin IIOFx is an interrupt pin (FUNCx = 1):
- FLAGx = 0 if interrupt is not asserted.
- FLAGx = 1 if interrupt is asserted.
If 0 (zero) is written to FLAGx, the corresponding interrupt is cleared unless an interrupt is on the same pin; in that case, the interrupt will remain set.

**EIIOFx**
Disable/enable external interrupt.
EIIOFx = 0 disables external interrupts at pin IIOFx.
EIIOFx = 1 enables external interrupts at pin IIOFx.
NMI  **Nonmaskable Interrupt flag (NMI).** The NMI interrupt (on the external NMI pin) behaves like other interrupts, except that it cannot be masked (disabled) by the GIE bit (ST bit 13) or by writing to the NMI bit. It is temporarily masked during delayed branches and multicycle CPU operations. At reset, this bit is cleared. An asserted interrupt is cleared only by servicing the interrupt. NMI is a negative-going, edge-triggered, latched interrupt. It is read-only.

Reading NMI as 0 indicates that the interrupt is not asserted.

Reading NMI as 1 indicates that the interrupt is asserted.

Reserved **Reserved; read as zeros.**

TINT0  **Timer interrupt flags 0 and 1.**

TINT1  Reading TINTx as 0 indicates that the timer interrupt is not asserted. Reading TINTx as 1 indicates that the timer interrupt is asserted.

A zero written to this bit clears the interrupt unless the interrupt is asserted at the same time; in that case, the interrupt will be shown as asserted.

DMAINTx  Interrupt flag for DMA coprocessor channels 0 to 5. Reading DMAINTx as 0 indicates that the channel interrupt is not asserted. Reading DMAINTx as 1 indicates that the channel interrupt is asserted. A zero written to this bit clears the interrupt unless the interrupt is asserted at the same time; in that case, the interrupt is shown as asserted.

**Notes:**

1) Shaded IIF bits 0, 1, 2, 3 apply to pin IIOF0; shaded IIF bits 4, 5, 6, 7 apply to IIOFT, etc.

2) The x represents the corresponding IIOF interrupt pin (IIOF0–3)
3.1.11 Block-Repeat (RS, RE) and Repeat-Count (RC) Registers

The 32-bit repeat start address register (RS) contains the starting address of the block of program memory to be repeated when the CPU is operating in the repeat mode.

The 32-bit repeat end address register (RE) contains the ending address of the block of program memory to be repeated when the CPU is operating in the repeat mode.

**Note:**

If RE < RS, the block of program memory is not repeated, and the code does not loop backwards. However, the ST(RM) bit remains set to 1.

The repeat-count register (RC) is a 32-bit register that specifies the number of times a block of code is to be repeated when a block repeat is performed. If RC contains the number \( n \), the loop is executed \( n + 1 \) times.

3.1.12 Program Counter (PC)

The program counter (PC) is a 32-bit register containing the address of the next instruction to fetch. While the program counter is not part of the CPU register file, it can be modified by the same instructions that modify the program flow.

3.1.13 Reserved Bits and Compatibility

To retain compatibility with future members of the 'C4x family of microprocessors, reserved bits that are read as zero must be written as zero. Reserved bits that have an undefined value **must not** have their current value modified. In other cases, maintain the reserved bits as specified.
3.2 CPU Expansion Register File

This expansion register file contains two special control registers:

- Interrupt-vector table pointer (IVTP)
- Trap-vector table pointer (TVTP)

Table 3–7. CPU Expansion Registers

<table>
<thead>
<tr>
<th>Assembler Syntax</th>
<th>Register Value (Hex)</th>
<th>Function Name</th>
</tr>
</thead>
<tbody>
<tr>
<td>IVTP</td>
<td>00</td>
<td>Interrupt-vector table pointer. Points to start of the interrupt-vector table.</td>
</tr>
<tr>
<td>TVTP</td>
<td>01</td>
<td>Trap-vector table pointer. Points to start of the trap-vector table.</td>
</tr>
</tbody>
</table>

Use the **LDEP instruction** to load (copy) an expansion register to a primary register (e.g., to any of the auxiliary registers AR0–AR7; see Table 3–1 on page 3-2). For example:

LDEP IVTP,AR5 ; IVTP contents to AR5

Likewise, use the **LDPE instruction** to load (copy) a primary register to an expansion register. Neither of these instructions affects the status register condition flags.

LDPE AR5,IVTP ; AR5 contents to IVTP

Note that both the interrupt-vector table and the trap-vector table are required to lie on a 512-word boundary; thus, the nine least significant bits of these pointers are zeros (i.e., $100000000_2 = 512 = 200h$). Write only zeros to these bits (though the register forces these to zeros).

The 32-bit **IVTP register** points to (is essentially the base address for) the interrupt-vector table (IVT) in memory.

The 32-bit **TVTP register** is essentially the base address for the trap-vector table (TVT) in memory. This table contains the vectors for the TRAP instruction’s 512-trap addresses (TRAP0–TRAP511).

The interrupt and trap vector tables can share the same 512-byte space in memory. In this configuration, you can place trap vectors where there are no interrupt vectors. For example, since interrupt vector 02Ch is unused, you could place a trap vector at IVTP + 02Ch (which is also TVTP + 02Ch if the tables overlap) and then call that trap by specifying 02Ch in the TRAP instruction.

At reset, IVTP and TVTP are both set to zero.
Memory and the Instruction Cache

The 'C40 accesses a total memory space of 4G 32-bit words (16G bytes) of program, data, and I/O space; the 'C44 accesses a total memory space of 32M 32-bit words (128M bytes).

Two internal RAM blocks of 1K × 32 bits each (4K bytes) and an internal ROM block containing a bootloader permit two accesses per block in a single cycle.

A 128 × 32-bit instruction cache allows code to be stored off-chip in slower, lower-cost memories without degrading performance. The cache also speeds data fetches to the same physical space as the program because it does not burden the bus with program instruction fetches.

This chapter describes the memory maps and the instruction cache.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>4.1 Memory Map</td>
<td>4-2</td>
</tr>
<tr>
<td>4.2 Peripheral Bus Memory Map</td>
<td>4-5</td>
</tr>
<tr>
<td>4.3 Instruction Cache</td>
<td>4-10</td>
</tr>
</tbody>
</table>
4.1 Memory Map

The 'C4x memory space of 4 gigawords (4G × 32 bits where 1G = 2^30) is shown in the memory maps in Figure 4–1 and Figure 4–2. The contents of the first segment of address space, at 0000 0000h to 000F FFFFh, is selected by the value of the ROM enable (ROMEN) pin:

- ROMEN = 1. Addresses 0000 0000h–0000 0FFFh are an on-chip ROM block (reserved for bootloader operations), and addresses 0000 1000h–000F FFFFh are reserved.
- ROMEN = 0. The on-chip (reserved) ROM is disabled, and addresses 0000 0000h–000F FFFFh are mapped to the local bus.

Memory starting at 0010 0000h is not affected by ROMEN. The following is a general summary of address ranges:

- 0000 0000h–000F FFFFh: Can be local bus or on-chip (reserved) ROM, depending on the value of ROMEN. If ROMEN=0, these addresses are mapped to the local bus. If ROMEN=1, these addresses are mapped to the on-chip ROM.
- 0010 0000h–0010 00FFh: Internal peripherals (DMA coprocessor, communications ports, timers, etc.).
- 0010 0100h–002F F7FFh: Reserved.
- 002F F800h–002F FBFFh: 1K RAM Block 0.
- 002F FC00h–002F FFFFh: 1K RAM Block 1.
- 0030 0000h–7FFF FFFFh: Local bus. These addresses are mapped to the local bus.
- 8000 0000h–0FFF FFFFh: Global bus. These addresses are mapped to the global bus.

CPU data accesses and DMA accesses can be made from any unreserved part of the 'C4x memory map. Instruction fetches can take place from any unreserved area of the 'C4x memory map, except from the peripheral space (addresses 0010 0000h–0010 00FFh).

Note:
The 'C4x internal ROM is generally reserved for TI internal use only. However, for high-volume applications, you can request that TI install your code in the internal ROM.
Figure 4–1. 'C40 Memory Map

(a) Internal ROM disabled (ROMEN = 0)

Microprocessor Mode

(b) Internal ROM enabled (ROMEN = 1)

Microcomputer Mode
Figure 4–2. 'C44 Memory Map

(a) Internal ROM disabled (ROMEN = 0)
   Microprocessor Mode

  Structure identical
  2G–16M
  16M
  2G

  Global bus (alias region)

  Local bus (alias region)

  Global bus (external)

  Local bus (external)

  1K RAM BLK 1 (internal)

  1K RAM BLK 0 (internal)

  Reserved

  Peripherals (internal) (see Figure 2–6)

  Accessible local bus (external)

  1M
  Structure depends upon ROMEN bit
  1M
  1M

(b) Internal ROM enabled (ROMEN = 1)
   Microcomputer Mode

  Bootloader ROM (internal)

  Reserved

  Peripherals (internal) (see Figure 2–6)

  00000 0000h
  00000 0FFFh
  00000 1000h
  00000 1000h
  0000F FFFFh
  00010 00FFh
  00010 0100h
  0001F FFFFh
  00020 0000h
  0002F F7FFh
  0002F F800h
  0002F FC00h
  0002F FFFFh
  00030 0000h
  0000F FFFFh
  00010 0000h
  00010 00FFh
  00010 0100h
  0001F FFFFh
  0002F F7FFh
  0002F F800h
  0002F FC00h
  0002F FFFFh
  00030 0000h
  07FFF FFFFh
  08000 0000h
  0FFFF FFFFh

Microprocessor Mode

(b) Internal ROM enabled (ROMEN = 1)
   Microcomputer Mode
4.2 Peripheral Bus Memory Map

The peripheral bus memory map resides in addresses 0010 0000h–0010 00FFh. Each peripheral requires a 16-word area. Figure 4–3 shows the locations of registers for each peripheral in the memory map.

**Figure 4–3. Peripheral Memory Map**

<table>
<thead>
<tr>
<th>Address</th>
<th>Description</th>
<th>Subsection Referenced</th>
<th>Location in Diagram</th>
</tr>
</thead>
<tbody>
<tr>
<td>0010 0000h</td>
<td>Local and Global Port Control (16 words)</td>
<td>See subsection 4.2.1</td>
<td></td>
</tr>
<tr>
<td>0010 000Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0010h</td>
<td>Analysis Module Block Registers (16 words)</td>
<td>See subsection 4.2.2</td>
<td></td>
</tr>
<tr>
<td>0010 001Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0020h</td>
<td>Timer 0 Registers (16 words)</td>
<td>See subsection 4.2.3</td>
<td></td>
</tr>
<tr>
<td>0010 002Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0030h</td>
<td>Timer 1 Registers (16 words)</td>
<td>See subsection 4.2.3</td>
<td></td>
</tr>
<tr>
<td>0010 003Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0040h</td>
<td>Communication Port 0 (16 words) (C40 only)</td>
<td>See subsection 4.2.4</td>
<td></td>
</tr>
<tr>
<td>0010 004Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0050h</td>
<td>Communication Port 1 (16 words)</td>
<td>See subsection 4.2.4</td>
<td></td>
</tr>
<tr>
<td>0010 005Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0060h</td>
<td>Communication Port 2 (16 words)</td>
<td>See subsection 4.2.4</td>
<td></td>
</tr>
<tr>
<td>0010 006Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0070h</td>
<td>Communication Port 3 (16 words) (C40 only)</td>
<td>See subsection 4.2.4</td>
<td></td>
</tr>
<tr>
<td>0010 007Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0080h</td>
<td>Communication Port 4 (16 words)</td>
<td>See subsection 4.2.4</td>
<td></td>
</tr>
<tr>
<td>0010 008Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 0090h</td>
<td>Communication Port 5 (16 words)</td>
<td>See subsection 4.2.4</td>
<td></td>
</tr>
<tr>
<td>0010 009Fh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00A0h</td>
<td>DMA Coprocessor Channel 0 (16 words)</td>
<td>See subsection 4.2.5</td>
<td></td>
</tr>
<tr>
<td>0010 00AFh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00B0h</td>
<td>DMA Coprocessor Channel 1 (16 words)</td>
<td>See subsection 4.2.5</td>
<td></td>
</tr>
<tr>
<td>0010 00BFh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00C0h</td>
<td>DMA Coprocessor Channel 2 (16 words)</td>
<td>See subsection 4.2.5</td>
<td></td>
</tr>
<tr>
<td>0010 00CFh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00D0h</td>
<td>DMA Coprocessor Channel 3 (16 words)</td>
<td>See subsection 4.2.5</td>
<td></td>
</tr>
<tr>
<td>0010 00DFh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00E0h</td>
<td>DMA Coprocessor Channel 4 (16 words)</td>
<td>See subsection 4.2.5</td>
<td></td>
</tr>
<tr>
<td>0010 00EFh</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00F0h</td>
<td>DMA Coprocessor Channel 5 (16 words)</td>
<td>See subsection 4.2.5</td>
<td></td>
</tr>
<tr>
<td>0010 00FFh</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
4.2.1 Local and Global Memory Interface Control Registers

These registers control the local and global memory interfaces. They occupy the first 16-word block of the peripheral bus memory map, shown in Figure 4–3. The registers themselves are shown in Figure 4–4. Chapter 9, External Bus Operation, covers the operation of these registers.

These registers define several settings:

- The page sizes used for the two strobes of each port
- Address ranges over which the strobes are active
- Wait states
- Other similar operations that compose the memory interfaces

Figure 4–4. Memory Interface Control Registers

<table>
<thead>
<tr>
<th>Memory Interface Control Register</th>
<th>Address</th>
</tr>
</thead>
<tbody>
<tr>
<td>Global Memory Interface Control</td>
<td>0010 0000h</td>
</tr>
<tr>
<td>Reserved</td>
<td>0010 0001h</td>
</tr>
<tr>
<td>Local Memory Interface Control</td>
<td>0010 0003h</td>
</tr>
<tr>
<td>Reserved</td>
<td>0010 0004h</td>
</tr>
<tr>
<td>Reserved</td>
<td>0010 0005h</td>
</tr>
<tr>
<td>Reserved</td>
<td>0010 000Fh</td>
</tr>
</tbody>
</table>

4.2.2 Analysis Module Registers

The second lowest 16-word block in the peripheral bus memory map, as shown in Figure 4–3, contains part of the analysis module registers. These registers are reserved for emulation functions. The TMS320C4x C Source Debugger User’s Guide (literature number SPRU054) describes the analysis module user interface provided by the ’C4x debugger.
4.2.3 Timer Registers

This group of registers occupies the 0010 0020h–0010 003Fh range in the peripheral bus memory map shown in Figure 4–3, on page 4-5. Timers and their registers are covered in detail in Chapter 13, Timers.

Figure 4–5. Timer Registers

<table>
<thead>
<tr>
<th>Address</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0010 0020h</td>
<td>Timer 0 control register</td>
</tr>
<tr>
<td>0010 0021h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0022h</td>
<td>Timer 0 counter register</td>
</tr>
<tr>
<td>0010 0023h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0024h</td>
<td>Timer 0 period register</td>
</tr>
<tr>
<td>0010 0025h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0026h</td>
<td>Timer 0 control register</td>
</tr>
<tr>
<td>0010 0027h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0028h</td>
<td>Timer 0 period register</td>
</tr>
<tr>
<td>0010 0030h</td>
<td>Timer 1 control register</td>
</tr>
<tr>
<td>0010 0031h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0032h</td>
<td>Timer 1 counter register</td>
</tr>
<tr>
<td>0010 0033h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0034h</td>
<td>Timer 1 period register</td>
</tr>
<tr>
<td>0010 0035h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0036h</td>
<td>Timer 1 control register</td>
</tr>
<tr>
<td>0010 0037h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 0038h</td>
<td>Timer 1 period register</td>
</tr>
<tr>
<td>0010 0039h</td>
<td>Reserved</td>
</tr>
<tr>
<td>0010 003Fh</td>
<td>Timer 0 counter register</td>
</tr>
</tbody>
</table>
### 4.2.4 Communication Port Memory Map

Figure 4–6 illustrates the communication-port control registers (CPCR) and input and output FIFO buffers. This is the central group of registers in the peripheral bus memory map shown in Figure 4–4, on page 4-6. These registers are described in more detail in Chapter 12, *Communication Ports*.

#### Figure 4–6. Communication Port Memory Map

<table>
<thead>
<tr>
<th>Offset</th>
<th>Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>0010 0040h</td>
<td>CPCR 0 (C40 only)</td>
</tr>
<tr>
<td>0010 0041h</td>
<td>input port 0, FIFO position 0</td>
</tr>
<tr>
<td>0010 0042h</td>
<td>output port 0, FIFO position 7</td>
</tr>
<tr>
<td>0010 0043h</td>
<td>Port 0 software reset</td>
</tr>
<tr>
<td>0010 0050h</td>
<td>CPCR 1</td>
</tr>
<tr>
<td>0010 0051h</td>
<td>input port 1, FIFO position 0</td>
</tr>
<tr>
<td>0010 0052h</td>
<td>output port 1, FIFO position 7</td>
</tr>
<tr>
<td>0010 0053h</td>
<td>Port 1 software reset</td>
</tr>
<tr>
<td>0010 0060h</td>
<td>CPCR 2</td>
</tr>
<tr>
<td>0010 0061h</td>
<td>input port 2, FIFO position 0</td>
</tr>
<tr>
<td>0010 0062h</td>
<td>output port 2, FIFO position 7</td>
</tr>
<tr>
<td>0010 0063h</td>
<td>Port 2 software reset</td>
</tr>
<tr>
<td>0010 0070h</td>
<td>CPCR 3 (C40 only)</td>
</tr>
<tr>
<td>0010 0071h</td>
<td>input port 3, FIFO position 0</td>
</tr>
<tr>
<td>0010 0072h</td>
<td>output port 3, FIFO position 7</td>
</tr>
<tr>
<td>0010 0073h</td>
<td>Port 3 software reset</td>
</tr>
<tr>
<td>0010 0080h</td>
<td>CPCR 4</td>
</tr>
<tr>
<td>0010 0081h</td>
<td>input port 4, FIFO position 0</td>
</tr>
<tr>
<td>0010 0082h</td>
<td>output port 4, FIFO position 7</td>
</tr>
<tr>
<td>0010 0083h</td>
<td>Port 4 software reset</td>
</tr>
<tr>
<td>0010 0090h</td>
<td>CPCR 5</td>
</tr>
<tr>
<td>0010 0091h</td>
<td>input port 5, FIFO position 0</td>
</tr>
<tr>
<td>0010 0092h</td>
<td>output port 5, FIFO position 7</td>
</tr>
<tr>
<td>0010 0093h</td>
<td>Port 5 software reset</td>
</tr>
</tbody>
</table>
4.2.5 DMA Coprocessor Registers

The DMA registers (shown in Figure 4–7) are the bottom block of registers in the peripheral bus memory map (Figure 4–3 on page 4-5). These registers are described in Chapter 11, *The DMA Coprocessor*.

**Figure 4–7. DMA Coprocessor Memory Map**

<table>
<thead>
<tr>
<th>Channel</th>
<th>Registers (see exploded view)</th>
<th>DMA Ch 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0010 00A0h</td>
<td>Reserved</td>
<td></td>
</tr>
<tr>
<td>0010 00A8h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00A9h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00AFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00B0h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00B8h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00B9h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00BFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00C0h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00C8h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00C9h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00CFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00D0h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00D8h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00D9h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00DFh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00E0h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00E8h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00E9h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00F0h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00F8h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00F9h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0010 00FFh</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Exploded View of Each Channel Register**

<table>
<thead>
<tr>
<th>Channel</th>
<th>Registers (see exploded view)</th>
<th>DMA Ch 1</th>
</tr>
</thead>
<tbody>
<tr>
<td>010 00z0h</td>
<td>Control register x</td>
<td></td>
</tr>
<tr>
<td>010 00z1h</td>
<td>Source address x</td>
<td></td>
</tr>
<tr>
<td>010 00z2h</td>
<td>Source address index x</td>
<td></td>
</tr>
<tr>
<td>010 00z3h</td>
<td>Transfer counter x</td>
<td></td>
</tr>
<tr>
<td>010 00z4h</td>
<td>Destination address x</td>
<td></td>
</tr>
<tr>
<td>010 00z5h</td>
<td>Destination address index x</td>
<td></td>
</tr>
<tr>
<td>010 00z6h</td>
<td>Link pointer x</td>
<td></td>
</tr>
<tr>
<td>010 00z7h</td>
<td>Auxiliary transfer counter x</td>
<td></td>
</tr>
<tr>
<td>010 00z8h</td>
<td>Auxiliary link pointer x</td>
<td></td>
</tr>
</tbody>
</table>

x = channel number (e.g., x = 1 for channel 1, x = 2 for channel 2, etc.)
z = corresponding hexadecimal digit for channel address (e.g., substitute an A for DMA channel 0, B for DMA channel 1, etc.)
4.3 Instruction Cache

The 128 × 32-bit instruction cache speeds instruction fetches and lowers system cost. The instruction cache allows the use of slow external memories while still achieving single-cycle access performance. The cache also frees the external buses from program fetches, thus, allowing the use of these buses for DMA or other system needs. The cache can operate in a completely automatic fashion without the need for external intervention. It uses a form of the LRU (least recently used) cache update algorithm.

4.3.1 Instruction Cache Architecture

The instruction cache (see Figure 4–9 on page 4-11 ) contains 128 32-bit words of RAM, enough to hold 128 words of program memory. It is divided into four 32-word segments. Associated with each segment is a 27-bit segment start address (SSA) register. For each word in the cache, there is a corresponding single-bit present (P) flag.

When the CPU requests an instruction word, a check is made to determine whether the word is already in the instruction cache. The partitioning of an instruction address as used by the cache control algorithm is shown in Figure 4–8. The 27 most significant bits (MSBs) of the instruction address select the segment, and the five least significant bits (LSBs) define the address of the instruction word within the pertinent segment. The 27 MSBs of the instruction address are compared with the four SSA registers. If a match is found, the relevant P flag is checked. The P flag indicates whether the word within a particular segment is already present in cache memory:

- P = 1: the word is already present in cache memory.
- P = 0: location in cache is invalid (e.g., contains garbage).

Figure 4–8. Address Partitioning for Cache Control Algorithm

If there is no match, one of the segments must be replaced by the new data. The segment replaced in this circumstance is determined by the LRU (least recently used) algorithm. The LRU stack (see the upper-right portion of Figure 4–9) is maintained for this purpose.
Figure 4–9. Instruction Cache Architecture
The LRU stack keeps track of which segment (0–3) qualifies as the least recently used after each access to the cache. Each time a segment is accessed, its segment number is removed from the LRU stack and pushed onto the top of the LRU stack. Therefore, the number at the top of the stack is the most recently used segment number, and the number at the bottom of the stack is the least recently used segment number.

At reset, the following occur in the instruction cache:

- Cache is disabled (ST(CE) = 0). After reset cache is frozen (ST(CF) = 1).
  
  See section 3.1.7, Status Register (ST), on page 3-5, for details.

- All P flags are set to zero.

- The LRU stack is initialized with segment 0 at the top, followed by segments 1, 2, and 3 at the bottom. If any two SSA registers are equal (due to reset conditions) and a cache hit occurs, the instruction word is fetched from the most recently used segment.

When a replacement is necessary, the least recently used segment is selected for replacement. Also, the 32 P flags for the segment to be replaced are set to 0, and the segment’s SSA register is replaced with the 27 MSBs of the new instruction’s address.

4.3.2 Cache Control Bits

Four cache control bits are located in the CPU status register (ST): the cache clear bit (CC), the cache enable bit (CE), the cache freeze bit (CF), and the previous cache freeze bit (PCF). The status register is shown in Figure 3–3.

**Cache Clear Bit (CC).** Set CC = 1 to invalidate all entries in the cache. This bit is always cleared after it is written to; thus, it is always read as 0. At reset, 0 is written to this bit. The cache P flag = 0 when the cache is cleared.

**Cache Enable Bit (CE).** Set CE = 1 to enable the cache, allowing the cache to be used according to the LRU (least recently used) cache algorithm. Set CE = 0 to disable the cache; this prevents cache updates or modifications (thus, no cache fetches can be made). At reset, 0 is written to this bit. Cache clearing (CC = 1) is allowed when CE = 0.

**Cache Freeze Bit (CF).** Set CF = 1 to freeze the cache including freezing of LRU (least recently used) stack manipulation. If the cache is enabled (CE = 1) and the cache is frozen (CF = 1), fetches from the cache are allowed, but modification of the cache contents is not allowed. Cache clearing (CC = 1) is allowed when CF = 1. At reset, this bit is cleared to 0 and after reset it is set to 1. When CF = 0, cache clearing (CC=1) is allowed. CF is set to one when a trap or interrupt is taken. Also, the RETI and RETID instructions copy PCF to the CF bit.
Table 4–1 summarizes the effects of the CE and CF bits.

**Table 4–1. Combined Effect of the CE and CF Bits**

<table>
<thead>
<tr>
<th>CE</th>
<th>CF</th>
<th>Effect</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>Cache not enabled</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>Cache not enabled</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>Cache enabled and not frozen</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>Cache enabled and frozen</td>
</tr>
</tbody>
</table>

**Previous Cache Freeze Bit (PCF).** When an interrupt or trap vector is taken, the CF value is copied to the PCF bit, and the CF bit is set to 1. This protects the cache during interrupt processing and is particularly useful when code loops are interrupted. The interrupt service routine may optionally use the cache under software control. Interrupts may also be nested, providing that the status register is saved before the interrupts are enabled. When the instructions RETI<sub>cond</sub> and RETI<sub>condD</sub> are executed to complete interrupt processing, the contents of the PCF bit are copied to the CF bit.

### 4.3.3 Using the Cache

Only instructions may be fetched from the program cache. All reads and writes of data to and from memory, bypass the cache. Program fetches from internal memory do not modify the cache and do not generate cache hits or misses. The program cache is a single-access memory block. Dummy program fetches (i.e., following a branch) can generate cache misses and cache updates. Example 4–1 shows a typical way to clear and enable the cache.

**Example 4–1. Enabling the Cache**

```assembly
    OR 1800h,ST
    ...```

To use the cache more efficiently, take two precautions:

**Avoid using self-modifying code.** If an instruction resides in the cache and the corresponding location in primary memory is modified, the copy in the instruction in the cache is not modified.

**Align program code.** Use the .align directive when coding assembly language to align code on 32-word address boundaries.
4.3.4 The LRU Cache Algorithm

When the 'C4x requests an instruction word from external memory, the two possible actions are a cache hit or a cache miss:

- **Cache Hit.** The cache contains the requested instruction, and the following actions occur:
  - The instruction word is read from the cache.
  - The number of the segment containing the word is removed from the LRU stack and pushed to the top of the LRU stack (if it is not already at the top), thus moving the other segment numbers toward the bottom of the stack.

- **Cache Miss.** The cache does not contain the instruction. There are two types of cache misses:
  - **Subsegment miss.** The segment address register matches the instruction address, but the relevant P flag is not set. The following actions occur:
    - The instruction word is read from memory and copied into the cache.
    - The number of the segment containing the word is removed from the LRU stack and pushed to the top of the LRU stack (if it is not already at the top), thus moving the other segment numbers toward the bottom of the stack.
    - The relevant P flag is set.
  - **Segment miss.** None of the segment addresses matches the instruction address. The following actions occur:
    - The least recently used segment is selected for replacement and the P flags for all 32 words are cleared.
    - The SSA register for the selected segment is loaded with the 27 MSBs of the address of the requested instruction word.
    - The instruction word is fetched and copied into the cache. It goes into the appropriate word of the least recently used segment. The P flag for that word is set to 1.
    - The number of the segment containing the instruction word is removed from the LRU stack and pushed to the top of the LRU stack, thus moving the other segment numbers toward the bottom of the stack.
In the ’C4x architecture, data is organized into three fundamental types: integer, unsigned-integer, and floating-point. Note that the terms, integer and signed-integer, are considered to be equivalent. The ’C4x supports short and single-precision formats for signed and unsigned integers. It also supports short, single-precision and extended-precision formats for floating-point data.

Floating-point operations make fast, trouble-free, accurate, and precise computations. Specifically, the ’C4x implementation of floating-point arithmetic facilitates floating-point operations at integer speeds while preventing problems with overflow, operand alignment, and other burdensome tasks common in integer operations.

This chapter discusses in detail the data formats and floating-point operations supported on the ’C4x.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>5.1 Signed-Integer Formats</td>
<td>5-2</td>
</tr>
<tr>
<td>5.2 Unsigned-Integer Formats</td>
<td>5-3</td>
</tr>
<tr>
<td>5.3 Floating-Point Formats</td>
<td>5-4</td>
</tr>
<tr>
<td>5.4 Floating-Point Conversion (IEEE Std. 754)</td>
<td>5-13</td>
</tr>
<tr>
<td>5.5 Floating-Point Multiplication</td>
<td>5-19</td>
</tr>
<tr>
<td>5.6 Floating-Point Addition and Subtraction</td>
<td>5-23</td>
</tr>
<tr>
<td>5.7 Normalization (NORM Instruction)</td>
<td>5-27</td>
</tr>
<tr>
<td>5.8 Rounding (RND Instruction)</td>
<td>5-29</td>
</tr>
<tr>
<td>5.9 Floating-Point to-Integer Conversion (FIX Instruction)</td>
<td>5-31</td>
</tr>
<tr>
<td>5.10 Integer-to-Floating-Point Conversion (FLOAT Instruction)</td>
<td>5-33</td>
</tr>
<tr>
<td>5.11 Reciprocal (RCPF Instruction)</td>
<td>5-34</td>
</tr>
<tr>
<td>5.12 Reciprocal Square Root (RSQRF Instruction)</td>
<td>5-36</td>
</tr>
</tbody>
</table>
5.1 Signed-Integer Formats

The 'C4x supports two signed-integer formats: a 16-bit short format and a 32-bit single-precision format. The term integer is used throughout this chapter to refer to a signed integer.

Note:
When extended-precision registers are used as integer operands, only bits 31–0 are used; bits 39–32 remain unchanged and unused.

5.1.1 Short Integer Format

The 16-bit twos-complement short integer format is used for immediate integer operands. For those instructions that assume integer operands, this format is sign extended to 32 bits (see Figure 5–1). The range of an integer \( s_i \), represented in the short integer format, is:

\[ -2^{15} \leq s_i \leq 2^{15} - 1 \]

In Figure 5–1 and other figures in this chapter, \( s \) = sign bit.

Figure 5–1. Short-Integer Format and Sign Extension of Short Integer

5.1.2 Single-Precision Integer Format

In the single-precision integer format, the integer is represented in twos-complement notation. The range of an integer \( s_p \), represented in the single-precision integer format, is \(-2^{31} \leq s_p \leq 2^{31} - 1\). Figure 5–2 shows the single-precision integer format.

Figure 5–2. Single-Precision Integer Format
5.2 Unsigned-Integer Formats

Two unsigned-integer formats are supported on the 'C4x: a 16-bit short format and a 32-bit single-precision format. In this chapter, the term unsigned integer is used to refer to an unsigned integer.

Note:
When extended-precision registers are used, the unsigned-integer operands use only bits 31–0; bits 39–32 remain unchanged.

5.2.1 Short Unsigned-Integer Format

Figure 5–3 shows the 16-bit short unsigned-integer format used in immediate unsigned-integer operands. For instructions that use unsigned-integer operands, the format is filled with zeros to 32 bits. The range of a short unsigned integer is $0 \leq si \leq 2^{16}$.

Figure 5–3. Short Unsigned-Integer Format and Zero Fill

5.2.2 Single-Precision Unsigned-Integer Format

In the single-precision unsigned-integer format, the number is represented as a 32-bit value, as shown in Figure 5–4. The range of a single-precision unsigned-integer is $0 \leq sp \leq 2^{32}$.

Figure 5–4. Single-Precision Unsigned-Integer Format
5.3 Floating-Point Formats

The 'C4x supports three floating-point formats:

- A short floating-point format (for immediate floating-point operands) consisting of a 4-bit exponent, one sign bit, and an 11-bit fraction
- A single-precision format consisting of an 8-bit exponent, one sign bit, and a 23-bit fraction
- An extended-precision format consisting of an 8-bit exponent, one sign bit, and a 31-bit fraction

All 'C4x floating-point formats consist of three fields: an exponent field (e), a single-bit sign field (s), and a fraction field (f). The sign field and fraction field may be considered as one unit and referred to as the mantissa field (man). Each format is divided into these fields as shown in Figure 5–5.

Figure 5–5. General Floating-Point Format

<table>
<thead>
<tr>
<th>Exponent</th>
<th>Sign</th>
<th>Fraction</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Mantissa</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

The general equation for calculating the value in a floating point number is given by Equation 5–1. In the equation, s is the value of the sign bit, \( \bar{s} \) is the inverse of the value of the sign bit, f is the binary value of the fraction field, and e is the decimal equivalent of the exponent field.

Equation 5–1. Value in a Floating Point Number

\[
x = \bar{s}.f \times 2^e
\]

The mantissa represents a normalized twos-complement number. In a normalized representation, a most significant nonsign bit is implied, thus providing an additional bit of precision. The implied sign bit is used as follows:

- If \( s = 0 \), then the leading two bits of the mantissa are 01.
- If \( s = 1 \), then the leading two bits of the mantissa are 10.

If the sign bit, s, is equal to 0, the mantissa becomes 01.f, where f is the binary representation of the fraction field. If s is 1, the mantissa becomes 10.f, where f is the binary representation of the fraction field.

For example, if \( f = 00000000001_2 \) and s = 0, the value of the mantissa (man) would be \( 01.0000000000001_2 \). If s = 1 for the same value of f, the value of man would be \( 10.0000000000001_2 \).
The exponent field is a twos-complement number that determines the factor of two by which the number is multiplied. Essentially, the exponent field shifts the binary point in the mantissa. If the exponent is positive, then the binary point is shifted to the right. If the exponent is negative, then the binary point is shifted to the left.

For example, if \( \text{man}=01.00000000001_2 \) and the \( e=11_{10} \), then the binary point is shifted eleven places to the right, producing the number: \( 0100000000001_2 \), which is equal to 2049 decimal.

### 5.3.1 Short Floating-Point Format

In the short floating-point format, floating-point numbers are represented by a twos-complement 4-bit exponent field \( e \) and a twos-complement 12-bit mantissa field \( \text{man} \) with an implied most significant nonsign bit.

**Figure 5–6. Short Floating-Point Format**

![Short Floating-Point Format Diagram]

You must use the following reserved values to represent zero in the single-precision floating-point format:

- \( e = -8 \)
- \( s = 0 \)
- \( f = 0 \)

Operations are performed with an implied binary point between bits 11 and 10. The floating-point twos-complement number \( x \) in the short floating-point format is given by:

\[
\begin{align*}
    x &= 01.f_2 \times 2^e & \text{if } s = 0 \\
    x &= 10.f_2 \times 2^e & \text{if } s = 1 \\
    x &= 0 & \text{if } e = -8, s = 0, f = 0
\end{align*}
\]

The following examples illustrate the range and precision of the short floating-point format:

- **Most Positive:** \( x = (2 - 2^{-11}) \times 2^7 = 2.5594 \times 10^2 \)
- **Least Positive:** \( x = 1 \times 2^{-7} = 7.8125 \times 10^{-3} \)
- **Least Negative:** \( x = (-1 - 2^{-11}) \times 2^{-7} = -7.8163 \times 10^{-3} \)
- **Most Negative:** \( x = -2 \times 2^7 = -2.5600 \times 10^2 \)
5.3.2 Single-Precision Floating-Point Format

In the single-precision format, the floating-point number is represented by an 8-bit exponent field \( e \) and a twos-complement 24-bit mantissa field \( m_{an} \) with an implied most significant nonsign bit.

Operations are performed with an implied binary point between bits 23 and 22. When the implied most significant nonsign bit is made explicit, it is located to the immediate left of the binary point. The floating-point number \( x \) is given by

\[
x = \begin{cases} 
0.1f \times 2^e & \text{if } s = 0 \\
1.0f \times 2^e & \text{if } s = 1 \\
0 & \text{if } e = -128, s = 0, f = 0
\end{cases}
\]

Figure 5–7. Single-Precision Floating-Point Format

![Diagram of single-precision floating-point format]

You must use the following reserved values to represent zero in the single-precision floating-point format:

\[ e = -128 \]
\[ s = 0 \]
\[ f = 0 \]

The following examples illustrate the range and precision of the single-precision floating-point format.

Most Positive: \( x = (2 - 2^{-23}) \times 2^{127} = 3.4028234 \times 10^{38} \)

Least Positive: \( x = 1 \times 2^{-127} = 5.8774717 \times 10^{-39} \)

Least Negative: \( x = (-1 - 2^{-23}) \times 2^{-127} = -5.8774724 \times 10^{-39} \)

Most Negative: \( x = -2 \times 2^{127} = -3.4028236 \times 10^{38} \)
5.3.3 Extended-Precision Floating-Point Format

In the extended-precision format, the floating-point number is represented by an 8-bit exponent field \( e \) and a 32-bit mantissa field \( \text{man} \) with an implied most significant nonsign bit.

Operations are performed with an implied binary point between bits 31 and 30. When the implied most significant nonsign bit is made explicit, it is located to the immediate left of the binary point. The floating-point number \( x \) is given by:

\[
x = \begin{cases} 
01.f \times 2^e & \text{if } s = 0 \\
10.f \times 2^e & \text{if } s = 1 \\
0 & \text{if } e = -128, s = 0, f = 0
\end{cases}
\]

Figure 5–8. Extended-Precision Floating-Point Format

<table>
<thead>
<tr>
<th>Exponent</th>
<th>Sign</th>
<th>Fraction</th>
</tr>
</thead>
<tbody>
<tr>
<td>39 32</td>
<td>31</td>
<td>30 0</td>
</tr>
</tbody>
</table>

You must use the following reserved values to represent zero in the extended-precision floating-point format:

\( e = -128 \)
\( s = 0 \)
\( f = 0 \)

The following examples illustrate the range and precision of the extended-precision floating-point format:

Most Positive: \( x = (2 - 2^{-31}) \times 2^{127} = 3.4028236683 \times 10^{38} \)
Least Positive: \( x = 1 \times 2^{-127} = 5.8774717541 \times 10^{-39} \)
Least Negative: \( x = (-1 - 2^{-31}) \times 2^{-127} = -5.8774717569 \times 10^{-39} \)
Most Negative: \( x = -2 \times 2^{127} = -3.4028236691 \times 10^{38} \)
5.3.4 Determining the Decimal Equivalent of a Floating-Point Number

There are two basic steps in determining the value stored in floating point format:

1) Determine the values of the exponent and mantissa.

2) Shift the binary point in the mantissa according to the value of the exponent field and then convert the number to decimal.

5.3.4.1 Step 1: Determine the Values of the Exponent and Mantissa

The exponent field is a twos-complement number whose range depends on the type of floating-point number you are converting. Record the decimal equivalent of this value as $e$.

For example, if you are converting a single-precision floating-point number and the binary value of the exponent field is 00000100, then the decimal value of the exponent would be 4 since a 1 in the third bit from the right corresponds to 4.

If, on the other hand, the binary value of the exponent field is $11111100_2$, then the decimal value of the exponent would be $-4$. Since the first bit on the left is 1, you know that the number is negative. You calculate the value of the number by taking the one's complement of $11111100_2$, which is $00000011_2$ and then by adding 1 to that result.

Note:

If the value of the exponent matches the value reserved for zero, then the floating point number is equal to zero. The reserved value for each floating point type is given with the type descriptions in Section 5.3.

The mantissa is a binary number with an implied binary point between the sign bit and the fraction field. Form the mantissa in one of two ways:

- If $s = 0$, form the mantissa by writing 01. and appending the bits in the fraction field after the binary point.

  For example, if $f = 10100000000_2$, then $man = 01.10100000000_2$.

  Rewrite the mantissa as:

  \[
  \begin{array}{cccccccccccc}
  \text{Mantissa} \\
  0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0
  \end{array}
  \]

- If $s = 1$, form the mantissa by taking the one's complement of the fraction field and then appending the sign bit.

  For example, if $f = 01000000000_2$, then $man = 01.10100000000_2$.

  Rewrite the mantissa as:

  \[
  \begin{array}{cccccccccccc}
  \text{Mantissa} \\
  0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0
  \end{array}
  \]
If \( s = 1 \), form the mantissa by writing \( 1.0 \) and appending the bits in the fraction field after the binary point.

For example, if \( f = 10100000000_2 \), then \( \text{man} = 10100000000_2 \).

\[
\begin{array}{c|cccccccccccc}
S & \text{Fraction} \\
--- & --- & --- & --- & --- & --- & --- & --- & --- & --- & --- & --- \\
1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\end{array}
\]

Rewrite the mantissa as:

\[
\begin{array}{c|cccccccccccc}
\text{Mantissa} & --- & --- & --- & --- & --- & --- & --- & --- & --- & --- & --- & --- \\
1 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\end{array}
\]

### 5.3.4.2 Step 2: Shift the Decimal Point in the Mantissa and Convert to Decimal

If the exponent (\( e \)) has a positive value, then you shift the binary point \( e \) places to the right.

If the exponent (\( e \)) has a negative value, then you shift the binary point \( e \) places to the left.

For example, if \( e = 2_{10} \) and the \( \text{man} = 01.11000000000_2 \), then the shifted mantissa becomes \( 0111.000000000_2 \), which is equivalent to 7 in decimal.

If, on the other hand, \( e = -2_{10} \) and \( \text{man} = 01.10000000000_2 \), then the shifted mantissa becomes \( .0110000000000_2 \), which is equivalent to \( 3/8 \) in decimal.

The following examples illustrate how you can obtain the equivalent floating-point value of a number in ‘C4x floating-point format. Each of the examples uses the single-precision floating point format.

**Example 5–1. Positive Number**

\[
\begin{array}{cccccccccccc}
0 & 2 & 4 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0000 & 0010 & 0100 & 0000 & 0000 & 0000 & 0000 & 0000 & 0000 & 0000 & 0000 & 0000 \\
\end{array}
\]

Hex value

Binary value

Exponent = 0000 0010₂ = 2

Sign = 0

Fraction = 0.10000₂

Value = \( 01.1₂ \times 2^2 = 011₀₂ = 6 \)
**Floating-Point Formats**

**Example 5–2. Negative Number**

|       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |
|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| 0     | 1     | C     | 0     | 0     | 0     | 0     | 0     |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |
| 0000  | 0001  | 1100  | 0000  | 0000  | 0000  | 0000  | 0000  |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |       |

Exponent = 0000 0001₂ = 1
Sign = 1
Fraction = .10000₂
Value = \(10.1₂ \times 2^1 = 101₂ = -3\)

**Example 5–3. Fractional Number**

<table>
<thead>
<tr>
<th>F</th>
<th>B</th>
<th>4</th>
<th>0</th>
<th>0</th>
<th>0</th>
<th>0</th>
<th>0</th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
<td>1011</td>
<td>0100</td>
<td>0000</td>
<td>0000</td>
<td>0000</td>
<td>0000</td>
<td>0000</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Exponent = 1111 1011₂ = -5
Sign = 0
Fraction = .10000₂
Value = \(01.1₂ \times 2^{-5} = \cdot000011₂ = 3/64\)
5.3.5 Conversion Between Floating-Point Formats

Floating-point operations assume several different formats for inputs and outputs. These formats often require conversion from one floating-point format to another (for example, from short floating-point format to extended-precision floating-point format). Format conversions occur automatically in hardware, with no overhead, as a part of floating-point operations. Examples of the four conversions are shown in Figure 5–9 through Figure 5–12 ($s =$ sign bit of the exponent). When a floating-point format zero is converted to a different format, it is always converted to a valid representation of zero in that format.

**Figure 5–9. Short Floating-Point Format Conversion to Single-Precision Floating-Point Format**

In converting from short format to single-precision format, the exponent field is sign extended and the rightmost 12 bits of the fraction field are filled with zeros.

**Figure 5–10. Short Floating-Point Format Conversion to Extended-Precision Floating-Point Format**

In converting from short format to extended-precision format, the exponent field is sign extended and the rightmost 20 bits of the fraction field are filled with zeros.
In converting from single-precision format to extended-precision format, the rightmost eight bits of the fraction field are filled with zeros.

In converting from extended-precision format to single-precision format, the eight rightmost bits of the fraction field are truncated.
5.4 Floating-Point Conversion (IEEE Std. 754)

The 'C4x floating-point format is not compatible with the IEEE standard 754 format. However, the 'C4x has instructions to directly convert to and from IEEE format (TOIEEE and FRIEEE, respectively). The conversion process is explained in subsections 5.4.1 and 5.4.2. Figure 5–13 shows the IEEE floating-point format, and Figure 5–14 shows the floating-point 'C4x format.

Figure 5–13. IEEE Single-Precision Std. 754 Floating-Point Format

The following five cases define the value $v$ of a number expressed in the IEEE format:

1) If $e = 255$ and $f \neq 0$, then $v = \text{NaN}$
2) If $e = 255$ and $f = 0$, then $v = (-1)^s \text{ infinite}$
3) If $0 < e < 255$, then $v = (-1)^s \times 2^{e-127}(1.f)$
4) If $e = 0$ and $f \neq 0$, then $v = (-1)^s \times 2^{-126}(0.f)$
5) If $e = 0$ and $f = 0$, then $v = (-1)^s \times 0 \text{ (zero)}$.

where $s = \text{sign bit; e} = \text{the exponent field; f} = \text{the fraction field; NaN = Not a number}$

For the above five representations, $e$ is treated as an unsigned integer. Case 1 generates NaN (not an number) and is primarily used for software signaling. Case 4 represents a denormalized number. Case 5 represents positive and negative zero.

Figure 5–14. 'C4x Single-Precision Twos-Complement Floating-Point Format‡

In comparison, Figure 5–14 shows the 'C4x twos-complement floating-point format. In this format, two cases can be used to define value $v$ of a number:

1) If $e = -128$ and $f \neq 0$, then $v = 0$
2) If $e \neq -128$ then $v = s\overline{s}.f_2 \times 2^e$

where $s = \text{sign bit; e} = \text{the exponent field; f} = \text{the fraction field}$.
For this representation, $e$ is treated as a two's-complement integer. The fraction and sign bit form a normalized two's-complement mantissa.

**Note: Differentiating Symbols for IEEE and 'C4x Formats**

To differentiate between the symbols that define these two formats, all IEEE fields are subscripted with an IEEE (e.g., $e_{\text{IEEE}}$, $s_{\text{IEEE}}$, etc.). Similarly, all two's-complement fields are subscripted with two (i.e., $e_{\text{two}}$, $s_{\text{two}}$, $f_{\text{two}}$).

### 5.4.1 Converting IEEE Format to Twos-Complement 'C4x Floating-Point Format

The most common conversion is the IEEE-to-twos-complement format. This conversion is done according to rules in the following table:

**Table 5–1. Converting IEEE Format to Twos-Complement Floating-Point Format**

<table>
<thead>
<tr>
<th>Case</th>
<th>$e_{\text{IEEE}}$</th>
<th>$s_{\text{IEEE}}$</th>
<th>$f_{\text{IEEE}}$</th>
<th>$e_{\text{two}}$</th>
<th>$s_{\text{two}}$</th>
<th>$f_{\text{two}}$</th>
<th>$s_{\text{IEEE}}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>255</td>
<td>1</td>
<td></td>
<td>7Fh</td>
<td>1</td>
<td>00 0000h</td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>255</td>
<td>0</td>
<td></td>
<td>7Fh</td>
<td>0</td>
<td>7F FFFFh</td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0 &lt; $e_{\text{IEEE}}$ &lt; 255</td>
<td>0</td>
<td></td>
<td>$e_{\text{IEEE}}$–7Fh</td>
<td>$f_{\text{IEEE}}$</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>0 &lt; $e_{\text{IEEE}}$ &lt; 255</td>
<td>1, $\neq 0$</td>
<td></td>
<td>$e_{\text{IEEE}}$–7Fh</td>
<td>$f_{\text{IEEE}}$+1†</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>0 &lt; $e_{\text{IEEE}}$ &lt; 255</td>
<td>1</td>
<td>0</td>
<td>$e_{\text{IEEE}}$–80h</td>
<td>0</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>0</td>
<td></td>
<td></td>
<td>80h</td>
<td>0</td>
<td>00 0000h</td>
<td></td>
</tr>
</tbody>
</table>

† $f_{\text{IEEE}}$ = ones complement of $f_{\text{IEEE}}$. 


Case 1 maps the IEEE positive NaNs and positive infinity to the single-precision twos-complement most positive number. Overflow is also signaled to allow you to check for these special cases.

Case 2 maps the IEEE negative NaNs and negative infinity to the single-precision twos-complement most negative number. Overflow is also signaled to allow you to check for these special cases.

Case 3 maps the IEEE positive normalized numbers to the identical value in the twos-complement positive number.

Case 4 maps the IEEE negative normalized numbers with a nonzero fraction to the identical value in the twos-complement negative number.

Case 5 maps the IEEE negative normalized numbers with a zero fraction to the identical value in the twos-complement negative number.

Case 6 maps the IEEE positive and negative denormalized numbers and positive and negative zeros to a twos-complement zero.

The 'C4x assumes that an IEEE number is stored as an integer in memory or in a register. When the 'C4x converts an IEEE number, it places the number in an extended-precision register by using the exponent and fraction fields of the register. The eight LSBs of the extended-precision register are set to zero. Any arithmetic operations that are performed on the fraction field of the IEEE number should be performed only on the IEEE fraction field. In the case of a block memory transfer, a no-penalty data format conversion can be executed by using parallel instructions with STF. Example 5–4 illustrates how this can be accomplished.
Example 5–4.IEEE to ‘C4x Conversion Within Block Memory Transfer

* TITLE IEEE TO ’C4x CONVERSION WITHIN BLOCK MEMORY
* TRANSFER
* 
* PROGRAM ASSUMES THAT INPUT FIFO OF COMMUNICATION PORT 0
* IS FULL OF IEEE FORMAT DATA. EIGHT DATA WORDS ARE
* TRANSFERRED FROM COMMUNICATION PORT 0 TO INTERNAL RAM
* BLOCK 0 AND THE DATA FORMAT IS CONVERTED FROM IEEE FORMAT
* TO ’C4x FLOATING-POINT FORMAT.
*
.
.
.
LDI @CP0_IN,AR0 ;Load comm port0 input FIFO address
LDI @RAM0,AR1 ;Load internal RAM block 0 address
FRIEEE*AR0,R0 ;Convert first data
RPTS 6
FRIEEE*AR0,R0 ;Convert next data
|| STF R0,*AR1++(1) ;Store previous data
STF R0,*AR1++(1) ;Store last data
.
.
.

Floating-Point Conversion (IEEE Std. 754)
5.4.2 Converting Twos-Complement 'C4x Floating-Point Format to IEEE Format

This conversion is performed according to the following table:

**Table 5–2. Converting Twos-Complement Floating-Point Format to IEEE Format**

<table>
<thead>
<tr>
<th>Case</th>
<th>$e_{\text{two}}$</th>
<th>$s_{\text{two}}$</th>
<th>$f_{\text{two}}$</th>
<th>$e_{\text{IEEE}}$</th>
<th>$s_{\text{IEEE}}$</th>
<th>$f_{\text{IEEE}}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>-128</td>
<td>0</td>
<td>0</td>
<td>00h</td>
<td>0</td>
<td>00 0000h</td>
</tr>
<tr>
<td>2</td>
<td>-127</td>
<td>0</td>
<td>0</td>
<td>00h</td>
<td>0</td>
<td>00 0000h</td>
</tr>
<tr>
<td>3</td>
<td>$-126 \leq e_{\text{two}} \leq 127$</td>
<td>0</td>
<td>$f_{\text{two}}$</td>
<td>$e_{\text{two}} + 7Fh$</td>
<td>0</td>
<td>$f_{\text{two}}$</td>
</tr>
<tr>
<td>4</td>
<td>$-126 \leq e_{\text{two}} \leq 127$</td>
<td>1</td>
<td>$\neq 0$</td>
<td>$e_{\text{two}} + 7Fh$</td>
<td>0</td>
<td>$f_{\text{two}} + 1$</td>
</tr>
<tr>
<td>5</td>
<td>$-126 \leq e_{\text{two}} \leq 127$</td>
<td>1</td>
<td>0</td>
<td>$e_{\text{two}} + 80h$</td>
<td>1</td>
<td>00 0000h</td>
</tr>
<tr>
<td>6</td>
<td>127</td>
<td>1</td>
<td>0</td>
<td>FFh</td>
<td>1</td>
<td>00 0000h</td>
</tr>
</tbody>
</table>

† $f_{\text{two}} = $ ones complement of $f_{\text{two}}$.

**Case 1** maps a twos-complement zero to a positive IEEE zero.

**Case 2** maps the twos-complement numbers that are too small to be represented as normalized IEEE numbers to a positive IEEE zero.

**Case 3** maps the positive twos-complement numbers that are not covered by case 2 into the identically valued IEEE number.

**Case 4** maps the negative twos-complement numbers with a nonzero fraction that are not covered in case 2 into the identically valued IEEE number.

**Case 5** maps all the negative twos-complement numbers with a zero fraction, except for the most negative twos-complement number and those that are not covered in case 2, into the identically valued IEEE number.

**Case 6** maps the most negative twos-complement number to the IEEE negative infinity.

The 'C4x assumes that the twos-complement numbers are in memory or are in an extended-precision register in the exponent and fraction field of the register (shown in Figure 5–14 on page 5-13). If the value is in an extended-precision register, then only the 24 MSBs of the fraction field are manipulated as the fraction field and for detection of the special cases. The result of the conversion goes into the 32 MSBs of an extended-precision register. In the case of a block memory transfer, a no-penalty data format conversion can be executed by using parallel instructions with STF. Example 5–5 illustrates how this can be accomplished.
Example 5–5. 'C4x to IEEE Conversion Within Block Memory Transfer

* TITLE 'C4x TO IEEE CONVERSION WITHIN BLOCK MEMORY
* TRANSFER
* *
* PROGRAM ASSUMES THAT OUTPUT FIFO OF COMMUNICATION PORT 0
* IS EMPTY. EIGHT DATA WORDS ARE TRANSFERRED FROM
* INTERNAL RAM BLOCK 0 TO COMMUNICATION PORT 0 AND THE
* DATA FORMAT IS CONVERTED FROM 'C4x FLOATING-POINT FORMAT
* TO IEEE FORMAT.
* *
.  .  .
LDI @CP0_OUT,AR0 ;Load comm port0 output FIFO
  ; address
LDI @RAM0,AR1  ;Load internal RAM block 0
  ; address
TOIEEE*AR1++(1),R0 ;Convert first data
RPTS 6
TOIEEE*AR1++(1),R0 ;Convert next data
|| STF R0,*AR0  ;Store previous data
STF R0,*AR0  ;Store last data
.  .  .
5.5 Floating-Point Multiplication

A floating-point number $\alpha$ can be written in floating-point format as in the following formula, where $\alpha_{(\text{man})}$ is the mantissa and $\alpha_{(\text{exp})}$ is the exponent:

$$\alpha = \alpha_{(\text{man})} \times 2^{\alpha_{(\text{exp})}}$$

The product of $\alpha$ and $b$ is $c$, defined as:

$$c = \alpha \times b = \alpha_{(\text{man})} \times b_{(\text{man})} \times 2^{(\alpha_{(\text{exp})}+b_{(\text{exp})})}$$

Thus:

$$c_{(\text{man})} = \alpha_{(\text{man})} \times b_{(\text{man})}$$

$$c_{(\text{exp})} = \alpha_{(\text{exp})} + b_{(\text{exp})}$$

During floating-point multiplication, the source operands are always in the extended-precision floating-point format. If the source operands are in short or single-precision format, they are converted to extended-precision format. These conversions occur automatically in hardware with no overhead. All results of floating-point multiplications are returned in the extended-precision format.

A multiplication occurs in a single cycle.

Figure 5–15 is a flowchart showing the steps involved in a floating-point multiplication. Each step is labelled with a number in parentheses.

- In step 1, the 32-bit source mantissas, $\alpha_{(\text{man})}$ and $b_{(\text{man})}$, are multiplied, producing a 64-bit result, $c_{(\text{man})}$. (Note that input and output data are always represented as normalized numbers.)
- In step 2, the exponents, $\alpha_{(\text{exp})}$ and $b_{(\text{exp})}$, are added, yielding $c_{(\text{exp})}$.
- Step 3 checks whether or not $c_{(\text{man})}$ is equal to zero. If $c_{(\text{man})}$ is zero, step 7 sets $c_{(\text{exp})}$ to $-128$, thus yielding the representation for zero.
- Steps 4 and 5 normalize the result.
- If a right shift of one is necessary, then in step 8, $c_{(\text{man})}$ is right-shifted one bit, and 1 is added to $c_{(\text{exp})}$.
- If a right shift of two is necessary, then in step 9, $c_{(\text{man})}$ is right-shifted two bits, and 2 is added to $c_{(\text{exp})}$. step 6 occurs when the result is normalized.
- In step 10, $c_{(\text{man})}$ is set in the extended-precision floating-point format.
- Steps 11 through 16 check for special cases of $c_{(\text{exp})}$.
- In step 14, if $c_{(\text{exp})}$ has overflowed (detected in step 11) in the positive direction, then $c_{(\text{exp})}$ is set to the most positive extended-precision format value. If $c_{(\text{exp})}$ has overflowed in the negative direction, then $c_{(\text{exp})}$ is set to the most negative extended-precision format value.
If \( c(\text{exp}) \) has underflowed (detected in step 12), then \( c \) is set to zero in step 15; i.e., \( c(\text{man}) = 0 \) and \( c(\text{exp}) = -128 \).

**Figure 5–15. Flowchart for Floating-Point Multiplication**

1. Multiply mantissas
   \[ c(\text{man}) = \alpha(\text{man}) \times b(\text{man}) \] (50-bit result)
2. Add exponents
   \[ c(\text{exp}) = \alpha(\text{exp}) + b(\text{exp}) \]
3. Test for special cases of \( c(\text{man}) \)
   - \( c(\text{man}) = 0 \)
   - \( c(\text{man}) > > 1 \), \( c(\text{exp}) = c(\text{exp}) + 1 \)
   - \( c(\text{man}) > > 2 \), \( c(\text{exp}) = c(\text{exp}) + 2 \)
4. Right-shift 1 to normalize
5. Right-shift 2 to normalize
6. No shift to normalize
7. \( c(\text{exp}) = -128 \)
8. \( c(\text{man}) > > 1 \), \( c(\text{exp}) = c(\text{exp}) + 1 \)
9. \( c(\text{man}) > > 2 \), \( c(\text{exp}) = c(\text{exp}) + 2 \)
10. Dispose of extra bits
    - Put \( c(\text{man}) \) in extended-precision floating-point format
11. Test for special cases of \( c(\text{exp}) \)
    - \( c(\text{exp}) \) overflow
    - \( c(\text{exp}) \) underflow
    - \( c(\text{exp}) \) in range
12. If \( c(\text{man}) > 0 \), set \( c \) to most positive value.
    - \( c(\text{man}) < 0 \), set \( c \) to most negative value.
13. \( c(\text{exp}) = -128 \)
14. \( c(\text{man}) = 0 \)
15. Set \( c \) to final result
    \[ c = \alpha \times b \]
Example 5–6 through Example 5–9 illustrate how floating-point multiplication is performed on the C4x. For these examples, the implied most significant nonsign bit is made explicit.

**Example 5–6. Floating-Point Multiply (Both Mantissas = –2.0)**

Let

\[
\alpha = -2.0 \times 2^{\alpha(\exp)} = 10.00000000000000000000000 \times 2^{\alpha(\exp)}
\]

\[
b = -2.0 \times 2^{b(\exp)} = 10.00000000000000000000000 \times 2^{b(\exp)}
\]

where \(a\) and \(b\) are both represented in binary form according to the normalized single-precision floating-point format.

To place this number in the proper normalized format, it is necessary to shift the mantissa two places to the right and add 2 to the exponent. This yields

\[
01.00000000000000000000000 \times 2^{(\alpha(\exp) + b(\exp) + 2)}
\]

In floating-point multiplication, the exponent of the result may overflow when the exponents are initially added or when the exponent is modified during normalization.

**Example 5–7. Floating-Point Multiply (Both Mantissas = 1.5)**

Let

\[
\alpha = 1.5 \times 2^{\alpha(\exp)} = 01.10000000000000000000000 \times 2^{\alpha(\exp)}
\]

\[
b = 1.5 \times 2^{b(\exp)} = 01.10000000000000000000000 \times 2^{b(\exp)}
\]

where \(\alpha\) and \(b\) are both represented in binary form according to the single-precision floating-point format. Then

\[
01.10000000000000000000000 \times 2^{\alpha(\exp)}
\]

\[
\times 01.10000000000000000000000 \times 2^{b(\exp)}
\]

\[
0010.010000000000000000000000000000000000000000000000 \times 2^{(\alpha(\exp) + b(\exp))}
\]

To place this number in the proper normalized format, it is necessary to shift the mantissa one place to the right and add 1 to the exponent. This yields

\[
01.10000000000000000000000 \times 2^{\alpha(\exp)}
\]

\[
\times 01.10000000000000000000000 \times 2^{b(\exp)}
\]

\[
01.001000000000000000000000000000000000000000000000 \times 2^{(\alpha(\exp) + b(\exp) + 1)}
\]
Example 5–8. Floating-Point Multiply (Both Mantissas = 1.0)

Let

\[ \alpha = 1.0 \times 2^{\alpha(\exp)} = 0.00000000000000000000000 \times 2^{\alpha(\exp)} \]
\[ b = 1.0 \times 2^{b(\exp)} = 0.00000000000000000000000 \times 2^{b(\exp)} \]

where a and b are both represented in binary form according to the single-precision floating-point format. Then

\[
\begin{align*}
01.00000000000000000000000 & \times 2^{\alpha(\exp)} \\
\times 01.00000000000000000000000 & \times 2^{b(\exp)}
\end{align*}
\]

\[0001.0000000000000000000000000000000000000000000000 \times 2^{(\alpha(\exp) + b(\exp))} \]

This number is in the proper normalized format. Therefore, no shift of the mantissa or modification of the exponent is necessary.

The previous three examples show cases in which the product of two normalized numbers can be normalized with a shift of zero, one, or two. The floating-point format of the 'C4x makes this possible.

Example 5–9. Floating-Point Multiply Between Positive and Negative Numbers

Let

\[ \alpha = 1.0 \times 2^{\alpha(\exp)} = 0.00000000000000000000000 \times 2^{\alpha(\exp)} \]
\[ b = -2.0 \times 2^{b(\exp)} = 10.00000000000000000000000 \times 2^{b(\exp)} \]

Then

\[
\begin{align*}
01.00000000000000000000000 & \times 2^{\alpha(\exp)} \\
\times 10.00000000000000000000000 & \times 2^{b(\exp)}
\end{align*}
\]

\[1110.0000000000000000000000000000000000000000000000 \times 2^{(\alpha(\exp) + b(\exp))} \]

The result is

\[ c = -2.0 \times 2^{(\alpha(\exp) + b(\exp))} \]

Floating-Point Multiply by Zero

All multiplications by a floating-point zero yield a result of zero (\( f = 0, s = 0, \) and \( \exp = -128 \)).
5.6 Floating-Point Addition and Subtraction

In floating-point addition and subtraction, two floating-point numbers $\alpha$ and $b$ can be defined as

\[ \alpha = \alpha(\text{man}) \times 2^{\alpha(\text{exp})} \]
\[ b = b(\text{man}) \times 2^{b(\text{exp})} \]

The sum (or difference) of $\alpha$ and $b$ can be defined as

\[ c = \alpha \pm b \]
\[ = (\alpha(\text{man}) \pm (b(\text{man}) \times 2^{-(\alpha(\text{exp})-b(\text{exp}))}) \times 2^{\alpha(\text{exp})}, \]
if $\alpha(\text{exp}) \geq b(\text{exp})$
\[ = ((\alpha(\text{man}) \times 2^{-(b(\text{exp})-\alpha(\text{exp}))} \pm b(\text{man})) \times 2^{b(\text{exp})}, \]
if $\alpha(\text{exp}) < b(\text{exp})$

Figure 5–16 is the flowchart for floating-point addition. Because this flowchart assumes signed data, it is also appropriate for floating-point subtraction. In this figure, it is assumed that $\alpha(\text{exp}) \leq b(\text{exp})$. Steps are shown as numbers in parentheses in the figure.

- In step 1, the source exponents are compared, and $c(\text{exp})$ is set equal to the largest of the two source exponents.
- In step 2, $d$ is set to the difference of the two exponents.
- In step 3, the mantissa with the smallest exponent, in this case $\alpha(\text{man})$, is right-shifted $d$ bits in order to align the mantissas.
- In step 4, after the mantissas have been aligned, they are added.
- In steps 5 through 7 check for a special case of $c(\text{man})$. If $c(\text{man})$ is zero (step 5), then $c(\text{exp})$ is set to its most negative value (step 8) to yield the correct representation of zero. If $c(\text{man})$ has overflowed $c$ (step 6), then in step 9, $c(\text{man})$ is right-shifted one bit, and 1 is added to $c(\text{exp})$. In step 10, the result is normalized.
- In steps 11 and 12, special cases of $c(\text{exp})$ are tested. If $c(\text{exp})$ has overflowed, then $c$ is set to the most positive extended-precision value if it is positive; if it is negative, it is set to the most negative extended-precision value.
The following examples describe the floating-point addition and subtraction operations. It is assumed that the data is in the extended-precision floating-point format.
Example 5–10. Floating-Point Addition

Let
\[ \alpha = 1.5 = 01.1000000000000000000000000000000 \times 2^0 \]
\[ b = 0.5 = 01.0000000000000000000000000000000 \times 2^{-1} \]
It is necessary to shift b to the right by one so that \( \alpha \) and b have the same exponent. This yields
\[ b = 0.5 = 00.1000000000000000000000000000000 \times 2^0 \]
Then
\[
\begin{array}{c}
01.1000000000000000000000000000000 \times 2^0 \\
+ 00.1000000000000000000000000000000 \times 2^0 \\
\hline
01.0000000000000000000000000000000 \times 2^1 \\
\end{array}
\]

As in the case of multiplication, it is necessary to shift the binary point one place to the left and to add 1 to the exponent. This yields
\[
\begin{array}{c}
01.1000000000000000000000000000000 \times 2^0 \\
+ 00.1000000000000000000000000000000 \times 2^0 \\
\hline
01.0000000000000000000000000000000 \times 2^1 \\
\end{array}
\]

Example 5–11. Floating-Point Subtraction

Let
\[ \alpha = 01.0000000000000000000000000000001 \times 2^0 \]
\[ b = 01.0000000000000000000000000000000 \times 2^0 \]
The operation to be performed is \( \alpha - b \). The mantissas are already aligned because the two numbers have the same exponent. The result is a large cancellation of the upper bits, as shown below.
\[
\begin{array}{c}
01.0000000000000000000000000000001 \times 2^0 \\
- 01.0000000000000000000000000000000 \times 2^0 \\
\hline
00.0000000000000000000000000000000 \times 2^0 \\
\end{array}
\]
Floating-Point Addition and Subtraction

The result must be normalized. In this case, a left shift of 31 is required. The exponent of the result is modified accordingly. The result is

\[
0.00000000000000000000000000000001 \times 2^0 \\
- 0.00000000000000000000000000000000 \times 2^0
\]

\[
0.00000000000000000000000000000000 \times 2^{-31}
\]

Example 5–12. Floating-Point Addition With a 32-Bit Shift

This example illustrates a situation in which a full 32-bit shift is necessary to normalize the result. Let

\[
\alpha = 01.1111111111111111111111111111111 \times 2^{127} \\
b = 10.0000000000000000000000000000000 \times 2^{127}
\]

The operation to be performed is \( \alpha + b \).

\[
01.1111111111111111111111111111111 \\
+ 10.0000000000000000000000000000000 \\
\quad \times 2^{127}
\]

Normalizing the result requires a left shift of 32 and a subtraction of 32 from the exponent. The result is

\[
01.1111111111111111111111111111111 \\
+ 10.0000000000000000000000000000000 \\
\quad \times 2^{127}
\]

Example 5–13. Floating-Point Addition/Subtraction and Zero

When floating-point addition and subtraction are performed with a floating-point 0, the following identities are satisfied:

\[
\alpha \pm 0 = \alpha \text{ (assuming that } \alpha \neq 0) \\
0 \pm 0 = 0 \\
0 - \alpha = -\alpha \text{ (assuming that } \alpha \neq 0)
\]
5.7 Normalization (NORM Instruction)

The NORM instruction normalizes an extended-precision floating-point number that is assumed to be unnormalized. Since the number is assumed to be unnormalized, no implied most significant nonsign bit is assumed. The NORM instruction executes three steps:

1) Locates the most significant nonsign bit of the floating-point number
2) Left shifts to normalize the number
3) Adjusts the exponent

Given the extended-precision floating-point value \( \alpha \) to be normalized, the normalization is performed as shown in Figure 5–17.

Figure 5–17. Flowchart for NORM Instruction Operation
Example 5–14. *NORM Instruction*

Assume that an extended-precision register contains the value:

\[ \text{man} = 000000000000000000010000000001, \ exp = 0 \]

When the normalization is performed on a number assumed to be unnormalized, the binary point is assumed to be:

\[ \text{man} = 0.000000000000000000100000000001, \ exp = 0 \]

This number is then sign extended one bit so that the mantissa contains 33 bits:

\[ \text{man} = 00.000000000000000000100000000000001, \ exp = 0 \]

Here is the intermediate result after the most significant nonsign bit is located and the shift is performed:

\[ \text{man} = 01.0000000000010000000000000000000000, \ exp = -19 \]

The final 32-bit value output after removing the redundant bit is:

\[ \text{man} = 00000000000100000000000000000000000000, \ exp = -19 \]

The NORM instruction is useful for counting the number of leading zeros or leading ones in a 32-bit field. If the exponent is initially zero, the absolute value of the final value of the exponent is the number of leading ones or zeros. This instruction is also useful for manipulating unnormalized floating-point numbers.
5.8 Rounding (RND Instruction)

The RND instruction rounds a number from the extended-precision floating-point format to the single-precision floating-point format in a single cycle. Rounding (rnd) is similar to floating-point addition. Given the number $\alpha$ to be rounded, the following operation is performed first.

$$c = \alpha (\text{man}) \times 2^{\alpha (\text{exp})} + (1 \times 2^{\alpha (\text{exp}) - 24})$$

Next, a conversion from extended-precision floating-point to single-precision floating-point format is performed. Given the extended-precision floating-point value, rounding is performed as shown in Figure 5–18.

**Note:**

RND src, dst — where (src) = 0 — does not set the zero condition flag (bit 2 in the status register). Instead, it sets the underflow condition flag (bit 4 in the status register). When required, check for the underflow condition instead of the zero condition.
Figure 5–18. Flowchart for Floating-Point Rounding by the RND Instruction

Add $\alpha(\text{man})$ and 1/2 an LSB
\[ c(\text{man}) = \alpha(\text{man}) + 2^{-24} \]

Test for special cases of $c(\text{man})$
- $c(\text{man}) = 0$
- Overflow of $c(\text{man})$
- No special case

$c(\exp) = -128$
- $c(\text{man}) = c(\text{man}) < < 1$
- $c(\exp) = \alpha(\exp) + 1$

Test for special cases of $c(\exp)$
- $c(\exp)$ overflow
- $c(\exp)$ in range

If $c(\text{man}) > 0$, set $c$ to most positive single-precision value.
If $c(\text{man}) < 0$, set $c$ to most negative single-precision value.

Set 8 LSBs of $c(\text{man})$ to zero
\[ c = \text{rnd}(\alpha) \]
5.9 Floating-Point-to-Integer Conversion (FIX Instruction)

Using the FIX instruction, you can convert an extended-precision floating-point number to a single-precision integer in a single cycle. The floating-point to integer conversion of the value \( x \) is referred to here as \( \text{fix}(x) \). The conversion does not overflow if \( \alpha \), the number to be converted, is in the range:

\[-2^{31} \leq \alpha \leq 2^{31} - 1\]

First, you must be certain that:

\( \alpha(\text{exp}) \leq 30 \)

If these bounds are not met, an overflow occurs. If an overflow occurs in the positive direction, the output is the most positive integer. If an overflow occurs in the negative direction, the output is the most negative integer. If \( \alpha(\text{exp}) \) is within the valid range, then \( \alpha(\text{man}) \), with implied bit included, is sign-extended and right-shifted (rs) by the amount:

\[ rs = 31 - \alpha(\text{exp}) \]

This right shift (rs) shifts out those bits corresponding to the fractional part of the mantissa. For example:

- If \( 0 \leq x < 1 \), then \( \text{fix}(x) = 0 \).
- If \( -1 \leq x < 0 \), then \( \text{fix}(x) = -1 \).

The flowchart for the floating-point-to-integer conversion is shown in Figure 5–19.
Floating-Point to Integer Conversion (FIX Instruction)

Figure 5–19. Flowchart for Floating-Point-to-Integer Conversion by FIX Instruction

- Test for special cases of $\alpha(\text{exp})$
  - $\alpha(\text{exp}) > 30$
  - $\alpha(\text{exp})$ in range
    - $rs = 31 - \alpha(\text{exp})$

- Overflow
  - If $\alpha(\text{man}) > 0$,
    - $c =$ most positive integer
  - If $\alpha(\text{man}) < 0$,
    - $c =$ most negative integer

- Shift
  - $c = \alpha(\text{man}) >> rs$

- Set $c$ to final result

$c = \text{fix}(\alpha)$
5.10 Integer-to-Floating-Point Conversion (FLOAT Instruction)

Integer-to-floating-point conversion performed by the FLOAT instruction allows a single-precision integer to be converted to an extended-precision floating-point number in a single cycle. The flowchart for this conversion is shown in Figure 5–20.

Figure 5–20. Flowchart for Integer-to-Floating-Point Conversion by FLOAT Instructions
5.11 Reciprocal (RCPF Instruction)

The RCPF instruction generates a satisfactory estimate of the reciprocal of a floating-point number in a single cycle. The estimate has the correct exponent, and the mantissa is accurate to the eighth binary position (mantissa error is thus < $2^{-8}$) giving a 16-bit representation of the result (8-bit exponent plus 8-bit mantissa). Also, this estimate can be used as a seed for an algorithm to compute the reciprocal to even greater accuracy. (The Newton-Raphson algorithm, described in this section, is one such case.)

Figure 5–21 below depicts the algorithm used by instruction RCPF.

- The input is assumed to be $v = \text{vman} \times 2^{\text{vexp}}$.
- The output is assumed to be $x = \text{xman} \times 2^{\text{xexp}}$.
- $\text{vexp}$ is negated.
- If $\text{vexp} = -128$, the result is saturated to the most positive number, and the overflow flag is set. The N condition flag is set to the same sign as $\text{vsign}$.

Figure 5–21. RCPF Instruction Algorithm
The look-up table is read by forming a nine-bit address consisting of \( v\text{sign} \) and bits 22–15 of \( v\text{frac} \). The eight-bit output of the look-up table forms bits 22–15 of \( x\text{frac} \). Bits 14–0 of \( x\text{frac} \) are cleared to zero. \( x\text{sign} \) is set to \( v\text{sign} \).

The look-up table values are generated from simulation results.

### 5.11.1 Reciprocal Algorithm

The RCPF instruction provides the reciprocal of a number. The estimate has the correct exponent and a mantissa accurate to the eighth binary place (i.e., the error of the mantissa is \(< 2^{-8})\). The Newton-Raphson algorithm (shown below) can be used to further extend the mantissa’s precision:

\[
x[n+1] = x[n](2 - vx[n])
\]

where \( v \) = the number whose reciprocal is to be found.

\( x[0] \), the seed for the algorithm, is given by RCPF. For each iteration of the algorithm, the number of accurate bits in the mantissa doubles. Using RCPF, you can start with an estimate accurate to eight bits. With one iteration, accuracy is 16 bits in the mantissa, and with a second iteration, accuracy is 32 bits.

The ‘C4x program to implement this algorithm is shown in Example 5–15. Each step of the algorithm is labeled along with the corresponding accuracy achieved at the end of the step. The algorithm takes only seven machine cycles.

**Example 5–15. Newton-Raphson Algorithm for Computing the Reciprocal**

```assembly
RCPF R0,R1 ; R0 = v, R1 = x[0]
;
MPYF R1,R0,R2
SUBRF 2.0,R2
MPYF R2,R1 ; end of first iteration (16-bit accuracy)
;
MPYF R1,R0,R2
SUBRF 2.0,R2
MPYF R2,R1 ; end of second iteration (32-bit accuracy)
;
; R1 = 1/v
```
5.12 Reciprocal Square Root (RSQRF Instruction)

In many applications, normalization of data values is necessary. Often, the normalizing factor is the square root of another quantity. For example, when one vector is given, you can find the unit vector in the same direction by dividing the original vector by its own length. This involves division by a square root. The RSQRF instruction provides a simple way to directly determine this quantity instead of going through a two-step approach of finding the square root and then finding the reciprocal of the square root.

Given the result of this algorithm, the square root is found by a simple multiplication:

\[ v = vx[n] \]

where \( x[n] \) is the estimate of \( \sqrt{\frac{1}{v}} \) as determined by the Newton-Raphson algorithm or some other algorithm.

The RSQRF instruction generates an estimate reciprocal of the square root of a floating-point number in a single cycle. It parallels some of the operational characteristics of the RCPF instruction in these ways:

- RSQRF generates an estimate (in this case, the reciprocal of the square root of a floating-point number).
- The mantissa is accurate to the eighth binary place (mantissa error is < \( 2^{-8} \)).
- Often, this is a satisfactory estimate of the reciprocal of a number’s square root; in other cases, it may be used as a seed for an algorithm that computes the reciprocal square root to an even greater accuracy.

Figure 5–22 depicts the RSQRF algorithm. In the algorithm:

- The input is assumed to be \( v = vman \times 2^{vexp} \).
- The output is assumed to be \( x = xman \times 2^{xexp} \).
- \( vexp + 1 \) is negated and shifted right one bit with sign extension.
- If \( vexp = -128 \), the result is saturated to the most positive number, and the overflow flag is set.
Reciprocal Square Root (RSQRF Instruction)

Figure 5–22. RSQRF Instruction Algorithm

The look-up table is read by forming a nine-bit address consisting of the least significant bit of \( vexp \) and bits 22–15 of \( vfrac \). The eight-bit output of the look-up table forms bits 22–15 of \( xfrac \). Bits 14–0 of \( xfrac \) are cleared to zero. \( xsign \) is set to 0. There is no provision for negative values of \( v \).

The look-up table values are generated from simulation results.

Given the result of this algorithm, division is performed by a simple multiplication:

\[
y/v = yx[n]
\]

In the equation, \( x[n] \) is the estimate of \( 1/v \) as determined by the Newton-Raphson algorithm or another algorithm.

**Newton-Raphson Algorithm**

The RSQRF instruction provides the reciprocal of the square root of a number. The estimate has the correct exponent and a mantissa accurate to the eighth binary place (i.e., the error of the mantissa is \( <2^{-8} \)). The Newton-Raphson algorithm (shown below) can be used to further extend the mantissa’s precision:

\[
x[n+1] = x[n](1.5-(v/2)x[n])
\]

where \( v \) = the number whose reciprocal is to be found.
The seed for the algorithm, \( x[0] \), is given by RSQRF. For each iteration of the algorithm, the number of accurate bits in the mantissa doubles. Using RSQRF, you can start with an estimate accurate to eight bits. With one iteration, accuracy is 16 bits in the mantissa, and with a second iteration, accuracy is 32 bits.

The 'C4x program to implement this algorithm is shown in Example 5–16. Each step of the algorithm is labeled, and the corresponding accuracy achieved is noted at the end of the step. The algorithm takes only ten machine cycles (compared to 30 cycles on the 'C3x without a look-up table).


\[
\begin{align*}
\text{RSQRF} & \ R0, R1 \ ; \ R0 = v, \ R1 = x[0] \\
\text{MPYF} & \ 0.5, R0 ; \ R0 = v/2 \\
\text{MPYF} & \ R1, R1, R2 \\
\text{MPYF} & \ R0, R2 \\
\text{SUBRF} & \ 1.5, R2 \\
\text{MPYF} & \ R2, R1 ; \text{end of first iteration (16-bit accuracy)} \\
\text{MPYF} & \ R1, R1, R2 \\
\text{MPYF} & \ R0, R2 \\
\text{SUBRF} & \ 1.5, R2 \\
\text{MPYF} & \ R2, R1 ; \text{end of second iteration (32-bit accuracy)} \\
\text{} & \ R1 = 1/(v^{0.5})
\end{align*}
\]
The 'C4x supports five types of addressing to access data from memory, registers, and the instruction word. This chapter details the operation, encoding, and implementation of the addressing modes.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>6.1 Addressing Types</td>
<td>6-2</td>
</tr>
<tr>
<td>6.2 Register Addressing</td>
<td>6-3</td>
</tr>
<tr>
<td>6.3 Direct Addressing</td>
<td>6-5</td>
</tr>
<tr>
<td>6.4 Indirect Addressing</td>
<td>6-6</td>
</tr>
<tr>
<td>6.5 Immediate Addressing</td>
<td>6-18</td>
</tr>
<tr>
<td>6.6 PC-Relative Addressing</td>
<td>6-19</td>
</tr>
<tr>
<td>6.7 Encoding of Addressing Modes</td>
<td>6-21</td>
</tr>
<tr>
<td>6.8 Circular Addressing</td>
<td>6-27</td>
</tr>
<tr>
<td>6.9 Bit-Reversed Addressing</td>
<td>6-32</td>
</tr>
</tbody>
</table>
6.1 Addressing Types

You can access data from memory, registers, and the instruction word by using five types of addressing:

- Register addressing
- Direct addressing
- Indirect addressing
- Immediate addressing
- PC-relative addressing

Not all addressing types are appropriate for all instructions. Addressing types are classified into four groups, depending upon the encoding method used:

- General addressing modes (G)
- Three-operand addressing modes (T)
- Parallel addressing modes (P)
- Conditional-branch addressing modes (B)

For use in filters and FFTs, there are two specialized modes:

- Circular addressing
- Bit-reversed addressing
6.2 Register Addressing

In register addressing, a CPU register contains the operand, as shown in this example:

\[
\text{ABSF } R1 ; \ R1 = |R1| \]

The machine address for the CPU registers, the assembler syntax (register name), and the assigned function for those registers are listed in Table 6–1.

Table 6–1. CPU Register/Assembler Syntax and Function

<table>
<thead>
<tr>
<th>Register Name</th>
<th>Machine Address</th>
<th>Assigned Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>00h</td>
<td>Extended-precision register 0</td>
</tr>
<tr>
<td>R1</td>
<td>01h</td>
<td>Extended-precision register 1</td>
</tr>
<tr>
<td>R2</td>
<td>02h</td>
<td>Extended-precision register 2</td>
</tr>
<tr>
<td>R3</td>
<td>03h</td>
<td>Extended-precision register 3</td>
</tr>
<tr>
<td>R4</td>
<td>04h</td>
<td>Extended-precision register 4</td>
</tr>
<tr>
<td>R5</td>
<td>05h</td>
<td>Extended-precision register 5</td>
</tr>
<tr>
<td>R6</td>
<td>06h</td>
<td>Extended-precision register 6</td>
</tr>
<tr>
<td>R7</td>
<td>07h</td>
<td>Extended-precision register 7</td>
</tr>
<tr>
<td>R8</td>
<td>1Ch</td>
<td>Extended-precision register 8</td>
</tr>
<tr>
<td>R9</td>
<td>1Dh</td>
<td>Extended-precision register 9</td>
</tr>
<tr>
<td>R10</td>
<td>1Eh</td>
<td>Extended-precision register 10</td>
</tr>
<tr>
<td>R11</td>
<td>1Fh</td>
<td>Extended-precision register 11</td>
</tr>
<tr>
<td>A0</td>
<td>08h</td>
<td>Auxiliary register 0</td>
</tr>
<tr>
<td>A1</td>
<td>09h</td>
<td>Auxiliary register 1</td>
</tr>
<tr>
<td>A2</td>
<td>0Ah</td>
<td>Auxiliary register 2</td>
</tr>
<tr>
<td>A3</td>
<td>0Bh</td>
<td>Auxiliary register 3</td>
</tr>
<tr>
<td>A4</td>
<td>0Ch</td>
<td>Auxiliary register 4</td>
</tr>
<tr>
<td>A5</td>
<td>0Dh</td>
<td>Auxiliary register 5</td>
</tr>
<tr>
<td>A6</td>
<td>0Eh</td>
<td>Auxiliary register 6</td>
</tr>
</tbody>
</table>
Table 6–1. CPU Register/Assembler Syntax and Function (Continued)

<table>
<thead>
<tr>
<th>Register Name</th>
<th>Machine Address</th>
<th>Assigned Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>A7</td>
<td>0Fh</td>
<td>Auxiliary register 7</td>
</tr>
<tr>
<td>DP</td>
<td>10h</td>
<td>Data-page pointer</td>
</tr>
<tr>
<td>IR0</td>
<td>11h</td>
<td>Index register 0</td>
</tr>
<tr>
<td>IR1</td>
<td>12h</td>
<td>Index register 1</td>
</tr>
<tr>
<td>BK</td>
<td>13h</td>
<td>Block-size register</td>
</tr>
<tr>
<td>SP</td>
<td>14h</td>
<td>Active stack pointer</td>
</tr>
<tr>
<td>ST</td>
<td>15h</td>
<td>Status register</td>
</tr>
<tr>
<td>DIE</td>
<td>16h</td>
<td>DMA coprocessor interrupt enable</td>
</tr>
<tr>
<td>IIE</td>
<td>17h</td>
<td>Internal interrupt enable register</td>
</tr>
<tr>
<td>IIF</td>
<td>18h</td>
<td>IIOF pins and interrupt flag register</td>
</tr>
<tr>
<td>RS</td>
<td>19h</td>
<td>Repeat start address register</td>
</tr>
<tr>
<td>RE</td>
<td>1Ah</td>
<td>Repeat end address register</td>
</tr>
<tr>
<td>RC</td>
<td>1Bh</td>
<td>Repeat counter register</td>
</tr>
</tbody>
</table>

(b) CPU Expansion Registers

<table>
<thead>
<tr>
<th>Register Name</th>
<th>Machine Address</th>
<th>Assigned Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>IVTP</td>
<td>00h</td>
<td>Interrupt-vector table pointer</td>
</tr>
<tr>
<td>TVTP</td>
<td>01h</td>
<td>Trap-vector table pointer</td>
</tr>
</tbody>
</table>
6.3 Direct Addressing

In direct addressing, the data address is formed by the concatenation of the 16 least significant bits of the data page pointer (DP) with the 16 least significant bits of the instruction word (expr). The use of 16 bits for the DP results in 65536 pages (64K words per page), allowing you to access a large address space without changing the value of the DP. The syntax and operation for direct addressing are listed below.

**Syntax:** \[@expr\]

**Operation:** address = DP concatenated with expr

Figure 6–1 shows the formation of the data address. Example 6–1 gives an instruction example with data before and after instruction execution.

**Example 6–1. Direct Addressing**

```
ADDI @0BCDEh, R7
```

<table>
<thead>
<tr>
<th>Before Instruction:</th>
<th>After Instruction:</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP = 108Ah</td>
<td>DP = 108Ah</td>
</tr>
<tr>
<td>R7 = 11h</td>
<td>R7 = 1234 5689h</td>
</tr>
<tr>
<td>Data at 108A BCDEh = 1234 5678h</td>
<td>Data at 108A BCDEh = 1234 5678h</td>
</tr>
</tbody>
</table>
6.4 Indirect Addressing

Indirect addressing specifies the address of an operand in memory through the contents of an auxiliary register, optional displacements, and index registers. The auxiliary register arithmetic units (ARAU's) perform this unsigned arithmetic. (All 32 bits of the auxiliary and index registers are used in indirect addressing.)

The flexibility of indirect addressing is possible because the ARAUs on the 'C4x modify auxiliary registers in parallel with operations within the main CPU. Indirect addressing is specified by a five-bit field in the instruction word, referred to as the mod field (shown on the left side of Table 6–2 on as well as in the examples that follow). A displacement is either an explicit unsigned 5-bit or 8-bit integer contained in the instruction word or an implicit displacement of one. Two index registers, IR0 and IR1, can also be used in indirect addressing, enabling the use of 32-bit indirect displacements (IR0 and IR1 are treated as signed integers). In some cases, an addressing scheme using circular or bit-reversed addressing is optional. Generating addresses for circular addressing is discussed in Section 6.8, and for bit-reversed addressing in Section 6.9.

Table 6–2 lists the various kinds of indirect addressing, along with the value of the modification (mod) field, assembler syntax, operation, and function for each. Figure 6–2 shows the format of the indirect addressing operand in the instruction encoding. The disp field does not exist for some instructions.

Figure 6–2. Indirect Addressing Operand Encoding

<table>
<thead>
<tr>
<th>MSB</th>
<th>mod</th>
<th>ARn</th>
<th>disp</th>
</tr>
</thead>
<tbody>
<tr>
<td>5 bits</td>
<td>3 bits</td>
<td>0, 5, or 8 bits</td>
<td></td>
</tr>
</tbody>
</table>

Note:
The auxiliary register (ARn) to be used is encoded in the instruction word according to its binary representation, n (i.e., AR3 is encoded as 112), not its register machine address (as shown in Table 6–1).
Table 6–2. Indirect Addressing
(a) Indirect Addressing With Displacement

<table>
<thead>
<tr>
<th>Mod Field</th>
<th>Syntax</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>00000</td>
<td>*+ARn(disp)</td>
<td>addr = ARn + disp</td>
<td>With predisplacement add</td>
</tr>
<tr>
<td>00001</td>
<td>*–ARn(disp)</td>
<td>addr = ARn – disp</td>
<td>With predisplacement subtract</td>
</tr>
<tr>
<td>00010</td>
<td>*++ARn(disp)</td>
<td>addr = ARn + disp, ARn = ARn + disp</td>
<td>With predisplacement add and modify</td>
</tr>
<tr>
<td>00011</td>
<td>*––ARn(disp)</td>
<td>addr = ARn – disp, ARn = ARn – disp</td>
<td>With predisplacement subtract and modify</td>
</tr>
<tr>
<td>00100</td>
<td>*ARn++(disp)</td>
<td>addr = ARn, ARn = ARn + disp</td>
<td>With postdisplacement add and modify</td>
</tr>
<tr>
<td>00101</td>
<td>*ARn––(disp)</td>
<td>addr = ARn, ARn = ARn – disp</td>
<td>With postdisplacement subtract and modify</td>
</tr>
<tr>
<td>00110</td>
<td>*ARn++(disp)%</td>
<td>addr = ARn, ARn = circ(ARn + disp)</td>
<td>With postdisplacement add and circular modify</td>
</tr>
<tr>
<td>00111</td>
<td>*ARn––(disp)%</td>
<td>addr = ARn, ARn = circ(ARn – disp)</td>
<td>With postdisplacement subtract and circular modify</td>
</tr>
</tbody>
</table>

(b) Indirect Addressing With Index Register IRO

<table>
<thead>
<tr>
<th>Mod Field</th>
<th>Syntax</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>01000</td>
<td>*+ARn(IR0)</td>
<td>addr = ARn + IR0</td>
<td>With preindex (IR0) add</td>
</tr>
<tr>
<td>01001</td>
<td>*–ARn(IR0)</td>
<td>addr = ARn – IR0</td>
<td>With preindex (IR0) subtract</td>
</tr>
<tr>
<td>01010</td>
<td>*++ARn(IR0)</td>
<td>addr = ARn + IR0, ARn = ARn + IR0</td>
<td>With preindex (IR0) add and modify</td>
</tr>
<tr>
<td>01011</td>
<td>*––ARn(IR0)</td>
<td>addr = ARn – IR0, ARn = ARn – IR0</td>
<td>With preindex (IR0) subtract and modify</td>
</tr>
<tr>
<td>01100</td>
<td>*ARn++(IR0)</td>
<td>addr = ARn, ARn = ARn + IR0</td>
<td>With postindex (IR0) add and modify</td>
</tr>
<tr>
<td>01101</td>
<td>*ARn––(IR0)</td>
<td>addr = ARn, ARn = ARn – IR0</td>
<td>With postindex (IR0) subtract and modify</td>
</tr>
<tr>
<td>01110</td>
<td>*ARn++(IR0)%</td>
<td>addr = ARn, ARn = circ(ARn + IR0)</td>
<td>With postindex (IR0) add and circular modify</td>
</tr>
<tr>
<td>01111</td>
<td>*ARn––(IR0)%</td>
<td>addr = ARn, ARn = circ(ARn – IR0)</td>
<td>With postindex (IR0) subtract and circular modify</td>
</tr>
</tbody>
</table>

LEGEND:
- addr = memory address
- ARn = auxiliary register AR0 – AR7
- IRn = index register IR0 or IR1
- disp = displacement
- ++ = add and modify
- --- = subtract and modify
- circ() = address in circular addressing
- % = where circular addressing is performed
- B = where bit-reversed addressing is performed
### Indirect Addressing

#### Table 6–2. Indirect Addressing (Continued)

**c) Indirect Addressing With Index Register IR1**

<table>
<thead>
<tr>
<th>Mod Field</th>
<th>Syntax</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>10000</td>
<td>*+ ARn(IR1)</td>
<td>addr = ARn + IR1</td>
<td>With preindex (IR1) add</td>
</tr>
<tr>
<td>10001</td>
<td>*− ARn(IR1)</td>
<td>addr = ARn − IR1</td>
<td>With preindex (IR1) subtract</td>
</tr>
<tr>
<td>10010</td>
<td>*++ ARn(IR1)</td>
<td>addr = ARn + IR1</td>
<td>With preindex (IR1) add and modify</td>
</tr>
<tr>
<td></td>
<td>ARn = ARn + IR1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10011</td>
<td>*−− ARn(IR1)</td>
<td>addr = ARn − IR1</td>
<td>With preindex (IR1) subtract and modify</td>
</tr>
<tr>
<td></td>
<td>ARn = ARn − IR1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10100</td>
<td>*ARn++(IR1)</td>
<td>addr = ARn</td>
<td>With postindex (IR1) add and modify</td>
</tr>
<tr>
<td></td>
<td>ARn = ARn + IR1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10101</td>
<td>*ARn−−(IR1)</td>
<td>addr = ARn</td>
<td>With postindex (IR1) subtract and modify</td>
</tr>
<tr>
<td></td>
<td>ARn = ARn − IR1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10110</td>
<td>*ARn++(IR1)%</td>
<td>addr = ARn</td>
<td>With postindex (IR1) add and circular modify</td>
</tr>
<tr>
<td></td>
<td>ARn = circ(ARn + IR1)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10111</td>
<td>*ARn−−(IR1)%</td>
<td>addr = ARn</td>
<td>With postindex (IR1) subtract and circular</td>
</tr>
<tr>
<td></td>
<td>ARn = circ(ARn − IR1)</td>
<td></td>
<td>modify</td>
</tr>
</tbody>
</table>

**d) Indirect Addressing (Special Cases)**

<table>
<thead>
<tr>
<th>Mod Field</th>
<th>Syntax</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>11000</td>
<td>*ARn</td>
<td>addr = ARn</td>
<td>Indirect</td>
</tr>
<tr>
<td>11001</td>
<td>*ARn++(IR0)B</td>
<td>addr = ARn</td>
<td>With postindex (IR0) add and bit-reversed modify</td>
</tr>
<tr>
<td></td>
<td>ARn = B(ARn + IR0)</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**LEGEND:**
- addr = memory address
- ARn = auxiliary register AR0 – AR7
- IRn = index register IR0 or IR1
- disp = displacement
- ++ = add and modify
- − = subtract and modify
- circ() = address in circular addressing
- % = where circular addressing is performed
- B = where bit-reversed addressing is performed
Example 6–2 through Example 6–19 show the operation for each type of indirect addressing.

Example 6–2. Auxiliary Register Indirect

An auxiliary register (ARn) contains the address of the operand to be fetched.  
**Operation:** operand address = ARn  
**Assembler Syntax:** *ARn  
**Modification Field:** 11000

Example 6–3. Indirect With Predisplacement Add

The address of the operand to be fetched is the sum of an auxiliary register (ARn) and the displacement (disp). The displacement is either a 5-bit or 8-bit unsigned integer contained in the instruction word or an implied value of 1.  
**Operation:** operand address = ARn + disp  
**Assembler Syntax:** *ARn+ disp  
**Modification Field:** 00000
Example 6–4. Indirect With Predisplacement Subtract

The address of the operand to be fetched is the contents of an auxiliary register (ARn) minus the displacement (disp). The displacement is either an 8-bit unsigned integer contained in the instruction word or an implied value of 1.

Operation: \( \text{operand address} = \text{ARn} - \text{disp} \)

Assembler Syntax: \(^*\text{ARn}(\text{disp})\)

Modification Field: 

```
00001
```

Example 6–5. Indirect With Predisplacement Add and Modify

The address of the operand to be fetched is the sum of an auxiliary register (ARn) and the displacement (disp). The displacement is either an 8-bit unsigned integer contained in the instruction word or an implied value of 1. After the data is fetched, the auxiliary register is updated with the generated address.

Operation: \( \text{operand address} = \text{ARn} + \text{disp} \)
\( \text{ARn} = \text{ARn} + \text{disp} \)

Assembler Syntax: \(^*++\text{ARn}(\text{disp})\)

Modification Field: 

```
00010
```
Example 6–6. *Indirect With Predisplacement Subtract and Modify*

The address of the operand to be fetched is the contents of an auxiliary register (ARn) minus the displacement (disp). The displacement is either an 8-bit unsigned integer contained in the instruction word or an implied value of 1. After the data is fetched, the auxiliary register is updated with the generated address.

**Operation:**

\[
\text{operand address} = \text{ARn} - \text{disp} \\
\text{ARn} = \text{AR } n - \text{disp}
\]

**Assembler Syntax:**

\[
* - \text{ARn} (\text{disp})
\]

**Modification Field:**

\[
\text{00011}
\]

---

Example 6–7. *Indirect With Postdisplacement Add and Modify*

The address of the operand to be fetched is the contents of an auxiliary register (ARn). After the operand is fetched, the displacement (disp) is added to the auxiliary register. The displacement is either an 8-bit unsigned integer contained in the instruction word or an implied value of 1.

**Operation:**

\[
\text{operand address} = \text{ARn} \\
\text{ARn} = \text{ARn + disp}
\]

**Assembler Syntax:**

\[
* \text{ARn} + \text{disp}
\]

**Modification Field:**

\[
\text{00100}
\]
Example 6–8. _Indirect With Postdisplacement Subtract and Modify_

The address of the operand to be fetched is the contents of an auxiliary register (AR\(_n\)). After the operand is fetched, the displacement (\(disp\)) is subtracted from the auxiliary register. The displacement is either an 8-bit unsigned integer contained in the instruction word or an implied value of 1.

**Operation:**

\[
\begin{align*}
\text{operand address} &= \text{AR}_n \\
\text{AR}_n &= \text{AR}_n - \text{disp}
\end{align*}
\]

**Assembler Syntax:**

`*ARn -- disp`

**Modification Field:**

\[00101\]

---

Example 6–9. _Indirect With Postdisplacement Add and Circular Modify_

The address of the operand to be fetched is the contents of an auxiliary register (AR\(_n\)). After the operand is fetched, the displacement (\(disp\)) is added to the contents of the auxiliary register through circular addressing. This result is used to update the auxiliary register. The displacement is either an 8-bit unsigned integer contained in the instruction word or an implied value of 1.

**Operation:**

\[
\begin{align*}
\text{operand address} &= \text{AR}_n \\
\text{AR}_n &= \text{circ} (\text{AR}_n + \text{disp})
\end{align*}
\]

**Assembler Syntax:**

`*ARn ++ (disp)%`

**Modification Field:**

\[00110\]
Example 6–10. Indirect With Postdisplacement Subtract and Circular Modify

The address of the operand to be fetched is the contents of an auxiliary register (ARn). After the operand is fetched, the displacement (disp) is subtracted from the contents of the auxiliary register through circular addressing. This result is used to update the auxiliary register. The displacement is either an 8-bit unsigned integer contained in the instruction word or an implied value of 1.

**Operation:**

\[
\text{operand address} = \text{ARn} \\
\text{ARn} = \text{circ(ARn} - \text{disp)}
\]

**Assembler Syntax:**

\*ARn ––(disp)%

**Modification Field:**

00111

Example 6–11. Indirect With Preindex Add

The address of the operand to be fetched is the sum of an auxiliary register (ARn) and an index register (IR0 or IR1).

**Operation:**

\[
\text{operand address} = \text{ARn} + \text{IR}_m
\]

**Assembler Syntax:**

\*ARn(IRm)

**Modification Field:**

01000 if \( m=0 \)  
10000 if \( m=1 \)
**Example 6–12. Indirect With Preindex Subtract**

The address of the operand to be fetched is the difference between an auxiliary register (ARn) and an index register (IR0 or IR1).

**Operation:** \[ \text{operand address} = \text{ARn} - \text{IRm} \]

**Assembler Syntax:** \*(-ARn(IRm))

**Modification Field:**
- 01001 if \( m = 0 \)
- 10001 if \( m = 1 \)

---

**Example 6–13. Indirect With Preindex Add and Modify**

The address of the operand to be fetched is the sum of an auxiliary register (ARn) and an index register (IR0 or IR1). After the data is fetched, the auxiliary register is updated with the generated address.

**Operation:**
- \[ \text{operand address} = \text{ARn} + \text{IRm} \]
- \[ \text{ARn} = \text{ARn} + \text{IRm} \]

**Assembler syntax:** \(*++\text{ARn}(\text{IRm})\)

**Modification Field:**
- 01010 if \( m = 0 \)
- 10010 if \( m = 1 \)
Example 6–14. Indirect With Preindex Subtract and Modify

The address of the operand to be fetched is the difference between an auxiliary register (ARn) and an index register (IR0 or IR1). The resulting address becomes the new contents of the auxiliary register.

**Operation:**
- operand address = ARn – IRm
- ARn = ARn – IRm

**Assembler Syntax:**
*– – ARn(IRm)

**Modification Field:**
- 01011 if m = 0
- 10011 if m = 1

Example 6–15. Indirect With Postindex Add and Modify

The address of the operand to be fetched is the contents of an auxiliary register (ARn). After the operand is fetched, an index register (IR0 or IR1) is added to the auxiliary register.

**Operation:**
- operand address = ARn
- ARn = ARn + IRm

**Assembler Syntax:**
*ARn++ (IRm)

**Modification Field:**
- 01100 if m = 0
- 10100 if m = 1
Example 6–16. Indirect With Postindex Subtract and Modify

The address of the operand to be fetched is the contents of an auxiliary register (ARn). After the operand is fetched, the index register (IR0 or IR1) is subtracted from the auxiliary register.

**Operation:**  
operand address = ARn  
ARn = ARn – IRm

**Assembler Syntax:**  
*ARn –– (IRm)

**Modification Field:**  
01101 if m = 0  
10101 if m = 1

Example 6–17. Indirect With Postindex Add and Circular Modify

The address of the operand to be fetched is the contents of an auxiliary register (ARn). After the operand is fetched, the index register (IR0 or IR1) is added to the auxiliary register. This value is evaluated through circular addressing and replaces the contents of the auxiliary register.

**Operation:**  
operand address = ARn  
ARn = circ(ARn + IRm)

**Assembler Syntax:**  
*ARn ++ (IRm)%

**Modification Field:**  
01110 if m = 0  
10110 if m = 1
Example 6–18. Indirect With Postindex Subtract and Circular Modify

The address of the operand to be fetched is the contents of an auxiliary register (ARn). After the operand is fetched, the index register (IR0 or IR1) is subtracted from the auxiliary register. The result is evaluated through circular addressing and replaces the contents of the auxiliary register.

**Operation:**

- operand address = ARn
- ARn = circ(ARn – IRm)

**Assembler Syntax:**
- *ARn – –(IRm)%

**Modification Field:**
- 01111 if m = 0
- 10111 if m = 1

---

Example 6–19. Indirect With Postindex Add and Bit-Reversed Modify

The address of the operand to be fetched is the contents of an auxiliary register (ARn). After the operand is fetched, the index register (IR0) is added to the auxiliary register. This addition is performed with a reverse-carry propagation and can be used to yield a bit-reversed (B) address. This value replaces the contents of the auxiliary register.

**Operation:**

- operand address = ARn
- ARn = B(ARn+IR0)

**Assembler Syntax:**
- *ARn ++(IR0)B

**Modification Field:**
- 11001
6.5 Immediate Addressing

In immediate addressing, the operand is an 8- or 16-bit immediate value contained in the 8 or 16 least significant bits of the instruction word (expr). Depending on the data types assumed for the instruction, the immediate operand may be a two’s-complement integer, an unsigned integer, a signed integer, or a floating-point number. The syntax for this mode is as follows:

Syntax: expr

Example 6–20 gives an instruction example with data from before and after the instruction is executed. Notice that AND and AND3 produce different results.

Example 6–20. Immediate Addressing

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Before</th>
<th>After</th>
</tr>
</thead>
<tbody>
<tr>
<td>SUBI 1,R0</td>
<td>R0=00</td>
<td>FFFF FFFFh</td>
</tr>
<tr>
<td>LDI 0FFFFh,R0</td>
<td>R0=00</td>
<td>FFFF FFFFh</td>
</tr>
<tr>
<td>LDF 5.0,R0</td>
<td>R0=00</td>
<td>2000 0000h</td>
</tr>
<tr>
<td>OR 0FFFFh,R0</td>
<td>R0=00</td>
<td>0000 FFFFh</td>
</tr>
<tr>
<td>AND3 80h,R0,R0</td>
<td>R0=00</td>
<td>FFFF FFFFh</td>
</tr>
<tr>
<td>AND 80h,R0</td>
<td>R0=00</td>
<td>0000 0000h</td>
</tr>
</tbody>
</table>
6.6  PC-Relative Addressing

PC-relative addressing is used for branching. It adds the contents of the 16 or
24 least significant bits of the instruction word to the PC register. The assem-
bler takes the src (a label or address) specified by the user and generates a
displacement. If the branch is a standard branch, this displacement is equal
to \([label - (instruction\ address +1)]\). If the branch is a delayed branch, this dis-
placement is equal to \([label - (instruction\ address+3)]\).

The displacement is stored as a 16-bit or 24-bit signed integer in the least sig-
nificant bits of the instruction word. The displacement is added to the PC during
the pipeline decode phase. Notice that because the PC is incremented by one
in the fetch phase, the displacement is added to this incremented PC value.

**Syntax:**    expr (label or address)

Example 6–21 gives an instruction example with before- and after-instruction
data.

**Example 6–21. PC-Relative Addressing**

| BU       | NEWPC | ; address of BU instruction=1,        |
|          |      | ; NEWPC label =5, displacement = 3    |
| ...      |      | ; displacement = 5 - (1 + 1)          |
| NEWPC    | ...  | ; displacement = 5 - (1 + 1)          |

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>Decode Phase</td>
<td>Execution Phase:</td>
</tr>
<tr>
<td>PC = 2h</td>
<td>PC = 5h</td>
</tr>
</tbody>
</table>

The 24-bit addressing mode is used to encode the program control instruc-
tions (e.g., BR, BRD, CALL, RPTB, RPTBD, LAJ). Depending on the instruc-
tion, the new PC value is derived by adding a 24-bit signed value in the instruc-
tion word with the present PC value. Bit 24 determines the type of branch (D
= 0 for a standard branch or D = 1 for a delayed branch). Some of these instruc-
tions are encoded in Figure 6–3.
Figure 6–3. Encoding for 24-Bit PC-Relative Addressing Mode

(a) BR, BRD: unconditional branches (delayed and not delayed)

<table>
<thead>
<tr>
<th>31</th>
<th>25</th>
<th>23</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

(b) CALL: unconditional subroutine call

<table>
<thead>
<tr>
<th>31</th>
<th>23</th>
<th>23</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

(c) RPTB, RPTBD: repeat block (not delayed and delayed)

<table>
<thead>
<tr>
<th>31</th>
<th>23</th>
<th>23</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>

(d) LAJ: link and jump (return address in extended-precision register R11)

<table>
<thead>
<tr>
<th>31</th>
<th>23</th>
<th>23</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>
6.7 Encoding of Addressing Modes

The five addressing types form four groups of addressing modes:

- General addressing modes (G) (subsection 6.7.1)
- Three-operand addressing modes (T) (subsection 6.7.2)
- Parallel addressing modes (P) (subsection 6.7.3)
- Conditional-branch addressing modes (B) (subsection 6.7.4)

6.7.1 General Addressing Modes

Instructions that use the general addressing modes are general-purpose instructions, such as ADDI, MPYF, and LSH. Such instructions usually have the following syntax:

```
dst operation src → dst
```

In the syntax, the destination operand is signified by dst and the source operand by src; operation defines an operation to be performed with the general addressing modes to specify certain operands. Bits 31–29 are zero, indicating general addressing mode instructions. Bits 22 and 21 specify the general addressing mode (G) field, which defines how bits 15 through 0 are to be interpreted for addressing the src operand.

Options for bits 22 and 21 (G field) are as follows:

<table>
<thead>
<tr>
<th>G</th>
<th>Mode</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (all CPU registers unless specified otherwise)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

If the src and dst fields contain register specifications, the value in these fields contains the CPU register addresses as defined by Table 6–1. For the general addressing modes, the following values of ARn are valid for indirect addressing:

```
ARn, 0 ≤ n ≤ 7
```

Figure 6–4 shows the encoding for the general addressing modes. The notation mod n indicates the modification field that goes with the ARn field. Refer to Table 6–2 for further information.
Figure 6–4. Encoding for General Addressing Modes

<table>
<thead>
<tr>
<th>G</th>
<th>Destination</th>
<th>Source Operands</th>
</tr>
</thead>
<tbody>
<tr>
<td>31 29 28 23 22 21 20 16 15 11 10 8 7 5 4 0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0 0 0</td>
<td>operation</td>
<td>0</td>
</tr>
<tr>
<td>0 0 0</td>
<td>operation</td>
<td>0</td>
</tr>
<tr>
<td>0 0 0</td>
<td>operation</td>
<td>1</td>
</tr>
<tr>
<td>0 0 0</td>
<td>operation</td>
<td>1</td>
</tr>
</tbody>
</table>

6.7.2 Three-Operand Addressing Modes

The 19 three-operand instructions on the 'C4x use the eight addressing modes listed in Table 6–3:

Table 6–3. Three-Operand Instruction Addressing Modes

**Type 1†**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
<th>dst ‡</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>Register mode (any CPU register)</td>
<td>Register mode (any CPU register)</td>
<td>Rx</td>
</tr>
<tr>
<td>01</td>
<td>Indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>Register mode (any CPU register)</td>
<td>Rx</td>
</tr>
<tr>
<td>10</td>
<td>Register mode (any CPU register)</td>
<td>Indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>Rx</td>
</tr>
<tr>
<td>11</td>
<td>Indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>Indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>Rx</td>
</tr>
</tbody>
</table>

† The 'C4x recognizes either type 1 or type 2 modes; the 'C3x recognizes only type 1.
‡ Rx = any register in the CPU (primary) register file for the respective processor.

**Type 2†**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
<th>dst ‡</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>Register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
<td>Rx</td>
</tr>
<tr>
<td>01</td>
<td>Register mode (any CPU register)</td>
<td>Indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>Rx</td>
</tr>
<tr>
<td>10</td>
<td>Indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
<td>Rx</td>
</tr>
<tr>
<td>11</td>
<td>Indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>Indirect mode *+ARn2(5-bit unsigned displacement)</td>
<td>Rx</td>
</tr>
</tbody>
</table>

† The 'C4x recognizes either type 1 or type 2 modes; the 'C3x recognizes only type 1.
‡ Rx = any register in the CPU (primary) register file for the respective processor.
The object values differ for three-operand instructions, depending on the assembler used:

- The 'C3x assembler recognizes only type 1 modes and sets bits 31–28 to 0010₂.
- The 'C4x assembler recognizes both types and sets bits 31–28 to 0010₂ for type 1 and to 0011₂ for type 2.

The three-operand instructions MPYSHI3 and MPYUHI3 are unique to the 'C4x.

All instructions except four can use all of the type 2 address modes shown in Table 6–3. The exceptions, which can use only the second and fourth address modes in type 2, are the floating-point instructions ADDF3, CMPF3, MPYF3, and SUBF3.

The remaining 15 three-operand instructions are ADDC3, ADDI3, AND3, ANDN3, ASH3, CMPI3, LSH3, MPYI3, MPYSHI3, MPYUHI3, OR3, SUBB3, SUBI3, TSTB3, and XOR3.

**Note:**
The suffix 3 can be omitted from a three-operand instruction mnemonic.

Bits 22 and 21 specify the three-operand addressing mode (T) field, which defines how to interpret bits 15–0 for addressing the src operands. Bits 15–8 define the src1 address, and bits 7–0 define the src2 address.

Figure 6–5 and Figure 6–6 show the encoding for 'C4x three-operand addressing (the 'C3x recognizes only the format in Figure 6–5). The notation modm or modn indicates the modification field that goes with the ARm or ARn (auxiliary register) field, respectively. Refer to Table 6–2 for further information.

The 8-bit signed immediate value supports left shifts, right shifts, and memory increment and decrement operations. The immediate value is not available for floating-point operations.

These instructions greatly help reduce code size, both assembled and compiled. They also improve performance notably in DSP and other computationally intensive applications and general-purpose code.
6.7.3 Parallel Addressing Modes

Instructions that use parallel addressing, indicated by || (two vertical bars), allow for the greatest amount of parallelism possible. The destination operands are indicated as d1 and d2, signifying dst1 and dst2, respectively (see Figure 6–4). The source operands, signified by src1 and src2, use the extended-precision registers. The parallel operation to be performed is called operation.

Figure 6–7. Encoding for Parallel Multiply With ADD/SUB

The parallel addressing mode (P) field specifies how to use the operands, i.e., whether they are source or destination. The specific relationship between the P field and the operands is detailed in the description of the individual parallel instructions (see Chapter 14 for more information). However, the operands are always encoded in the same way. Bits 31 and 30 are set to the value of 10, indicating parallel addressing mode instructions. Bits 25 and 24 specify the parallel addressing mode (P) field, which defines how bits 21–0 are to be interpreted for addressing the src operands. Bits 21–19 define the src1 address, bits 18–16 define the src2 address, bits 15–8 the src3 address, and bits 7–0 the src4 address. The notations modn and modm indicate the modification.
field that goes with the AR\(n\) or AR\(m\) (auxiliary register) field, respectively. The parallel addressing operands are listed below.

\[
\begin{align*}
src1 &= Rn \\
src2 &= Rn \\
d1 &= \text{If 0, } dst1 \text{ is R0. If 1, } dst1 \text{ is R1.} \\
d2 &= \text{If 0, } dst2 \text{ is R2. If 1, } dst2 \text{ is R3.} \\
P &= 0 \leq P \leq 3 \\
src3 &= \text{indirect } (disp = 0, 1, IR0, IR1) \\
src4 &= \text{indirect } (disp = 0, 1, IR0, IR1)
\end{align*}
\]

Note:

Only registers R0–R7 are used in parallel instructions. R8–R11 are not used in parallel instructions.

As in the three-operand addressing mode, indirect addressing in the parallel addressing mode allows for displacements of 0 or 1 and the use of the index registers (IR0 and IR1). The displacement of 1 is implied and is not explicitly coded in the instruction word.

In the encoding shown for this mode in Figure 6–7, if the src3 and src4 fields use the same auxiliary register, both addresses are correctly generated, but only the value created by the src3 field is saved in the specified auxiliary register. The assembler issues a warning if you specify the same auxiliary register src3 and src4.

### 6.7.4 Conditional-Branch Addressing Modes

Instructions using the conditional-branch addressing modes (B\(cond\), B\(cond\)D, CALL\(cond\), DB\(cond\), and DB\(cond\)D) can perform a variety of conditional operations. Bits 31–27 are set to the value of 01101, indicating conditional-branch addressing mode instructions. Bit 26 is set to 0 or 1; 0 selects DB\(cond\) and 1 selects B\(cond\). Bit 25 determines the conditional-branch addressing mode (B). If B = 0, register addressing is used; if B = 1, PC-relative addressing is used. Bit 21 sets the type of branch: D = 0 for a standard branch, and D = 1 for a delayed branch. The condition field(\(cond\)) specifies the condition checked to determine what action to take — for example, whether or not to branch (see Table 14–8 on page 14-14 for a list of condition codes). Figure 6–6 shows the encoding for conditional-branch addressing.
Encoding of Addressing Modes

Figure 6–8. Encoding for Conditional-Branch Addressing Modes

DBcond (D):

<table>
<thead>
<tr>
<th>3</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>16</th>
<th>15</th>
<th>5</th>
<th>4</th>
<th>0</th>
</tr>
</thead>
</table>
| 0 | 11011 | B | ARn | D | cond | 0 | 0 | 0 | 0 | 0 | 0 | 0 | src reg
| 0 | 11011 | B | ARn | D | cond | Immediate (PC relative) |

Bcond (D):

<table>
<thead>
<tr>
<th>31</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>16</th>
<th>15</th>
<th>5</th>
<th>4</th>
<th>0</th>
</tr>
</thead>
</table>
| 0 | 11010 | B | 0 | 0 | 0 | D | cond | 0 | 0 | 0 | 0 | 0 | src reg
| 0 | 11010 | B | 0 | 0 | 0 | D | cond | Immediate (PC relative) |

CALLcond:

<table>
<thead>
<tr>
<th>31</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>16</th>
<th>15</th>
<th>5</th>
<th>4</th>
<th>0</th>
</tr>
</thead>
</table>
| 0 | 11100 | B | 0 | 0 | 0 | D | cond | 0 | 0 | 0 | 0 | 0 | src reg
| 0 | 11100 | B | 0 | 0 | 0 | D | cond | Immediate (PC relative) |
6.8 Circular Addressing

Many DSP algorithms require a circular buffer in memory. In convolution and correlation, a circular buffer acts as a sliding window that contains the most recent data to be processed. As new data is brought in, the new data overwrites the oldest data. The key to using a circular buffer is the implementation of a circular addressing mode. This section describes the circular addressing mode of the 'C4x.

The block-size register (BK) specifies the size of the circular buffer. If the most significant bit equal to 1 in the BK register is labeled bit N, with \( N \leq 15 \), the address immediately following the bottom of the circular buffer can be found by concatenating bits 31 through \( N+1 \) of a user-selected register (ARn) with bits \( N \) through 0 of the BK register. The address of the top of the buffer is referred to as the effective base (EB) and can be found by concatenating bits 31 through \( N+1 \) of ARn. Bits \( N \) through 0 of EB are zero.

Figure 6–9 illustrates the relationships among the block-size register (BK), the auxiliary registers (ARn), the bottom of the circular buffer, the top of the circular buffer, and the index into the circular buffer.

A circular buffer of size \( R \) must start on a K-bit boundary (that is, the K LSBs of the starting address of the circular buffer must be zeros), where \( K \) is an integer such that \( 2^K > R \). Since the value \( R \) must be loaded into the BK register, \( K \geq N+1 \). For example, a 31-word circular buffer must start at an address whose five LSBs are 0 (that is, xxx...x00000), and the value must be loaded into the BK register.

Note:

If the BK register has a value of 0, circular addressing is not performed. The effect will be the generation of a conventional linear address.
Circular Addressing

**Figure 6-9. Register Relationships in Circular Addressing**

**LEGEND:**
- ARn = auxiliary register n
- BK = block-size register
- EB = effective base
- H = high-order bits
- L = low-order bits
- L’ = new low-order bits
- LSB = least significant bit
- N = location of the MSB equal to 1 in the BK register

First 1 at Location N

Top of Buffer + 1 (Low Address)

Bottom of Buffer + 1 (High Address)

Circular Addressing

Algorithm Logic
Circular Addressing

In circular addressing, index refers to the N LSBs of the auxiliary register selected, and step is the quantity being added to or subtracted from the auxiliary register. When you use circular addressing, follow two basic rules:

- The step used must be less than or equal to the block size and is treated as an unsigned integer.
- The first time the circular queue is addressed, the auxiliary register must be pointing to an element in the circular queue.

The algorithm for circular addressing is as follows:

\[
\begin{align*}
&\text{If } 0 \leq \text{index} + \text{step} < \text{BK}: \\
&\quad \text{index} = \text{index} + \text{step}. \\
&\text{Else if } \text{index} + \text{step} \geq \text{BK}: \\
&\quad \text{index} = \text{index} + \text{step} - \text{BK}. \\
&\text{Else if } \text{index} + \text{step} < 0: \\
&\quad \text{index} = \text{index} + \text{step} + \text{BK}.
\end{align*}
\]

Figure 6–10 shows how the circular buffer is implemented. It illustrates the relationship of the generated quantities and the elements in the circular buffer.

**Figure 6–10. Circular Buffer Implementation**

<table>
<thead>
<tr>
<th>Address</th>
<th>Data</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Effective Base (EB)</strong></td>
<td>Top of Circular Buffer</td>
</tr>
<tr>
<td>31 N + 1 N 0</td>
<td>Element 0</td>
</tr>
<tr>
<td>H...H 0...0</td>
<td>Element 1</td>
</tr>
<tr>
<td><strong>Auxiliary Register (ARn)</strong></td>
<td>Element (N LSBs of ARn)</td>
</tr>
<tr>
<td>31 N + 1 N 0</td>
<td>Last Element</td>
</tr>
<tr>
<td>H...H L...L</td>
<td>Last Element + 1</td>
</tr>
<tr>
<td>31 N + 1 N 0</td>
<td>LSBS BK</td>
</tr>
</tbody>
</table>
Circular Addressing

Figure 6–11 gives an example of the operation of circular addressing. Assuming that all registers are four bits, let BK = 0110₂ (block size of 6) and AR0 = 0000₂ (at least the 3 LSBs of AR0 should be 0). This example shows a sequence of modifications and the resulting value of AR0. It also shows how the pointer steps through the circular queue with a variety of step sizes (both incrementally and decrementally).

Figure 6–11. Circular Addressing Example

*AR0++(5)%; AR0 = 0 (0th value)
*AR0++(2)%; AR0 = 5 (1st value)
*AR0--(3)%; AR0 = 1 (2nd value)
*AR0++(6)%; AR0 = 4 (3rd value)
*AR0--; AR0 = 4 (4th value)
*AR0--; AR0 = 3 (5th value)

<table>
<thead>
<tr>
<th>Value</th>
<th>Data</th>
<th>Address</th>
</tr>
</thead>
<tbody>
<tr>
<td>0th</td>
<td>Element 0</td>
<td>0</td>
</tr>
<tr>
<td>1st</td>
<td>Element 1</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td>Element 2</td>
<td>2</td>
</tr>
<tr>
<td>5th</td>
<td>Element 3</td>
<td>3</td>
</tr>
<tr>
<td>4th, 3rd</td>
<td>Element 4</td>
<td>4</td>
</tr>
<tr>
<td></td>
<td>Element 5 (Last Element)</td>
<td>5</td>
</tr>
<tr>
<td></td>
<td>Last Element + 1</td>
<td>6</td>
</tr>
</tbody>
</table>
Circular addressing is especially useful for the implementation of FIR filters. Figure 6–12 shows one possible data structure for FIR filters. Note that the initial value of AR0 points to h(N−1), and the initial value of AR1 points to x(0). Circular addressing is used in the 'C4x code for the FIR filter shown in Example 6–22.

**Example 6–22. FIR Filter Code Using Circular Addressing**

```
* Initialization
*  LDI N,BK ; Load block size.
  LDI H,AR0 ; Load pointer to impulse response.
  LDI X,AR1 ; Load pointer to bottom of input
            ; sample buffer.
*  TOP LDF IN, R3 ; Read input sample.
  STF R3,*AR1++% ; Store with other samples.
            ; and point to top of buffer.
  LDF 0,R0 ; Initialize R0.
  LDF 0,R2 ; Initialize R2.
*  Filter
*  RPTS N-1 ; Repeat next instruction.
  MPYF3 *AR0++%,*AR1++% R0
  | | ADDF3 R0,R2,R2 ; Multiply and accumulate.
  ADDF R0,R2 ; Last product accumulated.
*  STF R2,Y ; Save result.
  B TOP ; Repeat.
```
6.9 Bit-Reversed Addressing

The 'C4x can implement fast Fourier transforms (FFT) with bit-reversed addressing. If the data to be transformed is in the correct order, the final result of the FFT is in bit-reversed order. To recover the frequency-domain data in the correct order, certain memory locations must be swapped. The bit-reversed addressing mode makes swapping unnecessary. The next time data must be accessed, it is accessed in a bit-reversed manner rather than sequentially. In the 'C4x, this bit-reversed addressing can be implemented with both the CPU and DMA.

For correct CPU (or DMA) bit-reverse operation, the base address of bit-reversed addressing must be located on a boundary of the size of the FFT table. The CPU bit-reverse operation can be illustrated by assuming an FFT table of size \(N = 2^n\). When real and imaginary data are stored in separate arrays, the \(n\) LSBs of the base address must be zero, and \(IR_0\) must be equal to \(2^{n-1}\) (half of the FFT size). When real and imaginary data are stored in consecutive memory locations (\(Re-Im-Re-Im\)), the \(n+1\) LSBs of the base address must be zero, and \(IR_0\) must be equal to \(2^n\) (FFT size).

For CPU bit-reversing, one auxiliary register (\(AR2\) in this case) points to the physical location of a data value. When you add \(IR_0\) to this auxiliary register by using bit-reversed addressing, addresses are generated in a bit-reversed fashion (reverse carry propagation). The largest index for bit-reversed addressing is \(0008\ 0000\)h; this index is treated as an unsigned integer.

To illustrate bit reversed addressing, assume 8-bit auxiliary registers. Let \(AR2\) contain the value \(0110\ 0000\) (96\(_{10}\)). This is the base address of the data in memory. Let \(IR_0\) contain the value \(0000\ 1000\) (8\(_{10}\)). Example 6–23 shows a sequence of modifications of \(AR2\) and the resulting values of \(AR2\).

**Example 6–23. Bit-Reversed Addressing Example**

<table>
<thead>
<tr>
<th>(AR2++(IR0)B)</th>
<th>(AR2)</th>
</tr>
</thead>
<tbody>
<tr>
<td>(AR2++(IR0)B)</td>
<td>(AR2 = 0110\ 0000) (0th value)</td>
</tr>
<tr>
<td>(AR2++(IR0)B)</td>
<td>(AR2 = 0110\ 1000) (1st value)</td>
</tr>
<tr>
<td>(AR2++(IR0)B)</td>
<td>(AR2 = 0110\ 0100) (2nd value)</td>
</tr>
<tr>
<td>(AR2++(IR0)B)</td>
<td>(AR2 = 0110\ 1100) (3rd value)</td>
</tr>
<tr>
<td>(AR2++(IR0)B)</td>
<td>(AR2 = 0110\ 0010) (4th value)</td>
</tr>
<tr>
<td>(AR2++(IR0)B)</td>
<td>(AR2 = 0110\ 1010) (5th value)</td>
</tr>
<tr>
<td>(AR2++(IR0)B)</td>
<td>(AR2 = 0110\ 0110) (6th value)</td>
</tr>
<tr>
<td>(AR2)</td>
<td>(AR2 = 0110\ 1110) (7th value)</td>
</tr>
</tbody>
</table>
Table 6–4 shows the relationship of the index steps and the four LSBs of AR2. You can find the four LSBs by reversing the bit pattern of the steps.

**Table 6–4. Index Steps and Bit-Reversed Addressing**

<table>
<thead>
<tr>
<th>Step</th>
<th>Bit Pattern</th>
<th>Bit-Reversed Pattern</th>
<th>Bit-Reversed Step</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0000</td>
<td>0000</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>0001</td>
<td>1000</td>
<td>8</td>
</tr>
<tr>
<td>2</td>
<td>0010</td>
<td>0100</td>
<td>4</td>
</tr>
<tr>
<td>3</td>
<td>0011</td>
<td>1100</td>
<td>12</td>
</tr>
<tr>
<td>4</td>
<td>0100</td>
<td>0010</td>
<td>2</td>
</tr>
<tr>
<td>5</td>
<td>0101</td>
<td>1010</td>
<td>10</td>
</tr>
<tr>
<td>6</td>
<td>0110</td>
<td>0110</td>
<td>6</td>
</tr>
<tr>
<td>7</td>
<td>0111</td>
<td>1110</td>
<td>14</td>
</tr>
<tr>
<td>8</td>
<td>1000</td>
<td>0001</td>
<td>1</td>
</tr>
<tr>
<td>9</td>
<td>1001</td>
<td>1001</td>
<td>9</td>
</tr>
<tr>
<td>10</td>
<td>1010</td>
<td>0101</td>
<td>5</td>
</tr>
<tr>
<td>11</td>
<td>1011</td>
<td>1101</td>
<td>13</td>
</tr>
<tr>
<td>12</td>
<td>1100</td>
<td>0011</td>
<td>3</td>
</tr>
<tr>
<td>13</td>
<td>1101</td>
<td>1011</td>
<td>11</td>
</tr>
<tr>
<td>14</td>
<td>1110</td>
<td>0111</td>
<td>7</td>
</tr>
<tr>
<td>15</td>
<td>1111</td>
<td>1111</td>
<td>15</td>
</tr>
</tbody>
</table>

**Note:**

Bit-reverse operation of the DMA coprocessor is covered in Chapter 11 of this user’s guide and in the *TMS320C4x General-Purpose Applications User’s Guide.*
The 'C4x provides a complete set of constructs that allow software and hardware control of the program flow. Software control includes repeats, branches, calls, traps, and returns. Hardware control includes interrupts. You can select the constructs best suited for your particular application.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>7.1 Repeat Mode</td>
<td>7-2</td>
</tr>
<tr>
<td>7.2 Delayed Branches</td>
<td>7-8</td>
</tr>
<tr>
<td>7.3 Calls, Traps, Branches, Jumps, and Returns</td>
<td>7-12</td>
</tr>
<tr>
<td>7.4 Interrupts</td>
<td>7-15</td>
</tr>
<tr>
<td>7.5 Traps</td>
<td>7-24</td>
</tr>
<tr>
<td>7.6 DMA Interrupts</td>
<td>7-26</td>
</tr>
<tr>
<td>7.7 Reset</td>
<td>7-29</td>
</tr>
</tbody>
</table>
7.1 Repeat Mode

The repeat mode of the 'C4x can implement zero-overhead looping. For many algorithms, most execution time is spent in an inner kernel of code. Using the repeat modes allows these time-critical sections of code to be executed in the shortest possible time.

The 'C4x provides three instructions to support zero-overhead looping: RPTB (repeat a block of code), RPTBD (repeat a block of code delayed) and RPTS (repeat a single instruction):

- RPTB and RPTBD cause a block of code to be repeated a specified number of times.
- RPTS causes a single instruction to be repeated a number of times and reduces bus traffic by fetching the instruction only once.

RPTB and RPTS are four-cycle instructions; these four cycles of overhead are incurred only on the first pass through the loop. All subsequent passes through the loop are accomplished with zero cycles of loop overhead. RPTBD is a one-cycle instruction.

Three registers (RS, RE, and RC) control the updating of the program counter when it is updated in a repeat mode, as described in Table 7–1 below.

### Table 7–1. Repeat-Mode Registers

<table>
<thead>
<tr>
<th>Register</th>
<th>Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>RS</td>
<td>Repeat start address register. Holds the address of the first instruction of the code block to be repeated.</td>
</tr>
<tr>
<td>RE</td>
<td>Repeat end address register. Holds the address of the last instruction of the code block to be repeated. RE should be greater than or equal to RS (see subsection 7.1.2).</td>
</tr>
<tr>
<td>RC</td>
<td>Repeat-count register. Contains one less than the number of times remaining for the code block to be repeated.</td>
</tr>
</tbody>
</table>

Correct operation of the repeat modes requires that all of the above registers and status register fields be initialized correctly. RPTB, RPTBD, and RPTS perform this initialization in slightly different ways (see subsection 7.1.3 and subsection 7.1.4 for more information).
7.1.1 Control Bits

Two bits are important to the operation of RPTB, RPTBD and RPTS:

- The **RM** (repeat-mode flag) bit in the status register specifies whether or not the processor fetches instructions during the repeat mode.
  - If RM = 0, fetches are not made in repeat mode.
  - If RM = 1, fetches are made in repeat mode.

- The **S** bit is internal to the processor and cannot be programmed, but this bit is necessary to fully describe the operation of RPTB, RPTBD, and RPTS.
  - If RM = 1 and S = 0, RPTB or RPTBD is executing. Program fetches occur from memory.
  - If RM = 1 and S = 1, RPTS is executing. After the first fetch (from memory), program fetches occur from the instruction register (IR).

7.1.2 Repeat-Mode Operation

Information in the repeat-mode registers and associated control bits is used to control the modification of the PC when instruction fetches are being made in repeat mode. The repeat modes compare the contents of the RE register (repeat end address register) with the program counter (PC) after the execution of each instruction. If they match and the repeat counter is nonnegative, the repeat counter is decremented, the PC is loaded with the repeat start address, and processing continues. The fetches and appropriate status bits are modified as necessary. Note that the repeat counter (RC) is never modified when the repeat-mode flag (RM) is 0.

The repeat counter should be loaded with a value one less than the number of times to execute the block; for example, an RC value of 4 would execute the block five times. The detailed algorithm for the update of the PC is shown in Example 7–1.

**Notes:**

1) The maximum number of repeats occurs when RC = 8000 0000h. This results in 8000 0001h repetitions. The minimum number of repeats occurs when RC = 0. This results in one repetition.

2) RE should be greater than or equal to RS (RE ≥ RS). Otherwise, the code will not repeat even though the RM bit remains set to 1.

3) By writing a 0 into the repeat counter or writing 0 into the RM bit of the status register, you can stop the loop before it completes.
Example 7–1. Repeat-Mode Control Algorithm

```
if RM == 1  ;If in repeat mode (RPTB or RPTS)
  if S == 1  ;If RPTS
    if first time through  ;If this is the first fetch
      fetch instruction from memory  ;Fetch instruction from memory
    else
      fetch instruction from IR  ;Fetch instruction from IR
    RC - 1 \rightarrow RC  ;Decrement RC
    if RC < 0  ;If RC is negative
      ;Repeat single mode completed
      0 \rightarrow ST(RM)  ;Turn off repeat mode bit
      0 \rightarrow S  ;Clear S
      PC + 1 \rightarrow PC  ;Increment PC
  else if S == 0  ;If RPTB
    fetch instruction from memory  ;Fetch instruction from memory
    if PC == RE  ;If this is the end of the block
      RC - 1 \rightarrow RC  ;Decrement RC
    if RC \geq 0  ;If RC is not negative
      RS \rightarrow PC  ;Set PC to start of block
    else if RC < 0  ;If RC is negative
      0 \rightarrow ST(RM)  ;Turn off repeat mode bits
      0 \rightarrow S  ;Clear S
      PC + 1 \rightarrow PC  ;Increment PC
else
  fetch instruction from memory  ;Fetch instruction from memory
```

7.1.3 RPTB and RPTBD Instructions

The RPTB and RPTBD instructions repeat a block of code a specified number of times. RPTBD is a delayed form of the RPTB instruction that allows placing three instructions after it. These three instructions are not part of the block that is repeated, but they execute before the block repeat is started. This way, the pipeline remains full, and the RPTBD instruction can execute in one cycle.

The number of times to repeat the block is the RC (repeat count) register value plus one. Because the execution of RPTB and RPTBD does not load the RC, you must load this register yourself. The RC register must be loaded before the RPTB/RPTBD instruction is executed. The RC register should not be loaded in the 3 instructions after RPTBD. Example 7–2 shows a typical setup of the block repeat operation.

Example 7–2. RPTB Operation

```
LD 15,RC ; Load repeat counter with 15
RPTB ENDLOP ; Execute the block of code
STLOOP

; from STLOOP to ENDLOP 16 times
ENDLOP
```
All block repeats initiated by RPTB or RPTBD can be interrupted. However, interrupts are disabled during the execution of the three instructions following an RPTBD. None of the three instructions after the RPTBD instruction should modify the PC register or program flow. This restriction also applies to delayed branches, as explained in Section 7.2.

When RPTB src or RPTBD src execute, they perform a sequence of four operations:

1) Load the start address of the block into RS (repeat start address register).
   - For RPTB, this is the next address following the instruction: 
     PC of RPTB + 1 → RS
   - For RPTBD, this is the fourth address following the instruction: 
     PC of RPTBD + 4 → RS

2) Load the end address of the block into RE (repeat end address register).
   - For RPTB, in PC-relative mode, the 24-bit src operand plus RS is the end address:
     src + PC of RPTB + 1 → RE
   - For RPTBD, in PC-relative mode, the 24-bit source operand plus RS is the end address:
     src + PC of RPTBD + 3 → RE

3) In register mode, the contents of the src register is the end address: 
   contents of src register → RE

4) Set the status register to indicate the repeat mode of operation. 
   1 → RM status register bit (repeat mode flag)

5) Indicate that this is the repeat block mode of operation. 
   0 → S bit (bit is internal to the processor and not programmable)

### 7.1.4 RPTS Instruction

A RPTS src instruction repeats the instruction following the RPTS (src +1) times. Repeats of a single instruction initiated by RPTS are not interruptible since RPTS fetches the instruction word only once and then keeps it in the instruction register for reuse. An interrupt in this situation would cause the instruction word to be lost. Refetching the instruction word from the instruction register reduces memory accesses and, in effect, acts as a one-word program cache. If you need a single instruction that is repeatable and interruptible, you can use the RPTB/RPTBD instruction.
When RPTS src is executed, a sequence of five operations occurs:

1) PC + 1 → RS
2) PC + 1 → RE
3) 1 → RM (status register bit)
4) 1 → S
5) src → RC (repeat count register)

The RPTS instruction loads all registers and mode bits necessary for the operation of the single instruction repeat mode. Step 1 loads the start address of the block into RS. Step 2 loads the end address into the RE (end address of the block). Since this is a repeat of a single instruction, the start address and the end address are the same. Step 3 sets the status register to indicate the repeat mode of operation. Step 4 indicates that this is the repeat single-instruction mode of operation. Step 5 loads src into RC.

7.1.5 Repeat Mode Restriction Rules

Because the block repeat modes modify the program counter, other instructions cannot modify the program counter at the same time. Two rules apply:

Rule 1: The last instruction in the block (or the only instruction in a block of size one) cannot be a Bcond, DBcond, CALL, CALLcond, TRAPcond, RET1cond, RETScond, IDLE, RPTB, or RPTS. Example 7–3 shows an incorrectly placed standard branch.

Rule 2: None of the last four instructions from the bottom of the block (nor the only instruction in a block of size one) can be a BcondD, BRD, or DBcondD, RPTBD, LAJ, LAJcond, LATcond, BcondAF, BcondAT, or RET1condD. Example 7–4 shows an incorrectly placed delayed branch.

If either of these rules is violated, the PC will be undefined.

Example 7–3. Incorrectly Placed Standard Branch

```
LDI 15,RC ; Load repeat counter with 15
RPTB ENDLOP ; Execute block of code
STLOOP ; from STLOOP to ENDLOP 16 times
.
.
.
ENDLOP BR OOPS ; This branch violates rule 1
```
Example 7–4. Incorrectly Placed Delayed Branch

```
LDI 15,RC ; Load repeat counter with 15
RPTB ENDP ; Execute block of code
STLOOP ; from STLOOP to ENDP 16 times
  
BRD OOPS ; This branch violates rule 2
ADDF
MIFYF
ENDDP SUBF
```

7.1.6 RC Register Value After Repeat Mode Completes

For the RPTB/RPTBD instruction, the RC register normally decrements to 0000 0000h, unless the block size is 1; in that case, it decrements to FFFF FFFFh. However, if the RPTB/RPTBD instruction with a block size of 1 has a pipeline conflict in the instruction being executed, the RC register decrements to 0000 0000h. Example 7–5 illustrates a pipeline conflict. Refer to Chapter 8 for pipeline information.

RPTS normally decrements the RC register to FFFF FFFFh. However, if the RPTS has a pipeline conflict on the last cycle, the RC register decrements to 0000 0000h.

In any case, the number of repetitions is always RC + 1, regardless of the final value of RC.

Example 7–5. Pipeline Conflict in a RPTB Instruction

```
EDC .word 40000000h ; Program is located in 4000000Fh
LDP EDC
LDI @EDC,AR0
LDI 15,RC ; Load repeat counter with 15
RPTB ENDP ; Execute block of code
ENDDP LDI *AR0,R0 ; The *AR0 read conflicts with
       ; the instruction fetching.
       ; Then RC decrements to 0. If
       ; cache is enabled, RC decrements
       ; to FFFF FFFFh
```
7.1.7 Nesting Block Repeats

Block repeats (RPTB and RPTBD) are nestable. Because all of the control of a block repeat is defined by the RS, RE, RC, and ST registers, these registers must be saved and stored to nest block repeats. For example, if you write an interrupt service routine that requires the use of RPTB or RPTBD, it is possible that the interrupt associated with the routine may occur during another block repeat. The interrupt service routine can check the RM bit to determine whether the block repeat mode is active. If RM is set, the interrupt routine should save ST, RS, RE and RC, in this order. The interrupt routine can then perform a block repeat. Before returning from the interrupted routine, the interrupt routine should restore RC, RE, RS, and ST, in this order. If the RM bit is not set, you do not need to save and restore these registers.

The RPTS instruction can also be used in a block repeat loop if the proper registers are saved.

Because the program counter is modified at the end of the loop according to the contents of registers RS, RE, and RC, no operation should attempt to modify the repeat counter or the program counter to a different value at the end of the loop.

It takes four cycles of overhead to save and restore these registers. Hence, sometimes, it may be more economical to implement a nested loop by the more traditional method of using a register as a counter and then using a delayed branch, rather than by using the nested repeat block approach. Often, implementing the outer loop as a counter and the inner loop as a RPTB/RPTBD instruction produces the fastest execution.

Note:

The order in which the registers are saved/restored is important to guarantee correct operation. The ST register should be restored last, after the RC, RE, and RS registers. ST should be restored after restoring RC, because the RM bit cannot be set to one if the RC register is 0 or –1. For this reason, if you execute a POP ST instruction (with ST(RM) = 1) while RC = 0, the POP instruction recovers all of the ST register bits except the RM bit, which stays at 0 (repeat mode disabled). Also, RS and RE should be correctly set before you activate the repeat mode.

Section 1.7, Repeat Modes, in the TMS320C4x General-Purpose Applications User’s Guide contains examples of how to use repeat-mode instructions.
7.2 Delayed Branches

The 'C4x offers two main types of branches: standard and delayed.

*Standard branches* empty the pipeline before performing the branch; this guarantees correct management of the program counter and results in a 'C4x branch taking four cycles. Included in this class are standard branches (Bcond), repeats, calls, returns, and traps.

*Delayed branches* do not empty the pipeline but guarantee that the next three instructions will execute before the program counter is modified by the branch. The result is a branch that requires only a single cycle, thus making the speed of the delayed branch very close to speed of the optimal block repeat modes of the 'C4x. However, unlike block repeat modes, delayed branches can be used in situations other than looping. Every delayed branch has a standard branch counterpart that is used when a delayed branch cannot be used.

*Conditional delayed branches* use the conditions, reflected in the status register, that existed at the end of the instruction preceding the branch. They do not depend upon the instructions following the delayed branch. The execution time of a conditional delayed branch instruction is the same regardless of whether or not the branch is taken.

When a delayed branch is fetched, it remains pending until the three instructions that follow are executed. None of the three instructions immediately after a delayed branch can be any of the following:

- Bcond
- DBcond
- LAJ
- RETScond
- BcondD
- DBcondD
- LAJcond
- RPTB
- BcondAF
- CALL
- LATcond
- RPTBD
- BcondAT
- CALLcond
- RETIcond
- RPTS
- BR
- IDLE
- RETIcondD
- TRAPcond
- BRD

This restriction also applies to the RPTBD instruction, covered in subsection 7.1.3.

Delayed branches disable interrupts until the three instructions following the delayed branch are completed. This is independent of whether or not the branch is taken.

**Incorrectly used delayed branches can leave the PC undefined.** Example 7–6 illustrates an incorrectly-placed delayed branch.
Delayed Branches

Example 7–6. Incorrectly Placed Delayed Branches

```
B1:BD L1
   NOP
   NOP
B2:B  L2 ; This branch is incorrectly placed
   NOP
   NOP
   ...
```

Sometimes, a branch is necessary for the program flow when fewer than three instructions can be placed after a delayed branch. For faster execution, it is still advantageous to use a delayed branch. This is shown in Example 7–7, with a NOP taking the place of the third unused instruction. The tradeoff is more instruction words for less execution time.

Example 7–7. Delayed Branch Execution

```
* TITLE DELAYED BRANCH EXECUTION
   ...
   ...
   ...
   ...
   LDF*+AR1(5),R2 ; Load contents of memory to R2
   BGED SKIP ; If loaded number >=0, branch (delayed)
   LDFN R2,R1 ; If loaded number <0, load it to R1
   SUBF 3.0,R1 ; Subtract 3 from R1
   NOP ; Dummy operation to complete delayed branch
   MPYF 1.5,R1 ; Continue here if loaded number < 0
   ...
   ...
   SKIP LDF R1,R3 ; Continue here if loaded number >=0
```

There are two types of delayed branches: branches without annulling and branches with annulling.

7.2.1 Delayed Branches Without Annulling

Delayed branches without annulling do not empty the pipeline but guarantee that the next three instructions execute before the program counter is modified by the branch. The delayed branches without annulling are BcondD, BRD, and DBcondD.
7.2.2 Delayed Branches With Annulling

Delayed branches with annulling may conditionally annul the next three instructions. The delayed branches with annulling are BcondAT and BcondAF:

- **BcondAF**
  
  If the condition is true, the BcondAF instruction executes the three instructions following the branch and then branches. If the condition is false, the processor does not take the branch and it annuls the effects of the execute phase of the first following instruction and the effects of the read and execute phases of the second and third following instructions.

- **BcondAT**
  
  If the condition is true, the BcondAT instruction causes a branch and annuls the effects of the execute phase of the first following instruction and the effects of the read and execute phases of the second and third following instructions. If the condition is false, the instruction causes the execution of the three instructions following the branch and does not cause a branch.
7.3 Calls, Traps, Branches, Jumps, and Returns

Calls and traps can execute a subroutine or function while providing a return to the calling routine.

The CALL, CALLcond, and TRAPcond instructions store the value of the PC on the stack before changing the PC’s contents. The RETScond or RETIcond (standard or delayed) instructions use the value on the stack to return execution from traps and calls.

CALL is a four-cycle instruction, while CALLcond and TRAPcond are five-cycle instructions. 'C4x delayed instructions LAJ, LAJcond, and LATcond provide equivalent functionality, respectively, but in a single cycle.

- **CALL** places the next PC value on the stack and places the src (source) operand into the PC. The src is a 24-bit PC-relative or register value. Figure 7–1 shows CALL response timing.

- **CALLcond** is similar to the CALL instruction (above) except for two differences:
  - It executes only if a specific condition is true (the 20 conditions — including unconditional — are listed in Section 14.2 on page 14-12).
  - The src is either a 24-bit PC-relative displacement or in register addressing mode.

- **TRAPcond** executes only if a specific condition is true (same conditions as for the CALLcond instruction). When it executes, a four-step sequence occurs:
  1) The values of the GIE and CF status register bits are saved into the PGIE and PCF status register bits.
  2) Interrupts are disabled (GIE = 0) and the cache is frozen (CF bit = 0).
  3) The next PC value is stored on the stack.
  4) The specified vector is retrieved from the trap vector table and is loaded into the PC. The vector address corresponds to a trap number in the instruction.

Using RETIcond or RETIcondD to return re-enables interrupts if the status register’s GIE bit was set previously and recovers the previous CF bit.

- **RETScond** returns execution from any of the above three instructions by popping the top of the stack to the PC. For RETScond to execute, the specified condition must be true. The conditions are the same for RETScond as for the CALLcond instruction.
RET1cond returns from traps or calls in the same way that RETScond does with the addition that RET1cond also copies the PGIE and PCF bit values into the GIE and CF bits of the status register. The conditions for RET1cond are the same as for the CALLcond instruction.

RET1condD returns from traps or calls in the same way that RET1cond does with the addition that RET1condD first executes the three instructions immediately following RET1condD. The conditions for RET1condD are the same as for the CALLcond instruction.

Link and jump (LAJ), link and jump conditional (LAJcond), and link and trap conditional (LATcond) each provide a return address in extended-precision register R11.

- After it executes the three instructions that follow it, LAJ jumps to an address derived by a 24-bit PC-relative addressing mode (see subsection 6.6 for more information).

- The LAJcond destination address is either PC-relative (a displacement) or the contents of a specified register. If the condition is true, LAJcond first executes the three instructions following the LAJcond before making the jump. If the condition is not true, execution continues immediately after the LAJcond instruction.

- After it executes the three instructions that follow it, LATcond calls one of the 512 available trap vectors pointed to by the trap vector table pointer (see Section 3.2, on page 3-17, for more information about the TVTP).

Functionally, calls and traps accomplish the same task—a subfunction is called and executed, and control is then returned to the calling function. Traps offer two advantages over calls:

- Interrupts are automatically disabled when a trap is executed. This allows critical code to execute without risk of being interrupted. Thus, traps are usually terminated with a RET1cond or RET1condD instruction to re-enable interrupts if the status register GIE bit was set previously.

- You can use traps to indirectly call functions. This is particularly beneficial when a kernel of code contains the basic subfunctions to be used by applications. In this case, you can modify the functions in the kernel and relocate them without recompiling each application.
Figure 7–1. CALL Response Timing

Fetch CALL  |  Decode CALL  |  Read CALL  |  Execute CALL (store PC on stack)  |  Fetch first subroutine instruction

H3
H1
ADDR  |  Vector address  |  First instruction address
Data  |  PC  |  Inst 1
7.4 Interrupts

The 'C4x supports multiple internal and external interrupts, which can be used for a variety of applications. Internal interrupts are generated by the DMA controller, the timers, and the communication ports. The five external interrupt pins include four external maskable interrupt pins (IIOF0–IIOF3) and one non-maskable interrupt (NMI) pin. Interrupts can be sent to both the CPU and the DMA controller.

Interrupts on the 'C4x are automatically prioritized. This allows interrupts that occur simultaneously to be serviced in a predefined order.

This section discusses the operation of these interrupts. Additional information regarding internal interrupts can be found in Section 12.6, Coordinating Communication Ports with the CPU and DMA Processor, on page 12-17, Section 11.10, DMA and Interrupts, on page 11-42, and Chapter 13, Timers. See Section 7.6, DMA Interrupts, on page 7-26, for more information about interrupts to the DMA controller.

7.4.1 Interrupt Vector Table and Prioritization

The interrupt vector table (IVT) shown in Figure 7–2 contains the interrupt vectors. An interrupt vector is an address of an interrupt service routine that should start executing when an interrupt is received. The IVT table must be placed on a 512-word memory boundary. The table location is determined by the value that is stored in the IVTP register (see Section 3.2, CPU Expansion Register File, on page 3-17).

Prioritization means that an interrupt in a higher position in the interrupt vector table (Figure 7–2) is serviced before one in a lower position when both are received in the same clock cycle or when two previously received interrupts are waiting to be serviced. It does not mean, for example, that IIOF3 must wait until service routines for IIOF2, IIOF1, and IIOF0 are completed (when ST(GIE) = 1).

The priority of interrupts is handled by the CPU according to the interrupt vector table. Priority is set according to position in the table — those with displacements closest to the IVTP base address are higher in priority (i.e., NMI is higher than TINT0, which is higher than IIOF0, etc.). Note that interrupt TINT0 is located at IVTP + 2, while the TINT1 vector is located at IVTP + 2Bh after the communication port and DMA coprocessor interrupts.
**Figure 7–2. Interrupt-Vector Table (IVT)**

<table>
<thead>
<tr>
<th>IVTP+</th>
<th>IVTP+</th>
</tr>
</thead>
<tbody>
<tr>
<td>000h</td>
<td>Reserved</td>
</tr>
<tr>
<td>001h</td>
<td>NMI</td>
</tr>
<tr>
<td>002h</td>
<td>TINT0</td>
</tr>
<tr>
<td>003h</td>
<td>ITOF0</td>
</tr>
<tr>
<td>004h</td>
<td>ITOF1</td>
</tr>
<tr>
<td>005h</td>
<td>ITOF2</td>
</tr>
<tr>
<td>006h</td>
<td>ITOF3</td>
</tr>
<tr>
<td>007h</td>
<td>Unused</td>
</tr>
<tr>
<td>00Ch</td>
<td>ICFULL0</td>
</tr>
<tr>
<td>00Dh</td>
<td>OCFULL4</td>
</tr>
<tr>
<td>01Dh</td>
<td>ICFULL5</td>
</tr>
<tr>
<td>020h</td>
<td>OCEMPTY4</td>
</tr>
<tr>
<td>021h</td>
<td>ICFULL6</td>
</tr>
<tr>
<td>022h</td>
<td>OCRDY5</td>
</tr>
<tr>
<td>023h</td>
<td>OCEMPTY5</td>
</tr>
<tr>
<td>024h</td>
<td>OCEMPTY6</td>
</tr>
<tr>
<td>025h</td>
<td>DMA INT0</td>
</tr>
<tr>
<td>026h</td>
<td>DMA INT1</td>
</tr>
<tr>
<td>027h</td>
<td>DMA INT2</td>
</tr>
<tr>
<td>028h</td>
<td>DMA INT3</td>
</tr>
<tr>
<td>029h</td>
<td>DMA INT4</td>
</tr>
<tr>
<td>02Ah</td>
<td>DMA INT5</td>
</tr>
<tr>
<td>02Bh</td>
<td>TINT1</td>
</tr>
<tr>
<td>02Ch</td>
<td>Unused</td>
</tr>
<tr>
<td>03Ch</td>
<td>OCEMPTY3</td>
</tr>
<tr>
<td>03Dh</td>
<td>Reserved</td>
</tr>
</tbody>
</table>

**Notes:**

1) Reserved for the reset vector. See Table 7–4.
2) NMI (the nonmaskable interrupt) is discussed in subsection 7.4.5.
3) Timer interrupts TINT0 and TINT1 are enabled by the IIE register (subsection 3.1.9, page 3-11) and monitored at the IIF register (subsection 3.1.10, page 3-13).
4) External pins ITOF0—ITOF3 are programmed in the IIF register (subsection 3.1.10, page 3-13).
5) The communication port I/O buffers full/empty/ready interrupts are enabled by the IIE register and are also described in Figure 12–4, on page 12-8, (OUTPUT LEVEL and INPUT LEVEL bits).
6) Interrupts from the DMA are enabled at the IIE register and DMA channel control register at bits TCC and AUX TCC (see Figure 11–2, on page 11-8, for bit descriptions).
7) In the ‘C44, the interrupts for communication ports 0 and 3 are active. If you enable them with the IE bit, the ISR will be executed.
Interrupts

7.4.2 CPU Interrupt Control Bits

Three CPU registers contain bits used to control CPU interrupt operation:

- **The CPU status register (ST).** The CPU global interrupt enable bit (GIE), located in the ST, controls all maskable CPU interrupts. When this bit is set to 1, CPU interrupts are globally enabled. When this bit is cleared to 0, all CPU interrupts are disabled (except NMI, the nonmaskable interrupt). Refer to subsection 3.1.7, Status Register (ST), on page 3-5.

- **Internal interrupt enable register (IIE).** The IIE is used to enable CPU internally-generated interrupts (from timers, communication ports, and DMA channels). See subsection 3.1.9, CPU Internal Interrupt Enable Register (IIE), on page 3-11, for more information.

- **IIOF flag register (IIF).** The IIF contains interrupt flag bits and bits to determine the function of the external-interrupt pins (IIF0 – IIF3).

**The IIF Register**

When an external interrupt or most of the internal interrupts are received, a corresponding bit in the IIF register is set to 1. The only internally generated interrupts that do not have a flag bit in the IIF register are the communication port interrupts.

When the CPU services an interrupt that has an interrupt flag bit in the IIF register, or when the DMA controller latches this type of interrupt into a DMA internal signal, this flag bit is cleared by the internal interrupt acknowledge signal. However, for level-triggered interrupts, if IIFn is still low when the interrupt acknowledge signal occurs, the interrupt flag bit is cleared for only one cycle and then set to 1 again. For this reason, it is theoretically possible that, depending on when the IIF register is read, the interrupt flag bit may be zero, even though IIFn is low. After reset, zero is written to the interrupt flag register, thereby clearing all pending interrupts.

The IIF register bits can be read or written under software control. This provides access to the IIOFx pins, which can be treated as general-purpose I/O or as interrupt pins. For example, if at the IIF register, FUNCx = 0 (I/O pin) and TYPEx = 1 (output pin), then by writing into the FLAGx bit, you can also write to the external pin IIOFx. If FUNCx = 1 (interrupt pin), writing a 1 to the IIF register FLAGx bit has the same effect as an incoming interrupt received on the corresponding pin. In this way, all interrupts can be triggered and/or cleared through software. Since the interrupt bits also can be read, the interrupt pins can be polled in software when an interrupt-driven interface is not required.
Internal interrupts operate in a similar manner. In the IIF register, the bit corresponding to an internal interrupt (e.g., TINT0, TINT1) can be read and written to through software. Writing a 1 sets the interrupt latch, and writing a 0 clears it. All internal interrupts are one H1/H3 cycle in length. If any previous bit values of the IIF register need to be preserved, a modification to IIF should be performed with logic operations (AND, OR, etc), directly to the IIF register.

**Figure 7–3. IIF Register Modification**

<table>
<thead>
<tr>
<th>correct</th>
<th>incorrect</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDI @MASK, R0</td>
<td>LDI IIF, R1</td>
</tr>
<tr>
<td>AND R0, IIF</td>
<td>AND @MASK, R1</td>
</tr>
<tr>
<td></td>
<td>LDI R1, IIF</td>
</tr>
</tbody>
</table>

### 7.4.3 Interrupt Processing

For an interrupt to occur, at least two conditions must be met:

- All interrupts must be enabled globally by setting the GIE bit to 0 in the CPU status register (ST).
- The interrupt must be enabled by setting the corresponding bit in the IIE register.

The CPU interrupt processing cycle (shown in Figure 7–4) involves several events. The corresponding interrupt flag in the IIF register is cleared, the values of the GIE and CF status register bits are preserved, the cache is frozen (CF = 1), interrupts are globally disabled (GIE = 0), and the CPU completes all fetched instructions. Then, the interrupt vector is fetched and loaded into the PC, and the CPU continues execution of the first instruction in the interrupt service routine (ISR). When you use RETI cond or RETI cond D to return from the interrupt service routine, the previous GIE and CF bit values are recovered.

If you wish to make the interrupt service routine interruptible, you can set the GIE bit to 1 after entering the ISR. In addition, you can enable the cache. Be aware that because the PGIE and PCF status register bits are one deep, they preserve only the previous GIE and CF bits.

**Note:**

The GIE, and CF are preserved and loaded with new values after the completion of the last instruction that was fetched before the interrupt was flushed. This guarantees later restoration of correct flag values.
Figure 7–4. CPU Interrupt Processing

<table>
<thead>
<tr>
<th>Step</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Is an enabled interrupt set?</td>
<td>If enabled in the IIE or IIF registers, the interrupt is a CPU Interrupt</td>
</tr>
<tr>
<td>Disable interrupts temporarily</td>
<td></td>
</tr>
<tr>
<td>Clear interrupt flag (CPU)</td>
<td></td>
</tr>
<tr>
<td>Complete all fetched instructions</td>
<td></td>
</tr>
<tr>
<td>*(++SP) -&gt; PC</td>
<td></td>
</tr>
<tr>
<td>PGIE -&gt; GIE</td>
<td></td>
</tr>
<tr>
<td>PCF -&gt; CF</td>
<td></td>
</tr>
<tr>
<td>GIE -&gt; 0</td>
<td></td>
</tr>
<tr>
<td>CF -&gt; 1</td>
<td></td>
</tr>
<tr>
<td>PC ← Interrupt vector</td>
<td></td>
</tr>
<tr>
<td>CPU starts executing ISR routine</td>
<td></td>
</tr>
<tr>
<td>Return executed</td>
<td></td>
</tr>
<tr>
<td>(RETI/RETIcond)</td>
<td></td>
</tr>
<tr>
<td>PGIE -&gt; GIE</td>
<td></td>
</tr>
<tr>
<td>PCF -&gt; CF</td>
<td></td>
</tr>
</tbody>
</table>
Interrupts

CPU interrupts (including NMI) are only acknowledged (responded to by the CPU) on instruction fetch boundaries. If instruction fetches are halted because of pipeline conflicts or when an RPTS loop is executing, CPU interrupts are not acknowledged until the next instruction fetch.

The interrupt acknowledge (IACK) instruction can be used to signal externally that an interrupt has been serviced. If external memory is specified in the operand, IACK drives the IACK pin and performs a dummy read. The read is performed from the address specified by the IACK instruction operand. IACK is typically placed in the early portion of an interrupt service routine. However, depending on your application, it may be better suited at the end of the interrupt service routine or at another location. You are not required to use the IACK instruction in interrupt service routines.

Note the following situations:

- Interrupts are disabled during a RPTS and during a delayed branch (until the 3 instructions following a delayed branch are completed). Interrupts are held until after the branch.

- When an interrupt occurs, instructions currently in the decode and read phases continue regular execution. This is not the case for an instruction in the fetch phase:
  - If the interrupt occurs in the first cycle of the fetch of an instruction, the fetched instruction is discarded (not executed), and the address of that instruction is pushed to the top of the system stack.
  - If the interrupt occurs after the first cycle of the fetch (in the case of a multicycle fetch due to wait states), that instruction is executed, and the address of the next instruction to be fetched is pushed to the top of the system stack.
  - If no program fetch is occurring, then no new fetch is performed.

7.4.4 CPU Interrupt Latency

CPU interrupt latency, defined as the time from the acknowledgement of the interrupt to the execution of the first instruction of the interrupt service routine (ISR), is at least 8 cycles. This is explained in Table 7–2 where the interrupt is treated as an instruction, assuming that all the instructions are single-cycle instructions.
Interrupts

Table 7–2. Interrupt Latency

<table>
<thead>
<tr>
<th>Cycle</th>
<th>Description</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Recognize interrupt in single-cycle fetched (prog a+1) instruction.</td>
<td>prog a+1</td>
<td>prog a</td>
<td>prog a−1</td>
<td>prog a−2</td>
</tr>
<tr>
<td>2</td>
<td>Temporarily disable interrupt until GIE is cleared. Clear the corresponding IIF flag (if applicable).</td>
<td>—</td>
<td>interrupt</td>
<td>prog a</td>
<td>prog a−1</td>
</tr>
<tr>
<td>3</td>
<td>Read the interrupt vector table.</td>
<td>—</td>
<td>—</td>
<td>interrupt</td>
<td>prog a</td>
</tr>
<tr>
<td>4</td>
<td>Store return address to stack; save the GIE bit into PGIE and CF into PCF. Then, clear the GIE bit and set the CF bit to 1.</td>
<td>—</td>
<td>—</td>
<td>—</td>
<td>interrupt</td>
</tr>
<tr>
<td>5</td>
<td>Pipeline begins to fill with ISR instruction.</td>
<td>isr1</td>
<td>—</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>6</td>
<td></td>
<td>isr2</td>
<td>isr1</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>7</td>
<td></td>
<td>isr3</td>
<td>isr2</td>
<td>isr1</td>
<td>—</td>
</tr>
<tr>
<td>8</td>
<td>Execute first instruction of interrupt service routine.</td>
<td>isr4</td>
<td>isr3</td>
<td>isr2</td>
<td>isr1</td>
</tr>
</tbody>
</table>

7.4.5 External Interrupts

The five external interrupt pins include four external maskable interrupt pins (IIOF0–IIOF3) and one nonmaskable interrupt (NMI) pin.

The four external maskable interrupts (IIOF0–IIOF3) are enabled at the IIF register (subsection 3.1.10 page 3-13) and are synchronized internally. They are sampled on the falling edge of H1 and passed through a series of H1/H3 delays internally. Once synchronized, the interrupt input will set the corresponding interrupt flag register (IIF) bit if the interrupt is active. The list below shows the external interrupts and their corresponding interrupt vectors:

<table>
<thead>
<tr>
<th>IIOF Pin and Interrupt</th>
<th>Interrupt Vector Location</th>
</tr>
</thead>
<tbody>
<tr>
<td>IIOF0</td>
<td>IVTP + 003h</td>
</tr>
<tr>
<td>IIOF1</td>
<td>IVTP + 004h</td>
</tr>
<tr>
<td>IIOF2</td>
<td>IVTP + 005h</td>
</tr>
<tr>
<td>IIOF3</td>
<td>IVTP + 006h</td>
</tr>
</tbody>
</table>

Program Flow Control 7-21
These interrupts are prioritized by the selection of one over the other if both come on the same clock cycle (II0F0 the highest, II0F1 next, etc.). When an interrupt is taken, the status register ST(GIE) bit is reset to 0, disabling any other incoming interrupt (except NMI). This prevents any other interrupt (II0F0–II0F3) from assuming program control until the ST(GIE) bit is set back to 1. In addition, the ST(GIE) bit is saved into ST(PGIE) and the ST(CF) bit into ST(PCF). On a return from an interrupt routine, the RETI and RETIcond instructions place the value that is in the ST(PGIE) bit into the ST(GIE) bit and ST(PCF) bit into the ST(CF) bit, returning them to their previous values.

External interrupts can be either edge- or level-triggered, depending on how the TYPE fields are set in the IIF register (see subsection 3.1.10, IIOF Flag Register (IIF), on page 3-13, for more information about the IIF).

For an edge-triggered interrupt to be detected by the 'C4x, the external pin must transition from 1 to 0. And then, it needs to be held low for at least one H1/H3 cycle (but it could be held low longer).

For a level-triggered interrupt to be detected by the 'C4x, the external pin needs to be held low for between one and two cycles (1 \leq \text{low-pulse width} \leq 2). If the interrupt is held low for more than two cycles, more than one interrupt might be recognized. There is no need to provide an edge in this case.

**Note:**

Level-triggered interrupts are unlatched. The 'C4x will only detect them if the low-level is present during a fetch-to-decode pipeline transition. This means that during a pipeline halt, the level-triggered interrupts might be missed even if they are held low between one and two cycles. This is not the case for an edge-triggered interrupt because they are latched (they will get recognized regardless if the pipeline is halted).

**NMI**

The nonmaskable interrupt, NMI (an incoming low on pin AJ5, signal NMI), is not masked by the ST(GIE) bit. Even though the NMI is nonmaskable, its processing is temporarily postponed during delayed branches and multicycle CPU operations. NMI is a negative-going, edge-triggered, latched interrupt.

Take special care when using an NMI as a second level interrupt. When the 'C4x services an interrupt, interrupts are disabled except for the NMI. This creates a problem because the ST register may end up with the wrong value if the NMI is executed before the first level ISR that preserves the ST register’s value.

The TMS320C44 and the TMS320C40 (revision 5.0 and greater) has a software-configurable feature that allows the forcing ready of the internal peripher-
Interrupts

Interrupts

al bus when the NMI signal is asserted. This NMI bus-grant feature is enabled when bits 18 and 19 in the status register (ST) are set to 10₂. When enabled, a peripheral bus-grant signal is generated on the falling edge of NMI. If NMI is asserted and this feature is not enabled, the CPU stalls on an access to the peripheral bus if the bus is not ready. A stall condition occurs when writing to a full output FIFO or reading an empty input FIFO. This feature is useful in correcting communication-port errors when used in conjunction with the communication-port software-reset feature.
Traps

7.5 Traps

A trap is the equivalent of a software-triggered interrupt. In the 'C4x, traps and interrupts are treated identically, except in the way in which they are initialized.

7.5.1 Initialization of Traps and Interrupts

Traps and interrupts are initialized differently in the 'C4x.

Traps are always triggered by a software mechanism, by the TRAPcond (conditional trap) and LATcond (link and trap conditionally delayed) instructions.

Interrupts are always triggered by hardware events (for example, by external interrupts, DMA interrupts, or communication channel interrupts).

These GIE bit in the ST register and the mask bits in the IIE do not apply to traps.

7.5.2 Operation of Traps

Figure 7–5 shows the general flow of traps (and also of interrupts).

Figure 7–5. Flow of Traps

![Diagram of Trap Flow]

The RETIcond and RETIcondD instructions manipulate the status flags as shown in block (3) in Figure 7–5. RETIcond/RETIcondD provides a return/delayed return from a trap or interrupt.
In general, you should not directly modify the PGIE or PCF status register bits except when putting the status register on a stack for recursive interrupts or traps.

The 'C4x supports 512 different traps. When a TRAPcond n or LATcond n instruction is executed, the 'C4x jumps to the address stored in the memory location pointed to by TVTP + n, where TVTP is the Trap Vector Table Pointer register. The 32-bit TVTP register is essentially the base address for the trap-vector table (TVT) in memory. This table, shown in Figure 7–6, contains the addresses of the trap service routines that are executed by the CPU.

Figure 7–6. Trap Vector Table (TVT)

<table>
<thead>
<tr>
<th>Address</th>
<th>Trap Service Routine</th>
</tr>
</thead>
<tbody>
<tr>
<td>TVTP + 000h</td>
<td>TRAP0</td>
</tr>
<tr>
<td>TVTP + 001h</td>
<td>TRAP1</td>
</tr>
<tr>
<td>TVTP + 1FEh</td>
<td>TRAP510</td>
</tr>
<tr>
<td>TVTP + 1FFh</td>
<td>TRAP511</td>
</tr>
</tbody>
</table>

As with the interrupt vector table (IVT), the trap vector table (TVT) must begin on a 512-word memory boundary. The TVT pointer register (TVTP) points to the beginning of the TVT. See Section 3.2, CPU Expansion Register File, on page 3-17, for more information about the TVTP.

The TRAP or LATcond instructions can be used to generate a trap and manipulate the status flags as shown in block (1) in Figure 7–5. LATcond (link and trap conditionally) provides a single-cycle trap that is very useful for error detection and correction.

Note:
Because LATcond is a delayed instruction, the three instructions following LATcond should not modify the GIE or CF status register bits (this could result in storing incorrect values of these two bits).

7.5.3 Overlapping the Trap and Interrupt Vector Tables

The interrupt and trap vector tables can share the same 512-byte space in memory. In this configuration, you can place trap vectors where there are no interrupt vectors. For example, since interrupt vector 02Ch is unused, you could place a trap vector at IVTP+02Ch (which is also TVTP+02Ch if the tables overlap) and then call that trap by specifying 02Ch in the TRAP instruction.
7.6 DMA Interrupts

Interrupts can trigger DMA read and write operations. This is called DMA synchronization. The DMA interrupt processing cycle is similar to that of the CPU. After the pertinent interrupt flag is cleared, the DMA coprocessor proceeds according to the status of the SYNC bits in the DMA coprocessor global control register.

If the interrupt in the DMA Interrupt Enable (DIE) register is enabled, the interrupt controller automatically latches the interrupt and saves it for future DMA use. In the case of the flag interrupts (timer, external interrupt), the IIF flags are cleared when the interrupt controller latches the interrupt, not when the DMA responds to it. Even if the DMA has not been started, the interrupt latch occurs, except when the start bits in the DMA control register have the reset value 00\textsubscript{2} in START (AUX START) bits. DMA reset clears the interrupt internal latch.

7.6.1 DMA Interrupt Control Bits

Two registers contain bits used to control DMA interrupt operation:

- DMA interrupt enable register (DIE). All DMA interrupts are controlled by bits in the DIE and by the SYNC bits of the DMA channel control registers (described in Figure 11–2). The DMA interrupts are not dependent upon ST(GIE) and are local to the DMA.

- The DMA channel control register. Each DMA coprocessor channel uses a channel control register to determine its mode of operation. This register is shown in Figure 11–2.

The DIE is broken into six subfields that determine which interrupts can be used to control the synchronization for each of the six DMA channels. For example, the bits in these each of these fields allow you to select whether a DMA channel is synchronized to a communication port, a timer, or an external interrupt pin.

See subsection 3.1.8, *DMA Coprocessor Interrupt Enable Register (DIE)*, on page 3-8, for a description of the DIE.
7.6.2 DMA Interrupt Processing

Figure 7–7 shows the general flow of interrupt processing by the DMA coprocessor.

*Figure 7–7. DMA Interrupt Processing*

For more information about DMA interrupts, see Section 11.10, *DMA and Interrupts.*
7.6.3 CPU/DMA Interrupt Interaction

The 'C4x DMA coprocessor is not affected by the processing of CPU interrupts, even when the DMA is using interrupts for synchronization of transfers. In addition, the DMA is be affected, even when pipeline fetches are halted.

The 'C4x allows the CPU and DMA controller to respond to and process interrupts in parallel. Figure 7–8 shows the sequence of events in interrupt processing for both the CPU and the DMA controller; for the exact sequence of events, see Table 7–2.

It is therefore possible to interrupt the CPU and DMA coprocessors simultaneously with the same or different interrupts and, in effect, synchronize their activities. However, because the DMA coprocessor and CPU share the same set of interrupt flags, in some instances the DMA coprocessor can clear an interrupt flag before the CPU can respond to it. For example, if CPU interrupts are disabled or if instruction fetches have been halted, the DMA can latch the interrupt and thus clear the associated interrupt flag. If the interrupt is enabled in the DIE register, the CPU will never be able to “steal” a DMA interrupt, because the DMA responds to an interrupt as fast as or faster than the CPU.

*Figure 7–8. Parallel CPU and DMA Interrupt Processing*

- Does GIE=1 and is the interrupt enabled in the IIE or IIF register?
  - No
  - Yes
    - Process the CPU interrupt as shown in Figure 7–4.

- Is the interrupt enabled in the DIE register?
  - No
  - Yes
    - Process the DMA interrupt as shown in Figure 7–7.
7.7 Reset

The 'C4x supports a nonmaskable external reset signal (RESET), which is used to perform system reset. This section discusses the reset operation.

After powerup, the state of the 'C4x processor is undefined. You can use the RESET signal to put the processor in a known state. This signal must be asserted low for 10 or more H1 clock cycles to guarantee a system reset (See Chapter 1, Processor Initialization, in the TMS320C4x General-Purpose Applications User’s Guide for the recommended reset circuit). H1 is an output clock signal generated by the 'C4x.

Reset affects several aspects of 'C4x operation:

- Some device pins
- Some device registers
- Program execution

7.7.1 Reset’s Effects on Pin States

Reset affects the other pins on the device in either a synchronous or an asynchronous manner. The synchronous reset is gated by the 'C4x's internal clocks. The asynchronous reset directly affects the pins and is faster than the synchronous reset. Reset timing details are included in the 'C4x data sheets.

Table 7–3 shows the state of the 'C4x’s pins during RESET = 0 and after RESET goes back to 1. Each pin is described according to whether the pin is reset synchronously or asynchronously.

### Table 7–3. Pin States At System Reset

(a) Clock (4 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>H1</td>
<td>1</td>
<td>O</td>
<td>S</td>
<td>Begins clocking when RESET makes a 1-to-0 transition</td>
</tr>
<tr>
<td>H3</td>
<td>1</td>
<td>O</td>
<td>S</td>
<td>Begins clocking when RESET makes a 1-to-0 transition</td>
</tr>
<tr>
<td>X1</td>
<td>1</td>
<td>O</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>X2/CLKIN</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
</tbody>
</table>

† A = Asynchronous, S = Synchronous
‡ Recommended decoupling capacitors are one multiple 0.1 μF and 4.7 μF around the device.
Number depends on specific board noise conditions.
§ I=Input, O=Output, Z=High-impedance state.
Table 7–3. Pin States After System Reset (Continued)

**Communication Port 0 Interface (12 pins)**

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>C0D(7–0)</td>
<td>8</td>
<td>I/O</td>
<td>S</td>
<td>Set to undefined value</td>
</tr>
<tr>
<td>CACK0</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CRDY0</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CREQ0</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CSTRB0</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
</tbody>
</table>

**Communication Port 1 Interface (12 pins)**

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>C1D(7–0)</td>
<td>8</td>
<td>I/O</td>
<td>S</td>
<td>Set to undefined value</td>
</tr>
<tr>
<td>CACK1</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CRDY1</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CREQ1</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CSTRB1</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
</tbody>
</table>

**Communication Port 2 Interface (12 pins)**

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>C2D(7–0)</td>
<td>8</td>
<td>I/O</td>
<td>S</td>
<td>Set to undefined value</td>
</tr>
<tr>
<td>CACK2</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CRDY2</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CREQ2</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CSTRB2</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
</tbody>
</table>

† A = Asynchronous, S = Synchronous
‡ Recommended decoupling capacitors are one multiple 0.1 µF and 4.7 µF around the device.
  Number depends on specific board noise conditions.
§ I=Input, O=Output, Z=High-impedance state.
### Table 7–3. Pin States After System Reset (Continued)

#### (e) Communication Port 3 Interface (12 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type†</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>C3D(7–0)</td>
<td>8</td>
<td>I/O</td>
<td>S</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CACK3</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CRDY3</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CREQ3</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CSTRB3</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
</tbody>
</table>

#### (f) Communication Port 4 Interface (12 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type†</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>C4D(7–0)</td>
<td>8</td>
<td>I/O</td>
<td>S</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CACK4</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CRDY4</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CREQ4</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CSTRB4</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
</tbody>
</table>

#### (g) Communication Port 5 Interface (12 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type†</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>C5D(7–0)</td>
<td>8</td>
<td>I/O</td>
<td>S</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CACK5</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>CRDY5</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CREQ5</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set high-impedance when reset goes low and then set to one when reset goes high</td>
</tr>
<tr>
<td>CSTRB5</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
</tbody>
</table>

† A = Asynchronous, S = Synchronous
‡ Recommended decoupling capacitors are one multiple 0.1 μF and 4.7 μF around the device. Number depends on specific board noise conditions.
§ I=Input, O=Output, Z=High-impedance state.
Table 7–3. Pin States After System Reset (Continued)

(h) Emulation (7 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O§</th>
<th>Type†</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>EMU0</td>
<td>1</td>
<td>I/O</td>
<td>–</td>
<td>Undefined</td>
</tr>
<tr>
<td>EMU1</td>
<td>1</td>
<td>I/O</td>
<td>–</td>
<td>Undefined</td>
</tr>
<tr>
<td>TCK</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>TDI</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>TDO</td>
<td>1</td>
<td>O</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>TMS</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>TRST</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
</tbody>
</table>

(i) Global Bus External Interface (80 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O§</th>
<th>Type†</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>A(30–0)</td>
<td>31</td>
<td>O/Z</td>
<td>S</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>AE</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>CE0</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>CE1</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>D(31–0)</td>
<td>32</td>
<td>I/O/Z</td>
<td>S</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>DE</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>LOCK</td>
<td>1</td>
<td>O</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>PAGE0</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to zero</td>
</tr>
<tr>
<td>PAGE1</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to zero</td>
</tr>
<tr>
<td>RDY0</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>RDY1</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>R/W0</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>R/W1</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>STAT(3–0)</td>
<td>4</td>
<td>O</td>
<td>S</td>
<td>Set to all ones</td>
</tr>
<tr>
<td>STRB0</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>STRB1</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to one</td>
</tr>
</tbody>
</table>

† A = Asynchronous, S = Synchronous
‡ Recommended decoupling capacitors are multiple 0.1 µF and 4.7 µF around the device. Number depends on specific board noise conditions.
§ I=Input, O=Output, Z=High-impedance state.
Table 7–3. Pin States After System Reset (Continued)

(j) Local Bus External Interface (80 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LA(30–0)</td>
<td>31</td>
<td>O/Z</td>
<td>S</td>
<td>Placed in high-impedance state</td>
</tr>
<tr>
<td>LAE</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>Reset has no effect</td>
</tr>
<tr>
<td>LCE0</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>Reset has no effect</td>
</tr>
<tr>
<td>LCE1</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>Reset has no effect</td>
</tr>
<tr>
<td>LDE</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>Reset has no effect</td>
</tr>
<tr>
<td>LLOCK</td>
<td>1</td>
<td>O</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>LPAGE0</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to zero</td>
</tr>
<tr>
<td>LPAGE1</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to zero</td>
</tr>
<tr>
<td>LRDY0</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>Reset has no effect</td>
</tr>
<tr>
<td>LRDY1</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>Reset has no effect</td>
</tr>
<tr>
<td>LR/W1</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>LSTAT(3–0)</td>
<td>4</td>
<td>O</td>
<td>S</td>
<td>Set to all ones</td>
</tr>
<tr>
<td>LSTRB0</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>LSTRB1</td>
<td>1</td>
<td>O/Z</td>
<td>S</td>
<td>Set to one</td>
</tr>
</tbody>
</table>

(k) Interrupts, I/O Flags, Reset, Timer (12 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>IACK</td>
<td>1</td>
<td>I</td>
<td>S</td>
<td>Set to one</td>
</tr>
<tr>
<td>TOF(0–3)</td>
<td>4</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>NMI</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>RESET</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>RESET input pin</td>
</tr>
<tr>
<td>RESETLOC(1,0)</td>
<td>2</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>ROMEN</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>No effect</td>
</tr>
<tr>
<td>LD(31–0)</td>
<td>32</td>
<td>I/O/Z</td>
<td>S</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>TCLK0</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
<tr>
<td>TCLK1</td>
<td>1</td>
<td>I/O</td>
<td>A</td>
<td>Set to high-impedance</td>
</tr>
</tbody>
</table>

† A = Asynchronous, S = Synchronous
‡ Recommended decoupling capacitors are one multiple 0.1 µF and 4.7 µF around the device.
§ Number depends on specific board noise conditions.
$\$$ I=Input, O=Output, Z=High-impedance state.
Table 7–3. Pin States After System Reset (Continued)

(l) Power (70 pins)

<table>
<thead>
<tr>
<th>Signal</th>
<th>Pins</th>
<th>I/O</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>SUBS</td>
<td>1</td>
<td>I</td>
<td>–</td>
<td>Substrate pin (tie to ground). Set to high-impedance.</td>
</tr>
<tr>
<td>VSSL</td>
<td>4</td>
<td>I</td>
<td>–</td>
<td>Ground pins. Set to high-impedance.</td>
</tr>
<tr>
<td>CVSS</td>
<td>15</td>
<td>I</td>
<td>–</td>
<td>Ground pins. Set to high-impedance.</td>
</tr>
<tr>
<td>DVSS</td>
<td>15</td>
<td>I</td>
<td>–</td>
<td>Ground pins. Set to high-impedance.</td>
</tr>
<tr>
<td>IVSS</td>
<td>6</td>
<td>I</td>
<td>–</td>
<td>Ground pins. Set to high-impedance.</td>
</tr>
<tr>
<td>DVDD</td>
<td>13</td>
<td>I</td>
<td>–</td>
<td>+$V_{DC}$ supply pins. Set to high-impedance.‡</td>
</tr>
<tr>
<td>GADVDD</td>
<td>3</td>
<td>I</td>
<td>–</td>
<td>+$V_{DC}$ supply pins. Set to high-impedance.‡</td>
</tr>
<tr>
<td>GDDVDD</td>
<td>3</td>
<td>I</td>
<td>–</td>
<td>+$V_{DC}$ supply pins. Set to high-impedance.‡</td>
</tr>
<tr>
<td>LADVDD</td>
<td>3</td>
<td>I</td>
<td>–</td>
<td>+$V_{DC}$ supply pins. Set to high-impedance.‡</td>
</tr>
<tr>
<td>LDDVDD</td>
<td>3</td>
<td>I</td>
<td>–</td>
<td>+$V_{DC}$ supply pins. Set to high-impedance.‡</td>
</tr>
<tr>
<td>VDDL</td>
<td>4</td>
<td>I</td>
<td>–</td>
<td>+$V_{DC}$ supply pins. Set to high-impedance.‡</td>
</tr>
</tbody>
</table>

† A = Asynchronous, S = Synchronous
‡ Recommended decoupling capacitors are one multiple 0.1 µF and 4.7 µF around the device. Number depends on specific board noise conditions.
§ I=Input, O=Output, Z=High-impedance state.
7.7.2 Reset Vector Location

When RESET is released, the 'C4x begins executing the application program. The initial address of the program is stored in the reset vector. The 'C4x permits selection of any one of four reset vector locations. Selection of the reset vector location that is used is determined by the levels on the RESETLOC1 and RESETLOC0 pins at reset. Table 7–4 shows the possible configurations of these pins.

Table 7–4. RESET Vector Locations

<table>
<thead>
<tr>
<th>Value at RESETLOCx Pin</th>
<th>Get Reset Vector From Hex Memory Address</th>
<th>Comment</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>00000 0000</td>
<td>Local Bus</td>
</tr>
<tr>
<td></td>
<td>07FFF FFFF†</td>
<td>Local Bus</td>
</tr>
<tr>
<td></td>
<td>08000 0000†</td>
<td>Global Bus</td>
</tr>
<tr>
<td></td>
<td>0FFFFFF FFFF†</td>
<td>Global Bus</td>
</tr>
</tbody>
</table>

† This corresponds to the 32-bit address that the processor accesses. However, in the 'C44 only the 24-LSBs of the reset address will be driven on pins A0–A23 or pins LA0–LA23. The corresponding LSTRBx pins will also be activated.

7.7.3 Additional Reset Operations

After system reset (after RESET goes back from 0 to 1), the following additional operations are performed:

- Timer registers are set.
  - The timer global control register is set to 0, except that bit DATIN is set to the value on pin TCLK.
  - The timer counter and timer period registers set to zeros.

- Control registers for communication ports 0–2 (subsection 12.3.1 on page 12-8) are set to zeros (output operation), and control registers for communication ports 3–5 are set to 04h (input operation).

- External memory interface control registers (Section 9.3 on page 9-6) are set to 3E39 FFF0h. (7 wait states)

- DMA channel control register, DMA transfer counter, and DMA auxiliary transfer counter (subsection 11.3.1 on page 11-7) are set to zeros.
The following CPU registers are loaded with zeros (each described in Chapter 3):

- IIE (CPU internal interrupt enable register)
- IIF (interrupt flag register)
- DIE (DMA internal enable register)
- IVTP (interrupt-vector table pointer)
- TVTP (trap-vector table pointer)

The CPU status register (ST) is set to 0400h, which puts the on-chip cache in cache freeze mode.

The reset vector is read from its location and loaded into the PC.

If ROMEN=1 (Internal ROM enabled), the RESETLOC(1,0) pins are low, and the II0F0 pin is high, the 'C4x will start execution of the bootloader code. Otherwise, the 'C4x will start execution of the routine which is pointed to by the reset vector corresponding to the RESETLOC(1,0) pins.

Multiple 'C4xs driven by the same system clock can be reset and synchronized. See Reset Signal Generation in the TMS320C4x General-Purpose User’s Guide for information about resetting multiple 'C4xs.
Two characteristics of the 'C4x that contribute to its high performance are pipelining and concurrent I/O and CPU operation.

Four functional units control 'C4x pipeline operation: fetch, decode, read, and execute. Pipelining is the overlapping or parallel operations of the fetch, decode, read, and execute levels of a basic instruction.

The DMA coprocessor decreases pipeline interference and enhances the CPU's computational throughput by performing input/output operations.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>8.1 Pipeline Structure</td>
<td>8-2</td>
</tr>
<tr>
<td>8.2 Pipeline Conflicts</td>
<td>8-4</td>
</tr>
<tr>
<td>8.3 Memory Accesses for Maximum Performance</td>
<td>8-17</td>
</tr>
<tr>
<td>8.4 Clocking of Memory Accesses</td>
<td>8-19</td>
</tr>
</tbody>
</table>
8.1 Pipeline Structure

The four major units of the 'C4x pipeline structure and their functions are as follows:

- **Fetch Unit (F)**: Fetches the instruction words from memory and updates the program counter (PC).
- **Decode Unit (D)**: Decodes the instruction word and performs address generation. Also, controls modification of the ARn registers in the indirect addressing mode, and of the stack pointer when PUSH to/POP from the stack occurs.
- **Read Unit (R)**: If required, reads the operands from memory.
- **Execute Unit (E)**: If required, reads the operands from the register file, performs the necessary operation, and writes results to the register file. If required, results of previous operations are written to memory.

A basic instruction has four levels: fetch, decode, read, and execute. Figure 8–1 illustrates these four levels of the pipeline structure. The levels are indexed according to instruction and execution cycle. In the figure, perfect overlap in the pipeline, where all four units operate in parallel, occurs at cycle \( m \). Levels about to be executed are at \( m +1 \), and those just executed are at \( m–1 \). The 'C4x pipeline controller supports a high-speed processing rate of one execution per cycle. It also manages pipeline conflicts so that they are transparent to the user. You do not need to take any special precautions to guarantee correct operation.
**Figure 8–1. Pipeline Structure**

<table>
<thead>
<tr>
<th>CYCLE</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>m–3</td>
<td>w</td>
<td>–</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>m–2</td>
<td>x</td>
<td>w</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>m–1</td>
<td>y</td>
<td>x</td>
<td>w</td>
<td>–</td>
</tr>
<tr>
<td>m</td>
<td>z</td>
<td>y</td>
<td>x</td>
<td>w</td>
</tr>
<tr>
<td>m+1</td>
<td>–</td>
<td>z</td>
<td>y</td>
<td>x</td>
</tr>
<tr>
<td>m+2</td>
<td>–</td>
<td>–</td>
<td>z</td>
<td>y</td>
</tr>
<tr>
<td>m+3</td>
<td>–</td>
<td>–</td>
<td>–</td>
<td>z</td>
</tr>
</tbody>
</table>

Notes:  
1) W, X, Y, and Z represent instructions.  
2) F, D, R, E = fetch, decode, read, and execute, respectively.

Priorities from highest to lowest have been assigned to each of the functional units of the pipeline and to the DMA controller as follows:

- DMA (if configured as highest priority)
- Execute
- Read
- Decode
- Fetch
- DMA (if configured as lowest priority).

When the processing of an instruction is ready to pass to the next higher pipeline level and that level is not ready to accept a new input, a pipeline conflict occurs. In this case, the lower priority unit waits until the higher priority unit completes its currently executing function.
8.2 Pipeline Conflicts

Pipeline conflicts in the 'C4x can be grouped into the following three main categories:

**Branch Conflicts**  
Involve most of those instructions or operations that read and/or modify the PC.

**Register Conflicts**  
Involve delays that can occur when reading from or writing to registers that are used for address generation, such as: AR0–AR7, IR0, IR1, BK, DP and SP.

**Memory Conflicts**  
Occur when the internal units of the 'C4x compete for memory resources.

Each of these three types is discussed in the following subsections. Examples are included. Note in these examples, when data is refetched or an operation is repeated, the symbol representing the stage of the pipeline is appended with a number. For example, if a fetch is performed again, the instruction mnemonic is repeated. The symbol RDY is used to indicate that a unit is not ready and the symbol RDY is used to indicate that a unit is ready.

8.2.1 Branch Conflicts

Branch conflicts involve most of the instructions or operations that read and/or modify the PC.

8.2.1.1 Standard Branches

Pipeline conflicts occur with standard (nondelayed) branches, i.e., BR, Bcond, DBcond, CALL, IDLE, RPTB, RPTS, RET1cond, RETScond, interrupts, and reset, because their execution is all the pipeline can handle. Other information fetched into the pipeline is discarded or refetched, or the pipeline becomes inactive; this is referred to as flushing the pipeline. Flushing the pipeline is necessary in these cases to prevent partial execution of succeeding instructions. The branches discussed here are loads; TRAP cond and CALLcond are treated as conditional stores and are shown in Example 8–13.

Example 8–1 shows the code and pipeline operation for a standard branch. Note that one dummy fetch is performed (MPYF instruction), and then after the branch address is available, a new fetch (OR instruction) is performed. This dummy fetch introduces the MPYF instruction into the cache.
Example 8–1. Standard Branch

```
BR THREE ; Unconditional branch
MPYF    ; Not executed
ADD     ; Not executed
SUBF    ; Not executed
AND     ; Not executed
.
.
.
THREE  OR ; Fetched after BR is taken
STI
.
.
```

PIPELINE OPERATION

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>BR</td>
<td>—</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>n+1</td>
<td>MPYF</td>
<td>BR</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>n+1</td>
<td>(nop)</td>
<td>(nop)</td>
<td>BR</td>
<td>—</td>
</tr>
<tr>
<td>n+1</td>
<td>(nop)</td>
<td>(nop)</td>
<td>(nop)</td>
<td>BR</td>
</tr>
<tr>
<td>THREE</td>
<td>OR</td>
<td>(nop)</td>
<td>(nop)</td>
<td>(nop)</td>
</tr>
<tr>
<td></td>
<td>STI</td>
<td>OR</td>
<td>(nop)</td>
<td>(nop)</td>
</tr>
</tbody>
</table>

Note:

Both RPTS and RPTB flush the pipeline, allowing the RS, RE, and RC registers to be loaded at the proper time. If these registers are loaded without the use of RPTS or RPTB, no flushing of the pipeline occurs. Thus, RS, RE, and RC can be used as general-purpose 32-bit registers without pipeline conflicts. When RPTB is nested because of nested interrupts, it may be necessary to load and store these registers directly while using the repeat modes. Since up to four instructions can be fetched before the repeat mode is entered, loads should be followed by a branch to flush the pipeline. If the RC is changing when an instruction is loading it, the direct load takes priority over the modification made by the repeat mode logic.
8.2.1.2 Delayed Branches Without Annul Option

Delayed branches are implemented to assure that the next three instructions are fetched and executed. The delayed branches without annul option include BRD, BcondD, and DBcondD. Example 8–2 shows the code and pipeline operation for a delayed branch.

Example 8–2. Delayed Branch Without Annul Option

```
BRDTHREE ; Unconditional delayed branch
MPYF ; Executed
ADD ; Executed
SUBF ; Executed
AND ; Not executed
.
.
.
THREE MPYF ; Fetched after SUBF is fetched
.
.
.
```

**PIPELINE OPERATION**

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>BRD</td>
<td>–</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+1</td>
<td>MPYF</td>
<td>BRD</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+2</td>
<td>ADDF</td>
<td>MPYF</td>
<td>BRD</td>
<td>–</td>
</tr>
<tr>
<td>n+3</td>
<td>SUBF</td>
<td>ADDF</td>
<td>MPYF</td>
<td>BRD</td>
</tr>
<tr>
<td>THREE</td>
<td>MPYF</td>
<td>SUBF</td>
<td>ADDF</td>
<td>MPYF</td>
</tr>
</tbody>
</table>

Pipeline Conflicts
8.2.1.3 Delayed Branches With Annul Option

The 'C4x supports delayed branches with an annulling option: BcondAT (branch conditional, annul if true) and BcondAF (branch conditional, annul if false). The true or false status of the condition controls whether or not a branch is performed (as in a delayed branch). The annulling operation cancels the effect of the execute phase of the first instruction and of the read and execute phases of the second and third instructions following the BcondAT or BcondAF.

- If the condition is true, BcondAT performs a branch, and the annulling operation takes place. Otherwise, the branch is not taken and the annulling operation does not take place.

- If the condition is false, BcondAF does not perform a branch, and the annulling operation takes place. Otherwise, the branch is taken and the annulling operation does not take place.

See subsection 7.2.2 for more information about delayed branches with annulling. Example 8–3 uses both BcondAT and BcondAF.

Example 8–3. Using BcondAF and BcondAT Instructions

```
LDI  *AR1,R0  ; If negative, branch and
BNAT bottom  ; annul the execute phase
ADDI *++AR2,R3  ; of ADDI, MPYF, and NOT.
MPYF      ; Otherwise, don’t annul and
NOT       ; continue with SUBF.
top: SUBF  ;
.
.
SUBI 1,R0  ; If not negative, branch and
BNNAF top  ; do not annul the execute
ADDI *++AR2,R3  ; phase of ADDI, MPYF, and
MPYF      ; NOT. Otherwise, annul ADDI, NOT. Otherwise, annul ADDI,
bottom:XOR  ; MPYF, and NOT, and continue
            ; with XOR.
```

At the start of Example 8–3, if the result of the load is negative (a true condition), the BcondAT instruction causes a branch and also annuls the execute phase of the three instructions that follow it. As a result, the execute phase of the ADDI instruction does not occur, and register R3 is not updated by addition. However, AR2 is incremented, and data at the corresponding address is read because these operations are in the decode and read phases of the pipeline, respectively, and thus cannot be annulled.
Two types of operations can be annulled:

- All writes to the register file that occur in the execute phase (ADDs, LDs, etc., but not LDA, LDPK, etc.)
- All stores to memory

### 8.2.2 Register Conflicts

A register conflict occurs if you read from or write to a register used for addressing purposes (AR0–AR7, IR0, IR1, BK, DP, and SP) when the register is not ready to be used. For example, if an instruction writes to one of these registers, the decode unit cannot use that same register until the write is complete (which occurs in the execute stage).

In Example 8–4, an auxiliary register is loaded, and the same auxiliary register is used on the next instruction. Since the decode stage needs the result of the write to the auxiliary register, the decode of this second instruction is delayed two cycles. Every time the decode is delayed, a refetch of the program word is performed; i.e., ADDF is fetched three times. Because these are actual refetches, they can cause not only conflicts with the DMA controller, but also cache hits and misses. If the AR register used in the MPYF instruction were different from the one used in the LDI instruction, no delay would occur.

**Example 8–4. Write to an AR Followed by an AR for Address Generation**

<table>
<thead>
<tr>
<th></th>
<th>LDI 7, AR2</th>
<th>; 7 → AR2</th>
</tr>
</thead>
<tbody>
<tr>
<td>NEXT</td>
<td>MPYF *AR2, R0</td>
<td>; Decode delayed 2 cycles</td>
</tr>
<tr>
<td></td>
<td>ADDF</td>
<td>FLOAT</td>
</tr>
</tbody>
</table>

#### PIPELINE OPERATION

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>LDI</td>
<td>–</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+1</td>
<td>MPYF</td>
<td>LDI</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+2</td>
<td>ADDF</td>
<td>MPYF</td>
<td>LDI</td>
<td>–</td>
</tr>
<tr>
<td>n+2</td>
<td>ADDF</td>
<td>MPYF</td>
<td>(nop)</td>
<td>LDI 7, AR2</td>
</tr>
<tr>
<td>n+2</td>
<td>ADDF</td>
<td>MPYF</td>
<td>(nop)</td>
<td>(nop)</td>
</tr>
<tr>
<td>n+2</td>
<td>FLOAT</td>
<td>ADDF</td>
<td>MPYF</td>
<td>(nop)</td>
</tr>
</tbody>
</table>
Conflicts involving reads are similar to those involving writes. If an instruction must read registers AR0–AR7 or SP, the use of those particular registers by the decode stage for the following instruction is delayed until the read is complete. The registers are read at the start of the execute cycle and therefore require only a one-cycle delay of the following decode. For four registers (IR0, IR1, BK, or DP), no delay is incurred upon a read.

In Example 8–5, two auxiliary registers are added together with the result going to an extended-precision register. The next instruction uses one of the same auxiliary registers as an address register. If the MPYF instruction used an AR register other than AR0 or AR2, no delay would occur.

**Example 8–5. A Read of ARs Followed by ARs for Address Generation**

<table>
<thead>
<tr>
<th></th>
<th>ADDI</th>
<th>AR0,AR2,R1 ; AR0 + AR2 → R1</th>
</tr>
</thead>
<tbody>
<tr>
<td>NEXT</td>
<td>MPYF</td>
<td>***AR2,R0 ; Decode delayed 1 cycle</td>
</tr>
<tr>
<td></td>
<td>ADDF</td>
<td></td>
</tr>
<tr>
<td></td>
<td>FLOAT</td>
<td></td>
</tr>
</tbody>
</table>

![Pipeline Operation Diagram]

---

**Note:**

The DBR (decrement and branch) instruction’s use of auxiliary registers for loop counters is treated the same as if the use were for addressing. Therefore, the operation shown in the two previous examples can also occur for this instruction.
8.2.3 Memory Conflicts

Memory conflicts can occur when the memory bandwidth of a physical memory space is exceeded. RAM blocks 0 and 1 and the ROM block can support only two accesses per cycle. The external interface can support only one access per cycle. Some conditions under which memory conflicts can be avoided are discussed in Section 8.3, on page 8-17.

Memory pipeline conflicts consist of the following four types:

- **Program Wait**: A program fetch is prevented from beginning.
- **Program Fetch Incomplete**: A program fetch has begun but is not yet complete.
- **Execute Only**: An instruction sequence requires three CPU data accesses in a single cycle.
- **Hold Everything**: A global or local bus operation must complete before another one can proceed.

These four types of memory conflicts are illustrated in examples and discussed in the paragraphs that follow.

8.2.3.1 Program Wait

Two conditions can delay an instruction fetch:

- Too many accesses to the same memory at the start of a CPU data access can occur in two cases:
  - Two CPU data accesses are made to an internal RAM or ROM block, and a program fetch from the same block is necessary.
  - One of the external ports is starting a CPU data access, and a program fetch from the same port is necessary.

- A multicycle CPU data access or DMA data access over the external bus is needed.
Example 8–6 illustrates a program wait until a CPU data access completes. In this case, *AR0 and *AR1 are both pointing to data in RAM block 0, and the MPYF instruction will be fetched from RAM block 0. This results in the conflict shown. Since no more than two accesses can be made to RAM block 0 in a single cycle, the program fetch cannot begin and must wait until the CPU data accesses are complete.

**Example 8–6. Program Wait Until CPU Data Access Completes**

```
ADDF3 *AR0,*AR1,R0
FIX
MPYF
ADDF3
NEGB
```

![Pipeline Operation Diagram]

Example 8–7 shows a program wait due to a multicycle data-data access or a multicycle DMA access. The ADDF, MPYF, and SUBF are fetched from some portion in memory other than the external port the DMA requires. The DMA begins a multicycle access. The program fetch corresponding to the CALL is made to the same external port that the DMA is using.

Even if the DMA is configured as the lowest priority, a multicycle access cannot be aborted. The program fetch must therefore wait until the DMA access completes.
Pipeline Conflicts

Example 8–7. Program Wait Due to Multicycle Access

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>ADDF</td>
<td>–</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+1</td>
<td>MPYF</td>
<td>ADDF</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+2</td>
<td>SUBF</td>
<td>MPYF</td>
<td>ADDF</td>
<td>–</td>
</tr>
<tr>
<td></td>
<td>(wait)</td>
<td>SUBF</td>
<td>MPYF</td>
<td>ADDF</td>
</tr>
<tr>
<td>n+3</td>
<td>CALL</td>
<td>(nop)</td>
<td>SUBF</td>
<td>MPYF</td>
</tr>
<tr>
<td>n+4</td>
<td>–</td>
<td>CALL</td>
<td>(nop)</td>
<td>SUBF</td>
</tr>
</tbody>
</table>

8.2.3.2 Program Fetch Incomplete

A program fetch incomplete occurs when an instruction fetch takes more than one cycle to complete because of wait states. In Example 8–8, the MPYF and ADDF are fetched from memory that supports single-cycle accesses. The SUBF is fetched from memory requiring one wait state. One example that demonstrates this conflict is a fetch across a bank boundary on the external port.

Example 8–8. Multicycle Program Memory Fetches

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>MPYF</td>
<td>–</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+1</td>
<td>ADDF</td>
<td>MPYF</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+2</td>
<td>SUBF</td>
<td>ADDF</td>
<td>MPYF</td>
<td>–</td>
</tr>
<tr>
<td></td>
<td>(nop)</td>
<td>ADDF</td>
<td>MPYF</td>
<td>–</td>
</tr>
<tr>
<td>n+3</td>
<td>ADDI</td>
<td>SUBF</td>
<td>(nop)</td>
<td>ADDF</td>
</tr>
</tbody>
</table>
8.2.3.3 Execute Only

The Execute Only type of memory pipeline conflict occurs when a sequence of instructions requires three CPU data accesses in a single cycle. There are two cases in which this occurs:

- An instruction performs a store and is followed by an instruction that performs two memory reads.
- An instruction performs two stores and is followed by an instruction that performs at least one memory read.

The first case is shown in Example 8–9. Since this sequence requires three data memory accesses and only two are available, only the execute phase of the pipeline is allowed to proceed. The dual reads required by the LDF || LDF are delayed one cycle. Note that in this case a refetch of the next instruction can occur, which could cause an additional access to memory.

Example 8–9. Single Store Followed by Two Reads

```
STFR0,*AR1 ; R0 → *AR1
LDF*AR2,R1 ; *AR2 → R1 in parallel with
|| LDF*AR3,R2 ; *AR3 → R2
```

**PIPELINE OPERATION**

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>STF</td>
<td>–</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>n+1</td>
<td>LDF</td>
<td></td>
<td>LDF</td>
<td>STF</td>
</tr>
<tr>
<td>n+2</td>
<td>W</td>
<td>LDF</td>
<td></td>
<td>LDF</td>
</tr>
<tr>
<td>n+3</td>
<td>X</td>
<td>W</td>
<td>LDF</td>
<td></td>
</tr>
<tr>
<td>n+4</td>
<td>X</td>
<td>W</td>
<td>LDF</td>
<td></td>
</tr>
<tr>
<td>n+4</td>
<td>Y</td>
<td>X</td>
<td>W</td>
<td>LDF</td>
</tr>
</tbody>
</table>
Pipeline Conflicts

Example 8–10 shows a parallel store followed by a single load or read. Since two parallel stores are required, the next CPU data memory read must wait one cycle before beginning. One program memory refetch may occur.

Example 8–10. Parallel Store Followed by Single Read

<table>
<thead>
<tr>
<th></th>
<th>STF R0,*AR0 ; R0 → *AR0 in parallel with</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>ADDF @SUM,R1 ; R1 + @SUM → R1</td>
</tr>
<tr>
<td></td>
<td>IACK</td>
</tr>
<tr>
<td></td>
<td>ASH</td>
</tr>
</tbody>
</table>

PIPELINE OPERATION

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>STF</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>n+1</td>
<td>ADDF</td>
<td>STF</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>n+2</td>
<td>IACK</td>
<td>ADDF</td>
<td>STF</td>
<td></td>
</tr>
<tr>
<td>n+3</td>
<td>ASH</td>
<td>IACK</td>
<td>ADDF</td>
<td>STF</td>
</tr>
<tr>
<td>n+4</td>
<td>ASH</td>
<td>IACK</td>
<td>ADDF</td>
<td>(nop)</td>
</tr>
<tr>
<td>n+4</td>
<td>-</td>
<td>ASH</td>
<td>IACK</td>
<td>ADDF</td>
</tr>
</tbody>
</table>
8.2.3.4 **Hold Everything**

Three types of conditions cause Hold Everything memory pipeline conflicts:

- A CPU data load or store cannot be performed because an external port is busy.
- An external load takes more than one cycle.
- The execution of conditional calls and traps, which take one more cycle than conditional branches.

The first type of Hold Everything conflict occurs when one of the external ports is busy because of an access that has started but is not complete. In Example 8–11, the first store is a two-cycle store. The CPU writes the data to an external port. The port control then takes two cycles to complete the data-write. The LDF is a read over the same external port. Since the store is not complete, the CPU continues to attempt processing the LDF until the port is available.

**Example 8–11. Busy External Port**

```
STF R0, @DMA1
LDF @DMA2, R0
```

<table>
<thead>
<tr>
<th>PIPELINE OPERATION</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC</td>
</tr>
<tr>
<td>n</td>
</tr>
<tr>
<td>n+1</td>
</tr>
<tr>
<td>n+2</td>
</tr>
<tr>
<td>n+2</td>
</tr>
<tr>
<td>n+2</td>
</tr>
<tr>
<td>n+3</td>
</tr>
<tr>
<td>n+4</td>
</tr>
</tbody>
</table>

The second type of Hold Everything conflict involves multicycle data reads. In this case, the read has begun and continues until completed. In Example 8–12, the LDF is performed from an external memory that requires several cycles to access.
Example 8–12. Multicycle Data Reads

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>LDF</td>
<td>—</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>n+1</td>
<td>I</td>
<td>LDF</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>n+2</td>
<td>J</td>
<td>I</td>
<td>LDF</td>
<td>—</td>
</tr>
<tr>
<td>n+3</td>
<td>K\text{ (dummy)}</td>
<td>I</td>
<td>LDF</td>
<td>—</td>
</tr>
<tr>
<td>n+3</td>
<td>K_2</td>
<td>J</td>
<td>I</td>
<td>LDF</td>
</tr>
</tbody>
</table>

The final type of Hold Everything conflict deals with conditional calls (CALL\text{cond}) and traps (TRAP\text{cond}), which are different from other branch instructions. Whereas other branch instructions are conditional loads, the conditional calls and traps are conditional stores, which take one more cycle to complete than conditional branches (see Example 8–13). The added cycle pushes the return address after the call condition is evaluated.

Example 8–13. Conditional Calls and Traps

<table>
<thead>
<tr>
<th>PC</th>
<th>Fetch</th>
<th>Decode</th>
<th>Read</th>
<th>Execute</th>
</tr>
</thead>
<tbody>
<tr>
<td>n</td>
<td>CALL\text{cond}</td>
<td>—</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>n+1</td>
<td>I</td>
<td>CALL\text{cond}</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>n+1</td>
<td>(nop)</td>
<td>(nop)</td>
<td>CALL\text{cond}</td>
<td>—</td>
</tr>
<tr>
<td>n+1</td>
<td>(nop)</td>
<td>(nop)</td>
<td>(nop)</td>
<td>CALL\text{cond}</td>
</tr>
<tr>
<td>n+1</td>
<td>(nop)</td>
<td>(nop)</td>
<td>(nop)</td>
<td>CALL\text{cond}</td>
</tr>
<tr>
<td>n+2/CALL\text{addr}</td>
<td>I</td>
<td>(ndp)</td>
<td>(nop)</td>
<td>(nop)</td>
</tr>
</tbody>
</table>
8.3 Memory Accesses for Maximum Performance

If program fetches and data accesses are performed in such a manner that the resources being used cannot provide the necessary bandwidth, the pipeline is stalled until the accesses are complete. Certain configurations of program fetch and data accesses yield conditions under which the 'C4x can achieve maximum throughput.

Table 8–1 shows how many accesses can be performed from the different memory spaces when it is necessary to do a program fetch and a single data access, and still achieve maximum performance (one cycle). Four cases achieve one-cycle maximization.

<table>
<thead>
<tr>
<th>Case No.</th>
<th>Global Bus Accesses</th>
<th>Accesses From Dual-Access Internal Memory</th>
<th>Local Bus Or Peripheral Accesses</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>—</td>
</tr>
<tr>
<td>2</td>
<td>1</td>
<td>—</td>
<td>1</td>
</tr>
<tr>
<td>3</td>
<td>—</td>
<td>2 from any combination of internal memory</td>
<td>—</td>
</tr>
<tr>
<td>4</td>
<td>—</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>
Table 8–2 shows how many accesses can be performed from the different memory spaces when it is necessary to do a program fetch and two data accesses, still achieving maximum performance (one cycle). Six cases achieve this maximization.

Table 8–2. One Program Fetch and Two Data Accesses for Maximum Performance

<table>
<thead>
<tr>
<th>Case No.</th>
<th>Global Bus Accesses</th>
<th>Accesses From Dual-Access Internal Memory</th>
<th>Local Or Peripheral Bus Accesses</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>2 from any combination of internal memory</td>
<td>—</td>
</tr>
<tr>
<td>2†</td>
<td>1 program</td>
<td>1 data</td>
<td>1 data</td>
</tr>
<tr>
<td>3†</td>
<td>1 data</td>
<td>1 data</td>
<td>1 program</td>
</tr>
<tr>
<td>4</td>
<td>1 data</td>
<td>1 program, 1 data</td>
<td>1 DMA</td>
</tr>
<tr>
<td>5</td>
<td>—</td>
<td>2 from same internal memory block and 1 from a different internal memory block</td>
<td>—</td>
</tr>
<tr>
<td>6</td>
<td>—</td>
<td>3 from different internal memory blocks</td>
<td>1 DMA</td>
</tr>
<tr>
<td>7</td>
<td>—</td>
<td>2 from any combination of internal memory</td>
<td>1</td>
</tr>
<tr>
<td>8</td>
<td>1 program</td>
<td>2 data</td>
<td>1 DMA</td>
</tr>
<tr>
<td>9</td>
<td>1 DMA</td>
<td>2 data</td>
<td>1 program</td>
</tr>
</tbody>
</table>

† For Cases 2 and 3, see Three-Operand Instruction Memory Reads on page 8-20.
### 8.4 Clocking of Memory Accesses

This section discusses the role of internal clock phases (H1 and H3) in the way the 'C4x handles multiple memory accesses. Whereas the previous section discussed the interaction between sequences of instructions, this section discusses the flow of data on an individual instruction basis.

Each major clock period of 40 ns is composed of two minor clock periods of 20 ns, labeled H3 and H1 (these times assume a 50-MHz 'C40). The active clock period for H3 and H1 is the time when that signal is high.

![Diagram of clock phases H1 and H3](image)

The precise operation of memory reads and writes can be defined according to these minor clock periods. The types of memory operations that can occur are program fetches, data loads and stores, and DMA accesses. Internal DMA data accesses always start during the H3 cycle.

#### 8.4.1 Program Fetches

Internal program fetches are always performed during H3 unless a single data store must occur at the same time because of another instruction in the pipeline. In that case, the program fetch occurs during H1 and the data store occurs during H3.

External program fetches always start at the beginning of H3 with the address being presented on the external bus. At the end of H1, the fetches are completed with the latching of the instruction word.
8.4.2 Data Loads and Stores

Four types of instructions perform loads, memory reads, and stores: two-operand instructions, three-operand instructions, multiplier/ALU operation with store instructions, and parallel multiply and add instructions. See Chapter 6 for detailed information on addressing modes.

As discussed in Chapter 9, the number of bus cycles for external memory accesses differs in some cases from the number of CPU execution cycles. For external reads, the number of bus cycles and CPU execution cycles is identical. For external writes, there are always at least two bus cycles, but unless there is a port access conflict, there is only one CPU execution cycle. In the following examples, any difference in the number of bus cycles and CPU cycles is noted.

8.4.2.1 Two-Operand Instruction Memory Accesses

Figure 8–2. Two-Operand Instruction Word

Two-operand instructions include all those instructions with bits 31–29 being 000 or 010 (see Figure 8–2). In the case of a data read, bits 15–0 represent the src operand. Internal data reads are always performed during H1. External data reads always start at the beginning of H3 with the address presented on the external bus, and they complete with the latching of the data word at the end of H1.

In the case of a data store, bits 15–0 represent the dst operand. Internal data stores are performed during H3. External data stores always start at the beginning of H3 with the address and data presented on the external bus.

8.4.2.2 Three-Operand Instruction Memory Reads

Figure 8–3. Three-Operand Instruction Word

Three-operand instructions include all instructions with bits 31–29 being 001 (see Figure 8–3). The source operands, src1 and src2, come from either registers or memory. When one or more of the source operands are from memory, these instructions are always memory reads.

If only one of the source operands is from memory (either src1 or src2) and is located in internal memory, the data is read during H1. If the single memory
source operand is in external memory, the read starts at the beginning of H3, with the address presented on the external bus, and completes with the latching of the data word at the end of H1.

If both source operands are to be fetched from memory, then memory reads can occur in several ways:

- If both operands are located in internal memory, the src1 read is performed during H3 and the src2 read during H1, thus completing two memory reads in a single cycle.
- If src1 is in internal memory and src2 is in external memory, the src2 access begins at the start of H3 and latches at the end of H1. At the same time, the src1 access to internal memory is performed during H3. Again, two memory reads are completed in a single cycle.
- If src1 is in external memory and src2 is in internal memory, two cycles are necessary to complete the two reads. In the first cycle, the internal src2 access is performed. The src1 is also performed, but not latched until the next H3.
- If src1 and src2 are both from external memory, two cycles are required to complete the two reads. In the first cycle, the src1 access is performed and loaded on the next H3; in the second cycle, the src2 access is performed and loaded on that cycle’s H1.

### 8.4.2.3 Operations with Parallel Stores

**Figure 8–4. Multiply or CPU Operation With a Parallel Store**

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>dst1</td>
<td>src1</td>
<td>src3</td>
<td>dst2</td>
<td></td>
<td>src2</td>
</tr>
</tbody>
</table>

The next class of instructions includes all instructions that have a store in parallel with another instruction. Bits 31 and 30 for these instructions are equal to \(11_2\).

For operations that perform a multiply or ALU operation in parallel with a store, the instruction word format is shown in Figure 8–4. If the store operation to dst2 is external or internal, it is performed during H3. Two bus cycles are required for external stores, but only one CPU cycle is necessary to complete the write.

If the memory read operation is external, it starts at the beginning of H3 and latches at the end of H1. If the memory read operation is internal, it is performed during H1. Note that memory reads are performed by the CPU during the read (R) phase of the pipeline, and stores are performed during the execute (E) phase.
Clocking of Memory Accesses

The instruction word format for instructions that have parallel stores to memory is shown in Figure 8–5. If both destination operands, dst1 and dst2, are located in internal memory, dst1 is stored during H3 and dst2 during H1, thus completing two memory stores in a single cycle.

Figure 8–5. Two Parallel Stores

|   | 1 1 | ST||ST | src2 | 0 0 | src1 | dst1 | dst2 |
|---|-----|-----|-----|-----|-----|-----|------|------|

If dst1 is in external memory and dst2 is in internal memory, the dst1 store begins at the start of H3. The dst2 store to internal memory is performed during H1. Two bus cycles are required for the external store, but only one CPU cycle is necessary to complete the write. Again, two memory stores are completed in a single cycle.

If dst1 is in internal memory and dst2 is in external memory, an additional bus cycle is necessary to complete the dst2 store. Only one CPU cycle is necessary to complete the write, but the port access requires three bus cycles. In the first cycle, the internal dst1 store is performed during H3, and dst2 is written to the port during H1. During the next cycle, the dst2 store is performed on the external bus, beginning in H3, and executes as normal through the following cycle.

If dst1 and dst2 are both written to external memory, a single CPU cycle is still all that is necessary to complete the stores. In this case, four bus cycles are required.

1) In the first cycle, both dst1 and dst2 are written to the port, and the external bus access for dst1 begins.

2) The store for dst1 is completed on the second cycle.

3) The store for dst2 begins on the third external bus cycle.

4) Finally, the store for dst2 is completed on the fourth external bus cycle.
8.4.2.4 Parallel Multiplies and Adds

Memory addressing for parallel multiplies and adds is similar to that for three-operand instructions. The parallel multiplies and adds include all instructions with bits 31–30 equal to 10₂ (see Figure 8–6).

For these operations, src3 and src4 are both located in memory. If both operands are located in internal memory, src3 is performed during H3, and src4 is performed during H1, thus completing two memory reads in a single cycle.

If src3 is in internal memory and src4 is in external memory, the src4 access begins at the start of H3 and latches at the end of H1. At the same time, the src3 access to internal memory is performed during H3. Again, two memory reads are completed in a single cycle.

If src3 is in external memory and src4 is in internal memory, two cycles are necessary to complete the two reads. In the first cycle, the internal src4 access is performed. During the H3 of the next cycle, the src3 access is performed.

If src3 and src4 are both from external memory, two cycles are necessary to complete the two reads. In the first cycle, the src3 access is performed; in the second cycle, the src4 access is performed.
The 'C4x has two identical external bus interfaces. One bus is called the global memory interface and the other bus is called the local memory interface. These buses are designed to allow higher throughput by permitting simultaneous loads and stores to different external memories.

The information in this chapter applies to both the global memory interface and the local memory interface; however, in some sections, only the global memory interface is shown. Examples of memory interfacing are provided in the TMS320C4x General-Purpose Applications User's Guide.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>9.1 Overview</td>
<td>9-2</td>
</tr>
<tr>
<td>9.2 Memory Interface Signals</td>
<td>9-3</td>
</tr>
<tr>
<td>9.3 Memory Interface Control Registers</td>
<td>9-6</td>
</tr>
<tr>
<td>9.4 Programmable Wait States</td>
<td>9-14</td>
</tr>
<tr>
<td>9.5 Memory Interface Timing</td>
<td>9-16</td>
</tr>
<tr>
<td>9.6 Using Enable Signals to Control Signal Groups</td>
<td>9-38</td>
</tr>
<tr>
<td>9.7 Interlocked Operations</td>
<td>9-39</td>
</tr>
<tr>
<td>9.8 IACK Timing</td>
<td>9-49</td>
</tr>
</tbody>
</table>
9.1 Overview

The ’C4x has two identical parallel external interfaces: the global memory interface and the local memory interface. Each interface has the following features:

- Separate configurations, each with its own 32-bit data bus and 31-bit address bus (24 pin address bus in the ’C44)
- Single-cycle reads and pipelined writes
- Independent enable signals for data, address, and control lines
- Bus-request and bus-lock signaling for shared memory parallel processing
- User-controlled mapping of addresses to either of two sets of independent strobes for different speed memories
- Look-ahead bus status signals for defining current and requested bus operations for parallel processing arbitration
- Selectable wait states (both software- and hardware-controlled)
- Signals that indicate when memory-page boundaries are crossed.

Note:
The global-memory interface is identical in every way to the local memory interface except that (1) they have different positions in the memory map, and (2) the control signals for the local memory interface are labeled an additional “L” prefix (as described in Figure 9–1 on page 9-3).

Throughout this chapter, no distinction is made between global and local interface signals and between STRB0 and STRB1, except for clarity.

The signals that indicate when memory-page boundaries are crossed support three main types of memory:

- page-mode and static-column decode DRAMs
- high-speed SRAM banks
- slow speed memory banks and I/O devices
9.2 Memory Interface Signals

As shown in Figure 9–1, the global-memory interface has two sets of control signals, STRB0 and STRB1. The global-memory port control-registers (Section 9.3 on page 9-6) define which set of registers is active.

Figure 9–1. Global and Local Memory Interface Control Signals

![Diagram of control signals]

Note: The signals used in this figure are for the global-memory interface. The local-memory interface signals have the same configuration and an additional “L” prefix is added for each signal (for example, STRB0 becomes LSTRB0, etc.).
**Memory Interface Signals**

**Table 9–1. Global Memory Interface Signals**

| Signal† | Type $|$ Description                                                                 | Value After Reset | Idle Status||
|---------|------------------------------------|-------------------|-------------|
| AE $|$ I | Address bus enable signal for global-memory interface. When high (set to 1), places address lines A30–0 in the high-impedance state. | N.A.# ignored    |
| CE(0,1) $|$ I | Control signal enable for R/Wx, STRBx, and PAGEx signals. When high (set to 1), it places the corresponding R/Wx, STRBx, and PAGEx signals in high-impedance state (x = 0 for CE0 and x = 1 for CE1). | N.A. ignored |
| DE $|$ I | Data bus enable signal for global memory interface. When high (set to 1), places data lines D31–0 in the high-impedance state. Reads can still occur but writes cannot. | N.A. ignored |
| LOCK $|$ O | Lock signal for global bus interface. Indicates whether an interlocked access is underway (0 = access underway; 1 = access not underway). LOCK is changed only by the interlocked instructions. | 1 1 |
| PAGE(0,1) O/Z | Memory-page enable signal for STRB(0,1) accesses | 0 0 |
| RDY(0,1) I | Indicates external memory is ready to be accessed | N.A. ignored |
| R/W(0,1) O/Z | Specifies memory read (active high) or write (active low) mode | 1 1 |
| STAT(3–0) $|$ O | Four lines that define the status or function of the memory port as shown in Table 9–2 (next page). | all 1s all 1s |
| STRB(0,1) O/Z | Interface access strobe | 1 1 |
| A(30–0) O/Z | Address bus. The address lines are always driven. They keep the address of the last access. | Hi–Z address of last access |
| D(31–0) I/O/Z | Data bus. These signals go to high impedance between write accesses. | Hi–Z Hi–Z |

† The numbers in parentheses mean that either a 0 (zero) or a 1 can follow the prefix shown to the left of the parenthesis. A zero indicates STRB0 control signals (shown in Figure 9–1), and a one indicates STRB1 control signals.
‡ STAT(3–0) and LOCK cannot be controlled by an external control signal.
§ O=output; I=input; Z=high-impedance state.
¶ This signal can be used in a shared bus configuration to hold the ‘C4x off the shared bus while another ‘C4x accesses the shared memory and peripherals.
# N.A. means not affected.
|| Idle status = no external memory access

Table 9–2 shows how pins STAT3 to STAT0 define the current status of the global-memory port. For bus accesses, these signals provide information about the access that is about to begin. The code for a SIGI instruction read is useful for distinguishing between a SIGI read and a LDII or LDFI read.
The bus idle status code is $1111_2$ (given at the bottom of Table 9–2). This simplifies modular shared-bus multiprocessor interfaces because pull-up resistors can be used to signal the idle condition when processor cards are not attached to the shared bus.

### Table 9–2. Global Memory Port Status for $\text{STRB}_0$ and $\text{STRB}_1$ Accesses

<table>
<thead>
<tr>
<th>Value at Pins $^\dagger$</th>
<th>Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0</td>
<td>$\text{STRB}_0$ access, program read</td>
</tr>
<tr>
<td>0 0 0 1</td>
<td>$\text{STRB}_0$ access, data read</td>
</tr>
<tr>
<td>0 0 1 0</td>
<td>$\text{STRB}_0$ access, DMA read</td>
</tr>
<tr>
<td>0 0 1 1</td>
<td>$\text{STRB}_0$ access, SIGI (instruction) read</td>
</tr>
<tr>
<td>0 1 0 0</td>
<td>Reserved</td>
</tr>
<tr>
<td>0 1 0 1</td>
<td>$\text{STRB}_0$ access, data write</td>
</tr>
<tr>
<td>0 1 1 0</td>
<td>$\text{STRB}_0$ access, DMA write</td>
</tr>
<tr>
<td>0 1 1 1</td>
<td>Reserved</td>
</tr>
<tr>
<td>1 0 0 0</td>
<td>$\text{STRB}_1$ access, program read</td>
</tr>
<tr>
<td>1 0 0 1</td>
<td>$\text{STRB}_1$ access, data read</td>
</tr>
<tr>
<td>1 0 1 0</td>
<td>$\text{STRB}_1$ access, DMA read</td>
</tr>
<tr>
<td>1 0 1 1</td>
<td>$\text{STRB}_1$ access, SIGI (instruction) read</td>
</tr>
<tr>
<td>1 1 0 0</td>
<td>Reserved</td>
</tr>
<tr>
<td>1 1 0 1</td>
<td>$\text{STRB}_1$ access, data write</td>
</tr>
<tr>
<td>1 1 1 0</td>
<td>$\text{STRB}_1$ access, DMA write</td>
</tr>
<tr>
<td>1 1 1 1</td>
<td>Idle</td>
</tr>
</tbody>
</table>

$^\dagger$ This table applies to both the global-memory interface and local-memory interface (for local memory interface signals, add an L prefix to form LSTAT3, LSTAT2, etc.).
9.3 Memory-Interface Control Registers

Figure 9–2 shows the memory map for both the global- and local-memory interface-control registers. Figure 9–3 shows the fields in each register. Each register can be programmed to control its respective memory interface by defining the:

- Page size used for the two strobes at each port
- Address ranges over which the strobes are active
- Wait states
- Other operations that control the memory interface

Figure 9–3 lists the fields in these registers.

At reset, the binary values shown above each bit in Figure 9–2 are written to the global memory interface control register. Values in bits 3 – 0 are the values at these bits' respective pins (AE, DE, CE1, and CE0). Reset has the following effects (for both the local bus and the global bus):

- The PAGESIZE fields for STRB0 (bits 18–14) and STRB1 (bits 23–19) are set to 00111₂, which corresponds to 256 words.
- The WTCNT fields for STRB0 (bits 10–8) and STRB1 (bits 13–11) are set to 111₂, which corresponds to seven wait states.
- The ACTIVE field for STRB0 (bits 28–24) is set for all addresses over the global (or local for LSTRB0) memory interface.
- The STRB SWITCH field (bit 29) is set to 1 to insert a cycle between back-to-back reads that switch from STRB0 to STRB1 (or STRB1 to STRB0).
- The SWW fields for STRB0 (bits 5–4) and STRB1 (bits 7–6) are both set to 11₂ to set the internal ready signal to be the logical AND of the external READY signal (RDY) and the ready signal generated by the on-chip wait-state counter (RDY_wtcnt).
Figure 9–2. Location of the Memory-Interface Control Registers

<table>
<thead>
<tr>
<th>Address</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>00010 0000h</td>
<td>Global memory interface control register</td>
</tr>
<tr>
<td>00010 0001h</td>
<td>Reserved</td>
</tr>
<tr>
<td>00010 0003h</td>
<td>Local memory interface control register</td>
</tr>
</tbody>
</table>

Figure 9–3. Fields in the Memory-Interface Control Registers

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>31–29</td>
<td>STRB SWITCH (Table 9–4, Table 9–5)</td>
</tr>
<tr>
<td>28</td>
<td>STRB ACTIVE</td>
</tr>
<tr>
<td>24–23</td>
<td>STRB1 PAGESIZE (Table 9–3)</td>
</tr>
<tr>
<td></td>
<td>STRB0 PAGESIZE (Table 9–3)</td>
</tr>
<tr>
<td>18–11</td>
<td>STRB0 WTCNT</td>
</tr>
<tr>
<td>10–8</td>
<td>STRB1 WTCNT</td>
</tr>
<tr>
<td>7–6</td>
<td>STRB1 SWW (Table 9–6)</td>
</tr>
<tr>
<td>5–4</td>
<td>STRB0 SWW (Table 9–6)</td>
</tr>
<tr>
<td>3–0</td>
<td>AE, DE, CE1, CE0</td>
</tr>
</tbody>
</table>

Notes:
1) The register cell figure contains global-memory interface-control register mnemonics. For local-memory interface-control register mnemonics, add an L prefix to each mnemonic in the figure (e.g., LSTRB SWW, LCE0, etc.).
2) The 1s and 0s below each bit are the binary values written to the register at reset. The values at bits 3–0 are defined by the values of their respective external pins (AE, DE, CE1, and CE0).
3) These registers are shown in the overall memory map in Figure 4–1 and Figure 4–3.
4) RW=read/write; R=read.

Note:
Mnemonics used are for the global memory interface control register. For the local-memory interface-control register, add the prefix L to each mnemonic (e.g., LCEO, LCE1, LSTRB1, etc. The description remains the same for the local-memory interface-control register.

CE0  Value of external pin CE0 (after it passes through an internal synchronizer). The value is not latched.
CE1  Value of external pin CE1 (after it passes through an internal synchronizer). The value is not latched.
DE   Value of external pin DE (after it passes through an internal synchronizer). The value is not latched.
AE   Value of external pin AE (after it passes through an internal synchronizer). The value is not latched.
<table>
<thead>
<tr>
<th>Register</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>STRB0 SWW</strong></td>
<td>Software wait states for STRB0 access. In conjunction with STRB0 WTCNT, this field defines the mode of wait-state generation. Actual wait states are explained in Section 9.4 and in Table 9–6.</td>
</tr>
<tr>
<td><strong>STRB1 SWW</strong></td>
<td>Software wait states for STRB1 access. In conjunction with STRB1 WTCNT, this field defines the mode of wait-state generation. Actual wait states are explained in Section 9.4 and in Table 9–6.</td>
</tr>
<tr>
<td><strong>STRB0 WTCNT</strong></td>
<td>Software wait-state count for STRB0 accesses. Specifies the number of cycles to use when software wait states are active. Three-bit range is from 000₂ (zero) to 111₂ (seven).</td>
</tr>
<tr>
<td><strong>STRB1 WTCNT</strong></td>
<td>Software wait-state count for STRB1 accesses. Specifies the number of cycles to use when software wait states are active. Three-bit range is from 000₂ (zero) to 111₂ (seven).</td>
</tr>
<tr>
<td><strong>STRB0 PAGESIZE</strong></td>
<td>Page size for STRB0 accesses. Specifies the number of MSBs of the address to use to define the bank size for STRB0 accesses. See ranges in Table 9–3 and subsection 9.3.2.</td>
</tr>
<tr>
<td><strong>STRB1 PAGESIZE</strong></td>
<td>Page size for STRB1 accesses. Specifies the number of MSBs of the address to use to define the bank size for STRB1 accesses. See ranges in Table 9–3 and subsection 9.3.2.</td>
</tr>
<tr>
<td><strong>STRB ACTIVE</strong></td>
<td>Specifies address ranges over which STRB₀ and STRB₁ are active. See ranges in Table 9–4 on for STRB ACTIVE and Table 9–5 for LSTRB ACTIVE.</td>
</tr>
<tr>
<td><strong>STRB SWITCH</strong></td>
<td>Inserts a single cycle between back-to-back reads that switch from STRB0 to STRB1 (or vice versa). When a 1, insert cycle. When a 0, don’t insert cycle.</td>
</tr>
<tr>
<td><strong>Reserved</strong></td>
<td>Read as zeros.</td>
</tr>
</tbody>
</table>
### Table 9–3. Page Size as Defined by $\text{STRB}0/1$ PAGESIZE Bits$^\dagger$

<table>
<thead>
<tr>
<th>STRBx PAGESIZE (Bits 14–18, 19–23)$^\ddagger$</th>
<th>External Address Bus Bits Defining the Current Page</th>
<th>External Address Bus Bits Defining Address on a Page</th>
<th>Page Size (32-Bit Wds)</th>
</tr>
</thead>
<tbody>
<tr>
<td>00000–00110</td>
<td>Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
</tr>
<tr>
<td>00111$^\S$</td>
<td>30–8</td>
<td>7–0</td>
<td>$2^{6}=256$</td>
</tr>
<tr>
<td>01000</td>
<td>30–9</td>
<td>8–0</td>
<td>$2^{9}=512$</td>
</tr>
<tr>
<td>01001</td>
<td>30–10</td>
<td>9–0</td>
<td>$2^{10}=1K$</td>
</tr>
<tr>
<td>01010</td>
<td>30–11</td>
<td>10–0</td>
<td>$2^{11}=2K$</td>
</tr>
<tr>
<td>01011</td>
<td>30–12</td>
<td>11–0</td>
<td>$2^{12}=4K$</td>
</tr>
<tr>
<td>01100</td>
<td>30–13</td>
<td>12–0</td>
<td>$2^{13}=8K$</td>
</tr>
<tr>
<td>01101</td>
<td>30–14</td>
<td>13–0</td>
<td>$2^{14}=16K$</td>
</tr>
<tr>
<td>01110</td>
<td>30–15</td>
<td>14–0</td>
<td>$2^{15}=32K$</td>
</tr>
<tr>
<td>01111</td>
<td>30–16</td>
<td>15–0</td>
<td>$2^{16}=64K$</td>
</tr>
<tr>
<td>10000</td>
<td>30–17</td>
<td>16–0</td>
<td>$2^{17}=128K$</td>
</tr>
<tr>
<td>10001</td>
<td>30–18</td>
<td>17–0</td>
<td>$2^{18}=256K$</td>
</tr>
<tr>
<td>10010</td>
<td>30–19</td>
<td>18–0</td>
<td>$2^{19}=512K$</td>
</tr>
<tr>
<td>10011</td>
<td>30–20</td>
<td>19–0</td>
<td>$2^{20}=1M$</td>
</tr>
<tr>
<td>10100</td>
<td>30–21</td>
<td>20–0</td>
<td>$2^{21}=2M$</td>
</tr>
<tr>
<td>10101</td>
<td>30–22</td>
<td>21–0</td>
<td>$2^{22}=4M$</td>
</tr>
<tr>
<td>10110§</td>
<td>30–23</td>
<td>22–0</td>
<td>$2^{23}=8M$</td>
</tr>
<tr>
<td>10111</td>
<td>30–24</td>
<td>23–0</td>
<td>$2^{24}=16M$</td>
</tr>
<tr>
<td>11000</td>
<td>30–25</td>
<td>24–0</td>
<td>$2^{25}=32M$</td>
</tr>
<tr>
<td>11001</td>
<td>30–26</td>
<td>25–0</td>
<td>$2^{26}=64M$</td>
</tr>
<tr>
<td>11010</td>
<td>30–27</td>
<td>26–0</td>
<td>$2^{27}=128M$</td>
</tr>
<tr>
<td>11011</td>
<td>30–28</td>
<td>27–0</td>
<td>$2^{28}=256M$</td>
</tr>
<tr>
<td>11100</td>
<td>30–29</td>
<td>28–0</td>
<td>$2^{29}=512M$</td>
</tr>
<tr>
<td>11101</td>
<td>30</td>
<td>29–0</td>
<td>$2^{30}=1G$</td>
</tr>
<tr>
<td>11110</td>
<td>None</td>
<td>30–0</td>
<td>$2^{31}=2G$</td>
</tr>
<tr>
<td>11111</td>
<td>Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
</tr>
</tbody>
</table>

$^\dagger$ Mnemonics used are for the global-memory interface-control register. For the local-memory interface-control register, add the prefix L to the beginning of each mnemonic (e.g., LSTRB0 PAGESIZE, LSTRB1 PAGESIZE, etc.). The description is the same for the local-memory interface-control register.

$^\ddagger$ The x in STRBx means that the data in the columns are for STRB0 or STRB1.

$^\S$ A STRBx PAGESIZE field of 10110$^2$ is depicted in Figure 9–5 on page 9-13.

$^\S$ Value at reset.
### Table 9-4. Address Ranges Specified by STRB ACTIVE Bits†

<table>
<thead>
<tr>
<th>STRBx ACTIVE Field (Bits 24–28)</th>
<th>STRB0 ACTIVE Address Range</th>
<th>Size of STRB0 ACTIVE Address Range</th>
<th>STRB1 ACTIVE Address Range</th>
</tr>
</thead>
<tbody>
<tr>
<td>00000–01110</td>
<td>Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
</tr>
<tr>
<td>01111</td>
<td>8000 0000–8000 FFFF</td>
<td>$2^{16}=64K$</td>
<td>8001 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10000</td>
<td>8000 0000–8001 FFFF</td>
<td>$2^{17}=128K$</td>
<td>8002 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10001</td>
<td>8000 0000–8003 FFFF</td>
<td>$2^{18}=256K$</td>
<td>8004 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10010</td>
<td>8000 0000–8007 FFFF</td>
<td>$2^{19}=512K$</td>
<td>8008 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10011</td>
<td>8000 0000–800F FFFF</td>
<td>$2^{20}=1M$</td>
<td>8010 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10100</td>
<td>8000 0000–801F FFFF</td>
<td>$2^{21}=2M$</td>
<td>8020 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10101</td>
<td>8000 0000–803F FFFF</td>
<td>$2^{22}=4M$</td>
<td>8040 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10110</td>
<td>8000 0000–807F FFFF</td>
<td>$2^{23}=8M$</td>
<td>8080 0000–FFFF FFFF</td>
</tr>
<tr>
<td>10111</td>
<td>8000 0000–80FF FFFF</td>
<td>$2^{24}=16M$</td>
<td>8100 0000–FFFF FFFF</td>
</tr>
<tr>
<td>11000</td>
<td>8000 0000–81FF FFFF</td>
<td>$2^{25}=32M$</td>
<td>8200 0000–FFFF FFFF</td>
</tr>
<tr>
<td>11001</td>
<td>8000 0000–83FF FFFF</td>
<td>$2^{26}=64M$</td>
<td>8400 0000–FFFF FFFF</td>
</tr>
<tr>
<td>11010</td>
<td>8000 0000–87FF FFFF</td>
<td>$2^{27}=128M$</td>
<td>8800 0000–FFFF FFFF</td>
</tr>
<tr>
<td>11011</td>
<td>8000 0000–8FFF FFFF</td>
<td>$2^{28}=256M$</td>
<td>9000 0000–FFFF FFFF</td>
</tr>
<tr>
<td>11100</td>
<td>8000 0000–9FFF FFFF</td>
<td>$2^{29}=512M$</td>
<td>A000 0000–FFFF FFFF</td>
</tr>
<tr>
<td>11101</td>
<td>8000 0000–BFFF FFFF</td>
<td>$2^{30}=1G$</td>
<td>C000 0000–FFFF FFFF</td>
</tr>
<tr>
<td>11110‡</td>
<td>8000 0000–FFFF FFFF</td>
<td>$2^{31}=2G$</td>
<td>None</td>
</tr>
<tr>
<td>11111</td>
<td>Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
</tr>
</tbody>
</table>

† Address ranges specified by the LSTRB ACTIVE bits are listed in Table 9–5.
‡ Value at reset.
### Table 9–5. Address Ranges Specified by LSTRB ACTIVE Bits†

<table>
<thead>
<tr>
<th>LSTRBx ACTIVE Field (Bits 24–28)</th>
<th>LSTRB0 ACTIVE</th>
<th>Size of LSTRB0 ACTIVE Address Range</th>
<th>LSTRB1 ACTIVE</th>
<th>Size of LSTRB1 ACTIVE Address Range</th>
</tr>
</thead>
<tbody>
<tr>
<td>00000–01110 Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
</tr>
<tr>
<td>01111</td>
<td>0000 0000–0000 FFFF</td>
<td>2^16 = 64K</td>
<td>0001 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10000</td>
<td>0000 0000–0001 FFFF</td>
<td>2^17 = 128K</td>
<td>0002 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10001</td>
<td>0000 0000–0003 FFFF</td>
<td>2^18 = 256K</td>
<td>0004 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10010</td>
<td>0000 0000–0007 FFFF</td>
<td>2^19 = 512K</td>
<td>0008 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10011</td>
<td>0000 0000–000F FFFF</td>
<td>2^20 = 1M</td>
<td>0010 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10100</td>
<td>0000 0000–001F FFFF</td>
<td>2^21 = 2M</td>
<td>0020 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10101</td>
<td>0000 0000–003F FFFF</td>
<td>2^22 = 4M</td>
<td>0040 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10110</td>
<td>0000 0000–007F FFFF</td>
<td>2^23 = 8M</td>
<td>0080 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>10111</td>
<td>0000 0000–00FF FFFF</td>
<td>2^24 = 16M</td>
<td>0100 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>11000</td>
<td>0000 0000–01FF FFFF</td>
<td>2^25 = 32M</td>
<td>0200 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>11001</td>
<td>0000 0000–03FF FFFF</td>
<td>2^26 = 64M</td>
<td>0400 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>11010</td>
<td>0000 0000–07FF FFFF</td>
<td>2^27 = 128M</td>
<td>0800 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>11111</td>
<td>0000 0000–0FF FFFF</td>
<td>2^28 = 256M</td>
<td>1000 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>11100</td>
<td>0000 0000–1FFF FFFF</td>
<td>2^29 = 512M</td>
<td>2000 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>11101</td>
<td>0000 0000–3FFF FFFF</td>
<td>2^30 = 1G</td>
<td>4000 0000–7FFF FFFF</td>
<td></td>
</tr>
<tr>
<td>1110#</td>
<td>0000 0000–7FFF FFFF</td>
<td>2^31 = 2G</td>
<td>None</td>
<td></td>
</tr>
<tr>
<td>11111</td>
<td>Reserved</td>
<td>Reserved</td>
<td>Reserved</td>
<td></td>
</tr>
</tbody>
</table>

† Address ranges below 0030 0000h are valid only in microprocessor mode (ROMEN=0). Access to reserved, peripheral, and on-chip memory areas does not activate LSTRB signals.
‡ Value at reset.
9.3.1 Mapping Addresses to Strobes

Figure 9–4 demonstrates the relationship between the STRB ACTIVE bits (see Figure 9–3 on page 9-7 for more information) and the address ranges over which the signals, STRB0 and STRB1, are active. Note that the address ranges of STRBx and LSTRBx also govern the ranges of associated signals—RDYx, LRDYx, RWx, LR/Wx, PAGEx, LPAGEx, etc. (where x=1 or 0).

Example (a) of Figure 9–4 shows the reset condition (STRB ACTIVE=11110₂). In this case, signal STRB0 is active over the entire address range of the global memory bus (see Table 9–4 for fields and address ranges of STRB ACTIVE).

Example (b) of Figure 9–4 shows the global memory bus memory map when STRB ACTIVE=10101₂. In this case, STRB0 is active from addresses 8000 0000h–803F FFFFh, and STRB1 is active from addresses 8040 0000h–FFF FFFFh (as shown in Table 9–4 for a STRB ACTIVE of 10101₂).

NOTE: Shown here are two examples for the global memory map. The entire 'C40 memory map (local and global) is shown in Figure 4–1 on page 4-3. Note that the highest address for LSTRB1 (local bus) is 7FFF FFFFh.
9.3.2 Page Size Operation

Within the memory range selected by any of the four strobe lines, the 'C4x external interface allows you to further divide the range into pages of selected length. This capability gives you great flexibility in the design of high-speed, high-density memory systems combined with slower peripheral devices; each time a page boundary is crossed, a cycle is inserted to allow external logic to reconfigure itself.

Each PAGESIZE field in the memory interface control register (shown in Figure 9–2 on page 9-7) works in the same manner to specify the page size for its corresponding strobe. Table 9–3 on page 9-9 illustrates the relationship between the PAGESIZE field and the bits of the address used to define the current page and the resulting page size. Page size begins at 256 words (with external address-bus bits 7–0 defining the address on a page, and ranges of up to 2G words ('C40) with external address bus bits 30–0 ('C40) defining the location on a page. The example in Figure 9–5 shows how a pagesize field value of 10110_2 is translated into bits 30–23 defining the current page and bits 22–0 defining an address on a page.

Figure 9–5. STRBx PAGESIZE Fields Example

![External address bus bits defining the current page](image)

Note: This figure represents a STRBx PAGESIZE field value of 10110_2 (as shown in Table 9–3).

Changing from one page to another causes a cycle to be inserted in the external access sequence, allowing external logic to reconfigure itself appropriately. For example, the extra cycle allows time for slower devices to get off the bus, thereby eliminating bus contention. The memory interface control logic keeps track of the address used for the last access for each STRB. When an access begins, the PAGE signal corresponding to the active STRB goes inactive (high) if the access is to a new page. The PAGE0 and PAGE1 signals are independent of one another, each having its own page-size logic.

At reset, the page-control logic is initialized so that the extra cycle is inserted for the first access to the two strobe interfaces.

The control registers for the local memory interface function in the same way as the control registers for the global memory interface.
9.4 Programmable Wait States

The 'C4x has its own internal software-configurable ready-generation capability for each strobe. This software wait-state generator is controlled by configuring two fields in the global or local interface control register. Use the STRBx WTCNT field (bits 8–10 and 11–13) to specify the number of software wait states to generate, and use the STRBx SWW field (bits 6–7, and 4–5) to select one of the following four modes of wait-state generation:

- External \( RDY \) (SWW = 0). Wait states are generated solely by the external \( RDY \) line (software wait-states ignored).
- WTCNT-generated \( RDY_{\text{wtcnt}} \) (SWW = 01\(_2\)). Wait states are generated solely by the software wait-state generator (external \( RDY \) ignored).
- Logical-OR of \( RDY \) and \( RDY_{\text{wtcnt}} \) (SWW = 10\(_2\)). Wait states are generated with a logical OR of internal and external ready signals. Either signal can generate ready.
- Logical-AND of \( RDY \) and \( RDY_{\text{wtcnt}} \) (SWW = 11\(_2\)). Wait states are generated with a logical AND of internal and external ready signals. Both signals must occur.

The four modes are used to generate the internal ready signal, \( RDY_{\text{int}} \), that controls accesses. As long as \( RDY_{\text{int}} = 1 \), the current external access is extended. When \( RDY_{\text{int}} = 0 \), the current access completes. Since the use of programmable wait states for both external interfaces is identical, only the global-bus interface is described in this section.

\( RDY_{\text{wtcnt}} \) is an internally-generated ready signal. When an external access is begun, the value in WTCNT is loaded into a counter. WTCNT can be any value from 0 through 7. The counter is decremented every H1/H3 clock cycle until it becomes 0. Once the counter is cleared to 0, it remains cleared to 0 until the next access. When the counter is nonzero, \( RDY_{\text{wtcnt}}=1 \). When the counter is 0, \( RDY_{\text{wtcnt}}=0 \).

Table 9–6 is the truth table for each value of SWW, showing the different values at \( RDY \), \( RDY_{\text{wtcnt}} \), and \( RDY_{\text{int}} \).

<table>
<thead>
<tr>
<th>SWW</th>
<th>RDY ( \text{wtcnt} )</th>
<th>RDY ( \text{int} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>00_2</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>01_2</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>10_2</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>11_2</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>

Note:

At reset, the 'C4x inserts seven wait states for each access to external memory. These wait states are inserted to ensure that the system can function with slow memories. To increase system performance when using fast external memories, you will need to decrease the number of wait states.
<table>
<thead>
<tr>
<th>SWW Value</th>
<th>RDY</th>
<th>RDY_wtcnt</th>
<th>RDY_int</th>
<th>RDY_int</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
</tbody>
</table>

RDY\_int is dependent only upon RDY. RDY\_wtcnt is ignored.

RDY\_int is dependent only upon RDY\_wtcnt. RDY is ignored.

RDY\_int is the logical-OR (electrical AND because these signals are low true) of RDY and RDY\_wtcnt.

RDY\_int is the logical-AND (electrical OR because these signals are low true) of RDY and RDY\_wtcnt.
9.5 Memory Interface Timing

Except for some cases that are covered in detail later in this chapter, the 'C4x offers a one-cycle external read and a pipeline external write. A write is considered a two-step operation: one cycle writes the data into the external memory port buffer and then another cycle moves the data from there to external memory.

**Note:**

From the perspective of the DMA or CPU, the write operation finishes in one cycle, and the DMA or CPU can proceed. However, if the next DMA or CPU access is to the same external bus, the DMA or CPU must wait, and the write is considered a two-cycle operation.

---

**Figure 9–6. STRB and RDY Timing**

![Timing Diagram](image)

**Note:** The dotted lines emphasize the relationships between the signals.

As shown in Figure 9–6, STRB changes on the falling edge of H1, and RDY is sampled on the falling edge of H1. Throughout the other timing diagrams in this section, the following general rules apply to the logical timing of the parallel external interfaces:

- Changes of R/W are always framed by STRB.
- A page boundary crossing for a particular STRB results in the corresponding PAGE signal going high for one cycle.
- R/W transitions always occur on the rising edge of H1.
- STRB transitions always occur on the falling edge H1.
- RDY is always sampled on the falling edge H1.
- Data is always sampled during a read on the falling edge of H1.
- Data is always driven out during a write on the falling edge of H1.
Data is always stopped from being driven during a write on the rising edge of H1.

The status and PAGE signals, following a read, change on the falling edge of H1. The address also changes on H1’s falling edge.

The status and PAGE signals, following a write, change on the falling edge of H1; the address changes on the rising edge of H1.

The fetch of an interrupt vector over an external interface is identified by the status signals for that interface (STAT or LSTAT) as a data read.

The interlocked operation status signals (LOCK and LLOCK) have the same timing as the STAT and LSTAT status signals, respectively.

Any time PAGE goes high, STRB goes high.

Note:
When no external port is accessing memory (idle status), the control lines are inactive (RDY is ignored, STRB is high, and the STATx lines become high), the address lines keep the last value used in the pins, and the data lines become high-impedance. This can be seen in Figure 9–16.

Figure 9–7 illustrates a read, read, write sequence. This figure assumes that all three accesses are to the same page and that they are STRB1 accesses. This timing diagram illustrates that:

- Back-to-back reads to the same page are single-cycle accesses.
- STRB stays low during back-to-back reads.
- When the transition from a read to a write is done, STRB goes high for one cycle to frame the R/W signal changing.
Figure 9–7. Read Same Page, Read Same Page, Write Same Page Sequence

Note: Strobe and Ready Further Defined

Strobe and ready are discussed from the application viewpoint in TMS320C4x General-Purpose Applications User’s Guide.
Figure 9–8 shows that:

- To prevent unwanted writes, \( \text{STRB} \) goes high between back-to-back writes to disable the memory while the address changes.

- As in Figure 9–7, \( \text{STRB} \) goes high between a write and a read, and it frames the \( \text{R/W} \) transition.

- A read following a write on the same bus takes two cycles. This happens regardless of whether or not the read is on the same strobe and/or page.

- Consecutive writes take two cycles.

*Figure 9–8. Write Same Page, Write Same Page, Read Same Page Sequence*
Figure 9–9 shows that going from one page to another on back-to-back reads causes:

- An extra cycle to be inserted to allow the next memory to be selected
- The transition to be signaled by PAGE going high for one cycle
- STRB1 to go high for one cycle

Figure 9–9. Read Same Page, Read Different Page, Read Same Page Sequence
Figure 9–10 shows that on back-to-back writes, when a page switch occurs:

- PAGE1 signals this occurrence by going high for one cycle.
- No extra cycle is inserted, because write cycles exhibit an inherent one-half H1 cycle setup of address information before STRB goes low.

Figure 9–10. Write Same Page, Write Different Page, Write Same Page Sequence
Figure 9–11. Write Same Page, Read Different Page, Write Different Page Sequence
Figure 9–12. Read Different Page, Read Different Page, Write Same Page Sequence

Memory Interface Timing
Figure 9–13. Write Different Page, Write Different Page, Read Same Page Sequence
Figure 9–14. Read Same Page, Write Different Page, Read Different Page Sequence
Figure 9–15 through Figure 9–19 illustrate idle bus cycles. Idle bus cycle timing is similar to read cycle timing. The primary differences are that no data is read, STRB is held high, and RDY is ignored.

**Figure 9–15. Read Same Page, Idle One Cycle, Read Same Page Sequence**
Figure 9–16. Write Same Page, Idle One Cycle, Write Different Page Sequence
Figure 9–17. Idle, Read Different Page, Idle Sequence
Figure 9–18. Idle, Write Same Page, Idle Sequence
Figure 9–19. Write Different or Same Page, Idle, Idle Sequence
Figure 9–20 illustrates a STRB1 read followed by a STRB0 read when STRB SWITCH=0. This mode allows the reads to be back-to-back, with no cycles inserted between them when they are activating different strobes.

*Figure 9–20. Read Same Page on STRB1, STRB0, and on STRB1 Sequence When STRB SWITCH = 0*
Figure 9–21 is similar to Figure 9–20 except that the second \texttt{STRB1} read is from a different page than the first.

\textit{Figure 9–21. Read Same Page on \texttt{STRB1}, \!\texttt{STRB0}, Read Different Page on \texttt{STRB1} Sequence When \texttt{STRB SWITCH} = 0}
Figure 9–22 illustrates a STRB1 read followed by a STRB0 read when STRB SWITCH=1. In this mode, a cycle is inserted between back-to-back reads that activate different strobes. Some memory configurations require this cycle between strobe transitions to prevent bus conflicts during back-to-back reads on different strobes.

**Figure 9–22. Read Same Page on STRB1, STRB0, and on STRB1 Sequence When STRB SWITCH = 1**

![Diagram showing STRB1, STRB0, PAGE1, PAGE0, R/W, and D31–D0 signals with timing details such as (STRB1 read), (STRB0 read), and (STRB1 read).]
Figure 9–23 is similar to Figure 9–22 except that the second \( \text{STRB1} \) read is from a different page than the first.

**Figure 9–23. Read Same Page on \( \text{STRB1} \), \( \text{STRB0} \), Read Different Page on \( \text{STRB1} \) **

Sequence When \( \text{STRB SWITCH} = 1 \)
Figure 9–24. Write Same Page on STRB1, STRB0, Read Same Page on STRB1 Sequence
Figure 9–25 and Figure 9–26 show one wait-state read and write operations, respectively.

**Figure 9–25. Read With One Wait State**
Figure 9–26. Write With One Wait State

[Diagram showing waveforms for various signals including H1, R/W0, STRB0, RDY0, PAGE0, R/W1, STRB1, RDY1, PAGE1, D31–D0, A30–A0, and STAT3, STAT0, LOCK, with label (STRB1 write) and an extra cycle note.]
9.6 Using Enable Signals to Control Signal Groups

Figure 9–27. Using Enable Signals to Put Signal Groups in a High-Impedance State

Figure 9–27 shows an enable signal controlling the corresponding signal group. For example, signal DE controls the global external-interface data signals. The enable signals are unsynchronized inputs that turn off the corresponding output buffers. After the enable signal goes high plus timing (1) in Figure 9–27, the corresponding signal group goes into high-impedance. Then, after the enable signal goes low plus timing (2) in Figure 9–27, the signal group comes out of high-impedance. If the signal group is already in a high-impedance state before the enable signal goes high, the group will come out of the high-impedance state (when the enable signal goes low) only if the signal group is in a state requiring it to do so. For example, a data bus that was not being driven will be driven after being enabled, if an access is pending for the data bus.

Note:

If you intend to use internally generated wait states, be certain that no data is read from or written to the bus when it is disabled. This is because it is possible for a bus to be in the high-impedance state with internally generated wait states. In this case, data that is written will not be seen externally, and data that is read will be whatever value is sampled on the high-impedance bus.
9.7 Interlocked Operations

One of the most common parallel processing configurations is the sharing of global memory by multiple processors. For multiple processors to access this global memory and share data in a coherent manner, some sort of arbitration or handshaking is necessary. 'C4x interlocked operations meet this requirement for arbitration. More details are given in Section 9.7.5 on page 9-44.

Five 'C4x instructions are referred to as interlocked operations. Through the use of external signals, these instructions provide powerful synchronization mechanisms. They also guarantee integrity of communication and result in a high-speed operation. The interlocked-operation instruction group is listed in Table 9–7.

Table 9–7. Interlocked Operations

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Operation</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDFI</td>
<td>Load floating-point value from memory into a register; interlocked when <em>external</em> memory accessed</td>
<td>Signal interlocked src → dst</td>
</tr>
<tr>
<td>LDII</td>
<td>Load integer from memory into a register; interlocked when <em>external</em> memory accessed</td>
<td>Signal interlocked src → dst</td>
</tr>
<tr>
<td>SIGI</td>
<td>Load floating-point value from memory into a register; interlocked when <em>external</em> memory accessed</td>
<td>Signal interlocked Clear interlock</td>
</tr>
<tr>
<td>STFI</td>
<td>Store floating-point value from a register to memory; interlocked when <em>external</em> memory accessed</td>
<td>src → dst Clear interlock</td>
</tr>
<tr>
<td>STII</td>
<td>Store integer from a register to memory; interlocked when <em>external</em> memory accessed</td>
<td>src → dst Clear interlock</td>
</tr>
</tbody>
</table>

The interlocked operations use the global- and local-bus pins, LOCK and LLOCK, to reflect a currently executing interlocked operation. This signal is active (low) when any of the interlocked instructions in Table 9–7 are executing.
The external timing for interlocked loads and stores is the same as for standard loads and stores. You can extend interlocked loads and stores like standard accesses by using the appropriate ready signal (RDYx or LRDYx).

### 9.7.1 LDFI and LDII

The **LDFI** and **LDII** instructions perform the following actions:

1. Pull (L)LOCK low.
2. Execute an LDF or LDI instruction.
3. Extend the read cycle until the appropriate ready signal is received. Complete the instruction.
4. Leave (L)LOCK active low until changed by an STFI, STII, or SIGI.

The read/write operation is identical to any other read/write cycle except for the special use of (L)LOCK. The src operand for LDFI and LDII is always a direct or indirect memory address. (L)LOCK is set to 0 only if the src is located off-chip (i.e., STRB or LSTRB is active). If on-chip memory is accessed, then (L)LOCK is not asserted, and the operation is as an LDF or LDI from internal memory.

### 9.7.2 STFI and STII

The **STFI** and **STII** instructions perform the following operations:

1. Begin a write cycle. The state of (L)LOCK does not change. If it is low, an interlocked operation occurs. If high, the operation is as if an STF or STI is performed (not interlocked).
2. Execute an STF or STI instruction and extend the write cycle until the appropriate ready is signaled.
3. After the write cycle, bring (L)LOCK inactive (high).

As in the case for LDFI and LDII, the dst of STFI and STII affects (L)LOCK. If dst is located off-chip (STRB(0,1) or LSTRB(0,1) is active), (L)LOCK is set to a 1. If on-chip memory is accessed, then (L)LOCK is not asserted, and the operations are as a STF or STI to internal memory.
9.7.3 SIGI

The SIGI instruction can be used in a variety of ways. In some applications, you may wish to modify semaphores externally, perhaps with special-purpose logic. If so, SIGI can be used to perform a single-cycle interlocked access of the semaphore. The SIGI instruction can also be used simply to perform an external read and to signal that a particular point in your code has been reached.

The SIGI instruction functions as follows:

1) Pulls (L)OCK low
2) Executes an LDI instruction
3) Extends the read cycle until the appropriate ready signal is received. Completes the instruction
4) Brings (L)OCK back inactive high

Interlocked operations can be used to implement a busy-waiting loop, to manipulate a multiprocessor counter, to implement a simple semaphore mechanism, or to perform synchronization between two 'C4xs. The following examples illustrate the usefulness of the interlocked operations instructions.

9.7.4 Interlocked Examples

Examples in this section show you how interlocked operations can be used to implement:

- A busy-waiting loop to synchronize processors at the software level (Example 9–1, page 9-42)
- A counter shared between cooperative processors that defines the number of times a task should be done by the processors (Example 9–2 on page 9-42)
- Semaphores to ease the programming of critical sections (Example 9–3 and Example 9–4 on page 9-43)

Example 9–1 shows the implementation of a busy-waiting loop. The 'C4x stays in this loop until another processor writes a 0 in @LOCK. If location LOCK is the interlock for a critical section of code, and a nonzero means the lock is busy, the algorithm for a busy-waiting loop can be used as shown.
Interlocked Operations

Example 9–1. Busy-Waiting Loop

```plaintext
LDI 1,R0 ;Put 1 in R0
L1: LDII @LOCK,R1 ;Load lock value into R1
     STII R0,@LOCK ;Set lock value to 1
     BNZ L1 ;If R1 (previous lock value) is not
             ;0, read it again
```

Example 9–2 shows how a location COUNT may contain a count of the number of times a particular operation must be performed. This operation may be performed by any processor in the system. If the count is zero, the processor waits until it is nonzero before beginning processing. The example also shows the algorithm for modifying COUNT correctly.

Example 9–2. Task Counter Manipulation

```plaintext
LDI 0,R0
WAIT LDII @COUNT,R1 ;Read current value of counter
BZD WAIT ;If COUNT = 0, try again
LDNZ 1,R0 ;If COUNT not zero, decrement it
SUBI R0,R1
STII R1,@COUNT ;Update COUNT
```

Figure 9–28 illustrates multiple 'C4xs sharing global memory and using interlocked instructions as shown in Example 9–3 and Example 9–4.

Figure 9–28. Multiple 'C4x Devices Sharing Global Memory
Sometimes it may be necessary for several processors to access some shared data or other common resources. The portion of code that must access the shared data is called a critical section.

To ease the programming of critical sections, semaphores may be used. Semaphores are variables that can take only nonnegative integer values. Two primitive, indivisible operations are defined on semaphores (with S being a semaphore):

\[ V(S) : S + 1 \to S \]
\[ P(S) : \begin{cases} P : & \text{if } (S == 0), \text{ go to } P \\ & \text{else } S - 1 \to S \end{cases} \]

Indivisibility of V(S) and P(S) means that when these processes access and modify the semaphore S, they are the only processes doing so.

To enter a critical section, a P operation is performed on a common semaphore, for example, on S (S is initialized to 1). The first processor performing P(S) will be able to enter its critical section. All other processors are blocked because S has become 0. After leaving its critical section, the processor performs a V(S), thus allowing another processor to execute P(S) successfully.

The 'C4x code for V(S) is shown in Example 9–3, and code for P(S) is shown in Example 9–4. Compare the code in Example 9–4 to the code in Example 9–2, which does not use semaphores.
9.7.5 Bus-Lock Pins and Bus Timing

The timing of the LOCK and LLOCK pins is the same as the timing of the STAT(3–0) and LSTAT(3–0) pins. The LDII, LDFI, STII, STFI, and SIGI instructions manipulate the bus-lock signals only when an external memory access is made.

LDII, LDFI, and SIGI all clear LOCK or LLOCK to zero at the beginning of the read cycle with H1 falling. STII, STFI, and SIGI all set LOCK or LLOCK to one at the end of the access cycle on the falling edge of H1. Interlocked instructions are explained in Section 9.7.

Figure 9–29 through Figure 9–32 show bus timing characteristics for several external accesses using STII, LDII, STFI, LDFI, and SIGI.
Figure 9–29 is an example of an LDII or LDFI external access.

**Figure 9–29. LDII or LDFI External Access**
Figure 9–30 is an example of STII or STFI external access following the previous interlocked load (shown in Figure 9–29) and an idle cycle. This is the timing for an interlocked load/interlocked store sequence.

**Figure 9–30. LDII or LDFI and STII or STFI External Access**
Figure 9–31 is an example of a SIGI external access.

Figure 9–31. SIGI External Access Timing
Figure 9–32 illustrates timing for SIGI if the LOCK signal is already low. This could occur when a SIGI follows an LDII instruction. Since LOCK is already low, the only effect SIGI has on LOCK is to bring it high.

Figure 9–32. SIGI When LOCK Is Already Low
9.8 IACK Timing

The IACK pin is affected by the IACK (interrupt acknowledge) instruction. The timing of the pin is similar to that of the LOCK pin when used by the SIGI instruction. In all respects (timing, extension with wait states, etc.) the IACK behaves like a LOCK or STAT signal. The only difference is that there is only one IACK pin.

The timing for the IACK pin is shown in Figure 9–33. Like the interlocked instructions, the IACK instruction affects IACK only for an external access.
Figure 9–33. IACK Timing

![IACK Timing Diagram]

- H1
- R/W0
- STRB0
- RDY0
- PAGE0
- R/W1
- STRB1
- RDY1
- PAGE1
- D31 – D0: Read
- A30 – A0
- STAT3 – STAT0: (STRB1 read) (STRB1 IACK read)
- IACK
- IACK external access
The bootloader provided in the on-chip ROM of the ‘C4x can load and execute source programs that are received from a host processor, an EPROM, or a standard memory device. The ‘C4x bootloader functions primarily as either a memory bootloader or as a communication port bootloader.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>10.1 Bootloader Description</td>
<td>10-2</td>
</tr>
<tr>
<td>10.2 Mode Selection</td>
<td>10-3</td>
</tr>
<tr>
<td>10.3 Bootloading Sequence</td>
<td>10-5</td>
</tr>
<tr>
<td>10.4 Bootloading from External Memory (Examples)</td>
<td>10-10</td>
</tr>
<tr>
<td>10.5 Bootloading from a Communication Port (Examples)</td>
<td>10-16</td>
</tr>
<tr>
<td>10.6 Modifying the II0Fx Pins After Bootloading</td>
<td>10-19</td>
</tr>
<tr>
<td>10.7 The Bootloader Program</td>
<td>10-20</td>
</tr>
</tbody>
</table>
10.1 Bootloader Description

The bootloader code starts at location 0x11bc in the on-chip ROM in both the ’C40 and ’C44. For ’C44 device revisions ≤ 1.0, the ’C44 bootloader code is identical to the ’C40 bootloader code. For ’C44 device revisions > 1.0, the ’C44 bootloader code differs in three memory locations from the ’C40 bootloader. These three locations are noted in the code. The bootloader program is listed in Section 10.7, The Bootloader Program.
10.2 Mode Selection

The 'C4x bootloader functions primarily as either a memory bootloader or a communication port bootloader. Bootloader mode selection is determined by the IIOF(3–0) pins, as described in Table 10–1 and shown in Figure 10–1.

- The memory bootloader supports user-definable byte, half-word, and full-word data formats, which allow the flexibility to load a source program from memories having widths of 8 bits, 16 bits, or 32 bits. The source programs to be loaded must reside in one of six predefined memory locations, as listed in Table 10–1. STRB0 (LSTRB0) should be used because they are the active strobes after reset. Figure 10–2 shows the flow for the memory bootloader.

- The communication port bootloader waits for the first data input from one of the six ('C40) or four ('C44) communication port channels and uses that channel to perform the bootload. The format of the incoming data stream is similar to that for a memory data stream, except that the source memory width is excluded (the format is described in Table 10–2). Figure 10–3 shows the flow of the communication port bootloader.

Table 10–1. Bootloader Mode Selection Using Pins IIOF(3–0)

<table>
<thead>
<tr>
<th>External Pin</th>
<th>Source Program Location</th>
</tr>
</thead>
<tbody>
<tr>
<td>IIOF3 IIOF2 IIOF1 IIOF0</td>
<td>'C40</td>
</tr>
<tr>
<td>1 1 0 1</td>
<td>0030 0000h</td>
</tr>
<tr>
<td>1 0 1 1</td>
<td>4000 0000h</td>
</tr>
<tr>
<td>1 0 0 1</td>
<td>6000 0000h</td>
</tr>
<tr>
<td>0 1 1 1</td>
<td>8000 0000h</td>
</tr>
<tr>
<td>0 1 0 1</td>
<td>A000 0000h</td>
</tr>
<tr>
<td>0 0 1 1</td>
<td>C000 0000h</td>
</tr>
<tr>
<td>0 0 0 1</td>
<td>Reserved (the bootloader terminates)</td>
</tr>
<tr>
<td>1 1 1 1</td>
<td>Communication port</td>
</tr>
</tbody>
</table>

† The 'C44 external-address buses each have only the low 24 bits of the internal address bus. Thus, the internal address 4000000h maps to 0h on the local bus. Any address at or above 80000000h is mapped to the global bus; 80800000h, for example, maps to address 00800000h on the global bus.
Figure 10–1. Mode Selection Flow

1. **Begin**
2. Reset with ROMEN=1, IIOF0 = 1 and RESETLOC(0,1) = 00₂
3. **Check comm. port 0**
4. **Memory load from 4000 0000h**
5. **Register IIF(3–1) = 101₂**
6. **Memory load from 4000 0000h**
7. **No**
8. **Register IIF(3–1) = 110₂**
9. **Memory load from 0030 0000h**
10. **Register IIF(3–1) = 110₂**
11. **Yes**
12. **Memory load from 0030 0000h**
13. **No**
14. **Check next comm. port**
15. **All comm. ports checked?**
16. **No**
17. **Data in comm. port IFIFO?**
18. **No**
19. **Comm. port load**
20. **Check comm. port 0**
21. **Yes**
22. **Exit bootloader code**
23. **Yes**
24. **No**
25. **Memory load from 8000 0000h**
26. **Register IIF(3–1) = 011₂**
27. **Memory load from 8000 0000h**
28. **No**
29. **Register IIF(3–1) = 100₂**
30. **Memory load from 6000 0000h (C40) from 0080 0000h (C44)**
31. **No**
32. **Register IIF(3–1) = 100₂**
33. **Memory load from 6000 0000h (C40) from 0080 0000h (C44)**
34. **No**
35. **Register IIF(3–1) = 011₂**
36. **Memory load from 8000 0000h**
37. **No**
38. **Register IIF(3–1) = 010₂**
39. **Memory load from A000 0000h (C40) 8040 0000h (C44)**
40. **No**
41. **Register IIF(3–1) = 010₂**
42. **Memory load from A000 0000h (C40) 8040 0000h (C44)**
43. **No**
44. **Register IIF(3–1) = 001₂**
45. **Memory load from C000 0000h (C40) 8080 0000h (C44)**
46. **No**
47. **Register IIF(3–1) = 001₂**
48. **Memory load from C000 0000h (C40) 8080 0000h (C44)**
49. **No**
50. **Register IIF(3–1) = 000₂**
51. **Exit bootloader code**
52. **No
10.3 Bootloading Sequence

Here is the general sequence of events in bootloading a source program:

1) Select the bootloader by resetting the 'C4x while driving the
   RESETLOC(1,0) pins low, the on-chip ROM enable pin (ROMEN) high,
   and the IIF0 pin high. The ROMEN pin must be high during bootloader
   execution, but it can be changed anytime after bootloading.

2) The status of external pins IIF(3–1) indicates where to find the source
   program to be loaded (memory or communication port). These options are
   listed in Table 10–1. Pins IIF(3–1) are read as the IIF flags in the CPU
   IIF register. The bootloader takes the following steps to determine the
   source program's location, as is shown in Figure 10–1.
   a) If an IIF(3–1) value of from 1102 to 0012 (6 to 1) is found, the source
      program is loaded from the corresponding memory address shown in
      the top six lines of Table 10–1. See Figure 10–2 for details on boot-
      loader memory flow.
   b) The IIF(3–1) value of 0002 (0) is reserved. You should not use this
      mode.
   c) If none of the combinations 0002–1102 are found, the bootloader
      program assumes that loading will be via a communication port, and
      it starts checking communication port input channels (in the order port
      0 through port 5). If it finds no inputs from a communication port, the
      program returns to checking the status of the IIF(3–1) pins again.
      See Figure 10–3 for details on bootloader communication port flow.

3) When the source program's data stream is found, the program is loaded
   at the address found in the fifth word of the data stream (the format is
   shown in Table 10–2), using the bus width specified in the first word (8, 16,
   or 32 bits wide). The bootloader cannot load the source program to any
   location below 0000 1000h, unless the address decode logic is remapped.
   The first five words of the source program specify its loading and execution
   criteria. Remaining words are the source program(s) and vector table
   pointers as shown in Table 10–2.

4) An IACK instruction is executed, indicating the completion of the bootload
   sequence. This indication can then be used to switch from microcomputer
   (ROMEN = 1) to microprocessor mode (ROMEN = 0). You do not need to
   reset the 'C4x to change the ROMEN pin. However, ensure that the 'C4x
   will not access addresses 0000 0000h to 0000 0FFFh during the change.

5) The source program is executed (entry point is the first word of the first
   loaded program).
Bootloading Sequence

Figure 10–2. Memory Load Flow

1. Memory load
2. Branch to source program address
3. Determine mode 8, 16, or 32 bits?
4. Set global bus control register
5. Set local bus control register
6. Load block size
7. Block size = 0?
   - Yes: Set IVTP register
   - No: Load destination address
8. Transfer 32-Bit data from source to destination
9. End of block?
   - Yes: Begin program execution
   - No: Go back to step 4.
10. Set TVTP register
11. Execute IACK
12. Branch to destination address of first block loaded
13. Begin program execution
Figure 10–3. Communication-Port Load Mode Flow

1. Comm. port load
2. Wait for comm. port input
3. Set global bus control register
4. Wait for comm. port input
5. Set local bus control register
6. Wait for comm. port input
7. Load block size
8. Block size=0?
   - Yes: Set IVTP register, set TVTP register, execute IACK, branch to destination address of first block loaded
   - No: Wait for comm. port input
9. Load destination address
10. Transfer one word from comm. port to destination address
11. End of block?
    - Yes: Begin program execution
    - No: Wait for comm. port input
The data stream with its source program(s) should be in the format shown in Table 10–2. The contents of words 4 through $n$ vary for the different source programs loaded throughout the entire data stream. The first three words and the last three words are nonvariables that affect each of the source-program blocks. The eight least significant bits (LSBs) of the first word specify the memory width. If byte or half-word wide is selected, the loading sequence is from LSBs to MSBs.

<table>
<thead>
<tr>
<th>Word</th>
<th>Contents</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x0 1</td>
<td>Memory width where source program resides (8, 16, or 32 bits wide)</td>
</tr>
<tr>
<td>0x0 2</td>
<td>Value to set in the global memory interface control register (shown in Figure 9–2).</td>
</tr>
<tr>
<td>0x0 3</td>
<td>Value to set in the local memory interface control register (shown in Figure 9–2).</td>
</tr>
<tr>
<td>0x0 4</td>
<td>Block size in 32-bit words of the first program block to be loaded (after the number of words is loaded, the next word should be all zeros; if not, another block is assumed to follow).</td>
</tr>
<tr>
<td>0x0 5</td>
<td>Address where the source program is to be loaded.</td>
</tr>
<tr>
<td>$n$ 6</td>
<td>First word of source program.</td>
</tr>
<tr>
<td>$n$ 7</td>
<td>Last word of source program (the program organized as words 4 through $n$ — these shaded words).</td>
</tr>
<tr>
<td>$n$+1 8</td>
<td>Word of all zeros. (Note that if several source-program blocks were sent, word $n$ above would be the last word of the last source-program block. Each source-program block would have the format shown in words 4 through $n$. This word of all zeros follows the last source program block).</td>
</tr>
<tr>
<td>$n$+2 9</td>
<td>IVTP value (interrupt vector table pointer, see Section 3.2).</td>
</tr>
<tr>
<td>$n$+3 10</td>
<td>TVTP value (trap vector table pointer, see Section 3.2).</td>
</tr>
<tr>
<td>$n$+4 11</td>
<td>Memory location for IACK instruction (see IACK instruction in Chapter 14).</td>
</tr>
</tbody>
</table>

Note: The shaded area identifies the source program block.

Each source program in a multiple block program transfer can be loaded to different specified destinations. Each program block specifies its program's size and destination address at the beginning of the block. End the entire block program loader function by following the last block with an all-zero word (0000 0000h).
The second and third last words of the source memory define the interrupt vector table pointer (IVTP) and the trap vector table pointer (TVTP). The last word of the source memory defines the memory location for the IACK instruction. The IACK instruction brings the IACK signal low as data is read, if the memory location specified in the IACK instruction is in external memory that is available in the system. Finally, the processor begins execution of the first code block.

**CAUTION**

It is assumed that at least one block of source will be loaded when the bootloader is invoked. Initial loader invocation with a block size of 0000 0000h produces unpredictable results.
10.4 Bootloading from External Memory (Examples)

When the 'C4x's ROMEN input pin is high and RESETLOC(1,0)=00₂ during reset, the memory bootloader can load programs stored in off-chip memory (typically 8-, 16-, or 32-bit ROMs) at an address determined by the IIOF pins to any valid external or internal memory in the 'C4x's memory map.

Because address zero (0) is reserved for the bootloader, address zero should not be used for the reset vector when a user-defined, internal ROM-code mask is used.

The 8 LSBs of the first word of data read stream specify the memory width (8, 16, or 32 bits) as shown in Table 10–3, Table 10–4, and Table 10–5.

- 8-bit memories: 08h
- 16-bit memories: 0010h
- 32-bit memories: 0000 0020h

If 8- or 16-bit external memories are used, the loading sequence is from LSBs to MSBs. The bootloader reads the contents of 16-bit wide memories (least significant half word first) and packs each pair of 16-bit half words to make a 32-bit word before loading each word to memory. Accordingly, the bootloader reads the contents of byte-wide memories (least significant byte first) and packs each group of four bytes into a 32-bit word before loading each word to memory. Because the bootloader packs bytes before loading, no external hardware is needed to pack the loaded bytes into a 32-bit word. For 32-bit wide external memories, no byte packing is necessary, because the memory data width matches that of the 'C4x.

For 16-bit memories, the data read is expected to be in bit positions 0–15. Thus, the half-word memory’s data lines should be interfaced to 'C4x data lines (L)D15–0. For byte-wide memories, the data read is expected to be in bit positions 0–7. Hence, the byte-wide memory’s data lines should be interfaced to 'C4x data lines (L)D7–0. Even though the 'C4x does not require that unused data lines be pulled up to VCC, it is recommended that each unused data line be pulled up through separate 22 KΩ resistors to 5 volts for minimum power dissipation.

Table 10–3, Table 10–4, and Table 10–5 show example data streams for 8-bit, 16-bit, and 32-bit wide configured memories, respectively.
These examples assume that:

- The status of the IIOF (0–3) pins is $110_2$ after reset is deasserted (memory load from 0030 0000h — see Table 10–1).

- The source program resides at memory location 0030 0000h and defines the following:
  - Memory width for bootloader: 8, 16, or 32 bits
  - Global bus memory with one software wait state, external RDY (SWW = 11), page size = 64K words for both STRB0 and STRB1, and an active address range = 1G words for both STRB0 and STRB1.
  - Local memory bus that requires two software wait states (SWW = 01), page size = 32K words, and active address range = 1G words for both STRB0 and STRB1.
  - First block program of 294 words in length and whose destination address is at 002F F840h.
  - Second block program of 64 words in length and whose destination address is at 002F F800h.
  - IVTP and TVTP, which are overlapped and point to the beginning of the on-chip RAM.
  - Memory location of 0030 0000h for IACK instruction.

### Table 10–3. Byte-Wide Configured Memory

<table>
<thead>
<tr>
<th>Word</th>
<th>Address</th>
<th>Value</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0030 0000h</td>
<td>08h</td>
<td>Memory width = 8 bits</td>
</tr>
<tr>
<td></td>
<td>0030 0001h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 0002h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 0003h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>0030 0004h</td>
<td>F0h</td>
<td>Global memory bus control word = 1D7B C9F0h</td>
</tr>
<tr>
<td></td>
<td>0030 0005h</td>
<td>C9h</td>
<td>(Described in Figure 9–2 on page 9-7)</td>
</tr>
<tr>
<td></td>
<td>0030 0006h</td>
<td>7Bh</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 0007h</td>
<td>1Dh</td>
<td></td>
</tr>
</tbody>
</table>
Table 10–3. Byte-Wide Configured Memory (Continued)

<table>
<thead>
<tr>
<th>Word</th>
<th>Address</th>
<th>Value</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td>3</td>
<td>0030 0008h</td>
<td>50h</td>
<td>Local memory bus control word = 1D73 9250h</td>
</tr>
<tr>
<td></td>
<td>0030 0009h</td>
<td>92h</td>
<td>(Described in Figure 9–2 on page 9-7)</td>
</tr>
<tr>
<td></td>
<td>0030 000Ah</td>
<td>73h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 000Bh</td>
<td>1Dh</td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>0030 000Ch</td>
<td>26h</td>
<td>1st source program block size = 126h</td>
</tr>
<tr>
<td></td>
<td>0030 000Dh</td>
<td>01h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 000Eh</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>0030 0010h</td>
<td>40h</td>
<td>1st source program block starting addr = 002F F840h</td>
</tr>
<tr>
<td></td>
<td>0030 0011h</td>
<td>F8h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 0012h</td>
<td>2Fh</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 0013h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>0030 0014h</td>
<td>1st source program block starts here (first word)</td>
<td></td>
</tr>
<tr>
<td>299</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 04ABh</td>
<td>1st source program block ends here (last word)</td>
<td></td>
</tr>
<tr>
<td>300</td>
<td>0030 04ACh</td>
<td>40h</td>
<td>2nd source program block size = 40</td>
</tr>
<tr>
<td></td>
<td>0030 04ADh</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 04AEh</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 04AFh</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>301</td>
<td>0030 04B0h</td>
<td>00h</td>
<td>2nd source program block starting addr = 002F F800h</td>
</tr>
<tr>
<td></td>
<td>0030 04B1h</td>
<td>F8h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 04B2h</td>
<td>2Fh</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 04B3h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>302</td>
<td>0030 04B4h</td>
<td>2nd source program block starts here (first word)</td>
<td></td>
</tr>
<tr>
<td>365</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05B3h</td>
<td>2nd source program block ends here (last word)</td>
<td></td>
</tr>
</tbody>
</table>

**Note:** The shaded area identifies the source program block.
### Table 10-3. Byte-Wide Configured Memory (Continued)

<table>
<thead>
<tr>
<th>Word</th>
<th>Address</th>
<th>Value</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td>366</td>
<td>0030 05B4h</td>
<td>00h</td>
<td>Value 0 to terminate the program block load</td>
</tr>
<tr>
<td></td>
<td>0030 05B5h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05B6h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05B7h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>367</td>
<td>0030 05B8h</td>
<td>00h</td>
<td>IVTP = 002F F800h</td>
</tr>
<tr>
<td></td>
<td>0030 05B9h</td>
<td>F8h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05BAh</td>
<td>2Fh</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05BBh</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>368</td>
<td>0030 05BCh</td>
<td>00h</td>
<td>TVTP = 002F F800h</td>
</tr>
<tr>
<td></td>
<td>0030 05BDh</td>
<td>F8h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05BEh</td>
<td>2Fh</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05BFh</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td>369</td>
<td>0030 05C0h</td>
<td>00h</td>
<td>Memory location for IACK instruction =0030 0000h</td>
</tr>
<tr>
<td></td>
<td>0030 05C1h</td>
<td>00h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05C2h</td>
<td>30h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0030 05C3h</td>
<td>00h</td>
<td>(This is the final word in the data stream.)</td>
</tr>
</tbody>
</table>

**Note:** The shaded area identifies the source program block.
Table 10–4. 16-Bit Wide Configured Memory

<table>
<thead>
<tr>
<th>Word</th>
<th>Address</th>
<th>Value</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0030 0000h</td>
<td>0010h</td>
<td>Memory width = 16 bits</td>
</tr>
<tr>
<td></td>
<td>0030 0001h</td>
<td>0000h</td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>0030 0002h</td>
<td>C9F0h</td>
<td>Global memory bus control word = 1D7B C9F0h</td>
</tr>
<tr>
<td></td>
<td>0030 0003h</td>
<td>1D7Bh</td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>0030 0004h</td>
<td>9250h</td>
<td>Local memory bus control word = 1D73 9250h</td>
</tr>
<tr>
<td></td>
<td>0030 0005h</td>
<td>1D73h</td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>0030 0006h</td>
<td>0126h</td>
<td>1st program block size = 126h</td>
</tr>
<tr>
<td></td>
<td>0030 0007h</td>
<td>0000h</td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>0030 0008h</td>
<td>F840h</td>
<td>1st program block starting addr. = 002F F840h</td>
</tr>
<tr>
<td></td>
<td>0030 0009h</td>
<td>002Fh</td>
<td></td>
</tr>
<tr>
<td>6 to 299</td>
<td>0030 00Ah</td>
<td>1st program block starts here (first word)</td>
<td></td>
</tr>
<tr>
<td>300</td>
<td>0030 0256h</td>
<td>0040h</td>
<td>2nd program block size = 40h</td>
</tr>
<tr>
<td></td>
<td>0030 0257h</td>
<td>0000h</td>
<td></td>
</tr>
<tr>
<td>301</td>
<td>0030 0258h</td>
<td>F800h</td>
<td>2nd program block starting addr. = 002F F800h</td>
</tr>
<tr>
<td></td>
<td>0030 0259h</td>
<td>002Fh</td>
<td></td>
</tr>
<tr>
<td>302 to 365</td>
<td>0030 025Ah</td>
<td>2nd program block starts here (first word)</td>
<td></td>
</tr>
<tr>
<td>366</td>
<td>0030 02DAh</td>
<td>0000h</td>
<td>Value 0 to terminate the program block load</td>
</tr>
<tr>
<td></td>
<td>0030 02DBh</td>
<td>0000h</td>
<td></td>
</tr>
<tr>
<td>367</td>
<td>0030 02DCh</td>
<td>F800h</td>
<td>IVTP = 002F F800h</td>
</tr>
<tr>
<td></td>
<td>0030 02DDh</td>
<td>002Fh</td>
<td></td>
</tr>
<tr>
<td>368</td>
<td>0030 02DEh</td>
<td>F800h</td>
<td>TVTP = 002F F800h</td>
</tr>
<tr>
<td></td>
<td>0030 02DFh</td>
<td>002Fh</td>
<td></td>
</tr>
</tbody>
</table>

Note: The shaded area identifies the source program block.
**Table 10–4. 16-Bit Wide Configured Memory (Continued)**

<table>
<thead>
<tr>
<th>Word</th>
<th>Address</th>
<th>Value</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td>369</td>
<td>0030 02E0h</td>
<td>0000h</td>
<td>Memory location for IACK instruction = 0030 0000h</td>
</tr>
<tr>
<td></td>
<td>0030 02E1h</td>
<td>0030h</td>
<td>(This is the final word in the data stream.)</td>
</tr>
</tbody>
</table>

**Note:** The shaded areas identify the source program blocks.

**Table 10–5. 32-Bit Wide Configured Memory**

<table>
<thead>
<tr>
<th>Word</th>
<th>Address</th>
<th>Value</th>
<th>Comments</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0030 0000h</td>
<td>0000 0020h</td>
<td>Memory width = 32 bits</td>
</tr>
<tr>
<td>2</td>
<td>0030 0001h</td>
<td>1D7B C9F0h</td>
<td>Global memory bus control word = 01D7B C9F0h</td>
</tr>
<tr>
<td>3</td>
<td>0030 0002h</td>
<td>1D73 9250h</td>
<td>Local memory bus control word = 01D73 9250h</td>
</tr>
<tr>
<td>4</td>
<td>0030 0003h</td>
<td>0000 0126h</td>
<td>1st program block size = 126h</td>
</tr>
<tr>
<td>5</td>
<td>0030 0004h</td>
<td>002F F840h</td>
<td>1st program block starting addr = 002F F840h</td>
</tr>
<tr>
<td>6</td>
<td>0030 0005h</td>
<td>002F F800h</td>
<td>2nd program block starting addr = 002F F800h</td>
</tr>
<tr>
<td></td>
<td>299</td>
<td></td>
<td>2nd program block starts here (first word)</td>
</tr>
<tr>
<td></td>
<td>300</td>
<td></td>
<td>2nd program block ends here (last word)</td>
</tr>
</tbody>
</table>

**Note:** The shaded areas identify the source program blocks.
10.5 Bootloading from a Communication Port (Examples)

A value of all 1s on \texttt{IIOF\{0–3\}} signals that the source program is being transmitted via a communication port. Bringing all four of the \texttt{IIOF\{0–3\}} pins high also allows the pins to be used as interrupt lines without any external decode logic. With pins \texttt{IIOF\{0–3\}} all high at reset, the \textquoteleft C4x polls the input level of each port to determine which channel contains the program. The input data sequence of the communication bootloader is the same as that of the memory bootloader except that it lacks the source memory width definition (because the memory width of the communication port bootloader is fixed).

Example 10–1 is a program listing for booting a multiprocessor system.

After a 32-bit boot from external memory, the master \textquoteleft C4x boots — via a communication port—another \textquoteleft C4x (slave processor) connected to communication port 0 of the master processor. Both processors stay in an infinite loop after booting. The code should be loaded in the master \textquoteleft C4x EPROM in the correct memory location according to the \texttt{IIOF} settings of the master \textquoteleft C4x. All \texttt{IIOF} pins of the slave processor should be set to 1. The ROMEN pin is enabled (ROMEN=1) and the RESETLOC\{1,0\} pins are low in both processors. For a description of how to convert an executable COFF file into an EPROM programmer format, see the hex conversion utility in the \textit{TMS320 Floating Point Assembly Language Tools User’s Guide} (literature number SPRU035).
Example 10–1. Booting a `C4x Multiprocessor System

*---------------------------------------------------------------
*        MASTER PROCESSOR BOOT TABLE
*---------------------------------------------------------------
.text
.word  32 ; memory width
.word  3003c000h ; MASTER global control register
                 ; (system specific !!)
.word  3d79c210h ; master local control register
                 ; (system specific !!)

*---------------------------------------------------------------
*        MASTER PROCESSOR PROGRAM BLOCK
*---------------------------------------------------------------
.word  10 ; block size
.word  2ff800h ; block dest addr

* Code for master processor: this code sends boot table to slave processor
ldi  8,rc ; loop 9 times: size of slave processor
          ; boot table
rptbd  endb1
ldp  src
ldi @src,ar0
ldi @dst,arl
ldi *ar0++(1),r0 ; block start
endb1: sti r0,*arl
bu $ ; master processor loops forever
src .word BOOT_TABLE2 ; address of boot table of slave
                 ; processor
dst .word 100042h ; address of OFIFO connected to slave
                 ; processor

*---------------------------------------------------------------
*        END OF ALL BLOCKS
*---------------------------------------------------------------
.word  0 ; master end of bootload sequence
.word  2ffd00h ; master IVTP value
.word  2ffd00h ; master TVTP value
.word  40000000h ; master address for iack

*---------------------------------------------------------------
*        END OF MASTER PROCESSOR BOOT TABLE : size = 9 words
*---------------------------------------------------------------
Example 10–1. Booting a `C4x Multiprocessor System (Continued)

```
*----------------------------------------------
* SLAVE PROCESSOR BOOT TABLE
*----------------------------------------------
BOOT_TABLE2:        ; slave BOOT TABLE
   .word   3003c000h ; slave global control register
   ; (system specific !!!)
   .word   3d79c210h ; slave local control register
   ; (system specific !!!)
   .word   1           ; block size
   .word   2ff800h    ; dst load address
   bu $               ; slave processor loops forever
   .word   0           ; slave end of bootload sequence
   .word   2ffd00h    ; slave IVTP value
   .word   2ffd00h    ; slave TVTP value
   .word   40000000h  ; slave address for iack
*----------------------------------------------
* END OF EPROM CODE
*----------------------------------------------
```
10.6 Modifying the IIOFx Pins After Bootloading

The load options are based upon the status of IIOF(3–0) as general-purpose input pins. Therefore, to select the correct bootloader mode, pins IIOF(3–0) must be kept at a constant valid status value (see Table 10–1 for a list of values).

After the bootload is complete, the IACK signal is brought low until the read phase in the pipeline finishes. Figure 10–4 shows an example circuit that generates the IIOF(3–0) signals for bootload selection and, after bootload operation, allows incoming external interrupts. In this example, after reset, the IIOF pins stay low until the IACK signal is received.

Figure 10–4. Circuit for Generation of a Low IIOF Signal for Bootloader Selection
10.7 The Bootloader Program

* *******************************************************
* * C40BOOT - TMS320C40 BOOTLOADER PROGRAM
* * (C) COPYRIGHT TEXAS INSTRUMENTS INC., 1990
* *
* NOTE 1. AFTER THE DEVICE IS RESET, THE PROGRAM IS CHECKING THE INPUT STATUS
* OF IIOF0–3 PINS AND COMMUNICATION PORT INPUT FLAGS TO CONFIGURE ITSELF WHEN
* THE ON-CHIP ROM IS ENABLED (ROMEN=1). THE IIOF0 PIN IS ASSUMED TO BE HIGH.
* *
* NOTE 2. THE FUNCTION SELECTION OF IIOF0–3 IS LISTED AS:
* *
<table>
<thead>
<tr>
<th>IIOF</th>
<th>IIOF</th>
<th>IIOF</th>
<th>IIOF</th>
<th>FUNCTION</th>
</tr>
</thead>
<tbody>
<tr>
<td>3</td>
<td>2</td>
<td>1</td>
<td>0</td>
<td>Memory bootloader from 00300000H</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>Memory bootloader from 40000000H</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>Memory bootloader from 60000000H</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>Memory bootloader from 80000000H</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>Memory bootloader from C0000000H</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>Reserved</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>Communication port bootloader</td>
</tr>
</tbody>
</table>
*
*
* THE PROGRAM ASSUMES THE COMMUNICATION PORT BOOTLOADER IS THE DEFAULT
* FUNCTION. IF NO OTHER FUNCTION IS SELECTED, THE PROGRAM STARTS CHECKING
* THE COMMUNICATION PORT INPUT CHANNELS. IF THERE IS NO INPUT FROM A
* COMMUNICATION PORT, THE PROGRAM RECHECKS THE IIOF(3–0) STATUS AGAIN.
*
* NOTE 3. MEMORY BOOTLOADER LOADS WORD, HALF-WORD, OR BYTE WIDE PROGRAM TO
* DIFFERENT SPECIFIED LOCATIONS. THE 8 LSBs OF THE FIRST MEMORY SPECIFIES THE
* MEMORY WIDTH. IF THE HALF-WORD OR BYTE WIDE PROGRAM IS SELECTED, THE LSBs
* ARE LOADED FIRST AND THEN THE MSBs. THE NEXT 2 WORDS CONTAIN THE CONTROL
* WORD FOR THE GLOBAL AND LOCAL MEMORY INTERFACE CONTROL REGISTERS. NEXT COME
* THE PROGRAM BLOCKS. THE FIRST TWO WORDS OF EACH PROGRAM BLOCK CONTAIN THE
* BLOCK SIZE AND DESTINATION ADDRESS WHERE THE PROGRAM IS TO BE LOADED. WHEN
* THE ZERO BLOCK SIZE IS READ, THE PROGRAM BLOCK LOADING IS TERMINATED. THE
* NEXT TWO WORDS ARE THE INITIAL VALUES FOR THE IVTP AND TVTP REGISTERS.
* AFTER THE BOOTLOADING IS COMPLETED, THE IACK SIGNAL IS SENT OUT ACCORDING
* TO THE LAST WORD OF THE SOURCE MEMORY, AND THE PROGRAM COUNTER WILL BRANCH
* TO THE STARTING ADDRESS OF THE FIRST PROGRAM BLOCK.
*
* NOTE 4. IF IIOF(3–0) ARE SET FOR COMMUNICATION PORT BOOTLOADER, THE PROCESSOR
* WAITS FOR THE FIRST INPUT FROM AN INPUT COMMUNICATION CHANNEL AND USE THAT
* CHANNEL TO PERFORM THE DOWNLOAD. THE BEGINNING TWO WORDS SHOULD CONTAIN THE
* GLOBAL AND LOCAL BUS CONTROL WORDS. SIMILAR TO THE MEMORY LOADER, THE
* PROGRAM CAN BE LOADED INTO DIFFERENT MEMORY BLOCKS. THE FIRST TWO WORDS OF
* EACH PROGRAM BLOCK CONTAINS THE BLOCK SIZE AND MEMORY ADDRESS TO BE LOADED
* INTO. WHEN THE ZERO BLOCK SIZE IS READ, THE PROGRAM BLOCK LOADING IS
* TERMINATED. IN OTHER WORDS, TO TERMINATE THE PROGRAM BLOCK LOADING, A
* ZERO HAS TO BE ADDED AT THE END OF PROGRAM BLOCK. THE FOLLOWING TWO WORDS
* ARE THE INITIAL VALUES FOR THE IVTP AND TVTP REGISTERS. AFTER THE BOOT-
* LOADING IS COMPLETED, THE IACK SIGNAL IS SENT OUT ACCORDING TO THE LAST
* WORD OF THE SOURCE MEMORY AND THE PROGRAM COUNTER BRANCHES TO THE STARTING
* ADDRESS OF THE FIRST PROGRAM BLOCK.
The Bootloader Program

```
.sect "boot"
 ************************************************************
*              TMS320C4x PROCESSOR BOOTLOADER               *
************************************************************
BOOT:      LDI    COM_LOAD,R10    ; Comm. port load subroutine address -> R10
         LDHI   0010H,AR0       ; Load peripheral mem. map start addr 100000H
*
*    CHECK THE IIOF1–3 FOR THE BOOTLOADER
*
CHECK:    LDHI   0030H,AR1       ; Load memory address = 00300000H
         CMPI   04404H,IIF      ; Test function 110 condition
         BEQ    MEMORY          ; If true, execute memory bootloader
         LDHI   04000H,AR1      ; Load memory address = 40000000H
         CMPI   04044H,IIF      ; Test function 101 condition
         BEQ    MEMORY          ; If true, execute memory bootloader
|       LDHI   06000H,AR1      ; Load memory address = 60000000H
|   ’C44: LDHI 00080h,AR1    ; replace previous line with this line ('C44)
|       CMPI   04004H,IIF      ; Test function 100 condition
|       BEQ    MEMORY          ; If true, execute memory bootloader
|       LDHI   08000H,AR1      ; Load memory address = 80000000H
|       CMPI   00444H,IIF      ; Test function 011 condition
|       BEQ    MEMORY          ; If true, execute memory bootloader
|       LDHI   0A000H,AR1      ; Load memory address = A0000000H
|   ’C44: LDHI 08040,AR1      ; replace previous line with this line ('C44)
|       CMPI   00404H,IIF      ; Test function 010 condition
|       BEQ    MEMORY          ; If true, execute memory bootloader
|       LDHI   0C000H,AR1      ; Load memory address = C0000000H
|   ’C44: LDHI 08080H         ; replace previous line with this line ('C44)
|       CMPI   00044H,IIF      ; Test function 001 condition
|       BEQ    MEMORY          ; If true, execute memory bootloader
|       CMPI   00004H,IIF      ; Test function 000 condition
|       BEQ    STATRAM         ; If true, branch to STATIC RAM TEST
*––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
*                     COMMUNICATION PORT BOOTLOADER
*––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
*    CHECK COMMUNICATION PORT INPUT CHANNEL
*
ADDI   040H,AR0,AR3    ; Point to comm. port 0 control register addr
LDI    5,AR1           ; Set loop counter for CHECK_CH loop
CHECK_CH:  LSH3   -9,*AR3,R1      ; Check comm port input
         BNZ    LOAD0          ; If input exist, start comm port loader
         ADDI   010H,AR3      ; Point to next comm. port channel addr
         DBU    AR1,CHECK_CH   ; Check next comm. port channel input
         B      CHECK         ; Recheck the input flags
```
The Bootloader Program

*---------------------------------------------------------------
*                          MEMORY BOOTLOADER
*---------------------------------------------------------------
*    TEST MEMORY WORD WIDTH
*---------------------------------------------------------------
MEMORY:    LDI *AR1++(1),R1 ; Load the memory word width
LDI W_WIDE,R10 ; Full-word size subroutine address -> R10
LSH 26,R1 ; Test bit 5 of mem. width word
BN LOAD0 ; If '1' start PGM loading (32 bits width)
NOP *AR1++(1) ; Jump last half word from mem. word
LDI H_WIDE,R10 ; Half-word size subroutine address -> R10
LSH 1,R1 ; Test bit 4 of mem. width word
BN LOAD0 ; If '1' start PGM loading (16 bits width)
NOP *AR1++(1) ; Jump last 1 bytes from mem. word
LDI B_WIDE,R10 ; Byte size subroutine address -> R10
NOP *AR1++(1) ; Jump last 1 bytes from mem. word

*    START PROGRAM LOADING
*---------------------------------------------------------------
LOAD0: LAJU R10 ; Load new word according to mem. width
LDHI 0010H,AR0 ; Load peripheral mem. map start addr 100000H
LDI 1,R0 ; Set start address flag off
NOP
LAJU R10 ; Load new word according to mem. width
STI AR2,*AR0 ; Set global bus control register
NOP
STI AR2,*+AR0(4) ; Set local bus control register

LOAD2: LAJU R10 ; Load new word according to mem. width
ADDI 1,R0 ; Set start address flag off
NOP
NOP
CMPI 0,AR2 ; If 0 block size start PGM
BEQ IVTP_LOAD

LAJU R10 ; Load new word according to mem. width
SUBI3 1,AR2,RC ; Set block size for repeat loop
NOP
SUBI 1,R10 ; Sub address with loop
LDI R0,R0 ; Test start address loaded flag
LDIP AR2,R9 ; Load start address if flag off
LAJU R10 ; Load block words according to mem. width
LDI AR2,AR0 ; Set destination address
LDI -1,R0 ; Set start & dest. address flag on
ADDI 1,R10 ; Sub address without loop
B LOAD2 ; Jump to load a new block when loop completed
* INITIALIZE IVTP AND TVTP REGISTERS *

IVTP_LOAD: LAJU R10 ; Load new word according to mem. width
NOP
NOP
NOP

TVTP_LOAD: LAJU R10 ; Load new word according to mem. width
LDPE AR2,IVTP ; Load the IVTP pointer
NOP
NOP
LAJU R10 ; Load new word according to mem. width
LDPE AR2,TVTP ; Load the TVTP pointer
NOP
NOP
IACK *AR2 ; Send out IACK signal out
BU R9 ; Branch to the start of the program

;––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––;
; BYTE–WIDE MEMORY BOOTLOADER SUBROUTINE           ;
;––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––;
LOOP_B: RPTB LOAD_B ; PGM load loop
B_WIDE: LWL0 *AR1++(1),AR2 ; Load byte 0 (LSB)
NOP ; Nop for STRB to go high
LWL1 *AR1++(1),AR2 ; Join byte 1 with byte 0
NOP ; Nop for STRB to go high
LWL2 *AR1++(1),AR2 ; Join byte 2 with byte 0 & 1
NOP ; Nop for STRB to go high
LWL3 *AR1++(1),AR2 ; Join byte 3 with byte 0, 1, & 2
LDI R0,R0 ; Test load address flag
BNN H_END
LOAD_B: STI AR2,*AR0++(1) ; Store new word to dest. address
B_END: BU R11 ; Return from subroutine

;––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––;
; HALF–WORD WIDE MEMORY BOOTLOADER SUBROUTINE         ;
;––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––;
LOOP_H: RPTB LOAD_H ; PGM load loop
H_WIDE: LWL0 *AR1++(1),AR2 ; Load LSB half-word
NOP ; Nop for STRB to go high
LWL2 *AR1++(1),AR2 ; Join MSB half-word with LSB half-word
LDI R0,R0 ; Test load address flag
BNN H_END
LOAD_H STI AR2,*AR0++(1) ; Store new word to dest. address
H_END BU R11 ; Return from subroutine

;––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––;
; FULL–WORD WIDE MEMORY BOOTLOADER SUBROUTINE        ;
;––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––;
LOOP_W RPTB LOAD_W ; PGM load loop
W_WIDE LDI *AR1++(1),AR2 ; Read a new 32 bits word
LDI R0,R0 ; Test load address flag
BNN W_END
LOAD_W STI AR2,*AR0++(1) ; Store new word to dest. address
W_END BU R11 ; Return from subroutine
The Bootloader Program

;-----------------------------------------------;
; COMMUNICATION PORT BOOTLOADER SUBROUTINE    
;-----------------------------------------------;
LOOP_C   RPTB   LOAD_C      ; PGM load loop
COM_LOAD LSH3   -9,*AR3,R1  ; Check comm port input
               B2      COM_LOAD    ; Wait for comm port input
               LDI   *+AR3(1),AR2 ; Read a new 32 bits word
               LDI   R0,R0       ; Test load address flag
               BNN   C_END      
LOAD_C   STI    AR2,*AR0++(1) ; Store new word to dest. address
C_END    BU     R11           ; Return from subroutine
.end
The DMA Coprocessor

The direct memory access (DMA) coprocessor is a programmable on-chip device that allows simultaneous memory transfer and CPU operation with minimum CPU overhead. This chapter describes the DMA coprocessor and also offers suggestions for programming the device.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>11.1 Introduction</td>
<td>11-2</td>
</tr>
<tr>
<td>11.2 DMA Functional Description</td>
<td>11-3</td>
</tr>
<tr>
<td>11.3 DMA Registers</td>
<td>11-7</td>
</tr>
<tr>
<td>11.4 DMA Unified Mode</td>
<td>11-19</td>
</tr>
<tr>
<td>11.5 DMA Split Mode</td>
<td>11-20</td>
</tr>
<tr>
<td>11.6 DMA Internal Priority Schemes</td>
<td>11-22</td>
</tr>
<tr>
<td>11.7 CPU and DMA Coprocessor Arbitration</td>
<td>11-27</td>
</tr>
<tr>
<td>11.8 Data Transfer Modes</td>
<td>11-28</td>
</tr>
<tr>
<td>11.9 Autoinitization</td>
<td>11-34</td>
</tr>
<tr>
<td>11.10 DMA and Interrupts</td>
<td>11-42</td>
</tr>
<tr>
<td>11.11 DMA Memory Transfer Timing</td>
<td>11-51</td>
</tr>
</tbody>
</table>
The DMA coprocessor is a self-programmable peripheral that transfers blocks of data by maximizing sustained CPU performance and by alleviating the CPU of burdensome I/O duties.

- Transfers to and from anywhere in the processor’s memory map. For example, transfers can be made to and from on-chip memory, off-chip memory, and any of the six on-chip communication ports.

- Six DMA channels for memory-to-memory transfers in unified mode; a special split mode supports 12 DMA channels for communication port to/from memory transfers.

- Automatic initialization of registers via linked lists stored in memory, allowing the DMA to run continuously without intervention by the CPU.

- Concurrent CPU and DMA coprocessor operation with DMA transfers at the same rate as the CPU (supported by separate internal DMA address and data buses)

- Source and destination address registers with variable indices, making it possible to step through matrices by row or column

- Bit-reversed addressing for FFTs

- Synchronization of data transfers via external and internal interrupts
11.2 DMA Functional Description

The DMA coprocessor supports six DMA channels that perform transfers to and from anywhere in the 'C4x memory map.

Each DMA channel is controlled by nine registers that are mapped in the 'C4x peripheral address space, as shown in Figure 11–1. The major DMA registers are described in Section 11.3.

The DMA coprocessor has dedicated on-chip address and data buses (see Figure 2–8 for a block diagram of the peripherals of the 'C4x). All accesses made by the six DMA channels are arbitrated in the DMA coprocessor and take place over these dedicated buses. The six DMA channels transfer data in a sequential time-slice fashion, rather than simultaneously, because they share common buses.

The DMA channels can run constantly or can be triggered by external (I0F3–0) or internal (on-chip timers and communication ports) interrupts.

The DMA coprocessor can transfer data in a bit-reversed fashion (for FFT applications) or in a linear fashion; it can also transfer matrix data in a row or column fashion.

The DMA coprocessor has two basic operational modes:

- **Unified Mode**: Used for memory-to-memory transfers. The unified mode is described in Section 11.4, *DMA Unified Mode*. The unified block transfer sequence is presented in subsection 11.2.1, *Block Transfer Sequence*.

- **Split Mode**: Used for two-way, memory-to-communication port transfers. The split mode is described in Section 11.5, *DMA Split Mode*.
Figure 11–1. DMA Coprocessor Memory Map

DMA Ch. 0
Registers (See exploded view)

DMA Ch. 1
Registers (See exploded view)

DMA Ch. 2
Registers (See exploded view)

DMA Ch. 3
Registers (See exploded view)

DMA Ch. 4
Registers (See exploded view)

DMA Ch. 5
Registers (See exploded view)

Reserved

Control Registers x
Source Address x
Source Address Index x
Transfer Counter x
Destination Address x
Destination Address Index x
Link Pointer x
Auxiliary Transfer Counter x
Auxiliary Link Pointer x

x = channel number (e.g., a 1 for all registers in channel 1, a 2 for all registers in channel 2, etc.).

z = corresponding hexadecimal digit for channel address (e.g., substitute an “A” for DMA channel 0; “B” for DMA channel 1, etc.).
11.2.1 DMA Basic Operation

If a block of data is to be transferred from one region in memory to another region in memory (unified mode), the following sequence is performed:

DMA Registers Initialization

1) The source address register of a DMA channel is loaded with the address of the memory location to read from.
2) The destination address register of the same DMA channel is loaded with the address of the memory location to write to.
3) The transfer counter is loaded with the number of words to be transferred.
4) The source/destination index register is loaded with the step size of source/destination register update. If sequential memory accesses are required, the source address index register and the destination address index register must be set to 1.
5) The DMA channel control register is loaded with the appropriate modes to synchronize the DMA coprocessor reads and writes with interrupts. The DIE register determines which interrupt to use for synchronous transfer.

DMA Start

6) The DMA coprocessor is started via the DMA START field in the DMA channel control register.

Word Transfers

7) The DMA channel reads a word from the source address register and writes it to a temporary register within the DMA channel.
8) After a read by the DMA channel, the source-index register is added to the source address register.
9) After the read operation completes, the DMA channel writes the temporary register value to the destination address pointed to by the destination address register.
10) After the destination address has been fetched, the transfer counter register is decremented and the destination-index register is added to the destination-address register.

Note:

Both of the index registers (source and destination) contain signed values. This allows for variable step sizes or continuous reads from and/or writes to memory. When an index register equals zero, the DMA coprocessor transfers data to or from a fixed location.
11) During every data write, the transfer counter is decremented. The block transfer terminates when the transfer counter reaches zero and the write of the last transfer is completed. The DMA channel sets the transfer counter interrupt (TCINT) flag in the DMA channel control register.

After the completion of a block transfer, the DMA coprocessor can be programmed to do several things:

- Stop until reprogrammed (TRANSFER MODE bits = 012)
- Continue transferring data (TRANSFER MODE bits = 002)
- Generate an interrupt to signal the CPU that the block transfer is complete (TCC bit = 12)
- Autoinitialize itself to start the next block transfer (TRANSFER MODE bits = 102 or 112).

Each DMA channel reads new DMA register values from memory, loads these values into its register file, and, according to the values loaded, begins another block transfer. Whether or not the CPU must initialize transfers is determined by the value of the transfer mode bits:

- Autoinitilaization under transfer mode bits = 102 is done without any intervention by the CPU.
- Autoinitilaization under transfer mode bits =112 requires the CPU to start the DMA.
11.3 DMA Registers

Each DMA channel has nine registers designated as follows:

- **Control register**: contains the status and mode information about the associated DMA channel.

- **Source address register**: contains the memory address of data to be read.

- **Source address-index register**: contains the step size (a signed 32-bit number) used to increment or decrement the source address register.

- **Destination address register**: contains the memory address where data is written.

- **Destination address-index register**: contains the step size (a signed 32-bit number) used to increment or decrement the destination address register.

- **Transfer counter register**: contains the block size to move in unified mode or in split mode (primary channel).

- **Auxiliary transfer-counter register**: contains the block size to move in split mode (auxiliary channel).

- **Link pointer register**: contains the memory address of data to autoinitialize the DMA channel registers. Used for unified mode or primary channel in split mode.

- **Auxiliary link-pointer register**: contains the memory address of data to autoinitialize the DMA channel registers. Used for auxiliary channel in split mode.

After reset, the control register, the transfer counter, and the auxiliary transfer counter registers are set to zeros and the other registers are undefined.

11.3.1 Control Register

The format of the DMA-channel control register is shown in Figure 11–2. The text following the figure describes the functions of each field in the register.

At reset, each DMA-channel control register is set to zero. This makes the DMA channels lower-priority than the CPU, sets up the source address and destination address to be calculated via linear addressing, and configures the DMA channel in the unified mode.
**DMA Registers**

Figure 11–2. DMA Channel Control Register

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>23</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>19</th>
<th>18</th>
</tr>
</thead>
<tbody>
<tr>
<td>xx</td>
<td>PRIORITY MODE</td>
<td>AUX START</td>
<td>STATUS</td>
<td>AUX START</td>
<td>START</td>
<td>AUX TCC</td>
<td>TCINT FLAG</td>
<td>TCINT FLAG</td>
<td>AUX TCC</td>
<td>TCINT FLAG</td>
<td>AUX TCC</td>
<td>TCINT FLAG</td>
<td>AUX TCC</td>
</tr>
<tr>
<td>COM PORT</td>
<td>SPLIT MODE</td>
<td>WRITE BIT REV</td>
<td>READ BIT REV</td>
<td>AUTOINIT STATIC</td>
<td>AUTOINIT STATIC</td>
<td>SYNC MODE</td>
<td>AUX TRANSFER MODE</td>
<td>TRANSFER MODE</td>
<td>DMA PRI</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

- **R** – Bit may be read.
- **W** – Bit may be written.
- **S** – Bit is shadowed during autoinitialization (no changes take place until autoinitialization is complete.)
- **A** – Bit is auxiliary for autoinitialization.
- **xx** – Reserved.

DMA Channel 0 only

**DMA PRI**
Sets DMA coprocessor priority. Defines the arbitration rules to be used when a DMA channel and the CPU are requesting the same resource. Affects all DMA coprocessor modes. The rules are listed in Table 11–1.

**TRANSFER MODE**
Defines the transfer mode used by the DMA channel. Affects unified mode and the primary channel in split mode. The bits are defined in Table 11–2.

**AUX TRANSFER MODE**
Defines the transfer mode used by the DMA channel. Affects the auxiliary channel in split mode only. The bits are defined in Table 11–2.

**SYNC MODE**
Determines the mode of synchronization for performing data transfers. These bits work differently in unified and split modes. See Table 11–3 and Table 11–4 for bit descriptions for unified and split modes.

**Note:** If a DMA channel is interrupt driven for both reads and writes, and the interrupt for the write comes before the interrupt for the read, the interrupt for the write is latched by the DMA channel. After the read is complete, the write can be executed.

**AUTOINIT STATIC**
This bit affects unified mode and the primary channel in split mode. It keeps the auxiliary link pointer constant during autoinitialization from the on-chip communication ports or other stream-oriented devices (such as first-in first-out (FIFO) memory buffers). If bit=0, the link pointer is incremented during autoinitialization. If bit=1, the link pointer is not incremented (it is static) during autoinitialization.
DMA Registers

AUX AUTOINIT STATIC
Acts like the AUTOINIT STATIC bit above, except that it affects the auxiliary channel in split mode only.

AUTOINIT SYNC
This bit has an effect only in the DMA coprocessor sync mode (bits 6–7 above). It affects the interrupt that is enabled by the DMA interrupt enable register (shown in Figure 11–25) used for DMA reads: If bit = 0, the interrupt is ignored, and the autoinitialization reads are not synchronized with any interrupt signals. If bit = 1, then the interrupt is recognized and is also used to synchronize the autoinitialization reads. This affects the unified mode and the primary channel in split mode (see the SPLIT MODE bit). The effect of this bit and the SYNC MODE bit in autoinitialization is summarized in Table 11–9.

AUX AUTOINIT SYNC
Acts the same as the AUTOINIT SYNC bit above, except that it affects the auxiliary channel in split mode. The effect of this bit and the SYNC MODE bits in autoinitialization is summarized in Table 11–9.

READ BIT REV
Selects type of addressing for modifying the source address. If bit = 0, the source address is modified using 32-bit linear addressing. If bit = 1, the source address is modified using 24-bit bit-reversed addressing. The bit affects unified mode and primary channel reads (source) in split mode.

WRITE BIT REV
Selects the type of addressing for modifying the destination address. If bit = 0, the destination address is modified using 32-bit linear addressing. If bit = 1, the destination address is modified using 24-bit bit-reversed addressing. The bit affects unified mode and auxiliary channel writes (destination) in split mode.

SPLIT MODE
This bit controls the DMA coprocessor mode of operation. If bit = 0, DMA transfers are from memory to memory. This is referred to as unified mode. If bit = 1, split mode is entered with each DMA channel split into two channels, allowing a single DMA channel to perform memory-to-communication-port and communication-port-to-memory transfers. The split mode can be modified by autoinitialization in unified mode or by autoinitialization by the auxiliary channel in split mode. Split mode is further described in Section 11.4, DMA Split Mode.

COM PORT
These bits define a communication port (000₂ to 101₂) to be used for DMA transfers. If SPLIT MODE = 0, COM PORT has no affect on the operation of the DMA channel. If SPLIT MODE = 1, COM PORT defines which of the six communication ports to use with the DMA channel. The COM PORT may be modified by autoinitialization in unified mode or by autoinitialization by the auxiliary channel in split mode.
DMA Registers

**TCC**  Transfer counter interrupt control. If $TCC = 1$, a DMA channel interrupt pulse is sent to the CPU after the transfer counter makes a transition to zero and the write of the last transfer is complete.

If enabled, the corresponding DMA interrupt (DMA INT0–INT5) occurs at the vector shown in Figure 7–2. If $TCC = 0$, a DMA channel interrupt pulse is not sent to the CPU when the transfer counter transitions to zero. This bit affects unified mode and the primary channel in split mode.

**AUX TCC**  Auxiliary transfer counter interrupt control. If bit = 1, a DMA channel interrupt pulse is sent to the CPU after the auxiliary transfer counter makes a transition to zero and the write of the last transfer is complete. If enabled, the corresponding DMA interrupt (DMA INT0–INT5) occurs as shown in Figure 7–2. If bit = 0, a DMA channel interrupt pulse is not sent to the CPU when the auxiliary transfer counter transitions to zero. This bit affects the auxiliary channel in split mode only.

**TCINT FLAG**  Transfer counter interrupt flag. This flag is set to 1 whenever the transfer counter makes a transition to zero and the write of the last transfer is completed. Whenever the DMA channel control register is read, this flag is cleared, unless the flag is being set by the DMA in the same cycle as the read. The TCINT FLAG is affected by the unified mode and the primary channel in split mode.

**AUX TCINT FLAG**  Auxiliary transfer counter interrupt flag. This flag is set to 1 whenever the auxiliary transfer counter makes a transition to zero and the write of the last transfer is completed. Whenever the DMA control register is read, this flag is cleared, unless the flag is being set by the DMA coprocessor in the same cycle as the read. The AUX TCINT FLAG is affected by the auxiliary channel in split mode. Since only one interrupt is available for a DMA channel, you can determine what event had set the interrupt by examining the TCINT FLAG and the AUX TCINT FLAG.

**START**  Starts and stops the DMA channel in several different ways (as are listed in Table 11–5). START affects the unified mode and the primary channel in split mode. If they is used to hold a channel in the middle of an autoinit sequence, the START and AUX START bits will hold the autoinit sequence. If the START or AUX START bits are being modified by the DMA channel (for example, to force a halt code of 102 on a transfer-counter terminated block transfer) and a write is being performed by an external source to the DMA channel control register, internal modification of the START or AUX START bits by the DMA channel has priority. See TRANSFER MODE bits value of 012 in Table 11–2 for more information.

**AUX START**  Starts and stops the DMA channel in several different ways (as are listed in Table 11–5). AUX START affects the auxiliary channel in split mode only.
DMA Registers

**STATUS**
Indicates the status of the DMA channel as listed in Table 11–6. STATUS is updated in the unified mode and by the primary channel in the split mode. Updates are performed every cycle. The STATUS and AUX STATUS bits also determine if the DMA channel has halted or has been reset after writing to the START or AUX START bits.

**AUX STATUS**
Indicates the status of the DMA channel as listed in Table 11–6. STATUS is updated by the auxiliary channel in split mode only. Updates are performed every cycle.

**PRIORITY MODE**
Priority mode of DMA channel access: If bit = 0, priority rotates as shown in Section 11.6. If bit = 1 priority is fixed as shown in Section 11.6. This bit is available only at DMA channel zero.
### DMA PRI Bits and CPU/DMA Arbitration Rules

<table>
<thead>
<tr>
<th>DMA PRI Bit Nos: 1 – 0</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>DMA coprocessor access is lower priority than CPU. If the DMA channel and the CPU are requesting the same resource, then the CPU will proceed. These bits are set this way at reset.</td>
</tr>
<tr>
<td>0 1</td>
<td>This setting selects <em>rotating arbitration</em>, which sets priorities between the CPU and DMA channel by alternating their accesses, but not exactly equally. Priority rotates between CPU and DMA accesses when they conflict during consecutive instruction cycles. The first time the DMA channel and the CPU request the same resource, the CPU has priority. If, in the following instruction cycle, the DMA coprocessor and the CPU again request the same resource, the DMA has priority. Alternate access continues as long as the CPU and DMA requests conflict in consecutive instruction cycles. When there is no conflict in a previous instruction cycle, the CPU has priority.</td>
</tr>
<tr>
<td>1 0</td>
<td>Reserved.</td>
</tr>
<tr>
<td>1 1</td>
<td>DMA coprocessor access is higher priority than CPU access. If the DMA channel and the CPU are requesting the same resource, then the DMA will proceed.</td>
</tr>
</tbody>
</table>

### TRANSFER MODE (AUX TRANSFER MODE) Field Descriptions

<table>
<thead>
<tr>
<th>TRANSFER MODE Bit Nos: 3 – 2 / (5 – 4)</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>Transfers are <em>not</em> terminated by the transfer counter, and no autoinitialization is performed. TCINT (transfer counter interrupt) and AUX TCINT can still be used to cause an interrupt when the transfer counter makes a transition to zero. The DMA channel continues to run. Note that the address continues to increment while the transfer count rolls over to its maximum value of 0FFFF FFFFh.</td>
</tr>
<tr>
<td>0 1</td>
<td>Transfers are terminated by the transfer counter. No autoinitialization is performed. A halt code of 10 2 is placed in the START (or AUX START) field when transfers are completed.</td>
</tr>
<tr>
<td>1 0</td>
<td>Autoinitialization is performed when the transfer counter goes to zero without waiting for CPU intervention.</td>
</tr>
<tr>
<td>1 1</td>
<td>The DMA channel is autoinitialized when the CPU restarts the DMA coprocessor by using the DMA register in the CPU. When the transfer counter goes to zero, operation is halted until the CPU starts the DMA coprocessor by using the START (AUX START) field in the DMA channel control register (bits 22–23 and 24–25, Table 11–5). A halt code of 10 2 is placed in the START (or AUX START) field by the DMA coprocessor.</td>
</tr>
</tbody>
</table>
Table 11–3. SYNC MODE Field Descriptions in Unified Mode

<table>
<thead>
<tr>
<th>SYNC MODE</th>
<th>Description</th>
<th>Bit Nos: 7 – 6</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>No synchronization. Interrupts are ignored, see Figure 11–27.</td>
<td></td>
</tr>
<tr>
<td>0 1</td>
<td>Source synchronization. A read is not performed until an enabled interrupt occurs (see Figure 11–28a). The interrupt is specified by the DMAx READ field of the DMA interrupt enable (DIE) register (see subsection 11.10.1, Interrupts and Synchronization of DMA Channels, for more information).</td>
<td></td>
</tr>
<tr>
<td>1 0</td>
<td>Destination synchronization. A write is not performed until an enabled interrupt occurs (see Figure 11–29a). The interrupt is specified by the DMAx WRITE field of the DMA interrupt enable (DIE) register (subsection 11.10.1, Interrupts and Synchronization of DMA Channels, for more information).</td>
<td></td>
</tr>
<tr>
<td>1 1</td>
<td>Source and destination synchronization. A read is performed when an enabled interrupt (specified by the DMAx READ field) occurs. Then, a write is performed when an enabled interrupt (specified by the DMAx WRITE field) occurs (as shown in Figure 11–30). These fields are part of the DMA interrupt enable (DIE) register (see subsection 11.10.1, Interrupts and Synchronization of DMA Channels, for more information).</td>
<td></td>
</tr>
</tbody>
</table>

Table 11–4. SYNC MODE Field Descriptions in Split Mode

<table>
<thead>
<tr>
<th>SYNC MODE</th>
<th>Description</th>
<th>Bit Nos: 7 – 6</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>No synchronization. Interrupts are ignored see Figure 11–27.</td>
<td></td>
</tr>
<tr>
<td>0 1</td>
<td>Destination synchronization. A primary channel write to the communication-port output FIFO is not performed until an enabled interrupt occurs (see Figure 11–29b). The interrupt is specified by the DMAx PRIMARY WRITE field of the DMA interrupt enable (DIE) register (see subsection 11.10.1, Interrupts and Synchronization of DMA Channels, for more information).</td>
<td></td>
</tr>
<tr>
<td>1 0</td>
<td>Source synchronization. An auxiliary-channel read from the communication-port input FIFO is not performed until an enabled interrupt occurs (see Figure 11–28b). The interrupt is specified by the DMAx AUXILIARY READ field of the DMA interrupt enable (DIE) register (see subsection 11.10.1, Interrupts and Synchronization of DMA Channels, for more information).</td>
<td></td>
</tr>
<tr>
<td>1 1</td>
<td>Source and destination synchronization. A read from the communication-port input FIFO is performed when an enabled interrupt (specified by the DMAx AUXILIARY READ field) occurs. A write to the communication port output FIFO is performed when an enabled interrupt (specified by the DMAx PRIMARY WRITE field) occurs. These fields are part of the DMA interrupt enable (DIE) register (see subsection 11.10.1, Interrupts and Synchronization of DMA Channels, for more information).</td>
<td></td>
</tr>
</tbody>
</table>
## DMA Registers

### Table 11–5. START (AUX START) Field Descriptions

<table>
<thead>
<tr>
<th>START (AUX START)</th>
<th>Bit Nos:</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>23 – 22</td>
<td></td>
</tr>
<tr>
<td></td>
<td>(25 – 24)</td>
<td></td>
</tr>
<tr>
<td>0 0</td>
<td></td>
<td>DMA channel reset. DMA-channel read or write cycles in progress are completed (not aborted); any data read is ignored. Any pending (not started) read or write is canceled. The auxiliary (AUX START = 002) and primary (START = 002) transfer counters are set to zero. The DMA channel is reset so that when it starts, a new transaction begins; that is, a read is performed. In this mode, stopping is immediate with no other registers loaded.</td>
</tr>
<tr>
<td>0 1</td>
<td></td>
<td>DMA halt on read or write boundary. Halts the DMA channel on the first available read or write boundary. If a read or write has begun, the read or write is completed before stopping. If a read or write has not begun, no read or write is started. In this mode, stopping is immediate with no other registers loaded.</td>
</tr>
<tr>
<td>1 0</td>
<td></td>
<td>DMA halt on transfer boundary. Halts the DMA channel on the first available transfer boundary. If a DMA transfer has begun, the entire transfer is completed, including both cycles (both read and write operations), before stopping. If a transfer has not begun, none is started. In this mode, stopping is immediate with no other registers loaded. This is also the value after a DMA transfer completes.</td>
</tr>
<tr>
<td>1 1</td>
<td></td>
<td>DMA start. Writing 112 to this field starts the DMA process using the values in the channel’s DMA channel registers (Figure 11–1). If the DMA is in autoinitialization, all DMA registers are loaded before starting the operation. The DMA coprocessor starts from reset if previously reset (START or AUX START bits = 002) or restarts from the previous state if previously halted (START or AUX START bits = 012 or 102).</td>
</tr>
</tbody>
</table>

### Table 11–6. STATUS (AUX STATUS) Field Descriptions

<table>
<thead>
<tr>
<th>STATUS (AUX STATUS)</th>
<th>Bit Nos:</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>27 – 26</td>
<td></td>
</tr>
<tr>
<td>(29 – 28)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0 0</td>
<td></td>
<td>The DMA channel is held on the boundary of the DMA transfer (the write is complete, and the read has not begun). This is the value at RESET after a halt on a transfer boundary or after a block transfer.</td>
</tr>
<tr>
<td>0 1</td>
<td></td>
<td>The DMA channel is being held in the middle of a DMA transfer; (the read is complete, and the write has not begun). This occurs only if the START (or AUX START) field = 012.</td>
</tr>
<tr>
<td>1 0</td>
<td></td>
<td>Reserved.</td>
</tr>
<tr>
<td>1 1</td>
<td></td>
<td>The DMA channel is not being held or reset.</td>
</tr>
</tbody>
</table>
11.3.2 Address and Index Registers

As shown in Figure 11–3, both the DMA coprocessor source-address and destination-address registers have an associated index register. After each DMA-channel read (source address) or write (destination address), the corresponding (source or destination) address generator adds the index register to the address register \textit{and places the result in the address register}. In this way, the address register acts as an accumulator because it retains its own sum and the sum of its index register, as is shown by the following equation:

\[ \text{Address Register} + \text{Index Register} \rightarrow \text{Address Register} \]

The values in these registers are undefined at reset.

Depending upon bits 12 and 13 (READ BIT REV and WRITE BIT REV) of the DMA channel control register, the addition may be either:

- **Linear** (normal addition): READ BIT REV = 0 or WRITE BIT REV = 0, or
- **Bit reversed** (reverse carry propagation): READ BIT REV = 1 or WRITE BIT REV = 1.

Both index values (source or destination) are \textit{signed} values.
11.3.3 Transfer Counter and Auxiliary Transfer Counter Registers

These registers contain the number of words to be transmitted.

Figure 11–4 shows the six transfer counters and the six auxiliary transfer counters. A DMA channel in split mode (described in Section 11.4, DMA Split Mode) uses the auxiliary transfer counter for the auxiliary channel and the primary transfer counter for the primary channel. The values in these registers are set to zero at reset.

The counters are decremented after completing the address fetch for the write portion of a transfer. The TCINT FLAG and AUX TCINT FLAG (bits 20 and 21
of the DMA channel control register, as shown in Figure 11–2) are not set until the counter is decremented and the write of the last transfer is completed. Correspondingly, the interrupt will not be seen by the CPU interrupt controller until the transfer counter is decremented and the write of the last transfer is completed.

The decrementer checks whether the transfer counter equals zero after the decrement is performed. As a result, if the counter register has a value of 1, then the DMA channel can be halted after only one transfer is performed. Thus, by setting the transfer counter to 1, the DMA channel transfers the minimum possible number of words (1 time). The count is treated as an unsigned integer. Transfers can be halted when a zero count is detected after a decrement. If the DMA coprocessor channel is not halted after the transfer reaches zero, the counter will continue decrementing below zero. Thus, by setting the transfer counter to zero, the DMA channel transfers the maximum possible number of words (10000 0000h times).

Figure 11–4. Transfer Counter Registers

11.3.4 Link Pointer and Auxiliary Link-Pointer Registers

The link pointers specify the address from which to load the new DMA channel register values when autoinitialization is performed. When a channel has exhausted its counter (transfer counter = 0), it will (if appropriately configured) use the link pointer to reload itself. Figure 11–5 illustrates the DMA coprocessor link address registers. The values in these registers are undefined at reset.
For example, under autoinitialization, the steps to load the channel registers for DMA channel 0 (as shown in Figure 11–1) are:

1) Get the link pointer for the next DMA operation. The pointer is the memory address containing the contents of the first DMA channel 0 register (the channel control register as shown in Figure 11–1).

2) Bring in the contents pointed to by the pointer and write to address 0010 00A0h (first word of DMA channel 0 registers as shown in Figure 11–1).

3) Increment the link pointer. (Skip this step if the AUTOINIT STATIC bit = 1.)

4) Bring in the next word and write to address 0010 00A1h.

5) Repeat until the entire block of registers is loaded for DMA channel 0 (7 registers in unified mode; 5 registers in split mode).

Figure 11–5. Link Pointer Registers
11.4 DMA Unified Mode

Unified mode is the default DMA operational mode. It is used for memory-to-memory transfers. To select unified mode, clear the SPLIT MODE bit (bit 14 of the DMA channel control register, which is shown in Figure 11–2). Thus, write a zero to this bit (zero is the reset value of this bit).

The block transfer sequence under unified mode is covered in subsection 11.2.1. DMA channel arbitration in unified mode is described in Section 11.6. DMA synchronization with interrupts is covered in Section 11.10, *DMA and Interrupts*. Autoinitiation in unified mode is covered in subsection 11.9.1, *Unified Mode*.

A unified DMA word transfer consists of two steps, as shown in Figure 11–6:

1) The DMA channel reads the source data value from the address pointed to by the source address register and stores it in a temporary register. 
2) The DMA channel reads the temporary register value and writes it to the address pointed to by the destination address register.

You can use unified mode to perform communication port transfers, especially unidirectional transfers. Using split mode is more advantageous in bidirectional transfers.

*Figure 11–6. Typical Unified-Mode DMA Channel Configuration*
11.5 DMA Split Mode

The DMA split mode (see Figure 11–7) allows one DMA channel to be used for both reading and writing data to a communications port. Split mode essentially transforms one DMA channel into two DMA channels:

- **Primary Channel**: dedicated to reading data from a location in the memory map (external/internal) and writing it to a communication port output FIFO.

- **Auxiliary Channel**: dedicated to receiving data from a communication port input FIFO and writing it to a location in the memory map.

To select split mode, set the SPLIT MODE bit (bit 14 of the DMA channel control register, Figure 11–2) to one.

All six DMA channels support this split mode to accommodate all of the communication ports. The COM PORT field (bits 15–17 as shown in Figure 11–2) of the DMA channel control register defines which communication port is used (port 0–5). A DMA channel in split mode can be used with any communication port; however, read/write synchronization is restricted to signals from the communication port with the same number as the DMA channel being used; in other words, DMA can synchronize only with signals coming from communication port \( i \) (see Section 11.10, *DMA and Interrupts*, for more information). Figure 11–7 shows typical split mode operation with one communication port.

A split mode word transfer is similar to that of the unified mode except for the following differences:

- The primary channel reads a word from the address pointed to by the source address register and writes it to a temporary register within the DMA coprocessor. It then writes the temporary register value to the output FIFO on the communication port specified in the COM PORT field. The registers that control the primary channel are the DMA channel control register, source address register, source index register (added to source address register), transfer-counter register, and link pointer register.

- The auxiliary channel reads a word from the input FIFO on the communication port specified in the COM PORT field and writes it to a temporary register within the DMA coprocessor. It then writes the temporary register value in the address pointed to by the destination address register. The registers that control the auxiliary channel are the DMA channel control register, destination address register, destination index register (added to the destination address register), auxiliary transfer-counter register, and auxiliary link pointer register.

**Figure 11–7. Typical Split-Mode DMA Configuration**

Notice that there is only one temporary register in each DMA channel. Therefore, a primary channel operation must complete before an auxiliary channel operation can begin, and vice versa.

Primary and auxiliary channels share some of the DMA channel control registers and exclusively use others:

- PRIORITY MODE, COM PORT, SPLIT MODE, and DMA PRI are fields that both primary and auxiliary channels use.
- AUX STATUS, AUX START, AUX TCINT flag, AUX TCC, WRITE BIT REV, SYNC MODE (bit 7), and AUX TRANSFER MODE are used exclusively by the auxiliary channel.
- STATUS, START, TCINT flag, TCC, READ BIT REV, SYNC MODE (bit 6), and TRANSFER MODE are used exclusively by the primary channel.
11.6 DMA Internal Priority Schemes

Because all accesses made by the six DMA channels take place over one common internal DMA data and address bus, a priority scheme for bus arbitration is required. Within the DMA coprocessor, two priority schemes are used to designate which channel is serviced next:

- A fixed priority scheme with channel 0 always having the highest priority and channel 5 the lowest.
- A rotating priority scheme that places the most recently serviced channel at the bottom of the priority list (default setup after reset).

11.6.1 Fixed Priority Scheme

This scheme provides a fixed (unchanging) priority for each channel as follows:

<table>
<thead>
<tr>
<th>Highest priority</th>
<th>0</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>Lowest priority</th>
</tr>
</thead>
</table>

To select fixed priority, set the PRIORITY MODE bit (bit 30) of channel 0’s DMA-channel control register to 1 (one).

11.6.2 Rotating Priority Scheme

In a rotating priority scheme, the last channel serviced becomes the lowest priority channel. The other channels sequentially rotate through the priority list with the lowest channel next to the last-serviced channel becoming the highest priority on the following request. The priority rotates every time the channel most recently granted priority completes its access. Figure 11–8 and Figure 11–10 illustrate the rotation of priority across several DMA coprocessor accesses. At system reset, the channels are ordered from highest to lowest priority (0, 1, 2, 3, 4, 5).

To select this scheme, set the PRIORITY MODE bit (bit 30) of channel 0’s DMA control register to 0 (zero).
**Figure 11–8. Rotating Priority Mode Example of the DMA Coprocessor**

At the start of the example in Figure 11–8, channels 2, 4, and 5 are requesting service. Because channel 2 has the highest priority, it is serviced first. It then becomes the lowest priority channel. The highest priority channel then becomes channel three. On the following services, channels 4 and 5 are taken care of in a similar fashion. Figure 11–9 shows the entire read and write sequence.

**Note:**

Each service means one read access or one write access. The DMA coprocessor handles channel arbitration on an access-by-access basis; that is, a DMA channel must contend for both the read and the write access in both unified and split modes.

**Figure 11–9. Rotating Priority DMA Read and Write Sequence Example (Unified Mode)**

Figure 11–10 shows the same results in a different way as in Figure 11–8 in a rotating priority scheme. Priority decreases from highest to lowest in a clockwise direction. The priority rotates in a counter clockwise direction with the most recently serviced channel becoming the lowest in priority.
 DMA Internal Priority Schemes

Figure 11–10. Example of a Priority Wheel

With the rotating priority scheme, any DMA channel requesting service is guaranteed to be recognized after a number of higher priority requests have been serviced. The maximum number of requests are:

- Five in unified mode
- Eleven in split mode

This provides a way of preventing a channel from monopolizing the system.

DMA channels that are running and are not synchronized via interrupts are always requesting service.

11.6.3 Split Mode and DMA Channel Arbitration

When a DMA channel is running in split mode, arbitration between channels is similar to rotating priorities. A split-mode DMA channel has the same priority as a unified DMA channel. The only issue is how to arbitrate between the primary split channel and the auxiliary split channel. The split channels alternate priorities via a rotating priority scheme.

When a DMA channel is in split mode and both paths are simultaneously started via the START and AUX START bits, the output (primary) channel has priority over the input (auxiliary) channel. Both the START and AUX START bits must be written at the same time in order to achieve this reset condition.
The priority scheme for split mode channels is slightly different from the scheme for unified mode channels:

- For unified channels, the priority changes after a read or a write.
- For the primary and auxiliary channels within a split channel, priority changes after a complete read and write. This is because there is only one temporary register for both DMA channels (primary and auxiliary) to store the read value.

Figure 11–11 shows two channels contending for the DMA bus: channel 2 (a split channel) and channel 4.

**Figure 11–11. Example of a Channel Priority Scheme in Split Mode**

```
Highest priority channel          0
                                 1
† [2pri
  2aux]
                                 3
† 4
Lowest priority channel          5
```

† DMA channel requesting an access
† Split channels requesting access
2pri = the primary split channel of channel 2
2aux = the auxiliary split channel of channel 2

The channel priority scheme in Figure 11–11 is shown sequentially in Figure 11–12. In words, the scheme follows eight steps:

1) The first service is a request by the primary split channel of channel 2 (2pri). 2pri reads, and then channel 2 is moved to the lowest priority level, but 2pri remains the higher priority channel of channel 2.

2) On the second service, channel 4, now a higher priority than channel 2, reads its source address and becomes the lowest priority.

3) On the third service, the value read by 2pri is written to its destination address, and channel 2 is moved to the lowest priority level. Also, 2pri is moved to a lower priority than 2aux. Note that the split channel that just completed a read retains a higher priority than the other split channel until the data is written to the destination address.

4) On the fourth service, the value read by channel 4 in service 2 is now written to its destination address and the channel becomes the lowest priority.
Figure 11–12. Service Sequence for Split Mode Priority Example

5) In the fifth service, 2aux is read and channel 2 becomes the lowest priority.

6) On the sixth service, channel 4 is read again, and it becomes the lowest priority.

7) On the seventh and eighth services, the 2aux and channel 4 values that were read in services 5 and 6 are now written to their destination address- es. After the channel is written, it assumes the lowest priority.

8) In the ninth service, 2pri is read again as in the first service, and the read/ write cycle continues as begun in the first service.
11.7 CPU and DMA Coprocessor Arbitration

The DMA coprocessor transfers data on its own internal buses. Arbitration is necessary only when a resource conflict exists between the DMA coprocessor and the CPU. The arbitration causes no delay. When there is no conflict, the CPU and DMA coprocessor accesses proceed in parallel.

All arbitration between the CPU and the DMA coprocessor is on an access basis; that is, the DMA coprocessor must contend for read and write accesses in both unified and split modes. DMA coprocessor internal memory access starts during H3 (See Section 8.4, Clocking of Memory Accesses, on page 8-19, for more information).

When the CPU and DMA coprocessor request the same resource, the DMA channel’s DMA PRI bits (bits 0 and 1 of the channel control register) define the arbitration rules (as shown in Table 11–7). The CPU has higher priority than the DMA when DMA PRI = 00; it has lower priority than the DMA when DMA PRI = 11. They rotate priority when DMA PRI = 01.

Table 11–7. DMA PRI Bits and CPU/DMA Arbitration Rules

<table>
<thead>
<tr>
<th>DMA PRI (Bits 1–0)</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>DMA access is lower priority than the CPU access. If the DMA channel and the CPU are requesting the same resource, then the CPU will proceed. (DMA PRI bits are set to 00 at reset.)</td>
</tr>
<tr>
<td>0 1</td>
<td>This setting selects rotating arbitration, which sets priorities between the CPU and DMA channel by alternating their accesses, but not exactly equally. Priority rotates between CPU and DMA accesses when they conflict during consecutive instruction cycles. The first time the DMA channel and the CPU request the same resource, the CPU has priority. If, in the following instruction cycle, the DMA coprocessor and the CPU again request the same resource, the DMA has priority. Alternate access continues as long as the CPU and DMA requests conflict in consecutive instruction cycles. When there is no conflict in a previous instruction cycle, the CPU has priority.</td>
</tr>
<tr>
<td>1 0</td>
<td>Reserved</td>
</tr>
<tr>
<td>1 1</td>
<td>DMA access is higher priority than the CPU access. If the DMA channel and the CPU are requesting the same resource, the DMA will proceed.</td>
</tr>
</tbody>
</table>
11.8 Data Transfer Modes

Each DMA channel can operate in four types of data transfer modes. These modes differ in:

- Whether or not they use autoinitialization
- How they operate if autoinitialization is in effect or not

Table 11–8 and the following paragraphs describe these data transfers.

<table>
<thead>
<tr>
<th>TRANSFER MODE (AUX TRANSFER MODE)</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Bits 3–2 (5–4)</td>
<td></td>
</tr>
<tr>
<td>0 0₂</td>
<td>Transfers are not terminated by the transfer counter. No autoinitialization is performed. The TCINT (transfer count interrupt) bits can still be used to cause an interrupt when the transfer counter makes a transition to zero. The DMA channel continues to run.</td>
</tr>
<tr>
<td>0 1₂</td>
<td>Transfers are terminated by the transfer counter. No autoinitialization is performed. A halt code of 10₂ is placed in the START or AUX START field (bits 22–23 or bits 24–25 of the DMA channel control register) when transfers are complete.</td>
</tr>
<tr>
<td>1 0₂</td>
<td>Autoinitialization 1. Autoinitialization is performed when the transfer counter goes to zero without waiting for CPU intervention.</td>
</tr>
<tr>
<td>1 1₂</td>
<td>Autoinitialization 2. The DMA channel is autoinitialized when the CPU restarts the DMA coprocessor by using the DMA channel control register in the CPU. When the transfer counter goes to zero, operation is halted until the CPU starts the DMA coprocessor by using the START (or AUX START) field in the DMA channel control register. A halt code of 10₂ is placed in the START (or AUX START) field by the DMA.</td>
</tr>
</tbody>
</table>

11.8.1 Running in TRANSFER MODE = 00₂

When TRANSFER MODE = 00₂, transfers are not terminated when the transfer counter goes to zero, and no autoinitialization is performed. Even though the transfer counter does not halt transfers, an interrupt can be generated on the transfer counter transition to zero, setting the TCINT FLAG bit to 1. If the DMA coprocessor channel is not halted after the transfer reaches zero, the counter will continue decrementing below zero.
11.8.2 Running in TRANSFER MODE = 01\textsubscript{2}

When TRANSFER MODE = 01\textsubscript{2}, transfers are terminated when the transfer counter goes to zero, and no autoinitialization is performed. When the transfer counter goes to zero, the DMA channel is halted by forcing 10\textsubscript{2} into the START or AUX START field.

11.8.3 Running in TRANSFER MODE = 10\textsubscript{2} (Autoinitialization 1)

This transfer mode allows the DMA channel to run continuously, change pointers and synchronization by the autoinitialization procedure, and turn itself off. Two different autoinitialization methods are supported:

**Autoinitialization method 1a** always starts after a system reset, after a DMA channel is reset (00\textsubscript{2} written to the START or AUX START bits), or after a DMA channel halts (01\textsubscript{2} or 10\textsubscript{2} written to START or AUX START bits). To select transfer mode 10\textsubscript{2} (autoinitialization method 1a), follow the steps listed here and shown in Figure 11–13.

1) Initialize the DMA control register to transfer mode 10\textsubscript{2}, and reset or halt the DMA channel to be autoinitialized.

2) Initialize the transfer counter to 0 (resetting the DMA channel does this).

3) Initialize the DMA channel link pointer with the address where the autoinitialization values reside. No initialization of the other DMA channel registers is required, because they are automatically set up during the autoinitialization process.

4) Start the DMA channel by writing 11\textsubscript{2} to the START (or AUX START) bits.

5) The DMA channel performs the sequence, *autoinitialize and block transfer*.

*Figure 11–13: DMA Channel Running in Transfer Mode 10\textsubscript{2} (Autoinitialization Method 1a)*
**Data Transfer Modes**

**Autoinitialization method 1b** starts when the transfer counter is not zero. The DMA starts a regular DMA transfer and autoinitializes after this transfer completes (when the transfer counter becomes zero). To select transfer mode 102 (autoinitialization method 1b), follow the steps listed here and shown in Figure 11–14.

1) Initialize the DMA control register to transfer mode 102, and reset or halt the DMA channel for the first transfer operation.

2) Initialize all the other DMA channel registers (source address, destination address, transfer counter, etc.) according to the transfer operation desired. Note that the transfer counter now reflects the number of words to be transferred (normally a nonzero value) before the autoinitialization process.

3) Initialize the DMA channel link pointer with the address where the autoinitialization values for subsequent transfer operations reside.

4) Start the DMA channel by writing 112 to the START (or AUX START) bits.

5) The DMA channel performs this sequence: block transfer and autoinitialize (reverse order of method 1a).

Note that if a DMA channel is programmed to perform n block transfers, autoinitialization method 1a requires n DMA autoinitialization values. Autoinitialization method 1b requires only n–1 autoinitialization values because the first transfer can be accomplished during the initial DMA transfer. This represents some memory saving, but successive identical DMA operations require extra CPU cycles to set the initial DMA registers values again.

*Figure 11–14. DMA Channel Running in Transfer Mode 102 (Autoinitialization Method 1b)*
11.8.4 Running in TRANSFER MODE = 11₂ (Autoinitialization 2)

This transfer mode, besides having all of the advantages of autoinitialization, allows the CPU to coordinate its operation very easily with the operation of the DMA channels. Two different autoinitialization methods are supported:

**Autoinitialization method 2a** always starts after a system reset, after a DMA channel reset (00₂ written to the START or AUX START bits), or after a channel halts (01₂ or 10₂ written to the START or AUX START bits). To select transfer mode 11₂ and use autoinitialization method 2a, follow the steps listed here and shown in Figure 11–15.

1) Initialize the DMA control register to transfer mode 11₂ and reset or halt the DMA channel to be autoinitialized.

2) Initialize the transfer counter to 0 (resetting the DMA channel does this).

3) Initialize the DMA channel link pointer with the address where the autoinitialization values reside. No initialization of the other DMA channel registers is required, because they are automatically set up during the autoinitialization process.

4) Start the DMA channel by writing 11₂ to the START or AUX START bits.

5) The DMA channel autoinitializes itself and performs a block transfer.

6) When the transfer counter goes to zero, the DMA waits for the CPU to write a 11₂ to the START (or AUX START) field of the DMA channel control register and autoinitialize.

7) Repeat the sequence autoinitialize, transfer, and wait.

8) When the transfer counter goes to zero, you can halt the DMA channel by forcing 10₂ into the START (or AUX START) field.

![DMA Channel Running in Transfer Mode 11₂ (Autoinitialization Method 2a)](image-url)
**Autoinitialization method 2b** starts when the transfer counter is not zero. The DMA starts with a regular DMA transfer and autoinitializes after this transfer completes (when the transfer counter becomes zero). To select transfer mode 11₂ and use autoinitialization mode 2b, follow the steps listed here and shown in Figure 11–16.

1) Initialize the DMA control register to transfer mode 11₂ and reset or halt the DMA channel for the first transfer operation.

2) Initialize the other DMA channel registers (source address, destination address, transfer counter, etc.) accordingly. Note that the transfer counter now reflects the number of words to be transferred (normally a nonzero value) before the autoinitialization process.

3) Initialize the DMA channel link pointer with the address where the autoinitialization values for subsequent transfer operations reside.

4) Start the DMA channel by writing 11₂ to the START (or AUX START) bits.

5) The DMA channel performs the initial block transfer. When the transfer counter goes to zero, the DMA waits for the CPU to write a 11₂ to the START or AUX START field of the DMA channel control register and autoinitialize.

6) Repeat the sequence transfer, wait, and autoinitialize.

Note that if a DMA channel is programmed to perform *n* block transfers, using autoinitialization method 2a requires *n* DMA autoinitialization values. Autoinitialization method 2b requires only *n*–1 autoinitialization values because the first transfer can be accomplished during the initial DMA transfer. This represents some memory saving, but successive identical DMA operations require extra CPU cycles to set the initial DMA register values again.
DMA Channel Running in Transfer Mode 112 (Autoinitialization Method 2b)

1. DMA is reset or halted
2. CPU initializes DMA registers and DMA link pointer
3. Start DMA channel
4. DMA channel performs block transfers. Reads and writes can be synchronized
5. DMA channel waits for CPU to start it
6. DMA coprocessor autoinitializes
Autoinitialization

11.9 Autoinitialization

Autoinitialization is a method for reloading a DMA channel register file when the transfer counter goes to zero. When the DMA channel is operating in autoinitialization mode, the link pointer register and auxiliary link pointer register are used to initialize the registers that control the operation of the DMA channel. These pointers are memory address locations for blocks of data that are to be loaded into the DMA register file, shown in Figure 11–1. Link pointers are covered in subsection 11.3.4, *Link Pointer and Auxiliary Link–Pointer Registers*.

Autoinitialization is a regular DMA block transfer operation in which the destination is the DMA coprocessor’s register file. The DMA reads the value pointed to by the link pointer and writes it to the DMA register over the peripheral bus on the next available cycle. Consequently, autoinitialization read/write accesses are also subject to any normal CPU/DMA access conflict.

Autoinitialization can happen:

- Without CPU intervention when the TRANSFER MODE bits = 10₂ (autoinitialization 1). Refer to subsection 11.8.3, *Running in TRANSFER MODE = 10₂ (Autoinitialization 1)*.
- With CPU intervention when the TRANSFER MODE bits = 11₂ (autoinitialization 2). In this case, the CPU should restart the DMA channel before the autoinitialization proceeds. Refer to subsection 11.8.4, *Running in TRANSFER MODE = 11₂ (Autoinitialization 2)*.
- Before any block transfer (autoinitialization method a). The DMA starts with the transfer counter at zero, then autoinitializes and performs a block transfer.
- After a block transfer (autoinitialization method b). The DMA starts with a regular block transfer, and, when the transfer counter register goes to zero, it autoinitializes.

Autoinitialization 1 or 2 can use methods a or b.

Autoinitialization depends on the DMA channel’s current mode: split or unified mode. The mode of operation is controlled by the SPLIT MODE bit (bit 14 in Figure 11–2). When autoinitializing the DMA coprocessor, *do not change* the SPLIT MODE bit. This bit should be changed only when the DMA coprocessor has been reset and halted (see DMA START bit description in Table 11–5 for more information).
11.9.1 Unified Mode

If the DMA channel is running in unified mode (SPLIT MODE = 0), the link pointer is used and the DMA-channel registers are loaded in the following order:

1) DMA-channel control register
2) Source-address register
3) Source-address index register
4) Transfer-counter register
5) Destination-address register
6) Destination-address index register
7) Link-pointer register

The storage of new values for these registers in memory is illustrated in Figure 11–17.

Figure 11–17. Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 0)

11.9.2 Split Mode

If the DMA channel is running in split mode (SPLIT MODE = 1), then the autoinitialize sequence depends upon which counter has terminated.

If the transfer counter register has gone to zero with SPLIT MODE = 1, then the link-pointer register is used for autoinitialization. In this case, the DMA channel registers are loaded in the following order:

1) DMA-channel control register
2) Source-address register
3) Source-address index register
4) Transfer-counter register
5) Link-pointer register

The storage of the new values for these registers in memory is illustrated in Figure 11–18.
Autointialization

Figure 11–18. Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 1 and Transfer Counter = 0)

If the auxiliary transfer counter register has gone to zero with SPLIT MODE=1, then the auxiliary link pointer register is used for autointialization. In this case, the DMA channel registers are loaded in the following order:

1) DMA channel control register
2) Destination address register
3) Destination address index register
4) Auxiliary transfer count register
5) Auxiliary link pointer register

The storage of the new values of these registers in memory is illustrated in Figure 11–19.

Figure 11–19. Store New Values of DMA Channel Registers in Memory (SPLIT MODE = 1 and Auxiliary Transfer Counter = 0)

11.9.3 Incrementing the Link Pointer

During autointialization, the link pointer can be incremented or held constant:

- When the link pointer is incremented, the autointialization values are stored in sequential memory locations, and the link pointer or auxiliary link pointer is incremented in order to access each of these locations.
Autoinitialization

- When you autoinitialize the DMA channel from a stream-oriented device, such as the on-chip communication ports or external FIFOs, you should hold the link pointer constant.

This can be controlled by the AUTOINIT STATIC and the AUX AUTOINIT STATIC bits of the DMA control register as follows:

- In unified mode, the AUTOINIT STATIC bit controls the link pointer.
- In split mode, the AUTOINIT STATIC bit controls the link pointer (primary channel), and the AUX AUTOINIT STATIC controls the auxiliary linker pointer.

When the AUTOINIT STATIC (AUX AUTOINIT STATIC) bit is zero, the link pointer is incremented. When it is one, the link pointer is held constant.

11.9.4 Synchronization

Usually, autoinitialization data is stored in memory, and synchronization is not necessary. In some cases, you may wish to transfer autoinitialization data in the same way as in the synchronized data reads and writes.

Autoinitialization synchronization is a function of the:

- SYNC MODE bits (DMA channel control register bits 6 and 7) that control synchronization of data transfers, and
- AUTOINIT SYNC bits (DMA channel control register bits 10 and 11) that affect only autoinitialization synchronization.

If the SYNC MODE bits are not set to synchronize data transfers (i.e., if the preceding data transfer is not synchronized on interrupts), then the DMA channel autoinitialization sequence is not synchronized either. If the SYNC MODE bits are set to transfer data synchronously (if the preceding data transfer is synchronized), then the upcoming data channel autoinitialization sequence can be synchronized on reads or writes or both (depending on whether the DMA coprocessor is in unified or split mode) as shown in Table 11–9. Note that when both modes show "no sync" for a bit setting in the table, the DMA channel autoinitialization sequence is not synchronized on interrupts.

In unified mode, there is no write synchronization for autoinitialization operation, because the destination is the DMA register, which is always ready.

In split mode, bit 6 of the DMA control register controls the autoinitialization synchronization of the DMA primary channel, and bit 7 controls the autoinitialization synchronization of the DMA auxiliary channel.
If primary channel autoinitialization synchronization is used, the DMA read of autoinitialization values from memory does not proceed until the interrupt specified in the DMAx primary write field in the DIE register is received.

If auxiliary channel autoinitialization synchronization is used, the DMA read of autoinitialization values from memory does not proceed until the interrupt specified in the DMAx auxiliary read field in the DIE register is received.

Table 11–9. Effect of SYNC MODE and AUTOINIT MODE Bits in Autoinitialization

<table>
<thead>
<tr>
<th>Sync Mode</th>
<th>Autoinit Sync</th>
<th>Unified Mode</th>
<th>Split Mode</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>0 0</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>0 0</td>
<td>0 1</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>0 0</td>
<td>1 0</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>0 0</td>
<td>1 1</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>0 1</td>
<td>0 0</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>0 1</td>
<td>0 1</td>
<td>Read</td>
<td>Primary channel</td>
</tr>
<tr>
<td>0 1</td>
<td>1 0</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>0 1</td>
<td>1 1</td>
<td>Read</td>
<td>Primary channel</td>
</tr>
<tr>
<td>1 0</td>
<td>0 0</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>1 0</td>
<td>0 1</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>1 0</td>
<td>1 0</td>
<td>No synchronization</td>
<td>Auxiliary channel</td>
</tr>
<tr>
<td>1 0</td>
<td>1 1</td>
<td>No synchronization</td>
<td>Auxiliary channel</td>
</tr>
<tr>
<td>1 1</td>
<td>0 0</td>
<td>No synchronization</td>
<td>No synchronization</td>
</tr>
<tr>
<td>1 1</td>
<td>0 1</td>
<td>Read</td>
<td>Primary channel</td>
</tr>
<tr>
<td>1 1</td>
<td>1 0</td>
<td>No synchronization</td>
<td>Auxiliary channel</td>
</tr>
<tr>
<td>1 1</td>
<td>1 1</td>
<td>Read</td>
<td>Auxiliary and primary channels</td>
</tr>
</tbody>
</table>

11.9.5 Effect on DMA Control Register Bits

In unified mode, all of the writable control register bits are affected by autoinitialization. These bits are labeled in Figure 11–20.
In split mode during autoinitialization of the primary DMA channel, the writable, nonauxiliary bits can be modified, but auxiliary bits are protected (as shown in Figure 11–21). In other words, only nonauxiliary bits are allowed to be modified by the CPU or DMA coprocessor. Also, if the auxiliary DMA channel is autoinitialized, the writable auxiliary bits can be modified, but nonauxiliary bits are protected. These bits are labeled in Figure 11–22.

Even though the shadowed bits (designated by s in Figure 11–20) are modified during autoinitialization, they do not have an effect until autoinitialization is complete. Unshadowed bits take effect immediately, affecting the autoinitialization sequence. In other words, at autoinitialization, new shadowed bit values are entered last after all registers are loaded (as specified by the link pointer).

Regardless of whether the DMA channel is running in unified mode or split mode, if the CPU or another external source writes to the DMA channel control register, this affects all writable bits, including the shadow bits.

**Note:**

If the CPU writes to the DMA control register during DMA autoinitialization, the CPU write takes effect after the autoinitialization sequence completes. Even though the autoinitialization operation on the DMA registers is not affected, the subsequent data transfer may be affected.

---

**Figure 11–20. DMA Channel Control Register Bits Modifiable by Autoinitialization in Unified Mode**

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>23</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>19</th>
<th>18</th>
</tr>
</thead>
<tbody>
<tr>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>AUX START</td>
<td>START</td>
<td>xx</td>
<td>xx</td>
<td>AUX TCC</td>
<td>TCC</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

R/W R/W R/W R/W s s

<table>
<thead>
<tr>
<th>17</th>
<th>16</th>
<th>15</th>
<th>14</th>
<th>13</th>
<th>12</th>
<th>11</th>
<th>10</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>COM PORTS</td>
<td>SPLIT MODE</td>
<td>WRITE BIT REV</td>
</tr>
</tbody>
</table>

R/W R/W R/W S S S S S S

<table>
<thead>
<tr>
<th>9</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>AUX AUTO STATIC</td>
<td>AUTO INIT STATIC</td>
<td>SYNC MODE</td>
<td>AUX TRANSFER MODE</td>
<td>TRANSFER MODE</td>
<td>DMA PRI</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

R S S S S S S S S S

s — These shadowed bits do not take effect until autoinitialization is complete.
xx — Write protected during autoinitialization.
Autoinitialization

Figure 11–21. DMA Channel Control Register Bit Modifiable by Autoinitialization of the Primary Channel in Split Mode

![DMA Channel Control Register Bit Modifiable by Autoinitialization of the Primary Channel in Split Mode](image)

s — These shadowed bits do not take effect until autoinitialization is complete.
xx — Write protected during primary channel autoinitialization.

Figure 11–22. DMA Channel Control Register Bits That Can Be Modified by Autoinitialization of the Auxiliary Channel in Split Mode

![DMA Channel Control Register Bits That Can Be Modified by Autoinitialization of the Auxiliary Channel in Split Mode](image)

s — These shadowed bits do not take effect until autoinitialization is complete.
xx — Write protected during auxiliary channel autoinitialization.

11.9.6 Consecutive Autoinitializations

For many applications, it is sufficient to autoinitialize the DMA channel with the same data each time. In this case, the new link-pointer value points to the start of the same block of data containing the new link pointer, as illustrated in Figure 11–23. This particular example assumes that the DMA channel is not running in split mode.

11-40
If you want, you can make the new link pointer point to a new set of register values, as illustrated in Figure 11–24. This can be continued to any level.

Figure 11–23. Self-Referential Link Pointer

Figure 11–24. Referring to a New Link Pointer
11.10 DMA and Interrupts

The DMA coprocessor uses interrupts in the following way:

- It can send interrupts to the CPU when a block transfer finishes. See the TCC and AUX TCC bits in Figure 11–2.
- It can receive interrupts from the external interrupt pins (IIOF3–0), the timers, or the communication port (ICRDY, OCRDY).

This section explains how the DMA receives interrupts. This process is called synchronization.

All of the interrupts that the DMA coprocessor can see are first received by the CPU interrupt controller. Edge-triggered interrupts are latched by the CPU in the appropriate interrupt flag register; level-triggered interrupts are not.

When an external interrupt (IIOF3–0) is used for DMA coprocessor transfer synchronization, the CPU is responsible for configuring external interrupts as edge- or level-triggered interrupts (as set in the FUNCx and TYPEx bits of the interrupt flag register (discussed in subsection 3.1.10, IIOF Flag Register (IIF)), on page 3-13.

Edge-triggered interrupts are timer interrupts, DMA interrupts, and external interrupts that are configured as edge-triggered interrupts. Detailed information on interrupts is provided in Section 7.4, Interrupts, on page 7-15, and Section 7.6, DMA Interrupts, on page 7-26. When the interrupt controller determines that an edge-triggered interrupt that a DMA channel is waiting on (DIE registers bits set) has been latched into the interrupt flag, the CPU clears the interrupt flag and sends an interrupt pulse to the DMA channel. The DMA channel latches the interrupt locally until it can service the interrupt. At that time, the latched interrupt is cleared by the DMA coprocessor for two cycles.

Level-triggered interrupts generated by communication ports and external interrupts that are configured as level-triggered interrupts are handled differently by the CPU interrupt controller. When the interrupt controller determines that a level-triggered interrupt that a DMA channel is waiting for (DIE register bits set) has been received, the CPU sends an interrupt pulse to the DMA channel. The DMA channel latches the interrupt locally until it can service the interrupt. At that time, the locally latched interrupt is cleared by the DMA coprocessor for two cycles.

The interrupt reset signal generated by the DMA coprocessor after a DMA interrupt is serviced has priority over the interrupt set signal. Thus, the interrupt signal will not be continuously set, even if the CPU is continuously sending the interrupt set signal. Therefore, when the DMA-set priority scheme is used and
a higher priority DMA channel is driven by continuous interrupt signals, the lower priority DMA channel can be serviced in between the higher priority DMA services.

Unlike the 'C3x, the 'C4x DMA processor is not affected by processing the CPU interrupts, even when pipeline fetches are being halted. When interrupts are enabled in the DIE register, the interrupt is latched automatically by the CPU interrupt controller and saved for future DMA use. When a flag interrupt (timer, external interrupt) is latched, the IIF flag is cleared. Note that IIF flags are cleared when the CPU interrupt controller latches the interrupt, not when the DMA responds to it. Even if the DMA has not been started, the interrupt latch occurs, except when the start bits in the DMA control register have the reset value (00 in the START or AUX START bits). DMA reset clears the interrupt internal latch. To avoid losing previously received interrupts, it is recommended that you initialize DIE register after starting the DMA, when the DMA start bits have the value 11. Note that when the DMA completes a transfer, the start (AUXSTART) bits are set to 10. For this reason, the DMA will not miss any interrupt between transfers.

The DMA and the CPU can respond to the same interrupt if the CPU is not involved in any pipeline conflict or in any instruction that halts instruction fetching. Refer to subsection 7.4.1, Interrupt Vector Table and Prioritization, on page 7-15 for more details. It is also possible for different DMA channels (including auxiliary and primary channels) to respond to the same interrupt. If the same interrupt is selected for source and destination synchronization, both read and write cycles are enabled with a single incoming interrupt.

The internal circuitry of the 'C4x guarantees proper operation between a communication port that generates level-triggered interrupts and the DMA channel that is synchronizing with those level-triggered interrupts.

**Note:**

When you synchronize the DMA channels with external interrupts, it is better to configure the interrupt lines as edge-triggered interrupts to ensure that only one interrupt is recognized.

### 11.10.1 Intermits and Synchronization of DMA Channels

You can use interrupts to synchronize DMA channel transfers. To set up the DMA for a synchronous data transfer mode requires two steps:

1) Set the DMA SYNC MODE bits (bits 6,7) in the DMA channel control register to the value for the source, destination, or source and destination synchronization desired. See subsection 11.10.2, Synchronization Mode Bits, for more information.
2) Set the DIE register to enable the corresponding interrupt for the DMA transfer synchronization desired. Figure 11–25 and Figure 11–26 show the DIE register for the split and unified modes, respectively. Table 11–10 and Table 11–11 lists the different synchronization interrupts for unified mode, and Table 11–12 and Table 11–13 list them for split mode.

It is recommended that you initialize the DIE register after starting the DMA, when the start bits have the value $11_2$. This prevents losing previously received interrupts, which may occur if you enable the DIE register when the start bits are $00_2$ (reset value).

**Figure 11–25. DIE Register Bit Functions for DMA Unified Mode**

<table>
<thead>
<tr>
<th>Bit Number</th>
<th>DMA5 Write</th>
<th>DMA5 Read</th>
<th>DMA4 Write</th>
<th>DMA4 Read</th>
<th>DMA3 Write</th>
<th>DMA3 Read</th>
<th>DMA2 Write</th>
<th>DMA2 Read</th>
<th>DMA1 Write</th>
<th>DMA1 Read</th>
<th>DMA0 Write</th>
<th>DMA0 Read</th>
</tr>
</thead>
<tbody>
<tr>
<td>31</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
</tr>
<tr>
<td>30</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>29</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>28</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>27</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>26</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>25</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>24</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>23</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>22</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>21</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>20</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

R = Read  W = Write

**Table 11–10. DMA Channels 0 and 1 (DMA0 and DMA1) Unified-Mode Synchronization Interrupts**

<table>
<thead>
<tr>
<th>Bit Value (in DMA0 or DMA1)</th>
<th>DMA0 Read</th>
<th>DMA0 Write</th>
<th>DMA1 Read</th>
<th>DMA1 Write</th>
<th>Interrupt Enabled at DMA0 or DMA1</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0†</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>None</td>
</tr>
<tr>
<td>0 1</td>
<td>ICRDY0</td>
<td>OCRDY0</td>
<td>ICRDY1</td>
<td>OCRDY1</td>
<td>From communication port</td>
<td>From communication port</td>
</tr>
<tr>
<td>1 0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>From external pins TIM0–TIM3</td>
<td>From external pins TIM0–TIM3</td>
</tr>
<tr>
<td>1 1</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>From timer TIM0</td>
<td>From timer TIM0</td>
</tr>
</tbody>
</table>

† DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.
Table 11–11. DMA Channels 2 to 5 (DMA2 to DMA5) Unified-Mode Synchronization Interrupts

<table>
<thead>
<tr>
<th>Bit Value (in DMA2 to DMA5)</th>
<th>Interrupt Enabled at DMA2–DMA5†</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0†</td>
<td>None</td>
<td>– –</td>
</tr>
<tr>
<td>0 0 1</td>
<td>ICRDY x†</td>
<td>From communication port</td>
</tr>
<tr>
<td>0 1 0</td>
<td>IIOF0</td>
<td>From external pins IIOF0–IIOF3</td>
</tr>
<tr>
<td>0 1 1</td>
<td>IIOF1</td>
<td>From external pins IIOF0–IIOF3</td>
</tr>
<tr>
<td>1 0 0</td>
<td>IIOF2</td>
<td>From external pins IIOF0–IIOF3</td>
</tr>
<tr>
<td>1 0 1</td>
<td>IIOF3</td>
<td>From external pins IIOF0–IIOF3</td>
</tr>
<tr>
<td>1 1 0</td>
<td>TIM0</td>
<td>From timers TIM0 and TIM1</td>
</tr>
<tr>
<td>1 1 1</td>
<td>TIM1</td>
<td>– –</td>
</tr>
</tbody>
</table>

† The x in DMAx represents the DMA channel number and also the number for the corresponding ICRDYx and OCRDYx interrupts. For example, an 001 in both DMA2 READ and DMA5 WRITE would enable interrupts ICRDY2 and OCRDY5, respectively. All other viable bit values (010 to 111) are the same (as shown in the table) for DMA2 through DMA5.
†† DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.

Figure 11–26. DIE Register Bit Functions for DMA Split Mode

<table>
<thead>
<tr>
<th>DMA5 Primary Write</th>
<th>DMA5 Auxiliary Read</th>
<th>DMA4 Primary Write</th>
<th>DMA4 Auxiliary Read</th>
</tr>
</thead>
<tbody>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
<tr>
<td>19 18 17 16 15 14</td>
<td>13 12 11 10 9 8</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>DMA3 Primary Write</th>
<th>DMA3 Auxiliary Read</th>
<th>DMA2 Primary Write</th>
<th>DMA2 Auxiliary Read</th>
</tr>
</thead>
<tbody>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
<tr>
<td>7 6 5 4 3 2 1 0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>DMA1 Primary Write</th>
<th>DMA1 Auxiliary Read</th>
<th>DMA0 Primary Write</th>
<th>DMA0 Auxiliary Read</th>
</tr>
</thead>
<tbody>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
</tbody>
</table>

R = Read W = Write
### DMA and Interrupts

**Table 11–12. DMA Channels 0 and 1 (DMA0 and DMA1) Split-Mode Synchronization Interrupts**

<table>
<thead>
<tr>
<th>Bit Value (in DMA0 or DMA1)</th>
<th>DMA0 Auxiliary Read</th>
<th>DMA0 Primary Write</th>
<th>DMA1 Auxiliary Read</th>
<th>DMA1 Primary Write</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0†</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>None</td>
<td>-- --</td>
</tr>
<tr>
<td>0 1</td>
<td>ICRDY0</td>
<td>OCRDY0</td>
<td>ICRDY1</td>
<td>OCRDY1</td>
<td>From communication port</td>
</tr>
<tr>
<td>1 0</td>
<td>II0F0</td>
<td>II0F1</td>
<td>II0F2</td>
<td>II0F3</td>
<td>From external pins II0F0–II0F3</td>
</tr>
<tr>
<td>1 1</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>From timer TIM0</td>
</tr>
</tbody>
</table>

† DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.

**Table 11–13. DMA Channels 2 to 5 (DMA2 to DMA5) Split-Mode Synchronization Interrupts**

<table>
<thead>
<tr>
<th>Bit Value (in DMA2 to DMA5)</th>
<th>DMAx Auxiliary Read†</th>
<th>†DMAx Primary Write†</th>
<th>Interrupt Source for DMA Synchronization</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0†</td>
<td>None</td>
<td>None</td>
<td>-- --</td>
</tr>
<tr>
<td>0 0 1</td>
<td>ICRDYx†</td>
<td>OCRDYx†</td>
<td>From communication port</td>
</tr>
<tr>
<td>0 1 0</td>
<td>II0F0</td>
<td>II0F0</td>
<td>From external pins II0F0–II0F3</td>
</tr>
<tr>
<td>0 1 1</td>
<td>II0F1</td>
<td>II0F1</td>
<td>From external pins II0F0–II0F3</td>
</tr>
<tr>
<td>1 0 0</td>
<td>II0F2</td>
<td>II0F2</td>
<td>From external pins II0F0–II0F3</td>
</tr>
<tr>
<td>1 0 1</td>
<td>II0F3</td>
<td>II0F3</td>
<td>From external pins II0F0–II0F3</td>
</tr>
<tr>
<td>1 1 0</td>
<td>TIM0</td>
<td>TIM0</td>
<td>From timer TIM0 and TIM1</td>
</tr>
<tr>
<td>1 1 1</td>
<td>TIM1</td>
<td>TIM1</td>
<td>From timer TIM0 and TIM1</td>
</tr>
</tbody>
</table>

† The x in DMAx represents the DMA channel number and also the number for the corresponding ICRDYx and OCRDYx interrupts. For example, an 0012 in both DMA2 READ and DMA5 WRITE would enable interrupts ICRDY2 and OCRDY5, respectively. All other viable bit values (0102 to 1112) are the same (as shown in the table) for DMA2 through DMA5.

‡ DMA channel halts (no read or write operation proceeds) if DMA synchronous transfer is used.

#### 11.10.2 Synchronization Mode Bits

Table 11–3 and Table 11–4 describe how the bit values of the SYNC MODE field of the DMA channel control register determine synchronization in unified and split mode, respectively.
DMA and Interrupts

- No synchronization (SYNC MODE = 00₂)
- Source synchronization
  - for unified mode (SYNC MODE = 01₂)
  - for split mode (SYNC MODE = 10₂)
- Destination synchronization
  - for unified mode (SYNC MODE = 10₂)
  - for split mode (SYNC MODE = 01₂)
- Source and destination synchronization (SYNC MODE = 11₂)

When the 'C4x DMA is in split mode, the primary channel supports write (or destination) synchronization transfers only, and the auxiliary channel supports read (or source) synchronization transfers only. In split mode, bits 6 and 7 of the DMA channel control register (as shown in Table 11–3) are used to control channel synchronization:

- **Bit 6** controls primary write channel synchronization (destination synchronization).
- **Bit 7** controls auxiliary read channel synchronization (source synchronization).

DMA transfer rate in synchronization mode is explained in subsection 11.11.2, *DMA Transfer Rate in Synchronization Mode*, on page 11-55.

**No Synchronization**

When SYNC MODE = 00₂, no synchronization is performed. The DMA performs reads and writes whenever it has the priority to use the DMA bus. All interrupts are ignored. Note the difference between this mode and having the zero value in the DIE read or write fields. Having zeros in the DIE register read/write fields results in a total DMA halt if synchronization is used, whereas SYNC MODE = 00₂ leaves the DMA channel running freely. Figure 11–27 shows the mechanism used when SYNC MODE = 00₂.

*Figure 11–27. No DMA Synchronization*

```
Start

DMA channel performs a read

DMA channel performs a write

Go to start

(SYNC MODE = 00₂)
```
Source Synchronization

When SYNC MODE = 01₂ (for unified mode) or when SYNC MODE = 10₂ (for auxiliary channel in split mode), the DMA coprocessor is synchronized to the source (see Figure 11–28). A read will not be performed until an interrupt is received by the DMA channel. Then, all DMA interrupts are disabled globally. However, no bits in the DMA interrupt enable register are changed.

**Figure 11–28.DMA Source Synchronization**

<table>
<thead>
<tr>
<th>a) DMA Channel in Unified Mode</th>
<th>b) Auxiliary Channel in Split Mode</th>
</tr>
</thead>
<tbody>
<tr>
<td>(SYNC MODE = 01₂)</td>
<td>Sync on Read</td>
</tr>
</tbody>
</table>

- Start
- Idle until enabled interrupt is received
- Disable DMA interrupts globally
- DMA channel performs a read
- Enable DMA interrupts globally
- DMA channel performs a write
- Go to start

Destination Synchronization

When SYNC MODE = 10₂ (for unified mode) or when SYNC MODE = 01₂ (for primary channel in split mode), the DMA channel is synchronized to the destination. A write is not performed until an interrupt is received by the DMA channel. Figure 11–29 shows the synchronization mechanism.

In unified mode, the read is performed without waiting for the interrupt. However, in split mode, the read occurs only when the interrupt enabling the write is received. This avoids a lock situation that could happen if the primary channel...
reads but never writes out of the temporary register, because it does not receive the write interrupt. In this case, the auxiliary channel could not proceed, because the DMA internal temporary register is busy.

**Figure 11–29. DMA Destination Synchronization**

*a)* DMA Channel in Unified Mode

(SYNC MODE = 10₂)

- Start
- DMA channel performs a read
- Idle until enabled interrupt is received
- Disable DMA interrupts globally
- DMA channel performs a write
- Enable DMA interrupts globally
- Go to start

*b)* Primary Channel in Split Mode

(SYNC MODE = 01₂)

- Start
- Idle until enabled interrupt is received
- Disable interrupts globally
- DMA channel performs a read
- Write data to communication port output FIFO
- Enable DMA interrupts globally
- Go to start

**Source and Destination Synchronization**

When SYNC MODE = 1₁₂, a read is performed when a read interrupt is received, and a write is performed on the write interrupt. If a write interrupt is received before a read interrupt, the write interrupt is latched, and the DMA data write is not executed until the read is completed. Unified mode source and destination synchronization (SYNC MODE = 1₁₂) is shown in Figure 11–30.

If DMA split mode is selected, it reacts as two independent synchronizations for the primary (write synchronization) and auxiliary (read synchronization) channels. Figure 11–28b and Figure 11–29b show this.

When the same interrupt is selected for read and write synchronization (in either split or unified mode), one single interrupt will enable both read and write operations.
Figure 11–30. Unified Mode DMA Source and Destination Synchronization

(SYNC MODE = 112)

1. Start
2. Idle until enabled interrupt is received
3. Disable DMA interrupts globally
4. DMA channel performs a read
5. Enable DMA interrupts globally
6. Idle until enabled interrupt is received
7. Disable DMA interrupts globally
8. DMA channel performs a write
9. Enable DMA interrupts globally
10. Go to start
11.11 DMA Memory Transfer Timing

The 'C4x provides six DMA channels (twelve DMA channels if they are all in split mode) with a fixed/rotating priority arbitration scheme and configurable CPU/DMA priority scheme (for detailed information, see Section 11.6 for DMA internal priority schemes and Section 11.7, CPU and DMA Coprocessor Arbitration, for CPU and DMA priority arbitration).

The maximum data transfer rate that the 'C4x DMA sustains is one word every two cycles. The six DMA channels transfer data in a sequential time-slice fashion, rather than simultaneously, because they share common buses.

DMA memory transfer timing can be very complicated, especially if bus resource conflicts occur. However, some rules help you calculate the transfer timing for certain DMA setups. For simplification, the following subsection focuses on a single-channel DMA memory transfer timing with no conflict with the CPU or other DMA channels. You can obtain the actual DMA transfer timing by combining the calculations for single-channel DMA transfer timing with those for bus resource conflict situations.

11.11.1 Single DMA Memory Transfer Timing

When the DMA memory transfer has no conflict with the CPU or any other DMA channels, the number of cycles of a DMA transfer depends on whether the source and destination location are designated as on-chip memory, peripheral, or external ports. When the external port is used, the DMA transfer speed is affected by two factors: the external bus wait state and the read/write conflict (for example, if a write is followed by a read, the read takes two cycles). Figure 11–31 through Figure 11–33 show the number of cycles a DMA transfer requires from different sources to different destinations. Entries in the table represents the number of cycles required to do the $T$ transfers, assuming that there are no pipeline conflicts. A timing diagram for the DMA transfers accompanies each figure.
### DMA Memory Transfer Timing

#### Figure 11–31. Timing and Number of Cycles for DMA Transfers to On-Chip Destination

<table>
<thead>
<tr>
<th>Cycles</th>
<th>T</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
<th>11</th>
<th>12</th>
<th>13</th>
<th>14</th>
<th>15</th>
<th>16</th>
<th>17</th>
<th>18</th>
</tr>
</thead>
<tbody>
<tr>
<td>Source on-chip</td>
<td>9</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
</tr>
<tr>
<td>Source local bus</td>
<td>4</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
</tr>
<tr>
<td>Source global bus</td>
<td>4</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Source</th>
<th>Destination: On-chip</th>
</tr>
</thead>
<tbody>
<tr>
<td>On-chip</td>
<td>(1+1)T</td>
</tr>
<tr>
<td>Local bus</td>
<td>[(1+Cr)+1]T</td>
</tr>
<tr>
<td>Global bus</td>
<td>[(1+Cr)+1]T</td>
</tr>
</tbody>
</table>

**Legend:**
- **T** = Number of transfers
- **Cr** = Source-read wait states
- **R** = Single-cycle reads
- **W** = Single-cycle writes
- **R R** = Multicycle reads
Figure 11–32. Timing and Number of Cycles for DMA Transfers to a Local-Bus Destination

<table>
<thead>
<tr>
<th>Cycles</th>
<th>T 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18</th>
</tr>
</thead>
<tbody>
<tr>
<td>Source on-chip</td>
<td>Destination local bus</td>
</tr>
<tr>
<td>4</td>
<td>R</td>
</tr>
<tr>
<td>Source local bus</td>
<td>Destination local bus</td>
</tr>
<tr>
<td>2</td>
<td>R</td>
</tr>
<tr>
<td>Source global bus</td>
<td>Destination local bus</td>
</tr>
<tr>
<td>3</td>
<td>R</td>
</tr>
</tbody>
</table>

Source: Destination: Local Bus

<table>
<thead>
<tr>
<th>Source</th>
<th>Destination: Local Bus</th>
</tr>
</thead>
<tbody>
<tr>
<td>On-chip</td>
<td>1+(2+Cw)T</td>
</tr>
<tr>
<td>Local bus</td>
<td>[(2+Cr)+(2+Cw)]T–1</td>
</tr>
<tr>
<td>Global bus</td>
<td>[(1+Cr)+(2+Cw)]+<a href="T%E2%80%931">2+max(Cr,Cw)</a></td>
</tr>
</tbody>
</table>

Legend:
- T = Number of transfers
- Cr = Source-read wait states
- Cw = Destination-write wait states
- R = Single-cycle reads
- R R = Multicycle reads
- W W = Multicycle writes
**DMA Memory Transfer Timing**

*Figure 11–33. Timing and Number of Cycles for DMA Transfers to a Global-Bus Destination*

<table>
<thead>
<tr>
<th>Cycles</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
<th>11</th>
<th>12</th>
<th>13</th>
<th>14</th>
<th>15</th>
<th>16</th>
<th>17</th>
<th>18</th>
</tr>
</thead>
<tbody>
<tr>
<td>Source on-chip</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>Destination global bus</td>
<td>4</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
</tr>
<tr>
<td>Source local bus</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>Destination global bus</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
<td>3</td>
</tr>
<tr>
<td>Source global bus</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>Destination global bus</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
<td>2</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Source</th>
<th>Destination: Global Bus</th>
</tr>
</thead>
<tbody>
<tr>
<td>On-chip</td>
<td>1 + (2 + Cw)T</td>
</tr>
<tr>
<td>Local bus</td>
<td>[(1 + Cr) + (2 + Cw)] + [2 + max(Cr, Cw)](T – 1)</td>
</tr>
<tr>
<td>Global bus</td>
<td>[(2 + Cr) + (2 + Cw)]T – 1</td>
</tr>
</tbody>
</table>

Legend:
- **T** = Number of transfers
- **Cr** = Source-read wait states
- **Cw** = Destination-write wait states
- **R** = Single-cycle reads
- **R R** = Multicycle reads
- **W W** = Multicycle writes

Externally, on the global and local buses, writes take at least two cycles. However, internally, the CPU/DMA requires one cycle to perform the write to external memory. Therefore, the DMA/CPU can transfer data on the next cycle if it is not to the same external bus. For example, the DMA transfers 1024 words from internal memory RAM block 1 to a 1-wait-state memory on the global bus while the CPU runs from memory on the local bus and fetches operands from RAM block 0. The DMA transfer time is calculated from Figure 11–32 as $1 + (2 + 1)1024 = 1 + 3072 = 3073$ cycles.
11.11.2 DMA Transfer Rate in Synchronization Mode

The synchronization mode used for transfers also affects the DMA data transfer rate. The DMA data transfer rate is slower if synchronization is used because it takes two cycles to reset the request from the interrupt. However, these two extra cycles can be absorbed if multiple DMAs are running at the same time.

In unified mode, the maximum transfer rate is one word every three cycles, using synchronization. Figure 11–34 shows the number of cycles a DMA transfer requires under unified mode with different types of synchronization. For simplification, a single-channel DMA memory transfer timing with no conflict with CPU or other DMA channels, no memory wait states, and interrupts always active, is considered.

Figure 11–34. Unified-Mode DMA Timing for Different Synchronizations

<table>
<thead>
<tr>
<th>Cycles</th>
<th>T</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
<th>11</th>
<th>12</th>
<th>13</th>
<th>14</th>
<th>15</th>
<th>16</th>
<th>17</th>
<th>18</th>
</tr>
</thead>
<tbody>
<tr>
<td>No synchronization</td>
<td>9</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>Read synchronization</td>
<td>6</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>Write synchronization</td>
<td>5</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
<tr>
<td>Read and write synchronization</td>
<td>5</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
</tr>
</tbody>
</table>

**Synchronization** | **Timing**
---|---
No synchronization | 2T
Read synchronization | 3T
Write synchronization | 1 + 3T
Read and write synchronization | 1 + 3T

Legend:
T = Number of transfers
R = Single-cycle reads
W = Single-cycle writes
Rr = Read flag-reset (2 cycles)
Wr = Write-flag reset (2 cycles)
In split mode, the maximum transfer rate for either the primary or auxiliary channel is one word every four cycles, using synchronization. When auxiliary and primary channels are running at the same time, the two-cycle overhead for interrupt reset is absorbed, and the maximum transfer rate can be one word every two cycles. Figure 11–35 shows the number of cycles a DMA transfer requires in split mode with different types of synchronization. For simplification, a single-channel DMA memory transfer timing with no conflict with CPU or other DMA channels, no wait states, and interrupts always active, is considered.

**Figure 11–35. Split-Mode DMA Timing for Different Synchronizations**

<table>
<thead>
<tr>
<th>Cycles</th>
<th>T</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
<th>11</th>
<th>12</th>
<th>13</th>
<th>14</th>
<th>15</th>
<th>16</th>
<th>17</th>
<th>18</th>
</tr>
</thead>
<tbody>
<tr>
<td>No synchronization (both channels running)</td>
<td>8</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>R'</td>
<td>R'</td>
<td>R'</td>
<td>R'</td>
<td>W'</td>
<td>W'</td>
<td>W'</td>
<td>W'</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Primary channel synchronization (auxiliary channel not running)</td>
<td>4</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>Pr</td>
<td>Pr</td>
<td>Pr</td>
<td>Pr</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Auxiliary channel synchronization (primary channel not running)</td>
<td>4</td>
<td>R'</td>
<td>R'</td>
<td>R'</td>
<td>R'</td>
<td>W'</td>
<td>W'</td>
<td>W'</td>
<td>W'</td>
<td>Ar</td>
<td>Ar</td>
<td>Ar</td>
<td>Ar</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Primary and auxiliary channel synchronization (both channels running)</td>
<td>8</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>W</td>
<td>Pr</td>
<td>Pr</td>
<td>Pr</td>
<td>Pr</td>
<td>R'</td>
<td>R'</td>
<td>R'</td>
<td>R'</td>
<td>W'</td>
<td>W'</td>
</tr>
</tbody>
</table>

**Synchronization**

<table>
<thead>
<tr>
<th></th>
<th>Timing</th>
</tr>
</thead>
<tbody>
<tr>
<td>No synchronization (both channels running)</td>
<td>2T</td>
</tr>
<tr>
<td>Primary channel synchronization (auxiliary channel not running)</td>
<td>4T</td>
</tr>
<tr>
<td>Auxiliary channel synchronization (primary channel not running)</td>
<td>4T</td>
</tr>
<tr>
<td>Primary and auxiliary channel synchronization (both channels running)</td>
<td>2T + 2</td>
</tr>
</tbody>
</table>

**Legend:**

- **T** = Number of transfers
- **R** = Single-cycle reads primary channel
- **R’** = Single-cycle reads auxiliary channel
- **W** = Single-cycle writes primary channel
- **W’** = Single-cycle writes auxiliary channel
- **Pr** = Primary channel flag reset (2 cycles)
- **Ar** = Auxiliary channel flag reset (2 cycles)
The 'C4x offers six ('C40) or four ('C44) on-chip communication ports for interfacing with other 'C4xs and peripherals. One important feature of the ports is that they can work with the DMA coprocessor to transfer data without CPU intervention, allowing the CPU to perform other tasks.

This chapter describes the key features, memory map and registers, and operations of the communication ports of the 'C4x digital signal processor.

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>12.1 Features</td>
<td>12-2</td>
</tr>
<tr>
<td>12.2 Operational Overview</td>
<td>12-3</td>
</tr>
<tr>
<td>12.3 Memory Map and Registers</td>
<td>12-7</td>
</tr>
<tr>
<td>12.4 Port Arbitration Units (PAUs)</td>
<td>12-11</td>
</tr>
<tr>
<td>12.5 Halting of Input and Output FIFOs</td>
<td>12-14</td>
</tr>
<tr>
<td>12.6 Coordinating Communication Ports With the CPU and DMA Coprocessor</td>
<td>12-17</td>
</tr>
<tr>
<td>12.7 Token Transfer Operation</td>
<td>12-19</td>
</tr>
<tr>
<td>12.8 Word Transfer Operation</td>
<td>12-22</td>
</tr>
<tr>
<td>12.9 Synchronizers</td>
<td>12-26</td>
</tr>
<tr>
<td>12.10 Module Reset</td>
<td>12-29</td>
</tr>
<tr>
<td>12.11 Tips for Using Communication Ports</td>
<td>12-32</td>
</tr>
</tbody>
</table>
12.1 Features

Each 'C4x communication port has several key features:

- 160-MB per second bidirectional peak data transfer rates (at 40-ns cycle time)
- Simple processor-to-processor communication via eight data lines and four control lines
- FIFO buffering of all data transfers
- Automatic arbitration and handshaking to ensure communication synchronization
- Synchronization between the CPU or direct-memory access (DMA) coprocessor and the six communication ports via internal interrupts and internal ready signals
- Support of a wide variety of multiprocessor architectures, including rings, trees, hypercubes, bidirectional pipelines, two-dimensional Euclidean grids, hexagonal grids, and three-dimensional grids
- Communication-port software reset ('C40 revisions ≥ 5.0 and 'C44 only)
12.2 Operational Overview

The 'C4x contains six ('C40) or four ('C44) identical high-speed communication ports, each of which provides a bidirectional communication interface to one other 'C4x or external peripheral. Figure 12–1 shows the internal architecture of a single communication port. Each port contains the following components:

- **Input FIFO channel** — provides an 8-level, 32-bit wide first-in-first-out (FIFO) input buffer that isolates the 'C4x from the port communication data bus and buffers data received from an external device via the bus.

- **Output FIFO channel** — provides an 8-level, 32-bit wide FIFO output buffer that isolates the 'C4x from the port communication data bus and buffers data to be sent to an external device via the bus.

- **Port arbitration unit (PAU)** — handles the arbitration tasks associated with the movement of data between a 'C4x and an external device via the port communication data bus. The PAU is described in detail in Section 12.4, Port Arbitration Units (PAUs), on page 12-11.

- **Communication port control register (CPCR)** — allows you to control the communication port functions and data transfer operations between a 'C4x and an external device via the communication port data bus.

- **Communication-port software reset register ('C44 and 'C40 rev ≥ 5.0)** — allows you to flush the input FIFO and output FIFO levels of a communication port. This is explained in subsection 12.3.4, Communication Port Software Reset Register, on page 12-10.

A communication port transmits each of the 32-bit words stored in its output FIFO on a byte-to-byte basis. Because the control and data lines are bidirectional, each 'C4x must have ownership of the communication port data bus before starting a word transfer. A simulated *token* is used to designate bus ownership: the communication port that has the token owns the communication port data bus and can transmit data.
Figure 12–1. Communication Port Block Diagram

- **Peripheral data bus**
- **Input FIFO channel**
  - FIFO control
  - FIFO positions (8 levels)
  - Input channel full flag (ICFULL)
  - Input channel ready flag (ICRDY)
- **Output FIFO Channel**
  - Output channel empty flag (OCEMPTY)
  - Output channel ready flag (OCRDY)
- **Communication port control registers**
  - Input channel halt
  - Output channel halt
  - Port direction
  - Input channel level
  - Output channel level
- **Port arbitration unit (PAU)**
  - Buffer register for word being input or output
  - Input/output data shifter and multiplexer
- **Interface**
  - CREGx
  - CACKx
  - CSTRBx
  - CRDYx
  - CxD(7–0)
Figure 12–2 is an example of two 'C4x DSPs connected via their communication ports. This simple communication interface consists of the following bidirectional control and data lines:

- **CREQ** — communication-port token request. A 'C4x activates this signal to request the use of the communication-port data bus.
- **CACK** — communication-port token acknowledge. A 'C4x activates this signal to relinquish ownership of the communication-port data bus upon receiving a CREQ from another 'C4x.
- **CSTRB** — communication-port strobe. A sending 'C4x activates this signal to indicate that it has placed a valid data byte on the communication port data bus.
- **CRDY** — communication-port ready. A receiving 'C4x activates this signal to indicate that it has received a data byte via the communication port data bus.
- **CxD(7–0)** — communication-port data bus. This bus carries data bidirectionally, one byte at a time, between two 'C4xs or between a 'C4x and some other device.

### 12.2.1 Token Transfer Operation

To transfer a token, the PAUs in the two 'C4xs cooperate to generate the signals and control sequences necessary to ensure orderly data transfers at the highest possible rate. To avoid conflicts on the bus, the PAUs arbitrate bus ownership, allowing only one DSP to transmit at any given time. The PAU that owns the token can relinquish bus ownership when the other 'C4x has data to send.
12-6

The signals CREQx and CACKx handle the handshaking arbitration between the two DSPs in two steps:

1) The PAU that does not own the data bus (CxD(7–0)) activates CREQx to request bus ownership.
2) The PAU owning the bus then activates CACKx to acknowledge the request and relinquish bus ownership to the requesting PAU.

In this manner, these signals transfer a token (or priority) from one PAU to another, and the PAU receiving the token gains ownership of the bus. See Section 12.7, Token Transfer Operation, for a detailed description of token transfer.

12.2.2 Data Transfer Operation

A data transfer operation takes four basic steps to complete:

1) The CPU or DMA coprocessor of the sending DSP writes a 32-bit data word to the output FIFO (of a communication port) via a memory-mapped address (listed in Figure 12–3).
2) The communication port then places the 32-bit data word on CxD(7–0) on a byte-to-byte basis (LS byte first), activating CSTRBx to signal the receiving communication port that the bus contains a valid data byte.
3) Upon receiving each data byte, the receiving communication port activates CRDYx to indicate that it has received the data byte.
4) After receiving the 4 bytes of a 32-bit word, the CPU or DMA coprocessor of the receiving DSP can then read the data from the input FIFO via a memory-mapped address (listed in Figure 12–3).

Each of the input and output FIFOs can buffer a maximum of eight 32-bit words.

Buffering provided by the input and output FIFOs is essential. This buffering allows for a high degree of decoupling of computation and communication overhead. When 'C4xs A and B are connected via their communication ports, the effective length of the FIFOs becomes 16 levels. This occurs because the output path from A to B is the concatenation of the eight levels of the output FIFO of A with the eight levels of the input FIFO of B. This also applies for the output path from B to A.
12.3 Memory Map and Registers

Figure 12–3 shows the memory map for the 'C4x communication-port control registers (CPCRs) and their associated input FIFOs and output FIFOs. The lowest three addresses of each port’s 16-address block are mapped to a corresponding CPCR, and its associated input and output FIFOs. Fields (bits) within a CPCR are shown in Figure 12–4.

Figure 12–3. Communication-Port Memory Map

† This feature is only available on the 'C44 and on the 'C40 (revision 5.0 and above).
### 12.3.1 Communication-Port Control Register (CPCR)

Figure 12–4 shows the format of a 'C4x CPCR, which contains control and status bits for its associated communication port. The text following the figure lists the CPCR bits and fields and describes their functions.

**Figure 12–4. Communication-Port Control Register (CPCR)**

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>23</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>19</th>
<th>18</th>
<th>17</th>
<th>16</th>
<th>15</th>
<th>14</th>
<th>13</th>
</tr>
</thead>
<tbody>
<tr>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>12</th>
<th>11</th>
<th>10</th>
<th>9</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>INPUT LEVEL</td>
<td>OUTPUT LEVEL</td>
<td>OCH</td>
<td>ICH</td>
<td>PORT DIR</td>
<td>xx</td>
<td>xx</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R/W</td>
<td>R/W</td>
<td>R</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Notes:**
1) $xx =$ reserved bit (read/write as zero).
2) $R =$ read, $W =$ write.

- **Reserved**
  - Undefined

- **PORT DIR**
  - **Port Direction.** This bit determines the direction of data transfer operations for the communication port.
    - PORT DIR = 0: port is in the output mode.
    - PORT DIR = 1: port is in the input mode.
    - This is a read-only bit. It is not possible to change the port direction under software control.

- **ICH**
  - **Input Channel Halt.**
    - Write a 1 to ICH to halt the input channel.
    - Clear ICH to 0 when the input channel is to be unhalted.
    - The input channel cannot signal externally when it is ready to receive.

- **OCH**
  - **Output Channel Halt.**
    - Write a 1 to this bit to halt the output channel immediately.
    - However, the communication port is still able to accept a token request from the input channel.
    - Clear this bit to 0 to allow the output channel to transfer data.
### Output FIFO Level

Contents of this 4-bit field:

- **0000_2 (0)**: indicates an empty output FIFO.
- **0001_2 (1) through 0111_2 (7)**: indicates the number of full positions in the output FIFO.
- **1111_2 (15)**: indicates a full output FIFO.

An empty output buffer (OUTPUT LEVEL = 0000_2) sends an unlatched, positive level-triggered interrupt (OCEMPTY = 1) to the CPU. When the CPU or DMA coprocessor writes to the empty output FIFO, OCEMPTY is cleared to 0 and remains in that state until the buffer is again empty. An output FIFO with one or more empty levels also sends an unlatched, positive level-triggered interrupt (OCRDY = 1) to the CPU and the DMA coprocessor. This condition causes a READY/NOT READY signal to be generated when the CPU or DMA coprocessor attempts to write to the output FIFO. See Section 12.6, Coordinating Communication Ports With the CPU and DMA Coprocessor, on page 12-17, for details.

### Input FIFO Level

Contents of this 4-bit field:

- **0000_2 (0)**: indicates an empty input FIFO.
- **0001_2 (1) through 0111_2 (7)**: indicates the number of full positions in the input FIFO.
- **1111_2 (15)**: indicates a full input FIFO.

A full input FIFO (INPUT LEVEL = 1111_2) sends an unlatched, positive level-triggered interrupt (ICFULL = 1) to the CPU. When the CPU or DMA coprocessor reads from the full input FIFO, ICFULL is cleared to 0 and remains in that state until the FIFO is again full. An input FIFO with one or more full levels also sends an unlatched, positive level-triggered interrupt (ICRDY = 1) to the CPU and the DMA coprocessor. This condition causes a READY/NOT READY signal to be generated when the CPU or DMA coprocessor attempts to read from the input FIFO.

### Reserved

Undefined.

#### 12.3.2 Input-Port Register

This read-only register contains the contents of position 0, the oldest value of the input FIFO. If this register is written to, its contents remain unchanged. Reading from an empty input FIFO causes the CPU or DMA operation to stall and to halt the peripheral bus.

#### 12.3.3 Output-Port Register

This write-only register interfaces to position 7 (the newest value) of the output FIFO. If this register is read, its contents remain unchanged, and the value read is undefined.
If an output FIFO that is full is written to, the peripheral-bus interface latches the word, and returns a *not ready* signal. This condition disappears when an empty position appears in the output FIFO and the data on the bus is transferred to the FIFO.

### 12.3.4 Communication-Port Software Reset Register

The input and output FIFO levels for a communication port can be flushed by writing at least two back-to-back values to its communication-port-software reset address as specified in Table 12–1. The communication port reset feature does not affect the status of the external pins.

*Table 12–1. Communication-Port Software Reset Address (*'C44 and 'C40 ≥ 5.0)*

<table>
<thead>
<tr>
<th>COMMUNICATION PORT</th>
<th>SOFTWARE RESET ADDRESS</th>
</tr>
</thead>
<tbody>
<tr>
<td>0†</td>
<td>0x0100043</td>
</tr>
<tr>
<td>1</td>
<td>0x0100053</td>
</tr>
<tr>
<td>2</td>
<td>0x0100063</td>
</tr>
<tr>
<td>3†</td>
<td>0x0100073</td>
</tr>
<tr>
<td>4</td>
<td>0x0100083</td>
</tr>
<tr>
<td>5</td>
<td>0x0100093</td>
</tr>
</tbody>
</table>

† These ports are available only in the 'C40.

Example 12–1 shows a method for resetting a communication port.

*Example 12–2. Communication Port Reset*

```asm
; RESET1: Flushes FIFOs data for communication port 1;  
RESET1 push AR0 ; Save registers  
push R0 ;  
push RC ;  
ldhi 010h,AR0 ; Set AR0 to base address of COM 1  
or 050h,AR0 ;  
FLUSH: rpts 1 ; Flush FIFO data with back-to-back write  
sti R0, +AR0(3) ;  
rpts 10 ; Wait  
nop ;  
ldi +AR0(0),R0 ; Check for new data from other port  
and 01FE0h,R0 ;  
bnz FLUSH ;  
pop RC ; Restore registers  
pop R0 ;  
pop AR0 ;  
rets ; Return
```
12.4 Port Arbitration Units (PAUs)

The PAU arbitrates between two devices to determine which device has possession of the communication port data bus at any given time. This arbitration uses CREQ and CACK signals to pass the bus ownership token back and forth between two devices connected via their communication ports. Token transfer operation is covered in detail in Section 12.7, Token Transfer Operation.

After system reset, half of the communication channels associated with a particular 'C4x have token ownership (communication ports 0, 1, 2), and the other half (communication ports 3, 4, 5) do not.

The PAU is a synchronous state machine with four states, as shown in Table 12–2. These states are not software-accessible by the CPU or the DMA coprocessor.

Table 12–2. PAU State Definitions

<table>
<thead>
<tr>
<th>PAU State</th>
<th>Summary</th>
<th>PAU Status</th>
</tr>
</thead>
</table>
| State 0: Idle with| 1. PAU has token (PORT DIR = 0).  
               token                                   2. Channel not in use.                                                   | The PAU currently has possession of the bus ownership token, and its associated communication channel is not in use. Under this condition, the PORT DIR bit of the associated CPCR is 0 (output). This is the state of communication ports 0, 1, and 2 after system reset. |
| State 1: Idle without| 1. PAU does not have token (PORT DIR = 1).  
                   token                                   2. Token not requested by PAU (OUTPUT LEVEL = 0).                         | The PAU currently does not have possession of the bus ownership token and has not requested the token. Under this condition, the PORT DIR bit equals 1 (input), and the OUTPUT LEVEL field equals 0 (empty output FIFO). This is the state of communication ports 3, 4, and 5 after system reset. |
| State 2: Active    | 1. PAU has token (PORT DIR = 0).  
               2. Channel is in use (OUTPUT LEVEL ≠ 0).                             | The PAU currently has possession of the bus ownership token, and its associated communication channel is in use. Under this condition, the PORT DIR bit equals 0 (output), and the OUTPUT LEVEL field does not equal 0. |
| State 3: Waiting for| 1. PAU does not have token (PORT DIR = 1).  
                   token                                   2. Token requested by PAU (OUTPUT LEVEL ≠ 0).                             | The PAU currently does not have the bus ownership token but has requested it. Under this condition, the PORT DIR bit equals 1 (input), and the OUTPUT LEVEL field does not equal 0. |
Port Arbitration Units (PAUs)

Figure 12–5 shows the state diagram and controlling equations for state transitions.

To place data on the communication port data bus, the PAU must arbitrate between two types of requests:

- On-chip requests to output data in the output FIFO (shown as BUSRQ = 1 in Figure 12–5), and
- External requests received via the CREQ line (shown as TOKRQ = 1 in Figure 12–5).

**Figure 12–5. Communication-Port Arbitration-Unit State Diagram**

To further examine the port arbitration scheme represented in Figure 12–5, consider a data transfer operation from 'C4x A to 'C4x B. The transfer begins with PAU A in state 0 (idle with token) and PAU B in state 1 (idle without token). If PAU A receives a request (BUSRQ = 1) from its output buffer to use the communication-port data bus, it allows the output buffer to transmit one word immediately and enter state 2 (active). After the output buffer transmits one word, it removes the bus request (BUSRQ = 0), and PAU A returns to state 0 (idle with token).
If PAU B receives a request from its output buffer to use the bus, it activates CREQ to request the token from PAU A. PAU A detects this request via the state variable TOKRQ = 1 and then activates the CACK line to transfer the bus ownership token to PAU B. PAU B then generates an internal bus acknowledge (BUSAK = 1) to indicate that it has gained bus ownership. As a result of this token transfer operation, PAU A enters state 1 (idle without token), and PAU B starts the word transfer and enters state 2 (active).

To prevent any communication port from monopolizing the communication-port bus, the PAU always returns to state 0 (idle with token) and checks for a token request (CREQ active) from the external device after each word transfer. If the token request is active, the token is passed to the requesting device so that it can transmit a word. As long as ‘C4x A and ‘C4x B have information to send in their output FIFOs, they alternate use of the data bus to provide a bi-directional data path.

If a token request is received at the end of a word transfer and the sender ‘C4x has another word in the output FIFO to send, two situations can occur:

- If the CREQ going low signal is received before CRDY low is received for the last byte, the sender ‘C4x releases the token at the end of the current word transfer.

- If the CREQ going low signal is received after or at the same time as CRDY goes low from the last byte, the sender ‘C4x continues owning the token; only after transferring the next word, will it release token ownership.

In summary, token transfer occurs only on word boundaries. The ‘C4x will not release the token until the transfer of the four bytes completes.
12.5 Halting of Input and Output FIFOs

The 'C4x can halt the input FIFO, or the output FIFO, or both at word boundaries.

To halt an input FIFO, write a 1 to bit 3 (ICH) of the communication port control register (CPCR). This bit also can be read to determine if the port is halted or is able to receive. Write a 0 to the ICH bit to unhalt the input FIFO.

To halt an output FIFO, write a 1 to bit 4 (OCH) of the communication port control register (CPCR). This bit also can be read to determine whether the port is halted or is able to transmit. Write a 0 to the OCH bit to unhalt the output FIFO. The halt/unhalt operations are discussed in the following subsections. A summary is provided in Table 12–3.

<table>
<thead>
<tr>
<th>Halted/Unhalted</th>
<th>If the Port Has Token</th>
<th>If the Port Does Not Have Token</th>
</tr>
</thead>
<tbody>
<tr>
<td>Input halted</td>
<td>a. Will not release token</td>
<td>a. If the halt signal is present when the input FIFO finishes receiving a word, the port will not signal ready when the first byte of a new word is received (transfer frozen). If the halt signal is received with no word reception in progress, the port receives one word and then halts.</td>
</tr>
<tr>
<td>Output unhalted</td>
<td>b. Will transmit data</td>
<td>b. If halted after the first byte is received, the port receives the rest of the word and then halts the input.</td>
</tr>
<tr>
<td>Input unhalted</td>
<td>a. Will not transmit data</td>
<td>a. Will receive data</td>
</tr>
<tr>
<td>Output halted</td>
<td>b. If halted after the first byte is sent, it completes the word transfer and then halts the output.</td>
<td>b. Will not request token</td>
</tr>
<tr>
<td>c. Will release token</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Input halted</td>
<td>a. Will not release token</td>
<td>a. If the halt signal is present when the input FIFO finishes receiving a word, the port will not signal ready when the first byte of a new word is received (transfer frozen). If the halt signal is received with no word reception in progress, the port receives one word and then halts.</td>
</tr>
<tr>
<td>Output halted</td>
<td>b. Will not transmit data</td>
<td>b. If halted after the first byte is sent, it completes the word transfer and then halts the output.</td>
</tr>
<tr>
<td>c. If halted after the first byte is sent, it completes the word transfer and then halts the output</td>
<td>c. Will not request token</td>
<td></td>
</tr>
</tbody>
</table>
12.5.1 Input FIFO Halt Operation

The goal of input FIFO halting is to halt the input FIFO as soon as possible, without losing the data being input.

A communication port with an input FIFO that is either halted or is full does not respond to CSTRB low with CRDY low or acknowledge a token request with CACK low when CREQ low is received. This assures that the communication port’s output channel remains open.

The communication-port logic checks whether an input FIFO halt signal has been written to the CPCR register only after finishing receiving a word. This implies:

- If the communication port receives an input halt signal when there is no word reception in progress, the input FIFO does not halt immediately; it waits to receive one word and then halts. This is the case of an input FIFO halt after reset.

- If the halt signal is written to the CPCR register while a word is being received, the input FIFO receives the rest of the current word and then halts the input. At this point, the data transfer is frozen until the input FIFO is unhalted or a system reset occurs. If the input FIFO is unhalted later, the transfer continues without any loss of data.

Notice that even when an input FIFO is halted, you can still read the words previously stored in the input FIFO.

If a communication port’s input FIFO is halted during a token request from the communication port to which it is connected, then the token request is acknowledged before the input FIFO halts.

12.5.2 Output FIFO Halt Operation

Output FIFO halting is analogous to input FIFO halting and occurs also at word boundaries. Assume that ‘C4x A’s output FIFO has OCH = 1. Then the output FIFO will be halted on the basis of its current state.

If communication port A does not have the token:

- The output FIFO is halted immediately, and no request is made for the token.

- If the communication port requesting the token is halted after sending the CREQ signal low, the communication port still accepts the token and halts immediately after that.
Halting of Input and Output FIFOs

If communication port A has the token:

- If it is currently transmitting a word, then after the current word is transmitted, the output FIFO is halted and no new transfers occur.
- If it is not currently transmitting a word, then the output FIFO halts immediately and no transfer occurs.
- If the input FIFO is not halted and the output FIFO is halted, then communication port A transfers the token when requested by communication port B.
- If the input FIFO is halted and the output FIFO is halted, then communication port A does not transfer the token when requested by communication port B.

If the communication port still has the token when it comes out of the halted state, it can transmit data if necessary. If it needs the token, it will arbitrate for the token as usual.

In summary, a halted output FIFO does not transmit but releases the token if the input FIFO is not halted.
12.6 Coordinating Communication Ports With the CPU and DMA Coprocessor

The communication ports support synchronization with two types of signals:

- A ready/not ready signal that can halt CPU and DMA accesses to a communication port
- Interrupts that can be used to signal the CPU and DMA

The simplest form of synchronization is based on a ready/not-ready signal. If the DMA or CPU attempt to read an empty input FIFO or write to a full output FIFO, a not-ready signal is returned, and the DMA or CPU continues to read or write (halting the peripheral bus) until a ready signal is received. The ready signal for the output channel is OCRDY (output channel ready), which is also an interrupt signal. The ready signal for the input channel is ICRDY (input channel ready), which is also an interrupt signal.

In the interrupt form of synchronization, each communication port generates four different interrupt signals, as listed below (interrupt vector locations for these are shown in Figure 7–2):

- ICFULL (input channel full): indicates that the input FIFO has eight words.
- ICRDY (input channel ready): indicates that at least one word is in the input FIFO.
- OCRDY (output channel ready): indicates that at least one word space is available in the output FIFO.
- OCEMPTY (output channel empty): Indicates that the output FIFO is empty.

The CPU can respond to all four of these interrupt signals. The DMA coprocessor can respond only to the ICRDY and OCRDY interrupt signals. Each DMA channel can respond only to the ICRDY and OCRDY signals coming from its own communication port; that is, DMA channel \( i \) can synchronize only with ICRDY\(_i\) and OCRDY\(_i\).

Notice that none of the four communication-port interrupt signals has flags in the IIF register. These four communication-port status signals (ICFULL, ICRDY, OCRDY, and OCEMPTY) can be obtained by checking the input and output levels in the communication port control register (CPCR) with logical instructions. For example, to poll for an ICFULL condition, bit 12 can be tested for a bit value equal to 1. See subsection 12.3.1, Communication-Port Control Register (CPCR), on page 12-8, for more information about checking for communication-port conditions.
Maximum Communication Port Sustained Transfer Rate. The maximum data transfer rate of any single communication port in a 50-MHz 'C4x is 20 M Bytes/s. This rate can be easily achieved under CPU or DMA coprocessor control, as long as data is sent to the output FIFO at least at this rate. However, when multiple communication ports are transmitting simultaneously, this may not be the case. For example, the DMA memory-to-memory maximum transfer rate is 50 M bytes/s (one read-write sequence every two cycles). The DMA can handle up to two communication ports transmitting at their full speed. For more than two communication ports, the DMA becomes the bottleneck, regardless of how many DMA channels are used. The CPU can perform two reads and two writes in two cycles by using parallel instructions, achieving a 100-M Bytes/s transfer rate. For more than five communication ports, the CPU becomes the bottleneck.
12.7 Token Transfer Operation

Token transfer operation requires handshaking of signals through pins CREQ and CACK. This is illustrated in Figure 12–6. For clarity, a suffix identifies the signals at each processor end. For example, CREQb denotes the CREQ signal at the processor B end. Table 12–4 lists the handshaking events. Steps in the table are shown by numbers in Figure 12–6.

Notice that an overlap feature is built into CREQ, CSTRB, and CRDY when a token is transferred between two C4x communication ports. This overlap will cause these signals to drive high (at both ends), ensuring that neither end is susceptible to floating or low-noise signals. For example, in Figure 14–23, CSTRB is an output before CREQ goes high, and in Figure 14–24, CSTRB becomes an input only after CREQ goes high. Both C4xs drive communication port lines for a period of 0.5 H1/H3, but this is not a problem, because they are both driving high; as a result, there is no current from one device to the other.

For this reason, the clocks of two C4xs connected together must be within a 2:1 ratio (at most, one C4x can be twice as fast as the other). If this guideline is not followed, the overlap will last too long, and the C4x with the faster clock may start driving low before the current bus master has relinquished that line. This will cause signal contention that could damage communication port drivers.

There is no limit on the time period between CREQ and CACK. The C4x can perform token transfer with a slow non-C4x device, as long as correct handshaking of CREQ and CACK is maintained and there is no signal contention.

To avoid bus contention problems, you should understand which event triggers the switch of the direction (input-to-output or output-to-input) of each of the communication port bidirectional lines. This is especially important when you attempt to build a communication port interface to a non-C4x device or when you work with very long C4x links. For example, the data lines and CSTRB should not be driven after CACK goes low. If they are, this could cause a bus conflict.

An implementation of a hardware token forcer can be found in the Communication Ports chapter of the TMS320C4x General-Purpose Applications User’s Guide.
Figure 12–6. Token Transfer Operation

PROCESSOR A (initial token owner)

- CREQa
- CACKa
- CSTRBa
- CDa

PROCESSOR B (token requester)

- CREQb
- CACKb
- CSTRBb
- CRDYb
- CDab

= When signal is an input (clear = when signal is an output).

Note: For an explanation of Type 1 delay, see Section 12.9, Synchronizers.
### Table 12–4. Token Transfer Sequence

<table>
<thead>
<tr>
<th>Event No.†</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>Initially, A has the token and is idle.</td>
</tr>
<tr>
<td>1</td>
<td>B wants to send data and requests the token by bringing CREQb low.</td>
</tr>
<tr>
<td>2</td>
<td>After a transmission line time delay, A sees the token request when CREQa goes low.</td>
</tr>
<tr>
<td>3</td>
<td>After a type 1 delay from CREQa falling, A releases token ownership and acknowledges the request by bringing CACKa low.</td>
</tr>
<tr>
<td>4</td>
<td>After a transmission line time delay, B sees the acknowledgement from A when CACKb goes low.</td>
</tr>
<tr>
<td>5</td>
<td>A switches CRDYa from high impedance to high after CACKa falling.</td>
</tr>
<tr>
<td>6</td>
<td>A puts CDa(7–0) in high impedance after CACKa falling.</td>
</tr>
<tr>
<td>7</td>
<td>B switches CSTRBb from high impedance to high after CACKb falling.</td>
</tr>
<tr>
<td>8</td>
<td>B brings CREQb high after a type 1 delay from CACKb falling.</td>
</tr>
<tr>
<td>9</td>
<td>After a transmission line time delay, A sees CREQa go high.</td>
</tr>
<tr>
<td>10</td>
<td>A switches CREQa from high impedance to high after receiving a high on CREQa.</td>
</tr>
<tr>
<td>11</td>
<td>A brings CACKa high after CREQa goes high.</td>
</tr>
<tr>
<td>12</td>
<td>A puts CACKa in high impedance after CREQa goes high and after CACKa goes high.</td>
</tr>
<tr>
<td>13</td>
<td>A puts CSTRBa in high impedance after CREQa goes high.</td>
</tr>
<tr>
<td>14</td>
<td>B puts CREQb in high impedance after CREQb goes high.</td>
</tr>
<tr>
<td>15</td>
<td>B switches CACKb from high impedance to high after CREQb goes high.</td>
</tr>
<tr>
<td>16</td>
<td>B puts CRDYb in high impedance after CREQb goes high.</td>
</tr>
<tr>
<td>17</td>
<td>B switches CDa from input to output after CREQb goes high and starts driving an undefined value.</td>
</tr>
<tr>
<td>18</td>
<td>B drives the first byte onto CDa(7–0) on H1 rising (plus analog delay) after CREQb goes high.</td>
</tr>
<tr>
<td>19</td>
<td>B brings CSTRBb low on the second H1 rising (plus analog delay) after CREQb rising.</td>
</tr>
<tr>
<td>20</td>
<td>After a transmission time delay, A sees the first byte on CDa(7–0).</td>
</tr>
<tr>
<td>21</td>
<td>After a transmission time delay, A sees CSTRBa go low, signaling valid data.</td>
</tr>
<tr>
<td>22</td>
<td>A reads the data and then brings CRDYa low.</td>
</tr>
</tbody>
</table>

† Event numbers correspond to numbers in Figure 12–6.
12.8 Word Transfer Operation

The C4x communication ports transfer words on a byte-to-byte basis (LS byte is transmitted first). Byte transfer operation requires handshaking of signals through pins CSTRB and CRDY. This is illustrated in Figure 12–7. For clarity, a suffix identifies the signals at each processor end. For example, CSTRB denotes the CSTRB signal at the processor B end. Table 12–5 lists the handshaking events. Steps in the table are shown by numbers in Figure 12–7.

Byte transmission is totally asynchronous, and the communication-port transfer rate can be higher than one byte per cycle. The exception is for the first byte. Notice that on the first byte, the data lines are set up in relation to an H1 synchronization (output FIFO advance). The first byte appears on a different H1 edge, depending on the transmit mode used. If the communication port is in continuous transmit mode (no token exchanged), the first data byte appears synchronous to the H1 falling edge before CSTRB going low. That is, the data appears one half of one H1 cycle before CSTRB falls. If a token transfer occurs, the first byte appears synchronous to the rising edge of H1 before CSTRB going low. That is, data appears one H1 cycle before CSTRB falls.

Subsequent bytes and CSTRB high become valid from the falling edge of CRDY. Because both of these signals are caused by the same event but have different internal paths, their delay values are not exactly the same but are very close.

During back-to-back write cycles, a type 2 synchronizer is used between CRDY low to CSTRB low before byte 0 (first byte) of the next word is transmitted. Communication port synchronizers are explained in Section 12.9, Synchronizers, on page 12-26.

Even if the availability of data is granted, do not tie CSTRB or CRDY to ground. For each byte transfer, there must be a CSTRB and CRDY handshake. The C4x must see the transitions in the CSTRB and CRDY signals to advance its internal byte counter.

If an input buffer becomes full, it will not activate CRDY at the beginning of the transmission of the first byte that would overflow the buffer. This condition prevents data transfer operations until the situation is resolved. When the receiver reads the full input buffer, CRDY falls, and the next FIFO position is made available.

Notice in Figure 12–7 that after CRDYb goes low (byte 3 has been received), B drives an undefined value temporarily on CDb (7–0) (event 12 in Figure 12–7) before driving byte 0 of the new word.
Figure 12–7. Word Transfer Operation

PROCESSOR B (sender)

H1
CREQb
CACKb
CSTRBb
CRDYb
CDb
undefined →

PROCESSOR A (receiver)

CREQA
CACKa
CSTRBa
CRDYa
CDa
undefined →

= When signal is an input (clear = when signal is an output).

Note: B0' = byte 0 of a new word.
# Word Transfer Operation

## Table 12–5. Word Transfer Sequence

<table>
<thead>
<tr>
<th>Event No.†</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>B owns the token and has data to transmit.</td>
</tr>
<tr>
<td>1</td>
<td>B drives the first byte onto CDb(7–0) on H1 falling (plus analog delay)‡.</td>
</tr>
<tr>
<td>2</td>
<td>B brings CSTRBb low on H1 rising (plus analog delay)§.</td>
</tr>
<tr>
<td>3</td>
<td>After a transmission line time delay, A sees CSTRBa go low, signaling valid data.</td>
</tr>
<tr>
<td>4</td>
<td>A reads the data and then brings CRDYa low</td>
</tr>
<tr>
<td>5</td>
<td>After a transmission line time delay B sees CRDYb go low, signaling data has been read.</td>
</tr>
<tr>
<td>6</td>
<td>B brings CSTRBb high after CRDYb goes low.</td>
</tr>
<tr>
<td>7</td>
<td>B drives the next byte on CDb(7–0) after CRDYb goes low.</td>
</tr>
<tr>
<td>8</td>
<td>After a transmission line time delay, A sees CSTRBa go high.</td>
</tr>
<tr>
<td>9</td>
<td>A brings CRDYa high after CSTRBa goes high.</td>
</tr>
<tr>
<td>10</td>
<td>After a transmission line time delay, B sees CRDYb go high.</td>
</tr>
<tr>
<td>11</td>
<td>B brings CSTRBb low after CRDYb goes high.</td>
</tr>
<tr>
<td>12</td>
<td>B drives an undefined byte on CDb(7–0) after CRDY goes low.</td>
</tr>
</tbody>
</table>

† Event numbers correspond to numbers in Figure 12–7.
‡ If this is the first word the token is received, this transition occurs after CREQb goes high (See event 18 in Table 12–4).
§ If this is the first word after the token is received, this transition occurs on the second H1 rising after CREQb goes high (See event 19 in Table 12–4).
CSTRB Width Restrictions

In 'C4x device revisions lower than 3.0, the width of the CSTRB low pulse between word boundaries should not exceed 1.0 H1/H3 at the receiving 'C4x end. If it does, the receiver 'C4x byte counter that has looped back to byte 0 between word boundaries will see this low and recognize CSTRB as the next valid byte, effectively slipping a byte. This is not a problem unless you are working with very long distances or with external devices. If you are, use flip-flops to locally shorten the CSTRB at the receiver end while returning a valid CRDY width to the sender. Wide widths at the sender are not a problem. Chapter 7, Interfacing Communication Ports, the TMS320C4x General-Purpose Applications User’s Guide shows a circuit to shorten the CSTRB low pulse. In 'C4x device revisions 3.0 or higher, no CSTRB width restriction exists.

Note:

See Chapter 7, Interfacing Communication Ports, in the TMS320C4x General-Purpose Applications User’s Guide for a detailed description of the word transfer operation when interfacing a 'C4x communication port with a non-'C4x device.
12.9 Synchronizers

H1/H3 synchronization is required during word transfer boundaries and during token transfers. Three types of synchronizers are used in the port arbitration unit:

- **Type-one synchronizers** cause delays that vary from 1 to 2 machine clock from the receiving of an input on a pin until the response on output pin (ignoring analog delays). An input is recognized when H1 is high; then it is passed through an \textit{H3-high/H1-high} series of delays. The response occurs at the start of the following time H3 is high.

The minimum type-one synchronizer delay of 1 machine clock will occur when the input changes just before H1 goes low. This delay is shown in Figure 12–8.

The maximum type-one synchronizer delay of 2 machine clocks will occur when the input changes just after H1 goes low. This delay is shown in Figure 12–9.

![Figure 12–8. Type-One Synchronizer Minimum Delay](image)

![Figure 12–9. Type-One Synchronizer Maximum Delay](image)

- **Type-two synchronizers** cause delays that vary from 1.5 to 2.5 machine clock from the receiving of an input on a pin until the response on an output pin (ignoring analog delays). An input is recognized when H1 is high; then it is passed through an \textit{H3-high/H1-high/H3-high} series of delays. The response occurs at the start of the following time H1 is high.

The minimum type-two synchronizer delay of 1.5 machine clocks occurs when the input changes just before H1 goes low. This delay is shown in Figure 12–10.
The maximum type-two synchronizer delay of 2.5 machine clocks occurs when the input changes just after H1 goes low. This delay is shown in Figure 12–11.

**Figure 12–10. Type-Two Synchronizer Minimum Delay**

![Type-Two Synchronizer Minimum Delay Diagram](image)

**Figure 12–11. Type-Two Synchronizer Maximum Delay**

![Type-Two Synchronizer Maximum Delay Diagram](image)

**Type-three synchronizers** cause delays that vary from 0.5 to 1.5 machine clocks from the receiving of an input on a pin until the response on output pin (ignoring analog delays). An input is recognized when H1 is high; then it is passed through an H3-high delay. The response occurs at the following time H1 is high.

The minimum type-three synchronizer delay of 0.5 machine clock cycles will occur when the input changes just before H1 goes low. This delay is shown in Figure 12–12.

The maximum type-three synchronizer delay of 1.5 machine clocks will occur when the input changes just after H1 goes low. This delay is shown in Figure 12–13.

**Figure 12–12. Type-Three Synchronizer Minimum Delay**

![Type-Three Synchronizer Minimum Delay Diagram](image)
Figure 12–13. Type-Three Synchronizer Maximum Delay

Table 12–6 shows the types of synchronizer delays for communication port signals.

Table 12–6. Communication-Port Signals and Synchronizer Delays

<table>
<thead>
<tr>
<th>Input Signal to Output Signal</th>
<th>Delay Type</th>
<th>Min. Delay (clock cycles)</th>
<th>Max. Delay (clock cycles)</th>
</tr>
</thead>
<tbody>
<tr>
<td>CREQ↓ to CACK↓</td>
<td>One</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>CACK↓ to CREQ↑</td>
<td>One</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>CRDY↓ to CD valid between back-to-back word transfers</td>
<td>One</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>CRDY↓ to CSTRB↓ between back-to-back word transfers</td>
<td>Two</td>
<td>1.5</td>
<td>2.5</td>
</tr>
<tr>
<td>CACK↓ to CSTRB switch from input to an output high.</td>
<td>Three</td>
<td>0.5</td>
<td>1.5</td>
</tr>
</tbody>
</table>
12.10 Module Reset

This section explains the status of the 'C4x communication ports after power-up and during and after system reset.

The recommended reset sequence in a multiprocessing system is described in Chapter 1, Processor Initialization and Program Control, in the TMS320C4x General-Purpose Applications User’s Guide.

After powerup, the status depends on the RESET pin:

If RESET is low, the 'C4x is in reset immediately, and the description under "at reset" (below) applies.

If RESET is not low, the 'C4x device is in an unknown stage. The communication port signals can be in a combination of states.

At reset (while RESET = 0), the communication port pins are all put in the high-impedance state. The input and output channels both assume an empty state, causing all values in the input and output buffers to be lost. Pullup resistors should be used on all control lines to ensure that they are logic high if reset is not applied at the same time in interconnected 'C4xs.

After reset (after the rising edge of RESET), communication ports 0, 1, and 2 are configured as output ports and assume the following states:

- The PAU is reset to state 0: The PAU has the bus ownership token and is idle.
- The pin status (see Figure 12–14) is set as follows:
  - The CxD(7–0) signals start driving an undefined value.
  - The CACK and CSTRB signals go to 1 (inactive). CREQ and CRDY continue to be high-impedance.

Note:

The individual communication port software reset feature only flushes the FIFOs, but does not have any effect on the communication port external pins.
Module Reset

- The communication port control register gets a 0h value:
  - PORT DIR = 0: the communication port is configured for an output operation.
  - INPUT LEVEL = 0: The input FIFO is empty.
  - OUTPUT LEVEL = 0: The output FIFO is empty.
  - ICH = 0: The input FIFO is not in its halted state.
  - OCH = 0: The output FIFO is not in its halted state.

- ICRDY = 0: The input FIFO is empty and is not ready to be read from.

- OCRDY = 0: The input FIFO is not full and is ready to be written to.

Figure 12–14. Post-Reset State for an Output Port

After reset (after rising edge of RESET), communication ports 3, 4, and 5 are configured as input ports and assume the following states:

- PAU is reset to state 1: The PAU does not have the bus ownership token, and the token is not requested.

- The pin status (see Figure 12–15) is set as follows:
  - CxD(7–0) continue to be high-impedance.
  - CREQ and CRDY signals go to 1 (inactive). CACK and CSTRB continue in high impedance.

- The communication port control register gets a value of 04h.
PORT DIR = 1: the communication port is configured for an input operation:
- INPUT LEVEL = 0: The input FIFO is empty.
- OUTPUT LEVEL = 0: The output FIFO is empty.
- ICH = 0: The input FIFO is not in its halted state.
- OCH = 0: The output FIFO is not in its halted state.

ICRDY = 0: The input FIFO is empty and is not ready to be read from.

OCRDY = 0: The input FIFO is not full and is ready to be written to.

Figure 12–15. Post-Reset State for an Input Port

At reset, ports 0, 1, and 2 are configured as output ports (PORT DIR = 0), and ports 3, 4, and 5 are configured as input ports (PORT DIR = 1). When you interconnect the ports of two 'C4x devices, connect the port of one 'C4x to a port of the other 'C4x that would be in the opposite direction at reset in other words, connect any one of port 0, 1, or 2 connected to any one of port 3, 4, or 5.

If your system configuration requires connection of input-to-input communication ports or output-to-output communication ports, refer to Chapter 7, Interfacing Communication Ports, in the TMS320C4x General-Purpose Applications User’s Guide for an implementation of a token forcer.
12.11 Tips for Using Communication Ports

When you design systems that use the communication ports, there are considerations to keep in mind:

- At reset, ports 0–2 are set for transmit and ports 3–5 are set for receive. When connecting communication ports between 'C4x devices, make sure you connect transmit ports to receive ports and receive ports to transmit ports. Otherwise, unpredictable results may occur.

- Signal quality is very important. Make sure you design your board to minimize noise from other components. See the section entitled Signal Considerations in the communications port chapter of the TMS320C4x General-Purpose Applications User’s Guide for more information.

- Do not read from an empty input FIFO. This will cause the CPU or DMA operation to stall and to halt the peripheral bus.

- Do not write to an unconnected communication port. If a port’s transmit FIFO is full and the port can’t transmit, an additional write to the port’s FIFO will halt the peripheral bus.

- The clocks of two 'C4xs connected together must be within a 2:1 ratio (at most, one 'C4x can be twice as fast as the other). If this guideline is not followed, the 'C4x with the faster clock may start driving low before the current bus master has relinquished that line. This will cause signal contention that could damage communication port drivers. This restriction does not apply when connecting to a non-`C4x device.

- When you design an interface to a non-'C4x device, the non-'C4x device should mimic the asynchronous handshaking operation of a 'C4x communication port. See Word Transfer Considerations in the TMS320C4x General-Purpose Applications User’s Guide for more information on interfacing to non-`C4x devices.

Note:

See Section 7.4, Design Tips, in the TMS320C4x General-Purpose Applications User’s Guide for more tips for using communication ports.
The 'C4x has two general-purpose timer modules that time events, generate pulses, and interrupt the CPU or DMA coprocessor.

This chapter provides you with information about:

- The components of the timers
- The control registers of the timers
- The operation of the timers
- The interrupts generated by the timers

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>13.1 Overview of the Timers</td>
<td>13-2</td>
</tr>
<tr>
<td>13.2 Timer Pins</td>
<td>13-4</td>
</tr>
<tr>
<td>13.3 Timer Control Registers</td>
<td>13-5</td>
</tr>
<tr>
<td>13.4 Timer Pulse Generation</td>
<td>13-9</td>
</tr>
<tr>
<td>13.5 Timer Interrupts</td>
<td>13-11</td>
</tr>
<tr>
<td>13.6 Selecting CLKSRC and FUNC Values</td>
<td>13-13</td>
</tr>
<tr>
<td>13.7 Using TCLKx as General-Purpose I/O Pins</td>
<td>13-15</td>
</tr>
<tr>
<td>13.8 Configuring a Timer</td>
<td>13-16</td>
</tr>
</tbody>
</table>
13.1 Overview of the Timers

The 'C4x has two 32-bit general-purpose timer modules. Each timer has two signaling modes and can be clocked by an internal or an external source. The timer modules can be used to send periodic signals to the 'C4x or to devices in the external world; or they can be used to count external events. Each timer has an I/O pin (TCLK) that functions as an input clock, as an output clock, or as a general-purpose I/O pin.

With an internal clock, for example, the timer can signal an external A/D converter to start a conversion, or it can interrupt the 'C4x DMA controller to begin a data transfer.

With an external clock, for example, the timer can count external events and interrupt the CPU after a specified number of events.

Each timer consists of a 32-bit counter, a comparator, an input clock selector, a pulse generator, and supporting hardware.

A timer in the 'C4x counts the cycles of a timer input clock. When that count (counter register) equals the value stored in the timer period register, it rolls over the counter to zero and produces a transition in the timer output signal.

The timer input clock can be either the H1/2 internal clock frequency of the 'C4x or an external clock on the TCLKx pin. This is determined by the CLKSRC bit in the timer control register. If an external clock is used, the timer can counter either 0-to-1 or 1-to-0 transitions depending on the value of the INV bit.

The timer output signal depends on the signalling mode selected by the C/P bit (clock or pulse mode). See Section 13.4, Timer Pulse Generation, on page 13-9, for more information about this bit.

The timer output can be routed to the TCLKx pin that can also be used as a general-purpose I/O pin.

Figure 13–1 shows the block diagram of a 'C4x timer module.
Overview of the Timers

Figure 13–1. Timer Block Diagram

† If CLKSRC = 1 and FUNC = 1, this signal goes into the TCLK pin.
‡ Selector controlled by the CLKSRC bit.
§ Maximum frequency = f(H1) / 2.6
13.2 Timer Pins

Each timer has one pin associated with the timer clock signal (TCLK) pin.

- **TCLK.** This pin is used as a general-purpose I/O signal, as a timer output, or as an input for an external clock for a timer. Each timer has a TCLK pin: TCLK0 is connected to timer 0 and TCLK1 is connected to timer 1.
13.3 Timer Control Registers

The timers are controlled through three registers, as shown in Figure 13–2, that are mapped into the peripheral address space:

- **Control register.** This register determines the operating mode of the timer, monitors the timer status, and controls the function of the I/O pin (TCLK) of the timer.

- **Period register.** This register contains the number of timer input clock cycles to count. This number controls the timer output signal frequency.

- **Counter register.** Contains the current value of the incrementing counter. The 32-bit counter counts timer input clock cycles.

*Figure 13–2. Memory-Mapped Timer Locations*

<table>
<thead>
<tr>
<th>Register</th>
<th>Peripheral Address</th>
</tr>
</thead>
<tbody>
<tr>
<td>Timer Control</td>
<td>100020h 100030h</td>
</tr>
<tr>
<td>Reserved</td>
<td>100021h 100031h</td>
</tr>
<tr>
<td>Reserved</td>
<td>100022h 100032h</td>
</tr>
<tr>
<td>Reserved</td>
<td>100023h 100033h</td>
</tr>
<tr>
<td>Timer Counter</td>
<td>100024h 100034h</td>
</tr>
<tr>
<td>Reserved</td>
<td>100025h 100035h</td>
</tr>
<tr>
<td>Reserved</td>
<td>100026h 100036h</td>
</tr>
<tr>
<td>Reserved</td>
<td>100027h 100037h</td>
</tr>
<tr>
<td>Timer Period</td>
<td>100028h 100038h</td>
</tr>
<tr>
<td>Reserved</td>
<td>100029h 100039h</td>
</tr>
<tr>
<td>Reserved</td>
<td>10002Ah 10003Ah</td>
</tr>
<tr>
<td>Reserved</td>
<td>10002Bh 10003Bh</td>
</tr>
<tr>
<td>Reserved</td>
<td>10002Ch 10003Ch</td>
</tr>
<tr>
<td>Reserved</td>
<td>10002Dh 10003Dh</td>
</tr>
<tr>
<td>Reserved</td>
<td>10002Eh 10003Eh</td>
</tr>
<tr>
<td>Reserved</td>
<td>10002Fh 10003Fh</td>
</tr>
</tbody>
</table>
13.3.1 Timer Control Register

The timer control register is located at 100020h for timer 0 and at 100030h for timer 1.

The 32-bit timer global control register contains two sets of bits:

- The timer global control bits (bits 11–6) control timer mode and monitor timer status (TSTAT).
- The TCLK pin control bits (bits 3–0) control the function of the TCLK pin, which can be used as a timer pin or as a general-purpose I/O pin.

Figure 13–3 shows the 32-bit timer global control register. Note that at reset all bits are set to 0, except for DATIN, which is set to the value read on TCLK.

**Figure 13–3. Timer Control Register**

<table>
<thead>
<tr>
<th>32</th>
<th>12</th>
<th>11</th>
<th>10</th>
<th>9</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>xx</td>
<td>TSTAT</td>
<td>INV</td>
<td>CLKSRC</td>
<td>C/P</td>
<td>HLD</td>
<td>GO</td>
<td>xx</td>
<td>DATIN</td>
<td>DATOUT</td>
<td>I/O</td>
<td>FUNC</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Note:**  
R=Read, W=Write

**FUNC**  
Function bit. The FUNC bit controls the function of the TCLK pin. If FUNC = 0, TCLK is configured as a general-purpose digital I/O pin. If FUNC = 1, TCLK is configured as a timer pin.

**I/O**  
Input/output bit. If I/O = 1 and FUNC = 0, then TCLK is configured as an input pin. If I/O = 0 and FUNC = 0, then TCLK is configured as an output pin.

**DATOUT**  
Data output bit. DATOUT drives TCLK when the 'C4x is in I/O port mode. DATOUT can also be used as an input to the timer.

**DATIN**  
Data input bit. Reads data from TCLK or DATOUT. A write to this bit has no effect.

**GO**  
GO bit. Resets and starts the timer counter. When GO = 1 and the timer is not held, the counter is zeroed and begins incrementing on the next rising edge of the timer input clock. The GO bit is cleared on the same rising edge. When GO = 0, the timer is not affected. Section 13.8, Configuring A Timer, further defines this bit.

**HLD**  
Counter hold bit. When HLD = 0, the counter is disabled and held in its current state. If the timer is driving TCLK, the state of TCLK is also held. The internal divide-by-two counter is also held so that the counter can continue where it left off when HLD is set to 1. The timer registers can be read and modified while the timer is being held. RESET has priority over HLD. Section 13.8, Configuring A Timer, shows the effect of writing to GO and HLD, and shows the result of a write using specified values of the GO and HLD bits in the timer global control register.
### Timer Control Registers

<table>
<thead>
<tr>
<th>GO (Bit 6)</th>
<th>HLD (Bit 7)</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>All timer operations are held. No reset is performed.</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>Timer proceeds from state before write.</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>All timer operations are held, including zeroing of the counter. The GO bit is not cleared until the timer is taken out of hold.</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>Timer resets and starts.</td>
</tr>
</tbody>
</table>

**C/P** Clock/pulse mode control. When C/P = 1, clock mode is chosen, and the signaling of the TSTAT status bit and TCLK pin will have a 50 percent duty cycle. When C/P = 0, the TSTAT status bit and TCLK pin will be active for one H1 cycle during each timer period (see Figure 13–4).

**CLKSRC** Timer input clock source select bit. Specifies the source of the timer input clock. When CLKSRC = 1, an internal clock with frequency equal to one-half the H1 frequency is used as the timer input clock, and the INV bit has no effect. When CLKSRC = 0, an external signal from the TCLK pin is used as the timer input clock. The external clock is synchronized internally, thus allowing external asynchronous clock sources that do not exceed the specified maximum allowable external clock frequency of f(H1)/2.6.

**INV** Inverter control bit. If an external clock is used as the timer input clock and INV = 1, the external clock is inverted as it goes into the counter. If the output of the pulse generator (TSTAT) is routed to TCLK and INV = 1, the output is inverted before it goes to TCLK. If INV = 0, no inversion is performed on the input or output of the timer. The INV bit has no effect, regardless of its value, when TCLK is used in I/O port mode.

**TSTAT** Timer status bit. This bit tracks the output of the timer and sets a CPU interrupt on a transition from 0 to 1. A write has no effect.

### 13.3.2 Timer Period Register

The timer period register is located at 100028h for timer 0 and at 100038h for timer 1.

The 32-bit timer period register contains the number of timer input clock cycles to count. This number controls the frequency of the timer output signal.
The frequency of timer signaling is determined by the frequency of the timer input clock and the period register. The following equations are valid with either an internal or an external timer clock:

\[
\begin{align*}
    f(\text{pulse mode}) &= \frac{f(\text{timer clock})}{\text{period register}} \\
    f(\text{clock mode}) &= \frac{f(\text{timer clock})}{2 \times \text{period register}}
\end{align*}
\]

This register is cleared to 0 at reset.

### 13.3.3 Timer Counter Register

The timer period register is located at 100024h for timer 0 and at 100034h for timer 1.

The 32-bit timer counter register increments with each cycle of the timer input clock. The timer counter can be incremented on the rising edge (INV = 0) or on the falling edge (INV = 1) of an externally generated timer input clock (CLKSRC = 0). With an internally generated timer input clock (CLKSRC = 1), the timer counter increments on the rising edge only. The timer counter is zeroed whenever its value equals that of the period register.

This register is cleared to 0 at reset.

### 13.3.4 Boundary Conditions in the Control Registers

Certain boundary conditions, such as a zero in the period register and an overflow of the counter, affect timer operation. These conditions are listed as follows:

- When the period and counter registers are zero, the operation of the timer depends on the C/P mode selected. In pulse mode (C/P = 0), TSTAT is set and remains set. In clock mode (C/P = 1), the width of a cycle is \(2/f(H1)\), and external clocks are ignored.

- When the counter register is not 0 and the period register = 0, the counter will count until it reaches its maximum 32-bit value (0FFFF FFFFh), roll over to 0, and then function as described in the preceding bullet.

- When the counter register is set to a value greater than the value of the period register, the counter reaches its maximum 32-bit value (0FFFF FFFFh), rolls over to 0, and continues.

**Note:**

Writes from the peripheral bus override register updates from the counter and new status updates to the control register.
13.4 Timer Pulse Generation

The timer pulse generator (see Figure 13–1) can generate several different TSTAT signals. These signals can be inverted (set by the INV bit) into the timer output signal. The two basic pulse generation modes are pulse mode and clock mode, as shown in Figure 13–4. You can select the mode with the C/P bit of the timer global control register. In both modes, an internal clock source has a frequency of \( f(H1) / 2 \), and an external clock source has a maximum frequency of \( f(H1) / 2.6 \). In pulse mode (C/P = 0), the width of the pulse is \( 1/f(H1) \). In clock mode (C/P = 1), the width of the pulse is the period register divided by the frequency of the timer input clock.

**Figure 13–4. Timer Pulse Mode and Clock Mode Timing**

![Diagram](image)

(a) TSTAT and timer output (INV = 0) when C/P = 0 (pulse mode)

(b) TSTAT and timer output (INV = 0) when C/P = 1 (clock mode)

**Note:** TINT is the timer interrupt signal generated whenever TSTAT transitions from 0 to 1.
Timer Pulse Generation

The rate of the timer output (TSTAT) is determined by the frequency of the timer input clock and the period register. The following equations are valid with either an internal or an external timer clock:

In pulse mode: \( f(TSTAT) = \frac{f(\text{timer input clock})}{\text{period register}} \)

In clock mode: \( f(TSTAT) = \frac{f(\text{timer input clock})}{(2 \times \text{period register})} \)

If the period register equals zero, refer to subsection 13.3.2, Timer Period Register.

Figure 13–5 provides some examples of TSTAT and timer output (INV = 0) when the period register is set to various values and clock or pulse mode is selected. Timer input clock is generated internally \( (f(H1 ÷ 2)) \).

Figure 13–5. Timer Output Generation Examples

(a) Pulse mode with timer period = 1 or
Clock mode with timer period = 0

(b) Pulse mode with timer period = 2

(c) Pulse mode with timer period = 3

(d) Clock mode with timer period = 1

(e) Clock mode with timer period = 2

(f) Clock mode with timer period = 3
13.5 Timer Interrupts

Each timer can send an interrupt to the CPU when the TSTAT signal transitions from 0 to 1. Timer 0 sends TINT0 and timer 1 sends TINT1.

**TINT0.** This interrupt uses the interrupt vector at IVTP + 002h. It has a priority level of two, which is second only to NMI and RESET.

**TINT1.** This interrupt uses the interrupt vector at IVTP + 02Bh. It has the lowest priority level of all interrupts.

13.5.1 Timer Interrupts and Their Vectors

TINT0 corresponds to timer 0. This interrupt uses the interrupt vector at IVTP + 002h. It has a priority level of two, which is second only to NMI and RESET.

TINT1 corresponds to timer 1. This interrupt uses the interrupt vector at IVTP + 02Bh. It has the lowest priority level of all interrupts.

13.5.2 Timer Interrupt Operation

A timer interrupt is generated whenever TSTAT transitions from a zero to a one. The frequency of timer interrupts depends on whether the timer is set up in pulse mode or clock mode.

In pulse mode, the interrupt frequency is:

\[
f(\text{interrupt}) = \frac{f(\text{input timer clock})}{\text{period register}}
\]

In clock mode, the interrupt frequency is:

\[
f(\text{interrupt}) = \frac{f(\text{input timer clock})}{(2 \times \text{period register})}
\]

If the period register equals zero, see subsection 13.3.4, Boundary Conditions in the Control Registers, on page 13-8, for more information.

The timer interrupt can be used to interrupt either the CPU or the DMA coprocessor.

The timer interrupt enable bits for the CPU are found in the IIE register. Bit 0 in the IIE corresponds to TINT0, and bit 1 corresponds to TINT1. For more information about the IIE register, see subsection 3.1.9, CPU Internal Interrupt Enable Register (IIE), on page 3-11.

The timer interrupt enable bits for the DMA control register are found in the DIE register. Several bits in this register control how each DMA channel responds.
to the timers. For more information about the DIE register, see subsection 13.3.4, *Boundary Conditions in the Control Registers.*

### 13.5.3 Considerations When Using a Timer Interrupt

The main consideration when using a timer to interrupt the CPU is the priority needed for the operation. If the timer operation has a low priority compared to other devices, then use timer 1, since that timer's interrupt has the lowest priority of all interrupts. If, on the other hand, the timer operation has a high priority compared to other devices, then use timer 0, since that timer's interrupt is second in priority only to an NMI.
13.6 Selecting CLKSRC and FUNC Values

The timer can receive its input and send its output in several different modes, depending on the setting of CLKSRC, FUNC, and I/O. The four timer modes of operation are defined by the values of CLKSRC and FUNC in the global control register.

13.6.1 CLKSRC = 1 and FUNC = 0.

If CLKSRC = 1 and FUNC = 0 (see Figure 13–6), the timer input comes from the internal clock. Interrupts can still be generated during the transition of TSTAT from 0 to 1. The internal clock is not affected by the INV bit in the global control register. In this mode, TCLK is connected to the I/O port control and can be used as a general-purpose I/O pin. If I/O = 0, TCLK is configured as a general-purpose input pin whose state can be read in DATIN. DATOUT has no effect on TCLK or DATIN. If I/O = 1, TCLK is configured as a general-purpose output pin. DATOUT is placed on TCLK and can be read in DATIN.

Figure 13–6. Timer Configuration With CLKSRC = 1 and FUNC = 0

13.6.2 CLKSRC = 1 and FUNC = 1.

If CLKSRC = 1 and FUNC = 1 (see Figure 13–7), the timer input comes from the internal clock, and the timer output goes to TCLK. You can invert the value on TCLK by setting INV to 1. Also, the value of TCLK can be read in DATIN.

Figure 13–7. Timer Configuration With CLKSRC = 1 and FUNC = 1
Selecting CLKSRC and FUNC Values

13.6.3 CLKSRC = 0 and FUNC = 0

If CLKSRC = 0 and FUNC = 0 (see Figure 13–8), the timer can still generate interrupt signals and is driven according to the status of the I/O bit:

- If I/O = 0, the timer input comes from TCLK. You can invert the value read from TCLK by setting INV to 1, and the value of TCLK can be read through DATIN.

- If I/O = 1, TCLK is an output pin; both TCLK and the timer are driven by DATOUT. All 0-to-1 transitions of DATOUT increment the counter. INV has no effect on DATOUT. The value of DATOUT can be read through DATIN.

Figure 13–8. Timer Configuration With CLKSRC = 0 and FUNC = 0

13.6.4 CLKSRC = 0 and FUNC = 1

If CLKSRC = 0 and FUNC = 1 (see Figure 13–9), TCLK drives the timer. If INV = 0, all 0-to-1 transitions of TCLK increment the counter. If INV = 1, all 1-to-0 transitions of TCLK increment the counter. The value of TCLK can be read through DATIN.

Figure 13–9. Timer Configuration With CLKSRC = 0 and FUNC = 1
13.7 Using TCLKx as General-Purpose I/O Pins

When FUNC = 0, TCLKx can be used as an I/O pin. Figure 13–10 and Figure 13–11 show how the TCLKx is connected when it is configured as a general-purpose I/O pin. In Figure 13–10, the I/O bit equals 0 and TCLK is configured as an input pin whose value can be read in the DATIN bit. In Figure 13–11, the I/O bit equals 1 and TCLK is configured as an output pin that outputs the value you wrote in the DATOUT bit.

Figure 13–10. TCLK as an Input (I/O = 0)

Figure 13–11. TCLK as an Output (I/O = 1)
Configuring a Timer

13.8 Configuring a Timer

Configuring a timer requires three basic steps:

1) Halt the timer by clearing to 0 the GO and HLD bits of the timer global-control register. To do this, write a 0 to the timer global-control register. Note that the timers are halted on RESET.

2) Configure the timer via the timer global-control register (with GO=HLD=0), the timer counter register, and timer period register, if necessary.

3) Start the timer by setting the GO and HLD bits of the timer global-control register to 1.

Example 13–1 shows how to set up the 'C4x timer to generate the maximum frequency clock through the TCLKx pin.

Example 13–1. Maximum Frequency Timer Clock Setup

```assembly
* TITLE MAXIMUM FREQUENCY TIMER CLOCK SETUP
* THIS EXAMPLE SHOWS HOW TO SET UP TIMER TO GENERATE MAXIMUM FREQUENCY TIMER CLOCK USING INTERNAL CLOCK. WHERE "TIMER_REGISTER" SECTION IS LOCATED FROM 100020h.

TIM0_CTL_REG .usect "TIMER_REGISTER",4
TIM0_CNT_REG .usect "TIMER_REGISTER",4
TIM0_PRD_REG .usect "TIMER_REGISTER",8
.text
    .
    .
    LDI 0,R0
    STI R0,@TIM0_PRD_REG
    LDI 3C1H,R0
    STI R0,@TIM0_CTL_REG
    .
    .
.end
```
Assembly Language Instructions

The 'C4x assembly language instruction set supports numeric-intensive, signal processing, and general-purpose applications. The instructions are organized into these major groups: load-and-store, two- or three-operand arithmetic/logical, parallel, program control, and interlocked operations instructions. The addressing modes used with the instructions are described in Chapter 6.

The 'C4x instruction set can also use one of 20 condition codes with any of the 10 conditional instructions, such as LDF\textit{cond}. This chapter defines the condition codes and flags.

The assembler allows optional syntax forms to simplify the assembly language for special-case instructions. These optional forms are listed and explained.

Each of the individual instructions is described and listed in alphabetical order. An example instruction (on pages 14-23 through 14-25) demonstrates the special format used and explains its content.

This chapter discusses these topics:

<table>
<thead>
<tr>
<th>Topic</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>14.1 Instruction Set</td>
<td>14-2</td>
</tr>
<tr>
<td>14.2 Condition Codes and Flags</td>
<td>14-12</td>
</tr>
<tr>
<td>14.3 Individual Instruction Descriptions</td>
<td>14-16</td>
</tr>
</tbody>
</table>
14.1 Instruction Set

The 'C4x instruction set is exceptionally well-suited to digital signal processing and other numeric-intensive applications. All instructions are a single machine word long, and most instructions take a single cycle to execute. In addition to multiply and accumulate instructions, the 'C4x possesses a full complement of general-purpose instructions.

The instruction set contains 145 instructions organized into the following functional groups:

- Load-and-store
- Two-operand arithmetic/logical
- Three-operand arithmetic/logical
- Program control
- Interlocked operations
- Parallel operations

Each of these groups is discussed in the succeeding subsections.

14.1.1 Load-and-Store Instructions

The 'C4x supports 24 load-and-store instructions (see Table 14–1). These instructions can:

- Load a word from memory into a register
- Store a word from a register into memory
- Manipulate data on the system stack
- Transfer data between primary register and expansion register

Two of these instructions can load data conditionally. This is useful for locating the maximum or minimum value in a data set. See Section 14.2 for detailed information on condition codes.
### Table 14-1. Load-and-Store Instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LBb†</td>
<td>Load byte (signed)</td>
<td>LDPK†</td>
<td>Load DP register immediate</td>
</tr>
<tr>
<td>LBUb†</td>
<td>Load byte (unsigned)</td>
<td>LHw†</td>
<td>Load half-word signed</td>
</tr>
<tr>
<td>LDA†</td>
<td>Load address register</td>
<td>LHUw†</td>
<td>Load half-word unsigned</td>
</tr>
<tr>
<td>LDE</td>
<td>Load floating-point exponent</td>
<td>LWLcf†</td>
<td>Load word left-shifted</td>
</tr>
<tr>
<td>LDEP†</td>
<td>Load integer, expansion-file register to primary register</td>
<td>LWRcf†</td>
<td>Load word right-shifted</td>
</tr>
<tr>
<td>LDF</td>
<td>Load floating-point value</td>
<td>POP</td>
<td>Pop integer from stack</td>
</tr>
<tr>
<td>LDFcond</td>
<td>Load floating-point value conditionally</td>
<td>POPF</td>
<td>Pop floating-point value from stack</td>
</tr>
<tr>
<td>LDHI†</td>
<td>Load 16-bit unsigned immediate into 16 MSBs</td>
<td>PUSH</td>
<td>Push integer on stack</td>
</tr>
<tr>
<td>LDI</td>
<td>Load integer</td>
<td>PUSHF</td>
<td>Push floating-point value on stack</td>
</tr>
<tr>
<td>LDIcond</td>
<td>Load integer conditionally</td>
<td>STF</td>
<td>Store floating-point value</td>
</tr>
<tr>
<td>LDM</td>
<td>Load floating-point mantissa</td>
<td>STI</td>
<td>Store integer</td>
</tr>
<tr>
<td>LDPE†</td>
<td>Load integer, primary register to expansion file register</td>
<td>STIK†</td>
<td>Store integer immediate</td>
</tr>
</tbody>
</table>

† The 'C4x instruction set is a superset of the 'C3x instruction set. The instructions marked are 'C4x-specific.
14.1.2 Two-Operand Instructions

The 'C4x supports a complete set of 43 two-operand arithmetic and logical instructions. The two operands are the source and destination. The source operand can be a memory word, a register, or a constant. The destination operand is always a register.

These instructions provide integer, floating-point, or logical operations, and multiprecision arithmetic. Table 14–2 lists these instructions.

Table 14–2. Two-Operand Instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ABSF</td>
<td>Absolute value of a floating-point number</td>
<td>MPYFT</td>
<td>Multiply floating-point values</td>
</tr>
<tr>
<td>ABSI</td>
<td>Absolute value of an integer</td>
<td>MPYI†</td>
<td>Multiply integers</td>
</tr>
<tr>
<td>ADDCT†</td>
<td>Add integers with carry</td>
<td>MPYSHI‡†‡</td>
<td>Multiply signed integer, 32-MSB product</td>
</tr>
<tr>
<td>ADDFT†</td>
<td>Add floating-point values</td>
<td>MPYUHI‡‡</td>
<td>Multiply unsigned integer, 32-MSB product</td>
</tr>
<tr>
<td>ADDIT†</td>
<td>Add integers</td>
<td>NEGB</td>
<td>Negate integer with borrow</td>
</tr>
<tr>
<td>AND†</td>
<td>Bitwise logical-AND</td>
<td>NEGF</td>
<td>Negate floating-point value</td>
</tr>
<tr>
<td>ANDNT†</td>
<td>Bitwise logical-AND with complement</td>
<td>NEGI</td>
<td>Negate integer</td>
</tr>
<tr>
<td>ASHT†</td>
<td>Arithmetic shift</td>
<td>NORM</td>
<td>Normalize floating-point value</td>
</tr>
<tr>
<td>CMPFT†</td>
<td>Compare floating-point values</td>
<td>NOT</td>
<td>Bitwise logical-complement</td>
</tr>
<tr>
<td>CMPI†</td>
<td>Compare integers</td>
<td>OR†</td>
<td>Bitwise logical-OR</td>
</tr>
<tr>
<td>FIX</td>
<td>Convert floating-point value to integer</td>
<td>RCPF‡</td>
<td>Reciprocal floating point</td>
</tr>
<tr>
<td>FLOAT</td>
<td>Convert integer to floating-point value</td>
<td>RND</td>
<td>Round floating-point value</td>
</tr>
<tr>
<td>FRIEEE‡</td>
<td>Convert IEEE floating-point format to 2s-complement floating-point format</td>
<td>ROL</td>
<td>Rotate left</td>
</tr>
<tr>
<td>LSH†</td>
<td>Logical shift</td>
<td>ROLC</td>
<td>Rotate left through carry</td>
</tr>
<tr>
<td>MBctr‡</td>
<td>Merge byte, left shifted</td>
<td>ROR</td>
<td>Rotate right</td>
</tr>
<tr>
<td>MHctr‡</td>
<td>Merge half-word, left shifted</td>
<td>RORC</td>
<td>Rotate right through carry</td>
</tr>
</tbody>
</table>

† Two- and three-operand versions
‡ The ‘C4x instruction set is a superset of the ‘C3x instruction set. The instructions marked are ‘C4x-specific.
### Table 14–2. Two-Operand Instructions (Continued)

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>RSQRF†</td>
<td>Reciprocal of square root, floating-point value</td>
<td>SUBRF</td>
<td>Subtract reverse floating-point value</td>
</tr>
<tr>
<td>SUBB†</td>
<td>Subtract integers with borrow</td>
<td>SUBRI</td>
<td>Subtract reverse integer</td>
</tr>
<tr>
<td>SUBC</td>
<td>Subtract integers conditionally</td>
<td>TOIEEE‡</td>
<td>Convert 2s complement to IEEE format</td>
</tr>
<tr>
<td>SUBF†</td>
<td>Subtract floating-point values</td>
<td>TSTB†</td>
<td>Test bit fields</td>
</tr>
<tr>
<td>SUBI†</td>
<td>Subtract integer</td>
<td>XOR†</td>
<td>Bitwise exclusive-OR</td>
</tr>
<tr>
<td>SUBRB</td>
<td>Subtract reverse integer with borrow</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

† Two- and three-operand versions.  
‡ The ‘C4x instruction set is a superset of the ‘C3x instruction set. The instructions marked are ‘C4x-specific.
14.1.3 Three-Operand Instructions

Most instructions contain two or three operands. The 19 three-operand instructions allow the 'C4x to read two operands from memory or the CPU register file in a single cycle and store the results in a register. The following differentiates the two- and three-operand instructions:

- Two-operand instructions have one source operand (or shift count) and a destination operand.
- Three-operand instructions may have two source operands (or one source operand and a count operand) and a destination operand. A source operand can be a memory word, a register or a constant. The destination of a three-operand instruction is always a register.

Table 14–3 lists the instructions that have three-operand versions. Note that the 3 in the mnemonic can be omitted from three-operand instructions (see subsection 14.3.2).

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ADDC3</td>
<td>Add with carry</td>
<td>MPYI3</td>
<td>Multiply integers</td>
</tr>
<tr>
<td>ADDF3</td>
<td>Add floating-point values</td>
<td>MPYSHI3†</td>
<td>Multiply signed integer, 32-MSB product</td>
</tr>
<tr>
<td>ADDI3</td>
<td>Add integers</td>
<td>MPYUHI3†</td>
<td>Multiply unsigned integer, 32-MSB product</td>
</tr>
<tr>
<td>AND3</td>
<td>Bitwise logical-AND</td>
<td>OR3</td>
<td>Bitwise logical-OR</td>
</tr>
<tr>
<td>ANDN3</td>
<td>Bitwise logical-AND with complement</td>
<td>SUBB3</td>
<td>Subtract integers with borrow</td>
</tr>
<tr>
<td>ASH3</td>
<td>Arithmetic shift</td>
<td>SUBF3</td>
<td>Subtract floating-point values</td>
</tr>
<tr>
<td>CMPF3</td>
<td>Compare floating-point values</td>
<td>SUBI3</td>
<td>Subtract integers</td>
</tr>
<tr>
<td>CMPI3</td>
<td>Compare integers</td>
<td>TSTB3</td>
<td>Test bit fields</td>
</tr>
<tr>
<td>LSH3</td>
<td>Logical shift</td>
<td>XOR3</td>
<td>Bitwise exclusive-OR</td>
</tr>
<tr>
<td>MPYF3</td>
<td>Multiply floating-point values</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

† The 'C4x instruction set is a superset of the 'C3x instruction set. The instructions marked are 'C4x-specific.
14.1.4 Program Control Instructions

The program-control instruction group consists of all of those instructions (24) that affect program flow. The repeat mode allows repetition of a block of code (RPTB and RPTBD) or of a single line of code (RPTS). Both standard and delayed (single-cycle) branching are supported. Several of the program control instructions are capable of conditional operations (see Section 14.2 for detailed information on condition codes). Table 14–4 lists the program control instructions.

Table 14–4. Program Control Instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Bcond</td>
<td>Branch conditionally (standard)</td>
<td>LAJ†</td>
<td>Link and jump</td>
</tr>
<tr>
<td>BcondAF†</td>
<td>Branch conditionally delayed and annul if false</td>
<td>LAJcond†</td>
<td>Link and jump conditional</td>
</tr>
<tr>
<td>BcondAT†</td>
<td>Branch conditionally delayed and annul if true</td>
<td>LATcond†</td>
<td>Link and trap conditional</td>
</tr>
<tr>
<td>BcondD</td>
<td>Branch conditionally (delayed)</td>
<td>NOP</td>
<td>No operation</td>
</tr>
<tr>
<td>BR‡</td>
<td>Branch unconditionally (standard)</td>
<td>RETIcond</td>
<td>Return from interrupt conditionally</td>
</tr>
<tr>
<td>BRD‡</td>
<td>Branch unconditionally (delayed)</td>
<td>RETIcondD†</td>
<td>Return from trap or interrupt, delayed</td>
</tr>
<tr>
<td>CALL‡</td>
<td>Call subroutine</td>
<td>RETScond</td>
<td>Return from subroutine conditionally</td>
</tr>
<tr>
<td>CALLcond</td>
<td>Call subroutine conditionally</td>
<td>RPTB‡</td>
<td>Repeat block of instructions</td>
</tr>
<tr>
<td>DBcond</td>
<td>Decrement and branch conditionally (standard)</td>
<td>RPTBD</td>
<td>Repeat block, delayed</td>
</tr>
<tr>
<td>DBcondD</td>
<td>Decrement and branch conditionally (delayed)</td>
<td>RPTS</td>
<td>Repeat single instruction</td>
</tr>
<tr>
<td>IACK</td>
<td>Interrupt acknowledge</td>
<td>SWI</td>
<td>Software interrupt</td>
</tr>
<tr>
<td>IDLE</td>
<td>Idle until interrupt</td>
<td>TRAPcond</td>
<td>Trap conditionally</td>
</tr>
</tbody>
</table>

† The ‘C4x instruction set is a superset of the ‘C3x instruction set. The instructions marked are ‘C4x-specific.
‡ Operand addressing mode is incompatible with ‘C3x.
14.1.5 Interlocked Operations Instructions

The interlocked operations instructions support multiprocessor communication and the use of external signals to allow for powerful synchronization mechanisms. They also guarantee the integrity of the communication and result in a high-speed operation. Refer to Chapter 7 for examples of the use of interlocked instructions.

Table 14–5. Interlocked Operations Instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDFI</td>
<td>Load floating-point value, interlocked</td>
<td>STFI</td>
<td>Store floating-point value, interlocked</td>
</tr>
<tr>
<td>LDII</td>
<td>Load integer, interlocked</td>
<td>STII</td>
<td>Store integer, interlocked</td>
</tr>
<tr>
<td>SIGI</td>
<td>Signal, interlocked</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
14.1.6 Parallel Operations Instructions

The parallel-operations instructions group makes a high degree of parallelism possible. Some of the 'C4x instructions can occur in pairs that are executed in parallel. These instructions offer the following features:

- Parallel loading of registers
- Parallel store
- Parallel arithmetic operations
- Arithmetic/logical instructions used in parallel with a store instruction.

Each instruction in a pair is entered as a separate source statement. The second instruction in the pair must be preceded by two vertical bars (||).

Table 14–6 lists the valid instruction pairs.

Table 14–6. Parallel Instructions
(a) Parallel Arithmetic With Store Instructions

<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ABSF</td>
<td></td>
</tr>
<tr>
<td>ABSI</td>
<td></td>
</tr>
<tr>
<td>ADDF3</td>
<td></td>
</tr>
<tr>
<td>ADDI3</td>
<td></td>
</tr>
<tr>
<td>AND3</td>
<td></td>
</tr>
<tr>
<td>ASH3</td>
<td></td>
</tr>
<tr>
<td>FIX</td>
<td></td>
</tr>
<tr>
<td>FLOAT</td>
<td></td>
</tr>
<tr>
<td>FRIEEE</td>
<td></td>
</tr>
<tr>
<td>LDF</td>
<td></td>
</tr>
<tr>
<td>LDI</td>
<td></td>
</tr>
</tbody>
</table>

† The 'C4x instruction set is a superset of the 'C3x instruction set. The instructions marked are 'C4x-specific.
**Instruction Set**

Table 14–6. Parallel Instructions (Concluded)

(a) Parallel Arithmetic With Store Instructions (Continued)

<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LSH3</td>
<td></td>
</tr>
<tr>
<td>MPYF3</td>
<td></td>
</tr>
<tr>
<td>MPYI3</td>
<td></td>
</tr>
<tr>
<td>NEGF</td>
<td></td>
</tr>
<tr>
<td>NEGI</td>
<td></td>
</tr>
<tr>
<td>NOT</td>
<td></td>
</tr>
<tr>
<td>OR3</td>
<td></td>
</tr>
<tr>
<td>STF</td>
<td></td>
</tr>
<tr>
<td>STI</td>
<td></td>
</tr>
<tr>
<td>SUBF3</td>
<td></td>
</tr>
<tr>
<td>TOIEEE</td>
<td></td>
</tr>
<tr>
<td>SUBI3</td>
<td></td>
</tr>
<tr>
<td>XOR3</td>
<td></td>
</tr>
</tbody>
</table>

(b) Parallel Load Instructions

<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDF</td>
<td></td>
</tr>
<tr>
<td>LDI</td>
<td></td>
</tr>
</tbody>
</table>

(c) Parallel Multiply and Add/Subtract Instructions

<table>
<thead>
<tr>
<th>Mnemonic</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>MPYF3</td>
<td></td>
</tr>
<tr>
<td>MPYF3</td>
<td></td>
</tr>
<tr>
<td>MPYI3</td>
<td></td>
</tr>
<tr>
<td>MPYI3</td>
<td></td>
</tr>
</tbody>
</table>

† The 'C4x instruction set is a superset of the 'C3x instruction set. The instructions marked are 'C4x-specific.
14.1.7 Illegal Instructions

The 'C4x has no illegal instruction detection mechanism. Fetching an illegal (undefined) code may result in the execution of an undefined operation. If TI TMS320 floating-point software tools are used, no illegal opcodes can be generated. An illegal opcode can only be generated by the misuse of the tools, by an error in the ROM code, or by a defective RAM.
14.2 Condition Codes and Flags

The 'C4x provides 20 condition codes (00000–10100, excluding 01011) that can be used with any of the conditional instructions, such as RETScond or LDFcond. The conditions include signed and unsigned comparisons, comparisons to zero, and comparisons based on the status of individual condition flags. Note that all conditional instructions can also accept the suffix U to indicate unconditional operation.

Seven condition flags provide information about properties of the result of arithmetic and logical instructions. The condition flags are stored in the status register (ST); the effect of an instruction on a condition flag depends on the value of the SET COND field (bit 15 of the status register). The value of SET COND (0 or 1) does not affect the nature of the compare instructions (CMPF, CMPF3, CMPI, CMPI3, TSTB, or TSTB3).

- If SET COND = 0, the ST condition flags are set if the operation’s target is any extended-precision register (R0–R11).

- If SET COND = 1, the ST condition flags are also set if the operation’s target is any register in the primary register file except the status register.

The condition flags can be modified by most instructions when either of the preceding conditions is established and either of the following two cases occurs:

- A result is generated when the specified operation is performed to infinite precision. This is appropriate for compare-and-test instructions that do not store results in a register. It is also appropriate for arithmetic instructions that produce underflow or overflow.

- The output is written to the destination register as shown in Table 14–7. This is appropriate for other instructions that modify the condition flags.

*Table 14–7. Output Value Formats*

<table>
<thead>
<tr>
<th>Type of Operation</th>
<th>Output Format</th>
</tr>
</thead>
<tbody>
<tr>
<td>Floating-point</td>
<td>8-bit exponent, 1 sign bit, 31-bit fraction</td>
</tr>
<tr>
<td>Integer</td>
<td>32-bit integer</td>
</tr>
<tr>
<td>Logical</td>
<td>32-bit unsigned integer</td>
</tr>
</tbody>
</table>
Figure 14–1 shows the condition flags in the low-order bits of the status register. Following the figure is a list of status register condition flags and descriptions on how the flags are set by most instructions. For specific details of the effect of a particular instruction on the condition flags, see the description of that instruction in subsection 14.3.3.

**Figure 14–1. Status Register**

<table>
<thead>
<tr>
<th>31</th>
<th>30</th>
<th>29</th>
<th>28</th>
<th>27</th>
<th>26</th>
<th>25</th>
<th>24</th>
<th>23</th>
<th>22</th>
<th>21</th>
<th>20</th>
<th>19</th>
<th>18</th>
<th>17</th>
<th>16</th>
</tr>
</thead>
<tbody>
<tr>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
<td>xx</td>
</tr>
<tr>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
<td>R</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>15</th>
<th>14</th>
<th>13</th>
<th>12</th>
<th>11</th>
<th>10</th>
<th>9</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>SET</td>
<td>COND</td>
<td>PGIE</td>
<td>GIE</td>
<td>CC</td>
<td>CE</td>
<td>CF</td>
<td>PCF</td>
<td>RM</td>
<td>OVM</td>
<td>LUF</td>
<td>LV</td>
<td>UF</td>
<td>N</td>
<td>Z</td>
<td>V</td>
</tr>
<tr>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
<td>R/W</td>
</tr>
</tbody>
</table>

**NOTE:** xx = reserved bit.

**LUF** Latched Underflow Condition Flag. LUF is set whenever UF (floating-point underflow flag) is set. LUF can be cleared only by a processor reset or by modifying it in the status register (ST).

**LV** Latched Overflow Condition Flag. LV is set whenever V (overflow condition flag) is set. Otherwise, it is unchanged. LV can be cleared only by a processor reset or by modifying it in the status register (ST).

**UF** Floating-Point Underflow Condition Flag. A floating-point underflow occurs whenever the exponent of the result is less than or equal to \(-128\). If a floating-point underflow occurs, UF is set, and the output value is set to 0. UF is cleared if a floating-point underflow does not occur.

**N** Negative Condition Flag. Logical operations assign N (the state of the MSB of the output value). For integer and floating-point operations, N is set if the result is negative, and cleared otherwise. Zero is positive.

**Z** Zero Condition Flag. For logical, integer, and floating-point operations, Z is set if the output is 0, and cleared otherwise.

**V** Overflow Condition Flag. For integer operations, V is set if the result does not fit into the format specified for the destination (i.e., \(-2^{32} \leq \text{result} \leq 2^{32} - 1\)). Otherwise, V is cleared. For floating-point operations, V is set if the exponent of the result is greater than 127; otherwise, V is cleared. Logical operations always clear V.
Condition Codes and Flags

C  **Carry Flag.** When an integer addition is performed, C is set if a carry occurs out of the bit corresponding to the MSB of the output. When an integer subtraction is performed, C is set if a borrow occurs into the bit corresponding to the MSB of the output. Otherwise, for integer operations, C is cleared. The carry flag is unaffected by floating-point and logical operations. For shift instructions, this flag is set to the final value shifted out; for a zero shift count, this is set to zero.

Table 14–8 lists the condition mnemonic, code, description, and flag for each of the 19 condition codes.

**Table 14–8. Condition Codes and Flags**

(a) Unconditional Compares

<table>
<thead>
<tr>
<th>Condition</th>
<th>Code</th>
<th>Description</th>
<th>Flag †</th>
</tr>
</thead>
<tbody>
<tr>
<td>U</td>
<td>00000</td>
<td>Unconditional</td>
<td>Don’t care</td>
</tr>
</tbody>
</table>

(b) Unsigned Compares

<table>
<thead>
<tr>
<th>Condition</th>
<th>Code</th>
<th>Description</th>
<th>Flag †</th>
</tr>
</thead>
<tbody>
<tr>
<td>LO</td>
<td>00001</td>
<td>Lower than</td>
<td>C</td>
</tr>
<tr>
<td>LS</td>
<td>00010</td>
<td>Lower than or same as</td>
<td>C OR Z</td>
</tr>
<tr>
<td>HI</td>
<td>00111</td>
<td>Higher than</td>
<td>¬C AND ¬Z</td>
</tr>
<tr>
<td>HS</td>
<td>00100</td>
<td>Higher than or same as</td>
<td>¬C</td>
</tr>
<tr>
<td>EQ</td>
<td>00101</td>
<td>Equal to</td>
<td>Z</td>
</tr>
<tr>
<td>NE</td>
<td>00110</td>
<td>Not Equal to</td>
<td>¬Z</td>
</tr>
</tbody>
</table>

(c) Signed Compares

<table>
<thead>
<tr>
<th>Condition</th>
<th>Code</th>
<th>Description</th>
<th>Flag †</th>
</tr>
</thead>
<tbody>
<tr>
<td>LT</td>
<td>00111</td>
<td>Less than</td>
<td>N</td>
</tr>
<tr>
<td>LE</td>
<td>01000</td>
<td>Less than or equal to</td>
<td>N OR Z</td>
</tr>
<tr>
<td>GT</td>
<td>01001</td>
<td>Greater than</td>
<td>¬N AND ¬Z</td>
</tr>
<tr>
<td>GE</td>
<td>01010</td>
<td>Greater than or equal to</td>
<td>¬N</td>
</tr>
<tr>
<td>EQ</td>
<td>00101</td>
<td>Equal to</td>
<td>Z</td>
</tr>
<tr>
<td>NE</td>
<td>00110</td>
<td>Not equal to</td>
<td>¬Z</td>
</tr>
</tbody>
</table>

(d) Compare to Zero

<table>
<thead>
<tr>
<th>Condition</th>
<th>Code</th>
<th>Description</th>
<th>Flag †</th>
</tr>
</thead>
<tbody>
<tr>
<td>Z</td>
<td>00101</td>
<td>Zero</td>
<td>Z</td>
</tr>
<tr>
<td>NZ</td>
<td>00110</td>
<td>Not zero</td>
<td>¬Z</td>
</tr>
<tr>
<td>P</td>
<td>01001</td>
<td>Positive</td>
<td>¬N AND ¬Z</td>
</tr>
<tr>
<td>N</td>
<td>00111</td>
<td>Negative</td>
<td>N</td>
</tr>
<tr>
<td>NN</td>
<td>01010</td>
<td>Nonnegative</td>
<td>¬N</td>
</tr>
</tbody>
</table>

† The ¬ means logical complement (“not true” condition).
### Compare to Condition Flags

<table>
<thead>
<tr>
<th>Condition</th>
<th>Code</th>
<th>Description</th>
<th>Flag†</th>
</tr>
</thead>
<tbody>
<tr>
<td>NN</td>
<td>01010</td>
<td>Nonnegative</td>
<td>–N</td>
</tr>
<tr>
<td>N</td>
<td>00111</td>
<td>Negative</td>
<td>N</td>
</tr>
<tr>
<td>NZ</td>
<td>00110</td>
<td>Nonzero</td>
<td>–Z</td>
</tr>
<tr>
<td>Z</td>
<td>00101</td>
<td>Zero</td>
<td>Z</td>
</tr>
<tr>
<td>NV</td>
<td>01100</td>
<td>No overflow</td>
<td>–V</td>
</tr>
<tr>
<td>V</td>
<td>01101</td>
<td>Overflow</td>
<td>V</td>
</tr>
<tr>
<td>NUF</td>
<td>01110</td>
<td>No underflow</td>
<td>–UF</td>
</tr>
<tr>
<td>UF</td>
<td>01111</td>
<td>Underflow</td>
<td>UF</td>
</tr>
<tr>
<td>NC</td>
<td>00100</td>
<td>No carry</td>
<td>–C</td>
</tr>
<tr>
<td>C</td>
<td>00001</td>
<td>Carry</td>
<td>C</td>
</tr>
<tr>
<td>NLV</td>
<td>10000</td>
<td>No latched overflow</td>
<td>–LV</td>
</tr>
<tr>
<td>LV</td>
<td>10001</td>
<td>Latched overflow</td>
<td>LV</td>
</tr>
<tr>
<td>NLUF</td>
<td>10010</td>
<td>No latched floating-point underflow</td>
<td>–LUF</td>
</tr>
<tr>
<td>LUF</td>
<td>10011</td>
<td>Latched floating-point underflow</td>
<td>LUF</td>
</tr>
<tr>
<td>ZUF</td>
<td>10100</td>
<td>Zero or floating-point underflow</td>
<td>Z OR UF</td>
</tr>
</tbody>
</table>

† The – means logical complement (“not true” condition).
14.3 Individual Instruction Descriptions

This section contains the individual assembly language instructions for the 'C4x. The instructions are listed in alphabetical order. Information for each instruction includes assembler syntax, operation, operands, encoding, description, cycles, status bits, mode bit, and examples.

Definitions of the symbols and abbreviations, as well as optional syntax forms allowed by the assembler, precede the individual instruction description section. Also, an example instruction shows the special format used and explains its content.

You can find a functional grouping of the instructions, as well as a complete instruction set summary in Section 14.1. See Chapter 7, Addressing and Stack Management, for information on memory addressing.

14.3.1 Symbols and Abbreviations

Table 14–9 lists the symbols and abbreviations used in the individual instruction descriptions.
### Table 14–9. Instruction Symbols

<table>
<thead>
<tr>
<th>Symbol</th>
<th>Meaning</th>
</tr>
</thead>
<tbody>
<tr>
<td>src</td>
<td>Source operand</td>
</tr>
<tr>
<td>src1</td>
<td>Source operand 1</td>
</tr>
<tr>
<td>src2</td>
<td>Source operand 2</td>
</tr>
<tr>
<td>src3</td>
<td>Source operand 3</td>
</tr>
<tr>
<td>src4</td>
<td>Source operand 4</td>
</tr>
<tr>
<td>dst</td>
<td>Destination operand</td>
</tr>
<tr>
<td>dst1</td>
<td>Destination operand 1</td>
</tr>
<tr>
<td>dst2</td>
<td>Destination operand 2</td>
</tr>
<tr>
<td>disp</td>
<td>Displacement</td>
</tr>
<tr>
<td>cond</td>
<td>Condition</td>
</tr>
<tr>
<td>count</td>
<td>Shift count</td>
</tr>
<tr>
<td>G</td>
<td>General addressing modes</td>
</tr>
<tr>
<td>T</td>
<td>Three-operand addressing modes</td>
</tr>
<tr>
<td>P</td>
<td>Parallel addressing modes</td>
</tr>
<tr>
<td>B</td>
<td>Conditional-branch addressing modes</td>
</tr>
<tr>
<td>ARn</td>
<td>Auxiliary register n</td>
</tr>
<tr>
<td>IRn</td>
<td>Index register n</td>
</tr>
<tr>
<td>Rn</td>
<td>Extended-precision register address n</td>
</tr>
<tr>
<td>RC</td>
<td>Repeat count register</td>
</tr>
<tr>
<td>RE</td>
<td>Repeat end address register</td>
</tr>
<tr>
<td>RS</td>
<td>Repeat start address register</td>
</tr>
<tr>
<td>ST</td>
<td>Status register</td>
</tr>
<tr>
<td>C</td>
<td>Carry bit of status register</td>
</tr>
<tr>
<td>GIE</td>
<td>Global interrupt enable bit of status register</td>
</tr>
<tr>
<td>N</td>
<td>Trap vector</td>
</tr>
<tr>
<td>PC</td>
<td>Program counter</td>
</tr>
<tr>
<td>RM</td>
<td>Repeat mode flag</td>
</tr>
<tr>
<td>SP</td>
<td>System stack pointer</td>
</tr>
<tr>
<td></td>
<td>x</td>
</tr>
<tr>
<td></td>
<td>x</td>
</tr>
<tr>
<td>x(man)</td>
<td>Mantissa field (sign + fraction) of x</td>
</tr>
<tr>
<td>x(exp)</td>
<td>Exponent field of x</td>
</tr>
<tr>
<td>op1</td>
<td></td>
</tr>
<tr>
<td>x AND y</td>
<td>Bitwise-logical AND of x and y</td>
</tr>
<tr>
<td>x OR y</td>
<td>Bitwise-logical OR of x and y</td>
</tr>
<tr>
<td>x XOR y</td>
<td>Bitwise-logical XOR of x and y</td>
</tr>
<tr>
<td>~x</td>
<td>Bitwise-logical complement of x</td>
</tr>
<tr>
<td>x &lt;&lt; y</td>
<td>Shift x to the left y bits</td>
</tr>
<tr>
<td>x &gt;&gt; y</td>
<td>Shift x to the right y bits</td>
</tr>
<tr>
<td>*++SP</td>
<td>Increment SP and use incremented SP as address</td>
</tr>
<tr>
<td>*SP--</td>
<td>Use SP as address and decrement SP</td>
</tr>
</tbody>
</table>
14.3.2 Optional Assembler Syntaxes

The assembler allows a relaxed syntax form for some instructions. These optional forms simplify the assembly language so that special-case syntax can be ignored. The following is a list of these optional syntax forms.

- The destination register can be omitted on unary arithmetic and logical operations when the same register is used as a source. For example,
  
  ```
  ABSI R0,R0 can be written as ABSI R0
  ```

  Instructions affected: ABSI, ABSF, FIX, FLOAT, NEGB, NEGF, NEGI, NORM, NOT, RND.

- All three-operand instructions can be written without the 3. For example,
  
  ```
  ADDI3 R0,R1,R2 can be written as ADDI R0,R1,R2
  ```


  This also applies to all the pertinent parallel instructions.

- All three-operand comparison instructions can be written without the 3. For example,
  
  ```
  CMPI3 R0,*AR0 can be written as CMPI R0,*AR0
  ```

  Instructions affected: CMPI3, CMPF3, TSTB3.

- Indirect operands with an explicit 0 displacement are allowed. In three-operand or parallel instructions, operands with 0 displacement are automatically converted to no-displacement mode. For example:
  
  ```
  LDI *+AR0(0),R1 is legal
  ```

  Also

  ```
  ADDI3 *+AR0(0),R1,R2 is equivalent to ADDI3 *AR0,R1,R2
  ```

- Indirect operands can be written with no displacement; in which case, a displacement of 1 is assumed. For example,
  
  ```
  LDI *AR0++,R0 can be written as LDI *AR0+,R0
  ```

- All conditional instructions accept the suffix U to indicate unconditional operation. Also, the U can be omitted from unconditional short branch instructions. For example:
  
  ```
  BU label can be written as B label
  ```

- Labels can be written with or without a trailing colon. For example:
  
  ```
  label0: NOP
  label1  NOP
  label2:    (label assembles to next source line)
  ```
<table>
<thead>
<tr>
<th>Individual Instruction Descriptions</th>
</tr>
</thead>
</table>

- Empty expressions are not allowed for the displacement in indirect mode:
  
  \[
  \text{LDI } \star AR0(), R0 \quad \text{is not legal}
  \]

- Immediate-mode destination operands of BR and CALL can be written with an at (@) sign:
  
  \[
  \text{BR label \quad can be written as \quad BR @label}
  \]

- The LDP pseudo-op can be used to load a register (DP by default) with the 16 MSBs of a relocatable address as follows:
  
  \[
  \text{LDP addr, REG \quad or \quad LDP @addr, REG \quad or \quad LDP addr}
  \]

  The at (@) sign is optional.

  LDP generates an LDIU instruction. An immediate operand with a special relocation type is used.

- Parallel instructions can be written in either order. For example:
  
  \[
  \text{ADDI} \\
  || \quad \text{STI}
  \]

  \[
  \text{can be written as}
  \]

  \[
  \text{STI} \\
  || \quad \text{ADDI}
  \]

- The parallel bars indicating part two of a parallel instruction can be written anywhere on the line from column 0 to the mnemonic. For example:
  
  \[
  \text{ADDI} \\
  || \quad \text{STI}
  \]

  \[
  \text{can be written as}
  \]

  \[
  \text{ADDI} \\
  || \quad \text{STI}
  \]

- If the second operand of a parallel instruction is the same as the third (destination register) operand, the third operand can be omitted. This allows the writing of three-operand parallel instructions that look like normal two-operand instructions. For example,
  
  \[
  \text{ADDI } \star AR0, R2, R2 \\
  || \quad \text{MPYI } \star AR1, R0, R0
  \]

  \[
  \text{can be written as}
  \]

  \[
  \text{ADDI } \star AR0, R2 \\
  || \quad \text{MPYI } \star AR1, R0
  \]

Instructions affected (applies to all parallel instructions that have a register as the second operand): ADDI, ADDF, AND, MPYI, MPYF, OR, SUBI, SUBF, XOR.
All commutative operations in parallel instructions can be written in either order. For example, the ADDI part of a parallel instruction can be written in either of two ways:

```
ADDI *AR0,R1,R2 or ADDI R1,*AR0,R2
```

The instructions affected are parallel instructions containing any of the following: ADDI, ADDF, MPYI, MPYF, AND, OR, XOR.

Use the syntax in Table 14–10 to designate CPU registers in operands.

### 14.3.3 Individual Instruction Descriptions

Each assembly language instruction for the 'C4x is described in this section in alphabetical order. The description includes the assembler syntax, operation, operands, encoding, description, cycles, status bits, mode bit, and examples. Table 14–10 shows the CPU register symbols.
### Table 14–10. CPU Register Symbols

<table>
<thead>
<tr>
<th>Register Symbol</th>
<th>Machine Value (hex)</th>
<th>Assigned Function Name</th>
<th>Subsection</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>00</td>
<td>Extended-precision register 0</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R1</td>
<td>01</td>
<td>Extended-precision register 1</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R2</td>
<td>02</td>
<td>Extended-precision register 2</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R3</td>
<td>03</td>
<td>Extended-precision register 3</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R4</td>
<td>04</td>
<td>Extended-precision register 4</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R5</td>
<td>05</td>
<td>Extended-precision register 5</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R6</td>
<td>06</td>
<td>Extended-precision register 6</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R7</td>
<td>07</td>
<td>Extended-precision register 7</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R8</td>
<td>1C</td>
<td>Extended-precision register 8</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R9</td>
<td>1D</td>
<td>Extended-precision register 9</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R10</td>
<td>1E</td>
<td>Extended-precision register 10</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>R11</td>
<td>1F</td>
<td>Extended-precision register 11</td>
<td>3.1.1</td>
<td>3-3</td>
</tr>
<tr>
<td>AR0</td>
<td>08</td>
<td>Auxiliary register 0</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR1</td>
<td>09</td>
<td>Auxiliary register 1</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR2</td>
<td>0A</td>
<td>Auxiliary register 2</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR3</td>
<td>0B</td>
<td>Auxiliary register 3</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR4</td>
<td>0C</td>
<td>Auxiliary register 4</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR5</td>
<td>0D</td>
<td>Auxiliary register 5</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR6</td>
<td>0E</td>
<td>Auxiliary register 6</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>AR7</td>
<td>0F</td>
<td>Auxiliary register 7</td>
<td>3.1.2</td>
<td>3-4</td>
</tr>
<tr>
<td>DP</td>
<td>10</td>
<td>Data-page pointer</td>
<td>3.1.3</td>
<td>3-4</td>
</tr>
<tr>
<td>IR0</td>
<td>11</td>
<td>Index register 0</td>
<td>3.1.4</td>
<td>3-4</td>
</tr>
<tr>
<td>IR1</td>
<td>12</td>
<td>Index register 1</td>
<td>3.1.4</td>
<td>3-4</td>
</tr>
<tr>
<td>BK</td>
<td>13</td>
<td>Block-size register</td>
<td>3.1.5</td>
<td>3-5</td>
</tr>
<tr>
<td>SP</td>
<td>14</td>
<td>System stack pointer</td>
<td>3.1.6</td>
<td>3-5</td>
</tr>
</tbody>
</table>
**Table 14–10. CPU Register Symbols (Continued)**

<table>
<thead>
<tr>
<th>Register Symbol</th>
<th>Register Value (hex)</th>
<th>Assigned Function Name</th>
<th>Subsection</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>ST</td>
<td>15</td>
<td>Status register</td>
<td>3.1.7</td>
<td>3-5</td>
</tr>
<tr>
<td>DIE</td>
<td>16</td>
<td>DMA coprocessor interrupt enable</td>
<td>3.1.8</td>
<td>3-8</td>
</tr>
<tr>
<td>IIE</td>
<td>17</td>
<td>Internal-interrupt enable register</td>
<td>3.1.9</td>
<td>3-11</td>
</tr>
<tr>
<td>IIF</td>
<td>18</td>
<td>IIOF pins and interrupt flag register</td>
<td>3.1.10</td>
<td>3-13</td>
</tr>
<tr>
<td>RS</td>
<td>19</td>
<td>Repeat start address</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
<tr>
<td>RE</td>
<td>1A</td>
<td>Repeat end address</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
<tr>
<td>RC</td>
<td>1B</td>
<td>Repeat counter</td>
<td>3.1.11</td>
<td>3-16</td>
</tr>
<tr>
<td>IVTP</td>
<td>00</td>
<td>Interrupt-vector table pointer</td>
<td>3.2</td>
<td>3-17</td>
</tr>
<tr>
<td>TVTP</td>
<td>01</td>
<td>Trap-vector table pointer</td>
<td>3.2</td>
<td>3-17</td>
</tr>
</tbody>
</table>
Syntax

\[ \text{INST } \text{src, dst} \]

or

\[ \text{INST1 } \text{src2, dst1} \]
\[ \text{|| INST2 } \text{src3, dst2} \]

Each instruction begins with an assembler syntax expression. Labels may be placed either before the command (instruction mnemonic) on the same line or on the preceding line in the first column. The optional comment field that concludes the syntax is not included in the syntax expression. A space is required between fields (label, command, operand, and comment fields).

The syntax examples illustrate the common one-line syntax and the two-line syntax used in parallel addressing. Note that the two vertical bars || that indicate a parallel addressing pair can be placed anywhere before the mnemonic on the second line. The first instruction in the pair can have a label, but the second instruction cannot have a label.

Operands

src general-addressing modes (G):

dst register (R0 – R11)

The operands segment lists the types of operands that the instruction uses.

Opcode

\[
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & \text{INST} & \text{G} & \text{dst} & & \text{src} \\
\end{array}
\]

\[
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 1 & \text{INST1} & | & \text{INST2} & \text{dst1} & 0 & 0 & 0 & \text{src3} & \text{dst2} & \text{src2} \\
\end{array}
\]

Encoding examples are shown for general addressing and parallel addressing. The instruction pair for the parallel addressing example consists of INST1 and INST2. Note that two separate opcodes are listed in this case; each instruction is 32-bits in length in the 'C4x.

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>
The word fields segment describes the addressing mode that corresponds to each value of a word field in the opcode. The word field listed in the table corresponds to the field listed under operands.

### Operation

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>src (\rightarrow) dst</td>
<td></td>
</tr>
<tr>
<td>or</td>
<td></td>
</tr>
<tr>
<td>src2 (\rightarrow) dst1</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
</tbody>
</table>

The instruction operation sequence describes the processing that takes place when the instruction is executed. For parallel instructions, the operation sequence is performed in parallel. Conditional effects of status register specified modes are listed for conditional instructions such as Bcond.

- **dst** register (any register in CPU primary-register file)
- or
- src2 indirect (disp = 0, 1, IR0, IR1)
- dst1 register (R0–R7)
- src3 register (R0–R7)
- dst2 indirect (disp = 0, 1, IR0, IR1)

Operands are defined according to the addressing mode and/or the type of addressing used. Note that indirect addressing uses displacements and the index registers. See Chapter 6, Addressing, for detailed information on addressing.

### Description

Instruction execution and its effect on the rest of the processor or memory contents are described in this segment. Any constraints on the operands imposed by the processor or the assembler are discussed. The description parallels and supplements the information given by the operation block.

### Status Bits

- **LUF** Latched Floating-Point Underflow Condition Flag. 1 if a floating-point underflow occurs, unchanged otherwise.
- **LV** Latched Overflow Condition Flag. 1 if an integer or floating-point overflow occurs, unchanged otherwise.
- **UF** Floating-Point Underflow Condition Flag. 1 if a floating-point underflow occurs, 0 otherwise.
- **N** Negative Condition Flag. 1 if a negative result is generated, 0 otherwise. In some instructions, this flag is the MSB of the output.
- **Z** Zero Condition Flag. 1 if a zero result is generated, 0 otherwise. For logical and shift instructions, 1 if a zero output is generated, 0 otherwise.
V  **Overflow Condition Flag.** 1 if an integer or floating-point overflow occurs, 0 otherwise.

C  **Carry Flag.** 1 if a carry or borrow occurs, 0 otherwise. For shift instructions, this flag is set to the value of the last bit shifted out; 0 for a shift count of 0.

The seven condition flags are stored in the status register (ST). They provide information about the properties of the result or output of arithmetic or logical operations.

**Mode Bit**

**OVM Overflow Mode Flag.** In general, integer operations are affected by the OVM bit value.

**Cycles**

1

The digit specifies the number of cycles required to execute the instruction.

**Example**

```
INST @98AEh, R5
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP 80h</td>
<td>DP 80h</td>
</tr>
<tr>
<td>R5 07 6690 0000h</td>
<td>R5 00 6690 0000h</td>
</tr>
<tr>
<td>Memory at 0080 98AEh</td>
<td>Memory at 80 98AEh</td>
</tr>
<tr>
<td>5CDFh 1.00001107e + 00</td>
<td>5CDFh 1.00001107e + 00</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>

The sample code presented in the above format shows the effect of the code on system pointers (e.g., DP or SP), registers (e.g., R1 or R5), memory at specific locations, and the seven status bits. The values given for the registers include the leading zeros to show the exponent in floating-point operations. Decimal conversions are provided for all register and memory locations. The seven status bits are listed in the order in which they appear in the assembler and simulator (see Section 14.2, *Condition Codes and Flags*, and Table 14–8 on page 14-14 for further information on these seven status bits).
ABSF  Absolute Value of Floating-Point Number

Syntax  
**ABSF** `src, dst`

Operands  
*src*: general-addressing modes  
*dst*: register (R0 – R11)

Opcode  
```
31 21 23 16 0
0 0 0 0 0 src0 dstG00
```

Word Fields  

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
|src| → |dst|

Description  
The absolute value of the *src* operand is loaded into the *dst* register. The *src* and *dst* operands are assumed to be floating-point numbers.

An overflow occurs if *src* (man) = 8000 0000h and *src* (exp) = 7Fh. The result is *dst* (man) = 7FFF FFFFh and *dst* (exp) = 7Fh.

Status Bits  

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if a floating-point overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if a floating-point overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit  

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>OVM</td>
<td>Operation is affected by the OVM bit’s value.</td>
</tr>
</tbody>
</table>

Cycles  
1

Example  
**ABSF** `R4, R7`

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R4 05C8000F971h</td>
<td>R4 05C8000F971h</td>
</tr>
<tr>
<td>R7 07D251100AEh</td>
<td>R7 05C7FFFF068Fh</td>
</tr>
<tr>
<td>LV</td>
<td>LV</td>
</tr>
<tr>
<td>Z</td>
<td>Z</td>
</tr>
<tr>
<td>V</td>
<td>V</td>
</tr>
</tbody>
</table>

14-26
### Parallel ABSF and STF

**ABSF||STF**

#### Syntax

```plaintext
ABSF src2, dst1
|| STF src3, dst2
```

#### Operands

- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

#### Opcode

```
<table>
<thead>
<tr>
<th>31  29 24 23 16 15 8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
</tr>
</tbody>
</table>
```

#### Word Fields

None.

#### Operation

- `|src2| → dst1`
- `|| src3 → dst2`

#### Description

A floating-point absolute value and a floating-point store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STF) reads from a register and the operation being performed in parallel (ABSF) writes to the same register, then STF accepts as input the contents of the register before it is modified by the ABSF.

If `src2` and `dst2` point to the same location, `src2` is read before the write to `dst2`. If `src3` and `dst1` point to the same register, `src3` is read before the write to `dst1`.

An overflow occurs if `src` (man) = 8000 0000h and `src` (exp) = 7Fh. The result is `dst` (man) = 7FFF FFFFh and `dst` (exp) = 7Fh.

#### Status Bits

- **LUF** Unaffected
- **LV** 1 if a floating-point overflow occurs, unchanged otherwise
- **UF** 0
- **N** 0
- **Z** 1 if a zero result is generated, 0 otherwise
- **V** 1 if a floating-point overflow occurs, 0 otherwise
- **C** Unaffected

#### Mode Bit

**OVM** operation is not affected by OVM bit value.

#### Cycles

1
### Example

**ABSF** `+++AR3(IR1), R4 || STF R4, −AR7(1)`

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>AR3</strong> 80 9800h</td>
<td><strong>AR3</strong> 80 98AFh</td>
</tr>
<tr>
<td><strong>IR1</strong> 0AFh</td>
<td><strong>IR1</strong> 0AFh</td>
</tr>
<tr>
<td><strong>R4</strong> 733C0 0000h</td>
<td><strong>R4</strong> 733C0 0000h</td>
</tr>
<tr>
<td>{{1.79750e + 02}}</td>
<td>6.118750e + 01</td>
</tr>
<tr>
<td><strong>AR7</strong> 80 98C5h</td>
<td><strong>AR7</strong> 80 98C5h</td>
</tr>
<tr>
<td><strong>LUF</strong> 0</td>
<td><strong>LUF</strong> 0</td>
</tr>
<tr>
<td><strong>LV</strong> 0</td>
<td><strong>LV</strong> 0</td>
</tr>
<tr>
<td><strong>UF</strong> 0</td>
<td><strong>UF</strong> 0</td>
</tr>
<tr>
<td><strong>N</strong> 0</td>
<td><strong>N</strong> 0</td>
</tr>
<tr>
<td><strong>Z</strong> 0</td>
<td><strong>Z</strong> 0</td>
</tr>
<tr>
<td><strong>V</strong> 0</td>
<td><strong>V</strong> 0</td>
</tr>
<tr>
<td><strong>C</strong> 0</td>
<td><strong>C</strong> 0</td>
</tr>
</tbody>
</table>

- **Data at 80 98AFh**
  - 58B 4000h: −6.118750e + 01
  - 58B 4000h: −6.118750e + 01

- **Data at 80 98C4h**
  - 0h
  - 733 C000h: 1.79750e + 02
**Syntax**

`ABSI src, dst`

**Operands**

- `src`: general-addressing modes
- `dst`: register (any register in CPU primary-register file)

**Opcode**

```
<table>
<thead>
<tr>
<th>31 24 23 16 15 8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0 0 0 0 1</td>
</tr>
<tr>
<td>G dst src</td>
</tr>
</tbody>
</table>
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

`|src| → dst`

**Description**

The absolute value of the `src` operand is loaded into the `dst` register. The `src` and `dst` operands are assumed to be signed integers.

An overflow occurs if `src = 8000 0000h`. If ST(OVM) = 1, the result is `dst = 7FFF FFFFh`. If ST(OVM) = 0, the result is `dst = 8000 0000h`.

**Status Bits**

- **LUF**: Unaffected
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 0
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: Unaffected

**Mode Bit**

- OVM operation is affected by OVM bit value.

**Cycles**

1
**ABSI**  
*Absolute Value of Integer*

**Example 1**  
ABSI R0,R0  
or ABSI R0

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>0FFFF FFCBh -53</td>
</tr>
<tr>
<td>R0</td>
<td>03Fh 53</td>
</tr>
</tbody>
</table>

**Example 2**  
ABSI *AR1,R3*

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1</td>
<td>20h</td>
</tr>
<tr>
<td>R3</td>
<td>0h</td>
</tr>
<tr>
<td>Data at 20h</td>
<td>0FFFF FFCBh -53</td>
</tr>
<tr>
<td>AR1</td>
<td>20h</td>
</tr>
<tr>
<td>R3</td>
<td>35h 53</td>
</tr>
<tr>
<td>Data at 20h</td>
<td>0FFFF FFCBh -53</td>
</tr>
</tbody>
</table>
### Parallel ABSI and STI

<table>
<thead>
<tr>
<th>Syntax</th>
<th>ABSI</th>
<th>src2, dst1</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>STI</td>
<td>src3, dst2</td>
</tr>
</tbody>
</table>

#### Operands
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

#### Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>14</th>
<th>13</th>
<th>12</th>
<th>11</th>
<th>10</th>
<th>9</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>dst1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>src3</td>
<td>dst2</td>
<td>src2</td>
<td></td>
</tr>
</tbody>
</table>

#### Word Fields
None

#### Operation
- |src2| → dst1
- || src3 → dst2

#### Description
An integer absolute value and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (ABSI) writes to the same register, then STI accepts as input the contents of the register before it is modified by the ABSI.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

An overflow occurs if src = 8000 0000h. If ST(OVM) = 1, the result is dst = 7FFF FFFFh. If ST(OVM) = 0, the result is dst = 8000 0000h.

#### Status Bits
- **LUF**  Unaffected
- **LV**  1 if an integer overflow occurs, unchanged otherwise
- **UF**  0
- **N**  0
- **Z**  1 if a zero result is generated, 0 otherwise
- **V**  1 if an integer overflow occurs, 0 otherwise
- **C**  Unaffected

#### Mode Bit
**OVM** operation is affected by OVM bit value.

#### Cycles
1
Example

\[
\text{ABSI } *\text{AR5(1),R5} \quad || \quad \text{STI } R1, \text{*AR2=-(IR1)}
\]

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR5 80 99E2h</td>
<td>AR5 80 99E2h</td>
</tr>
<tr>
<td>R5 0h</td>
<td>R5 35h 53</td>
</tr>
<tr>
<td>R1 42h 66</td>
<td>R1 42h 66</td>
</tr>
<tr>
<td>AR2 80 98FFh</td>
<td>AR2 80 98F0h</td>
</tr>
<tr>
<td>IR1 0Fh</td>
<td>IR1 0Fh</td>
</tr>
<tr>
<td>Data at 80 99E1h</td>
<td>Data at 80 99E1h</td>
</tr>
<tr>
<td>0FFFFFFFCCBh</td>
<td>–53 0FFFFFFFCCBh</td>
</tr>
<tr>
<td>Data at 80 98FFh</td>
<td>Data at 80 98FFh</td>
</tr>
<tr>
<td>2h</td>
<td>42h 66</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Add Integer With Carry  

Add Integer With Carry  

Syntax  

```
ADD C src, dst
```

Operands  

- **src**: general-addressing modes  
- **dst**: register (any register in CPU primary-register file)

Opcode  

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>0 0 0 0 0 1 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>
```

Word Fields  

```
G src addressing modes
00 register (any register in CPU primary-register file)
01 direct
10 indirect
11 immediate
```

Operation  

```
dst + src + C → dst
```

Description  

The sum of the dst and src operands and the C (carry) flag is loaded into the dst register. The dst and src operands are assumed to be signed integers.

Status Bits  

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected  
- **LV**: 1 if an integer overflow occurs, unchanged otherwise  
- **UF**: 0  
- **N**: 1 if a negative result is generated, 0 otherwise  
- **Z**: 1 if a zero result is generated, 0 otherwise  
- **V**: 1 if an integer overflow occurs, 0 otherwise  
- **C**: 1 if a carry occurs, 0 otherwise

Mode Bit  

**OVM** operation is affected by OVM bit value.

Cycles  

1
### ADDC  Add Integer With Carry

**Example**  

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
</table>
| ADDC R1, R5 | R1 00FFFF 5C25h  
  R5 00FFFF 019Eh  
  LV 0  
  UF 0  
  N 0  
  Z 0  
  V 0  
  C 0 | R1 00FFFF 5C25h  
  R5 00FFFFFF DC4h  
  LV 0  
  UF 0  
  N 0  
  Z 0  
  V 0  
  C 0 |
### Syntax

**ADDC3**  
*src2, src1, dst*

### Operands

*src1, src2*: type 1 or type 2 three-operand addressing modes  
*dst*: register mode (any register in CPU primary-register file)

### Opcode

#### Type 1

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 0 0 0 0 0</td>
<td>T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

#### Type 2

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 0 0 0</td>
<td>T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

### Word Fields

#### Type 1

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

#### Type 2

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

### Operation

*src1 + src2 + C → dst*
ADDC3  Add Integer With Carry, 3 Operands

Description  The sum of the src1 and src2 operands and value of the C (carry) flag is loaded into the dst register. The src1, src2, and dst operands are assumed to be signed integers.

Status Bits  If ST (SET COND) = 0, the condition flags are modified if the destination register is R0 – R11. If ST (SET COND) = 1, they are modified for all destination registers.

LUF  Unaffected
LV  1 if an integer overflow occurs, unchanged otherwise
U  0
N  1 if a negative result is generated, 0 otherwise
Z  1 if a zero result is generated, 0 otherwise
V  1 if an integer overflow occurs, 0 otherwise
C  1 if a carry occurs, 0 otherwise

Mode Bit  OVM operation is affected by OVM bit value.

Cycles  1

Example  None
Add Floating-Point Values

**ADD F**

### Assembly Language Instructions

**Syntax**

`ADD F src, dst`

**Operands**

- `src`: general-addressing modes
- `dst`: register (R0 – R11)

**Opcode**

```
0 0 0 0 0 0 1 1 | 0 0 0 0 23 24 31 15 8 7 0
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0-R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

`dst + src → dst`

**Description**

The sum of the `dst` and `src` operands is loaded into the `dst` register. The `dst` and `src` operands are assumed to be floating-point numbers.

**Status Bits**

- **LUF**: 1 if a floating-point underflow occurs, unchanged otherwise
- **LV**: 1 if a floating-point overflow occurs, unchanged otherwise
- **UF**: 1 if a floating-point underflow occurs, 0 otherwise
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an floating-point overflow occurs, 0 otherwise
- **C**: Unaffected

**Mode Bit**

- **OVM**: operation is not affected by OVM bit value.

**Cycles**

1
### Example

ADD.F *AR4++(IR1),R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR4 80 9800h</td>
<td>AR4 80 992Bhh</td>
</tr>
<tr>
<td>IR1 12Bh</td>
<td>IR1 12Bh</td>
</tr>
<tr>
<td>R5 057980 0000h</td>
<td>R5 09052C 0000h</td>
</tr>
<tr>
<td>6.23750e + 01</td>
<td>5.3268750e + 02</td>
</tr>
</tbody>
</table>

Data at 80 9800h

<p>| | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>86B 2800h</td>
<td>86B 2800h</td>
</tr>
<tr>
<td>4.7031250e + 02</td>
<td>4.7031250e + 02</td>
</tr>
</tbody>
</table>

LUF 0
LV 0
UF 0
N 0
Z 0
V 0
C 0
Add Floating-Point Values, 3 Operands  ADDF3

Syntax  ADDF3  \( src2, src1, dst \)

Operands  \( src1, src2 \): type 1 or type 2 three-operand addressing modes  
\( dst \): register mode (R0 – R11)

Opcode

Type 1

\[
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 \ \ 7 \ \ 0 \\
0 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & T & dst & src1 & src2
\end{array}
\]

Type 2

\[
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 \ \ 7 \ \ 0 \\
0 & 0 & 1 & 1 & 0 & 0 & 0 & 1 & T & dst & src1 & src2
\end{array}
\]

Word Fields

Type 1

\( T \)  \( src1 \) addressing modes  \( src2 \) addressing modes

| 00 | register mode (R0 – R11) | register mode (R0 – R11) |
| 01 | indirect mode (disp = 0, 1, IR0, IR1) | register mode (R0 – R11) |
| 10 | register mode (R0 – R11) | indirect mode (disp = 0, 1, IR0, IR1) |
| 11 | indirect mode (disp = 0, 1, IR0, IR1) | indirect mode (disp = 0, 1, IR0, IR1) |

Type 2

\( T \)  \( src1 \) addressing modes  \( src2 \) addressing modes

| 01 | register mode (any CPU register) | indirect mode *+ARn(5-bit unsigned displacement) |
| 11 | indirect mode *+ARn1(5-bit unsigned displacement) | indirect mode *+ARn2(5-bit unsigned displacement) |

Operation  \( src1 + src2 \rightarrow dst \)
ADDF3  Add Floating-Point Values, 3 Operands

Description
The sum of the src1 and src2 operands is loaded into the dst register. The src1, src2, and dst operands are assumed to be floating-point numbers.

Status Bits
- **LUF**: 1 if a floating-point underflow occurs, unchanged otherwise
- **LV**: 1 if a floating-point overflow occurs, unchanged otherwise
- **UF**: 1 if a floating-point underflow occurs, 0 otherwise
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an floating-point overflow occurs, 0 otherwise
- **C**: Unaffected

Mode Bit
**OVM** operation is not affected by OVM bit value.

Cycles
1

Example
`ADDF3 *+AR1(2),*+AR1(8),R4`

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1 2FF820h</td>
<td>AR1 2FF820h</td>
</tr>
<tr>
<td>R4 0h</td>
<td>R4 070DB20000h</td>
</tr>
<tr>
<td>Data at 22F F822h</td>
<td>Data at 22F F828h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Parallel ADDF3 and STF  ADDF3||STF

Syntax

\[
\text{ADDF3} \quad \text{src2, src1, dst1} \\
\| \quad \text{STF} \quad \text{src3, dst2}
\]

Operands

- \textit{src1}: register (R0 – R7)
- \textit{src2}: indirect (disp = 0, 1, IR0, IR1)
- \textit{dst1}: register (R0 – R7)
- \textit{src3}: register (R0 – R7)
- \textit{dst2}: indirect (disp = 0, 1, IR0, IR1)

Opcode

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 1 & 0 & 0 & 1 & 1 & 0 & \text{dst1} \quad \text{src1} \quad \text{dst2} \quad \text{src2}
\end{array}
\]

Word Fields

None

Operation

\[
\text{src1} + \text{src2} \rightarrow \text{dst1} \\
\| \quad \text{src3} \rightarrow \text{dst2}
\]

Description

A floating-point addition and a floating-point store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STF) reads from a register and the operation being performed in parallel (ADDF3) writes to the same register, then STF accepts as input the contents of the register before it is modified by the ADDF3.

If \textit{src2} and \textit{dst2} point to the same location, \textit{src2} is read before the write to \textit{dst2}.

Status Bits

- \textsc{LUF}: 1 if a floating-point underflow occurs, unchanged otherwise
- \textsc{LV}: 1 if a floating-point overflow occurs, unchanged otherwise
- \textsc{UF}: 1 if a floating-point underflow occurs, 0 otherwise
- \textsc{N}: 1 if a negative result is generated, 0 otherwise
- \textsc{Z}: 1 if a zero result is generated, 0 otherwise
- \textsc{V}: 1 if an floating-point overflow occurs, 0 otherwise
- \textsc{C}: Unaffected

Mode Bit

\textsc{OVM} operation is not affected by OVM bit value.

Cycles

1
**Example**

`ADDF3  *+AR3(IR1),R2,R5
|| STF  R4,*AR2`

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>AR3</strong> 80 9800h</td>
<td><strong>AR3</strong> 80 9800h</td>
</tr>
<tr>
<td><strong>IR1</strong>  0A5h</td>
<td><strong>IR1</strong>  0A5h</td>
</tr>
<tr>
<td><strong>R2</strong> 070C80 0000h</td>
<td><strong>R2</strong> 070C80 0000h</td>
</tr>
<tr>
<td><strong>R5</strong>  0h</td>
<td><strong>R5</strong>  082020 0000h</td>
</tr>
<tr>
<td><strong>R4</strong> 057B40 0000h</td>
<td><strong>R4</strong> 057B40 0000h</td>
</tr>
<tr>
<td><strong>AR2</strong> 80 98F3h</td>
<td><strong>AR2</strong> 80 98F3h</td>
</tr>
<tr>
<td><strong>Data at 80 98A5h</strong> 733 C000h</td>
<td><strong>Data at 80 98F3h</strong> 57B4000h</td>
</tr>
<tr>
<td><strong>Data at 80 98F3h</strong></td>
<td><strong>Data at 80 98F3h</strong></td>
</tr>
<tr>
<td><strong>LUF</strong> 0</td>
<td><strong>LUF</strong> 0</td>
</tr>
<tr>
<td><strong>LV</strong> 0</td>
<td><strong>LV</strong> 0</td>
</tr>
<tr>
<td><strong>UF</strong> 0</td>
<td><strong>UF</strong> 0</td>
</tr>
<tr>
<td><strong>N</strong> 0</td>
<td><strong>N</strong> 0</td>
</tr>
<tr>
<td><strong>Z</strong> 0</td>
<td><strong>Z</strong> 0</td>
</tr>
<tr>
<td><strong>V</strong> 0</td>
<td><strong>V</strong> 0</td>
</tr>
<tr>
<td><strong>C</strong> 0</td>
<td><strong>C</strong> 0</td>
</tr>
</tbody>
</table>
Add Integer ADDI

**Syntax**

ADDI src, dst

**Operands**

src: general-addressing modes
dst: register (any register in CPU primary-register file)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0 0 0 1 0 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

dst + src → dst

**Description**

The sum of the dst and src operands is loaded into the the dst register. The dst and src operands are assumed to be signed integers.

**Status Bits**

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: 1 if a carry occurs, 0 otherwise

**Mode Bit**

OVM operation is affected by OVM bit value.

**Cycles**

None
### Example

**ADDI**  \( R3, R7 \)

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3</td>
<td>0FFFF FFCBh</td>
</tr>
<tr>
<td>R7</td>
<td>35h</td>
</tr>
<tr>
<td>LUF</td>
<td>53</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
<tr>
<td>R3</td>
<td>0FFFF FFCBh</td>
</tr>
<tr>
<td>R7</td>
<td>0h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>1</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
Add Integer, 3 Operands

**ADDI3**

**Syntax**

ADDI3 src2, src1, dst

**Operands**

src1, src2: type 1 or type 2 three-operand addressing modes  
dst: register mode (any register in CPU primary-register file)

**Opcode**

**Type 1**

<table>
<thead>
<tr>
<th>Type</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>Type</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

**Operation**

src1 + src2 → dst
ADDI3  Add Integer, 3 Operands

Description  The sum of the src1 and src2 operands is loaded into the dst register. The src1, src2, and dst operands are assumed to be signed integers.

Status Bits  If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>Flag</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>1 if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>1 if a carry occurs, 0 otherwise</td>
</tr>
</tbody>
</table>

Mode Bit  OVM operation is affected by OVM bit value.

Cycles  1

Example  None
Parallel ADDI3 and STI

ADDI3||STI

Syntax

ADDI3  src2, src1, dst1
|| STI  src3, dst2

Operands

src1: register (R0 – R7)
src2: indirect (disp = 0, 1, IR0, IR1)
dst1: register (R0 – R7)
src3: register (R0 – R7)
dst2: indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>dst1</td>
<td>src1</td>
<td>src3</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>dst2</td>
<td></td>
<td>src2</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

src1 + src2 → dst1
|| src3 → dst2

Description

An integer addition and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (ADDI3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the ADDI3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Status Bits

LUF  Unaffected
LV   1 if an integer overflow occurs, unchanged otherwise
UF   0
N    1 if a negative result is generated, 0 otherwise
Z    1 if a zero result is generated, 0 otherwise
V    1 if an integer overflow occurs, 0 otherwise
C    1 if a carry occurs, 0 otherwise

Mode Bit

OVM operation is affected by OVM bit value.

Cycles

1

Assembly Language Instructions  14-47
### Example


<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>AR0</strong></td>
<td>80 992Ch</td>
</tr>
<tr>
<td><strong>IR0</strong></td>
<td>0Ch</td>
</tr>
<tr>
<td><strong>R5</strong></td>
<td>0DCh</td>
</tr>
<tr>
<td><strong>R0</strong></td>
<td>0h</td>
</tr>
<tr>
<td><strong>R3</strong></td>
<td>35h</td>
</tr>
<tr>
<td><strong>AR7</strong></td>
<td>80 983Bh</td>
</tr>
<tr>
<td>Data at 80 992Ch</td>
<td>12Ch 300</td>
</tr>
<tr>
<td>Data at 80 983Bh</td>
<td>0h</td>
</tr>
<tr>
<td><strong>LUF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>LV</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>UF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>N</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>Z</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>V</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>C</strong></td>
<td>0</td>
</tr>
</tbody>
</table>

**ADDI3** *AR0=-(IR0),R5,R0*  
**STI**  R3,*AR7**
Bitwise Logical-AND

**Syntax**

\[
\text{AND } src, dst
\]

**Operands**

- **src**: general-addressing modes
- **dst**: register (any register in CPU primary-register file)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td>0</td>
<td>1</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

\[ dst \text{ AND } src \rightarrow dst \]

**Description**

The bitwise-logical AND between the \( dst \) and \( src \) operands is loaded into the \( dst \) register. The \( dst \) and \( src \) operands are assumed to be unsigned integers.

**Status Bits**

If \( ST \text{ (SET COND)} = 0 \) and the destination register is \( R0 \rightarrow R11 \), the condition flags are modified. If \( ST \text{ (SET COND)} = 1 \), they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

**Mode Bit**

\( OVM \) operation is not affected by \( OVM \) bit value.

**Cycles**

1
### AND Bitwise Logical-AND

**Example**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1 80h</td>
<td>R1 80h</td>
</tr>
<tr>
<td>R2 0AFFh</td>
<td>R2 80h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 1</td>
<td>C 1</td>
</tr>
</tbody>
</table>

**Before Instruction**
- R1: 80h
- R2: 0AFFh
- LUF: 0
- LV: 0
- UF: 0
- N: 0
- Z: 0
- V: 0
- C: 1

**After Instruction**
- R1: 80h
- R2: 80h
- LUF: 0
- LV: 0
- UF: 0
- N: 0
- Z: 0
- V: 0
- C: 1
Syntax

\[ \text{AND3} \ src2, \ src1, \ dst \]

Operands

- \textit{src1}, \textit{src2}: type 1 or type 2 three-operand addressing modes
- \textit{dst}: register mode (any register in CPU primary-register file)

Opcode

Type 1

\[
\begin{array}{cccccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 & 1 & 1 & T & dst & src1 & src2
\end{array}
\]

Type 2

\[
\begin{array}{cccccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & T & dst & src1 & src2
\end{array}
\]

Word Fields

Type 1

\[
\begin{array}{ccc}
T & src1 \ addressing \ modes & src2 \ addressing \ modes \\
00 & \text{register mode (any CPU register)} & \text{register mode (any CPU register)} \\
01 & \text{indirect mode (disp = 0, 1, IR0, IR1)} & \text{register mode (any CPU register)} \\
10 & \text{register mode (any CPU register)} & \text{indirect mode (disp = 0, 1, IR0, IR1)} \\
11 & \text{indirect mode (disp = 0, 1, IR0, IR1)} & \text{indirect mode (disp = 0, 1, IR0, IR1)}
\end{array}
\]

Type 2

\[
\begin{array}{ccc}
T & src1 \ addressing \ modes & src2 \ addressing \ modes \\
00 & \text{register mode (any CPU register)} & \text{8-bit signed immediate} \\
01 & \text{register mode (any CPU register)} & \text{indirect mode } *+\text{ARn}(5\text{-bit unsigned displacement}) \\
10 & \text{indirect mode } *+\text{ARn}(5\text{-bit unsigned displacement}) & \text{8-bit signed immediate} \\
11 & \text{indirect mode } *+\text{ARn1}(5\text{-bit unsigned displacement}) & \text{indirect mode } *+\text{ARn2}(5\text{-bit unsigned displacement)}
\end{array}
\]

Operation

\[ src1 \ & \ src2 \rightarrow \ dst \]
AND3  Bitwise Logical-AND, 3 Operands

Description
The bitwise logical-AND between the src1 and src2 operands is loaded into the dst register. The src1, src2, and dst operands are assumed to be unsigned integers. The immediate src2 addressing mode is sign-extended.

Status Bits
If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>Flag</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>MSB of the output</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
1

Example
Notice the difference between AND and AND3, in this example:

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Destination</th>
<th>Source 1</th>
<th>Source 2</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<tr>
<td>AND3 80h, R0, R0</td>
<td>R0=FFFF FFFFh</td>
<td>R0=FFFF FF80h</td>
<td></td>
<td></td>
</tr>
<tr>
<td>AND 80h, R0</td>
<td>R0=FFFF FFFFh</td>
<td>R0=0000 0080h</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
Parallel AND3 and STI

Syntax

\[ \text{AND3 } src2, src1, dst1 \]
\[ || \]
\[ \text{STI } src3, dst2 \]

Operands

- src1: register (R0 – R7)
- src2: indirect (disp = 0, 1, IR0, IR1)
- dst1: register (R0 – R7)
- src3: register (R0 – R7)
- dst2: indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>dst1</td>
<td>src1</td>
<td>src3</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

\[ src1 \text{ AND } src2 \rightarrow dst1 \]
\[ || \]
\[ src3 \rightarrow dst2 \]

Description

A bitwise-logical AND and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (AND3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the AND3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Status Bits

- LUF: Unaffected
- LV: Unaffected
- UF: 0
- N: MSB of the output
- Z: 1 if a zero result is generated, 0 otherwise
- V: 0
- C: Unaffected

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
### Example

\[
\text{AND3} \quad *+\text{AR1(IR0), R4, R7} \\
\text{|| STI R3, *AR2}
\]

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1</td>
<td>80 99F1h</td>
</tr>
<tr>
<td>IR0</td>
<td>8h</td>
</tr>
<tr>
<td>R4</td>
<td>0A323h</td>
</tr>
<tr>
<td>R7</td>
<td>0h</td>
</tr>
<tr>
<td>R3</td>
<td>35h</td>
</tr>
<tr>
<td>AR2</td>
<td>80 983Fh</td>
</tr>
<tr>
<td>Data at 80 99F9h</td>
<td>5C53h</td>
</tr>
<tr>
<td>Data at 80 983Fh</td>
<td>0h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
Bitwise Logical-AND With Complement  \( \text{ANDN} \)

Syntax

\[ \text{ANDN} \ src, \ dst \]

Operands

- **src**: general-addressing modes
- **dst**: register (any register in CPU primary-register file)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16</th>
<th>15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0 0 1 1 0</td>
<td>dst</td>
<td>src</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

\[ dst \ \text{AND} \ \sim src \rightarrow dst \]

Description

The bitwise-logical AND between the **dst** operand and the bitwise-logical complement (\( \sim \)) of the **src** operand is loaded into the **dst** register. The **dst** and **src** operands are assumed to be unsigned integers.

Status Bits

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

Mode Bit

**OVM** operation is not affected by OVM bit value.

Cycles

1
**Example**

**ANDN @980Ch, R2**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP 80h</td>
<td>DP 80h</td>
</tr>
<tr>
<td>R2 0C2Fh</td>
<td>R2 042Dh</td>
</tr>
<tr>
<td>Data at 80 980Ch</td>
<td>Data at 80 980Ch</td>
</tr>
<tr>
<td>0A02h</td>
<td>0A02h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Syntax

**ANDN3 src2, src1, dst**

Operands

*src1, src2:* type 1 or type 2 three-operand addressing modes  
*dst:* register mode (any register in CPU primary-register file)

Opcode

**Type 1**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1</td>
<td>0 0 0</td>
<td>1 0 0</td>
<td>T</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1</td>
<td>0 0 0</td>
<td>1 0 0</td>
<td>T</td>
</tr>
</tbody>
</table>

Word Fields

**Type 1**

<table>
<thead>
<tr>
<th></th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th></th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation

\[ \text{src1 AND} \sim \text{src2} \rightarrow \text{dst} \]
Description
The bitwise-logical AND between the src1 operand and the bitwise-logical complement (¬) of the src2 operand is loaded into the dst register. The src1, src2, and dst operands are assumed to be unsigned integers. The immediate src2 addressing mode is sign-extended.

Status Bits
- LUF: Unaffected
- LV: Unaffected
- UF: 0
- N: MSB of the output
- Z: 1 if a zero result is generated, 0 otherwise
- V: 0
- C: Unaffected

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
1

Example
None
Syntax

`ASH src_count, dst`

Operands

`src_count`: general-addressing modes  
`dst`: register (any register in CPU primary-register file)

Opcode

```
0 0 0 0 0 0 1 1 1 G dst src_count
```

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

`count = 7` LSBs of `src_count`

If (`count ≥ 0`):

\[ dst \leftarrow count \rightarrow dst \]

Else:

\[ dst \rightarrow |count| \rightarrow dst \]

Description

The seven least-significant bits of the `src_count` operand constitute the 2s-complement shift `count` of up to 32 bits.

If `count` is greater than 0, the `dst` operand is left-shifted by the value of `count`. Low-order bits shifted in are zero-filled, and high-order bits are shifted out through the C (carry) bit.

Arithmetic left-shift:

![Arithmetic left-shift diagram]

If `count` is less than 0, the `dst` operand is right-shifted by the absolute value of `count`. The high-order bits of the `dst` operand are sign-extended as it is right-shifted. Low-order bits are shifted out through the C (carry) bit.

Arithmetic right-shift:

![Arithmetic right-shift diagram]
ASH  Arithmetic Shift

If count is 0, no shift is performed, and the C (carry) bit is set to 0. The src_count and dst operands are assumed to be signed integers.

Status Bits

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

LUF  Unaffected
LV  1 if an integer overflow occurs, unchanged otherwise
UF  0
N  MSB of the output
Z  1 if a zero result is generated, 0 otherwise
V  1 if an integer overflow occurs, 0 otherwise
C  Set to the value of the last bit shifted out. 0 for a shift count of 0

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1

Example 1

ASH R1,R3

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1 10h</td>
<td>R1 10h</td>
</tr>
<tr>
<td>R3 0A E000h</td>
<td>R3 0E000 0000h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 1</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 1</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>

Example 2

ASH @98C3h,R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP 80h</td>
<td>DP 80h</td>
</tr>
<tr>
<td>R5 0AEC0 0001h</td>
<td>R5 0FFFF FFAEh</td>
</tr>
<tr>
<td>Data at 80 98C3h</td>
<td>Data at 80 98C3h</td>
</tr>
<tr>
<td>0FFE8 –24</td>
<td>0FFE8 –24</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 1</td>
</tr>
</tbody>
</table>

14-60
Syntax

ASH3 src_count, src, dst

Operands

src, src_count  type 1 or type 2 three-operand addressing modes
dst  register mode (any register in CPU primary register file)

Opcode

Type 1

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 0 0 0 1 0 1</td>
<td>T</td>
<td>dst</td>
<td>src</td>
<td>src_count</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 0 1 0 1</td>
<td>T</td>
<td>dst</td>
<td>src</td>
<td>src_count</td>
</tr>
</tbody>
</table>

Word Fields

Type 1

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation

count = 7 LSBs of src_count
if (count ≥ 0)
src << count → dst

Else:
src >> | count| → dst
### Description

The seven least-significant bits of the `src_count` operand constitute the 2s-complement shift `count`.

If `count` is greater than 0, the `src` operand is left-shifted by the value of `count`. Low-order bits shifted in are zero-filled, and high-order bits are shifted out through the status register’s C (carry) bit.

Arithmetic left-shift:

![Arithmetic Left-Shift Diagram]

If `count` is less than 0, the `src` operand is right-shifted by the absolute value of `count` (e.g. –4 = right-shift 4). The high-order bits of the `src` operand are sign-extended as they are right-shifted. Low-order bits are shifted out through the C (carry) bit.

Arithmetic right-shift:

![Arithmetic Right-Shift Diagram]

If `count` is 0, no shift is performed, and the C (carry) bit is set to 0. The `src_count`, `src`, and `dst` operands are assumed to be signed integers.

### Status Bits

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>MSB of the output</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>Set to the value of the last bit shifted out. 0 for a shift <code>count</code> of 0</td>
</tr>
</tbody>
</table>

### Mode Bit

OVM operation is not affected by OVM bit value.

### Cycles

1

### Example

None
Parallel ASH₃ and STI

Syntax

ASH₃  \( \text{src\_count}, \text{src2}, \text{dst1} \)  
||  
STI  \( \text{src3}, \text{dst2} \)

\( \text{src2} \ll \text{count} \rightarrow \text{dst1} \)
Else:
\( \text{src2} \gg |\text{count}| \rightarrow \text{dst1} \)
||  \( \text{src3} \rightarrow \text{dst2} \)

Operands

\( \text{src\_count} \) register (R0 – R7)
\( \text{src2} \): indirect (disp = 0, 1, IR0, IR1)
\( \text{dst1} \): register (R0 – R7)
\( \text{src3} \): register (R0 – R7)
\( \text{dst2} \): indirect (disp = 0, 1, IR0, IR1)

Opcode

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 1 & 0 & 1 & 0 & 1 & |\text{dst1}| & |\text{src\_count}| & |\text{src3}| & |\text{dst2}| & |\text{src2}| \\
\end{array}
\]

Word Fields

None

Operation

\( \text{count} = 7 \text{ LSBs of src\_count} \)

If \( \text{count} \geq 0 \):

Description

The seven least-significant bits of the src\_count operand register constitute the 2s-complement shift count of up to 32 bits.

If \( \text{count} \) is greater than 0, the dst operand is left-shifted by the value of count. Low-order bits shifted in are zero-filled, and high-order bits are shifted out through the C (carry) bit.

Arithmetic left-shift:

\[
\begin{array}{c}
\text{C} \\
\text{src2} \\
\downarrow \\
0
\end{array}
\]

If \( \text{count} \) is less than 0, the dst operand is right-shifted by the absolute value of count. The high-order bits of the dst operand are sign-extended as it is right-shifted. Low-order bits are shifted out through the C (carry) bit.

Arithmetic right-shift:

\[
\begin{array}{c}
\text{sign of src2} \\
\text{src2} \\
\downarrow \\
\text{C}
\end{array}
\]

If \( \text{count} \) is 0, no shift is performed, and the C (carry) bit is set to 0. The src\_count and dst operands are assumed to be signed integers.
ASH3||STI  Parallel ASH3 and STI

All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (ASH3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the ASH3. If src2 and dst2 point to the same location, src2 is read before the write to dst2.

**Status Bits**

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>MSB of the output</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>Set to the value of the last bit shifted out. 0 for a shift count of 0</td>
</tr>
</tbody>
</table>

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

ASH3 R1,*AR6++(IR1),R0 || STI R5,*AR2

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR6 80 9900h</td>
<td>AR6 80 998Ch</td>
</tr>
<tr>
<td>IR1 8Ch</td>
<td>IR1 8Ch</td>
</tr>
<tr>
<td>R1 0FFE8h</td>
<td>R1 0FFE8h</td>
</tr>
<tr>
<td>R0 0h</td>
<td>R0 0FFFF FFAEh</td>
</tr>
<tr>
<td>R5 35h</td>
<td>R5 35h</td>
</tr>
<tr>
<td>AR2 80 98A2h</td>
<td>AR2 80 98A2h</td>
</tr>
<tr>
<td>Data at 80 9900h</td>
<td>Data at 80 9900h</td>
</tr>
<tr>
<td>0AE00 0000h</td>
<td>0AE00 0000h</td>
</tr>
<tr>
<td>Data at 80 98A2h</td>
<td>Data at 80 98A2h</td>
</tr>
<tr>
<td>0h</td>
<td>0h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Syntax

\texttt{B cond src}

Operands

\textit{src}: conditional-branch-addressing modes (B)

_opcode_

\begin{center}
\begin{tabular}{ccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0
\hline
0 & 1 & 1 & 0 & 1 & 0 & B & 0 & 0 & 0 & 0 & \textit{cond} & register or displacement
\end{tabular}
\end{center}

Word Fields

<table>
<thead>
<tr>
<th>B</th>
<th>\textit{src} addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>register</td>
</tr>
<tr>
<td>1</td>
<td>PC relative</td>
</tr>
</tbody>
</table>

Operation

If \textit{cond} is true:

If \textit{src} is in register-addressing mode (any register in CPU primary-register file),

\textit{src} \rightarrow \text{PC}.

If \textit{src} is in PC-relative mode (label or address), \textit{displacement} + \text{PC} + 1 \rightarrow \text{PC}.

Else, continue.

Description

\texttt{B cond} signifies a standard branch that executes in four cycles. A branch is performed if the condition is true (since a pipeline flush also occurs on a true condition; see Section 8.2 on page 8-4). If the \textit{src} operand is expressed in register-addressing mode, the contents of the specified register are loaded into the PC. If the \textit{src} operand is expressed in PC-relative mode, the assembler generates a displacement: \textit{displacement} = \text{label} – (PC of branch instruction + 1). This displacement is stored as a 16-bit signed integer in the 16 least-significant bits of the branch instruction word. This displacement is added to the PC of the branch instruction plus 1 to generate the new PC.

The 'C4x provides 20 condition codes that can be used with this instruction (see Section 14.2 for a list of condition mnemonics, encoding, and flags).

Status Bits

<table>
<thead>
<tr>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit

\texttt{OVM} operation is not affected by OVM bit value.

Cycles

4 (Regardless of whether or not the branch is taken)
### Bcond  Branch Conditionally (Standard)

#### Example

<table>
<thead>
<tr>
<th>BZ R0</th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC</td>
<td>2B00h</td>
<td>PC 3FF00h</td>
</tr>
<tr>
<td>R0</td>
<td>0003 FF00h</td>
<td>R0 0003 FF00h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z</td>
<td>1</td>
<td>Z 1</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>V 0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>C 0</td>
</tr>
</tbody>
</table>

**Note:**

If a BZ instruction is executed immediately following a RND instruction with a zero operand, the branch is not performed, because the zero flag is not set. To circumvent this problem, execute a BZUF instead of a BZ instruction.
Branch Conditionally Delayed and Annul If False  **BcondAF**

**Syntax**

\[ B\text{condAF} \ src \]

**Operands**

\( src: \) conditional-branch-addressing modes

**Opcode**

\[
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 8 & 7 & 0 \\
0 & 1 & 1 & 0 & 1 & 0 & B \\
0 & 1 & 0 & 1 & & & \text{cond} \\
& & & & & \text{register or displacement}
\end{array}
\]

**Word Fields**

<table>
<thead>
<tr>
<th>B</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>register</td>
</tr>
<tr>
<td>1</td>
<td>PC relative</td>
</tr>
</tbody>
</table>

**Operation**

If \((\text{cond} \text{ is true})\)

- If \((\text{src} \text{ is a register})\)
  \(\text{src} \rightarrow \text{PC}\)
- If \((\text{src} \text{ is in PC-relative mode})\)
  \(\text{displacement} + \text{PC of branch} + 3 \rightarrow \text{PC}\)

Else:

If \((\text{cond} \text{ is false})\)

annul the effect of the execute phase of the first following instruction and the effect of the read and execute phases of the second and third following instructions and continue.

**Description**

If the condition is true, a branch and the three instructions following the branch instruction are executed. If the condition is false, no branch is performed, and the effect of the execute phase of the first following instruction and of the read and execute phases of the second and third following instructions is annulled. The three instructions following BcondAF do not affect the \(\text{cond}\). If the \(\text{src}\) operand is in register mode, then the contents of the specified register are loaded into the PC. If the \(\text{src}\) operand is in PC-relative mode, then the sum of the PC of the branch instruction + 3 and the \(\text{displacement}\) is loaded into the PC. In PC-relative mode the \(\text{displacement}\) field is interpreted as a 16-bit signed integer.

None of the three instructions following the BcondAF can be an instruction that modifies the program flow. Interrupts are disabled for the duration of the BcondAF instruction.

BcondAF especially is useful for controlling the exit at the bottom of a loop. Use caution when using instructions such as PUSH/POP, LDPK, or LDA that can modify registers like AR\(n\), SP, and DP in the decode and/or read phase. This also applies when using instructions to perform indirect addressing with AR\(n\) modification.
BcondAF  Branch Conditionally Delayed and Annul If False

Status Bits

<table>
<thead>
<tr>
<th>Bit</th>
<th>Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1

Example

None
Branch Conditionally Delayed and Annul If True  \textbf{BcondAT}

**Syntax**

\textbf{BcondAT} \textit{src}

**Operands**

\textit{src} conditional-branch-addressing modes

**Opcode**

\begin{center}
\begin{tabular}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 1 & 1 & 0 & 0 & B & 0 & 0 & 1 & 1 & \textit{cond} & \textit{register or displacement}
\end{tabular}
\end{center}

**Word Fields**

\begin{center}
\begin{tabular}{cc}
\textit{B} & \textit{src} addressing modes \\
0 & register \\
1 & PC relative
\end{tabular}
\end{center}

**Operation**

If (\textit{cond} is true)
If (\textit{src} is a register)
\[ \textit{src} \rightarrow \text{PC} \]
annul the effect of the execute phase of the first following instruction and of the read and execute phases of second and third following instructions.
If (\textit{src} is in PC-relative mode)
\[ \text{displacement} + \text{PC of branch} + 3 \rightarrow \text{PC} \]
annul the effect of the execute phase of the first following instruction and of the read and execute phases of second and third following instructions.
Else, continue.

**Description**

If the condition is true, a branch is performed, and the effect of the execute phase of the first following instruction and of the read and execute phases of second and third following instructions is annulled. The three instructions following \textbf{BcondAT} do not affect the \textit{cond}. If the \textit{src} operand is expressed in register mode, then the contents of the specified register are loaded into the PC. If the \textit{src} operand is in PC-relative mode, then the sum of the PC of the branch instruction + 3 and the \textit{displacement} are loaded into the PC. In PC-relative mode, the \textit{displacement} field is interpreted as a 16-bit signed integer.

None of the three instructions following \textbf{BcondAT} can be an instruction that modifies the program flow. Interrupts are disabled for the duration of \textbf{BcondAT}.

The \textbf{BcondAT} instruction does not annul the status signals at the external interfaces. Be especially careful when using instructions such as \textbf{PUSH/POP}, \textbf{LDPK}, or \textbf{LDA} that can modify registers like ARn, SP, and DP in the decode and/or read phase. This also applies when you use instructions to perform indirect addressing with ARn modification. \textbf{BcondAT} particularly is useful for controlling the entry at the top of the loop.
**BcondAT**  *Branch Conditionally Delayed and Annul If True*

<table>
<thead>
<tr>
<th>Status Bits</th>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

**Mode Bit**  
OVM operation is not affected by OVM bit value.

**Cycles**  
1

**Example**  
None
Branch Conditionally (Delayed)  BcondD

Syntax  
BcondD src

Operands  
src: conditional-branch-addressing modes (B)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>cond</td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>B</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>register</td>
</tr>
<tr>
<td>1</td>
<td>PC relative</td>
</tr>
</tbody>
</table>

Operation  
If cond is true:
If src is in register-addressing mode (any register in CPU primary-register file)  
src → PC.
If src in PC-relative mode (label or address), displacement + PC + 3 → PC. Else, continue.

Description  
BcondD signifies a delayed branch, allowing the three instructions after the delayed branch to be performed before the PC is modified. The effect is a single-cycle branch, and the three instructions following BcondD do not affect the cond.

None of the three instructions following BcondD should be an instruction that modifies program flow. Interrupts are disabled for the duration of BcondD.

A branch is performed if the condition is true. If the src operand is expressed in register-addressing mode, the contents of the specified register are loaded into the PC. If the src operand is expressed in PC-relative mode, the assembler generates a displacement: displacement = src – (PC of branch instruction + 3). This displacement is stored as a 16-bit signed integer in the 16 least-significant bits of the branch instruction. This displacement is added to the PC of the branch instruction plus 3 to generate the new PC. The 'C4x provides 20 condition codes that can be used with this instruction (see Section 14.2 for a list of condition mnemonics, encoding, and flags).

Status Bits  

<table>
<thead>
<tr>
<th>Bit</th>
<th>Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>
### BcondD  *Branch Conditionally (Delayed)*

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

BNZD 36 (36 = 24h)

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC 50h</td>
<td>PC 77h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 1</td>
<td>Z 1</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Syntax  
\[\text{BR} \ src\]

Operands  
\(src\): in PC-relative mode

Opcode  
\[
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 \ 7 \ 0 \\
0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & \text{displacement}
\end{array}
\]

Word Fields  
None

Operation  
\(\text{PC} + 1 + \text{displacement} \rightarrow \text{PC}\)

Description  
Performs an unconditional branch. The assembler generates a displacement: \(\text{displacement} = src - (\text{PC of branch instruction} + 1)\). This displacement is stored as a 24-bit signed integer in the 24 least-significant bits of the branch instruction. This displacement is added to the PC of the branch instruction plus 1 to generate the new PC.

Status Bits  
- \(\text{LUF}\): Unaffected
- \(\text{LV}\): Unaffected
- \(\text{UF}\): Unaffected
- \(\text{N}\): Unaffected
- \(\text{Z}\): Unaffected
- \(\text{V}\): Unaffected
- \(\text{C}\): Unaffected

Mode Bit  
\(\text{OVM}\) operation is not affected by OVM bit value.

Cycles  
4

Example  
None
BRD  Branch Unconditionally (Delayed)

Syntax  BRD src

Operands  src: in PC-relative mode

Opcode

```
  31 24 23 16 15 8 7 0
  0 1 1 0 0 0 0 1
```

Word Fields  None

Operation  PC + 3 + displacement → PC

Description  Performs an unconditional delayed branch. The assembler generates a displacement: $\text{displacement} = \text{src} - (\text{PC of branch instruction} + 3)$. This displacement is stored as a 24-bit signed integer in the 24 least significant bits of the branch instruction. This displacement is added to the PC of the branch instruction plus 3 to generate the new PC.Interrupts are disabled during the BRD instruction.

The three instructions following the BRD instruction are fetched and executed. None of these three instructions should modify the program flow (e.g., affect the PC value).

Status Bits  

<table>
<thead>
<tr>
<th>Bit</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit  OVM operation is not affected by OVM bit value.

Cycles  1

Example  None
**Syntax**

CALL src

**Operands**

src: in PC-relative mode

** Opcode**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 1 1 0 0 0 1</td>
<td>0</td>
<td>displacement</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
```

**Word Fields**

None

**Operation**

Next PC → *(++SP)
PC + 1 + displacement → PC

**Description**

Performs a call. The next PC value is pushed onto the system stack. The assembler generates a displacement: displacement = src – (PC of branch instruction + 1). This displacement is stored as a 24-bit signed integer in the 24 least significant bits of the branch instruction. This displacement is added to the PC of the branch instruction plus 1 to generate the new PC.

**Status Bits**

LUF  Unaffected
LV   Unaffected
UF   Unaffected
N    Unaffected
Z    Unaffected
V    Unaffected
C    Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

4

**Example**

None
**CALLcond**  
*Call Subroutine Conditionally*

**Syntax**  
\[
\text{CALLcond src}
\]

**Operands**  
\(src\): conditional-branch-addressing modes (B)

**Opcode**

|   |   |   |   |   | B | 0 0 0 0 | cond | register or displacement |
|---|---|---|---|---|---|-------|--------------------------|
| 31 | 24 | 23 | 16 | 15 | 8 | 7 | 0 |

**Word Fields**

<table>
<thead>
<tr>
<th>B</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>register</td>
</tr>
<tr>
<td>1</td>
<td>PC relative</td>
</tr>
</tbody>
</table>

**Operation**

If \(cond\) is true:
- Next PC \(\rightarrow *++\text{SP}\)
- If \(src\) is in register-addressing mode (any register in CPU primary-register file), \(src \rightarrow \text{PC}\).
- If \(src\) in PC-relative mode (label or address), \(\text{displacement} + \text{PC} + 1 \rightarrow \text{PC}\).

Else, continue.

**Description**

A call is performed if the condition is true. If the condition is true, the next PC value is pushed onto the system stack. If the \(src\) operand is expressed in register-addressing mode, the contents of the specified register are loaded into the PC. If the \(src\) operand is expressed in PC-relative mode, the assembler generates a displacement: \(\text{displacement} = \text{label} - (\text{PC of call instruction} + 1)\). This displacement is stored as a 16-bit signed integer in the 16 least-significant bits of the call instruction word. This displacement is added to the PC of the call instruction plus 1 to generate the new PC.

The ’C4x provides 20 condition codes that can be used with this instruction (see Section 14.2 for a list of condition mnemonics, encoding, and flags).

**Status Bits**

- LUF: Unaffected
- LV: Unaffected
- UF: Unaffected
- N: Unaffected
- Z: Unaffected
- V: Unaffected
- C: Unaffected

**Mode Bit**

- OVM operation is not affected by OVM bit value.

**Cycles**

5 (Regardless of whether the condition is true or not)
**Example**

CALLNZ R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>PC</strong></td>
<td>123h</td>
</tr>
<tr>
<td><strong>SP</strong></td>
<td>80 9835h</td>
</tr>
<tr>
<td><strong>R5</strong></td>
<td>789h</td>
</tr>
<tr>
<td><strong>PC</strong></td>
<td>789h</td>
</tr>
<tr>
<td><strong>SP</strong></td>
<td>80 9836h</td>
</tr>
<tr>
<td><strong>R5</strong></td>
<td>789h</td>
</tr>
</tbody>
</table>

Data at 9836h: 124h

LUF: 0
LV: 0
UF: 0
N: 0
Z: 0
V: 0
C: 0
CMPF  Compare Floating-Point Values

Syntax  
\textbf{CMPF} \textit{src, dst}

Operands  
\textit{src}: general-addressing modes (G):  
\textit{dst}: register (R0 – R11)

Opcode  
\begin{tabular}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\
\hline
\end{tabular}

G  src addressing modes

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
\textit{dst} – \textit{src}

Description  
The \textit{src} operand is subtracted from the \textit{dst} operand. The result is not loaded into any register; this allows for nondestructive compares. The \textit{dst} and \textit{src} operands are assumed to be floating-point numbers.

Status Bits

- \textbf{LUF} 1 if a floating-point underflow occurs, unchanged otherwise
- \textbf{LV} 1 if a floating-point overflow occurs, unchanged otherwise
- \textbf{UF} 1 if a floating-point underflow occurs, 0 otherwise
- \textbf{N} 1 if a negative result is generated, 0 otherwise
- \textbf{Z} 1 if a zero result is generated, 0 otherwise
- \textbf{V} 1 if a floating-point overflow occurs, 0 otherwise
- \textbf{C} Unaffected

Mode Bit

\textbf{OVM} operation is not affected by OVM bit value.

Cycles  
1
### Example

CMPF *+AR4, R6

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR4 80 98F2h</td>
<td>AR4 80 98F2h</td>
</tr>
<tr>
<td>R6 070C80 0000h</td>
<td>R6 070C80 0000h</td>
</tr>
<tr>
<td>Data at 80 98F3h</td>
<td>Data at 80 98F3h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 1</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
CMPF3 Compare Floating-Point Values, 3 Operands

Syntax

CMPF3 src2, src1

Operands

src1 – src2 type 1 or type 2 three-operand addressing modes

Opcode

Type 1

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 0 0 0 1 1 0</td>
<td>T</td>
<td>0 0 0 0 0</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 0 1 1 0</td>
<td>T</td>
<td>0 0 0 0 0</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

Word Fields

Type 1

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (R0 — R11)</td>
<td>register mode (R0 — R11)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (R0 — R11)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (R0 — R11)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>register mode (R11—R0)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation

src1 – src2

Description

The src2 operand is subtracted from the src1 operand. The result is not loaded into any register. This allows for nondestructive compares. The src1 and src2 operands are assumed to be floating-point numbers.

Status Bits

| LUF | 1 if a floating-point underflow occurs, unchanged otherwise |
| LV  | 1 if a floating-point overflow occurs, unchanged otherwise |
| UF  | 1 if a floating-point underflow occurs, 0 otherwise |
| N   | 1 if a negative result is generated, 0 otherwise |
| Z   | 1 if a zero result is generated, 0 otherwise |
| V   | 1 if a floating-point overflow occurs, 0 otherwise |
| C   | Unaffected |

Mode Bit

OVM operation is not affected by OVM bit value.
### Compare Floating-Point Values, 3 Operands

**CMPF3**

<table>
<thead>
<tr>
<th>Cycles</th>
<th>1</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Example</strong></td>
<td>CMPF3 *AR2, *AR3–(1)</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2 809831h</td>
<td>AR2 809831h</td>
</tr>
<tr>
<td>AR3 809852h</td>
<td>AR3 809851h (decrement)</td>
</tr>
<tr>
<td>Data at 809831h 77A7000h 2.5044e + 02</td>
<td>Data at 809831h 77A7000h 2.5044e + 02</td>
</tr>
<tr>
<td>Data at 809852h 57A2000h 6.253125e + 01</td>
<td>Data at 809852h 57A2000h 6.253125e + 01</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
**CMPI  Compare Integer**

**Syntax**  
**CMPI** src, dst

**Operands**  
src: general-addressing modes  
dst: register (any register in CPU primary-register file)

**Opcode**  
<table>
<thead>
<tr>
<th>31</th>
<th>24 23 16 15</th>
<th>8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0</td>
<td>1 0 0 1</td>
<td>G</td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
</tr>
<tr>
<td>01</td>
</tr>
<tr>
<td>10</td>
</tr>
<tr>
<td>11</td>
</tr>
</tbody>
</table>

**Operation**  
dst – src

**Description**  
The src operand is subtracted from the dst operand. The result is not loaded into any register; this allows for nondestructive compares. The dst and src operands are assumed to be signed integers.

**Status Bits**

<table>
<thead>
<tr>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>1 if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>1 if a borrow occurs, 0 otherwise</td>
</tr>
</tbody>
</table>

**Mode Bit**  
OVM operation is not affected by OVM bit value.

**Cycles**  
1
**Example**

CMPI R3, R7

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3 898h 2200</td>
<td>R3 898h 2200</td>
</tr>
<tr>
<td>R7 3E8h 1000</td>
<td>R7 3E8h 1000</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
CMPI3  Compare Integer, 3 Operands

Syntax
CMPI3 src2, src1

Operands
src1 – src2 type 1 or type 2 three-operand addressing modes

Opcode

Type 1

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 0</td>
<td>0 0</td>
<td>0 1</td>
<td>1 1</td>
<td>T</td>
</tr>
</tbody>
</table>

src1 | src2

Type 2

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1</td>
<td>0 1</td>
<td>0 1</td>
<td>1 1</td>
<td>T</td>
</tr>
</tbody>
</table>

src1 | src2

Word Fields

Type 1

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation
src1 – src2

Description
The src2 operand is subtracted from the src1 operand. The result is not loaded into any register. This allows for nondestructive compares. The src1 and src2 operands are assumed to be signed integers. Although this instruction has only two operands, it is designated as a three-operand instruction because operands are specified in the three-operand format.
### Compare Integer, 3 Operands \textbf{CMPI3}

<table>
<thead>
<tr>
<th>Status Bits</th>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td>LV</td>
<td>1</td>
<td>if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>N</td>
<td>1</td>
<td>if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td>Z</td>
<td>1</td>
<td>if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1</td>
<td>if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>1</td>
<td>if a borrow occurs, 0 otherwise</td>
</tr>
</tbody>
</table>

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
DBcond  
Decrement and Branch Conditionally (Standard)

Syntax  
\text{DBcond} \ ARn, \ src

Operands  
\text{src}: \ conditional-branch-addressing \ modes \ (B)  
\text{ARn}: \ auxiliary \ register

Opcode  
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 \ 0 \\
0 & 1 & 1 & 0 & 1 & 1 & B \ ARn & 0 & \text{cond} \end{array}

register \ or \ displacement

Word Fields  
\begin{array}{ll}
\text{B} & \text{src addressing modes} \\
0 & \text{register} \\
1 & \text{PC relative}
\end{array}

Operation  
\text{ARn} - 1 \rightarrow \text{ARn}

If \text{cond} \ is \ true \ and \ \text{ARn} \geq 0:

\begin{itemize}
  \item If \text{src} \ is \ in \ register-addressing \ mode \ (any \ register \ in \ CPU \ primary-register \ file), \ \text{src} \rightarrow \text{PC}.
  \item If \text{src} \ in \ PC-relative \ mode \ (label \ or \ address), \ \text{displacement} + \text{PC} + 1 \rightarrow \text{PC}.
\end{itemize}
Else, continue.

Description  
DBcond \ signifies \ a \ standard \ branch \ that \ executes \ in \ four \ cycles \ because \ the \ pipeline \ must \ be \ flushed \ if \ \text{cond} \ is \ true. \ If \ the \ condition \ is \ true \ and \ the \ specified \ auxiliary \ register \ is \ greater \ than \ or \ equal \ to \ 0, \ the \ specified \ auxiliary \ register \ is \ decremented \ and \ a \ branch \ is \ performed.

The auxiliary register is treated as a 32-bit signed integer. Note that the branch condition does not depend on the auxiliary register decrement.

If the \text{src} \ operand \ is \ expressed \ in \ register-addressing \ mode, \ the \ contents \ of \ the \ specified \ register \ are \ loaded \ into \ the \ PC. \ If \ the \ \text{src} \ operand \ is \ expressed \ in \ PC-relative \ addressing \ mode, \ the \ assembler \ generates \ a \ displacement: \ \text{displacement} = \text{label} - (\text{PC of branch instruction} + 1). \ This \ integer \ is \ stored \ as \ a 16-bit \ signed \ integer \ in \ the 16 \ least-significant \ bits \ of \ the \ branch \ instruction \ word. \ This \ displacement \ is \ added \ to \ the \ PC \ of \ the \ branch \ instruction \ plus \ 1 \ to \ generate \ the \ new \ PC.

The ‘C4x provides 20 condition codes that can be used with this instruction (see Section 11.2 for a list of condition mnemonics, encoding, and flags).
Decrement and Branch Conditionally (Standard)  DBcond

**Status Bits**
- LUF  Unaffected
- LV  Unaffected
- UF  Unaffected
- N  Unaffected
- Z  Unaffected
- V  Unaffected
- C  Unaffected

**Mode Bit**
- OVM operation is not affected by OVM bit value.

**Cycles**
- 4

**Example**
- DBLT AR3,R2

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC 5Fh</td>
<td>PC 9Fh</td>
</tr>
<tr>
<td>AR3 12h</td>
<td>AR3 11h</td>
</tr>
<tr>
<td>R2 9Fh</td>
<td>R2 9Fh</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 1</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
DBcondD  
Decrement and Branch Conditionally (Delayed)

Syntax

\[ \text{DBcondD} \ ARn, \ src \]

Operands

- **src**: conditional-branch-addressing modes (B)
- **ARn**: auxiliary register

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>B</td>
<td>ARn</td>
<td>1</td>
</tr>
</tbody>
</table>

\[ \text{cond} \]

- `B`: register or displacement

Word Fields

<table>
<thead>
<tr>
<th>B</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>register</td>
</tr>
<tr>
<td>1</td>
<td>PC relative</td>
</tr>
</tbody>
</table>

Operation

\[ ARn - 1 \rightarrow ARn \]

If `cond` is true and \( ARn \geq 0 \):

- If `src` is in register addressing mode (any register in CPU primary-register file),
  \[ src \rightarrow PC \]
- If `src` is in PC-relative mode (label or address) \[ \text{displacement} + \text{PC} + 3 \rightarrow \text{PC}. \]

Else, continue.

Description

DBcondD signifies a delayed branch that allows the three instructions after the delayed branch to be fetched before the PC is modified. The effect is a single-cycle branch. If the condition is true and the specified auxiliary register is greater than or equal to zero, the specified auxiliary register is decremented and a branch is performed. (The three instructions following the DBcondD must not affect the `cond`).

The auxiliary register is treated as a 32-bit signed integer. None of the three instructions following DBcondD should modify the program flow. Interrupts are disabled for the duration of the DBcondD instruction. Note that the branch condition does not depend on the auxiliary register decrement.

If the `src` operand is expressed in register addressing mode, the contents of the specified register are loaded into the PC. If the `src` is expressed in PC-relative addressing, the assembler generates a displacement: \[ \text{displacement} = \text{label} - (\text{PC of branch instruction} + 3). \] This displacement is added to the PC of the branch instruction plus 3 to generate the new PC. Note that bit 21 = 1 for a delayed branch.

The 'C4x provides 20 condition codes that can be used with this instruction (see Section 14.2 for a list of condition mnemonics, encoding, and flags).
Decrement and Branch Conditionally (Delayed)  DBcondD

Status Bits
- LUF  Unaffected
- LV   Unaffected
- UF   Unaffected
- N    Unaffected
- Z    Unaffected
- V    Unaffected
- C    Unaffected

Mode Bit  OVM operation is not affected by OVM bit value.

Cycles  1

Example  DBZD AR5, $+110h

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC</td>
<td>0h</td>
</tr>
<tr>
<td>AR5</td>
<td>67h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>1</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
**FIX Floating-Point to Integer Conversion**

**Syntax**

`FIX src, dst`

**Operands**

`src`: general-addressing modes (G)

`dst`: register (any register in CPU primary-register file)

**Opcode**

```
  31 24 23 16 15 8 7 0
  0 0 0 0 1 1 0 1
  dst src
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

`fix(src) → dst`

**Description**

The floating-point operand `src` is converted to the nearest integer less than or equal to it in value, and the result is loaded into the `dst` register. The `src` operand is assumed to be a floating-point number and the `dst` operand is assumed to be a signed integer.

The exponent field of the result register (if it has one) is not modified.

Integer overflow occurs when the floating-point number is too large to be represented as a 32-bit 2s-complement integer. In the case of integer overflow, the result is saturated in the direction of overflow.

**Status Bits**

If ST (SET COND) = 0 and the destination register is R0–R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- LUF: Unaffected
- LV: 1 if an integer overflow occurs, unchanged otherwise
- UF: 0
- N: 1 if a negative result is generated, 0 otherwise
- Z: 1 if a zero result is generated, 0 otherwise
- V: 1 if an integer overflow occurs, 0 otherwise
- C: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1
**Example**

```plaintext
FIX R1, R2  

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1</td>
<td>0A2820 0000h</td>
</tr>
<tr>
<td></td>
<td>1.3454e + 3</td>
</tr>
<tr>
<td>R2</td>
<td>0h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
```

**Floating-Point to Integer Conversion**

**FIX**
**FIX||STI**  *Parallel FIX and STI*

### Syntax

```
FIX  src2, dst1
|| STI src3, dst2
```

### Operands

- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

### Opcode

```
   31  24  23  16  15  8  7  0
   1  1  0  1  0  1  0     dst1     0  0  0     src3     dst2     src2
```

### Word Fields

None

### Operation

- `fix(src2) → dst1`
- `|| src3 → dst2`

### Description

A floating-point-to-integer conversion is performed. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (FIX) writes to the same register, then STI accepts as input the contents of the register before it is modified by FIX.

If **src2** and **dst2** point to the same location, **src2** is read before the write to **dst2**.

Integer overflow occurs when the floating-point number is too large to be represented as a 32-bit 2’s-complement integer. In the case of integer overflow, the result is saturated in the direction of overflow.

### Status Bits

- **LUF**: Unaffected
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: Unaffected

### Mode Bit

OVM operation is not affected by OVM bit value.

### Cycles

1

---

14-92
### Example

```
FIX  ***AR4(1),R1
||
STI  R0,*AR2
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR4  80 98A2h</td>
<td>AR4  80 98A3h</td>
</tr>
<tr>
<td>R1    0h 66</td>
<td>R1    0B3h 179</td>
</tr>
<tr>
<td>R0    0DCh 220</td>
<td>R0    0DCh 220</td>
</tr>
<tr>
<td>AR2   80 983Ch</td>
<td>AR2   80 983Ch</td>
</tr>
<tr>
<td>Data at 80 98A3h</td>
<td>Data at 80 98A3h</td>
</tr>
<tr>
<td>733 C000h 1.79750e + 02</td>
<td>733 C000h 1.79750e + 02</td>
</tr>
<tr>
<td>Data at 80 983Ch</td>
<td>Data at 80 983Ch</td>
</tr>
<tr>
<td>0h 2 0DCh 220</td>
<td>LUF   0</td>
</tr>
<tr>
<td>LV    0</td>
<td>LV    0</td>
</tr>
<tr>
<td>UF    0</td>
<td>UF    0</td>
</tr>
<tr>
<td>N     0</td>
<td>N     0</td>
</tr>
<tr>
<td>Z     0</td>
<td>Z     0</td>
</tr>
<tr>
<td>V     0</td>
<td>V     0</td>
</tr>
<tr>
<td>C     0</td>
<td>C     0</td>
</tr>
</tbody>
</table>
FLOAT  Integer to Floating-Point Conversion

Syntax  
FLOAT src, dst

Operands  
src:  general-addressing modes (G)  
dst:  register (R0 – R11)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0  0</td>
<td>1  1</td>
<td>G dst src</td>
</tr>
</tbody>
</table>

Word Fields  

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
float (src) → dst

Description  
The integer operand src is converted to the floating-point value equal to it, and the result loaded into the dst register. The src operand is assumed to be a signed integer, and the dst operand is assumed to be a floating-point number.

Status Bits  
LUF  Unaffected  
LV   Unaffected  
UF   0  
N    1 if a negative result is generated, 0 otherwise  
Z    1 if a zero result is generated, 0 otherwise  
V    0  
C    Unaffected

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1
### Example

FLOAT *++AR2 (2), R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2 80 9800h</td>
<td>AR2 80 9802h</td>
</tr>
<tr>
<td>R5 034C 2000h</td>
<td>R5 072E0 0000h</td>
</tr>
<tr>
<td>Data at 80 9802h</td>
<td>Data at 80 9802h</td>
</tr>
<tr>
<td></td>
<td>0AEh 174</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
FLOAT||STF  Parallel FLOAT and STF

Syntax  
FLOAT src2, dst1  
||  STF src3, dst2

Operands  
src2: indirect (disp = 0, 1, IR0, IR1)  
dst1: register (R0 – R7)  
src3: register (R0 – R7)  
dst2: register (disp = 0, 1, IR0, IR1)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>dst1</td>
<td>0</td>
</tr>
</tbody>
</table>

Operation  
float(src2) → dst1  
||  src3 → dst2

Description  
An integer-to-floating-point conversion is performed. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STF) reads from a register and the operation being performed in parallel (FLOAT) writes to the same register, then STF accepts as input the contents of the register before it is modified by FLOAT.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Status Bits  
LUF  Unaffected  
LV  Unaffected  
UF  0  
N  1 if a negative result is generated, 0 otherwise  
Z  1 if a zero result is generated, 0 otherwise  
V  0  
C  Unaffected

Mode Bit  
OVM operation is affected by OVM bit value.

Cycles  
1
### Example

**Before Instruction**

<table>
<thead>
<tr>
<th>Register</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2</td>
<td>80 98C5h</td>
</tr>
<tr>
<td>IR0</td>
<td>8h</td>
</tr>
<tr>
<td>R6</td>
<td>0h</td>
</tr>
<tr>
<td>R7</td>
<td>034C20 0000h 1.27578125e + 01</td>
</tr>
<tr>
<td>AR1</td>
<td>80 9933h</td>
</tr>
<tr>
<td>Data at 80 98CDh</td>
<td>0AEh</td>
</tr>
<tr>
<td>Data at 80 9933h</td>
<td>0h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>

**After Instruction**

<table>
<thead>
<tr>
<th>Register</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2</td>
<td>80 98C5h</td>
</tr>
<tr>
<td>IR0</td>
<td>8h</td>
</tr>
<tr>
<td>R6</td>
<td>072E00 0000h 1.740e + 02</td>
</tr>
<tr>
<td>R7</td>
<td>034C20 0000h 1.27578125e + 01</td>
</tr>
<tr>
<td>AR1</td>
<td>80 9933h</td>
</tr>
<tr>
<td>Data at 80 98CDh</td>
<td>0AEh</td>
</tr>
<tr>
<td>Data at 80 9933h</td>
<td>0AEh 174</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
Syntax: \texttt{FRIEEE \textit{src}, \textit{dst}}

Operands:
- \textit{src}: direct- or indirect-addressing modes
- \textit{dst}: extended-precision register (R0 – R11)

Opcode:

\begin{center}
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 \\
\end{array}
\end{center}

- G \hspace{1cm} src \\
- dst \hspace{1cm}

Word Fields:

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
</tbody>
</table>

Operation: convert \textit{src} from IEEE format \rightarrow \textit{dst}

Description:
The \textit{src} operand is converted from the IEEE floating-point format to the 2s-complement floating-point format.

The \textit{src} operand comes from memory. The converted result goes into an extended precision register as a single-precision floating-point number.

Status Bits:
- LUF: Unaffected
- LV: Set if overflow, otherwise unchanged
- UF: 0
- N: Sign of the result
- Z: 1 if result is 0, 0 otherwise
- V: 1 if overflow, 0 otherwise
- C: Unaffected

Mode Bit:
- OVM: operation is not affected by OVM bit value.

Cycles: 1

Example: None
**Syntax**

FRIEEE  src2, dst1
|| STF  src3, dst2

**Operands**

- src2: indirect mode (disp = 0, 1, IR0, IR1)
- dst1: register mode (R0 – R7)
- src3: register mode (R0 – R7)
- dst2: indirect mode (disp = 0, 1, IR0, IR1)

**Opcode**

<table>
<thead>
<tr>
<th></th>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>1 1 1 1 0 0 1</td>
<td>dst1 0 0 0</td>
<td>src3</td>
<td>dst2</td>
<td>src2</td>
</tr>
</tbody>
</table>

**Operation**

convert src2 from IEEE format → dst1 in parallel with src3 → dst2

**Description**

The src2 operand is converted from the IEEE floating-point format to the 2s-complement format. The converted result goes into an extended-precision register dst1 as a single-precision floating-point number.

A floating-point store is done in parallel.

If src2 and dst2 point to the same location, then src2 is read before the write to dst2.

**Status Bits**

- **LUF**: Unaffected
- **LV**: Set if overflow, otherwise unchanged
- **UF**: 0
- **N**: Sign of the result
- **Z**: 1 if result is 0, 0 otherwise
- **V**: 1 if overflow, 0 otherwise
- **C**: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
**IACK**  *Interrupt Acknowledge*

**Syntax**

\[ \text{IACK } \text{src} \]

**Operation**

\( \text{src: general-addressing modes (G)} \)

**Opcode**

```
 31 24 23 16 15 8 7 0
0 0 0 1 1 1 0 G 0 0 0 0 0 src
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
</tbody>
</table>

**Operation**

Perform a dummy-read operation with IACK = 0.
At end of dummy read, set IACK to 1.

**Description**

A dummy-read operation at address pointed by src is performed with IACK = 0.
At the end of the dummy read, IACK is set to 1 if off-chip memory is specified. This instruction can be used to generate an external-interrupt acknowledge. The IACK signal and the address can then be used to signal interrupt acknowledge to external devices. The data read by the processor is unused. Note that the IACK signal is extended with multicycle reads.

**Status Bits**

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1
Example

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>IACK *AR5</td>
<td></td>
<td></td>
</tr>
<tr>
<td>IACK</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>PC</td>
<td>300h</td>
<td>301h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
**IDLE**  *Idle Until Interrupt*

**Syntax**
IDLE

**Operands**
None

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>0 0</td>
<td>1 1</td>
<td>0 0</td>
<td>0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0</td>
</tr>
</tbody>
</table>

**Word Fields**
None

**Operation**
1 → ST(GIE)
Next PC → PC
Idle until interrupt

**Description**
The global-interrupt enable bit is set, the next PC value is loaded into the PC, and the CPU idles until an unmasked interrupt is received. When the interrupt is received, the contents of the PC are pushed onto the active system stack, and the processor jumps to execute the interrupt service routine.

**Status Bits**

- **LUF**  Unaffected
- **LV**   Unaffected
- **UF**   Unaffected
- **N**    Unaffected
- **Z**    Unaffected
- **V**    Unaffected
- **C**    Unaffected

**Mode Bit**
**OVM** operation is not affected by OVM bit value.

**Cycles**
1

**Example**
None
Idle Until Interrupt 2  

**Syntax**  

IDLE2  

('C40 revision ≥ 5.0 and 'C44 only)

**Operands**  

None

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

**Word Fields**  

None

**Operation**  

1 → ST(GIE)  

Next PC → PC  

Idle until interrupt

**Description**  

The IDLE2 instruction performs the same function as IDLE, except that it removes the functional clock input from the internal device. This allows for an extremely low-power mode. The PC is incremented once, and the device remains in an idle state until one of the external interrupts (NMI or IIOFx) is asserted.

In IDLE2 mode, the 'C4x behaves as follows:

- The CPU, peripherals, and memory retain their previous states.
- When the device is in the functional (nonemulation) mode, the clocks stop with H1 high and H3 low.
- The 'C4x remains in IDLE2 until one of the external interrupts (NMI or IIOFx) is asserted for at least two H1 clock cycles. Then, the clocks start after a delay of one H1 cycle. The clocks can start up in the phase opposite that in which they were stopped (that is, H1 might start high when H3 was high before stopping, and H3 might start high when H1 was high before stopping). However, the H1 and H3 clocks remain 180° out of phase with each other.
- During IDLE2 operation, for one of the external interrupts to be recognized and serviced by the CPU, it must be asserted for at least two H1 cycles. For the processor to recognize only one interrupt when it restarts operation, the interrupt pin must be configured for edge-triggered mode or asserted for less than three cycles in level-triggered mode.
- When the 'C4x is in the emulation mode, the H1 and H3 clocks continue to run normally, and the CPU operates as if an IDLE instruction had been executed. The clocks continue to run for correct operation of the emulator.
- Any external interrupt pin can wake up the device from IDLE2; but for the CPU to recognize that interrupt, the interrupt must also be enabled. If an interrupt is recognized and executed by the CPU, the instruction following the IDLE2 instruction is not executed until after a return is executed.
## IDLE2  Idle Until Interrupt 2

<table>
<thead>
<tr>
<th>Status Bits</th>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
Syntax  
**LAJ src**

Operands  
*src*: in PC-relative mode

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
</tbody>
</table>

*displacement*

Word Fields  
None

Operation  
PC of LAJ + 4 \(\rightarrow\) extended-precision register R11  
*displacement* + 3 + PC of LAJ \(\rightarrow\) PC

Description  
LAJ performs a single-cycle delayed subroutine call that allows the three instructions following the LAJ instruction to be performed before branching. The return address (address of the LAJ instruction + 4) is placed in extended-precision register R11. The assembler generates a displacement: 
*displacement* = *src* – (PC of branch instruction + 1). This displacement is stored as a 24-bit signed integer in the 24 least significant bits of the branch instruction. This displacement is added to the PC of the branch instruction plus 1 to generate the new PC. See Section 6.6 on page 6-19 for details.

None of the three instructions following the LAJ instruction should modify R11 or the program flow. Interrupts are disabled for the duration of the LAJ instruction.

Status Bits  
- **LUF**: Unaffected  
- **LV**: Unaffected  
- **UF**: Unaffected  
- **N**: Unaffected  
- **Z**: Unaffected  
- **V**: Unaffected  
- **C**: Unaffected

Mode Bit  
**OVM** operation is not affected by OVM bit value.

Cycles  
1

Example  
None
LAJcond  
*Link and Jump Conditionally*

**Syntax**  
\[ \text{LAJ} \text{cond} \ src \]

**Operands**  
*src*: conditional-branch addressing modes

**Opcode**

```
+---+---+---+---+---+---+---+---+
| 31| 24| 23| 16| 15| 8 | 7 | 0 |
+---+---+---+---+---+---+---+---+
| 0 | 1 | 1 | 0 | 0 | B | 0 | 0 | 0 | 1 | cond | register or displacement |
+---+---+---+---+---+---+---+---+
```

**Word Fields**

<table>
<thead>
<tr>
<th>B</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>register</td>
</tr>
<tr>
<td>1</td>
<td>PC relative</td>
</tr>
</tbody>
</table>

**Operation**

If (\textit{cond} is true)  
If (\textit{src} is a register)  
\[ \text{PC of LAJ}\text{cond} + 4 \rightarrow \text{extended-precision register R11} \]
\[ \text{src} \rightarrow \text{PC} \]

If (\textit{src} is in PC-relative mode)  
\[ \text{PC of LAJ}\text{cond} + 4 \rightarrow \text{extended-precision register R11} \]
\[ \text{displacement} = \text{src} - (\text{PC of LAJ} + 3) \]
\[ \text{displacement} + \text{PC of the LAJ} + 3 \rightarrow \text{PC} \]

Else, continue.

**Description**

LAJ\textit{cond} performs a conditional single-cycle delayed subroutine call that allows the three instructions following the LAJ\textit{cond} instruction to be performed before branching, without affecting the \textit{cond}. The return address (address of the LAJ instruction + 4) is placed in extended-precision register R11. The address branched to is formed by either register mode or PC-relative mode.

None of the three instructions following the LAJ\textit{cond} instruction should modify R11 or the program flow. Interrupts are disabled for the duration of the LAJ\textit{cond} instruction.

**Status Bits**

- \textit{LUF}: Unaffected
- \textit{LV}: Unaffected
- \textit{UF}: Unaffected
- \textit{N}: Unaffected
- \textit{Z}: Unaffected
- \textit{V}: Unaffected
- \textit{C}: Unaffected

**Mode Bit**

- \textit{OVM}: operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
Syntax \[ \text{LAT}\text{cond} N \]

Operands N immediate mode – trap number \((0 \leq N \leq 511)\)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 1 1 1 0 1 0 0 1 0 0</td>
<td>(\text{cond})</td>
<td>0 0 0 0 0 0 0</td>
<td>(N)</td>
<td></td>
</tr>
</tbody>
</table>

Word Fields None

Operation If \((\text{cond} \text{ is true})\)

- \(\text{ST(GIE)} \rightarrow \text{ST(PGIE)}\)
- \(\text{ST(CF)} \rightarrow \text{ST(PCF)}\)
- \(0 \rightarrow \text{ST(GIE)}\)
- \(1 \rightarrow \text{ST(CF)}\)
- \(\text{PC of LAT}\text{cond} + 4 \rightarrow \text{extended-precision register R11}\)
- \(\text{trap vector N} \rightarrow \text{PC}\)

Else, continue.

Description The \(\text{LAT}\text{cond}\) instruction performs a conditional delayed single-cycle trap. If the condition is true, ST bits GIE and CF are saved in PGIE and PCF in the status register. Then all interrupts are disabled \((0 \rightarrow \text{GIE})\), and the cache is frozen \((1 \rightarrow \text{CF})\). The contents of the PC of the \(\text{LAT}\text{cond} + 4\) are placed in R11, and the PC is loaded with the contents of the specified trap vector \((N)\). If the condition is not true, then continue normal operation. If traps are to be nested, you may need to save the status register before executing \(\text{LAT}\text{cond}\).

The three instructions following \(\text{LAT}\text{cond}\) are fetched and executed, but they do not affect the \(\text{cond}\). They should not modify the program flow or directly modify the status register. Interrupts are disabled for the duration of the \(\text{LAT}\text{cond} N\) instruction.

Status Bits

<table>
<thead>
<tr>
<th>LUF</th>
<th>LV</th>
<th>UF</th>
<th>N</th>
<th>Z</th>
<th>V</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>Unaffected</td>
<td>Unaffected</td>
<td>Unaffected</td>
<td>Unaffected</td>
<td>Unaffected</td>
<td>Unaffected</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit \(\text{OVM}\) operation is not affected by OVM bit value.

Cycles 1

Example None
**LBb  Load Byte**

### Syntax

```
LBb src, dst
```

### Operands

- **src**: register, direct, 16-bit immediate, or indirect-addressing modes
- **dst**: register mode (any register in CPU primary-register file)

### Opcode

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0</td>
<td>1 0</td>
<td>G</td>
<td>dst</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>B</td>
<td>src</td>
</tr>
</tbody>
</table>
```

### Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
<th>B</th>
<th>src byte</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode</td>
<td>00</td>
<td>byte 0 LS byte</td>
</tr>
<tr>
<td>01</td>
<td>direct mode</td>
<td>01</td>
<td>byte 1</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode</td>
<td>10</td>
<td>byte 2</td>
</tr>
<tr>
<td>11</td>
<td>immediate mode (16 bits)</td>
<td>11</td>
<td>byte 3 MS byte</td>
</tr>
</tbody>
</table>

### Operation

Sign-extended byte (3, 2, 1, 0) of src → dst

```
b = byte to load (3, 2, 1, 0)
```

```
3 2 1 0 = b (byte designator 3 – 0)
```

### Description

The specified byte of the src operand is sign-extended and right-shifted into the eight LSBs of the dst register. The src byte is signed. When immediate mode is specified and byte 2 (B = 10) or byte 3 (B = 10) is selected, the LBb instruction performs sign extension of the 16-bit value. Consequently, the value of 00h or FFh is stored into the eight LSBs of the dst register.

### Status Bits

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

### Mode Bit

**OVM** operation is not affected by OVM bit value.

### Cycles

1
Load Byte \texttt{LBb}

**Example**

\texttt{LB2 R1, R2 ; sign extended byte 2 of \textit{R1} \rightarrow \textit{R2}}

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>\textit{R1} 00AB 0000h</td>
<td>\textit{R1} 00AB 0000h</td>
</tr>
<tr>
<td>\textit{R2} 0000 0000h</td>
<td>\textit{R2} FFFF FFABh</td>
</tr>
</tbody>
</table>
**Load Byte Unsigned (LBUb)**

**Syntax**

LBUb src, dst

**Operands**

src: register, direct, 16-bit immediate, or indirect-addressing modes  
dst: register mode (any register in CPU primary-register file)

**Opcode**

```
1 0 1 1 0 0 1 B G dst src
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
<th>B</th>
<th>src byte</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode</td>
<td>00</td>
<td>byte 0 LS byte</td>
</tr>
<tr>
<td>01</td>
<td>direct mode</td>
<td>01</td>
<td>byte 1</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode</td>
<td>10</td>
<td>byte 2</td>
</tr>
<tr>
<td>11</td>
<td>immediate mode (16 bits)</td>
<td>11</td>
<td>byte 3 MS byte</td>
</tr>
</tbody>
</table>

**Operation**

Byte (3, 2, 1, 0) of src → dst

b = byte to load (3, 2, 1, 0)

```
3 2 1 0 = b (byte designator 3 – 0)
```

**Description**

The specified byte of the src operand is right-shifted, without sign extension, into the eight LSBs of the dst register. The src byte is unsigned. When immediate mode is specified and byte 2 (B=10) or byte 3 (B=10) is selected, the LBUb instruction performs sign extension of the 16-bit value. Consequently, the value of 00h or FFh is stored into the eight LSBs of the dst register.

**Status Bits**

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected  
- **LV**: Unaffected  
- **UF**: 0  
- **N**: 0  
- **Z**: 1 if a zero result is generated, 0 otherwise  
- **V**: 0  
- **C**: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

LBU2 R1, R2

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td>R1</td>
<td>00AB 0000h</td>
</tr>
<tr>
<td>R2</td>
<td>0000 0000h</td>
</tr>
<tr>
<td></td>
<td>0000 00ABh</td>
</tr>
</tbody>
</table>
LDA src, dst

Operands
src: general-addressing modes
dst: register mode (address registers only)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 1 1 0 1</td>
<td>G</td>
<td>dst</td>
<td>src</td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation
src → dst

Description
The src operand is loaded into the dst register. The dst register can be any of the address registers: AR0 – AR7, IR0, IR1, DP, BK, or SP. The load is complete by the end of the read phase of the pipeline. As a result, LDA is one cycle faster than LDI for loading these registers. (All operands are treated as signed integers.)

The src and dst operands cannot be the same register.

Status Bits
LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
1
Example  None
Load Floating-Point Exponent  \textbf{LDE}

**Syntax**  
\textbf{LDE} \textit{src, dst}

**Operands**  
\textit{src}: general-addressing modes (G)  
\textit{dst}: register (R0 – R11)

**Opcode**  

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>0 0 1 0 1</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**  
\textit{src}(exp) \rightarrow \textit{dst}(exp)

**Description**
The exponent field of the \textit{src} operand is loaded into the exponent field of the \textit{dst} register. No modification of the \textit{dst} register mantissa field is made unless the value of the exponent loaded is the reserved value of the exponent for zero as determined by the precision of the \textit{src} operand. Then, the mantissa field of the \textit{dst} register is set to 0. The \textit{src} and \textit{dst} operands are assumed to be floating-point numbers. Immediate values are evaluated in the short floating-point format.

**Status Bits**
- LUF: Unaffected
- LV: Unaffected
- UF: Unaffected
- N: Unaffected
- Z: Unaffected
- V: Unaffected
- C: Unaffected

**Mode Bit**
\textbf{OVM} operation is not affected by OVM bit value.

**Cycles**
1
**Example**

LDE R0, R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0 020005 6F30h</td>
<td>R0 020005 6F30h</td>
</tr>
<tr>
<td>R5 0A056F E332h</td>
<td>R5 02056F E332h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>

Before Instruction: 4.0006637e + 00

After Instruction: 4.0006637e + 00

Before Instruction: 1.0679648e + 03

After Instruction: 4.16990814e + 00
**Syntax**

LDEP src, dst

**Operands**

src: expansion register file register (IVTP or TVTP)
dst: register mode (any register in CPU primary register file)

**Opcode**

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th>31 24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td>0 1 1 1 0 1 1 0 0 0</td>
<td>0 0</td>
<td>dst 0 0 0 0 0 0 0 0 0</td>
<td>src 0 0 0 0 0 0 0 0 0</td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

src → dst

**Description**

The LDEP instruction loads a CPU register with the contents of the IVTP register (interrupt-trap table pointer) or the TVTP register. These registers are described in Section 3.2.

The src operand register from the expansion-register file is loaded into the dst register in the primary register file. The dst register content is assumed to be an integer.

**Status Bits**

LUF  Unaffected
LV   Unaffected
UF   Unaffected
N    Unaffected
Z    Unaffected
V    Unaffected
C    Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
**LDF  Load Floating-Point Value**

**Syntax**

\[ \text{LDF} \, \text{src, dst} \]

**Operands**

- **src**: general-addressing modes (G)
- **dst**: register (R0 – R11)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0 1 1 1 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

\[ \text{src} \rightarrow \text{dst} \]

**Description**

The **src** operand is loaded into the **dst** register. The **dst** and **src** operands are assumed to be floating-point numbers.

**Status Bits**

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: 1 if a negative result is loaded, 0 otherwise
- **Z**: 1 if a zero result is loaded, 0 otherwise
- **V**: 0
- **C**: Unaffected

**Mode Bit**

**OVM** operation is not affected by OVM bit value.

**Cycles**

1
**Example**

LDF @9800h, R2

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP 80h</td>
<td>DP 80h</td>
</tr>
<tr>
<td>R2 0h</td>
<td>R2 010C52 A000h</td>
</tr>
<tr>
<td>Data at 80 9800h</td>
<td>Data at 80 9800h</td>
</tr>
<tr>
<td>10C5 2A00h</td>
<td>10C5 2A00h</td>
</tr>
<tr>
<td>2.19254303e + 00</td>
<td>2.19254303e + 00</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
LDFcond  Load Floating-Point Value Conditionally

Syntax

LDFcond  src, dst

Operands

src:  general-addressing modes (G)
dst:  register (R0 – R11)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 1 0 0 cond G dst</td>
<td>src</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

If cond is true:

src → dst.

Else:

dst is unchanged.

Description

If the condition is true, the src operand is loaded into the dst register. Otherwise, the dst register is unchanged. The dst and src operands are assumed to be floating-point numbers.

The 'C4x provides 20 condition codes that can be used with this instruction (see Section 14.2 on page 14-12 for a list of condition mnemonics, encoding, and flags). Note that an LDFU (load floating-point unconditionally) instruction is useful for loading R0 – R11 without affecting condition flags.

Status Bits

<p>| | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
Load Floating-Point Value Conditionally  \textbf{LDFcond}

**Example**

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDFZ R3, R5</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>R3</th>
<th>Before Value</th>
<th>After Value</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>2CFF2C D500h</td>
<td>1.77055560e +13</td>
</tr>
<tr>
<td>R5</td>
<td>5F0000 003Eh</td>
<td>3.96140824e +28</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Flags</th>
<th>Value</th>
<th>Flags</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>0</td>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>1</td>
<td>Z</td>
<td>1</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
LDFI  Load Floating-Point Value, Interlocked

Syntax  
LDFI src, dst

Operands  
src: general-addressing modes (G)  
dst: register (R0 – R11)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>G</td>
</tr>
<tr>
<td>dst</td>
<td>src</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Word Fields  

<table>
<thead>
<tr>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>G</td>
</tr>
<tr>
<td>01</td>
</tr>
<tr>
<td>10</td>
</tr>
</tbody>
</table>

Operation  
Signal interlocked operation.  
src → dst

Description  
The src operand is loaded into the dst register. An interlocked operation is signaled over LOCK or LLOCK. The src and dst operands are assumed to be floating-point numbers. Only direct and indirect modes are allowed. Refer to Section 9.7 (page 9-39) for a detailed description.

Status Bits  
LUF  Unaffected  
LV   Unaffected  
UF   0  
N    1 if a negative result is generated, 0 otherwise  
Z    1 if a zero result is generated, 0 otherwise  
V    0  
C    Unaffected

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1
**Example**

LDFI *+AR2,R7

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2 8098F1h</td>
<td>AR2 8098F1h</td>
</tr>
<tr>
<td>R7 0h</td>
<td>R7 0584C0 0000h</td>
</tr>
<tr>
<td>Data at 80 98F2h</td>
<td>Data at 80 98F2h</td>
</tr>
<tr>
<td>584 C000h</td>
<td>584 C000h</td>
</tr>
<tr>
<td>-6.28125e + 01</td>
<td>-6.28125e + 01</td>
</tr>
</tbody>
</table>

LUF 0  
LV 0  
UF 0  
N 0  
Z 0  
V 0  
C 0  

Load Floating-Point Value, Interlocked  
LDFI
LDF||LDF  Parallel LDF and LDF

Syntax  
\[
\text{LDF } \text{src2, dst2} \\
|| \text{LDF } \text{src1, dst1}
\]

Operands  
- **src1**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 — R7)
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst2**: register (R0 — R7)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>dst2</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>dst1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>src1</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>src2</td>
</tr>
</tbody>
</table>

Word Fields  
None

Operation  
\[
\text{src2 } \rightarrow \text{dst2} \\
|| \text{src1 } \rightarrow \text{dst1}
\]

Description  
Two floating-point loads are performed in parallel. If the LDFs load the same register, the assembler issues a warning. The result is that of LDF src2, dst2.

Status Bits  
- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: Unaffected
- **N**: Unaffected
- **Z**: Unaffected
- **V**: Unaffected
- **C**: Unaffected

Mode Bit  
**OVM** operation is not affected by OVM bit value.

Cycles  
1
Example

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1 80 985Fh</td>
<td>AR1 80 9857h</td>
</tr>
<tr>
<td>IR0 8h</td>
<td>IR0 8h</td>
</tr>
<tr>
<td>R7 0h</td>
<td>R7 070C80 0000h</td>
</tr>
<tr>
<td>AR7 80 988Ah</td>
<td>AR7 80 988Bh</td>
</tr>
<tr>
<td>R3 0h</td>
<td>R3 057B40 0000h</td>
</tr>
<tr>
<td>Data at 80 9857h</td>
<td>Data at 80 9857h</td>
</tr>
<tr>
<td>70C 8000h 1.4050e + 02</td>
<td>70C 8000h 1.4050e + 02</td>
</tr>
<tr>
<td>Data at 80 988Ah</td>
<td>Data at 80 988Ah</td>
</tr>
<tr>
<td>57B 4000h 6.281250e + 01</td>
<td>57B 4000h 6.281250e + 01</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>

Assembly Language Instructions 14-123
LDF||STF  Parallel LDF and STF

Syntax

\[
\text{LDF } src2, dst1 \\
\text{|| STF } src3, dst2
\]

Operands

- \( src2 \): indirect (disp = 0, 1, IR0, IR1)
- \( dst1 \): register (R0 – R7)
- \( src3 \): register (R0 – R7)
- \( dst2 \): indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31 24 23 16 15 8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1 1 0 1 1 0 0 dst1 0 0 0 src3 dst2 src2</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

\[
\text{src2 } \rightarrow \text{ dst1} \\
\text{|| src3 } \rightarrow \text{ dst2}
\]

Description

A floating-point load and a floating-point store are performed in parallel.

If \( src2 \) and \( dst2 \) point to the same location, \( src2 \) is read before the write to \( dst2 \).

Status Bits

- LUF: Unaffected
- LV: Unaffected
- UF: Unaffected
- N: Unaffected
- Z: Unaffected
- V: Unaffected
- C: Unaffected

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
## Example

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDF *AR2−−(1),R1</td>
<td>LDF *AR2−−(1),R1</td>
</tr>
<tr>
<td>STF R3,*AR4++(IR1)</td>
<td>STF R3,*AR4++(IR1)</td>
</tr>
</tbody>
</table>

### Before Instruction

- **AR2**: 80 98E6h
- **R1**: 0h
- **R3**: 057B40 0000h
- **AR4**: 80 9900h
- **IR1**: 10h

- **Data at 80 98E7h**: 70C 8000h
- **Data at 80 9900h**: 0h

### After Instruction

- **AR2**: 80 98E6h
- **R1**: 070C80 0000h
- **R3**: 057B40 0000h
- **AR4**: 80 9910h
- **IR1**: 10h

- **Data at 80 98E7h**: 70C 8000h
- **Data at 80 9900h**: 57B 4000h

### Flags

- **LUF**: 0
- **LV**: 0
- **UF**: 0
- **N**: 0
- **Z**: 0
- **V**: 0
- **C**: 0
LDHI  Load 16 MSBs With 16-Bit Immediate

Syntax  
LDHI src, dst

Operands  
src:  16-bit unsigned immediate  
dst:  register mode

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1</td>
<td>1</td>
<td>dst</td>
<td>src (immediate value)</td>
<td></td>
</tr>
</tbody>
</table>

Word Fields  
None

Operation  
src → 16 MSBs of dst

Description  
The 16-bit unsigned src immediate value is loaded into the 16 MSBs of the dst register, and 0 is loaded into the 16 LSBs of the dst register. The dst register is assumed to be an integer.

Status Bits  
LUF  Unaffected
LV  Unaffected
UF  Unaffected
N  Unaffected
Z  Unaffected
V  Unaffected
C  Unaffected

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1

Example  
LDHI 44h, R2

Before Instruction  
| R2 | ABCD EF12h |

After Instruction  
| R2 | 0044 0000h |
Load Integer  LDI

Syntax  
LDI src, dst

Operands  
src: general-addressing modes (G)
dst: register (any register in CPU primary-register file)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1</td>
<td>0 0</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

Word Fields  
None

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
src → dst

Description  
The src operand is loaded into the dst register. The dst and src operands are assumed to be signed integers.

Status Bits  
LUF  Unaffected
LV   Unaffected
UF   0
N    1 if a negative result is generated, 0 otherwise
Z    1 if a zero result is generated, 0 otherwise
V    0
C    Unaffected

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1
**Example**

LDI *-AR1(IR0),R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1</td>
<td>2Ch</td>
</tr>
<tr>
<td>IR0</td>
<td>5h</td>
</tr>
<tr>
<td>R5</td>
<td>3C5h</td>
</tr>
<tr>
<td>Data at 27h</td>
<td>26h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
Load Integer Conditionally  \textbf{LDI}cond

**Syntax**

\texttt{LDI}cond \textit{src}, \textit{dst}

**Operands**

- \textit{src}: general addressing modes (G)
- \textit{dst}: register (any register in CPU primary register file)

**Opcode**

\[
\begin{array}{ccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
\hline
0 & 1 & 0 & 1 & \text{cond} & G & \text{dst} & \text{src} \\
\end{array}
\]

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

If \textit{cond} is true:
\[ \text{src} \rightarrow \text{dst}, \]
Else: \textit{dst} is unchanged.

**Description**

If the condition is true, the \textit{src} operand is loaded into the \textit{dst} register. Otherwise, the \textit{dst} register is unchanged. The \textit{dst} and \textit{src} operands are assumed to be signed integers.

LDP (an alternate form of LDIU) loads the data-page pointer register (DP) or any other register with the 16 MSBs of a relocatable address.

The 'C4x provides 20 condition codes that can be used with this instruction (see Section 14.2 for a list of condition mnemonics, encoding, and flags). Note that a load integer unconditionally (LDIU) instruction is useful for loading a selected CPU register without affecting the condition flags that the LDI instruction affects.

**Status Bits**

- \textbf{LUF} Unaffected
- \textbf{LV} Unaffected
- \textbf{UF} Unaffected
- \textbf{N} Unaffected
- \textbf{Z} Unaffected
- \textbf{V} Unaffected
- \textbf{C} Unaffected

**Mode Bit**

\textbf{OVM} operation is not affected by OVM bit value.

**Cycles**

1
### LDIcond
*Load Integer Conditionally*

**Example**

```
LDIZ R4, R6
```

<table>
<thead>
<tr>
<th></th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R4</td>
<td>027Ch</td>
<td>027Ch</td>
</tr>
<tr>
<td>R6</td>
<td>0FE2h</td>
<td>0FE2h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
Load Integer, Interlocked

**LDII** *src, dst*

**Syntax**

```
LDII src, dst
```

**Operands**

- **src**: general-addressing modes (G)
- **dst**: register (any register in CPU primary-register file)

**Opcode**

```
0 0 0 0 1 0 0 1 G dst src
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
</tbody>
</table>

**Operation**

Signal interlocked operation.

```
src → dst
```

**Description**

The *src* operand is loaded into the *dst* register. An interlocked operation is signaled over LOCK or LLOCK. The *src* and *dst* operands are assumed to be signed integers. Note that only the direct and indirect modes are allowed. Refer to Section 9.7 on page 9-39 for a detailed description.

**Status Bits**

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

**Mode Bit**

- **OVM**: operation is not affected by OVM bit value.

**Cycles**

1
**Example**

LDII @985Fh, R3

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP 80</td>
<td>DP 80</td>
</tr>
<tr>
<td>R3 0h</td>
<td>R3 0DCh</td>
</tr>
<tr>
<td>Data at 80 985Fh</td>
<td>Data at 80 98F5h</td>
</tr>
<tr>
<td>0DCh</td>
<td>0DCh</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Parallel LDI and LDI

**Syntax**

\[
\text{LDI } src2, \ dst2 \\
|\ |
\text{LDI } src1, \ dst1
\]

**Operands**

- \( src1 \): indirect (disp = 0, 1, IR0, IR1)
- \( dst1 \): register (R0 – R7)
- \( src2 \): indirect (disp = 0, 1, IR0, IR1)
- \( dst2 \): register (R0 – R7)

**Opcode**

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 1 & 0 & 0 & 0 & 1 & 1 & \text{dst2} \\
& & \text{dst1} & 0 & 0 & 0 & \text{src1} & \text{src2}
\end{array}
\]

**Word Fields**

None

**Operation**

- \( src2 \rightarrow dst2 \)
- || \( src1 \rightarrow dst1 \)

**Description**

Two integer loads are performed in parallel. A warning is issued by the assembler if the LDIs load the same register. The result is that of LDI \( src2, dst2 \).

**Status Bits**

- \( \text{LUF} \): Unaffected
- \( \text{LV} \): Unaffected
- \( \text{UF} \): Unaffected
- \( \text{N} \): Unaffected
- \( \text{Z} \): Unaffected
- \( \text{V} \): Unaffected
- \( \text{C} \): Unaffected

**Mode Bit**

\( \text{OVM} \): operation is not affected by OVM bit value.

**Cycles**

1
### Example

LDI *–AR1(1),R7  
\\  LDI *AR7++(IR0),R1

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1</td>
<td>80 9826h</td>
</tr>
<tr>
<td>R7</td>
<td>0h</td>
</tr>
<tr>
<td>AR7</td>
<td>80 98C8h</td>
</tr>
<tr>
<td>IR0</td>
<td>10h</td>
</tr>
<tr>
<td>R1</td>
<td>0h</td>
</tr>
<tr>
<td>Data at 80 9825h</td>
<td>0FAh</td>
</tr>
<tr>
<td>Data at 80 98C8h</td>
<td>2EEh</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
Parallel LDI and STI

LDI src2, dst1
|| STI src3, dst2

Operands

src2: indirect (disp = 0, 1, IR0, IR1)
dst1: register (R0 – R7)
src3: register (R0 – R7)
dst2: indirect (disp = 0, 1, IR0, IR1)

Opcode

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 1 & 0 & 1 & 1 & 0 & 1 & dst1 \\
0 & 0 & 0 & src3 & dst2 & src2 \\
\end{array}
\]

Word Fields
None

Operation

src2 → dst1
|| src3 → dst2

Description
An integer load and an integer store are performed in parallel. If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Status Bits

LUF Unaffected
LV Unaffected
UF Unaffected
N Unaffected
Z Unaffected
V Unaffected
C Unaffected

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
1
Example

\[
\text{LDI } *{\text{AR1(1)}}{, R2}
\]
\[
||
\text{STI } R7{, *AR5++(IR0)}
\]

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1 80 98E7h</td>
<td>AR1 80 98E7h</td>
</tr>
<tr>
<td>R2 0h</td>
<td>R2 0DCh</td>
</tr>
<tr>
<td>R7 35h</td>
<td>R7 35h</td>
</tr>
<tr>
<td>AR5 80 982Ch</td>
<td>AR5 80 9834h</td>
</tr>
<tr>
<td>IR0 8h</td>
<td>IR0 8h</td>
</tr>
<tr>
<td>Data at 80 98E6h</td>
<td>Data at 80 98E6h</td>
</tr>
<tr>
<td>0DCh</td>
<td>0DCh</td>
</tr>
<tr>
<td>Data at 80 982Ch</td>
<td>Data at 80 982Ch</td>
</tr>
<tr>
<td>0h</td>
<td>0h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
<tr>
<td>220</td>
<td>220</td>
</tr>
<tr>
<td>53</td>
<td>53</td>
</tr>
</tbody>
</table>
Syntax

LDM src, dst

Operands

src: general-addressing modes (G)
dst: register (R0 – R11)

Opcode

```
<table>
<thead>
<tr>
<th>31 24 23 16 15 8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 0 0 1 0</td>
</tr>
<tr>
<td>G</td>
</tr>
</tbody>
</table>
```

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0 – R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

src (man) → dst (man)

Description

The mantissa field of the src operand is loaded into the mantissa field of the dst register. The dst exponent field is not modified. The src and dst operands are assumed to be floating-point numbers. If immediate addressing mode is used, bits 15 –12 of the instruction word are forced to 0 by the assembler. If the source is in the memory, the 32-bit data are loaded into the mantissa field.

Status Bits

<table>
<thead>
<tr>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
**Example**  

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDM 156.75, R2</td>
<td>R2 0h</td>
<td>R2 1CC0 0000h 1.22460938e + 00</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>V 0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>C 0</td>
</tr>
</tbody>
</table>

(156.75 = 07 1CC0 0000h)
Load Data Page Pointer  LDP

Syntax

LDP src,[DP]

Operands

crc: 16 MSBs of the absolute 32-bit source address (src).
dst: optional (data-page pointer understood if “,DP” left out of operand)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

src → Data-page pointer

Description

This pseudo-op is an alternate form of the LDIU instruction, except that LDP is always in the immediate addressing mode (bits 22 – 21 = 11_2). The 16 MSBs of the src absolute 32-bit value (note that an src less than 32 bits is zero filled to make the 32 bits) are loaded into the 16 LSBs of the data-page pointer.

The 16 LSBs of the pointer are used in direct addressing as a pointer to the page of data being addressed. There is a total of 64K pages, each page 64K words long. Bits 31 – 16 of the pointer are reserved and should be kept to zero.

Status Bits

LUF  Unaffected
LV   Unaffected
UF   Unaffected
N    Unaffected
Z    Unaffected
V    Unaffected
C    Unaffected

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1

Example

LDP @809900h, DP
or
LDP @809900h

Before Instruction

<p>| | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>DP</td>
<td>6465h</td>
</tr>
</tbody>
</table>

After Instruction

<p>| | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>DP</td>
<td>0080h</td>
</tr>
</tbody>
</table>

16MSBs of 32-bit src, zeros extended
LDPE  Load Integer From Primary Register File to Expansion Register File

Syntax  
LDPE  \( src, dst \)

Operands  
\( src \): register mode (any register in CPU primary-register file)
\( dst \): expansion-register file register (IVTP or TVTP)

Opcode  
\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 1 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & dst & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & src
\end{array}
\]

Word Fields  
None

Operation  
\( src \rightarrow dst \)

Description  
This is a means to load the interrupt vector table pointer (IVTP) register or trap-vector table pointer (TVTP) register. These registers are described in Section 3.2 on page 3-17.

The \( src \) operand register from the primary-register file is loaded into the \( dst \) register in the expansion register file. The \( dst \) operand is assumed to be an integer.

Status Bits  
LUF  Unaffected
LV  Unaffected
UF  Unaffected
N  Unaffected
Z  Unaffected
V  Unaffected
C  Unaffected

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1

Example  
LDPE  AR0, TVTP  ; set trap-vector pointer
**Syntax**

\[ \text{LDPK} \ src \]

**Operands**

\( src: \) 16-bit unsigned immediate

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>00 0</td>
<td>1 1 1 1 0 1 1 1 0 0 0 0</td>
<td>( src )</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

\( src \rightarrow \text{DP} \)

**Description**

The 16-bit unsigned immediate value is loaded into the DP register. This operation is completed by the end of the decode phase of the LDPK instruction; thus, the value loaded is ready for the next instruction for immediate addressing. Use caution when using the DP register in the instruction that precedes the LDPK. For example:

```
PUSH  \text{DP}
PUSH  \text{new_value}
```

pushes the DP new value into the stack instead of saving the old DP value.

**Status Bits**

- **LUF** Unaffected
- **LV** Unaffected
- **UF** Unaffected
- **N** Unaffected
- **Z** Unaffected
- **V** Unaffected
- **C** Unaffected

**Mode Bit**

**OVM** operation is not affected by OVM bit value.

**Cycles**

1
**LHw  Load Half-Word**

**Syntax**

\[ \text{LHw} \ src, \ dst \]

**Operands**

- **src**: register, direct, 16-bit immediate, or indirect-addressing modes
- **dst**: register mode (any register in CPU primary-register file)

**Opcode**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td></td>
</tr>
</tbody>
</table>
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>H</th>
<th>src half-word</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>half-word 0 (LS half-word)</td>
</tr>
<tr>
<td>1</td>
<td>half-word 1 (MS half-word)</td>
</tr>
</tbody>
</table>

**Operation**

Sign-extended half-word (0, 1) of **src** → **dst**

\[ w = \text{half-word to load} \ (0, \ 1) \]

\[ \begin{array}{c}
0 \\
1 \\
\end{array} = w \text{ designator} \]

**Description**

The specified half-word of the **src** operand is sign-extended and right-shifted into the 16 LSBs of the **dst** register. The **src** half-word is signed. When immediate mode is specified and a half-word 1 (H = 1) is selected, the LHw instruction performs sign extension of the 16-bit value into a 32-bit value. Consequently, the corresponding half-word value (0000h or FFFFh) is stored into the 16 LSBs of the **dst** register.
Load Half-Word  LHw

Status Bits
If ST (SET COND) = 0 and the destination register is R0–R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**  Unaffected
- **LV**  Unaffected
- **UF**  0
- **N**  1 if a negative result is generated, 0 otherwise
- **Z**  1 if a zero result is generated, 0 otherwise
- **V**  0
- **C**  Unaffected

Mode Bit
**OVM** operation is not affected by OVM bit value.

Cycles
1

Example
```
LH0 R1, R2
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1</td>
<td>R1</td>
</tr>
<tr>
<td>ABCD EF12h</td>
<td>ABCD EF12h</td>
</tr>
<tr>
<td>R2</td>
<td>R2</td>
</tr>
<tr>
<td>1234 5678h</td>
<td>FFFF EF12h</td>
</tr>
</tbody>
</table>
**LHUw**  Load Half-Word Unsigned

**Syntax**  
\[ \text{LHUw} \; \text{src, dst} \]

**Operands**  
*src*: register, direct, 16-bit immediate-, or indirect- addressing modes  
*dst*: register mode (any register in CPU primary-register file)

**Opcode**

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 0 & 1 & 1 & 0 & 1 & 1 & H \\
G & dst & src \\
\end{array}
\]

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>H</th>
<th>src half-word</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>half-word 0 (LS half-word)</td>
</tr>
<tr>
<td>1</td>
<td>half-word 1 (MS half-word)</td>
</tr>
</tbody>
</table>

**Operation**  
Unsigned half-word \((0, 1)\) of *src* \(\rightarrow\) *dst*

\[
\begin{array}{c}
w = \text{half-word to load} (0, 1) \\
\begin{array}{c}
1 \\
0 \\
\end{array} = \text{w designator}
\end{array}
\]

**Description**  
The specified half-word of the *src* operand is unsigned and right-shifted into the 16 LSBs of the *dst* register. The *src* half-word is unsigned. When immediate mode is specified and a half-word \(1 (H = 1)\) is selected, the LHw instruction performs sign extension of the 16-bit value into a 32-bit value. Consequently, the corresponding half-word value (0000h or FFFFh) is stored into the 16 LSBs of the *dst* register.
Status Bits
If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

LUF  Unaffected
LV   Unaffected
UF   0
N    0
Z    1 if a zero result is generated, 0 otherwise
V    0
C    Unaffected

Cycles
1

Mode Bit
OVM operation is not affected by OVM bit value.

Example
LHU0  R1, R2

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1  ABCD EF12h</td>
<td>R1  ABCD EF12h</td>
</tr>
<tr>
<td>R2  1234 5678h</td>
<td>R2  0000 EF12h</td>
</tr>
</tbody>
</table>
LSH  Logical Shift

Syntax  LSH  src_count, dst  

Operands  
src_count: general-addressing modes (G)  
dst: register (any register in CPU primary-register file)  

Opcode

<table>
<thead>
<tr>
<th>31 24 23 16 15 8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 0 0 1 1 G dst</td>
</tr>
</tbody>
</table>

src_count

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

count = 7 LSBs of src_count  
If count ≥ 0:  
dst << count → dst  
Else:  
dst >> |count| → dst

Description

The seven LSBs of the src_count operand constitute the 2s-complement-shift count. If count is greater than 0, the dst operand is left-shifted by the value of count. Low-order bits shifted in are zero-filled, and high-order bits are shifted out through the C (carry) bit.

Logical left-shift:

C ← dst ← 0  

If count is less than 0, the dst is right-shifted by the absolute value of the count operand. The high-order bits of the dst operand are zero-filled as they are shifted to the right. Low-order bits are shifted out through the C (carry) bit.

Logical right-shift:

0 → dst → C 

If count is 0, no shift is performed, and the C (carry) bit is cleared to 0.

If count is greater than 32, the C (carry) bit gets the LSB. If count is less than −32, the C bit is cleared to 0.
The `src_count` operand is assumed to be a signed integer, and the `dst` operand is assumed to be an unsigned integer.

**Status Bits**

If ST (SET COND) = 0 and the destination register is R0–R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF** Unaffected
- **LV** Unaffected
- **UF** 0
- **N** MSB of the output
- **Z** 1 if a zero output is generated, 0 otherwise
- **V** 0
- **C** Set to the value of the last bit shifted out. 0 for a shift `count` of 0

**Mode Bit**

- **OVM** operation is not affected by OVM bit value.

**Cycles**

1

**Example 1**

LSH R4, R7

**Before Instruction**

<table>
<thead>
<tr>
<th>R4</th>
<th>R7</th>
<th>LUF</th>
<th>LV</th>
<th>UF</th>
<th>N</th>
<th>Z</th>
<th>V</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>018h</td>
<td>02Ah</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

**After Instruction**

<table>
<thead>
<tr>
<th>R4</th>
<th>R7</th>
<th>LUF</th>
<th>LV</th>
<th>UF</th>
<th>N</th>
<th>Z</th>
<th>V</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>018h</td>
<td>0AC00 0000h</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

**Example 2**

LSH *–AR5(IR0),R5

**Before Instruction**

<table>
<thead>
<tr>
<th>AR5</th>
<th>IR0</th>
<th>R5</th>
<th>Data at 80 9904h</th>
</tr>
</thead>
<tbody>
<tr>
<td>80 9908h</td>
<td>4h</td>
<td>00 12C0 0000h</td>
<td>0FFF FFFF4h</td>
</tr>
</tbody>
</table>

**After Instruction**

<table>
<thead>
<tr>
<th>AR5</th>
<th>IR0</th>
<th>R5</th>
<th>Data at 80 9904h</th>
</tr>
</thead>
<tbody>
<tr>
<td>80 9908h</td>
<td>4h</td>
<td>00 0001 2C00h</td>
<td>0FFF FFFF4h</td>
</tr>
</tbody>
</table>

Assembly Language Instructions 14-147
LSH3  Logical Shift, 3 Operands

Syntax  

LSH3 src_count, src, dst

Operands  

src, src_count: both type 1 or type 2 three-operand addressing modes  

dst: register mode (any register in CPU primary-register file)

Opcode  

Type 1

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1 0 0 1 0 0 0</td>
<td>T</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1 1 0 1 0 0 0</td>
<td>T</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

Word Fields  

Type 1

<table>
<thead>
<tr>
<th>T</th>
<th>src addressing modes</th>
<th>src_count addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>T</th>
<th>src addressing modes</th>
<th>src_count addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation

count = 7 LSBs of src_count

If count ≥ 0:

src << count → dst

Else:

src >> |count| → dst
**Description**

The seven LSBs of the `src_count` operand constitute the 2s-complement shift count.

If `count` is greater than 0, a copy of the `src` operand is left-shifted by the value of `count`, and the result is written to the `dst` (the `src` is not changed). Low-order bits shifted in are zero-filled, and high-order bits are shifted out through the C (carry) bit.

Logical left-shift:

\[ C \leftarrow src \leftarrow 0 \]

If `count` is less than 0, the `src` operand is right-shifted by the absolute value of `count`. The high-order bits of the `dst` operand are zero-filled as shifted to the right. Low-order bits are shifted out through the C (carry) bit.

Logical right-shift:

\[ 0 \rightarrow src \rightarrow C \]

If `count` is 0, no shift is performed and the C (carry) bit is set to 0.

If `count` is greater than 32, the carry (C) bit is set to the LSB. If `count` is less than 32, the carry bit is cleared to 0. This also applies to LSH.

The `src_count` operand is assumed to be a signed integer. The `src` and `dst` operands are assumed to be unsigned integers.

**Status Bits**

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>Code</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>MSB of the output</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero output is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>Set to the value of the last bit shifted out. 0 for a shift <code>count</code> of 0</td>
</tr>
</tbody>
</table>

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
LSH3||STI  Parallel LSH3 and STI

Syntax

LSH3  src_count, src2, dst1

||

STI  src3, dst2

Operands

src_count:  register (R0 – R7)
src2:  indirect (disp = 0, 1, IR0, IR1)
dst1:  register (R0 – R7)
src3:  register (R0 – R7)
dst2:  indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1 1 0 1 1 0</td>
<td>dst1</td>
<td>src_count</td>
<td>src3</td>
<td>dst2</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

count = 7 LSBs of src_count
If count ≥ 0:
   src2 << count → dst1
Else:
   src2 >> |count| → dst1
   || src3 → dst2

Description

The seven LSBs of the src_count operand constitute the 2s-complement shift count.

If count is greater than 0, a copy of the src2 operand is left-shifted by the value of count and the result is written to dst1 (src2 is not changed). Low-order bits shifted in are zero-filled, and high-order bits are shifted out through the C (carry) bit.

Logical left-shift:

C ← src2 ← 0

If count is less than 0, a copy of the src2 operand is right-shifted by the absolute value of count. The high-order bits of the dst operand are zero-filled as shifted to the right. Low-order bits are shifted out through the C (carry bit).

Logical right-shift:

0 → src2 → C

If count is 0, no shift is performed and the carry bit is set to 0.

The src_count operand is assumed to be a signed integer, and the src2 and dst1 operands are assumed to be unsigned integers. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation be-
ing performed in parallel (LSH3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the LSH3.

If \( src2 \) and \( dst2 \) point to the same location, \( src2 \) is read before the write to \( dst2 \).

**Status Bits**

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero output is generated, 0 otherwise
- **V**: 0
- **C**: Set to the value of the last bit shifted out. 0 for a shift count of 0

**Mode Bit**

- **OVM**: Operation is affected by OVM bit value.

**Cycles**

- 1

**Example 1**

\[
\begin{array}{c}
\text{LSH3} \quad R2,*++AR3(1),R0 \\
\text{||} \\
\text{STI} \quad R4,*-AR5
\end{array}
\]

<table>
<thead>
<tr>
<th></th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>R2</strong></td>
<td>18h</td>
<td>18h</td>
</tr>
<tr>
<td><strong>AR3</strong></td>
<td>8098C2h</td>
<td>8098C3h</td>
</tr>
<tr>
<td><strong>R0</strong></td>
<td>0h</td>
<td>0AC00 0000h</td>
</tr>
<tr>
<td><strong>R4</strong></td>
<td>0DCh</td>
<td>0DCh</td>
</tr>
<tr>
<td><strong>AR5</strong></td>
<td>80 98A3h</td>
<td>80 98A3h</td>
</tr>
<tr>
<td>Data at 80 98C3h</td>
<td></td>
<td>Data at 80 98C3h</td>
</tr>
<tr>
<td></td>
<td>0ACh</td>
<td>0ACh</td>
</tr>
<tr>
<td>Data at 80 98A2h</td>
<td></td>
<td>Data at 80 98A2h</td>
</tr>
<tr>
<td></td>
<td>0h</td>
<td>0DCh</td>
</tr>
<tr>
<td><strong>LUF</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>LV</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>UF</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>N</strong></td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td><strong>Z</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>V</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>C</strong></td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
### Example 2

**Example 2**

LSH3 R7,*AR2--(1),R2

‖

STI R0,*+AR0(1)

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R7</td>
<td>0FFFFF FF4h</td>
</tr>
<tr>
<td>AR2</td>
<td>80 9863h</td>
</tr>
<tr>
<td>R2</td>
<td>0h</td>
</tr>
<tr>
<td>R0</td>
<td>12Ch</td>
</tr>
<tr>
<td>AR0</td>
<td>80 98B7h</td>
</tr>
<tr>
<td>Data at 80 9863h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>2C00 0000h</td>
</tr>
<tr>
<td>Data at 80 98B9h</td>
<td></td>
</tr>
<tr>
<td></td>
<td>0</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
Load Word Left-Shifted LWLct

Syntax  
LWLct \textit{src, dst}

Operands  
\begin{itemize}
  \item \textit{ct}: the count of bytes \{0, 1, 2, or 3\} to shift left (\(ct \times 8 = \) shift in bits)
  \item \textit{src}: register, direct, 16-bit immediate-, or indirect- addressing modes
  \item \textit{dst}: register mode (any register in CPU primary-register file)
\end{itemize}

Opcode  

\begin{tabular}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 \\
1 & 0 & 1 & 1 & 0 & 0
\end{tabular}

\begin{tabular}{c|c}
G & src addressing modes \\
\hline
00 & register (any register in CPU primary register file) \\
01 & direct \\
10 & indirect \\
11 & immediate \\
\end{tabular}

\begin{tabular}{c|c}
B & src byte \\
\hline
00 & no shift \\
01 & shift left 1-byte space \\
10 & shift left 2-byte spaces \\
11 & shift left 3-byte spaces \\
\end{tabular}

Operation  
\textit{src} \ll \{0, 1, 2, or 3\} bytes and merged with \textit{dst} \rightarrow \textit{dst}

Description  
The \textit{src} operand is left-shifted the specified number of bytes and merged with the bytes of the \textit{dst} register that are below the left-shifted LSB of the \textit{src} operand. When immediate mode is selected, this instruction performs a sign extension of the 16-bit immediate value into a 32-bit value; then, this 32-bit value is shifted and merged.
Status Bits

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

Mode Bit

**OVM** operation is not affected by OVM bit value.

Cycles

1

Example

**LWL2 R1, R2**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1</td>
<td>ABCD EF12h</td>
</tr>
<tr>
<td>R2</td>
<td>1234 5678h</td>
</tr>
</tbody>
</table>
Load Word Right-Shifted \( \text{LWRct} \)

Syntax

\[ \text{LWRct } src, dst \]

Operands

- \( ct \): the count of bytes (0, 1, 2, or 3) to shift right (\( ct \times 8 \) = shift in bits)
- \( src \): register, direct, 16-bit immediate-, or indirect-addressing modes
- \( dst \): register mode (any register in CPU primary-register file)

Opcode

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 0 & 1 & 1 & 0 & 1 & 1 & B & G & dst & src
\end{array}
\]

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>B</th>
<th>src byte</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>no shift</td>
</tr>
<tr>
<td>01</td>
<td>shift left 1 byte space</td>
</tr>
<tr>
<td>10</td>
<td>shift left 2-byte spaces</td>
</tr>
<tr>
<td>11</td>
<td>shift left 3-byte spaces</td>
</tr>
</tbody>
</table>

Operation

\( src \gg \{0, 1, 2, \text{ or } 3\} \) bytes and merged with \( dst \rightarrow dst \)

Description

The \( src \) operand is right-shifted the specified number of bytes and merged with the bytes of the \( dst \) register that are above the right-shifted MSB of the \( src \) operand. Sign is not extended. When immediate mode is selected, this instruction performs a sign extension of the 16-bit immediate value into a 32-bit value; then, this 32-bit value is shifted and merged.
LWRct  Load Word Right-Shifted

**Status Bits**

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**  Unaffected
- **LV**  Unaffected
- **UF**  0
- **N**  MSB of the output
- **Z**  1 if a zero result is generated, 0 otherwise
- **V**  0
- **C**  Unaffected

**Mode Bit**

**OVM** operation is not affected by OVM bit value.

**Cycles**

1

**Example**

LWRct AR1, R2

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1 ABCD EF12hEF</td>
<td>AR1 ABCD EF12hEF</td>
</tr>
<tr>
<td>R2 1234 5678h</td>
<td>R2 12AB CDEFh</td>
</tr>
</tbody>
</table>
Syntax

\[ \text{MBct} \ src, \ dst \]

Operands

- \( ct \): the count of bytes \( \{0, 1, 2, 3\} \) to shift left \( (ct \times 8 = \text{shift in bits}) \)
- \( src \): register-, direct-, or indirect-addressing modes
- \( dst \): register mode (any register in CPU primary-register file)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>B</th>
<th>src byte</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>no shift</td>
</tr>
<tr>
<td>01</td>
<td>shift left 1 byte space</td>
</tr>
<tr>
<td>10</td>
<td>shift left 2-byte spaces</td>
</tr>
<tr>
<td>11</td>
<td>shift left 3-byte spaces</td>
</tr>
</tbody>
</table>

Operation

8 LSBs of \( src \) \( << \{0, 1, 2, \text{or} \ 3\} \) bytes and merged with \( dst \rightarrow dst \)

Description

The eight LSBs of the \( src \) operand are left shifted 0, 1, 2, or 3 bytes and merged with the bits of the \( dst \) register that are below the left-shifted LSB of the \( src \) operand. When immediate mode is selected, this instruction performs a sign extension of the 16-bit immediate value into a 32-bit value; then, this 32-bit value is shifted and merged.
**MBct**  *Merge Byte, Left-Shifted*

**Status Bits**
If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**  Unaffected
- **LV**  Unaffected
- **UF**  0
- **N**  MSB of the output
- **Z**  1 if a zero result is generated, 0 otherwise
- **V**  0
- **C**  Unaffected

**Mode Bit**
OVM operation is not affected by OVM bit value.

**Cycles**
1

**Example**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1</td>
<td>ABCD EF12h (0012 0000h)</td>
</tr>
<tr>
<td>AR2</td>
<td>1234 5678h</td>
</tr>
</tbody>
</table>
**Merge Half-Word, Left-Shifted**  \( \text{MHct} \)

### Syntax

\( \text{MHct} \ src, \ dst \)

### Operands

- \( ct \): the count of half-word (16-bit) shifts
- \( src \): register-, direct-, 16-bit immediate-, or indirect-addressing modes
- \( dst \): register mode (any register in CPU primary-register file)

### Opcode

```
31 24 23 16 15 8 7 0
 1 0 0 1 1 1 0 0
```

**G**

<table>
<thead>
<tr>
<th>( G )</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**H**

<table>
<thead>
<tr>
<th>( H )</th>
<th>src half-word</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>half-word 0 (LS half-word)</td>
</tr>
<tr>
<td>1</td>
<td>half-word 1 (MS half-word)</td>
</tr>
</tbody>
</table>

### Operation

16 LSBs of \( src \) \( << \) \( \{0, 1\} \) half-words and merged with \( dst \rightarrow dst \)

### Description

The 16 LSBs of the \( src \) operand are left shifted 0 or 1 half-words and merged with the bits of the \( dst \) register that are below the left-shifted LSB of the \( src \) operand. When immediate mode is selected, this instruction performs a sign extension of the 16-bit immediate value into a 32-bit value; then, this 32-bit value is shifted and merged.

### Status Bits

If \( \text{ST (SET COND)} = 0 \) and the destination register is \( \text{R0–R11} \), the condition flags are modified. If \( \text{ST (SET COND)} = 1 \), they are modified for all destination registers.

- \( \text{LUF} \): Unaffected
- \( \text{LV} \): Unaffected
- \( \text{UF} \): 0
- \( \text{N} \): MSB of the output
- \( \text{Z} \): 1 if a zero result is generated, 0 otherwise
- \( \text{V} \): 0
- \( \text{C} \): Unaffected

---

*Assembly Language Instructions*  14-159
**MHct**  *Merge Half-Word, Left-Shifted*

<table>
<thead>
<tr>
<th>Mode Bit</th>
<th>OVM operation is not affected by OVM bit value.</th>
</tr>
</thead>
<tbody>
<tr>
<td>Cycles</td>
<td>1</td>
</tr>
<tr>
<td>Example</td>
<td><strong>MH1 AR1, AR2</strong></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th></th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1</td>
<td>ABCD EF12h</td>
<td>(EF12 0000h)</td>
</tr>
<tr>
<td>AR2</td>
<td>1234 5678h</td>
<td>AR1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>ABCD EF12h</td>
</tr>
<tr>
<td></td>
<td></td>
<td>AR2</td>
</tr>
<tr>
<td></td>
<td></td>
<td>EF12 5678h</td>
</tr>
</tbody>
</table>
Multiply Floating-Point Values

Syntax

MPYF src, dst

Operands

src: general-addressing modes (G)
dst: register (R0 – R11)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 0 1 0 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00 register (R0–R11)</td>
</tr>
<tr>
<td>01 direct</td>
</tr>
<tr>
<td>10 indirect</td>
</tr>
<tr>
<td>11 immediate</td>
</tr>
</tbody>
</table>

Operation

dst × src → dst

Description

The product of the dst and src operands is loaded into the dst register. The values at src (if in register mode (R0–R11)) and dst are treated as extended-precision floating-point numbers. For nonregister mode, src is treated as single-precision floating-point number.

Status Bits

LUF 1 if a floating-point underflow occurs, unchanged otherwise
LV  1 if a floating-point overflow occurs, unchanged otherwise
UF 1 if a floating-point underflow occurs, 0 otherwise
N   1 if a negative result is generated, 0 otherwise
Z   1 if a zero result is generated, 0 otherwise
V   1 if a floating-point is overflow occurs, 0 otherwise
C   Unaffected.

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
## MPYF Multiply Floating-Point Value

### Example

**MPYF R0, R2**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>R0</strong> 07 0C80 0000h</td>
<td>1.4050e + 02</td>
</tr>
<tr>
<td><strong>R2</strong> 03 4C20 0000h</td>
<td>1.27578125e + 01</td>
</tr>
<tr>
<td><strong>LUF</strong> 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td><strong>LV</strong> 0</td>
<td>LV 0</td>
</tr>
<tr>
<td><strong>UF</strong> 0</td>
<td>UF 0</td>
</tr>
<tr>
<td><strong>N</strong> 0</td>
<td>N 0</td>
</tr>
<tr>
<td><strong>Z</strong> 0</td>
<td>Z 0</td>
</tr>
<tr>
<td><strong>V</strong> 0</td>
<td>V 0</td>
</tr>
<tr>
<td><strong>C</strong> 0</td>
<td>C 0</td>
</tr>
<tr>
<td><strong>R0</strong> 07 0C80 0000h</td>
<td>1.4050e + 02</td>
</tr>
<tr>
<td><strong>R2</strong> 0A 600F 2000h</td>
<td>1.79247266e + 03</td>
</tr>
</tbody>
</table>
Multiply Floating-Point Values, 3 Operands

**MPYF3**

**Syntax**

MPYF3 src2, src1, dst

**Operands**

src1, src2: type 1 or type 2 three-operand addressing modes

dst: register mode (R0 – R11)

**Opcode**

**Type 1**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 0 0 1 0 0 1</td>
<td>T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 0 1 0 1</td>
<td>T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

**Word Fields**

**Type 1**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (R0 — R11)</td>
<td>register mode (R0 — R11)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (R0 — R11)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>register mode (R0 — R11)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

**Operation**

src1 × src2 → dst

**Description**

The product of src1 and src2 is loaded into the dst register. The values at src1, src2 (if src1 and src2 are in register mode (R0–R11)), and dst are treated as extended-precision floating-point numbers. If src1 and src2 are in nonregister mode, they are assumed to be single-precision floating-point numbers.

**Status Bits**

LUF 1 if a floating-point underflow occurs, unchanged otherwise

LV 1 if a floating-point overflow occurs, unchanged otherwise

UF 1 if a floating-point underflow occurs, 0 otherwise

N 1 if a negative result is generated, 0 otherwise

Z 1 if a zero result is generated, 0 otherwise

V 1 if a floating-point is overflow occurs, 0 otherwise

C Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
Parallel MPYF3 and ADDF3

Syntax

\[
\text{MPYF3} \ srcA, srcB, dst1 \\
\| \ 
\text{ADDF3} \ srcC, srcD, dst2
\]

Operands

\[
\begin{align*}
\text{srcA} & : \text{any two must be register (R0 – R7)} \\
\text{srcB} & : \text{any two must be indirect (disp = 0, 1, IR0, IR1)} \\
\text{srcC} & : \text{any two must be register (R0 – R7)} \\
\text{srcD} & : \text{any two must be indirect (disp = 0, 1, IR0, IR1)}
\end{align*}
\]

\[
\begin{align*}
\text{dst1} & : \text{register (d1)} \\
& \begin{cases} 
0 = \text{R0} \\
1 = \text{R1}
\end{cases}
\end{align*}
\]

\[
\begin{align*}
\text{dst2} & : \text{register (d2)} \\
& \begin{cases} 
0 = \text{R2} \\
1 = \text{R3}
\end{cases}
\end{align*}
\]

\[
\begin{align*}
\text{src1} & : \text{register} \ (R0 – R7) \\
\text{src2} & : \text{register} \ (R0 – R7) \\
\text{src3} & : \text{indirect} \ (\text{disp = 0, 1, IR0, IR1}) \ \\
\text{src4} & : \text{indirect} \ (\text{disp = 0, 1, IR0, IR1})
\end{align*}
\]

\[
\begin{align*}
P & : \text{parallel-addressing modes} \ (0 \leq P \leq 3)
\end{align*}
\]

Operation (P Field)

\[
\begin{align*}
00 & : \text{src3} \times \text{src4}, \text{src1} + \text{src2} \\
01 & : \text{src3} \times \text{src1}, \text{src4} + \text{src2} \\
10 & : \text{src1} \times \text{src2}, \text{src3} + \text{src4} \\
11 & : \text{src3} \times \text{src1}, \text{src2} + \text{src4}
\end{align*}
\]

Opcode

\[
\begin{array}{cccccccc}
31 & & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
\hline
1 & 0 & 0 & 0 & 0 & P & d1 & d2 & src1 & src2 & src3 & src4
\end{array}
\]

Word Fields

None

Operation

\[
\begin{align*}
\text{srcA} \times \text{srcB} \rightarrow \text{dst1} \\
\| \ 
\text{srcC} + \text{srcD} \rightarrow \text{dst2}
\end{align*}
\]

Description

A floating-point multiplication and a floating-point addition are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (MPYF3) reads from a register and the operation being performed in parallel (ADDF3) writes to the same register, then MPYF3 accepts as input the contents of the register before it is modified by the ADDF3.
You can code any combination of addressing modes for the four possible source operands as long as you code two as indirect and two as register. The assignment of the source operands srcA – srcD to the src1 – src4 fields varies, depending on the combination of addressing modes used; the P field is encoded accordingly. The assembler may, when not significant, change the order of operands in commutative operations to simplify processing.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

### Status Bits
- **LUF**: 1 if a floating-point underflow occurs, unchanged otherwise
- **LV**: 1 if a floating-point overflow occurs, unchanged otherwise
- **UF**: 1 if a floating-point underflow occurs, 0 otherwise
- **N**: 0
- **Z**: 0
- **V**: 1 if a floating-point overflow occurs, 0 otherwise
- **C**: Unaffected

### Mode Bit
**OVM** operation is not affected by OVM bit value.

### Cycles
1

### Example
```
MPYF3 *AR5++(1),*– – AR1(IR0),R0
||
ADDF3 R5,R7,R3
```

#### Before Instruction | After Instruction
| AR5       | 80 98C5h   | AR5       | 80 98C6h   |
| AR1       | 80 98A8h   | AR1       | 80 98A4h   |
| IR0       | 4h         | IR0       | 4h         |
| R0        | 0h         | R0        | 04 6718 000h 2.88867188e + 01 |
| R5        | 07 33C0 0000h | 1.79750e + 02 | R5        | 07 33C0 0000h | 1.79750e + 02 |
| R7        | 07 0C80 0000h | 1.4050e + 02 | R7        | 07 0C80 0000h | 1.4050e + 02 |
| R3        | 0h         | R3        | 08 2020 0000h 3.20250e + 02 |
| Data at 80 98C5h | 34C 0000h | 1.2750e + 01 | Data at 80 98C5h | 34C 0000h | 1.2750e + 01 |
| Data at 80 98A4h | 111 0000h | 2.265625e + 0 | Data at 80 98A4h | 111 0000h | 2.265625e + 0 |
| LUF       | 0          | LUF       | 0          |
| LV        | 0          | LV        | 0          |
| UF        | 0          | UF        | 0          |
| N         | 0          | N         | 0          |
| Z         | 0          | Z         | 0          |
| V         | 0          | V         | 0          |
| C         | 0          | C         | 0          |
**Syntax**

\[
\text{MPYF3} \text{ src2, src1, dst1}
||
\text{STF} \text{ src3, dst2}
\]

**Operands**

- **src1**: register (R0 – R7)
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

**Opcode**

\[
\begin{array}{ccccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 1 & 0 & 1 & 1 & 1 & dst1 & src1 & src3 & dst2 & src2
\end{array}
\]

**Word Fields**

None

**Operation**

\[
\text{src1} \times \text{src2} \to \text{dst1}
||
\text{src3} \to \text{dst2}
\]

**Description**

A floating-point multiplication and a floating-point store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (MPYF3) writes to a register and the operation being performed in parallel (STF) reads from the same register, then the STF accepts as input the contents of the register before it is modified by the MPYF3.

If **src2** and **dst2** point to the same location, then **src2** is read before the write to **dst2**.

**Status Bits**

- **LUF**: 1 if a floating-point underflow occurs, 0 unchanged otherwise
- **LV**: 1 if a floating-point overflow occurs, unchanged otherwise
- **UF**: 1 if a floating-point underflow occurs, 0 otherwise
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if a floating-point overflow occurs, 0 otherwise
- **C**: Unaffected

**Mode Bit**

**OVM** operation is not affected by OVM bit value.

**Cycles**

1
Example

\[
\begin{align*}
\text{MPYF3} & \quad \text{R7, R0} \\
\text{||} & \\
\text{STF} & \quad \text{R3, } \text{AR0} \quad \text{–} \quad \text{IR0}
\end{align*}
\]

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2</td>
<td>80 982Bh</td>
</tr>
<tr>
<td>R7</td>
<td>05 7B40 0000h</td>
</tr>
<tr>
<td>R0</td>
<td>0h</td>
</tr>
<tr>
<td>R3</td>
<td>08 6B28 0000h</td>
</tr>
<tr>
<td>AR0</td>
<td>80 9860h</td>
</tr>
<tr>
<td>IR0</td>
<td>8h</td>
</tr>
<tr>
<td>Data at 80 982Ah</td>
<td>70C 8000h</td>
</tr>
<tr>
<td></td>
<td>1.4050e + 02</td>
</tr>
<tr>
<td></td>
<td></td>
</tr>
<tr>
<td>Data at 80 9860h</td>
<td>87B28 0000h</td>
</tr>
<tr>
<td></td>
<td>4.7031250e + 02</td>
</tr>
</tbody>
</table>

| LUF | 0 |
| LV  | 0 |
| UF  | 0 |
| N   | 0 |
| Z   | 0 |
| V   | 0 |
| C   | 0 |
**MPYF3||SUBF3**  Parallel MPYF3 and SUBF3

**Syntax**

<table>
<thead>
<tr>
<th>MPYF3</th>
<th>SUBF3</th>
</tr>
</thead>
<tbody>
<tr>
<td>srcA, srcB, dst1</td>
<td>srcC, srcD, dst2</td>
</tr>
</tbody>
</table>

- srcA
- srcB
- srcC
- srcD

Any two must be indirect (disp = 0, 1, IR0, IR1), and any two must be register (R0 – R7).

**Operands**

None

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>P</td>
<td>d1</td>
<td>d2</td>
<td>src1</td>
<td>src2</td>
</tr>
<tr>
<td>src3</td>
<td>src4</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

- srcA × srcB → dst1
- srcD – srcC → dst2

**dst1** register (d1):

- 0 = R0
- 1 = R1

**dst2** register (d2):

- 0 = R2
- 1 = R3

**src1** register (R0 – R7)

**src2** register (R0 – R7)

**src3** indirect (disp = 0, 1, IR0, IR1)

**src4** indirect (disp = 0, 1, IR0, IR1)

**P** parallel-addressing modes (0 ≤ P ≤ 3)

**Operation (P Field)**

- 00  src3 × src4, src1 – src2
- 01  src3 × src1, src4 – src2
- 10  src1 × src2, src3 – src4
- 11  src3 × src1, src2 – src4

**Description**

A floating-point multiplication and a floating-point subtraction are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (MPYF3) reads from a register, and the operation being performed in parallel (SUBF3) writes to the same register, then MPYF3 accepts as input the contents of the register before it is modified by the SUBF3.
You can code any combination of addressing modes for the four possible source operands as long as you code two as indirect and two as register. The assignment of the source operands srcA – srcD to the src1 – src4 fields varies, depending on the combination of addressing modes used; the P field is encoded accordingly. The assembler may, when not significant, change the order of operands in commutative operations to simplify processing.

<table>
<thead>
<tr>
<th>Status Bits</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>1 if a floating-point underflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>LV</td>
<td>1 if a floating-point overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>1 if a floating-point underflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>1 if a floating-point overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Mode Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>OVM</td>
<td>Operation is not affected by OVM bit value.</td>
</tr>
</tbody>
</table>

| Cycles | 1 |

**Example**

```
MPYF3 R5,*++AR7(IR1),R0
||
SUBF3 R7,*AR3--(1),R2
```

```
MPYF3 *++AR7(IR1), R5,R0
||
SUBF3 R7,*AR3--(1),R2
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R5</td>
<td>03 4C00 0000h</td>
</tr>
<tr>
<td>AR7</td>
<td>80 9904h</td>
</tr>
<tr>
<td>IR1</td>
<td>08h</td>
</tr>
<tr>
<td>R0</td>
<td>00h</td>
</tr>
<tr>
<td>R7</td>
<td>07 33C0 0000h</td>
</tr>
<tr>
<td>AR3</td>
<td>80 98B2h</td>
</tr>
<tr>
<td>R2</td>
<td>00h</td>
</tr>
<tr>
<td>Data at 80 990Ch</td>
<td>111 0000h</td>
</tr>
<tr>
<td>Data at 80 98B2h</td>
<td>70C 8000h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
MPYI  Multiply Integer

Syntax

MPYI  src, dst

Operands

src:  general-addressing modes (G)
dst:  register (any register in CPU primary-register file)

Opcode

+---+---+---+---+---+---+---+---+
| 31 | 24 | 23 | 16 | 15 | 8  | 7  | 0  |
+---+---+---+---+---+---+---+---+
| 0 0 0 0 1 0 1 0 1 | G | dst | src |
+---+---+---+---+---+---+---+---+

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

dst × src → dst

Description

The product of the dst and src operands is loaded into the dst register. The src and dst operands, when read, are assumed to be 32-bit signed integers. The result is assumed to be a 64-bit signed integer. The output to the dst register is the 32 LSBs of the result.

Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs from the MSB of the 32-bit output value.

Status Bits

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>LUF</th>
<th>LV</th>
<th>UF</th>
<th>N</th>
<th>Z</th>
<th>V</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>Unchanged</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
<td>0</td>
<td>1 if a negative result is generated, 0 otherwise</td>
<td>1 if a zero result is generated, 0 otherwise</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit

OVM operation is affected by OVM bit value.

Cycles

1
### Example

**MPYI R1,R5**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1 00 0033 C251h</td>
<td>3 392 081</td>
</tr>
<tr>
<td>R5 00 0078 B600h</td>
<td>7 910 912</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 1</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 1</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>

The result overflows and R5 contains the 32 LSBs of the result. To obtain the 32 MSBs, use the MPYSHI3 or the MPYUHI3 instructions.
**MPYI3  Multiply Integer, 3 Operands**

**Syntax**

MPYI3  \( \text{src2, src1, dst} \)

**Operands**

- \( \text{src1, src2: type 1 or type 2 three-operand addressing modes} \)
- \( \text{dst: register mode (any register in CPU primary-register file)} \)

**Opcode**

**Type 1**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 0</td>
<td>1 0</td>
<td>1 0</td>
<td>1</td>
</tr>
</tbody>
</table>

\( T \)  dst  src1  src2

**Type 2**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1</td>
<td>1 0</td>
<td>1 0</td>
<td>1</td>
</tr>
</tbody>
</table>

\( T \)  dst  src1  src2

**Word Fields**

**Type 1**

<table>
<thead>
<tr>
<th>( T )</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>( T )</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (+\text{ARn}(5\text{-bit unsigned displacement}))</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode (+\text{ARn}(5\text{-bit unsigned displacement}))</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (+\text{ARn}(5\text{-bit unsigned displacement}))</td>
<td>indirect mode (+\text{ARn}(5\text{-bit unsigned displacement}))</td>
</tr>
</tbody>
</table>

**Operation**

\( \text{src1} \times \text{src2} \rightarrow \text{dst} \)

**Description**

The product of the numbers at \( \text{src1} \) and \( \text{src2} \) is loaded into the \( \text{dst} \) register. The multiplied numbers are assumed to be 32-bit signed integers. The result is assumed to be a signed 64-bit integer. The output to the \( \text{dst} \) register is the 32 least-significant bits of the result.

Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs from the MSB of the 32-bit \( \text{dst} \) value.
### Status Bits

If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**  Unchanged.
- **LV** 1 if an integer overflow occurs, unchanged otherwise
- **UF** 0
- **N** 1 if a negative result is generated, 0 otherwise
- **Z** 1 if a zero result is generated, 0 otherwise
- **V** 1 if an integer overflow occurs, 0 otherwise
- **C** Unaffected

### Mode Bit

**OVM** operation is affected by OVM bit value.

### Cycles

1

### Example

None
**Parallel MPYI3 and ADDI3**

**Syntax**
```
MPYI3 srcA, srcB, dst1
||
ADDI3 srcC, srcD, dst2
```

**Operands**

- `srcA`: Any two must be indirect (disp = 0, 1, IR0, IR1), and any two must be register (R0 – R7).
- `srcB`:
- `srcC`:
- `srcD`:
- `dst1`: register (d1):
  - 0 = R0
  - 1 = R1
- `dst2`: register (d2):
  - 0 = R2
  - 1 = R3
- `src1`: register (R0 – R7)
- `src2`: register (R0 – R7)
- `src3`: indirect (disp = 0, 1, IR0, IR1)
- `src4`: indirect (disp = 0, 1, IR0, IR1)
- `P`: parallel-addressing modes (0 ≤ P ≤ 3)

**Operation (P Field)**

- 00: `src3 × src4, src1 + src2`
- 01: `src3 × src1, src4 + src2`
- 10: `src1 × src2, src3 + src4`
- 11: `src3 × src1, src2 + src4`

**Opcode**

```
31 24 23 16 15 8 7 0
1 0 0 0 1 0 P d1 d2 src1 src2 src3 src4
```

**Word Fields**

- None

**Operation**

```
srcA × srcB → dst1
||
srcD + srcC → dst2
```

**Description**

An integer multiplication and an integer addition are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (MPYI3) reads from a register and the operation being performed in parallel (ADDI3) writes to the same register, then MPYI3 accepts as input the contents of the register before it is modified by the ADDI3.
You can code any combination of addressing modes for the four possible source operands as long as you code two as indirect and two as register. The assignment of the source operands srcA – srcD to the src1 – src4 fields varies, depending on the combination of addressing modes used; the P field is encoded accordingly. The assembler may, when not significant, change the order of operands in commutative operations to simplify processing.

**Status Bits**
- **LUF**: Unchanged
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 0
- **Z**: 0
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: Unaffected

**Mode Bit**
- **OVM**: operation is affected by OVM bit value.

**Cycles**
- 1

**Example**

```
MPYI3 R7,R4,R0
|| ADDI3 *–AR3,*AR5– –(1),R3
```

**Before Instruction**

<table>
<thead>
<tr>
<th>R7</th>
<th>14h</th>
<th>20</th>
<th>R7</th>
<th>14h</th>
<th>20</th>
</tr>
</thead>
<tbody>
<tr>
<td>R4</td>
<td>64h</td>
<td>100</td>
<td>R4</td>
<td>64h</td>
<td>100</td>
</tr>
<tr>
<td>R0</td>
<td>0h</td>
<td>07D0h</td>
<td>R0</td>
<td>07D0h</td>
<td>2000</td>
</tr>
<tr>
<td>AR3</td>
<td>80 981Fh</td>
<td>AR3</td>
<td>80 981Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>AR5</td>
<td>80 996Fh</td>
<td>AR5</td>
<td>80 996Fh</td>
<td></td>
<td></td>
</tr>
<tr>
<td>R3</td>
<td>0h</td>
<td>0h</td>
<td>R3</td>
<td>0h</td>
<td></td>
</tr>
</tbody>
</table>

Data at 80 981Eh

```
0FFFF FFCBh -53
```

Data at 80 996Eh

```
35h 53
```

<table>
<thead>
<tr>
<th>LUF</th>
<th>0</th>
<th>LUF</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>LV</td>
<td>0</td>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
MPYI3||STI  Parallel MPYI3 and STI3

Syntax

\[
\text{MPYI3} \quad \text{src2, src1, dst1} \\
\text{STI} \quad \text{src3, dst2}
\]

Operands

- **src1**: register (R0 – R7)
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>dst1</td>
<td>src1</td>
<td>src3</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

\[
\text{src1} \times \text{src2} \rightarrow \text{dst1} \\
\text{src3} \rightarrow \text{dst2}
\]

Description

An integer multiplication and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (MPYI3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the MPYI3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs from the most significant bit of the 32-bit dst1 value.

Status Bits

- **LUF**: Unchanged
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: Unaffected

Mode Bit

**OVM** operation is affected by OVM bit value.

Cycles

1
### Example

```
MPYI3 *++AR0(1),R5,R7
STI R2,*–AR3(1)
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR0 80 995Ah</td>
<td>AR0 80 995Bh</td>
</tr>
<tr>
<td>R5 32h</td>
<td>R5 32h</td>
</tr>
<tr>
<td>R7 0h</td>
<td>R7 2710h</td>
</tr>
<tr>
<td>R2 0DCh</td>
<td>R2 0DCh</td>
</tr>
<tr>
<td>AR3 80 982Fh</td>
<td>AR3 80 982Fh</td>
</tr>
<tr>
<td>Data at 80 995Bh</td>
<td>Data at 80 995Bh</td>
</tr>
<tr>
<td>0C8h 200</td>
<td>0C8h 200</td>
</tr>
<tr>
<td>Data at 80 982Eh</td>
<td>Data at 80 982Eh</td>
</tr>
<tr>
<td>0h</td>
<td>0DCh 220</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Parallel MPYI3 and SUBI3

Syntax

\[
\text{MPYI3} \quad \text{srcA, srcB, dst1} \\
|| \quad \text{SUBI3} \quad \text{srcC, srcD, dst2}
\]

Operands

- srcA
- srcB
- srcC
- srcD

Any two must be indirect (disp = 0, 1, IR0, IR1), and any two must be register (R0 – R7).

- dst1 register (d1):
  - 0 = R0
  - 1 = R1

- dst2 register (d2):
  - 0 = R2
  - 1 = R3

- src1 register (R0 – R7)
- src2 register (R0 – R7)
- src3 indirect (disp = 0, 1, IR0, IR1)
- src4 indirect (disp = 0, 1, IR0, IR1)

- P parallel-addressing modes (0 ≤ P ≤ 3)

Operation (P Field)

- 00 \(\text{src3} \times \text{src4}, \text{src1} – \text{src2}\)
- 01 \(\text{src3} \times \text{src1}, \text{src4} – \text{src2}\)
- 10 \(\text{src1} \times \text{src2}, \text{src3} – \text{src4}\)
- 11 \(\text{src3} \times \text{src1}, \text{src2} – \text{src4}\)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>P</td>
<td>d1</td>
<td>d2</td>
</tr>
</tbody>
</table>

| src1 | src2 | src3 | src4 |

Word Fields

None

Operation

\[
\text{srcA} \times \text{srcB} \to \text{dst1} \\
|| \quad \text{srcD} – \text{srcC} \to \text{dst2}
\]

Description

An integer multiplication and an integer subtraction are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (MPYI3) reads from a register and the operation being performed in parallel (SUBI3) writes to the same register, then MPYI3 accepts as input the contents of the register before it is modified by the SUBI3.

You can code any combination of addressing modes for the four possible source operands as long as you code two as indirect and two as register. The
Parallel MPYI3 and SUBI3

Assignment of the source operands srcA–srcD to the src1–src4 fields varies, depending on the combination of addressing modes used; the P field is encoded accordingly. The assembler may, when not significant, change the order of operands in commutative operations in order to simplify processing.

Integer overflow occurs when any of the 32 MSBs of the 64-bit result differs from the MSB of the 32-bit output value.

| Status Bits | LUF | Unchanged |
| LV | 1 if an integer overflow occurs, unchanged otherwise |
| UF | 1 if an integer underflow occurs, 0 otherwise |
| N | 0 |
| Z | 0 |
| V | 1 if an integer overflow occurs, 0 otherwise |
| C | Unaffected |

Mode Bit
OVM operation is affected by OVM bit value.

Cycles
1

Example
MPYI3 R2, *++AR0(1), R0
|| SUBI3 *AR5--(IR1), R4, R2
or
MPYI3 *++AR0(1), R2, R0
|| SUBI3 *AR5--(IR1), R4, R2

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R2</td>
<td>32h</td>
</tr>
<tr>
<td>AR0</td>
<td>80 98E3h</td>
</tr>
<tr>
<td>R0</td>
<td>0h</td>
</tr>
<tr>
<td>AR5</td>
<td>80 99FCCh</td>
</tr>
<tr>
<td>IR1</td>
<td>0Ch</td>
</tr>
<tr>
<td>R4</td>
<td>07D0h</td>
</tr>
<tr>
<td>Data at 80 98E4h</td>
<td>Data at 80 98E4h</td>
</tr>
<tr>
<td>62h</td>
<td>98</td>
</tr>
<tr>
<td>Data at 80 99FCh</td>
<td>4B0h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
### MPYSHI

**Multiply Signed Integer and Produce 32 MSBs**

#### Syntax

MPYSHI \( src, \ dst \)

#### Operands

- **src**: general-addressing modes
- **dst**: register mode (any register in CPU primary-register file)

#### Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>

- **G**: src addressing modes
  - 00: register (any register in CPU primary-register file)
  - 01: direct
  - 10: indirect
  - 11: immediate

#### Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

#### Operation

\( \text{dst} \times \text{src} \rightarrow \text{dst} \)

#### Description

The 32 MSBs of the product of the numbers at \( \text{dst} \) and \( \text{src} \) are loaded into the \( \text{dst} \) register. These numbers, when read, are assumed to be signed 32-bit integers. The result is assumed to be a signed 64-bit integer. The output to the \( \text{dst} \) register is the 32 MSBs of the result. The MPYI instruction provides the 32 LSBs of the result.

#### Status Bits

- **LUF**: Unchanged
- **LV**: Unchanged
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if all 64 bits of the product are 0, 0 otherwise
- **V**: 0
- **C**: Unaffected

#### Mode Bit

**OVM** operation is not affected by OVM bit value.

#### Cycles

1

#### Example

None
Multiply Signed Integer Producing 32 MSBs, 3 Operands

**MPYSHI3**

**Syntax**

MPYSHI3  src2, src1, dst

**Operands**

- src1: type 1 or type 2 three-operand addressing modes
- src2: type 1 or type 2 three-operand addressing modes
- dst: register mode (any register in CPU primary-register file)

**Opcode**

**Type 1**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 1</td>
<td>0 1</td>
<td>0 0</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>T</td>
<td>dst</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0</td>
<td>src1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1</td>
<td>src2</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1 1</td>
<td>0 1</td>
<td>0 0</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>T</td>
<td>dst</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0</td>
<td>src1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1</td>
<td>src2</td>
</tr>
</tbody>
</table>

**Word Fields**

**Type 1**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

**Operation**

src1 \( \times \) src2 \( \rightarrow \) dst

**Description**

The product of the numbers at the src1 and src2 operands is loaded into the dst register. The numbers at the src1 and src2 operands are assumed to be 32-bit signed integers. The result is assumed to be a signed 64-bit integer. The output to the dst register is the 32 MSBs of the result. The MPYI3 instruction provides the 32 LSBs of the result.
**Status Bits**

If ST (SET COND) = 0 and the destination register is R0–R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**  Unchanged
- **LV**   1 if an integer overflow occurs, unchanged otherwise
- **UF**   0
- **N**    1 if a negative result is generated, 0 otherwise
- **Z**    1 if a zero result is generated, 0 otherwise
- **V**    1 if an integer overflow occurs, 0 otherwise
- **C**    Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
Multiply Unsigned Integer and Produce 32 MSBs

**MPYUHI**

**Syntax**

`MPYUHI src, dst`

**Operands**

- `src`: general-addressing modes
- `dst`: register mode (any register in CPU primary-register file)

**Opcode**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 1 1 1 0 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
<td></td>
</tr>
</tbody>
</table>
```

**Word Fields**

```
<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>
```

**Operation**

`dst \times src \rightarrow dst`

**Description**

The 32 MSBs of the product of the numbers at `dst` and `src` operands are loaded into the `dst` register. These numbers, when read, are assumed to be unsigned 32-bit integers. The result is assumed to be an unsigned 64-bit integer. The output to the `dst` register is the 32 MSBs of the result. The MPYI instruction provides the 32 LSBs of the result.

**Status Bits**

If `ST` (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If `ST` (SET COND) = 1, they are modified for all destination registers.

- **LUF**: Unchanged
- **LV**: Unchanged
- **UF**: 0
- **N**: 0
- **Z**: 1 if all 64 bits of the product are 0, 0 otherwise
- **V**: 0
- **C**: Unaffected

**Mode Bit**

**OVM** operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
**MPYUHI3** Multiply Unsigned Integer Producing 32 MSBs, 3 Operands

**Syntax**

MPYUHI3 \( src2, src1, dst \)

**Operands**

- \( src1, src2 \): both type 1 or type 2 three-operand addressing modes
- \( dst \): register mode (any register in CPU primary-register file)

**Opcode**

**Type 1**

\[
\begin{array}{ccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 1 & 0 & 0 & 1 & 0 & T \quad \text{dst} \quad \text{src1} \quad \text{src2} \\
\end{array}
\]

**Type 2**

\[
\begin{array}{ccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 1 & 1 & 0 & 0 & 1 & T \quad \text{dst} \quad \text{src1} \quad \text{src2} \\
\end{array}
\]

**Word Fields**

**Type 1**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

**Operation**

\( src1 \times src2 \to dst \)

**Description**

The product of the numbers at the \( src1 \) and \( src2 \) operands is loaded into the \( dst \) register. The numbers at the \( src1 \) and \( src2 \) operands are assumed to be 32-bit signed integers. The result is assumed to be an unsigned 64-bit integer. The output to the \( dst \) register is the 32 MSBs of the result. The MPYI3 instruction provides the 32 LSBs of the result.
**Status Bits**  
If ST (SET COND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SET COND) = 1, they are modified for all destination registers.

- **LUF**  
  Unchanged

- **LV**  
  Unchanged

- **UF**  
  0

- **N**  
  0

- **Z**  
  1 if all 64 bits of the product are 0, 0 otherwise

- **V**  
  0

- **C**  
  Unaffected

**Mode Bit**  
OVM operation is not affected by OVM bit value.

**Cycles**  
1

**Example**  
None
NEGB  Negate Integer With Borrow

Syntax  \textbf{NEGB} \textit{src, dst}

Operands  \textit{src}: general-addressing modes (G)
\textit{dst}: register (any register in CPU primary-register file)

Opcode

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 \\
 \multicolumn{1}{|c|}{G} & \multicolumn{1}{|c|}{dst} & \multicolumn{1}{|c|}{src} \\
\end{array}
\]

Word Fields

\begin{center}
\begin{tabular}{|c|c|}
\hline
\textbf{G} & \textbf{src addressing modes} \\
\hline
00 & register (any register in CPU primary-register file) \\
01 & direct \\
10 & indirect \\
11 & immediate \\
\hline
\end{tabular}
\end{center}

Operation  \(0 - \textit{src} - C \rightarrow \textit{dst}\)

Description  The difference of the 0, \textit{src}, and C operands, calculated as shown, is loaded into the \textit{dst} register. The \textit{dst} and \textit{src} are assumed to be signed integers.

Status Bits  If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

\begin{itemize}
\item \textbf{LUF}  Unaffected
\item \textbf{LV}  1 if an integer overflow occurs, unchanged otherwise
\item \textbf{UF}  0
\item \textbf{N}  1 if a negative result is generated, 0 otherwise
\item \textbf{Z}  1 if a zero result is generated, 0 otherwise
\item \textbf{V}  1 if an integer overflow occurs, 0 otherwise
\item \textbf{C}  1 if a borrow occurs, 0 otherwise
\end{itemize}

Mode Bit  \textbf{OVM} operation is affected by OVM bit value.

Cycles  1
### Example

**NEGB R5, R7**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R5</td>
<td>0FFFF FFCBh</td>
</tr>
<tr>
<td>R7</td>
<td>0h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>1</td>
</tr>
</tbody>
</table>
NEGF  *Negate Floating-Point Value*

**Syntax**

`NEGF src, dst`

**Operands**

- `src`: general-addressing modes (G)
- `dst`: register (R0 – R11)

**Opcode**

```
31 24 23 16 15 8 7 0
0 0 0 0 1 1 1 G dst src
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary–register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

`0 – src → dst`

**Description**

The difference of the 0 and `src` operands is loaded into the `dst` register. The `dst` and `src` operands are assumed to be floating-point numbers.

**Status Bits**

- `LUF`: 1 if a floating-point underflow occurs, unchanged otherwise
- `LV`: 1 if a floating-point overflow occurs, unchanged otherwise
- `UF`: 1 if a floating-point underflow occurs, 0 otherwise
- `N`: 1 if a negative result is generated, 0 otherwise
- `Z`: 1 if a zero result is generated, 0 otherwise
- `V`: 1 if a floating-point overflow occurs, 0 otherwise
- `C`: Unaffected

**Mode Bit**

`OVM` operation is not affected by `OVM` bit value.

**Cycles**

1
### Example

**NEGF** *+++AR3(2), R1*

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR3 80 9800h</td>
<td>AR3 80 9802h</td>
</tr>
<tr>
<td>R1 05 7B40 0025h</td>
<td>R1 07 F380 0000h</td>
</tr>
<tr>
<td>Data at 80 9802h</td>
<td>Data at 80 9802h</td>
</tr>
<tr>
<td>70C 8000h</td>
<td>70C 8000h</td>
</tr>
<tr>
<td>1.4050e + 02</td>
<td>1.4050e + 02</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
NEGF||STF  Parallel NEGF and STF

Syntax

\[
\text{NEGF} \quad \text{src2, dst1} \\
\| \quad \text{STF} \quad \text{src3, dst2}
\]

Operands

- \text{src2}: indirect (disp = 0, 1, IR0, IR1)
- \text{dst1}: register (R0 – R7)
- \text{src3}: register (R0 – R7)
- \text{dst2}: indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1 1 1 0 0 1</td>
<td>dst1</td>
<td>0 0 0</td>
<td>src3</td>
<td>dst2</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

\[
0 - \text{src2} \rightarrow \text{dst1} \\
\| \quad \text{src3} \rightarrow \text{dst2}
\]

Description

A floating-point negation and a floating-point store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STF) reads from a register and the operation being performed in parallel (NEGF) writes to the same register, then STF accepts as input the contents of the register before it is modified by the NEGF.

If \text{src2} and \text{dst2} point to the same location, \text{src2} is read before the write to \text{dst2}.

Status Bits

- LUF: 1 if a floating-point underflow occurs, 0 unchanged otherwise
- LV: 1 if a floating-point overflow occurs, unchanged otherwise
- UF: 1 if a floating-point underflow occurs, 0 otherwise
- N: 1 if a negative result is generated, 0 otherwise
- Z: 1 if a zero result is generated, 0 otherwise
- V: 1 if a floating-point overflow occurs, 0 otherwise
- C: Unaffected

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
### Example

```plaintext
NEGF *AR4--(1), R7
|| STF R2, *++AR5(1)
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR4 80 98E1h</td>
<td>AR4 80 98E0h</td>
</tr>
<tr>
<td>R7 0h</td>
<td>R7 05 B40 0000h -6.281250e + 01</td>
</tr>
<tr>
<td>R2 07 33C0 0000h</td>
<td>R2 07 33C0 0000h 1.79750e + 02</td>
</tr>
<tr>
<td>AR5 80 9803h</td>
<td>AR5 80 9804h</td>
</tr>
<tr>
<td>Data at 80 98E1h</td>
<td>Data at 80 98E1h 6.281250e + 01</td>
</tr>
<tr>
<td>57 B40 0000h</td>
<td>57 B40 0000h 6.281250e + 01</td>
</tr>
<tr>
<td>Data at 80 9804h</td>
<td>Data at 80 9804h 1.79750e + 02</td>
</tr>
<tr>
<td>0h</td>
<td>0h 733 C000h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
NEGI  Negate Integer

Syntax  **NEGI**  *src, dst*

Operands  
- **src**: general-addressing modes (G)
- **dst**: register (any register in CPU primary-register file)

Opcode  

<table>
<thead>
<tr>
<th>31 24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0 1 1 0 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
0 – *src* → *dst*

Description  
The difference of the 0 and *src* operands is loaded into the *dst* register. The *dst* and *src* operands are assumed to be signed integers.

Status Bits  
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: 1 if a borrow occurs, 0 otherwise

Mode Bit  
**OVM** operation is affected by OVM bit value.

Cycles  
1
Example

NEGI 174, R5 (174 = 0AEh)

<table>
<thead>
<tr>
<th></th>
<th>Before Instruction</th>
<th>After Instruction</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>R5</td>
<td>0DCh</td>
<td>0FFFFFF52</td>
<td>-174</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>1</td>
<td></td>
</tr>
</tbody>
</table>
**NEGI||STI**  
Parallel NEGI and STI

**Syntax**  
\[
\text{NEGI } src2, \text{ dst1} \\
|| \text{STI } src3, \text{ dst2}
\]

**Operands**  
- **src2**: indirect (disp = 0, 1, IR0, IR1)  
- **dst1**: register (R0 – R7)  
- **src3**: register (R0 – R7)  
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

**Opcode**  

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>dst1</td>
<td>0</td>
</tr>
</tbody>
</table>

**Word Fields**  
None

**Operation**  
0 – src2 → dst1  
|| src3 → dst2

**Description**  
An integer negation and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (NEGI) writes to the same register, then STI accepts as input the contents of the register before it is modified by the NEGI.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

**Status Bits**  
- **LUF**: Unaffected  
- **LV**: 1 if an integer overflow occurs, unchanged otherwise  
- **UF**: 0  
- **N**: 1 if a negative result is generated, 0 otherwise  
- **Z**: 1 if a zero result is generated, 0 otherwise  
- **V**: 1 if an integer overflow occurs, 0 otherwise  
- **C**: 1 if a borrow occurs, 0 otherwise

**Mode Bit**  
**OVM** operation is affected by OVM bit value.

**Cycles**  
1
### Example

**NEGI** *–AR3, R2

|| **STI** R2, *AR1++

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR3</td>
<td>80 982Fh</td>
</tr>
<tr>
<td>R2</td>
<td>19h</td>
</tr>
<tr>
<td>25</td>
<td>R2</td>
</tr>
<tr>
<td>0FFFF FF24h</td>
<td>–220</td>
</tr>
<tr>
<td>AR1</td>
<td>80 98A6h</td>
</tr>
<tr>
<td>Data at 80 982Eh</td>
<td>Data at 80 982Eh</td>
</tr>
<tr>
<td>19h</td>
<td>0DCh</td>
</tr>
<tr>
<td>220</td>
<td>220</td>
</tr>
<tr>
<td>Data at 80 98A5h</td>
<td>Data at 80 98A5h</td>
</tr>
<tr>
<td>19h</td>
<td>0DCh</td>
</tr>
<tr>
<td>25</td>
<td>25</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
NOP  No Operation

Syntax  NOP src
Operands  src: general-addressing modes (G)
Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>dst</td>
<td>src</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (no operation)</td>
</tr>
<tr>
<td>10</td>
<td>indirect (modify ARn, 0 ≤ n ≤ 7)</td>
</tr>
</tbody>
</table>

Operation  No ALU or multiplier operations.
ARn is modified if src is specified in indirect mode.

Description  If the src operand is specified in the indirect mode, the specified addressing operation is performed, and a dummy memory read occurs. If the src operand is omitted, no operation is performed.

Status Bits

<table>
<thead>
<tr>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit  OVM operation is not affected by OVM bit value.

Cycles  1

Example 1  NOP

Before Instruction  After Instruction
PC 3Ah  PC 3Bh

Example 2  NOP *AR3--(1)

Before Instruction  After Instruction
PC 5h  PC 6h
AR3 80 9900h  AR3 80 98FFh
**Syntax**

NORM src, dst

**Operands**

src: general-addressing modes (G)
dst: register (R0 – R11)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16</th>
<th>15 8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0</td>
<td>0 1 0 1 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

norm (src) → dst

**Description**

The src operand is assumed to be an unnormalized floating-point number; for example, the implied bit is set equal to the sign bit. The dst is set equal to the normalized src operand with the implied bit removed. The dst operand exponent is set to the src operand exponent minus the size of the left-shift necessary to normalize the src. The dst operand is assumed to be a normalized floating-point number.

For values of src:

- If src (exp) = −128 and src (man) = 0, then dst = 0, Z = 1, and UF = 0.
- If src (exp) = −128 and src (man) ≠ 0, then dst = 0, Z = 0, and UF = 1.
- For all other cases of the src, if a floating-point underflow occurs, then dst (man) is forced to 0 and dst (exp) = −128. If src (man) = 0, then dst (man) = 0 and dst (exp) = −128. Refer to Section 5.7 on page 5-27.

**Status Bits**

- LUF: 1 if a floating-point underflow occurs, unchanged otherwise
- LV: Unaffected
- UF: 1 if a floating-point underflow occurs, 0 otherwise
- N: 1 if a negative result is generated, 0 otherwise
- Z: 1 if a zero result is generated, 0 otherwise
- V: 0
- C: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1
**Example**

NORM \( R_1, R_2 \)

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>( R_1 ) 04 0000 3AF5h</td>
<td>( R_1 ) 04 0000 3AF5h</td>
</tr>
<tr>
<td>( R_2 ) 07 0C80 0000h</td>
<td>( R_2 ) F2 6BD4 0000h 1.12451613e – 04</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Syntax

\[ \text{NOT } src, dst \]

Operands

\begin{align*}
src: & \quad \text{general-addressing modes (G)} \\
dst: & \quad \text{register (any register in CPU primary-register file)}
\end{align*}

Opcode

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 \\
G & dst & src
\end{array}
\]

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

\[ \neg src \rightarrow dst \]

Description

The bitwise-logical complement of the \( src \) operand is loaded into the \( dst \) register. The complement is formed by a logical NOT of each bit of the \( src \) operand. The \( dst \) and \( src \) operands are assumed to be unsigned integers.

Status Bits

If \( ST \) (SETCOND) = 0 and the destination register is \( R0 \) – \( R11 \), the condition flags are modified. If \( ST \) (SETCOND) = 1, they are modified for all destination registers.

\begin{align*}
\text{LUF} & \quad \text{Unaffected} \\
\text{LV} & \quad \text{Unaffected} \\
\text{UF} & \quad 0 \\
\text{N} & \quad \text{MSB of the output} \\
\text{Z} & \quad 1 \text{ if a zero result is generated, } 0 \text{ otherwise} \\
\text{V} & \quad 0 \\
\text{C} & \quad \text{Unaffected}
\end{align*}

Mode Bit

\( OVM \) operation is affected by OVM bit value.

Cycles

1
**NOT  Bitwise Logical Complement**

### Example

`NOT @982Ch, R4`

<table>
<thead>
<tr>
<th></th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>DP</strong></td>
<td>80h</td>
<td>80h</td>
</tr>
<tr>
<td><strong>R4</strong></td>
<td>0h</td>
<td>0FFFF A1D0h</td>
</tr>
<tr>
<td><strong>Data at 80 982Ch</strong></td>
<td>5E2Fh</td>
<td>5E2Fh</td>
</tr>
<tr>
<td><strong>LUF</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>LV</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>UF</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>N</strong></td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td><strong>Z</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>V</strong></td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td><strong>C</strong></td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
Parallel NOT and STI

Syntax

\[ \text{NOT} \quad src2, \ dst1 \]
\[ \text{|| STI} \quad src3, \ dst2 \]

Operands

- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

Opcode

```
31 23 24 16 0
111 00 src211
0x0 dst1
15 8 7
src3
0x0 dst2
0 0 0
```

Word Fields

None

Operation

\[ \sim src2 \rightarrow dst1 \]
\[ \text{|| src3} \rightarrow dst2 \]

Description

A bitwise-logical NOT and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (NOT) writes to the same register, then STI accepts as input the contents of the register before it is modified by the NOT.

If **src2** and **dst2** point to the same location, **src2** is read before the write to **dst2**.

Status Bits

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

Mode Bit

**OVM** operation is not affected by OVM bit value.

Cycles

1
### Example

**NOT**  $\text{++AR2,R3}$  
**STI**  $\text{R7,--- AR4 (IR1)}$

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2 80 99CBh</td>
<td>AR2 80 99CBh</td>
</tr>
<tr>
<td>R3 0h</td>
<td>R3 0FFFF F3D0h</td>
</tr>
<tr>
<td>R7 0DCh</td>
<td>R7 0DCh</td>
</tr>
<tr>
<td>AR4 80 9850h</td>
<td>AR4 80 9840h</td>
</tr>
<tr>
<td>IR1 10h</td>
<td>IR1 10h</td>
</tr>
<tr>
<td>Data at 80 99CCh</td>
<td>Data at 80 99CCh</td>
</tr>
<tr>
<td>0C2Fh</td>
<td>0C2Fh</td>
</tr>
<tr>
<td>Data at 80 9840h</td>
<td>Data at 80 9840h</td>
</tr>
<tr>
<td>0h</td>
<td>0DCh</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 1</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Syntax  
\[ \text{OR} \ src, \ dst \]

Operands  
- **src**: general-addressing modes (G)
- **dst**: register (any register in CPU primary-register file)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1</td>
<td>0 0 0 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
\[ dst \ OR \ src \rightarrow dst \]

Description  
The bitwise-logical OR between the **src** and **dst** operands is loaded into the **dst** register. The **dst** and **src** operands are assumed to be unsigned integers.

Status Bits  
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

Mode Bit  
**OVM** operation is not affected by OVM bit value.

Cycles  
1
### Example

**OR** *+++AR1(IR1),R2*

<table>
<thead>
<tr>
<th></th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR1</td>
<td>80 9800h</td>
<td>80 9800h</td>
</tr>
<tr>
<td>IR1</td>
<td>4h</td>
<td>4h</td>
</tr>
<tr>
<td>R2</td>
<td>01256 0000h</td>
<td>01256 2BCDh</td>
</tr>
<tr>
<td>Data at 80 9804h</td>
<td>2BCDh</td>
<td>2BCDh</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
Syntax

OR3 src2, src1, dst

Operands

src1, src2: type 1 or type 2 three-operand addressing modes
dst: register mode (any register in CPU primary-register file)

Opcode

Type 1

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0 1 0 1 1 T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
<td></td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 1 0 1 1 T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

Type 1

<table>
<thead>
<tr>
<th>T src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00 register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01 indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10 register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11 indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>T src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00 register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01 register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10 indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11 indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation

src1 OR src2 → dst
OR3  Bitwise Logical OR, 3 Operands

Description
The bitwise-logical OR between the numbers at the src1 and src2 operands is loaded into the dst register. The numbers at the src1, src2, and dst operands are assumed to be unsigned integers. The src2 immediate-addressing mode is sign extended.

Status Bits
If ST (SETCOND) = 0, the condition flags are modified if the destination register is R0 — R11. If ST (SETCOND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>Status Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>MSB of the output</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
1

Example
None
Parallel OR3 and STI

**Syntax**

\[ \text{OR3: src2, src1, dst1} \]
\[ \parallel \text{STI: src3, dst2} \]

**Operands**

- **src1**: register (R0 – R7)
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>dst1</td>
<td>src1</td>
<td>src3</td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

\[ \text{src1 OR src2} \rightarrow \text{dst1} \]
\[ \parallel \text{src3} \rightarrow \text{dst2} \]

**Description**

A bitwise-logical OR and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (OR3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the OR3.

If src2 and dst2 point to the same location, src2 is read before the write to dst2.

**Status Bits**

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

**Mode Bit**

**OVM** operation is not affected by OVM bit value.

**Cycles**

1
**Example**

```
OR3 *+++AR2,R5,R2
||
STI  R6,*AR1--
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR2</td>
<td>80 9830h</td>
</tr>
<tr>
<td>R5</td>
<td>80 0000h</td>
</tr>
<tr>
<td>R2</td>
<td>0h</td>
</tr>
<tr>
<td>R6</td>
<td>0DCh</td>
</tr>
<tr>
<td>AR1</td>
<td>80 9833h</td>
</tr>
<tr>
<td>Data at 80 9831h</td>
<td>Data at 80 9831h</td>
</tr>
<tr>
<td>9800h</td>
<td>9800h</td>
</tr>
</tbody>
</table>

| Data at 80 9883h | Data at 80 9883h |
| 0h | 0DCh | 220 |
| LUF | 0 | LUF | 0 |
| LV  | 0 | LV  | 0 |
| UF  | 0 | UF  | 0 |
| N   | 0 | N   | 0 |
| Z   | 0 | Z   | 0 |
| V   | 0 | V   | 0 |
| C   | 0 | C   | 0 |
POP Integer

**Syntax**

POP dst

**Operands**

dst: register (any register in CPU primary-register file)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0</td>
<td>1 1 0 0 0 1</td>
<td>dst</td>
<td>0 0 0 0 0 0 0 0 0 0 0 0 0 0</td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

*SP–* → dst

**Description**

The top of the current system stack is popped and loaded into the 32 LSBs of the dst register. The top of the stack is assumed to be a signed integer. The POP is performed with a post decrement of the stack pointer. The eight MSBs (exponent) of an extended-precision dst register (R11–R0) are left unmodified. If required they can be recovered with a POPF instruction.

**Status Bits**

If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF** Unaffected
- **LV** Unaffected
- **UF** 0
- **N** 1 if a negative result is generated, 0 otherwise
- **Z** 1 if a zero result is generated, 0 otherwise
- **V** 0
- **C** Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

POP R3

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>SP</td>
<td>80 9856h</td>
</tr>
<tr>
<td>R3</td>
<td>012DAh</td>
</tr>
<tr>
<td>Data at 80 9856h</td>
<td>4 826</td>
</tr>
<tr>
<td>0FFF 0DA4h</td>
<td>–62 044</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>

*Assembly Language Instructions* 14-209
POPF  POP Floating-Point Value

Syntax  POPF dst

Operands  dst: register (R0 – R11)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 0 1 1 1 0 1 0 1</td>
<td>dst</td>
<td>0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Word Fields  None

Operation  *SP–– → dst

Description  The top of the current system stack is popped and loaded into the dst register (32 MSBs). The eight LSBs of the dst register mantissa are set to 0. For this reason, POPF must be executed before the POP instruction when you are preserving the entire 40 register bits. The top of the stack is assumed to be a floating-point number. The POP is performed with a postdecrement of the stack pointer.

Status Bits

<table>
<thead>
<tr>
<th>LUF</th>
<th>UF</th>
<th>LV</th>
<th>N</th>
<th>Z</th>
<th>V</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>Unaffected</td>
<td>0</td>
<td>Unaffected</td>
<td>1 if a negative result is generated, 0 otherwise</td>
<td>1 if a zero result is generated, 0 otherwise</td>
<td>0</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit  OVM operation is not affected by OVM bit value.

Cycles  1

Example  POPF R4

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>SP 80 984Ah</td>
<td>SP 80 9849h</td>
</tr>
<tr>
<td>R4 02 5D2E 0123h 6.91186578e + 00</td>
<td>R4 5F 2C13 0200h 5.32544007e + 28</td>
</tr>
<tr>
<td>Data at 80 984Ah 5F2C 1302h 5.32544007e + 28</td>
<td>Data at 80 984Ah 5F2C 1302h 5.32544007e + 28</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
**Syntax**

\[ \text{PUSH } src \]

**Operands**

*src*: register (any register in CPU primary-register file)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>src</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

\[ src \rightarrow ^{++}SP \]

**Description**

The contents of the *src* register (32 LSBs) are pushed onto the current system stack. The integer or mantissa portion of an extended-precision register (R0–R11) is saved with this instruction. The 8 MSBs (exponent) can be pushed with the PUSHF instruction. The *src* is assumed to be a signed integer. The PUSH is performed with a preincrement of the stack pointer.

**Status Bits**

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: Unaffected
- **N**: Unaffected
- **Z**: Unaffected
- **V**: Unaffected
- **C**: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

**Before Instruction**

<table>
<thead>
<tr>
<th>SP</th>
<th>80 98AEh</th>
</tr>
</thead>
<tbody>
<tr>
<td>R6</td>
<td>815Bh</td>
</tr>
<tr>
<td>Data at 80 98AEh</td>
<td>0h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>

**After Instruction**

<table>
<thead>
<tr>
<th>SP</th>
<th>80 98AFh</th>
</tr>
</thead>
<tbody>
<tr>
<td>R6</td>
<td>815Bh</td>
</tr>
<tr>
<td>Data at 80 98AFh</td>
<td>815Bh</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>

---

*Assembly Language Instructions* 14-211
**PUSHF**  *PUSH Floating-Point Value*

**Syntax**

```
PUSHF src
```

**Operands**

```
src:  register (R0 – R11)
```

**Opcode**

```

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0 0</td>
<td>1 1</td>
<td>1 1</td>
<td>0 1</td>
</tr>
<tr>
<td></td>
<td>src</td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
</tr>
<tr>
<td></td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
</tr>
<tr>
<td></td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
</tr>
<tr>
<td></td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
</tr>
<tr>
<td></td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
</tr>
<tr>
<td></td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
<td>0 0</td>
</tr>
</tbody>
</table>
```

**Word Fields**

None

**Operation**

```
src → *++SP
```

**Description**

The contents of the *src* register (32 MSBs) are pushed onto the current system stack. The *src* is assumed to be a floating-point number. The PUSH is performed with a preincrement of the stack pointer. The eight LSBs of the mantissa are not saved (notice the difference in R2 and the value on the stack in the example below), but they can be saved with the PUSH instruction. PUSHF should be executed after the PUSH instruction.

**Status Bits**

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: Unaffected
- **N**: Unaffected
- **Z**: Unaffected
- **V**: Unaffected
- **C**: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

```
PUSHF R2
```

### Before Instruction

<table>
<thead>
<tr>
<th>SP</th>
<th>R2</th>
<th>Data at 80 9802h</th>
</tr>
</thead>
<tbody>
<tr>
<td>80 9801h</td>
<td>02 5C12 8081h 6.87725854e + 00</td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>LUF</th>
<th>LV</th>
<th>UF</th>
<th>N</th>
<th>Z</th>
<th>V</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

### After Instruction

<table>
<thead>
<tr>
<th>SP</th>
<th>R2</th>
<th>Data at 80 9802h</th>
</tr>
</thead>
<tbody>
<tr>
<td>80 9802h</td>
<td>02 5C12 8081h 6.87725854e + 00</td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>LUF</th>
<th>LV</th>
<th>UF</th>
<th>N</th>
<th>Z</th>
<th>V</th>
<th>C</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

14-212
Reciprocal of Floating-Point Value  RCPF

Syntax
RCPF src, dst

Operands
src: extended-precision register-, direct- and indirect-addressing modes
dst: R0 – R11

Opcode

<table>
<thead>
<tr>
<th>31 24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 1 1 0 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00 register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01 direct</td>
</tr>
<tr>
<td>10 indirect</td>
</tr>
<tr>
<td>11 immediate</td>
</tr>
</tbody>
</table>

Operation
16-bit reciprocal of src → dst

Description
The 16-bit approximation of the reciprocal of the src operand is loaded into the dst register. The dst and src operands are assumed to be floating-point numbers.

Status Bits
LUF 1 if a floating-point underflow occurs, unchanged otherwise
LV 1 if a floating-point overflow occurs, unchanged otherwise
UF 1 if a floating-point underflow occurs, 0 otherwise
N 1 if a negative result is generated, 0 otherwise
Z 1 is a zero result, 0 otherwise
V 1 if a floating-point overflow occurs, 0 otherwise
C Unaffected

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
1

Example
None
**Syntax**

```
RETI cond
```

**Operands**

None

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 1 1 1 0 0 0 0 0</td>
<td>cond</td>
<td>0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

If (cond is true)
- *(SP) → PC
- ST(PGIE) → ST(GIE)
- ST(PCF) → ST(CF)
Else, continue

**Description**

If the condition is true, then the top of the stack is popped to the PC, PGIE is copied to GIE, and PCF is copied to CF. If the condition is not true, then continue normal operation (see Section 14.2 on page 14-12 for a list of condition mnemonics, encoding, and flags).

**Status Bits**

- **LUF** Unaffected
- **LV** Unaffected
- **UF** Unaffected
- **N** Unaffected
- **Z** Unaffected
- **V** Unaffected
- **C** Unaffected

**Mode Bit**

**OVM** operation is not affected by OVM bit value.

**Cycles**

4

**Example**

None
Return From Interrupt or Trap Conditionally Delayed  RETIcondD

Syntax  RETIcondD
Operands  None
Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8  7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>cond</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

Word Fields  None
Operation  If (cond is true)
*(SP) → PC
ST(PGIE) → ST(GIE)
ST(PCF) → ST(CF)
Else, continue

Description  Performs a delayed return from an interrupt or trap.

Because this is a delayed return, the three instructions following the RETIcondD are fetched and executed. These three instructions should not modify the program flow, load the status register, or modify the stack pointer (SP) register. See Section 14.2 for a list of condition mnemonics, encoding, and flags.

Interrupts are disabled for the duration of the RETIcondD.

Status Bits

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit  OVM operation is not affected by OVM bit value.

Cycles  1
Example  None
**RETcond**  
*Return From Subroutine Conditionally*

**Syntax**

```
RETcond
```

**Operands**

None

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 1 1 1</td>
<td>0 0 0 1</td>
<td>0 0</td>
<td>cond</td>
<td>0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0</td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

If `cond` is true:

```
*SP--- → PC.
```

Else, continue.

**Description**

A conditional return is performed. If the condition is true, the top of the stack is popped to the PC.

The 'C4x provides 20 condition codes that can be used with this instruction (see Section 14.2 for a list of condition mnemonics, encoding, and flags).

**Status Bits**

- **LUF** Unaffected
- **LV** Unaffected
- **UF** Unaffected
- **N** Unaffected
- **Z** Unaffected
- **V** Unaffected
- **C** Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

4

**Example**

```
Before Instruction                  After Instruction
PC     123h                          PC     456h
SP     80 983Ch                      SP     80 983Bh
Data at 80 983Ch                    Data at 80 983Ch
      456h                          456h
LUF    0                            LUF    0
LV     0                            LV     0
UF     0                            UF     0
N      0                            N      0
Z      0                            Z      0
V      0                            V      0
C      0                            C      0
```
### Syntax

**RND** *src, dst*

### Operands

- **src**: general-addressing modes (G)
- **dst**: register (R0 – R11)

### Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>0 0 0</td>
<td>1 0 0 0 1 0</td>
<td>G</td>
<td><strong>dst</strong></td>
</tr>
</tbody>
</table>

### Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

### Operation

`rnd(src) → dst`

### Description

The result of rounding the *src* operand is loaded into the *dst* register. The *src* operand is rounded to the nearest single-precision floating-point value. If the *src* operand is exactly halfway between two single-precision values, it is rounded to the most positive of those values. Notice that the rounding of 0 does not set the zero (z) status bit but the underflow bit.

### Status Bits

- **LUF**: 1 if a floating-point underflow occurs, unchanged otherwise
- **LV**: 1 if a floating-point overflow occurs, unchanged otherwise
- **UF**: 1 if a floating-point underflow occurs or the *src* operand is zero, 0 otherwise
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: Unaffected
- **V**: 1 if a floating-point overflow occurs, 0 otherwise
- **C**: Unaffected

### Mode Bit

**OVM** operation is affected by OVM bit value.

### Cycles

1
### RND Round Floating-Point Value

**Example**

<table>
<thead>
<tr>
<th></th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R5</td>
<td>07 33C1 6EEFh</td>
<td>1.79755599e + 02</td>
</tr>
<tr>
<td>R2</td>
<td>0h</td>
<td>07 33C1 6EEFh</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
Syntax

```
ROL dst
```

Operands

```
dst: register (any register in CPU primary-register file)
```

Opcode

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0</td>
<td>1 0 0</td>
<td>0 1 1</td>
<td>dst</td>
<td>0 0 0 0 0 0 0 0 0 1 1</td>
</tr>
</tbody>
</table>
```

Word Fields

None

Operation

```
dst left-rotated 1 bit → dst
```

Description

The contents of the `dst` operand are left-rotated one bit and loaded into the `dst` register. This is a circular rotate with the MSB transferred into the LSB.

Rotate left:

```
C → dst
```

Status Bits

If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero output is generated, 0 otherwise
- **V**: 0
- **C**: Set to the value of the bit rotated out of the high-order bit

Mode Bit

**OVM** operation is not affected by OVM bit value.

Cycles

1

Example

```
ROL R3
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3: 8002 5CD4h</td>
<td>R3: 0004 B9A9h</td>
</tr>
<tr>
<td>LUF</td>
<td>LUF</td>
</tr>
<tr>
<td>LV</td>
<td>LV</td>
</tr>
<tr>
<td>UF</td>
<td>UF</td>
</tr>
<tr>
<td>N</td>
<td>N</td>
</tr>
<tr>
<td>Z</td>
<td>Z</td>
</tr>
<tr>
<td>V</td>
<td>V</td>
</tr>
<tr>
<td>C</td>
<td>C</td>
</tr>
</tbody>
</table>
**ROLC  Rotate Left Through Carry**

**Syntax**

ROLC dst

**Operands**

dst: register (any register in CPU primary-register file)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>1 0</td>
<td>0 1</td>
<td>0 1</td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

dst left-rotated 1 bit through carry bit → dst

**Description**

The contents of the dst operand are left-rotated one bit through the carry bit and loaded into the dst register. The MSB is rotated to the carry bit, at the same time the carry bit is transferred to the LSB.

Rotate left through carry bit:

![Diagram of rotate left through carry]

**Status Bits**

If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero output is generated, 0 otherwise
- **V**: 0
- **C**: Set to the value of the bit rotated out of the high-order bit

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example 1**

ROLC R3

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3 0000 0420h</td>
<td>R3 0000 0841h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 1</td>
<td>C 0</td>
</tr>
</tbody>
</table>

14-220
Example 2  

<table>
<thead>
<tr>
<th>ROLC R3</th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3</td>
<td>8000 4281h</td>
<td>0000 8502h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>
ROR  Rotate Right

Syntax
ROR dst

Operands
dst: register (any register in CPU primary-register file)

Opcode

<table>
<thead>
<tr>
<th></th>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>0 0</td>
<td>1 0 1</td>
<td>1 1</td>
<td>dst</td>
<td>1 1 1 1 1 1 1 1 1 1 1 1 1</td>
</tr>
</tbody>
</table>

Word Fields
None

Operation
dst right-rotated 1 bit → dst

Description
The contents of the dst operand are right-rotated one bit and loaded into the dst register. The LSB is rotated into the carry bit and also transferred into the MSB.

Rotate right:

Status Bits
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- LUF  Unaffected
- LV   Unaffected
- UF   0
- N    MSB of the output
- Z    1 if a zero output is generated, 0 otherwise
- V    0
- C    Set to the value of the bit rotated out of the high-order bit

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
1

Example
ROR R7

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R7 00000421h</td>
<td>R7 80000210h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 1</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 1</td>
</tr>
</tbody>
</table>
**Syntax**

RORC dst

**Operands**

dst: register (any register in CPU primary-register file)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0</td>
<td>1 0 0 1 1 0</td>
<td>dst</td>
<td>1 1 1 1 1 1 1 1 1 1 1 1</td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

None

**Operation**

dst right-rotated 1 bit through carry bit → dst

**Description**

The contents of the dst operand are right-rotated one bit through the status register’s carry bit. This could be viewed as a 33-bit shift. The carry bit value is rotated into the MSB of the dst; at the same time, the dst LSB is rotated into the carry bit.

Rotate right through carry bit:

![Diagram of RORC operation]

**Status Bits**

If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF** Unaffected
- **LV** Unaffected
- **UF** 0
- **N** MSB of the output
- **Z** 1 if a zero output is generated, 0 otherwise
- **V** 0
- **C** Set to the value of the bit rotated out of the high-order bit

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

RORC R4

<table>
<thead>
<tr>
<th>Register</th>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R4</td>
<td>8000.0081h</td>
<td>4000.0040h</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>

*Assembly Language Instructions* 14-223
RPTB  Repeat Block

Syntax
RPTB src

Operands
src: 24-bit signed immediate displacement or register mode

Opcode

For 24-bit signed immediate or register mode:

```
 31 24 23 16 15 8 7 0
0 1 1 0 0 1 0 0
```

src (displacement)

For register mode:

```
 31 24 23 16 15 8 7 0
0 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
```

Word Fields
None

Operation
src + PC +1 → RE
1 → ST (RM)
Next PC → RS

Description
RPTB allows a block of instructions to be repeated a number of times without any penalty for looping.

It activates the block repeat mode of updating the PC. The src operand can be a 32-bit register value or a 24-bit signed immediate value (displacement). The resulting src address is the end address of the block to be repeated. This address is loaded into the repeat end address (RE) register. A 1 is written into the repeat mode bit of status register [ST(RM)] to indicate that the PC is to be updated in the repeat mode. The address of the next instruction is loaded into the repeat-start address (RS) register.

RE should be greater than or equal to RS (RE ≥ RS). Otherwise, the code does not repeat, even though the RM bit remains set to 1.

Status Bits
LUF  Unaffected
LV   Unaffected
UF   Unaffected
N    Unaffected
Z    Unaffected
V    Unaffected
C    Unaffected

Mode Bit
OVM operation is not affected by OVM bit value.

Cycles
4

Example
None
RPTBD src

Operands
src: 24-bit signed immediate displacement or register mode

Opcode

For 24-bit signed immediate or register mode:

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 1 1 0 0 1 0 1</td>
<td>src (displacement)</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

For register mode:

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 1 1 1 0 0 1 1</td>
<td>0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Word Fields
None

Operation
if src is an immediate value (displacement)
src + PC + 3 → RE
Else:
src → RE
1 → ST(RM)
PC of RPTBD + 4 → RS

Description
RPTBD allows a block of instructions to be repeated a number of times without any penalty for looping and with single-cycle execution of the RPTBD instruction. It activates the block repeat mode of updating the PC. The src operand can be a 32-bit register value or a 24-bit signed immediate value (displacement). The resulting src address is loaded into the repeat-end address (RE) register (block-end address). A 1 is written to the status-register repeat mode bit [ST(RM)], indicating the PC is to be updated in the repeat mode. The address of the next instruction +3 is loaded into the repeat-start address (RS) register.

RE should be greater than or equal to RS (RE ≥ RS). Otherwise, the code will not repeat, even though the RM bit remains set to 1.

RPTBD does not flush the pipeline. The three instructions following RPTBD are executed and should not modify the program flow. These three instructions are not part of the block that is repeated. The RC register must be loaded before the RPTBD instruction executes. It should not be loaded in the three instructions after RPTBD.

Interrupts are disabled during the next three instructions after RPTBD.
**RPTBD  Repeat Block Delayed**

<table>
<thead>
<tr>
<th>Status Bits</th>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

**Mode Bit**  

*OVM* operation is not affected by OVM bit value.

**Cycles**  

1

**Example**  

None
**Syntax**

RPTS src  

**Operands**

src: general-addressing modes (G)

**Opcode**

```
0 0 0 1 0 0 1 1  G 1 1 0 1 1  src
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

src → RC  
1 → ST (RM)  
1 → S  
Next PC → RS  
Next PC → RE

**Description**

The RPTS instruction allows a single instruction to be repeated a number of times without any penalty for looping. Fetches also can be made from the instruction register (IR), thus avoiding repeated memory access.

The src operand is loaded into the repeat counter (RC). A 1 is written into the repeat mode (RM) bit of the status register (ST). A 1 also is written into the repeat single bit (S). This indicates that the program fetches are to be performed only from the instruction register. The next PC is loaded into the repeat-end address (RE) register and the repeat-start address (RS) register.

For the immediate mode, the src operand is assumed to be an unsigned integer and is not sign-extended.

**Status Bits**

- **LUF** Unaffected
- **LV** Unaffected
- **UF** Unaffected
- **N** Unaffected
- **Z** Unaffected
- **V** Unaffected
- **C** Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

4
Example

<table>
<thead>
<tr>
<th>RPTS AR5</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Before Instruction</strong></td>
</tr>
<tr>
<td>PC</td>
</tr>
<tr>
<td>ST</td>
</tr>
<tr>
<td>RS</td>
</tr>
<tr>
<td>RE</td>
</tr>
<tr>
<td>RC</td>
</tr>
<tr>
<td>AR5</td>
</tr>
<tr>
<td>LUF</td>
</tr>
<tr>
<td>LV</td>
</tr>
<tr>
<td>UF</td>
</tr>
<tr>
<td>N</td>
</tr>
<tr>
<td>Z</td>
</tr>
<tr>
<td>V</td>
</tr>
<tr>
<td>C</td>
</tr>
</tbody>
</table>

The RPTS instruction is not interruptable. Interrupts are held pending until the RPTS instruction is finished executing. In timing-critical applications, this could cause timings to be inaccurate; thus, in timing-critical applications, use caution when using the RPTS instruction.
Reciprocal of Square Root Floating-Point Value  \textit{RSQRF}

**Syntax**  
\texttt{RSQRF \textit{src}, \textit{dst}}

**Operands**  
\textit{src}: extended-precision register, direct-, and indirect addressing modes  
\textit{dst}: extended-precision register

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 1 1 0 0 1</td>
<td>G</td>
<td>\textit{dst}</td>
<td>\textit{src}</td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>extended-precision register</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>16-bit immediate</td>
</tr>
</tbody>
</table>

**Operation**

16-bit reciprocal of the square root of \textit{src} \rightarrow \textit{dst}

**Description**
The 16-bit approximation of the reciprocal of the square root of the number at the \textit{src} operand is loaded into the \textit{dst} register. The number at the \textit{src} operand is assumed to be positive. The operation for negative inputs is undefined.

The value at the \textit{dst} and \textit{src} operands are assumed to be floating-point numbers.

**Status Bits**

- \texttt{LUF}: Unchanged
- \texttt{LV}: 1 if input is zero, unchanged otherwise
- \texttt{UF}: 0
- \texttt{N}: 0
- \texttt{Z}: 0
- \texttt{V}: 1 if input is zero, 0 otherwise
- \texttt{C}: Unaffected

**Mode Bit**

\texttt{OVM} operation is not affected by \texttt{OVM} bit value.

**Cycles**

1

**Example**

None
**SIGI**  *Signal, Interlocked*

**Syntax**

\[
\text{SIGI} \ src, \ dst
\]

**Operands**

- **src**: direct- and indirect-addressing modes (assumed to be signed integer)
- **dst**: register mode (assumed to be signed integer)

**Opcode**

<table>
<thead>
<tr>
<th>31 24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 1 0 0</td>
<td><strong>G</strong></td>
<td><strong>dst</strong></td>
<td><strong>src</strong></td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th><strong>G</strong></th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
</tbody>
</table>

**Operation**

- LOCK (or LLOCK) pin brought low
- \( src \rightarrow dst \)
- LOCK (or LLOCK) pin brought high

**Description**

An interlocking operation is signaled by the appropriate bus-lock signal (LOCK or LLOCK) if, and only if, an external-memory access is performed. The **src** and **dst** operands are assumed to be signed integers. After the read is performed, the bus-lock signal is deasserted. If an internal-memory access is performed, SIGI performs the read but does not assert a bus-lock signal. Refer to Section 9.7 on page 9-39 for a detailed description.

The numbers at the **src** and **dst** operands are treated as signed integers.

**Status Bits**

- If \( \text{ST} (\text{SETHOOK}) = 0 \) and the destination register is R0 – R11, the condition flags are modified. If \( \text{ST} (\text{SETHOOK}) = 1 \), they are modified for all destination registers.

<table>
<thead>
<tr>
<th><strong>LUF</strong></th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>LV</strong></td>
<td>Unaffected</td>
</tr>
<tr>
<td><strong>UF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>N</strong></td>
<td>1 if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td><strong>Z</strong></td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td><strong>V</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>C</strong></td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

**Mode Bit**

- **OVM** operation is not affected by OVM bit value.

**Cycles**

- 1

**Example**

None
Syntax  
\text{STF} \ src, \ dst 

Operands  
src: register (R0 – R11)  
dst: general-addressing modes (G)  

Opcode  
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
\hline
0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & G & \text{src} & dst \\
\end{array}  

Word Fields  
\begin{array}{|c|}
\hline
G & \text{src addressing modes} \\
01 & \text{direct} \\
10 & \text{indirect} \\
\hline
\end{array}  

Operation  
src \rightarrow \ dst  

Description  
The src register is loaded into the dst memory location. The src and dst operands are assumed to be floating-point numbers.  

Status Bits  
LUF: Unaffected  
LV: Unaffected  
UF: Unaffected  
N: Unaffected  
Z: Unaffected  
V: Unaffected  
C: Unaffected  

Mode Bit  
OVM operation is not affected by OVM bit value.  

Cycles  
1
### Example

**STF R2, @98A1h**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>DP</strong></td>
<td>80h</td>
</tr>
<tr>
<td><strong>R2</strong></td>
<td>052 C501 900h 4.30782204e + 01</td>
</tr>
<tr>
<td>Data at 80 98A1h</td>
<td></td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
<tr>
<td>DP</td>
<td>80h</td>
</tr>
<tr>
<td>R2</td>
<td>052 C501 900h 4.30782204e + 01</td>
</tr>
<tr>
<td>Data at 80 98A1h</td>
<td></td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
Store Floating-Point Value, Interlocked  STFI

Syntax  
STFI src, dst

Operands  
src: register (R0 – R11)  
dst: general-addressing modes (G)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0</td>
<td>1 0 1 0 0 1</td>
<td>G</td>
<td>src</td>
<td>dst</td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
</tbody>
</table>

Operation  
src → dst  
Signal end of interlocked operation.

Description  
The src register is loaded into the dst memory location. An interlocked operation is signaled over LOCK or LLOCK. The src and dst operands are assumed to be floating-point numbers. Refer to Section 9.7 on page 9-39 for detailed information.

Status Bits  
LUF Unaffected  
LV Unaffected  
UF Unaffected  
N Unaffected  
Z Unaffected  
V Unaffected  
C Unaffected

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1
Example

\[ \text{STFI } R3, \star \text{-AR4} \]

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3 ( 07 \text{33C0 0000h} ) ( 1.79750e + 02 )</td>
<td>R3 ( 07 \text{33C0 0000h} ) ( 1.79750e + 02 )</td>
</tr>
<tr>
<td>AR4 ( 80 \text{ 993Ch} )</td>
<td>AR4 ( 80 \text{ 993Ch} )</td>
</tr>
<tr>
<td>Data at ( 80 \text{ 993Bh} )</td>
<td>Data at ( 80 \text{ 993Bh} )</td>
</tr>
<tr>
<td>LUF ( 0 )</td>
<td>LUF ( 0 )</td>
</tr>
<tr>
<td>LV ( 0 )</td>
<td>LV ( 0 )</td>
</tr>
<tr>
<td>UF ( 0 )</td>
<td>UF ( 0 )</td>
</tr>
<tr>
<td>N ( 0 )</td>
<td>N ( 0 )</td>
</tr>
<tr>
<td>Z ( 0 )</td>
<td>Z ( 0 )</td>
</tr>
<tr>
<td>V ( 0 )</td>
<td>V ( 0 )</td>
</tr>
<tr>
<td>C ( 0 )</td>
<td>C ( 0 )</td>
</tr>
</tbody>
</table>
Parallel Store Floating-Point Value  STF||STF

Syntax

```
STF src2, dst2
|| STF src1, dst1
```

Operands

- `src1`: register (Rn1, 0 ≤ n1 ≤ 7)
- `dst1`: indirect (disp = 0, 1, IR0, IR1)
- `src2`: register (Rn2, 0 ≤ n2 ≤ 7)
- `dst2`: indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>src2</td>
<td>0</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

- `src2 → dst2`
- `|| src1 → dst1`

Description

Two STF instructions are executed in parallel. Both `src1` and `src2` are assumed to be floating-point numbers.

Status Bits

- LUF: Unaffected
- LV: Unaffected
- UF: Unaffected
- N: Unaffected
- Z: Unaffected
- V: Unaffected
- C: Unaffected

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1

Example

```
STF R4, *AR3--
|| STF R3, *++AR5
```
<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R4 07 0C80 0000h</td>
<td>R4 07 0C80 0000h</td>
</tr>
<tr>
<td>AR3 80 9835h</td>
<td>AR3 80 9834h</td>
</tr>
<tr>
<td>R3 07 33C0 0000h</td>
<td>R3 07 33C0 0000h</td>
</tr>
<tr>
<td>AR5 80 99D2h</td>
<td>AR5 80 99D3h</td>
</tr>
<tr>
<td>Data at 80 9835h</td>
<td>Data at 80 9835h</td>
</tr>
<tr>
<td></td>
<td>070C 8000h</td>
</tr>
<tr>
<td>Data at 80 99D3h</td>
<td>Data at 80 99D3h</td>
</tr>
<tr>
<td></td>
<td>0733 C00000h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
Syntax: STI src, dst

Operands:
- src: register (any register in CPU primary-register file)
- dst: general-addressing modes (G)

Opcode:

```
31 24 23 16 15 8 7 0
0 0 0 1 0 1 0 1 0 G src dst
```

Word Fields:

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
</tbody>
</table>

Operation: src → dst

Description: The src register is loaded into the dst memory location. The src and dst operands are assumed to be signed integers.

Status Bits:
- LUF: Unaffected
- LV: Unaffected
- UF: Unaffected
- N: Unaffected
- Z: Unaffected
- V: Unaffected
- C: Unaffected

Mode Bit: OVM operation is not affected by OVM bit value.

Cycles: 1

Example: STI R4, @982Bh

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP 80h</td>
<td>DP 80h</td>
</tr>
<tr>
<td>R4 42BD7h</td>
<td>R4 42BD7h</td>
</tr>
<tr>
<td>Data at 80 982Bh</td>
<td>Data at 80 982Bh</td>
</tr>
<tr>
<td>0E5FCh 58 876</td>
<td>42BD7h 273 367</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
STII  Store Integer, Interlocked

Syntax  
\texttt{STII \textit{src}, \textit{dst}}

Operands  
\textit{src}: register (any register in CPU primary-register file)  
\textit{dst}: general-addressing modes (G)

Opcode  
\begin{verbatim}
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 1 & G & src & dst \\
\end{array}
\end{verbatim}

Word Fields  
\begin{verbatim}
\begin{tabular}{c c}
G & src addressing modes \\
01 & direct \\
10 & indirect \\
\end{tabular}
\end{verbatim}

Operation  
\textit{src} → \textit{dst}  
Signal end of interlocked operation.

Description  
The \textit{src} register is loaded into the \textit{dst} memory location. An interlocked operation is signaled over \texttt{LOCK} or \texttt{LLOCK}. The \textit{src} and \textit{dst} operands are assumed to be signed integers. Refer to Section 9.7 on page 9-39 for detailed information.

Status Bits  
\begin{verbatim}
\begin{tabular}{c c}
LUF & Unaffected \\
LV & Unaffected \\
UF & Unaffected \\
N & Unaffected \\
Z & Unaffected \\
V & Unaffected \\
C & Unaffected \\
\end{tabular}
\end{verbatim}

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1

Example  
\texttt{STII R1, @98AEh}

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP 80h</td>
<td>DP 80h</td>
</tr>
<tr>
<td>R1 78Dh</td>
<td>R1 78Dh</td>
</tr>
<tr>
<td>Data at 80 98AEh</td>
<td>Data at 80 98AEh</td>
</tr>
<tr>
<td>25Ch</td>
<td>78Dh</td>
</tr>
</tbody>
</table>

14-238
Parallel STI and STI  STI||STI

Syntax

```
STI  src2, dst2  
||  STI  src1, dst1
```

Operands

- **src1**: register (Rn1, 0 ≤ n1 ≤ 7)
- **dst1**: indirect (disp = 0, 1, IR0, IR1)
- **src2**: register (Rn2, 0 ≤ n2 ≤ 7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

Opcode

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>11 0 0 0 0 1</td>
<td>src2</td>
<td>0 0 0</td>
<td>src1</td>
<td>dst1</td>
<td>dst2</td>
</tr>
</tbody>
</table>
```

Word Fields

None

Operation

```
src2 → dst2  
||  src1 → dst1
```

Description

Two integer stores are performed in parallel. If both stores are executed to the same address, the value written is that of STI src2, dst2.

Status Bits

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: Unaffected
- **N**: Unaffected
- **Z**: Unaffected
- **V**: Unaffected
- **C**: Unaffected

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
### Example

STI R0, *++AR2 (IR0)  
| | STI R5, *AR0

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>0DCh</td>
</tr>
<tr>
<td>AR2</td>
<td>80 9830h</td>
</tr>
<tr>
<td>IR0</td>
<td>8h</td>
</tr>
<tr>
<td>R5</td>
<td>35h</td>
</tr>
<tr>
<td>AR0</td>
<td>80 98D3h</td>
</tr>
</tbody>
</table>

Data at 80 9838h  
Data at 80 98D3h  

LUF | 0 | LUF | 0 |
LV | 0 | LV | 0 |
UF | 0 | UF | 0 |
N | 0 | N | 0 |
Z | 0 | Z | 0 |
V | 0 | V | 0 |
C | 0 | C | 0 |
Store Integer Immediate Value  

**STIK**

---

**Syntax**

`STIK src, dst`

**Operands**

`src:`  5-bit signed integer  
`dst:`  direct and indirect mode

**Opcode**

```
  31 24 23 16 15 8 7 0
  0 0 0 1 0 1 0 0
  src
  dst
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>direct</td>
</tr>
<tr>
<td>11</td>
<td>indirect</td>
</tr>
</tbody>
</table>

**Operation**

`src` → `dst`

**Description**

The 5-bit signed integer `src` value is loaded into the `dst` memory location. The `src` and `dst` operands are assumed to be signed integers.

**Status Bits**

- LUF  Unaffected
- LV   Unaffected
- UF   Unaffected
- N    Unaffected
- Z    Unaffected
- V    Unaffected
- C    Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
SUBB  Subtract Integer With Borrow

Syntax  
**SUBB** *src, dst*

Operands  
*src:* general-addressing modes (G)
*dst:* register (any register in CPU primary-register file)

Opcode  

<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td><strong>G</strong></td>
<td><strong>dst</strong></td>
<td><strong>src</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
dst – src – C → dst

Description  
The difference of the dst, src, and C operands, as calculated above, is loaded into the dst register. The dst and src operands are assumed to be signed integers.

Status Bits  
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: 1 if a borrow occurs, 0 otherwise

Mode Bit  
OVM operation is affected by OVM bit value.

Cycles  
1
**Example**

```
SUBB *AR5++(4), R5
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR5 80 9800h</td>
<td>AR5 80 9804h</td>
</tr>
<tr>
<td>R5 0FAh</td>
<td>R5 032h</td>
</tr>
<tr>
<td>Data at 80 9800h</td>
<td>Data at 80 9800h</td>
</tr>
<tr>
<td>0C7h</td>
<td>0C7h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 1</td>
<td>C 0</td>
</tr>
</tbody>
</table>
SUBB3  Subtract Integer With Borrow, 3 Operands

Syntax  
\textbf{SUBB3} \textit{src2, src1, dst}

Operands  
\textit{src1, src2}: type 1 or type 2 three-operand addressing modes  
\textit{dst}: register mode (any register in CPU primary-register file)

Opcode  

<table>
<thead>
<tr>
<th>Type 1</th>
<th>31 24 23 16 15 8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>0 0 1 0 0 1 1 0 0</td>
</tr>
<tr>
<td></td>
<td>T dst src1 src2</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Type 2</th>
<th>31 24 23 16 15 8 7 0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>0 0 1 1 0 1 1 0 0</td>
</tr>
<tr>
<td></td>
<td>T dst src1 src2</td>
</tr>
</tbody>
</table>

Word Fields

Type 1

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation  
\( src1 - src2 - C \rightarrow dst \)
### Description
The difference of the src1 and src2 operands and the C (carry) flag is loaded into the dst register. The src1, src2, and dst operands are assumed to be signed integers.

### Status Bits
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>Flag</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>1 if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>1 if a borrow is generated, 0 otherwise</td>
</tr>
</tbody>
</table>

### Mode Bit
OVM operation is affected by OVM bit value.

### Cycles
1

### Example
None
**SUBC**  Subtract Integer Conditionally

**Syntax**

SUBC src, dst

**Operands**

src: general-addressing modes (G)
dst: register (any register in CPU primary-register file)

**Opcode**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 0 1 1 0</td>
<td>G</td>
<td>dst</td>
<td>src</td>
<td></td>
</tr>
</tbody>
</table>

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

If (dst – src ≥ 0):

(dst – src << 1) OR 1 → dst

Else:

dst << 1 → dst

**Description**

The src operand is subtracted from the dst operand. The dst operand is loaded with a value that depends upon the result of the subtraction. If (dst – src) is greater than or equal to zero, then (dst – src) is left-shifted one bit, the least-significant bit is set to 1, and the result is loaded into the dst register. If (dst – src) is less than zero, dst is left-shifted one bit and loaded into the dst register. The dst and src operands are assumed to be unsigned integers.

SUBC can be used to perform a single step of a multibit-integer division.

**Status Bits**

<table>
<thead>
<tr>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>N</td>
<td>Unaffected</td>
</tr>
<tr>
<td>Z</td>
<td>Unaffected</td>
</tr>
<tr>
<td>V</td>
<td>Unaffected</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1
### Example 1

**Instruction:** `SUBC @98C5h, R1`

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>DP</td>
<td>DP 80h</td>
</tr>
<tr>
<td>R1</td>
<td>R1 0C9h</td>
</tr>
<tr>
<td>Data at 80 98C5h</td>
<td>Data at 80 98C5h</td>
</tr>
<tr>
<td></td>
<td>492h 1170</td>
</tr>
<tr>
<td>LUF</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF</td>
<td>UF 0</td>
</tr>
<tr>
<td>N</td>
<td>N 0</td>
</tr>
<tr>
<td>Z</td>
<td>Z 0</td>
</tr>
<tr>
<td>V</td>
<td>V 0</td>
</tr>
<tr>
<td>C</td>
<td>C 0</td>
</tr>
</tbody>
</table>

### Example 2

**Instruction:** `SUBC 3000, R0` (3000 = 0BB8h)

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>R0 0F0Ah</td>
</tr>
<tr>
<td>LUF</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF</td>
<td>UF 0</td>
</tr>
<tr>
<td>N</td>
<td>N 0</td>
</tr>
<tr>
<td>Z</td>
<td>Z 0</td>
</tr>
<tr>
<td>V</td>
<td>V 0</td>
</tr>
<tr>
<td>C</td>
<td>C 0</td>
</tr>
</tbody>
</table>
SUBF  Subtract Floating-Point Value

Syntax

SUBF src, dst

Operands

src:  general-addressing modes (G)
dst:  register (R0 – R11)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0</td>
<td>1 0 1 1 1</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0-R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation

dst – src → dst

Description

The result of the dst operand minus the src operand is loaded into the dst register. The dst and src operands are assumed to be floating-point numbers.

Status Bits

| LUF | 1 if a floating-point underflow occurs, unchanged otherwise |
| LV  | 1 if a floating-point overflow occurs, unchanged otherwise |
| UF  | 1 if a floating-point underflow occurs, 0 otherwise |
| N   | 1 if a negative result is generated, 0 otherwise |
| Z   | 1 if a zero result is generated, 0 otherwise |
| V   | 1 if an floating-point overflow occurs, 0 otherwise |
| C   | Unaffected |

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
### Example

**SUBF** *

\*AR0\*−−\*(IR0)\*, R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>AR0</strong></td>
<td>80 9888h</td>
</tr>
<tr>
<td><strong>IR0</strong></td>
<td>80h</td>
</tr>
<tr>
<td><strong>R5</strong></td>
<td>07 33C0 0000h</td>
</tr>
<tr>
<td>Data at 80 9888h</td>
<td>70C 8000h</td>
</tr>
<tr>
<td><strong>LUF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>LV</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>UF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>N</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>Z</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>V</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>C</strong></td>
<td>0</td>
</tr>
</tbody>
</table>
**SUBF3**  *Subtract Floating-Point Value, 3 Operands*

**Syntax**

**SUBF3** *src2, src1, dst*

**Operands**

*src1, src2*: type 1 or type 2 three-operand addressing modes  
*dst*: register mode (R0 – R11)

**Opcode**

**Type 1**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>
```

**Type 2**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24</th>
<th>23</th>
<th>16</th>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>
```

**Word Fields**

**Type 1**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (R0–R11)</td>
<td>register mode (R0–R11)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (R0–R11)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (R0–R11)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

**Operation**

*src1 – src2 → dst*
### Description
The difference of the src1 and src2 operands is loaded into the dst register. The src1, src2, and dst operands are assumed to be floating-point numbers.

### Status Bits
- **LUF**: 1 if a floating-point underflow occurs, unchanged otherwise
- **LV**: 1 if a floating-point overflow occurs, unchanged otherwise
- **UF**: 1 if a floating-point underflow occurs, 0 otherwise
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an floating-point overflow occurs, 0 otherwise
- **C**: Unaffected

### Mode Bit
OVM operation is not affected by OVM bit value.

### Cycles
1

### Example
None
**SUBF3||STF**  Parallel SUBF3 and STF

**Syntax**

```
| SUBF3 | src1, src2, dst1 |
|       | || STF | src3, dst2 |
```

**Operands**

- **src1**: register (R0 – R7)
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

**Opcode**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0 1</td>
<td>0 1</td>
<td>dst1 src1 src3 dst2 src2</td>
</tr>
</tbody>
</table>
```

**Word Fields**

None

**Operation**

```
src2  – src1 → dst1
|| src3 → dst2
```

**Description**

A floating-point subtraction and a floating-point store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STF) reads from a register and the operation being performed in parallel (SUBF3) writes to the same register, then STF accepts as input the contents of the register before it is modified by the SUBF3.

If **src3** and **dst1** point to the same location, **src3** is read before the write to **dst1**.

**Status Bits**

- **LUF**: 1 if a floating-point underflow occurs, unchanged otherwise
- **LV**: 1 if an floating-point overflow occurs, unchanged otherwise
- **UF**: 1 if a floating-point underflow occurs, 0 otherwise
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an floating-point overflow occurs, 0 otherwise
- **C**: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1
Example

```
SUBF3 R1,*-AR4(IR1),R0
||
STF R7,*+AR5(IR0)
```

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1 05 7B40 0000h</td>
<td>6.28125e + 01</td>
</tr>
<tr>
<td>AR4 80 98B8h</td>
<td>80 98B8h</td>
</tr>
<tr>
<td>IR1 8h</td>
<td>8h</td>
</tr>
<tr>
<td>R0 0h</td>
<td>06 1B60 0000h</td>
</tr>
<tr>
<td>R7 07 33C0 0000h</td>
<td>1.79750e + 02</td>
</tr>
<tr>
<td>AR5 80 9850h</td>
<td>80 9850h</td>
</tr>
<tr>
<td>IR0 10h</td>
<td>10h</td>
</tr>
<tr>
<td>Data at 80 98B0h</td>
<td>70C 8000h</td>
</tr>
<tr>
<td>70C 8000h</td>
<td>1.4050e + 02</td>
</tr>
<tr>
<td>Data at 80 9860h</td>
<td>733 C000h</td>
</tr>
<tr>
<td>733 C000h</td>
<td>1.79750e + 02</td>
</tr>
<tr>
<td>LUF 0</td>
<td>0</td>
</tr>
<tr>
<td>LV 0</td>
<td>0</td>
</tr>
<tr>
<td>UF 0</td>
<td>0</td>
</tr>
<tr>
<td>N 0</td>
<td>0</td>
</tr>
<tr>
<td>Z 0</td>
<td>0</td>
</tr>
<tr>
<td>V 0</td>
<td>0</td>
</tr>
<tr>
<td>C 0</td>
<td>0</td>
</tr>
</tbody>
</table>
**Syntax**

**SUBI** \( src, \ dst \)

**Operands**

- **src**: general-addressing modes (G)
- **dst**: register (any register in CPU primary-register file)

**Opcode**

```
  31 24 23 16 15 8 7 0
0 0 0 1 1 0 0 0 G dst src
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

\( \text{dst} - \text{src} \rightarrow \text{dst} \)

**Description**

The result of the **dst** operand minus the **src** operand is loaded into the **dst** register. The **dst** and **src** operands are assumed to be signed integers.

**Status Bits**

If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF**: Unaffected
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: 1 if a borrow occurs, 0 otherwise

**Mode Bit**

OVM operation is affected by OVM bit value.

**Cycles**

1
### Example

**SUBI 220, R7**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R7: 226h, 550</td>
<td>R7: 14Ah, 330</td>
</tr>
<tr>
<td>LUF: 0</td>
<td>LUF: 0</td>
</tr>
<tr>
<td>LV: 0</td>
<td>LV: 0</td>
</tr>
<tr>
<td>UF: 0</td>
<td>UF: 0</td>
</tr>
<tr>
<td>N: 0</td>
<td>N: 0</td>
</tr>
<tr>
<td>Z: 0</td>
<td>Z: 0</td>
</tr>
<tr>
<td>V: 0</td>
<td>V: 0</td>
</tr>
<tr>
<td>C: 0</td>
<td>C: 0</td>
</tr>
</tbody>
</table>
**SUBI3** Subtract Integer, 3 Operands

**Syntax**

`SUBI3 src2, src1, dst`

**Operands**

src1, src2: type 1 or type 2 three-operand addressing modes
dst: register mode (any register in CPU primary-register file)

**Opcode**

**Type 1**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 0 0 1 1 1 0</td>
<td>T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 1 1 1 0</td>
<td>T</td>
<td>dst</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

**Word Fields**

**Type 1**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

**Operation**

`src1 – src2 → dst`
### Description
The result of the src1 operand minus the src2 operand is loaded into the dst register. The src1, src2, and dst operands are assumed to be signed integers.

### Status Bits
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>Flag</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>1 if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>1 if a borrow is generated, 0 otherwise</td>
</tr>
</tbody>
</table>

### Mode Bit
OVM operation is affected by OVM bit value.

### Cycles
1

### Example
None
SUBI3||STI  Parallel SUBI3 and STI

Syntax

\[
\begin{align*}
\text{SUBI3} & \quad \text{src1, src2, dst1} \\
\| & \quad \text{STI} \quad \text{src3, dst2}
\end{align*}
\]

Operands

- **src1**: register (R0 – R7)
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

Opcode

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
1 & 1 & 1 & 0 & 1 & 1 & 0 & \text{dst1} \quad \text{src1} \quad \text{src3} \quad \text{dst2} \quad \text{src2}
\end{array}
\]

Word Fields

None

Operation

\[
\begin{align*}
\text{src2} - \text{src1} & \rightarrow \text{dst1} \\
\| & \quad \text{src3} \rightarrow \text{dst2}
\end{align*}
\]

Description

An integer subtraction and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (SUBI3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the SUBI3.

If **src3** and **dst1** point to the same location, **src3** is read before the write to **dst1**.

Status Bits

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>1 if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>1 if a borrow occurs, 0 otherwise</td>
</tr>
</tbody>
</table>

Mode Bit

OVM operation is affected by OVM bit value.

Cycles

1
Example

SUBI3 R7, *+AR2(IR0), R1
STI R3, *++AR7

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R7 14h 20</td>
<td>R7 14h 20</td>
</tr>
<tr>
<td>AR2 80 982Fh</td>
<td>AR2 80 982Fh</td>
</tr>
<tr>
<td>IR0 10h</td>
<td>IR0 10h</td>
</tr>
<tr>
<td>R1 0h</td>
<td>R1 0C8h 200</td>
</tr>
<tr>
<td>R3 35h 53</td>
<td>R3 35h 53</td>
</tr>
<tr>
<td>AR7 80 983Bh</td>
<td>AR7 80 983Ch</td>
</tr>
<tr>
<td>Data at 80 983Fh</td>
<td>Data at 80 983Fh</td>
</tr>
<tr>
<td>0DCh 220</td>
<td>0DCh 220</td>
</tr>
<tr>
<td>Data at 80 983Ch</td>
<td>Data at 80 983Ch</td>
</tr>
<tr>
<td>0h</td>
<td>35h 53</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
SUBRB  Subtract Reverse Integer With Borrow

Syntax  
SUBRB src, dst

Operands  
src: general-addressing modes (G)  
dst: register (any register in CPU primary-register file)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 1 0 0 0 1</td>
<td>G</td>
<td>dst</td>
<td>src</td>
<td></td>
</tr>
</tbody>
</table>

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
src – dst – C → dst

Description  
The difference of the src, dst, and C operands, as calculated above, is loaded into the dst register. The dst and src operands are assumed to be signed integers.

Status Bits  
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>1 if an integer overflow occurs, unchanged otherwise</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>1 if a negative result is generated, 0 otherwise</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero result is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>1 if an integer overflow occurs, 0 otherwise</td>
</tr>
<tr>
<td>C</td>
<td>1 if a borrow occurs, 0 otherwise</td>
</tr>
</tbody>
</table>

Mode Bit  
OVM operation is affected by OVM bit value.

Cycles  
1

14-260
**Example**  

**SUBRB R4, R6**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R4</td>
<td>03CBh 971</td>
</tr>
<tr>
<td>R6</td>
<td>0258h 600</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>1</td>
</tr>
<tr>
<td>R4</td>
<td>03CBh 971</td>
</tr>
<tr>
<td>R6</td>
<td>0172h 370</td>
</tr>
<tr>
<td>LUF</td>
<td>0</td>
</tr>
<tr>
<td>LV</td>
<td>0</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>0</td>
</tr>
<tr>
<td>Z</td>
<td>0</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>0</td>
</tr>
</tbody>
</table>
**SUBRF**  *Subtract Reverse Floating-Point Values*

**Syntax**  
```
SUBRF src, dst
```

**Operands**  
- **src**: general-addressing modes (G)  
- **dst**: register (R0 – R11)

**Opcode**

```
<table>
<thead>
<tr>
<th>31 24 23 16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0 1 1 0 0 1</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (R0-R11)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**  
```
src – dst → dst
```

**Description**  
The result of the `src` operand minus the `dst` operand is loaded into the `dst` register. The `dst` and `src` operands are assumed to be floating-point numbers.

**Status Bits**
- **LUF** 1 if a floating-point underflow occurs, unchanged otherwise  
- **LV** 1 if a floating-point overflow occurs, unchanged otherwise  
- **UF** 1 if a floating-point underflow occurs, 0 otherwise  
- **N** 1 if a negative result is generated, 0 otherwise  
- **Z** 1 if a zero result is generated, 0 otherwise  
- **V** 1 if a floating-point overflow occurs, 0 otherwise  
- **C** Unaffected

**Mode Bit**  
OVM operation is not affected by OVM bit value.

**Cycles**  
1
Example

<table>
<thead>
<tr>
<th>SUBRF @9905h, R5</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Before Instruction</strong></td>
</tr>
<tr>
<td>DP</td>
</tr>
<tr>
<td>R5</td>
</tr>
<tr>
<td>Data at 80 9905h</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td>LUF</td>
</tr>
<tr>
<td>LV</td>
</tr>
<tr>
<td>UF</td>
</tr>
<tr>
<td>N</td>
</tr>
<tr>
<td>Z</td>
</tr>
<tr>
<td>V</td>
</tr>
<tr>
<td>C</td>
</tr>
</tbody>
</table>
**SUBRI** *Subtract Reverse Integer*

**Syntax**

```
SUBRI src, dst
```

**Operands**

- **src**: general-addressing modes (G)
- **dst**: register (any register in CPU primary-register file)

**Opcode**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>0 0 1 1 0 0 1 1</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>
```

**Word Fields**

<table>
<thead>
<tr>
<th></th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

```
src – dst → dst
```

**Description**

The result of the src operand minus the dst operand is loaded into the dst register. The dst and src operands are assumed to be signed integers.

**Status Bits**

- **LUF**: Unaffected
- **LV**: 1 if an integer overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an integer overflow occurs, 0 otherwise
- **C**: 1 if a borrow occurs, 0 otherwise

**Mode Bit**

OVM operation is affected by OVM bit value.

**Cycles**

1
### Example: `SUBRI *AR5++, (IR0), R3`

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>AR5</strong></td>
<td>80 9900h</td>
</tr>
<tr>
<td><strong>IR0</strong></td>
<td>8h</td>
</tr>
<tr>
<td><strong>R3</strong></td>
<td>0DCh 220</td>
</tr>
<tr>
<td><strong>Data at 80 9900h</strong></td>
<td>226h 550</td>
</tr>
<tr>
<td><strong>LUF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>LV</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>UF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>N</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>Z</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>V</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>C</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>AR5</strong></td>
<td>80 9908h</td>
</tr>
<tr>
<td><strong>IR0</strong></td>
<td>8h</td>
</tr>
<tr>
<td><strong>R3</strong></td>
<td>014Ah 330</td>
</tr>
<tr>
<td><strong>Data at 80 9900h</strong></td>
<td>226h 550</td>
</tr>
<tr>
<td><strong>LUF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>LV</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>UF</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>N</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>Z</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>V</strong></td>
<td>0</td>
</tr>
<tr>
<td><strong>C</strong></td>
<td>0</td>
</tr>
</tbody>
</table>
SWI  Software Interrupt

Syntax  SWI
Operands  None
Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

Word Fields  None
Operation  Performs an emulation interrupt
Description  The SWI instruction performs an emulator interrupt. This is a reserved instruction and should not be used in normal programming.

Status Bits  LUF  Unaffected
LV  Unaffected
UF  Unaffected
N  Unaffected
Z  Unaffected
V  Unaffected
C  Unaffected

Mode Bit  OVM operation is not affected by OVM bit value.

Cycles  4
Example  None
**Syntax**

\[
\text{TOIEEE } \text{src, dst}
\]

**Operands**

\[
\begin{align*}
\text{src:} & \quad \text{extended-precision register (R0 – R11),} \\
& \quad \text{direct- and indirect-addressing modes} \\
\text{dst:} & \quad \text{extended-precision register}
\end{align*}
\]

**Opcode**

\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & 1 & 1 & 0 & 1 & 1
\end{array}
\]

\[
G \quad \text{dst} \quad \text{src}
\]

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register [extended-precision register (R0-R11)]</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**

convert \text{src} to IEEE format \rightarrow \text{dst}

**Description**

The \text{src} operand is converted from a 2s-complement floating-point format to the IEEE floating-point format.

The \text{src} operand is assumed to be a single-precision floating-point number, except for the immediate mode that is considered a short 16-bit floating point format. The converted result goes into the 32 MSBs of the \text{dst} register. STF can be used to store the result to memory.

**Status Bits**

- **LUF** Unaffected
- **LV** 1 if an overflow occurs, unchanged otherwise
- **UF** 0
- **N** 1 if a negative result is generated, 0 otherwise
- **Z** 1 if a zero result is generated, 0 otherwise
- **V** 1 if an overflow occurs, 0 otherwise
- **C** Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
**TOIEEE||STF**  Parallel TOIEEE and STF

**Syntax**

```
TOIEEE  src2, dst1
||  STF  src3, dst2
```

**Operands**

- **src2**: indirect mode (disp = 0, 1, IR0, IR1)
- **dst1**: register mode (Rn1, 0 ≤ n1 ≤ 7)
- **src3**: register mode (Rn1, 0 ≤ n1 ≤ 7)
- **dst2**: indirect mode (disp = 0, 1, IR0, IR1)

**Opcode**

```
\begin{array}{cccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
\hline
1 & 1 & 1 & 1 & 0 & 0 & \text{dst1} & 0 & 0 & \text{src3} & \text{dst2} & \text{src2}
\end{array}
```

**Word Fields**

None

**Operation**

- convert *src2* to IEEE format → *dst1*
- in parallel with *src3* → *dst2*

**Description**

The *src2* operand is converted from a 2s-complement floating-point format to the IEEE floating-point format.

The *src2* operand is assumed to be a single-precision floating-point number. The converted result goes into the 32 MSBs of the *dst1* register. A floating-point store is done in parallel.

If *src2* and *dst2* point to the same location, then *src2* is read before the write to *dst2*.

**Status Bits**

- **LUF**: Unaffected
- **LV**: 1 if an overflow occurs, unchanged otherwise
- **UF**: 0
- **N**: 1 if a negative result is generated, 0 otherwise
- **Z**: 1 if a zero result is generated, 0 otherwise
- **V**: 1 if an overflow occurs, 0 otherwise
- **C**: Unaffected

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
## Trap Conditionally  TRAPcond

<table>
<thead>
<tr>
<th>Syntax</th>
<th>TRAPcond N</th>
</tr>
</thead>
<tbody>
<tr>
<td>Operands</td>
<td>N: immediate mode (0 ≤ N ≤ 511)</td>
</tr>
<tr>
<td>Opcode</td>
<td></td>
</tr>
<tr>
<td>Word Fields</td>
<td>None</td>
</tr>
<tr>
<td>Operation</td>
<td>If (cond is true)</td>
</tr>
<tr>
<td>Description</td>
<td>If the condition is true, then GIE and CF are saved in PGIE and PCF in the status register, all interrups are disabled (0 → GIE), and the cache is frozen (1 → CF). Then, the contents of the PC are pushed onto the system stack, and the PC is loaded with the contents of the specified trap vector (N). If the condition is not true, then continue normal operation.</td>
</tr>
<tr>
<td>Status Bits</td>
<td>GIE Set to 0 if TRAP executes</td>
</tr>
<tr>
<td>Mode Bit</td>
<td>OVM operation is not affected by OVM bit value.</td>
</tr>
<tr>
<td>Cycles</td>
<td>5</td>
</tr>
<tr>
<td>Example</td>
<td>None</td>
</tr>
</tbody>
</table>

### Syntax

```
Syntax TRAPcond N
```

### Operands

```
Operands N: immediate mode (0 ≤ N ≤ 511)
```

### Opcode

```
<table>
<thead>
<tr>
<th>Opcode</th>
</tr>
</thead>
<tbody>
<tr>
<td>31 23 24 16 0</td>
</tr>
<tr>
<td>0 1 1 1 0 1 0 0 0 0</td>
</tr>
</tbody>
</table>
```

### Word Fields

None

### Operation

If (\textit{cond} is true)

- \text{ST}(GIE) → \text{ST}(PGIE)
- \text{ST}(CF) → \text{ST}(PCF)
- 0 → \text{ST}(GIE)
- 1 → \text{ST}(CF)
- next PC → *(*(++SP))
- trap vector N → PC

Else, continue.

### Description

If the condition is true, then GIE and CF are saved in PGIE and PCF in the status register, all interrups are disabled (0 → GIE), and the cache is frozen (1 → CF). Then, the contents of the PC are pushed onto the system stack, and the PC is loaded with the contents of the specified trap vector (N). If the condition is not true, then continue normal operation.

If traps are to be nested, you may need to save the status register before executing TRAPcond.

### Status Bits

- **GIE**: Set to 0 if TRAP executes
- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: Unaffected
- **N**: Unaffected
- **Z**: Unaffected
- **V**: Unaffected
- **C**: Unaffected

### Mode Bit

OVM operation is not affected by OVM bit value.

### Cycles

5

### Example

None
TSTB  Test Bit Fields

Syntax  
\[ \text{TSTB} \ src, \ dst \]

Operands  
\( src \): general-addressing modes (G)  
\( dst \): register (any register in CPU primary-register file)

Opcode  
\[
\begin{array}{cccccccc}
31 & 24 & 23 & 16 & 15 & 8 & 7 & 0 \\
0 & 0 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & G & dst & src
\end{array}
\]

Word Fields

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

Operation  
\( dst \ AND \ src \)

Description  
The bitwise-logical AND of the \( dst \) and \( src \) operands is formed, but the result is not loaded in any register. This allows for nondestructive compares. The \( dst \) and \( src \) operands are assumed to be unsigned integers.

Status Bits  
These condition flags are modified for all destination registers.

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>LUF</td>
<td>Unaffected</td>
</tr>
<tr>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td>N</td>
<td>MSB of the output</td>
</tr>
<tr>
<td>Z</td>
<td>1 if a zero output is generated, 0 otherwise</td>
</tr>
<tr>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

Mode Bit  
OVM operation is not affected by OVM bit value.

Cycles  
1
**Example**

TSTB *–AR4 (1), R5

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>AR4 80 99C5h</td>
<td>AR4 80 99C5h</td>
</tr>
<tr>
<td>R5 898h</td>
<td>R5 898h</td>
</tr>
<tr>
<td>Data at 80 99C4h</td>
<td>Data at 80 99C4h</td>
</tr>
<tr>
<td>767h</td>
<td>1895</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 1</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
TSTB3  Test Bit Fields, 3 Operands

Syntax  
TSTB3 src2, src1

Operands  
src1, src2:  type 1 or type 2 three-operand addressing modes

Opcode

Type 1

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 0 0 1 1 1 1</td>
<td>T</td>
<td>0 0 0 0 0</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 1 1 1 1</td>
<td>T</td>
<td>0 0 0 0 0</td>
<td>src1</td>
<td>src2</td>
</tr>
</tbody>
</table>

Word Fields

Type 1

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

Type 2

<table>
<thead>
<tr>
<th>T</th>
<th>src1 addressing modes</th>
<th>src2 addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode *+ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode *+ARn1(5-bit unsigned displacement)</td>
<td>indirect mode *+ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

Operation  
src1 AND src2  

Description

The bitwise-logical AND between the src1 and src2 operands is performed but is not loaded into any register. This allows for nondestructive compares. The src1 and src2 operands are assumed to be unsigned integers. The src2 immediate-addressing mode is sign-extended.

Although this instruction has only two operands, it is designated as a three-operand instruction because operands are specified in the three-operand format.
### Test Bit Fields, 3 Operands (TSTB3)

<table>
<thead>
<tr>
<th>Status Bits</th>
<th>LUF</th>
<th>Unaffected</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>LV</td>
<td>Unaffected</td>
</tr>
<tr>
<td></td>
<td>UF</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td>N</td>
<td>MSB of the output</td>
</tr>
<tr>
<td></td>
<td>Z</td>
<td>1 if a zero output is generated, 0 otherwise</td>
</tr>
<tr>
<td></td>
<td>V</td>
<td>0</td>
</tr>
<tr>
<td></td>
<td>C</td>
<td>Unaffected</td>
</tr>
</tbody>
</table>

**Mode Bit**

OVM operation is not affected by OVM bit value.

**Cycles**

1

**Example**

None
**XOR**  
**Bitwise Exclusive OR**

**Syntax**  
\[
\text{XOR} \ src, \ dst
\]

**Operands**  
- \( src \): general-addressing modes (G)  
- \( dst \): register (any register in CPU primary-register file)

**Opcode**

```
<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16</th>
<th>15 8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 0</td>
<td>1 1 0 0 1</td>
<td>G</td>
<td>dst</td>
<td>src</td>
</tr>
</tbody>
</table>
```

**Word Fields**

<table>
<thead>
<tr>
<th>G</th>
<th>src addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register (any register in CPU primary-register file)</td>
</tr>
<tr>
<td>01</td>
<td>direct</td>
</tr>
<tr>
<td>10</td>
<td>indirect</td>
</tr>
<tr>
<td>11</td>
<td>immediate</td>
</tr>
</tbody>
</table>

**Operation**  
\( dst \ XOR \ src \rightarrow dst \)

**Description**  
The bitwise-exclusive OR of the \( src \) and \( dst \) operands is loaded into the \( dst \) register. The \( dst \) and \( src \) operands are assumed to be unsigned integers.

**Status Bits**  
If \( \text{ST (SETCOND)} = 0 \) and the destination register is \( R0 - R11 \), the condition flags are modified. If \( \text{ST (SETCOND)} = 1 \), they are modified for all destination registers.

- \( \text{LUF} \): Unaffected  
- \( \text{LV} \): Unaffected  
- \( \text{UF} \): 0  
- \( \text{N} \): MSB of the output  
- \( \text{Z} \): 1 if a zero output is generated, 0 otherwise  
- \( \text{V} \): 0  
- \( \text{C} \): Unaffected

**Mode Bit**  
OVM operation is not affected by OVM bit value.

**Cycles**  
1
### Bitwise Exclusive OR (XOR)

#### Example

Example:

**XOR R1, R2**

<table>
<thead>
<tr>
<th>Before Instruction</th>
<th>After Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>R1 0F FA32h</td>
<td>R1 0F F412h</td>
</tr>
<tr>
<td>R2 0F F5C1h</td>
<td>R2 00 0FF3h</td>
</tr>
<tr>
<td>LUF 0</td>
<td>LUF 0</td>
</tr>
<tr>
<td>LV 0</td>
<td>LV 0</td>
</tr>
<tr>
<td>UF 0</td>
<td>UF 0</td>
</tr>
<tr>
<td>N 0</td>
<td>N 0</td>
</tr>
<tr>
<td>Z 0</td>
<td>Z 0</td>
</tr>
<tr>
<td>V 0</td>
<td>V 0</td>
</tr>
<tr>
<td>C 0</td>
<td>C 0</td>
</tr>
</tbody>
</table>
**XOR3**  
*Bitwise Exclusive OR, 3 Operands*

**Syntax**  
\[\text{XOR3 } src2, src1, dst\]

**Operands**  
- \(src1, src2\): type 1 or type 2 three-operand addressing modes  
- \(dst\): register mode (any register in CPU primary-register file)

**Opcode**  

**Type 1**

<table>
<thead>
<tr>
<th>31 24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 0 1 0 0 0 0</td>
<td>(dst)</td>
<td>(src1)</td>
<td>(src2)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>31 24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0 1 1 0 0 0 0</td>
<td>(dst)</td>
<td>(src1)</td>
<td>(src2)</td>
</tr>
</tbody>
</table>

**Word Fields**

**Type 1**

<table>
<thead>
<tr>
<th>(T)</th>
<th>(src1) addressing modes</th>
<th>(src2) addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>01</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>register mode (any CPU register)</td>
</tr>
<tr>
<td>10</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
<td>indirect mode (disp = 0, 1, IR0, IR1)</td>
</tr>
</tbody>
</table>

**Type 2**

<table>
<thead>
<tr>
<th>(T)</th>
<th>(src1) addressing modes</th>
<th>(src2) addressing modes</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>register mode (any CPU register)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>01</td>
<td>register mode (any CPU register)</td>
<td>indirect mode (#)ARn(5-bit unsigned displacement)</td>
</tr>
<tr>
<td>10</td>
<td>indirect mode (#)ARn(5-bit unsigned displacement)</td>
<td>8-bit signed immediate</td>
</tr>
<tr>
<td>11</td>
<td>indirect mode (#)ARn(5-bit unsigned displacement)</td>
<td>indirect mode (#)ARn2(5-bit unsigned displacement)</td>
</tr>
</tbody>
</table>

**Operation**  
\(src1 \text{ XOR } src2 \rightarrow dst\)
**Description**
The bitwise-exclusive OR between the src1 and src2 operands is loaded into the dst register. The src1, src2, and dst operands are assumed to be unsigned integers. The src2 immediate-addressing mode is sign-extended.

**Status Bits**
If ST (SETCOND) = 0 and the destination register is R0 – R11, the condition flags are modified. If ST (SETCOND) = 1, they are modified for all destination registers.

- **LUF**  Unaffected
- **LV**   Unaffected
- **UF**   0
- **N**    MSB of the output
- **Z**    1 if a zero output is generated, 0 otherwise
- **V**    0
- **C**    Unaffected

**Mode Bit**
OVM operation is not affected by OVM bit value.

**Cycles**
1

**Example**
None
XOR3||STI  Parallel XOR3 and STI

Syntax

\[
\text{XOR3} \quad \text{src2, src1, dst1} \\
|| \quad \text{STI} \quad \text{src3, dst2}
\]

Operands

- **src1**: register (R0 – R7)
- **src2**: indirect (disp = 0, 1, IR0, IR1)
- **dst1**: register (R0 – R7)
- **src3**: register (R0 – R7)
- **dst2**: indirect (disp = 0, 1, IR0, IR1)

Opcode

<table>
<thead>
<tr>
<th>31</th>
<th>24 23</th>
<th>16 15</th>
<th>8 7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>dst</td>
<td>src1</td>
<td>src3</td>
<td>dst2</td>
<td>src2</td>
</tr>
</tbody>
</table>

Word Fields

None

Operation

\[
\text{src1 XOR src2} \to \text{dst1} \\
\| \quad \text{src3} \to \text{dst2}
\]

Description

A bitwise-exclusive XOR and an integer store are performed in parallel. All registers are read at the beginning and loaded at the end of the execute cycle. This means that if one of the parallel operations (STI) reads from a register and the operation being performed in parallel (XOR3) writes to the same register, then STI accepts as input the contents of the register before it is modified by the XOR3.

If **src2** and **dst2** point to the same location, **src2** is read before the write to **dst2**.

Status Bits

- **LUF**: Unaffected
- **LV**: Unaffected
- **UF**: 0
- **N**: MSB of the output
- **Z**: 1 if a zero output is generated, 0 otherwise
- **V**: 0
- **C**: Unaffected

Mode Bit

OVM operation is not affected by OVM bit value.

Cycles

1
### Example

```assembly
XOR3  *AR1++, R3, R3
|| STI  R6, *AR2(IR0)
```

#### Before Instruction

<table>
<thead>
<tr>
<th>AR1</th>
<th>80 987Eh</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3</td>
<td>85h</td>
</tr>
<tr>
<td>R6</td>
<td>0DCh</td>
</tr>
<tr>
<td>AR2</td>
<td>80 98B4h</td>
</tr>
<tr>
<td>IR0</td>
<td>8h</td>
</tr>
<tr>
<td>Data at 80 987Eh</td>
<td>85h</td>
</tr>
<tr>
<td>Data at 80 98ACCh</td>
<td>0h</td>
</tr>
</tbody>
</table>

#### After Instruction

<table>
<thead>
<tr>
<th>AR1</th>
<th>80 987Fh</th>
</tr>
</thead>
<tbody>
<tr>
<td>R3</td>
<td>0h</td>
</tr>
<tr>
<td>R6</td>
<td>0DCh</td>
</tr>
<tr>
<td>AR2</td>
<td>80 98B4h</td>
</tr>
<tr>
<td>IR0</td>
<td>8h</td>
</tr>
<tr>
<td>Data at 80 987Eh</td>
<td>85h</td>
</tr>
<tr>
<td>Data at 80 98ACCh</td>
<td>0DCh</td>
</tr>
</tbody>
</table>

#### Additional Flags

- **LUF**: 0
- **LV**: 0
- **UF**: 0
- **N**: 0
- **Z**: 0
- **V**: 0
- **C**: 0
A0–A30: External address pins for data/program memory or I/O devices. These pins are on the global bus. See also LA0–LA30.

address: The location of program code or data stored in memory.

addressing mode: The method by which an instruction interprets its operands to acquire the data it needs.

ALU: See Arithmetic logic unit.

analog-to-digital (A/D) converter: A successive-approximation converter with internal sample-and-hold circuitry used to translate an analog signal to a digital signal.

ARAU: See auxiliary-register arithmetic unit.

arithmetic logic unit (ALU): The part of the CPU that performs arithmetic and logic operations.

auxiliary registers (ARn): A set of registers used primarily in address generation.

auxiliary-register arithmetic unit (ARAU): Auxiliary-register arithmetic unit. A 32-bit arithmetic logic unit (ALU) used to calculate indirect addresses using the auxiliary registers as inputs and outputs.

bit-reversed addressing: Addressing in which several bits of an address are reversed in order to speed processing of algorithms, such as Fourier transforms.

BK: See block-size register.
**bootloader:** An on-chip code that transfers code from an external memory or from a communication port to RAM at power-up.

**carry bit:** A bit in status register ST used by the ALU for extended arithmetic operations and accumulator shifts and rotates. The carry bit can be tested by conditional instructions.

**circular addressing:** An addressing mode in which an auxiliary register is used to cycle through a range of addresses to create a circular buffer in memory.

**context save/restore:** A save/restore of system status (status registers, accumulator, product register, temporary register, hardware stack, and auxiliary registers, etc.) when the device enters/exits a subroutine such as an interrupt service routine.

**CPU:** Central processing unit. The unit that coordinates the functions of a processor.

**CPU cycle:** The time it takes the CPU to go through one logic phase (during which internal values are changed) and one latch phase (during which the values are held constant).

**cycle:** See CPU cycle.

**D0–D31:** External data-bus pins that transfer data between the processor and external data/program memory or I/O devices. See also LD0–LD31.

**data-address generation logic:** Logic circuitry that generates the addresses for data-memory reads and writes. This circuitry can generate one address per machine cycle. See also program address generation logic.

**data-page pointer:** A 32-bit register used as the 16 MSBs in addresses generated using direct addressing.

**decode phase:** The phase of the pipeline in which the instruction is decoded (identified).

**DIE:** See DMA interrupt enable register.

**DMA coprocessor:** A peripheral that transfers the contents of memory locations independently of the processor (except for initialization).
DMA controller: See DMA coprocessor.

DMA interrupt enable register (DIE): A register (in the CPU register file) that controls which interrupts the DMA coprocessor responds to.

DP: See data-page pointer.

dual-access RAM: Memory that can be accessed twice in a single clock cycle. For example, your code can read from and write to a dual-access RAM in one clock cycle.

external interrupt: A hardware interrupt triggered by a pin.

extended-precision floating-point format: A 40-bit representation of a floating-point number with a 32-bit mantissa and an 8-bit exponent.

extended-precision register: A 40-bit register used primarily for extended-precision floating-point calculations. Floating-point operations use bits 39–0 of an extended-precision register. Integer operations, however, use only bits 31–0.

FIFO buffer: First-in, first-out buffer. A portion of memory in which data is stored and then retrieved in the same order in which it was stored. Thus, the first word stored in this buffer is retrieved first. The ‘C4x’s communication ports each have two FIFOs: one for transmit operations and one for receive operations.

hardware interrupt: An interrupt triggered through physical connections with on-chip peripherals or external devices.

hit: A condition in which, when the processor fetches an instruction, the instruction is available in the cache.

IACK: Interrupt acknowledge signal. An output signal that indicates that an interrupt has been received and that the program counter is fetching the interrupt vector that will force the processor into an interrupt service routine.
IIE: See internal interrupt enable register.

IIF: See IIOF flag register.

**IIOF flag register (IIF):** Controls the function (general-purpose I/O or interrupt) of the four external pins (IIOF0 to IIOF3). It also contains timer/DMA interrupt flags.

index registers: Two registers (IR0 and IR1) that are used by the ARAU for indexing an address.

internal interrupt: A hardware interrupt caused by an on-chip peripheral.

**internal interrupt enable register:** A register (in the CPU register file) that determines whether or not the CPU will respond to interrupts from the communication ports, the timers, and the DMA coprocessor.

interrupt: A signal sent to the CPU that (when not masked) forces the CPU into a subroutine called an interrupt service routine. This signal can be triggered by an external device, an on-chip peripheral, or an instruction (TRAP, for example).

**interrupt acknowledge (IACK):** A signal that indicates that an interrupt has been received, and that the program counter is fetching the interrupt vector location.

**interrupt vector table (IVT):** An ordered list of addresses which each correspond to an interrupt; when an interrupt occurs and is enabled, the processor executes a branch to the address stored in the corresponding location in the interrupt vector table.

interrupt vector table pointer (IVTP): A register (in the CPU expansion register file) that contains the address of the beginning of the interrupt vector table.

**ISR:** Interrupt service routine. A module of code that is executed in response to a hardware or software interrupt.

IVTP: See interrupt vector table pointer.

**LA0–LA30:** External address pins for data/program memory or I/O devices. These pins are on the local bus. See also A0–A30.

**LD0–LD31:** External data bus pins that transfer data between the processor and external data/program memory or I/O devices. See also D0–D31.
LSB:  *Least significant bit.* The lowest order bit in a word.

**machine cycle:** See *CPU cycle.*

**mantissa:** A component of a floating-point number consisting of a fraction and a sign bit. The mantissa represents a normalized fraction whose binary point is shifted by the exponent.

**maskable interrupt:** A hardware interrupt that can be enabled or disabled through software.

**memory-mapped register:** One of the on-chip registers mapped to addresses in memory. Some memory-mapped registers are mapped to data memory, and some are mapped to input/output memory.

**MFLOPS:** *Millions of floating point operations per second.* A measure of floating-point processor speed that counts the number of floating-point operations made per second.

**microcomputer mode:** A mode in which the on-chip ROM (bootloader) is enabled. This mode is selected via the MP/MC pin. See also *MP/MC pin; microprocessor mode.*

**microprocessor mode:** A mode in which the on-chip ROM is disabled. This mode is selected via the MP/MC pin. See also *MP/MC pin; microcomputer mode.*

**MIPS:** Million instructions-per-second.

**miss:** A condition in which, when the processor fetches an instruction, it is not available in the cache.

**MSB:** *Most significant bit.* The highest order bit in a word.

**multiplier:** A device that generates the product of two numbers.

**NMI:** See *Nonmaskable interrupt.*

**nonmaskable interrupt (NMI):** A hardware interrupt that uses the same logic as the maskable interrupts but cannot be masked.
overflow flag (OV) bit:  A status bit that indicates whether or not an arithmetic operation has exceeded the capacity of the corresponding register.

PC:  See program counter.

peripheral bus:  A bus that is used by the CPU to communicate the DMA co-processor, communication ports, and timers.


program counter:  A register that contains the address of the next instruction to be fetched.

RC:  See repeat counter register.

read/write (R/W) pin:  This memory-control signal indicates the direction of transfer when communicating to an external device.

register file:  A bank of registers.

repeat counter register:  A register (in the CPU register file) that specifies the number of times minus one that a block of code is to be repeated when a block repeat is performed.

repeat mode:  A zero-overhead method for repeating the execution of a block of code.

reset:  A means to bring the central processing unit (CPU) to a known state by setting the registers and control bits to predetermined values and signaling execution to fetch the reset vector.

reset pin:  This pin causes the device to reset.

ROMEN:  ROM enable. An external pin that determines whether or not the on-chip ROM is enabled.

R/W:  See read/write pin.
short floating-point format: A 16-bit representation of a floating point number with a 12-bit mantissa and a 4-bit exponent.

short integer format: A two's-complement, 16-bit format for integer data.

short unsigned-integer format: A 16-bit unsigned format for integer data.

sign-extend: Fill the high order bits of a number with the sign bit.

single-precision floating-point format: A 32-bit representation of a floating-point number with a 24-bit mantissa and an 8-bit exponent.

single-precision integer format: A two's-complement 32-bit format for integer data.

single-precision unsigned-integer format: A 32-bit unsigned format for integer data.

software interrupt: An interrupt caused by the execution of a TRAP instruction.

split mode: A mode of operation of the DMA coprocessor. This mode allows one DMA channel to service both the receive and transmit portions of a communication port.

ST: See status register.

stack: A block of memory reserved for storing and retrieving data on a first-in last-out basis. It is usually used for storing return addresses and for preserving register values.

status register: A register in the CPU register file that contains global information related to the CPU.

Timer: A programmable peripheral that can generate pulses or time events.

Timer-Period Register: Timer-period register. A 32-bit memory-mapped register that specifies the period for the on-chip timer.

trap vector table (TVT): An ordered list of addresses which each correspond to an interrupt; when a trap is executed, the processor executes a branch to the address stored in the corresponding location in the trap vector table.

trap vector table pointer (TVTP): A register in the CPU expansion register file that contains the address of the beginning of the trap vector table.
TVTP: See *trap vector table pointer*.

**unified mode:** A mode of operation of the DMA coprocessor. The mode is used mainly for memory-to-memory transfers. This is the default mode of operation for a DMA channel. See also *split mode*.

**wait state:** A period of time that the CPU must wait for external program, data, or I/O memory to respond when reading from or writing to that external memory. The CPU waits one extra cycle for every wait state.

**wait-state generator:** A program that can be modified to generate a limited number of wait states for a given off-chip memory space (lower program, upper program, data, or I/O).

**zero fill:** Fill the low or high order bits with zeros when loading a number into a larger field.
Index

Nu
16-bit wide configured memory
  table 10-14
32-bit wide configured memory
  table 10-15

A
A/D converter
  definition A-1
A0-A30
  definition A-1
  abbreviations 14-16
ABS||STI instruction 14-31
ABSF instruction 14-26
ABSF||STF instruction 14-27
ABSI instruction 14-29
ADDC instruction 14-33
ADDC3 instruction 14-35
ADDF instruction 14-37
ADDF3 instruction 14-39
ADDF3||STF instruction 14-41
ADDI instruction 14-43
ADDI3 instruction 14-45
ADDI3||STI instruction 14-47
  addition
    floating-point 5-23
  address
    definition A-1
  address buses
    external 2-20
  address partitioning
    figure 4-10
  address pins
external A-4
address range
LSTRB0 9-11
STRB0 9-10
address space caution 2-13
addressing modes
bit-reversed addressing 6-32
circular 6-27
conditional branch definition A-1
encoding 6-21
conditional branch 6-25
general 6-21
parallel 6-24
three-operand 6-23
general 2-18
groups 6-21
parallel 2-18
three operand 2-18 6-22
addressing types 6-2
direct addressing 6-5
immediate 6-18
indirect addressing 6-6 to 6-21
PC relative 6-19
register 6-3
AE bit 9-7
aliasing 2-17
ALU. See arithmetic logic unit; arithmetic logic unit (ALU)
analysis bit 3-7
analysis module registers 4-6
AND instruction 14-49
AND||STI instruction 14-53
AND3 instruction 14-51
ANDN instruction 14-55
ANDN3 instruction 14-57
application(s)
automotive viii xiv
consumer viii xiv
control viii xii
development support viii xii
general-purpose viii xii
graphics/imagery viii xi
medical viii xiv
military viii xiii
multimedia viii xiii
speech/voice viii xi
telecommunications viii xiii
ARAUs. See auxiliary register arithmetic units (ARAUs)

architectural overview
  introduction 2-1
architecture
  peripheral bus 2-22
arithmetic logic unit (ALU) 2-4
definition A-1
ASH instruction 14-59
ASH3 instruction 14-61
ASH3||STI instruction 14-63
assembly language instructions 14-2 to 14-11
  condition codes flags 14-12
example instruction 14-23
illegal instructions 14-11
interlocked operation 14-8
load and store 14-2
parallel operation 14-9
program control 14-7
register symbols 14-21 to 14-22
symbols 14-17 to 14-22
syntax options 14-18 to 14-22
three-operand 14-6
two-operand 14-4
autoinitialization 11-34
consecutive 11-40
situations 11-34
synchronization 11-37
automotive applications viii, xiv
auxiliary link-pointer register 11-7
auxiliary register
  indirect 6-9
auxiliary register arithmetic units (ARAUs) 2-6
auxiliary registers (AR0–7) 2-6, 3-4
auxiliary registers (ARn)
definition A-1
auxiliary transfer-counter register 11-7
auxiliary-register arithmetic unit (ARAU)
definition A-1

B
Bcond instruction 14-65
BcondAF instruction 7-11, 8-7, 14-67
text example 8-7
BcondAT instruction 7-11, 8-7, 14-69
text example 8-7
BcondD instruction 14-71
bit-reversed addressing 6-32
definition A-1
example 6-32
index steps 5-33
block diagram
'C4x 2-2
communication ports 12-4
peripheral modules 2-22
timers 13-3
block repeat
nesting 7-8
registers (RC, RE, RS) 3-16, 7-2
block size (BK) register 3-5
block transfer completion 11-6
block transfer sequence 11-5
bootloader
definition A-2
description 10-2
from communication port 10-3
from memory 10-3
introduction 10-1
mode selection 10-3
operation 10-5
setting the IIOF pins 10-19
source code 10-20 to 10-25
source structure 10-8
bootloader mode selection
table 10-3
bootloading
from a comm port 10-16
from memory 10-10
sequence 10-5
BR instruction 14-73
branch conflicts 8-4
branch execution
delayed 7-10
branches 7-9, 7-12
BRD instruction 14-74
bus operation
external 2-20, 9-1 to 9-50
internal 2-19
busy-waiting example 9-42
byte-wide configured memory
table 10-11 to 10-14

C
C flag 3-5

A-4
'C40 memory map
figure 2-14, 4-3
'C40/'C44 features
table 1-4
'C44 memory aliasing
figure 2-17
'C44 memory map
figure 2-15
'C4x multiprocessor system
booting example 10-17
'C4x to IEEE conversion
table 5-18
'C4x-specific instructions 14-3 to 14-8
cache 4-1
cache clear (CC) bit 4-12
cache enable (CE) bit 4-12
cache freeze (CF) bit 4-12
cache memory
architecture 2-11, 4-13
control bits 2-11, 4-10
enabling 4-13
hit 4-14
instruction cache 4-10
LRU algorithm 4-14
miss 4-14
segment miss 4-14
subsegment miss 4-14
CALL instruction 7-12, 14-75
CALL response timing
figure 7-14
CALLcond instruction 7-12, 14-76
calls 7-12
carry bit
definition A-2
CC bit 3-6, 4-12
CE and CF bits
combined effect
table 4-13
table 3-7
CE bit 3-6, 4-12
CE0 bit 9-7
CE1 bit 9-7
central processing unit. See CPU
CF bit 3-6, 4-12
channel control register. See DMA channel control register
channel priority scheme
split mode 11-25
circular addressing
  definition A-2
  example 6-30
  FIR filters 5-31
  register relationships
    figure 6-28

circular addressing mode 6-27

circular buffer
  implementation 6-29
  CLKSRC = 0 and FUNC = 0 13-14
  CLKSRC = 0 and FUNC = 1 13-14
  CLKSRC = 1 and FUNC = 0 13-13
  CMPF instruction 14-78
  CMPF3 instruction 14-80
  CMPI instruction 14-82
  CMPI3 instruction 14-84
  communication port load mode
    flow chart 10-7
  communication port memory map
    figure 12-7
  communication port reset
    example 12-10
  communication port software register 12-3
  communication ports
    arbitration unit 12-3 12-11
    block diagram 12-4
    control register 12-3
    coordination with CPU/DMA 12-17
    CSTRB width restrictions 12-25
    features 2-23 12-2
    H1/H3 synchronization 12-26
    input FIFO halt 12-15
  communication ports (continued)
    input port post-reset state 12-31
    input port register 12-9
    interconnection 12-5
    introduction 12-1
    memory map 4-8 12-7
      figure 4-8
    output FIFO halt 12-15
    output port post-reset state 12-30
    output port register 12-9
    reset 12-29
    tips 12-32
    token transfer 12-19
  communication-port control register (CPCR) 12-8
    field descriptions 12-8
    figure 12-8
    ICH 12-8
INPUT LEVEL 12-9
OCH 12-8
OUTPUT LEVEL 12-9
PORT DIR 12-8
communication-port software
reset address
   table 12-10
reset register 12-10
condition codes
   flags 14-14
conditional-branch addressing modes 2-18 6-25
encoding 6-26
consecutive autoinitializations 11-40
consumer applications viii xiv
context save/restore
   definition A-2
control applications viii xii
control bits
   repeat mode 7-3
control registers 7-35 11-7 13-5
conversion of format
   'C4x floating-point to integer 5-31
   extended-precision floating-point to single-precision floating-point 5-12
   FRIEEE instruction 14-98
   IEEE single precision std. 754 5-13
   IEEE to 'C4x floating-point 5-14
   integer to floating-point 5-33
   short floating-point to extended-precision floating-point 5-11
   short floating-point to single-precision floating-point 5-11
   single-precision 'C4x floating-point 5-13
   single-precision floating-point to extended-precision floating-point 5-12
   TOIEEE instruction 14-265
converting IEEE format
   table 5-14
   converting twos complement
   table 5-17
counter register 13-5
CPU 2-4
   arbitration 11-27
   block diagram 2-5
   buses 2-19
   components 2-4
   communication ports coordination 12-17
   definition A-2
   internal interrupt enable register (IIE) 2-9 3-11
   primary register file 2-6
CPU cycle
   definition A-2
CPU expansion register file
definition 3-1
CPU primary register file definition 3-1
CPU registers 2-7 3-8 7-36
auxiliary (AR0–AR7) 2-8 3-4
block repeat (RC, RE, RS) 3-16
block size (BK) 2-8 3-5
data page pointer (DP) 2-8 3-4 6-5
DMA interrupt enable (DIE) 2-9 3-8 11-44
expansion register file 2-10 3-17
extended precision (R0–R11) 2-6 3-3
IIE 3-11
IIOF flag register (IIF) 2-9 3-13
index (IR1, IR0) 2-8 3-4
internal interrupt enable (IIE) 2-9 3-11, 3-12
Introduction 3-1
program counter (PC) 2-9 2-19 3-16
repeat count (RC) 2-9 3-16 7-2
repeat end address (RE) 3-16 7-2
See also repeat block (RC, RE, RS)
repeat start address (RS) 3-16 7-2
See also repeat block (RC, RE, RS)
stack pointer (SP) 2-8 3-5
status register (ST) 2-9 3-5 14-13
table 3-2 6-3
timer 4-7

CSTRB width restrictions 12-25

D

D0-D31 definition A-2
data buses external 2-20
data formats introduction 5-1
data page pointer (DP) 2-8 3-4 6-5
data structure FIR filters 6-31
data transfer modes 11-28
data transfer operation 12-6
data-address generation logic definition A-2
data-page pointer definition A-2
DBcond instruction 14-86
DBcondD instruction 14-88
DBR instruction 8-9
DE bit 9-7

A-8
decode phase
definition A-2
delayed branches 7-9
conditional 7-9
disabled interrupts 7-9
example 7-10
incorrectly placed 7-10
example 7-7
with annul option 8-7
with annulling 7-11
without annul option 8-6
without annulling 7-10

destination address register 11-7
destination address-index register 11-7
development support applications viii, xv
DIE register bit functions
DMA split mode 11-45
direct addressing 6-5
example 6-6
figure 6-5
direct memory access. See DMA coprocessor
displacement
indirect addressing
table 5-7
displacements 6-6 to 6-21
DMA channel control register 7-35 11-7
bit definitions 11-8
data transfer modes 11-28
field descriptions 11-8
modifiable by autoinitialization in split mode 11-40
modifiable by autoinitialization in unified mode 11-39
modifiable by autoinitialization of auxiliary channel 11-40
DMA channel control register bit definitions
AUTOINIT STATIC 11-8
AUTOINIT SYNC 11-9
AUX AUTOINIT STATIC 11-9
AUX AUTOINIT SYNC 11-9
AUX START 11-10
AUX STATUS 11-11
AUX TCC 11-10
AUX TCINT FLAG 11-10
AUX TRANSFER MODE 11-8
COM PORT 11-9
DMA PRI 11-8
PRIORITY MODE 11-11
READ BIT REV 11-9
SPLIT MODE 11-9
START 11-10
<table>
<thead>
<tr>
<th>Attribute</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>STATUS</td>
<td>11-11</td>
</tr>
<tr>
<td>SYNC MODE</td>
<td>11-8</td>
</tr>
<tr>
<td>TCC</td>
<td>11-10</td>
</tr>
<tr>
<td>TCINT FLAC</td>
<td>11-10</td>
</tr>
<tr>
<td>TRANSFER MODE</td>
<td>11-8</td>
</tr>
<tr>
<td>WRITE BIT REV</td>
<td>11-9</td>
</tr>
</tbody>
</table>

DMA channel registers
(SPLIT MODE=1, auxiliary transfer counter = 0)
- **figure**: 11-36
- storage in memory (SPLIT MODE=0)
- **figure**: 11-35
- storage in memory (SPLIT MODE=1)
- **figure**: 11-36

DMA channel running
- transfer mode 102
- **figure**: 11-29, 11-30
- transfer mode 112
- **figure**: 11-31, 11-33

DMA control register bits
- effect 11-38

DMA controller 2-19
DMA coprocessor
  address generation
    figure 11-16
  address registers
  arbitration
  autoinitialization
  auxiliary channel
  block transfer sequence
  buses
  channel arbitration
  channel configuration
    figure 11-19
  channel register map
  channel synchronization
  communication ports coordination
  definition A-2
  destination synchronization
  features
  functional description
  index registers
  interrupts
  introduction
  link-pointer register
  memory map
  operational modes
  primary channel
  priorities
  priority wheel
  registers
  six channels
  source and destination synchronization
  source synchronization
  split mode
  timing
  transfer count register
  transfer modes
  unified mode
DMA coprocessor memory map
  figure 4-9
DMA destination synchronization
  figure 11-49
DMA interrupt enable register (DIE)
  bit functions
  definition A-3
DMA interrupts
  control bits
  CPU interaction
  processing
DMA memory transfer timing
  single
DMA PRI and CPU/DMA arbitration rules 11-27
DMA registers initialization 11-5
DMA source and destination sync
unified mode 11-50
DMA source synchronization figure 11-48
DMA start 11-5
DMA timing for different synchronizations
split mode figure 11-56
unified mode figure 11-55
DMA transfers
timing and number of cycles to global bus figure 11-54
timing and number of cycles to local bus figure 11-53
timing and number of cycles to on-chip figure 11-52
DMAINTx flag 3-15
documentation vii
dual-access RAM definition A-3

E
edge-triggered interrupts 7-15
EDMAINTx bit 3-12
EICFULLx bit 3-12
EICRDYx bit 3-12
EIIFOx bit 3-14
EOCEMPTYx bit 3-12
EOCRDYx bit 3-12
ETINT0 bit 3-12
ETINT1 bit 3-12
execute only 8-10
parallel store 8-14
single store 8-13
expansion register file 2-10
interrupt vector table (IVT) 7-16
exponent field
definition 5-8
extended precision registers 3-3
extended-precision floating-point format
definition A-3

A-12
extended-precision register
  definition A-3
  floating-point format 3-4
  integer format 3-4
external bus
  control registers. See memory interface control registers
  interface signals 3-3
  external bus operation 9-1 to 9-50
  overview 9-2
  external buses (global, local), wait states 9-14
external interrupts 2-21, 7-21
  definition A-3
external memory interface registers 7-35

features comparison 1-4
FIFO buffer
  definition A-3
FIFOS
  halting 12-14
FIR filters
  circular addressing 6-31
  data structure 6-31
  FIX instruction 5-31, 14-90
  FIX|STI instruction 14-92
  fixed priority 11-22
  FLAGx bit 3-14
  FLOAT instruction 5-33, 14-94
  FLOAT|STF instruction 14-96
floating point
  addition 5-23
  conversion to integer 5-31
  extended-precision format 5-7
  format conversion 5-11
  formats 5-4
  normalization 5-23, 5-27
  reciprocal 5-34
  register format 3-4
  rounding value 5-29
  single-precision format 5-6
floating point (continued)
  subtraction 5-23
  underflow 5-24
floating-point
  determining decimal equivalent 5-8
  extended-precision format 5-7
  general format 5-4
multiplication 5-19
short format 5-5
figure 5-5
single-precision format figure 5-6
floating-point addition
32-bit shift 5-26
example 5-25
floating-point addition/subtraction
example 5-26
floating-point formats
IEEE Std. 754 5-13
supported types 5-4
floating-point multiplication chart 5-20
floating-point multiply
mantissa = 1.0 5-22
mantissa = 1.5 5-21
mantissa = 2.0 5-21
positive and negative numbers 5-22
floating-point operation introduction 5-1
floating-point rounding flowchart 5-30
floating-point subtraction
example 5-25
floating-point to integer conversion
flowchart 5-32
floating-point values
fractional 5-10
negative 5-10
positive 5-9
floating-point/integer multiplier 2-4
format
conversion
'C4x to IEEE 5-17
conversions
IEEE std. 754 5-13
formats
conversion
floating-point 5-1
See also conversion of formats
formats (continued)
signed integer 5-2
unsigned integer 5-3
FRIEEE instruction 14-98
FRIEEE||STF instruction 14-99
FUNCx bit 3-14
G

general addressing modes 2-18, 6-21
encoding 6-22

general-purpose applications viii
GIE bit 3-7

global and local memory
interface control signals 9-3

global memory 9-39, 9-41, 9-43
interface 2-20

   table 9-4

   global memory port status
   STRB0 and STRB1 accesses 9-5

graphics/imagery applications viii, xi

H

halting of FIFOs 12-14
hardware interrupt
definition A-3
hit
   cache 4-14
   definition A-3
hold everything 8-10, 8-15
   busy external port 8-15
   conditional calls and traps 8-16
   multicycle data reads 8-16

I

IACK
definition A-3
IACK instruction 9-49, 14-100
IACK pin 9-49
timing 9-49
ICFULL interrupt
description 12-17
enabling 3-12
ICRDY flag
interrupt use 11-46
ICRDY interrupt
description 12-17
enabling 3-12
   interrupt use 3-3, 3-10, 11-44, 11-45
IDLE instruction 14-102, 14-103
IEEE std. 754 (conversions) 5-13
IEEE to ‘C4x conversion
example 5-16
\texttt{IIE} register 3-11
\texttt{IIF} register 7-17
\texttt{IIF} register modification 3-13
figure 7-18
\texttt{IIOF} flag register (\texttt{IIF}) 2-9 3-13
definition A-4
\texttt{IIOF} pins
boot loader use 10-5
modification 10-19
immediate addressing example 6-18
index registers (\texttt{IR0, IR1}) 2-8 3-4 11-15
definition A-4
indirect addressing 6-6
displacement 5-7
flexibility 6-6
index register \texttt{IR1} 5-8
index register \texttt{IRO} 5-7
operand coding
\textit{figure} 5-6
postdisplacement add and circular modify 6-12
postdisplacement add and modify 6-11
postdisplacement subtract and circular modify 6-13
postdisplacement subtract and modify 6-12
postindex add and bit-reversed modify 6-17
postindex add and circular modify 6-16
postindex add and modify 6-15
postindex subtract and circular modify 6-17
postindex subtract and modify 6-16
predisplacement add 6-9
predisplacement add and modify 6-10
predisplacement subtract 6-10
predisplacement subtract and modify 6-11
preindex add 6-13
preindex add and modify 6-14
preindex subtract 6-14
preindex subtract and modify 6-15
special cases 6-8
individual instruction descriptions 14-20
input and output FIFO halting summary 12-14
input FIFO channel 12-3
instruction cache 4-10
architecture 4-10
\textit{figure} 4-11
reset 4-12
instruction register (\texttt{IR}) 2-19
instruction set summary \[14-2 \text{ to } 14-11\]
  functional groups \[14-2\]
instructions
  See also assembly language
  interlocked \[9-44\]
integer
  short format \[5-2\]
  short unsigned format \[5-3\]
  signed formats \[5-2\]
  single-precision unsigned format \[5-3\]
  single-precision format \[5-2\]
  unsigned formats \[5-3\]
integer formats
  short integer \[5-2\]
  signed \[5-2\]
interlocked instructions \[2-20, 9-39, 9-44\]
interlocked operations \[9-39\]
interlocked operations instructions
  table \[14-8\]
internal buses \[2-4, 2-19\]
internal interrupt
  definition \[A-4\]
internal interrupt enable register (IIE) \[2-9, 3-11\]
  definition \[A-4\]
internal interrupts \[7-18\]
interrupt
  definition \[A-4\]
  interrupt acknowledge (IACK) \[A-4\]
  definition \[A-4\]
  interrupt acknowledge (IACK) instruction \[7-20\]
interrupt flag register (IIF)
  figure \[3-14\]
interrupt latency
  table \[7-21\]
interrupt service routine (ISR)
  definition \[A-4\]
interrupt vector table (IVT) \[7-15\]
  boot loader use \[10-8\]
  definition \[A-4\]
interrupt vector table pointer (IVTP)
  definition \[A-4\]
interrufts \[2-21\]
  control bits \[7-17\]
  DMA \[7-28, 11-42\]
  DMA interaction \[7-28\]
  edge triggered \[11-42\]
  edge-triggered \[11-42\]
  external \[2-21, 7-21\]
initialization 7-24
initiation condition 7-15
latency 7-20
level triggered 11-42
level-triggered 11-42
NMI 7-22
overlapping the IVT and TVT 7-25
prioritization 7-15
processing 7-18, 7-19, 7-27, 7-28
timer 13-12
vector table 7-16
vectors 7-28, 13-11
ISR. See interrupt service routine (ISR)
IVTP. See interrupt vector table (IVT)
IVTP register 2-10, 3-17

J
jumps 7-12

L
LA0-LA30
definition A-4
LAJ instruction 7-13, 14-105
LAIcond instruction 7-13, 14-106
LATcond instruction 7-13, 7-25, 14-107
LBb instruction 14-108
LBUb instruction 14-110
LD0-LD31
definition A-4
LDA instruction 14-111
LDE instruction 14-112
LDEP instruction 14-114
LDF instruction 14-115
LDF||LDF instruction 14-121
LDF||STF instruction 14-123
LDFcond instruction 14-117
LDFI instruction 9-39, 9-45, 14-119
LDHI instruction 14-125
LDI instruction 14-126
LDI||LDI instruction 14-132
LDI||STI instruction 14-134
LDIcond instruction 14-128
LDII instruction 9-39, 9-45, 14-130
LDM instruction 14-136
<table>
<thead>
<tr>
<th>Term</th>
<th>Page(s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDP instruction</td>
<td>14-138</td>
</tr>
<tr>
<td>LDPE instruction</td>
<td>14-139</td>
</tr>
<tr>
<td>LDPK instruction</td>
<td>14-140</td>
</tr>
<tr>
<td>level-triggered</td>
<td></td>
</tr>
<tr>
<td>interrupts</td>
<td>7-15</td>
</tr>
<tr>
<td>LHUw instruction</td>
<td>14-143</td>
</tr>
<tr>
<td>LHw instruction</td>
<td>14-141</td>
</tr>
<tr>
<td>link pointer</td>
<td></td>
</tr>
<tr>
<td>incrementing</td>
<td>11-36</td>
</tr>
<tr>
<td>reference to</td>
<td>11-41</td>
</tr>
<tr>
<td>self referential</td>
<td>11-41</td>
</tr>
<tr>
<td>link pointer registers</td>
<td></td>
</tr>
<tr>
<td>figure</td>
<td>11-18</td>
</tr>
<tr>
<td>literature</td>
<td>vii</td>
</tr>
<tr>
<td>LLOCK signal</td>
<td>9-44</td>
</tr>
<tr>
<td>load and store</td>
<td></td>
</tr>
<tr>
<td>instructions</td>
<td></td>
</tr>
<tr>
<td>table</td>
<td>14-3</td>
</tr>
<tr>
<td>loading sequence</td>
<td></td>
</tr>
<tr>
<td>bootloader</td>
<td>10-10</td>
</tr>
<tr>
<td>local memory</td>
<td></td>
</tr>
<tr>
<td>interface</td>
<td>2-20</td>
</tr>
<tr>
<td>LOCK signal</td>
<td>9-44</td>
</tr>
<tr>
<td>low IIOF signal</td>
<td></td>
</tr>
<tr>
<td>circuit diagram</td>
<td>10-19</td>
</tr>
<tr>
<td>LRU algorithm</td>
<td>4-14</td>
</tr>
<tr>
<td>LRU stack</td>
<td>4-12</td>
</tr>
<tr>
<td>LSB</td>
<td></td>
</tr>
<tr>
<td>definition</td>
<td>A-5</td>
</tr>
<tr>
<td>LSH instruction</td>
<td>14-145</td>
</tr>
<tr>
<td>LSH3 instruction</td>
<td>14-147</td>
</tr>
<tr>
<td>LSH3</td>
<td></td>
</tr>
<tr>
<td>LUF flag</td>
<td>3-5</td>
</tr>
<tr>
<td>LV flag</td>
<td>3-5</td>
</tr>
<tr>
<td>LWLct instruction</td>
<td>14-152</td>
</tr>
<tr>
<td>LWRct instruction</td>
<td>14-154</td>
</tr>
<tr>
<td>machine values</td>
<td>14-21</td>
</tr>
<tr>
<td>mantissa</td>
<td></td>
</tr>
<tr>
<td>definition</td>
<td>5-8</td>
</tr>
<tr>
<td>A-5</td>
<td></td>
</tr>
<tr>
<td>mapping addresses</td>
<td></td>
</tr>
<tr>
<td>to strobes</td>
<td>9-12</td>
</tr>
<tr>
<td>maskable interrupt</td>
<td></td>
</tr>
<tr>
<td>definition</td>
<td>A-5</td>
</tr>
<tr>
<td>MBct instruction</td>
<td>14-156</td>
</tr>
<tr>
<td>medical applications</td>
<td></td>
</tr>
<tr>
<td>viii</td>
<td></td>
</tr>
<tr>
<td>xiv</td>
<td></td>
</tr>
<tr>
<td>memory</td>
<td>2-11</td>
</tr>
<tr>
<td>8-10</td>
<td></td>
</tr>
</tbody>
</table>

See also memory interface
accesses
  pipeline 8-13
  timing 8-15
addressing modes 2-18
aliasing 2-17
block diagram 2-12
cache 4-10, 4-13
communication ports memory map 12-7
cache 4-10, 4-13
control registers. See memory interface
global 9-39, 9-41, 9-43
introduction 4-1
maps 2-13, 4-2
maximizing pipeline performance 8-17
memory maps
  communication ports 4-8
  DMA 11-4
timer registers 4-7, 13-3
organization 2-11, 4-2
parallel multiplies and adds 8-23
parallel stores 8-21
pipeline conflicts 8-10
ranges 9-10
registers. See memory interface control registers
ROMEN pin 4-2
sharing 9-42
signal-group control 9-38
space 2-11
three-operand accesses 8-20
timing 8-19, 9-16
two-operand accesses 8-20
memory accesses
  data access 8-17
  data loads and stores 8-20
  external program fetches 8-19
  internal clock 8-19
  internal program fetches 8-19
  program fetch 8-17
  two data accesses 8-18
memory cache
  rules for efficient usage 4-13
memory conflicts 3-4
memory interface
  address ranges 9-11
  control registers 9-6
  control signals 9-3
  page size 3-3
  PAGESIZE field. See memory interface control registers
memory interface (local, global)
  features 9-2
  ready generation 9-14, 9-16
MPYF3||SUBF3 instruction 14-167
MPYI instruction 14-169
MPYI3 instruction 14-171
MPYI3||ADDI3 instruction 14-173
MPYI3||STI instruction 14-175
MPYI3||SUBI3 instruction 14-177
MPYSHI instruction 14-179
MPYSHI3 instruction, 14-180
MPYUHI instruction 14-182
MPYUHI3 instruction 14-183
MSB
definition A-5
multimedia applications viii, xiii
multiplication
floating-point 5-19
multiplier
definition A-2
multiply or CPU operation
parallel store 8-21

N
N flag 3-5
NEGB instruction 14-185
NEGFI instruction 14-187
NEGFI||STF instruction 14-189
NEGI instruction 14-191
NEGI||STI instruction 14-193
Newton-Raphson algorithm
example 5-35
reciprocal square root 5-38
NMI 7-22
NMI bus grant field 3-7
NMI flag 3-15
no DMA synchronization
figure 11-47
nonmaskable interrupt (NMI)
definition A-5
NOP instruction 14-195
NORM instruction 5-27 14-196
execution 5-28
flowchart 5-27
normalization
floating point value 5-23 5-27
NOT instruction 14-198
NOT||STI instruction 14-200
O

object values
  three-operand instructions 6-23
OCEMPTY interrupt
  description 12-17
  enabling 3-12
OCRDY flag
  interrupt use 11-46
OCRDY interrupt
  description 12-17
  enabling 3-12
  interrupt use 3-3 3-10 11-44 11-45
operational overview
  communication ports 12-3
OR instruction 14-202
OR3 instruction 14-204
OR3||STI instruction 14-206
output FIFO channel 12-3
output value formats 14-12
overflow 5-24 5-31
overflow flag (OV) bit
  definition A-6
OVM flag 3-6

P

P flag (cache) 4-10
page size 9-9
page size operation 9-13
parallel addressing modes 2-18 6-24
parallel instructions
  table 14-9
parallel multiplies and adds
  figure 8-23
parallel multiply with ADD/SUB
  encoding 6-24
PAU. See port arbitration unit
PAU state definitions 12-11
PCF bit 3-6
PC-relative addressing
  encoding 6-20
  example 3-19
period register 13-5
peripheral
  memory map 4-5
peripheral modules
figure 2-22
peripheral bus 2-22
definition A-6
general architecture 2-22
memory map 4-5
peripheral memory map 2-16
peripherals
communication port 2-23
PGIE bit 3-7
pin states
table 7-29 to 7-35
pipeline
conflicts 8-4
branch 8-4
memory 8-10
register 8-8
resolving (memory) 8-17
declare unit 8-2
definition A-6
execute unit 8-2
fetch unit 8-2
four major units 8-2
introduction 8-1
memory accesses 8-19
read unit 8-2
structure 8-2
pipeline structure
figure 8-3
POP instruction 14-208
POPF instruction 14-209
port arbitration unit 12-3, 12-11
previous cache freeze (PCF) bit 4-13
primary register file (CPU) 2-6, 3-2
prioritization 7-15
priority wheel (DMA) 11-24
program
buses 2-19
program control instructions
table 14-7
program counter (PC) 2-9 2-19 3-16
definition A-6
program fetch
multicycle program memory fetches 8-12
program fetch incomplete 8-10 8-12
program wait 8-10
due to multicycle access 8-12
wait until CPU data access completes 8-11
PUSH instruction 14-210
PUSHF instruction 14-211

R
RAM 2-11
RC register 7-4
RPFCF instruction 5-34, 5-35, 14-212
RPFCF instruction algorithm
figure 5-34
read of AR 8-9
read/write (R/W) pin
definition A-6
ready
generation 9-14
timing 9-16
reciprocal (RPFCF instruction) 5-34
reciprocal algorithm 5-35
reciprocal square root (RSQRF instruction) 5-36
register addressing 3-3
register bit functions
DMA unified mode
figure 3-3
register buses 2-19
register conflicts 8-4
register file
definition A-6
registers 2-6, 2-7
auxiliary (AR0–AR7) 2-6, 3-4
block repeat (RC, RE, RS) 3-16
block size (BK) 2-8, 3-5
data page pointer (DP) 2-8, 3-4, 6-5
DMA interrupt enable (DIE) 2-9, 3-8
expansion register file 2-10
extended precision 3-3
extended precision (R0–R11) 2-6
IIF flag register (IIF) 2-9, 3-13, 7-17, 7-26
index (IR1, IR0) 3-4
input port 12-9
internal interrupt enable (IIE) 2-9, 3-11, 3-12
output port 12-9
pipeline conflicts 8-8
program counter (PC) 2-8, 2-19, 3-16
repeat count (RC) 2-9, 3-16, 7-2
repeat end address (RE) 3-16, 7-2
See also repeat block (RC, RE, RS)
repeat mode 7-2
repeat start address (RS) 3-16, 7-2
See also repeat block (RC, RE, RS)
stack pointer (SP) 2-3 3-5
status register (ST) 2-9 3-5 14-13
timer counter 13-8
timer global control 13-6
repeat count register (RC) 2-9 3-16 7-2
definition A-6
repeat end address register (RE) 7-2
repeat mode 7-2
  block (RPTB) 7-2
  block delayed (RPTBD) 7-2
  control bits 7-3
definition A-6
  nesting 7-8
  operation 7-3
RC value after completion 7-7
restriction rules 7-6
RPTB instruction 7-4
RPTBD instruction 7-4
RPTS instruction 7-5
repeat mode flag 3-6
repeat mode registers 7-2
repeat start address register (RS) 7-2
repeat-mode control algorithm 7-4
example 7-4
reserved bits 3-16
reset 7-29
  additional operations 7-35
communication ports 12-29
definition A-6
reset (continued)
  memory interface control registers 9-6
  pin states 7-29
  vector location 7-35
  vectors 7-28
RESET pin 12-29
reset pin
definition A-6
RESETLOC pins 10-10
RETcond instruction 7-13 7-24 14-213
RETcondD instruction 7-13 7-24 14-214
RETScond instruction 7-12 14-215
returns 7-12
RM bit 7-3
RM flag 3-6
RND instruction 5-29 14-216
ROL instruction 14-218
ROLC instruction 14-219
ROM 2-11 2-13
ROMEN 2-13
definition A-6
ROMEN pin 2-11 4-2
ROR instruction 14-221
RORC instruction 14-222
rotating priority 11-22
rotating priority DMA
read and write sequence 11-23
rotating priority mode
figure 11-23
rounding of floating-point value 5-29
RPTB instruction 7-2 7-8 8-8 14-223
pipeline conflict 7-7
RPTB operation
oxample 7-4
RPTBD instruction 7-2 7-8 14-224
RPTS execution
teps 7-6
RPTS instruction 7-2 8-5 14-226
RSQRF instruction
algorithm
figure 5-37
S
S bit 7-3
semaphores 9-43
service sequence
split mode priority 11-26
SET COND bit 3-7
short floating-point format
definition A-7
short integer format
definition A-7
short unsigned-integer format
definition A-7
SIGI instruction 9-39, 9-47, 14-229
signed-integer formats 5-2
sign-extend
definition A-7
single-precision floating-point format
definition A-7
single-precision integer format
definition A-7
single-precision unsigned-integer format
definition A-7
software interrupt
definition A-7
source address register 11-7
source address-index register 11-7
source and destination synchronization 11-49
speech/voice applications viii, xi
split mode 3-10
definition A-7
split mode (DMA) 11-20, 11-35
split-mode DMA configuration
figure 11-21
split-mode synchronization interrupts
table 3-11
stack
definition A-7
stack pointer (SP) 2-5, 3-5
standard (nondelayed) branches 8-4
standard branch 7-9
example 8-5
START field descriptions
table 11-14
state diagram
port arbitration unit 12-12
STATUS field descriptions
table 11-14
status register (ST)
definition A-7
figure 3-5
STF instruction 14-230
STF||STF instruction 14-234
STFI instruction 9-39 9-46 14-232
STI instruction 14-235
STII instruction 14-237
STIK instruction 14-239
STRB ACTIVE 9-8
STRB SWITCH 9-8
STRB0 PAGESIZE 9-8
STRB0 SWW 9-8
STRB0 WTCNT 9-8
STRB1 PAGESIZE 9-8
STRB1 SWW 9-8
STRB1 WTCNT 9-8
STRBx PAGESIZE fields
figure 9-13
 strobe settings 9-7
 strobes 9-12
 timing 9-16
 style (manual) iv
SUBB instruction 14-240
SUBB3 instruction 14-242
SUBC instruction 14-244
SUBF instruction 14-246
SUBF3 instruction 14-248
SUBF3||STF instruction 14-250
SUBI instruction 14-252
SUBI3 14-254
SUBI3||STI instruction 14-256
SUBRB instruction 14-258
SUBRF instruction 14-260
SUBRI instruction 14-262
 subtraction
 floating-point 5-23 5-25
 SWI instruction 14-264
 symbols 14-16
 symbols (used in manual) iv
 sync mode
 transfer rate 11-55
 SYNC MODE and AUTOINIT MODE bits
 autoinitialization
 table 11-38
 SYNC MODE bits 11-46
SYNC MODE field descriptions
- split mode
  - table 11-13
- unified mode
  - table 11-13
- synchronization 11-37
  - destination 11-48
  - DMA channels 11-43
  - source 11-48
  - source and destination 11-49
- synchronization interrupts
  - DMA channels 3-9
- synchronizer delays 12-28
- synchronizers 12-26

T

- task counter example 9-42
- TCLK 13-4
- TCLK as an input
  - figure 13-15
- TCLK as an output
  - figure 13-15
- technical assistance xvi
- telecommunications applications viii, xiii
- three-operand addressing modes 2-18, 6-22
  - encoding for type 1 6-24
- three-operand instruction word 8-20
- three-operand instructions
  - table 14-6
- timer
  - definition A-7
  - interrupts considerations 13-12
- timer clock setup
  - maximum setup 13-16
- timer configuration
  - CLKSRC = 0 and FUNC = 0 13-14
  - CLKSRC = 0 and FUNC = 1 13-14
  - CLKSRC = 1 and FUNC = 0 13-13
  - CLKSRC = 1 and FUNC = 1 13-13
- timer control register bit summary
  - C/P 13-7
  - CLKSRC 13-7
  - DATIN 13-6
- timer control register bit summary (continued)
  - DATOUT 13-6
  - FUNC 13-8
TMS320C4x devices 1-2
TOIEEE instruction 14-265
TOIEEE||STF instruction 14-266
token 12-3
token transfer
  figure 12-20
  operation 12-5 12-19
token transfer sequence
  table 12-21
transfer counter registers
  figure 11-17
TRANSFER MODE = 002
  running 11-28
TRANSFER MODE = 012
  running 11-29
TRANSFER MODE = 102
  running 11-29
TRANSFER MODE = 112
  running 11-31
TRANSFER MODE field descriptions 11-28
  table 11-12
transfer rate
  sync mode 11-55
trap flow
  figure 7-24
TRAP instruction 7-25
trap vector table (TVT)
  boot loader use
    definition A-7
  to 10-9 to 10-10
trap vector table pointer (TVTP)
  definition A-7
TRAPcond instruction 7-12 14-267
traps 7-12 7-24
  initialization 7-24
  operation 7-24
  overlapping the TVT and IVT vector table 7-25
TSTB instruction 14-268
TSTB3 instruction 14-270
TVTP 3-17

See also trap vector table (TVT)
TVTP register 2-10

two parallel stores
  figure 8-22

two-operand instruction word 8-20

two-operand instructions
  table 14-4

type one synchronizer
  maximum delay 12-26
  minimum delay 12-26

type three synchronizer
  maximum delay 12-28
  minimum delay 12-27

type two synchronizer
  maximum delay 12-27
  minimum delay 12-27

TYPEX bit 3-14

UF flag 3-5

underflow 5-23

unified mode
  definition A-8

unified mode (DMA) 11-19, 11-35

unsigned-integer formats 5-3

V flag 3-5

value in a floating-point number
  equation 5-4

vector locations
  table 7-35

vectors (reset, interrupts) 7-28

wait state
  definition A-8

wait states 9-14, 9-36, 9-37

bus disabled 9-38

wait-state generator
  definition A-8

word transfer
  operation 12-22, 12-23

word transfer sequence
  table 12-24
word transfers \(11-5\)
write to AR \(8-8\)

\(\textbf{X}\)
XOR instruction \(14-272\)
XOR3 instruction \(14-274\)
XOR3|STI instruction \(14-276\)

\(\textbf{Z}\)
Z flag \(3-5\)
zero fill
definition \(A-8\)