SPRAB89A September   2011  – March 2014

 

  1. Introduction
    1. 1.1  ABIs for the C6000
    2. 1.2  Scope
    3. 1.3  ABI Variants
    4. 1.4  Toolchains and Interoperability
    5. 1.5  Libraries
    6. 1.6  Types of Object Files
    7. 1.7  Segments
    8. 1.8  C6000 Architecture Overview
    9. 1.9  Reference Documents
    10. 1.10 Code Fragment Notation
  2. Data Representation
    1. 2.1 Basic Types
    2. 2.2 Data in Registers
    3. 2.3 Data in Memory
    4. 2.4 Complex Types
    5. 2.5 Structures and Unions
    6. 2.6 Arrays
    7. 2.7 Bit Fields
      1. 2.7.1 Volatile Bit Fields
    8. 2.8 Enumeration Types
  3. Calling Conventions
    1. 3.1 Call and Return
      1. 3.1.1 Return Address Computation
      2. 3.1.2 Call Instructions
      3. 3.1.3 Return Instruction
      4. 3.1.4 Pipeline Conventions
      5. 3.1.5 Weak Functions
    2. 3.2 Register Conventions
    3. 3.3 Argument Passing
    4. 3.4 Return Values
    5. 3.5 Structures and Unions Passed and Returned by Reference
    6. 3.6 Conventions for Compiler Helper Functions
    7. 3.7 Scratch Registers for Inter-Section Calls
    8. 3.8 Setting Up DP
  4. Data Allocation and Addressing
    1. 4.1 Data Sections and Segments
    2. 4.2 Allocation and Addressing of Static Data
      1. 4.2.1 Addressing Methods for Static Data
        1. 4.2.1.1 Near DP-Relative Addressing
        2. 4.2.1.2 Far DP-Relative Addressing
        3. 4.2.1.3 Absolute Addressing
        4. 4.2.1.4 GOT-Indirect Addressing
        5. 4.2.1.5 PC-Relative Addressing
      2. 4.2.2 Placement Conventions for Static Data
        1. 4.2.2.1 Abstract Conventions for Placement
        2. 4.2.2.2 Abstract Conventions for Addressing
        3. 4.2.2.3 Linker Requirements
      3. 4.2.3 Initialization of Static Data
    3. 4.3 Automatic Variables
    4. 4.4 Frame Layout
      1. 4.4.1 Stack Alignment
      2. 4.4.2 Register Save Order
        1. 4.4.2.1 Big-Endian Pair Swapping
        2. 4.4.2.2 Examples
      3. 4.4.3 DATA_MEM_BANK
      4. 4.4.4 C64x+ Specific Stack Layouts
        1. 4.4.4.1 _ _C6000_push_rts Layout
        2. 4.4.4.2 Compact Frame Layout
    5. 4.5 Heap-Allocated Objects
  5. Code Allocation and Addressing
    1. 5.1 Computing the Address of a Code Label
      1. 5.1.1 Absolute Addressing for Code
      2. 5.1.2 PC-Relative Addressing
      3. 5.1.3 PC-Relative Addressing Within the Same Section
      4. 5.1.4 Short-Offset PC-Relative Addressing (C64x)
      5. 5.1.5 GOT-Based Addressing for Code
    2. 5.2 Branching
    3. 5.3 Calls
      1. 5.3.1 Direct PC-Relative Call
      2. 5.3.2 Far Call Trampoline
      3. 5.3.3 Indirect Calls
    4. 5.4 Addressing Compact Instructions
  6. Addressing Model for Dynamic Linking
    1. 6.1 Terms and Concepts
    2. 6.2 Overview of Dynamic Linking Mechanisms
    3. 6.3 DSOs and DLLs
    4. 6.4 Preemption
    5. 6.5 PLT Entries
      1. 6.5.1 Direct Calls to Imported Functions
      2. 6.5.2 PLT Entry Via Absolute Address
      3. 6.5.3 PLT Entry Via GOT
    6. 6.6 The Global Offset Table
      1. 6.6.1 GOT-Based Reference Using Near DP-Relative Addressing
      2. 6.6.2 GOT-Based Reference Using Far DP-Relative Addressing
    7. 6.7 The DSBT Model
      1. 6.7.1 Entry/Exit Sequence for Exported Functions
      2. 6.7.2 Avoiding DP Loads for Internal Functions
      3. 6.7.3 Function Pointers
      4. 6.7.4 Interrupts
      5. 6.7.5 Compatibility With Non-DSBT Code
    8. 6.8 Performance Implications of Dynamic Linking
  7. Thread-Local Storage Allocation and Addressing
    1. 7.1 About Multi-Threading and Thread-Local Storage
    2. 7.2 Terms and Concepts
    3. 7.3 User Interface
    4. 7.4 ELF Object File Representation
    5. 7.5 TLS Access Models
      1. 7.5.1 C6x Linux TLS Models
        1. 7.5.1.1 General Dynamic TLS Access Model
        2. 7.5.1.2 Local Dynamic TLS Access Model
        3. 7.5.1.3 Initial Exec TLS Access Model
          1. 7.5.1.3.1 Thread Pointer
          2. 7.5.1.3.2 Initial Exec TLS Addressing
        4. 7.5.1.4 Local Exec TLS Access Model
      2. 7.5.2 Static Executable TLS Model
        1. 7.5.2.1 Static Executable Addressing
        2. 7.5.2.2 Static Executable TLS Runtime Architecture
        3. 7.5.2.3 Static Executable TLS Allocation
          1. 7.5.2.3.1 TLS Initialization Image Allocation
          2. 7.5.2.3.2 Main Thread’s TLS Allocation
          3. 7.5.2.3.3 Thread Library’s TLS Region Allocation
        4. 7.5.2.4 Static Executable TLS Initialization
          1. 7.5.2.4.1 Main Thread’s TLS Initialization
          2. 7.5.2.4.2 TLS Initialization by Thread Library
        5. 7.5.2.5 Thread Pointer
      3. 7.5.3 Bare-Metal Dynamic Linking TLS Model
        1. 7.5.3.1 Default TLS Addressing for Bare-Metal Dynamic Linking
        2. 7.5.3.2 TLS Block Creation
    6. 7.6 Thread-Local Symbol Resolution and Weak References
      1. 7.6.1 General and Local Dynamic TLS Weak Reference Addressing
      2. 7.6.2 Initial and Local Executable TLS Weak Reference Addressing
      3. 7.6.3 Static Exec and Bare Metal Dynamic TLS Model Weak References
  8. Helper Function API
    1. 8.1 Floating-Point Behavior
    2. 8.2 C Helper Function API
    3. 8.3 Special Register Conventions for Helper Functions
    4. 8.4 Helper Functions for Complex Types
    5. 8.5 Floating-Point Helper Functions for C99
  9. Standard C Library API
    1. 9.1  Reserved Symbols
    2. 9.2  <assert.h> Implementation
    3. 9.3  <complex.h> Implementation
    4. 9.4  <ctype.h> Implementation
    5. 9.5  <errno.h> Implementation
    6. 9.6  <float.h> Implementation
    7. 9.7  <inttypes.h> Implementation
    8. 9.8  <iso646.h> Implementation
    9. 9.9  <limits.h> Implementation
    10. 9.10 <locale.h> Implementation
    11. 9.11 <math.h> Implementation
    12. 9.12 <setjmp.h> Implementation
    13. 9.13 <signal.h> Implementation
    14. 9.14 <stdarg.h> Implementation
    15. 9.15 <stdbool.h> Implementation
    16. 9.16 <stddef.h> Implementation
    17. 9.17 <stdint.h> Implementation
    18. 9.18 <stdio.h> Implementation
    19. 9.19 <stdlib.h> Implementation
    20. 9.20 <string.h> Implementation
    21. 9.21 <tgmath.h> Implementation
    22. 9.22 <time.h> Implementation
    23. 9.23 <wchar.h> Implementation
    24. 9.24 <wctype.h> Implementation
  10. 10C++ ABI
    1. 10.1  Limits (GC++ABI 1.2)
    2. 10.2  Export Template (GC++ABI 1.4.2)
    3. 10.3  Data Layout (GC++ABI Chapter 2)
    4. 10.4  Initialization Guard Variables (GC++ABI 2.8)
    5. 10.5  Constructor Return Value (GC++ABI 3.1.5)
    6. 10.6  One-Time Construction API (GC++ABI 3.3.2)
    7. 10.7  Controlling Object Construction Order (GC++ ABI 3.3.4)
    8. 10.8  Demangler API (GC++ABI 3.4)
    9. 10.9  Static Data (GC++ ABI 5.2.2)
    10. 10.10 Virtual Tables and the Key function (GC++ABI 5.2.3)
    11. 10.11 Unwind Table Location (GC++ABI 5.3)
  11. 11Exception Handling
    1. 11.1  Overview
    2. 11.2  PREL31 Encoding
    3. 11.3  The Exception Index Table (EXIDX)
      1. 11.3.1 Pointer to Out-of-Line EXTAB Entry
      2. 11.3.2 EXIDX_CANTUNWIND
      3. 11.3.3 Inlined EXTAB Entry
    4. 11.4  The Exception Handling Instruction Table (EXTAB)
      1. 11.4.1 EXTAB Generic Model
      2. 11.4.2 EXTAB Compact Model
      3. 11.4.3 Personality Routines
    5. 11.5  Unwinding Instructions
      1. 11.5.1 Common Sequence
      2. 11.5.2 Byte-Encoded Unwinding Instructions
      3. 11.5.3 24-Bit Unwinding Encoding
    6. 11.6  Descriptors
      1. 11.6.1 Encoding of Type Identifiers
      2. 11.6.2 Scope
      3. 11.6.3 Cleanup Descriptor
      4. 11.6.4 Catch Descriptor
      5. 11.6.5 Function Exception Specification (FESPEC) Descriptor
    7. 11.7  Special Sections
    8. 11.8  Interaction With Non-C++ Code
      1. 11.8.1 Automatic EXIDX Entry Generation
      2. 11.8.2 Hand-Coded Assembly Functions
    9. 11.9  Interaction With System Features
      1. 11.9.1 Shared Libraries
      2. 11.9.2 Overlays
      3. 11.9.3 Interrupts
    10. 11.10 Assembly Language Operators in the TI Toolchain
  12. 12DWARF
    1. 12.1 DWARF Register Names
    2. 12.2 Call Frame Information
    3. 12.3 Vendor Names
    4. 12.4 Vendor Extensions
  13. 13ELF Object Files (Processor Supplement)
    1. 13.1 Registered Vendor Names
    2. 13.2 ELF Header
    3. 13.3 Sections
      1. 13.3.1 Section Indexes
      2. 13.3.2 Section Types
      3. 13.3.3 Extended Section Header Attributes
      4. 13.3.4 Subsections
      5. 13.3.5 Special Sections
      6. 13.3.6 Section Alignment
    4. 13.4 Symbol Table
      1. 13.4.1 Symbol Types
      2. 13.4.2 Common Block Symbols
      3. 13.4.3 Symbol Names
      4. 13.4.4 Reserved Symbol Names
      5. 13.4.5 Mapping Symbols
    5. 13.5 Relocation
      1. 13.5.1 Relocation Types
      2. 13.5.2 Relocation Operations
      3. 13.5.3 Relocation of Unresolved Weak References
  14. 14ELF Program Loading and Dynamic Linking (Processor Supplement)
    1. 14.1 Program Header
      1. 14.1.1 Base Address
      2. 14.1.2 Segment Contents
      3. 14.1.3 Bound and Read-Only Segments
      4. 14.1.4 Thread-Local Storage
    2. 14.2 Program Loading
    3. 14.3 Dynamic Linking
      1. 14.3.1 Program Interpreter
      2. 14.3.2 Dynamic Section
      3. 14.3.3 Shared Object Dependencies
      4. 14.3.4 Global Offset Table
      5. 14.3.5 Procedure Linkage Table
      6. 14.3.6 Preemption
      7. 14.3.7 Initialization and Termination
    4. 14.4 Bare-Metal Dynamic Linking Model
      1. 14.4.1 File Types
      2. 14.4.2 ELF Identification
      3. 14.4.3 Visibility and Binding
      4. 14.4.4 Data Addressing
      5. 14.4.5 Code Addressing
      6. 14.4.6 Dynamic Information
  15. 15Linux ABI
    1. 15.1  File Types
    2. 15.2  ELF Identification
    3. 15.3  Program Headers and Segments
    4. 15.4  Data Addressing
      1. 15.4.1 Data Segment Base Table (DSBT)
      2. 15.4.2 Global Offset Table (GOT)
    5. 15.5  Code Addressing
    6. 15.6  Lazy Binding
    7. 15.7  Visibility
    8. 15.8  Preemption
    9. 15.9  Import-as-Own Preemption
    10. 15.10 Program Loading
    11. 15.11 Dynamic Information
    12. 15.12 Initialization and Termination Functions
    13. 15.13 Summary of the Linux Model
  16. 16Symbol Versioning
    1. 16.1 ELF Symbol Versioning Overview
    2. 16.2 Version Section Identification
  17. 17Build Attributes
    1. 17.1 C6000 ABI Build Attribute Subsection
    2. 17.2 C6000 Build Attribute Tags
  18. 18Copy Tables and Variable Initialization
    1. 18.1 Copy Table Format
    2. 18.2 Compressed Data Formats
      1. 18.2.1 RLE
      2. 18.2.2 LZSS Format
    3. 18.3 Variable Initialization
  19. 19Extended Program Header Attributes
    1. 19.1 Encoding
    2. 19.2 Attribute Tag Definitions
    3. 19.3 Extended Program Header Attributes Section Format
  20. 20Revision History

General Dynamic TLS Access Model

This is the most generic TLS access model. Objects using this access model can be used to build any Linux module: executables, initially loaded modules, and dlopened modules. The generated code for this model cannot assume the module-id or the offset is known during static linking.

With this access model, a dynamic module can be loaded at run time. To allow for this possibility, the thread library’s thread management architecture must provide a way for TLS blocks to be added and removed as dynamic modules are loaded and unloaded at run-time.

The compiler generates a call to __tls_get_addr() to get the address of the thread-local variable. The module-id and the thread-local variable’s offset in the module’s TLS block are passed as parameters. The code obtains the module-id and offset from the Global Offset Table (GOT) entries to ensure position independence (PIC) and symbol preemption.

The simplest way for the __tls_get_addr() function to pass the module-id and offset is as follows:

    void * __tls_get_addr(unsigned int module_id, ptrdiff_t offset); 

Note that both are 32-bit arguments, and the GOT entries are also 32-bit entries. As an optimization, we can load these two GOT entries as a 64-bit double word if the ISA supports this. The two GOT entries must be allocated consecutively and aligned to a 64-bit boundary. This GOT entity can be thought of as the following struct:

    struct TLS_descriptor 
    { 
        unsigned int module_id; 
        ptrditt_t offset; 
    } __attribute__ ((aligned (8)));

Then the __tls_get_addr() interface becomes:

    void * __tls_get_addr(struct TLS_descriptor); 

In this EABI, a struct of size 64 bits or less is passed by value, resulting in passing the TLS descriptor in the A5:A4 register pair. In little-endian mode, the module-id is passed in A4 and the offset is in A5. In big-endian mode, the registers are swapped as per the C6x EABI calling conventions. The examples in this section use little-endian mode.

Using this interface, the thread-local access becomes the following (for C64 and above):

        LDDW   *+DP($GOT_TLS(X)), A5:A4  ;reloc R_C6000_SBR_GOT_U15_D_TLS
     || CALLP  __tls_get_addr,B3          ; A4 has the address of X at return
        LDW    *A4, A4                   ; A4 has the value of X

The relocation R_C6000_SBR_GOT_U15_D_TLS causes the linker to create GOT entries for the module-id and offset for x as follows:

64-bit aligned address:
        GOT[n]            ;reloc R_C6000_TLSMOD  (symbol X)
        GOT[n+1]          ;reloc R_C6000_TBR_U32 (symbol X)

The linker then resolves the R_C6000_SBR_GOT_U15_D_TLS relocation with the DP-relative offset of the GOT entity. The dynamic loader resolves R_C6000_TLSMOD to the module-id of the module where x is defined. It resolves R_C6000_TBR_U32 to the offset of x in the module’s TLS block.

The C6x ISA does not currently have an instruction to load the 64-bit TLS descriptor directly. However, we define the __tls_get_addr() interface using the 64-bit descriptor in anticipation of a future ISA having such support.

    void * __tls_get_addr(struct TLS_descriptor);

The linker is required to allocate the GOT entries of a thread-local variable’s module-id and offset consecutively and align the first entry to a 64-bit boundary when the R_C6000_SBR_GOT_U15_D_TLS relocation is found.

Lacking support for a DP-relative 64-bit load, the following sequence can be used on current ISAs:

        LDW    *+DP($GOT_TLSMOD(X)), A5   ;reloc R_C6000_SBR_GOT_U15_W_TLSMOD
        LDW    *+DP($GOT_TBR(X)), A4      ;reloc R_C6000_SBR_GOT_U15_W_TBR
     || CALLP  __tls_get_addr,B3          ; A4 has the address of X at return
        LDW    *A4, A4                    ; A4 has the value of X

The relocations R_C6000_SBR_GOT_U15_W_TLSMOD and R_C6000_SBR_GOT_U15_W_TBR cause the linker to create GOT entries for the module-id and offset respectively for x. This access mode does not require these GOT entries to be consecutive and 64-bit aligned. If the linker does not also see a DW_TLS relocation for the same symbol, it is free to define the module-id and offset GOT entries separately without 64-bit alignment. However, if it sees DW_TLS in addition to the TLSMOD/TBR relocations for the same symbol, 64-bit aligned consecutive GOT entries must be defined and reused for the TLSMOD/TBR relocations.

If the GOT must be addressed using far-DP addressing, then the general dynamic addressing becomes:

        MVKL $DPR_GOT_TLSMOD(X),  A5      ;reloc R_C6000_SBR_GOT_L16_W_TLSMOD
        MVKH $DPR_GOT_TLSMOD(X),  A5      ;reloc R_C6000_SBR_GOT_H16_W_TLSMOD
        ADD  DP, A5, A5
        LDW  *A5, A5
        MVKL $DPR_GOT_TPR(X),  A4         ;reloc R_C6000_SBR_GOT_L16_W_TBR
        MVKH $DPR_GOT_TPR(X),  A4         ;reloc R_C6000_SBR_GOT_H16_W_TBR
        ADD  DP, A4, A4
        LDW  *A4, A4
     || CALLP  __tls_get_addr,B3          ; A4 has the address of X at return
        LDW    *A4, A4                    ; A4 has the value of X

__tls_get_addr() can calculate the thread-local address as follows:

    void * __tls_get_addr(struct TLS_descriptor desc)
    {
        void *TP  = __c6xabi_get_tp();
        int  *dtv = (int*)(((int*) TP)[0]);
        char *tls = (char *)dtv[desc.module_id];
        return tls + desc.offset;
    }