SPRUIV4C May   2020  – December 2023

 

  1.   1
  2.   Read This First
    1.     About This Manual
    2.     Related Documentation
    3.     Trademarks
  3. 2Introduction
    1. 2.1 C7000 Digital Signal Processor CPU Architecture Overview
    2. 2.2 C7000 Split Datapath and Functional Units
  4. 3C7000 C/C++ Compiler Options
    1. 3.1 Overview
    2. 3.2 Selecting Compiler Options for Performance
    3. 3.3 Understanding Compiler Optimization
      1. 3.3.1 Software Pipelining
      2. 3.3.2 Vectorization and Vector Predication
      3. 3.3.3 Automatic Use of Streaming Engine and Streaming Address Generator
      4. 3.3.4 Loop Collapsing and Loop Coalescing
      5. 3.3.5 Automatic Inlining
      6. 3.3.6 If Conversion
  5. 4Basic Code Optimization
    1. 4.1  Signed Types for Iteration Counters and Limits
    2. 4.2  Floating-Point Division
    3. 4.3  Loop-Carried Dependencies and the Restrict Keyword
      1. 4.3.1 Loop-Carried Dependencies
      2. 4.3.2 The Restrict Keyword
      3. 4.3.3 Run-Time Alias Disambiguation
    4. 4.4  Function Calls and Inlining
    5. 4.5  MUST_ITERATE and PROB_ITERATE Pragmas and Attributes
    6. 4.6  If Statements and Nested If Statements
    7. 4.7  Intrinsics
    8. 4.8  Vector Types
    9. 4.9  C++ Features to Use and Avoid
    10. 4.10 Streaming Engine
    11. 4.11 Streaming Address Generator
    12. 4.12 Optimized Libraries
    13. 4.13 Memory Optimizations
  6. 5Understanding the Assembly Comment Blocks
    1. 5.1 Software Pipelining Processing Stages
    2. 5.2 Software Pipeline Information Comment Block
      1. 5.2.1 Loop and Iteration Count Information
      2. 5.2.2 Dependency and Resource Bounds
      3. 5.2.3 Initiation Interval (ii) and Iterations
      4. 5.2.4 Constant Extensions
      5. 5.2.5 Resources Used and Register Tables
      6. 5.2.6 Stage Collapsing
      7. 5.2.7 Memory Bank Conflicts
      8. 5.2.8 Loop Duration Formula
    3. 5.3 Single Scheduled Iteration Comment Block
    4. 5.4 Identifying Pipeline Failures and Performance Issues
      1. 5.4.1 Issues that Prevent a Loop from Being Software Pipelined
      2. 5.4.2 Software Pipeline Failure Messages
      3. 5.4.3 Performance Issues
  7. 6Revision History

If Conversion

In order to software pipeline a loop (and thus improve performance), the only branch that may occur in a loop is a branch back to the top of the loop. Branches for if-then and if-then-else statements or for other control-flow constructs will prevent software pipelining.

To get around this limitation, the compiler performs if-conversion. If-conversion attempts to remove branches associated with if-then and if-then-else statements, by predicating instructions so that they conditionally execute depending on the test in the "if" statement. As long as there are not too many nesting levels, too many condition terms, or too many instructions in the if-then or if-then-else statements, if-conversion usually succeeds.

The following example demonstrates if-conversion. In order to software pipeline the "for" loop in this C++ code, if-conversion must be performed. The pragmas are used to prevent the compiler from vectorizing and generating additional code that is not important for this example.

// if_conversion.cpp
// Compile with "cl7x -mv7100 --opt_level=3 --debug_software_pipeline
// --src_interlist --symdebug:none if_conversion.cpp"

void function_1(int * restrict a, int *restrict b, int *restrict out, int n)
{
    #pragma UNROLL(1)
    #pragma MUST_ITERATE(1024, ,32)
    for (int i = 0; i < n; i++)
    {
        int result;
        if (a[i] < b[i])
            result = a[i] + b[i];
        else
            result = 0;

        out [i] = result;
    }
}

After compilation, the single-scheduled iteration of the loop in the software pipeline information comment block looks like the following:

;*----------------------------------------------------------------------------*
;*        SINGLE SCHEDULED ITERATION
;*
;*        ||$C$C65||:
;*   0              TICK                               ; [A_U] 
;*   1              SLDW    .D1     *D2++(4),A1       ; [A_D1] |17|  ^ 
;*     ||           SLDW    .D2     *D1++(4),A2       ; [A_D2] |17|  ^ 
;*   2              NOP     0x5     ; [A_B] 
;*   7              CMPGEW  .L1     A2,A1,A0          ; [A_L1] |17|  ^ 
;*   8      [!A0]   ADDW    .D2     A1,A2,D3          ; [A_D2] |17|  ^
;*   9      [ A0]   MVKU32  .S1     0,D3              ; [A_S1] |17|
;*  10              STW     .D1     D3,*D0++(4)       ; [A_D1] |17| 
;*     ||           BNL     .B1     ||$C$C65||        ; [A_B] |9| 
;*  11              ; BRANCHCC OCCURS {||$C$C65||}    ; [] |9| 
;*----------------------------------------------------------------------------

The instruction [!A0] ADDW.D2 A1,A2,D3 represents the "then" part of the if statement. The instruction [A0] MVK32.S1 0,D3 represents the "else" part of the if statement. The CMPGEW instruction computes the if-condition and puts the result into a predicate register, which is used to conditionally execute the ADDW and MVKU32 instructions.