SPRZ491E December   2020  – December 2024 DRA821U , DRA821U-Q1

 

  1.   1
  2. 1Modules Affected
  3. 2Nomenclature, Package Symbolization, and Revision Identification
    1. 2.1 Device and Development-Support Tool Nomenclature
    2. 2.2 Devices Supported
    3. 2.3 Package Symbolization and Revision Identification
  4. 3Silicon Revision 1.0, 2.0 Usage Notes and Advisories
    1. 3.1 Silicon Revision 1.0, 2.0 Usage Notes
    2. 3.2 Silicon Revision 1.0, 2.0 Advisories
    3.     i2049
    4.     i2062
    5.     i2091
    6.     i2103
    7.     i2116
    8.     i2123
    9. 3.3 i2126
    10. 3.4 i2127
    11.     i2134
    12.     i2137
    13.     i2146
    14. 3.5 i2151
    15.     i2157
    16.     i2159
    17.     i2160
    18.     i2161
    19.     i2163
    20.     i2166
    21.     i2177
    22.     i2182
    23.     i2183
    24.     i2184
    25.     i2185
    26.     i2186
    27.     i2187
    28.     i2189
    29.     i2196
    30.     i2197
    31.     i2201
    32.     i2205
    33.     i2207
    34.     i2208
    35.     i2209
    36.     i2216
    37.     i2217
    38.     i2221
    39.     i2222
    40.     i2227
    41.     i2228
    42.     i2232
    43.     i2233
    44.     i2234
    45.     i2235
    46.     i2237
    47.     i2241
    48.     i2242
    49.     i2243
    50.     i2244
    51.     i2245
    52.     i2246
    53.     i2249
    54.     i2253
    55.     i2257
    56.     i2274
    57.     i2275
    58.     i2277
    59.     i2278
    60.     i2279
    61.     i2283
    62.     i2306
    63.     i2307
    64.     i2310
    65.     i2311
    66.     i2312
    67.     i2320
    68.     i2326
    69.     i2329
    70.     i2351
    71.     i2360
    72.     i2361
    73.     i2362
    74.     i2366
    75.     i2371
    76.     i2372
    77.     i2383
    78.     i2401
    79.     i2409
    80.     i2413
    81.     i2414
    82.     i2418
    83.     i2419
    84.     i2422
    85.     i2424
    86.     i2435
    87.     i2459
  5.   Trademarks
  6.   Revision History

i2163


UDMAP: UDMA transfers with ICNTs and/or src/dst addr NOT aligned to 64B fail when used in "event trigger" mode

Details:

Note: The following description uses an example a C7x DSP core, but it applies to any other processing cores which can program the UDMA.

For DSP algorithm processing on C6x/C7x, the software often uses UDMA in NavSS or DRU in MSMC. In many cases, UDMA is used instead of DRU, because DRU channels are reserved in many use-cases for C7x/MMA deep learning operations. In a typical DSP algorithm processing, data is DMA'ed block by block to L2 memory for DSP, and DSP operates on the data in L2 memory instead of operating from DDR (through the cache). The typical DMA setup and event trigger for this operation is as below; this is referred to as "2D trigger and wait" in the following example.

For each "frame":

  1. Setup a TR typically 3 or 4 dimension TR.
    1. Set TYPE = 4D_BLOCK_MOVE_REPACKING_INDIRECTION
    2. Set EVENT_SIZE = ICNT2_DEC
    3. Set TRIGGER0 = GLOBAL0
    4. Set TRIGGER0_TYPE = ICNT2_DEC
    5. Set TRIGGER1 = NONE
    6. ICNT0 x ICNT1 is block width x block height
    7. ICNT2 = number of blocks
    8. ICNT3 = 1
    9. src addr = DDR
    10. dst addr = C6x L2 memory
  2. Submit this TR
    1. This TR starts a transfer on GLOBAL TRIGGER0 and transfers ICNT0xICNT1 bytes, then raises an event
  3. For each block do the following:
    1. Trigger DMA by setting GLOBAL TRIGGER0
    2. Wait for the event that indicates that the block is transferred
    3. Do DSP processing
This sequence is a simplified sequence; in the actual algorithm, there can be multiple channels doing DDR to L2 or L2 DDR transfer in a "ping-pong" manner, such that DSP processing and DMA runs in parallel. The event itself is programmed appropriately at the channel OES registers, and the event status check is done using a free bit in IA for UDMA.

When the following conditions occur, the event in step 3.2 is not received for the first trigger:

  • Condition 1: ICNT0xICT1 is NOT a multiple of 64.
  • Condition 2: src or dst is NOT a multiple of 64.
  • Condition 3: ICNT0xICT1 is NOT a multiple of 64 and src/dst address not a multiple of 64
Multiple of 16B or 32B for ICNT0xICNT1 and src/dst addr also has the same issue, where the event is not received. Only alignment of 64B makes it work.

Conditions in which it works:

  • If ICNT0xICNT1 is made a multiple of 64 and src/dst address a multiple of 64, the test case passes.
  • If DRU is used instead of UDMA, then the test passes. You must submit the TR to DRU through the UDMA DRU external channel. With DRU and with ICNTs and src/dst addr unaligned, the user can trigger and get events as expected when TR is programmed such that the number of events and number of triggers in a frame is 1, i.e ICNT2 = 1 in above case or EVENT_SIZE = COMPLETION and trigger is NONE. Then the completion event occurs as expected. This is not feasible to be used by the use-cases in question.
Above is a example for "2D trigger and wait", the same constraint applies for "1D trigger and wait" and "3D trigger and wait":

  • For "1D trigger and wait", ICNT0 MUST be multiple of 64
  • For "3D trigger and wait", ICNT0xICNT1xICNT2 MUST be multiple of 64

Workaround(s):

Set the EOL flag in TR for UDMAP as shown in following example:

  • 1D trigger and wait
    • TR.FLAGS |= CSL_FMK(UDMAP_TR_FLAGS_EOL, CSL_UDMAP_TR_FLAGS_EOL_ICNT0);
  • 2D trigger and wait
    • TR.FLAGS |= CSL_FMK(UDMAP_TR_FLAGS_EOL, CSL_UDMAP_TR_FLAGS_EOL_ICNT0_ICNT1);
  • 3D trigger and wait
    • TR.FLAGS |= CSL_FMK(UDMAP_TR_FLAGS_EOL,CSL_UDMAP_TR_FLAGS_EOL_ICNT0_ICNT1_ICNT2);

There is no performance impact due to this workaround.