SWRZ150 Errata

SWRZ150A December 2024 – October 2025 AWR2544

1
1Introduction
2Device Nomenclature
3Device Markings
4Advisory to Silicon Variant / Revision Map
5Known Design Exceptions to Functional Specifications
1. MSS#25
2. MSS#27
3. MSS#28
4. MSS#29
5. MSS#30
6. MSS#33
7. MSS#40
8. 5.1 MSS#49
9. 5.2 MSS#52
10. 5.3 MSS#53
11. 5.4 MSS#54
12. 5.5 MSS#55
13. 5.6 MSS#56
14. 5.7 MSS#57
15. 5.8 MSS#58
16. 5.9 MSS#59
17. 5.10 MSS#60
18. 5.11 MSS#61
19. 5.12 MSS#62
20. 5.13 MSS#63
21. 5.14 MSS#64
22. 5.15 MSS#65
23. MSS#68
24. MSS#71
25. 5.16 ANA#12A
26. ANA#37A
27. ANA#39
28. ANA#43
29. ANA#44
30. ANA#45
31. ANA#47
32. ANA#59
Trademarks
6Revision History

MSS#71

Single bit ECC (error correction) mechanism can cause an incorrect memory update

Revision(s) Affected:

AWR2544

Description:

Note: The issue was uncovered during the debug of an incorrect memory access sequence in simulations. Till date there are no such issues reported in-field / deployment scenarios by any of our customers.

In the uncommon occurrences of single bit upset events on below tabulated memory ranges in the SoC, under a specific memory access sequence combination, the single bit error correction mechanism can cause an incorrect memory update.

The RAM memories on AWR294x are ECC protected with a Single bit Error Correction, Double bit Error Detection (SECDED) mechanism. On the occurrence of a specific sequence of events, the single bit error correction mechanism can cause an incorrect memory update.

For the issue to cause an impact to the application, all the below conditions must satisfy

Random hardware faults, due to environmental conditions or other factors, leading to a single bit upset events occur, AND
The Single bit upset event affects the impacted memory ranges, AND
A read or partial-write access to the memory location with single bit error occurs (leading to the single bit error correction mechanism kicking in), AND
A specific memory access sequence combination occurs after the single bit error correction happens, AND
The incorrect memory update by the error correction mechanism is critical enough to impact the application program flow and is undetected by other safety mechanisms.

The following access combination (Conditions 3 and 4 above) to the impacted memory range after the single bit error correction happens can cause the issue.

Read / Partial write access (from/to the location A with SEC) → (Followed by) Full write (to one or more memory locations in the same memory range) → (Followed by) Partial write (to any other location in the same memory range) : leads to incorrect update to last full-write location.
Partial write access (to the location A with SEC) → (Followed by) Partial write (to any other location in the same memory range) : leads to incorrect update to location A.

Note: The issue doesn’t occur for all other combinations of memory access sequence combinations.

Workaround(s):

The single bit upset events are uncommon with lower probability of occurrence.

The scenario must lead to single bit errors alone. Double bit errors are only detected and on double bit errors, depending on the criticality, the device is taken to safe state.

Partial write memory accesses (needed to cause the issue) are limited as

Cached memory ranges do not lead to partial write accesses as cache lines writes are always full writes.
- Ex., MSS L2 memories
Code sections are read only (hence the entire code section accesses do not satisfy the conditions to cause the issue).
- Ex., MSS L2 memories
Impacted memories with partial write accesses can have other safety mechanisms that can detect or avoid such random errors.
- Higher level processing algorithms of Radar data cube have built in outlier rejection capabilities due to tracking functions (temporal and logical monitoring).
  - Ex., DSS L3
- Information redundancy techniques may be used on impact memories like Mailbox to detect errors.
  - Ex., Mailbox memories

In the impacted memory ranges, identify if there are possibilities of partial memory write accesses. Decide on the criticality for the need to take cation on such identified memory ranges with partial writes. Following are the possible courses of actions:

No Action:

If single bit upset events are unlikely in the operating environment.
If there are other safety mechanisms that can detect or avoid such spurious random errors.

Action: One or more of the following options can be considered
- Avoid the partial write access pattern to those memory ranges.
- Re-initialise the impacted memory bank on single bit memory correction event.
- Treat the single bit memory correction event as an un-correctable error and enter safe state.
  - This does not impact the Functional safety detectability claims and may impact the availability in the event of such single bit upset occurrence.

Refer below table for memory range and its corresponding ESM line & ECC aggregator bit if action (2-b-ii) needs to be taken.

This table includes only impacted memory list and corresponding details regarding

Memory Name	Start address	End Address	ESM Line	ECC Aggregator Status bit
DSS L3 Bank0	0x88000000	0x880BFFFF	DSS_ESM:: GROUP1 Line No- 92	DSS_ECC_AGG::SEC_STATUS_REG0:: DSS_L3RAM0_PEND
DSS L3 Bank1	0x8800C000	0x8817FFFF	DSS_ESM:: GROUP1 Line No- 92	DSS_ECC_AGG::SEC_STATUS_REG0:: DSS_L3RAM1_PEND
DSS L3 Bank2	0x88180000	0x881FFFFF	DSS_ESM:: GROUP1 Line No- 92	DSS_ECC_AGG::SEC_STATUS_REG0:: DSS_L3RAM2_PEND
DSS L3 Bank3	0x88200000	0x8827FFFF	DSS_ESM:: GROUP1 Line No- 92	DSS_ECC_AGG::SEC_STATUS_REG0:: DSS_L3RAM3_PEND
MSS L2 Bank0	0xC0200000	0xC027FFFF	MSS_ESM:: GROUP1 Line No-18	MSS_ECC_AGG_MSS::SEC_STATUS_REG0:: MSS_L2SLV0_PEND
MSS L2 Bank1	0xC0280000	0xC02EFFFF	MSS_ESM:: GROUP1 Line No-18	MSS_ECC_AGG_MSS::SEC_STATUS_REG0:: MSS_L2SLV1_PEND
MSS Mailbox	0xC5000000	0xC5001FFF	MSS_ESM:: GROUP1 Line No-18	MSS_ECC_AGG_MSS::SEC_STATUS_REG0:: MSS_MBOX_PEND
MSS_RETRAM	0xC5010000	0xC50107FF	MSS_ESM:: GROUP1 Line No-18	MSS_ECC_AGG_MSS::SEC_STATUS_REG0:: MSS_RETRAM_PEND
DSS Mailbox	0x83100000	0x83100FFF	DSS_ESM:: GROUP1 Line No- 92	DSS_ECC_AGG::SEC_STATUS_REG0:: DSS_MAILBOX_PEND

Note: MSS_L2 address captured above is from DSS and EDMA addressing View. MSS_L2_BANK0 and MSS_L2_BAK1 address view from MSS-R5 is 0x10200000-0x1027FFFF. and 0x10280000-0x102EFFFF respectively

Other memories that are not utilized by the application but used by the BSS, such as BSS_Mailbox and BSS_Static_RAM, are also affected by this errata

The BSS mailbox is primarily used for communication between the BSS and MSS/DSS using mmWaveLink, following a message protocol that incorporates CRC for data integrity. Using CRC during message exchanges over the BSS mailbox reduces the risk associated with this memory.
When a fault occurs (in this case, an ECC SEC), BSS sends an ESM Fault Asynchronous event message to the MSS/DSS as a notification. The application must read the b20:ECC_AGG_SEC_ERROR from AWR_AE_RF_ADV_ESMFAULT_STATUS_SB async-event from the BSS. Treat this single-bit memory correction event as an uncorrectable error and enter to a safe state.
- This workaround is only valid if the application uses BSS Patch from DFP version 2.4.14 or earlier