Integrated diagnostics apply system reliability features at the ADC device level

Bryan Lizon
Product Marketing Engineer
Precision Delta-Sigma ADCs
Texas Instruments
As system complexity continues to grow, ADC IC designers are integrating device-level reliability features to help support system-level productivity and dependability.

Mitigating risk and uncertainty is an overarching goal of all engineers. As a result, reliability is one of the bedrock principles on which all engineering disciplines are based. In the 21st century and beyond, this concept will become even more critical as we rely more heavily on technology to improve our lives and our world. From self-driving vehicles to smart energy delivery to factory automation, the need for integrated reliability features in electronic systems will only increase.

To help meet that need, a growing trend within the field of analog integrated circuit (IC) design is to take system-level reliability features and apply those at the device level. In this way, engineers provide a new level of information that, when used properly, can help decrease the probability of device failure, allowing for a more reliable overall system.

**Motivation for reliability**

As next-generation systems grow in complexity and are adopted more widely – for example, an entire city whose utilities, communications and traffic are all managed for efficiency – so too will the likelihood that part(s) of that system will fail. Given that some level of failure is inevitable and anticipated while engineering complex systems, reliability must become a guiding design concern. Unfortunately, in most engineering and reliability discussions, the conversation often turns to a “lessons-learned” analysis of what went wrong during a given event.

A textbook example of this is the Challenger shuttle explosion (1986) (Figure 1), where unusually cold pre-launch temperatures caused an O-ring in the solid rocket booster joint to seat improperly. This released pressurized burning gas that ultimately led to the destruction of the shuttle and all seven crew members. Disasters like this exemplify the consequences of unreliable systems and help define what reliable systems shouldn’t do.

![Figure 1: The Challenger shuttle explosion](Source: NASA)

Fortunately, the vast majority of the time, complex, effectively engineered systems work as they should – or were intended to be designed with appropriate fail-safe provisions – due to the careful and meticulous planning of engineers. In fact, despite widespread coverage of events like the Mississippi River bridge collapse in Minnesota (2007) and the Deepwater Horizon oil rig explosion (2010), there
are over 600,000 bridges and 3,500 oil rigs in the United States, all of which continue to operate safely and reliably with proper maintenance.

Therefore, the reliability discussion should really be less about avoiding disaster and more about providing thoughtfully designed, quality products that offer predictable functionality given some set of recommended operating conditions. However, as the future brings technological integration to never-before-seen levels and system complexity increases to a city- or even region-wide scale, the typical approach of defining reliability through static device limitations may not be enough. Engineers will need to reevaluate how they approach design.

Recognizing that an end-equipment’s reliability is limited by its least reliable component, IC designers are creating new, intelligent devices that provide feedback about their overall health and the status of their various functions.

Additionally, they are incorporating more active solutions that can detect data errors, as well as have the ability to correct them.

To help ensure performance in harsher and more demanding industrial environments, integrated monitoring features are providing new ways to confirm that the system inputs – from the sampled signal to external connections to the overall temperature – are within tolerable, anticipated limits and in working order. This enhances device reliability and provides valuable information to the governing system.

**Making systems more reliable**

Reliability-minded designers model and test the failure rates of their devices, determining anticipated or presumed conditions that lead to unreliable operation. They also identify how long it takes to reach this state. To accomplish this, ICs are put through a whole host of quality and reliability tests.

When determining the minimum and maximum values for electrical characteristics, for example, several methods can be used depending on the specification. Many parameters, such as offset and gain errors, common-mode rejection, and power-supply rejection, are tested in production, where each device specification is measured. Those that fail to meet determined requirements/conditions are rejected.

Other device specifications are determined through characterization of a random sampling of devices, generally 30 or more. Once this data is collected and analyzed, the standard deviation (or multiple thereof) is used to help define an acceptable tolerance (or margin) that the user can expect.

**Figure 2** shows some parameters typically characterized in this way (though this is not always the case).

<table>
<thead>
<tr>
<th>Offset drift</th>
<th>PGA disabled, gain = 1 to 4</th>
</tr>
</thead>
<tbody>
<tr>
<td>Gain = 1 to 128, $T_A = -40^\circ C$ to $+85^\circ C$ (2)</td>
<td></td>
</tr>
<tr>
<td>Gain = 1 to 128</td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Gain drift</th>
<th>PGA disabled, gain = 1 to 4</th>
</tr>
</thead>
<tbody>
<tr>
<td>Gain = 1 to 128 (2)</td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Reference drift (2)</th>
</tr>
</thead>
<tbody>
<tr>
<td>PGA disabled, gain = 1 to 4</td>
</tr>
<tr>
<td>Gain = 1 to 128</td>
</tr>
</tbody>
</table>

(2) Minimum and maximum values are ensured by design and characterization data.

**Figure 2: An example of parameters typically defined by characterization data**

That being said, quality and reliability do not simply define the minimum and maximum values for each electrical characteristic. They also examine how device reliability is affected by certain environmental factors, such as high temperature, electrostatic discharge (ESD), moisture sensitivity, and thermal impedance, as well as the limits of those stressors.

Additionally, characterization and quality data provide useful information, such as early life failure rate (ELFR) and mean time between failure (MTBF), affording users a statistically determined product lifetime during which they may expect reliable device operation.
In the semiconductor industry, the distinctions between reliable and unreliable operation are easily exemplified using any datasheet’s recommended operating conditions section (Figure 3), where limits are set for each parameter given some tolerance for error. When the analog-to-digital converter (ADC) inputs are kept within these limits, the user can expect predictable operation throughout the product’s lifetime. This differs from the absolute maximum ratings, which determine the specification limits that will not damage the device. While these maximum ratings are less restrictive than the recommended operating conditions, they offer no practical expectation of reliable ADC performance and, if used for a prolonged period of time, can cause irreversible damage.

### Recommended Operating Conditions

<table>
<thead>
<tr>
<th>POWER SUPPLY</th>
<th>MIN</th>
<th>NOM</th>
<th>MAX</th>
<th>UNIT</th>
</tr>
</thead>
<tbody>
<tr>
<td>Unipolar analog power supply: AVDD to AVSS</td>
<td>2.3</td>
<td>5.5</td>
<td>V</td>
<td></td>
</tr>
<tr>
<td>AVSS to DGND</td>
<td>−0.1</td>
<td>0</td>
<td>0.1</td>
<td>V</td>
</tr>
</tbody>
</table>

### Absolute Maximum Ratings

<table>
<thead>
<tr>
<th>POWER SUPPLY</th>
<th>MIN</th>
<th>MAX</th>
<th>UNIT</th>
</tr>
</thead>
<tbody>
<tr>
<td>AVDD to AVSS</td>
<td>−0.3</td>
<td>7</td>
<td>V</td>
</tr>
<tr>
<td>DVDD to DGND</td>
<td>−0.3</td>
<td>7</td>
<td>V</td>
</tr>
</tbody>
</table>

Theoretically, redundancy can be modeled as shown in Figure 4, where n is equal to the number of redundant nodes in the system (n = 0 implies no redundancy). This figure depicts what is commonly referred to as *cold standby redundancy* where one system remains on while the redundant system(s) is off. This method can help eliminate unnecessary stress on backup systems being powered but not used, as well as reduce energy costs for powering those unused systems. Additional redundancy methods include *hot standby* or *modular redundancy*. These methods keep all nodes powered, sacrificing energy budget for switchover speed or optimal output.

A major benefit of redundancy is that it can increase the overall reliability of a system beyond the reliability of each component. If we assume that the reliability of each component is independent, even a modest level of redundancy can have a positive effect.

Unfortunately, true independence can be difficult to achieve. Even with careful planning and designing, unforeseen system relationships can cause seemingly independent redundant systems to fail simultaneously, a phenomenon known as common-mode failure. In the *Challenger* disaster example, there were two O-rings in the rocket booster joint (Figure 5), with the second included in case the first failed. Catastrophically, the cold temperatures affected both O-rings equally, causing both to fail simultaneously.

---

Figure 3: Difference between recommended operating conditions and absolute maximum ratings

However, functioning within the recommended operating conditions is simply about maintaining reliability and does not necessarily define any method to improve it.

One way to increase reliability is through redundancy. This is the inclusion of backup systems that help allow processes to continue operation even in the event of a defined failure. A real-world example of redundancy that many of us can relate to is keeping a spare key to your home in the event you misplace the original. While this idea is simple, it is often extended to incredibly complex systems.
Integrated diagnostics apply system reliability features at the ADC device level.

While redundancy can be included in a system to make it more reliable, it should be supplemented with other methods that can also help improve reliability. Examples include radiation hardening, higher isolation barriers and better power supply rejection (PSR). However, these tactics do not have to be static, nor do they have to be utilized only at a system level. In fact, as the need for more widespread and complex systems continues to grow, so too will the need for more intelligent components that actively contribute to the health of the systems to which they belong.

Introducing dynamic reliability features at a more granular level better ensures that each part of a complex system performs as expected throughout its lifespan. With this in mind, IC designers have engineered ADCs that cater to those needs, as is the case with the 32-bit ADS1262 and ADS1263. These devices are some of the first industrial ADCs to contain a host of monitoring and protection features.

**Integrated diagnostics**

The ADS1262/3 ([Figure 6](#)) include both analog and digital monitoring methods that provide an extra level of diagnostic ability. In the analog domain, the integrated programmable gain amplifier (PGA) includes both out-of-range detection and rail detection. The former detects whether or not the differential output voltage exceeds ±105 percent of the full-scale voltage range (VREF), while the latter sets a flag if either PGA output voltage is within 100 mV of the supply (AVDD or AVSS). Additionally, both ADCs incorporate reference fault detection, where the differential reference voltage (VREFP-VREFN) is continuously compared against 0.4 V. The ADCs update the conversion status byte each conversion cycle, indicating if it dropped below this value.

![Figure 5: Segmented view of the Challenger booster rocket identifying the O-rings that failed](source: NASA)

While redundancy can be included in a system to make it more reliable, it should be supplemented with other methods that can also help improve reliability. Examples include radiation hardening, higher isolation barriers and better power supply rejection (PSR). However, these tactics do not have to be static, nor do they have to be utilized only at a system level. In fact, as the need for more widespread and complex systems continues to grow, so too will the need for more intelligent components that actively contribute to the health of the systems to which they belong.

Introducing dynamic reliability features at a more granular level better ensures that each part of a complex system performs as expected throughout its lifespan. With this in mind, IC designers have engineered ADCs that cater to those needs, as is the case with the 32-bit ADS1262 and ADS1263. These devices are some of the first industrial ADCs to contain a host of monitoring and protection features.

**Integrated diagnostics**

The ADS1262/3 ([Figure 6](#)) include both analog and digital monitoring methods that provide an extra level of diagnostic ability. In the analog domain, the integrated programmable gain amplifier (PGA) includes both out-of-range detection and rail detection. The former detects whether or not the differential output voltage exceeds ±105 percent of the full-scale voltage range (VREF), while the latter sets a flag if either PGA output voltage is within 100 mV of the supply (AVDD or AVSS). Additionally, both ADCs incorporate reference fault detection, where the differential reference voltage (VREFP-VREFN) is continuously compared against 0.4 V. The ADCs update the conversion status byte each conversion cycle, indicating if it dropped below this value.

![Figure 6: System block diagram of the high-resolution, low-noise ADCs](#)
In industrial environments, noise is often introduced through strong radio frequency (RF) signals, transients sourced from motors or switchgear, or even through noise-generating maintenance operations, such as welding a broken machine. To minimize the effects of this noise on sensitive digital circuitry, the ADS1262/3 have a cyclic redundancy checksum (CRC), as well as a simple checksum, both of which help detect single-bit and multi-bit errors.

Each of these detection schemes works by calculating a known value from the converted result, which is then compared to a similarly calculated value from the host controller. The CRC divides each data byte by the CRC-8-ATM polynomial \(x^8 + x^2 + x + 1\), while the simple checksum sums all four data bytes, along with a constant (0x9Bh). If the ADC and host results differ, an error has occurred. In this case, the data may be read back in the same cycle to potentially recover the actual sampled value.

Occasionally, output data may seem continuously invalid, implying a potentially more serious problem than a noisy environment. If the health of the ADC is questionable, one of the most useful features integrated into the ADS1262/3 is a test digital-to-analog converter (DAC) (Figure 7).

To assess the state of the ADC, the DAC generates a known single-ended, differential or common-mode voltage that is also compatible with any setting of the integrated PGA. Given this known input, the ADC should generate an expected output. If not, one of the ADC's components may not be working properly, resulting in a need for further diagnostics.

Additionally, the test DAC signals can be routed externally to analyze any suspected issues with the signal-conditioning circuit. This powerful diagnostic tool can be monitored on demand or, for more critical applications, utilized after each converted ADC sample to check for erroneous data, possibly due to a dysfunctional ADC.

Another method to test the health of the main ADC involves taking a redundant measurement with the auxiliary 24-bit ADC of the ADS1263 to see if they agree. If they do not agree, the controlling system can check the conversion status byte to see if any of the monitoring flags have been set. Alternatively, since the test DAC can be applied to either ADC, it can send the same known signal through both to verify that they output the same value. If not, one may be defective.

These monitoring features provide multiple ways to check the ADC's overall health, as well as the operation of its individual parts, and can alert the controlling system when it detects anomalous behavior. From this, the host can make quicker, better-informed decisions, such as forcing malfunctioning processes into a desired safe state or calling for an entire plant shutdown. Ultimately, this allows for a safer working environment.

**Monitoring system inputs**

The aforementioned features are generally used to understand the operating status of several of the ADC's individual parts, as well as to test its overall health. In a closed system this might be adequate. However, since ADCs do not operate in a vacuum, additional elements are necessary to help monitor the inputs to the ADC from the system.

For example, the ADS1262/3 incorporate a temperature sensor that can be used to monitor the temperature of the die. In the event that this sensor reports increasing temperatures, the speed of external cooling fans can be adjusted.
automatically. If this functionality does not exist in the overall system, it can be shut down to prevent permanent damage to the device and allow time for troubleshooting.

As was previously discussed, the ADS1263 integrates an auxiliary 24-bit ADC, which includes its own input multiplexer (MUX), PGA, and reference inputs (Figure 8). While it has various uses, including cold junction compensation (CJC) in thermocouple applications and confirming the output of the main ADC, the auxiliary ADC can be used in a variety of other ways to help monitor system inputs.

Expanding on the redundant measurement method described earlier, the auxiliary ADC can also be used to take redundant measurements of the main ADC, but with a different PGA gain. This configuration allows the user to view a small signal, like a bridge measurement, from a wider perspective. It also enables event detection of anomalies, such as clipping or transients, that might otherwise go unnoticed and passed along as valid data. To do this, the user would configure the main ADC with a large gain, for example 32, while the auxiliary ADC would have a gain of 1 (Figure 9).

Figure 8: Signal chain block diagram for the main and auxiliary ADCs

Expanding on the redundant measurement method described earlier, the auxiliary ADC can also be used to take redundant measurements of the main ADC, but with a different PGA gain. This configuration allows the user to view a small signal, like a bridge measurement, from a wider perspective. It also enables event detection of anomalies, such as clipping or transients, that might otherwise go unnoticed and passed along as valid data. To do this, the user would configure the main ADC with a large gain, for example 32, while the auxiliary ADC would have a gain of 1 (Figure 9).

With this setup, the host controller is better prepared to make quick, effective decisions in the event of a problem. If a transient is detected, the ADC can run the test DAC and PGA/reference monitors to determine if either signal chain has been damaged. Or, if the signal is clipping, the gain can be reprogrammed in real-time to continue accurately capturing data.

Another important feature of the auxiliary ADC is its ability to use the sensor bias block without having to interrupt the main ADC. The sensor bias feature consists of configurable resistors or current sources that force a positive or negative full-scale reading in the event that a sensor becomes disconnected.

If multiple sensors are used, such as thermocouples and RTDs, the main ADC can sample the output of the first sensor while the auxiliary ADC monitors the sensor bias reading of the second sensor at the input. This helps to ensure that when the main ADC is ready to sample the second sensor’s output, it will still be connected. This idea can then be translated to sensors three, four, five and so on.

Two final points

First, while these features provide either internal or external reliability information, they are not limited to one or the other. For example, the differential reference voltage may be provided by the system (not internally). The result is that the REF monitoring capabilities
provide information about a system input versus an integrated feature. Furthermore, certain features need not be used to enhance the reliability of the device, as is the case with the ADS1263’s auxiliary ADC performing CJC.

Second, certain circumstances, such as a lightning strike, could damage these monitoring capabilities. Therefore, they should always work in tandem with system reliability features and should never take the place of routine maintenance and external testing by the governing system.

Meeting the demands of the future

As technology rises to meet the challenges of tomorrow, the need to integrate already-complex systems into one coherent entity will only increase. These challenges will continue to drive engineers of all disciplines to consider how the products they create contribute both passively and actively to the reliable operation of the systems they support.

To meet the demands of the future, IC designers at Texas Instruments are integrating reliability features into ADCs, allowing these devices to play an active role in the overall health and dependability of the systems in which they are incorporated.

In the case of the ADS1262/3, these features help to ensure more reliable data while providing a wide range of targeted diagnostic information. Both features and diagnostics can help contribute to a system’s reliability when used properly in the original equipment manufacturers’ (OEM) end application.

Ultimately, incorporating reliability features at the device level can help OEMs reduce the rate of failure of the complex systems they develop, bringing the incredible visions of the future ever closer to becoming the reality of the present.

References
1. History Channel, Challenger Disaster
3. Reliability Data: Reliability Estimator, Texas Instruments
4. ADS1262 datasheet
5. ADS1263 datasheet
Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, enhancements, improvements and other changes to its semiconductor products and services per JESD46, latest issue, and to discontinue any product or service per JESD48, latest issue. Buyers should obtain the latest relevant information before placing orders and should verify that such information is current and complete. All semiconductor products (also referred to herein as “components”) are sold subject to TI’s terms and conditions of sale supplied at the time of order acknowledgment.

TI warrants performance of its components to the specifications applicable at the time of sale, in accordance with the warranty in TI’s terms and conditions of sale of semiconductor products. Testing and other quality control techniques are used to the extent TI deems necessary to support this warranty. Except where mandated by applicable law, testing of all parameters of each component is not necessarily performed.

TI assumes no liability for applications assistance or the design of Buyers’ products. Buyers are responsible for their products and applications using TI components. To minimize the risks associated with Buyers’ products and applications, Buyers should provide adequate design and operating safeguards.

TI does not warrant or represent that any license, either express or implied, is granted under any patent right, copyright, mask work right, or other intellectual property right relating to any combination, machine, or process in which TI components or services are used. Information published by TI regarding third-party products or services does not constitute a license to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property of the third party, or a license from TI under the patents or other intellectual property of TI.

Reproduction of significant portions of TI information in TI data books or data sheets is permissible only if reproduction is without alteration and is accompanied by all associated warranties, conditions, limitations, and notices. TI is not responsible or liable for such altered documentation. Information of third parties may be subject to additional restrictions.

Resale of TI components or services with statements different from or beyond the parameters stated by TI for that component or service voids all express and any implied warranties for the associated TI component or service and is an unfair and deceptive business practice. TI is not responsible or liable for any such statements.

Buyer acknowledges and agrees that it is solely responsible for compliance with all legal, regulatory and safety-related requirements concerning its products, and any use of TI components in its applications, notwithstanding any applications-related information or support that may be provided by TI. Buyer represents and agrees that it has all the necessary expertise to create and implement safeguards which anticipate dangerous consequences of failures, monitor failures and their consequences, lessen the likelihood of failures that might cause harm and take appropriate remedial actions. Buyer will fully indemnify TI and its representatives against any damages arising out of the use of any TI components in safety-critical applications.

In some cases, TI components may be promoted specifically to facilitate safety-related applications. With such components, TI’s goal is to help enable customers to design and create their own end-product solutions that meet applicable functional safety standards and requirements. Nonetheless, such components are subject to these terms.

No TI components are authorized for use in FDA Class III (or similar life-critical medical equipment) unless authorized officers of the parties have executed a special agreement specifically governing such use.

Only those TI components which TI has specifically designated as military grade or “enhanced plastic” are designed and intended for use in military/aerospace applications or environments. Buyer acknowledges and agrees that any military or aerospace use of TI components which have not been so designated is solely at the Buyer’s risk, and that Buyer is solely responsible for compliance with all legal and regulatory requirements in connection with such use.

TI has specifically designated certain components as meeting ISO/TS16949 requirements, mainly for automotive use. In any case of use of non-designated products, TI will not be responsible for any failure to meet ISO/TS16949.

**Products**

- **Audio**: www.ti.com/audio
- **Amplifiers**: amplifier.ti.com
- **Data Converters**: dataconverter.ti.com
- **DLP® Products**: www.dlp.com
- **DSP**: dsp.ti.com
- **Clocks and Timers**: www.ti.com/clocks
- **Interface**: interface.ti.com
- **Logic**: logic.ti.com
- **Power Mgmt**: power.ti.com
- **Microcontrollers**: microcontroller.ti.com
- **RFID**: www.ti-rfid.com
- **OMAP Applications Processors**: www.ti.com/omap
- **Wireless Connectivity**: www.ti.com/wirelessconnectivity

**Applications**

- **Automotive and Transportation**: www.ti.com/automotive
- **Communications and Telecom**: www.ti.com/communications
- **Computers and Peripherals**: www.ti.com/computers
- **Consumer Electronics**: www.ti.com/consumer-apps
- **Energy and Lighting**: www.ti.com/energy
- **Industrial**: www.ti.com/industrial
- **Medical**: www.ti.com/medical
- **Security**: www.ti.com/security
- **Space, Avionics and Defense**: www.ti.com/space-avionics-defense
- **Video and Imaging**: www.ti.com/video

**TI**

**E2E Community**: e2e.ti.com

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2015, Texas Instruments Incorporated