Guide to Reliability of Electrical/Electronic Equipment and Products--Robust Design Practices (part 2)

Home | Articles | Forum | Glossary | Books



With the dependence of today's products on high-performance integrated circuits, disk drives, and power supplies, it is important that thorough and accurate processes are in place to ensure that the right technology and functional components are selected from the right suppliers. This means that components with the required performance characteristics and reliability have been selected for the de sign and that appropriate steps have been taken to assure that these same components are procured and delivered for use in production.

The selection process begins with the dialog between component engineering and design engineering in completing the technology assessment questionnaire discussed earlier in conjunction with technology and functional road maps. Obsolete or soon-to-be obsolete components and or technologies need to be identified and avoided. Also, the use of single- or sole-sourced components need to be identified, the risks assessed, and a proactive contingency plan developed during the detailed design phase to ensure a continuous source of supply during manufacturing. Depending on customer requirements and corporate culture, involved actions can include supplier design and manufacturing process audits and verifications; verification of supplier conducted reliability tests; analysis of the use of single- or sole-sourced components; exhaustive component qualification and new package evaluation and testing; accelerated stress testing during design to determine operating margins and device sensitivities; device characterization testing focusing on four-corner testing and testing of unspecified parameters; in-application (i.e., product) testing; and supplier performance monitoring.

Section 4 discusses component and supplier selection and qualification in detail.


Reliability prediction is the process of developing a model or assigning/calculating reliability values or failure rates to each component and subsystem. MIL HDBK-217 and Bellcore Procedure TR-332 are two documents that list component failure rate data. Even though of questionable accuracy, these models provide a starting point from which to proceed based on practical experience with various components and subassemblies. Reliability prediction is used and iteratively refined throughout the design cycle for comparing designs, estimating warranty costs, and predicting reliability. However, predictions are most useful when comparing two or more design alternatives. Although prediction metrics are notoriously inaccurate, the factors that cause the inaccuracies (such as manufacturing, quality, end-use environments, operator problems, mishandling, etc.) are usually the same for each competing design. Therefore, although the absolute values may be incorrect, the relative values and rankings tend to be valid. A detailed discussion and illustrative examples can be found in section 1.


FIGURE 4 Support cost model.


TABLE 7 Power Supply Life Cycle Cost Comparison

Item Supplier A Supplier B

Input data Cost per unit $1000 $1300 Expected service calls in 5 years 0.6 0.1 Cost per service call (OEM cost only) $850 $1100 Cost per service call (OEM cost _ customer cost) $2550 $3600 Cost calculations Development cost Minimal

Minimal Production cost per unit $1000 $1300 Support cost per unit (OEM cost only) $510 $110 Support cost per unit (OEM cost _ customer cost) $1530 $360

Other costs Minimal

Minimal Life cycle cost

Total cost (OEM cost only) $1510 $1410 Total cost (OEM cost _ customer cost) $2530 $1660



A support cost model is used to assess cost effectiveness of tradeoffs, such as determining whether the additional cost of a higher reliability unit will pay for itself in terms of reduced service calls.

A complete product life cycle cost model includes development, production, and support costs. Experience has shown that the financial focus of most projects is on development and production costs. Support costs are often over looked because they are harder to quantify and do not directly relate to financial metrics such as percent R&D expenditure or gross margins. They do, however, relate to profit, and a product reliability problem can affect the bottom line for many years.

Figure 4 shows the structure of a typical support cost model. This model helps evaluate total product cost and ensure that reliability is appropriately considered in design tradeoffs. Support cost is a function of many factors, such as reliability, support strategies, warranty policies, stocking locations, repair depots, and restoration times. Support costs include the number of service calls multiplied by the cost of a service call, highlighting the dependence of support cost on reliability. Other important factors, such as months of spares inventory, have also been included in the cost of a service call. The support cost model is implemented in a spreadsheet to make it easy for anyone to use. Costs are calculated for the expected product life and discounted back to present value. The model can be used for both developed and purchased products and is flexible enough to use with standard reliability metrics. For example, inventory, repair, and shipping costs are removed for service calls that do not result in a part replacement, and material repair cost is removed from returned units that do not fail in test.

The life cycle cost model is often used in the supplier selection process.

An example of its use in power supply supplier selection is presented. Supplier A offered a low-cost, low-reliability power supply, while Supplier B offered a higher-cost, high-reliability power supply. Purchasing wanted to use supplier A and find another low-cost second source. The life cycle cost analysis shown in Table 7 was performed. Development costs were minimal since development was only involved in a support role (acting as a consultant) in the specification and qualification of the power supplies.

The cost per service call is actually more for the higher-reliability power supply because the shipping and repair costs are higher. However, the number of service calls is significantly lower. The life cycle cost analysis convinced

Purchasing to keep supplier B as a second source. Supplier B's power supplies have performed so reliably since product introduction that they quickly became the preferred power supply supplier and have been contracted for a majority of the end product.


Stress analysis consists of calculating or measuring the critical stresses applied to a component (such as voltage applied to a capacitor or power dissipation in a resistor) and comparing the applied stress to some defined criteria. Traditional stress/strength theory indicates that a margin should exist between the applied stresses on a component and the rated capabilities of that component. When sufficient margin exists between the applied stress and the component strength, the probability of failure is minimal. When the safety margin is missing, a finite probability exists that the stresses will exceed the component strength, resulting in failure. The real question that must be answered is how much margin is enough.

Often the answer depends on the circuit function, the application, and the product operating environment.

The result of a stress analysis is a priority list requiring design action to eliminate the failure modes by using more robust components with greater safety margins, to minimize the effects of the failures, to provide for routine preventive maintenance or replacement, and/or to assure that repairs can be accomplished easily.

Part derating is essential to achieve or maintain the designed-in reliability of the equipment. Derating assures the margin of safety between the operating stress level and the actual rated level for the part and also provides added protection from system anomalies that are unforeseen during system design. These anomalies may occur as power surges, printed circuit board hot spots, unforeseen environmental stresses, part degradation with time, and the like.

Derating levels are not absolute values, and engineering judgment is required for resolving critical issues. Derating is simply a tradeoff between factors such as size, weight, cost, and failure rate. It is important to note that excessive derating can result in unnecessary increases in part count, part size, printed circuit board size, and cost. This can also increase the overall predicted failure rate; therefore, engineering judgment is necessary to choose the most effective level of derating. The derating guidelines should be exceeded only after evaluating all of the possible tradeoffs and using sound engineering principles and judgment. Appendix A is one company's parts derating guidelines document at the end of this guide.

The rules for derating a part are logical and applied in a specific order. The following is a typical example of such rules:

1. Use the part type number to determine the important electrical and environmental characteristics which are reliability sensitive, such as voltage, current, power, time, temperature, frequency of operation, duty cycle, and others.

2. Determine the worst case operating temperature.

3. Develop derating curves or plots for the part type.

4. Derate the part in accordance with the appropriate derating plots. This becomes the operational parameter derating.

5. Using a derating guideline such as that in Appendix A of section 3 or military derating guideline documents [such as those provided by the U.S. Air Force (AFSCP Pamphlet 800-27) and the Army Missile Command] to obtain the derating percentage. Multiply the operational derating (the value obtained from Step 4) by this derating percentage.

This becomes the reliability derating.

6. Divide the operational stress by the reliability derating. This provides the parametric stress ratio and establishes a theoretical value to deter mine if the part is overstressed. A stress ratio of 1.0 is considered to be critical, and for a value of _1.0 the part is considered to be overstressed.

7. If the part is theoretically overstressed, then an analysis is required and an engineering judgment and business decision must be made whether it is necessary to change the part, do a redesign, or continue using the current part.

11.1 Derating Examples

FIGURE 5 Typical power resistor derating curve.

Several examples are presented to demonstrate the derating process.

Example 1: Power Resistor

This example involves derating an established reliability fixed wire-wound (Power Type)MIL-R-39007/5C resistor. This family of resistors has a fixed resistance value, rated for 2 W power dissipation at 25°C, derated to 0 W power dissipation at 275°C, and are designed for use in electrical, electronic communication, and associated equipment. In this example the designer assumed that resistor style PWR 71 meets the design criteria. An evaluation is required to assure that the resistor is not being overstressed. The known facts are that the resistor is being used at a worst case system temperature of 105°C and during operation will dissipate 1.2 W at this temperature. This resistor style derates to 0 W at 275°C. The problem solution follows the previously stated rules:

1. The power (rated wattage) versus temperature derating plot is shown in Figure 5. For a worst case temperature of 105°C, the power derates to approximately 1.36 W.

2. The resistor derating information is found in the MIL-R-39007 standard. The two important stress characteristics are power and tempera ture. The recommended power derating is 50%, while the maximum usage temperature is 275°C _ 25°C, or 250°C.

3. The resistor is power derated to 1.36 W. It must now be derated an additional 50% (considered a reliability derating), or 1.36 W _ .50 _ 0.68 W.

4. The operating power dissipation or operating stress is 1.2 W. This di vided by the reliability derating gives the safety factor stress ratio. In this case, 1.2/0.68 _ 1.77. The stress ratio exceeds 1.0 by 0.77 and requires an engineering evaluation.

5. The engineering evaluation would consider the best case, which is an operational power dissipation of 1.2 W divided by the temperature de rating of 1.6W, or a ratio of 0.882. This indicates that the best available case with no reliability derating applied is a ratio of 0.882 versus a maximal acceptable ratio of 1.0. The largest reliability derating which then could be applied would be approximately 11.8%versus the recommended 50% derating guideline. This would derate the resistor power dissipation to 1.36 W _0.118 _ 160 mW. As a check: 1320 mW _ 160 mW _ 1200 mW, or 1.2 W. The safety factor stress ratio would be equal to 1.0. Reliability engineering should then perform a design reevaluation prior to allowing the use of this resistor in the circuit de sign. If it were used, it would be a reliability critical item and should appear on the reliability critical items parts list.

6. The typical engineering evaluation must ensure that there are no hot spot temperatures on the PWA which impact the resistor power dissipation, localized hot spots on the resistor, and that the surrounding air temperature is less than the temperature of the resistor body. These items must be confirmed by actually making the necessary temperature measurements and constructing a temperature profile.

Misconception of Part Versus System Operating Temperatures

A common misconception exists concerning the meaning of temperature versus derating. This misconception causes reliability problems. For example, the sys tem operating temperature may be stated as being 75°C. To most people not familiar with derating concepts, this means that all parts in the system are operating at or below 75°C. This temperature of 75°C applies only to the external ambient temperature (worst case) which the system will encounter and does not imply that parts operating internally in the system reach 75°C worst case. It is typical for the stress ratio and derating to be calculated using a worse case temperature of 105°C. Again, this is a guideline temperature which provides an adequate operating reliability safety margin for the vast majority of parts operating in the system. For good reason, an analysis of each part in the system is still required to highlight parts that exceed a temperature of 105°C. Parts that exceed a stress ratio of 1 are considered to be high-risk reliability items and should be placed on a critical items list, resulting in an engineering analysis and judgment being made regarding appropriate corrective action for any potential problem area.

Example 2: Derating a Simple Digital Integrated Circuit

Integrated circuit description:

Triple-3 input NOR gate, commercial part number CD4025

Package: fourteen lead DIP

Supply voltage range: _0.5 to 18 V

Input current (for each input): _10 mA

Maximum power dissipation (PD): 200 mW

Maximum junction temperature: 175°C

Recommended operating conditions: 4.5 to 15 V

Ambient operating temperature range: _55 to 125°C

Load capacitance: 50 pF maximum

Power derating begins at 25°C and derates to 0 at 175°C

Circuit design conditions:

Power supply voltage: 15 V

Operating worst case temperature: 105°C

Output current within derating guidelines

Output power dissipation: 55 mW

Junction temperature (Tj): 110°C

Often we want to find the junction to case thermal resistance. If it is not specified on the IC manufacturer's data sheet, we can calculate it as follows, given that the case temperature (Tc) is 104.4°C:

theta_jc _ Tj _ Tc Pd

_ 110 _ 104.4 200

_ 0.028 _ 28°C/W

The junction temperature is calculated as follows:

Tj _ Tc _ (Pd)(theta_jc) _ 104.4 _ (200) (28) Tj _ 104.4 _ 5.6 _ 110°C

A note of caution: the 28°C/W can vary depending upon the bonding coverage area between the die and the substrate.

Power Derating. There are several methods that can be used to determine the derated power. One is by using the formula

Pd _ Pd(max) Tj(max) _ 25°C

In the present case, Pd _ 200/150 _ 1.33 mW/°C

An alternative method is to construct a power derating curve and solve the problem graphically. From the derating curve of Figure 6, the device dissipates 200 mW from _55 to 25°C and then rolls off linearly to 0 W at 175°C.

From the IC manufacturer maximum ratings and from the intended application, it is determined that the maximum junction temperature is 110°C. Going to Figure 6, we proceed as follows.

1. Locate 110°C on the power derating curve and proceed up the 110°C line until it intersects the derating curve.

2. From this point, proceed on a line parallel to the temperature axis to the point where the line intersects the power axis.

3. It is seen that at 110°C junction temperature, the maximum power dissipation is 85 mW (maximum).

4. The power has been derated to 42.5% of the maximum rated value. It was stated that the operating temperature is 105°C. At this temperature the power dissipation is 92 mW.

The next step requires calculating the power dissipation stress ratio. Since power dissipation is directly proportional to the voltage multiplied by the current, the power has to be derated to 90% (due to the intended application) of the derated value of 92 mW: 92 _ 0.9 _ 82.8 mW. The stress ratio is calculated by dividing the actual power dissipation by the derated power: 55/92 _ 0.5978. This value is less than 1.0; thus the IC used in this design situation meets the built-in reliability design criteria.

FIGURE 6 Power versus temperature derating curve for CD 4025 CMOS gate.

There are several other conditions that must be satisfied. The case tempera ture must be measured and the junction temperature calculated to ensure that it is 105°C or less. The worst case voltage, current, and junction temperature must be determined to ensure that the worst case power dissipation never exceeds the stress ratio power dissipation of 92 mW.

It is important to understand the derating curve. Limiting the junction temperature to 110°C appears wasteful when the device is rated at 175°C. In this case the power dissipation has already been decreased to 50% of its 200-mW rating at approximately 100°C. Several of the concerns that must be evaluated deal with PWA thermal issues: hot spots and heat generating components and their effect on nearest neighbor components; voltage and current transients and their duration and period; the surrounding environment temperature; PWA workmanship; and soldering quality. There are other factors that should be considered, but those listed here provide some insight as to why it is necessary to derate a part and then also apply additional safety derating to protect against worst case and unforeseen conditions. In reliability terms this is called designing in a safety margin.

Example 3: Power Derating a Bipolar IC

Absolute maximum ratings:

Power supply voltage range: _0.5 to 7.0 V

Maximum power dissipation per gate (Pd): 50 mW

Maximum junction temperature (Tj): 175°C

Thermal resistance, junction to case (?jc): 28° C/W

Recommended operating conditions:

Power supply voltage range: _0.5 to 7.0 V

Case operating temperature range: _55 to 125°C

Circuit design conditions:

Power supply voltage: 5.5 V

Operating junction temperature: 105°C

Output current within derating guidelines

Output power dissipation: 20 mW/gate

Maximum junction temperature for application: 110°C

Since the maximum junction temperature allowed for the application is 110°C and the estimated operating junction temperature is less than this (105°C), the operating junction temperature is satisfactory.

FIGURE 7 Bipolar IC power derating curve.

The next step is to draw the derating curve for junction temperature versus power dissipation. This will be a straight-line linear derating curve, similar to that of Figure 6. The maximum power dissipation is 50 mW/gate. Plotting this curve in Figure 7 and utilizing the standard derating method for ICs, we see that the derating curve is flat from _55 to 25°C and then rolls off linearly from 25 to 175°C. At 175°C the power dissipation is 0 W.

Using the derating curve of Figure 7 we proceed in the following manner:

1. Find the 105°C temperature point on the horizontal temperature axis.

From this point, draw a vertical line to the point where the 105°C temperature line intersects the derating line.

2. From the point of intersection, draw a line parallel to the temperature axis until it intersects the power dissipation axis.

3. The point of intersection defines the maximum power dissipation at 105°C. From Figure 7, the maximum power dissipation is 23.5 mW.

4. The output power dissipation was given as 20 mW/gate; therefore, the actual power dissipation is less than the derated value.

The stress ratio is calculated as follows:

1. For the intended application environment the power is derated to 90% (given condition) of the derated value. This becomes the stress de rating.

2. The derated power at 105°C is 23.5 mW and is stress derated to 90% of this value: 0.90 _ 23.5 _ 21.15 mW.

3. The stress ratio is then calculated by dividing the actual power dissipated (20 mW) by the stress derated power (21.15 mW): 20/21.15 _ 0.9456.

4. Since the stress ratio of 0.9456 is less than 1.0, the design use of this part is marginally satisfactory. Since the junction temperature was estimated to be 105°C, the actual value should be determined by calculation or measurement since a temperature of 107°C (which is very close to the estimated 105°C) will yield a stress ratio of 1.0.

Top of Page

PREV.   NEXT Article Index HOME