Maintenance Strategies, Dielectric Theory, Insulating Materials, Failure Modes, and Maintenance Impact on Arc-Flash Hazards (part 1-3)

Home | Articles | Forum | Glossary | Books

1 Introduction

2 Why Maintain and Test

3 Overview of Electrical Maintenance and Testing Strategies

4 Planning an EPM Program

5 Overview of Testing and Test Methods

6 Review of Dielectric Theory and Practice

7 Insulating Materials for Electrical Power Equipment

8 Causes of Insulation Degradation and Failure Modes of Electrical Equipment

9 Maintenance of Protective Devices and their Impact on Arc-Flash Hazard Analysis

__1 Introduction

The deterioration of electrical equipment is normal, and this process begins as soon as the equipment is installed. If deterioration is not checked, it can cause electrical failures and malfunctions. In addition, load changes or circuit alterations may be made without overall design coordination, which can result in improper selection of equipment, or settings of protective devices, or wrong trip devices installed in the circuits. The purpose of an electrical preventive maintenance (EPM) and testing program should be to recognize these factors and provide means for correcting them. With an EPM and testing program, potential hazards that can cause failure of equipment or interruption of electrical service can be discovered and corrected. Also, the EPM program will minimize the hazards to life and equipment that can result from failure of equipment when it is not properly maintained. Properly maintained equipment reduces downtime by minimizing catastrophic failures. To carry out the successful operation of electrical equipment and apparatus, it is essential to set up an effective maintenance and testing program.

This program can be implemented by setting up a maintenance department or by contracting the work to a private company engaged in this practice.

The EPM program should consist of conducting routine inspections, tests, repairs, and service of electrical power system apparatus such as transformers, cables, circuit breakers, switchgear assemblies, and the like, along with associated equipment comprised of control wiring, protective devices and relays, supervisory equipment, and indicating and metering instruments.

__2 Why Maintain and Test

A well-organized and implemented program minimizes accidents, reduces unplanned shutdowns, and lengthens the mean time between failures (MTBF) of electrical equipment. Benefits of EPM can be categorized as direct and indirect. Direct benefits are derived from reduced cost of repairs, reduced downtime of equipment, and improved safety of personnel and property.

Indirect benefits can be related to improved morale of employees, better workmanship, increased productivity, and the discovery of deficiencies in the system that were either designed into the original system or caused by later changes made in the system.

__3 Overview of Electrical Maintenance and Testing Strategies

Much of the essence of effective electrical equipment preventive maintenance (PM) can be summarized by four rules:

  • Keep it dry.
  • Keep it clean.
  • Keep it cool.
  • Keep it tight.

More specifically, most electrical power and control equipment is susceptible to a relatively small number of mechanisms of degradation, and the purpose of most EPM activities is to prevent them, retard them, or mitigate their effects. There are number of traditional philosophical approaches to electrical maintenance, such as run-to-failure (RTF), maintain as necessary, perform maintenance on fixed time schedules, and predictive maintenance, which are briefly summarized in the following sections. The reliability-centered maintenance (RCM) program is gaining favor because it combines the strengths of reactive, preventive, predictive, and proactive maintenance strategies. The RCM approach to electrical equipment is discussed in a greater detail than other maintenance strategies because it is becoming a maintenance program of choice. However, most power utilities, manufacturing i rms, and owners of plant facilities utilize a combination of these programs. The decision as to which approach to adopt is largely dependent on the scope of system and equipment, as well as a function of how management views the cost and benefits of maintenance.


In this approach, EPM per se is not performed at all. Degraded equipment is only repaired or replaced when the effect of degradation on process output becomes unacceptable. (For most types of electric power equipment, this coincides with catastrophic failure.) No explicit attempt is made to monitor performance or to avert failure, and the risks associated with ultimate failure are accepted. Because of the generally high reliability of electric power equipment installed in a benign environment, the RTF approach often provides satisfactory power reliability and availability in noncritical applications.

Small organizations which lack dedicated maintenance staffs often utilize this approach by default, and larger and more sophisticated organizations in the manufacturing sector also frequently apply it to noncritical equipment and systems. This maintenance strategy is also referred to as reactive maintenance.

Inspect and service as necessary

This approach is an advance beyond RTF wherein plant operating or maintenance personnel inspect electrical equipment on a more or less regular schedule (often during regular rounds of the plant). Under this approach, incipient failures are usually corrected before they become catastrophic, especially if the impact of a failure is considered unacceptable, and there is often some informal monitoring of performance to predict future failures. Many industrial manufacturing plants use this approach and find it satisfactory.

Time-based maintenance

The time-based maintenance (TBM) strategy is also known as scheduled PM. In this approach, established EPM activities are performed at fixed intervals of calendar time, operating hours, or operating cycles. Both procedures and schedules are usually based on manufacturers' recommendations or industry standards. While the scheduled EPM approach ensures that equipment gets periodic attention, it does not necessarily prioritize EPM according to safety or productivity significance, nor does it optimize the application of limited EPM resources or take advantage of lessons learned from plant and industry experience. Scheduled EPM currently is the predominant approach among relatively sophisticated operators of plants where productivity and safety is a serious concern.

Condition-based maintenance

The condition-based maintenance (CBM) strategy is also called predictive maintenance. It is an extension of the TBM strategy and uses nonintrusive testing techniques to assess equipment condition. It uses planned maintenance tasks that are based on equipment's previous operating history, and trending of the maintenance data. It is most effective when combined with a PM program because it prioritizes EPM based on criticality of equipment, productivity, resources, or lessons learned from experience.


It is a maintenance strategy where equipment condition, criticality, failure history, and life cycle cost are integrated to develop logically the most effective maintenance methods for each system, subsystem, and components.

RCM capitalizes on the respective strengths of reactive, preventive, predictive, and proactive maintenance methods to maximize equipment reliability and availability. It is an ongoing process that continuously refines and redefines each maintenance activity.

The RCM process reduces the uncertainty inherently associated with the operational reliability of equipment by managing the risk through the periodic assessment of equipment condition. By using the proper instrumentation, the ability to determine the current equipment condition, changes from the baseline, and margin to failure, limits are readily determined. This allows the maintenance and operations staff to quantify the risk associated with continued operation or maintenance deferment, and to identify the most probable cause of the problem to the component level. In the majority of cases, condition testing is nonintrusive, allowing equipment condition assessments to be performed with the equipment operating under normal, loaded conditions.

The concept of RCM has evolved considerably over time when one applies it to facility maintenance. Historically, there was an intuitive belief that because mechanical parts wear out over time, equipment reliability is directly related to operating age. The belief was that the more frequently that equipment was overhauled, the better protected it would be against failure.

Industry increased PM to include nearly everything.

In the 1970s, the airline industry found that many types of failure could not be prevented regardless of the intensity of maintenance. Actuarial analysis of failure data suggested that PM was ineffective by itself in controlling failure rates. And for many items, failure rates did not increase with increased operational use. In the 1980s, early forms of condition monitoring devices came on the market and coincided with microprocessors and a new computer literacy. RCM theory was refined and adopted by the US Navy's submarine fleet. It was shown that in many cases, scheduled overhaul increases the overall failure rate by introducing new infant mortality probability into an otherwise stable system.

What has evolved is a complementary program-rigorous and stream lined-that has its most appropriate applications based on the consequences of failure, the probability of failure, historical data, and the amount of risk willing to be tolerated.

Rigorous RCM in its original concept involves a heavy reliance on detailed failure modes and effects analyses; math-calculated probabilities of failure; model development and accumulation of historical data. It provides the most detailed knowledge on a specific system and component and provides the most detailed documentation. Because of the detail involved, it is highly labor intensive, time-consuming, and comparatively expensive. The most appropriate applications of RCM are when the consequences of failure would result in a catastrophic risk to personal safety and health, to the environment, or could result in complete economic failure of an organization.

Plant managers adopted a streamlined RCM approach recognizing its benefits while realizing that few building mechanical and electrical systems carry the catastrophic risk addressed in the rigorous RCM process.

Lower intensity more in line with the scale of a facility's infrastructure also meant lower costs. Streamlined RCM targets systems and components in order of criticality. It relies heavily on condition-based tasks and eliminates low-value maintenance tasks altogether based on maintenance and operations staff input and historical data. It minimizes extensive analysis in favor of finding the most obvious, costly problems early-on, capitalizes on the early successes, and then expands outward in a continuous fashion.


FIG. 1 Common applications of maintenance strategies for RCM program. (From St. Germain, E. and Pride, A., NASA Facilities RCM Guide, 1996, p. 1-1)

Reliability centered maintenance

Reactive maintenance (run-to-failure) (RTF) Preventive maintenance (time based) Predictive maintenance (condition based) Small items; noncritical; inconsequential; unlikely to fail; redundancy Subject to wear; consumable replacement; known failure pattern Random failure patterns; not subject to wear; PM-induced failures RCFA; A&E design; age exploration; FMEA Typically:

Proactive maintenance (improvement)


Streamlined RCM requires a thorough understanding of condition monitoring technologies as well as analytical techniques, including root cause failure analysis (RCFA), trend analysis, and failure modes and effects analysis (FMEA). With some exceptions streamlined RCM is the philosophy of choice in plant maintenance programs.

Failure: RCM defines failure as any unsatisfactory condition. It may be a loss of function, where a system or component stops running altogether, or it may be a loss of acceptable quality, where operation continues, but at a substandard or inadequate quality. A failure may be catastrophic or merely out of tolerance.

As stated, RCM seeks the optimum mix of four maintenance strategies: reactive (RTF), preventive (time-directed), predictive (condition-directed), and proactive (failure-finding). Most common elements of each maintenance strategy are illustrated in FIG. 1. The application of the various elements of the four maintenance strategies for an automobile RCM program is shown in FIG. 2.

The mix: Maintenance activity at facilities typically run about 80%-85% reactive (service requests, trouble calls, repairs), 15% preventive, 1% predictive, and 1% proactive. Goals for effective maintenance programs should be in the range of 30%-35% for reactive maintenance, 30%-35% for PM, 25% for predictive maintenance, and 10% for proactive maintenance.

In addition to improving reliability, this maintenance mix will have a size able impact on the cost of maintenance: breakdowns and repairs typically cost about $17-18 per installed horsepower (hp)/year, preventive costs about $11-13 per installed hp/year, and predictive maintenance costs about $7-9 per installed hp/year.


RTF-wipers Time-oil change RTF-lamps Time-inspect belts Time-inspect fluids Time-inspect tread Condition-replace pads, turn rotor Condition-replace tire; Proactive lessons learned incorporated into next year's model; failure research; user feedback and trends RTF-lamps Time-inspect exhaust Condition-wash and wax

FIG. 2 An example of proactive maintenance applied to an automobile. (From St. Germain, E. and Pride, A., NASA Facilities RCM Guide, 1998, p. 1-7)


FIG. 3 A decision logic tree for maintenance strategy. (From St. Germain, E. and Pride, A., NASA Facilities RCM Guide, 1996, p. 2-3)

Will failure have a direct and adverse impact on environ., health, security, and safety? Run to fail? Redesign system; accept failure or risk; install redundancy Develop and perform interval-based task Perform condition based task Will failure result in high economic loss? Will failure have a direct and adverse impact on mission? Is there an effective CM approach? Is there an effective interval based task? Develop and schedule CM task to monitor condition


A decision logic tree shown in FIG. 3 may be used as a starting point to determine the appropriate maintenance strategy for a given system or component. Various maintenance strategies are discussed in the following sections.

Reactive maintenance: It involves repair or replacement only when deterioration of the condition causes a functional failure. The unit breaks down.

Reactive maintenance assumes that failure is equally likely to occur in any part, component, or system. If an item fails and parts are not available, delay will occur. Management has no influence on when the failure will occur (usually at the most inopportune time) and a premium will be paid for urgent attention. When this is the sole type of maintenance practiced, there is typically a high percentage of unplanned maintenance, a large replacement parts inventory must be maintained, and it is an inefficient use of the workforce.

An appropriate application of reactive maintenance is when a failure of the system or component poses little risk to operations, is inconsequential, and the costs of maintenance outweigh the items replacement cost. Examples include the replacement of failed fuses, incandescent lamps, and repair equipment when it breaks down. Reactive maintenance strategy is similar to RTF strategy discussed earlier.

PM: It consists of the regularly scheduled inspection, adjustments, cleaning, lubrication, parts replacement, and repair of components. It is performed on an arbitrary time basis without regard for equipment condition. Maintenance intervals are normally predefined by the manufacturer (who may have a protective self-interest at stake and a lesser regard for costs to the plant). Regularly scheduled PM can result in unnecessary, even damaging, maintenance. Maintenance-induced failures and high maintenance costs typify this strategy. An example is overhauling a properly functioning motor generator set based on a manufacturer recommended timetable. PM strategy is the same as scheduled PM discussed earlier.

Predictive maintenance or condition monitoring: It uses nonintrusive testing techniques, visual inspection, performance data, and data analysis to assess equipment condition. It replaces arbitrarily scheduled maintenance tasks with maintenance tasks that are driven by the item's condition. Trending analysis of data is used for planning and to establish schedules. Since the technology is not applicable to all types of equipment or possible failure modes, it should not be the sole maintenance strategy employed. It is most effective when used in conjunction with a preventive program. Examples are detection of high- resistance electrical connections by infrared thermography, bearing deficiencies by vibration analysis, and motor winding problems by motor signature analysis.

Vibration monitoring: It is perhaps the most familiar and most beneficial of the mainline techniques for rotating apparatus such as motors. It should be applied to all large (> 7.5 hp), high-cost, and critical rotating equipment to monitor wear, imbalance, misalignment, mechanical looseness, bearing damage, belt flaws, sheave and pulley flaws, gear damage, flow turbulence, cavitation, structural resonance, mounting deficiencies, and fatigue. It can take several weeks or months of warning before failure occurs, thereby allowing the remedial task to be planned during a convenient time and logistically prepared. It has an accuracy rate of as high as 92% when applied correctly and a false alarm rate of about 8%. The vibration analysis can be performed in-house by technicians who have a good understanding of vibration theory and adequate equipment or it can be outsourced.

Infrared thermography: It has numerous applications in checking electrical systems (connections, unbalanced loads, and overheating), mechanical systems (blocked flow, binding, bearings, fluid levels, and thermal efficiency), and structural systems (roof leaks, building envelope integrity, and insulation). Equipment varies from contact devices to imaging infrared cameras, coupled with appropriate analysis software. Analysis can be a challenge, based on part by environmental factors that influence the data, so a technician with level I or IFI thermography certification should be employed to perform this survey.

These services can be outsourced. Thermographic finds are invaluable from a safety perspective and typically result in a cost recovery within 1 year.

Passive airborne ultrasonic: It is a low-cost tool for detecting pressure and vacuum leaks in piping, steam traps, pressure vessels, and valves; mechanical systems bearings, lubrication, and mechanical rubbing; and electrical systems arching and corona. Ultrasonic devices are becoming increasingly popular by technicians performing lubrication tasks to determine appropriate lubrication levels. Operators require little training or prior experience and scanners cost as little as $1000.

Lubrication oil analysis: It is often performed on large or critical machines to determine its mechanical wear, the condition of lubricant, if the lubricant has become contaminated, and the condition and appropriateness of the lubricant additives. Lube oil packages include checking for visual condition and odor, viscosity, water content, acidity, alkalinity, and metallic and nonmetallic contamination. Precise procedures must be followed in obtaining clean, representative samples; however, analysis is performed in a laboratory at reasonable costs ($10-$100 per test). A single failure detected could pay for the program for several years.

Electrical condition monitoring techniques: It should be applied to electrical distribution cabling, panels, and connections; switchgear and controllers; transformers; electric motors; and generators. It is estimated that 95% of all electrical problems are due to connections (loose, corroded, under sized, and over tightened), unbalanced load, inductive heating, spiral heating in multistrand wires, slip rings, commutators, and brush riggings.

Condition monitoring detects abnormal temperature, voltage, current, resistance, complex impedance, insulation integrity, phase imbalance, mechanical binding, and the presence of arching. The most common predictive tests are Infrared thermography-To detect temperature differences and the overheating of circuits (see Sect. 8 for more detail) Insulation power factor (PF)-Measures power loss through insulation to ground (see Sect. 3 for more detail) Insulation oil analysis-Detects transformer, switch, breaker insulation oil condition, and contamination (see Sect. 4 for more detail) Dissolved gas analysis-Trends the amount of nine gases in transformer oil formed by transformer age and stress (carbon monoxide [CO] and carbon dioxide [CO2] to detect overheating of windings; CO, CO2, and methane [CH4] to detect hot spots in insulation; hydro gen, ethane, ethylene, and methane (H2, C2H6, C2H4, and CH4) to detect overheating of oil and/or corona discharge; and acetylene (C2H2) to detect internal arching) (see Sect. 4 for more detail)

Megohmmeter testing--Measures insulation resistance phase to phase or phase to ground (see Sects 2 and 3 for more detail); High-potential (hi-pot) testing-Go/no-go test of the insulation.

Airborne ultrasonic noise--Detects electrical arching and corona discharges Battery impedance-Checks impedance between terminals and compares the same battery against previous readings (should be within 5%), compares the battery with others in the bank (within 10%), internal short (impedance > 0), open circuit (impedance > infinity), and premature aging due to heat/discharges (fast rise in capacity loss) (see Sect. 8 for more detail)

Surge testing--Go/no-go test of winding insulation

Motor circuit analysis (MCA)-Measures motor circuit resistance, capacitance, imbalance, and rotor influence (see Sect. 10 for details)

Motor current signature analysis (MCSA)-Provides signatures of motor current variations (see Sect. 10 for details)

Electric motor phase voltage unbalances affect the phase current unbalances, cause motors to run hotter, and reduce the motor's ability to produce torque.

For every 10°F increase in operating temperature, it is estimated that the life of the equipment is reduced by half (H.W. Penrose, White Paper, Test methods for determining the impact of motor condition on motor efficiency and reliability).

Some of these electrical tests require the circuits to be energized, and others not. Some tests require specific initial conditions, such as normal operating temperature. Whereas some high loads amplify problems, low load allows for their non-detection.

Electricians, technicians, and electrical engineers trained in electrical predictive techniques can perform the testing. A comprehensive testing program toolbox would include an infrared camera, ultrasonic detector, multimeter/volt-ohmmeter, clamp-on current transformer, an insulation and PF test set, battery impedance test set, MCSA test set, and MCA tester.

Proactive maintenance: It improves equipment condition and rate of degradation through better design, installation procedures, failure analysis, workman ship, and scheduling. Its procedures and technologies are used during forensic evaluations to determine the cause of failure. Proactive maintenance uses feed back to ensure that changes from lessons learned and best practices are incorporated in future designs and procedures. It employs a life-cycle view of maintenance, ensures that nothing affecting maintenance is done in isolation, and integrates maintenance support functions into maintenance planning. It uses RCFA and predictive technologies to maximize maintenance effectiveness. Common proactive techniques are:

RCM specifications: Specifications that incorporate RCM philosophy and techniques are prepared for new and rebuilt equipment. These specifications include vibration, alignment, and balance standards; electrical testing criteria; lube oil testing requirements; and commissioning and acceptance testing requirements. Operator and maintenance feedback and RCM analysis documentation provide designers with justification for equipment upgrades and modernization. New and replacement units' design should reflect lessons learned and best practices for improvements on operability, maintainability, and reliability.

Failed part analysis: Involves visually inspecting failed parts to identify the root cause of the failure. It looks at forensic scoring, color, and pitting, particularly of bearings, which are generally the equipment's weakest components and achieve only 10%-20% of their design life.

RCFA: Maintenance technicians usually repair symptoms, although recur ring problems are symptomatic of more severe problems. The end result is high cost, questionable mission reliability, strained user goodwill, and safety hazards. RCFA seeks to find the cause, not just the effect, quickly, efficiently, and economically. Predictive maintenance techniques detect and correct problems before failure, but do not act on the root cause. RCFA provide the information to eliminate the recurrence and instill the mentality of "fix forever." FMEA: Similar to RCFA, but performed prior to failure. Its goal is to identify potential failures and failure modes to take action to prevent the failure, detect the failure earlier, and reduce the consequences of failure. For each affected equipment, it describes the function, identifies failure modes and the effects of failure, the probability and criticality of failure, and suggests a maintenance approach.

Reliability engineering: It involves the redesign, modification, and replacement of components with superior components, such as sealed bearings, upgraded metal, and lubricant additives.

Age exploration: Determines the optimal maintenance frequency. Starts with the manufacturer's recommendations, then adjusts the frequency based on equipment histories and observations and condition assessments during PMs and "open and inspects." Recurrence control: A repetitive failure is the recurring inability of a system, subsystem, structure, or component to perform the required function. The process analyzes the repeated failure of the same component, repeated failure of various components of the same system, and the repeated failure of the same component of various systems. Historical maintenance and trend data would be monitored to determine if recurring component problems might be symptomatic of possible genetic problems and/or procedures of system aging, corrosion, wear, design, operations, the work environment, or maintenance application (or misapplication).

Program implementation: The planning of a maintenance program should include considerations for proper test equipment, tools, trained personnel to carry out the maintenance tasks, and time required to perform inspections, tests, and maintenance routines. Also, consideration should be given to record-keeping systems that range from computerized maintenance management systems (CMMSs) to manual file systems. There are number of companies that offer computerized maintenance management programs as stand alone programs or they can be incorporated into the facility operational programs. The reader is encouraged to look into this programs since they are not fully covered in this guide.

The following are the steps in implementing an effective maintenance program:

1. Determine the objectives and long-range goals of the maintenance program.

2. Survey and consolidate data on equipment breakdowns.

3. Determine equipment criticalities.

4. Determine the risk and the amount of risk that you are willing to tolerate.

5. Establish metrics and key performance indicators (KPIs) to track and trend performance.

6. Establish the best maintenance techniques within your resources to mitigate the risk. Determine the maintenance procedures and frequencies.

7. Schedule and implement the program, starting with the most critical systems and those with the fastest, most beneficial paybacks first.

8. Publicize successes; provide trends, metrics, and KPIs to top management to gain management support.

9. Repeat the cycle.

Maintenance analyst: The quality of the maintenance program is reflective of the skill of the maintenance technicians, their workmanship, quality of the supporting documentation, procedures, and the technologies used.

A position for maintenance analyst should be included in an RCM pro gram. This person should be able to detect the equipment condition, must have the skill to analyze the condition, must be able to diagnose the machine or system operation and develop a course of action, and must take the action needed to prevent failure (or allow RTF). The analyst would be responsible for monitoring and analyzing data for the mechanical systems. He or she would receive all work orders, trouble calls, KPIs, and test results and would provide continuous oversight and analysis.

Plant databases: CMMSs, building management systems (BMSs), and energy management systems (EMSs) provide invaluable historical data to the maintenance analyst. Historical data from these provide information on age-reliability relationships, data to trend and forecast impending failure, test results, performance data, and feedback to improve performance and to document condition.

RCM involves specifying and scheduling EPM activities in accordance with the statistical failure rate and/or life expectancy of the equipment being maintained and its criticality and productivity, and continually updating EPM procedures and schedules to reflect actual maintenance experience in the plant. RCM is the most cost-effective of the alternative approaches because it improves plant safety, reliability, and availability while reducing maintenance costs by concentrating limited maintenance resources on items which are the most important and/or troublesome, and reducing or eliminating unnecessary maintenance on items which are of little significance and/or highly reliable. A comprehensive RCM program also incorporates structured provisions for failure root cause investigation and correction and for performance monitoring to predict failures. RCM is used extensively in the military and is gaining acceptance among both nuclear utilities and manufacturing plant operators as its advantages are increasingly recognized.

__3.1 Key Factors in EPM Optimization Decisions

The optimum EPM approach for any specific plant, system, and/or piece of equipment depends on a variety of factors, including the following:

  • Safety impact of equipment failure
  • Productivity and profitability impact of equipment failure (including costs of lost production as well as failed equipment repair or replacement)
  • Cost of PM
  • Failure rate and/or anticipated life of equipment
  • Predictability of failure (either from accumulated operating time or cycles or from discernible clues to impending failure)
  • Likelihood of inducing equipment damage or system problems during maintenance and testing
  • Technical sophistication of the plant maintenance staff
  • Availability of equipment reliability data to support RCM

__3.2 General Criteria for an Effective EPM and Testing Program

Effective electrical equipment and subsystem PM and testing programs should satisfy the criteria listed below.

First and most fundamental, a structured EPM program should actually exist. That is, EPM should be performed as follows:

  • Under formal management control
  • In accordance with defined practices and schedules
  • By clearly designated persons
  • Specifically:

Management should assign a high priority to EPM. As a corollary, adequate resources-personnel, facilities, tools, test equipment, training, engineering, and administrative support-should be devoted to EPM. Adequate support from design engineering and operations are especially important.

EPM activities should be prioritized according to the criticality of the systems and equipment involved, with the highest resource intensity and scheduling priority assigned to equipment, subsystems, and systems important to safety.

EPM should be performed according to unambiguous written procedures based on specific consideration of equipment, application, and environmental characteristics.

EPM procedures and schedules should be maintained and reviewed in order to ensure engineering review of procedural changes and the incorporation of plant modifications.

The EPM program should have provisions to take effective advantage of actual experience accumulated both in the plant and else where (e.g., as professional society and industry association publications, and informal communications with other interested organizations).

The EPM program should incorporate effective provisions for failure root cause analysis, correction, and recurrence control.

Information systems should be in place to record and update the plant maintenance, testing, and operating history, and to facilitate trending of test data, in support of the previous two criteria.

EPM should be performed only by appropriately qualified personnel. (See Sections 3.) Management should continually monitor and reevaluate the effectiveness of the EPM program, and make appropriate changes in response to identified programmatic problems and advances in maintenance technology.

By clear implication, the "RTF" and "inspect and service as necessary" philosophies described earlier fail to provide enough structure, direction, and monitoring to satisfy the criteria for a sound EPM approach. These philosophies are not acceptable for important equipment and systems. At a mini mum, a scheduled EPM program is clearly necessary.

__3.3 Qualifications of EPM Personnel

The minimum acceptable qualifications for personnel assigned to perform EPM depend on the type of maintenance and the type of the equipment to be maintained. It is normally acceptable for non-specialists personnel to perform superficial inspections and other undemanding EPM tasks when guided by defined procedures and acceptance criteria. However, effective administrative controls should be in place to ensure that critical PM tasks on important equipment and systems are performed only by-or at least under the immediate and active supervision of-appropriately trained and experienced maintenance technicians. Such tasks typically include internal inspection, testing, calibration, and refurbishment.

Training for critical EPM work on important equipment and systems should include at least the following:

  • The fundamentals of electrical power technology
  • General electrical maintenance techniques
  • Electrical safety methods and practices
  • The design and operation of the equipment and system to be maintained
  • The applicable maintenance and testing procedures required for the maintenance and testing of the equipment

For critical tasks, technicians' experience should include similar work on the same or closely comparable equipment, preferably in an operational environment, although experience acquired in a training environment under direct supervision of experienced instructors is acceptable.

With regard to electrical safety methods and practices, the National Fire Protection Association (NFPA) and the Occupational Safety and Health Administration (OSHA) have promulgated new guidelines and requirements to protect workers from shock and flash hazards. The NFPA 70E, Article 110.8 (B) (1) requires safety-related work practices to be used to protect employees who might be exposed to the electrical hazards involved when working on live parts operating at 50 V or more. Appropriate safety related work practices shall be determined before any person approaches exposed live parts within the limited approach boundary by using both shock hazard and flash hazard analyses. Similarly, OSHA 1910.335(a)(1)(i) requires employees working in areas where there are potential electrical hazards to be provided with, and to use, electrical protective equipment that is appropriate for the specific parts of the body to be protected and for the work to be performed. Also in accordance with OSHA 1910.132(d), the employer is required to assess the workplace hazard to determine the use of personal protective equipment (PPE) required to protect the worker from shock and flash hazards. The NFPA 70E and OSHA requirements for shock and arc-flash hazards and guidelines for performing such an analysis are covered in more detail in Sect. 13, Sections 13.2 and 13.3. The maintenance of protective devices and its impact on arc-flash hazard are covered in Sections 7 of Sect. 1.

__3.4 Optimization of PM Intervals

Experience in a variety of industries demonstrates that performing PM on an absolutely fixed schedule rarely results in the optimum balance among the costs of preventive and corrective maintenance and the safety and productivity benefits of equipment reliability and availability. Given an adequate historical failure and maintenance database, reasonably straightforward methods can be used to optimize the PM cycle.

Also, several industry standards such as National Electrical Code (NEC) Standard 70B, National Electrical Testing Association (NETA) maintenance specifications, and others including manufacturer's recommendations pro vide guidelines on the frequency of maintenance of electrical equipment which could be used to establish EPM cycle.

__3.5 Trending of Test Results

Systematic trending of EPM test results is a key element of a high-quality electrical maintenance program. This is true because the magnitudes (pass or fail value) of many of the parameters measured during EPM tests on equipment are poor predictors of future failures, unless they are so far out of the normal range that they indicate imminent and probably irretrievable failure. Examples include insulation resistance, leakage current, capacitance, PF, and dissipation factor (DF); bearing temperature and vibration; and winding temperature. However, a degrading trend in these parameters strongly indicates impending trouble, especially if the trend is accelerating. A sound trending program can often alert the maintenance and operations staff of the plant in time to arrest the degradation and avert the failure, or at least to minimize the effect of the failure on safety and productivity.

To provide meaningful information, the trending program must be structured to screen the effects of external factors which affect the measured results but which are irrelevant to the actual condition of the equipment health and reliability. Test procedures should mandate precautions to ensure that the external conditions which can affect the test results remain the same from test to test, or to correct the results when this is impractical. (For example, insulation resistances readings taken at varying temperatures are corrected to a common base temperature.) Typical irrelevant external conditions that affect electrical test results include temperature, humidity, and load.

__3.6 Systematic Failure Analysis Approach

Failure analysis and root cause investigation should be an integral part of any EPM program. The steps to be taken after a failure is observed are

1. Use a failure cause analysis to determine the proximate cause of the failure. The proximate cause is expressed in terms of the piece-part level failure, e.g., relay XX failed to transfer due to corroded contacts.

2. Compare the proximate cause to past failures or conditions on the same and similar equipment to determine if the problem has a systematic root cause, e.g., a chemically active environment in the example cited above.

3. If there appears to be no systematic root cause, correct the failure, resume operation, and continue performance monitoring. If there is a discernible root cause, initiate a structured root cause investigation.

4. If the problem is generic, contact other affected plants and manufacturers of the equipment to determine if they have taken any effective corrective actions. If so, adapt these actions to the specific circum stances of the affected equipment; if not, proceed to the next step.

5. If the problem is plant-specific, or if it is generic but no effective solution has been developed elsewhere, determine if it is attributable to a unique system design, to application or environmental factors, or to operational factors such as maintenance, testing, and operations practices.

6. If the problem is determined to be related to system design, equipment application, or environment, determine the specific deficiency (through special tests performance monitoring, environmental monitoring, etc.), and make appropriate corrections.

7. If the problem is related to faulty operations, identify and correct the specific procedures involved.

8. Determine whether the root cause of the problem is a programmatic deficiency, e.g., in procedures writing, training, supervision, or adequacy of resources, and make appropriate corrections.

9. Perform the necessary post-correction testing and monitoring to close out the problem and ensure that it is corrected.

__3.6.1 Post-maintenance Testing

Post-maintenance testing provides the best assurance that maintenance actions were accomplished correctly and that the system or component was returned to functional condition. Post-maintenance testing is heavily emphasized in the better-performing plants. In these organizations, post-maintenance tests are performed following any action that potentially affects the operability of a component/subsystem/system and the scope of the testing is broad enough to confirm all of the potentially affected functions. Associated systems, subsystems, or components are tested along with the systems, subsystems, or components which initiated the process if an engineering analysis indicates that the maintenance action could have a significant impact on these associated items.

__3.6.2 Engineering Support

Engineering support is intended to ensure that the PM program properly addresses the engineering and logistical aspects of maintenance. In view of this broad objective, engineering support of maintenance encompasses much of the engineering and management activity that takes place in a plant. This includes at least the following functions:

• Maintenance engineering

• System engineering

• Design engineering

• Training

• Spare parts and materials management

• Quality assurance

• Quality control

There are, of course, many other areas of maintenance involvement with engineering support groups. The intent here is to show areas which stand out in the better-performing plants and which tend to be missing or under developed in other organizations.

Maintenance engineering is the engineering support activity most directly involved with PM. This function is present in all of the better-performing plants, although its name and where it fits into the organization vary widely from plant to plant. Its purpose is to optimize the maintenance program through planning, feedback, continual evaluation, and periodic updating of policies and procedures. The functions of a maintenance engineering group typically include

• Maintenance procedure development and control

• Periodic review and updating of maintenance practices and procedures

• Maintenance recordkeeping

• In-service inspection and testing (ISI/IST) program development

• Providing guidance to the training staff on maintenance training

• Collecting and trending equipment failure, reliability, availability, and maintainability data

• Tracking and trending the corrective- to preventive-maintenance ratio

• Failure root cause analysis

• Tracking, trending, and analysis of non-conformances

• Identifying and monitoring maintenance-related equipment performance parameters, especially failure precursors Identifying and monitoring maintenance performance indicators

__3.6.3 Summary

The foregoing has been a brief look at the features of the EPM program. There are many ways to effect improvements in an organization, but probably the dominant cause of failing to improve is resistance to change. In the plants that have outstanding maintenance organizations, upper management has overcome this resistance by direct, long-term involvement in establishing and implementing policies leading to improved maintenance. Perceptible improvements in reliability, availability, and thermal efficiency have generally resulted; the indirect results have been both greater safety and higher profits. The changes in these organizations were not easy and required both time and dedication to implement. Effective management appears to be the key to an effective overall maintenance organization, not the number of programs management has in place.

Top of Page