Disaster by Design/Safety by Intent #3
Disaster by Design
The primary purpose of commercial nuclear power plants in the U.S. is to generate electricity. When not fulfilling that role, nuclear power plants that are shut down require electricity to run the equipment needed to prevent the irradiated fuel in the reactor core and spent fuel pool from damage by overheating. The March 2011 accident at Fukushima Daiichi in Japan graphically illustrated what can happen when nuclear plants do not get the electricity they require.
U.S. nuclear power plants are designed with three sources of electricity: (1) the offsite power grid, (2) the backup power supply, and (3) the direct current power from batteries. (The responses to the 9/11 and Fukushima tragedies added a fourth source in the form of portable generators, but the reliability is significantly lower because this equipment is not purchased, tested, and maintained to anything close to the high standards applied to the other sources.)
When electricity from the offsite power grid is available, a nuclear plant has the largest inventory of cooling equipment available. When the offsite power grid is not available, the backup power supply has sufficient capacity for the emergency equipment, but not for the normal cooling equipment. And when the offsite power grid and the backup power supply are both unavailable—plunging the plant into what is called a station blackout—the batteries have sufficient capacity for a single cooling system for a handful of hours.
Offsite Power Grid Problems
The NRC examined times when U.S. nuclear power plants were disconnected from their offsite power grids. The NRC’s term for such events is Loss of Offsite Power (LOOP). The NRC reported roughly the same number of LOOP events when plants were operating (55) as when plants were shut down (58), even though nuclear plants tend to spend more time operating than shut down. The NRC identified four causes for LOOP events: (1) plant-centered, (2) switchyard-centered, (3) grid-related, and (4) weather-related. The relative number of each cause are shown in Fig. 1
An event on March 25, 2003, at the Palisades nuclear plant in Michigan is indicative of a plant-centered LOOP; although like snowflakes, no two are alike. The plant was shut down at the time for refueling. Workers installing a post for a sign in the plant’s parking lot penetrated through an underground conduit containing electrical cables. (The reports didn’t say what the sign read. Hopefully, the sign did not say “CAUTION: Important Cables Buried Below” or “No Digging.”) The control circuits for both of the offsite power transmission lines were damaged, causing a LOOP. The emergency diesel generators—the backup power supply—automatically started. But the low pressure safety injection pump that had been running to cool the reactor core was not automatically connected to the buses supplied by the emergency diesel generators. It took the operators 20 minutes to restore reactor core cooling.
An event on March 20, 1990, at the Vogtle nuclear plant in Georgia illustrates a switchyard-centered LOOP. The unit was shut down at the time for refueling. A worker drove a fuel truck into the switchyard to refill the tank of a welding machine. Trying to turn around and exit the switchyard, the worker backed the truck into a pole supporting the 230,000 volt overhead transmission line. The impact caused an electrical fault that de-energized the in-service transformer between the grid and the unit, triggering a LOOP. One of the two emergency diesel generators was out of service for maintenance. The remaining emergency diesel generator automatically started. But a sensor on its cooling system malfunctioned and stopped the emergency diesel generator. The sensor had malfunctioned 69 times since 1985, or roughly once a month, but had never been fixed or replaced. The unit was in a station blackout. The temperature of the reactor cooling water rose from 90°F to 136°F in the 36 minutes it took for workers to restart the emergency diesel generator in emergency mode (bypassing the cooling water sensor problem) and restore reactor core cooling.
An event on June 14, 2004, at the Palo Verde nuclear plant in Arizona illustrates a grid-related LOOP. All three reactors were operating when an electrical fault occurred on a 230,000 volt transmission line about 47 miles from the plant. A circuit intended to isolate the electric disturbance failed, allowing a ripple effect across the power grid. All three reactors at Palo Verde automatically shut down due to the fluctuating conditions on the power grid and six non-nuclear generating units on the grid also shut down. All the emergency diesel generators at Palo Verde automatically started, except for one of two emergency diesel generators for Unit 2. A diode in a control circuit failed, disabling the emergency diesel generator. The plant’s response to the triple shut down was complicated by:
- the emergency diesel generator for the Technical Support Center not working due to a mis-positioned switch,
- lack of understanding about a temporary modification on Unit 1 that allowed the letdown system to cause excessively high temperature in a downstream system that ignited paint on the overheated piping,
- a leaking check valve in the Unit 3 safety injection system that forced the operators to manually depressurized the low pressure safety injection system three times to protect its piping from becoming over-pressurized,
- and failure of two electrical circuit breakers to operate that delayed workers in restoring power to plant equipment.
Despite these, and other, problems, the operators were able to safely shut down all three reactors.
Another grid-related event on August 14, 2003, in the northeastern U.S. and parts of Canada caused eight operating U.S. reactors (Fermi Unit 2 in Michigan; Perry in Ohio; and FitzPatrick, Indian Point Units 2 and 3, Nine Mile Point Units 1 and 2, and Ginna in New York) to experience LOOPs. Offsite power was restored to Ginna in 49 minutes. It took 6 hours and 24 minutes to restore offsite power at Nine Mile Point Unit 2. While some plants experienced equipment malfunctions that complicated the response to the LOOP, all endured it successfully and restarted shortly afterward.
Appendix A to a report released by the NRC in December 2003 summarizes 83 grid events between 1994 and 2001 that affected U.S. nuclear power plants. The compilation included the December 14, 1994, event where a transmission line fault in Idaho rippled across the western U.S., affecting Diablo Canyon Units 1 and 2 and San Onofre Unit 2 in California; Palo Verde Units 1 and 2 in Arizona; and Columbia Generating Station in Washington.
The damage inflicted on August 24, 1992, as Hurricane Andrew hit south Florida, including the Turkey Point nuclear plant, is an example of a weather-related LOOP. Both reactors had been shut down as a precautionary measure before the hurricane’s arrival. The hurricane downed transmission lines, causing a LOOP at Turkey Point that lasted nearly five days. The high winds also damaged onsite antennas and offsite repeating stations. The plant was without telephone or radio communications for four hours, except for one hand-held radio. The fire protection system was impaired when high winds blew a tower onto the 500,000 gallon storage tank. Both reactors endured the challenge and were restarted days later.
The NRC examined four hurricanes that visited the southeastern U.S. during 2004 for the consequences at Brunswick Units 1 and 2 in North Carolina and St. Lucie Units 1 and 2 and Crystal River 3 in Florida.
- Hurricane Charley caused an offsite transmission line fault that triggered the automatic shut down of Brunswick Unit 1. The power outage disabled 25 of the 36 emergency sirens within the emergency planning zone.
- Operators began manually shutting down both reactors at St. Lucie on September 3, 2004, as Hurricane Frances approached. During the storm, the Emergency Response Data Acquisition Display System link to NRC headquarters as well as the Emergency Notification System direct connection between the plant and NRC headquarters was lost for hours.
- Operators began shutting down both reactors at St. Lucie again on September 25, 2004, as Hurricane Jeanne approached. This time, the Emergency Response Data System connection between the plant and the NRC’s Incident Response Center was lost. After the storm passed, workers discovered that the exterior doors on the east side of the Unit 2 reactor auxiliary building were wide open. The plant’s safety studies assumed these doors were closed during reactor operation and severe weather to act as missile shields, protecting vital equipment inside from debris picked up and propelled by high winds. The doors had been left open during Hurricane Jeanne due to “lack of procedural guidance.”
- Crystal River 3 automatically shut down on September 6, 2004, when Hurricane Frances’ high winds caused a phase-to-ground fault in the 230,000 volt switchyard. The fault was attributed to “diameter loss and subsequent mechanical failure of a carbon steel pin in a vertical slice of insulators” with the diameter loss “caused by possible leakage current, which led to spark erosion and severe electrochemical corrosion of the carbon steel pin”—Nukespeak for the thing done getting fried by high voltage current.
Backup Power Supply Problems
LOOPs mean the normal, preferred source of electricity to a nuclear power plant is unavailable. Emergency diesel generators, with the sole exception of hydroelectric generating units for the Oconee nuclear plant in South Carolina, are the backup power supply for U.S. nuclear power plants. Essentially locomotive diesel engines without the wheels and whistle, emergency diesel generators can supply power to emergency equipment designed to mitigate transients (like LOOPs) and accidents (like loss of coolant accidents) and protect workers and the public.
Emergency diesel generators are highly reliable, but far from infallible. A report issued for the NRC in 2011 on emergency diesel generator failures a decade earlier noted 137 emergency diesel generator failures during the three-year period 1999-2001 across the fleet of nearly 100 U.S. nuclear power reactors (Fig. 2). The failures included times when an emergency diesel generator failed to successfully start, times when it started but failed to connect to its electrical distribution bus (also termed failing to supply electricity to the equipment loaded on the bus), and times when it started and supplied its loads only to later fail while running. The apparent high number of failures is tempered by the large number of tests: there were 75 times among 13,772 demands (combination of tests and responses to actual events) when an emergency diesel generator failed to start, 42 times among 11,843 demands where an emergency diesel generator failed to supply its loads, and 20 times during 26,170 hours of run-time when an emergency diesel stopped running unintentionally.
Some of the more recent emergency diesel engine failures include:
- Arkansas Nuclear One (Arkansas): The Unit 2 emergency diesel generator 4A caught on fire about one minute after being started for a monthly test run on August 3, 2007. Workers determined that a warped panel used to cover an inspection port allowed oil to leak onto the exhaust header.
- Calvert Cliffs (Maryland): The Unit 1 emergency diesel generator B caught on fire 1 hour and 20 minutes into a monthly test run on August 12, 2007. Lubricating oil leaked from several loose bolts connecting the engine top cover to the exhaust manifold and ignited. The ensuing investigation found that 15 of the 122 bolts were at less than the 40 to 55 foot-pounds torque value specified by the vendor to ensure proper bolt tightness. The procedure used at the plant did not specify a torque value for the engine top cover bolts.
- Fermi Unit 2 (Michigan): Emergency diesel generators 11 caught on fire during a post-maintenance test run on January 31, 2003. Fuel oil spilled from the clean fuel drain header vent onto the injector deck where it flowed onto the exhaust manifold and ignited. Two weeks earlier, workers installed temporary plastic sleeves on the drain lines from the clean fuel drain header without following approved modification procedures. The plastic sleeves restricted flow through the drain lines, allowing fuel oil to back up and overflow from the vent header.
- North Anna (Virginia): The Unit 2 emergency diesel generator H caught on fire during a test run in September 2006. Workers determined that lubricating oil had leaked past bolts onto the exhaust manifold. The bolts had been replaced during maintenance in the spring of 2006 and the replacement bolts were longer than the original bolts, creating a pathway for oil leakage.
- Palo Verde (Arizona): The Unit 2 emergency diesel generator A unexpectedly stopped running during a monthly test on November 12, 2008. Troubleshooting identified damage to the excitation control system for the generator. An offsite laboratory examined the damaged parts and determined that misalignment of parts during assembly at the manufacturer created a sharp edge. When the emergency diesel generator ran, its vibrations allowed the sharp edge to slowly cut through the insulation on control wires, allowing an electrical fault.
- Peach Bottom (Pennsylvania): The Unit 2 emergency diesel generator E2 caught on fire during a test run on April 19, 2003. Loose bolts holding the engine top cover in place allowed lubricating oil to leak onto the exhaust manifold. Maintenance procedures did not specify the torque value recommended by the vendor to ensure proper tightness, but instead directed workers to tighten the bolts until they were “wrench-tight.”
- San Onofre Unit 3 (California): Emergency diesel generator A failed to start during a test on December 12, 2009. Workers found that a capacitor failure in the power supply for the local alarm panel allowed an electrical transient that affected the speed switch circuit and prevented the emergency diesel generator from being started.
The causes of these failures include manufacturing problems, inadequate maintenance practices, and improper modifications.
Direct Current from Batteries Problems
Should a nuclear plant become deprived of both the electricity from the offsite power grid and from the backup power supply, it experiences a station blackout where the only remaining source of electricity is direct current from onsite batteries. The batteries are normally kept fully charged from the alternating current systems through inverters and chargers. The station batteries are designed to supply sufficient electricity to a minimal subset of emergency equipment needed to cool the reactor core for 4 to 8 hours when it is assumed that either the connection to the offsite power grid will be restored or at least one of the backup power supplies will be repaired and returned to service.
Some recent problems involving station batteries include:
- Davis-Besse (Ohio): On July 26, 2001, NRC inspectors identified that the electrical cables and associated relays for non-essential loads on the station batteries were not qualified for the post-accident environment they would experience. Specifically, the direct current supplies to the backup oil lift pump motors for the four reactor coolant pumps and to the containment lighting panel could fail following an accident. Their failure could shorten the life of the station batteries to less than assumed in the plant’s safety studies.
- Indian Point Unit 3 (New York): During a weekly surveillance test, workers discovered a crack in the casing for cell 14 of station battery 33 on October 9, 2013. The crack extended below the high level fluid level of the cell. The damaged cell was replaced. Workers attributed the crack to corrosion on the positive battery post which caused the post to expand and put excessive stress on the casing.
- San Onofre Unit 2 (California): Workers conducting a weekly surveillance test identified an apparent low-voltage condition on one of the four banks of station batteries on March 25, 2008. The problem was attributed to loose bolts on the connection for the charging cable dating back to a maintenance task performed on March 17, 2004.
- Waterford (Louisiana): Workers found that the capacity of station battery B was 86.25% of the manufacturer’s rating during a test on May 16, 2008. Although this result satisfied the regulatory requirement of at least 80% rated capacity, it was significantly below the average value of 103.7% recorded during prior tests. Workers conducted a follow-up test on May 22, 2008, and found the capacity had dropped to 71.67%. The batteries were replaced. Because the battery cells were thrown away, no root cause of the problem could be established. The batteries had a rated service life of 20 years but failed after 15.6 years.
- Waterford (Louisiana): In the midst of Hurricane Gustav, the operators declared station battery 3B-D inoperable due to low voltage on September 3, 2008. Workers determined the problem to be loose bolts on the connection between battery cells 57 and 58. The bolts had been loosened on May 29, 2008, when cell 56 was replaced and apparently had not been properly retightened.
The causes of these failures include design problems, inadequate maintenance practices, and aging degradation.
Safety by Intent
Nuclear power plants are designed with three sources of electricity for emergency systems needed to protect reactor cores from overheating damage. Because each source is highly reliable, it’s unlikely that all three will fail when needed. But all three failed at Fukushima, and all three could fail again.
Look at the math. For illustration, assume that each source is 95% reliable, meaning each source has a 5% (expressed as 0.05) chance of failure. The chance that all three sources fail is therefore 0.05 times 0.05 times 0.05 or 0.000125 => .0125% or 1.25 triple failures in 10,000 trials.
What happens when design errors, inadequate maintenance practices, and/or aging degradation reduce the reliability of the sources?
Suppose that the reliability of each source drops to 90%. Something succeeding 9 times out of 10 seems pretty safe, especially considering that three 90% reliable sources must concurrently fail to cause real harm. The chance of all three 90% reliable sources fail is 0.1 times 0.1 times 0.1 or 0.001 => 0.1% or 1 triple failure in 1,000 trials.
Safety is enhanced when impairments are flushed out and fixed because the reliability of protective barriers increases.
Safety is degraded when impairments remain hidden or remain uncorrected because the reliability of protective barriers decreases.
UCS’s Disaster by Design/Safety by Intent series of blog posts is intended to help readers understand how a seemingly unrelated assortment of minor problems can coalesce to cause disaster and how addressing pre-existing problems can lead to a more effective defense-in-depth protection.