Disaster by Design: Safety by Intent #5
Disaster by Design
Thirty years ago, the average annual capacity factor of U.S. nuclear power reactors—the fraction of electricity generated compared to what could be generated by operating at 100% power over the entire year—was between 55 and 60%.
Over the past decade, the average annual capacity factor has steadily been around 90%.
One of several factors that contributed to this significantly improved performance is something called online maintenance. As implied by the name, online maintenance involves conducting maintenance on equipment while the reactor is operating rather than performing this work when the reactor is shut down. The maintenance in this case isn’t reactive work—fixing something that breaks. It is preventative maintenance performed to replace or repair components that are wearing out before they break.
The Maintenance Rule adopted by the NRC in 1996 provided the foundation for expanded use of online maintenance. Fig. 1 shows that capacity factors climbed from the 80 to the 90 percent level shortly after this foundation was laid.
A Safety Trade-off
Online maintenance reduces the availability of safety equipment during reactor operation. Maintenance that had been performed during reactor outages is now performed while the reactor operates with a safety component disassembled for maintenance being unavailable should an accident happen. In other words, workers now routinely take apart—break if you will—safety equipment showing absolutely no signs of trouble for the goal of obtaining more reliable equipment once they put all the pieces back together and restore it to service.
The trade-off achievable from online maintenance is increased reliability. A lot of safety equipment is in standby mode during reactor operation, like coiled springs ready to pounce into action should an accident occur. But the idleness that comes with being in standby could mask broken equipment that would remain on the bench instead of taking the field to mitigate an accident. Online maintenance can provide greater assurance that standby safety equipment answers the bells that an accident sounds.
The Maintenance Rule seeks to balance the potential reliability gains that can be derived from online maintenance with the unavoidable loss of availability of a system when the maintenance is being performed. Some of the challenges in obtaining the proper balance are illustrated in the following summaries of recent events.
On January 28, 2015, operators at Wolf Creek closed valves and de-energized equipment to allow workers to perform maintenance on two valves. The work plan recognized that taking this equipment out of service disabled one of the two emergency systems used to cool the reactor core in event of accident. Because the reactor was operating at 100 percent power, the plant’s technical specifications only allowed this emergency system to be taken out of service for a handful of hours. But operators made a mistake—the valves they removed from service disabled BOTH emergency systems. The technical specifications did not permit both emergency systems to be removed from service. Nor did the technical specifications prevent both emergency systems from being removed from service.
Workers conducted a test of emergency diesel generator 12 at Monticello on December 28, 2014. A part of the test directed workers to adjust the speed controller for the engine to idle from full and then start the emergency diesel generator. The worker performed this step, but on the wrong engine. The controller for emergency diesel generator 11 was adjusted instead. This resulted in both emergency diesel generators being unable to properly respond to an accident or a loss of offsite power condition—the only reasons they were purchased and installed at the plant.
On May 25, 2014, the Unit 2 and 3 reactors at the Millstone nuclear plant in Connecticut were operating at full power with one of the four transmission lines connecting the plant to its offsite power grid removed from service for scheduled maintenance. A ground fault (short circuit) caused electrical breakers to automatically open on one of the three in-service transmission lines, removing it from service. The electrical disturbance caused by this abrupt change caused electrical breakers to automatically open on another transmission line, removing it from service. The sole transmission line remaining in service could not handle the combined electrical output from both reactors, causing electrical breakers to automatically open and remove it from service, too. Both reactors automatically shut down due to the loss of offsite power. The transmission line removed from service for maintenance was the first domino to fall, and the other three dominos soon toppled (as dominos are prone to do.)
A bolt of lightning struck the switchyard at the Joseph M. Farley nuclear plant on October 14, 2014, causing a fault on one of the transmission lines. The electrical disturbance forced the protective devices on Startup Auxiliary Transformer 2B to automatically open circuit breakers to prevent the problem from rippling throughout the plant. Because emergency diesel generator 2B was out of service for maintenance at the time, the partial loss of offsite power and onsite power disabled the cooling water flow to the reactor coolant pump motors. Per procedure, the operators manually tripped the reactor.
Workers turned on the B emergency filtration train at the Monticello nuclear plant in Minnesota on August 5, 2014, to supply fresh, filtered air to the control room with the A emergency filtration train removed from service to replace its charcoal filters. Twelve minutes later, the B emergency filtration system failed. In less than a quarter of an hour, the plant went from two fully redundant emergency filtration systems to none. Fortunate, indeed, that the plant didn’t experience an event where emergency filtration was needed to protect control room operators from harm.
The operators manually tripped the Unit 1 and 2 reactors at the DC Cook nuclear plant in Michigan on November 1, 2014, when debris in the lake clogged the screens at the intake structure. One of the traveling screens used to remove debris and unlock the incoming water pathway was out of service at the time. Following the reactor trips, the turbine-driven auxiliary feedwater pump on Unit 1 automatically started per design, but failed three minutes later. The main generator on Unit 2 was supposed to automatically trip, but failed to do so and the operators had to manually trip it.
On April 27, 2011, a tornado knocked down the offsite transmission lines for the Browns Ferry nuclear plant in Alabama causing all three reactors to automatically shut down. Seven of the eight onsite emergency diesel generators automatically started and provided power to emergency equipment. The eighth emergency diesel generator was out of service for maintenance at the time.
Safety by Intent
The technical specifications issued by the NRC with each reactor operating license define the terms and conditions under which safety equipment can be out of service for maintenance. Under certain conditions, the technical specifications allow a safety widget to be out of service for up to N days. But those conditions are based on events that the widget is designed to mitigate having the same chance of occurring at any time during the year; in other words, as likely to happen in February as in December.
But consider the Browns Ferry event. The plant is in northern Alabama near Decatur and Athens—in tornado alley. Tornados are common visitors to this region in springtime but rarely make appearances in fall and winter. The emergency diesel generators are intended to step in when the offsite power grid becomes unavailable and tornados have demonstrated a knack for making offsite power grids unavailable. Consequently, deliberately disabling emergency diesel generators at nuclear plants in tornado alley during tornado season seems imprudent. Yet the technical specifications apply uniform risk management measures across non-uniform hazards.
The Millstone event, like the 1927 heavyweight boxing match between Gene Tunney and Jack Dempsey, shows the importance of getting the count right. With one of four transmission lines taken out of service for maintenance, it appeared as if three transmission lines remained (yes, Virginia, 4 minus 1 equals 3).But the combined output from both operating reactors was more than a single transmission line could carry, so it really took two transmission lines to carry the load. And the system became unstable when one transmission line was out of service such that the “three” in-service transmission lines swam or sunk together. At the first ripple, all three sank to put Millstone in a loss of offsite power condition (yes, Virginia, 4 minus 4 equals 0).
After this event, considerable time and money was spent understanding why an apparent 3-count was actually a 1-count. While the cost of reaching this understanding before the transmission line might have been the same as doing so afterwards, the benefit of this awareness would have had far greater value—a stitch in time save nine and all.
UCS’s Disaster by Design/Safety by Intent series of blog posts is intended to help readers understand how a seemingly unrelated assortment of minor problems can coalesce to cause disaster and how addressing pre-existing problems can lead to a more effective defense-in-depth protection.