Disaster by Design/Safety by Intent #21
Disaster by Design
Growing up, I remember seeing a test pattern of the Emergency Broadcast System appear on our black and white television set accompanied by a really annoying high-pitched constant sound, followed by a voice telling us that “If this had been an actual emergency, you would have been instructed where to tune in your area for news and official information.”
The United States discontinued the Emergency Broadcast System in 1997, but workers at the Indian Point nuclear plant in New York needed a voice a couple of years later telling them “Because this is an actual emergency, follow the instructions to make it stop.” With that voice silenced, many of them headed home and let their “secret” emergency worsen.
The Indian Point Unit 2 pressurized water reactor was operating at 99% power on Tuesday, August 31, 1999. Instrument and control technicians were performing maintenance on Channel 3 of the reactor protection system when a spurious voltage spike occurred on Channel 4. Signals from both channels triggered the automatic shut down of the reactor at 2:31 pm.
The reactor’s shutdown caused the automatic shut down of the turbine and generator by design. The generator had been producing electricity that passed through the switchyard and out via power transmission lines to the electrical grid. Some of the electricity produced by the generator went to powering the unit’s equipment. The generator’s shut down deprived in-plant equipment of their normal source of power when then unit was operating. The design called for power supplies to automatically shift from the generator to the offsite electrical grid. Electrical relays made these transfers. But four minutes later, the electrical breakers re-opened, stopping the flow of power from the offsite grid to the four key electrical circuits.
Sensors detected low voltage on these four key electrical circuits, triggering the automatically startup of the three emergency diesel generators. The emergency diesel generators started and connected to the four electrical circuits to re-power equipment supplied from these circuits. Fourteen seconds later, the breaker connecting emergency diesel generator 23 to electrical circuit 6A re-opened, causing that one electrical circuit to lose alternating current (ac) power again. The breaker’s re-opening affected the other three electrical circuits in another way—the loss of ac power to electrical circuit 6A prevented the other three electrical circuits from being re-connected to the offsite electrical grid.
Battery 24 was another backup power supply to electrical circuit 6A. Thus, some of the vital equipment on electrical circuit 6A continued to receive direct current (dc) power from the battery. Other vital equipment (including a motor-driven auxiliary feedwater pump, a charging water pump, a component cooling water pump, and an auxiliary service water pump) had no power supplies available.
The Emergency Not Called
The unit’s emergency response procedures dictated that an Unusual Event, the least serious of the NRC’s four emergency classification levels, be declared when offsite power is unavailable to the four key electrical circuits for longer than 15 minutes. Even though that condition existed by 2:50 pm, the operators misinterpreted the procedure and did not declare an emergency at that time.
The Non-Emergency Response
At 4:00 pm, management convened a meeting to discuss tasks that needed to be performed before the reactor could be restarted. Restoring ac power to electrical circuit 6A was deemed most important of the steps to restart, but a lot of other housekeeping chores were added on the To Do list.
At 4:30 pm, the Station Nuclear Safety Committee (SNSC) met to review a procedure for work scheduled to be performed before restart. The NRC’s report on the event stated “that the SNSC meeting, which covered a topic unrelated to the trip and recovery, distracted some plant personnel from efforts to evaluate Bus 6A and recover from the event.”
Some workers simply went home.
I attended the public meeting conducted in the NRC’s Region I offices in King of Prussia, Pennsylvania with representatives from Indian Point Unit 2 regarding this event. An NRC manager told me about the agency’s frustration during the event getting the plant’s management to take it seriously. When I asked what this meant, the manager said that many workers headed for home at the normal end of the work day, despite the unit being unable to get power from the offsite grid and some vital equipment only getting powered from batteries that would soon become depleted. The manager told me that some of the recovery efforts were delayed by the departing staff resources. For example, the Plant Manager and Vice President-Nuclear for Indian Point Unit 2 headed home before 6:00 pm.
The Watch Engineer performed an online risk assessment at 6:40 pm and concluded that the condition was Red due to a Daily Risk Factor of 196. The typical Daily Risk Factor was less than 1.0 (zero being lowest risk) and a Red condition had never been calculated for Unit 2 before. The risk was estimated to be 1.8×10-3, or approximately 2 in a 1000 chance of reactor core damage—or nearly 200 times greater risk than that associated with normal reactor operation. And things went downhill from there.
The Emergency Belatedly Called
At 9:55 pm, the voltage from Battery 24, normally at 118 volts, dropped below 105 volts, causing it to stop powering safety equipment. The operators declared an Unusual Event, the least serious of the NRC’s four emergency classification levels, because the depleted battery disabled about 75% of the alarms in the control room.
The operators notified all state and local agencies by 10:09 pm. The operators notified the NRC about the emergency at 10:39 pm.
The Overdue End of the Emergency
The operators reconnected electrical circuit 5A to the offsite power grid at 2:24 am on September 1, 1999 and shut down emergency diesel generator 21 eighteen minutes later.
The operators reconnected electrical circuits 2A and 3A to the offsite power grid at 2:50 am and shut down emergency diesel generator 22 six minutes later.
At 3:30 am, the operators terminated the Unusual Event.
The Well-Earned Emergency
The spurious voltage spike on reactor protection system Channel 4 had occurred several times prior to this event, most recently on August 26, 1999—just five days earlier. Workers initiated a condition report that day to troubleshoot and repair the problem. But the condition report was closed out on the morning of August 31 without any troubleshooting or repairing.
The recurring but uncorrected problem recurred again, this time with worse consequences.
After the generator tripped, the supply of power to in-plant equipment automatically transferred to the offsite power grid per design. But the electrical breakers for four key electrical circuits re-opened four minutes later. Workers had modified the control logic for these breakers in 1995. The modification specified voltage conditions for closing and opening these breakers. But those voltage values had not been revised in calibration and maintenance procedures. Thus, when workers calibrated the control logic in June 1997, they used the wrong values. During the August 31, 1999, event, these breakers performed as calibrated instead of as designed to perform.
Emergency diesel generator 23 automatically started and connected to electrical circuit 6A per design. But its output breaker re-opened 14 seconds later. In 1997, workers modified the overcurrent protection relays for the emergency diesel generator output breakers, reducing the setpoint for opening the breakers from 7,500 amps to 6,000 amps. Workers tested the relays and verified that they opened the breakers at 6,000 amps. But the test they used was deficient. When workers re-tested the relays after the August 31 event using a proper procedure, the breakers opened instead at 3,000 amps. When emergency diesel generator 23 started and connected to electrical circuit 6A, other logic circuits restarted equipment every few seconds to avoid overloading the emergency diesel generator with too much demand (i.e, electrical current needs) at the same time. The 6,000 amp design limit accommodated all the staggered demands; the 3,000 amp actual limit did not.
The Well-Earned Fine
On February 25, 2000, the NRC imposed an $88,000 fine for several violations of safety regulations. Events were happening at Indian Point faster than the NRC could write tickets—the operators manually tripped the reactor on February 15, 2000, due to a broken tube inside a steam generator. Workers had inspected that tube in 1997 and got indications it was damaged more than allowed by safety regulations, but had mis-diagnosed those indications without fixing the damaged tube.
Safety by Intent
An event happening a year earlier at the Davis-Besse nuclear plant in Ohio demonstrated how things should have gone at Indian Point.
The Davis-Besse pressurized water reactor was operating at 99% power on Wednesday, June 24, 1998. At 8:44 pm, a tornado touched down onsite. The control room operators manually tried starting both emergency diesel generators upon receiving reports of the tornado. One emergency diesel generator failed to start, so operators went to the diesel generator room and manually started it using the local panel. One minute later, at 8:47 pm, the tornado damaged the switchyard, disconnecting the plant from its offsite power grid and causing the automatic shut down of the reactor, turbine, and generator.
The operators declared an Alert, the second least serious of the NRC’s four emergency classification levels at 9:18 pm. Despite the tornado’s damage disabling two of the three telephone systems at the plant, workers completed all the emergency notifications by 9:36 pm.
The loss of power from the offsite grid, even with both emergency diesel generators running, meant that none of the reactor coolant pumps could be operated to force cooling water through the reactor core. The emergency response procedures directed the operators to cool down the reactor water at a rate of less than 10°F per hour to avoid forming a steam bubble in the upper dome space of the reactor vessel.
During the slow and steady cool-down, the temperature inside one of the rooms containing a running emergency diesel generator warmed to over its 120°F maximum design limit due to a faulty damper in its ventilation system. Workers installed portable cooling fans and monitored the emergency diesel generator’s parameters for indications of diminished performance. They continued running the emergency diesel generator, but declared it to be inoperable because of the high room temperature.
The plant’s technical specifications directed that the reactor water be cooled to less than 280°F within seven hours of one emergency diesel generator being declared inoperable. The lead operator decided that continuing the slow and steady cool-down was safer than accelerating it to drop below 280°F within the specified timeframe, so he invoked a clause in the NRC’s regulations that permitted a requirement to be intentionally violated if plant conditions warranted it.
The power company’s workers repaired damage to the offsite transmission lines and towers, restoring offsite power to the plant. The operators downgraded the emergency classification to an Unusual Event at 2:00 am on June 26 and terminated it later that day.
Probablistic Risk Assessment
These Indian Point and Davis-Besse events illustrate how probabilistic risk analyses are conducted for nuclear power plants in general and in response to specific conditions.
An event tree, in this case (Fig. 2) for a small-break loss of coolant accident (LOCA) looks at the array of systems installed to mitigate it. When that system functions successfully, the event tree moves upward and onward to the next decision point. When that system fails, the event tree moves downward to the next decision point. The probability of success and failure are derived from past experience.
As the event tree illustrates, a failure and sometimes even multiple failures can be tolerated without leading to meltdown as long as some of the mitigation measures succeed. Called defense-in-depth, safety is best served when there are as many paths to Okay as possible and the odds of wandering down a path to Meltdown are as small as achievable.
Too many of the crossroads faced during the Indian Point moved along the negative path—the connections to offsite power were lost due to a flawed modification, one of the onsite backup power sources was lost due to another flawed modification, the operators failed to recognize and declare an emergency, and so on. While the event did not result in a meltdown, too many of the mitigation measures failed, pushing the reactor closer to meltdown than necessary.
The majority of the crossroads faced during the Davis-Besse event moved along the positive path—the failure to start an emergency diesel generator from the control room was remedied within two minutes, the loss of two telephone systems complicated but did not impair timely emergency notifications, the overheating of an emergency diesel generator was readily detected and appropriate compensatory measures put in place, and so on. The event did not result in a meltdown and the many successful mitigation measures provided a comfortable margin to a meltdown.
Nuclear plants are not meltdown-proof. Dodging a meltdown, as Indian Point did in August 1999, is great. Extracting lessons from near misses and effectively implementing solutions that make nuclear plants more meltdown-resistant is better.
UCS’s Disaster by Design/ Safety by Intent series of blog posts is intended to help readers understand how a seemingly unrelated assortment of minor problems can coalesce to cause disaster and how effective defense-in-depth can lessen both the number of pre-existing problems and the chances they team up.