Did the air traffic control center really have a "Microsoft server crash"?
Submitted by doc on Wed, 09/22/2004 - 19:02.
On Tuesday, September 14, something went wrong at the FAA's regional center that controls high altitude air traffic over Southern California and much of the southwest U.S. Two days later, this Associated Press story (carried here on MSNBC) summarized the problem in its opening sentence: "Failure to perform a routine maintenance check caused the shutdown of an air traffic communications system serving a large swath of the West, resulting in several close calls in the skies, the FAA and a union official said Wednesday." That same day, the Los Angeles Times ran a story titled "Human Factors Silenced Airports". Then, on September 21, TechWorld ran a story titled "Microsoft server crash nearly causes 800-plane pile-up: Failure to restart system caused data overload". It begins, "A major breakdown in Southern California's air traffic control system last week was partly due to a 'design anomaly' in the way Microsoft Windows servers were integrated into the system, according to a report in the Los Angeles Times. Here's what the Times story said....
Officials from Professional Airways Systems Specialists, the union that represents FAA technicians, acknowledged Wednesday that an improperly trained employee failed to reset the Palmdale radio system.
But they said the quirk in the system, known as Voice Switching and Control System, is a "design anomaly" that should have been corrected after it was discovered last year in Atlanta.
As originally designed, the VSCS system used computers that ran on an operating system known as Unix, said Ray Baggett, vice president for the union's western region.
The VSCS system was built for the FAA by Harris Corp. of Melbourne, Fla., at a cost of more than $1.5 billion.
When the system was upgraded about a year ago, the original computers were replaced by Dell computers using Microsoft software. Baggett said the Microsoft software contained an internal clock designed to shut the system down after 49.7 days to prevent it from becoming overloaded with data.
Software analysts say a shutdown mechanism is preferable to allowing an overloaded system to keep running and potentially give controllers wrong information about flights.
Richard Riggs, an advisor to the technicians union, said the FAA had been planning to fix the program for some time. "They should have done it before they fielded the system," he said.
To prevent a reoccurrence of the problem before the software glitch is fixed, Laura Brown, an FAA spokeswoman, said the agency plans to install a system that would issue a warning well before shutdown.
Martin, the chief FAA spokesman in Washington, said the failure was not an indication of the reliability of the radio communications system itself, which he described as "nearly perfect."