Exclusive: Air traffic system failure caused by computer memory shortage
NEW YORK (Reuters) - A common design problem in the U.S. air traffic control system made it possible for a U-2 spy plane to spark a computer glitch that recently grounded or delayed hundreds of Los Angeles area flights, according to an inside account and security experts.
In theory, the same vulnerability could have been used by an attacker in a deliberate shut-down, the experts said, though two people familiar with the incident said it would be difficult to replicate the exact conditions.
The error blanked out a broad swath of the southwestern United States, from the West Coast to western Arizona and from southern Nevada to the Mexico border.
As aircraft flew through the region, the $2.4 billion system made by Lockheed Martin Corp, cycled off and on trying to fix the error, triggered by a lack of altitude information in the U-2's flight plan, according to the sources, who were not authorized to speak publicly about the incident.
No accidents or injuries were reported from the April 30 failure, though numerous flights were delayed or canceled.
Lockheed Martin said it conducts "robust testing" on all its systems and referred further questions about the En Route Automation Modernization (ERAM) system to the Federal Aviation Administration.
FAA spokeswoman Laura Brown said the computer had to examine a large number of air routes to "de-conflict the aircraft with lower-altitude flights".
She said that process "used a large amount of available memory and interrupted the computer's other flight-processing functions".
The FAA later set the system to require altitudes for every flight plan and added memory to the system, which should prevent such problems in the future, Brown said.
COMPLEX FLIGHT PLAN
When the system went out, air traffic controllers working in the regional center switched to a back-up system so they could see the planes on their screens, according to one of the sources.
Paper slips and telephones were used to relay information about planes to other control centers.
The ERAM system failed because it limits how much data each plane can send it, according to the sources. Most planes have simple flight plans, so they do not exceed that limit.
But a U-2 operating at high altitude that day had a complex flight plan that put it close to the system's limit, the sources said.
The plan showed the plane going in and out of the Los Angeles control area multiple times, not a simple point-to-point route like most flights, they said.
The flight plan did not contain an altitude for the flight, one of the sources said. While a controller entered the usual altitude for a U-2 plane - about 60,000 feet - the system began to consider all altitudes between ground level and infinity.
The conflict generated error messages and caused the system to begin cycling through restarts.
"The system is only designed to take so much data per airplane," one of the sources said. "It keeps failing itself because it's exceeded the limit of what it can do."
CYBER ATTACK CONCERN
The sources said the circumstances would be difficult for an attacker to mimic, since they involved a complex flight plan, an altitude discrepancy and an input from the controller that added to the flight plan data.
Former military and commercial pilots said flight plans are generally carefully checked and manually entered into the air traffic control computers, which are owned by the FAA.
"It would be hard to replicate by a hostile government, but it shows a very basic limitation of the system," said a former military and commercial pilot.
Cyber-attacks on aviation have been an area of increased concern for intelligence officials, who said earlier this year they will set up a new center in Maryland for sharing information on detected and possible threats.
Security experts said that from the description by insiders, the failure appeared to have been made possible by the sort of routine programming mistake that should have been identified in testing before it was deployed.
"That's when you put in values anywhere that a human could put in a number, like minus one feet, or a million feet, to see what that would do," said Jeff Moss, founder of the Black Hat and Def Con security conferences and an advisor to the Department of Homeland Security.
While it might be logical to limit the amount of data associated with one flight plan, anything exceeding that amount should not be able to render the system useless, they said.
Though they welcomed the FAA's assurance that a fix was being rolled out, they said the incident suggested that similar failures could be found.
"If it's now understood that there are flight plans that cause the automated system to fail, then the flight plan is an 'attack surface,'" said Dan Kaminsky, co-founder of the White Ops security firm and an expert in attacks based on over-filling areas of computer memory.
"It's certainly possible that there are other forms of flight plans that could cause similar or even worse effects," Kaminsky said. "This is part of the downside of automation."
Moss said many hackers have been studying aspects of a new $40 billion air traffic control system, known as NextGen, which encompasses ERAM, including its reliance on Global Positioning System data that could be faked.
At least two talks at this summer's Def Con will look at potential weaknesses in the system.
"It's very over-budget and behind schedule, so it doesn't surprise me that it's got some bugs - it's the way it presented itself" that's alarming, Moss said.
But air traffic controllers and pilots said ERAM is a vast improvement over past systems and that it is needed to fit growing plane traffic into the airspace safely.
Nate Pair, president of the Los Angeles Center for the National Air Traffic Controllers Association, said it was remarkable that ERAM was restored less than an hour after the outage, limiting the effect on travelers.
"We were completely shut down and 46 minutes later we were back up and running," Pair said.
"That could have easily been several hours and then we would have been into flight delays for days because of the ripple effects."
(Reporting by Alwyn Scott and Joseph Menn; Editing by John Pickering and Sophie Hares)