Contending with the trinity of troubles



Gang Tan

In August 2003, the largest electric power blackout in history affected 50 million people in eight U.S. states and Ontario, Canada. Power in some areas was out for a week, and the two countries’ economies lost at least $5 billion.
In September 2004, air traffic controllers at Los Angeles International Airport lost voice contact with 400 airplanes flying over five states. Before communication was restored, five pairs of planes had narrowly avoided crashing into each other.
Both the power outage and the near-disaster in the air occurred in part because of errors in software programs, says Gang Tan, assistant professor of computer science and engineering.
While a programming error caused alarm failures that contributed to the scope of the blackout, a software bug in a voice switching and control system caused the air-traffic controllers to lose contact with pilots.
Tan chronicles these and similar events in his web site, which he calls “ A Collection of Well-Known Software Failures.”
“Software is an essential part of daily life,” says Tan, who specializes in software security, especially vulnerabilities in large systems. “Think of e-voting or e-commerce. The safety of software can affect the results of elections or your experience of buying online. So it is critical to get software right.”
To assure the safety of software, Tan contends with what security specialists dub the “trinity of troubles.” These are:
• Complexity, or the rapidly increasing number of source lines of code (SLOC) necessary to implement a program. Microsoft’s Windows 3.1 contained 5 million SLOC in 1993. Windows Vista (2006) required 50 million SLOC. Even rigorously tested code contains between 0.5 and 3 errors per 1,000 LOC, says Tan. One mistake can disrupt a program, thus the need for robust security.
• Connectivity. Before the Internet, PCs existed in isolation; today virtually every computer is online. Hackers anywhere in the world, says Tan, can access your data if your computer is not secure.
• Extensibility. Not too long ago, users purchased software directly from developers or vendors. Now, plug-ins and other extensions pitched by third-party developers make it easy to download new programs. But extensions written by strangers could have a malicious intent.
These challenges compel software users and security experts to spend more time fending off attacks by what Tan calls an invisible economy of hackers.
“It seems there are always new patches [software repairs] being made available by Microsoft and other developers. You need to download and install them to keep your computer out of the hands of remote hackers, Tan says.
In his research, Tan focuses on small errors in software systems. A recent example was the freeze-up last December of Microsoft’s Zune media players. The phenomenon was dubbed a “mass suicide” when customers worldwide reported that the 30-gigabyte system was failing to boot up.
“One error in one line of source code was causing the problem,” says Tan. “When it was fixed, the problem went away.”
The complexity of software code makes it difficult to model the behavior of a software system mathematically and almost impossible to understand every aspect of a program.
Tan develops automated techniques to scan for errors in large software systems. His goal is to locate areas of vulnerability so developers can patch errors before they distribute software commercially.
“Our techniques seek to understand the semantics of a program, that is, what the software should do, says Tan. A voting machine’s software, for example, should record votes inside a database or send them to a remote server. If it deviates from this, our analyzer issues a warning.
“We look at the supposed behavior, or specification, of a system. To verify this behavior, we do a static analysis to try to understand the behavior of a system without running it.
“One of the projects in my group is to try to understand the interaction between multiple programming languages. Just as there are many human languages, there are many programming languages. Each has its own syntax and semantics.”
Large software systems are often written in multiple programming languages, each with its own specialty. But small errors often result during language “interoperation.”
“We try to understand the kind of errors that can arise, and then to find and fix these errors,” says Tan.
Tan has conducted a static analysis of a software system containing 2 million LOC written in the programming language Java and .8 million LOC written in C.
“We have found more than 100 errors, and we have only covered a small part of the code,” says Tan. “We are exploring the possibility of parallelizing our program so it can be run on multiple processors.”
Tan has spent a decade studying software security. His research has been funded by DARPA, NSF and the National Security Agency.
--Kurt Pfitzer