Tuesday, March 31, 2009

Academics: Part 3 of 5: "Program Analysis"

“Had Darth Vader employed formal methods to the design of his Death Star, perhaps it would not have been vulnerable to the starfighter attack that led to its destruction.”

- Jose Meseguer and Grigore Rosu, professors of computer science at UIUC, in their research description

-----------------------------------------------------------

For my third post in this series I will talk about...

Program Analysis and Understanding
with Jeff Foster

This is a graduate level course that is a lot different than any of the other computer science courses I have taken. We do very little programming in the class; instead, the focus is using mathematical logic to prove properties of computer programs.

Some of the topics we have learned about so far include:

Dataflow Analysis - this involves tracking different facts about the data manipulated by a program (like "variable X is always zero here" or "variable Y will always be overwritten by another value before it is used) through the execution of a program. The algorithm for doing this is actually reasonably straightfoward - basically, you first generate a "control flow graph" shoing all the different possible ways that control can proceed through the program (for example, an if statement would be a branch in the control flow graph, since control could go two one of two different places depending on the result of the if statement) and then propagate the facts along the edges of the graph. One area of active research now is dataflow analysis for multithreaded programs - where there are two or more separate "threads" of execution, and it is not known which thread will be executed first and when the operating system will decide to switch the control between threads. This of course makes dataflow analysis a lot harder because you don't actually know what the "control flow graph" looks like.

Automated Proof Systems - This involves repesenting logical statements (like "for all integers A and B, A+B = B+A") and proofs of those statements in a form that a computer can manipulate. For example, in the example above, you would first have to define "integer" - the way you do that is by saying "0 is an integer, and for every integer X, there exists a successor S(X)." From this definition it is possible to define operations like addition and multiplication, and then prove statements like the one above. The way you prove statements in Coq (the proof system we are using) is by giving a series of "tactics" that specify how to get from the hypotheses (the things you know are true) to the goal (the thing you are trying to prove.) Examples of tactics are:

Rewrite - If you know that A=B, then you can replace all the A's in an expression with B's, or vice versa.

Destruct - This "breaks apart" a term into all the possibilities for what it is, so you can do analysis on a case-by-case basis. For example if X is a boolean value (true or false), then "destruct X" would create two branches of the proof - one where X is true and one where X is false. This allows you to prove things like DeMorgan's Laws (by enumerating all the cases).

Apply - If you know that "if A then B", and your goal is B, this changes the goal to be A. This makes sense because if you know A and if you know "if A then B" then you know B.

(One good thing about automated proof systems from a teacher's perspective is that it makes it really easy to grade the homeworks. If the proof goes through and doesn't generate an error, then it must be correct.)

Automated proof systems are also useful in proving things about computer programs. If you define the syntax of a programming language in Coq, then you can represent statements like "the expression X+Y evaluates to 7 given the variable values X=3 and Y=4" and prove things like that the final result of an expression is the same no matter what order you evaluate its terms in (this is true in the simplified language we are studying, but not true in many real programming languages.)

Also, just yesterday my professor was quoted in a Diamondback article. (See the second page of the article for his quote.)

-----------------------------

P.S. The quote at the top of this post is inaccurate. Darth Vader had nothing to do with the design or construction of the Death Star.

3 comments:

Nanette Goodman said...

Once you get to UIUC you can straighten out the professors'knowledge of the Death Start.

This sounds like an interesting field. Do you think it is something you will pursue in graduate school.

Dan Mont said...

Hi Alex,

This "meta-computer science" stuff sounds really interesting to me.

In your discussions about generating algorithms to prove logical statements, have you talked about Godel's Theorem at all?

Alexander Mont said...

This is a field I might be considering to pursue in graduate school

We have not talked about Godel's Theorem yet.