COMS W4117
Compilers and Translators:
Software Verification Tools
Lecture 13: Applications of Interprocedural Analysis
October 18, 2007
Lecture Outline
- Review
- Context-sensitive vs. context-insensitive analysis
- Call strings
- Cloning-based context-sensitive analysis
- Summary-based context-sensitive analysis
- Pointer alias analysis
- SQL injections
- Buffer overflows
1. Review
- Intraprocedural vs. interprocedural analysis
- Call graphs
2. Context-sensitive vs. Context-insensitive Analysis
- Context-insensitive interprocedural analysis treats each call and return
statement as goto operations ignoring the order of the calls.
- We can model this kind of analysis with a super control-flow graph (like the PDG)
which is a control-flow graph with additional edges connecting each call
site to the beginning of the procedure it calls and each return statement to
the instruction following the call site.
- A context-sensitive analysis keeps track of the context in which each
procedure was called.
for (i = 0; i < n; i++) {
c1: t1 = f(0);
c2: t2 = f(243);
c3: t3 = f(243);
X[i] = t1 + t2 + t3;
}
int f(int v) {
return (v+1);
}
A context-insensitive analysis yields that X[i] is either 3, 246, 489, or 732.
A context-sensitive analysis yields that X[i] is 489.
3. Call Strings
- A calling context is defined by the contents of the entire call stack.
We refer to the string of call sites on the stack as the call string.
- Consider a modification of the above program where the calls to f are now
calls to g, which then calls f:
for (i = 0; i < n; i++) {
c1: t1 = g(0);
c2: t2 = g(243);
c3: t3 = g(243);
X[i] = t1 + t2 + t3;
}
int g(int v) {
c4: return f(v);
}
int f(int v) {
return (v+1);
}
There are three calls strings to f: (c1, c4), (c2, c4), (c3, c4).
4. Cloning-based Context-sensitive Analysis
- In cloning-based context-sensitive analysis we create a conceptual clone of
the procedure for each context of interest and then apply a context-insensitive
analysis to the cloned call graph.
- In practice we do not clone the procedure; we use an internal data structure
to keep track of the analysis results on each clone.
- A cloned version of the above program is shown here:
for (i = 0; i < n; i++) {
c1: t1 = g(0);
c2: t2 = g(243);
c3: t3 = g(243);
X[i] = t1 + t2 + t3;
}
int g(int v) {
c4.1: return f1(v);
}
int g(int v) {
c4.2: return f2(v);
}
int g(int v) {
c4.3: return f3(v);
}
int f1(int v) {
return (v+1);
}
int f2(int v) {
return (v+1);
}
int f3(int v) {
return (v+1);
}
5. Summary-based Context-sensitive Analysis
- In summary-based context-sensitive analysis we create a concise description ("summary") of
the observable behavior of each procedure.
- The purpose of the summary is to avoid reanalyzing the procedure body at each
invoking call site.
- Each procedure is represented by a region with a single entry point.
- A procedure region can be nested within several different outer regions.
- The analysis has two parts:
- A bottom-up phase that computes a transfer function to summarize the
effect of a procedure.
- A top-down phase that propagates caller information to compute callee results.
- The following program shows the result of propagating all possible
constant arguments to the function f in the program in section (4) above:
for (i = 0; i < n; i++) {
c1: t1 = f0(0);
c2: t2 = f243(243);
c3: t3 = f243(243);
X[i] = t1 + t2 + t3;
}
int f0(int v) {
return (1);
}
int f243(int v) {
return (244);
}
6. Pointer Alias Analysis
- We can improve the accuracy of program analyses if we can determine
whether two pointers can be aliases of one another.
- If we can determine that p and q cannot point to the same location
in the following code fragment, then we can conclude that x is 1
at the end of the third statement.
*p = 1;
*q = 2;
x = *p;
7. SQL Injection
- Suppose a banking application uses a relation:
AcctData(name, password, balance)
Suppose a user enters a name n and a password p into a Web form
and the following SQL is then executed:
SELECT balance FROM AcctData
WHERE name = ':n' and password = ':p'
A hacker enters:
n = Charles Dickens' --
p = who cares
The SQL query becomes:
SELECT balance FROM AcctData
WHERE name = 'Charles Dickens' --' and password = 'who cares'
Since -- introduces a comment, the SQL query becomes:
SELECT balance FROM AcctData
WHERE name = 'Charles Dickens'
and the hacker get Charles Dickens' balance.
A full interprocedural analysis of the program is necessary to detect such errors.
8. Reading
aho@cs.columbia.edu