没有合适的资源?快使用搜索试试~ 我知道了~
Lecture Notes on Static Analysis.pdf
需积分: 10 14 下载量 54 浏览量
2008-06-21
21:29:17
上传
评论
收藏 347KB PDF 举报
温馨提示
试读
54页
Lecture Notes on Static Analysis.pdf
资源推荐
资源详情
资源评论
Lecture Notes on Static Analysis
Michael I. Schwartzbach
BRICS, Department of Computer Science
University of Aarhus, Denmark
mis@brics.dk
Abstract
These notes present principles and applications of static analysis of
programs. We cover type analysis, lattice theory, control flow graphs,
dataflow analysis, fixed-point algorithms, narrowing and widening, inter-
procedural analysis, control flow analysis, and pointer analysis. A tiny
imperative programming language with heap pointers and function point-
ers is subjected to numerous different static analyses illustrating the tech-
niques that are presented.
The style of presentation is intended to be precise but not overly for-
mal. The readers are assumed to be familiar with advanced programming
language concepts and the basics of compiler construction.
1
Contents
1 Introduction 3
2 A Tiny Example Language 4
Example Programs 6
3 Type Analysis 7
Types 7
Type Constraints 8
Solving Constraints 9
Slack and Limitations 10
4 Lattice Theory 11
Lattices 11
Fixed-Points 12
Closure Properties 13
Equations and Inequations 15
5 Control Flow Graphs 15
Control Flow Graphs for Statements 16
6 Dataflow Analysis 17
Fixed-Point Algorithms 18
Example: Liveness 19
Example: Available Expressions 22
Example: Very Busy Expressions 25
Example: Reaching Definitions 26
Forwards, Backwards, May, and Must 27
Example: Initialized Variables 28
Example: Sign Analysis 28
Example: Constant Propagation 31
7 Widening and Narrowing 32
8 Interprocedural Analysis 35
Flow Graphs for Programs 35
Polyvariance 37
Example: Tree Shaking 39
9 Control Flow Analysis 39
Control Flow Analysis for the λ-Calculus 39
The Cubic Algorithm 40
Control Flow Graphs for Function Pointers 42
Class Hierarchy Analysis 44
10 Pointer Analysis 45
Points-To Analysis 44
Andersen’s Algorithm 45
Steensgaard’s Algorithm 47
Interprocedural Points-To Analysis 48
Example: Null Pointer Analysis 49
Example: Shape Analysis 51
Example: Escape Analysis 53
11 Conclusion 54
2
1 Introduction
There are many interesting questions that can be asked about a given program:
• does the program terminate?
• how large can the heap become during execution?
• what is the possible output?
Other questions concern individual program points in the source code:
• does the variable x always have the same value?
• will the value of x be read in the future?
• can the pointer p be null?
• which variables can p point to?
• is the variable x initialized before it is read?
• is the value of the integer variable x always positive?
• what is a lower and upper bound on the value of the integer variable x?
• at which program points could x be assigned its current value?
• do p and q point to disjoint structures in the heap?
Rice’s theorem is a general result from 1953 that informally can be paraphrased
as stating that all interesting questions about the behavior of programs are
undecidable. This is easily seen for any special case. Assume for example the
existence of an analyzer that decides if a variable in a program has a constant
value. We could exploit this analyzer to also decide the halting problem by
using as input the program:
x = 17; if (TM(j)) x = 18;
Here x has a constant value if and only if the j’th Turing machine halts on
empty input.
This seems like a discouraging result. However, our real focus is not to decide
such properties but rather to solve practical problems like making the program
run faster or use less space, or finding bugs in the program. The solution is
to settle for approximative answers that are still precise enough to fuel our
applications.
Most often, such approximations are conservative, meaning that all errors
lean to the same side, which is determined by our intended application.
Consider again the problem of determining if a variable has a constant value.
If our intended application is to perform constant propagation, then the analysis
may only answer yes if the variable really is a constant and must answer no if
the variable may or may not be a constant. The trivial solution is of course to
answer no all the time, so we are facing the engineering challenge of answering
yes as often as possible while obtaining a reasonable performance.
3
A different example is the question: to which variables may the pointer p
point? If our intended application is to replace *p with x in order to save a
dereference operation, then the analysis may only answer “&x” if p certainly
must point to x and must answer “?” if this is false or the answer cannot be
determined. If our intended application is instead to determine the maximal size
of *p, then the analysis must reply with a possibly too large set {&x,&y,&z,...}
that is guaranteed to contain all targets.
In general, all optimization applications need conservative approximations.
If we are given false information, then the optimization is unsound and changes
the semantics of the program. Conversely, if we are given trivial information,
then the optimization fails to do anything.
Approximative answers may also be useful for finding bugs in programs,
which may be viewed as a weak form of program verification. As a case in
point, consider programming with pointers in the C language. This is fraught
with dangers such as null dereferences, dangling pointers, leaking memory, and
unintended aliases. The standard compiler technology, based on type checking,
offers little protection from pointer errors. Consider the following small program
which performs every kind of error:
int main() {
char *p,*q;
p = NULL;
printf("%s",p);
q = (char *)malloc(100);
p = q;
free(q);
*p = ’x’;
free(p);
p = (char *)malloc(100);
p = (char *)malloc(100);
q = p;
strcat(p,q);
}
The standard tools such as gcc -Wall and lint detect no errors. If we had
even approximative answers to questions about null values and pointer targets,
then many of the above errors could be caught.
Exercise 1.1: Describe all the errors in the above program.
2 A Tiny Example Language
We use a tiny imperative programming language, called TIP, throughout the
following sections. It is designed to have a minimal syntax and yet to contain
all the constructions that make static analyses interesting and challenging.
4
Expressions
The basic expressions all denote integer values:
E → intconst
→ id
→ E + E | E - E | E * E | E / E | E > E | E == E
→ ( E )
→ input
The input expression reads an integer from the input stream. The comparison
operators yield 0 for false and 1 for true. Pointer expressions will be added later.
Statements
The simple statements are familiar:
S → id = E ;
→ output E;
→ S S
→ if (E ) { S }
→ if (E ) { S } else { S }
→ while (E ) { S }
→ var id
1
,. . . ,,id
n
;
In the conditions we interpret 0 as false and all other values as true. The output
statement writes an integer value to the output stream. The var statement
declares a collection of uninitialized variables.
Functions
Functions take any number of arguments and return a single value:
F → id ( id ,. . . ,id ) { var id ,. . . ,id; S return E ; }
Function calls are an extra kind of expression:
E → id ( E ,. . . ,E )
Pointers
Finally, to allow dynamic memory, we introduce pointers into a heap:
E → &id
→ malloc
→ *E
→ null
5
剩余53页未读,继续阅读
资源评论
liuxizhiyi
- 粉丝: 3
- 资源: 13
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于Python的PCA人脸识别算法的原理及实现代码详解+源码+详细代码解析+开发文档+数据(毕业设计&课程设计&项目开发)
- Decision tree20240105(1).ipynb
- zuoyezuoyezuoye
- zuoyezuoyezuoye
- 机械设计电机转子装配设备sw22非常好的设计图纸100%好用.zip
- 作业作业作业作业作业作业
- xdotool.c
- RLMD鲁棒性局部均值分解信号分量可视化(Matlab完整源码和数据)
- Screenshot_2024-04-26-17-17-26-36_9d26c6446fd7bb8e41d99b6262b17def.jpg
- 6.0版本超广角文件+教程使用MT管理器打-7.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功