|
ABSTRACT
The earliest data flow analysis research dealt with concrete
problems (such as detection of available expressions) and with low
level representations of control flow (with one large graph, each
of whose nodes represents a basic block). Several recent papers
have introduced an abstract approach, dealing with any problem
expressible in terms of a semilattice L and a monoid M of isotone
maps from L to L, under various algebraic constraints. Examples
include [CC77; GW76; KU76; Ki73; Ta75; Ta76; We75]. Several other
recent papers have introduced a high level representation with many
small graphs, each of which represents a small portion of the
control flow information in a program. The hierarchy of small
graphs is explicit in [Ro77a; Ro77b] and implicit in papers that
deal with syntax directed analysis of programs written within the
confines of classical structured programming [DDH72, Sec. 1.7].
Examples include [TK76; ZB74]. The abstract papers have retained
the low level representations while the high level papers have
retained the concrete problems of the earliest work. This paper
studies abstract conditions on L and M that lead to rapid data flow
analysis, with emphasis on high level representations. Unlike some
analysis methods oriented toward structured programming [TK76;
Wu75; ZB74], our method retains the ability to cope with arbitrary
escape and jump statements while it exploits the control flow
information implicit in the parse tree.
The general algebraic framework for data flow analysis with
semilattices is presented in Section 2, along with some preliminary
lemmas. Our "rapid" monoids properly include the "fast" monoids of
[GW76]. Section 3 relates data flow problems to the hierarchies of
small graphs introduced in [Ro77a; Ro77b]. High level analysis
begins with local information expressed by mapping the arcs of a
large graph into the monoid M, much as in low level analysis. But
each arc in our small graphs represents a set (often an infinite
set) of paths in the underlying large graph. Appropriate members of
M are associated with these arcs. This "globalized" local
information is used to solve global flow problems in Section 4. The
fundamental theorem of Section 4 is applied to programs with the
control structures of classical structured programming in Section
5. For a given rapid monoid M, the time required to solve any
global data flow problem is linear in the number of statements in
the program. (For varying M, the time is linear in the product of
this number by t@, where t@ is a parameter of
M introduced in the definition of rapidity.) For reasons sketched
at the beginning of Section 6, we feel obliged to cope with source
level escape and jump statements as well as with classical
structured programming. Section 6 shows how to apply the
fundamental theorem of Section 4 to programs with arbitrary escapes
and jumps. The explicit time bound is only derived for programs
without jumps. A comparison between the results obtained by our
method and those obtained by [GW76] is in Section 7, which also
contains examples of rapid monoids in the full paper. Finally,
Section 8 lists conclusions and open problems. Proofs of lemmas are
omitted to save space. The full paper will resubmitted to a
journal.
We proceed from the general to the particular, except in some
places where bending the rule a little makes a significant
improvement in the expository flow. Common mathematical notation is
used. To avoid excessive parentheses, the value of a function f at
an argument x is fx rather than f(x). If fx is itself a function
then (fx)y is the result of applying fx to y. The usual
¡Ü and ¡Ý symbols are used for arbitrary
partial orders as well as for the usual order among integers. A
function from a partially ordered set (poset) to a poset is
isotone iff x ¡Ü y implies fx ¡Ü fy.
(Isotone maps are sometimes called "monotonic" in the literature.)
A meet semilattice is a poset with a binary operation
¡Ä such that x ¡Ä y is the greatest lower
bound of the set {x, y}. A meet semilattice wherein every subset
has a greatest lower bound is complete. In particular, the
empty subset has a greatest lower bound T, so a complete meet
semilattice has a maximum element. A monoid is a set
together with an associative binary operation ∘ that has
a unit element 1 : 1 ∘ m = m ∘
1 = m for all m. In all our examples the monoid M will be a
monoid of functions: every member of M is a function (from a
set into itself), the operation ∘ is the usual
composition (f ∘ g)x = f(gx), and the unit 1 is
the identity function with 1X = x for all x. Two
considerations governed the notational choices. First, we speak in
ways that are common in mathematics and are convenient here.
Second, we try to facilitate comparisons with [GW76; KU76; Ro77b],
to the extent that the disparities among these works permit. One
disparity is between the meet semilattices of [GW76; KU76; Ki73]
and the join semilattices of [Ro77b; Ta75; We75], where
least upper bounds are considered instead of greatest lower bounds.
To speak of meets is more natural in applications that are
intuitively stated in terms of "what must happen on all paths" in
some class of paths in a program, while to speak of joins is more
natural in applications that are intuitively stated in terms of
"what can happen on some paths." By checking whether there are any
paths in the relevant class and by using the rule that 3 is
equivalent to ⌍V⌍, join oriented applications
can be reduced to meet oriented ones (and vice versa). A general
theory should speak in one way or the other, and we have chosen
meets. For us, strong assertions about a program's data flow are
high in the semilattice.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
Ha77b. Harrison, W. H. Compiler analysis of the value ranges for variables. IEEE Trans. on Software Engineering3 ( 1977), 243-250.
|
| |
8
|
KU77. Kam, J. B., and Ullman J. D. Monotone data flow analysis framework. Acta Informatica7 (1977), 305-317.
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
La77. Lamport, L. Proving the correctness of multiprocess programs. IEEE Trans. on Software Engineering3 (1977), 125-143.
|
 |
13
|
|
 |
14
|
|
| |
15
|
Ro76. Rosen, B. K. Correctness of parallel programs: the Church-Rosser approach. Theoretical Computer Science2 (1976), 183-207.
|
 |
16
|
|
 |
17
|
|
| |
18
|
TK76. Taniguchi, K., and Kasami, T. An O(n) algorithm for computing the set of available expressions of D-charts. Acta Informatica6 (1976), 361-364.
|
| |
19
|
|
| |
20
|
|
| |
21
|
Ul73. Ullman, J. D. Fast algorithms for the elimination of common subexpressions. Acta Informatica2 (1973), 191-213.
|
| |
22
|
We75. Wegbreit, B. Property extraction in well founded property sets. IEEE Trans. on Software Engineering1 (1975), 270-285.
|
| |
23
|
|
| |
24
|
|
| |
25
|
ZB74. Zelkowitz, M. V., and Bail, W. G. Optimization of structured programs. Software Practice and Experience4 (1974), 51-57.
|
|