.lf 3 aggie4.t 0
.nr PS 11
.nr VS 16p
.nr DD 0p
.EQ
delim %%
define in '^ size -2 \(mo ~'
define notin '^ size -2 \(nm ^'
define ge '^ size -2 >= ^'
define le '^ size -2 <= ^'
define gt '^ size -2 > ^'
define X0p '{X sub {p,0}}'
define Xip '{X sub {p,i}}'
define Rip '{R sub {p,i,a}}'
define Ript '{R sub {pt,i,a}}'
define Xjp '{X sub {p,j}}'
define Y0pt '{Y sub {pt,0}}'
define Yipt '{Y sub {pt,i}}'
define Xptp '{X sub {pt \{mu p}}'
define Rjp '{R sub {p,j,a}}'
define Rjpt '{R sub {pt,j,a}}'
define iss '^ size -2 \(ib ^'
define and '^ size -2 \(AN ^'
define mul '^ size -2 \(mu ^'
.EN
.TL
Modular Attribute Grammars
.AU
Gerald D. P. Dueck*
Gordon V. Cormack\(dg
.FS
* On leave from Department of Mathematics and Computer Science,
Brandon University, Brandon, Manitoba, Canada, R7A 6A9.
Bitnet address: gdueck at water; gery at uofmcc.
.br
\(dg Internet address: gvcormack@waterloo.edu. Bitnet or UUCP:
gvcormac at water.
.AI
Department of Computer Science
University of Waterloo
Waterloo, Ontario, Canada N2L 3G1
.AB
.LP
Attribute grammars provide a formal declarative notation for describing
the semantics and translation of programming languages.
Describing any real programming language is a significant
software engineering challenge.
From a software engineering viewpoint, current
notations for attribute grammars have two flaws:
tedious repetition of essentially the same attribute computations
is inevitable, and the various components of the description cannot
be decomposed into modules \(em they must be merged (and hence closely
coupled) with the syntax specification. This paper describes a tool
that generates attribute grammars from pattern-oriented specifications.
These specifications can be grouped according to the separation of
concerns arising from individual aspects of the compilation process.
Implementation and use of the attribute grammar generator MAGGIE is
described.
.AE
.SH
Introduction
.LP
Attribute grammars (Knuth, 1968) provide a formalism for describing the
syntax, semantics, and translation of programming languages using a
declarative specification. One advantage of such a specification is that
it provides a formal definition of the programming language being
described. Another advantage is that the specification can be converted
automatically into a compiler. The formalism itself is simple, yet quite
powerful. However, close inspection of even a small attribute grammar
will reveal certain drawbacks, namely repetition, overwhelming detail,
and the interleaving of many activities.
The size and complexity of a specification written as an
attribute grammar is such that the notion of correctness, originally an
impetus for the formalism, is compromised;
the grammar is difficult to read
and particularly difficult to debug.
.PP
Conventional attribute grammars have no facility to support
abstraction; we cannot easily extract from an attribute
grammar the nature of individual sets of related computations,
or modules.
To address this problem, we have developed a mechanism to generate
attribute grammars from
rules which are grouped together as
modular attribute grammars (MAGs).
Because a MAG rule describes a class of attribute computations,
modular attribute grammars
can be significantly less complex than equivalent attribute grammars,
and can be structured according to appropriate criteria for modular
decomposition (Parnas, 1972).
.PP
To introduce MAGS, we first require some terminology from
attribute grammars.
An attribute grammar consists of a context-free grammar with each
production augmented by attribute computations.
\fIAttributes\fP are values associated with nodes in the derivation
trees corresponding to strings in the language
generated by the context-free grammar.
An \fIattribute computation\fP defines an attribute in terms
of other attributes in the same or adjacent nodes.
Each production may have a number of attribute computations associated
with it; because of this,
the structure of the context-free grammar, rather than the
relationships among the attribute computations, dominates an attribute
grammar.
We feel it is more appropriate to structure the attribute grammar
according to the computation of attributes.
.PP
A single MAG is a set of patterns and associated templates.
The patterns are applied to a context-free grammar; for those
that match, an attribute computation is generated from the associated
template.
Pattern matching and selection of generated attribute computations
are constrained;
pattern matching uses
\fIdefinability\fP and
generated computations are selected by \fIneed\fP,
two ideas
which are introduced in this paper.
.PP
The attribute computations generated by a MAG collectively define
one or more \fIoutput\fP attributes from zero or more \fIinput\fP
attributes.
These sets of input and output attributes constitute the interface to
the MAG.
Separate MAGs can be defined to address separate concerns in the
language and compiler specification and can be combined according to
their interfaces.
.PP
We have built a prototype and gained some experience with modular
attribute grammars. The basic tool
translates a context-free grammar
and several MAGs into a monolithic attribute grammar
and produces tables used by another tool to parse program source,
build a form of \fIcompound dependency graph\fP (Jalili, 1983),
and evaluate the graph.
Attribute computations are written as compound statements in the C
programming language, and may
reference user-defined functions.
We have used the prototype to perform semantic analysis for declarations
in Pascal
and to develop techniques for using MAGs.
.KF
.lf 3 magpic1.t 0
.PS
boxwid=.75
boxht=.5
disp = .375
[
T1: box "MAG" "translator"
arrow down from T1.s
AG: box "AG" "source"
arrow right from AG.e
T2: box "AG" "translator"
arrow right 2 * disp
P: box "parser +" "evaluator"
line <- up from P.n
box "program" "source" with .s at last line.n
arrow down from P.s
box "trans-" "lation"
box invis "MAGGIE" with .w at T1.e + linewid,0
CFG: box "CFG" with .s at T1.n+-boxht,boxwid
MAG: box "MAG" "source" with .s at T1.n+boxht,boxwid
line at last box.nw+.05,0 up .05 then right boxwid then down boxht then left .05
line at last box.nw+.1,.05 up .05 then right boxwid then down boxht then left .05
line at last box.nw+.15,.1 up .05 then right boxwid then down boxht then left .05
line at last box.nw+.2,.15 up .05 then right boxwid then down boxht then left .05
arrow from CFG.s to 1/3
arrow from MAG.s to 2/3
box dotted with .nw at T1.nw + -disp,disp \
ht 2 * boxht + linewid + 2 * disp \
wid 2 * boxwid + linewid + 2 * disp
]
box invis "Figure 1: Translation of modular attribute grammars" \
with .n at last[].s
.PE
.lf 3 aggie4.t 143
.KE
.SH
Related Work
.LP
In several recent papers on attribute grammars, we find research
motivated by the need to reduce the complexity of compiler descriptions
written as attribute grammars.
In particular,
Koskimies, R\o'a\(..'ih\o'a\(..', and Sarjakoski (1982)
note that attribute grammars are hard
to read and understand, being far from self-documenting;
Ganzinger and Giegerich (1984)
quote further references in claiming that the few attribute
rules which bear semantic
significance are often buried in a large number of trivial rules;
and R\o'a\(..'ih\o'a\(..' and Tarhio (1986)
mention the difficulty of comprehending
the global use pattern of an attribute in the presence of superfluous
information.
We agree with these concerns.
.PP
Koskimies \fIet al.\fP (1982) and R\o'a\(..'ih\o'a\(..' (1984)
propose that, in designing an attribute grammar, consideration
should be given to the objects represented by the non-terminals,
not by the productions.
This approach, also pursued in the HLP84 system,
(Koskimies, Nurmi, Paaka, and Sippu, 1988),
tends to emphasize single-pass algorithmic computation, and
mandates that attribute computations be encapsulated within blocks
defining non-terminals.
In contrast, our approach emphasizes attributes rather than
non-terminals, declarative rather than algorithmic computation,
and modular structure independent from the context-free grammar.
.PP
GAG (Kastens, Zimmerman, and Hutt, 1982)
is a compiler generator that uses monolithic attribute
grammars and is necessarily rule based. GAG addresses
complexity by providing attribute transfer
and remote attribute access facilities. Attribute transfer abbreviates
a set of simple-copy attribute computations into one statement.
Remote attribute access abbreviates the transfer of attributes over
long distances in the derivation tree.
In effect, transfer and remote access are specific examples of simple
abstractions that the designers have built into GAG.
Modular attribute grammars facilitate general user abstractions that
subsume both of these facilities.
Jullig and DeRemer (1984) and Koskimies \fIet al.\fP (1982)
also address the problem of automatic
propagation of attribute values through the derivation tree.
.PP
Attribute coupled grammars (Ganzinger and Giegerich, 1984) and
tree transformation grammars
(Keller, Perkins, Payton, and Mardinly, 1984)
are used to specify compilation phases.
We define a phase as a data structure and an algorithm to map the
(input) data structure into another (output) data structure.
A formalism to specify a phase-structured compiler necessarily
requires a way to specify data structures and inter-phase interfaces
and we suggest this requirement
increases the complexity of the formalism.
Modular attribute grammars may be used to specify phases using
a single, simpler formalism.
However, we do not necessarily agree that phases are the most
appropriate approach to module decomposition.
.PP
The MUG2 system (Ganzinger, Giegerich, M\o'o\(..'ncke, and Wilhelm)
also provides a mechanism to specify phases of translation.
Each phase resembles an attribute grammar, and context-free
grammar rules are repeated in each phase. These phases, while
resembling our MAGs, differ in two significant ways:
(1) MAGs are not phase oriented; no ordering is implied,
and MAGS may be mutually dependent and
(2) MAGs do not include context-free grammar rules and are
much less closely coupled to the concrete syntax.
.PP
Regular expressions improve the conciseness of the context-free
portion of an attribute grammar (cf. Kastens \fIet al.\fP (1982)
and Jullig and DeRemer (1984)).
However, they introduce the additional notions of alternation and
repetition into attribute computations.
They provide no fundamental alternative to the monolithic
structure of conventional attribute grammars.
.PP
Extended attribute grammars (EAGs) are another notation based on
attribute grammars (Watt and Madsen (1982), Watt (1986)).
In EAGs, the notation for expressing relationships between
attributes is embedded within the notation for expressing
the context-free grammar.
The benefit of this approach is to allow attribute relationships
to be stated implicitly.
Our approach, in contrast, is based on decoupling attribute
computation and context-free productions.
.PP
Work on attribute evaluators that are efficient
in terms of execution time
(left to right evaluators (Bochmann, 1976),
ordered attribute grammars (Kastens, 1980),
translation for direct execution (Katayama, 1985), one-pass evaluators
(Koskimies, 1984)) and storage management
(Jazayeri and Pozefsky, 1981) has progressed so
that they can be realistically engineered
for production compilers (Kastens \fIet al.\fP, 1982).
Our work complements this
technology; the monolithic attribute grammar produced in an
intermediate stage of our prototype
may serve as input for other systems.
.SH
Modular Attribute Grammars
.LP
A conventional attribute grammar consists of a number of context-free
grammar rules; textually associated with each grammar rule are
a number of
attribute computations. In our system, the attribute computations
are specified separately from the CFG in a number of MAGs
(see Figure 1).
A MAG resembles an attribute grammar: it consists of a number of
.I patterns ;
with each pattern is associated one or more templates. A
pattern matches a set of rules in the CFG,
and a template specifies the attribute computations to be generated
for each matching CFG rule.
Whereas a CFG rule contains only vocabulary symbols, a MAG pattern
contains: variable symbols, which match
.I any
vocabulary symbol; quoted symbols, which match one vocabulary
symbol; and ellipses, which match zero or more vocabulary symbols.
Attribute references within the template are of the form P.x, where
P is the name of a variable or quoted symbol, and x is an attribute
name. (If P is repeated in the pattern, successive occurrences
are denoted P, P\s-2\1\s0, P\s-2\2\s0, etc.
For example, the MAG rule
P \(ra P %'%B ... Q
P.w = P\s-2\1\s0.x + B.y + Q.z
.br
matches the CFG rule
A \(ra A B C D E
.br
and generates the attribute computation
A.w = A\s-2\1\s0.x + B.y + E.z
.PP
The attribute computation is meaningful only if the attributes
A.x, B.y, and E.z are defined (by some other computation).
The computation defines A.w and is useful only if
A.w is used elsewhere in an attribute computation. The matching
of MAG rules to CFG rules is constrained to generate only useful
and meaningful attribute computations.
The generated attribute computations must comprise a well-formed
attribute grammar (Knuth, 1968), but this is not necessary
to the operation of the MAG translator; the translator does,
however, perform some \fIpost hoc\fP consistency checks.
.SH
An Example
.LP
In this section we use an example from Knuth (1968).
The problem is to create modules that generate an attribute
grammar to recognize and evaluate expressions on binary numbers.
At the end of this section, we discuss how changes to the problem
affect the modules.
.PP
The syntax is described by the following context-free grammar.
.DS
1 goal \(-> expr
2 expr \(-> term
3 expr \(-> expr addop term
4 term \(-> factor
5 term \(-> term mulop factor
6 factor \(-> int
7 factor \(-> ( expr )
8 int \(-> digit
9 int \(-> int digit
10 digit \(-> 0
11 digit \(-> 1
12 addop \(-> +
13 mulop \(-> *
.DE
.PP
The value of a binary number is determined using
%sum from p=1 to n (d sub p ) sup p-1%
where \fIp\fP denotes the position of a binary digit
\fId\fP with respect to the right boundary of a string of
\fIn\fP digits.
.PP
The following MAG rules describe the synthesized attribute \fIval\fP,
used to compose the value of a binary number or expression.
.DS
\fBmodule\fP \fIval\fP
1 %'%digit \(-> %'%0
aDigit.val = 0;
2 %'%digit \(-> %'%1
aDigit.val = 2 ^ aDigit.scale;
3 binexpr \(-> Lopnd op Ropnd
binexpr.val = callop(op.operator, Lopnd.val, Ropnd.val);
4 compose \(-> valA valB
compose.val = valA.val + valB.val;
5 A \(-> %...% B %...%
A.val = B.val;
.DE
Considering only textual pattern matching,
the first two MAG rules match productions 10 and 11;
rule 3 matches productions 3, 5 and 7;
rule 4 matches production 9; and rule 5 matches productions
1 \- 13 in 20 different ways.
.PP
Rule 5 textually
matches any production with one or more right-part symbols.
An attribute computation will only be generated from this rule if,
for a production that matches,
a) the right-part production symbol matching B
is able to synthesize the attribute
\fIval\fP and b) an occurrence of the left-part production symbol
matching A appears
in some other production on the right-hand side and
\fIneeds\fP the attribute \fIval\fP in that context.
For example, although the variable symbol B in rule 5
could match the symbol \(oq(\(cq in production 7,
there is no opportunity for
\(oq(\(cq to have synthesized this attribute.
.PP
The template in rule
3 invokes the function
.I callop
with two values and an operation indicator bound to the attribute
.I operator .
.DS
\fBmodule\fP \fIoperator\fP
6 op \(-> %'%+
op.operator = add;
7 op \(-> %'%*
op.operator = mul;
.DE
.PP
The inherited attribute \fIscale\fP (required by rule 2)
is initially zero for the right-most digit of a binary number
and incremented to the left.
.DS
\fBmodule\fP \fIscale\fP
8 binary \(-> left right
right.scale = binary.scale;
9 binary \(-> left right
left.scale = binary.scale + 1;
10 A \(-> %'%int
%'%int.scale = 0;
11 A \(-> B
B.scale = A.scale;
.DE
Although rules 8 and 9 have the same pattern, their respective
templates generate different computations. We have chosen not
to condense these two into one MAG rule with two templates
because of the implication that two attribute computations would
then be generated simultaneously.
.PP
The result of applying the generator to the context-free grammar and
the MAG rules illustrated above is the following attribute grammar.
.DS
1 goal \(-> expr
goal.val = expr.val ;
.DE
.DS
2 expr \(-> term
expr.val = term.val ;
.DE
.DS
3 expr \(-> expr addop term
expr.val = callop ( addop.operator , expr\s-2\1\s0.val , term.val ) ;
.DE
.DS
4 term \(-> factor
term.val = factor.val ;
.DE
.DS
5 term \(-> term mulop factor
term.val = callop ( mulop.operator , term\s-2\1\s0.val , factor.val ) ;
.DE
.DS
6 factor \(-> int
int.scale = 0 ;
factor.val = int.val ;
.DE
.DS
7 factor \(-> ( expr )
factor.val = expr.val ;
.DE
.DS
8 int \(-> digit
digit.scale = int.scale ;
int.val = digit.val ;
.DE
.DS
9 int \(-> int digit
int\s-2\1\s0.scale = int.scale + 1 ;
digit.scale = int.scale ;
int.val = int\s-2\1\s0.val + digit.val ;
.DE
.DS
10 digit \(-> 0
digit.val = 0 ;
.DE
.DS
11 digit \(-> 1
digit.val = 2 ^ digit.scale ;
.DE
.DS
12 addop \(-> +
addop.op = add ;
.DE
.DS
13 mulop \(-> *
mulop.op = mul ;
.DE
.LP
\fBDiscussion of modularity.\fP
To measure the degree of abstraction achieved by the three modules
above, let us propose some syntactic and semantic changes
to the example language and discuss how these affect the modular
specification.
.PP
Module \fIval\fP is immune to all changes except those that directly
affect the computation of \fIval\fP.
For example, MAG rules 1 and 2 each handle unique digits.
If the number-base changes
from binary to some other base, the number of digits will increase
and each will require a MAG rule similar to 1 and 2.
This will change module \fIval\fP directly in proportion to the
number base change.
On the other hand, if more operators or more operator priority levels
are added,
rule 3 will remain unchanged.
Rule 5 is a copy rule that simply propagates \fIval\fP up the tree.
The language could also be changed so that \fIval\fP is required in
more places; rule 5 guarantees that it is always available.
.PP
Module \fIoperator\fP abstracts the individual operators.
It is immune to
number base and priority changes but is clearly affected by
the addition of more operators.
The grammar could be modified by incorporating the unit productions
for \fIaddop\fP and \fImulop\fP into
productions where they are referenced.
In this case, the rules in module \fIoperator\fP could be modified to
recognize the operators \fI+\fP and \fI*\fP \fIin situ\fP, using
patterns such as A \(ra ... %'%+ ... or the more restrictive
A \(ra A %'%+ C, with
no further changes to the modules required.
.PP
Module \fIscale\fP is unaffected by any of the proposed changes.
.SH
Using Modular Attribute Grammars
.LP
In this section we discuss different approaches to modularity
in compiler construction and show how MAGs
may be used to achieve module decomposition.
.LP
\fBBucket Brigade.\fP
In a compiler, it is often the case that the name space or environment
is propagated down the derivation tree and an environment augmented
with new definitions is propagated up the tree
(Koskimies \fIet al.\fP, 1982) (see Figure 2).
.KF
.lf 3 magpic2.t 0
.PS
circlerad = .0625
x = .15
y = .15
[
Root: circle radius .0625
P1: circle same at Root + -.4,-.4
P2: circle same at P1 + -.4,-.4
P3: circle same at P2 + -.4,-.4
P4: circle same at P1 + .4,-.4
P5: circle same at P2 + .4,-.4
P6: circle same at Root + .4,-.4
P7: circle same at P6 + .4,-.4
P8: circle same at P7 + .4,-.4
P9: circle same at P7 + -.4,-.4
line from Root to P1 chop
line from P1 to P2 chop
line from P2 to P3 chop
line from P1 to P4 chop
line from P2 to P5 chop
line from Root to P6 chop
line from P6 to P7 chop
line from P7 to P8 chop
line from P7 to P9 chop
spline from Root + -x,0 \
to P1 + -x,0 \
then to P2 + -x,0 \
then to P3 + 0,y \
then to P3 + -x,0 \
then to P3 + 0,-y \
then to P3 + x,0 \
then to P5 + -x,0 \
then to P5 + 0,-y \
then to P5 + x,0 \
then to P5 + 0,y \
then to P2 + x/2,0 \
then to P2 + x,y \
then to P4 + -x,0 \
then to P4 + 0,-y \
then to P4 +x,0 \
then to P4 + 0,y \
then to P1 + x/2,0 \
then to P1 + x,y \
then to P6 + -x,0 \
then to P6 + 0,-y \
then to P7 + 2*-x,y \
then to P7 + -x,0 \
then to P7 + 2*-x,-y \
then to P9 + 1.5*-x,.0 \
then to P9 + 0,-y \
then to P9 + x,0 \
then to P8 + -x,0 \
then to P8 + 0,-y \
then to P8 + x,0 \
then to P8 + 0,y \
then to P7 + x,0 \
then to P6 + x,0 \
then to Root + x,0
]
box invis "Figure 2: Left-to-right tree traversal" \
with .n at last[].s ht .4
.PE
.lf 3 aggie4.t 502
.KE
The \fIbucket brigade\fP operator of
regular-right part attribute grammars (Jullig and DeRemer, 1984)
was introduced
for this purpose.
The following sets of MAG rules accomplish the same
effect. We have subdivided them into two modules: the first produces
the output attribute \fIenv\fP intended to represent the environment
propagated down the derivation tree and the second module produces the
attribute \fIdef\fP intended to represent the (modified) environment
passed up the tree.
We assume the start symbol in the context-free grammar is \fIgoal\fP.
In terms of inputs and outputs,
the two modules are dependent both on themselves and on each other.
The example illustrates a valid modular decomposition
that that cannot be expressed as a phase oriented decomposition.
.DS
\fBmodule\fP \fIenv\fP
(1) %'%goal %->% A ...
A.env = 0;
(2) A %->% B ...
B.env = A.env;
(3) A %->% ... B C ...
C.env = B.def;
.DE
.DS
\fBmodule\fP \fIdef\fP
(4) A %->% ... B ...
B.def = B.env;
(5) A %->% ... B
A.def = B.def;
(6) A %->%
A.def = A.env;
.DE
.PP
In module \fIenv\fP, (2) the left symbol of a right part inherits
\fIenv\fP from the left part and (3) other symbols in the right part
inherit \fIenv\fP in terms of \fIdef\fP
synthesized by their left neighbours.
.PP
In module \fIdef\fP, an empty production (6) sends \fIenv\fP back to
its parent as \fIdef\fP.
In a non-empty production, (5) \fIdef\fP is
synthesized from the right-most
symbol of the right part.
In the case (4) that a right-part
symbol is not a non-terminal and therefore
\fIdef\fP cannot by synthesized by (5) or (6) when the
symbol appears as a left part, \fIenv\fP is passed along as
\fIdef\fP.
.PP
The modules shown above will generate a top-down left-to-right
traversal for any derivation tree of any context-free grammar.
The modules demonstrate that a constant number of rules
can be used to describe an abstraction \- namely, bucket brigade \-
that applies to any size attribute grammar;
we feel this is a significant reduction in
complexity.
In a compiler application, the \fIdef\fP module will be augmented by
rules that recognize defining occurrences of identifiers and
generate attribute definitions that modify the environment.
.PP
To illustrate the flexibility of modular attribute grammars, we
show how to perform a top-down \fIright-to-left\fP
traversal of the derivation
tree with the following modules.
.DS
\fBmodule\fP \fIenv\fP
%'%goal %->% ... A
A.env = 0;
A %->% ... B
B.env = A.env;
A %->% ... B C ...
B.env = C.def;
.DE
.DS
\fBmodule\fP \fIdef\fP
A %->% ... B ...
B.def = B.env;
A %->% B ...
A.def = B.def;
A %->%
A.def = A.env;
.DE
.LP
.B "Abstract MAGs."
A common technique in compiler construction is to transform the
parse tree into an
.I "abstract syntax tree" ,
a representation of the parse with less irrelevant detail. Some
notation is necessary to specify the relationship of the parse tree
to the abstract syntax tree. A MAG
could be used to specify the translation from concrete to abstract
syntax tree, whose shape is specified by a second context-free grammar
(the abstract grammar). A second MAG, based on the abstract grammar,
could specify the evaluation of attributes in the abstract syntax tree.
.PP
In a number of instances, the same abstraction can be achieved without
using the two-phase approach described above. Patterns can be used
in place of two common mappings from concrete to abstract syntax:
.I grouping
and
.I elision.
Grouping involves building nodes of a common type for subtrees
derived from a number of nonterminal symbols or rules
with similar meanings.
In the binary-numbers example given above,
the nonterminals %term%, %factor%, and %expr%
all represent expressions; the nonterminals %addop% and %mulop%
both represent operators. The rules
.I "expr \(-> expr addop term"
and
.I "term \(-> term mulop factor"
both represent
binary-expression tree nodes. MAG rules conveniently
express this abstract grouping: attributes are assigned to concrete
syntax elements using MAG rules containing quoted
symbols; abstract attribute computations are expressed in terms
of variable symbols in MAG rules \- the binding of these
computations to the concrete syntax is controlled by the availability
of attribute values. That is, the set of definable
attributes is used to label nodes according to the abstract syntax.
Elision, the removal of irrelevant detail, is accomplished in two
ways: patterns containing %...% are used to match strings of symbols
that are semantically irrelevant, and unit rules can be handled by
the general
.I "copy rule"
paradigm presented above.
.PP
Several layers of abstraction
may be built using attribute-controlled MAGs \- each successive
layer is built from attribute values generated by the previous one.
Also, several different abstract views may be imposed on the parse
tree at the same time: each abstract view needs to deal with only the
sorts of information of interest to it. In our evolving methodology
for the use of MAGs, we are attempting to write reusable modules
such as symbol table routines, expression evaluators, overload
resolution algorithms, and code generators.
One of our first motivating
examples was to express using reusable MAGs an
operator selection algorithm akin to that of Ada; this algorithm
has been described informally in terms of attributes by Cormack
and Wright (1987).
The general approach
is that each such module would apply to any parse tree decorated
by some specific set of attributes. The user of the module would
be responsible to ensure (possibly by writing an interface MAG)
that the input attributes decorate the parse tree in the appropriate
manner. These input attributes will be often computed
from the outputs of other modules. For example the
expression evaluator in the example could be applied to
a variety of concrete grammars, provided the operator nodes
and value-generating nodes were assigned the attributes %op%
and %val% respectively.
.SH
A Formal Specification
.LP
.B "Attribute grammars."
An attribute grammar (AG) is composed of a context-free grammar %G% and
a set of attribute computations %R%. The attribute rules describe how an
%attributed% %parse% %tree% is derived from any string in the language
%L% generated by %G%.
.PP
More formally, an AG is a sextuple
%(N,T,S,P,A,R)% where %N% is the set of non-terminal symbols, %T% is
the set of terminal symbols, %V = N \(cu T%, %S \(mo~ N% is the start
symbol, %P% is a set of productions, %A% is a set of attributes,
and %R% is a set of attribute computations.
Each production %p \(mo~ P% is a sequence
%X sub 0 , X sub 1 , ... X sub n%, where %X sub 0 \(mo~ N% and
%X sub i \(mo~ V% %(1 <= i <= n)%. %A% is the set of symbols used to
denote attribute values within the attributed parse tree. %R% is
a set of attribute computations of the form
%(p,D,U,f)% where %p \(mo~ P%, %D% is an attribute reference
of the form %(i,a)% (%0 <= i < | p |%, %a \(mo~ A%),
%U% is a set of attribute references
of the same form as %D%, and %f% is a function
that defines the attribute referenced by %D% in terms of those
referenced by %U%.
.PP
In this exposition, we denote members of %N% and %A%
by words or letters, and members of %T% by words, letters, or single
symbols. Membership in %N%, %T%, or %A% may be deduced from context.
A production %p% is denoted %X sub 0 -> X sub 1 X sub 2 ... X sub n%.
Attribute computations pertaining to production %p% are written adjacent
to %p%.
Within an attribute computation %(p,D,U,f)%, each attribute reference
%(i,a)% is denoted %X sub i n . a% where %n% selects from duplicate
occurrences of the symbol %X sub i% within %p%:
%n = %|{% X sub j : X sub i = X sub j ^and^ j < i%}|. The special case
of %n = 0% is abbreviated %X sub i . a%.
The function %f% is a sequence of statements in the C programming
language that computes %D% in terms of the elements of %U%.
.PP
The derivation of any string %s% from %G% may be described as a parse
tree, with each leaf labelled by a terminal symbol such that, when
concatenated from left to right, these labels form %s%. Each
interior node represents the expansion of some production %p \(mo~ P% %=%
%X sub 0 -> X sub 1 X sub 2 ... X sub n%; the node is labelled %X sub 0%
and its children are labelled %X sub 1 , X sub 2 ... X sub n% in order.
Each node also has a set of attributes and attribute values associated
with it; the computation of these attributes is specified by the
attribute computations. The attribute rule %(p,D,U,f)% where D is of the
form %(0,a)% specifies the computation of the
.I synthesized
attribute %a% for
each node %n% representing an expansion of %p%. This value is
obtained by applying %f% to the values denoted by %U%; each
%(i,b) \(mo~ U% denotes the value of attribute %b% of %n% if %i = 0%,
otherwise of the %i%th child
of %n%. If %D% is of the form %(i,a)% for %i != 0%, it specifies the
computation of the
.I inherited
attribute %a% for each node which is the %i%th
child of a node %n% representing an expansion of %p%. This value is
obtained by applying %f% to the values denoted by %U%, as defined
above.
To ensure that the attribute computation is well defined
for all parse trees, we may apply the consistency constraints of Knuth
(1968).
(However, the following description of modular attribute grammars
neither depends on nor enforces these constraints.)
.PP
For example, consider the following attribute grammar.
.DS
A %->% u B z
A.a = B.a
A.b = B.b
B.c = 0
.DE
.DS
B %->% C
B.a = C.a
B.b = C.b
C.c = B.c
.DE
.DS
C %->% D E
C.a = D.a
C.b = E.b
D.c = C.c
E.c = C.c
.DE
.DS
D %->% v w
D.a = %f sub 1%(D.c)
E %->% x y
E.b = %f sub 2%(E.c)
.DE
.LP
Using the formalism to represent this grammar, we have
%N=%{A, B, C, D, E},
%T=%{u, v, w, x, y, z},
%S=%A,
%A=%{a, b, c},
%P=% {A %->% u B z, B %->% C, C %->% D E, D %->% v w, E %->% x y };
the set %R=%{%(p,D,U,f)%} is given in Table 1.
.KF
.ps 10
.vs 12p
.TS
center;
|lccccl|
|llllll|.
_
%p% %D% %U% %f%
_
{ (A %->% u B z (0,a) {(2,a)} "A.a = B.a;"),
(A %->% u B z (0,b) {(2,b)} "A.b = B.b;"),
(A %->% u B z (2,c) {} "B.c = 0;"),
(B %->% C (0,a) {(1,a)} "B.a = C.a;"),
(B %->% C (0,b) {(1,b)} "B.b = C.b;"),
(B %->% C (1,c) {(0,c)} "C.c = B.c;),
(C %->% D E (0,a) {(1,a)} "C.a = D.a;"),
(C %->% D E (0,b) {(2,b)} "C.b = E.b;"),
(C %->% D E (1,c) {(0,c)} "D.c = C.c;"),
(C %->% D E (2,c) {(0,c)} "E.c = C.c;"),
(D %->% v w (0,a) {(0,c)} "D.a = %f sub 1%D.c;"),
(E %->% x y (0,b) {(0,c)} "E.b = %f sub 2%E.c;") }
_
.TE
.ps 11
.vs 16p
.CD
Table 1: The set of attribute computations %R%
.sp
.DE
.KE
.LP
Figure 3 shows the derivation tree and compound dependency graph
produced by this attribute grammar for the input string
\(oqu v w x y z\(cq.
.KF
.lf 3 magpic0.t 0
.PS
boxht=.15
boxwid=.15
circlerad = .1
disp=.20
x=.6
x1 = .2
y = .75
w = .25
[
PA: circle invis "A"
Pb: circle invis "u" at PA + -x,-y
PB: circle invis "B" at PA + 0,-y
Pc: circle invis "z" at PA + x+x1,-y
PC: circle invis "C" at PB + 0,-y
PD: circle invis "D" at PC + -x,-y
Pe: circle invis "v" at PD + -x,-y
Pf: circle invis "w" at PD + x-x1,-y
PE: circle invis "E" at PC + x,-y
Pg: circle invis "x" at PE + -x+x1,-y
PH: circle invis "y" at PE + x-x1,-y
line dashed from PA to Pb chop
line dashed from PA to PB chop
line dashed from PA to Pc chop
line dashed from PB to PC chop
line dashed from PC to PD chop
line dashed from PC to PE chop
line dashed from PD to Pe chop
line dashed from PD to Pf chop
line dashed from PE to Pg chop
line dashed from PE to PH chop
BAa: box at PA + disp,0 "a"
BAb: box at last box + boxwid,0 "b"
BBa: box at PB + disp,0 "a"
BBb: box at last box + boxwid,0 "b"
BBc: box at last box + boxwid,0 "c"
BCa: box at PC + disp,0 "a"
BCb: box at last box + boxwid,0 "b"
BCc: box at last box + boxwid,0 "c"
BDa: box at PD + disp,0 "a"
BDc: box at last box + boxwid,0 "c"
BEb: box at PE + disp,0 "b"
BEc: box at last box + boxwid,0 "c"
B0: box at BBc + 3*boxwid,3*boxht invis "0"
Bf1: box at BDa + 2*boxwid,3*-boxht wid 1.5*boxwid "%space 0 f sub 1%(" invis
Bf1p: box at Bf1 + boxwid,0 invis
box at last box + boxwid/2,0 ")" invis
Bf2: box at BEb + 2*boxwid,3*-boxht wid 1.5*boxwid "%space 0 f sub 2%(" invis
Bf2p: box at Bf2 + boxwid,0 invis
box at last box + boxwid/2,0 ")" invis
line <- from BAa.s to BBa.n
line <- from BAb.s to BBb.n
line <- from BBa.s to BCa.n
line <- from BBb.s to BCb.n
spline <- from BCa.s down w then to BDa + 0,w then to BDa.n
spline <- from BDa.s down w then to Bf1.w
spline <- from Bf1p up w then to BDc + 0,-w then to BDc.s
spline <- from BDc.n up w/2 then to BCc + 0,-w then to BCc.s
spline <- from BCb.s down w then to BEb + 0,w then to BEb.n
spline <- from BEb.s down w then to Bf2.w
spline <- from Bf2p up w then to BEc + 0,-w then to BEc.s
spline <- from BEc.n up w then to BCc + 0,-w then to BCc.s
line <- from BCc.n to BBc.s
spline <- from BBc.n up w then to B0.w
]
box invis "Figure 3: Attribute relations" with .n at last[].s
.PE
.lf 3 aggie4.t 796
.KE
.LP
.B "Modular attribute grammars."
A modular attribute grammar (MAG) consists of an ordered
set of patterns %PT%
and a set of templates %TM%.
The patterns are the analogues of productions
in attribute grammars, and the templates are the analogues of
attribute computations. One or more MAGs is applied to a context-free
grammar %G% to create an attribute grammar as defined above.
That is, the MAG %(PT, TM)% specifies the construction of %R% in
terms of %N%, %T%, %S%, %P%, and %A%.
.PP
Each %pt \(mo~ PT% has the form %Y sub 0 , Y sub 1 , ... Y sub n%, where
%Y sub 0 \(mo~'N \(cu W% and %Y sub i \(mo~ 'V \(cu W \(cu "{" ... "}"%.
%'N% (the set of %quoted% nonterminals) is {%'x^ : ^x \(mo~ N%}, and
%'V% (the set of %quoted% vocabulary symbols) is
{%'x^ : ^x \(mo~ V%}. %W% is the set of all variable symbols used in
MAG rules.
Each %tm \(mo~ TM%
is a quadruple %(pt, D, U, f)%, where %D%
is a pattern attribute reference
%(i,a)% where %a \(mo~ A% and %0 <= i < |pt|%. %U% is a set of pattern
attribute references of the same form as %D%. The denotation used
for %PT% and %TM% is identical to that used for %P% and %R% in
the attribute grammar.
.PP
From %PT% and %P% we compute
%MT%, the set of textual matches between patterns and productions.
Each element %mt~\(mo~MT% has the
form %(p,pt,m)%, where %p \(mo~ P%, %pt \(mo~ PT%, and
%m% provides a mapping from symbols of %pt% onto symbols of %p%:
%m = M sub 0 , M sub 1 , ... M sub |pt|-1% %(0 <= M sub i <= |p|)%.
%mt~\(mo~MT% if and only if the following constraints hold:
%M sub i <= M sub i+1% %(0 <= i < |pt|-1)%;
%M sub i + 1 = M sub {i + 1}% if %Y sub i != ...% %(0 <= i < |pt|-1)%;
%Y sub i = 'X sub {M sub i}% if %Y sub i \(mo~'V% %(0 <= i < |pt|)%.
Finally, we impose a partial ordering on %MT% with the following
relations:
%(p,pt,m) < (p,pt ' ,m ' )% if %pt% appears before %pt ' % in %PT%;
%(p,pt,m) < (p,pt,m ' )% if %M sub i = M ' sub i% %(0 <= i < j)%
and %M sub j < M ' sub j% for some %j%.
.PP
Each element %mt \(mo~ MT% generates a set of attribute computations,
denoted %gen(mt)%; the union of all %gen(mt \(mo~ MT)%
is denoted %RM%.
For
%mt = (p,pt,M sub 0 M sub 1 ... M sub |pt|-1 )~\(mo~MT%,
each corresponding template
%(pt,D,U,f)%
generates an attribute computation %r% %=% %(p,D ' ,U ' ,f ' )%,
where %D '%, %U '%,
and %f '% are computed from %D%, %U%, and %f% by uniform replacement
of pattern references of the form %(i,a)% by attribute references
%(M sub i , a)%. %gen(mt)% is the set of all such %r%. %RM% is
ordered according to the relation: %r < r '% if %r% is generated from
%s \(mo~ MT%
and %r '% is generated from %s ' \(mo~ MT% and %s < s '%.
.PP
If the entire set %RM% were used as %R% as described above,
%R% would contain many meaningless, useless, and ambiguous attribute
rules.
A rule %r = (p,D,U,f)% is meaningless if any of the attribute values
denoted by %U% does not exist; %r% is useless if the attribute value
denoted by %D% is not used as an input to some
other rule; %r% and %r '% are ambiguous if they both define the
same attribute for some node in the parse tree.
.PP
The set of definable attributes %TD \(ib V \(mu A% is the
set of pairs %(w,a)% (denoted w.a) for which a meaningful rule
%r = (p,(i,a),U,f) \(mo~ RM% exists, where %X sub i = w%. %TD%
is the set defined by the recursive rule
%(w,a) \(mo~ TD%
if %\(te sub {r = (p,(i,a),U,f) \(mo~ RM}% %X sub i = w%
%\(AN%
%\(fa sub {(j,b) \(mo~ U}% %(X sub j ,b) \(mo~ TD%.
%TD% may be computed assuming %TD = "{}"% initially, and repeatedly
testing %(w,a) \(mo~ V \(mu A% for membership in %TD%, iterating to
convergence.
.PP
The set of needed attributes %TN \(ib V \(mu A% is computed in
a similar fashion: it is the set that satisfies the two
constraints:
%TN \(ip "{" (S,a) : (S,a) \(mo~ TD "}"%;
%(w,a) \(mo~ TN% if
%\(te sub {r = (p,(i,b),U,f) \(mo~ RM}%
%(X sub i, b) \(mo~ TN%
%\(AN%
%\(te sub { (j,a) \(mo~ U } X sub j = w%.
.PP
The (possibly ambiguous) set of rules %RA% is generated by
constraining %RM% using %TD% and %TN%:
%RA = "{" r = (p,(i,a),U,f) : (X sub i , a) \(mo~ TN
\(fa sub {(j,b) \(mo~ U} (X sub j , b) \(mo~ TD%}.
Two rules %r = (p,D,U,f)%
and %r ' = (p ' , D ' , U ' , f ' )% are ambiguous if %p = p '% and
%D = D '%. In this case, %r% is selected if %r < r '%.
This arbitrary choice ensures that each attribute of a symbol
within a production rule is defined by only one attribute computation,
and allows the user to specify MAG rules in order of importance.
The resulting attribute grammar
must still be checked for consistency; in particular, this construction
may yield an incomplete or circular attribute grammar.
The final
set of rules of the attribute grammar is
%R = "{" (p,D,U,f) \(mo~ RA :%\o'/\(te'%"" sub {(p,D,U ',f ') \(mo~ RA}
(p,D,U ', f ') < (p,D,U,f)%}.
.PP
For example, the attribute grammar given in the previous section
is specified by the following MAG.
.EQ
define P0 'roman P sup 0'
define Q0 'roman Q sup 0'
define P1 'roman P sup 1'
define Q1 'roman Q sup 1'
define P2 'roman P sup 2'
define P3 'roman P sup 3'
define Q3 'roman Q sup 3'
.EN
.DS
%'%D %-> ...%
D.a = %f sub 1%(D.c);
.DE
.DS
P %-> ...% Q %...%
Q.a;
.DE
.DS
%'%E %-> ...%
E.b = %f sub 2%(E.c);
.DE
.DS
P %-> ...% Q %...%
P.b = Q.b;
.DE
.DS
P %-> ...% %'%B %...%
B.c = 0;
.DE
.DS
P %-> ...% Q %...%
Q.c = P.c;
.DE
.LP
When this MAG is applied to the context free grammar from the
previous example, we have the set of patterns %PT =%
{%'%D %-> ...%,
%P0 -> ... Q0 ...% , %'%E %-> ...%,
%P1 -> ... Q1 ...%,
%P2 -> ...% %'%B %...%,
%P3 -> ... Q3 ...%},
%'N=%{%'%A, %'%B, %'%C, %'%D, %'%E},
%'V='N^\(cu^%{%'%b, %'%c, %'%e, %'%f, %'%g, %'%h},
and %W=%{P, Q}.
The set of templates %TM = %{%(pt, D, U, f)%} is shown in Table 2.
.KF
.ps 10
.vs 12p
.sp
.TS
center;
|lccccl|
|llllll|.
_
%p% %D% %U% %f%
_
{ (%'%D %-> ...% (0,a) {(0,c)} "D.a = %f sub 1%(D.c);"),
(%P0 -> ... Q0 ...% (0,a) {(2,a)} "P.a = Q.a;"),
(%'%E %-> ...% (0,b) {(0,c)} "E.b = %f sub 2%(E.c);"),
(%P1 -> ... Q1 ...% (0,b) {(2,b)} "P.b = Q.b;"),
(%P2 -> ...% %'%B %...% (2,c) {} "B.c = 0;"),
(%P3 -> ... Q3 ...% (2,c) {(0,c)} "Q.c = P.c;") }
_
.TE
.ps 11
.vs 16p
.CD
Table 2: The set of templates %TM%
.sp
.DE
.KE
.PP
The set of textual matches between patterns and productions is
%MT=%{%(p, pt, m)%} where %m% is a tuple of elements that
correspond positionally to elements of a pattern %pt%
and that map elements of %pt% onto elements of
a production %p%; %MT% is shown in Table 3.
.KF
.ps 10
.vs 12p
.TS
center;
|lcccl|
|lllll|.
_
%p% %pt% %m%
_
{ (A %->% u B z %P0 -> ... Q0 ...% (0, 1, 1, 2)),
(A %->% u B z %P0 -> ... Q0 ...% (0, 1, 2, 3)),
(A %->% u B z %P0 -> ... Q0 ...% (0, 1, 3, 4)),
(A %->% u B z %P1 -> ... Q1 ...% (0, 1, 1, 2)),
(A %->% u B z %P1 -> ... Q1 ...% (0, 1, 2, 3)),
(A %->% u B z %P1 -> ... Q1 ...% (0, 1, 3, 4)),
(A %->% u B z %P2 -> ...% %'%B %...% (0, 1, 2, 3)),
(A %->% u B z %P3 -> ... Q3 ...% (0, 1, 1, 2)),
(A %->% u B z %P3 -> ... Q3 ...% (0, 1, 2, 3)),
(A %->% u B z %P3 -> ... Q3 ...% (0, 1, 3, 4)),
(B %->% C %P0 -> ... Q0 ...% (0, 1, 1, 2)),
(B %->% C %P1 -> ... Q1 ...% (0, 1, 1, 2)),
(B %->% C %P3 -> ... Q3 ...% (0, 1, 1, 2)),
(C %->% D E %P0 -> ... Q0 ...% (0, 1, 1, 2)),
(C %->% D E %P0 -> ... Q0 ...% (0, 1, 2, 3)),
(C %->% D E %P1 -> ... Q1 ...% (0, 1, 1, 2)),
(C %->% D E %P1 -> ... Q1 ...% (0, 1, 2, 3)),
(C %->% D E %P3 -> ... Q3 ...% (0, 1, 1, 2)),
(C %->% D E %P3 -> ... Q3 ...% (0, 1, 2, 3)),
(D %->% v w %'%D %-> ...% (0, 1)),
(D %->% v w %P0 -> ... Q0 ...% (0, 1, 1, 2)),
(D %->% v w %P0 -> ... Q0 ...% (0, 1, 2, 3)),
(D %->% v w %P1 -> ... Q1 ...% (0, 1, 1, 2)),
(D %->% v w %P1 -> ... Q1 ...% (0, 1, 2, 3)),
(D %->% v w %P3 -> ... Q3 ...% (0, 1, 1, 2)),
(D %->% v w %P3 -> ... Q3 ...% (0, 1, 2, 3)),
(E %->% x y %'%E %-> ...% (0, 1)),
(E %->% x y %P0 -> ... Q0 ...% (0, 1, 1, 2)),
(E %->% x y %P0 -> ... Q0 ...% (0, 1, 2, 3)),
(E %->% x y %P1 -> ... Q1 ...% (0, 1, 1, 2)),
(E %->% x y %P1 -> ... Q1 ...% (0, 1, 2, 3)),
(E %->% x y %P3 -> ... Q3 ...% (0, 1, 1, 2)),
(E %->% x y %P3 -> ... Q3 ...% (0, 1, 2, 3)) }
_
.TE
.ps 11
.vs 16p
.RT
.CD
Table 3: The set of textual matches %MT%
.sp
.DE
.KE
.PP
The textual matches %MT% and templates %TM% generate
%RM=%{%(p,D ' ,U ' ,f ')%},
a set of attribute computations.
For example,
(%P0 -> ... Q0 ...%, (0,a), {(2,a)}, "P.a = Q.a;") %\(mo~ TM% and
(A %->% u B z, %P0 -> ... Q0 ...%, (0, 1, 1, 2)) %\(mo~ MT%
generate
(A %->% u B z, (0,a), {(1,a)}, "A.a = u.a;") %\(mo~ RM%
because %P0% corresponds to 0 in %m%
which corresponds to A in %p%,
and %Q0% corresponds to the second 1 in %m% which corresponds to
u in %p%.
%RM% is shown in Table 4.
For the purpose of this exposition attribute definitions
and references of
the form (%i, a%), as required by the formalism, are
rewritten in the form %X sub i n . a% in Table 4.
.KF
.ps 10
.vs 12p
.TS
center;
|lccccl|
|llllll|.
_
%p% %D% %U% %f%
_
{ (A %->% u B z A.a {u.a} "A.a = u.a;"),
(A %->% u B z A.a {B.a} "A.a = B.a;"),
(A %->% u B z A.a {z.a} "A.a = z.a;"),
(A %->% u B z A.b {u.b} "A.b = u.b;"),
(A %->% u B z A.b {B.b} "A.b = B.b;"),
(A %->% u B z A.b {z.b} "A.b = z.b;"),
(A %->% u B z B.c {} "B.c = 0;"),
(A %->% u B z u.c {A.c} "u.c = A.c;"),
(A %->% u B z B.c {A.c} "B.c = A.c;"),
(A %->% u B z z.c {A.c} "z.c = A.c;"),
(B %->% C B.a {C.a} "B.a = C.a;"),
(B %->% C B.b {C.b} "B.b = C.b;"),
(B %->% C C.c {B.c} "C.c = B.c;"),
(C %->% D E C.a {D.a} "C.a = D.a;"),
(C %->% D E C.a {E.a} "C.a = E.a;"),
(C %->% D E C.b {D.b} "C.b = D.b;"),
(C %->% D E C.b {E.b} "C.b = E.b;"),
(C %->% D E D.c {C.c} "D.c = C.c;"),
(C %->% D E E.c {C.c} "E.c = C.c;"),
(D %->% v w D.a {D.c} "D.a = %f sub 1%(D.c);"),
(D %->% v w D.a {v.a} "D.a = v.a;"),
(D %->% v w D.a {w.a} "D.a = w.a;"),
(D %->% v w D.b {v.b} "D.b = v.b;"),
(D %->% v w D.b {w.b} "D.b = w.b;"),
(D %->% v w v.c {D.c} "v.c = D.c;"),
(D %->% v w w.c {D.c} "w.c = D.c;"),
(E %->% x y E.a {x.a} "E.a = x.a;"),
(E %->% x y E.a {y.a} "E.a = y.a;"),
(E %->% x y E.b {E.c} "E.b = %f sub 2%(E.c);"),
(E %->% x y E.b {x.b} "E.b = x.b;"),
(E %->% x y E.b {y.b} "E.b = y.b;"),
(E %->% x y x.c {E.c} "x.c = E.c;"),
(E %->% x y y.c {E.c} "y.c = E.c;") }
_
.TE
.ps 11
.vs 16p
.CD
Table 4: The set of generated attribute computations %RM%
.sp
.DE
.KE
.PP
%RM% is used to generate %TD%, the set of definable
attributes.
%TD% is initially
empty;
B.c can be added to %TD% because
%\(te sub "(p,D,U,f) \(mo~ RM" D = % B.c and %U = % {}.
Now the set {%D : \(fa sub "(p,D,U,f)" U \(ib %{B.c}} %=% {C.c} can
be added to %TD%.
The set %TD% finally becomes {B.c, C.c, D.c, D.a, E.c, E.b, E.c, e.c,
f.c, g.c, h.c, C.b, C.a, B.b, B.a, A.b, A.a}.
.PP
The set %TN% of needed attributes is initially composed
of those attributes definable on the goal
symbol. %TN% is initially {A.a, A.b}.
By repeatedly adding to %TN% the set
{%U : \(te sub "(p,D,U,f) \(mo~ RM" D \(mo~ TN% and %U \(mo~ TD%},
the final set %TN =% {A.a, A.b, B.a, C.a, D.a, D.c, C.c, B.c,
B.b, C.b, E.c, E.c} is created.
.PP
The set of attribute computations %R%, specified in the attribute grammar
of the previous section, may now be derived from %RM%, %TD%, and
%TN% by selecting those computations in %RM% (Table 4)
that define needed
attributes (members of %RN%) in terms of definable attributes
(members of %TD%).
.SH
Implementation
.LP
Figure 1 shows the components of the implementation.
The following algorithm may be used to generate an attribute
grammar \fIAG\fP from a context free grammar %G% and a set of modular
attribute grammars \fIMAG\fP.
.IP 1. 4
Generate the set of all attributes %X.a% where %X% is a symbol
in the vocabulary of %G% and %a% is any attribute name used in
\fIMAG\fP.
Initially, all attributes are marked as not needed and not defined.
.IP 2.
For each attribute computation %c% generated by matching some pattern
\fIpt\fP to some production %p%,
if all inputs of %c% are definable
mark the output of %c% as definable.
Perform this procedure until no more attributes can be marked.
.IP 3.
Mark all definable
attributes %S.a% as needed, where %S% is the goal symbol of G.
.IP 4.
For each attribute computation %c% generated by matching some pattern
\fIpt\fP to
some production %p%, if the output X.a of %c% is needed and
no attribute computation with the output X.a has yet
been associated with %p%,
associate %c% with %p% and mark all the inputs of %c% as needed.
Repeat this procedure until no more attribute computations can be
associated with productions.
.IP 5.
Check to ensure that
.RS
.IP (a)
no attribute is both synthesized and
inherited,
.IP (b)
every synthesized attribute X.a is defined in all
attribute computations associated with each production that
has X as a left-part symbol
.IP (c)
every inherited attribute Y.b is defined in all
attribute computations associated with each production that
has Y as a right-part symbol.
.RE
.PP
In the conventional view of an attribute grammar evaluator,
the derivation tree is decorated with attributes connected
to form a graph.
The graph, traditionally called a compound dependency
graph, reflects both the structure of the derivation tree
and relationships specified in the attribute grammar.
The graph represents an expression that can be evaluated
and, if the attribute grammar specifies a translation,
the result of the evaluation is the desired translation.
The size of the graph
and the time to traverse it
is linear with respect to
the size of the input and the size of the attribute grammar.
For our research, a straightforward graph evaluator is sufficient
to provide adequate performance.
Circularity in the compound dependency graph is detected
by the evaluator.
Should performance become an issue, we have noted previously that
tools for generating efficient evaluators exist and
that the MAG translator has been designed so that such evaluators can
be interfaced with the monolithic attribute grammar produced
by our system.
.SH
Experience
.LP
We have written MAGs to perform
semantic analysis for Pascal declarations.
Our implementation uses a 158 line context-free grammar and
12 modules comprising 101 MAG rules. The
generated attribute grammar contains 525 attribute definitions
that require 50 unique attribute computations.
.PP
From this experience we note that module usage may be partitioned
into \fIcreation\fP, \fIcombination\fP, and \fIdistribution\fP.
Modules that are not partitioned strictly according to these categories
generally contain some aspect of each.
Creation modules use rules to recognise syntactic
constructs that uniquely identify semantic constructs \-
syntactic recognition is enhanced by availability of
definable attributes.
Combining modules recognise situations where
multiple threads of similar information are combined
into a single thread.
For example, a list is composed by appending elements to
another list or to an empty list.
Combining situations can be recognised solely through attribute
availability but may be triggered by cues in the concrete syntax.
Distribution modules use bucket-brigade rules to distribute
information throughout a derivation tree. These modules may also use
syntactic cues to terminate distribution.
.PP
From the Pascal implementation we also note that using MAGs is
not easy as we would like. Perhaps due to unfamiliarity with
the pattern matching process, we observe
that the user occasionally must resort to
inspecting the generated AG to determine
the effect of pattern matching. This is analogous to
object-level debugging of a program written in a high level language;
we hope to find methods and tools to allow us to work exclusively
at the MAG source level.
This aim is partly addressed with the use of a strict
methodology like that outlined above.
.SH
Conclusions and Future Research
.LP
The complexity of attribute grammars is due to the dominance
of the structure of the context-free grammar over the structure
of information flow through attribute computations.
This paper has suggested that the specification of the
computation and flow of information
through attributes
can be decoupled from concrete syntax and rearranged
as an appropriate module decomposition.
A tool to generate attribute grammars
from modular specifications has been used to
investigate how decomposition should apply
to attribute grammars.
We have gained enough experience to conclude that modular
attribute grammars represent an improvement over monolithic
attribute grammars in reducing the complexity of attribute
specifications;
we are now in a position
to make suggestions for further improvements.
.PP
The textual patterns introduced in this paper are intentionally
designed for simplicity. While more sophisticated patterns
are possible, we are not yet convinced that individual improvements
in this area will significantly reduce complexity.
In contrast, the contribution of \fIdefinability\fP and of
\fIneed\fP
in constraining textual pattern matching cannot
be overstressed.
If a better technique for MAG translation is to be
found, we believe it will be coupled with an increase in the power of
attribute-constrained
pattern matching.
.PP
Our experience with modular attribute grammars shows that
some form of bucket-brigade rules is incorporated into
most modules.
We hesitate to incorporate automatic generation of copy rules
into the MAG translator because this is a particular solution
for a particular problem that may be addressed by a more general
solution.
We find many modules, not necessarily
concerned solely with attribute propagation by copy rule,
that share a similar overall appearance.
A possible solution is to provide generic modules,
parameterized by attribute and
production symbols, that can be used to produce module instances.
This would make it possible to incorporate an instance of a generic
general-purpose bucket-brigade module as a sub-module of
any module, or, indeed, to compose modules entirely of instances
of generic modules.
.PP
MAGGIE was designed to address shortcomings in attribute grammars
that became apparent because of considerable experience (our own
and others) in
using attribute grammars to specify programming languages
and compilers.
Before making any changes to MAGGIE, we need
a larger body of expertise in creating reusable MAGs using the
existing tool. Only with this expertise can we properly evaluate
possible enhancements.
.SH
References
.na
.nr SJ \n(.j
.XP
Bochmann, G. V.,
Semantic evaluation from left to right,
\fIComm. ACM \fP\fB19\fP, 2 (1976), 55\-62.
.XP
Cormack, G. V. and Wright, A. K.,
Polymorphism in the Compiled Language ForceOne,
\fIProc. 20th Annual Hawaii International Conference on System
Sciences\fP,
1987, 284\-292.
.XP
Farrow, R.,
LINGUIST\-86: yet another translator writing system based on attribute
grammars,
\fIProc. SIGPLAN 82 Symposium on Compiler Construction,
SIGPLAN Notices, \fP\fB17\fP, 6 (1982), 160\-171.
.XP
Ganzinger, H., Giegerich, R., M\o'o\(..'ncke, U., and Wilhelm, R.,
A truly generative semantics-directed compiler generator,
\fIProc SIGPLAN '82 Symposium on Compiler Construction,
SIGPLAN Notices, \fP\fB17\fP, 6 (1982),
173\-184.
.XP
Ganzinger, H., and Giegerich, R.,
Attribute Coupled Grammars,
\fIProc. SIGPLAN '84 Symposium on Compiler Construction,
SIGPLAN Notices, \fP\fB19\fP, 6 (1984),
157\-170.
.XP
Jalili, F.,
A general linear-time evaluator fo attribute grammars,
\fISIGPLAN Notices\fP, \fB18\fP, 9, 35\-44.
.XP
Jazayeri, M. and Pozefsky, D.,
Space-efficient storage management in an attribute grammar evaluator,
\IACM Trans. Prog. Lang. Syst, \fP\fB3\fP, 4 (1981), 388\-404.
.XP
Jullig, R. K. and DeRemer, F.,
Regular right-part grammars,
\fIProc. SIGPLAN '84 Symposium on Compiler Construction
SIGPLAN Notices, \fP\fB19\fP, 6 (1984),
171\-178.
.XP
Kastens, U.,
Ordered attribute grammars,
\fIActa. Inf., \fP\fB13\fP (1980), 229\-256.
.XP
Kastens, U., Zimmerman, E., and Hutt, B.,
\fIGAG \- a Practical Compiler Generator,\fP
Lecture Notes in Computer Science 141, Berlin, Springer (1982).
.XP
Katayama, T.,
Translation of attribute grammars into procedures,
\fIACM Trans. Prog. Lang. Syst., \fP\fB6\fP, 3 (1984), 345\-369.
.XP
Keller, S. E., Perkins, J. A., Payton, T. F., and Mardinly, S. P.,
Tree transformation techniques and experiences,
\fIProc. SIGPLAN '84 Symposium on Compiler Construction,
SIGPLAN Notices, \fP\fB19\fP, 6 (1984),
190\-201.
.XP
Knuth, D. E.,
Semantics of context\-free languages,
\fIMath. Syst. Theory, \fP\fB2\fP,
2 (1968), 127\-145;
correction in
\fIMath. Syst. Theory, \fP\fB5\fP, 1 (1971), 95\-96.
.XP
Koskimies, K., R\o'a\(..'ih\o'a\(..', K-J., and Sarjakoski, M.,
Compiler construction using attribute grammars,
\fIProc. SIGPLAN '82 Symposium on Compiler Construction,
SIGPLAN Notices, \fP\fB17\fP, 6 (1982),
53\-159.
.XP
Koskimies, K.,
A note on one-pass evaluation of attribute grammars,
\fIBIT, \fP\fB25\fP, (1985), 439\-450.
.XP
Koskimies, K., Nurmi, O., Paakki, J., and Sippu, S.,
The design of a language processor generator,
\fISoft. Pract. Exp., \fP\fB18\fP, 2 (1988), 107\-135.
.XP
Parnas, D. L.,
On the criteria to be used in decomposing systems into modules,
\fIComm. ACM \fP\fB15\fP, 10 (1972), 1053\-1058.
.XP
R\o'a\(..'ih\o'a\(..', K-J.,
Saarinen, M., Soisalon\-Soininen, E., and Tienari, M.,
The compiler writing system HLP,
report A-1978-2, Department of Computer Science,
University of Helsinki (1978).
.XP
R\o'a\(..'ih\o'a\(..', K-J.,
Attribute Grammar design using the compiler writing system HLP,
\fIMethods and Tools for Compiler Construction\fP,
Cambridge UP (1984), 183\-206.
.XP
R\o'a\(..'ih\o'a\(..', K-J. and Tarhio, J.,
A globalizing transformation for attribute grammars,
\fIProc. SIGPLAN '86 Symposium on Compiler Construction,
SIGPLAN Notices, \fP\fB21\fP, 7 (1986) 74\-83.
.XP
Watt, D. A. and Madsen, O. L.,
Extended attribute grammars,
\fIComp. Journal \fP\fB26\fP, 2 (1982), 142\-153.
.XP
Watt, D. A.,
Executable Semantic Descriptions,
\fISoft. Pract. Exp., \fP\fB16\fP, 1 (1986), 13\-43.