Syntax-Directed Translation

Slides:

Advertisements

Similar presentations

Exercise 1 EECS, Peking University Exercise in Query Processing.

Advertisements

Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.

第七课：电脑和网络. 生词上网 vs. 网上我上网看天气预报。今天早上看了网上的天气预报。正式 zhèngshì （报告，会议，纪录）他被这所学校正式录取大桥已经落成，日内就可以正式通车落伍 luòw ǔ 迟到 chídào 他怕迟到，六点就起床了.

Course 1 演算法: 效率、分析與量級 Algorithms: Efficiency, Analysis, and Order

a simplified C to Java Compiler

-CHINESE TIME (中文时间): Free Response idea: 你周末做了什么？

第五章语法制导的翻译赵建华南京大学计算机系 2010年3月.

编译原理第二次习题课王静.

第六章中间代码生成赵建华南京大学计算机系.

-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學資訊管理系李麗華教授.

The discipline of algorithms

Minimum Spanning Trees

Operators and Expressions

Euler’s method of construction of the Exponential function

编译原理与技术中间代码生成 2018/9/17 《编译原理与技术》讲义.

Population proportion and sample proportion

！温故知新上下文无关文法最左推导最右推导自上而下自下而上句柄归约移进-归约冲突移进-归约分析递归下降预测分析

编译原理(H) 第一次习题课.

關聯式資料庫.

7 Intermediate Representation

Chapter 4 歸納(Induction)與遞迴(Recursion)

2.2 语法分析器生成器YACC 分析器的构造步骤：产生式→识别活前缀的DFA→分析表（＋驱动器） YACC概述

樹狀結構陳怡芬 2018/11/16 北一女中資訊專題研究.

CH6.属性文法语法制导翻译《程序设计语言编译原理》陈火旺等编著 2000年第3版

编译原理与技术第7章中间代码生成 3学时.

第五讲数据的分组、合并与转换.

Chapter 6 Graph Chang Chi-Chung

助教：胡光能，解定宝编译原理讲师：戴新宇

属性文法和语法制导翻译授课：胡静.

Last Lecture Revisited

Logistics 物流昭安國際物流園區總經理曾玉勤.

第4章网络互联与广域网 4.1 网络互联概述 4.2 网络互联设备 4.3 广域网 4.4 ISDN 4.5 DDN

Course 9 NP Theory序論 An Introduction to the Theory of NP

语言及其文法.

C 語言簡介 - 2.

The expression and applications of topology on spatial data

第2次课上下文无关文法

第九單元 Classes and data abstraction I

Interval Estimation區間估計

SpringerLink 新平台介绍.

重點資料結構之選定會影響演算法選擇對的資料結構讓您上天堂程式.

7.4 布尔表达式和控制流语句布尔表达式有两个基本目的计算逻辑值 c = (a > b) && a > 1

客户服务询盘惯例.

第4章(1) 空间数据库 —数据库理论基础北京建筑工程学院王文宇.

中间代码生成.

樹 2 Michael Tsai 2013/3/26.

第十五课：在医院看病.

Chapter 2 Basic Concepts in Graph Theory

Version Control System Based DSNs

在Microsoft Access 下建立資料庫

软件设计任务从工程管理的角度来看，软件设计分两步完成。概要设计，将软件需求转化为数据结构和软件的系统结构。

中国科学技术大学计算机系陈香兰 2013Fall 第七讲存储器管理中国科学技术大学计算机系陈香兰 2013Fall.

Definition of Trace Function

中央社新聞— ＜LTTC：台灣學生英語聽說提升讀寫相對下降＞

第六章语法制导的翻译本章内容介绍一种语义描述方法：语法制导的翻译，介绍语法制导的翻译的实现方法。

关联词 Writing.

從 ER 到 Logical Schema ──兼談Schema Integration

SpringerLink 新平台介绍.

习题课编译原理与技术计算机科学与技术学院李诚.

M; Well, let me check again with Jane

赵才荣同济大学，电子与信息工程学院，智信馆410室

進階 WWW 程式設計 -- PHP Array 靜宜大學資訊管理學系蔡奇偉副教授

反覆迴圈、陣列、副程式靜宜大學資管系楊子青

反覆迴圈、陣列、副程式靜宜大學資管系楊子青

5. Combinational Logic Analysis

ABAP Basic Concept (2) 運算子控制式與迴圈 Subroutines Event Block

第8章语法制导翻译与中间代码生成.

Gaussian Process Ruohua Shi Meeting

ABAP Basic Concept (2) 運算子控制式與迴圈 Subroutines Event Block

Presentation transcript:

Syntax-Directed Translation 語法導引之意譯 Syntax-Directed Translation 此章介紹如何將語法依附以語意屬性值; 即 SDT. We associate information with a programming language construct by attaching attributes to the grammar symbols representing the construct. Values for attributes are computed by “semantic rules” associated with the grammar productions. There are two notations for associating semantic rules with productions, syntax-directed definitions and translation schemes. Syntax-directed definitions are high-level specifications for translations. They hide many implementation details and free the user from having to specify explicitly the order in which translation takes place. Translation schemes indicate the order in which semantic rules are to be evaluated, so they allow some implementation details to be shown. 2019/1/16

The translation of the token stream is the result obtained by evaluating the semantic rules(conceptual view of SDT) Input string Parse tree Evaluation order For semantic rules Dependency graph To translate a programming language construct, a compiler may need to keep track of many quantities besides the code generated for the construct. For example, the compiler may need to know the type of the construct, or the location of the first instruction in the target code, or the number of instructions generated. We therefore talk abstractly about attributes associated with constructs. An attribute may represent any quantity, e.g., a type, a string, a memory location, or whatever. A syntax-directed definition uses a context-free grammar to specify the syntactic structure of the input. With each grammar symbol, it associates a set of attributes, and with each production, a set of semantic rules for computing values of the attributes associated with the symbols appearing in that production. The grammar and the set of semantic rules constitute the syntax directed definition. 2019/1/16

Attribute Grammars Def: An attribute grammar is a cfg G = (S, N, T, P) with the following additions: For each grammar symbol x there is a set A(x) of attribute values Each rule has a set of functions that define certain attributes of the nonterminals in the rule Each rule has a (possibly empty) set of predicates to check for attribute consistency Attribute grammars are grammars to which have been added attributes, attribute computation functions, and predicate functions. 2019/1/16

Attribute Grammars Let X0  X1 ... Xn be a rule Functions of the form S(X0) = f(A(X1), ... , A(Xn)) define synthesized attributes Functions of the form I(Xj) = f(A(X0), ... , A(Xj-1)), for 1 <= j <= n, define inherited attributes Initially, there are intrinsic attributes on the leaves ... 2019/1/16

Attribute Grammars Example: expressions of the form id + id BNF: id's can be either int_type or real_type types of the two id's must be the same type of the expression must match it's expected type(from top down) BNF: <expr>  <var> + <var> <var>  id Attributes: actual_type - synthesized for <var> and <expr> expected_type - inherited for <expr> 2019/1/16

<assign>  <var> = <expr> <expr>  <var>1 + <var>2 <expr>  <var> <var>  A | B Syntax rule: <assign>  <var> = <expr> Semantic rule: <expr>.expected_type  <var>.actual_type Rule 1 Syntax rule: <expr>  <var>1 + <var>2 Semantic rule: <expr>.actual_type  if (<var>1 .actual_type = int) and (<var>2 .actual_type = int) then int else real end if Predicate: <expr>.actual_type = <expr>.expected_type Rule 2 2019/1/16

Syntax rule: <expr>  <var> Semantic rule: <expr>.actual_type  <var>.actual_type Predicate: <expr>.actual_type = <expr>.expected_type Rule 3 Syntax rule: <var>  A | B Semantic rule: <var>.actual_type  look-up(<var>.string) Rule 4 To compute the attribute values, the following is a possible sequence: <var>.actual_type  look-up(A) (Rule 4) <expr>.expected_type  <var>.actual_type (Rule 1) <var>1 .actual_type  look-up(A) (Rule 4) <var>2 .actual_type  look-up(A) (Rule 4) <expr>.actual_type  either int or real (Rule 2) <expr>.expected_type = <expr>.actual_type is either True or False (Rule 2) 2019/1/16

<assign> T or F ? ⑥ <expr> ⑤ ⑤ ② <var> <var>1 Expected_type Actual_type <expr> ⑤ ⑤ ② Actual_type Actual_type Actual_type <var> <var>1 <var>2 ① ③ ④ A = A + B Now you may take a close look at the Fig.3.8 of a “Fully attributed” parse tree. 2019/1/16

Attribute Grammars How are attribute values computed? If all attributes were inherited, the tree could be decorated in top-down order. If all attributes were synthesized, the tree could be decorated in bottom-up order. In many cases, both kinds of attributes are used, and it is some combination of top-down and bottom-up that must be used. 2019/1/16

Assignment statements Now let us take a look on forward branching problem. 此相為一語意處理相(semantic processing),將有中間碼(intermediate code)產生. 今以例示之. Assignment statements E.code: sequence of three-address code to evaluate E. E.place: place where E’s value resides at runtime. E id {E.code := null, E.place := id.place} E E(1) * E(2) {E.place := Newtemp( ), E.code := E(1) .code // E(2).code // gen( E.place “:=“ E(1) .place “*” E(2).place) } E -E(1) {E.place := Newtemp( ), E.code := E(1).code // gen( E.place “:=“ ‘uminus’ E(1). Place) } A id:=E {A.code := E.code // gen(id.place “:=“ E.place) } 其語意之處理過程以 Bubble 圖示如次頁: 2019/1/16

.place id a “ “ := id * -id$ id a “ “ id b “ “ := E b “ “ id a “ “ := Bubble Input a := b * -c $ id := id * -id$ .place .CODE id a “ “ := id * -id$ id a “ “ id b “ “ := * -id $ E b “ “ id a “ “ := id $ * - id c “ “ id a “ “ E b “ “ := $ * - E b “ “ E c “ “ id a “ “ := * - $ E t1 t1:= -c E b “ “ id a “ “ := * $ 2019/1/16 

id a “ “ E t2 t2 := b*t1 := $  S a a := t2 $  t1 := -c t2 := b*t1 a := t2 semantic processing 但是有些不妙的問題卻發生了:(1) forward branching, (2)inserting GOTO instruction. 2019/1/16

forward branching <label> ::= id: { if id.place.quad is defined then error(“Duplicate label defined”); else id.place.quad = NextQuad; backpatch( id.place.fwdref, NextQuad); } <stmt> ::= goto id { if id.place.quad is defined then gen(‘goto’ id.place.quad) : append( id.place.fwdref, NextQuad) ; gen( ‘goto’ ?); } Backpatch 是一旦找到id 之location 值, 則回填任何與此 id相關之等待goto?處 append 是將此未定義之id.place.fwdref位置linking 到list 中,以做為未來backpatch之用. 2019/1/16

inserting "goto?" S ::= if E then S1 <else> S2 fi {backpatch(E.true, S1.1stquad); backpatch(E.false, S2.1stquad); S.next := merge(<else>.quad,S1.next,S2.next)} <else> ::= else {<else>.quad := Nextquad; gen(‘goto’ ?) } S ::= while E do S1 done {backpatch(E.false, S1.next ); backpatch( E.true, S1.1stquad); S.next = E.false; S.1stquad = E.1stquad; gen( ‘goto’ E.1stquad) } 也也人將文法改成 S ::= if E then S1 M else S2 fi M ::=  2019/1/16

語法導引之轉譯技術(今) 前言 1. 在靜態文法結構上,附加一些屬性值,而使語言的建構增添一些額外資訊( 這些屬性值由語意規則計算而來;而文法規則與語意規則是相互對應的) 2. 自文法產生語意需經過兩種步驟: (a) Syntax Directed Definitions; 它僅定義比較抽象的規格,不涉及實作細節順序. (b) Translation Schemes; 它表達實作細節與順序. 此二者觀念上均為了下述工作而來: 語法基本單元語法樹相依圖語意規則計算此過程正是 Syntax Directed Translation 之概念; 為了效率考量,實際的Translation可能不必明顯構作解析樹或相依圖,而在one-pass方式下執行語意規則(此即所謂的 L-Attributed definitions). 2019/1/16

Syntax Directed Definition (SDD) SDD 是 Context-Free Grammar 之擴充,它對每一個文法符號( T & N) 賦予一組屬性( 即定義 Bubble). 這組屬性又可分 Synthesized attributes 及 Inherited attributes 兩種. 每一個屬性即就是我們在 programming languages 中所說的 object之 an attribute; ; 它可以是 a name, a value, a string , a date type, a location or anything else. 在語法解析樹中,所有non-terminals之屬性值均是由此節點相對應的語意規則(semantic rules)之計算而得. 其中合成屬性值來自其decendants之屬性的合成,而繼承屬性值則來自其ancestors或siblings(同輩)屬性之傳承而得. Observations 若Non-terminals之屬性值完全被定義,則不必synthesized及inherited之處理.但是一般而言,一棵語法樹(parse tree)是需要添加(annotating)屬性才能繼續往前編譯; 而屬性之添加需要透過相依圖( dependency graph)而導出一組計算語意規則之順序,然後藉著計算這些規則而建構各節點之屬性值. 2019/1/16

SDD型式在SDD中,每一條語法規則 A ---->  對應一組型式為 b ::= f(c1 , . . . ,ck) 之語意規則. 此處 f 是一個函數,它滿足下述兩者之一: (1) b 是 A 的合成函數值,且 ci 是文法規則右邊之文法符號之屬性, 或 (2) b 是  中某 nonterminal之inherited屬性,且 ci 是 A 或文法規則右邊任一文法符號之屬性 . 不管哪一種情況, b 屬性與 ci屬性均相依 . 有時 SDD 會產生side effect . 若有此情況,則可將此 f 函數看成 A 定義之虛擬合成屬性值. (PS: 所謂 side effect 即程式語言中所指之事) 2019/1/16

圖5.2 A desk calculator's SDD production Semantic Rules L En print (E.val) E E1 + T E.val := E1.val + T.val E T E.val := T.val T T1 * F T.val := T1.val * F.val T F T.val := F.val F (E) F.val := E.val F digit F.val := digit.lexval 圖5.2 A desk calculator's SDD 2019/1/16

說明: (1) token中之digit 有一合成屬性 lexval , 其值是 lexical analyzer所供給( 終端符號只含合成屬性) . (2) L En 之語意規則是把文法右邊算術式E之值印出,這點可以想像成 L 的虛擬屬性. (3) 本SDD 整個看來均屬 “合成屬性”(bottom-up 表現) . (4) 若SDD只用到合成屬性,則謂之 S屬性定義（S-Attributed Definition). 其計算順序必可由下而上方式處理之. 一棵語法解析樹若標上了S屬性　(Synthesized attribute), 則成為　annotated parse tree. 使用　LR parser 可建構 S 屬性定義. 2019/1/16

圖5.3 L n 用 3*5+4n 這個輸入字串與上上一頁之SDD,可得annotated parse tree E.val = 19 n E.val = 15 + T.val=4 T.val=15 F.val=4 T.val=3 * digit.lexcal=4 F.val=5 F.val=3 digit.lexcal=5 digit.lexcal=3 圖5.3 2019/1/16 3*5+4n 之標上屬性的語法解析樹

圖5.4 用L.in 繼承式屬性之SDD S-attributes 已看過,現在轉為研究 Inherited Attributes. 眾所周知,繼承式屬性即表示解析樹中節點之屬性值來自此節點之祖先或同輩各節點之屬性值. 何以要有此 Inherited Attributes ? Ans: 因為有些屬性來自parse tree之由上而下順序情況時之上層節點. 舉例來說: 像 data type, upper bound 或 lower bound 等皆然. 用例示說明此觀點:(例 5.3) 先定義宣告部之文法及其對應之語意規則 production Semantic Rules D TL L.in := T.type (Inherited) T int T.type := integer (Synthesized) T real T.type := real (Synthesized) L L1,id L1.in := L.in (Inherited) addtype(id.entry, L.in) (Inherited) L id addtype(id.entry, L.in) (Inherited) 圖5.4 用L.in 繼承式屬性之SDD 2019/1/16

Parse tree with inherited attributes in at each node 說明資料型態屬性由上而下傳遞過去,因此需要繼承式SDD. 本例中用 .in 屬性表示 inherited attributes. 上例中也有兩處使用 synthesized attributes. Addtype 乃一 procedure, 作用是將 type (即 L.in) 加到 id 所在之table中. 先前已說過,SDD不管“計算時之順序”, 因此圖5.4及圖5.2可說是“比較高層次的說明”. 下圖之箭頭方向代表“繼承式”屬性之傳遞方向.而且在每一個L節點中,Semantic Rules叫用 procedure - addtype 以便將此節點後代或右邊後代之 id 的 type (實數或整數) 加入符號表之中. D T.type = real L.in = real id3 L.in = real , real Fig. 5.5 Parse tree with inherited attributes in at each node labeled L. id2 L.in = real , 2019/1/16 id1

相依圖(Dependency Graph) 既然屬性由繼承式與合成式兩者而來,因此以語法解析樹這種無向圖形為基礎,可以構作出一有向圖形(即相依圖),用來表示所有節點之繼承屬性與合成屬性之相依關系. 先看例示: 假設文法 A XY 所對應的 Semantic Rules 是 A.a := f(X.x, Y.y), 則此 A.a屬性必是合成式;即由下而上方式合成之,且依賴 X.x 與 Y.y 兩屬性. 如果在語法解析樹中用到此語法,則在相依圖中就會有三個節點: A.a, X.x, 及 Y.y, 而且分別自 X. x 及 Y.y 到 A.a 會有兩條邊線用來表示“合成”理念. 用特定例子說明: Production Semantic Rules E E1 + E2 E.val := E1.val + E2.val 則相依圖如下: val E Fig. 5.6 E.val is synthesized from E1.val and E2.val. E2 + E1 val 2019/1/16 val

同理,若文法 A XY 之Semantic Rules 中有一條語意規則是 X.i := g(A.a, Y.y), 則自 A.a 與 Y.y 分別各有一調有向邊線指向 X.i 節點. 而此屬性由於是繼承式,故箭頭若不是水平走向,就必然是向下走向. 將此兩種走向屬性值綜合之 ,可定義圖5.5 之相依圖如下: D L T 4 type in 5 6 , id3 in 7 L 3 entry 8 real , id2 in 9 L 10 2 entry id1 1 entry 2019/1/16 Fig. 5.7 Dependency graph for parse tree of Fig. 5.5.

說明 (1) 相依圖上之節點都標示了數字,用於表示 topological sort, 亦即計算語意規則的順序. (2) 文法 D TL 對應之 Semantic Rules L.in := T.type , 因此也一條有向邊線自4號節點 T.type 連到 5 號節點 L.in(當然屬於 inherited attribute). 同理, 觀察 L L1, id 之 L1.in := L.in , 只不過對應此 L 之 Semantic Rules 有 addtype 程序會產生一個虛擬屬性(dummy attribute) ---- 即 id.entry 之屬性由 L.in 而來. 至此,我們可詳細地思量“如之何構作出一個語法解析樹之相依圖”. 其演算法(algorithm)之大意如次頁所示: 2019/1/16

FOR 對應於 n 之文法的每一條語意規則 b:=f(c1, . . ,ck) DO FOR i := 1 TO k DO FOR 語法解析樹上每一個節點 n DO FOR 節點 n 上每一個屬性 a DO 為 a 在相依圖上構作一個節點; FOR 語法解析樹上每一個節點 n DO FOR 對應於 n 之文法的每一條語意規則 b:=f(c1, . . ,ck) DO FOR i := 1 TO k DO 從節點 ci 到 b 節點劃一條有向邊線; PS: 若節點之屬性有 a1, a2, . . . , am , 則實線有向圖有 m 種矣 ! 2019/1/16

計算語意規則之順序對於一個有向非循環圖形(Directed Acyclic Graph), 其拓樸排序如資料結構一書所示. 若將相依圖執行 topological sorting, 則這些節點之排序結果正是計算 Semantic Rules 之順序. [再觀察 Fig. 5.7 可知此理] In summary, Semantic Processings 包括 (1) 定義 SDD 及 (2) 設計translation scheme. SDD 已如前述針對各 production rule 定義 semantic rules. 而 translation scheme 即 (a) 用文法建構一個語法解析樹, (b) 用上一頁的algorithm建構相依圖, (c ) 對相依圖執行拓樸排序, 最后 (d) 求得各語意規則之計算順序,按順序執行各semantic rules,而得轉譯碼. 2019/1/16

實例: 根據上述,經過topological sorting 之后,圖5.7 可得下述程式. [ 若用 an 表示在相依圖編號為 n 之節點的對應屬性 ] a4 := real; a5 := a4; addtype( id3.entry, a5); a7 := a5; addtype( id2.entry, a7); a9 := a7; addtype( id1.entry, a9); 執行此程式,則 real 這個型別資訊可存入符號表中各 id 之 type 欄位內. 現在,若相依圖中有 cycle 存在,則只好回到舊版書所提之方法 ----- Rule - Based Methods. 還好第三種方法是不管語意規則,而使用“某種固定的計算順序”. 由於後兩者不必構作相依圖,因此 Time 與 Space 相對地有效率. 2019/1/16