没有合适的资源?快使用搜索试试~ 我知道了~
The Definitive ANTLR4Reference 学习笔记
2星 需积分: 22 29 下载量 94 浏览量
2017-12-24
21:08:51
上传
评论
收藏 850KB PDF 举报
温馨提示
试读
19页
The Definitive ANTLR4Reference 学习笔记 The Definitive ANTLR4Reference 学习笔记
资源详情
资源评论
资源推荐
[3]The Definitive ANTLR 4 Reference
CHAPTER 2 Big Picture
Conceptions
CHAPTER 3-4 Demo
CHAPTER 5 Designing Grammars
Recognizing Common Language Patterns with ANTLR Grammars
dealing with Precedence, Left Recursion, and Associativity
Recognizing Common Lexical Structures
【Tips】
【1】fragment
【2】字符串
【3】转义\或\
【4】->s=skip跳过
【5】逻辑反
CHAPTER 6 Exploring Some Real Grammars
JSON
CHAPTER 7Decoupling Grammars from Application-Specific Code
Evolving from Embedded Actions to Listeners
Implementing Applications with Parse-Tree Listeners
Implementing Applications with Visitors
Labeling Rule Alternatives for Precise Event Methods
Listener和Visitor实现提供返回结果方式的区别
三种实现方法的示例
三种方法的对比和选取
CHAPTER 8 应用范例(待有时间学习)
CHAPTER 9异常处理(未看)
CHAPTER 10Attributes and Actions (高级主题)
CHAPTER 11Alteringthe Parse with Semantic Predicates (高级主题)周末补!!!!!!
CHAPTER 12Wielding Lexical Black Magic (高级主题)在Lexical阶段介入优化一般用不到
CHAPTER 13Runtime API
CHAPTER 15Grammar Reference 可作为语法字典查询
CHAPTER 2 Big Picture
Conceptions
LanguageA language is a set of valid sentences; sentences are composed of phrases, which are composed of subphrases, and
so on.
Grammar A grammar formally defines the syntax rules of a language. Each rule in a grammar expresses the structure of a
subphrase.
Syntax tree or parse tree This represents the structure of the sentence where each subtree root gives an abstract name to
the elements beneath it. The subtree roots correspond to grammar rule names. The leaves of the tree are symbols or tokens
of the sentence.
Token A token is a vocabulary symbol in a language; these can represent a category of symbols such as “identifier” or
can represent a single operator or keyword.
Lexer or tokenizer This breaks up an input character stream into tokens. A lexer performs lexical analysis.
Parser A parser checks sentences for membership in a specific language by checking the sentence’s structure against the
rules of a grammar. The best analogy for parsing is traversing a maze, comparing words of a sentence to words written
along the floor to go from entrance to exit. ANTLR generates top-down parsers called ALL(*) that can use all remaining
input symbols to make decisions. Top- down parsers are goal-oriented and start matching at the rule associated with the
coarsest construct, such as program or inputFile.
Recursive-descent parser This is a specific kind of top-down parser implemented with a function for each rule in the
grammar.
Lookahead Parsers use lookahead to make decisions by comparing the symbols that begin each alternative.
Lexers process characters (short integers),
parsers process token types (integers).
1.
2.
3.
CHAPTER 3-4 Demo
CHAPTER 5 Designing Grammars
Recognizing Common Language Patterns with ANTLR Grammars
Pattern: Sequence
Sequence: 元素组成的行,例如
retr : INT ; 'RETR' '\n' // match keyword integer newline sequence
元素可以是关键字,字符串'RETR',或标点符号
one or more elements, we use the + subrule operator, (INT)+ describes an arbitrarily long sequence of integers
zero-or-more * operator: INT*
zero-or-one sequence, specified with the ?
Pattern: Choice (Alternatives)
| as the “or” operator
例如choice of integers or strings.
field : INT | STRING ;
Pattern: Token Dependency
直接用(...), {...}, and [...].等符号表示token的一些附属
functionDecl
: type ID '(' formalParameters? ')' block // "void f(int x) {...}" ;
formalParameters
: formalParameter (',' formalParameter)* ;
formalParameter
: type ID
;
Pattern: Nested Phrase
嵌套
stat: 'while' '(' expr ')' stat // match WHILE statement
| '{' stat* '}' // match block of statements in curlies
... // and other kinds of statements
dealing with Precedence, Left Recursion, and Associativity
优先级问题,坚持先定义高优先
例如:C语言中的指针优先级
Recognizing Common Lexical Structures
lexer rule用大写字母
parser rule用小写字母
keywords, operators, punctuation, 不需要特殊标识,直接用单引号引用如 'while', '*',
Matching Identifiers
标识符的匹配, 用大小写字母标识的字符串Identifiers
按照先lexical rules后parser rules的匹配顺序,同一种规则中,按照先后定义的顺序执行,匹配是按照“最长匹配原则”
例如上面ID是包含FOR的但由于定义顺序,FOR可以匹配
【Tips】
【1】fragment
表示不作为token只在其他的lexcal中使用
【2】字符串
用双引号包括
如字符串rule
STRING : .*? ; '"' '"' // match anything in "..."
【3】转义\或\\
ESC : | ; '\\"' '\\\\' // 2-char sequences \" and \\
【4】->s=skip跳过
例如:跳过空白和换行
WS : [ \t\r\n]+ -> skip ; // match 1-or-more whitespace but discard
【5】逻辑反
~x operator matches anything but x
STUFF : ~ + -> skip ; NL : ; '\n' '\n' // match and discard anything but a '\n'
CHAPTER 6 Exploring Some Real Grammars
给出5中解析的例子,可供学习些Grammars参考
CSV
JSON
Cymbol
R
JSON
其中JSON例子如下:
剩余18页未读,继续阅读
hjw199089
- 粉丝: 84
- 资源: 24
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论1