Python库|nmfu-0.4.0a1.tar.gz资源-CSDN文库

版权申诉

140 浏览量 2022-04-12 05:04:07 上传评论收藏 68KB GZ 举报

共12个文件

txt：5个

pkg-info：2个

py：2个

资源推荐

资源详情

资源评论

收起资源包目录

nmfu-0.4.0a1.tar.gz （12个子文件）

nmfu-0.4.0a1

PKG-INFO 24KB

LICENSE 34KB

nmfu.py 222KB

setup.cfg 41B

setup.py 2KB

README.md 19KB

nmfu.egg-info

PKG-INFO 24KB

requires.txt 94B

SOURCES.txt 204B

entry_points.txt 36B

top_level.txt 5B

dependency_links.txt 1B

<img src="https://user-images.githubusercontent.com/5255209/117226360-7e69a900-ade2-11eb-9127-4a146a443199.png" alt="nmfu logo banner" width="100%"/> # nmfu --- _the "no memory for you" "parser" generator_ --- ![PyPI - License](https://img.shields.io/pypi/l/nmfu) [![PyPI](https://img.shields.io/pypi/v/nmfu)](https://pypi.org/project/nmfu) ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/nmfu) [![Jenkins](https://img.shields.io/jenkins/build?jobUrl=https%3A%2F%2Fjenkins.mm12.xyz%2Fjenkins%2Fjob%2Fnmfu%2Fjob%2Fmaster)](https://jenkins.mm12.xyz/job/nmfu) [![Jenkins tests](https://img.shields.io/jenkins/tests?compact_message&jobUrl=https%3A%2F%2Fjenkins.mm12.xyz%2Fjenkins%2Fjob%2Fnmfu%2Fjob%2Fmaster)](https://jenkins.mm12.xyz/jenkins/job/nmfu/job/master/lastCompletedBuild/testReport/) [![Jenkins Coverage](https://img.shields.io/jenkins/coverage/api?jobUrl=https%3A%2F%2Fjenkins.mm12.xyz%2Fjenkins%2Fjob%2Fnmfu%2Fjob%2Fmaster%2F)](https://jenkins.mm12.xyz/jenkins/job/nmfu/job/master/lastCompletedBuild/coverage/cobertura__coverage_xml/project/_/nmfu_py/) [![nmfu](https://snapcraft.io//nmfu/badge.svg)](https://snapcraft.io/nmfu) `nmfu` attempts to turn a parser specified as a procedural matching thing into a state machine. It's much more obvious what it does if you read some of the examples. It takes in a "program" containing various match expressions, control structures and actions and converts it into a DFA with actions on the transitions. This allows simple protocols (for example HTTP) to be parsed using an extremely small memory footprint and without requiring a separate task since it can be done character by character. You can also define various output variables which can be manipulated inside the parser program, which can then be examined after parsing. See `example/http.nmfu` for a good example of using this functionality. The rest of this README is a guide to using NMFU. ## Parser Specification NMFU source files support C++-style line comments (text after `//` on a line is ignored until the end of the line) ### Top-Level Constructs At the top-level, all NMFU parsers consist of a set of output-variables, macro-definitions, hook-definitions and the parser code itself. The output-variables are specified with the _output-declaration_: ```lark out_decl: "out" out_type IDENTIFIER ";" | "out" out_type IDENTIFIER "=" atom ";" out_type: "bool" -> bool_type | "int" -> int_type | "enum" "{" IDENTIFIER ("," IDENTIFIER)+ "}" -> enum_type | "str" "[" NUMBER "]" -> str_type ``` For example: ``` out int content_length = 32; out bool supports_gzip = false; // note you can't set default values for strings and enums out str[32] url; out enum{GET,POST} method; ``` All strings have a defined maximum size, which includes the null-terminator. Macros in NMFU are simple parse-tree level replacements. They look like: ```lark macro_decl: "macro" IDENTIFIER macro_args "{" statement* "}" macro_args: "(" macro_arg ("," macro_arg)* ")" | "(" ")" -> macro_arg_empty macro_arg: "macro" IDENTIFIER -> macro_macro_arg | "out" IDENTIFIER -> macro_out_arg | "match" IDENTIFIER -> macro_match_expr_arg | "expr" IDENTIFIER -> macro_int_expr_arg | "hook" IDENTIFIER -> macro_hook_arg | "loop" IDENTIFIER -> macro_breaktgt_arg ``` For example: ``` macro ows() { // optional white space optional { " "; } } ``` When macros are "called", or instantiated, all NMFU does is copy the contents of the parse tree from the macro declaration to the call-site. Note that although macros can call other macros, they cannot recurse. Macros can also take arguments, which are similarly treated as parse-tree level replacements, with the added restriction that their types _are_ checked. For example: ``` macro read_number(out target, match delimit) { target = 0; foreach { /\d+/; } do { target = [target * 10 + ($last - '0')]; } delimit; } ``` There are 6 types of arguments: - `macro`: a reference to another macro - `hook`: a reference to a hook - `out`: a reference to an _output-variable_ - `match`: an arbitrary _match-expression_ - `expr`: an arbitrary _integer-expression_ - `loop`: an arbitrary named _loop-statement_, for use in _break-statements_. Hooks (which are callbacks to user code which the parser can call at certain points) are defined with a _hook-declaration_: ```lark hook_decl: "hook" IDENTIFIER ";" ``` For example: ``` hook got_header; ``` ### Parser Declaration The parser proper is declared with the _parser-declaration_, ```lark parser_decl: "parser" "{" statement+ "}" ``` and contains a set of statements which are "executed" in order to parse the input. ### Basic Statements Basic statements are statements which do not have an associated code block, and which end with a semicolon. ```lark simple_stmt: expr -> match_stmt | IDENTIFIER "=" expr -> assign_stmt | IDENTIFIER "+=" expr -> append_stmt | IDENTIFIER "(" (expr ("," expr)*)? ")" -> call_stmt | "break" IDENTIFIER? -> break_stmt | "finish" -> finish_stmt | "wait" expr -> wait_stmt ``` The most basic form of statement in NMFU is a _match-statement_, which matches any _match-expression_ (explained in the next section). The next two statements are the _assign-statement_ and _append_statement_. The _assign-statement_ parses an _integer-expression_ (which are not limited to just integers, again explained in the next section). and assigns its result into the named _output-variable_. The _append-statement_ instead appends whatever is matched by the _match-expression_ into the named _output-variable_ which must by a string type. Additionally, if the argument to an _append-statement_ is a _math-expression_, then the result of evaluating the expression will be treated as a character code and appended to the string. The _call-stmt_ instantiates a macro or calls a hook. Note that there is currently no valid way to pass parameters to a hook, and as such the expressions provided in that case will be ignored. Macro arguments are always parsed as generic expressions and then interpreted according to the type given to them at declaration. If a hook and macro have the same name, the macro will take priority. Priority is undefined if a macro argument and global hook or macro share a name. The _break-statement_ is explained along with loops in a later section. The _finish-statement_ causes the parser to immediately stop and return a `DONE` status code, which should be interpreted by the calling application as a termination condition. The _wait-statement_ spins and consumes input until the _match-expression_ provided matches successfully. Importantly, no event (including end of input!) can stop the wait statement, which makes it useful primarily in error handlers. It is also important to note that this is _not_ the same as using a regex like `/.*someterminator/`, as the wait statement does _not_ "try" different starting positions for a string when its match fails. More concretely, something like `wait abcdabce` would _not_ match `abcdabcdabce`, as the statement would bail and restart matching from the beginning at the second `d`. ### Expressions There are three types of expressions in NMFU, _match-expressions_, _integer-expressions_ and _math-expressions_. A _match-expression_ is anything that can consume input to the parser and check it: ```lark ?expr: atom // string match | regex // not an atom to simplify things | "end" -> end_expr | "(" expr+ ")" -> concat_expr atom: STRING "i" -> string_case_const | STRING -> string_const ``` The simplest form of _match-expression_ is the _direct-match_, which matches a literal string. It can optionally match with case insensitivity by suffixing the literal string with an "i". The _end-match-expression_ is a match expression which only mat

评论收藏

内容反馈

版权申诉