specify repetition, so that a{1,5} looks for 1 to 5 occurrences of a, and A(B{1,4}) matches
ABC, ABBC, ABBBC, and ABBBBC (notice the use of parentheses, (), as grouping
symbols).
Brackets, [], indicate any one character from the string of characters specified between the
brackets. Thus, [dgka] matches a single d, g, k, or a.
Note that the characters between brackets must be adjacent, without spaces or punctuation.
The ^ operator, when it appears as the first character after the left bracket, indicates all
characters in the standard set except those specified between the brackets. (Note that |, {},
and ^ may serve other purposes as well.)
Ranges within a standard alphabetic or numeric order (A through Z, a through z, 0 through 9)
are specified with a hyphen. [a-z], for instance, indicates any lowercase letter.
[A-Za-z0-9*&#]
This is a regular expression that matches any letter (whether upper or lowercase), any digit, an
asterisk, an ampersand, or a #.
Given the following input text, the lexical analyzer with the previous specification in one of
its rules will recognize *, &, r, and #, perform on each recognition whatever action the rule
specifies (we have not indicated an action here), and print the rest of the text as it stands:
$$$$?? ????!!!*$$ $$$$$$&+====r~~# ((
To include the hyphen character in the class, have it appear as the first or last character in the
brackets: [-A-Z] or [A-Z-].
The operators become especially powerful in combination. For example, the regular
expression to recognize an identifier in many programming languages is:
[a-zA-Z][0-9a-zA-Z]*
An identifier in these languages is defined to be a letter followed by zero or more letters or
digits, and that is just what the regular expression says. The first pair of brackets matches any
letter. The second, if it were not followed by a *, would match any digit or letter.
The two pairs of brackets with their enclosed characters would then match any letter followed
by a digit or a letter. But with the *, the example matches any letter followed by any number
of letters or digits. In particular, it would recognize the following as identifiers:
e
not
idenTIFIER
pH
EngineNo99
R2D2
评论0
最新资源