python 正则表达式 详解(英文)

所需积分/C币:50 2013-07-25 17:24:51 65KB PDF
收藏 收藏
举报

目前找到的关于python 正则表达式介绍最好最详细的文档,梳理得也很有条理!
print " matchobj group(): ", matchobj group( print " natchobj. group(-) matchob]group(1) print matchob3 group(2):", matchob3group(2) e⊥se print " No match!! When the above code is executed, it produces following result Latchob3. group(): Cats are matchobj group(1): Cats Inlatchobj group(2) Matching vs Searching: Python offcrs two diffcrent primitive opcrations bascd on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match any where in the string(this is what Perl does by default) Example #!/usr/bin/python import re line " Cats are smarter than dogs " matchob3= re match( ridogs', line, re Mire.I if matchobj: print " match -- natchobj group() na chobi group( lse print "No match! matchobj= re search( r dogs ', line, reM re. I if matcha print search -- matchobj group( matchobj. group() else print No match! When the above code is executed, it produces following result No match! earch -- matchobj group() Scarch and replace Some of the most important re methods that use regular expressions is sub Syntax re sub(pattern, repl, string, max=0) This method replace all occurrences of the re pattern in string with repl, substituting all occurrences unless max provided. This method would return modified d string E xample Following is the example: #!/usr/bin/python phone="2004-959-559#This Phone Number ll Delete Python-style comments muIn phone print phone Num:" num Remove anything other than digits re. sub(r\D phone print Phone Num:,num When the above code is executed it produces following result Phone N 2004-959-559 Phone num: 2004959559 Regular-expression Modifiers - Option Flags Regular expression literals may include an optional modifier to control various aspects of matching. The modifier are specified as an optional flag. You can provide multiple modified using exclusive Or(), as shown previously and may be represented by one of these Moditier Description Performs case-insensitive matching re.L Interprets words according to the current locale. This interpretation affects the alphabetic group(w and \W), as well as word boundary behavior(\b and \B) Makes S match the end of a line (not just the end of the string) and makes match the start of any line(not just the start of the string) S Makes a period(dot)match any character, including a newline re,U Interprets letters according to the Unicode character set This flag affects the behavior of ww. w.b. \B Permits"cuter"regular expression syntax. It ignores whitespace(except inside a set or when cscapcd by a backslash), and treats uncscapcd as a comment marker Regular-expression patterns: Except for control characters, (+?.*A S([l()IV, all characters match themselves. You can escape a control character by preceding it with a backslash Following table lists the regular expression syntax that is available in Python Pattern Description Matches beginning of line Matches end of line Matches any single character except newline. Using m option allows it to match newline as well Matches any single character in brackets Matches any single character not in brackets Matches 0 or morc occurrences of prcccding cxprcssion re+ Matches 1 or more occurrence of preceding expression ? Matches0 or 1 occurrencc of preceding cxprcssion Matches cxactly n number of occurrences of preceding cxpression rc n,) Matches n or morc occurrences of preceding cxprcssion rc n Matches at Icast n and at most m occurrences of preceding cxprcssion al b Matches either a or b (re) Groups regular expressions and remembers matched text (?imx) Temporarily toggles on i, m, or x options within a regular expression. If in parentheses, only that area is affected Temporarily toggles off i, m, or x options within a regular expression. If in parentheses, only that area is affected re Groups regular expressions without remembering matched text (?imx: re) Temporarily toggles on 1, m, or x options within parentheses (?-imx: re Temporarily toggles off 1, m, or x options within parentheses (?#…) Comment Q= re Specifies position using a pattern. Doesn't have a range o re Specifies position using pattern negation. Doesn't have a range Matches independent pattern without backtracking Matches word characters Matches nonword characters Matches whitespace. Equivalent to [\nrir S Matches nonwhite Matches digits. Equivalent to [0-91 Matches nondigits Matches beginning of string Matches end of string. If a newline exists, it matches just before newline. Matches end of strin Matches point where last match finished Matches word boundaries when outside brackets. Matches backspace(0x08)when inside brackets B Matches nonword boundaris Ⅶn,lt,etc Matches newlines. carriage returns. tabs. etc Matches nth grouped subexpression 10 Matches nth grouped subexpression if it matched alrcady. Othcrwisc rcfcrs to thc octal representation of a charactcr codc REGULAR-EXPRESSION EXAMPLES Literal characters: Example Description py thon Match"python Character classes Example Description [Python Match"Python"or "python ruble] Match ruby or rube [aeiou] Match any one lowercase yowel [0-9] Match any digit; same as [0123456789 [a-z] Match any lowercase ascii letter [A-ZI Match any uppercase ASCll letter [a-ZA-Z0-9] Match any of the above aclou Match anything othcr than a lowcrcasc vowel [^0-9] Match anything other than a digit Special Character Classes: Example Description atch any character except ne Match a digit: 10-9 Match a nondigit: [0-9 Match a whitespace character: ItrinIf Match nonwhitespace: [n ItrInif Match a single word character: [A-Za Match a nonword character: [A-Za-Z0-9_ Repetition Cases: Example scriptio ruby Match"rub"or"ruby the y is optional rub Match"rub"plus o or more ys ruby+ Match "rub"plus l or more ys Match exactly 3 digits Match 3 or more digits d{3,5} Match 3, 4, or 5 digits Nongreedy repetition: This matches the smallest number of repetitions Example Description <.根> Greedy repetition: matches"python>perl> <.}> Nongreedy: matches"<python>"in"<python>perI> Grouping with parentheses: Example Description AD\d+ No group:+ repeats \ d (Dd)+ Grouped: repeats D\d pair ([Pp]yhon(,)?)+ Match "Python",Python, python, python",etc Backreferences This matches a previously matched group again Example Description ([Ppl)othon&flails Match python&rails or Python&rails ("])[^\1]#1 Single or double-quoted string. \I matches whatever the lst group matched. \2 matches whatever the 2nd group matched, etc Alternatives Example Description pythonlperl Match"python P or perl rub(yale Match ruby or ruble Python(!+l\?) Python" followed by one or more or one Anchors This need to specify match position Example Description PYthon Match Python"at the start of a string or internal line Python$ Match Python at the end of a string or line PYthon Match"Python"at the start of a string Python z Match "Python"at the end of a string \bPvthonlb Match "Python"at a word boundary \bubb AB is nonword boundary: match"rub" in "rubeand ruby but not alone Python (?=!) Match Python", if followed by an exclamation point Python(?! !) Match"Python", if not followed by an exclamation point Special syntax with parentheses Example Description RO? #commer Matches"R". all the rest is a comment R(?i)ub Case-insensitive while matching uby RO?i:uby) Same as above rub(?: ylle)) Group only without creating \1 backreference

...展开详情
试读 9P python 正则表达式 详解(英文)
立即下载 低至0.43元/次 身份认证VIP会员低至7折
    一个资源只可评论一次,评论内容不能少于5个字
    u011098789 东西还ok,可是例子单薄啊。
    2014-04-02
    回复
    img
    锦城追影

    关注 私信 TA的资源

    上传资源赚积分,得勋章
    最新推荐
    python 正则表达式 详解(英文) 50积分/C币 立即下载
    1/9
    python 正则表达式 详解(英文)第1页
    python 正则表达式 详解(英文)第2页
    python 正则表达式 详解(英文)第3页

    试读已结束,剩余6页未读...

    50积分/C币 立即下载 >