没有合适的资源?快使用搜索试试~ 我知道了~
PERL REGULAR EXPRESSIONS QUICK START
需积分: 0 0 下载量 180 浏览量
2010-11-01
13:10:02
上传
评论 1
收藏 210KB PDF 举报
温馨提示
试读
11页
How to write perl regular expression
资源详情
资源评论
资源推荐
PERLREQUICK - PERL REGULAR EXPRESSIONS QUICK START
THE GUIDE
Simple word matching
The simplest regex is simply a word, or more generally, a string of characters. A
regex consisting of a word matches any string that contains that word:
"Hello World" =~ /World/; # matches
In this statement, World is a regex and the // enclosing /World/ tells perl to
search a string for a match. The operator =~ associates the string with the
regex match and produces a true value if the regex matched, or false if the
regex did not match. In our case, World matches the second word in "Hello
World", so the expression is true. This idea has several variations.
Expressions like this are useful in conditionals:
print "It matches\n" if "Hello World" =~ /World/;
The sense of the match can be reversed by using !~ operator:
print "It doesn't match\n" if "Hello World" !~
/World/;
The literal string in the regex can be replaced by a variable:
$greeting = "World";
print "It matches\n" if "Hello World" =~ /$greeting/;
If you're matching against $_, the $_ =~ part can be omitted:
$_ = "Hello World";
print "It matches\n" if /World/;
Finally, the // default delimiters for a match can be changed to arbitrary
delimiters by putting an 'm' out front:
"Hello World" =~ m!World!; # matches, delimited by
'!'
"Hello World" =~ m{World}; # matches, note the
matching '{}'
"/usr/bin/perl" =~ m"/perl"; # matches after
'/usr/bin',
# '/' becomes an
ordinary char
Regexes must match a part of the string exactly in order for the statement to be
true:
"Hello World" =~ /world/; # doesn't match, case
sensitive
"Hello World" =~ /o W/; # matches, ' ' is an
ordinary char
"Hello World" =~ /World /; # doesn't match, no ' ' at
end
perl will always match at the earliest possible point in the string:
"Hello World" =~ /o/; # matches 'o' in 'Hello'
"That hat is red" =~ /hat/; # matches 'hat' in 'That'
Not all characters can be used 'as is' in a match. Some characters, called
metacharacters, are reserved for use in regex notation. The metacharacters
are
{}[]()^$.|*+?\
A metacharacter can be matched by putting a backslash before it:
"2+2=4" =~ /2+2/; # doesn't match, + is a
metacharacter
"2+2=4" =~ /2\+2/; # matches, \+ is treated like an
ordinary +
'C:\WIN32' =~ /C:\\WIN/; #
matches
"/usr/bin/perl" =~ /\/usr\/bin\/perl/; # matches
In the last regex, the forward slash '/' is also backslashed, because it is used
to delimit the regex.
Non-printable ASCII characters are represented by escape sequences.
Common examples are \t for a tab, \n for a newline, and \r for a carriage
return. Arbitrary bytes are represented by octal escape sequences, e.g., \033,
or hexadecimal escape sequences, e.g., \x1B:
"1000\t2000" =~ m(0\t2) # matches
"cat" =~ /\143\x61\x74/ # matches, but a weird
way to spell cat
Regexes are treated mostly as double quoted strings, so variable substitution
works:
$foo = 'house';
'cathouse' =~ /cat$foo/; # matches
'housecat' =~ /${foo}cat/; # matches
With all of the regexes above, if the regex matched anywhere in the string, it
was considered a match. To specify where it should match, we would use the
anchor metacharacters ^ and $. The anchor ^ means match at the beginning of
the string and the anchor $ means match at the end of the string, or before a
newline at the end of the string. Some examples:
"housekeeper" =~ /keeper/; # matches
"housekeeper" =~ /^keeper/; # doesn't match
"housekeeper" =~ /keeper$/; # matches
"housekeeper\n" =~ /keeper$/; # matches
"housekeeper" =~ /^housekeeper$/; # matches
Using character classes
A character class allows a set of possible characters, rather than just a single
character, to match at a particular point in a regex. Character classes are
denoted by brackets [...], with the set of characters to be possibly matched
inside. Here are some examples:
/cat/; # matches 'cat'
/[bcr]at/; # matches 'bat', 'cat', or 'rat'
"abc" =~ /[cab]/; # matches 'a'
In the last statement, even though 'c' is the first character in the class, the
earliest point at which the regex can match is 'a'.
/[yY][eE][sS]/; # match 'yes' in a case-insensitive
way
# 'yes', 'Yes', 'YES', etc.
剩余10页未读,继续阅读
shargue2000
- 粉丝: 1
- 资源: 5
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0