PCREPATTERN(3) PCREPATTERN(3)
NAME
PCRE - Perl-compatible regular expressions
PCRE REGULAR EXPRESSION DETAILS
The syntax and semantics of the regular expressions sup-
ported by PCRE are described below. Regular expressions
are also described in the Perl documentation and in a
number of books, some of which have copious examples.
Jeffrey Friedl's "Mastering Regular Expressions", pub-
lished by O'Reilly, covers regular expressions in great
detail. This description of PCRE's regular expressions
is intended as reference material.
The original operation of PCRE was on strings of one-
byte characters. However, there is now also support for
UTF-8 character strings. To use this, you must build
PCRE to include UTF-8 support, and then call pcre_com-
pile() with the PCRE_UTF8 option. How this affects pat-
tern matching is mentioned in several places below.
There is also a summary of UTF-8 features in the section
on UTF-8 support in the main pcre page.
The remainder of this document discusses the patterns
that are supported by PCRE when its main matching func-
tion, pcre_exec(), is used. From release 6.0, PCRE
offers a second matching function, pcre_dfa_exec(),
which matches using a different algorithm that is not
Perl-compatible. The advantages and disadvantages of the
alternative function, and how it differs from the normal
function, are discussed in the pcrematching page.
CHARACTERS AND METACHARACTERS
A regular expression is a pattern that is matched
against a subject string from left to right. Most char-
acters stand for themselves in a pattern, and match the
corresponding characters in the subject. As a trivial
example, the pattern
The quick brown fox
matches a portion of a subject string that is identical
to itself. When caseless matching is specified (the
PCRE_CASELESS option), letters are matched independently
of case. In UTF-8 mode, PCRE always understands the con-
cept of case for characters whose values are less than
128, so caseless matching is always possible. For char-
acters with higher values, the concept of case is sup-
ported if PCRE is compiled with Unicode property sup-
port, but not otherwise. If you want to use caseless
matching for characters 128 and above, you must ensure
that PCRE is compiled with Unicode property support as
well as with UTF-8 support.
The power of regular expressions comes from the ability
to include alternatives and repetitions in the pattern.
These are encoded in the pattern by the use of metachar-
acters, which do not stand for themselves but instead
are interpreted in some special way.
There are two different sets of metacharacters: those
that are recognized anywhere in the pattern except
within square brackets, and those that are recognized
within square brackets. Outside square brackets, the
metacharacters are as follows:
\ general escape character with several uses
^ assert start of string (or line, in multiline
mode)
$ assert end of string (or line, in multiline
mode)
. match any character except newline (by default)
[ start character class definition
| start of alternative branch
( start subpattern
) end subpattern
? extends the meaning of (
also 0 or 1 quantifier
also quantifier minimizer
* 0 or more quantifier
+ 1 or more quantifier
also "possessive quantifier"
{ start min/max quantifier
Part of a pattern that is in square brackets is called a
"character class". In a character class the only
metacharacters are:
\ general escape character
^ negate the class, but only if the first charac-
ter
- indicates character range
[ POSIX character class (only if followed by
POSIX
syntax)
] terminates the character class
The following sections describe the use of each of the
metacharacters.
BACKSLASH
The backslash character has several uses. Firstly, if it
is followed by a non-alphanumeric character, it takes
away any special meaning that character may have. This
use of backslash as an escape character applies both
inside and outside character classes.
For example, if you want to match a * character, you
write \* in the pattern. This escaping action applies
whether or not the following character would otherwise
be interpreted as a metacharacter, so it is always safe
to precede a non-alphanumeric with backslash to specify
that it stands for itself. In particular, if you want to
match a backslash, you write \\.
If a pattern is compiled with the PCRE_EXTENDED option,
whitespace in the pattern (other than in a character
class) and characters between a # outside a character
class and the next newline are ignored. An escaping
backslash can be used to include a whitespace or # char-
acter as part of the pattern.
If you want to remove the special meaning from a
sequence of characters, you can do so by putting them
between \Q and \E. This is different from Perl in that $
and @ are handled as literals in \Q...\E sequences in
PCRE, whereas in Perl, $ and @ cause variable interpola-
tion. Note the following examples:
Pattern PCRE matches Perl matches
\Qabc$xyz\E abc$xyz abc followed by the
contents of $xyz
\Qabc\$xyz\E abc\$xyz abc\$xyz
\Qabc\E\$\Qxyz\E abc$xyz abc$xyz
The \Q...\E sequence is recognized both inside and out-
side character classes.
Non-printing characters
A second use of backslash provides a way of encoding
non-printing characters in patterns in a visible manner.
There is no restriction on the appearance of non-print-
ing characters, apart from the binary zero that termi-
nates a pattern, but when a pattern is being prepared by
text editing, it is usually easier to use one of the
following escape sequences than the binary character it
represents:
\a alarm, that is, the BEL character (hex 07)
\cx "control-x", where x is any character
\e escape (hex 1B)
\f formfeed (hex 0C)
\n newline (hex 0A)
\r carriage return (hex 0D)
\t tab (hex 09)
\ddd character with octal code ddd, or backrefer-
ence
\xhh character with hex code hh
\x{hhh..} character with hex code hhh..
The precise effect of \cx is as follows: if x is a lower
case letter, it is converted to upper case. Then bit 6
of the character (hex 40) is inverted. Thus \cz becomes
hex 1A, but \c{ becomes hex 3B, while \c; becomes hex
7B.
After \x, from zero to two hexadecimal digits are read
(letters can be in upper or lower case). Any number of
hexadecimal digits may appear between \x{ and }, but the
value of the character code must be less than 256 in
non-UTF-8 mode, and less than 2**31 in UTF
没有合适的资源?快使用搜索试试~ 我知道了~
一款强大的网络嗅探工具
共533个文件
dll:83个
xml:14个
dtd:14个
5星 · 超过95%的资源 需积分: 48 69 下载量 37 浏览量
2010-12-15
15:09:11
上传
评论 2
收藏 19.59MB ZIP 举报
温馨提示
本软件是一款十分强大的网络嗅探器,可实时监控,抓包率很高。
资源推荐
资源详情
资源评论
收起资源包目录
一款强大的网络嗅探工具 (533个子文件)
dictionary.3com 600B
dictionary.3gpp 2KB
dictionary.3gpp2 5KB
dictionary.acc 11KB
ACCOUNTING-CONTROL-MIB 31KB
ADSL-LINE-EXT-MIB 49KB
ADSL-LINE-MIB 171KB
ADSL-TC-MIB 4KB
ADSL2-LINE-MIB 207KB
ADSL2-LINE-TC-MIB 28KB
AGENTX-MIB 18KB
AGGREGATE-MIB 17KB
ALARM-MIB 39KB
dictionary.alcatel 4KB
dictionary.alteon 448B
dictionary.altiga 7KB
Makefile.am 518B
APM-MIB 86KB
APPC-MIB 201KB
APPLETALK-MIB 103KB
APPLICATION-MIB 120KB
APPN-DLUR-MIB 24KB
APPN-MIB 201KB
APPN-TRAP-MIB 21KB
APS-MIB 57KB
dictionary.aptis 2KB
ARC-MIB 14KB
dictionary.aruba 425B
dictionary.ascend 46KB
ATM-ACCOUNTING-INFORMATION-MIB 15KB
ATM-MIB 105KB
ATM-TC-MIB 27KB
ATM2-MIB 120KB
AUTHORS-SHORT 31KB
dictionary.bay 11KB
dictionary.cisco.bbsm 333B
BGP4-MIB 44KB
dictionary.bintec 1KB
BLDG-HVAC-MIB 22KB
BRIDGE-MIB 51KB
dictionary.bristol 418B
dictionary.cablelabs 9KB
dictionary.cabletron 780B
cfilters 515B
CHARACTER-MIB 21KB
user-guide.chm 2.71MB
CIRCUIT-IF-MIB 13KB
dictionary.cisco 7KB
CLNS-MIB 38KB
COFFEE-POT-MIB 4KB
colorfilters 1KB
dictionary.colubris 312B
dictionary.columbia_university 588B
dictionary.compat 2KB
im-multipress.conf 565B
COPS-CLIENT-MIB 32KB
COPYING 27KB
dictionary.cosine 490B
ws.css 4KB
tpncp.dat 548KB
DECNET-PHIV-MIB 95KB
dfilters 685B
DIAL-CONTROL-MIB 48KB
dictionary 12KB
DIFFSERV-CONFIG-MIB 9KB
DIFFSERV-DSCP-TC 2KB
DIFFSERV-MIB 128KB
DIRECTORY-SERVER-MIB 24KB
DISMAN-EVENT-MIB 68KB
DISMAN-EXPRESSION-MIB 43KB
DISMAN-NSLOOKUP-MIB 19KB
DISMAN-PING-MIB 58KB
DISMAN-SCHEDULE-MIB 25KB
DISMAN-SCRIPT-MIB 65KB
DISMAN-TRACEROUTE-MIB 70KB
libwireshark.dll 5.38MB
libgtk-win32-2.0-0.dll 2.12MB
libgnutls-26.dll 660KB
libgcrypt-11.dll 583KB
libglib-2.0-0.dll 479KB
libgdk-win32-2.0-0.dll 458KB
libgail.dll 439KB
libcairo-2.dll 400KB
msvcp90.dll 338KB
msvcr90.dll 270KB
libgio-2.0-0.dll 243KB
msvcm90.dll 220KB
libpango-1.0-0.dll 180KB
libgobject-2.0-0.dll 155KB
krb5_32.dll 137KB
libtiff3.dll 122KB
wimax.dll 116KB
smi.dll 109KB
libpng12-0.dll 104KB
parlay.dll 98KB
libgnutls-openssl-26.dll 96KB
wiretap-0.3.1.dll 89KB
libatk-1.0-0.dll 87KB
libgdk_pixbuf-2.0-0.dll 85KB
libtasn1-3.dll 78KB
共 533 条
- 1
- 2
- 3
- 4
- 5
- 6
资源评论
- wxy927172014-03-26当时上课老师要用~挺好的~
坚持不懈123
- 粉丝: 19
- 资源: 25
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功