没有合适的资源?快使用搜索试试~ 我知道了~
AC自动机-Set Matching and Aho-Corasick Algorithm
需积分: 10 19 下载量 31 浏览量
2014-03-07
10:58:02
上传
评论 2
收藏 325KB PDF 举报
温馨提示
试读
24页
AC自动机-Set Matching and Aho-Corasick Algorithm
资源推荐
资源详情
资源评论
Exact Set Matching Problem
In the exact set matching problem we locate occurrences
of any pattern of a set P = {P
1
, . . . , P
k
}, in target T [1 . . . m]
Let n =
P
k
i=1
|P
i
|. Exact set matching can be solved in time
O(|P
1
| + m + · · · + |P
k
| + m) = O(n + km)
by applying any linear-time exact matching k times
Aho-Corasick algorithm (AC) is a classic solution to
exact set matching. It works in time O(n + m + z), where z
is number of pattern occurrences in T
(Main reference here [Aho and Corasick, 1975])
AC is based on a refinement of a keyword tree
BSA Lecture 4: Aho-Corasick matching – p.2/24
Keyword Trees
A keyword tree (or a trie) for a set of patterns P is a
rooted tree K such that
1. each edge of K is labeled by a character
2. any two edges out of a node have different labels
Define the label of a node v as the concatenation of edge
labels on the path from the root to v, and denote it by L(v)
3. for each P ∈ P there’s a node v with L(v) = P , and
4. the label L(v) of any leaf v equals some P ∈ P
BSA Lecture 4: Aho-Corasick matching – p.3/24
Example of a Keyword Tree
A keyword tree for P = {he, she, his, hers}:
-
-
-
-
-
-
-
Z
Z
Z
Z~
J
J
J
J
J
J^
h
e
s
r
s
s
h
e
i
2
4
1
3
A keyword tree is an efficient implementation of a
dictionary of strings
BSA Lecture 4: Aho-Corasick matching – p.4/24
Keyword Tree: Construction
Construction for P = {P
1
, . . . , P
k
}:
Begin with a root node only;
Insert each pattern P
i
, one after the other, as follows:
Starting at the root, follow the path labeled by chars of P
i
;
If the path ends before P
i
, continue it by adding new
edges and nodes for the remaining characters of P
i
Store identifier i of P
i
at the terminal node of the path
This takes clearly O(|P
1
| + · · · + |P
k
|) = O(n) time
BSA Lecture 4: Aho-Corasick matching – p.5/24
剩余23页未读,继续阅读
资源评论
Raise
- 粉丝: 329
- 资源: 69
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功