没有合适的资源?快使用搜索试试~ 我知道了~
大状态空间 HMM 的快速算法 Web 使用分析中的应用.pdf
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 25 浏览量
2024-04-02
14:34:46
上传
评论
收藏 184KB PDF 举报
温馨提示
试读
8页
大状态空间 HMM 的快速算法 Web 使用分析中的应用.pdf
资源推荐
资源详情
资源评论
Fast Algorithms for Large-State-Space HMMs with
Applications to Web Usage Analysis
Pedro F. Felzenszwalb
1
, Daniel P. Huttenlocher
2
, Jon M. Kleinberg
2
1
AI Lab, MIT, Cambridge MA 02139
2
Computer Science Dept., Cornell University, Ithaca NY 14853
Abstract
In applying Hidden Markov Models to the analysis of massive data
streams, it is often necessary to use an artificially reduced set of
states; this is due in large part to the fact that the basic HMM
estimation algorithms have a quadratic dependence on the size of
the state set. We present algorithms that reduce this computational
bottleneck to linear or near-linear time, when the states can be
embedded in an underlying parameter space. This type of state
representation arises in many domains; in particular, we show an
application to traffic analysis at a high-volume Web site.
1 Introduction
Hidden Markov Models (HMMs) are used in a wide variety of applications where
a sequence of observable events is correlated with or caused by a sequence of un-
observable underlying states (e.g., [8]). Despite their broad applicability, HMMs
are in practice limited to problems where the number of hidden states is relatively
small. The most natural such problems are those where some abstract categoriza-
tion provides a small set of discrete states, such as phonemes in the case of speech
recognition or coding and structure in the case of genomics. Recently, however,
issues arising in massive data streams, such as the analysis of usage logs at high-
traffic Web sites, have led to problems that call naturally for HMMs with large state
sets over very long input sequences.
A major obstacle in scaling HMMs up to larger state spaces is the computational
cost of implementing the basic primitives associated with them: given an n-state
HMM and a sequence of T observations, determining the probability of the observa-
tions, or the state sequence of maximum probability, takes time O(T n
2
) using the
forward-backward and Viterbi algorithms. The quadratic dependence on the num-
ber of states is a long-standing bottleneck that necessitates a small (often artificially
coarsened) state set, particularly when the length T of the input is large.
In this paper, we present algorithms that overcome this obstacle for a broad class
of HMMs. We improve the running times of the basic estimation and inference
primitives to have a linear or near-linear dependence on the number of states, for a
family of models in which the states are embedded as discrete points in an under-
lying parameter space, and the state transition costs (the negative logs of the state
资源评论
百态老人
- 粉丝: 1637
- 资源: 2万+
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功