没有合适的资源?快使用搜索试试~ 我知道了~
【异常注入工具】AIR-BAGLE:An Interactive Root cause-Based Anomaly Genera
需积分: 0 0 下载量 124 浏览量
2023-04-18
09:16:41
上传
评论
收藏 249KB DOCX 举报
温馨提示
试读
11页
【异常注入工具】AIR-BAGLE:An Interactive Root cause-Based Anomaly Generator for Event Logs翻译
资源推荐
资源详情
资源评论
AIR-BAGEL: An Interactive Root cause-Based Anomaly Generator for Event Logs
AIR-BAGEL:用于事件日志的交互式基于根因的异常发生器
Abstract—We describe AIR-BAGEL, a tool to generate pseudo-real trace-level
anomalies in event logs. Anomalies to be injected are defined by their root cause, i.e.,
resource behaviour or system malfunctioning. For each root cause, several anomaly
types can be specified, e.g., deleting, replacing or moving events in a trace. Root causes
and anomalies have been modelled based on existing literature on event log cleaning
and data quality analysis. AIR-BAGEL addresses the issue of unavailability of labelled
real world event logs for developing and evaluating event log cleaning and
reconstruction techniques and it represents a step forward compared to current
approaches in the literature that simply inject different types of anomalies randomly in
event logs.
摘要——我们描述了 AIR-BAGEL,这是一种在事件日志中生成伪真实轨迹级异
常的工具。要注入的异常由其根本原因定义,即资源行为或系统故障。对于每个
根因,可以指定几种异常类型,例如,删除、替换或移动跟踪中的事件。已根据
有关事件日志清理和数据质量分析的现有文献对根因和异常进行建模。 AIR-
BAGEL 解决了在开发和评估事件日志清理和重建技术时标记的真实世界事件
日志不可用的问题,与文献中简单地在事件日志中随机注入不同类型异常的当前
方法相比,它代表了向前迈进了一步。
Index Terms—event log, data quality, anomaly, process mining, event log cleaning.
索引词——事件日志,数据质量,异常,流程挖掘,事件日志清理。
I. INTRODUCTION
一、引言
Event logs captured in the real world are prone to errors [1]. These can stem from a
variety of root causes, such as system malfunctioning or human mistakes, resulting in
different types of errors, such as abnormal activity labels, or missing, duplicated, and
swapped events [1]–[8]. Errors in event logs hamper the possibility of extracting useful
insights from event log analysis. For instance, they clearly influence the models
obtained through process discovery and the fitness computation in conformance
checking. Therefore, one should aim at removing these errors before running event log
analytics.
在现实世界中捕获的事件日志很容易出错 [1]。这些可能源于各种根本原因,例
如系统故障或人为错误,导致不同类型的错误,例如异常活动标签,或丢失、重
复和交换事件 [1]-[8]。事件日志中的错误阻碍了从事件日志分析中提取有用见
解的可能性。例如,它们明显影响通过流程发现获得的模型和一致性检查中的适
应度计算。因此,在运行事件日志分析之前,应该着眼于消除这些错误。
If a process model is available or can be discovered from clean traces, then errors can
be detected using traditional conformance checking techniques. However, in many
cases a process model is not available. Event log anomaly detection or cleaning has
emerged recently as a new field in process mining developing techniques to identify
anomalies in event logs without assuming the existence of a process model [1].
如果流程模型可用或可以从干净的轨迹中发现,则可以使用传统的一致性检查技
术检测错误。但是,在许多情况下,流程模型不可用。事件日志异常检测或清理
最近作为流程挖掘开发技术的一个新领域出现,用于在不假设流程模型存在的情
况下识别事件日志中的异常 [1]。
To be evaluated, event log cleaning techniques require event logs in which traces and
events are labelled, i.e., whether normal or anomalous. When such techniques exploit
machine learning algorithms, labelled logs are also necessary for training/testing during
model development. Unfortunately, labelled real world event logs are not available
among the ones usually considered within the process mining community. Therefore,
researchers in this field have relied on artificially injecting anomalies into real or
simulated event logs.
要进行评估,事件日志清理技术需要事件日志,其中标记了轨迹和事件,即是否
正常或异常。当此类技术利用机器学习算法时,标记日志对于模型开发期间的训
练/测试也是必要的。不幸的是,在流程挖掘社区通常考虑的事件日志中,标记
的真实世界事件日志不可用。因此,该领域的研究人员一直依赖于将异常人工注
入真实或模拟的事件日志中。
Anomalies of different types are normally injected randomly into event logs at different
ratios, i.e., until a target ratio of existing traces and/or events have been modified to
become anomalous [1]. As far as existing tools are concerned, event log generation
tools, e.g., PLG2 [9] and PTandLogGenerator [10] allow to perturb an event log
obtained through simulation of a process model by randomly changing the order of
events in a trace or randomly deleting/adding events in a trace, but they do not allow
injecting anomalies generated by more complex patterns or into already existing event
logs.
不同类型的异常通常以不同的比率随机注入事件日志,即,直到现有轨迹和/或
事件的目标比率被修改为异常 [1]。就现有工具而言,事件日志生成工具,例如
PLG2 和 PTandLogGenerator 允许通过随机更改轨迹中事件的顺序或随机删除轨
迹中的事件来扰乱通过模拟流程模型获得的事件日志/在跟踪中添加事件,但它
们不允许将由更复杂的模式生成的异常注入到现有的事件日志中。
In the real world, anomalies are generated by complex organisational situations.
Anomalies in an event log, in fact, normally are linked to specific root causes [2], [3].
These can be classified, at least at a first high level of analysis, into two categories:
resource and system. The former refers to human resources involved in a process being
the source of anomalies. Resource A, for instance, may be sloppier at their job than
resource B, and therefore cases in which A is involved may have more events skipped
or wrongly recorded than the ones involving B. The latter refers to anomalies associated
with malfunctioning of systems supporting the execution of processes. System A, for
instance, may have malfunctioned for a specific amount of time on a specific day; as a
consequence, depending on the type of malfunctioning, the events happening in that
specific time window may be missing from the log or have been erroneously recorded,
e.g., moved to a different case.
在现实世界中,异常是由复杂的组织情况产生的。事实上,事件日志中的异常通
常与特定的根本原因有关。至少在第一个高级分析中,这些可以分为两类:资源
和系统。前者是指流程中涉及的人力资源是异常的来源。例如,资源 A 的工作
可能比资源 B 草率,因此涉及 A 的案例可能比涉及 B 的案例跳过或错误记录
的事件更多。后者指的是与支持系统故障相关的异常流程的执行。例如,系统 A
可能在特定日期发生了特定时间的故障;因此,根据故障的类型,在特定时间窗
口内发生的事件可能会从日志中丢失或被错误记录,例如,移至不同的案例。
This paper describes AIR-BAGEL, an interactive tool for anomaly injection in event
logs that aims at simulating pseudoreal anomalies. The idea behind AIR-BAGEL is to
associate the injected anomalies with specific root causes, i.e., the behaviour of
resources or system malfunctioning. AIR-BAGEL injects anomalies at the level of
order and occurrence of activities in a case, e.g., skipping or replacing events in a trace
in an event log. The tool outputs event logs with injected anomalies augmented with
case-level anomaly binary labels and other attributes required to reconstruct the type of
anomaly that was injected. The main objective of AIR-BAGEL is to provide the process
mining research community with a simple tool to generate pseudo-real anomalies in
existing event logs, so as to enable the development and evaluation of event log
cleaning approaches using event log anomalies that better resemble the ones that could
be encountered in the real world.
本文介绍了 AIR-BAGEL,这是一种用于在事件日志中注入异常的交互式工具,
旨在模拟伪真实异常。 AIR-BAGEL 背后的想法是将注入的异常与特定的根本
原因相关联,即资源行为或系统故障。 AIR-BAGEL 在案例中的活动顺序和发
生级别注入异常,例如,跳过或替换事件日志轨迹中的事件。该工具输出带有注
入异常的事件日志,并增加了案例级异常二进制标签和重建注入异常类型所需的
其他属性。 AIR-BAGEL 的主要目标是为流程挖掘研究社区提供一个简单的工
具来在现有事件日志中生成伪真实异常,从而能够开发和评估使用事件日志异常
的事件日志清理方法,这将使得事件日志更像人们能够在真实世界所遇到的那样。
The next section gives an overview of the tool, while Section III and IV describe more
in detail the features and the maturity of AIR-BAGEL, respectively. How to access and
install the tool is described in Section V, while conclusions are drawn in Section VI.
下一节概述了该工具,而第三节和第四节分别更详细地描述了 AIR-BAGEL 的
功能和成熟度。如何访问和安装该工具在第 V 节中进行了描述,而结论在第 VI
节中得出。
II. OVERVIEW
二、概述
AIR-BAGEL generates pseudo-real anomalies applying two types of root causes, i.e.,
resource and system. It is assumed that each event in a log has two attributes, one
identifying the (human) resource in charge of executing the task to which an event
refers and one identifying the system used for recording the task. Since this information,
particularly the system attribute, is often not available in an event log, AIR-BAGEL
also allows to generate these two attributes. The user can specify the existence of a
given number of resources/systems and associate them with specific classes of events
in the process.
AIR-BAGEL 使用两种类型的根因(即资源和系统)生成伪真实异常。假设日志
中的每个事件都有两个属性,一个用于标识负责执行事件所指任务的(人力)资
源,另一个用于标识用于记录任务的系统。由于此信息,尤其是系统属性,在事
件日志中通常不可用,因此 AIR-BAGEL 还允许生成这两个属性。用户可以指
定给定数量的资源/系统的存在,并将它们与流程中的特定事件类别相关联。
As far as generating anomalies is concerned, when resource is the root, since each
resource may have different job competency, experience, and behaviour, e.g., new
剩余10页未读,继续阅读
资源评论
ProgrammerMonkey
- 粉丝: 43
- 资源: 37
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功