没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
模式识别问题的自动化解决方案的开发在科学研究和人类努力的许多领域都很重要。 本文介绍了Pandora软件开发套件的实现,该套件可帮助设计,实现和运行模式识别算法的过程。 Pandora应用程序编程接口可确保对定义模式识别问题的构件进行简单说明。 解决问题所需的逻辑在算法中实现。 该算法请求创建或修改数据结构的操作,并且该操作由Pandora框架执行。 这种设计促进了使用许多解耦算法的方法,每种算法都针对特定的拓扑。 提出了解决高能物理中两个模式识别问题的算法细节:在高能e + e-线性对撞机上重建事件以及在液态氩时间投影室内重建宇宙射线或中微子事件。
资源推荐
资源详情
资源评论
Eur. Phys. J. C (2015) 75:439
DOI 10.1140/epjc/s10052-015-3659-3
Regular Article - Experimental Physics
The Pandora software development kit for pattern recognition
J. S. Marshall
a
,M.A.Thomson
Cavendish Laboratory, University of Cambridge, Cambridge, UK
Received: 18 June 2015 / Accepted: 4 September 2015 / Published online: 21 September 2015
© The Author(s) 2015. This article is published with open access at Springerlink.com
Abstract The development of automated solutions to pat-
tern recognition problems is important in many areas of sci-
entific research and human endeavour. This paper describes
the implementation of the Pandora software development kit,
which aids the process of designing, implementing and run-
ning pattern recognition algorithms. The Pandora Applica-
tion Programming Interfaces ensure simple specification of
the building-blocks defining a pattern recognition problem.
The logic required to solve the problem is implemented in
algorithms. The algorithms request operations to create or
modify data structures and the operations are performed by
the Pandora framework. This design promotes an approach
using many decoupled algorithms, each addressing specific
topologies. Details of algorithms addressing two pattern
recognition problems in High Energy Physics are presented:
reconstruction of events at a high-energy e
+
e
−
linear col-
lider and reconstruction of cosmic ray or neutrino events in
a liquid argon time projection chamber.
1 Introduction
Pattern recognition is the identification of structures or regu-
larities in data. Problems requiring a pattern recognition solu-
tion occur in all areas of scientific research and our every-
day lives. This document describes the implementation of
the Pandora software development kit (SDK), which aims
to ease the process of designing, implementing and running
pattern recognition algorithms. The Pandora SDK was cre-
ated to address the problem of identifying energy deposits
from individual particles in fine granularity detectors in High
Energy Physics (HEP). The ideas described in this document
are, however, actually quite generic, covering a wide array
of problems where the aim is to sort points in time or space
into higher-level structures.
a
e-mail: marshall@hep.phy.cam.ac.uk
Figure 1 illustrates two typical pattern recognition prob-
lems in HEP. Figure 1a shows the simulated detector response
to the production and hadronic decay of Higgs and Z bosons
following high energy e
+
e
−
collisions at the Compact Lin-
ear Collider (CLIC). In order to extract measurements of
the Higgs boson properties, such as its coupling strengths,
it is vital to reconstruct and classify the individual particles
in large samples of events. Figure 1b shows the simulated
response of a Liquid Argon Time Projection Chamber (LAr
TPC) to a charged current electron neutrino interaction. In
order to understand neutrino mixing and CP-violation in the
neutrino sector, it is crucial to identify and characterise each
particle in this challenging topology.
The idea underpinning the Pandora SDK is that the inter-
faces for pattern recognition problems are well defined, as are
the operations that must be performed by pattern recognition
algorithms. Whoever poses the pattern recognition problem
must specify the building-blocks, or space-points, that define
the problem. They must also be able to extract the output
structures, such as clusters, that represent the solution. The
algorithms that address the problem must be able to build
clusters of space-points and should be able to manipulate
clusters by splitting them up or merging them together. What
differs between pattern recognition problems is the precise
logic controlling the algorithm operations.
2 Historical context
The Pandora project began in 2007 to provide the first parti-
cle flow calorimetry implementation for the proposed Inter-
national Linear Collider (ILC). A particle flow algorithm
was developed, exploiting the fine granularity detectors in
order to reconstruct the paths of individual visible particles.
Successful identification of the trajectories allows particle
four-momenta to be extracted from the subdetector system
in which they are best-measured, delivering unprecedented
jet energy resolution. The Pandora algorithm was used to
123
439 Page 2 of 16 Eur. Phys. J. C (2015) 75 :439
(a)
(b)
Fig. 1 Typical pattern recognition problems in HEP. a Simulated
detector response to a Higgsstrahlung event at CLIC. b Simulated detec-
tor response to a charged current ν
e
interaction in a LAr TPC
perform the first systematic study of the potential of this
approach to calorimetry at a high energy lepton collider [1].
The original Pandora algorithm demonstrated sophisti-
cated pattern recognition ideas, but, in software-engineering
terms, was only a proof-of-principle implementation. It was
decided to develop a fully-featured software framework for
pattern recognition algorithms and to reimplement the ILC
particle flow approach in this new framework. This signifi-
cant software-engineering project took place in 2009–2010
and resulted in the first versions of the Pandora SDK and
Pandora Linear Collider content library.
New algorithms were subsequently added to extend the
pattern recognition functionality to higher energies, such as
those relevant to the multi-TeV lepton collider, CLIC. Pan-
dora was then used to provide the event reconstruction for
the physics analyses described in the ILC Technical Design
Report [2,3] and the CLIC Conceptual Design Report [4,5].
The performance of particle flow calorimetry at CLIC was
characterised in detail [6].
The Pandora SDK was designed to be applicable to mul-
tiple pattern recognition problems. Most recently, in 2013–
2015, a new library of Pandora algorithms was developed
to address the problem of particle reconstruction in the chal-
lenging event topologies seen in LAr TPCs [7]. This problem
is very different to that originally tackled for the ILC, but the
functionality required from the pattern recognition software
framework remains exactly the same.
3 Overview of the Pandora SDK
The Pandora SDK aims to provide a robust, reliable and easy-
to-use environment for developing and running pattern recog-
nition algorithms. Its Application Programming Interfaces
(APIs) are designed to create an environment in which:
– It is easy for users to provide the building-blocks defining
a pattern recognition problem.
– The logic required to solve pattern recognition problems
is cleanly implemented in algorithms.
– All operations to access or modify building-blocks, or to
create new structures, are requested by algorithms and
performed by the Pandora framework.
This design strategy is well-suited to an approach using
large numbers of decoupled algorithms, each of which care-
fully address specific event topologies, typically controlling
the merging or splitting of clusters.
The Pandora SDK consists of a dependency-free C++
library and carefully-designed APIs. It provides a compre-
hensive Event Data Model (EDM) for managing pattern
recognition problems. Instances of objects in the EDM are
owned by Pandora Manager classes. The instances are stored
in named lists and the managers are able to create new objects,
delete objects, create and save new lists and move objects
between lists. They provide a complete set of low-level oper-
ations that allow high-level operations requested by pattern
recognition algorithms to be satisfied.
To use the Pandora SDK, a user must create a Pandora
client application. This provides the input building-blocks
to describe the pattern recognition problem and receives the
final output. The pattern-recognition logic is implemented by
Pandora algorithms, which ask the Pandora SDK to provide
services in order to create new objects or make any changes
to existing instances. Sophisticated visualisation and tree-
writing monitoring functionality is available for use by algo-
rithms. Figure 2 illustrates the typical setup for addressing
pattern recognition problems with the Pandora SDK. With
this setup in mind, this document will describe the key aspects
of the Pandora SDK in detail.
4 Pandora event data model
The Pandora EDM provides a mechanism for managing data
describing pattern recognition problems and their possible
123
Eur. Phys. J. C (2015) 75 :439 Page 3 of 16 439
Pandora SDK
Client Application Pandora Algorithms
Fig. 2 The software setup for addressing pattern recognition problems
using the Pandora SDK
solutions. It consists of a set of classes representing the input
building-blocks for a problem and the structures that can
be created using these building-blocks. A successful EDM
provides a well-defined development environment for pattern
recognition algorithms. It also allows for independence of
the algorithms, which can only communicate via the EDM.
Algorithms are then successfully encapsulated and can be
developed and maintained independently. An algorithm can
be implemented to merge together clusters in close proximity,
for instance, without needing to know anything concerning
the construction of the clusters.
The Pandora EDM aims to be self-describing, which is
to say that each object provides all the information required
to allow investigation and processing by pattern recognition
algorithms. This enables Pandora to be a reusable software
solution, completely isolating the pattern recognition algo-
rithms from the details of the software framework and I/O
mechanism used to create or read the input building-blocks.
The building-blocks for pattern recognition in the Pan-
dora SDK are as described below. These are Pandora “Input
Objects” and are typically all created by the Pandora client
application before the pattern recognition algorithms are
called (see Sect. 5). These objects are completely defined
when they are created and their properties cannot be changed
by the algorithms. The objects are instead used to build new
constructs, termed “Algorithm Objects”. The Pandora SDK
monitors the usage of all the Input Objects to ensure that no
double-counting can occur, with no Input Object being used
to create multiple Algorithm Objects.
– CaloHit The primary building-block for pattern recogni-
tion problems, a CaloHit defines a position and extent
in space and time, together with an associated inten-
sity or energy measurement. Whilst CaloHits can rep-
resent points in free space, they can also provide infor-
mation regarding their location in a particle detec-
tor. This includes details of the subdetector system in
which energy was deposited and information about the
calorimeter readout and geometry. The CaloHits hold
estimators for the electromagnetic energy or hadronic
energy associated with the space-point. It is for algo-
rithms to select the appropriate energy estimator.
– Track Continuous trajectories of well-defined space-
points are represented by Track objects. These are helix
parameterisations of the space-points, providing details
of particle positions and momenta (Track States) along
the trajectory. These objects were originally designed to
represent Tracks reconstructed in fine granularity, low
material-budget tracking systems in particle detectors.
As such, key information provided by Pandora Track
objects include impact parameter details and a projected
Track State at the surface of the detector calorimeters.
Tracks can have parent–daughter and sibling relation-
ships in order to fully-describe particle interactions and
decays that can occur within a tracking detector.
– MCParticle Primarily for development purposes, MCPar-
ticles can be provided for access by the pattern recogni-
tion algorithms. These provide full details of the true pat-
tern recognition solution for simulated events. MCPar-
ticle instances can have parent–daughter links and can
fully describe particle decay cascades in simulated inter-
actions. The MCParticles can store details of their asso-
ciation (in terms of e.g. true energy deposited) with each
CaloHit and Track. Using MCParticles, it is possible
for algorithms to cheat some, or all, aspects of the pat-
tern recognition, allowing a wealth of development and
debugging functionality.
The Pandora Algorithm Objects represent the higher-level
structures created in order to solve pattern recognition prob-
lems. The Pandora SDK carefully manages the allocation
and manipulation of these objects and all non-const oper-
ations can only be requested by algorithms via the Pandora
APIs. The Pandora SDK is then able to perform the memory-
management for the objects.
– Cluster The main working-horse for pattern recognition
algorithms, a Cluster is a collection of CaloHits. It also
provides derived information describing the combined
properties of the CaloHit collection, such as energy esti-
mators and the results of linear fits to the CaloHit spatial
positions. The most typical tasks for Pandora algorithms
will be to create new Clusters from lists of input CaloHits
or to read lists of input Clusters and selectively split or
merge some Clusters.
– Ver t ex The identification and classification of a specific
point in space, vertices are typically used to flag positions
of particle creation or decay.
– ParticleFlowObject A container of Clusters, Tracks and
Vertices, together with metadata describing the parti-
cle type and four-momentum. The Particle Flow Object
(PFO) is the ultimate output of the pattern recognition,
grouping the input objects into structures that completely
define the solution. PFOs can have parent–daughter links
in order to describe particle decay hierarchies.
Instantiation of objects in the Pandora EDM follows a
design pattern that provides a clean and simple interface.
123
剩余15页未读,继续阅读
资源评论
weixin_38708945
- 粉丝: 2
- 资源: 908
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功