2832 Page 2 of 15 Eur. Phys. J. C (2014) 74:2832
tonically decreasing with increasing distance from u to X
i
.A
common choice of weights is the Gaussian density function
w
i
(u) =
Q
i
(2π)
3/2
h
3
exp
−
1
2h
2
(X
i
− u)
T
(X
i
− u)
, (2)
where Q
i
is the energy deposit for hit i and h is a constant
bandwidth parameter that steers the size of the local neigh-
bourhood. The weights play the role of “kernel” functions
and can, if desired, be replaced by other functions such as a
triangular-shaped or truncated probability density. In our sce-
nario, where the co-ordinates are all measured on the same
scale, we keep the Gaussian form and use the same band-
width parameter for all three directions. The value of h can
be selected through a coverage measure [3], though for our
purposes there is not much reason for this, as roughly the
same bandwidth, h ∼ 0.05 after scaling, will be usable in a
wide range of liquid argon detectors. Note that the normal-
isation denominator (2π)
3/2
h
3
can be left out of the kernel
function since it is a constant common factor for all hit points
and has no effect on the properties of the principal curve.
From Eq. 1, we define the mean shift as
s(u) = m(u) − u =
N
i=1
w
i
(u)(X
i
− u)
N
i=1
w
i
(u)
. (3)
This quantity has many interesting properties [4], one
of which being that s(u) ∝∇
ˆ
f (u)/
ˆ
f (u), where
ˆ
f (u) =
1
N
N
i=1
w
i
(u) is a density estimate of f at u. This implies
that the mean shift is a vector pointing into a denser direc-
tion of the data space. When carried out iteratively, starting
at u = m
0
, one can show [4] that the series of local means
m
+1
= m
+ s(m
), ≥ 0, (4)
converges to a local mode u
m
of
ˆ
f (u) where s(u
m
) = 0. This
has the attractive property of being a clustering technique; a
trajectory can be formed by running the mean shift proce-
dure on each data point X
i
iteratively until convergence is
achieved.
Though the convergence towards a local mode of the den-
sity is an appealing property, it has the negative side effect of
getting trapped at the local modes and will not move beyond
them. Therefore, some modification of Eq. 4 is needed which
ensures that particle trajectories are pursued beyond local
modes. The simple idea is to alternate the mean shift with
a local principal component step [3]. More specifically, let
γ(u) be the normalised eigenvector corresponding to the
largest eigenvalue of the local symmetric 3 × 3 covariance
matrix
(u) =
1
N
i=1
w
i
(u)
N
i=1
w
i
(u)(X
i
− u)(X
i
− u)
T
. (5)
Starting from a given point u = m
0
,weset = 0 and iterate
between
1. computing the local centre of mass:
m(u
) ≡ u
+ s(u
); (6)
2. finding the next local neighbourhood location:
u
+1
= m(u
) + t × γ
, (7)
where t is a given step size (of the same order as h) and
γ
≡ γ(u
). The local principal curve is then defined as the
series of local centres of mass m(u
). In our case, the starting
point u = m
0
is chosen to be the position of the nearest hit to
the energy-weighted centroid of all of the hits. Alternatively,
m
0
can be set either at random from the X
i
points, set by
hand, or be chosen to be a local density mode using an initial
mean shift procedure as outlined in Refs. [3,5]. The above
iteration is repeated until either the required number of lpc
points (N
p
) is obtained, or the path length along the local
curve is no longer increasing (convergence).
As we will see later, the angle φ between the normalised
eigenvector γ
and the preceeding eigenvector γ
−1
can be
used to infer the presence of feature points, corresponding
to drops in the angle profile along the principal curve, which
provide evidence for particle decays or possible interactions
between particles.
In order to provide inertia for reducing the chance of the
local principal curve deviating too much from the general
direction of neighbouring points, γ
is multiplied by an angle
penalisation term a =|cosφ|
α
, where α is usually set to 2,
when the next local neighbourhood location is found using
Eq. 7:
γ
:= aγ
+ (1 − a)γ
−1
. (8)
A technicality to be mentioned is that, for a given (u),the
first eigenvector may equally well be −γ(u) as well as γ(u),
so for each ≥ 1, one needs to check whether cosφ>0 and
set γ
:= −γ
otherwise [3].
Based on asymptotic considerations [5], it can be shown
that the sequence of lpc points m(u
) converges to a point
u
b
close to the boundary of the cloud of data points with
the property f (u
b
) = h|∇
ˆ
f (u
b
)|. In practical terms, conver-
gence is reached when the cumulative path length difference
between neighbouring local curve points, divided by their
sum, is below a chosen threshold typically set at 10
−6
:
R =
λ
− λ
−1
λ
+ λ
−1
< 10
−6
, (9)
where λ
= λ
−1
+|m(u
) − m(u
−1
)| and λ
0
= 0. In
order to pick up features that may be present in the tails
123