2
concentrate on motor imagery system based BCI which is a type
of ERD/S.
Motor Imagery based BCI were invented to help people with
disabilities to communicate with the outer world. EEG has
proven to be effective in motor imagery based BCI due to its
very light equipment, low cost and high temporal resolution
Curran and Stokes (2003). However, BCI using EEG suffers
from challenges such as extracting features which are useful
for specific task due to low specificity, vulnerable to volume
conduction effects, non-stationarity, and prone to noise Wol-
paw et al. (2000). Another problem posed by EEG signals is
that they vary from person to person and session to session
Krauledat et al. (2007). Spatial filtering has been introduced
to discriminate between the motor imagery signals using multi-
channel EEG. The objective behind this filtering is to transform
the multichannel EEG signals into small set of channels which
are useful to distinguish between the different brain activities
Vidaurre et al. (2011); Tangwiriyasakul et al. (2013).
2. Related Work
Common spatial Pattern (CSP) has proven to be very effi-
cient in context of motor imagery based BCI Tangermann et al.
(2012); Ramoser et al. (2000); Blankertz et al. (2008, 2004,
2006). Ideally, CSP computes the filters which maximizes the
ratio of variance between the brain activities, details of which
are explained in the subsection 3.1. However, CSP is vulnerable
to noise and problem of overfitting Reuderink and Poel (2008);
Grosse-Wentrup et al. (2009). To overcome these shortcom-
ings, many methods have been proposed in literature Blankertz
et al. (2008); Lu et al. (2009); Kang et al. (2009); Lotte and
Guan (2010). Lotte and Guan (2011) presents detailed com-
parisons of these regularization techniques. Dornhege et al.
(2004a) in their pioneer work extended CSP to multiclass by
using methods such as one-versus-rest, pair-wise and simulta-
neous diagonalization. A major drawback in extending CSP to
multiclass is that regularization methods proposed for two class
are not effective, therefore accuracies for multiclass are sub-
stantially low. For extending original CSP algorithm to more
than two class, Naeem et al. (2006), Brunner et al. (2007),
Grosse-Wentrup and Buss (2008), Gouy-Pailler et al. (2008),
Gouy-Pailler et al. (2010), Grosse-Wentrup and Buss (2008)
and Wei et al. proposed different methods using Joint Ap-
proximate Diagonalization (JAD) . Among all these approaches
common part was that they assumed that CSP is equivalent to
finding independent common components for 2-class problem.
Naeem et al. (2006) and Brunner et al. (2007) researched on dif-
ferent Independent Component Analysis (ICA) algorithms for
finding features and components including Informax Bell and
Sejnowski (1995), FastICA Hyvarinen (1999) and SOBI Be-
louchrani et al. (1997). They presented key differences between
the performance of ICA-based methods with other CSP meth-
ods which are variants of One-versus-the- Rest and Pair-Wise
Dornhege et al. (2004b). Their findings concluded that overall,
CSP methods perform better than ICA-based methods. In addi-
tion to this they also stated that among the ICA-based methods,
Infomax achieved better performance than others. CSP is iden-
tical to ICA was proven mathematically by Grosse-Wentrup
and Buss (2008). The authors proposed that CSP by JAD is
identical to ICA and also proposed information theoretic ap-
proach for feature extraction. Gouy-Pailler in their recent work
Gouy-Pailler et al. (2008), Gouy-Pailler et al. (2010) proposed
maximum likelihood method for finding spatial filters which
is extension of JAD method. They claimed that their method
is a neurophysiologically adapted version of JAD. They vali-
dated their approach on Dataset 2a of the BCI Competition IV
which has nine subjects(they have used only 8 subjects out of
9) with four motor-imagery tasks, and reported that their newly
proposed JAD method achieved a better classification accuracy
than the CSPs method. They also claimed that CSP with JAD is
not better significantly than CSP methods which are variant of
of One-versus-the- Rest and Pair-Wise as reported in the find-
ings of Grosse-Wentrup and Buss (2008). In a separate work,
Wei et al. used quadratic optimization to find common spa-
tial patterns for multi-class BCI problems. However, in this
case they conducted experiments on their own dataset instead
on more widely used datasets so it is difficult to compare their
results with previously noted research results. Apart from these
two methods there is some interesting work done using sub-
space method by Ramoser et al. (2000), in which they propose
Union-based common principal components (UCPC) to create
a subspace for class of data from covariance matrix, and finally
the union of all the all subspaces is used as common principal
component. However, drawback of this method is that chosen
principal components may not contribute to some data classes
at all.
In summary, multi-class BCI systems can be broken down
into two main groups. The first is based on extension of two
class methods that original CSP was designed to work upon, on
the other hand other groups deals with JAD. Key point of two
class CSP- based methods is that it breaks down the multi-class
problems into various classification two-class problems. The
two popular methods are One-versus-the-Rest, and a combina-
tion of pairs of two-class classification problems (Pair-Wise).
Each of these two methods has its own weakness. In the first
method there is an assumption that covariance matrix all the
other classes are almost identical. However, this assumption
hardly stands true in real world data. In latter, there is no surety
of getting the CSPs that are optimal of different pairs of classes.
This is similar to grouping common principal components of
pairs of classes and then finding CSPs.
Non-stationarity nature of data in EEG presents major prob-
lem, which is due to the variation in intra- and inter- session in
the same task. This results in very low accuracies in the clas-
sification phase, due to poor modeling of data. Recently fuzzy
inference systems were used in EEG which has proved to be ef-
fective for classification purpose. As shown in Guler and Ubeyli
(2005); Coyle et al. (2009); Fabien et al. (2007); Subasi (2007);
Yang et al. (2014) type-1 inference systems have been em-
ployed for classification of EEG signals. However, type-1 fuzzy
sets are inefficient in handling non-stationarity of EEG signals
as pointed out by Herman et al. (2006). Type-2 fuzzy sets which
was proposed by Zadeh (1975) along with uncertainty frame-
work can be used to handle non-stationarity as shown in Mendel
and John (2002). Type-2 sets are computationally very expen-