Parallel-computing-based implementation of
fast algorithms for discrete Gabor transform
ISSN 1751-9675
Received on 18th July 2014
Revised on 28th January 2015
Accepted on 30th March 2015
doi: 10.1049/iet-spr.2014.0300
www.ietdl.org
Chen Lin
1
✉
, Liang Tao
1
, Hon Keung Kwan
2
1
MOE Key Laboratory of Intelligent Computing and Signal Processing, School of Computer Science and Technology, Anhui University,
Hefei, Anhui, 230039, People’s Republic of China
2
Department of Electrical and Computer Engineering, University of Windsor, 401 Sunset Avenue, Windsor, Ontario, Canada, N9B 3P4
✉ E-mail: saltedlin@sina.com
Abstract: Parallel-computing-based implementation of the two recent fast para llel algorithms for the discrete Gabor
transform (DGT) is presented in this paper. First of all, the first existing block time-recursive DGT algorithm with
parallel lattice structure is analysed, and then an improved implementation method under a parallel computing
environment is presented. Each parallel channel (i.e. process in parallel computing) in the improved method is
independent, thereby reducing the interprocess communication by 99.2% on average ov er the original algorithm.
Second, the second existing fast parallel DGT algorithm based on multirate fi ltering i s analysed. Through the use of
parallel computing, the communication overhead of the multirate filtering-based parallel DGT algorithm is optimised
and its time efficiency is raised from 31.26 times to 54.52 times faster than the serial fast DGT algorithm in process ing
of long sequences. Fin ally, the experimental results are compared and analysed, which indicate that the proposed fas t
DGT implementation methods are attractive for real-time signal processing.
1 Introduction
Gabor transform [1] is a useful time–frequency analysis method.
However, real-time applications of this method are limited because
of computational complexity of the discrete Gabor transform
(DGT) and expansion. Thus, various approaches [2–10] such as
the biorthogonal analysis [2, 3], the quasi-orthogonal analysis [4],
the framework theory-based method [5], the fast Fourier transform-
based method [6] and others [7–10] have been proposed. However,
the efficiency of these approaches remains limited. In 1997, Lu
developed a parallel algorithm [11] called the parallel lattice
structure of block time-recursive DGT, which was later improved
by Tao and Kwan for the case of complex DGT [12] and for the
case of real-valued DGT [13]. In 2012, the fast parallel algorithms
for discrete Gabor expansion and transform based on multirate
filtering [14–16] were proposed. Compared with the
one-dimensional (1D) parallel lattice structure-based parallel DGT
algorithms in [11, 12], the 1D multirate filtering-based parallel
algorithms have increased parallelism and higher computational
efficiency [14], which remains the case for 1D DCT-based
real-valued DGT [15] and two-dimensional DHT-based real-valued
DGT [16]. So far, [12–16] are based on theoretical and simulative
analyses, and their experimental analyses under parallel-computing
environment have not yet been conducted.
In this paper, we shall implement and analyse fast parallel
algorithms for the DGT under parallel-computing environment.
The parallel lattice structure of block time-recursive DGT
algorithms advanced in [11, 12] ensures efficient computation in a
single channel (i.e. process in parallel computing). However, these
algorithms consume extra time in real parallel – computing
environments because of their recursive processes and a large
number of interprocess communications that have not been
considered in the existing parallel DGT algorithms [11, 12]. Thus,
an improved implementation method for the parallel lattice
structure of the block time-recursive DGT algorithm based on [12]
is presented in this paper. The recursive form of the original block
time-recursive DGT algorithm will be changed into an iterative
form, which makes all the parallel processes independent of one
another. Thus, the DGT coefficients can be obtained without
requiring interprocess communication. In this way, although the
time complexity of a single process is identical to that in the
original block time-recursive DGT algorithms [11, 12], its overall
efficiency will be improved. In addition, compared with the block
time-recursive DGT algorithm with the parallel lattice structure,
the multirate-based fast parallel DGT algorithms [14–16] all have
lower communication overhead and higher efficiency. Moreover,
the interprocess communication of the multirate-based parallel
DGT algorithm for the complex DGT of [14] can be further
optimised to improve computing efficiency through the use of
hybrid MPI and OpenMP model [17]. The above two
implementation methods proposed in this paper have been
programmed using C programming language and simulated using
a parallel computer with cluster-based hybrid architecture [18]
which will be described in Section 5.
The rest of the paper is organised as follows: In Section 2, the
DGT is reviewed. In Section 3, the parallel lattice structure of
block time-recursive DGT is introduced and its improved parallel
implementation method is presented. In Section 4, the fast parallel
DGT algorithm based on multirate filtering is briefly reviewed. Its
parallel implementations are analysed and then redesigned through
communication optimisation. In Section 5, the parallel computing
experiments and results are given. Finally, a conclusion is given in
Section 6.
2 Review of DGT
Let x[k] denote a real discrete-time signal with a period L, the
discrete Gabor expansion is de fined by
x[k] =
M−1
m=0
N−1
n=0
a[m, n]h[k − mN] exp ( j
0
2
p
nk/N) (1)
where j
0
=
−1
√
, and the coefficients a[m, n] can be obtained by
a[m, n] =
L−1
k=0
x[k]
g
[k − mN ] exp (−j
0
2
p
nk/N) (2)
IET Signal Processing
Research Article
IET Signal Process., pp. 1–7
1
&
The Institution of Engineering and Technology 2015