rate is increased, more distortion will be introduced so
that the image quality deteriorates. As the image content
becomes less identifiable after a large amount of data are
hidden, it is desirable to improve the quality of the
watermarked image. In this study, a new reversible data
hiding algorithm is proposed by considering the efficiency
of modifying a pair of histogram bins and using it to guide
the embedding process. To increase the embedding rate,
multiple pairs of histogram bins can be selected in
sequence and the pre-process is performed to prevent
the possible overflow and underflow of pixel values.
Embedding with the prediction errors is further investi-
gated with a new prediction scheme. In each of the four
prediction modes, a large amount of prediction errors can
be produced from the host image. Moreover, all options of
choosing a number of histogram bins with the four
possible modes are enumerated to obtain the best per-
formance. To extract the embedded data and blindly
recover the original image, a pre-computed location
map and other overhead information are saved into the
watermarked image as well. Promising experimental
results on a variety of test images are obtained. Compared
with the existing algorithms, the image content is better
preserved by the proposed one, especially for high pay-
load data hiding.
The rest of this paper is organized as follows. Section 2
presents a new data embedding algorithm by taking the
efficiency of modifying a pair of histogram bins into
consideration. In Section 3, a new prediction scheme is
designed to produce a large number of prediction errors
from the host image, and embedding with the prediction
errors is investigated. The experimental results are given
and analyzed in Section 4. Finally, some concluding
remarks are drawn in Section 5.
2. A new histogram modification algorithm for data
embedding
2.1. Considering the efficiency of histogram modification
Given a histogram h(x), which represents the number
of occurrence as the variable X assumes value x, it can be
modified to embed a number of bit values in a lossless
way. Here we assume that X only takes integer values. For
a 8-bit graylevel image I, it can be a pixel value or a
prediction error. Among the nonempty histogram bins, a
pair of bins can be selected with the corresponding values
x
l
and x
r
(x
l
o x
r
), as shown in Fig. 1(a) for instance. We
use the histogram of pixel values to illustrate the embed-
ding operations. For each pixel value x in the original
image, the embedding operation can be performed by
x
0
¼
x1ifxo x
l
,
x
l
b if x ¼ x
l
,
x if x
l
o xo x
r
,
x
r
þb if x ¼ x
r
,
xþ1ifx4 x
r
,
8
>
>
>
>
>
>
<
>
>
>
>
>
>
:
ð1Þ
where x
0
is the modified pixel value and b is the bit value to
be embedded. Here we assume that no overflow or under-
flow occurs, i.e. all the modified pixel values are within the
range of [0,255]. Normally, the number of histogram bins is
increased by 2 after the embedding, as shown in Fig. 1(b)
for instance. To extract the embedded data and restore the
original pixel values, the selected bin values x
l
and x
r
need
to be provided. Given a modified value x
0
, the recovery
operation is performed by
x ¼
x
0
þ1ifx
0
o x
l
,
x
0
if x
l
r xr x
r
,
x
0
1ifx
0
4 x
r
:
8
>
<
>
:
ð2Þ
Meanwhile, the extraction operation is performed by
b
0
¼
1ifx
0
¼ x
l
1orx
0
¼ x
r
þ1,
0ifx
0
¼ x
l
or x
0
¼ x
r
,
(
ð3Þ
where b
0
is the extracted bit value.
In the case as shown in Fig. 1, the two highest bins in the
histogram are chosen so that the capacity of one pair
embedding is maximized. However, the maximized capacity
of one pair embedding does not ensure the best perfor-
mance. As shown in Fig. 2, it may be more efficient to use
the histogram bin x
e
for data embedding instead of x
r
.
Although one more bit can be embedded with the bin x
r
,
however, more pixel values will be modified than using the
bin x
e
. When the number of modified pixel values is
increased, the quality of host image will be affected.
Fig. 1. An example of histogram modification for data embedding:
(a) original histogram h(x) and (b) modified histogram h
0
ðxÞ after data
embedding.
H.-T. Wu, J. Huang / Signal Processing 92 (2012) 3000–3009 3001