没有合适的资源?快使用搜索试试~ 我知道了~
ImageQualityAssessmentFromErrorVisibilitytoStructuralSimilarity
需积分: 1 0 下载量 129 浏览量
2024-04-21
13:13:31
上传
评论
收藏 1.63MB PDF 举报
温馨提示
试读
15页
SSIM 最早被提出来的论文,是图像领域一个非常重要的评价方法
资源推荐
资源详情
资源评论
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/3327793
Image Quality Assessment: From Error Visibility to Structural Similarity
ArticleinIEEE Transactions on Image Processing · May 2004
DOI: 10.1109/TIP.2003.819861·Source: IEEE Xplore
CITATIONS
19,306
READS
9,976
4 authors, including:
Some of the authors of this publication are also working on these related projects:
Predicting the quality of images compressed after distortion in two steps View project
Create new project "Perceptual Quality" View project
Zhou Wang
University of Waterloo
228 PUBLICATIONS50,145 CITATIONS
SEE PROFILE
Alan Bovik
University of Texas at Austin
906 PUBLICATIONS98,413 CITATIONS
SEE PROFILE
Eero P. Simoncelli
New York University
349 PUBLICATIONS73,865 CITATIONS
SEE PROFILE
All content following this page was uploaded by Eero P. Simoncelli on 23 September 2014.
The user has requested enhancement of the downloaded file.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 13, NO. 4, APRIL 2004 1
Image Quality Assessment: From Error Visibility to
Structural Similarity
Zhou Wang, Member, IEEE, Alan C. Bovik, Fellow, IEEE
Hamid R. Sheikh, Student Member, IEEE, and Eero P. Simoncelli, Senior Member, IEEE
Abstract— Objective methods for assessing perceptual im-
age quality have traditionally attempted to quantify the vis-
ibility of errors between a distorted image and a reference
image using a variety of known properties of the human
visual system. Under the assumption that human visual
perception is highly adapted for extracting structural infor-
mation from a scene, we introduce an alternative framework
for quality assessment based on the degradation of struc-
tural information. As a specific example of this concept,
we develop a Structural Similarity Index and demonstrate
its promise through a set of intuitive examples, as well as
comparison to both subjective ratings and state-of-the-art
objective methods on a database of images compressed with
JPEG and JPEG2000.
1
Keywords— Error sensitivity, human visual system (HVS),
image coding, image quality assessment, JPEG, JPEG2000,
perceptual quality, structural information, structural simi-
larity (SSIM).
I. Introduction
Digital images are subject to a wide variety of distor-
tions during acquisition, pro cessing, compression, storage,
transmission and reproduction, any of which may result
in a degradation of visual quality. For applications in
which images are ultimately to be viewed by human be-
ings, the only “correct” method of quantifying visual im-
age quality is through subjective evaluation. In practice,
however, subjective evaluation is usually too inconvenient,
time-consuming and expensive. The goal of research in ob-
jective image quality assessment is to develop quantitative
measures that can automatically predict perceived image
quality.
An objective image quality metric can play a variety of
roles in image processing applications. First, it can be
used to dynamically monitor and adjust image quality. For
example, a network digital video server can examine the
quality of video being transmitted in order to control and
allocate streaming resources. Second, it can be used to
optimize algorithms and parameter settings of image pro-
cessing systems. For instance, in a visual communication
The work of Z. Wang and E. P. Simoncelli was supported by the
Howard Hughes Medical Institute. The work of A. C. Bovik and H.
R. Sheikh was supported by the National Science Foundation and the
Texas Advanced Research Program. Z. Wang and E. P. Simoncelli are
with the Howard Hughes Medical Institute, the Center for Neural Sci-
ence and the Courant Institute for Mathematical Sciences, New York
University, New York, NY 10012 USA (email: zhouwang@ieee.org;
eero.simoncelli@nyu.edu). A. C. Bovik and H. R. Sheikh are with the
Laboratory for Image and Video Engineering (LIVE), Department
of Electrical and Computer Engineering, The University of Texas
at Austin, Austin, TX 78712 USA (email: bovik@ece.utexas.edu;
hamid.sheikh@ieee.org).
1
A MatLab implementation of the proposed algorithm is available
online at http://www.cns.nyu.edu/~lcv/ssim/.
system, a quality metric can assist in the optimal design of
prefiltering and bit assignment algorithms at the encoder
and of optimal reconstruction, error concealment and post-
filtering algorithms at the decoder. Third, it can be used
to benchmark image processing systems and algorithms.
Objective image quality metrics can be classified accord-
ing to the availability of an original (distortion-free) image,
with which the distorted image is to be compared. Most
existing approaches are known as full-reference, meaning
that a complete reference image is assumed to be known. In
many practical applications, however, the reference image
is not available, and a no-reference or “blind” quality as-
sessment approach is desirable. In a third type of method,
the reference image is only partially available, in the form
of a set of extracted features made available as side infor-
mation to help evaluate the quality of the distorted image.
This is referred to as reduced-reference quality assessment.
This paper focuses on full-reference image quality assess-
ment.
The simplest and most widely used full-reference quality
metric is the mean squared error (MSE), computed by aver-
aging the squared intensity differences of distorted and ref-
erence image pixels, along with the related quantity of peak
signal-to-noise ratio (PSNR). These are appealing because
they are simple to calculate, have clear physical meanings,
and are mathematically convenient in the context of opti-
mization. But they are not very well matched to perceived
visual quality (e.g., [1]–[9]). In the last three decades, a
great deal of effort has gone into the development of quality
assessment methods that take advantage of known charac-
teristics of the human visual system (HVS). The majority
of the proposed perceptual quality assessment models have
followed a strategy of modifying the MSE measure so that
errors are penalized in accordance with their visibility. Sec-
tion II summarizes this type of error-sensitivity approach
and discusses its difficulties and limitations. In Section III,
we describe a new paradigm for quality assessment, based
on the hypothesis that the HVS is highly adapted for ex-
tracting structural information. As a specific example, we
develop a measure of structural similarity that compares lo-
cal patterns of pixel intensities that have been normalized
for luminance and contrast. In Section IV, we compare the
test results of different quality assessment models against
a large set of subjective ratings gathered for a database of
344 images compressed with JPEG and JPEG2000.
2 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 13, NO. 4, APRIL 2004
Reference
signal
Distorted
signal
Quality/
Distortion
Measure
Channel
Decomposition
Error
Normalization
.
.
.
Error
Pooling
Pre-
processing
CSF
Filtering
.
.
.
Fig. 1. A prototypical quality assessment system based on error sensitivity. Note that the CSF feature can be implemented either as a
separate stage (as shown) or within “Error Normalization”.
II. Image Quality Assessment Based on Error
Sensitivity
An image signal whose quality is being evaluated can
be thought of as a sum of an undistorted reference signal
and an error signal. A widely adopted assumption is that
the loss of perceptual quality is directly related to the vis-
ibility of the error signal. The simplest implementation
of this concept is the MSE, which objectively quantifies
the strength of the error signal. But two distorted images
with the same MSE may have very different types of errors,
some of which are much more visible than others. Most
perceptual image quality assessment approaches prop osed
in the literature attempt to weight different aspects of the
error signal according to their visibility, as determined by
psychophysical measurements in humans or physiological
measurements in animals. This approach was pioneered
by Mannos and Sakrison [10], and has been extended by
many other researchers over the years. Reviews on image
and video quality assessment algorithms can be found in
[4], [11]–[13].
A. Framework
Fig. 1 illustrates a generic image quality assessment
framework based on error sensitivity. Most perceptual
quality assessment models can be described with a simi-
lar diagram, although they differ in detail. The stages of
the diagram are as follows:
Pre-processing. This stage typically performs a variety
of basic operations to eliminate known distortions from the
images being compared. First, the distorted and reference
signals are properly scaled and aligned. Second, the signal
might be transformed into a color space (e.g., [14]) that is
more appropriate for the HVS. Third, quality assessment
metrics may need to convert the digital pixel values stored
in the computer memory into luminance values of pixels on
the display device through pointwise nonlinear transforma-
tions. Fourth, a low-pass filter simulating the point spread
function of the eye optics may be applied. Finally, the ref-
erence and the distorted images may be modified using a
nonlinear point operation to simulate light adaptation.
CSF Filtering. The contrast sensitivity function (CSF)
describes the sensitivity of the HVS to different spatial and
temporal frequencies that are present in the visual stim-
ulus. Some image quality metrics include a stage that
weights the signal according to this function (typically im-
plemented using a linear filter that approximates the fre-
quency response of the CSF). However, many recent met-
rics choose to implement CSF as a base-sensitivity normal-
ization factor after channel decomp osition.
Channel Decomposition. The images are typically sep-
arated into subbands (commonly called “channels” in the
psychophysics literature) that are selective for spatial and
temporal frequency as well as orientation. While some
quality assessment methods implement sophisticated chan-
nel decompositions that are believed to be closely re-
lated to the neural responses in the primary visual cortex
[2], [15]–[19], many metrics use simpler transforms such as
the discrete cosine transform (DCT) [20], [21] or separa-
ble wavelet transforms [22]–[24]. Channel decompositions
tuned to various temporal frequencies have also been re-
ported for video quality assessment [5], [25].
Error Normalization. The error (difference) between the
decomposed reference and distorted signals in each channel
is calculated and normalized according to a certain masking
model, which takes into account the fact that the presence
of one image component will decrease the visibility of an-
other image component that is proximate in spatial or tem-
poral location, spatial frequency, or orientation. The nor-
malization mechanism weights the error signal in a channel
by a space-varying visibility threshold [26]. The visibility
threshold at each point is calculated based on the energy
of the reference and/or distorted coefficients in a neighbor-
hood (which may include coefficients from within a spatial
neighborhood of the same channel as well as other chan-
nels) and the base-sensitivity for that channel. The normal-
ization process is intended to convert the error into units of
just noticeable difference (JND). Some methods also con-
sider the effect of contrast resp onse saturation (e.g., [2]).
Error Pooling. The final stage of all quality metrics must
combine the normalized error signals over the spatial extent
of the image, and across the different channels, into a single
value. For most quality assessment methods, pooling takes
the form of a Minkowski norm:
E ({e
l,k
}) =
Ã
X
l
X
k
|e
l,k
|
β
!
1/β
(1)
where e
l,k
is the normalized error of the k-th coefficient in
the l-th channel, and β is a constant exponent typically
chosen to lie between 1 and 4. Minkowski pooling may be
performed over space (index k) and then over frequency
(index l ), or vice-versa, with some non-linearity between
them, or possibly with different exponents β. A spatial
剩余14页未读,继续阅读
资源评论
Jacen.L
- 粉丝: 208
- 资源: 5
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功