重组基于DCT的图像表示，以减少参考立体图像质量评估资源-CSDN文库

47 浏览量 2021-03-06 15:36:24 上传评论收藏 2.58MB PDF 举报

本文是一篇发表在Neurocomputing期刊上的研究论文，标题为“重组基于DCT的图像表示，以减少参考立体图像质量评估”。文章提出了一个新颖的减少参考立体图像质量评估方法（Reduced Reference Stereoscopic Image Quality Assessment，简称RR SIQA），主要通过对立体图像在重组离散余弦变换（Reorganized Discrete Cosine Transform，简称RDCT）域的统计特性进行特征化描述。在立体图像处理和评估领域，立体图像质量评估（SIQA）是一个重要且挑战性的问题。立体图像质量评估方法主要分为无参考、全参考和减少参考三类。无参考方法不使用原始图像进行评估，全参考方法需要原始图像作为参考，而减少参考方法则仅利用部分原始图像信息进行评估，这在实际应用中更为可行，因为它大大减少了所需的传输信息量和存储需求。文章的主要贡献点在于提出了一种基于RDCT的方法，该方法通过对立体图像的左、右视图以及二者之间的差异图像进行基于块的离散余弦变换（Block-based Discrete Cosine Transform，简称DCT），然后将DCT系数重新组织成三级系数树，形成十个RDCT子带。每个RDCT子带的系数分布的统计特性通过广义高斯密度（Generalized Gaussian Density，简称GGD）函数进行建模。进一步地，利用互信息（Mutual Information，简称MI）和能量分布比（Energy Distribution Ratio，简称EDR）来描述不同RDCT子带之间的统计特性。EDR还可以进一步模拟人类视觉系统的相互掩蔽特性。通过在每个RDCT子带内部考虑GGD模型的行为，并结合MI与EDR来表征RDCT子带之间的特性，立体图像的统计特性得到了充分利用，包括左视图、右视图以及差异图像。实验结果表明，提出的方法在减少参考立体图像质量评估中有效。文章的关键词还包括“Stereoscopic Image Quality Assessment（立体图像质量评估）”、“Reduced Reference（减少参考）”和“Human Visual System（人类视觉系统）”。文章的作者来自华为诺亚方舟实验室、深圳大学、南京信息工程大学和华中科技大学。研究团队由Lin Ma、Xu Wang、Qiong Liu和King Ngi Ngan等人组成。文章的结构概述了立体图像质量评估的重要性和研究背景，详细介绍了提出的方法及其理论基础，包括DCT和RDCT的概念及其在图像处理中的作用，GGD函数在描述系数分布统计特性中的应用，以及人类视觉系统特性在图像质量评估中的融入。此外，还描述了实验设计、评估标准以及实验结果的分析。本文的研究方法不仅在技术层面上展示了如何通过数学模型来模拟和预测人类视觉对立体图像质量的感知，而且在实际应用层面上，为立体图像质量的评估提供了一种更为高效和实用的途径，特别是在需要减少数据传输和存储需求的场合。这对于推动立体图像应用的发展，比如3D电视、虚拟现实和增强现实等领域，具有重要的理论和实践意义。

资源推荐

资源详情

资源评论

Reorganized DCT-based image representation for reduced reference

stereoscopic image quality assessment

Lin Ma

, Xu Wang

b,c,

, Qiong Liu

, King Ngi Ngan

Huawei Noah's Ark Lab, Hong Kong

College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China

Nanjing University of Information Science & Technology, Nanjing 210044, China

Department of Electronics & Information Engineering, Huazhong University of Science & Technology, Wuhan 430074, China

Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong

article info

Article history:

Received 28 January 2015

Received in revised form

17 June 2015

Accepted 20 June 2015

Available online 10 June 2016

Keywords:

Stereoscopic image quality assessment

(SIQA)

Reduced reference (RR)

Reorganized discrete cosine transform

(RDCT)

Human visual system (HVS)

abstract

In this paper, a novel reduced reference (RR) stereoscopic image quality assessment (SIQA) is proposed by

characterizing the statistical properties of the stereoscopic image in the reorganized discrete cosine

transform (RDCT) domain. Firstly, the difference image between the left and right view images is com-

puted. Afterwards, the left and right view images, as well as the difference image, are decomposed by

block-based discrete cosine transform (DCT). The DCT coefﬁcients are further reorganized into a three-level

coefﬁcient tree, resulting in ten RDCT subbands. For each RDCT subband, the statistical property of the

coefﬁcient distribution is modeled by the generalized Gaussian density (GGD) function. And the mutual

information (MI) and energy distribution ratio (EDR) are employed to depict the statistical properties

across different RDCT subbands. Moreover, EDR can further model the mutual masking property of the

human visual system (HVS). By considering the GGD modeling behavior within each RDCT subband and MI

together EDR characterizing behavior across RDCT subbands, the statistical properties of the stereoscopic

image are fully exploited, including the left view, right view, and the difference image. Experimental results

demonstrated that the statistical properties of the difference image can well represent the perceptual

quality of the stereoscopic image, which outperforms the representative RR quality metrics for stereoscopic

image and even some full reference (FR) quality metrics. By considering the left view, right view, and

difference image together, the performances of the proposed RR SIQA can be further improved, which

presenting a more closely relationship between the quality metric output and human visual perception.

1. Introduction

Image perceptual quality assessment plays the essential role in

the image processing and communication [1,2],suchasimage

capturing, compression, storage, transmission, displaying, printing,

sharing, and so on. Therefore, there are many research works aim to

develop image quality metrics for guiding the performance optimi-

zation during each step of image processing [3,4] and communica-

tion. Human eyes are the ultimate receivers of the images. The

subjective test process is regarded as the most reliable way to

evaluate the perceptual quality of the image. However, the sub-

jective test process is time-consuming, which is impractical for the

optimization process of the online image processing. Therefore, the

objective quality metrics that can automatically evaluate the image

perceptual quality and guide the image processing applications are

demanded.

Nowadays, with the rapid developments of content generation

and display technology, three-dimensional (3D) applications and

services are becoming more and more popular for visual quality of

experiences (QoE) of human viewers. The 3D contents displaying

on the 3D devices have brought new entertainments and more

vivid experiences to the consumers, which attract more and more

attentions from not only researchers but also the industries. For

these applications, the quality of 3D content is the most critical

part to provide the visual QoE. However, in the 3D processing

chain including capturing, processing [5–7], coding [8,9], trans-

mitting, reconstruction, retrieving, etc., artifacts are inevitably in-

troduced due to the resource shortage in processing [10]. There-

fore, how to automatically evaluate the perceptual quality of 3D

content [11] becomes a challenging issue in 3D visual signal pro-

cessing. Moreover, it is claimed that the artifacts of 3D content

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/neucom

Neurocomputing

http://dx.doi.org/10.1016/j.neucom.2015.06.116

Corresponding author at: College of Computer Science and Software En-

gineering, Shenzhen University, Shenzhen 518060, China.

E-mail addresses: forest.linma@gmail.com (L. Ma),

wangxu.cise@gmail.com (X. Wang), q.liu@hust.edu.cn (Q. Liu),

knngan@ee.cuhk.edu.hk (K.N. Ngan).

Neurocomputing 215 (2016) 21–31

affect more on human visual system (HVS) [12,13], compared with

the conventional 2D contents. Therefore, the realization of the HVS

properties on 3D content is researched to help more accurately

evaluate perceptual quality of the 3D contents.

According to the availability of the reference image, the conven-

tional 2D image quality assessment (IQA) methods can be divided into

three categories, speciﬁcally the full reference (FR) [1 4–16],nore-

ference (NR) [17–22], and reduced reference (RR) [22–28],respectively.

FR metrics require the full assess of the original image to eval-

uate the perceptual quality of the distorted image. The original

image is assumed to be artifact free and of perfect quality. Such

metrics can be employed to guide the perceptual quality optimi-

zation during image/video compression, watermarking, and so on.

The most appealing quality metrics are the mean square error

(MSE) and its related peak signal-to-noise ratio (PSNR), because of

their simplicity, clear physical meaning, and easy optimization.

However, MSE and PSNR do not correlate with HVS properties.

Therefore, many FR metrics are developed to incorporate the HVS

properties and image signal properties. Wang et al. developed the

most popular image quality metric structural similarity (SSIM) [14]

that captures the structure information loss to depict the perceptual

quality of the distorted image. A wavelet-based visual signal-to-

noise ratio (VSNR) is developed to capture the distortions in wavelet

domain [15]. A simple quality metric considering texture masking

and contrast sensitivity function is developed for perceptual image

coding [29].In[16], Ma et al. proposed to incorporate the horizontal

effect of HVS into SSIM for a better image quality metric.

However, in real-world applications, we are not able to access

the original image for quality analysis in most cases, where only the

distorted image is available. The NR quality metrics are thus em-

ployed. Many researchers employ the behaviors of speciﬁcdistor-

tions for the NR quality assessment, such as the blocking artifact of

JPEG coded images, ringing artifact of the JPEG 2000 coded images,

and so on. As JPEG 2000 employs the wavelet transform to com-

press the image, the wavelet statistical model is utilized to capture

the compression distortion [18]. Liang et al. [19] combined the

sharpness, blurring, and ringing measurements together to depict

the perceptual quality of the JPEG 2000 coded image. The dis-

tribution of the DCT coefﬁcient after quantization is modeled in [20]

to predict the PSNR value of the JPEG coded image. Furthermore,

Ferzli et al. [21] did the psychophysical experiment to test the

blurring tolerance ability of the HVS, based on which the just-no-

ticeable blur (JNB) model is developed. These methods employ the

behaviors of speciﬁc distortions to predict the degradation level.

Therefore, if a new distortion is introduced, these methods can

hardly evaluate the perceptual quality of the distorted image. In

order to compromise between the FR and NR IQAs, RR IQAs are

developed. It is expected that the RR methods can effectively

evaluate the image perceptual quality based on a limited number of

RR features extracted from the reference image. Only a small

number of bits are required for representing the extracted features,

which can be efﬁciently encoded and transmitted for the quality

analysis. Consequently, it will be very useful for the quality mon-

itoring during the image transmission and communication.

For RR quality metrics, only partial information of the original

image is available for quality analysis, which can be further cate-

gorized into the following three classes.



Distortion-based RR metrics: The behaviors of the distortions are

modeled for RR quality metric design. Wolf et al. [30,31]

proposed to extract a set of spatial and temporal features for

measuring the distortions introduced in the standard video

compression and communication environment. The features

that are associated with the blurring, blocking, and frame

differences are extracted in [32] to depict the compression

artifacts introduced by MPEG-2. These RR quality metrics are

designed for some speciﬁc distortions, which cannot be effec-

tively applied to the other images of different distortions.

Therefore, a general RR IQA for evaluating the image perceptual

quality of different distortions is required.



HVS-based RR metrics: HVS properties should be considered for

quality assessment, as the human eyes are the ultimate re-

ceivers. Le Callet et al. [33] employed a neural network to train

and evaluate the perceptual qualities of video sequences based

on the perceptual-related features extracted from the video

frames. In [34,35], the perceptual features motivated from the

computational models of low level vision are extracted as the

reduced descriptors to represent the image perceptual quality.

The merits from the contourlet transform, the contrast sensi-

tivity function, and Webers law of just noticeable difference are

incorporated to derive an RR IQA [36], which are employed for

evaluating the perceptual qualities of the JPEG and JPEG 2000

coded images. Recently, an RR IQA [37] for wireless imaging is

developed by considering different structural information that

is observed in the distortion model of wireless link.



Statistics-based RR metrics: The statistical modeling of the image

signal has been investigated for the image perceptual quality

assessment for RR IQAs. In [38,23], Wang et al. proposed a

wavelet-domain natural image statistic metric (WNISM), which

models the marginal probability distribution of the wavelet

coefﬁcients of a natural image by the generalized Gaussian

density (GGD) function. The Kullback–Leibler distance (KLD) is

used to depict the distribution difference. To improve the

performance and reduce the number of features, the probability

distribution was represented by the Gaussian scale mixture

(GSM) model in wavelet domain [39].In[24–26], the RR features

are extracted in the reorganized domain. In [24], the DCT

coefﬁcients are ﬁrst reorganized into several representative

subbands, whose distributions are modeled by the GGD. The

city-block distance (CBD) is utilized to capture the image per-

ceptual quality. In [40], the statistics of image gradient magni-

tude are modeled by the Weibull distribution to develop an RR

image quality metric. Also the statistics of the edge [41] are

utilized for developing the RR IQA. In [42],theauthorsmeasure

the differences between the entropies of wavelet coefﬁcients of

the reference and distorted image to quantify the image informa-

tion change, which can indicate the image perceptual quality.

In this paper, we proposed a novel RR SIQA based on the sta-

tistical modeling of the stereoscopic image. The difference image is

computed by referring to the left view and right view image. After

performing the block-based DCT and reorganization process, the

coefﬁcients of the images (left view, right view, and the different

image) are reorganized into different RDCT subbands. The statis-

tical property within each RDCT subband is exploited by the GGD

modeling of the coefﬁcient distribution. The statistical property

across RDCT subbands is modeled by the energy distribution ratio

(EDR), which can be further employed for modeling the HVS

mutual masking property. The main contributions of our proposed

method are listed as follows.



The statistical property of stereoscopic image is studied for

perceptual quality analysis. The statistical properties of the

obtained difference image are ﬁrstly investigated in the RDCT

domain. By considering the difference image, the left and right

view images are considered together for perceptual quality

analysis, which matches the HVS perception of the stereoscopic

image.



The statistical properties of the difference image are char-

acterized from two perspectives, speciﬁcally the within and

across RDCT subband statistical properties, respectively. The

statistical properties depicted within and across the RDCT

L. Ma et al. / Neurocomputing 215 (2016) 21–3122

剩余10页未读，继续阅读

评论收藏

内容反馈

weixin_38673798

粉丝: 5
资源: 943

重组基于DCT的图像表示，以减少参考立体图像质量评估

基于DCT的JPEG图像编解码PPT课件

基于DCT的图像压缩的实验报告

基于DCT变换的图像压缩算法

基于DCT图像压缩的压缩算法matlab

基于DCT的图像压缩编码算法的MATLAB实现

matlab实现基于DCT的图像变换编码

【图像隐藏】基于 DCT的图像隐写matlab源码.md

【图像隐藏】基于 DCT的图像隐写matlab源码1.zip

【图像隐藏】基于 DCT算法实现彩色数字水印嵌入提取含Matlab源码.zip

基于dct的图像压缩解压缩

MATLAB基于DCT图像压缩

基于DCT的JPEG图像数据压缩方法研究及应用

论文研究-基于DCT的遥感图像融合.pdf

基于DCT变换的彩色图像压缩应用研究

基于DCT域的图像数字水印技术

基于DCT变换的图像压缩Matlab代码

基于DCT和SVD的图像哈希水印算法.pdf

基于DCT图像有损压缩MATLAB仿真

基于CORDIC的反正弦和反余弦计算的FPGA实现

使用3DCNN和卷积LSTM进行手势识别学习时空特征

BA无标度网络中的SIR模型

基于三次贝塞尔曲线的类汽车曲率连续路径平滑

基于机器学习的设备剩余寿命预测方法综述

基于维纳过程的退化模型，具有递归过滤算法，可用于估计剩余使用寿命

基于FPGA的奇异值和特征值分解的快速实现。

基于BP神经网络的人口预测

最新资源