基于聚类和非聚类的图像和视频注释的多标签分类的鲁棒性比较

需积分: 9 181 浏览量 2021-03-28 08:36:38 上传评论收藏 429KB PDF 举报

本文探讨了基于聚类和非聚类的图像与视频注释的多标签分类方法的鲁棒性比较研究。研究的背景是自动图像或视频注释技术的发展，该技术自动为未标记的图像或视频片段分配一组语义关键词，从而传达这些图像或视频内容的意义。当前，多概念关联的图像和视频注释变得越来越流行，这与越来越多依赖在线资源进行自主学习和信息获取的人群增长有关。在研究中，作者采用了六种流行的多标签分类算法，并使用了两种不同的基础分类器进行基于问题转换的多标签分类，以及三种不同的聚类算法进行训练数据的预聚类。研究的实验评估基于两个多标签基准数据集：场景图像数据集和MediaMill视频数据集。同时，研究采用了微F1度量和汉明损失这两种多标签分类评估指标，以呈现分类的预测性能。文章指出，不同的基础分类器和聚类方法对多标签分类的性能贡献不同。总体而言，预聚类方法在某些实验设置中提高了多标签分类的有效性。这对于决定选择哪种多标签分类方法用于多概念关联的图像和视频注释的用户提供了重要的参考信息。从这篇论文中可以提取出以下几个重要的知识点： 1. 多标签分类：这是机器学习中的一个概念，指的是一个实例（如图像或视频）有多个标签，即它属于多个类别。这与传统的单标签分类不同，后者只涉及将每个实例分配到单一类别。 2. 聚类算法：聚类是数据挖掘的一种技术，用于将数据集中的实例划分为多个群组，使得同一群组内的实例相似度高，而不同群组之间的实例相似度低。在多标签分类中，聚类可以作为预处理步骤，帮助提高分类效果。 3. 问题转换基础分类器：在多标签学习中，问题转换方法是一种将多标签问题转换为一系列单标签问题来处理的技术。基于问题转换的分类器先将多标签问题转换成多个单标签问题，然后应用单标签分类器来求解每个问题。 4. 微F1度量：这是多标签学习中的一个性能评估指标，综合考虑了精确率和召回率两个因素，更适用于多标签分类问题的评估。 5. 汉明损失：这是一个用于衡量分类预测错误的指标，它是预测标签集合和真实标签集合之间不同元素个数的总和。在多标签学习中，汉明损失可以衡量分类器预测的准确程度。 6. 基准数据集：在机器学习中，基准数据集是一个预定义好的数据集，用于评估算法的性能。在这个研究中，作者使用了场景图像和MediaMill视频这两个多标签基准数据集。 7. 自动图像和视频注释：这是计算机视觉领域的一个研究方向，它涉及到使用机器学习方法为图像或视频内容自动分配语义关键词，以便更好地理解和检索多媒体信息。通过对以上知识点的详细解释，可以理解这篇论文所涉及的研究范围、采用的技术和方法，以及所得出的研究结论。这对于从事相关领域研究的学者或工程师而言，提供了有价值的知识参考。

资源推荐

资源详情

资源评论

2015 8th International Congress on Image and Signal Processing (CISP 2015)

Robustness Comparison of Clustering-Based vs.

Non-Clustering Multi-label Classifications for Image

and Video Annotations

Gulisong Nasierding

Yong Li

School of Computer Science and Technology

Xinjiang Normal University

No. 102 Xin Yi Rd, Urumqi, China 830001

gulnas9@gmail.com, liyong@live.com

Atul Sajjanhar

School of Information Technology

Deakin University

221 Burwood HWY, Burwood, VIC 3125, Australia

atul.sajjanhar@deakin.edu.au

Abstract—This paper reports robustness comparison of

clustering-based multi-label classification methods versus non-

clustering counterparts for multi-concept associated image and

video annotations. In the experimental setting of this paper, we

adopted six popular multi-label classification algorithms, two

different base classifiers for problem transformation based multi-

label classifications, and three different clustering algorithms for

pre-clustering of the training data. We conducted experimental

evaluation on two multi-label benchmark datasets: scene image

data and mediamill video data. We also employed two multi-label

classification evaluation metrics, namely, micro F1-measure and

Hamming-loss to present the predictive performance of the

classifications. The results reveal that different base classifiers

and clustering methods contribute differently to the performance

of the multi-label classifications. Overall, the pre-clustering

methods improve the effectiveness of multi-label classifications in

certain experimental settings. This provides vital information to

users when deciding which multi-label classification method to

choose for multiple-concept associated image and video

annotations.

Keywords - multi-concept; image and video annotation;

clustering based; multi-label classification; robustness comparison.

I. INTRODUCTION

Automatic image or video annotation refers to the

automatic assignation of a set of semantic keywords to

unlabeled images or video clips, which convey meaning of

those images or video contents [1-10]. Multiple concept

associated image and video annotations become popular

nowadays, in line with the growing number of people who

rely on online resources for autonomous learning and

education. During the learning process, learners have to search

and retrieve multimedia information, such as images and

videos from massive digital libraries. Due to the challenges of

representing query images or video clips using abstract

features, annotation keywords based image and video retrieval

become a wiser option[1,8]. However, such retrievals require

effective methods for automatically annotating unlabelled

images and videos in the training phase, and using annotation

keywords to search and retrieve expected images or videos.

This paper pursues fundamental research on exploration of

effective methods for automatic image and video annotations.

Automatically annotating image or video is a challenging

task when a single image or a video clip is associated with

multiple semantic concepts. Problems involved in this

approach are tackled by various methods including multi-label

classifications (MLC) [7-10]. Research findings present that a

clustering based multi-label classification (CBMLC)

framework [10] is effective for various multi-label

classification problems.

The rest of the paper is organized as follows: Section II

provides an overview of automatic image and video annotation

approaches and relevant MLC approaches. Section III

introduces the experimental setting including the experimental

setup, evaluation datasets and the MLC evaluations

measurement methods. Section IV demonstrates experimental

evaluation results and discussions. Section V draws up the

conclusion.

II. O

VERVIEW

A. Image Annotation Approaches

The process of image annotation involves a number of

stages including pre-processing, annotation, and post-

processing[1]. In a typical automatic image annotation (AIA)

system, the pre-processing stage usually involves image

segmentation and feature extraction. Images are firstly

segmented into sub-structural segments (regions or blobs).

Then, useful features are extracted from each region.

However, the image segmentation step can be ignored in some

cases where global features were used to represent images.

The major task for the annotation stage is to predict semantic

concepts for visual image contents. The annotation uses

statistical model based approaches, classification based

approaches, or integrates aspects of both approaches [1-9].

This paper is supported by National Natural Science Foundation of China

(NSFC, Project No. 61262065).

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余5页未读，立即下载

评论收藏

内容反馈

weixin_38703955

粉丝: 2
资源: 915

基于聚类和非聚类的图像和视频注释的多标签分类的鲁棒性比较

自适应k均值聚类

山东大学数据科学实验四-----机器学习：聚类和回归

数据分类算法 比k均值聚类要好

Ncut聚类算法

AP聚类算法的源代码

speaker segmentation and clustering语音分割聚类

模糊C均值聚类算法C++

凝聚聚类AFCM代码（可直接运行）.rar

opencv 监督学习聚类（k近邻(KNN)手写字识别、支持向量机数据分类 注释源码和数据文件下载）

MATLAB项目源码案例分析-广义神经网络的聚类算法-网络入侵聚类（附文档和源代码）.rar

基于matlab和图像识别实现的解魔方机器人项目完整源码(完成图像识别-魔方解算-串口发送信息给stm32)+代码注释.zip

基于matlab和图像识别实现解魔方机器人源码图像识别魔方解算串口发送信息给stm32+代码注释.zip

基于解码器注意力机制的视频摘要.pdf

基于matlab的考虑实时市场联动的电力零售商鲁棒定价策略源码+项目说明+详细注释.zip

基于小波包熵和模糊C均值的轴承故障诊断MATLAB程序.rar

基于matlab和图像识别实现解魔方机器人源码图像识别魔方解算串口发送信息给stm32+代码注释（毕业设计）

facenet的代码注释总结

k-中心点算法Matlab代码实现

100多种数据处理与分类算法matlab代码集合.zip

图像处理-边缘检测和特征提取MATLAB源代码

基于机器学习的软件缺陷识别的必要性.pdf

蛋白质组大数据挖掘.pptx

matlab开发-KMRBFTracker

sift算法实现sift.rar

ImageNet paper

基于CORDIC的反正弦和反余弦计算的FPGA实现

最新资源

数据分类算法比k均值聚类要好

opencv 监督学习聚类（k近邻(KNN)手写字识别、支持向量机数据分类注释源码和数据文件下载）