openEAR源码和一篇介绍文档资源-CSDN文库

共2个文件

gz：1个

pdf：1个

语音情感识别

3星 · 超过75%的资源需积分: 9 42 浏览量 2014-12-06 17:11:16 上传评论收藏 44.87MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

openEAR-0.1.0.zip （2个子文件）

26-TUM-Tools-openEAR.pdf 348KB

openEAR-0.1.0.tar.gz 44.75MB

openEAR - Introducing the Munich Open-Source

Emotion and Affect Recognition Toolkit

Florian Eyben, Martin W

ollmer, and Bj

orn Schuller

Technische Universit

at M

unchen, Institute for Human-Machine Communication

Theresienstrasse 90, 80333 M

unchen

{eyben|woellmer|schuller}@tum.de

Abstract

Various open-source toolkits exist for speech recognition

and speech processing. These toolkits have brought a great

beneﬁt to the research community, i.e. speeding up research.

Yet, no such freely available toolkit exists for automatic af-

fect recognition from speech. We herein introduce a novel

open-source affect and emotion recognition engine, which

integrates all necessary components in one highly efﬁcient

software package. The components include audio recording

and audio ﬁle reading, state-of-the-art paralinguistic fea-

ture extraction and plugable classiﬁcation modules. In this

paper we introduce the engine and extensive baseline re-

sults. Pre-trained models for four affect recognition tasks

are included in the openEAR distribution. The engine is tai-

lored for multi-threaded, incremental on-line processing of

live input in real-time, however it can also be used for batch

processing of databases.

1. Introduction

Affective Computing has become a popular area of re-

search in recent times [17]. Many achievements have been

made towards making machines detect and understand hu-

man affective states, such as emotion, interest or dialogue

role. Yet, in contrast to the ﬁeld of speech recognition, only

very few software toolkits exist, which are tailored specif-

ically for affect recognition from audio or video. In this

paper, we introduce and describe the Munich open Affect

Recognition Toolkit (openEAR), the ﬁrst such tool, which

runs on multiple platforms and is publicly available

OpenEAR in it’s initial version is introduced as an affect

and emotion recognition toolkit for audio and speech affect

recognition. However, openEAR’s architecture is modular

and by principle modality independent. Thus, also vision

features such as facial points or optical ﬂow measures can

http://sourceforge.net/projects/openear

be added and fused with audio features. Moreover, phys-

iological features such as heart rate, ECG, or EEG signals

from devices such as the Neural Impulse Actuator (NIA),

can be analysed using the same methods and algorithms

as for speech signals and thus can also be processed us-

ing openEAR – provided suitable capture interfaces and

databases.

2. Existing work

A few free toolkits exist, that provide various compo-

nents usable for emotion recognition. Most toolkits that in-

clude feature extraction algorithms are targeted at speech

recognition and speech processing, such as the Hidden

Markov Toolkit (HTK) [16], the PRAAT Software [1], the

Speech Filling System (SFS) from UCL, and the SNACK

package for the Tcl scripting language. These can all be

used to extract state-of-the-art features for emotion recog-

nition. However, only PRAAT and HTK include certain

classiﬁers. For further classiﬁers WEKA and RapidMiner,

for example, can be used. Moreover, only few of the listed

toolkits are available under a permissive Open-Source li-

cense, e. g. WEKA, PRAAT, and RapidMiner.

The most complete and task speciﬁc framework for

Emotion Recognition currently is EmoVoice [13]. How-

ever, the main design objective is to provide an emotion

recognition system for the non-expert. Thus it is a great

framework for demonstrator applications and making emo-

tion recognition available to the non-expert. openEAR, in

contrast, aims at being a stable and efﬁcient set of tools

for researchers and those developing emotional aware ap-

plications, providing the elementary functionality for emo-

tion recognition, i. e. the Swiss Army Knife for research and

development of affect aware applications. openEAR com-

bines everything from audio recording, feature extraction,

and classiﬁcation to evaluation of results, and pre-trained

models while being very fast and highly efﬁcient. All fea-

ture extractor components are written in C++ and can be

used as a library, facilitating integration into custom appli-

978-1-4244-4799-2/09/$25.00

2009 IEEE

cations. Also, openEAR can be used as an out-of-the-box

emotion live affect recogniser for various domains, using

pre-trained models which are included in the distribution.

Moreover, openEAR is Open-Source software, freely avail-

able to anybody under the terms of the GNU General Public

License.

3. openEAR’s Architecture

The openEAR toolkit consists of three major compo-

nents: the core component is the SMILE (Speech and Mu-

sic Interpretation by Large-Space Extraction) signal pro-

cessing and feature extraction tool, which is capable gen-

erating > 500 k features in real-time (Real-Time Factor

(RTF) < 0.1), either from live audio input or from off-

line media. Next, there is support for classiﬁcation mod-

ules via a plug-in interface to the feature extractor. More-

over, supporting scripts and tools are provided, which fa-

cilitate training of own models on arbitrary data sets. Fi-

nally, four ready-to-use model-sets are provided for recog-

nition of six basic emotion categories (trained on the Berlin

Speech Emotion Database (EMO-DB) [2] and the eNTER-

FACE database), for recognition of emotion in a continu-

ous three-dimensional feature space spanned by activation,

valence, and dominance (trained on the Belfast naturalistic

(SAL) and Vera-am-Mittag (VAM) [4] corpora), for recog-

nition of interest using three discrete classes taken from the

Audio Visual Interest Corpus (AVIC) [9], and for recogni-

tion of affective states such as drunkenness trained on the

Airplane Behaviour Corpus (ABC).

Signal input can either be read off-line from audio ﬁles or

recorded on-line from a capture device in real-time. Since

data processing is incremental (concerning signal process-

ing and feature extraction), there is no difference between

handling live input and off-line media. Independent of the

input method, the feature output can either be classiﬁed di-

rectly via built in classiﬁers, classiﬁer plug-ins, or the fea-

tures (or even wave data) can be exported to various ﬁle

formats used by other popular toolkits. Currently imple-

mented export ﬁle formats are: WEKA Arff [14], LibSVM

format [3], Comma Separated Value (CSV) File, and Hid-

den Markov Toolkit (HTK) [16] feature ﬁles.

The following sub-sections describe the feature extrac-

tor’s modular architecture, the features currently imple-

mented, and the classiﬁer interface. The model-sets will be

detailed along with baseline benchmark results in section 4.

3.1. Modular and Efﬁcient Implementation

During speciﬁcation of openEAR’s feature extractor ar-

chitecture, three main objectives were followed: speed and

efﬁciency, incremental processing of data (i. e. frame by

frame with minimum delay), and ﬂexibility and modular-

ity. Adding new features is possible via an easy plug-in

interface.

The SMILE feature extractor is implemented from

scratch in C++, without crucial third party dependencies.

Thus, it is easy to compile, and basically platform inde-

pendent. It is currently known to run on Mac OS, vari-

ous Linux distributions, and Windows platforms. Feature

extraction code is optimised to avoid double computations

of shared values, e. g. Fast Fourier Transform (FFT) coefﬁ-

cients, which are only computed once and used for multiple

algorithms such as computation of energy, spectral features,

and cepstral features.

Figure 1. Concept and components of openEAR’s SMILE (Speech

and Music Interpretation by Large-Space Extraction) feature ex-

tractor.

Figure 1 shows a rough sketch of the data ﬂow and sig-

nal processing architecture. The central component is the

Data Memory, which enables memory efﬁcient incremen-

tal processing by managing ring-buffer storage of feature

data. Input data (wave ﬁles, other features, etc.) is fed to the

Data Memory by Data Source components, which contain

a Data Writer sub-component that handles the data mem-

ory interface. Data Processor components read data frames

or contours from one location of the Data Memory, pro-

cess the data and write new frames to a different location in

the Data Memory. They contain both a Data Reader and a

Data Writer sub-component, which handle the Data Mem-

ory interface. Finally, the Data Sink components read data

from the Data Memory and feed it to the classiﬁer compo-

nents or write data to ﬁles. Each component can be run in a

separate thread, speeding up processing on multiple proces-

sors/cores.

The individual components can be freely instantiated,

conﬁgured, and connected to the Data Memory via a central

conﬁguration ﬁle. To facilitate conﬁguration ﬁle creation

example ﬁles are provided and conﬁguration ﬁle conversion

评论收藏

内容反馈

u010290143

2016-07-24

本来项目要用下了后来还没开始用···我就毕业了
patrick_wp

2016-01-04

工具完全打不开啊，不知道是不是我姿势不对，求回答

poe_hubert

粉丝: 0
资源: 2

openEAR 源码和一篇介绍文档

最新资源

openEAR 源码和一篇介绍文档

openear_x64_rtl_openear_dmr_tetra_SDR

openEAR-master_openEAR-master源码_ADS-B_SDR软件数字解码器_

OpenEAR Toolkit 源代码 语音情感分析

OpenEar1.7.0_OpenEarV1.70_

语音情感分析OpenEAR工具

opensmile官方文档及对应语音特征的计算原理和方法

一篇介绍nand flash结构与分析的文档

网上搜到的一篇mpls介绍文档

parent和opener的区别

XiorkFlow流程设计器源码及文档

window.opener用法和用途实例介绍

s7canopener_opener_S7Blocks_S7Canopener_S7CanOpener_CAN_源码

winmail opener

openeR:OpeneR是一个软件包，旨在舒适地记录您的R代码

opener实例页面之间传递参数

Handler和opener以及开放代理和私密代理的使用

OpENer：OpENer是用于IO适配器设备的EtherNetIP堆栈。 它支持多个IO和显式连接，并包括对象和服务，这些对象和服务用于制造ODVA规范中定义的符合EtherNetIP的产品

winmail_opener邮件查看工具

showModalDialog open弹出子窗口操作parent、opener父窗口及跨域处理

opener_server标准的perl实现opener_server.pl.zip

U2 Opener.wmv

Winmail.Opener.Outlook.打开工具

zip, rar opener

Python Handler处理器和自定义Opener原理详解

opener.html

winmail_opener

通过window.opener控制父窗体

第十五届蓝桥杯大赛软件赛省赛C++B组题目

C/C++中文参考手册离线最新版

代码随想录-八股文 pdf

最新资源

OpenEAR Toolkit 源代码语音情感分析

OpENer：OpENer是用于IO适配器设备的EtherNetIP堆栈。它支持多个IO和显式连接，并包括对象和服务，这些对象和服务用于制造ODVA规范中定义的符合EtherNetIP的产品