1
.
open-Source Media Interpretation by Large feature-space Extraction
Version 2.3, November 2016
Main authors: Florian Eyben, Felix Weninger, Martin W¨ollmer, Bj¨orn Schuller
E-mails: fe, fw, mw, bs at audeering.com
Copyright (C) 2013-2016 by
audEERING GmbH
Copyright (C) 2008-2013 by
TU M¨unchen, MMK
audEERING GmbH
D-82205 Gilching, Germany
http://www.audeering.com/
The official openSMILE homepage can be found at: http://opensmile.audeering.com/
This documentation was created by Florian Eyben. Contributions for Android from Gerhard
Hagerer.
2
Contents
1 About openSMILE 5
1.1 What is openSMILE? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Who needs openSMILE? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Capabilities - Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Using openSMILE 13
2.1 Obtaining and Installing openSMILE . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Compiling the openSMILE source code . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Build instructions for the impatient . . . . . . . . . . . . . . . . . . . . . 14
2.2.2 Compiling on Linux/Mac . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.3 Compiling on Linux/Mac with PortAudio . . . . . . . . . . . . . . . . . . 19
2.2.4 Compiling on Linux with openCV and portaudio support. . . . . . . . . . 19
2.2.5 Compiling on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.6 Compiling on Windows with PortAudio . . . . . . . . . . . . . . . . . . . 21
2.2.7 Compiling on Windows with openCV support. . . . . . . . . . . . . . . . 22
2.2.8 Compiling for Android and creating the example Android app . . . . . . 22
2.3 Extracting your first features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 What is going on inside of openSMILE . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 Incremental processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Smile messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.3 openSMILE terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Default feature sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5.1 Common options for all standard configuration files . . . . . . . . . . . . 33
2.5.2 Chroma features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.5.3 MFCC features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.5.4 PLP features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.5 Prosodic features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5.6 Extracting features for emotion recognition . . . . . . . . . . . . . . . . . 39
2.6 Using Portaudio for live recording/playback . . . . . . . . . . . . . . . . . . . . . 44
2.7 Extracting features with openCv . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.8 Visualising data with Gnuplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3 Description of algorithms 49
3
4 CONTENTS
4 Reference section 51
4.1 General usage - SMILExtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Understanding configuration files . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.1 Enabling components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.2 Configuring components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.3 Including other configuration files . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.4 Linking to command-line options . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.5 Defining variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Component description and on-line help . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Feature names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5 Developer’s Documentation 59
6 Additional Support 61
7 Acknowledgement 63
Chapter 1
About openSMILE
We start introducing openSMILE by addressing two important questions for users who are new
to openSMILE : What is openSMILE ? and Who needs openSMILE ?. If you want to start
using openSMILE right away, then you should start reading section 2, or section 2.3 if you have
already managed to install openSMILE.
1.1 What is openSMILE?
The Munich open-Source Media Interpretation by Large feature-space Extraction (openSMILE
) toolkit is a modular and flexible feature extractor for signal processing and machine learning
applications. The primary focus is clearly put on audio-signal features. However, due to their
high degree of abstraction, openSMILE components can also be used to analyse signals from
other modalities, such as physiological signals, visual signals, and other physical sensors, given
suitable input components. It is written purely in C++, has a fast, efficient, and flexible
architecture, and runs on various main-stream platforms such as Linux, Windows, and MacOS.
openSMILE is designed for real-time online processing, but can also be used off-line in batch
mode for processing of large data-sets. This is a feature rarely found in related feature extraction
software. Most of related projects are designed for off-line extraction and require the whole input
to be present. openSMILE can extract features incrementally as new data arrives. By using the
PortAudio
1
library, openSMILE features platform independent live audio input and live audio
playback, which enabled the extraction of audio features in real-time.
To facilitate interoperability, openSMILE supports reading and writing of various data for-
mats commonly used in the field of data mining and machine learning. These formats include
PCM WAVE for audio files, CSV (Comma Separated Value, spreadsheet format) and ARFF
(Weka Data Mining) for text-based data files, HTK (Hidden-Markov Toolkit) parameter files,
and a simple binary float matrix format for binary feature data.
Using the open-source software gnuplot
2
, extracted features which are dumped to files can be
visualised. A strength of openSMILE , due to its highly modular architecture is that almost all
intermediate data which is generated during the feature extraction process (such as windowed
audio data, spectra, etc.) can be accessed and saved to files for visualisation or further processing.
1
http://www.portaudio.com
2
http://www.gnuplot.info/
5