DSTL: Solution to Limitation of Small Corpus in Speech Emotion Recognition

Ying Chen; Zhongzhe Xiao; Xiaojun Zhang; Zhi Tao

doi:10.1613/jair.1.11729

PDF

Published: Oct 7, 2019

DOI: https://doi.org/10.1613/jair.1.11729

Keywords:

machine learning, speech processing, data mining

Ying Chen

Soochow University

Zhongzhe Xiao

Soochow University

https://orcid.org/0000-0003-3950-3070

Xiaojun Zhang

Soochow University

Zhi Tao

Soochow University

Abstract

Traditional machine learning methods share a common hypothesis: training and testing datasets must be in a common feature space with the same distribution. However, in reality, the labeled target data may be rare, so that target space does not share the same feature space or distribution as an available training set (source domain). To address the mismatch of domains, we propose a Dual-Subspace Transfer Learning (DSTL) framework that considers both the common and specific information of the two domains. In DSTL, a latent common subspace is first learned to preserve the data properties and reduce the discrepancy of domains. Then, we propose a mapping strategy to transfer the sourcespecific information to the target subspace. The integration of the domain-common and specific information constructs the proposed DSTL framework. In comparison to the stateart-of works, the main contribution of our work is that the DSTL framework not only considers the commonalities, but also exploits the specific information. Experiments on three emotional speech corpora verify the effectiveness of our approach. The results show that the methods which include both domain-common and specific information perform better than the baseline methods which only exploit the domain commonalities.

Issue

Vol. 66 (2019)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details