Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech

Simić, Nikola; Suzić, Siniša; Nosek, Tijana; Vujović, Mia; Perić, Zoran; Savić, Milan; Delić, Vlado

dc.contributor.author	Simić, Nikola
dc.contributor.author	Suzić, Siniša
dc.contributor.author	Nosek, Tijana
dc.contributor.author	Vujović, Mia
dc.contributor.author	Perić, Zoran
dc.contributor.author	Savić, Milan
dc.contributor.author	Delić, Vlado
dc.date.accessioned	2023-04-10T12:13:45Z
dc.date.available	2023-04-10T12:13:45Z
dc.date.issued	2022-03-16
dc.identifier.citation	III44006	en_US
dc.identifier.uri	https://platon.pr.ac.rs/handle/123456789/1181
dc.description.abstract	Speaker recognition is an important classification task, which can be solved using several approaches. Although building a speaker recognition model on a closed set of speakers under neutral speaking conditions is a well-researched task and there are solutions that provide excellent performance, the classification accuracy of developed models significantly decreases when applying them to emotional speech or in the presence of interference. Furthermore, deep models may require a large number of parameters, so constrained solutions are desirable in order to implement them on edge devices in the Internet of Things systems for real-time detection. The aim of this paper is to propose a simple and constrained convolutional neural network for speaker recognition tasks and to examine its robustness for recognition in emotional speech conditions. We examine three quantization methods for developing a constrained network: floating-point eight format, ternary scalar quantization, and binary scalar quantization. The results are demonstrated on the recently recorded SEAC dataset.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Molecular Diversity Preservation International	en_US
dc.title	Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech	en_US
dc.title.alternative	Entropy	en_US
dc.type	clanak-u-casopisu	en_US
dc.description.version	publishedVersion	en_US
dc.identifier.doi	https://doi.org/10.3390/e24030414, 1099-4300
dc.citation.volume	24
dc.citation.issue	3
dc.subject.keywords	speaker recognition	en_US
dc.subject.keywords	convolutional neural network	en_US
dc.subject.keywords	quantization	en_US
dc.subject.keywords	emotional speech	en_US
dc.type.mCategory	M22	en_US
dc.type.mCategory	openAccess	en_US
dc.type.mCategory	M22	en_US
dc.type.mCategory	openAccess	en_US

Dokumenti

Ime:: entropy-24-00414.pdf
Veličina:: 154.6Kb
Format:: PDF

Otvaranje

Ovaj rad se pojavljuje u sledećim kolekcijama

Главна колекција / Main Collection

Prikaz osnovnih podataka o dokumentu