Blind room parameter estimation using multiple multichannel speech recordings

Prerak Srivastava; Antoine Deleforge; Emmanuel Vincent

Communication Dans Un Congrès Année : 2021

Blind room parameter estimation using multiple multichannel speech recordings

(1) , (1) , (1)

Prerak Srivastava

Fonction : Auteur
PersonId : 1106538

Speech Modeling for Facilitating Oral-Based Communication

Antoine Deleforge

Fonction : Auteur
PersonId : 10056
IdHAL : antoine-deleforge
ORCID : 0000-0003-0339-7472
IdRef : 184451205

Speech Modeling for Facilitating Oral-Based Communication

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech Modeling for Facilitating Oral-Based Communication

Résumé

Knowing the geometrical and acoustical parameters of a room may benefit applications such as audio augmented reality, speech dereverberation or audio forensics. In this paper, we study the problem of jointly estimating the total surface area, the volume, as well as the frequency-dependent reverberation time and mean surface absorption of a room in a blind fashion, based on two-channel noisy speech recordings from multiple, unknown source-receiver positions. A novel convolutional neural network architecture leveraging both single-and inter-channel cues is proposed and trained on a large, realistic simulated dataset. Results on both simulated and real data show that using multiple observations in one room significantly reduces estimation errors and variances on all target quantities, and that using two channels helps the estimation of surface and volume. The proposed model outperforms a recently proposed blind volume estimation method on the considered datasets.

Domaines

Son [cs.SD] Apprentissage [cs.LG] Traitement du signal et de l'image [eess.SP] Intelligence artificielle [cs.AI]

Fichier principal

Waspaa_2021_CRV.pdf (403.46 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Prerak SRIVASTAVA : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03304656

Soumis le : mercredi 28 juillet 2021-15:34:59

Dernière modification le : jeudi 1 février 2024-10:04:59

Archivage à long terme le : vendredi 29 octobre 2021-18:57:24

Dates et versions

hal-03304656 , version 1 (28-07-2021)

Identifiants

HAL Id : hal-03304656 , version 1

Citer

Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent. Blind room parameter estimation using multiple multichannel speech recordings. WASPAA 2021 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct 2021, New Paltz, NY, United States. ⟨hal-03304656⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES SILECS ANR UR1-MATH-NUM

124 Consultations

323 Téléchargements

Blind room parameter estimation using multiple multichannel speech recordings

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager