Colección SciELO Chile

Departamento Gestión de Conocimiento, Monitoreo y Prospección
Consultas o comentarios: productividad@anid.cl
Búsqueda Publicación
Búsqueda por Tema Título, Abstract y Keywords



Assessing the effect of visual servoing on the performance of linear microphone arrays in moving human-robot interaction scenarios
Indexado
WoS WOS:000573827700018
Scopus SCOPUS_ID:85089286171
DOI 10.1016/J.CSL.2020.101136
Año 2021
Tipo artículo de investigación

Citas Totales

Autores Afiliación Chile

Instituciones Chile

% Participación
Internacional

Autores
Afiliación Extranjera

Instituciones
Extranjeras


Abstract



Social robotics is becoming a reality and voice-based human-robot interaction is essential for a successful human-robot collaborative symbiosis. The main objective of this paper is to assess the effect of visual servoing in the performance of a linear microphone array regarding distant ASR in a mobile, dynamic and non-stationary robotic test-bed that can be representative of real HRI scenarios. Visual servoing and image target tracking are different tasks, and this paper focuses on an effect that is rarely addressed in the literature: the dependence of the beamforming directivity on look direction. The datasets required to carry out the study reported here do not exist and had to be generated. A state-of-the-art mobile robotic testbed had to be set up with target speech and noise sources. A linear microphone array was chosen as a case of study and its response was measured. Standard beamforming methods were evaluated with respect to visual servoing: delay-and-sum combined with image tracking; weighted delay-and-sum; and, MVDR also combined with image tracking. The results presented here show that the performance of beamforming methods is dramatically degraded in moving and non-stationary conditions. In this context, visual servoing in HRI can significantly improve the performance of a linear microphone array regarding ASR accuracy. The average reduction in WER achieved when the robot head was steered toward the target speech source was as high as 28.2%. Finally, it is worth highlighting that the methodology adopted here is applicable to any microphone array, linear or not. (C) 2020 Elsevier Ltd. All rights reserved.

Métricas Externas



PlumX Altmetric Dimensions

Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:

Disciplinas de Investigación



WOS
Computer Science, Artificial Intelligence
Scopus
Sin Disciplinas
SciELO
Sin Disciplinas

Muestra la distribución de disciplinas para esta publicación.

Publicaciones WoS (Ediciones: ISSHP, ISTP, AHCI, SSCI, SCI), Scopus, SciELO Chile.

Colaboración Institucional



Muestra la distribución de colaboración, tanto nacional como extranjera, generada en esta publicación.


Autores - Afiliación



Ord. Autor Género Institución - País
1 Diaz, Alejandro Hombre Universidad de Chile - Chile
2 MAHU-SINCLAIR, RODRIGO MANUEL Hombre Universidad de Chile - Chile
3 Novoa, José Hombre Universidad de Chile - Chile
4 Sepulveda, Jorge Hombre Universidad de Chile - Chile
5 Datta, Jayanta - Universidad de Chile - Chile
6 Yoma, Nestor Becerra Hombre Universidad de Chile - Chile

Muestra la afiliación y género (detectado) para los co-autores de la publicación.

Financiamiento



Fuente
Conicyt-Fondecyt
ONRG

Muestra la fuente de financiamiento declarada en la publicación.

Agradecimientos



Agradecimiento
The research reported here was funded by Grants Conicyt-Fondecyt 1151306 and ONRG N62909-17-1-2002. The authors would like to thank Prof. Juan Monetta, INACAP, Chile, for having provided the anechoic chamber employed here, and Prof. Israel Cohen, Technion, Israel, for the discussion on circular arrays.

Muestra la fuente de financiamiento declarada en la publicación.