Colección SciELO Chile

Departamento Gestión de Conocimiento, Monitoreo y Prospección
Consultas o comentarios: productividad@anid.cl
Búsqueda Publicación
Búsqueda por Tema Título, Abstract y Keywords



M-DETR: Multi-scale DETR for Optical Music Recognition
Indexado
WoS WOS:001221608500001
Scopus SCOPUS_ID:85187959362
DOI 10.1016/J.ESWA.2024.123664
Año 2024
Tipo artículo de investigación

Citas Totales

Autores Afiliación Chile

Instituciones Chile

% Participación
Internacional

Autores
Afiliación Extranjera

Instituciones
Extranjeras


Abstract



Optical Music Recognition (OMR) is an important way to digitize score images and has broad application prospects in fields such as the storage of music documents, music education and digital creation. As a new paradigm for object detection, DETR (detection transformer) has the ability to associate contextual information, which can be exploited to resolve the OMR task. However, the original DETR does not fit OMR well due to its high computational complexity and numerous parameters. To address the DETR defects and improve the recognition accuracy of OMR, we propose a novel multi-scale DETR (M-DETR) with a multi-scale feature fusion mechanism and improved attention mechanisms. First, a new multi-scale feature fusion mechanism is designed to let the backbone network of M-DETR get rich multi-scale information. Then, a key-region attention mechanism is incorporated based on the character that the key information is concentrated on a score image. Finally, the pre-context attention mechanism is introduced to make better use of the contextual association between recognition notes in music scores. Experiment results show that M-DETR achieves recognition accuracy of 90.6% for 7 typical small-sized notes, which is better than Faster R-CNN and YOLO v5, and the improvement rate is 10.02% compared to the original DETR algorithm. The results indicate that M-DETR is an effective way for the OMR task, which also provides a new solution for the detection of small-sized objects with contextual association.

Métricas Externas



PlumX Altmetric Dimensions

Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:

Disciplinas de Investigación



WOS
Computer Science, Artificial Intelligence
Engineering, Electrical & Electronic
Operations Research & Management Science
Scopus
Computer Science Applications
Artificial Intelligence
Engineering (All)
SciELO
Sin Disciplinas

Muestra la distribución de disciplinas para esta publicación.

Publicaciones WoS (Ediciones: ISSHP, ISTP, AHCI, SSCI, SCI), Scopus, SciELO Chile.

Colaboración Institucional



Muestra la distribución de colaboración, tanto nacional como extranjera, generada en esta publicación.


Autores - Afiliación



Ord. Autor Género Institución - País
1 Luo, Fei - East China Univ Sci & Technol - China
Shanghai Key Lab Comp Software Evaluating Testing - China
East China University of Science and Technology - China
Shanghai Key Laboratory of Computer Software Testing and Evaluating - China
2 Dai, Yifan - East China Univ Sci & Technol - China
East China University of Science and Technology - China
3 Fuentes, Joel Hombre Universidad del Bío Bío - Chile
4 Ding, Weichao - East China Univ Sci & Technol - China
East China University of Science and Technology - China
5 Zhang, Xueqin - East China Univ Sci & Technol - China
Shanghai Key Lab Comp Software Evaluating Testing - China
East China University of Science and Technology - China
Shanghai Key Laboratory of Computer Software Testing and Evaluating - China

Muestra la afiliación y género (detectado) para los co-autores de la publicación.

Financiamiento



Fuente
National Natural Science Foundation of China
Natural Science Foundation of Shanghai Municipality
Shanghai Municipal Natural Science Foun-dation

Muestra la fuente de financiamiento declarada en la publicación.

Agradecimientos



Agradecimiento
This work was sponsored by National Natural Science Foundation of China (No. 62276097) and Shanghai Municipal Natural Science Foun-dation (No. 22ZR1416500) . Any opinions, findings, and conclusions are those of the authors, and do not necessarily reflect the views of the above agencies.
This work was sponsored by National Natural Science Foundation of China (No. 62276097 ) and Natural Science Foundation of Shanghai (No. 22ZR1416500 ). Any opinions, findings, and conclusions are those of the authors, and do not necessarily reflect the views of the above agencies.

Muestra la fuente de financiamiento declarada en la publicación.