Colección SciELO Chile

Departamento Gestión de Conocimiento, Monitoreo y Prospección
Consultas o comentarios: productividad@anid.cl
Búsqueda Publicación
Búsqueda por Tema Título, Abstract y Keywords



Automatic Extraction of Nested Entities in Clinical Referrals in Spanish
Indexado
WoS WOS:001381981400003
Scopus SCOPUS_ID:85137346329
DOI 10.1145/3498324
Año 2022
Tipo artículo de investigación

Citas Totales

Autores Afiliación Chile

Instituciones Chile

% Participación
Internacional

Autores
Afiliación Extranjera

Instituciones
Extranjeras


Abstract



Here we describe a new clinical corpus rich in nested entities and a series of neural models to identify them. The corpus comprises de-identified referrals from the waiting list in Chilean public hospitals. A subset of 5,000 referrals (58.6% medical and 41.4% dental) was manually annotated with 10 types of entities, six attributes, and pairs of relations with clinical relevance. In total, there are 110,771 annotated tokens. A trained medical doctor or dentist annotated these referrals, and then, together with three other researchers, consolidated each of the annotations. The annotated corpus has 48.17% of entities embedded in other entities or containing another one. We use this corpus to build models for Named Entity Recognition (NER). The best results were achieved using a Multiple Single-entity architecture with clinical word embeddings stacked with character and Flair contextual embeddings. The entity with the best performance is abbreviation, and the hardest to recognize is finding. NER models applied to this corpus can leverage statistics of diseases and pending procedures. This work constitutes the first annotated corpus using clinical narratives from Chile and one of the few in Spanish. The annotated corpus, clinical word embeddings, annotation guidelines, and neural models are freely released to the community.

Métricas Externas



PlumX Altmetric Dimensions

Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:

Disciplinas de Investigación



WOS
Sin Disciplinas
Scopus
Sin Disciplinas
SciELO
Sin Disciplinas

Muestra la distribución de disciplinas para esta publicación.

Publicaciones WoS (Ediciones: ISSHP, ISTP, AHCI, SSCI, SCI), Scopus, SciELO Chile.

Colaboración Institucional



Muestra la distribución de colaboración, tanto nacional como extranjera, generada en esta publicación.


Autores - Afiliación



Ord. Autor Género Institución - País
1 Baez, Pablo Hombre Universidad de Chile - Chile
2 Bravo-Marquez, Felipe Hombre Universidad de Chile - Chile
Instituto Milenio Fundamentos de los Datos - Chile
3 DUNSTAN-ESCUDERO, JOCELYN MARIEL Hombre Universidad de Chile - Chile
4 ROJAS-VALENZUELA, MATIAS ISMAEL Hombre Universidad de Chile - Chile
5 Villena, Fabian Hombre Universidad de Chile - Chile

Muestra la afiliación y género (detectado) para los co-autores de la publicación.

Financiamiento



Fuente
FONDECYT
ICM
Fondo Nacional de Desarrollo Científico y Tecnológico
NLHPC
Centro de Modelamiento Matematico (CMM)
Centro de Modelamiento Matematico
U-INICIA VID
Agencia Nacional de Investigación y Desarrollo
ANID-Chile
ANID-Millennium Science Initiative
BASAL funds for center of excellence from ANID-Chile
CIMT-CORFO

Muestra la fuente de financiamiento declarada en la publicación.

Agradecimientos



Agradecimiento
This work was funded by Centro de Modelamiento Matemático (CMM), ACE210010, AFB170001, and FB21005 Basal Funds for Center of Excellence from ANID-Chile. In addition, we got funding from U-INICIA VID UI-004/19 and UI-004/20, FONDECYT 11201250 and 11200290, CIMT-CORFO cost center 570111, ICM P09-015F, Postdoctoral FONDECYT 3210395, and ANID - Millennium Science Initiative Program - Code ICN17_002. We thank Maicol Fernández, Manuel Durán, and Esteban Galindo for performing annotations and consolidations on this corpus, and Ren Cerro for English proofreading. This research was partially supported by the supercomputing infrastructure of the NLHPC (ECM- 02).
This work was funded by Centro de Modelamiento Matematico (CMM) , ACE210010, AFB170001, and FB21005 Basal Funds for Center of Excellence from ANID-Chile. In addition, we got funding from U-INICIA VID UI-004/19 and UI-004/20, FONDECYT 11201250 and 11200290, CIMT-CORFO cost center 570111, ICM P09-015F, Postdoctoral FONDECYT 3210395, and ANID-Millennium Science Initiative Program-Code ICN17_002. We thank Maicol Fernandez, Manuel Duran, and Esteban Galindo for performing annotations and consolidations on this corpus, and Ren Cerro for English proofreading. This research was partially supported by the supercomputing infrastructure of the NLHPC (ECM-02) .

Muestra la fuente de financiamiento declarada en la publicación.