Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:
| Indexado |
|
||||
| DOI | 10.1145/3498324 | ||||
| Año | 2022 | ||||
| Tipo | artículo de investigación |
Citas Totales
Autores Afiliación Chile
Instituciones Chile
% Participación
Internacional
Autores
Afiliación Extranjera
Instituciones
Extranjeras
Here we describe a new clinical corpus rich in nested entities and a series of neural models to identify them. The corpus comprises de-identified referrals from the waiting list in Chilean public hospitals. A subset of 5,000 referrals (58.6% medical and 41.4% dental) was manually annotated with 10 types of entities, six attributes, and pairs of relations with clinical relevance. In total, there are 110,771 annotated tokens. A trained medical doctor or dentist annotated these referrals, and then, together with three other researchers, consolidated each of the annotations. The annotated corpus has 48.17% of entities embedded in other entities or containing another one. We use this corpus to build models for Named Entity Recognition (NER). The best results were achieved using a Multiple Single-entity architecture with clinical word embeddings stacked with character and Flair contextual embeddings. The entity with the best performance is abbreviation, and the hardest to recognize is finding. NER models applied to this corpus can leverage statistics of diseases and pending procedures. This work constitutes the first annotated corpus using clinical narratives from Chile and one of the few in Spanish. The annotated corpus, clinical word embeddings, annotation guidelines, and neural models are freely released to the community.
| Ord. | Autor | Género | Institución - País |
|---|---|---|---|
| 1 | Baez, Pablo | Hombre |
Universidad de Chile - Chile
|
| 2 | Bravo-Marquez, Felipe | Hombre |
Universidad de Chile - Chile
Instituto Milenio Fundamentos de los Datos - Chile |
| 3 | DUNSTAN-ESCUDERO, JOCELYN MARIEL | Hombre |
Universidad de Chile - Chile
|
| 4 | ROJAS-VALENZUELA, MATIAS ISMAEL | Hombre |
Universidad de Chile - Chile
|
| 5 | Villena, Fabian | Hombre |
Universidad de Chile - Chile
|
| Fuente |
|---|
| FONDECYT |
| ICM |
| Fondo Nacional de Desarrollo Científico y Tecnológico |
| NLHPC |
| Centro de Modelamiento Matematico (CMM) |
| Centro de Modelamiento Matematico |
| U-INICIA VID |
| Agencia Nacional de Investigación y Desarrollo |
| ANID-Chile |
| ANID-Millennium Science Initiative |
| BASAL funds for center of excellence from ANID-Chile |
| CIMT-CORFO |
| Agradecimiento |
|---|
| This work was funded by Centro de Modelamiento Matemático (CMM), ACE210010, AFB170001, and FB21005 Basal Funds for Center of Excellence from ANID-Chile. In addition, we got funding from U-INICIA VID UI-004/19 and UI-004/20, FONDECYT 11201250 and 11200290, CIMT-CORFO cost center 570111, ICM P09-015F, Postdoctoral FONDECYT 3210395, and ANID - Millennium Science Initiative Program - Code ICN17_002. We thank Maicol Fernández, Manuel Durán, and Esteban Galindo for performing annotations and consolidations on this corpus, and Ren Cerro for English proofreading. This research was partially supported by the supercomputing infrastructure of the NLHPC (ECM- 02). |
| This work was funded by Centro de Modelamiento Matematico (CMM) , ACE210010, AFB170001, and FB21005 Basal Funds for Center of Excellence from ANID-Chile. In addition, we got funding from U-INICIA VID UI-004/19 and UI-004/20, FONDECYT 11201250 and 11200290, CIMT-CORFO cost center 570111, ICM P09-015F, Postdoctoral FONDECYT 3210395, and ANID-Millennium Science Initiative Program-Code ICN17_002. We thank Maicol Fernandez, Manuel Duran, and Esteban Galindo for performing annotations and consolidations on this corpus, and Ren Cerro for English proofreading. This research was partially supported by the supercomputing infrastructure of the NLHPC (ECM-02) . |