Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:
| Indexado |
|
||||
| DOI | 10.1145/3331184.3331262 | ||||
| Año | 2019 | ||||
| Tipo | proceedings paper |
Citas Totales
Autores Afiliación Chile
Instituciones Chile
% Participación
Internacional
Autores
Afiliación Extranjera
Instituciones
Extranjeras
Hate speech is an important problem that is seriously affecting the dynamics and usefulness of online social communities. Large scale social platforms are currently investing important resources into automatically detecting and classifying hateful content, without much success. On the other hand, the results reported by state-of-the-art systems indicate that supervised approaches achieve almost perfect performance but only within specific datasets. In this work, we analyze this apparent contradiction between existing literature and actual applications. We study closely the experimental methodology used in prior work and their generalizability to other datasets. Our findings evidence methodological issues, as well as an important dataset bias. As a consequence, performance claims of the current state-of-the-art have become significantly overestimated. The problems that we have found are mostly related to data overfitting and sampling issues. We discuss the implications for current research and re-conduct experiments to give a more accurate picture of the current state-of-the art methods.
| Ord. | Autor | Género | Institución - País |
|---|---|---|---|
| 1 | Arango, Aymé | - |
Universidad de Chile - Chile
|
| 2 | PEREZ-ROJAS, JORGE ADRIAN | Hombre |
Universidad de Chile - Chile
|
| 3 | POBLETE-LABRA, BARBARA JEANNETTE | Mujer |
Universidad de Chile - Chile
|
| 4 | ACM | Corporación |
| Fuente |
|---|
| FONDECYT |
| Fondo Nacional de Desarrollo Científico y Tecnológico |
| Fondo Nacional de Desarrollo CientÃfico y Tecnológico |
| Millennium Institute for Foundational Research on Data (IMFD) |
| IMFD |
| Agradecimiento |
|---|
| We thank Thomas Davidson for providing all the information concerning the dataset described in Davidson et al. [8]. This work was supported by the Millennium Institute for Foundational Research on Data (IMFD). Poblete was also funded by Fondecyt grant 1191604. |
| We thank Thomas Davidson for providing all the information concerning the dataset described in Davidson et al. [8]. This work was supported by the Millennium Institute for Foundational Research on Data (IMFD). Poblete was also funded by Fondecyt grant 1191604. |