Colección SciELO Chile

Departamento Gestión de Conocimiento, Monitoreo y Prospección
Consultas o comentarios: productividad@anid.cl
Búsqueda Publicación
Búsqueda por Tema Título, Abstract y Keywords



Hate speech detection is not as easy as you may think: A closer look at model validation (extended version)
Indexado
WoS WOS:000740349400008
Scopus SCOPUS_ID:85087929876
DOI 10.1016/J.IS.2020.101584
Año 2022
Tipo artículo de investigación

Citas Totales

Autores Afiliación Chile

Instituciones Chile

% Participación
Internacional

Autores
Afiliación Extranjera

Instituciones
Extranjeras


Abstract



Hate speech is an important problem that is seriously affecting the dynamics and usefulness of online social communities. Large scale social platforms are currently investing important resources into automatically detecting and classifying hateful content, without much success. On the other hand, the results reported by state-of-the-art systems indicate that supervised approaches achieve almost perfect performance but only within specific datasets, most of them in English language. In this work, we analyze this apparent contradiction between existing literature and actual applications. We study closely the experimental methodology used in prior work and their generalizability to other datasets. Our findings evidence methodological issues, as well as an important dataset bias. As a consequence, performance claims of the current state-of-the-art have become significantly overestimated. The problems that we have found are mostly related to data overfitting and sampling issues. We discuss the implications for current research and re-conduct experiments to give a more accurate picture of the current state-of-the art methods. Moreover, we design some baseline approaches to perform cross-lingual experiments, using English and Spanish datasets.

Revista



Revista ISSN
Information Systems 0306-4379

Métricas Externas



PlumX Altmetric Dimensions

Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:

Disciplinas de Investigación



WOS
Computer Science, Information Systems
Scopus
Information Systems
Software
Hardware And Architecture
SciELO
Sin Disciplinas

Muestra la distribución de disciplinas para esta publicación.

Publicaciones WoS (Ediciones: ISSHP, ISTP, AHCI, SSCI, SCI), Scopus, SciELO Chile.

Colaboración Institucional



Muestra la distribución de colaboración, tanto nacional como extranjera, generada en esta publicación.


Autores - Afiliación



Ord. Autor Género Institución - País
1 Arango, Aymé - Universidad de Chile - Chile
Instituto Milenio Fundamentos de los Datos - Chile
2 PEREZ-ROJAS, JORGE ADRIAN Hombre Universidad de Chile - Chile
Instituto Milenio Fundamentos de los Datos - Chile
3 POBLETE-LABRA, BARBARA JEANNETTE Mujer Universidad de Chile - Chile
Instituto Milenio Fundamentos de los Datos - Chile

Muestra la afiliación y género (detectado) para los co-autores de la publicación.

Financiamiento



Fuente
Fondo Nacional de Desarrollo Científico y Tecnológico
Fondecyt, Chile
Fondo Nacional de Desarrollo Científico y Tecnológico
Millennium Institute for Foundational Research on Data, Chile (IMFD)

Muestra la fuente de financiamiento declarada en la publicación.

Agradecimientos



Agradecimiento
We thank Thomas Davidson for providing all the information concerning the dataset described in Davidson et al. [17] . This work was supported by the Millennium Institute for Foundational Research on Data, Chile (IMFD). Poblete was also funded by Fondecyt, Chile grant 1191604 , and Pérez by Fondecyt, Chile grant 1200967 .
We thank Thomas Davidson for providing all the information concerning the dataset described in Davidson et al. [17]. This work was supported by the Millennium Institute for Foundational Research on Data, Chile (IMFD). Poblete was also funded by Fondecyt, Chile grant 1191604, and Perez by Fondecyt, Chile grant 1200967.

Muestra la fuente de financiamiento declarada en la publicación.