Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:
| Indexado |
|
||||
| DOI | 10.1177/07356331231191174 | ||||
| Año | 2024 | ||||
| Tipo | artículo de investigación |
Citas Totales
Autores Afiliación Chile
Instituciones Chile
% Participación
Internacional
Autores
Afiliación Extranjera
Instituciones
Extranjeras
Written answers to open-ended questions can have a higher long-term effect on learning than multiple-choice questions. However, it is critical that teachers immediately review the answers, and ask to redo those that are incoherent. This can be a difficult task and can be time-consuming for teachers. A possible solution is to automate the detection of incoherent answers. One option is to automate the review with Large Language Models (LLM). They have a powerful discursive ability that can be used to explain decisions. In this paper, we analyze the responses of fourth graders in mathematics using three LLMs: GPT-3, BLOOM, and YOU. We used them with zero, one, two, three and four shots. We compared their performance with the results of various classifiers trained with Machine Learning (ML). We found that LLMs perform worse than MLs in detecting incoherent answers. The difficulty seems to reside in recursive questions that contain both questions and answers, and in responses from students with typical fourth-grader misspellings. Upon closer examination, we have found that the ChatGPT model faces the same challenges.
| Ord. | Autor | Género | Institución - País |
|---|---|---|---|
| 1 | Urrutia, Felipe | Hombre |
Universidad de Chile - Chile
|
| 2 | ARAYA-SCHULZ, ROBERTO | Hombre |
Universidad de Chile - Chile
Inst Educ - Chile |
| Fuente |
|---|
| ANID/PIA/Basal Funds for Centers of Excellence |
| Agencia Nacional de Investigación y Desarrollo |
| Support from ANID/PIA/Basal Funds for Centers of Excellence FB0003 is gratefully acknowledged. |
| Agradecimiento |
|---|
| Support from ANID/PIA/Basal Funds for Centers of Excellence FB0003 is gratefully acknowledged. |
| The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Chilean National Agency for Research and Development (ANID), grant number ANID/PIA/Basal Funds for Centers of Excellence FB0003. |