Colección SciELO Chile

Departamento Gestión de Conocimiento, Monitoreo y Prospección
Consultas o comentarios: productividad@anid.cl
Búsqueda Publicación
Búsqueda por Tema Título, Abstract y Keywords



AI apology: interactive multi-objective reinforcement learning for human-aligned AI
Indexado
WoS WOS:000973380900004
Scopus SCOPUS_ID:85153283220
DOI 10.1007/S00521-023-08586-X
Año 2023
Tipo artículo de investigación

Citas Totales

Autores Afiliación Chile

Instituciones Chile

% Participación
Internacional

Autores
Afiliación Extranjera

Instituciones
Extranjeras


Abstract



For an Artificially Intelligent (AI) system to maintain alignment between human desires and its behaviour, it is important that the AI account for human preferences. This paper proposes and empirically evaluates the first approach to aligning agent behaviour to human preference via an apologetic framework. In practice, an apology may consist of an acknowledgement, an explanation and an intention for the improvement of future behaviour. We propose that such an apology, provided in response to recognition of undesirable behaviour, is one way in which an AI agent may both be transparent and trustworthy to a human user. Furthermore, that behavioural adaptation as part of apology is a viable approach to correct against undesirable behaviours. The Act-Assess-Apologise framework potentially could address both the practical and social needs of a human user, to recognise and make reparations against prior undesirable behaviour and adjust for the future. Applied to a dual-auxiliary impact minimisation problem, the apologetic agent had a near perfect determination and apology provision accuracy in several non-trivial configurations. The agent subsequently demonstrated behaviour alignment with success that included up to complete avoidance of the impacts described by these objectives in some scenarios.

Métricas Externas



PlumX Altmetric Dimensions

Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:

Disciplinas de Investigación



WOS
Computer Science, Artificial Intelligence
Scopus
Sin Disciplinas
SciELO
Sin Disciplinas

Muestra la distribución de disciplinas para esta publicación.

Publicaciones WoS (Ediciones: ISSHP, ISTP, AHCI, SSCI, SCI), Scopus, SciELO Chile.

Colaboración Institucional



Muestra la distribución de colaboración, tanto nacional como extranjera, generada en esta publicación.


Autores - Afiliación



Ord. Autor Género Institución - País
1 Harland, Hadassah - Deakin Univ - Australia
Deakin University - Australia
2 Dazeley, Richard Hombre Deakin Univ - Australia
Deakin University - Australia
3 Nakisa, Bahareh - Deakin Univ - Australia
Deakin University - Australia
4 CRUZ-OLIVOS, FRANCISCO ANTONIO Hombre Univ New South Wales - Australia
Universidad Central de Chile - Chile
UNSW Sydney - Australia
5 Vamplew, Peter Hombre Federat Univ - Australia
Federation University Australia - Australia

Muestra la afiliación y género (detectado) para los co-autores de la publicación.

Financiamiento



Fuente
Deakin University
CAUL and its Member Institutions
School of Life and Environmental Sciences, Deakin University
Institute for Frontier Materials, Deakin University

Muestra la fuente de financiamiento declarada en la publicación.

Agradecimientos



Agradecimiento
Open Access funding enabled and organized by CAUL and its Member Institutions.

Muestra la fuente de financiamiento declarada en la publicación.