Dataciencia

Colección SciELO Chile

Accelerating range minimum queries with ray tracing cores

Indexado

WoS	WOS:001217458100001
Scopus	SCOPUS_ID:85189012812

DOI

10.1016/J.FUTURE.2024.03.040

Año

2024

Tipo

artículo de investigación

Citas Totales

Autores Afiliación Chile

Instituciones Chile

% Participación
Internacional

Autores
Afiliación Extranjera

Instituciones
Extranjeras

Abstract

Over the past decade, GPU technology has undergone a notable transformation, evolving from pure generalpurpose computation to the integration of application -specific integrated circuits (ASICs), including Tensor Cores and Ray Tracing (RT) cores. While these specialized GPU cores were initially developed to enhance specific domains like AI and real-time rendering, recent research has successfully harnessed their capabilities to expedite other tasks traditionally reliant on conventional GPU computing. One GPU task that is still yet to find its way into RT cores is the processing of range minimum queries (RMQs) in parallel, which is fundamental in fields such as information retrieval or pattern matching, among others. In this context, accelerating RMQs with RT cores would impact many of the applications that heavily rely on this task. In this work we present RTXRMQ, a new approach that can compute RMQs with RT cores. The main contribution is the proposal of a geometric solution for RMQ, where elements become triangles that are placed and shaped according to the element's value and position in the array, respectively, such that the closest hit of a ray launched from a point given by the query parameters corresponds to the result of that query. Experimental results show that RTXRMQ is currently best suited for small query ranges relative to the input size, achieving up to 5x and 2 . 3x of speedup over parallel state of the art CPU and GPU approaches, respectively. For medium and large query ranges RTXRMQ is still slower than the state of the art GPU approach, but still competitive by being 2 . 5x and 4x faster than a state of the art CPU method running in parallel as well. Furthermore, performance scaling experiments across the latest RTX GPU architectures show that if the current RT core scaling trend continues, then RTXRMQ's performance would scale at a higher rate than the other compared approaches, making it an attractive tool for future high performance applications that employ many batches of RMQs.

Revista

Revista	ISSN
Future Generation Computer Systems The International #Journal Of Grid Computing And E Science	0167-739X

Métricas Externas

PlumX	Altmetric	Dimensions

Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:

Plumx: https://plumanalytics.com/learn/about-metrics/
Altmetric: https://www.altmetric.com/about-altmetrics/what-are-altmetrics/
Dimensions: https://www.dimensions.ai/why-dimensions/

Disciplinas de Investigación

WOS
Computer Science, Theory & Methods

Scopus
Sin Disciplinas

SciELO
Sin Disciplinas

Muestra la distribución de disciplinas para esta publicación.

Publicaciones WoS (Ediciones: ISSHP, ISTP, AHCI, SSCI, SCI), Scopus, SciELO Chile.

Colaboración Institucional

Muestra la distribución de colaboración, tanto nacional como extranjera, generada en esta publicación.

Autores - Afiliación

Ord.	Autor	Género	Institución - País
1	Meneses, Enzo	Hombre	Universidad Austral de Chile - Chile
2	NAVARRO-GUERRERO, CRISTOBAL ALEJANDRO	Hombre	Universidad Austral de Chile - Chile
3	FERRADA-ESCOBAR, HECTOR RICARDO	Hombre	Universidad Austral de Chile - Chile
4	Quezada, Felipe A.	Hombre	Universidad Austral de Chile - Chile

Muestra la afiliación y género (detectado) para los co-autores de la publicación.

Financiamiento

Fuente
FONDEQUIP
Universidad Austral de Chile
ANID Fondecyt
ANID FONDECYT, Chile
Patagon supercomputer of Universidad Austral de Chile
Temporal research group

Muestra la fuente de financiamiento declarada en la publicación.

Agradecimientos

Agradecimiento
This work was supported by the ANID FONDECYT, Chile grant #1221357, the Temporal research group and the Patagon supercomputer of Universidad Austral de Chile (FONDEQUIP EQM180042) . Special thanks to AMAX Engineering (https://www.amax.com/) for supporting with the CPU and GPU hardware used for the experimentation.
This work was supported by the ANID FONDECYT grant #1221357 , the Temporal research group and the Patagón supercomputer of Universidad Austral de Chile ( FONDEQUIP EQM180042 ). Special thanks to AMAX Engineering ( https://www.amax.com/ ) for supporting with the CPU and GPU hardware used for the experimentation.
This work was supported by the ANID FONDECYT grant #1221357 , the Temporal research group and the Patagón supercomputer of Universidad Austral de Chile ( FONDEQUIP EQM180042 ). Special thanks to AMAX Engineering ( https://www.amax.com/ ) for supporting with the CPU and GPU hardware used for the experimentation.