Colección SciELO Chile

Departamento Gestión de Conocimiento, Monitoreo y Prospección
Consultas o comentarios: productividad@anid.cl
Búsqueda Publicación
Búsqueda por Tema Título, Abstract y Keywords



Accelerating range minimum queries with ray tracing cores
Indexado
WoS WOS:001217458100001
Scopus SCOPUS_ID:85189012812
DOI 10.1016/J.FUTURE.2024.03.040
Año 2024
Tipo artículo de investigación

Citas Totales

Autores Afiliación Chile

Instituciones Chile

% Participación
Internacional

Autores
Afiliación Extranjera

Instituciones
Extranjeras


Abstract



Over the past decade, GPU technology has undergone a notable transformation, evolving from pure generalpurpose computation to the integration of application -specific integrated circuits (ASICs), including Tensor Cores and Ray Tracing (RT) cores. While these specialized GPU cores were initially developed to enhance specific domains like AI and real-time rendering, recent research has successfully harnessed their capabilities to expedite other tasks traditionally reliant on conventional GPU computing. One GPU task that is still yet to find its way into RT cores is the processing of range minimum queries (RMQs) in parallel, which is fundamental in fields such as information retrieval or pattern matching, among others. In this context, accelerating RMQs with RT cores would impact many of the applications that heavily rely on this task. In this work we present RTXRMQ, a new approach that can compute RMQs with RT cores. The main contribution is the proposal of a geometric solution for RMQ, where elements become triangles that are placed and shaped according to the element's value and position in the array, respectively, such that the closest hit of a ray launched from a point given by the query parameters corresponds to the result of that query. Experimental results show that RTXRMQ is currently best suited for small query ranges relative to the input size, achieving up to 5x and 2 . 3x of speedup over parallel state of the art CPU and GPU approaches, respectively. For medium and large query ranges RTXRMQ is still slower than the state of the art GPU approach, but still competitive by being 2 . 5x and 4x faster than a state of the art CPU method running in parallel as well. Furthermore, performance scaling experiments across the latest RTX GPU architectures show that if the current RT core scaling trend continues, then RTXRMQ's performance would scale at a higher rate than the other compared approaches, making it an attractive tool for future high performance applications that employ many batches of RMQs.

Métricas Externas



PlumX Altmetric Dimensions

Muestra métricas de impacto externas asociadas a la publicación. Para mayor detalle:

Disciplinas de Investigación



WOS
Computer Science, Theory & Methods
Scopus
Sin Disciplinas
SciELO
Sin Disciplinas

Muestra la distribución de disciplinas para esta publicación.

Publicaciones WoS (Ediciones: ISSHP, ISTP, AHCI, SSCI, SCI), Scopus, SciELO Chile.

Colaboración Institucional



Muestra la distribución de colaboración, tanto nacional como extranjera, generada en esta publicación.


Autores - Afiliación



Ord. Autor Género Institución - País
1 Meneses, Enzo Hombre Universidad Austral de Chile - Chile
2 NAVARRO-GUERRERO, CRISTOBAL ALEJANDRO Hombre Universidad Austral de Chile - Chile
3 FERRADA-ESCOBAR, HECTOR RICARDO Hombre Universidad Austral de Chile - Chile
4 Quezada, Felipe A. Hombre Universidad Austral de Chile - Chile

Muestra la afiliación y género (detectado) para los co-autores de la publicación.

Financiamiento



Fuente
FONDEQUIP
Universidad Austral de Chile
ANID Fondecyt
ANID FONDECYT, Chile
Patagon supercomputer of Universidad Austral de Chile
Temporal research group

Muestra la fuente de financiamiento declarada en la publicación.

Agradecimientos



Agradecimiento
This work was supported by the ANID FONDECYT, Chile grant #1221357, the Temporal research group and the Patagon supercomputer of Universidad Austral de Chile (FONDEQUIP EQM180042) . Special thanks to AMAX Engineering (https://www.amax.com/) for supporting with the CPU and GPU hardware used for the experimentation.
This work was supported by the ANID FONDECYT grant #1221357 , the Temporal research group and the Patagón supercomputer of Universidad Austral de Chile ( FONDEQUIP EQM180042 ). Special thanks to AMAX Engineering ( https://www.amax.com/ ) for supporting with the CPU and GPU hardware used for the experimentation.
This work was supported by the ANID FONDECYT grant #1221357 , the Temporal research group and the Patagón supercomputer of Universidad Austral de Chile ( FONDEQUIP EQM180042 ). Special thanks to AMAX Engineering ( https://www.amax.com/ ) for supporting with the CPU and GPU hardware used for the experimentation.

Muestra la fuente de financiamiento declarada en la publicación.