GPU Based Hash Segmentation Index for Fast T-overlap Query

L Jia, Y Zhang, M Li, J Ding, J You�- …�22–24, 2017, Proceedings, Part I, 2017 - Springer
L Jia, Y Zhang, M Li, J Ding, J You
Data Science: Third International Conference of Pioneering Computer Scientists�…, 2017Springer
T-overlap query is the basis of set similarity query and has been applied in many important
fields. Most existing approaches employ a pruning-and-verification framework, thus in low
efficiency. Modern GPU has much higher parallelism as well as memory bandwidth than
CPU and can be used to accelerate T-overlap query. In this paper, we use hash
segmentation to divide inverted lists into segments, then design an efficient inverted index
called GHSII on GPU using hash segmentation. Based on GHSII, a new segmentation�…
Abstract
T-overlap query is the basis of set similarity query and has been applied in many important fields. Most existing approaches employ a pruning-and-verification framework, thus in low efficiency. Modern GPU has much higher parallelism as well as memory bandwidth than CPU and can be used to accelerate T-overlap query. In this paper, we use hash segmentation to divide inverted lists into segments, then design an efficient inverted index called GHSII on GPU using hash segmentation. Based on GHSII, a new segmentation parallel T-overlap algorithm, GSPS, is proposed. GSPS uses segment at a time to scan segments and uses shared memory to decrease the number of accesses to device memory. Furthermore, an optimized algorithm called GSPS-TLLO using a heuristic query order is proposed to solve the problem of load imbalance. Experiments are carried out on two real datasets and the results show that GSPS-TLLO outperforms the state-of-the-art GPU parallel T-overlap algorithms.
Springer
Showing the best result for this search. See all results