Abstract
In this paper we present an efficient and general sorting-based approach for the neighbor search on GPUs. Finding neighbors of a particle is a common task in particle methods and has a significant impact on the overall computational effort–especially in dynamics simulations. We extend a space-filling curve algorithm presented in Connor and Kumar (IEEE Trans Vis Comput Graph, 2009) for its usage on GPUs with the parallel computing model Compute Unified Device Architecture (CUDA). To evaluate our implementation, we consider the respective execution time of our GPU search algorithm, for the most common assemblies of particles: a regular grid, uniformly distributed random points and cluster points in 2 and 3 dimensions. The measured computational time is compared with the theoretical time complexity of the extended algorithm and the computational time of its reference single-core implementation. The presented results show a speed up of factor of 4 comparing the GPU and CPU run times.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Storing the graded nodes of the two dimensional regular grid in row-major order in a thrust vector leads to some issues in the costs of the Merge sort algorithm. To avoid this the nodes need to be stored randomized.
References
S. Aluru, F.E. Sevilgen, Parallel domain decomposition and load balancing using space-filling curves, in Proceedings of the 4th IEEE Conference on High Performance Computing, Bangalore, 1997, pp. 230–235
S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, A.Y. Wu, An optimal algortihm for approximate nearest neighbor searching in fixed dimensions, in Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, Arlington, 1994, vol. 5, pp. 573–582
M. Bader, Space-Filling Curves – An Introduction with Applications in Scientific Computing (Springer, Berlin/Heidelberg, 2013)
C. Böhm, S. Berchtold, A.D. Keim, Searching in high-dimensional spaces: index strucutres for improving the performance of multimedia databases. ACM Comput. Surv. 33, 322–373 (2001)
T.M. Chan, A minimalist’s implementation of an approximate nearest neighbor algorithm in fixed dimensions, https://cs.uwaterloo.ca/~tmchan/sss.ps, May 2006
M. Connor, P. Kumar, Stann, https://sites.google.com/a/compgeom.com/stann/
M. Connor, P. Kumar, Fast construction of k-nearest neighbor graphs for point clouds. IEEE Trans. Vis. Comput. Graph. 14(4), 599–608 (2009)
A. Dashti, I. Komarov, R.M. D’Souza, Efficient computation of k-nearest neighbour graphs for large high-dimensional data sets on GPU clusters. PLoS ONE 8, e74113 (2013), plosone.org
V. Garcia, E. Debreuve, M. Barlaud, kNN CUDA, http://vincentfpgarcia.github.io/kNN-CUDA/
R.A. Gingold, J.J. Monaghan, Smoothed particle hydrodynamics: theory and application to non-spherical stars. Mon. Not. R. Astron. Soc. 181, 375–389 (1977)
M. Griebel, S. Knapek, G. Zumbusch, Numerical Simulation in Molecular Dynamics (Springer, Berlin/Heidelberg, 2007)
P. Leite, J.M. Teixeira, T. Farias, B. Reis, V. Teichrieb, J. Kelner, Nearest neighbor searches on the gpu. Int. J. Parallel Program. 40(3), 313–330 (2012) (English)
J. Mellor-Crummey, D. Whalley, K. Kennedy, Improving memory hierarchy performance fir irregular applications using data and computation reorderings. Int. J. Parallel Program. 29, 217–247 (2001)
D.M. Mount, S. Arya, ANN: a library for approximate nearest neighbor searching, http://www.cs.umd.edu/~mount/ANN/
S.A. Nene, S.K Nayar, A simple algorithm for nearest neighbor search in high dimensions. IEEE Trans. Pattern Anal. Mach. Intell. 19, 989–1003 (1997)
M.L. Parks, R.B. Lehoucq, S.J. Plimpton, S.A. Silling, Implementing peridynamics within a molecular dynamics code. Comput. Phys. Commun. (EL, ed.) 179, 777–783 (2008)
S. Plimpton, Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995)
N. Satish, M. Harris, M. Garland, Designing efficient sorting algorithms for manycore GPUs, in IEEE International Symposium in Parallel & Distributed Processing, Rome, 2009, pp. 1–10
M.A. Schweitzer, A Parallel Multilevel Partition of Unity Method for Elliptic Partial Differential Equations. Lecture Notes in Computational Science and Engineering, vol. 29 (Springer, New York, 2003)
Y.D. Sergeyev, R.G. Strongin, D. Lera, Introduction to Global Optimization Exploiting Space-Filling Curves (Springer, New York/Heidelberg, 2013)
S.A. Silling, Reformulation of elasticity theory for discontinuties and long-range forces. Sandia report SAND98-2176, Sandia National Laboratories, 1998
S.A. Silling, E. Askari, A meshfree method based on the peridynamic model of solid mechanics. Comput. Struct. 83, 1526–1535 (2005)
E. Sintorn, U. Assarsson, Fast parallel GPU-sorting using a hybrid algorithm. J. Parallel Distrib. Comput. 68, 1381–1388 (2008)
H. Tropf, H. Herzog, Multidimensional range search in dynamically balanced trees. Angew. Inform. (Appl. Inform.) 2, 71–77 (1981). Vieweg Verlag
M.S. Warren, J.K. Salmon, A parallel hashed oct-tree n-body algorithm, in Proceedings of the 1993 ACM/IEEE Conference on Supercomputing (Supercomputing’93), Portland (ACM, New York, 1993), pp. 12–21
W. Wen-mei, GPU Computing Gems Emerald Edition Applications of GPU Computing Series, 1st edn. (Morgan Kaufmann, Burlington, Massachusetts 2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Diehl, P., Schweitzer, M.A. (2015). Efficient Neighbor Search for Particle Methods on GPUs. In: Griebel, M., Schweitzer, M. (eds) Meshfree Methods for Partial Differential Equations VII. Lecture Notes in Computational Science and Engineering, vol 100. Springer, Cham. https://doi.org/10.1007/978-3-319-06898-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-06898-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06897-8
Online ISBN: 978-3-319-06898-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)