Fast GPU-based Locality Sensitive Hashing for K-Nearest Neighbor
Computation
to perform approximate k-nearest neighbor computation in high-dimensional spaces. We use the Bi-level LSH algorithm, which can compute k-nearest neighbors with higher accuracy and is amenable to parallelization. During the first level, we use the parallel RP-tree algorithm to partition datasets into several groups so that items similar to each other are clustered together. The second level involves computing the Bi-Level LSH code for each item and constructing a hierarchical hash table. The hash table is based on parallel cuckoo hashing and Morton curves. In the query step, we use GPU-based work queues to accelerate short-list search, which is one of the main bottlenecks in LSH-based algorithms. We demonstrate the results on large image datasets with 200,000 images which are represented as 512 dimensional vectors. In practice, our GPU implementation can obtain more than 40x acceleration over a single-core CPU based LSH implementation.
introducing Bi-level Algorithm, International Conference on Data
Engineering (ICDE), 2012 Fast GPU-based
Locality Sensitive Hashing for K-Nearest Neighbor Computation (PDF) discussing the GPU Implementation, International Conference on
Advances in Geographic Information Systems (ACM SIGSPATIAL GIS), 2011 Software
|