Locality-sensitive hashing

In computer science, locality-sensitive hashing (LSH) is a fuzzy hashing technique that hashes similar input items into the same "buckets" with high probability.[1] (The number of buckets is much smaller than the universe of possible input items.)[1] Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.

Hashing-based approximate nearest-neighbor search algorithms generally use one of two main categories of hashing methods: either data-independent methods, such as locality-sensitive hashing (LSH); or data-dependent methods, such as locality-preserving hashing (LPH).[2][3]

Locality-preserving hashing was initially devised as a way to facilitate data pipelining in implementations of massively parallel algorithms that use randomized routing and universal hashing to reduce memory contention and network congestion.[4][5]

  1. ^ a b Rajaraman, A.; Ullman, J. (2010). "Mining of Massive Datasets, Ch. 3".
  2. ^ Zhao, Kang; Lu, Hongtao; Mei, Jincheng (2014). Locality Preserving Hashing. AAAI Conference on Artificial Intelligence. Vol. 28. pp. 2874–2880.
  3. ^ Tsai, Yi-Hsuan; Yang, Ming-Hsuan (October 2014). "Locality preserving hashing". 2014 IEEE International Conference on Image Processing (ICIP). pp. 2988–2992. doi:10.1109/ICIP.2014.7025604. ISBN 978-1-4799-5751-4. ISSN 1522-4880. S2CID 8024458.
  4. ^ Chin, Andrew (1991). Complexity Issues in General Purpose Parallel Computing (DPhil). University of Oxford. pp. 87–95.
  5. ^ Chin, Andrew (1994). "Locality-Preserving Hash Functions for General Purpose Parallel Computation" (PDF). Algorithmica. 12 (2–3): 170–181. doi:10.1007/BF01185209. S2CID 18108051.

© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search