Section: New Results
Randomized Embeddings with Slack and High-Dimensional Approximate Nearest Neighbor
Participants : Evangelos Anagnostopoulos, Ioannis Emiris, Ioannis Psarros.
In [1], we study
the approximate nearest neighbor problem (e-ANN) in high dimensional Euclidean space with methods beyond Locality Sensitive Hashing (LSH), which has polynomial dependence in the dimension, sublinear query time, but subquadratic space requirement. In particular, we introduce a new definition of “low-quality” embeddings for metric spaces. It requires that, for some query point q, there exists an approximate nearest neighbor among the pre-images of the k approximate nearest neighbors in the target space. Focusing on Euclidean spaces, we employ random projections in order to reduce the original problem to one in a space of dimension inversely proportional to k. The k approximate nearest neighbors can be efficiently retrieved by a data structure such as BBD-trees. The same approach is applied to the problem of computing an approximate near neighbor, where we obtain a data structure requiring linear space, and query time in