K-Nearest Points to Origin¶

Finding the k closest points to a reference location is the geometric core of k-nearest neighbors, one of the most intuitive ML algorithms. This problem strips it down to Euclidean distance and a single reference point (the origin).

Problem¶

Given a list of 2D points, return the k points closest to the origin (0, 0) using Euclidean distance.

Input: a list of (x, y) coordinates and integer k. Output: the k closest points.

Approach¶

Sort-based, O(n log n). Compute each point's distance, sort, slice the first k. Simple, but you're paying to fully order all n points when you only need the top k.

Max-heap, O(n log k). Maintain a max-heap of size k. For each point, push it onto the heap; if the heap exceeds k, pop the farthest point. Since k is typically much smaller than n, this is significantly faster on large datasets, exactly the scenario you'd face when doing nearest-neighbor lookups over embeddings.

Implementation¶

Sort-based¶

def k_closest_sorted(points, k):
    return sorted(points, key=lambda p: p[0] ** 2 + p[1] ** 2)[:k]

points = [[1, 3], [-2, 2], [4, 1], [0, 1], [-1, 3]]
print(k_closest_sorted(points, 3))
# [[0, 1], [-2, 2], [1, 3]]

Max-heap¶

import heapq

def k_closest_heap(points, k):
    max_heap = []
    for x, y in points:
        dist = x ** 2 + y ** 2
        heapq.heappush(max_heap, (-dist, [x, y]))
        if len(max_heap) > k:
            heapq.heappop(max_heap)
    return [p for _, p in sorted(max_heap, key=lambda t: t[0], reverse=True)]

points = [[1, 3], [-2, 2], [4, 1], [0, 1], [-1, 3]]
print(k_closest_heap(points, 3))
# [[0, 1], [-2, 2], [1, 3]]

Complexity¶

Approach	Time	Space
Sort	O(n log n)	O(1) in-place sort
Max-heap	O(n log k)	O(k)

Takeaway¶

The heap approach is the right default when k << n. Libraries like FAISS go further with approximate methods, but the heap-based mental model is the foundation for any top-k retrieval.

Back to Algorithms & Data Structures