← Back to Index

05 – Matching, Indexing & Search

SIFT keypoints, feature matching, bag of visual words, homography, and scale-space analysis

1. SIFT Keypoint Detection
SIFT (Scale-Invariant Feature Transform) detects keypoints at multiple scales by finding extrema in a Difference-of-Gaussians (DoG) pyramid. Below we approximate interest-point detection using a Harris corner detector implemented in JavaScript — computing image gradients, the structure tensor, and the Harris response R = det(M) − k·trace(M)².
0.04
10%
8
Loading images…
2. Feature Matching
Feature matching pairs keypoints between images using descriptor similarity. Here we extract small grayscale patches around detected corners and match them via normalised cross-correlation (NCC). Lines connect the best matches; green = strong match, yellow = weaker.
25
11
3. Bag of Visual Words
Bag of Visual Words represents each image as a histogram over a vocabulary of "visual words" (cluster centres from k-means on local descriptors). Similar images produce similar histograms. Here we use colour-patch features as a proxy for SIFT descriptors, then cluster with k-means.
8
8×8
4. Homography Warping
A homography is a 3×3 projective transformation mapping points from one plane to another. Given 4 point correspondences we can solve for the 8 degrees of freedom. Click 4 points on each image (in matching order) to define the mapping, or use presets.
Click 4 points on the left image, then 4 on the right image.
Source image (click 4 pts)
Destination image (click 4 pts)
Warped result
5. Laplacian of Gaussian (LoG)
The Laplacian of Gaussian is used for blob detection and edge detection in scale space. For a 1D signal, LoG highlights zero-crossings that correspond to edges.
LoG(x) = −1/(π·σ⁴) · (1 − x²/σ²) · exp(−x² / (2σ²))
0.100
Input signal
LoG kernel
Convolution result

Multi-scale LoG responses (8 σ values)

Made with ❤️ by Mark Žnidar