SIFT keypoints, feature matching, bag of visual words, homography, and scale-space analysis
1. SIFT Keypoint Detection interest points
SIFT (Scale-Invariant Feature Transform) detects keypoints at multiple scales by finding
extrema in a Difference-of-Gaussians (DoG) pyramid. Below we approximate interest-point detection using a
Harris corner detector implemented in JavaScript — computing image gradients, the structure
tensor, and the Harris response R = det(M) − k·trace(M)².
0.04
10%
8
Loading images…
2. Feature Matching patch descriptors
Feature matching pairs keypoints between images using descriptor similarity. Here we extract small
grayscale patches around detected corners and match them via normalised cross-correlation (NCC).
Lines connect the best matches; green = strong match, yellow = weaker.
25
11
3. Bag of Visual Words image retrieval
Bag of Visual Words represents each image as a histogram over a vocabulary of "visual words"
(cluster centres from k-means on local descriptors). Similar images produce similar histograms.
Here we use colour-patch features as a proxy for SIFT descriptors, then cluster with k-means.
8
8×8
4. Homography Warping projective transform
A homography is a 3×3 projective transformation mapping points from one plane to another.
Given 4 point correspondences we can solve for the 8 degrees of freedom. Click 4 points on each image
(in matching order) to define the mapping, or use presets.
Click 4 points on the left image, then 4 on the right image.
Source image (click 4 pts)
Destination image (click 4 pts)
Warped result
5. Laplacian of Gaussian (LoG) scale space
The Laplacian of Gaussian is used for blob detection and edge detection in scale space.
For a 1D signal, LoG highlights zero-crossings that correspond to edges.