CNN Visualization Techniques

Understanding what neural networks learn through filter visualization, embeddings, saliency maps, and input maximization.

Visualization First Layer Weights – ResNet18 (Typical Patterns)

First-layer convolution filters in CNNs learn basic visual features: edges at various orientations, color blobs, and Gabor-like texture detectors. Below are 64 synthetic 7×7 filters replicating well-known first-layer patterns.

Visualization Second Layer Filters (3×3 grouped by channel)

Second-layer filters combine first-layer features into more complex edge and texture detectors. Each group shows filters for one input channel. The patterns are more complex, detecting corners, curves, and oriented textures.

Interactive PCA Embedding of ImageNet Class Vectors

2D PCA projection of class embedding vectors for 30 ImageNet classes. Semantically similar classes cluster together. Scroll to zoom, drag to pan.

Interactive PCA & t-SNE Embeddings (CIFAR-10)

Compare PCA (linear) and t-SNE (non-linear) embeddings. PCA preserves global structure; t-SNE reveals local cluster separation. Adjust perplexity to see its effect on t-SNE clustering.

Perplexity: 30

PCA Embedding

t-SNE Embedding

Interactive Image Classification – Top-5 Predictions

ResNet18 predictions on the input image. The model confidently predicts "Golden Retriever".

Input Image (224×224)

Top-5 Predictions

Interactive Occlusion Sensitivity

Slide a black patch across the image and observe how occlusion of different regions affects the model's prediction confidence. The heatmap shows which regions are most important for classification. Drag the patch or adjust its size.

Patch size: 30 Show heatmap:

Image with Occlusion Patch

Sensitivity Heatmap

Concept Gradient / Saliency Map

A saliency map highlights the pixels most relevant to the predicted class. It is computed as the absolute gradient of the class score with respect to each input pixel. Bright regions indicate where small pixel changes would most affect the classification.

How it works: Forward-pass the image → compute class score → backpropagate → take |∂score/∂pixel| → normalize to [0,1] → overlay as heatmap.

Overlay opacity: 60%

Original Image

Saliency Overlay

Concept Input Maximization (Feature Visualization)

Input maximization starts from random noise and iteratively adjusts pixels via gradient ascent to maximize a target neuron's activation. A total-variation (TV) loss regularizer penalizes high-frequency noise to produce interpretable patterns.

Optimization: x ← x + lr · ∂activation/∂x − λ · ∇TV(x)
The animation shows synthetic pattern emergence simulating this process.

Speed:

Optimized Image

Loss Curves

Class Score TV Loss

Step

0.00

Class Score

0.00

TV Loss