
CNN
1. Motivation: Why CNNs? - Limitations of traditional MLPs for images - Translation invariance & parameter efficiency
2. CNN Architecture Components - Feature Extractors & Classifiers - VGG-16 example
3. The Convolutional Block - Convolutional Layers: Filters, Kernels, Operations - Pooling Layers: Downsampling & regularization
4. The Fully Connected Classifier - Transforming features to probabilities
5. Advanced Topics & Summary - Dropout, Batch Normalization, 3D Convolutions
Traditional Fully Connected Networks (MLPs) struggle with image data.
Caution
MLPs treat each pixel as an independent feature. Spatial relationships are lost if features move.
Tip
Large number of parameters makes training difficult and increases overfitting risk.


CNNs are designed to efficiently process image data.
Important
CNNs leverage local spatial coherence in images.


Most CNNs follow a two-part structure:
Feature Extractor and Classifier
Extracting features using learnable filters.
(e.g., 3x3) matrix of weights.
Note
Filter weights are learned during training. Unlike fixed filters (e.g., Sobel), CNN filters adapt for optimal feature detection.

Controlling the output size and feature detection.
Tip
Adjusting stride allows control over spatial dimension reduction. A stride of 2 halves the spatial dimensions.


(https://learnopencv.com/wp-content/uploads/2024/06/padding-stride-2.png)
](https://learnopencv.com/wp-content/uploads/2024/06/no_padding_no_strides.gif)
](https://learnopencv.com/wp-content/uploads/2024/06/same_padding_no_strides.gif)
The output size (O) of a 2D convolution is calculated by:
\[ O = \left\lfloor \frac{n - f + 2p}{s} \right\rfloor + 1 \]
Where: - n: Input size (height or width) - f: Kernel size - p: Padding - s: Stride
Try different values to see how they affect output size.
viewof input_n = Inputs.range([16, 256], {value: 32, step: 1, label: "Input Size (n)"});
viewof kernel_f = Inputs.range([1, 7], {value: 3, step: 1, label: "Kernel Size (f)"});
viewof padding_p = Inputs.range([0, 3], {value: 1, step: 1, label: "Padding (p)"});
viewof stride_s = Inputs.range([1, 4], {value: 1, step: 1, label: "Stride (s)"});A basic visual representation of a 2D convolution.

Convolutional Operation
A concrete example of a fixed filter’s operation.
Note
In CNNs, kernel weights are learned, allowing detection of diverse, complex features, not just predefined edges.


Understanding channels, filters, and trainable parameters.
fxfr): Typically 3x3 or 5x5.Important
A filter is a container for kernels. If input depth is C, a filter has C kernels.
(kernel_width * kernel_height * input_channels + 1) * num_filters (The +1 is for the bias term per filter.)Let’s illustrate:
224x224x3 (RGB Image)3x3 spatial size32Parameters = (3 * 3 * 3 + 1) * 32 Parameters = (27 + 1) * 32 Parameters = 28 * 32 = 896



From simple edges to complex object parts.

Tip
This hierarchical learning is why CNNs are so powerful. They build up complex understanding from simple visual primitives.

Summarizing features and reducing computations.
2x2).Input 4x4 Activation Map, 2x2 Filter, Stride 2

Note
Pooling layers summarily represent features in a smaller space. Think of it as feature aggregation.

Combining convolution and pooling.
Tip
VGG-16 uses 2-3 convolutional layers before each max pooling layer. Number of filters typically doubles with depth (e.g., 64 \(\rightarrow\) 128 \(\rightarrow\) 256).
Convolutional Block Detail
Mapping extracted features to class probabilities.
7x7x512) is reshaped into a 1D vector (e.g., 25088 features).
[0,1] range, sums to 1).
Note
Flattening doesn’t lose spatial information inherently; it just reorganizes it for the dense layer’s input.


Connecting learned features to actionable predictions.
Tip
Minimizing the loss function tunes the weights to effectively map features to class probabilities.


Enhancing training and performance.

Note
Crucial for dynamic signals and volumetric data in ECE applications.
Consolidating our understanding of CNNs.
Important
CNNs are the backbone of modern computer vision, driving innovation in diverse ECE applications.