graph TD
A["Raw Text"] --> B{"Feature Extraction / Embeddings"};
B --> C["Machine Learning Model"];
C --> D["Supervised Task (e.g., Classification, Generation)"];
5. Deep Learning: CNN, RNN, NLP
In the 1960s, research by David Hubel and Torsten Wiesel revealed how the visual cortex processes visual information.

Our visual system integrates information from these small receptive fields.
This process forms the complete images we perceive.
This biological mechanism provided a key inspiration for Convolutional Neural Networks (CNNs).

Inspired by the visual cortex, CNNs emerged in the 1980s.
CNNs are specialized neural networks for processing structured grid data, like images.
They incorporate unique layer types:

The power of CNNs lies in their flexible architecture.
Different combinations and ordering of layers lead to varied results.
This allows for optimization based on the specific task.

The simplest building block of a typical neural network.
A basic computational unit that processes inputs to produce an output.
Sensitivity to Input Changes:

CNNs introduce specialized layers to address the limitations of plain ANNs for image processing.
The processed output then feeds into a fully connected neural network.
A fundamental operation in CNNs.
It uses a filter (also called a kernel, mask, or convolution matrix) to analyze the influence of nearby pixels.
Consider a simple image: a rectangle with two shaded halves.
Pixel intensities are represented numerically.

We apply a 3x3 filter designed to create a blurring effect.
Each element in the filter has a specific weight.

The filter is applied by sliding it over the image.
At each position, it computes a new pixel value.
To calculate the new value for a pixel:

The new pixel value is an average of its neighbors.
This averaging effect reduces sharp intensity changes, leading to blurring.

When the filter reaches the image edge, padding is often used.
Commonly, the original image is padded with zeros around its borders.

The padding ensures the filter can be centered on edge pixels.
Only the part of the filter overlapping the original image contributes to the sum.
These kernels are designed to detect sharp intensity changes, indicating lines or edges.

An image with a vertical line where shading changes.
We will apply the \(G_x\) kernel to detect this line.

Applying \(G_x\) to the first 3x3 block yields 0.
No intensity change within this block.

Shifting the \(G_x\) kernel one pixel to the right.
The kernel now straddles the intensity change.

The calculation results in a non-zero value (\(200/9\)).
This indicates the presence of a vertical edge.

Shifting the \(G_x\) kernel one more pixel to the right.
The kernel is now fully on the right side of the edge.

The calculation results in \(300/9\).
This continues to highlight the edge region.

Shifting the \(G_x\) kernel one final pixel to the right.
The kernel is now past the intensity change.

The calculation returns 0 again.
This demonstrates that \(G_x\) effectively detects vertical lines by identifying sharp horizontal intensity transitions.

In CNNs, the kernel values are learned during training.
The network automatically identifies important features.
We don’t explicitly tell the model what to look for (e.g., “vertical lines”).

A downsampling technique often applied after convolution.
Objective: Reduce data size without losing critical information.
While CNNs learn many parameters, users define several key hyperparameters:
Convolution:
- Number of filters (features)
- Size of filters
Pooling:
- Window size
- Stride
Fully Connected Layers:
- Number of nodes
Also, the number and order of each layer type.
In the lab, you will build and experiment with a Convolutional Neural Network.
You’ll apply these concepts to a practical image classification task.
Previous deep neural networks were primarily feedforward.
Recurrent Neural Networks (RNNs) introduce a new dynamic.
Note
RNNs excel in tasks where the order of data points is crucial, such as time series or natural language.

A feedforward neuron with a crucial addition:

Visualizing the data flow:

Addresses the “short memory” problem of standard RNNs.

RNNs excel in tasks involving sequential data:
… and many more!
RNNs are particularly strong in sequence prediction.
Unlike earlier models, RNNs inherently consider the temporal dependency of data.
Previous models often assumed time-independent data.
Important
For example, predicting future sensor readings from past measurements in an ECE system.
Time series data is an ordered set of data points indexed by time.
The inherent ordering makes it ideal for RNNs.

Sequence prediction aims to forecast future values based on historical data.
Example: Predicting the next quarter’s performance from a year of data.

Predicting future stock prices based on historical market data.
A complex task due to market volatility, but RNNs can capture trends.

Forecasting weather patterns based on past meteorological data.
While complex, RNNs can identify subtle temporal dependencies.

Predicting daily traveler numbers at a train station.
RNNs can learn seasonality (weekdays vs. weekends, holidays).
Requires sufficient historical data to capture these patterns.

In the lab, you will use an RNN to predict a sequence of vibration readings from an engine.
You’ll apply TensorFlow with Keras to build, test, and tune your model.
The interaction between computers and human (natural) language.
Enables computers to understand, interpret, and generate human language.

… and so much more!
Character-Level Models:
- Process text one character at a time.
- Can handle out-of-vocabulary words and typos.

Word-Level Models:
- Process text one word at a time.
- More common for English and similar languages.
- Often faster to train and perform well.
Which is better? Depends on the language and use case.

A powerful tool for pattern matching in strings.
Used for extracting, validating, or modifying text.
| regex | matches |
|---|---|
[wW]ood |
wood, Wood |
beg.n |
begin, begun, beg3n |
o+h |
oh, oooooh |
[^a-zA-Z] |
a single non-alpha character |
Also known as Levenshtein distance.
Transforming raw text into informative numerical features.
n words.
Tip
N-grams help capture local word order, while TFIDF helps identify unique and important terms.
Simplest language modeling approach.
Treats a sentence as an unordered collection of words.
Example: “I love love loved it!” and “I HATED it :-(”
- Meaning can often be inferred without word order.

While bag-of-words is effective for some tasks (e.g., spam filtering), word order is crucial for complex NLP.
Sequential approaches preserve word order.
This is where Recurrent Neural Networks (RNNs) become indispensable.

Understanding the typical flow for NLP tasks:
graph TD
A["Raw Text"] --> B{"Feature Extraction / Embeddings"};
B --> C["Machine Learning Model"];
C --> D["Supervised Task (e.g., Classification, Generation)"];
In the lab, you will perform sentiment analysis on reviews.
You will then build a classifier to determine authorship (e.g., Jane Austen vs. Charles Dickens).
We’ve explored Convolutional Neural Networks for image processing, Recurrent Neural Networks for sequential data, and key concepts in Natural Language Processing.
These deep learning techniques are fundamental to many modern ECE applications.
What questions do you have about these powerful tools and their applications in engineering?