Introduction to image classification models

These are notes from lesson 1 of Fast AI Practical Deep Learning for Coders.

Homework Task

Train an image classifier: see car classification notebook

1. Notes on Learning

There is a companion course focusing on AI ethics here

Thoughts on education: play the whole game first to get an idea of the big picture. Don’t become afflicted with “elementitis” getting too bogged down in details too early.

Coloured cups - green, amber, red - for each student to indicate how well they are following
Meta Learning by Radek Osmulski - a book based on the author’s experience of learning about deep learning (via the fastai course)
Mathematician’s Lament by Paul Lockhart - a book about the state of mathematics education
Making Learning Whole by David Perkins - a book about apporaches to holistic learning

RISE is a jupyter notebook extension to turn notebooks into slides. Jeremy uses notebooks for: source code, book, blogging, CI/CD.

2. Background on Deep Learning and Image Classification

Before deep learning, the approach to machine learning was to enlist many domain experts to handcraft features and feed this into a constrained linear model (e.g. ridge regression). This is time-consuming, expensive and requires many domain experts.

Neural networks learn these features. Looking inside a CNN, for example, shows that these learned features match interpretable features that an expert might handcraft. An illustration of the features learned is given in this paper by Zeiler and Fergus.

For image classifiers, you don’t need particularly large images as inputs. GPUs are so quick now that if you use large images, most of the time is spent on opening the file rather than computations. So often we resize images down to 400x400 pixels or smaller.

For most use cases, there are pre-trained models and sensible default values that we can use. In practice, most of the time is spent on the input layer and output layer. For most models the middle layers are identical.

3. Overview of the fastai learner

Data blocks structure the input to learners. An overview of the DataBlock class:

blocks determines the input and output type as a tuple. For multi-target classification this tuple can be arbitrary length.
get_items is a function that returns a list of all the inputs.
splitter defines how to split the training/validation set.
get_y is a function that returns the label of a given input image.
item_tfms defines what transforms to apply to the inputs before training, e.g. resize.
dataloaders is a method that parallelises loading the data.

A learner combines the model (e.g. resnet or something from timm library) and the data to run that model on (the dataloaders from the DataBlock).

The fine_tune method starts with a pretrained model weights rather than randomised weights, and only needs to learn the differences between your data and the original model.

Other image problems that can utilise deep learning:

Image classification
Image segmentation

Other problem types use the same process, just with different DataBlock blocks types and the rest is the same. For example, tabular data, collaborative filtering.

4. Deep Learning vs Traditional Computer Programs

Traditional computer programs are essentially:

Deep learning models are:

FastAI Lesson 1: Image Classification Models

Introduction to image classification models

1. Notes on Learning

2. Background on Deep Learning and Image Classification

3. Overview of the fastai learner

4. Deep Learning vs Traditional Computer Programs

References