This is a basic computer vision example using PyTorch.
What you’ll find here :
- What is computer vision
- A simple classification example using a deep learning computer vision model
Without further ado lets begin… (before jumping right into the code please understand what computer vision is)
What is Computer Vision
Computer vision is a field of computer science that focuses on building intelligent systems that can process, analyse and and derive information from visual data like images, videos. Lately people are achieving marvellous things using 3D point cloud information…
FASHION MNIST CLASSIFICATION
Link to the dataset : Fashion Mnist
You dont need to download the dataset manually, they are included as part of pytorch
Its better if you use jupyter-notebook as the code in this blog is a step by step process with data visualisation in between for better understanding.
<span>import</span> <span>torch</span><span>import</span> <span>torchvision</span><span>import</span> <span>numpy</span> <span>as</span> <span>np</span><span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span><span>import</span> <span>torch.nn</span> <span>as</span> <span>nn</span><span>import</span> <span>torch.nn.functional</span> <span>as</span> <span>F</span><span>from</span> <span>torchvision.datasets</span> <span>import</span> <span>FashionMNIST</span><span>from</span> <span>torchvision.transforms</span> <span>import</span> <span>ToTensor</span><span>from</span> <span>torchvision.utils</span> <span>import</span> <span>make_grid</span><span>from</span> <span>torch.utils.data.dataloader</span> <span>import</span> <span>DataLoader</span><span>from</span> <span>torch.utils.data</span> <span>import</span> <span>random_split</span><span>import</span> <span>torch</span> <span>import</span> <span>torchvision</span> <span>import</span> <span>numpy</span> <span>as</span> <span>np</span> <span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span> <span>import</span> <span>torch.nn</span> <span>as</span> <span>nn</span> <span>import</span> <span>torch.nn.functional</span> <span>as</span> <span>F</span> <span>from</span> <span>torchvision.datasets</span> <span>import</span> <span>FashionMNIST</span> <span>from</span> <span>torchvision.transforms</span> <span>import</span> <span>ToTensor</span> <span>from</span> <span>torchvision.utils</span> <span>import</span> <span>make_grid</span> <span>from</span> <span>torch.utils.data.dataloader</span> <span>import</span> <span>DataLoader</span> <span>from</span> <span>torch.utils.data</span> <span>import</span> <span>random_split</span>import torch import torchvision import numpy as np import matplotlib.pyplot as plt import torch.nn as nn import torch.nn.functional as F from torchvision.datasets import FashionMNIST from torchvision.transforms import ToTensor from torchvision.utils import make_grid from torch.utils.data.dataloader import DataLoader from torch.utils.data import random_split
Enter fullscreen mode Exit fullscreen mode
Lets download our test and train dataset, the data that is used for training is stored in dataset and the test data is stored in test_dataset
<span>dataset</span> <span>=</span> <span>FashionMNIST</span><span>(</span><span>root</span><span>=</span><span>'</span><span>data/</span><span>'</span><span>,</span> <span>download</span><span>=</span><span>True</span><span>,</span> <span>transform</span><span>=</span><span>ToTensor</span><span>())</span><span>test_dataset</span> <span>=</span> <span>FashionMNIST</span><span>(</span><span>root</span><span>=</span><span>'</span><span>data/</span><span>'</span><span>,</span> <span>train</span><span>=</span><span>False</span><span>,</span> <span>transform</span><span>=</span><span>ToTensor</span><span>())</span><span>dataset</span> <span>=</span> <span>FashionMNIST</span><span>(</span><span>root</span><span>=</span><span>'</span><span>data/</span><span>'</span><span>,</span> <span>download</span><span>=</span><span>True</span><span>,</span> <span>transform</span><span>=</span><span>ToTensor</span><span>())</span> <span>test_dataset</span> <span>=</span> <span>FashionMNIST</span><span>(</span><span>root</span><span>=</span><span>'</span><span>data/</span><span>'</span><span>,</span> <span>train</span><span>=</span><span>False</span><span>,</span> <span>transform</span><span>=</span><span>ToTensor</span><span>())</span>dataset = FashionMNIST(root='data/', download=True, transform=ToTensor()) test_dataset = FashionMNIST(root='data/', train=False, transform=ToTensor())
Enter fullscreen mode Exit fullscreen mode
The training process usually involves taking a small subset of training data for validation(this subset is not used in the training process). We have a subset of 10000 datapoints for validating our model.
<span>val_size</span> <span>=</span> <span>10000</span><span>train_size</span> <span>=</span> <span>len</span><span>(</span><span>dataset</span><span>)</span> <span>-</span> <span>val_size</span><span>train_ds</span><span>,</span> <span>val_ds</span> <span>=</span> <span>random_split</span><span>(</span><span>dataset</span><span>,</span> <span>[</span><span>train_size</span><span>,</span> <span>val_size</span><span>])</span><span>print</span><span>(</span><span>"</span><span>Training dataset size : </span><span>"</span><span>,</span> <span>len</span><span>(</span><span>train_ds</span><span>))</span><span>print</span><span>(</span><span>"</span><span>Validation dataset size : </span><span>"</span><span>,</span> <span>len</span><span>(</span><span>val_ds</span><span>))</span><span>val_size</span> <span>=</span> <span>10000</span> <span>train_size</span> <span>=</span> <span>len</span><span>(</span><span>dataset</span><span>)</span> <span>-</span> <span>val_size</span> <span>train_ds</span><span>,</span> <span>val_ds</span> <span>=</span> <span>random_split</span><span>(</span><span>dataset</span><span>,</span> <span>[</span><span>train_size</span><span>,</span> <span>val_size</span><span>])</span> <span>print</span><span>(</span><span>"</span><span>Training dataset size : </span><span>"</span><span>,</span> <span>len</span><span>(</span><span>train_ds</span><span>))</span> <span>print</span><span>(</span><span>"</span><span>Validation dataset size : </span><span>"</span><span>,</span> <span>len</span><span>(</span><span>val_ds</span><span>))</span>val_size = 10000 train_size = len(dataset) - val_size train_ds, val_ds = random_split(dataset, [train_size, val_size]) print("Training dataset size : ", len(train_ds)) print("Validation dataset size : ", len(val_ds))
Enter fullscreen mode Exit fullscreen mode
Lets define our batch size which specifies how many data points we are planning to use in each iteration during the training process.
<span>batch_size</span><span>=</span><span>128</span><span>batch_size</span><span>=</span><span>128</span>batch_size=128
Enter fullscreen mode Exit fullscreen mode
DataLoader is a utility provided by pytorch that helps in creating an iterable for our dataset. The arguments shuffle is used to shuffle the data, num_workers is used for transferring data to RAM or whatever memory you are using and pin_memory is used for implicit CPU-to-CPU copy which helps to make the processing much more efficient.
<span>train_loader</span> <span>=</span> <span>DataLoader</span><span>(</span><span>train_ds</span><span>,</span> <span>batch_size</span><span>,</span> <span>shuffle</span><span>=</span><span>True</span><span>,</span> <span>num_workers</span><span>=</span><span>4</span><span>,</span> <span>pin_memory</span><span>=</span><span>True</span><span>)</span><span>val_loader</span> <span>=</span> <span>DataLoader</span><span>(</span><span>val_ds</span><span>,</span> <span>batch_size</span><span>*</span><span>2</span><span>,</span> <span>num_workers</span><span>=</span><span>4</span><span>,</span> <span>pin_memory</span><span>=</span><span>True</span><span>)</span><span>test_loader</span> <span>=</span> <span>DataLoader</span><span>(</span><span>test_dataset</span><span>,</span> <span>batch_size</span><span>*</span><span>2</span><span>,</span> <span>num_workers</span><span>=</span><span>4</span><span>,</span> <span>pin_memory</span><span>=</span><span>True</span><span>)</span><span>train_loader</span> <span>=</span> <span>DataLoader</span><span>(</span><span>train_ds</span><span>,</span> <span>batch_size</span><span>,</span> <span>shuffle</span><span>=</span><span>True</span><span>,</span> <span>num_workers</span><span>=</span><span>4</span><span>,</span> <span>pin_memory</span><span>=</span><span>True</span><span>)</span> <span>val_loader</span> <span>=</span> <span>DataLoader</span><span>(</span><span>val_ds</span><span>,</span> <span>batch_size</span><span>*</span><span>2</span><span>,</span> <span>num_workers</span><span>=</span><span>4</span><span>,</span> <span>pin_memory</span><span>=</span><span>True</span><span>)</span> <span>test_loader</span> <span>=</span> <span>DataLoader</span><span>(</span><span>test_dataset</span><span>,</span> <span>batch_size</span><span>*</span><span>2</span><span>,</span> <span>num_workers</span><span>=</span><span>4</span><span>,</span> <span>pin_memory</span><span>=</span><span>True</span><span>)</span>train_loader = DataLoader(train_ds, batch_size, shuffle=True, num_workers=4, pin_memory=True) val_loader = DataLoader(val_ds, batch_size*2, num_workers=4, pin_memory=True) test_loader = DataLoader(test_dataset, batch_size*2, num_workers=4, pin_memory=True)
Enter fullscreen mode Exit fullscreen mode
Lets Visualise out training dataset, the function below in jupyter notebook displays a grid of 126(our batch_size) images from the training subset.
<span>for</span> <span>images</span><span>,</span> <span>_</span> <span>in</span> <span>train_loader</span><span>:</span><span>print</span><span>(</span><span>'</span><span>images.shape:</span><span>'</span><span>,</span> <span>images</span><span>.</span><span>shape</span><span>)</span><span>plt</span><span>.</span><span>figure</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>16</span><span>,</span><span>8</span><span>))</span><span>plt</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>off</span><span>'</span><span>)</span><span>plt</span><span>.</span><span>imshow</span><span>(</span><span>make_grid</span><span>(</span><span>images</span><span>,</span> <span>nrow</span><span>=</span><span>16</span><span>).</span><span>permute</span><span>((</span><span>1</span><span>,</span> <span>2</span><span>,</span> <span>0</span><span>)))</span><span>break</span><span>for</span> <span>images</span><span>,</span> <span>_</span> <span>in</span> <span>train_loader</span><span>:</span> <span>print</span><span>(</span><span>'</span><span>images.shape:</span><span>'</span><span>,</span> <span>images</span><span>.</span><span>shape</span><span>)</span> <span>plt</span><span>.</span><span>figure</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>16</span><span>,</span><span>8</span><span>))</span> <span>plt</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>off</span><span>'</span><span>)</span> <span>plt</span><span>.</span><span>imshow</span><span>(</span><span>make_grid</span><span>(</span><span>images</span><span>,</span> <span>nrow</span><span>=</span><span>16</span><span>).</span><span>permute</span><span>((</span><span>1</span><span>,</span> <span>2</span><span>,</span> <span>0</span><span>)))</span> <span>break</span>for images, _ in train_loader: print('images.shape:', images.shape) plt.figure(figsize=(16,8)) plt.axis('off') plt.imshow(make_grid(images, nrow=16).permute((1, 2, 0))) break
Enter fullscreen mode Exit fullscreen mode
The accuracy function compares our model output to the actual desired output and returns an accuracy score, this helps to monitor our training process.
<span>def</span> <span>accuracy</span><span>(</span><span>outputs</span><span>,</span> <span>labels</span><span>):</span><span>_</span><span>,</span> <span>preds</span> <span>=</span> <span>torch</span><span>.</span><span>max</span><span>(</span><span>outputs</span><span>,</span> <span>dim</span><span>=</span><span>1</span><span>)</span><span>return</span> <span>torch</span><span>.</span><span>tensor</span><span>(</span><span>torch</span><span>.</span><span>sum</span><span>(</span><span>preds</span> <span>==</span> <span>labels</span><span>).</span><span>item</span><span>()</span> <span>/</span> <span>len</span><span>(</span><span>preds</span><span>))</span><span>def</span> <span>accuracy</span><span>(</span><span>outputs</span><span>,</span> <span>labels</span><span>):</span> <span>_</span><span>,</span> <span>preds</span> <span>=</span> <span>torch</span><span>.</span><span>max</span><span>(</span><span>outputs</span><span>,</span> <span>dim</span><span>=</span><span>1</span><span>)</span> <span>return</span> <span>torch</span><span>.</span><span>tensor</span><span>(</span><span>torch</span><span>.</span><span>sum</span><span>(</span><span>preds</span> <span>==</span> <span>labels</span><span>).</span><span>item</span><span>()</span> <span>/</span> <span>len</span><span>(</span><span>preds</span><span>))</span>def accuracy(outputs, labels): _, preds = torch.max(outputs, dim=1) return torch.tensor(torch.sum(preds == labels).item() / len(preds))
Enter fullscreen mode Exit fullscreen mode
And finally…. lets define our D.L model
-
init : contructor where the neural network layers are defined, here one output layer and 2 hidden layers are used which are made of linear layer
-
forward : This is where we have defined the model architecture,initially we are flattening our tensor as it has to pass through linear layers.
-
training_step : This is where the actual training happens, here we pass the data as input to a function called self which is part of the base class nn.Module which return the predictions as tensors.
-
validation_step : This is a very similar function to training_step , the only difference is that this is used for validating the training, this is where we use the accuracy function to evaluate our current state of model.
-
validation_epoch_end and epoch_end : These are used to display the state of our model at the end of the training process i.e after all epochs have completed.
<span>class</span> <span>MnistModel</span><span>(</span><span>nn</span><span>.</span><span>Module</span><span>):</span><span>"""</span><span>Feedfoward neural network with 1 hidden layer</span><span>"""</span><span>def</span> <span>__init__</span><span>(</span><span>self</span><span>,</span> <span>in_size</span><span>,</span> <span>out_size</span><span>):</span><span>super</span><span>().</span><span>__init__</span><span>()</span><span># hidden layer </span> <span>self</span><span>.</span><span>linear1</span> <span>=</span> <span>nn</span><span>.</span><span>Linear</span><span>(</span><span>in_size</span><span>,</span> <span>16</span><span>)</span><span># hidden layer 2 </span> <span>self</span><span>.</span><span>linear2</span> <span>=</span> <span>nn</span><span>.</span><span>Linear</span><span>(</span><span>16</span><span>,</span> <span>32</span><span>)</span><span># output layer </span> <span>self</span><span>.</span><span>linear3</span> <span>=</span> <span>nn</span><span>.</span><span>Linear</span><span>(</span><span>32</span><span>,</span> <span>out_size</span><span>)</span><span>def</span> <span>forward</span><span>(</span><span>self</span><span>,</span> <span>xb</span><span>):</span><span># Flatten the image tensors </span> <span>out</span> <span>=</span> <span>xb</span><span>.</span><span>view</span><span>(</span><span>xb</span><span>.</span><span>size</span><span>(</span><span>0</span><span>),</span> <span>-</span><span>1</span><span>)</span><span># Get intermediate outputs using hidden layer 1 </span> <span>out</span> <span>=</span> <span>self</span><span>.</span><span>linear1</span><span>(</span><span>out</span><span>)</span><span># Apply activation function </span> <span>out</span> <span>=</span> <span>F</span><span>.</span><span>relu</span><span>(</span><span>out</span><span>)</span><span># Get intermediate outputs using hidden layer 2 </span> <span>out</span> <span>=</span> <span>self</span><span>.</span><span>linear2</span><span>(</span><span>out</span><span>)</span><span># Apply activation function </span> <span>out</span> <span>=</span> <span>F</span><span>.</span><span>relu</span><span>(</span><span>out</span><span>)</span><span># Get predictions using output layer </span> <span>out</span> <span>=</span> <span>self</span><span>.</span><span>linear3</span><span>(</span><span>out</span><span>)</span><span>return</span> <span>out</span><span>def</span> <span>training_step</span><span>(</span><span>self</span><span>,</span> <span>batch</span><span>):</span><span>images</span><span>,</span> <span>labels</span> <span>=</span> <span>batch</span><span>out</span> <span>=</span> <span>self</span><span>(</span><span>images</span><span>)</span> <span># Generate predictions </span> <span>loss</span> <span>=</span> <span>F</span><span>.</span><span>cross_entropy</span><span>(</span><span>out</span><span>,</span> <span>labels</span><span>)</span> <span># Calculate loss </span> <span>return</span> <span>loss</span><span>def</span> <span>validation_step</span><span>(</span><span>self</span><span>,</span> <span>batch</span><span>):</span><span>images</span><span>,</span> <span>labels</span> <span>=</span> <span>batch</span><span>out</span> <span>=</span> <span>self</span><span>(</span><span>images</span><span>)</span> <span># Generate predictions </span> <span>loss</span> <span>=</span> <span>F</span><span>.</span><span>cross_entropy</span><span>(</span><span>out</span><span>,</span> <span>labels</span><span>)</span> <span># Calculate loss </span> <span>acc</span> <span>=</span> <span>accuracy</span><span>(</span><span>out</span><span>,</span> <span>labels</span><span>)</span> <span># Calculate accuracy </span> <span>return</span> <span>{</span><span>'</span><span>val_loss</span><span>'</span><span>:</span> <span>loss</span><span>,</span> <span>'</span><span>val_acc</span><span>'</span><span>:</span> <span>acc</span><span>}</span><span>def</span> <span>validation_epoch_end</span><span>(</span><span>self</span><span>,</span> <span>outputs</span><span>):</span><span>batch_losses</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_loss</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>outputs</span><span>]</span><span>epoch_loss</span> <span>=</span> <span>torch</span><span>.</span><span>stack</span><span>(</span><span>batch_losses</span><span>).</span><span>mean</span><span>()</span> <span># Combine losses </span> <span>batch_accs</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_acc</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>outputs</span><span>]</span><span>epoch_acc</span> <span>=</span> <span>torch</span><span>.</span><span>stack</span><span>(</span><span>batch_accs</span><span>).</span><span>mean</span><span>()</span> <span># Combine accuracies </span> <span>return</span> <span>{</span><span>'</span><span>val_loss</span><span>'</span><span>:</span> <span>epoch_loss</span><span>.</span><span>item</span><span>(),</span> <span>'</span><span>val_acc</span><span>'</span><span>:</span> <span>epoch_acc</span><span>.</span><span>item</span><span>()}</span><span>def</span> <span>epoch_end</span><span>(</span><span>self</span><span>,</span> <span>epoch</span><span>,</span> <span>result</span><span>):</span><span>print</span><span>(</span><span>"</span><span>Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}</span><span>"</span><span>.</span><span>format</span><span>(</span><span>epoch</span><span>,</span> <span>result</span><span>[</span><span>'</span><span>val_loss</span><span>'</span><span>],</span> <span>result</span><span>[</span><span>'</span><span>val_acc</span><span>'</span><span>]))</span><span>class</span> <span>MnistModel</span><span>(</span><span>nn</span><span>.</span><span>Module</span><span>):</span> <span>"""</span><span>Feedfoward neural network with 1 hidden layer</span><span>"""</span> <span>def</span> <span>__init__</span><span>(</span><span>self</span><span>,</span> <span>in_size</span><span>,</span> <span>out_size</span><span>):</span> <span>super</span><span>().</span><span>__init__</span><span>()</span> <span># hidden layer </span> <span>self</span><span>.</span><span>linear1</span> <span>=</span> <span>nn</span><span>.</span><span>Linear</span><span>(</span><span>in_size</span><span>,</span> <span>16</span><span>)</span> <span># hidden layer 2 </span> <span>self</span><span>.</span><span>linear2</span> <span>=</span> <span>nn</span><span>.</span><span>Linear</span><span>(</span><span>16</span><span>,</span> <span>32</span><span>)</span> <span># output layer </span> <span>self</span><span>.</span><span>linear3</span> <span>=</span> <span>nn</span><span>.</span><span>Linear</span><span>(</span><span>32</span><span>,</span> <span>out_size</span><span>)</span> <span>def</span> <span>forward</span><span>(</span><span>self</span><span>,</span> <span>xb</span><span>):</span> <span># Flatten the image tensors </span> <span>out</span> <span>=</span> <span>xb</span><span>.</span><span>view</span><span>(</span><span>xb</span><span>.</span><span>size</span><span>(</span><span>0</span><span>),</span> <span>-</span><span>1</span><span>)</span> <span># Get intermediate outputs using hidden layer 1 </span> <span>out</span> <span>=</span> <span>self</span><span>.</span><span>linear1</span><span>(</span><span>out</span><span>)</span> <span># Apply activation function </span> <span>out</span> <span>=</span> <span>F</span><span>.</span><span>relu</span><span>(</span><span>out</span><span>)</span> <span># Get intermediate outputs using hidden layer 2 </span> <span>out</span> <span>=</span> <span>self</span><span>.</span><span>linear2</span><span>(</span><span>out</span><span>)</span> <span># Apply activation function </span> <span>out</span> <span>=</span> <span>F</span><span>.</span><span>relu</span><span>(</span><span>out</span><span>)</span> <span># Get predictions using output layer </span> <span>out</span> <span>=</span> <span>self</span><span>.</span><span>linear3</span><span>(</span><span>out</span><span>)</span> <span>return</span> <span>out</span> <span>def</span> <span>training_step</span><span>(</span><span>self</span><span>,</span> <span>batch</span><span>):</span> <span>images</span><span>,</span> <span>labels</span> <span>=</span> <span>batch</span> <span>out</span> <span>=</span> <span>self</span><span>(</span><span>images</span><span>)</span> <span># Generate predictions </span> <span>loss</span> <span>=</span> <span>F</span><span>.</span><span>cross_entropy</span><span>(</span><span>out</span><span>,</span> <span>labels</span><span>)</span> <span># Calculate loss </span> <span>return</span> <span>loss</span> <span>def</span> <span>validation_step</span><span>(</span><span>self</span><span>,</span> <span>batch</span><span>):</span> <span>images</span><span>,</span> <span>labels</span> <span>=</span> <span>batch</span> <span>out</span> <span>=</span> <span>self</span><span>(</span><span>images</span><span>)</span> <span># Generate predictions </span> <span>loss</span> <span>=</span> <span>F</span><span>.</span><span>cross_entropy</span><span>(</span><span>out</span><span>,</span> <span>labels</span><span>)</span> <span># Calculate loss </span> <span>acc</span> <span>=</span> <span>accuracy</span><span>(</span><span>out</span><span>,</span> <span>labels</span><span>)</span> <span># Calculate accuracy </span> <span>return</span> <span>{</span><span>'</span><span>val_loss</span><span>'</span><span>:</span> <span>loss</span><span>,</span> <span>'</span><span>val_acc</span><span>'</span><span>:</span> <span>acc</span><span>}</span> <span>def</span> <span>validation_epoch_end</span><span>(</span><span>self</span><span>,</span> <span>outputs</span><span>):</span> <span>batch_losses</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_loss</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>outputs</span><span>]</span> <span>epoch_loss</span> <span>=</span> <span>torch</span><span>.</span><span>stack</span><span>(</span><span>batch_losses</span><span>).</span><span>mean</span><span>()</span> <span># Combine losses </span> <span>batch_accs</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_acc</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>outputs</span><span>]</span> <span>epoch_acc</span> <span>=</span> <span>torch</span><span>.</span><span>stack</span><span>(</span><span>batch_accs</span><span>).</span><span>mean</span><span>()</span> <span># Combine accuracies </span> <span>return</span> <span>{</span><span>'</span><span>val_loss</span><span>'</span><span>:</span> <span>epoch_loss</span><span>.</span><span>item</span><span>(),</span> <span>'</span><span>val_acc</span><span>'</span><span>:</span> <span>epoch_acc</span><span>.</span><span>item</span><span>()}</span> <span>def</span> <span>epoch_end</span><span>(</span><span>self</span><span>,</span> <span>epoch</span><span>,</span> <span>result</span><span>):</span> <span>print</span><span>(</span><span>"</span><span>Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}</span><span>"</span><span>.</span><span>format</span><span>(</span><span>epoch</span><span>,</span> <span>result</span><span>[</span><span>'</span><span>val_loss</span><span>'</span><span>],</span> <span>result</span><span>[</span><span>'</span><span>val_acc</span><span>'</span><span>]))</span>class MnistModel(nn.Module): """Feedfoward neural network with 1 hidden layer""" def __init__(self, in_size, out_size): super().__init__() # hidden layer self.linear1 = nn.Linear(in_size, 16) # hidden layer 2 self.linear2 = nn.Linear(16, 32) # output layer self.linear3 = nn.Linear(32, out_size) def forward(self, xb): # Flatten the image tensors out = xb.view(xb.size(0), -1) # Get intermediate outputs using hidden layer 1 out = self.linear1(out) # Apply activation function out = F.relu(out) # Get intermediate outputs using hidden layer 2 out = self.linear2(out) # Apply activation function out = F.relu(out) # Get predictions using output layer out = self.linear3(out) return out def training_step(self, batch): images, labels = batch out = self(images) # Generate predictions loss = F.cross_entropy(out, labels) # Calculate loss return loss def validation_step(self, batch): images, labels = batch out = self(images) # Generate predictions loss = F.cross_entropy(out, labels) # Calculate loss acc = accuracy(out, labels) # Calculate accuracy return {'val_loss': loss, 'val_acc': acc} def validation_epoch_end(self, outputs): batch_losses = [x['val_loss'] for x in outputs] epoch_loss = torch.stack(batch_losses).mean() # Combine losses batch_accs = [x['val_acc'] for x in outputs] epoch_acc = torch.stack(batch_accs).mean() # Combine accuracies return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()} def epoch_end(self, epoch, result): print("Epoch [{}], val_loss: {:.4f}, val_acc: {:.4f}".format(epoch, result['val_loss'], result['val_acc']))
Enter fullscreen mode Exit fullscreen mode
CPU or GPU, you can use anything to train the model, this function automatically detects your available device and assigns it for training.
<span>def</span> <span>get_default_device</span><span>():</span><span>"""</span><span>Pick GPU if available, else CPU</span><span>"""</span><span>if</span> <span>torch</span><span>.</span><span>cuda</span><span>.</span><span>is_available</span><span>():</span><span>return</span> <span>torch</span><span>.</span><span>device</span><span>(</span><span>'</span><span>cuda</span><span>'</span><span>)</span><span>else</span><span>:</span><span>return</span> <span>torch</span><span>.</span><span>device</span><span>(</span><span>'</span><span>cpu</span><span>'</span><span>)</span><span>device</span> <span>=</span> <span>get_default_device</span><span>()</span><span>def</span> <span>get_default_device</span><span>():</span> <span>"""</span><span>Pick GPU if available, else CPU</span><span>"""</span> <span>if</span> <span>torch</span><span>.</span><span>cuda</span><span>.</span><span>is_available</span><span>():</span> <span>return</span> <span>torch</span><span>.</span><span>device</span><span>(</span><span>'</span><span>cuda</span><span>'</span><span>)</span> <span>else</span><span>:</span> <span>return</span> <span>torch</span><span>.</span><span>device</span><span>(</span><span>'</span><span>cpu</span><span>'</span><span>)</span> <span>device</span> <span>=</span> <span>get_default_device</span><span>()</span>def get_default_device(): """Pick GPU if available, else CPU""" if torch.cuda.is_available(): return torch.device('cuda') else: return torch.device('cpu') device = get_default_device()
Enter fullscreen mode Exit fullscreen mode
Now let’s move the data to the chosen device, to understand this better if you are using GPU/CPU you cannot store the complete data into the the available memory(RAM/GPU-memory), so we are going to transfer it as batches at the time of training, this is an efficient way of training.
<span>def</span> <span>to_device</span><span>(</span><span>data</span><span>,</span> <span>device</span><span>):</span><span>"""</span><span>Move tensor(s) to chosen device</span><span>"""</span><span>if</span> <span>isinstance</span><span>(</span><span>data</span><span>,</span> <span>(</span><span>list</span><span>,</span><span>tuple</span><span>)):</span><span>return</span> <span>[</span><span>to_device</span><span>(</span><span>x</span><span>,</span> <span>device</span><span>)</span> <span>for</span> <span>x</span> <span>in</span> <span>data</span><span>]</span><span>return</span> <span>data</span><span>.</span><span>to</span><span>(</span><span>device</span><span>,</span> <span>non_blocking</span><span>=</span><span>True</span><span>)</span><span>def</span> <span>to_device</span><span>(</span><span>data</span><span>,</span> <span>device</span><span>):</span> <span>"""</span><span>Move tensor(s) to chosen device</span><span>"""</span> <span>if</span> <span>isinstance</span><span>(</span><span>data</span><span>,</span> <span>(</span><span>list</span><span>,</span><span>tuple</span><span>)):</span> <span>return</span> <span>[</span><span>to_device</span><span>(</span><span>x</span><span>,</span> <span>device</span><span>)</span> <span>for</span> <span>x</span> <span>in</span> <span>data</span><span>]</span> <span>return</span> <span>data</span><span>.</span><span>to</span><span>(</span><span>device</span><span>,</span> <span>non_blocking</span><span>=</span><span>True</span><span>)</span>def to_device(data, device): """Move tensor(s) to chosen device""" if isinstance(data, (list,tuple)): return [to_device(x, device) for x in data] return data.to(device, non_blocking=True)
Enter fullscreen mode Exit fullscreen mode
Now let’s define a class to handle all the data movement
<span>class</span> <span>DeviceDataLoader</span><span>():</span><span>"""</span><span>Wrap a dataloader to move data to a device</span><span>"""</span><span>def</span> <span>__init__</span><span>(</span><span>self</span><span>,</span> <span>dl</span><span>,</span> <span>device</span><span>):</span><span>self</span><span>.</span><span>dl</span> <span>=</span> <span>dl</span><span>self</span><span>.</span><span>device</span> <span>=</span> <span>device</span><span>def</span> <span>__iter__</span><span>(</span><span>self</span><span>):</span><span>"""</span><span>Yield a batch of data after moving it to device</span><span>"""</span><span>for</span> <span>b</span> <span>in</span> <span>self</span><span>.</span><span>dl</span><span>:</span><span>yield</span> <span>to_device</span><span>(</span><span>b</span><span>,</span> <span>self</span><span>.</span><span>device</span><span>)</span><span>def</span> <span>__len__</span><span>(</span><span>self</span><span>):</span><span>"""</span><span>Number of batches</span><span>"""</span><span>return</span> <span>len</span><span>(</span><span>self</span><span>.</span><span>dl</span><span>)</span><span>class</span> <span>DeviceDataLoader</span><span>():</span> <span>"""</span><span>Wrap a dataloader to move data to a device</span><span>"""</span> <span>def</span> <span>__init__</span><span>(</span><span>self</span><span>,</span> <span>dl</span><span>,</span> <span>device</span><span>):</span> <span>self</span><span>.</span><span>dl</span> <span>=</span> <span>dl</span> <span>self</span><span>.</span><span>device</span> <span>=</span> <span>device</span> <span>def</span> <span>__iter__</span><span>(</span><span>self</span><span>):</span> <span>"""</span><span>Yield a batch of data after moving it to device</span><span>"""</span> <span>for</span> <span>b</span> <span>in</span> <span>self</span><span>.</span><span>dl</span><span>:</span> <span>yield</span> <span>to_device</span><span>(</span><span>b</span><span>,</span> <span>self</span><span>.</span><span>device</span><span>)</span> <span>def</span> <span>__len__</span><span>(</span><span>self</span><span>):</span> <span>"""</span><span>Number of batches</span><span>"""</span> <span>return</span> <span>len</span><span>(</span><span>self</span><span>.</span><span>dl</span><span>)</span>class DeviceDataLoader(): """Wrap a dataloader to move data to a device""" def __init__(self, dl, device): self.dl = dl self.device = device def __iter__(self): """Yield a batch of data after moving it to device""" for b in self.dl: yield to_device(b, self.device) def __len__(self): """Number of batches""" return len(self.dl)
Enter fullscreen mode Exit fullscreen mode
<span>train_loader</span> <span>=</span> <span>DeviceDataLoader</span><span>(</span><span>train_loader</span><span>,</span> <span>device</span><span>)</span><span>val_loader</span> <span>=</span> <span>DeviceDataLoader</span><span>(</span><span>val_loader</span><span>,</span> <span>device</span><span>)</span><span>test_loader</span> <span>=</span> <span>DeviceDataLoader</span><span>(</span><span>test_loader</span><span>,</span> <span>device</span><span>)</span><span>train_loader</span> <span>=</span> <span>DeviceDataLoader</span><span>(</span><span>train_loader</span><span>,</span> <span>device</span><span>)</span> <span>val_loader</span> <span>=</span> <span>DeviceDataLoader</span><span>(</span><span>val_loader</span><span>,</span> <span>device</span><span>)</span> <span>test_loader</span> <span>=</span> <span>DeviceDataLoader</span><span>(</span><span>test_loader</span><span>,</span> <span>device</span><span>)</span>train_loader = DeviceDataLoader(train_loader, device) val_loader = DeviceDataLoader(val_loader, device) test_loader = DeviceDataLoader(test_loader, device)
Enter fullscreen mode Exit fullscreen mode
Now we are defining two functions to explicitly handle the training and validation of the model
<span>def</span> <span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>val_loader</span><span>):</span><span>outputs</span> <span>=</span> <span>[</span><span>model</span><span>.</span><span>validation_step</span><span>(</span><span>batch</span><span>)</span> <span>for</span> <span>batch</span> <span>in</span> <span>val_loader</span><span>]</span><span>return</span> <span>model</span><span>.</span><span>validation_epoch_end</span><span>(</span><span>outputs</span><span>)</span><span>def</span> <span>fit</span><span>(</span><span>epochs</span><span>,</span> <span>lr</span><span>,</span> <span>model</span><span>,</span> <span>train_loader</span><span>,</span> <span>val_loader</span><span>,</span> <span>opt_func</span><span>=</span><span>torch</span><span>.</span><span>optim</span><span>.</span><span>SGD</span><span>):</span><span>history</span> <span>=</span> <span>[]</span><span>optimizer</span> <span>=</span> <span>opt_func</span><span>(</span><span>model</span><span>.</span><span>parameters</span><span>(),</span> <span>lr</span><span>)</span><span>for</span> <span>epoch</span> <span>in</span> <span>range</span><span>(</span><span>epochs</span><span>):</span><span># Training Phase </span> <span>for</span> <span>batch</span> <span>in</span> <span>train_loader</span><span>:</span><span>loss</span> <span>=</span> <span>model</span><span>.</span><span>training_step</span><span>(</span><span>batch</span><span>)</span><span>loss</span><span>.</span><span>backward</span><span>()</span><span>optimizer</span><span>.</span><span>step</span><span>()</span><span>optimizer</span><span>.</span><span>zero_grad</span><span>()</span><span># Validation phase </span> <span>result</span> <span>=</span> <span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>val_loader</span><span>)</span><span>model</span><span>.</span><span>epoch_end</span><span>(</span><span>epoch</span><span>,</span> <span>result</span><span>)</span><span>history</span><span>.</span><span>append</span><span>(</span><span>result</span><span>)</span><span>return</span> <span>history</span><span>def</span> <span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>val_loader</span><span>):</span> <span>outputs</span> <span>=</span> <span>[</span><span>model</span><span>.</span><span>validation_step</span><span>(</span><span>batch</span><span>)</span> <span>for</span> <span>batch</span> <span>in</span> <span>val_loader</span><span>]</span> <span>return</span> <span>model</span><span>.</span><span>validation_epoch_end</span><span>(</span><span>outputs</span><span>)</span> <span>def</span> <span>fit</span><span>(</span><span>epochs</span><span>,</span> <span>lr</span><span>,</span> <span>model</span><span>,</span> <span>train_loader</span><span>,</span> <span>val_loader</span><span>,</span> <span>opt_func</span><span>=</span><span>torch</span><span>.</span><span>optim</span><span>.</span><span>SGD</span><span>):</span> <span>history</span> <span>=</span> <span>[]</span> <span>optimizer</span> <span>=</span> <span>opt_func</span><span>(</span><span>model</span><span>.</span><span>parameters</span><span>(),</span> <span>lr</span><span>)</span> <span>for</span> <span>epoch</span> <span>in</span> <span>range</span><span>(</span><span>epochs</span><span>):</span> <span># Training Phase </span> <span>for</span> <span>batch</span> <span>in</span> <span>train_loader</span><span>:</span> <span>loss</span> <span>=</span> <span>model</span><span>.</span><span>training_step</span><span>(</span><span>batch</span><span>)</span> <span>loss</span><span>.</span><span>backward</span><span>()</span> <span>optimizer</span><span>.</span><span>step</span><span>()</span> <span>optimizer</span><span>.</span><span>zero_grad</span><span>()</span> <span># Validation phase </span> <span>result</span> <span>=</span> <span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>val_loader</span><span>)</span> <span>model</span><span>.</span><span>epoch_end</span><span>(</span><span>epoch</span><span>,</span> <span>result</span><span>)</span> <span>history</span><span>.</span><span>append</span><span>(</span><span>result</span><span>)</span> <span>return</span> <span>history</span>def evaluate(model, val_loader): outputs = [model.validation_step(batch) for batch in val_loader] return model.validation_epoch_end(outputs) def fit(epochs, lr, model, train_loader, val_loader, opt_func=torch.optim.SGD): history = [] optimizer = opt_func(model.parameters(), lr) for epoch in range(epochs): # Training Phase for batch in train_loader: loss = model.training_step(batch) loss.backward() optimizer.step() optimizer.zero_grad() # Validation phase result = evaluate(model, val_loader) model.epoch_end(epoch, result) history.append(result) return history
Enter fullscreen mode Exit fullscreen mode
Each of our data point is of dimension 28*28 = 784 and we have 10 distinct labels(classes) which we need to classify
<span>input_size</span> <span>=</span> <span>784</span><span>num_classes</span> <span>=</span> <span>10</span><span>input_size</span> <span>=</span> <span>784</span> <span>num_classes</span> <span>=</span> <span>10</span>input_size = 784 num_classes = 10
Enter fullscreen mode Exit fullscreen mode
Now let’s initialise our model and move it inside GPU/CPU memory
<span>model</span> <span>=</span> <span>MnistModel</span><span>(</span><span>input_size</span><span>,</span> <span>out_size</span><span>=</span><span>num_classes</span><span>)</span><span>to_device</span><span>(</span><span>model</span><span>,</span> <span>device</span><span>)</span><span>model</span> <span>=</span> <span>MnistModel</span><span>(</span><span>input_size</span><span>,</span> <span>out_size</span><span>=</span><span>num_classes</span><span>)</span> <span>to_device</span><span>(</span><span>model</span><span>,</span> <span>device</span><span>)</span>model = MnistModel(input_size, out_size=num_classes) to_device(model, device)
Enter fullscreen mode Exit fullscreen mode
Initially evaluate the model without any training that is the model now has random weights and biases
<span>history</span> <span>=</span> <span>[</span><span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>val_loader</span><span>)]</span><span>print</span><span>(</span><span>"</span><span>Before Training : </span><span>"</span><span>,</span><span>history</span><span>)</span><span>history</span> <span>=</span> <span>[</span><span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>val_loader</span><span>)]</span> <span>print</span><span>(</span><span>"</span><span>Before Training : </span><span>"</span><span>,</span><span>history</span><span>)</span>history = [evaluate(model, val_loader)] print("Before Training : ",history)
Enter fullscreen mode Exit fullscreen mode
Now lets train it for 10 epochs and learning rate is 0.5
<span>history</span> <span>+=</span> <span>fit</span><span>(</span><span>10</span><span>,</span> <span>0.5</span><span>,</span> <span>model</span><span>,</span> <span>train_loader</span><span>,</span> <span>val_loader</span><span>)</span><span>history</span> <span>+=</span> <span>fit</span><span>(</span><span>10</span><span>,</span> <span>0.5</span><span>,</span> <span>model</span><span>,</span> <span>train_loader</span><span>,</span> <span>val_loader</span><span>)</span>history += fit(10, 0.5, model, train_loader, val_loader)
Enter fullscreen mode Exit fullscreen mode
and our training is done……..
To visualise the loss and accuracy the code given below could be very helpful, run this is jupyter notebook
<span>losses</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_loss</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>history</span><span>]</span><span>plt</span><span>.</span><span>plot</span><span>(</span><span>losses</span><span>,</span> <span>'</span><span>-x</span><span>'</span><span>)</span><span>plt</span><span>.</span><span>xlabel</span><span>(</span><span>'</span><span>epoch</span><span>'</span><span>)</span><span>plt</span><span>.</span><span>ylabel</span><span>(</span><span>'</span><span>loss</span><span>'</span><span>)</span><span>plt</span><span>.</span><span>title</span><span>(</span><span>'</span><span>Loss vs. No. of epochs</span><span>'</span><span>);</span><span>losses</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_loss</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>history</span><span>]</span> <span>plt</span><span>.</span><span>plot</span><span>(</span><span>losses</span><span>,</span> <span>'</span><span>-x</span><span>'</span><span>)</span> <span>plt</span><span>.</span><span>xlabel</span><span>(</span><span>'</span><span>epoch</span><span>'</span><span>)</span> <span>plt</span><span>.</span><span>ylabel</span><span>(</span><span>'</span><span>loss</span><span>'</span><span>)</span> <span>plt</span><span>.</span><span>title</span><span>(</span><span>'</span><span>Loss vs. No. of epochs</span><span>'</span><span>);</span>losses = [x['val_loss'] for x in history] plt.plot(losses, '-x') plt.xlabel('epoch') plt.ylabel('loss') plt.title('Loss vs. No. of epochs');
Enter fullscreen mode Exit fullscreen mode
<span>accuracies</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_acc</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>history</span><span>]</span><span>plt</span><span>.</span><span>plot</span><span>(</span><span>accuracies</span><span>,</span> <span>'</span><span>-x</span><span>'</span><span>)</span><span>plt</span><span>.</span><span>xlabel</span><span>(</span><span>'</span><span>epoch</span><span>'</span><span>)</span><span>plt</span><span>.</span><span>ylabel</span><span>(</span><span>'</span><span>accuracy</span><span>'</span><span>)</span><span>plt</span><span>.</span><span>title</span><span>(</span><span>'</span><span>Accuracy vs. No. of epochs</span><span>'</span><span>);</span><span>accuracies</span> <span>=</span> <span>[</span><span>x</span><span>[</span><span>'</span><span>val_acc</span><span>'</span><span>]</span> <span>for</span> <span>x</span> <span>in</span> <span>history</span><span>]</span> <span>plt</span><span>.</span><span>plot</span><span>(</span><span>accuracies</span><span>,</span> <span>'</span><span>-x</span><span>'</span><span>)</span> <span>plt</span><span>.</span><span>xlabel</span><span>(</span><span>'</span><span>epoch</span><span>'</span><span>)</span> <span>plt</span><span>.</span><span>ylabel</span><span>(</span><span>'</span><span>accuracy</span><span>'</span><span>)</span> <span>plt</span><span>.</span><span>title</span><span>(</span><span>'</span><span>Accuracy vs. No. of epochs</span><span>'</span><span>);</span>accuracies = [x['val_acc'] for x in history] plt.plot(accuracies, '-x') plt.xlabel('epoch') plt.ylabel('accuracy') plt.title('Accuracy vs. No. of epochs');
Enter fullscreen mode Exit fullscreen mode
Lets do some predictions/classification with our trained model
<span>def</span> <span>predict_image</span><span>(</span><span>img</span><span>,</span> <span>model</span><span>):</span><span>xb</span> <span>=</span> <span>to_device</span><span>(</span><span>img</span><span>.</span><span>unsqueeze</span><span>(</span><span>0</span><span>),</span> <span>device</span><span>)</span><span>yb</span> <span>=</span> <span>model</span><span>(</span><span>xb</span><span>)</span><span>_</span><span>,</span> <span>preds</span> <span>=</span> <span>torch</span><span>.</span><span>max</span><span>(</span><span>yb</span><span>,</span> <span>dim</span><span>=</span><span>1</span><span>)</span><span>return</span> <span>preds</span><span>[</span><span>0</span><span>].</span><span>item</span><span>()</span><span>def</span> <span>predict_image</span><span>(</span><span>img</span><span>,</span> <span>model</span><span>):</span> <span>xb</span> <span>=</span> <span>to_device</span><span>(</span><span>img</span><span>.</span><span>unsqueeze</span><span>(</span><span>0</span><span>),</span> <span>device</span><span>)</span> <span>yb</span> <span>=</span> <span>model</span><span>(</span><span>xb</span><span>)</span> <span>_</span><span>,</span> <span>preds</span> <span>=</span> <span>torch</span><span>.</span><span>max</span><span>(</span><span>yb</span><span>,</span> <span>dim</span><span>=</span><span>1</span><span>)</span> <span>return</span> <span>preds</span><span>[</span><span>0</span><span>].</span><span>item</span><span>()</span>def predict_image(img, model): xb = to_device(img.unsqueeze(0), device) yb = model(xb) _, preds = torch.max(yb, dim=1) return preds[0].item()
Enter fullscreen mode Exit fullscreen mode
We are taking one image from the test_data subset to test our model
<span>img</span><span>,</span> <span>label</span> <span>=</span> <span>test_dataset</span><span>[</span><span>0</span><span>]</span><span>plt</span><span>.</span><span>imshow</span><span>(</span><span>img</span><span>[</span><span>0</span><span>],</span> <span>cmap</span><span>=</span><span>'</span><span>gray</span><span>'</span><span>)</span><span>print</span><span>(</span><span>'</span><span>Label:</span><span>'</span><span>,</span> <span>dataset</span><span>.</span><span>classes</span><span>[</span><span>label</span><span>],</span> <span>'</span><span>, Predicted:</span><span>'</span><span>,</span> <span>dataset</span><span>.</span><span>classes</span><span>[</span><span>predict_image</span><span>(</span><span>img</span><span>,</span> <span>model</span><span>)])</span><span>img</span><span>,</span> <span>label</span> <span>=</span> <span>test_dataset</span><span>[</span><span>0</span><span>]</span> <span>plt</span><span>.</span><span>imshow</span><span>(</span><span>img</span><span>[</span><span>0</span><span>],</span> <span>cmap</span><span>=</span><span>'</span><span>gray</span><span>'</span><span>)</span> <span>print</span><span>(</span><span>'</span><span>Label:</span><span>'</span><span>,</span> <span>dataset</span><span>.</span><span>classes</span><span>[</span><span>label</span><span>],</span> <span>'</span><span>, Predicted:</span><span>'</span><span>,</span> <span>dataset</span><span>.</span><span>classes</span><span>[</span><span>predict_image</span><span>(</span><span>img</span><span>,</span> <span>model</span><span>)])</span>img, label = test_dataset[0] plt.imshow(img[0], cmap='gray') print('Label:', dataset.classes[label], ', Predicted:', dataset.classes[predict_image(img, model)])
Enter fullscreen mode Exit fullscreen mode
Lets look at the fruit of our work
<span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>test_loader</span><span>)</span><span>evaluate</span><span>(</span><span>model</span><span>,</span> <span>test_loader</span><span>)</span>evaluate(model, test_loader)
Enter fullscreen mode Exit fullscreen mode
This is what I got
Hope this blog helped you, by the way this my first blog so do provide valuable feedback, thanks !!!!
暂无评论内容