PyTorch is a high-level framework for efficiently creating and training deep learning architectures such as Feed-Forward Neural Networks (FFNN), RNN, and CNN. It is an incredibly useful tool because it allows you to perform nifty natural language processing (NLP) and computer vision (CV) tasks. You can use PyTorch to create models that perform NLP tasks such as sentiment analysis, translation, summarization, and even text generation (smart speech bots). Some CV tasks that you can perform using PyTorch are object classification/detection, semantic segmentation, and real-time image processing. Of course, PyTorch can be used for other applications including audio files, medical files, and time-series forecasting.
In this tutorial, we explain the building block of PyTorch operations: Tensors. Tensors are essentially PyTorch's implementation of arrays. Since machine learning is moslty matrix manipulation, you will need to be familiar with tensor operations to be a great PyTorch user. Tensors are similar to Numpy arrays. So, if you have previous experience using Numpy, you will have an easy time working with tensors right away.
Let's start by importing PyTorch and Numpy.
import torch
import numpy as np
Next, let's create a 2x3 random tensor to experiment with.
tens = torch.rand(2,3) #2 is the number of rows, 3 is the number of columns
tens
Now that we have a tensor, let's check out some of its important attributes. The two most important tensor attributes that you will often check are its shape and the data type.
print(f"This is the shape of our tensor: {tens.shape}")
print(f"This is the data type of our tensor: {tens.dtype}")
You will often check the shape of tensors after performing operations to make sure the end result is as expected. There are many data types for numbers in a tensor. You can find the full list here: https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.dtype
However, you need data types simply because most utilities in PyTorch require a certain data type. For instance, when working with CV utilities, you should your data in float.
You can easily change the data type of a tensor using the .to() method as follows:
int_tens = tens.to(torch.uint8)
int_tens.dtype
int_tens
You can see that tensor has integer data now, and the values are rounded down to zeros.
Notice that I created the tensor using using torch.rand, but there are other ways to create tensors:
Create an an empty tensor with zeros.
torch.zeros(2,3)
#An ones tensor
torch.ones(2,3)
Create a Tensor from Python list
torch.tensor([[1, 2, 3], [4, 5, 6]])
If your data is in Numpy, you can also convert it in to a tensor:
arr = np.array([[1, 2, 3], [4, 5, 6]])
tens = torch.from_numpy(arr)
tens
You can also convert tensors back to Numpy arrays:
tens.numpy()
Note that you can set the dtype of a tensor while creating it:
torch.zeros(2,3, dtype=torch.double)
So far, so good! Now let's explore what kind of tensor manipulations we need to be familiar with.
There are many tensor operations in PyTorch, but I like to group them into 2 categories: slice and math.
- Slice operations allow you to extract or write to any section of a tensor, such as a row, column, or submatrix. These are very useful.
- Math operations allow you to change the values of the tensor mathematically.
Let's create a tensor in order to experiment.
tens = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
tens
#To access a single value in the tensor (Keep in mind that Python indexing starts at 0):
print(f"Value in cell 1, 0: {tens[1,0]}")
print(f"Value in cell 2, 2: {tens[2,2]}")
#To access a row in the tensor:
print(f"Row 0: {tens[0]}")
print(f"Row 2: {tens[2]}")
#To access a column in the tensor:
print(f"Column 0: {tens[:, 0]}")
print(f"Column 1: {tens[:, 1]}")
#To access a subtensor in the tensor:
tens[1:, 1:2]
tens[:2, 1:3]
Please analyze the final two examples carefully to understand how the slicing system works for subtensors. Essentially, you are selecting the cutpoints for each axis. In The first example, axis 0 (rows) is 1:, which means start at row 1 and select all next rows. Then, axis 1 (columns) is 1:2, which means start at column 1 and stop at column 2 (exclusive). Thus, the resulting tensor is [[5],[8]].
Note that if a cutpoint is left empty before the colon (:), that means start from the beginning, And if left empty after the colon, that means continue till the end.
We will explore the commonly used operations. For the full list of math operations: https://pytorch.org/docs/stable/torch.html#math-operations
Let's create 2 tensors from the original using .clone():
tens1 = tens.clone()
tens2 = tens.clone()
For basic arithmetic operations, you can use math symbols or torch functions:
Tensor Addition
tens1 + tens2
#Addition
torch.add(tens1, tens2)
Tensor Subtraction
tens1 - tens2
#Subtraction
torch.sub(tens1, tens2)
Tensor Multiplication
tens1 * tens2
#Multiplication
torch.mul(tens1, tens2)
Tensor Division
tens1 / tens2
#Division
torch.div(tens1, tens2)
For true matrix multiplication, use torch.matmul()
#Matrix Multiplication
torch.matmul(tens1, tens2)
When concatenating 2 tensors, you specify the dimension along which the concatenation should happen. Again, dim = 0 means along rows, dim = 1 means along columns, etc.
Matrix Concatenation
torch.cat([tens1, tens2], dim=1)
Taking the transpose is a common operation when dealing with data. It can be done in 2 ways:
tens1.T
tens1.t()
Mean accepts only float dtypes, so we must first convert to float.
flt_tens = tens.to(torch.float32)
torch.mean(flt_tens)
As shown above the mean output is a single-element tensor. We can get this value by using .item():
torch.mean(flt_tens).item()
Tensor Min value
torch.min(tens).item()
Tensor Max value
torch.max(tens).item()
#Argmin
torch.argmin(tens).item()
#Argmax
torch.argmax(tens).item()
Sigmoid and tanh are common activation functions in neural networks. There are more advanced ways to use these 2 activation functions in PyTorch, but following is the simplest way ...
#Sigmoid
torch.sigmoid(tens)
#Tanh
torch.tanh(tens)
#In-place sigmoid
torch.sigmoid_(tens.to(torch.float32))
Here, since we are applying the transformation in-place, we must change the dtype of the input to match that of the output.
The final function we explore is .view(), which allows us to reshape a tensor. This will be used a lot when working with data.
tens.view(9, 1)
tens.view(1, 9)
Another way to reshape a tensor into a 1xN vector is to use (1, -1) shape. The -1 means that this dimension should be inferred from the others. If the other is 1, that means that the columns must be 9. This is a dynamic way of reshaping tensors.
tens.view(1, -1)
When training large models in PyTorch, you will need to use GPUs. A GPU speeds up the training process by 49 or more times (according to this repo https://github.com/jcjohnson/cnn-benchmarks). So, it is important to make sure the GPU is being used when training.
To do so, we must first set the device:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
This line dynamically sets the device depending on whether or not a GPU is available. Next, we must send the model we are working with to device.
I will create a simple neural network to demonstrate GPU usage.
import torch.nn as nn
import torch.nn.functional as F
class NeuralNet(nn.Module):
def __init__(self):
super(NeuralNet, self).__init__()
self.fc1 = nn.Linear(30, 120)
self.fc2 = nn.Linear(120, 64)
self.fc3 = nn.Linear(64, 5)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
Now that we have written the model, we can initizalize it as such:
model = NeuralNet()
print(model)
After initialization, we send the model to the device, where it is CPU or GPU:
model = model.to(device)
Please note that when working with a GPU, it is not enough to send the model to the GPU. The data must also be sent to the GPU. Since the GPU has limited space, we typically create batches of data (for example a batch of 16 images) in order to train.
You can send the data to the device using the same .to() operation:
tens = tens.to(device)