Building the Perfect AI – Part 1: Introduction to Neural Networks and Basic Concepts

  • Published
  • 5 mins read
You are currently viewing Building the Perfect AI – Part 1: Introduction to Neural Networks and Basic Concepts

Overview

Creating the perfect AI is not just about getting the code right but also understanding the core principles driving its learning, much like our own neural synapses. This series will guide you through building an AI from scratch, using PyTorch as the foundation. With each tutorial, we’ll take bite-sized steps, balancing theory with tangible experience by running code that delivers immediate feedback in your Python console, so you can see and feel the AI’s progress as you go.


Step 1: Setting Up the Basics with PyTorch

What Is a Neural Network?

In a simplified sense, a neural network mimics how the human brain processes information: neurons receive input, transform it, and pass it along, with connections between neurons strengthening or weakening as the network learns. In PyTorch, this can be represented and trained using tensors (multi-dimensional arrays) and a variety of layer types.

We’ll start by setting up a very simple neural network for classifying the MNIST dataset, where the network will learn to recognize digits (0-9) based on pixel data. Before diving into the code, let’s install the necessary tools and libraries.


Setting Up Your Environment

  1. Install Python: You should have Python installed on your machine. Download from here if needed.
  2. Install PyTorch and Supporting Libraries: You can install PyTorch and other key libraries using pip.
   pip install torch torchvision matplotlib
  1. IDE Recommendation: I personally like to use Spyder. If you don’t have Spyder, you can install it via conda:
   conda install spyder

Step 2: Creating and Running a Simple Neural Network

We’ll start by building a basic neural network to classify the MNIST dataset of handwritten digits.

Code Walkthrough: Simple Neural Network in PyTorch

  1. Import Libraries:
   import torch
   import torch.nn as nn
   import torch.optim as optim
   import torchvision
   import torchvision.transforms as transforms
   from torch.utils.data import DataLoader
   import matplotlib.pyplot as pltCode language: JavaScript (javascript)

Here, we import PyTorch and the relevant modules. We also import torchvision to easily load the MNIST dataset and matplotlib to visualize results.

  1. Download and Prepare the MNIST Dataset:
   transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])

   train_set = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
   test_set = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)

   train_loader = DataLoader(train_set, batch_size=64, shuffle=True)
   test_loader = DataLoader(test_set, batch_size=64, shuffle=False)Code language: PHP (php)

The dataset is downloaded, transformed into tensors, and normalized so that pixel values range from -1 to 1. We then load the data using PyTorch’s DataLoader, which helps in efficiently feeding batches of data into the model.

  1. Define the Neural Network:
   class SimpleNN(nn.Module):
       def __init__(self):
           super(SimpleNN, self).__init__()
           self.fc1 = nn.Linear(28 * 28, 128)
           self.fc2 = nn.Linear(128, 10)

       def forward(self, x):
           x = x.view(-1, 28 * 28)  # Flatten the image into a vector
           x = torch.relu(self.fc1(x))  # Apply ReLU activation
           x = self.fc2(x)  # Output layer (no activation because we’ll use cross-entropy loss)
           return x

   model = SimpleNN()

Explanation:

  • We create a neural network with two layers: the first (hidden) layer has 128 neurons and uses a ReLU activation function, and the second (output) layer has 10 neurons, one for each digit (0-9).
  • The forward function defines how data flows through the network, transforming the input from a 2D image (28×28 pixels) into a flat 1D array.
  1. Define the Loss Function and Optimizer:
   criterion = nn.CrossEntropyLoss()
   optimizer = optim.SGD(model.parameters(), lr=0.01)

We use cross-entropy loss, which is a common choice for classification tasks, and stochastic gradient descent (SGD) for optimization, updating the model’s weights as it learns.

  1. Train the Network:
   num_epochs = 5
   for epoch in range(num_epochs):
       running_loss = 0.0
       for inputs, labels in train_loader:
           optimizer.zero_grad()
           outputs = model(inputs)
           loss = criterion(outputs, labels)
           loss.backward()
           optimizer.step()
           running_loss += loss.item()

       print(f'Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}')

   print('Finished Training')Code language: PHP (php)

Explanation:

  • In each epoch (a complete pass through the training dataset), we feed data into the network, compute the loss, and use backpropagation to adjust the weights based on the errors.
  • optimizer.zero_grad() ensures gradients are reset before each step.
  1. Evaluate the Network:
    Now that we’ve trained the model, let’s evaluate its performance on the test data.
   correct = 0
   total = 0
   with torch.no_grad():  # We don’t need gradients for testing
       for inputs, labels in test_loader:
           outputs = model(inputs)
           _, predicted = torch.max(outputs.data, 1)
           total += labels.size(0)
           correct += (predicted == labels).sum().item()

   print(f'Accuracy on the test set: {100 * correct / total}%')Code language: PHP (php)

Tangible Output: See the Results in Spyder

Once you run this code in Spyder, you’ll see the following:

  1. Loss during training: After each epoch, you’ll see the loss decrease, indicating that the network is learning.
  2. Final accuracy: The accuracy score on the test set, letting you know how well the network generalizes to unseen data.

For example:

Epoch 1, Loss: 0.3505
Epoch 2, Loss: 0.3120
...
Accuracy on the test set: 92.0%Code language: CSS (css)

Conceptual Understanding: Synaptic Learning in Action

This simple network is a starting point. It mimics how our own brains strengthen synapses between neurons to recognize patterns, like handwriting. As you adjust the model’s parameters (e.g., number of neurons, learning rate), you begin to feel how these changes affect the network’s ability to learn, just as we fine-tune our own learning strategies.

In the next tutorial, we’ll explore more advanced architectures, such as adding more layers (deep learning), and dive deeper into what exactly happens during backpropagation. You’ll not only see the outputs but also feel how the model grows, learns, and evolves over time.


Now on to Part 2, where we’ll deepen our understanding of learning dynamics!

Leave a Reply