DEV Community

Cover image for Step-by-Step Guide: Building Your First Image Classification Project with Machine Learning
Ransika Silva
Ransika Silva

Posted on

Step-by-Step Guide: Building Your First Image Classification Project with Machine Learning

Introduction

Image classification is a pillar of the domain of computer vision that is a very good introduction to the domain of machine learning. In this article, we will go on a journey to build an image classifier from scratch with the aid of Python and Keras. At the end of this, you will have a working model that can classify images with a very acceptable degree of accuracy. So, let us begin!

Selecting a Dataset

The initial action to undertake with any machine learning activity is to find a fitting dataset to work with. It is best to find a well-documented dataset that is well-balanced—not too big and not too complex. Of the most intriguing challenges of image classification to tackle are:

  • MNIST: Handwritten digits (10 classes)
  • CIFAR-10: Small color images (10 classes)
  • Fashion MNIST: Fashion article images (10 classes)

For this guide, we will work with the CIFAR-10 database. The database includes 60,000 32x32 color images that are split into 10 classes with 6,000 images per class. The classes are airplane, car, bird, cat, deer, dog, frog, horse, ship, and truck.

The CIFAR-10 database can be obtained by the following code:

from tensorflow.keras.datasets import cifar10

(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()  
Enter fullscreen mode Exit fullscreen mode

Setting Up Your Environment

Before diving into the code itself, you need to have the proper software installed to successfully finish this exercise:

  • Python version 3.x
  • TensorFlow 2.x, included with Keras
  • NumPy
  • Matplotlib (for visualization)

They can be installed with pip:

pip install tensorflow numpy matplotlib
Enter fullscreen mode Exit fullscreen mode

Prepare the Data

Once the database is downloaded and the environment is established, we proceed with the following stages to prepare the learning material:

  1. Modify the value of the pixel to between 0 to 1
  2. Transform the category to a representation of a one-hot vector
  3. Split the data into training sets and testing sets

Here lies the code that serves this function:

train_images = train_images / 255.0
test_images = test_images / 255.0

train_labels = to_categorical(train_labels) 
test_labels = to_categorical(test_labels)
Enter fullscreen mode Exit fullscreen mode

Building the Model

At long last, we are at the exciting phase of building the neural network! We will have a convolutional neural network (CNN), a format that is highly adept at processing image information. We will have a simple CNN consisting of the following layers:

  • Conv2D layer with 32 filters, 3x3 kernel, ReLU activation
  • A MaxPooling2D with a 2x2 pooling area
  • A Conv2D with 64 filters with a 3x3 kernel and ReLU activation
  • A MaxPooling2D with a 2x2 pool size
  • Flatten the layer to reshape 2D features to 1D
  • Dense layer with 64 units, ReLU activation
  • Dense output layer with 10 units, softmax activation

This is the way that it looks:

model = models.Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),  
    MaxPooling2D((2, 2)),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])
Enter fullscreen mode Exit fullscreen mode

Training and Evaluation

With our architecture built out, the time is ready to actually train the model with our information. We first build the model out with the optimizer, loss function, and metrics we want to track:

model.compile(optimizer='adam',
              loss='categorical_crossentropy', 
              metrics=['accuracy'])
Enter fullscreen mode Exit fullscreen mode

Then, we train the model using fit():

history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))
Enter fullscreen mode Exit fullscreen mode

After training, we can evaluate the model's performance on the test set:

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('Test accuracy:', test_acc)
Enter fullscreen mode Exit fullscreen mode

We can also plot the training and validation accuracy over time:

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
Enter fullscreen mode Exit fullscreen mode

Conclusion

Congratulations, you have successfully built your very first image classification model! With minimal code touch, we were able to train a CNN that has a 70% accuracy rate of correctly predicting classes of images. Of course, much is still to improve upon; you could look into techniques like data augmentation or transfer learning to improve performance all the way!

I trust that this guide has introduced you to the potential of machine learning and computer vision. Carry on with learning, and have a nice time programming!

References and Resources

Top comments (1)

Collapse
 
nimansa_bandara profile image
Nimansa Bandara

Excellent work!