Ransika Silva

Posted on Feb 20

Step-by-Step Guide: Building Your First Image Classification Project with Machine Learning

#machinelearning #deeplearning #computervision #python

Introduction

Image classification is a pillar of the domain of computer vision that is a very good introduction to the domain of machine learning. In this article, we will go on a journey to build an image classifier from scratch with the aid of Python and Keras. At the end of this, you will have a working model that can classify images with a very acceptable degree of accuracy. So, let us begin!

Selecting a Dataset

The initial action to undertake with any machine learning activity is to find a fitting dataset to work with. It is best to find a well-documented dataset that is well-balanced—not too big and not too complex. Of the most intriguing challenges of image classification to tackle are:

MNIST: Handwritten digits (10 classes)
CIFAR-10: Small color images (10 classes)
Fashion MNIST: Fashion article images (10 classes)

For this guide, we will work with the CIFAR-10 database. The database includes 60,000 32x32 color images that are split into 10 classes with 6,000 images per class. The classes are airplane, car, bird, cat, deer, dog, frog, horse, ship, and truck.

The CIFAR-10 database can be obtained by the following code:

from tensorflow.keras.datasets import cifar10

(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

Setting Up Your Environment

Before diving into the code itself, you need to have the proper software installed to successfully finish this exercise:

Python version 3.x
TensorFlow 2.x, included with Keras
NumPy
Matplotlib (for visualization)

They can be installed with pip:

pip install tensorflow numpy matplotlib

Prepare the Data

Once the database is downloaded and the environment is established, we proceed with the following stages to prepare the learning material:

Modify the value of the pixel to between 0 to 1
Transform the category to a representation of a one-hot vector
Split the data into training sets and testing sets

Here lies the code that serves this function:

train_images = train_images / 255.0
test_images = test_images / 255.0

train_labels = to_categorical(train_labels) 
test_labels = to_categorical(test_labels)

Building the Model

At long last, we are at the exciting phase of building the neural network! We will have a convolutional neural network (CNN), a format that is highly adept at processing image information. We will have a simple CNN consisting of the following layers:

Conv2D layer with 32 filters, 3x3 kernel, ReLU activation
A MaxPooling2D with a 2x2 pooling area
A Conv2D with 64 filters with a 3x3 kernel and ReLU activation
A MaxPooling2D with a 2x2 pool size
Flatten the layer to reshape 2D features to 1D
Dense layer with 64 units, ReLU activation
Dense output layer with 10 units, softmax activation

This is the way that it looks:

model = models.Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),  
    MaxPooling2D((2, 2)),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

Training and Evaluation

With our architecture built out, the time is ready to actually train the model with our information. We first build the model out with the optimizer, loss function, and metrics we want to track:

model.compile(optimizer='adam',
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

Then, we train the model using fit():

history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))

After training, we can evaluate the model's performance on the test set:

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('Test accuracy:', test_acc)

We can also plot the training and validation accuracy over time:

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

Conclusion

Congratulations, you have successfully built your very first image classification model! With minimal code touch, we were able to train a CNN that has a 70% accuracy rate of correctly predicting classes of images. Of course, much is still to improve upon; you could look into techniques like data augmentation or transfer learning to improve performance all the way!

I trust that this guide has introduced you to the potential of machine learning and computer vision. Carry on with learning, and have a nice time programming!

References and Resources

CIFAR-10 dataset: https://www.cs.toronto.edu/~kriz/cifar.html
Keras documentation: https://keras.io/
TensorFlow tutorials: https://www.tensorflow.org/tutorials
Stanford CS231n: Convolutional Neural Networks for Visual Recognition: https://cs231n.github.io/

Top comments (1)

Nimansa Bandara • Feb 25

Excellent work!

DEV Community

Step-by-Step Guide: Building Your First Image Classification Project with Machine Learning

Introduction

Selecting a Dataset

Setting Up Your Environment

Prepare the Data

Building the Model

Training and Evaluation

Conclusion

References and Resources

Top comments (1)

Read next

Create a Python virtual environment (quick)

A Practical Guide to RAG with DeepSeek R1 & Ollama

India’s AI Future: How You Can Help Drive the Next Tech Revolution

Food Recognition and Nutrition Estimation using OpenAI