DEV Community

Jeremy
Jeremy

Posted on

Crop Bounding Box + Rotate

Abstract

We are pre-processing photos to ultimately enable OCR. We have bounding boxes for rear-end of cars; now let’s rectangularize the license plates with cropped and rotated photos.

ToC (Step-by-Step)

License plate detection workflow visualized

  1. Overview: OpenCV in Python for End-to-end License Plate Detection.
  2. Camera and computer setup. Raspberry pi (RPi), Canon DSLR, f-stop and ISO.
  3. Capturing images from DSLR to RPi. Automating capture of images and transfer to RPi.
  4. Model for detecting cars. Train model from scratch using YOLO and labeled images.
  5. Crop bounding box + rotate.
  6. TBD -- Model for finding license plates. Train a second model with YOLO.
  7. TBD -- Crop bounding box.
  8. TBD -- Read license plate with OCR. Pre-process image, and extract text with Paddle OCR.

Recipe

We now have model identifying cars in our streamed images from camera. Our high-level goal is cropped-image reduced only to license plate for OCR reading. That means finding cars in image (article 3), cropping image to just rear bumper of car (this article; 4), so that we can find license plate (next article; 5), and crop image to just license plate (article 6).

This interior step is a processing step to ultimately enable OCR. Note particularly in my captured images that images are both looking down, but also from the distant side, from the cars. That eschew is important feature to reduce. OCR expects a straight-on view, thus part of this step is also corrected eschew (by rotation).

At this point we have bounding box per car. We potentially have multiple cars per image - so within this step is a loop that forks all the following steps (for each car: detect license plate, crop, OCR, repeat).

First, we are cropping image to just bounding box from model in step 3, and rotating to match camera’s eschewed perspective in relation to highway.

  1. For each bounding box (ie for each car)
    Sample highway capture
    Input image. Run model, find bounding boxes (cars).

  2. Crop to rear bumper, and rotate to correct eschew

    2x extracted cropped images of cars

Output images cropped down to just rear bumper with clearer view of license plate. Note rotation artifacts in corners. Now, license is straight-on for easier OCR.

from ultralytics import YOLO
import cv2 
import imutils
import numpy as np
from os import path

baseDir = '/Users/japollock/Projects/TrainHighwayCarDetector/'
inputFilePath = baseDir + 'photos/yolo_cars/all/IMG_4553.jpg'
inputFileName = path.basename(inputFilePath)
outputPhotosDir = baseDir + 'photos/yolo_cars/licensePlates/'
model = YOLO(baseDir + 'src/runs/detect/yolov8m_v21_100e/weights/best.pt')

imageOriginal = cv2.imread(inputFilePath)
imageScaled = imutils.resize(imageOriginal, width=1280)

imgRatio = imageOriginal.shape[1] / imageScaled.shape[1]

results = model.predict(source=imageScaled, imgsz=1280)

j=0
for box in results[0].boxes:
    j += 1
    # bounding box in scaled-down image
    x1Float = box.xyxy[0][0].item()
    x2Float = box.xyxy[0][2].item()
    y1Float = box.xyxy[0][1].item()
    y2Float = box.xyxy[0][3].item()

    # calc bounding box in original (scaled-up) image
    x1 = int(imgRatio * x1Float)
    y1 = int(imgRatio * y1Float)
    x2 = int(imgRatio * x2Float)
    y2 = int(imgRatio * y2Float)

    # cropped
    imageCropped = imageOriginal[y1:y2,x1:x2]

    # rotated
    image_center = tuple(np.array(imageCropped.shape[1::-1]) / 2)
    rot_mat = cv2.getRotationMatrix2D(image_center, 4.7, 1.0)
    imageRotated = cv2.warpAffine(imageCropped, rot_mat, imageCropped.shape[1::-1], flags=cv2.INTER_LINEAR)

    outputFilePath = outputPhotosDir + inputFileName[0:(len(inputFileName)-4)] + '_' + format(j,'04d') + '.jpg'
    cv2.imwrite(outputFilePath, imageRotated)
    print('Wrote ' + outputFilePath)
Enter fullscreen mode Exit fullscreen mode

Now we have image clearly showing license plate. We still need to enhance-enhance before handing off to OCR. Thus repeat as before: find the license plate in the now-cropped image, in order to subsequently crop to just license plate. Next: YOLO model for license plates!

Next Link: TBD -- Model for detecting license plates in cropped image

Appendix: Interesting Points

  • Rotation (2d), and not perspective warping (3d) was an evolution of process. Initially I thought I would use perspective warping via homography as I knew how. But long story short, I realized it was over-engineering after tinkering on the solution. Why go through the effort of calculating perspective warp from my camera to highway, when a simple rotation of the image would work just as well - and a lot more simple?

Top comments (0)