Abstract
Build YOLO model from scratch to find bounding box of cars in images. This includes labeling test images, training model, and evaluating different settings for best model output.
ToC (Step-by-Step)
- Overview: OpenCV in Python for End-to-end License Plate Detection.
- Camera and computer setup. Raspberry pi (RPi), Canon DSLR, f-stop and ISO.
- Capturing images from DSLR to RPi. Automating capture of images and transfer to RPi.
- Model for detecting cars. Train model from scratch using YOLO and labeled images.
- Crop bounding box + rotate.
- TBD -- Model for finding license plates. Train a second model with YOLO.
- TBD -- Crop bounding box.
- TBD -- Read license plate with OCR. Pre-process image, and extract text with Paddle OCR.
Recipe
At this point we have focused camera, automatically capturing and transferring images, and we’re ready to process photos on the way to reading license plate numbers.
As humans, we see cars in the images from prior article. But our computers don't distinguish those entities yet like we can - not without being trained to. Thus, we will teach the computer what a car on a highway looks like.
Enabling computer to know what a car looks like distills essentially to: 1) collect training data, 2) label training data, 3) train model on training data, 4) train models with varying configurations and choose best. We’ll walk through those steps next.
-
Collect training data
- Use for loop’ing capture from step two under good light for a bit. I compiled 145 images.
Sample image
- Use for loop’ing capture from step two under good light for a bit. I compiled 145 images.
-
Label training data
- I used roboflow.com. I recommend their tooling. I’m sure there’s alternatives if you prefer, but roboflow made e2e process easier. Training the model requires label data in a particular format, which roboflow automates for you, and roboflow has online tooling to streamline your building of the labeling data.
- Upload your images
- Hand-label your images. This is time consuming. I suggest learning the hotkeys.
Sample labeling from roboflow - one of many
- I suggest defining guidelines for yourself to mitigate the monotony. My bounding boxes for labeling generally stretched from top-left turn-signal light to bottom-right wheel-well highlighting the entire rear bumper. I’m not saying use the same bounding box. I’m suggesting: know what your guideposts are. Knowing your guideposts will make your labeling consistent and help the tedium.
- Download training data. This is just the formatted xml capturing all your bounding boxes for training.
-
Train model YOLO
- Note that my output model is “v21” ie I trained several models to experiment with different parameters. I recommend testing and iterating on configurations that work for you: nano model vs medium, less epochs or more, etc.
- I found medium with default epoch and batch settings produced best results for my training set + labels. I trained five variant models including nano, medium, xlarge, and various permutations of epoch and batch.
- On MacBook M1 Pro with 32GB memory, took 4.5 hours to train the winning model. Five permutations of training different configurations took ~33 hours total.
from ultralytics import YOLO ########### # Training # Load the model baseDir = '/Users/japollock/Projects/TrainHighwayCarDetector/' model = YOLO(baseDir + 'resources/yolov8m.pt') results = model.train( imgsz=1280, # epochs=100, # batch=8, # device='mps', data=baseDir + 'resources/v2data_CARS/data.yaml', name='yolov8m_100_8_CARS_v21' ) ###########
-
Test YOLO.
- I compared five model variants on ClearML score dimensions: precision, recall, map50, and map50-95. I confirmed model performance manually visualizing the results with above script. Did this for both best and worst model. Worst model had false-positives, multiple detections on single car, and generally lower confidence score. Best model had no false-positives (in limited testing), no multiple detections, etc. Helped to see the output in addition to ClearML scoring.
from ultralytics import YOLO import cv2 from PIL import Image import imutils ########### # predict baseDir = '/Users/japollock/Projects/TrainHighwayCarDetector/' ########### # model model = YOLO(baseDir + 'src/runs/detect/yolov8m_v21_100e/weights/best.pt') ########### # images img = cv2.imread(baseDir + 'photos/yolo_cars/all/IMG_4553.jpg') results = model.predict( source=img, imgsz=1280 ) cv2.imshow("image", results[0].plot()) cv2.waitKey(0)
Sample classification from best model. “Cars” is the classifier label.
Now we have model detecting cars bounding box. This enables us to hone in on license plate from highway view. Next up is cropping down to bounding box, and perspective warp’ing to rectangular-ize the license plate from eschew.
Next Link: Crop bounding box + rotate.
Appendix: References
- https://app.roboflow.com/trainhighwaydetection
- https://learnopencv.com/train-yolov8-on-custom-dataset/
- https://app.clear.ml/projects/259cb1f4acbf49f4be55654b329429b1/experiments/049c87e301e84eb0986d74d7b5ed16d1/output/log
Appendix: Interesting Points
- Rabbit hole of car detectors
- Training a model to detect cars was an evolution in process. I started with the idea that I could just download an off-the-shelf model somewhere on the Internet and use that.
- First contact with “cars” model https://www.analyticsvidhya.com/blog/2021/12/vehicle-detection-and-counting-system-using-opencv/#wait_approval
- BUT from what I can gather first link was really based on this work https://github.com/andrewssobral/vehicle_detection_haarcascades
- Actual academic style paper detailing classifier https://github.com/andrewssobral/vehicle_detection_haarcascades/blob/master/doc/Automatic_Detection_of_Cars_in_Real_Roads_using_Haar-like_Features.pdf
- Aside from folks using the model without attribution, this model has terrible performance on my camera setup (ie eschewed orientation to cars; only rear-facing), and probably will on yours too. Models generally will have this quality - they will work well on their training data, and not necessarily yours. I’m sure this model was great for detect front-orientation of a car, or side-view. But our setup necessitated training a model from scratch.
Top comments (0)