DEV Community

Jeremy
Jeremy

Posted on • Edited on

Model for Detecting Cars

Abstract

Build YOLO model from scratch to find bounding box of cars in images. This includes labeling test images, training model, and evaluating different settings for best model output.

ToC (Step-by-Step)

License plate detection workflow visualized

  1. Overview: OpenCV in Python for End-to-end License Plate Detection.
  2. Camera and computer setup. Raspberry pi (RPi), Canon DSLR, f-stop and ISO.
  3. Capturing images from DSLR to RPi. Automating capture of images and transfer to RPi.
  4. Model for detecting cars. Train model from scratch using YOLO and labeled images.
  5. Crop bounding box + rotate.
  6. TBD -- Model for finding license plates. Train a second model with YOLO.
  7. TBD -- Crop bounding box.
  8. TBD -- Read license plate with OCR. Pre-process image, and extract text with Paddle OCR.

Recipe

At this point we have focused camera, automatically capturing and transferring images, and we’re ready to process photos on the way to reading license plate numbers.

As humans, we see cars in the images from prior article. But our computers don't distinguish those entities yet like we can - not without being trained to. Thus, we will teach the computer what a car on a highway looks like.

Enabling computer to know what a car looks like distills essentially to: 1) collect training data, 2) label training data, 3) train model on training data, 4) train models with varying configurations and choose best. We’ll walk through those steps next.

  1. Collect training data

    • Use for loop’ing capture from step two under good light for a bit. I compiled 145 images. Sample image from my camera setup for labeling Sample image
  2. Label training data

    • I used roboflow.com. I recommend their tooling. I’m sure there’s alternatives if you prefer, but roboflow made e2e process easier. Training the model requires label data in a particular format, which roboflow automates for you, and roboflow has online tooling to streamline your building of the labeling data.
    • Upload your images
    • Hand-label your images. This is time consuming. I suggest learning the hotkeys. Sample labeling of cars on highway Sample labeling from roboflow - one of many
    • I suggest defining guidelines for yourself to mitigate the monotony. My bounding boxes for labeling generally stretched from top-left turn-signal light to bottom-right wheel-well highlighting the entire rear bumper. I’m not saying use the same bounding box. I’m suggesting: know what your guideposts are. Knowing your guideposts will make your labeling consistent and help the tedium.
    • Download training data. This is just the formatted xml capturing all your bounding boxes for training.
  3. Train model YOLO

    • Note that my output model is “v21” ie I trained several models to experiment with different parameters. I recommend testing and iterating on configurations that work for you: nano model vs medium, less epochs or more, etc.
    • I found medium with default epoch and batch settings produced best results for my training set + labels. I trained five variant models including nano, medium, xlarge, and various permutations of epoch and batch.
    • On MacBook M1 Pro with 32GB memory, took 4.5 hours to train the winning model. Five permutations of training different configurations took ~33 hours total.
    from ultralytics import YOLO
    
    ###########
    # Training
    # Load the model
    baseDir = '/Users/japollock/Projects/TrainHighwayCarDetector/'
    model = YOLO(baseDir + 'resources/yolov8m.pt')
    
    results = model.train(
       imgsz=1280,
    #   epochs=100,
    #   batch=8,
    #   device='mps',
       data=baseDir + 'resources/v2data_CARS/data.yaml',
       name='yolov8m_100_8_CARS_v21'
    )
    ###########
    
  4. Test YOLO.

    • I compared five model variants on ClearML score dimensions: precision, recall, map50, and map50-95. I confirmed model performance manually visualizing the results with above script. Did this for both best and worst model. Worst model had false-positives, multiple detections on single car, and generally lower confidence score. Best model had no false-positives (in limited testing), no multiple detections, etc. Helped to see the output in addition to ClearML scoring.
    from ultralytics import YOLO
    import cv2 
    from PIL import Image
    import imutils
    
    ###########
    # predict
    baseDir = '/Users/japollock/Projects/TrainHighwayCarDetector/'
    
    ###########
    # model
    model = YOLO(baseDir + 'src/runs/detect/yolov8m_v21_100e/weights/best.pt')
    
    ###########
    # images
    img = cv2.imread(baseDir + 'photos/yolo_cars/all/IMG_4553.jpg')
    
    results = model.predict(
       source=img,
       imgsz=1280
    )
    
    cv2.imshow("image", results[0].plot())
    cv2.waitKey(0)
    

    Successfully detecting cars in highway image
    Sample classification from best model. “Cars” is the classifier label.

Now we have model detecting cars bounding box. This enables us to hone in on license plate from highway view. Next up is cropping down to bounding box, and perspective warp’ing to rectangular-ize the license plate from eschew.

Next Link: Crop bounding box + rotate.

Appendix: References

Appendix: Interesting Points

Top comments (0)