Abstract
So close! We now have cropped images with just license plate numbers. This is derived from highway-view, down to car-view, down to license plate. In this step, we pre-process the image to improve OCR, and we read the characters from image.
ToC (Step-by-Step)
Recipe
The following pre-processing steps helps focus OCR on just the important image information, and reduces everything else (eg RBG color, grays, small contours). For clarity this sequence of pre-processing prior to OCR is fairly typical pattern prior to OCR taught in computer vision. I’m not inventing or discovering something new here. At the bottom of article, we run OCR with no pre-processing to illustrate the difference.
Before pre-processing - cropped image of license plate from prior steps:
-
Pre-process step: desaturate colors. Go gray.
RGB color is not helpful here for OCR. Doesn’t matter if license plate numbers are magenta or cyan. Thus discard color information.
-
Pre-process step: gaussian blur to reduce noise / artifacts slightly.
Nevermind the “gaussian” part. This is just applying a blur filter to the image. That blur filter softens the small differences. And because small differences are reduced (eg dirty-ish license plate), then the big differences are easier to distinguish (eg dark-numbers on white-background).
-
Pre-process step: threshold to really highlight license plate characters
Thresholding further highlights where we have big differences such as dark-numbers on white-background.
-
OCR license plate
I experimented with a bunch of OCR libraries here. I found best for this use case was PaddleOCR, better than Tesseract and some others. Run PaddleOCR on thresholded image, and get string text, license plate number, back from the image!
Profit!
import cv2
from paddleocr import PaddleOCR, draw_ocr
from ppocr.utils.logging import get_logger
import logging
# Initialize PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='en', use_space_char=True)
logger = get_logger()
logger.setLevel(logging.ERROR) # avoid debug statements in OCR results
#baseDir = '/Users/japollock/Projects/TrainHighwayCarDetector/'
baseDir = '/home/pi/Projects/TrainHighwayCarDetector/'
img_path = baseDir + 'photos/yolo_licensePlates/croppedPlates/IMG_4554_0002.jpg'
print("Human readable is CEZ2594")
img = cv2.imread(img_path)
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), 0)
ret, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
# Perform OCR
result = ocr.ocr(thresh, cls=True)
for line in result:
print(line)
Print statement outputs
[[[[14.0, 29.0], [181.0, 22.0], [184.0, 76.0], [17.0, 83.0]], ('CEZ2594', 0.9654235243797302)]]
We read the license plate! Number to the right of string is confidence ie ~96.5% confident string is correct. And the left side numbers is bounding box coordinates. 🥳
Next Link: Tie it all together: End-to-end License Plate Detection
Appendix: References
- https://paddlepaddle.github.io/PaddleOCR/main/en/index.html#pp-ocrv3-chinese-model
- https://pypi.org/project/paddleocr/2.4/
Appendix: Interesting Points
Folks might wonder: why not just OCR the original image? You can. But this is what you get:
Note the right hand side not the red bounding box. Firstly, three of seven characters are wrong. This is the same input we just demonstrated, same license plate same image, only without pre-processing steps. Secondly, confidence score is low. This is why the pre-processing steps.
Next Link: TBD -- End-to-end real-time detection of license plates
Top comments (0)