License Plate Recognition
Scott Douglas
Sean Neubert

Overview:
We will be given a database of images with license plates clearly visible from Professor Gaborski. If these are not provided, we will use a digital camera to take pictures in the parking lot. We will then develop an algorithm to locate the license plate in the image and draw a box around it. We will not recognize the numbers.
Background:
There are several current implementations and theories on how to solve the problem of license plate recognition.
In "An Approach To License Plate Recognition", Parker and Federl first convert the image to grayscale. They then smooth the image with a 5x5 median filter. Afterwards they used a Shen-Castan edge detector to get an edge map. They go back to the smoothed image, and locate the characters in the image as boxes. They then apply a genetic algorithm to find the license plate based on how well the box they create matches the character boxes and the edge map. The character boxes give a good idea of where the license plate is, but the edge map is used to find the actual bounds of the plates.
In "Automatic License Plate Recognition", recognition of license plates was attained by scanning an RGB image for specific color edges (black and white, red and white, and green and white; because they are the colors of nearly all different types of license plates). Looking for only these edges eliminates most false positives and might remove the common limitation of ratio requirements for positive matches of the license plates within an image.
However, these theories are complex in implementation, and no likely feasible given the time currently allocated. In "A Fast License Plate Extraction Method on Complex Background", Bai, Zhu, and Liu found a method of using edge detection and then removing noise effectively (edge detection methods normally suffer from too many false positives). We will attempt to employ this method.
Data:
Professor Gaborski has offered to furnish test images. If more images are required, Scott has a digital camera and there are lots of parking lots at RIT.
Work Plan:
We will work together as a team for the four major parts of the project: vertical edge detection, edge density map generation, binarization and dilation, and license plate location.
Progress for 01.26.05:
We have implemented the image preprocessing for the algorithm presented in "A Fast License Plate Extraction Method on Complex Background." The images below attempt to take an image, and create a binary representation of the image's vertical edges with little noise. Our parameters need to be tweaked to help reduce noise. Our weak link right now is edge map generation.
One interesting thing to note is that the clean up stage tests strips of white in the vertical direction to make sure the strip is at least n-pixels connected. If it's smaller, then it is converted to background. This threshold must be set in code, since we don't know how far the image is zoomed in. This limits generalizability of the algorithm a bit.
Our current code can be found here.
Original Image:

Vertical Edge Detection With Sobel Filter:

Edge Map:

Edge Map Thresholded Using Otsu's Method:

Cleaned Image:

Dilated Image:

Progress for 02.01.05:
We have greatly improved our algorithm. Comments are inline. Keep reading.
Our current code can be found here.
Original Image:

Grayscale Image:

Converting to grayscale first increases the validity of our edge map data.
Red Map:

The Red Map has proven to be useless. Some pictures don't have headlights because it's a front view of the car. On some cars in the database, the headlights are below the license plate, which eliminates the application of vertical thresholding.
Vertical Edge Detection Using Sobel Filter:

Edge Map:

Edge Map Thresholded Using Otsu's Method:

Edge Map Dilated:

All the Thresholded Regions with Rectangles Around Them:

In the above image, each of the regions in our threshold has a rectangle drawn around it. We can now use this data to help determine the placement of the license plate. We plan on combining boxes in similar regions, to form larger boxes. We can then use heuristics about edge density, aspect ratio, etc... to determine which rectangle contains the actual license plate.
Progress for 02.08.05:
Our current code can be found here.
We started using both vertical and horizontal edge data to get a better representation of the license plate. When we just used the vertical data, the plate segmented into small blobs. When we went to filter the regions, these small blobs would be filtered out. We also began to dilate a larger amount. This helped increase the blob size. In the paper we were trying to implement, they used a complicated method based on density. They left the algorithm very ambiguous, so we decided not to use it. The dilation seems to do the same thing, but much faster. We now are accurate approximately 33% of the time. In a few cases we don't find anything that meets our criteria. In most of the other cases, we find the license plate, but also find other areas. Our work needs to focus on finding the next best way to filter out regions. We currently use size and aspect ratio to filter on. We could then look into how to locate the license plate in the remaining images where the license plate didn't show up using the current method. Some of our results are pictured below:










Progress for 02.15.05:
Our current code can be found here.
Since we started to correctly identify the license plate most of the time (along with some other junk), our main goal has been to narrow our results and make sure the algorithm only returns the license plate. Our new algorithm will either return one box that is its best guess of where the license plate is, or nothing if the algorithm has no idea where the license plate is. To do this, we run the algorithm as in the previous week. If multiple regions are found, we take a horizontal sliver of the center of each region. We then analyze the slivers to see which sliver has the most edges in it from our vertical edge detection. This region becomes our best guess. On the first twenty images, our success rate is 70% with 15% false positives. The remaining 15% of the time we don't find anything. On the next twenty images, our success rate drops dramatically to 35% accuracy with 50% false positives. The remaining 15% of the time we don't find anything. The only explanation we can come up with is either the last twenty images are just more difficult or we tuned our algorithm based on the first twenty images more than the last twenty. Our results for the entire image database were as follows:
Find Plate: 52.5%
Find Something Else: 32.5%
Find Nothing: 15%







