Pretrained Pytorch license plate segmentation model (DeepLabV3 with ResNet-101 backbone)
Install all dependencies:
# With conda - best to start in a fresh environment:
conda install --yes pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install --yes opencv
conda install --yes matplotlib
conda install --yes -c conda-forge tensorboard
pip install mmcv
# or clone this repo, removing the '-' to allow python imports:
git clone https://github.com/dennisbappert/pytorch-licenseplate-segmentation pytorch_licenseplate_segmentation
# Get started with the sample notebooks
Making predictions:
# Load the model:
model = create_model()
checkpoint = torch.load(weights, map_location='cpu')
model.load_state_dict(checkpoint['model'])
_ = model.eval()
if torch.cuda.is_available():
model.to('cuda')
# Prediction pipeline
def pred(image, model):
preprocess = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0)
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
with torch.no_grad():
output = model(input_batch)['out'][0]
return output
# Loading an image
img = Image.open(f'{filename}').convert('RGB')
# Defining a threshold for predictions
threshold = 0.1 # 0.1 seems appropriate for the pre-trained model
# Predict
output = pred(img, model)
output = (output > threshold).type(torch.IntTensor)
output = output.cpu().numpy()[0]
# Extracting coordinates
result = np.where(output > 0)
coords = list(zip(result[0], result[1]))
# Overlay the original image
for cord in coords:
frame.putpixel((cord[1], cord[0]), (255, 0, 0))
The following models have been pretrained (with links to download pytorch state_dict’s):
Model name | mIOU | Training dataset |
---|---|---|
model.pth (477MB) | 79.51 | Handcrafted |
model_v2.pth - new version (477MB) | 83.17 | Handcrafted |
The base weights are from here.
The model has been trained (transfer learning) on a small hand-crafted (130 images) dataset. Several augmentations were used during each epoch to ensure a good generalization of the model. However there are certain underrepresented classes (motorcycles, busses, american trucks).
The learning process has been adapted from the original training with the following changes:
The model has been trained on a RTX2060 Super with a batch-size of 2 for 250 Epochs (8 hours training time) on a Windows machine with an eGPU. After the training the model was fine-tuned using Stochastic Weight Averaging in order to generalize better (with bug success, most initial bugs disappeared).
The attached YouTube video shows the strength and weaknesses of the current model:
This notebook demonstrates how you can leverage the model to make predictions on single images.
This notebook demonstrates how you can leverage the model to overlay license plates in videos. This is just an example and loads all frames into memory which is not optimal for big videos. Also there is no resizing applied, which means that you need a beefy GPU with at least 8GB of memory for HD videos.
It might also be an obvious usecase to extract the license plates and use the segmentation results for further license plate recognition. This notebook shows, how to extract the results.
Example:
I basically started this project out of couriosity to improve my skillset. In particular I wanted to examine whether it is possible to solve an image segmentation problem with a very small dataset and I think I have been somewhat successful so far. I’m pretty new in this domain and appreciate all feedback, having that said, feel free to open as many issues as required.
I’m planning to do the following things in future:
By default the dataloader expects the following structure:
Feel free to change the dataloader to your needs. The masks are expected to be a PNG using 8-bit pixels, black and white. The training outputs a tensorboard summary to ./runs/
.
The data should look like this:
Pytorch DeepLabV3 ResNet-101 implementation and weights https://pytorch.org/hub/pytorch_vision_deeplabv3_resnet101/
Lovasz Softmax Loss function https://github.com/bermanmaxim/LovaszSoftmax
Pytorch reference implementation to train the model https://github.com/pytorch/vision/tree/master/references/segmentation
Example videos https://www.pexels.com/