At Parkopedia we apply cutting-edge machine learning and computer vision methods to solve difficult parking problems. Several of our projects involve using street-level imagery and video data, using sources such as dash cams to extract parking insights.
In recent years, significant progress in computer vision solutions has been made. Today, great effort is put into making these developments generally applicable, however, most are still only applied and tested in laboratories on the same limited standard datasets. When working with real-world images, you often find yourself confronted with unexpected problems.
At Parkopedia, we are constantly dealing with these issues while working with ‘real world’ dashcam footage. In our efforts to use this data to extract useful parking information, we started on the basis of our standard street segmentation model which classifies each pixel on an image into one of 16 classes. The simple version of that model has 5 classes: road, vehicle, sidewalk, curb, and other. The model was originally trained on the A2D2 dataset, a public dataset provided by Audi that contains a set of videos recorded in Germany. Performance was strong, with the model able to discern the different classes with an accuracy of 97.5%.
When applied to our dashcam videos recorded in London, the performance of the model significantly deteriorated.
The model seemed to malfunction in areas of the image where strong reflections appear. These are the reflections of objects left on the vehicle dashboard or the reflection of the dashboard itself. Although humans have become accustomed to ignoring this, a computer is more restricted to what it has analyzed before. The training data previously did not incorporate reflections so it was no surprise we started to see the model fail here.
One could also argue that the model fails because it has been trained on videos of German streets as opposed to London variants. However, the model provides consistent results on the ‘Camvid’ dataset, another publicly available dataset recorded in the UK.
The next error source checked was the reflection itself. This kind of reflection can be diminished when recording a video by using a ‘dash mat’ which is a non-reflective cloth laid on the dashboard. However, we are not always able to control how the videos are being recorded as many are received from third parties. To use this form of data, we needed our segmentation model to be robust to reflection interference.
Augmenting Images with Artificial Reflections
Data augmentation is a technique used in machine learning that consists of randomly applying slight modifications to data so that the model sees beyond the original dataset. For instance, if your training data only contains bright images, your model might not perform as expected on dark images. Rather than collecting a new dataset of dark images, a simple solution is to artificially make your training images darker. You can apply the same logic to contrast, colors, etc.
Similarly, our dataset did not contain images with reflections, so we started to simulate artificial reflections to our training images. In this case, the inside of the vehicle is being reflected onto the image which means that anything laying on the dashboard could end up appearing on the image. The most visible and damaging reflections are those of the actual dashboard, but also notebooks, wrappers, or anything left there by the driver. We reproduced these reflections on our training images by adding such objects to the images to make it appear like they are a reflection of items lying on the dashboard.
We successfully trained our model on the augmented dataset and it now reaches a performance of 97.2%, very similar to the original version, indicating it has learned how to handle the artificial reflections well. Analyzing the model’s performance based on the ‘real world’ data, we can see that it is significantly less disorientated by reflections coming from the dashboard, as illustrated with the following images.
When working with real-world data, our researchers are constantly confronted with new problems coming from imperfect data. This can include reflections, occlusions, vandalized information signs, or perhaps poorly maintained parking infrastructure with completely washed out demarcations. Sometimes the data is unfit for purpose and there is no option other than discarding the data completely. However, more often this just presents one more interesting problem for our team to resolve, and as shown in this example, sometimes some creative manipulations can do the trick!
Parkopedia is the world’s leading parking services provider used by millions of drivers and organizations such as Audi, Apple, BMW, Ford, Garmin, GM, Jaguar, Land Rover,
Mercedes-Benz, Peugeot, Sygic, TomTom, Toyota, Volkswagen, and many others. Parkopedia is available in 15,000 cities across 89 countries globally, covering over 70 million parking spaces, helping drivers take the pain out of parking. Parkopedia helps drivers find the closest, cheapest, or available parking to their destination, pay in selected locations, and navigate directly to the parking space. Visit business.parkopedia.com for more information.