Lessons Learned from Kaggle’s Airbus Challenge.

By Yassine Alouini

The challenge banner

Over the last three months, I have participated in the Airbus Ship Detection Kaggle challenge. As evident from the title, it is a detection computer vision (segmentation to be more precise) competition proposed by Airbus (its satellite data division) that consists in detecting ships in satellite images.

Before I start this challenge, I was (and somehow still) a beginner in the domain of segmentation. I have good grasp of “classical” machine learning (gradient boosting trees, linear models, and the likes) and have used it in production but deep learning was still a new thing for me.

Previously, I have written a series of blog posts explaining and implementing CNNs using Keras (check it here) and taken the great Coursera deep learning track series. However, I felt that what I have learned lacked practical applications. In fact, despite some guided projects, the course alone won’t help you develop “real-world” skills.

The competition’s timeline

So, what to do about that?

I started looking for “real-world” applications and so, around March 2018, I came across the Data Science Bowl 2018 Kaggle challenge. The competition consists in detecting cells’ nuclei. Similarly to the Airbus challenge, it was an instance segmentation task.

When I found out about this challenge, it was nearing its end (the competition was over by the 16th of April 2018). So, I followed some of the discussions, read and explored some of the models, learned a lot about segmentation using the U-Net model (and its variations), but didn’t have time to participate.

Thus, when the Airbus challenge came, I was more excited and set the following goal for myself: train a segmentation model and make at least one submission from it.

Have I achieved my goal?

Yes, I did and probably a lot more (you can decide by yourself once you read this post).

Overall, the process was enjoyable (most of the time at least) and I have gained a lot of practical knowledge. In what follows, I will share with you some of the lessons that I have learned (in no particular order).

Let’s get started.

That’s an obvious one of course but isn’t the focus of many people (at least when they start).

As obvious as this will sound, each task is different. Many competitions look similar from afar (segmentation for satellite images for example) but subtle differences make each task unique and hard (data collected following a particular process, unbalanced dataset, different evaluation metric…).

So make sure to understand the task and don’t hesitate to rely on the community to help you.

Various computer vision tasks (source: http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture11.pdf)

By the way, if you are new to the computer vision world in general and image segmentation more precisely, check these slides from the CS231 Stanford course.

Don’t reinvent the wheel or in the case of deep learning, don’t re-learn all the weights from scratch.

In fact, a lot of segmentation architectures contain two parts: an encoder and a decoder. The encoder is generally initialized with pre-trained weights learned on another dataset (for example ImageNet).

It is also very easy to initialize your model using these weights. Check the following example:

And before I forget, many thanks to Pavel Yakubovskiy (and other contributors) for the segmentation-models repo (notice that it contains many more architectures than just U-net). ;)

A partial view of my solution (source: https://github.com/yassineAlouini/airbus_ship_detection)

A lot of Kaggle users often start a competition using (or forking) a kernel (Kaggle’s custom Jupyter notebook implementation).

This is fine for doing EDA (exploratory data analysis) and getting a feel for the competition and exploring new ideas.

However, this doesn’t scale well.

Imagine you want to collaborate with a colleague (a common practice for top competitors) and the entire code base is in notebooks. It is a common knowledge that notebooks are hard to collaborate on and hard to version-control (even though there are some solutions such as nbdime).

Instead, I recommend the following better workflow:

  • Only keep notebooks for EDA and visualizations
  • Use git for version control
  • Make your code modular: a file for data pre-processing, one for modelling, one for evaluation, one for running the pipeline, and so on.

These steps will feel overkill when you start but trust me, it will be worth it on the long run.

Another benefit of this approach is that you will develop better software engineering skills. These are very valuable skills in the job market. In fact, you don’t only want to design good models but you also want to learn how to effectively collaborate with other colleagues, put models into production and make sure they scale.

Example of image data augmentaion (source: https://github.com/albu/albumentations)

When working with deep learning models, the more data you have, the better. Indeed, in most ML tasks, getting more data is often expensive. It could be done manually using Amazon’s mechanical turk (or similar alternatives) or using experts labeling (doctors for medical images for example).

Fortunately, generating new images is more scalable thanks to data augmentation. This is a valuable technique to avoid overfitting.

A sample of data augmentation using Keras (source: https://github.com/yassineAlouini/airbus_ship_detection/blob/master/asd/preprocessing.py)

During this challenge, I have used Keras’s ImageDataGenerator which is a very easy way to generate batches of augmented images.

At the end of the challenge, I have discovered a new augmentation library: albumentations. I wasn’t able to use it back then since it integrated more easily with pytorch but plan to give it a try on later challenges. Check the repo and paper. It is worth the read and effort.

An example of a monitoring dashboard (using comet.ml)

You can’t improve what you can’t measure (at least it will be harder).

Thus, it is a good practice to invest some time setting up ML monitoring and being organised in the various experiments you will conduct.

Fortunately, there are a lot of tools available to make your life easier. comet.ml is a great one and has a free plan if your code is open-source. You can find my dashboard for the Airbus challenge here.

Predicted ships masks (source: https://www.kaggle.com/iafoss/unet34-dice-0-87)

Alright, let’s say you have coded a model, trained it, and it gave a good score on the cross-validation dataset. What should you do?

An obvious thing to do, since this is a computer vision competition, is to check some of the predicted segmentation. This will give you an intuition about what could be improved (maybe your model has hard time finding smaller ships or correctly segmenting nearby ships) and what post-processing techniques you could try.

Build your own desktop if you can.

Many providers make participating in ML competitions easier than before. For instance, Google and Kaggle offer free GPU notebooks. Thus, theoretically, you just need an internet connection and a laptop.

That being said, if you are serious about Kaggle competitions (and ML experimentations more broadely), then the best investment is to build your own desktop. Don’t worry if you aren’t good with hardware, I got you covered. Check my build post where I explain the whole process. I have also included a section where I talk about other alternatives.

What you optimize in your model is sometimes as important as the model’s structure itself.

For the segmentation task for example, there are few variants:

One of the losses I have tried (source: https://github.com/yassineAlouini/airbus_ship_detection/blob/master/asd/losses_metrics.py)

I ended up trying a weighted sum of dice and cross-entropy and also focal loss and dice.

So, which one to choose then? If you are familiar with computer vision tasks and competitions, you will have an intuition for what might work best.

Now, if you are new (as I am), try various ones and select the one that seems the most promising for both execution time and cross-validation performance.

Finally, if you are unfamiliar with loss functions for classifications, the Wikipedia page is a good starting point.

Notice that I am planning to write a longer blog post about loss functions in various machine learning settings. So stay tuned for more about this. ;)

Is there a ship here or only waves?

Is the data unbalanced? Is it messy and needs heavy preprocessing?

These questions are important to determine what you should invest your time on and how you should approach the problem.

In fact, since the data in this competition is very unbalanced (a lot more empty sea images than ones with ships), a clever solution was to first train a classification model (whether there is a ship or not) then train a segmentation model on the predicted images with ships. Check this discussion thread for such a solution.

One day you could be here ;) (source: https://www.kaggle.com/c/airbus-ship-detection/discussion/71591#422787)

There is a huge and helpful Kaggle community so use it at your advantage: ask questions if you feel stuck, share what you have learned, upvote good kernels and posts, and stay up-to-date with the competition. This was especially crucial in this competition. In fact, a data leak was discovered and shared with the whole community. A fix was made eventually and things went back smoothly from there.

Stacking classifiers (source: http://rasbt.github.io/mlxtend/user_guide/classifier/StackingClassifier/)

A lot of the recent winning solutions (check this discussion for example) are stacked models (with one and sometimes two or more stacking levels). Often, it is also combined with post-processing techniques since these matter a lot in image segmentation. You can have a good example of these techniques in this code repo.

Every competition comes with new variations and specific problems: different evaluation metrics, tricky processing steps, unbalanced data, low-quality data, and so on.

Thus, don’t expect to get “good” results quickly (otherwise everyone will do and the competition isn’t worth it). Instead, keep doing the competition for as long as you can. Don’t get frustrated quickly (easier said than done) and trust that, with enough time and effort, you will eventually get better results (and at some point, win).

This should be an important aspect of course.

Try to balance the fun of the learning experience and your state of flow: challenge yourself progressively and don’t overwhelm yourself with everything at first.

Start slowly and ramp up as you progress. Try to focus on one aspect at a time and make deliberate effort to improve the areas that you aren’t good at. For me it was image pre-processing, augmentation, and post-processing. And give yourself time.

I hope you have enjoyed this post and gained new insights. Stay tuned for the next one!