Camera technology is beautiful. It’s given us all a chance to save our memories, and to relive them when we see them again in our photos.
That technology has come quite a long way over the past several years. With all kinds of new features like 4K, HDR, and colour enhancement, the photos one can capture are awe-inspiring.
But it does come at a price. Not everyone can afford the best-of-the-best camera. Consumer DSLR cameras range anywhere from a few hundred to several thousand dollars. Not only that, but not everyone can get the most out of those camera; we’re not all expert photographers after all!
Most of us just use our phones. But smartphones often take very bland photos in comparison to high-end DSLRs.
Deep learning changes all of that.
Research from the ETH Zurich’s Computer Vision Lab shows how you can automatically enhance photos taken by low quality cameras and make them look like they were taken by a pro photographer with a DSLR.
Here’s how they did it.
The team first collected a dataset of low-quality (from cell phones) and high-quality (from DSLR) photos, which you can download from the project page. This is exactly the data we want for such an enhancement task: input a low-quality image (from the phone) and have the deep network try to predict what the high-quality version (from the DSLR) would look like.
An image has several attributes that we may want to enhance: lighting, colors, texture, contrast, and sharpness. The deep network is trained to hit on all of these attributes with four different loss functions:
- Color loss: euclidean distance between the blurred versions of the predicted and target images.
- Texture loss: based on the classification loss from a Generative Adversarial Network (GAN). The GAN is trained to predict whether a grayscale photo is of high or low quality. Since grayscale is used, the network will be nicely focused on the textures of the image rather than color.
- Content loss: difference between VGG features of the predicted image and the ground truth. This loss ensures that the overall structure and objects in the image (i.e image semantics) remain the same.
- Total Variation loss: total vertical and horizontal gradients in the image. This enforces smoothness in the image, such that the final result is not too grainy or noisy.
Finally, these losses are all added up and an end-to-end network is trained to make the prediction! The over all GAN architecture is shown below. You can check out the paper for more details if you’d like to learn more too.
Thanks to the beauty of the open source mindset in the AI community, there is a publicly available implementation of this photo enhancer right here! Here’s how you can use it.
First clone the repository
git clone https://github.com/aiff22/DPED
Install the required libraries
pip install tensorflow-gpu
pip install numpy
pip install scipy
All of the pre-trained models already come with the repository in the models_orig folder, so there’s no need to download them!
Place the photos you want to enhance in the following directory:
This is the default directory for “iphone”, but you can change the code in the
test_model.py script if you want to change it. The reason it says “iphone” is because the authors originally trained 3 separate models using photos from 3 smartphones: iphone, sony, and blackberry, so those are your three options. But the model works quite well on most photos with any of these options, so we can just pick one and run with it!
Finally, to enhance the photos we just run one simple script:
python test_model.py model=iphone_orig \
Voila! Your enhanced and professional-looking photos will be saved in the
Give the code a try yourself, it’s great fun! See how your photos look after the enhancement. Feel free to post a link below to share your photos with the community. In the mean time, here’s a few results from my own tests.