Today's world is full of data, and images form a significant part of this data. However, before they can be used, these digital images must be processed—analyzed and manipulated in order to improve their quality or extract some information that can be put to use.
Common image processing tasks include displays; basic manipulations like cropping, flipping, rotating, etc.; image segmentation, classification, and feature extractions; image restoration; and image recognition. Python is an excellent choice for these types of image processing tasks due to its growing popularity as a scientific programming language and the free availability of many state-of-the-art image processing tools in its ecosystem.
This article looks at 10 of the most commonly used Python libraries for image manipulation tasks. These libraries provide an easy and intuitive way to transform images and make sense of the underlying data.
scikit-image is an open source Python package that works with NumPy arrays. It implements algorithms and utilities for use in research, education, and industry applications. It is a fairly simple and straightforward library, even for those who are new to Python's ecosystem. The code is high-quality, peer-reviewed, and written by an active community of volunteers.
scikit-image is very well documented with a lot of examples and practical use cases.
The package is imported as skimage, and most functions are found within the submodules.
import matplotlib.pyplot as plt %matplotlib inline
from skimage import data,filters
image = data.coins() # ... or any other NumPy array!
edges = filters.sobel(image)
Template matching using the match_template function:
You can find more examples in the gallery.
NumPy is one of the core libraries in Python programming and provides support for arrays. An image is essentially a standard NumPy array containing pixels of data points. Therefore, by using basic NumPy operations, such as slicing, masking, and fancy indexing, you can modify the pixel values of an image. The image can be loaded using skimage and displayed using Matplotlib.
A complete list of resources and documentation is available on NumPy's official documentation page.
Using Numpy to mask an image:
import numpy as np
from skimage import data
import matplotlib.pyplot as plt %matplotlib inline
image = data.camera()
numpy.ndarray #Image is a NumPy array:
mask = image < 87
SciPy is another of Python's core scientific modules (like NumPy) and can be used for basic image manipulation and processing tasks. In particular, the submodule scipy.ndimage (in SciPy v1.1.0) provides functions operating on n-dimensional NumPy arrays. The package currently includes functions for linear and non-linear filtering, binary morphology, B-spline interpolation, and object measurements.
For a complete list of functions provided by the scipy.ndimage package, refer to the documentation.
Using SciPy for blurring using a Gaussian filter:
from scipy import misc,ndimage
face = misc.face()
blurred_face = ndimage.gaussian_filter(face, sigma=3)
very_blurred = ndimage.gaussian_filter(face, sigma=5)
plt.imshow(<image to be displayed>)
PIL (Python Imaging Library) is a free library for the Python programming language that adds support for opening, manipulating, and saving many different image file formats. However, its development has stagnated, with its last release in 2009. Fortunately, there is Pillow, an actively developed fork of PIL, that is easier to install, runs on all major operating systems, and supports Python 3. The library contains basic image processing functionality, including point operations, filtering with a set of built-in convolution kernels, and color-space conversions.
The documentation has instructions for installation as well as examples covering every module of the library.
Enhancing an image in Pillow using ImageFilter:
from PIL import Image,ImageFilter
im = Image.open('image.jpg')
from PIL import ImageEnhance
enh = ImageEnhance.Contrast(im)
enh.enhance(1.8).show("30% more contrast")
OpenCV (Open Source Computer Vision Library) is one of the most widely used libraries for computer vision applications. OpenCV-Python is the Python API for OpenCV. OpenCV-Python is not only fast, since the background consists of code written in C/C++, but it is also easy to code and deploy (due to the Python wrapper in the foreground). This makes it a great choice to perform computationally intensive computer vision programs.
The OpenCV2-Python-Guide makes it easy to get started with OpenCV-Python.
Using Image Blending using Pyramids in OpenCV-Python to create an "Orapple":
SimpleCV is another open source framework for building computer vision applications. It offers access to several high-powered computer vision libraries such as OpenCV, but without having to know about bit depths, file formats, color spaces, etc. Its learning curve is substantially smaller than OpenCV's, and (as its tagline says), "it's computer vision made easy." Some points in favor of SimpleCV are:
- Even beginning programmers can write simple machine vision tests
- Cameras, video files, images, and video streams are all interoperable
The official documentation is very easy to follow and has tons of examples and use cases to follow.
Mahotas is another computer vision and image processing library for Python. It contains traditional image processing functions such as filtering and morphological operations, as well as more modern computer vision functions for feature computation, including interest point detection and local descriptors. The interface is in Python, which is appropriate for fast development, but the algorithms are implemented in C++ and tuned for speed. Mahotas' library is fast with minimalistic code and even minimum dependencies. Read its official paper for more insights.
The documentation contains installation instructions, examples, and even some tutorials to help you get started using Mahotas easily.
The Mahotas library relies on simple code to get things done. For example, it does a good job with the Finding Wally problem with a minimum amount of code.
Solving the Finding Wally problem:
ITK (Insight Segmentation and Registration Toolkit) is an "open source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. SimpleITK is a simplified layer built on top of ITK, intended to facilitate its use in rapid prototyping, education, [and] interpreted languages." It's also an image analysis toolkit with a large number of components supporting general filtering operations, image segmentation, and registration. SimpleITK is written in C++, but it's available for a large number of programming languages including Python.
There are a large number of Jupyter Notebooks illustrating the use of SimpleITK for educational and research activities. The notebooks demonstrate using SimpleITK for interactive image analysis using the Python and R programming languages.
Visualization of a rigid CT/MR registration process created with SimpleITK and Python:
pgmagick is a Python-based wrapper for the GraphicsMagick library. The GraphicsMagick image processing system is sometimes called the Swiss Army Knife of image processing. Its robust and efficient collection of tools and libraries supports reading, writing, and manipulating images in over 88 major formats including DPX, GIF, JPEG, JPEG-2000, PNG, PDF, PNM, and TIFF.
Pycairo is a set of Python bindings for the Cairo graphics library. Cairo is a 2D graphics library for drawing vector graphics. Vector graphics are interesting because they don't lose clarity when resized or transformed. Pycairo can call Cairo commands from Python.
Drawing lines, basic shapes, and radial gradients with Pycairo:
These are some of the useful and freely available image processing libraries in Python. Some are well known and others may be new to you. Try them out to get to know more about them!