Face detection using Python and OpenCV
Most of the posts you will find in this blog are Erlang related (of course they are!), but sometimes I like writing also about my experiences at University of Trento as I am doing right now. During the last couple of years I have attended many courses about Computer Vision and Digital Signal Processing so today I would like to show you something about it.
In this post I will write about making some code for face detection purposes using python and OpenCV. This post will have no code, actually you can just grab my original code from here (the files needed are faces.py and haarcascade_frontalface_alt.xml).
Face detection is a computer technology that determines the locations and sizes of human faces in images or video. It detects facial features and ignores anything else, such as buildings, trees and bodies.
Face detection is used for different purposes as recognition or video surveillance, but at University I mainly focused on human computer interface and image database management (for example, one of the project I did needed an accurate computation of the pupils center).
In order to identify the face ROI (Region Of Interest), I exploit the work proposed in 2001 by Paul Viola and Michael Jones, which brings together new algorithms and insights to construct a framework for robust and extremely rapid object detection. This framework was mainly developed to solve the problem of faces detection, but can be trained to detect a large variety of object classes. In Viola-Jones’ algorithm three main contributions are introduced.
The first contribution is a new image representation called “integral image” which allows very fast features evaluation. Briefly, we may say that a set of features which are reminiscent of Haar Basis functions is used, such features can be computed very rapidly at many scales by introducing the integral image representation for images. Once computed, anyone of these Haar-like features can be computed at any scale or location in constant time.
The second contribution consists of a method oriented to the construction of a classifier by selecting a small number of important features using AdaBoost algorithm. Within any image subwindow the total number of Haar-like features is quite big, in fact it is much more larger than the number of pixels. To ensure fast classification, the learning process must exclude a large majority of the available features, and focus on a small set of critical features. In practice, the selection of feature is obtained using a simple modification of the AdaBoost: the weak learner is constrained so that each weak classifier returned can depend on only a single feature. This means that each stage of the boosting process, which selects a new weak classifier, can be viewed as a feature selection process.
The third major contribution is a method which combines successively more complex classifiers in a cascade structure. In such a way there is a boost in the speed of the detector: the focus is concentrated on promising regions of the image since in such a way it is often possible to rapidly determine whether an object might occur or not in an image. More complex and accurate processing is reserved only for promising regions.
OpenCV provides an implementation of the Viola-Jones’ algorithm. OpenCV (Open Source Computer Vision Library) is a cross-platfrom library developed by Intel, WillowGarage and Itseez for real-time computer vision. The functions proposed by OpenCv can be used with C++ and Python but there are other some ports available (e.g. EMGU cv for C#).
The functions I used for a trivial face detection are:
- CascadeClassifier: Loads a classifier from a file and returns an object of time CascadeClassifier. For my code I will load the file proposed for face detection by OpenCV. The aforesaid file is haarcascade_frontalface_alt.xml.
- detectMultiScale: this function must by applied to a CascadeClassifier object. It detects objects of different sizes in the input image. The detected objects are returned as a list of OpenCV rectangles.
As I said before you will not find any code in this post, since the code I provide in the git is quite self explanatory, but if you have any trouble don’t hesitate and comment this post. Here follows how to run the code and the output I got by applying the code to the most famous picture in Computer Vision: Lena.
python faces.py /path/picture.jpg
A couple of notes about the code:
- I assume that at leat a face ROI is found in the image. This means that if you try to run this code with an image with no faces you will end up (I guess) with an error due to a non handled case
- The detection can bring as result more faces which are stored in the vector “rects”. You may notice that I assume the first element of this array to be the face we were looking for. This in fact could lead to error, since the classifier may detect as faces some pattern in the background and store it in the first element of the array. You can adjust this by changing the parameters in line 6 of faces.py (especially changing the minimum size allowed).
- The classifier is not bullet proof, if you run this code with a very difficult db (e.g. BioId database) you will find many pictures where the face is not found by the classifier, mostly because the face is to close to the camera or in a strange position).