Developed by the researchers at Intel, OpenCV is an open source library of functions dedicated to image processing. OpenCV (Computer Vision) - as the name suggests, is primarily focused on image processing aspects. It has a very rich (and equally complex) library of functions - thanks to the fact that it is an open-source tool. Learning all of these functions is difficult, and I rather approached it in a different way - learn a function as and when I need them.
Okay, so I tried and developed some of the basic programs like displaying an image or a video feed in a window, real-time video feed from a webcam, accessing individual frames from a video and filtering them and so on. I did that to get acquainted with the tool. Let me tell you that those programs are not as tough as they sound - just a few lines of code and that would serve the purpose. Some of you might even think of it as waste of time, but trust me - the work will pay its dividends. Once I was comfortable with using the basic functionalities, I developed a slightly complicated program - 'Real time face detection and tracking'.
OpenCV can be used with Windows, Unix, MAC and can be augmented with a variety of compilers - VS C/C++, Eclipse and many more. I preferred to use it with Windows, VS 2008 and .NET framework. If you're familiar with OpenCV, you will know that you can even use a wrapper instead of the .NET framework, but since it was my first project, I rather decided to stay away from it. The general logical flow of the program follows. I'll try and upload the video of the same as soon as possible so that you can have an idea of what exactly does the stuff do. The code is available upon request (prefer to contact me on varunshah444[at]gmail[dot]com). I welcome queries/suggestions and related comments from everyone.
Logical flow:
1. Start the webcam, and gain access to the live feed from it
2. Access the individual frames (this is where your hardwork pays off!!)
3. Load a cascade classifier for front of the face, and do template matching
4. You will now have the location of the face(s) in your video
5. Draw a rectangle (or whatever god damn shape you prefer) around the detected object (face in our case)
For those who are familiar, I have used the classifier cascade method and not the 'training-images' one, because of few reasons:
1. The latter takes hell lot of time to generate the .xml file. Typically, 3-4 days of continuous processing
2. It's tad complex compared to the classifier cascade method
3. For a beginner, it's better to use the former method
The moment it compiled and started working, I was on Cloud9, pheww!!! It is a wonderful feeling when your hardwork ultimately pays off. As I always say - Work hard, party harder! Happy coding!!!