Have you ever imagined how our smartphones are detecting faces while taking pictures? Or while unlocking mobile phones using your face as identity? How Augmented Reality(AR), Mixed Reality(MR) works? All of these technologies are being developed using Computer Vision at its core.
What is Computer Vision?
Computer Vision is a way(science) of teaching intelligence to machines and making them see things just like a human. Computer Vision is an integral part of artificial intelligence in machines, making more intelligible machines, machines that can identify people or can detect objects in a scene or play games with humans.
Now that we know what Computer Vision is, let us dive into python to implement some simple programs which will read a picture from the system and perform some pre-processing’s.
Requirements :
- Python 3.6 or Higher
- OpenCV library (Installed)
Quick Note for Installations:
Download python from Official Python Page and install it, remember to download python 3.6 or higher. Once you have python running on your machine, open "Command prompt" or "Terminal"(in macOS and Linux) and install OpenCV library using the following command:
pip install OpenCV-python
Once all the requirements are installed properly we can start working on OpenCV library, which is a huge open-source library. This library contains hundreds of CV(Computer Vision) algorithms. OpenCV library is available for C++, Java, Python, MATLAB etc. But in this article, we will be using python to implement OpenCV programs.
Let's begin with a simple program where we will convert a colored image into a black & white image.
Converting Colored Images to Black & White (GrayScale):
Steps to follow for reading an image and converting it to grayscale and store it, Display and close it.
import cv2
image = cv2.imread(“path_of_file”[,integer_value])
cv2.imwrite("NewFile_name",image)
cv2.imshow("TitleOfWindow",image)
cv2.waitKey(0)
cv2.destroyAllWindows()
In the first line we import the OpenCv library, this cv2
library is used as an Object to call different functions available in the OpenCV library.
cv2.imread():
In the second line of the program above, we used cv2.imread()
function which is used for reading an image. The first argument is the path of the image along with its extension.
The second argument is how to interpret the image, i.e if we pass the value as 0(zero) as the second argument it will read the image in grayscale(black and white). We can pass a value greater than 0 to read the image with RGB colors and less than zero for alpha colors. The cv2.imread()
function returns a numpy array(image data stored in an array) which is stored in the image
variable.
In simpler terms, the image
variable contains the actual data of the image, in our case the bird image.
cv2.imwrite():
The third line specifies the function to store the image. cv2.imwrite()
function requires the path along with filename where the image has to be stored and the second argument is the cv2 image variable acquired from cv2.imread()
function.
cv2.imshow():
This function displays the image in a window and takes input for the title of the window and the image
variable. The window displays the image but it closes immediately, so to make the window display the image until we press a key on the keyboard we use the cv2.waitKey(0)
function.
Hence when the window opens click any key on the keyboard and the window will close. cv2.destroyAllWindows()
will properly close all the resources used for creating and showing the window.
Here's the Program once again:
import cv2
image = cv2.imread('bird.jpg',0)
cv2.imwrite('birdBW.png',image)
cv2.imshow('Converted to GrayScale',image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Input Image:
Following image was taken for processing,
Output Image:
Following is the window displayed using cv2.imshow()
to show the output image,
A new file is created in the working directory for this grayscale image with name birdBW.png.
TIP: Close the window by pressing any key on the Keyboard rather than closing the window using the mouse, because this will properly close all the resources in the program and will not cause any error.
Applying Gaussian Blur to Images:
The steps for applying the gaussian blur are similar to the previous program but this time we don't have to convert the image to grayscale. In the program above we learned how to read an image using cv2.
imread
()
, now let's learn how to apply Gaussian blur to the image.
Code for applying Gaussian Blur:
import cv2
image = cv2.imread('bird.jpg',0)
matrix = (9,11)
image_blur = cv2.GaussianBlur(image,matrix,0)
cv2.imshow('Smoothened Image',image_blur)
cv2.waitKey(0)
cv2.destroyAllWindows()
After reading the image using cv2.imread()
method and storing it in variable image
, we need a tuple representing x
and y
axis values stored in the variable matrix
.
The main purpose of applying gaussian blur is to smoothen the image, for this, we are using this matrix to smoothen pixel based on x
and y axis values stored in the matrix
variable as a tuple.
Quick Note about Python Tuples:
Python tuples are data structures to store python objects in a comma-separated fashion. Follow the link to learn mor about them: Tuples in Python
Remember that while creating a tuple, x
and y
axis values must be positive and should have odd value only, else you will get an error. Now just use the function cv2.GaussianBlur(image, matrix, 0)
to blur the image and store it in a variable for further use like Storing the image in your device or to display it in a window.
The NERD way of explaining Gaussian Blur:
To perform a smoothing operation we will apply a filter to our image. The most common type of filters are linear, in which an output pixel's value is determined as a weighted sum of input pixel values:
h(k, l)
is called the kernel, which is nothing but a representation of the coefficients of the filter. It helps to visualize a filter as a window of coefficients sliding across the image.
Completely ignore the above equation if you are not able to understand it.
Input Image:
Following image was taken for processing,
Output Image:
Following is the output image,
Applications of OpenCV Library:
We often convert images in some applications to grayscale, which is useful in finding edges in an image. While working with RGB image codes complexity increases because you have to write a number of methods for finding edges etc.
Mostly we use Gaussian Blur in image processing to remove noise. Often, pictures that are taken at night end-up having significant noise.
Example: Footage from CCTV and data from different datasets.
There will be more articles regarding OpenCV in which you will learn to access the webcam and take pictures or record videos and perform some cool stuff on them.
HAPPY CODING..!
You may also like: