Moving Object Detection:-
Moving object detection is a technique used in computer vision and image processing. Multiple consecutive frames from a video are compared by various methods to determine if any moving object is detected.
MOD using Computer Vision
In recent years, deep learning techniques have contributed hugely to the field of computer vision and object detection. They can employ GPUs for efficient computing which makes it easier for us to use them for large-scale applications.
There are some very simple computer vision techniques is Equally work very well today. Moving object detection using frame differencing and summing techniques is one such method.
Frame Differencing and Summing Technique in Computer Vision
Frame Differencing and Summing Technique (DST for short) is a very simple yet effective computer vision technique. We can use it to know whether there are any moving objects in a video or not.
We know that a video consists of multiple consecutive frames. 0 is completely black and 255 is white.
Moving Object Detection and Segmentation
The moving object detection and segmentation directly use the video frames that we are dealing with. This is also the process where we use frame DST (Differencing and Summing Technique).
Moving further, let’s learn about the steps that we have to go through for moving object detection and segmentation from the video frames.
Image Resize:-
import cv2
import imutils
img = cv2.imread(‘sample2.jpg')
resizedImg = imutils.resize(img, width=500)
cv2.imwrite(‘resizedImage.jpg', resizedImg)
Gaussian Blur – Smoothening:-
import cv2
img = cv2.imread(‘sample2.jpg')
grayImg = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#dst = cv2.GaussianBlur(src, (kernel),borderType)
gaussianImg = cv2.GaussianBlur(grayImg, (21, 21), 0)
cv2.imwrite(“GaussianBlur.jpg”, gaussianImg)
Threshold:-
#dst = cv2.threshold(src, threshold, maxValueForThreshold,binary,type)[1]
import cv2
img=cv2.imread("sample.jpg")
grayImg = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gaussBlur = cv2.GaussianBlur(grayImg,(21,21),0)
thresholdImg = cv2.threshold(grayImg,150,255,cv2.THRESH_BINARY)[1]
cv2.imwrite("threshold.jpg",thresholdImg)
Drawing Rectangle:-
#cv2.rectangle(src,startpoint,endpoint,color,thickness)
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
Putting Text in Image:-
#cv2.putText(src, text, position,font,fontSize,color,thickness)
cv2.putText(img, text, (10, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
findContours:-
#dst =cv2.findContours(srcImageCopy, contourRetrievalMode,
contourApproximationMethod)
cnts = cv2.findContours(threshImg.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
Source Code:-
import imutils
import time
import cv2
vs = cv2.VideoCapture(0)
firstFrame = None
area=500
while True:
_,img = vs.read()
text = "Normal"
img = imutils.resize(img, width=500)
grayImg = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
grayImg = cv2.GaussianBlur(grayImg, (21, 21), 0)
if firstFrame is None:
firstFrame = grayImg
continue
imgDiff = cv2.absdiff(firstFrame, grayImg)
threshImg = cv2.threshold(imgDiff, 25, 255, cv2.THRESH_BINARY)[1]
threshImg = cv2.dilate(threshImg, None, iterations=2)
cnts = cv2.findContours(threshImg.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
for c in cnts:
if cv2.contourArea(c) < area:
continue
(x, y, w, h) = cv2.boundingRect(c)
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
text = "Moving Object detected"
print(text)
cv2.putText(img, text, (10, 20),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
cv2.imshow("VideoStream", img)
cv2.imshow("Thresh", threshImg)
cv2.imshow("Image Difference", imgDiff)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
vs.release()
cv2.destroyAllWindows()
Advantages
- The first one is low computation power. As we do not use any neural network or deep learning technique, it is not computationally demanding.
- We can even run it on a CPU. Even a moderately powerful CPU will suffice for employing this detection technique for moving objects.
Disadvantages
- First of all, we can only detect moving objects. If our goal is that, then it is all fine. But we will not be able to detect static objects using this technique.
- This also means that we cannot use this technique on images but only on videos.
- It cannot be actually completely real-time as we have to wait at least for a certain number of frames to get the background model.
- We also have to get a certain number of frames for differencing and summing and then only we can start detection.