In this article, I’m going to talk about the first part of my computer vision project namely a hybrid method for moving object detection. After discussing the details of the method, I will share the python codes. The rest of the project will be shared in the upcoming posts.
As I mentioned in the previous post, object detection could be conducted by a human operator, however it is not feasible since he/she cannot consistently concantrate on the screen and can miss most of the detections. Another bunch of techniques could be deep learning-based approches such as Yolo, R-CNN, Faster R-CNN, etc. Although those approaches are the state of the art, they require high performance hardware, and they may not meet the realtime requirements. If the objects that we are interested in can move in the scene of our system, we can then exploit this property to capture the moving objects.
There are some automatic moving object detectors in the lietarture such as background subtraction, temporal differencing, and statistical methods. In this post, I’m going to describe a hybrid method which is a mixture of frame differencing and background subtraction approaches. The frame differencing that I present is a novel technique. Let’s describe the methods step by step.
Assume that the latest four gray scale frames in our video sequence up to current time (t) are f(t-3), f(t-2), f(t-1), f(t). In the beginning, all frames are filtered with a Gaussian kernel for smoothing. Then, I apply the following steps:
#steps for frame differencing diff0 = abs(f(t) - f(t-2)) #even frames diff1 = abs(f(t-1) - f(t-3)) #odd frames diff0 = diff0 > threshold #binary thresholding diff1 = diff1 > threshold #binary thresholding result0 = diff0 & diff1 #bitwise and result0 = dilate(result0) #morphological dilate to extend the objects result0 = open(result0) #morphological open to remove noise #steps for background subtraction background = alpha * background + (1-alpha) * f(t) diff = abs(f(t) - background) diff = diff > threshold #binary thresholding result1 = dilate(diff) #morphological dilate to extend the objects result1 = open(result1) #morphological open to remove noise #final step final_result = result0 & result1 #bitwise and
After this step, the algorithm finds the connected components in the resulting image. This is a list of bounding boxes each containing x,y coordinates and w,h (width,height) information of the moving object. Using opencv drawing libraries, we can draw bounding boxes of the objects in each frame at the end. Let’s run the python code. The name of the class is ‘objectDetector’. Please make sure that python3.7 and Anaconda are installed on your computer. The project requires the libraries opencv, numpy, and time and make sure that they are installed in your system. You can install them by using pip command in the anaconda prompt. Now crate a file in your working folder namely ‘objectDetector.py’ and copy-paste the following lines using your favorite editor (I prefer spyder):
import cv2 import numpy as np import time class objectDetector(): def __init__(self, width, height): self.threshold_frame_diff = 20 self.threshold_background = 20 self.alpha_background = 0.9 self.min_pixels = 150 self.max_pixels = 10000 self.kernel = np.ones((5,5)) self.gaussKernel = (3,3) self.search_offset = 2 self.pen_color = (255,255,255) self.pen_thickness = 2 self.detection_list =  self.frame_p = np.zeros((1,1)) self.frame_pp = np.zeros((1,1)) self.frame_ppp = np.zeros((1,1)) self.background = np.zeros((1,1)) self.width = width self.height = height def background_subtraction(self, frame): frame_gry = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY) frame_gry = cv2.GaussianBlur(frame_gry, self.gaussKernel,0) if self.background.shape != 1: self.background = cv2.addWeighted(self.background, self.alpha_background, frame_gry, (1-self.alpha_background), 0) else: self.background = frame_gry diff = cv2.absdiff(frame_gry, self.background ) _, new_img = cv2.threshold(diff, self.threshold_background ,255,cv2.THRESH_BINARY) new_img = cv2.morphologyEx(new_img, cv2.MORPH_DILATE, self.kernel) new_img = cv2.morphologyEx(new_img, cv2.MORPH_OPEN, self.kernel) return new_img def frame_differencing(self,frame): frame_gry = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY) new_img = np.zeros(frame_gry.shape,dtype=np.uint8) frame_gry = cv2.GaussianBlur(frame_gry, self.gaussKernel,0) if self.frame_p.shape != 1 and self.frame_pp.shape != 1 and self.frame_ppp.shape != 1: diff0 = cv2.absdiff(self.frame_p, self.frame_ppp) _, diff0 = cv2.threshold(diff0,self.threshold_frame_diff ,255,cv2.THRESH_BINARY) diff1 = cv2.absdiff(frame_gry, self.frame_pp) _, diff1 = cv2.threshold(diff1,self.threshold_frame_diff ,255,cv2.THRESH_BINARY) new_img = cv2.bitwise_and(diff0,diff1) new_img = cv2.morphologyEx(new_img, cv2.MORPH_DILATE, self.kernel) new_img = cv2.morphologyEx(new_img, cv2.MORPH_OPEN, self.kernel) self.frame_ppp = self.frame_pp self.frame_pp = self.frame_p self.frame_p = frame_gry return new_img def find_detections(self, frame): bck = self.background_subtraction(frame) diff = self.frame_differencing(frame) fgmask = cv2.bitwise_and(diff,bck) connectivity = 4 output = cv2.connectedComponentsWithStats(fgmask, connectivity, cv2.CV_32S) num_labels = output #labels = output stats = output #centroids = output self.detection_list =  for i in range(num_labels): area = stats[i][cv2.CC_STAT_AREA] if area >= self.min_pixels and area <= self.max_pixels: x = stats[i][cv2.CC_STAT_LEFT]-self.search_offset y = stats[i][cv2.CC_STAT_TOP] -self.search_offset w = stats[i][cv2.CC_STAT_WIDTH] +self.search_offset*2 h = stats[i][cv2.CC_STAT_HEIGHT] +self.search_offset*2 x = max(0,x) y = max(0,y) box = (x,y,w,h) if x + w < self.width and y+h < self.height: self.detection_list.append(box) self.putRectangle(frame, box) self.putText(frame, 'Target', (x,y-5)) return fgmask def putRectangle(self,frame, box): (x,y,w,h) = box cv2.rectangle(frame, (x, y), (x + w, y + h), self.pen_color, self.pen_thickness) def putText(self, frame, text, location): cv2.putText(frame,text,org=location, fontFace=cv2.FONT_HERSHEY_COMPLEX, fontScale=0.5,color=self.pen_color) def main(): file_name = 'your_file_name.mp4' #put your file name here frame_rate = 30 #frame rate of your video capture = cv2.VideoCapture(file_name) if (capture.isOpened()== False): print("Error opening video stream or file") return frame_name = 'Moving Object Detection' width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)) height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT)) detector = objectDetector(width, height) is_reading = False frame = None is_reading, frame = capture.read() while is_reading: time.sleep(1.0/frame_rate) if is_reading: try: #find moving objects fgmask = detector.find_detections(frame) cv2.imshow(frame_name, frame) except Exception as e: print('exception: ', e) else: break key = (cv2.waitKey(20) & 0xFF) if key == ord('q'): #press q to quit break is_reading, frame = capture.read() capture.release() cv2.destroyAllWindows() if __name__ == '__main__': main()
Before running the code make sure file_name variable is set to a correct video file name in your working directory. You can run the file by pressing F5 in your keyboard or in the command line you can type ‘python objectDetector.py’. You should see something like the following figure (in your video, the moving objects may be different than cars):
That’s all for now, enjoy your detector!