A hybrid moving object detector (CV project part 1)

In this article, I’m going to talk about the first part of my computer vision project namely a hybrid method for moving object detection. After discussing the details of the method, I will share the python codes. The rest of the project will be shared in the upcoming posts.

As I mentioned in the previous post, object detection could be conducted by a human operator, however it is not feasible since he/she cannot consistently concantrate on the screen and can miss most of the detections. Another bunch of techniques could be deep learning-based approches such as Yolo, R-CNN, Faster R-CNN, etc. Although those approaches are the state of the art, they require high performance hardware, and they may not meet the realtime requirements. If the objects that we are interested in can move in the scene of our system, we can then exploit this property to capture the moving objects.

There are some automatic moving object detectors in the lietarture such as background subtraction, temporal differencing, and statistical methods. In this post, I’m going to describe a hybrid method which is a mixture of frame differencing and background subtraction approaches. The frame differencing that I present is a novel technique. Let’s describe the methods step by step.

Assume that the latest four gray scale frames in our video sequence up to current time (t) are f(t-3), f(t-2), f(t-1), f(t). In the beginning, all frames are filtered with a Gaussian kernel for smoothing. Then, I apply the following steps:

#steps for frame differencing 
diff0 = abs(f(t) - f(t-2))    #even frames
diff1 = abs(f(t-1) - f(t-3))  #odd frames

diff0 = diff0 > threshold #binary thresholding
diff1 = diff1 > threshold #binary thresholding

result0 = diff0 & diff1 #bitwise and
result0 = dilate(result0) #morphological dilate to extend the objects
result0 = open(result0) #morphological open to remove noise

#steps for background subtraction
background = alpha * background + (1-alpha) * f(t)

diff = abs(f(t) - background)

diff = diff > threshold #binary thresholding

result1 = dilate(diff) #morphological dilate to extend the objects
result1 = open(result1) #morphological open to remove noise

#final step
final_result = result0 & result1 #bitwise and

After this step, the algorithm finds the connected components in the resulting image. This is a list of bounding boxes each containing x,y coordinates and w,h (width,height) information of the moving object. Using opencv drawing libraries, we can draw bounding boxes of the objects in each frame at the end. Let’s run the python code. The name of the class is ‘objectDetector’. Please make sure that python3.7 and Anaconda are installed on your computer. The project requires the libraries opencv, numpy, and time and make sure that they are installed in your system. You can install them by using pip command in the anaconda prompt. Now crate a file in your working folder namely ‘objectDetector.py’ and copy-paste the following lines using your favorite editor (I prefer spyder):

import cv2
import numpy as np
import time

class objectDetector():
    
    def __init__(self, width, height):

        self.threshold_frame_diff = 20
        self.threshold_background = 20
        self.alpha_background = 0.9

        self.min_pixels = 150
        self.max_pixels = 10000
        
        self.kernel = np.ones((5,5))
        self.gaussKernel = (3,3)
        
        self.search_offset = 2
        self.pen_color = (255,255,255)
        self.pen_thickness = 2
        self.detection_list = []
        self.frame_p = np.zeros((1,1))
        self.frame_pp = np.zeros((1,1))
        self.frame_ppp = np.zeros((1,1))
        self.background = np.zeros((1,1))
        self.width = width
        self.height = height

    def background_subtraction(self, frame):
        
        frame_gry = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
        frame_gry = cv2.GaussianBlur(frame_gry, self.gaussKernel,0)
    
        if self.background.shape[0] != 1:
            self.background = cv2.addWeighted(self.background, self.alpha_background,
                                              frame_gry, (1-self.alpha_background), 0)
        else:
            self.background = frame_gry
            
        diff = cv2.absdiff(frame_gry, self.background )
            
        _, new_img = cv2.threshold(diff, self.threshold_background ,255,cv2.THRESH_BINARY)
        
        new_img = cv2.morphologyEx(new_img, cv2.MORPH_DILATE, self.kernel)
        new_img = cv2.morphologyEx(new_img, cv2.MORPH_OPEN, self.kernel)
        
        
        return new_img
    
    def frame_differencing(self,frame):
        
        frame_gry = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

        new_img = np.zeros(frame_gry.shape,dtype=np.uint8)
        
        frame_gry = cv2.GaussianBlur(frame_gry, self.gaussKernel,0)
        
        if self.frame_p.shape[0] != 1 and self.frame_pp.shape[0] != 1 and self.frame_ppp.shape[0] != 1:

            diff0 = cv2.absdiff(self.frame_p, self.frame_ppp)
            _, diff0 = cv2.threshold(diff0,self.threshold_frame_diff ,255,cv2.THRESH_BINARY)
            
            
            diff1 = cv2.absdiff(frame_gry, self.frame_pp)
            _, diff1 = cv2.threshold(diff1,self.threshold_frame_diff ,255,cv2.THRESH_BINARY)
            
            new_img = cv2.bitwise_and(diff0,diff1)
            
            new_img = cv2.morphologyEx(new_img, cv2.MORPH_DILATE, self.kernel)
            new_img = cv2.morphologyEx(new_img, cv2.MORPH_OPEN, self.kernel)

               
        self.frame_ppp = self.frame_pp
        self.frame_pp = self.frame_p
        self.frame_p = frame_gry
        
        return new_img
        
    def find_detections(self, frame):

        bck = self.background_subtraction(frame)
        diff = self.frame_differencing(frame)
        fgmask = cv2.bitwise_and(diff,bck)
        connectivity = 4
        output = cv2.connectedComponentsWithStats(fgmask, connectivity, cv2.CV_32S)
        num_labels = output[0]
        #labels = output[1]
        stats = output[2]
        #centroids = output[3]
        self.detection_list = []

        for i in range(num_labels):
            area = stats[i][cv2.CC_STAT_AREA]
            if area >= self.min_pixels and area <= self.max_pixels:
                x = stats[i][cv2.CC_STAT_LEFT]-self.search_offset
                y = stats[i][cv2.CC_STAT_TOP] -self.search_offset
                w = stats[i][cv2.CC_STAT_WIDTH] +self.search_offset*2
                h = stats[i][cv2.CC_STAT_HEIGHT] +self.search_offset*2
                x = max(0,x)
                y = max(0,y)
                box = (x,y,w,h)
                if x + w < self.width and y+h < self.height:    
                    self.detection_list.append(box)
                    self.putRectangle(frame, box)
                    self.putText(frame, 'Target', (x,y-5))        
                        
        return fgmask
    
    def putRectangle(self,frame, box):
        (x,y,w,h) = box
        cv2.rectangle(frame, (x, y), (x + w, y + h), self.pen_color, self.pen_thickness)
        
    def putText(self, frame, text, location):
        
        cv2.putText(frame,text,org=location, fontFace=cv2.FONT_HERSHEY_COMPLEX,
                    fontScale=0.5,color=self.pen_color)
        
        
def main():
   
    file_name = 'your_file_name.mp4' #put your file name here
    frame_rate = 30 #frame rate of your video
    capture = cv2.VideoCapture(file_name)
    if (capture.isOpened()== False): 
        print("Error opening video stream or file")
        return
    
    frame_name = 'Moving Object Detection'
    width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
    detector = objectDetector(width, height)
    is_reading = False
    frame = None
    
    is_reading, frame = capture.read()
    while is_reading:
        time.sleep(1.0/frame_rate)
        if is_reading:      
            try:
                #find moving objects
                fgmask = detector.find_detections(frame)
                cv2.imshow(frame_name, frame)
            except Exception as e:
                print('exception: ', e)
        else:
            break
        
        key = (cv2.waitKey(20) & 0xFF)
        
        if key == ord('q'): #press q to quit
            break
        
        is_reading, frame = capture.read()
        
    capture.release()
    cv2.destroyAllWindows()
    
if __name__ == '__main__':
    main()

Before running the code make sure file_name variable is set to a correct video file name in your working directory. You can run the file by pressing F5 in your keyboard or in the command line you can type ‘python objectDetector.py’. You should see something like the following figure (in your video, the moving objects may be different than cars):

Detected Moving Objects

That’s all for now, enjoy your detector!

Sharing is caring!

One thought on “A hybrid moving object detector (CV project part 1)

Leave a Reply

Your email address will not be published. Required fields are marked *