Object detection, tracking, and 3D positioning using a single camera

In this article, I’m going to share some results of my computer vision project namely moving object detection, tracking, 3D positioning, and speed estimation using only a single camera. The project has been completed using opencv and python. This post will summarize the project and in the upcoming posts I will present the details step by step and share the codes.

The first stage of the project is the moving object detection. This can be done by a human operator. However, it is impractical because the operator cannot constantly focus on the system and can miss most of the objects in the scene. Therefore, automatic methods are commonly prefered in the lieterature such as background subtraction, temporal differencing, and statistical methods. In this project, I use a novel technique in which sequential odd and even frames are differenced and bitwise-anded with some pre and post processings. The method can successfully capture the moving objects and eliminate the background clutter. The following figure shows how a moving object is captured in a video sequence using this technique.

Detection Result

In the second stage, a track is initiated for each detected object in the sequence. There are various visual object trackers in the liturature and most of them are based on convolutional neural networks, discriminative correlation filters, structured SVM, HOG features, mean shift, and optical flow. DSLT, ECO, KCF, TLD and MOSSE can be given as some examples of sophisticated trackers. In this project, I use my own custom tracker which is based on template matching in an acquisition window, adaptive scaling, template updating, and kalman filtering. The proposed tracker is highly light-weight and can track multiple objects successfully in real time and online.

The third step is the positioning (3D) of the moving objects using a single camera. In most cases, stereo and depth cameras are used to capture 3D images and depth information of the objects in the scene. However, using a single camera, it is not easy to accurately extract the depth information. Once it is extracted, then one can reconstruct the 3D position of the object by means of camera properties and mapping formulations. In this stage, I propose a method by which one can extract the depth information of the object with known real size information. Finally, the speed of the object is calculated by norming the velocity information which is provided by the kalman filter for free. In the following figure, we see how a moving target is tracked and position/speed information estimated.

Tracked Object with Position and Speed Estimation

The video of the experiment can be seen in the following link.

As I said earlier, I will share the codes step by step in the upcoming posts. That’s all for now, enjoy your day.

Sharing is caring!

Leave a Reply

Your email address will not be published. Required fields are marked *