In this post, I’m going to use only one camera to determine the 3D position of a target whose size is known apriori. The extracted position will be in the camera coordinate system. In the first step, I’m going to execute a calibration stage. For this, first I take the photos of a known object from the distances of 1,2, …, 5 meters. The actual width of the red object is 0.082 meters and I fix the camera resolution to 720p. At each photo, I measure the object width in terms of pixels. The procedure could be seen in the following figure.

After that I obtain the number of pixels vs distance plot as follows.

As we see that the plot has an exponential behaviour, we then use an exponential function to fit the data. In the same figure, we can observe the resulting fitted curve. The function that I use has the following form:

```
p = alpha * exp(-beta * d^gamma)
d: distance
p: number of pixels
alpha, beta, gamma: unknown coefficients
```

The result of the fitting operation gives the following numbers for the unknown coefficients:

```
alpha: 329
beta: 2
gamma: 0.4
```

We next need to invert the function to use it properly as follows:

`d = (-1/beta*log(p/alpha))^(1/gamma)`

If we multiply this function with the ratio (real object width/reference object width), we can then calculate the distance of the object, approximately. The final formula becomes:

`d = (-1/beta*log(p/alpha))^(1/gamma)*(real object width/reference object width)`

We also need to find x and y coordinates of the object of interest in order to localize it in 3D space. If we know the pixel locations of the object in the image (i,j), we can then easily convert them in x-y space using the following well known formulae:

```
x = depth * (i/(width-1)-0.5)*tan(fh/2)
y = depth * (0.5-j/(height-1))*tan(fv/2)
width: width of the image in pixels
height: height of the image in pixels
fh: horizantal fov of the camera in radians
fv: vertical fov of the camere in radians
depth: equivalent to d which we already determined
```

After giving the details, here I share the full python localization class as follows:

```
from kalmanFilter import kalmanFilter
import numpy as np
class localization():
def __init__(self, frame_rate, width, height):
self.width = width
self.height = height
self.frame_rate = frame_rate
self.kalman_speed = kalmanFilter(state_dim=6, measurement_dim=3,
dt=1/self.frame_rate, Q=0.001, R=50)
self.reduction_rate = 0.8
self.fov_h = 100
self.fov_v = 70
def predict(self, boundingBox, real_length):
(x,y,w,h) = boundingBox
#print('loc: {:.2f}, {:.2f}'.format(loc[0], loc[1]))
depth = self.calculate_dist(w*self.reduction_rate, real_length)
loc = (x+w/2,y+h/2)
(px,py,pz) = self.calx_xyz(depth, loc)
ret = self.kalman_speed.predict(np.array([px,py,pz]))
(kx,ky,kz,kvx,kvy,kvz) = ret.ravel()
pos = (kx,ky,kz)
vel = (kvx,kvy,kvz)
return pos, vel
def calculate_dist(self, pixels, real_length):
alpha = 329
beta = 2
gamma = 0.4
return (-1/beta*np.log(pixels/alpha))**(1/gamma) * real_length/0.082;
def calx_xyz(self, depth, search_point):
(i,j) = search_point
fh = self.fov_h/180*np.pi
fv = self.fov_v/180*np.pi
px = depth * (i/(self.width-1)-0.5)*np.tan(fh/2)
py = depth * (0.5-j/(self.height-1))*np.tan(fv/2)
pz = depth
return (px,py,pz)
```

Please note that the class requires the kalmanFilter which I have already shared in the following link: