We have implemented a correlation-based stereo algorithm, following the approach taken by Fua [4]. The algorithm computes similarity scores for every pixel in the image by taking a fixed window in the left image and shifting it along the epipolar line in the right image. The scores are determined using the normalized mean-squared difference of gray levels:
Where is the score of the correlation.
The summations are performed
over all pixels in the correlation window.
and
are pixels from the left
and right correlation window respectively,
and
are their average values over the correlation
window,
and
are the coordinates of the correlation window in the
left image,
and
is the disparity at which the comparison is made.
The desired disparity at the given pixel is then the one that provides the minimum correlation score:
Where is the disparity map and
,
is the disparity
search range.
The disparity map can be interpreted as the distance from the robot to the objects in the viewed scene, under the assumption of parallel camera image planes. Each disparity is inversely proportional to the distance of the object along the line of sight of each pixel [6]:
where is the baseline distance between cameras and
is the
focal length of the camera.
The performance of the stereo algorithm is improved by temporally
extending the results of the stereo algorithm.
The speedup is achieved by reducing the amount of searching
done along the epipolar lines.
In general stereo algorithms search for the best match within
a fixed disparity range, .
Our algorithm accepts different disparity search ranges,
,
, for each pixel in the image.
The disparity ranges provided are less then or equal
to the full disparity range. Therefore, the amount of searching
by the algorithm is reduced.
The disparity ranges are computed from the previous
disparity map and the constraints on the motion of the robot.
The first step in computing the disparity ranges is
to determine how much each pixel can move in the scene, given
the constraints on the motion of the robot. The area of the image
to which the pixel can move will be referred to as the ambiguity area,
.
Once the ambiguity areas are computed for each pixel, we have determined all points in the scene that the pixels can possibly see. By scanning the ambiguity area in the disparity map it is possible to determine the minimum and maximum disparity that the pixel may have in the next time interval. This disparity range is determined for each pixel, and provided to the next iteration of the stereo algorithm as input.