This page will contain materials and progress reports related to my diploma thesis. The task is to enable robot navigation on a checkerboard floor of the Eurobot contest and to detect specific colored objects (skittles) with an omnidirectional visual sensor.
The thesis was defeneded and marked as excellent by the oponnent and the entire commitee.
The master thesis is finished and published online! You can either read the individual chapters or download the entire thesis.
Started creating a summary of all the known Vision-Based Localization - Relevant Papers. Each paper has own wikipage with abstract, short annotation and linked citations.
Implemented a method of position tracking using the preprocessed boundary data for motion estimation and MCL for verification and correction of the current pose estimation. More can be found in the Tracking the Absolute Position of a Mobile Robot Using Vision-Based Monte Carlo Localization for the DARH2005.
Here you can find some experiments:
Definitevely sick and tired of the camera’s unpredictible white balancing, I decided to try prepending color normalization in front of the filter chain.
The first filter now processes the captured frame to reach uniform distribution of all 3 channels. RGB values are recomputed so that each channel has mean (average value) 65 and standard deviation (square root of divergence) equal to 50. With these values, the RGB cube optimally fills the YCrCb color space:
Thanks to the normalization, all the tracked colors are always present in the
thresholded image (even in the camera startup phase ). This enables to track
the White playing table borders (which used to be very fugitive).
Due to different montage of the camera, whole playing field is visible all the time and even the Checkerboard Pattern in Eurobot 2005 boundaries are easier to track:
The current algorithm detects the orientation and shift of the Checkerboard Pattern in Eurobot 2005 boundaries performing the following filter chain:
The input image is thresholded, browN-beigE boundary is detected
Then it is transformed to ground plane where it gets summed in 128 different directions
and the peaks in the sum image denote orientation and shift of straight lines in the transformed image:
The vision testing application has been rewritten to support capturing into several so called filter chains, each of which consists of several image filters that perform some operation on the image going from the camera through the chain. The image below shows output of the new BoundaryFilter, which extracts browN / beigE boundary information from the thresholded image.
The next step is to sum these boundaries in different directions to get the most probable orientation and shift of the Checkerboard Pattern in Eurobot 2005 pattern.
Finished the definitive montage of the new camera to the Beacon Support with Omnivisual Sensor.
The montage allows fine-tuning of the camera position and orientation resulting in an evidently more precise shape of the projected surface.
Applied the pixelwise transformation on the camera image to get a sparse
matrix of transformed camera pixels. It is optimistic, that the computed
transformation seems to be correct
On the other hand, proper camera
montage is necessary to see the Checkerboard Pattern in Eurobot 2005 pattern as actual squares.
Also tried to make a pretty dense image using a naive interpolation algorithm (sparse pixels are interpolated on horizontal lines only). It looks bad because in the top and bottom parts, the pixels are very sparse… It would be necessary to find the nearest pixels in the euclidean metric.
Future work:
Performed a few computations to get the formulas necessary to describe
the reflection of a ray corresponding to an image pixel.
I finally assumed that the center of camera projection (first principial point) is located in the second focus F (not contained in the mirror) of the mirror surface. This results in that the prolongations of all the reflected rays intersect in the fist focal point E.
The intersection of a ray passing from the camera chip towards the mirror
and the mirror surface lays at the coordinates [t, F(t)]
where
F is the mirror surface function and t
can be computed as:
where
and
Or alternatively (when fixing )
Todays meeting brought some specification to the corner detection algorithm (checkerboard processing).
Yet another possible approach:
More notes:
Today I just created pieces of code to better understand the yesterday’s work
The code performs all the 6 thresholding operations at once (as described earlier) and shows the image in the six representing colors. The fact that the different cones in YUV space overlap is not a big problem because all the 6 classes are maintained, i.e. when searching for browN and beigE, it doesn’t matter that some of the beigE is classified as White too.
From the right image it is obvious how the YUV space is oriented in the RGB one. The bigger window contains a UV plane for a fixed Y while the upper one contains result of the segmenation process applied on the UV plane.
The colors will definitely be played with but I think it can be dismissed until the final cameras arrive because I fear that the colors will be a bit different.
Todays testing was based on the CMVision idea:
int
table, one can track at most 32 different colors at a time using just 2 binary and
operators on every pixel.
The test code (
color_segmentation.c
) allows to calibrate the area in the YUV space interactively.
Left mouse button adds the selected pixel to the set while right mouse button removes all the three associated coordinates.
The visually selected region is then manually modified to form a cone.
The selected areas will be described as (y, u, v, y_span, u_span, v_span)
which allows
to automatically adjust the span tolerances in reaction to ambient conditions.
Results:
Algorithm:
Future work:
Today I tested the OpenCV
cvPyrSegmentation
function.
This function joins color thresholding with descending an image pyramid
to create initially interconnected segments. Then it creates larger
connected components from the initial ones using a second threshold.
Features:
Requires:
I tried to improve the results using
cvCvtColor
. The idea is that
if the game elements are strongly differentiated in color, I can throw
away the brightness and saturation information and process only the hue
channel with much higher frame-rate.
In fact, the speed up is the only positive of this approach. Frame rate climbed to 4fps (from around 2.5fps for a 3-channel source) which still is not really impressive anyway.
Although the red skittles are pretty distinctive, the bad news is that the green skittle color (dark green-yellow) is very close to the floor color (beigE and brown become yellow and orange-yellow respectively). It is almost impossible to get any results from adjusting the thresholds.
Future work:
Meeting with Zbynek Winkler.
Created the outline of the localization alghoritm:
Based on the e-mail discussion transcripted below the skittle detection algorithm evolved to the following steps: