Figure 19 Stereo vision
Quite some years ago, Julesz used `random dot stereograms', to show that the brain is able to interpret stereograms, with each monocular image by itself lacking any image contents (see [Julesz 60] and [Julesz 78]). Each of the left and right images considered alone appears to be completely randomly generated and contains no information. The 3D effect emerges from the interaction of the two images inside the brain, where the stereo disparity in the random patterns are analyzed and converted into spatial sensations. Random dot stereograms are very easy to generate on a computer system and do not have the disadvantage of photographs, such as imprecision and alignment problems. Neither do they require edge detection, since random dot stereograms consist of edges only.
1. Fill the left and right images with identical random values
2. Raising or lowering areas
An object can be raised to a certain altitude by shifting it only in the right image a corresponding number of pixels to the left (or lowered to some depth level by shifting to the right). The gap created in this process is then filled with a random dot pattern. The left image remains unchanged. This step is repeated for every area that will be elevated or lowered from the original image plane.
TYPE b_image = grid OF BOOLEAN;
g_image = grid OF [0..255];
i_image = grid OF INTEGER;
...
PROCEDURE generate_random_dot(VAR l_pic, r_pic: b_image);
VAR i,num, from_x, from_y, to_x, to_y, shifts: INTEGER;
BEGIN
l_pic := RandomVB(); (* random vector boolean *)
r_pic := l_pic;
WriteString("Number of Areas to Elevate: "); ReadInt(num);
FOR i := 1 TO num DO
WriteString("Area: from-x from-y to-x to-y shifts ");
ReadInt(from_x); ReadInt(from_y);
ReadInt(to_x); ReadInt(to_y);
ReadInt(shifts);
IF (from_y <= DIM(image,2) <= to_y) AND
(from_x - shifts <= DIM(image,1) <= to_x) THEN
r_pic := MOVE.left:shifts (r_pic) (* move rectangle *)
END;
IF (from_y <= DIM(image,2) <= to_y) AND
(to_x - shifts <= DIM(image,1) <= to_x) THEN
r_pic := RandomVB(); (* fill gap *)
END;
END;
END generate_random_dot;
There is a quite a number of possible techniques for viewing stereograms, all of which ensure that only the left picture is seen by the left eye and only the right picture by the right eye. Examples are prisms, LCD shutter displays, polarization filters, red/green filters, head mounted displays (two screens) or simply the concentrated viewing of each image with one eye. Red/green glasses (anaglyph method) represent the easiest option to implement. First the left image is colored green/white and the right image is colored red/white. Then the images are printed on top of each other, but slightly out of alignment, so that the displacement cannot be recognized without the glasses. If two colored pixels are to be printed on top of each other, they should appear on paper or on the screen as a black pixel.
The red filter in front of the left eye only allows the red light to pass through and therefore makes the green and black points appear dark and the white and red points light. This corresponds exactly to the original left image. The green filter in front of the right eye only allows the green light to pass through, therefore the red and black pixels appear dark. The green image is thereby filtered out of the mixed image stereogram (see Figure 20).
Figure 20 Effect of stereo glasses
The ensuing reverse calculation from the stereograms (e.g. regenerating the depth information from the left and right images) is much more difficult and computationally expensive than the generation of the stereograms. In addition, the reverse calculation can never be 100% correct. Usually, there are individual points which are assigned false depth values.
Analysis of random dot stereograms
(Reverse calculation of depth information from stereograms)
1. Comparison of the right image to the left image and search for matching pixels
A pair of corresponding pixels (one pixel from left and right image each, at the same position) match if both pixels are black or both are white. This operation can be performed in parallel on all pixels.
2. Shifting of the left image one pixel to the left and comparison with the right image
This shift step with subsequent comparison of left and right image (see step 1) is carried out iteratively for each depth level. This gives the matching pixels for all height levels.
3. Determining the depth of each Pixel
This step is also carried out iteratively for each depth level, but in parallel for all pixels. Data from a 5 x 5 local neighbor field (only a 3 x 3 field is shown in the figure) with eight-way grid connections is used to determine the matching neighborhood for all pixels. Pixels not matching receive the value zero, while pixels matching receive the sum of all neighbor pixels that also match, plus one for themselves.
4. Selection of the best fitting level (depth) for each pixel
The level (depth) that gives the largest matching sum for each pixel is selected as the level for that pixel (in the case of equal values, the lower level is selected). If the neighborhood values for all levels are zero, the level of one of the neighbor pixels is chosen. This computation can be done in parallel for all pixels.
The data from all the levels can be combined into a depth image (`false color image'), as shown here. For each pixel, its color represents its depth. The solution found in the example is not totally correct, since differences of isolated pixels are not avoidable with this procedure. The correct depth image is shown in parentheses below it.
5. Filter (optional)
In order to eliminate single pixel errors, a local 3 x 3 matrix can be used in each level, to sum the number of pixels already in that level. Pixels having too few neighbors in this level (number < threshold) are assigned to another level. The filter operation can be performed in parallel for all pixels.
The algorithm for analyzing random dot stereograms shown here can very easily be translated into a data parallel program. Function sum_3x3 returns the sum of a local 3x3 neighborhood.
PROCEDURE analyze_random_dot(l_pic,r_pic: b_image;
steps: CARDINAL; VAR elev: g_image);
VAR equal : b_image;
level, maxlevel: i_image;
i : INTEGER;
PROCEDURE sum_3x3(val: i_image): i_image;
VAR l_r: i_image;
BEGIN
l_r := val + MOVE.left(v) + MOVE.right(v); (* horiz. sum *)
RETURN( l_r + MOVE.up(l_r) + MOVE.down(l_r) ); (* vert. sum *)
END sum_3x3; (* cumulative *)
BEGIN
elev := 0;
maxlevel := 0;
FOR i := 0 TO steps DO (* add congruences in 5x5 neighborhood *)
equal := l_pic = r_pic;
level := sum_3x3( ORD(equal) );
(* find image plane with max. value *)
IF equal AND (level > maxlevel) THEN
elev := i;
maxlevel := level;
END;
l_pic := MOVE.left(l_pic); (* move image *)
END;
END analyze_random_dot;
Figure 21 shows a sample stereogram and the corresponding calculated depth
information (containing a few inevitable defects) together with a subsequent
smoothing using a 3x3 median filter operation.
Generated Random Dot Stereogram:
left image.....................right image
computed depth image filtered depth image
Figure 21 Random dot stereogram with calculated depth information
Figure 22 shows the generation of a random dot stereogram with 100x100 pixels and the subsequent computation of the corresponding depth image. The first half of the diagram corresponds to the stereogram generation. The PE-activation is decreasing, since for this sample run the image area being shifted is decreasing as well. The second half of the diagram shows a continuous high processor utilization, while individual peaks correspond to the iterations of the main analysis loop, comprising steps 2 to 4 of the algorithm above.
Figure 22 Processor utilization of random dot stereogram program