Now a quick look at four 3D systems.
Uses the principle of the human stereo vision. With our two eyes, placed at a certain distance from each other, viewing the same scene at two slightly different angles, we are able to judge how far away objects in the scene are. To use the same principle to measure distances and hence record 3D information of natural scenes has intrigued people for very long. The working principle was recognized already by Greek Euclid around 280 AD, hence the mathematics (Euclid was a mathematician) or at least the basic part has been around for very long. With stereo photography, the way of recording and displaying images such that the human viewer will experience depth has been popular for almost 100 years now. However, using a computer to reconstruct depth from a stereo image pair is a totally different ballgame. One part is a trigonometric calculation (triangulation) which is a trivial activity for a computer. But before this calculation can take place one has to determine “corresponding pixels” in the two camera images. This means finding the part of the left image that corresponds with a part in the right image. This is one of the many image processing tasks that our brains are excellent in. Today there are a series of algorithms which together with quite powerful PC’s can perform this 3D reconstruction sufficiently fast and stable. However, this has not been true for very long. I don’t have an exact year here but the possibility to perform digital stereo vision will origin from the early 1980’s. Both PC’s and algorithms have evolved quite a bit since, and there are both algorithms and applicable hardware to make stereo-vision useful in the industry of today. A Stereo Vision system typically reaches a depth resolution of WD/100. The frame rate is typically lower than 20 fps for 1.3 Mpixels image sensors. Since stereo vision typically can only delivers a 3D point for about 80% of the sensor pixels-pairs, a stereo vision system delivers about 20 x 1.3*106 x 0.8 = 2.1*107 3Dp/s.
“Sheet Of Light” (SOL)
The SOL technique is based on triangulation, as is also partly true for stereo vision, but instead of using two cameras, SOL uses one camera and one (laser) light source. The light source has its rays (ideally) located in only one plane or one sheet, and from there its name, Sheet Of Light. The simplest way is to use a spot laser and let it shine through a cylindric lens. The narrow and cylindrically shaped beam will when passing through the cylindric lens be transformed (re-shaped) to a narrow triangular sheet (see Figure 1). Today special optics are available which can perform this transformation such that we get a high-quality light-sheet, narrow and flat and with homogenous intensity. The light-sheet is directed towards the object we want to measure and we can observe a typically bright line on that object. From knowing the location and orientation of the light sheet relative to the location and orientation of the camera, it is easy to reconstruct where in space the different parts of this line is located (using triangulation). Hence, we can reconstruct the 3D shape of that object, however, temporarily only along that very line. In order to get the shape of the whole visible part of the object we will need to repeat the same action after e.g. translating the object (or the laser and camera setup) over a series of short distances. These short distances are typically measured very accurately by means of an encoder. With this background, it might not be surprising that SOL typically reaches a depth resolution of about WD/1000 (!), and with a data rate of 10k 3Dprofiles/s where each profile contains 2k 3Dpoints, a speed of 2*107 3Dp/s is achieved.
“Time Of Flight” (TOF)
Time Of Flight or TOF is the youngest of the four techniques discussed here. The first TOF products came on the market a few years after the mellennium shift. TOF uses the principle of measuring the time it takes for light to travel from the camera to an object and back to the camera. Since it takes only about 3.3 ns for light to travel 1m in air, the electronics used to handle this “measurement” need to be very fast. It is fast electronics of the controllable LED illumination and fast electronics for the signal processing in combination with a dedicated CMOS image chips which has enabled this technique. The principle is based on the technique used in LIDAR’s (Light Detection And Ranging), but instead of the need for scanning a single laser-beam and using a single detector, TOF utilizes a matrix sensor and an LED source illuminating and captures the complete scene at once. TOF thus eliminates the need for scanning. The illuminating source for TOF is typically NIR LED (Near InfraRed LED) and TOF is able to measure distances to the various objects in the scene (3D) as well as generating gray value images (2D). A TOF system reaches a depth resolution of about WD/100 and a frame rate typically lower than 20 fps for 0.3 Mpix images or about 0.6*107 3Dp/s.
This principle is closely related to SOL, but instead of using a light sheet, a projector (typically a DLP-projector) projects stripes on the object, alternating bright and dark. It is powerful DLP-projectors which has made this technique a valid 3D system candidate. Such projectors are on the market since around year 2000 and hence also this technique. It is here the transition between a bright and a dark stripe (the edge) which the algorithms are searching, instead of the light line in the case of SOL. The originally straight edge is deformed due to the shape of the object and as with SOL, triangulation is used for the 3D reconstruction. One benefit of using a DLP-projector rather than a fixed light line is that the projector, being able to project almost any pattern on the target, can move this edge over the object and the mechanical translation or motion required by SOL is not required for structured light. Due to the flexibility that the DLP-projector introduces, structured light uses much smarter patterns to project than simple edge which appear to translate over the object. Typically, a series of patterns are projected after each other. Starting with wide bright and dark stripes resulting in a few edges equidistantly distributed over the object, and systematically increasing the number stripes and hence edges, still equidistantly distributed over the object. In this way, it is possible to extract 3D information with a low spatial resolution first (only large variations), and increase the spatial resolution up to the required resolution. By applying clever patterns from the projector and ditto algorithms for the detection of edges, structured light can detect the edges (transitions) with sub-pixel precision. A structured lights system typically reaches a depth resolution of about WD/500. For a sensor resolution of 1.3Mpixels and a 3D framerate of 20 fps the 3D point speed is 2.6*107 3Dpoints/s.