METHOD AND SYSTEM FOR ACQUIRING A THREE-DIMENSIONAL SHAPE DESCRIPTION

Info

Publication number: 20020057832
Type: Application
Filed: Mar 16, 1999
Publication Date: May 16, 2002
Patent Grant number: 6510244
Inventors: MARC R.A.B. PROESMANS (LEDE), LUC S.J. VAN GOOL (ANTWERPEN), ANDRE J.J. OOSTERLINCK (LOVENJOEL), FILIP P. DEFOORT (ZWEVEGEM)
Application Number: 09202184

Abstract

Method for acquiring a three-dimensional shape or image of a scene, wherein a predetermined pattern of lines is projected onto the scene and the shape is acquired on the basis of relative distances between the lines and/or intersections of the lines of the pattern

Description

Description

[0001] The use of three-dimensional images or shapes is becoming increasingly important, also for new applications, such as showing objects on the Internet, virtual reality, the film industry and so on.

[0002] Three-dimensional shape descriptions are usually acquired with two or more cameras or using laser systems. Existing systems may have a high cost price, may require a time-consuming calibration procedure and be difficult to transport, while in some cases the scene must be static, the working memory volume for the scene is determined or insufficient information concerning the topology of the shape is obtained.

[0003] An example of related research is an article by A. Blake et al., “Trinocular active range-sensing”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, No. 5, pp. 477-483, 1993. According to this method two series of parallel lines are projected sequentially. The lines in the image can be identified by ensuring that the projected lines and epipolar lines intersect laterally. The lines may not therefore be too close together.

[0004] Another example is an article by Vuori and Smith, “Three dimensional imaging system with structured lighting and practical constraints”, Journal of Electronic Imaging 6(1), pp. 140-144, 1997. According to this method a single pattern is projected but limitations are imposed on the dimensions of the scene for recording and on the relative distances between the pattern elements. This makes it possible to predict in which part of the image plane each individual pattern element is projected.

[0005] In addition, known systems and methods always start from absolute measurements and dimensions, which is not always required, particularly not when the application is aimed at showing and not at measuring the objects.

[0006] The present invention provides a method for acquiring a three-dimensional shape or image of a scene, wherein a predetermined pattern is projected onto the scene and the shape is acquired on the basis of the relative deformations of the pattern as detected in one or more images.

[0007] The invention further provides a system for acquiring a three-dimensional shape or image of a scene, comprising:

[0008] at least one pattern generator for projecting a pattern on the scene;

[0009] at least one camera for making an image or images of the scene;

[0010] computing means for determining the shape of the scene.

[0011] The require hardware is simple and the system according to the present invention is therefore not very expensive.

[0012] According to the present invention it is no longer required to individually identify components of the projected pattern, such as lines or dots, either via a code in the pattern itself or via limitations imposed by the scene and/or the arrangement or a priori knowledge in respect thereof. In the method according to the present invention the shape can be acquired via the relative positions of the components in the pattern, for instance by determining the relative sequence of pattern lines.

[0013] According to the present invention a single image will suffice to acquire a three-dimensional image or a three-dimensional shape description and an acquisition time is required which is no longer than that for making a single image.

[0014] Because the recording time of a single image can remain limited and since reconstructions can be performed for a quick succession of different images, moving objects and their three-dimensional movements and changes of shape can be reconstructed.

[0015] According to the present invention the calibration procedure can be kept simple and few limitations need be imposed on the relative position of the pattern generator(s) and camera(s).

[0016] According to the present invention the projected pattern can be kept very simple and strongly interlinked, such as for instance a grid of lines. Owing to the simplicity the resolution of the pattern and thereby of the three-dimensional reconstruction can be kept high. Due to the strong interlinking of the pattern it becomes possible to determine the correct surface topology explicitly as a component of the output.

[0017] According to the present invention the absolute position of components of the pattern within the entirety of the pattern is of secondary importance. Three-dimensional shape descriptions can be generated without such information, except for a scale factor, by assuming a (pseudo-)orthographic model for the geometry of pattern projection and image formation. Perspective effects in this geometry are preferably kept limited.

[0018] According to the present invention both the three-dimensional shape and the surface texture of scenes can be determined. It is herein possible to eliminate problems of alignment of shape and texture since shape and texture are determined using the same camera(s). The final output, which can be a three-dimensional shape description with or without texture, can of course be made in any desired format, including formats which are for instance suitable for graphic work-stations or Internet.

[0019] According to a preferred embodiment the surface texture can moreover be acquired from the same image that is used to determine the three-dimensional shape. A similar method can also be extended to cases where a pattern is employed with absolute definition of the components. It will be apparent that the procedure for acquiring coloured or multi-spectral surface texture, i.e. built up of different spectral base images, does not differ essentially from the procedures for one base image.

[0020] According to a preferred embodiment the projected pattern is regular and composed of similar basic shapes, such as rectangles or squares, formed by the composition of two mutually perpendicular series of parallel equidistant lines.

[0021] According to a preferred embodiment calibration takes place by showing the system two planes between which the angle is known, for instance a right angle. The parameters required for the shape definition can be determined herefrom, such as the angle of reflection of the projected rays relative to the image plane and, if necessary, the height/width ratio of the pixels.

[0022] Further advantages, features and details of the present invention will be elucidated on the basis of the following description of the preferred embodiment thereof with reference to the annexed figures, in which:

[0023] FIG. 1 shows a schematic view of a preferred embodiment of the method according to the present invention;

[0024] FIG. 2 shows an example of a detail of an image acquired with the arrangement of FIG. 1;

[0025] FIGS. 3a-3d show a detail view with the results of line detectors and the combination thereof after a first correction step;

[0026] FIG. 4 shows an example of a discontinuity in the detected pattern;

[0027] FIGS. 5a,b,c show a diagram elucidating the construction of the grid;

[0028] FIGS. 6a-6b show a diagram elucidating the refining of the positions of the grid;

[0029] FIG. 7 shows a diagram elucidating the provision of the grid points with a label;

[0030] FIGS. 8a-8b shows the result of the complete process of grid extraction for a detail image;

[0031] FIGS. 9a-9b show two images elucidating the step for acquiring surface texture;

[0032] FIGS. 10a-h show examples of obtained results.

[0033] In the arrangement of FIG. 1 a pattern is cast onto the scene by a projector and captured by a camera in an image plane. In this embodiment the projector and camera were placed at a distance of about 4 m from the scene. The angle between the projection direction and the viewing direction was approximately 7▪. The projected pattern was a substantially square grid consisting of 600 horizontal and 600 vertical lines arranged on a slide using lithographic techniques, with a line thickness of 10 &mgr;m and placed at mutual distances of 50 &mgr;m. In this embodiment a 512×512 image was formed, taken up for the greater part by the part of the scene for reconstruction and typically 100 horizontal and 100 vertical projected lines were effectively visible in the image.

[0034] FIG. 2 shows a detail of an image recorded under the above mentioned conditions.

[0035] FIG. 3a shows such an image in more detail and the response of horizontal and vertical line detectors is shown in FIGS. 3b and 3c. After determining the central line of these responses and a first filling of gaps, the central lines are combined in an initial reconstruction of the image pattern (d).

[0036] Due to for instance noise or other causes, discontinuities in the found pattern lines will be present in regions where the response of a line detector is relatively poor. FIG. 4 shows an example of two lines which are interrupted, in this case in the vicinity of their intersection. As shown in FIG. 4, the immediate vicinity of the two end points of the segment of a line are scanned in the present preferred embodiment of the invention over a region in the line of a width s. If points of other line segments are encountered in this process, a score is assigned thereto based on characteristics such as the difference in the angle of inclination &egr; of the line segments close to the discontinuity, the distance to the end points of those segments, the difference in intensity of the given end point and the end point of a candidate segment, as well as the average intensity along the rectilinear link between the end points. Two segments are joined together if the sum of the scores is optimal. Such a result is already shown in FIG. 3d.

[0037] FIG. 5 shows how the crossings or intersections are constructed. FIG. 5a shows the intersecting of two lines. Such intersections are brought together in a grid representation, wherein each grid point has a maximum of four neighbours (N, E, S and W), as shown in FIG. 5b. Grid points which are vertices of a full quadrangle are of particular importance in subsequent correction steps. These are found as points of return by successive selection of the E neighbour, its S neighbour, its W neighbour, its N neighbour (c).

[0038] FIG. 6 shows a refining step applied to the grid. This step treats the grid as a system of linked lines, the position of which is refined by drawing out their continuity relative to their original discrete representation, shown in FIG. 6a, and by adding an attractive force which also gives the refined line positions a preference for positions with image intensities of the same polarity as the lines in the pattern. The result of this improvement step is shown in FIG. 6b.

[0039] The grid points are provided with labels or numbers as illustrated in FIG. 7. A random grid point in the image is chosen as reference point O(0,0). The other grid points are given two labels, a “horizontal” label i and a “vertical” label j, which give the relative position relative to the reference. The horizontal line labels i are incremented with each E step or decremented with each W step required to reach the grid point from the reference point. The vertical line labels j are incremented with each N step or decremented with each S step required to reach the grid point from the reference point. Defining of the labels amounts to defining the coherence of the grid.

[0040] FIG. 7 likewise shows an example wherein different labels can result from choosing a different path, for instance for the point P. Herein the circumscribed path to the right is more reliable since this avoids the false parts of the grid designated with white. The reliability of a path is determined on the basis of the reliability of the individual grid points along the path and decreases the more links to adjacent grid points are missing and the more the individually found quadrangles in the observed pattern show less resemblance to squares as viewed from the projection direction. The labels are corrected iteratively/recursively where necessary. Older labels are herein overwritten by newer labels if the latter are more reliable.

[0041] FIG. 8b shows the finally found grid for the detail image of FIG. 8a.

[0042] For further details in respect of defining the grid reference is made to the enclosure which includes the article “Active acquisition of 3D shape for moving objects” by the inventors.

[0043] After the grid has thus been precisely defined, the three-dimensional shape thereof can be computed.

[0044] The three-dimensional shape of the visible surface illuminated with the pattern is reconstructed on the basis of the three-dimensional positions of the grid points. In the preferred embodiment the relative positions of these latter are obtained from the intersections of two parallel beams of rays, one of which represents the pattern projection rays and the other the image projection rays and which are such that they form an angle as defined by calibration and that their intersections are consistent with the detected positions of the grid points in the image.

[0045] For further details relating to the definition of the three-dimensional shape description and the calibration used in the preferred embodiment reference is made to the enclosure which includes the article by the inventors “One-shot active 3D shape acquisition”.

[0046] After the grid has thus been precisely defined, in a preferred embodiment the texture is also extracted from the same image which was used for the shape definition.

[0047] In the image the cross section of an intensity profile of a pattern stripe will be Gaussian, approximated by the function:

w(x)=1−e−(x/&sgr;)n

[0048] In an experiment n=2 was chosen and the function was Gaussian. The intensity attenuation of the pattern stripes was subsequently modelled by superimposing this profile over the extracted grid lines. The result for the grid of FIG. 8b is shown in FIG. 9a. The greater the attenuation, the less representative for the real texture the image intensity is considered to be. In order to determine the texture an interpolation of intensities is taken as starting point, wherein account is taken of the relative attenuations and the distances to the point where the intensity is restored. The result serves only to initialize a non-linear diffusion process on the intensity, which eliminates the remaining traces of the pattern stripes. The result for the image of FIG. 8a is shown in FIG. 9b.

[0049] Finally, the extracted texture is arranged (“mapped”) on the surface.

[0050] For further details relating to the determining of the surface texture, reference is made to the enclosure, which includes the article by the inventors “A sensor that extracts both 3D shape and surface texture”.

[0051] FIG. 10 shows results obtained with this embodiment. FIGS. 10a and 10b show two input images for a hand and a face. FIGS. 10c and 10d show two views of the shape description obtained for the hand, FIGS. 10e and 10f the shape description of the face. FIGS. 10g and 10h show two views of the texture coated (“texture mapped”) shape description of the face.

[0052] Using additional steps perspective effect in the projection and viewing geometry can be taken into account in the above described preferred embodiment of the method and device. Using data additions absolute dimensions and distances can be obtained, so that the unknown scale factor of the shape description disappears. It of course remains possible to add codes to the projection pattern which facilitate the identification of pattern components. Different projections and/or images can be used simultaneously. The surface texture can also be obtained from images in which the pattern is not visible—for instance by using non-visible wavelengths or switching off the pattern—and/or the texture can be extracted by filtering out the pattern in other ways than described in this embodiment.

[0053] The present invention is not defined by the above described preferred embodiment, in which many modifications are conceivable; the rights applied for are defined by the scope of the following claims within which variations can be envisaged.

Claims

1. A method for acquiring a three dimensional shape or image of an object or a scene including freeform surfaces, wherein one or more predetermined patterns are projected onto the object or the scene and wherein the shape is acquired on the basis of relative distances between the primitives e.g. lines and/or intersections of the lines, of the patterns, from which the patterns are composed, as observed in one or more images.

2. A method according to claim 1, wherein relative, spatial positions are calculated directly from relative, observed positions of pattern primitives in the image or images.

3. A method according to claim 1 or 2, wherein one single pattern is projected and a three dimensional shape of an object or a scene is reconstructed on the basis of a single image.

4. A method according to claim 1, 2 or 3, wherein a sequence of three dimensional shapes is reconstructed from images acquired sequentially in time, e.g. to determine changes in the three dimensional shape of objects or three dimensional motions of objects.

5. A method according to any of claims 1-4, wherein one or more patterns of lines are projected onto the scene or the object and the shape is acquired on the basis of observed, relative distances between the lines and/or intersections of the lines of the pattern.

6. A method according to claim 5, wherein the pattern of lines is a grid of straight lines.

7. A method according to claim 6, wherein the grid consists of two mutually perpendicular series of parallel lines.

8. A method according to any of claims 1-7, based upon observed deformations of the pattern(s) other than perspective effects.

9. A method according to claim 8 the includes a step of minimizing inaccuracies in the extracted three dimensional shape that are due to perspective effects in the pattern projection or in the acquisition of the image(s).

10. A method according to any of claims 5-9, wherein pattern lines and/or intersections thereof are extracted from the image or images.

11. A method according to claim 10, wherein the different lines and/or intersections are jointly extracted in order to increase robustness and/or precision.

12. A method according to claim 10, wherein the intersections of the lines are determined and the relative positions thereof in the pattern(s) are provided with relative sequential numbers.

13. A method according to claim 10, wherein the positions of the lines are determined with a higher position than that of the imaging device(s), e.g. with a precision better than the size of a pixel.

14. A method according to claim 13, wherein the positions of the lines are refined by treating them as a ‘snake’ or through a regularization technique.

15. A method according to claim 10, wherein the intersections are determined with a higher precision than that of the imaging device(s), e.g. with a precision better than the size of a pixel.

16. A method according to claim 10, wherein the pattern lines in an image are spaced over a relatively small number of pixels, e.g. ten pixels or less.

17. A method according to any of claims 1-9, wherein the correctness of the extracted pattern(s) is checked and/or corrections are applied to the extracted pattern(s) through the comparison of the pattern that is actually projected with the shape the extracted pattern would yield if its three dimensional interpretation were viewed from the projector's position.

18. A method according to any of claims 1-9, wherein the texture is obtained by filtering out the pattern(s).

19. A method according to any of claims 1-9, wherein a code is added to the projected pattern(s), such as simplify one or more steps in the method or to increase the reliability. Such a code may consist of the use of multiple colours, the addition of geometric elements to the pattern(s), etc.

20. A method according to claim 19, wherein the code is used to correctly process and/or identify depth discontinuities from the viewpoint of the imaging device(s).

21. A method according to any of claims 1-9, wherein calibration is achieved through a reversed process, i.e. by presenting an object or a scene and extracting its shape in order to derive system parameters.

22. A method according to claim 21, wherein an object or a scene is presented that contains two planes that subtend a known angle and that are used for calibration.

23. A method according to claim 21, wherein system parameters are calibrated by taking at least two views of the same scene or the same object.

24. A method according to claim 23, wherein system parameters are calibrated by exploiting the fact that upon reconstruction from every image the shape of the part visible in more than one of the views should be identical.

25. A method according to claim 23 or 24, wherein a pattern is projected on a larger part of the object or the scene than that of which the three dimensional shape is reconstructed from a single image, while using the connectivity of the pattern to facilitate the registration and/or combination of multiple, partial reconstructions obtained from multiple images.

26. A method according to claim 21, wherein the shape of the object or scene is known and the system parameters are obtained by a comparison of the reconstructed shape with the known shape.

27. A method according to any of claims 1-9, wherein the calibration of the system on requires the determination of the height-width proportion (i.e. the ‘aspect ratio’) of the pixels and the relative direction of the pattern projection and image acquisition.

28. A system for obtaining a three dimensional shape of image of a scene or object, using a method according to any of claims 1-27 and comprising:

at least one pattern generator for projecting a pattern on a scene or object;

at least one imaging device, e.g. a camera, for observing a scene or an object; and

computing means for determining the shape of the scene or the object.

29. A system according to claim 28, wherein only one pattern generator and one imaging device are used.

30. A system according to claims 28 or 29, wherein the only constraint for the relative arrangements of pattern generator(s) and imaging device(s) is that at least a part of the pattern(s) is observed by the imaging device(s), and the projection and imaging direction(s) do not all coincidence.

31. A system according to any of claims 28, 29 or 30, wherein the direction of pattern projection and the imaging direction subtend a relatively small angle, e.g. 15 degrees or less.

32. A system according to any the claims 28, 29, 30 or 31, wherein radiation is used that is invisible to the human eye.

33. A system according to any of claims 28, 29, 30 or 31, wherein for the projection of the pattern(s) and/or the acquisition of the image(s) a combination of different wavelengths is used.

34. A system according to any of claims 28, 29, 30 or 31, wherein the pattern or patterns are projected during a short period of time and/or with interruptions.

35. A system according to claim 34, which is used to reduce the relative movement of the object or the scene during the effective imaging time.

36. A system according to claim 34, which is used to extract the texture or surface patterns during the period wherein no pattern(s) are projected.

37. A system according to any of claims 28, 29, 30 or 31, wherein components are integrated into a single apparatus, such as a photo or video camera suitable for three dimensional imaging.

38. A system and/or method substantially according to the description and/or figures.

39. Image acquired to any method as claimed in any of the claims 1-27, and or system according to any of claims 28-37.