RESOLVING HOMOGRAPHY DECOMPOSITION AMBIGUITY BASED ON VIEWING ANGLE RANGE

- QUALCOMM Incorporated

The homography between captured images of a planar object is determined and decomposed into at least one possible solution, and typically at least two ambiguous solutions. The removal of the ambiguity between the two solutions, or validation of a single solution, is performed using a viewing angle range. The viewing angle range may be used by comparing the viewing angle range to the orientation of each solution as derived from the rotation matrix resulting from the homography decomposition. Any solution with an orientation outside the viewing angle range may be eliminated as a solution.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS-REFERENCE TO PENDING PROVISIONAL APPLICATION

This application claims priority under 35 USC 119 to U.S. Provisional Application No. 61/533,733, filed Sep. 12, 2011, and entitled “Resolving Homography Decomposition Ambiguity,” which is assigned to the assignee hereof and which is incorporated herein by reference.

BACKGROUND

Vision based tracking techniques use images captured by a mobile platform to determine the position and orientation (pose) of the mobile platform with respect to an object in the environment. Tracking is useful for many applications such as navigation and augmented reality, in which virtual objects are inserted into a user's view of the real world.

One type of vision based tracking initializes a reference patch by detecting a planar surface in the environment. The surface is typically detected using multiple images of the surface the homography between the two images is computed and used to estimate 3D locations for the points detected on the surface. Any two camera images of the same planar surface are related by a 3×3 homography matrix h. The homography h can be decomposed into rotation R and translation t between two images. The pose information [R|t] may then be used for navigation, augmented reality or other such applications.

However, in most cases, the decomposition of homography h yields multiple possible solutions. Only one of these solutions, however, represents the actual planar surface. Thus, there is an ambiguity in the decomposition of homography h that must be resolved. Known methods of resolving homography decomposition ambiguity require the use of extra information to select the correct solution, such as additional images or prior knowledge of the planar surface.

By way of example, tracking technologies such as that described by Georg Klein and David Murray, “Parallel Tracking and Mapping on a Camera Phone”, In Proc. International Symposium on Mixed and Augmented Reality (ISMAR), 4 pages, 2009 (“PTAM”), suffers from the ambiguity in the pose selection after homography decomposition. PTAM requires additional video frames, i.e., images, to resolve the ambiguity. For each possible solution, PTAM computes the 3D camera pose and compares the pose reprojection error for a number of subsequent frames. When the average projection error for one solution is greater than another, such as two times greater, the solution with the greater error is eliminated. Using pose reprojection to resolve the ambiguity, however, takes a long time to converge and often yields incorrect results.

Another approach used to resolve the ambiguity is to choose the homography solution with normal closest to the initial orientation of the camera. This approach, however, restricts the user to always begin close to a head-on orientation and move camera away from that position.

In an approach described by D. Santosh Kumar and C. V. Jawahar, “Robust Homography-Based Control for Camera Positioning in Piecewise Planar Environments”, Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), 906-918 (2006), another planar surface in space is required or prior knowledge about the plane is needed to select the correct solution. Thus, this approach has limited practical application.

SUMMARY

The homography between captured images of a planar object is determined and decomposed into at least one possible solution, and typically at least two ambiguous solutions. The removal of the ambiguity between the two solutions, or validation of a single solution, is performed using a viewing angle range. The viewing angle range may be used comparing the viewing angle range to the orientation of each solution as derived from the rotation matrix resulting from the homography decomposition. Any solution with an orientation outside the viewing angle range may be eliminated as a solution.

In one embodiment, a method includes capturing two images of a planar object with at least one camera from a first position and a second position; determining a homography between the two images; decomposing the homography to obtain at least one possible solution for the second position; using a viewing angle range to eliminate the at least one possible solution; and storing any remaining solution for the second position.

In another embodiment, an apparatus includes a camera for capturing images of a planar object; and a processor coupled to receive two images captured from a first position and a second position, the processor configured to determine a homography between the two images, decompose the homography to obtain at least one possible solution for the second position, use a viewing angle range to eliminate the at least one possible solution, and store any remaining solution for the second position in a memory coupled to the processor.

In another embodiment, an apparatus includes means for capturing two images of a planar object with at least one camera from a first position and a second position; means for determining a homography between the two images; means for decomposing the homography to obtain at least one possible solution for the second position; means for using a viewing angle range to eliminate the at least one possible solution; and means for storing any remaining solution for the second position.

In yet another embodiment, a non-transitory computer-readable medium including program code stored thereon includes program code to determine a homography between two images of a planar object captured from different positions by at least one camera; program code to decompose the homography to obtain at least one possible solution; program code to use a viewing angle range to eliminate the at least one possible solution; and program code to store any remaining solution.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 illustrates a mobile platform capturing images of a planar object at two different positions.

FIG. 2 illustrates the projection of a three-dimensional (3D) point on a planar object onto two images captured at different positions.

FIG. 3 illustrates resolving ambiguity in solutions to homography decomposition using viewing angle.

FIGS. 4A, 4B, 4C, and 4D illustrate possible viewing directions of the mobile platform with respect to the planar object and the valid viewing angle ranges associated with the viewing directions.

FIG. 5 is a flow chart illustrating the process of resolving ambiguity in the homography decomposition using viewing angle.

FIG. 6 is a flow chart illustrating the process of using the viewing angle between the camera and the planar object to eliminate at least one possible solution.

FIG. 7 is a flow chart illustrating the process of selecting a solution, where the planar object may be horizontal or vertical.

FIG. 8 is a block diagram of a mobile platform capable of resolving ambiguity in the homography decomposition using only two images of the planar object and without any prior knowledge of the planar object.

DETAILED DESCRIPTION

FIG. 1 illustrates a mobile platform 100 at two different positions A and B. The mobile platform 100 captures images with a camera 114 of a planar object 102, illustrated as 102A and 102B in the display 112 of the mobile platform 100. In practice, a single mobile platform 100 may capture a series of frames from a live video stream while it is moved from position A to position B, as indicated by the broken arrow in FIG. 1. Alternatively, two different mobile platforms may be used to capture images of planar object 102 from different positions. The mobile platform may also include orientation sensors 116, such as accelerometers, magnetometers, and/or gyroscopes.

As shown in FIG. 2, if a 3D point on a plane it is viewed on two images I′ and I, its 2D projection q′=(x′,y′,1) on image I′ and q=(x,y,1) on image I are related by a homography h as:


q′≈hq  eq. 1

The homography h between two views of a planar surface can be decomposed into the rotation matrix R, translation t and the normal n using a well-known procedure described in Faugeras, O., Lustman, F.: “Motion and structure from motion in a piecewise planar environment”, International Journal of Pattern Recognition and Artificial Intelligence 2 (1988) 485-508, which is incorporated herein by reference. In most general cases, the decomposition of homography h generates four possible solutions. Two solutions could be eliminated by enforcing non-crossing constraints and visibility constraints. The non-crossing constraint requires that the two camera images are captured from the same side of the planar object, e.g., both images are captured from above the planar object. The visibility constraint requires that all the 3D points on the planar object must be in front of the camera when the images are captured. However, the ambiguity between the other two possible solutions remains.

FIG. 3 is another illustration of the mobile platform 100 at the initial position A and at the current position B (along with user 201) with respect to the planar object 102. The homography decomposition from the images produced at the initial position A and the current position B produces two possible solutions 200 and 202, wherein solution 200 corresponds to the correct position B of the mobile platform 100 and solution 202 is incorrect and is illustrated with dotted lines. As discussed above, each solution 200 and 202 to the homography decomposition includes the plane normal n, shown in FIG. 3, as well as a rotation matrix R, illustrated by arrows R200 and R202 in FIG. 3, and translation matrix t, not shown in FIG. 3. It should be understood, as discussed above, that the homography decomposition may produce up to four possible solutions, but two solutions may be easily eliminated by enforcing non-crossing constraints and visibility constraints and are therefore not shown FIG. 3. The two remaining possible solutions 200 and 202 shown in FIG. 3 are both valid solution from the homography decomposition, and thus, it is desirable to resolve the ambiguity. Additionally, it should be understood that the homography decomposition may produce only one possible solution, e.g., solution 200, but it may be desirable to validate that solution.

The ambiguity between the two remaining solutions may be resolved (or validated if only one solution remains) using a valid viewing angle range between which the correct solution is presumed to lie. The valid viewing angle range is based on the viewing direction of the user 201. FIGS. 4A, 4B, 4C, and 4D, by way of example, illustrate several possible viewing directions of the mobile platform 100 with respect to the planar object 102 and the valid viewing angle ranges associated with the viewing directions. The viewing angle range is a range of angles in which the user and mobile platform 100 are most like positioned based on the assumption that the user and the mobile platform are aligned with respect to the planar object such that the user can see the display of the mobile platform 100. For example, FIG. 4A illustrates the mobile platform 100 viewing a horizontal planar object 102 in a downward direction while in an upright orientation from the fourth (IV) quadrant as defined by the planar object 102 and normal n and how the homography h is decomposed and how the orientation is extracted from the resulting rotation matrices, which is illustrated by line 206. Thus, as illustrated in FIG. 4A, the valid viewing angle range α is between 270° and 360° from normal n to the planar object 102, but the starting and ending angle values of the viewing angle range α may have values other than 270° and 360° depending on the coordinate systems chosen during the homography decomposition and orientation extraction from the resulting rotation matrices. In general, however, it has been determined that an approximately 90° viewing angle range α (i.e., end angle—start angle ≈90°) provides adequate results, although a variation in the range of 1°, 2°, 5° or 10° may also provide useful results.

FIG. 4B is similar to FIG. 4A, but illustrates the mobile platform 100 viewing a horizontal planar object 102 in a downward direction from the first (I) quadrant, while in an upside-down orientation. Due to the viewing direction of the mobile platform 100, the valid viewing range α in FIG. 4B is between 0° and 90°, where, as illustrated by line 206, the homography h is decomposed and the orientation extracted from the resulting rotation matrices in FIG. 4B in the same manner as FIG. 4A. The viewing direction of the mobile platform 100 with respect to the planar object 102 may be determined, e.g., based on orientation sensors, which would indicate that the mobile platform 100 is in the upside-down orientation. A user is unlikely to hold the mobile platform 100 in an upside-down orientation while viewing a horizontal planar object 102 from the fourth (IV) quadrant, and thus, it may be safely presumed that an upside-down orientation indicates that the mobile platform 100 is viewing the planar object 102 from the first (I) quadrant as shown in FIG. 4B. Alternatively, the viewing direction of the mobile platform 100 may be provided by user input, e.g., from selection in a menu.

FIG. 4C is also similar to FIG. 4A, but illustrates the mobile platform 100 viewing a vertical planar object 102 in an upward direction from the fourth (IV) quadrant. Due to the viewing direction of the mobile platform 100 in FIG. 4C, the valid viewing range α in FIG. 4C is between 270° and 360°, where, as illustrated by line 206, the homography h is decomposed and the orientation extracted from the resulting rotation matrices in FIG. 4C in the same manner as FIG. 4A. The viewing direction of the mobile platform 100 with respect to the planar object 102 may be determined, e.g., based on orientation sensors, which would indicate that the mobile platform 100 is held in an upward orientation. A user cannot hold the mobile platform 100 in an upward orientation while viewing a planar object 102 the first (I) quadrant, or when viewing a horizontal planar object, as illustrated in FIGS. 4A and 4B, and thus, it may be safely presumed that an upward orientation indicates that the mobile platform 100 is viewing a vertical planar object 102 from the fourth (IV) quadrant. Alternatively, the viewing direction of the mobile platform 100 may be provided by user input, e.g., from selection in a menu.

FIG. 4D is also similar to FIG. 4A, but illustrates the mobile platform 100 viewing a vertical planar object 102 from a downward direction from the first (I) quadrant. Due to the viewing direction of the mobile platform 100 in FIG. 4D, the valid viewing range α in FIG. 4D is between 0° and 90°, where, as illustrated by line 206, the homography h is decomposed and the orientation extracted from the resulting rotation matrices in FIG. 4D in the same manner as FIG. 4A. The viewing direction of the mobile platform 100 with respect to the planar object 102 may be determined, e.g., based on orientation sensors, which would indicate that the mobile platform 100 is held in an downward orientation, along with a heuristic analysis of possible homography solutions, which may be used to distinguish a vertical planar object 102, as illustrated in FIG. 4D and a horizontal planar object 102, as illustrated in FIG. 4A, thereby identifying the viewing direction as from the first (I) quadrant. For example, if the orientation sensors 116 indicate that the mobile platform 100 is looking downwards, and no possible solutions are in the fourth (IV) quadrant, then the planar object 102 is most likely vertical. Additionally, out of two possible solutions, the solution that is within a small margin of error, e.g., 10°, from the orientation of the mobile platform 100 as determined by the orientation sensors 116 may be selected. Alternatively, the viewing direction of the mobile platform 100 may be provided by user input, e.g., from selection in a menu.

Thus, it can be seen that the viewing direction with respect to a horizontal, vertical (or any orientation there between) is from the first (I) quadrant, in which case a first valid viewing range α, e.g., between 0° and 90°, is used or in the fourth (IV) quadrant, in which case a second valid viewing range α, e.g., between 270° and 360°, is used. The quadrant that the mobile platform 100 is located, and thus, the viewing direction, may be determined based on orientation sensors, user input, heuristics, or any other desired manner.

Referring back to FIG. 3, each solution 200 and 202 has a different rotation matrix R200 and R202, as well as different translation matrices t, which are not illustrated. The orientation of each solution 200 and 202 with respect to normal n may be extracted from each solution's rotation matrix, R200 and R202. For example, as illustrated in FIG. 3, an orientation θ200 with respect to normal n is extracted from the rotation matrix R200 associated with solution 200 and a different orientation θ202 with respect to normal n is extracted from the rotation matrix R202 associated with solution 202. FIG. 3, by way of example, illustrates the orientations θ200 and θ202 from normal n as approximately 300° (e.g., in the fourth (IV) quadrant) and 30° (e.g., in the first (I) quadrant), respectively, but of course orientations for the possible solutions will vary based on the homography matrix h. The direction in which the orientations extend with respect to normal n is a function of the decomposition of the homography between the two images, but both solutions will extend in the same general direction, illustrated as in the clockwise direction in FIG. 3. It should be understood that while FIG. 3 illustrates both solutions 200 and 202 along the same projected line 204 (but asymmetrically arranged with respect to normal n), the projections of the solutions 200 and 202 may not always be linear, but because the orientations θ200 and θ202 are derived from the rotation matrices R200 and R202 using an identical method, the orientations will extend in the same general direction from normal n.

In order to resolve the ambiguity in the possible solutions 200 and 202, the orientations θ200 and θ202 are compared to the valid viewing angle range α. As discussed, above, the viewing angle range α is a range of angles in which the user 201 and mobile platform 100 are most like positioned. The viewing angle range α may be defined as a predetermined angular range {start angle, end angle} extending in the same general direction as the orientation for each possible solution encompassing the likely position of the mobile platform 100. By way of example, FIG. 3 illustrates the viewing angle range α as between 270° and 360° as illustrated by lines 206 and 208. As discussed above, the viewing angle range α, however, is a function of how the homography h is decomposed and how the orientation is extracted from the resulting rotation matrices and, thus, the starting and ending angle values of the viewing angle range α may have values other than 270° and 360° depending on the coordinate systems chosen during the homography decomposition and orientation extraction from the resulting rotation matrices.

Any possible homography h decomposition solution with an orientation θ with respect to the n axis that is outside the viewing angle range α may be eliminated as a valid solution. Thus, for example, as illustrated in FIG. 3, the orientation θ202 (e.g., approximately 30°) is outside the viewing angle range α (270° to 360°) and, accordingly, the associated solution 202 may be eliminated as a valid solution. The orientation θ200 (e.g., approximately 300°), on the other hand, is within the viewing angle range α and therefore the associated solution 200 remains as the correct homography h decomposition solution.

As discussed above, the output of the orientation sensors 116 may be used to indicate the orientation of the mobile platform 100 with respect to gravity, from which the viewing direction may be determined and the viewing angle range α adjusted as appropriate. For example, if the orientation sensors 116 indicate that while capturing the second image the mobile platform 100 is held in up-side down while in position B, the viewing angle range α may be modified to extend from 0° to 90°, thus, making solution 202 the valid solution and eliminating solution 200. Additionally, a heuristic analysis of the possible solutions or user input may be used to determine whether mobile platform is in the first (I) quadrant or the fourth (IV) quadrant, and thus, select the appropriate viewing angle range α.

In the rare case when both possible solutions 200 and 202 fall within the viewing angle range α, the correct solution may be selected by continuing to track both solutions until the orientations from one of the solutions falls outside the viewing angle range α as a result of user generated motion of the mobile platform 100. If the ambiguity remains unresolved past a threshold, e.g., a desired period of time or number of images, the correct solution may be selected using other known techniques, such as pose reprojection error, if desired.

Additionally, there are cases where only one possible solution may be generated from homography decomposition and the solution may be incorrect due to poor correlation of the 2D points. In this case, the same process may be used to validate the solution, i.e., if the possible solution has an orientation θ with respect to the n axis that is outside the viewing angle range α, the solution fails, and the process may be reset rather than assuming the only solution is correct.

FIG. 5 is a flow chart illustrating the process of resolving ambiguity in the homography decomposition using a viewing angle range. As illustrated, two images of a planar object are captured by at least one camera from different positions (302). The homography h is determined between the two images of the planar object (304). The homography h is decomposed to obtain at least one possible solution (306). For example, only one possible solution may be obtained, which as described above may be validated based on the viewing angle. Typically, however, up to four possible solutions may be obtained, a portion of which may be eliminated using non-crossing constraints and visibility constraints, leaving two ambiguous possible solutions. A viewing angle range is used to eliminate the at least one possible solution (308) and any remaining solution is stored as the correct position and orientation of the camera (310). Using the viewing angle range to eliminate the at least one possible solution advantageously avoids the use of prior knowledge of the planar object as well as the need for additional images of the planar object.

FIG. 6 is a flow chart illustrating the process of using the viewing angle range to eliminate at least one possible solution. As illustrated, the orientation of the at least one possible solution with respect to normal to the planar surface is extracted from a rotation matrix resulting from the homography decomposition (322). The orientation of the at least one possible solution is compared to the viewing angle range (324), which is predefined as discussed above. The at least one possible solution is eliminated if the orientation is outside the predefined viewing angle range (326). For example, the predefined viewing angle range may be between approximately 270° and 360°. The viewing angle range may be adjusted using an orientation of the camera with respect to gravity as determined, e.g., using an accelerometer or other appropriate position sensor.

FIG. 7, by way of example, is a flow chart illustrating the process of selecting a solution, where the planar object 102 may be horizontal (as illustrated in FIGS. 4A and 4B) or vertical (as illustrated in FIGS. 4C and 4D). As illustrated, a solution is selected based on the viewing angle range (352), as discussed above. The selected solution is compared to the orientation of the mobile platform 100 as derived from the orientation sensors 116, assuming that the planar object 102 is horizontal (354). If the selected solution falls within a margin of error (354), e.g., 10°, from the orientation of the mobile platform 100 as determined by the orientation sensors 116, the selected solution is considered to be correct and is stored (362). If the selected solution, however, is not within the margin of error (354), the other solution is compared to the orientation of the mobile platform 100 as derived from the orientation sensors 116, assuming that the planar object 102 is vertical (356). If the other solution falls within a margin of error (358), e.g., 10°, from the orientation of the mobile platform 100 as determined by the orientation sensors 116, assuming the planar object 102 is vertical, the other solution is considered to be correct and is stored (362). If the other solution, however, is not within the margin of error (358), then the planar object 102 is assumed to be vertical, and the solution closest to the orientation of the mobile platform 100 as determined by the orientation sensors 116 may be selected (360) and the solution is stored (362).

FIG. 8 is a block diagram of a mobile platform capable of resolving ambiguity in the homography decomposition using only two images of the planar object and without any prior knowledge of the planar object. The mobile platform 100 includes a means for capturing images of a planar object, such as camera 114 or multiple cameras. The mobile platform 100 also includes a means for sensing orientation, such as orientation sensors 116, which may be accelerometers, gyroscopes, electronic compass, or other similar sensing elements. The mobile platform 100 may further include a user interface 150 that includes a means for displaying the image of the environment, such as the display 112. The user interface 150 may also include a keypad 152 or other input device through which the user can input information into the mobile platform 100. If desired, the keypad 152 may be obviated by integrating a virtual keypad into the display 112 with a touch sensor (or gesture control). The user interface 150 may also include a microphone 154 and speaker 156, e.g., if the mobile platform 100 is a mobile platform such as a cellular telephone. Of course, mobile platform 100 may include other elements unrelated to the present disclosure.

The mobile platform 100 also includes a control unit 160 that is connected to and communicates with the camera 114 and orientation sensors 116. The control unit 160 accepts and processes images captured by camera 114 or multiple cameras, signals from the orientation sensors 116 and controls the display 112. The control unit 160 may be provided by a processor 161 and associated memory 164, hardware 162, software 165, and firmware 163. The control unit 160 may include an image processing unit 166 that performs homography decomposition on two images captured by the camera 114. The control unit 160 further includes a solution validating unit 168 that receives the solutions from the homography decomposition and determines if a solution is correct based on the viewing angle range as described in FIGS. 4 and 5. The selected solution may be stored in memory 164 or other storage unit as the position and orientation of the mobile platform 100.

The image processing unit 166 and solution validating unit 168 are illustrated separately from processor 161 for clarity, but may be part of the processor 161 or implemented in the processor based on instructions in the software 165 which is run in the processor 161. It will be understood as used herein that the processor 161 can, but need not necessarily include, one or more microprocessors, embedded processors, controllers, application specific integrated circuits (ASICs), digital signal processors (DSPs), and the like. The term processor is intended to describe the functions implemented by the system rather than specific hardware. Moreover, as used herein the term “memory” refers to any type of computer storage medium, including long term, short term, or other memory associated with the mobile platform, and is not to be limited to any particular type of memory or number of memories, or type of media upon which memory is stored.

The methodologies described herein may be implemented by various means depending upon the application. For example, these methodologies may be implemented in hardware 162, firmware 163, software 165, or any combination thereof. For a hardware implementation, the processing units may be implemented within one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, electronic devices, other electronic units designed to perform the functions described herein, or a combination thereof.

For a firmware and/or software implementation, the methodologies may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. Any machine-readable medium tangibly embodying instructions may be used in implementing the methodologies described herein. For example, software codes may be stored in memory 164 and executed by the processor 161. Memory may be implemented within or external to the processor 161. If implemented in firmware and/or software, the functions may be stored as one or more instructions or code on a computer-readable medium. Examples include non-transitory computer-readable media encoded with a data structure and computer-readable media encoded with a computer program. Computer-readable media includes physical computer storage media. A storage medium may be any available medium that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer; disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

Although the present invention is illustrated in connection with specific embodiments for instructional purposes, the present invention is not limited thereto. Various adaptations and modifications may be made without departing from the scope of the invention. Therefore, the spirit and scope of the appended claims should not be limited to the foregoing description.

Claims

1. A method comprising:

capturing two images of a planar object with at least one camera from a first position and a second position;
determining a homography between the two images;
decomposing the homography to obtain at least one possible solution for the second position;
using a viewing angle range to eliminate the at least one possible solution; and
storing any remaining solution for the second position.

2. The method of claim 1, wherein a plurality of possible solutions are obtained and the viewing angle range is used to eliminate at least one of the plurality of possible solutions.

3. The method of claim 1, wherein decomposing the homography produces a rotation matrix associated with the at least one possible solution, wherein using the viewing angle range comprises:

extracting an orientation with respect to normal of the planar object from the rotation matrix;
comparing the orientation to the viewing angle range; and
eliminating any possible solution with the orientation outside the viewing angle range.

4. The method of claim 3, wherein the viewing angle range is a predefined angular range between the planar object and the normal of the planar object.

5. The method of claim 3, wherein the viewing angle range is between approximately 270° and 360°.

6. The method of claim 1, the method further comprising:

determining an orientation of the at least one camera with respect to gravity; and
using the orientation of the at least one camera with respect to gravity to define the viewing angle range.

7. The method of claim 1, wherein using the viewing angle range to eliminate the at least one possible solution does not use prior knowledge of the planar object and does not use additional images of the planar object.

8. The method of claim 1, wherein a plurality of possible solutions are obtained and each of the plurality of possible solutions are within the viewing angle range, the method further comprising tracking each of the plurality of possible solutions until a solution is outside the viewing angle range before using the viewing angle range to eliminate the solution that is outside the viewing angle range.

9. An apparatus comprising:

a camera for capturing images of a planar object; and
a processor coupled to receive two images captured from a first position and a second position, the processor configured to determine a homography between the two images, decompose the homography to obtain at least one possible solution for the second position, use a viewing angle range to eliminate the at least one possible solution, and store any remaining solution for the second position in a memory coupled to the processor.

10. The apparatus of claim 9, wherein the at least one possible solution is a plurality of possible solutions and wherein the processor is configured to use the viewing angle range to eliminate at least one of the plurality of possible solutions.

11. The apparatus of claim 9, wherein the processor is configured to produce a rotation matrix associated with the at least one possible solution, wherein the processor is configured to use the viewing angle range by being configured to:

extract an orientation with respect to normal of the planar object from the rotation matrix;
compare the orientation to the viewing angle range; and
eliminate any possible solution with the orientation outside the viewing angle range.

12. The apparatus of claim 11, wherein the viewing angle range is predefined and is between the planar object and the normal of the planar object.

13. The apparatus of claim 11, wherein the viewing angle range is between approximately 270° and 360°.

14. The apparatus of claim 9, further comprising orientation sensors coupled to the processor, the processor being further configured to determine an orientation of the camera with respect to gravity using the orientation sensors, and to use the orientation of the camera with respect to gravity to define the viewing angle range.

15. The apparatus of claim 9, wherein the processor is configured to use the viewing angle range to eliminate the at least one possible solution without prior knowledge of the planar object and without additional images of the planar object.

16. The apparatus of claim 9, wherein a plurality of possible solutions are obtained and each of the plurality of possible solutions are within the viewing angle range, wherein the processor is further configured to track each of the plurality of possible solutions until a solution is outside the viewing angle range before the processor uses the viewing angle range to eliminate the solution that is outside the viewing angle range.

17. An apparatus comprising:

means for capturing two images of a planar object with at least one camera from a first position and a second position;
means for determining a homography between the two images;
means for decomposing the homography to obtain at least one possible solution for the second position;
means for using a viewing angle range to eliminate the at least one possible solution; and
means for storing any remaining solution for the second position.

18. The apparatus of claim 17, wherein the means for decomposing the homography produces a plurality of possible solutions and wherein the means for using the viewing angle range eliminates at least one of the plurality of possible solutions.

19. The apparatus of claim 17, wherein the means for decomposing the homography produces a rotation matrix associated with the at least one possible solution, wherein the means for using the viewing angle range comprises:

means for extracting an orientation with respect to normal of the planar object from the rotation matrix;
means for comparing the orientation to the viewing angle range; and
means for eliminating any possible solution with the orientation outside the viewing angle range.

20. The apparatus of claim 19, wherein the viewing angle range is predefined and is between the planar object and the normal of the planar object.

21. The apparatus of claim 19, wherein the viewing angle range is between approximately 270° and 360°.

22. The apparatus of claim 17, the method further comprising:

means for determining an orientation of the at least one camera with respect to gravity;
means for using the orientation of the at least one camera with respect to gravity to define the viewing angle range.

23. The apparatus of claim 17, wherein the means for using the viewing angle range to eliminate the at least one possible solution does not use prior knowledge of the planar object and does not use additional images of the planar object.

24. The apparatus of claim 17, wherein a plurality of possible solutions are obtained and each of the plurality of possible solutions are within the viewing angle range, the apparatus further comprising means for tracking each of the plurality of possible solutions until a solution is outside the viewing angle range before the means for eliminating eliminates the solution that is outside the viewing angle range.

25. A non-transitory computer-readable medium including program code stored thereon, comprising:

program code to determine a homography between two images of a planar object captured from different positions by at least one camera;
program code to decompose the homography to obtain at least one possible solution;
program code to use a viewing angle range to eliminate the at least one possible solution; and
program code to store any remaining solution.

26. The non-transitory computer-readable medium of claim 25, wherein a plurality of possible solutions are obtained and the viewing angle range is used to eliminate at least one of the plurality of possible solutions.

27. The non-transitory computer-readable medium of claim 25, wherein a rotation matrix is produced that is associated with the at least one possible solution, wherein the program code to use the viewing angle range comprises:

program code to extract an orientation with respect to normal of the planar object from the rotation matrix;
program code to compare the orientation to the viewing angle range; and
program code to eliminate any possible solution with the orientation outside the viewing angle range.

28. The non-transitory computer-readable medium of claim 27, wherein the viewing angle range is predefined and is between the planar object and the normal of the planar object.

29. The non-transitory computer-readable medium of claim 27, wherein the viewing angle range is between approximately 270° and 360°.

30. The non-transitory computer-readable medium of claim 25, further comprising:

program code to determine an orientation of the at least one camera with respect to gravity;
program code to use the orientation of the at least one camera with respect to gravity to define the viewing angle range.

31. The non-transitory computer-readable medium of claim 25, wherein the program code to use the viewing angle range to eliminate the at least one possible solution does not use prior knowledge of the planar object and does not use additional images of the planar object.

32. The non-transitory computer-readable medium of claim 25, wherein a plurality of possible solutions are obtained and each of the plurality of possible solutions are within the viewing angle range, the non-transitory computer-readable medium further comprising program code to track each of the plurality of possible solutions until a solution is outside the viewing angle range before the solution that is outside the viewing angle range is eliminated.

Patent History
Publication number: 20130064421
Type: Application
Filed: Jan 27, 2012
Publication Date: Mar 14, 2013
Applicant: QUALCOMM Incorporated (San Diego, CA)
Inventor: Dheeraj Ahuja (San Diego, CA)
Application Number: 13/360,505
Classifications
Current U.S. Class: Target Tracking Or Detecting (382/103)
International Classification: G06K 9/00 (20060101);