Supplementing depth information for anchoring
According to various implementations, a method includes obtaining pose data that indicates a plurality of poses of the electronic device within a physical environment. The method includes obtaining a representation of a three-dimensional (3D) environment. The representation of the 3D environment includes a plurality of planes. Each of the plurality of planes defines a plurality of points in xy space. The representation of the 3D environment does not include z space (e.g., depth) information. The method includes anchoring the representation of the 3D environment to a physical anchor point within the physical environment based on the pose data. The method includes anchoring a computer-generated object to the representation of the 3D environment based on the pose data. For example, the pose data includes z space (e.g., depth) information, which is used to anchor the computer-generated object to the representation of the 3D environment.
Latest APPLE INC. Patents:
This application is claims priority to U.S. Provisional Patent App. No. 63/337,860, filed on May 3, 2022, which is hereby incorporated by reference in its entirety.
TECHNICAL FIELDThe present disclosure relates to displaying an environment, and in particular anchoring the environment to a physical environment.
BACKGROUNDAnchoring an environment (e.g., a virtual 3D environment) includes rendering the environment such that the environment appears world-locked to a physical anchor point of a physical environment. Because rendering is a computationally expensive process, certain techniques include rendering a representation of the environment, in order to reduce computational demands associated with the anchoring. For example, the representation of the environment includes less graphical information than the environment. However, because of the lower amount of graphical information, it is challenging to efficiently anchor additional content to the anchored representation of the environment.
SUMMARYIn accordance with some implementations, a method is performed at an electronic device including one or more processors, a non-transitory memory, and a display. The method includes obtaining pose data that indicates a plurality of poses of the electronic device within a physical environment. The method includes obtaining a representation of a 3D environment. The representation of the 3D environment includes a first plurality of planes. Each of the first plurality of planes defines a plurality of points in xy space. The method includes anchoring the representation of the 3D environment to a physical anchor point within the physical environment based on the pose data. The method includes anchoring a computer-generated object to the representation of the 3D environment based on the pose data.
In accordance with some implementations, a method is performed at an electronic device including one or more processors, a non-transitory memory, and a display. The one or more programs are stored in the non-transitory memory and configured to be executed by the one or more processors and the one or more programs include instructions for performing or causing performance of the operations of any of the methods described herein. In accordance with some implementations, a non-transitory computer readable storage medium has stored therein instructions which when executed by one or more processors of an electronic device, cause the device to perform or cause performance of the operations of any of the methods described herein. In accordance with some implementations, an electronic device includes means for performing or causing performance of the operations of any of the methods described herein. In accordance with some implementations, an information processing apparatus, for use in an electronic device, includes means for performing or causing performance of the operations of any of the methods described herein.
For a better understanding of the various described implementations, reference should be made to the Description, below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.
In some circumstances, a device anchors an environment to a physical anchor point of a physical (e.g., real-world) environment, across a plurality of poses of the device. For example, in an augmented reality (AR) application, the device renders a 3D virtual house such that the rendered 3D virtual house appears anchored to a physical wall. Because rendering is a computationally expensive process, some techniques include rendering a simplified representation of the environment, in order to reduce computational demands associated with the anchoring. For example, the simplified representation of the environment includes information in two dimensions (e.g., xy space), whereas the environment includes information in three dimensions (e.g., xyz space). As another example, the simplified representation of the environment includes multiple 2D planes (e.g., in xy space), whereas the environment corresponds to a 3D mesh (e.g., in xyz space). In other words, in contrast to the environment, the simplified representation of the environment lacks z space (e.g., depth) information. Because of the lack of the z space information, the techniques lack a mechanism for efficiently anchoring additional content to the anchored simplified representation of the environment.
By contrast, various implementations disclosed herein include methods, systems, and electronic devices for using pose data to supplement the lack of z space information associated with a representation of a 3D environment. Namely, the representation of the 3D environment includes a plurality of planes, and each of the plurality of planes defines a plurality of points in xy space. Thus, the z space information indicated in the pose data is used to supplement the z space information not indicated by the representation of the 3D environment. In some implementations, the pose data indicates a 3D map or a 3D point cloud. To that end, in some implementations, an electronic device performs simultaneous localization and mapping (SLAM) operation with respect to image data of the physical environment and positional sensor data characterizing the position of the electronic device. The z space information enables the electronic device to anchor a computer-generated object to the representation of the 3D environment.
Reference will now be made in detail to implementations, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the various described implementations. However, it will be apparent to one of ordinary skill in the art that the various described implementations may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the implementations.
It will also be understood that, although the terms first, second, etc. are, in some instances, used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first contact could be termed a second contact, and, similarly, a second contact could be termed a first contact, without departing from the scope of the various described implementations. The first contact and the second contact are both contacts, but they are not the same contact, unless the context clearly indicates otherwise.
The terminology used in the description of the various described implementations herein is for the purpose of describing particular implementations only and is not intended to be limiting. As used in the description of the various described implementations and the appended claims, the singular forms “a”, “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes,” “including”, “comprises”, and/or “comprising”, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
As used herein, the term “if” is, optionally, construed to mean “when” or “upon” or “in response to determining” or “in response to detecting”, depending on the context. Similarly, the phrase “if it is determined” or “if [a stated condition or event] is detected” is, optionally, construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event]” or “in response to detecting [the stated condition or event]”, depending on the context.
Because the representation 120 includes less informational content than the 3D environment 100, an electronic device may anchor the representation 120 to a physical environment more efficiently (e.g., fewer graphics processing unit (GPU) cycles) than anchoring the 3D environment 100 to the physical environment. However, certain techniques cannot efficiently anchor a computer-generated object to the representation 120, due to the lack of z space (depth) information indicated by the representation 120. Accordingly, various implementations disclosed herein include supplementing the lack of z space information.
As illustrated in
Referring to
For example, referring to
According to various implementation, a physical environment includes a physical anchor point, to which a representation of a 3D environment is anchored. For example, with reference to
In some implementations, the electronic device 210 sets a physical anchor point independent of a user input. For example, the electronic device 210 sets the physical anchor point to a default location of the physical environment 200 corresponding to the middle of the display 212.
In some implementations, the electronic device 210 sets a physical anchor point based on a user input. To that end and with reference to
The electronic device 210 anchors the representation 120 to the physical anchor point 220 based on the pose data 336, as illustrated in
In some implementations, the anchoring is in response to detecting the first user input that specifies the physical anchor point 220. In some implementations, the anchoring is in response to detecting a user input directed to an affordance, such as a user input selecting a “place virtual world” affordance.
As illustrated in
As illustrated in
The electronic device 210 detects the positional change based on the positional sensor data 316. In response to detecting the positional change, the electronic device 210 maintains anchoring the representation 120 to the physical anchor point 220, based on the pose data 336. To that end, the electronic device 210 anchors a second plurality of planes of the representation 120 to the physical anchor point 220 based on the pose data 336, and ceases to anchor the first plurality of planes of the representation 120 to the physical anchor point 220. For example, whereas the first plurality of planes is associated with a first pose of the electronic device 210 (e.g., illustrated in
The second plurality of planes of the representation 120 is illustrated in
As further illustrated in
The electronic device 400 includes a memory 402 (e.g., a non-transitory computer readable storage medium), a memory controller 422, one or more processing units (CPUs) 420, a peripherals interface 418, an input/output (I/O) subsystem 406, a display system 412, an inertial measurement unit (IMU) 430, image sensor(s) 443 (e.g., camera), contact intensity sensor(s) 465, and other input or control device(s) 416. In some implementations, the electronic device 400 corresponds to one of a mobile phone, tablet, laptop, wearable computing device, head-mountable device (HMD), head-mountable enclosure (e.g., the electronic device 400 slides into or otherwise attaches to a head-mountable enclosure), or the like. In some implementations, the head-mountable enclosure is shaped to form a receptacle for receiving the electronic device 400 with a display.
In some implementations, the peripherals interface 418, the one or more processing units 420, and the memory controller 422 are, optionally, implemented on a single chip, such as a chip 403. In some other implementations, they are, optionally, implemented on separate chips.
The I/O subsystem 406 couples input/output peripherals on the electronic device 400, such as the display system 412 and the other input or control devices 416, with the peripherals interface 418. The I/O subsystem 406 optionally includes a display controller 456, an image sensor controller 458, an intensity sensor controller 459, one or more input controllers 452 for other input or control devices, and an IMU controller 432, The one or more input controllers 452 receive/send electrical signals from/to the other input or control devices 416. One example of the other input or control devices 416 is an eye tracker that tracks an eye gaze of a user. Another example of the other input or control devices 416 is an extremity tracker that tracks an extremity (e.g., a finger) of a user. In some implementations, the one or more input controllers 452 are, optionally, coupled with any (or none) of the following: a keyboard, infrared port, Universal Serial Bus (USB) port, stylus, finger-wearable device, and/or a pointer device such as a mouse. The one or more buttons optionally include a push button. In some implementations, the other input or control devices 416 includes a positional system (e.g., GPS) that obtains information concerning the location and/or orientation of the electronic device 400 relative to a particular object. In some implementations, the other input or control devices 416 include a depth sensor and/or a time-of-flight sensor that obtains depth information characterizing a physical object within a physical environment. In some implementations, the other input or control devices 416 include an ambient light sensor that senses ambient light from a physical environment and outputs corresponding ambient light data.
The display system 412 provides an input interface and an output interface between the electronic device 400 and a user. The display controller 456 receives and/or sends electrical signals from/to the display system 412. The display system 412 displays visual output to the user. The visual output optionally includes graphics, text, icons, video, and any combination thereof (sometimes referred to herein as “computer-generated content”). In some implementations, some or all of the visual output corresponds to user interface objects. As used herein, the term “affordance” refers to a user-interactive graphical user interface object (e.g., a graphical user interface object that is configured to respond to inputs directed toward the graphical user interface object). Examples of user-interactive graphical user interface objects include, without limitation, a button, slider, icon, selectable menu item, switch, hyperlink, or other user interface control.
The display system 412 may have a touch-sensitive surface, sensor, or set of sensors that accepts input from the user based on haptic and/or tactile contact. The display system 412 and the display controller 456 (along with any associated modules and/or sets of instructions in the memory 402) detect contact (and any movement or breaking of the contact) on the display system 412 and converts the detected contact into interaction with user-interface objects (e.g., one or more soft keys, icons, web pages or images) that are displayed on the display system 412.
The display system 412 optionally uses LCD (liquid crystal display) technology, LPD (light emitting polymer display) technology, or LED (light emitting diode) technology, although other display technologies are used in other implementations. The display system 412 and the display controller 456 optionally detect contact and any movement or breaking thereof using any of a plurality of touch sensing technologies now known or later developed, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with the display system 412.
The user optionally makes contact with the display system 412 using any suitable object or appendage, such as a stylus, a finger-wearable device, a finger, and so forth. In some implementations, the user interface is designed to work with finger-based contacts and gestures, which can be less precise than stylus-based input due to the larger area of contact of a finger on the touch screen. In some implementations, the electronic device 400 translates the rough finger-based input into a precise pointer/cursor position or command for performing the actions desired by the user.
The inertial measurement unit (IMU) 430 includes accelerometers, gyroscopes, and/or magnetometers in order to measure various forces, angular rates, and/or magnetic field information with respect to the electronic device 400. Accordingly, according to various implementations, the IMU 430 detects one or more positional change inputs of the electronic device 400, such as the electronic device 400 being shaken, rotated, moved in a particular direction, and/or the like.
The image sensor(s) 443 capture still images and/or video. In some implementations, an image sensor 443 is located on the back of the electronic device 400, opposite a touch screen on the front of the electronic device 400, so that the touch screen is enabled for use as a viewfinder for still and/or video image acquisition. In some implementations, another image sensor 443 is located on the front of the electronic device 400 so that the user's image is obtained (e.g., for selfies, for videoconferencing while the user views the other video conference participants on the touch screen, etc.). In some implementations, the image sensor(s) are integrated within an HMD. For example, the image sensor(s) 443 output image data that represents a physical object (e.g., a physical agent) within a physical environment.
The contact intensity sensors 465 detect intensity of contacts on the electronic device 400 (e.g., a touch input on a touch-sensitive surface of the electronic device 400). The contact intensity sensors 465 are coupled with the intensity sensor controller 459 in the I/O subsystem 406. The contact intensity sensor(s) 465 optionally include one or more piezoresistive strain gauges, capacitive force sensors, electric force sensors, piezoelectric force sensors, optical force sensors, capacitive touch-sensitive surfaces, or other intensity sensors (e.g., sensors used to measure the force (or pressure) of a contact on a touch-sensitive surface). The contact intensity sensor(s) 465 receive contact intensity information (e.g., pressure information or a proxy for pressure information) from the physical environment. In some implementations, at least one contact intensity sensor 465 is collocated with, or proximate to, a touch-sensitive surface of the electronic device 400. In some implementations, at least one contact intensity sensor 465 is located on the side of the electronic device 400.
As represented by block 502, in some implementations, the method 500 includes obtaining image data of a physical environment. For example, with reference to
As represented by block 504, in some implementations, the method 500 includes obtaining positional sensor data characterizing a position or a positional change of the electronic device. The positional change may correspond to a rotational movement, a translational movement, a shaking movement, etc. For example, with reference to
As represented by block 506, the method 500 includes obtaining pose data that indicates a plurality of poses of the electronic device within a physical environment. In some implementations, the pose data includes 3D positional information regarding the physical environment. The 3D positional information includes one or more points in z space of the physical environment. Accordingly, the 3D positional information includes depth information regarding the physical environment. In some implementations, and with reference to
As represented by block 512, the method 500 includes obtaining a representation of a 3D environment that includes a first plurality of planes. Each of the first plurality of planes defines a plurality of points in xy space. Accordingly, none of the first plurality of planes defines a point in z space. In other words, the representation of the 3D environment does not include depth information. For example, with reference to
As represented by block 514, the method 500 includes anchoring the representation of the 3D environment to a physical anchor point within the physical environment based on the pose data. For example and with reference to
As another example and with reference to
As represented by block 516, the method 500 includes anchoring a computer-generated object to the representation of the 3D environment based on the pose data. Anchoring the computer-generated object to the representation of the 3D environment may be based on the point in the z space (e.g., depth information), indicated within the pose data. For example, with reference to
As another example and with reference to
In some implementations, anchoring the computer-generated object is in response to detecting a second user input that requests anchoring the computer-generated object to the representation of the 3D environment. For example, the method 500 includes detecting the second user input after detecting the first user input. As another example, the method 500 includes detecting the second user input while the representation of the 3D environment is anchored to the physical anchor point.
In some implementations, anchoring the representation of the 3D environment to the physical anchor point is substantially concurrent with anchoring the computer-generated object to the representation of the 3D environment. For example, with reference to
As represented by block 518, in some implementations, the method 500 includes maintaining the dual anchoring (representation anchored to the physical anchor point, and computer-generated object anchored to the representation), based on a positional change of the electronic device. To that end, in some implementations, the method 500 includes detecting, based on the positional sensor data, a positional change of the electronic device. For example, with reference to
The present disclosure describes various features, no single one of which is solely responsible for the benefits described herein. It will be understood that various features described herein may be combined, modified, or omitted, as would be apparent to one of ordinary skill. Other combinations and sub-combinations than those specifically described herein will be apparent to one of ordinary skill, and are intended to form a part of this disclosure. Various methods are described herein in connection with various flowchart steps and/or phases. It will be understood that in many cases, certain steps and/or phases may be combined together such that multiple steps and/or phases shown in the flowcharts can be performed as a single step and/or phase. Also, certain steps and/or phases can be broken into additional sub-components to be performed separately. In some instances, the order of the steps and/or phases can be rearranged and certain steps and/or phases may be omitted entirely. Also, the methods described herein are to be understood to be open-ended, such that additional steps and/or phases to those shown and described herein can also be performed.
Some or all of the methods and tasks described herein may be performed and fully automated by a computer system. The computer system may, in some cases, include multiple distinct computers or computing devices (e.g., physical servers, workstations, storage arrays, etc.) that communicate and interoperate over a network to perform the described functions. Each such computing device typically includes a processor (or multiple processors) that executes program instructions or modules stored in a memory or other non-transitory computer-readable storage medium or device. The various functions disclosed herein may be implemented in such program instructions, although some or all of the disclosed functions may alternatively be implemented in application-specific circuitry (e.g., ASICs or FPGAs or GP-GPUs) of the computer system. Where the computer system includes multiple computing devices, these devices may be co-located or not co-located. The results of the disclosed methods and tasks may be persistently stored by transforming physical storage devices, such as solid-state memory chips and/or magnetic disks, into a different state.
The disclosure is not intended to be limited to the implementations shown herein. Various modifications to the implementations described in this disclosure may be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other implementations without departing from the spirit or scope of this disclosure. The teachings of the invention provided herein can be applied to other methods and systems, and are not limited to the methods and systems described above, and elements and acts of the various implementations described above can be combined to provide further implementations. Accordingly, the novel methods and systems described herein may be implemented in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the disclosure. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the disclosure.
Claims
1. A method comprising:
- at an electronic device with one or more processors, a non-transitory memory, and a display: obtaining pose data that indicates a plurality of poses of the electronic device within a physical environment; obtaining a representation of a three-dimensional (3D) environment, wherein the representation of the 3D environment includes a first plurality of planes, wherein each of the first plurality of planes defines a plurality of points in xy space, and wherein the representation is a modified version of the 3D environment that lacks z space information; anchoring the representation of the 3D environment to a physical anchor point within the physical environment based on z space information indicated in the pose data; and anchoring a computer-generated object to the representation of the 3D environment based on the pose data.
2. The method of claim 1, wherein none of the first plurality of planes defines a point in z space.
3. The method of claim 2, wherein the pose data includes 3D positional information regarding the physical environment, wherein the 3D positional information includes a point in the z space, and wherein anchoring the computer-generated object to the representation of the 3D environment is based on the point in the z space.
4. The method of claim 1, wherein obtaining the pose data includes generating a 3D map that characterizes the physical environment.
5. The method of claim 1, wherein obtaining the pose data includes generating a 3D point cloud that characterizes the physical environment.
6. The method of claim 1,
- wherein anchoring the representation of the 3D environment to the physical anchor point includes: rendering the representation of the 3D environment based on the pose data in order to generate a first portion of display data, and displaying, on the display, the first portion of the display data; and
- wherein anchoring the computer-generated object to the representation of the 3D environment includes: rendering the computer-generated object based on the pose data in order to generate a second portion of the display data, and displaying, on the display, the second portion of the display data while displaying the first portion of the display data.
7. The method of claim 1, wherein the representation of the 3D environment corresponds to a 3D representation of a virtual environment.
8. The method of claim 1, wherein the representation of the 3D environment corresponds to a 3D representation of a physical environment.
9. The method of claim 1, wherein anchoring the representation of the 3D environment to the physical anchor point is substantially concurrent with anchoring the computer-generated object to the representation of the 3D environment.
10. The method of claim 1, wherein the electronic device further comprises an image sensor that captures image data of the physical environment, wherein the electronic device further comprises a positional sensor that generates positional sensor data characterizing a position of the electronic device, and wherein obtaining the pose data includes determining the pose data based on the image data and the positional sensor data.
11. The method of claim 10, further comprising:
- detecting, based on the positional sensor data, a positional change of the electronic device; and
- in response to detecting the positional change: maintaining anchoring the representation of the 3D environment to the physical anchor point by anchoring a second plurality of planes of the representation of the 3D environment based on the pose data, wherein each of the second plurality of planes defines a plurality of points in the xy space; and maintaining anchoring the computer-generated object to the representation of the 3D environment by anchoring the computer-generated object to the second plurality of planes of the representation of the 3D environment.
12. The method of claim 10, wherein determining the pose data includes applying simultaneous localization and mapping (SLAM) to the image data and the positional sensor data.
13. The method of claim 1, wherein a corresponding 3D environment defines a plurality of points in xyz space, and wherein the corresponding 3D environment is more graphically complex than the representation of the 3D environment.
14. The method of 13, wherein rendering the corresponding 3D environment is associated with a first amount of resource utilization by the electronic device, and wherein rendering the representation of the 3D environment is associated with a second amount of resource utilization by the electronic device that is less than the first amount of resource utilization.
15. The method of claim 1, further comprising:
- detecting a first user input that specifies the physical anchor point, wherein anchoring the representation of the 3D environment to the physical anchor point is in response to detecting the first user input.
16. The method of claim 15, further comprising detecting a second user input that requests anchoring the computer-generated object to the representation of the 3D environment, wherein anchoring the computer-generated object to the representation of the 3D environment is in response to detecting the second user input.
17. An electronic device comprising:
- a display;
- a non-transitory memory; and
- one or more processors to: obtain pose data that indicates a plurality of poses of the electronic device within a physical environment; obtain a representation of a 3D environment, wherein the representation of the 3D environment includes a first plurality of planes, wherein each of the first plurality of planes defines a plurality of points in xy space, and wherein the representation is a modified version of the 3D environment that lacks z space information; anchor the representation of the 3D environment to a physical anchor point within the physical environment based on z space information indicated in the pose data; and anchor a computer-generated object to the representation of the 3D environment based on the pose data.
18. The electronic device of claim 17, wherein none of the first plurality of planes defines a point in z space, wherein the pose data indicates a point in the z space, and wherein anchoring the computer-generated object to the representation of the 3D environment is based on the point in the z space.
19. The electronic device of claim 17, wherein the electronic device further comprises an image sensor that captures image data of the physical environment, wherein the electronic device further comprises a positional sensor that generates positional sensor data characterizing a position of the electronic device, and wherein obtaining the pose data includes determining the pose data based on the image data and the positional sensor data.
20. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which, when executed by an electronic device including a display, cause the electronic device to:
- obtain pose data that indicates a plurality of poses of the electronic device within a physical environment;
- obtain a representation of a 3D environment, wherein the representation of the 3D environment includes a first plurality of planes, wherein each of the first plurality of planes defines a plurality of points in xy space, and wherein the representation is a modified version of the 3D environment that lacks z space information;
- anchor the representation of the 3D environment to a physical anchor point within the physical environment based on z space information indicated in the pose data; and
- anchor a computer-generated object to the representation of the 3D environment based on the pose data.
| 10740960 | August 11, 2020 | Stachniak |
| 10832417 | November 10, 2020 | Tzur |
| 20210042958 | February 11, 2021 | Engel et al. |
| 20210183161 | June 17, 2021 | Upendran et al. |
| 20220229534 | July 21, 2022 | Terre |
| 20220292715 | September 15, 2022 | Liu |
| 20230315383 | October 5, 2023 | Canberk |
Type: Grant
Filed: May 2, 2023
Date of Patent: May 5, 2026
Assignee: APPLE INC. (Cupertino, CA)
Inventors: Rudy Poot (Mill Valley, CA), Alvin Chung (Santa Clara, CA), Tomas Alvarez Rodriguez (Boulder, CO)
Primary Examiner: Saptarshi Mazumder
Application Number: 18/142,322
International Classification: G06T 19/00 (20110101);