Free-space user interface and control using virtual constructs

During control of a user interface via free-space motions of a hand or other suitable control object, switching between control modes can be facilitated by tracking the control object's movements relative to, and its penetration of, a virtual control construct (such as a virtual surface construct). The position of the virtual control construct can be updated, continuously or from time to time, based on the control object's location.

Skip to: Description  ·  Claims  ·  References Cited  · Patent History  ·  Patent History
Description
CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 16/054,891, filed Aug. 3, 2018, entitled “Free-space User Interface and Control Using Virtual Constructs”, which is a continuation of U.S. patent application Ser. No. 15/358,104, filed on Nov. 21, 2016, entitled “Free-space User Interface and Control Using Virtual Constructs”, which is a continuation of U.S. patent application Ser. No. 14/154,730, filed on Jan. 14, 2014, entitled “Free-space User Interface and Control Using Virtual Constructs” which claims priority to and the benefit of, and incorporates herein by reference in their entireties, U.S. Provisional Application Nos. 61/825,515 and 61/825,480, both filed on May 20, 2013; No. 61/873,351, filed on Sep. 3, 2013; No. 61/877,641, filed on Sep. 13, 2013; No. 61/816,487, filed on Apr. 26, 2013; No. 61/824,691, filed on May 17, 2013; Nos. 61/752,725, 61/752,731, and 61/752,733, all filed on Jan. 15, 2013; No. 61/791,204, filed on Mar. 15, 2013; Nos. 61/808,959 and 61/808,984, both filed on Apr. 5, 2013; and No. 61/872,538, filed on Aug. 30, 2013.

TECHNICAL FIELD

Implementations relate generally to machine-user interfaces, and more specifically to the interpretation of free-space user movements as control inputs.

BACKGROUND

Current computer systems typically include a graphic user interface that can be navigated by a cursor, i.e., a graphic element displayed on the screen and movable relative to other screen content, and which serves to indicate a position on the screen. The cursor is usually controlled by the user via a computer mouse or touch pad. In some systems, the screen itself doubles as an input device, allowing the user to select and manipulate graphic user interface components by touching the screen where they are located. While touch can be convenient and relatively intuitive for many users, touch is not that accurate. Fingers are fat. The user's fingers can easily cover multiple links on a crowded display leading to erroneous selection. Touch is also unforgiving—it requires the user's motions to be confined to specific areas of space. For example, move one's hand merely one key-width to the right or left and type. Nonsense appears on the screen.

Mice, touch pads, and touch screens can be cumbersome and inconvenient to use. Touch pads and touch screens require the user to be in close physical proximity to the pad (which is often integrated into a keyboard) or screen so as to be able to reach them, which significantly restricts users' range of motion while providing input to the system. Touch is, moreover, not always reliably detected, sometimes necessitating repeated motions across the pad or screen to effect the input. Mice facilitate user input at some distance from the computer and screen (determined by the length of the connection cable or the range of the wireless connection between computer and mouse), but require a flat surface with suitable surface properties, or even a special mouse pad, to function properly. Furthermore, prolonged use of a mouse, in particular if it is positioned sub-optimally relative to the user, can result in discomfort or even pain.

Accordingly, alternative input mechanisms that provide users with the advantages of touch based controls but free the user from the many disadvantages of touch based control are highly desirable.

SUMMARY

Aspects of the system and methods, described herein provide for improved machine interface and/or control by interpreting the motions (and/or position, configuration) of one or more control objects or portions thereof relative to one or more virtual control constructs defined (e.g., programmatically) in free space disposed at least partially within a field of view of an image-capture device. In implementations, the position, orientation, and/or motion of control object(s) (e.g., a user's finger(s), thumb, etc.; a suitable hand-held pointing device such as a stylus, wand, or some other control object; portions and/or combinations thereof) are tracked relative to virtual control surface(s) to facilitate determining whether an engagement gesture has occurred. Engagement gestures can include engaging with a control (e.g., selecting a button or switch), disengaging with a control (e.g., releasing a button or switch), motions that do not involve engagement with any control (e.g., motion that is tracked by the system, possibly followed by a cursor, and/or a single object in an application or the like), environmental interactions (i.e., gestures to direct an environment rather than a specific control, such as scroll up/down), special-purpose gestures (e.g., brighten/darken screen, volume control, etc.), as well as others or combinations thereof.

Engagement gestures can be mapped to one or more controls, or a control-less screen location, of a display device associated with the machine under control. Implementations provide for mapping of movements in three-dimensional (3D) space conveying control and/or other information to zero, one, or more controls. Controls can include imbedded controls (e.g., sliders, buttons, and other control objects in an application), or environmental-level controls (e.g., windowing controls, scrolls within a window, and other controls affecting the control environment). In implementations, controls can be displayable using two-dimensional (2D) presentations (e.g., a traditional cursor symbol, cross-hairs, icon, graphical representation of the control object, or other displayable object) on, e.g., one or more display screens, and/or 3D presentations using holography, projectors, or other mechanisms for creating 3D presentations. Presentations can also be audible (e.g., mapped to sounds, or other mechanisms for conveying audible information) and/or haptic.

In an implementation, determining whether motion information defines an engagement gesture can include finding an intersection (also referred to as a contact, pierce, or a “virtual touch”) of motion of a control object with a virtual control surface, whether actually detected or determined to be imminent; dis-intersection (also referred to as a “pull back” or “withdrawal”) of the control object with a virtual control surface; a non-intersection—i.e., motion relative to a virtual control surface (e.g., wave of a hand approximately parallel to the virtual surface to “erase” a virtual chalk board); or other types of identified motions relative to the virtual control surface suited to defining gestures conveying information to the machine. In an implementation and by way of example, one or more virtual control constructs can be defined computationally (e.g., programmatically using a computer or other intelligent machinery) based upon one or more geometric constructs to facilitate determining occurrence of engagement gestures from information about one or more control objects (e.g., hand, tool, combinations thereof) captured using imaging systems, scanning systems, or combinations thereof. Virtual control constructs in an implementation can include virtual surface constructs, virtual linear or curvilinear constructs, virtual point constructs, virtual solid constructs, and complex virtual constructs comprising combinations thereof. Virtual surface constructs can comprise one or more surfaces, e.g., a plane, curved open surface, closed surface, bounded open surface, or generally any multi-dimensional virtual surface definable in two or three dimensions. Virtual linear or curvilinear constructs can comprise any one-dimensional virtual line, curve, line segment or curve segment definable in one, two, or three dimensions. Virtual point constructs can comprise any zero-dimensional virtual point definable in one, two, or three dimensions. Virtual solids can comprise one or more solids, e.g., spheres, cylinders, cubes, or generally any three-dimensional virtual solid definable in three dimensions.

In an implementation, an engagement target can be defined using one or more virtual construct(s) coupled with a virtual control (e.g., slider, button, rotatable knob, or any graphical user interface component) for presentation to user(s) by a presentation system (e.g., displays, 3D projections, holographic presentation devices, non-visual presentation systems such as haptics, audio, and the like, any other devices for presenting information to users, or combinations thereof). Coupling a virtual control with a virtual construct enables the control object to “aim” for, or move relative to, the virtual control—and therefore the virtual control construct. Engagement targets in an implementation can include engagement volumes, engagement surfaces, engagement lines, engagement points, or the like, as well as complex engagement targets comprising combinations thereof. An engagement target can be associated with an application or non-application (e.g., OS, systems software, etc.) so that virtual control managers (i.e., program routines, classes, objects, etc. that manage the virtual control) can trigger differences in interpretation of engagement gestures including presence, position and/or shape of control objects, control object motions, or combinations thereof to conduct machine control. As explained in more detail below with reference to example implementations, engagement targets can be used to determine engagement gestures by providing the capability to discriminate between engagement and non-engagement (e.g., virtual touches, moves in relation to, and/or virtual pierces) of the engagement target by the control object.

In an implementation, determining whether motion information defines an engagement gesture can include determining one or more engagement attributes from the motion information about the control object. In an implementation, engagement attributes include motion attributes (e.g., speed, acceleration, duration, distance, etc.), gesture attributes (e.g., hand, two hands, tools, type, precision, etc.), other attributes and/or combinations thereof.

In an implementation, determining whether motion information defines an engagement gesture can include filtering motion information to determine whether motion comprises an engagement gesture. Filtering can be applied based upon engagement attributes, characteristics of motion, position in space, other criteria, and/or combinations thereof. Filtering can enable identification of engagement gestures, discrimination of engagement gestures from extraneous motions, discrimination of engagement gestures of differing types or meanings, and so forth.

In an implementation, sensing an engagement gesture provides an indication for selecting a mode to control a user interface of the machine (e.g., an “engaged mode” simulating a touch, or a “disengaged mode” simulating no contact and/or a hover in which a control is selected but not actuated). Other modes useful in various implementations include an “idle,” in which no control is selected nor virtually touched, and a “lock,” in which the last control to be engaged with remains engaged until disengaged. Yet further, hybrid modes can be created from the definitions of the foregoing modes in implementations.

In various implementations, to trigger an engaged mode—corresponding to, e.g., touching an object or a virtual object displayed on a screen—the control object's motion toward an engagement target such as a virtual surface construct (i.e., a plane, plane portion, or other (non-planar or curved) surface computationally or programmatically defined in space, but not necessarily corresponding to any physical surface) can be tracked; the motion can be, e.g., a forward motion starting from a disengaged mode, or a backward retreating motion. When the control object reaches a spatial location corresponding to this virtual surface construct—i.e., when the control object intersects “touches” or “pierces” the virtual surface construct—the user interface (or a component thereof, such as a cursor, user-interface control, or user-interface environment) is operated in the engaged mode; as the control object retracts from the virtual surface construct, user-interface operation switches back to the disengaged mode.

In implementations, the virtual surface construct can be fixed in space, e.g., relative to the screen; for example, it can be defined as a plane (or portion of a plane) parallel to and located several inches in front of the screen in one application, or as a curved surface defined in free space convenient to one or more users and optionally proximately to display(s) associated with one or more machines under control. The user can engage this plane while remaining at a comfortable distance from the screen (e.g., without needing to lean forward to reach the screen). The position of the plane can be adjusted by the user from time to time. In implementations, however, the user is relieved of the need to explicitly change the plane's position; instead, the plane (or other virtual surface construct) automatically moves along with, as if tethered to, the user's control object. For example, a virtual plane can be computationally defined as perpendicular to the orientation of the control object and located a certain distance, e.g., 3-4 millimeters, in front of its tip when the control object is at rest or moving with constant velocity. As the control object moves, the plane follows it, but with a certain time lag (e.g., 0.2 second). As a result, as the control object accelerates, the distance between its tip and the virtual touch plane changes, allowing the control object, when moving towards the plane, to eventually “catch” the plane—that is, the tip of the control object to touch or pierce the plane. Alternatively, instead of being based on a fixed time lag, updates to the position of the virtual plane can be computed based on a virtual energy potential defined to accelerate the plane towards (or away from) the control object tip depending on the plane-to-tip distance, likewise allowing the control object to touch or pierce the plane. Either way, such virtual touching or piercing can be interpreted as engagement events. Further, in some implementations, the degree of piercing (i.e., the distance beyond the plane that the control object reaches) is interpreted as an intensity level. To guide the user as she engages with or disengages from the virtual plane (or other virtual surface construct), the cursor symbol can encode the distance from the virtual surface visually, e.g., by changing in size with varying distance.

In an implementation, once engaged, further movements of the control object can serve to move graphical components across the screen (e.g., drag an icon, shift a scroll bar, etc.), change perceived “depth” of the object to the viewer (e.g., resize and/or change shape of objects displayed on the screen in connection, alone, or coupled with other visual effects) to create perception of “pulling” objects into the foreground of the display or “pushing” objects into the background of the display, create new screen content (e.g., draw a line), or otherwise manipulate screen content until the control object disengages (e.g., by pulling away from the virtual surface, indicating disengagement with some other gesture of the control object (e.g., curling the forefinger backward); and/or with some other movement of a second control object (e.g., waving the other hand, etc.)). Advantageously, tying the virtual surface construct to the control object (e.g., the user's finger), rather than fixing it relative to the screen or other stationary objects, allows the user to consistently use the same motions and gestures to engage and manipulate screen content regardless of his precise location relative to the screen. To eliminate the inevitable jitter typically accompanying the control object's movements and which might otherwise result in switching back and forth between the modes unintentionally, the control object's movements can be filtered and the cursor position thereby stabilized. Since faster movements will generally result in more jitter, the strength of the filter can depend on the speed of motion.

Accordingly, in one aspect, a computer-implemented method of controlling a machine user interface is provided. The method involves receiving information including motion information for a control object; determining from the motion information whether a motion of the control object is an engagement gesture according to an occurrence of an engagement gesture applied to at least one virtual control construct defined within a field of view of an image capturing device; determining a control to which the engagement gesture is applicable; and manipulating the control according to at least the motion information. The method can further include updating at least a spatial position of the virtual control construct(s) based at least in part on a spatial position of the control object determined from the motion information, thereby enabling the spatial position of the virtual control construct(s) to follow tracked motions of the control object.

In some implementations, determining whether a motion of the control object is an engagement gesture includes determining whether an intersection between the control object and the virtual control construct(s), a dis-intersection of the control object from the virtual control construct(s), or a motion of the control object relative to the virtual control construct(s) occurred. The method can further include determining from the motion information whether the engagement includes continued motion after intersection. In some implementations, determining from the motion information whether a motion of the control object is an engagement gesture includes determining from the motion information one or more engagement attributes (e.g., a potential energy) defining an engagement gesture. In some implementations, determining whether a motion of the control object is an engagement gesture includes identifying an engagement gesture by correlating motion information to at least one engagement gesture based at least upon one or more of motion of the control object, occurrence of any of an intersection, a dis-intersection or a non-intersection of the control object with the virtual control construct, and the set of engagement attributes.

Determining a control to which the engagement gesture is applicable can include selecting a control associated with an application, a control associated with an operating environment, and/or a special control. Manipulating a control according to at least the motion information can include controlling a user interface in a first mode, and otherwise controlling the user interface in a second mode different from the first mode.

In another aspect, a computer-implemented method of controlling a machine user interface is provided. The method includes receiving information including motion information for a control object. Further, it includes determining from the motion information whether a motion of the control object is an engagement gesture according to an occurrence of an engagement gesture applied to at least one virtual control construct defined within a field of view of an image capturing device by (i) determining whether an intersection occurred between control object and at least one virtual control construct, and when an intersection has occurred determining from the motion information whether the engagement includes continued motion after intersection; otherwise (ii) determining whether a dis-intersection of the control object from the at least one virtual control construct occurred; otherwise (iii) determining whether motion of the control object occurred relative to at least one virtual control construct; (iv) determining from the motion information a set of engagement attributes defining an engagement gesture; and (v) identifying an engagement gesture by correlating motion information to at least one engagement gesture based at least upon one or more of motion of the control object, occurrence of any of an intersection, a dis-intersection or a non-intersection of the control object with the virtual control construct, and the set of engagement attributes. Further, the method involves determining a control to which the engagement gesture is applicable, and manipulating the control according to at least the engagement gesture.

In another aspect, a computer-implemented method for facilitating control of a user interface via free-space motions of a control object is provided. One method implementation includes receiving data indicative of tracked motions of the control object, and computationally (i.e., using a processor) defining a virtual control construct and updating a spatial position (and, in some implementations, also a spatial orientation) of the virtual control construct based at least in part on the data such that the position of the virtual control construct follows the tracked motions of the control object. Further, implementations of the method involve computationally determining whether the control object intersects the virtual control construct, and, if so, controlling the user interface in a first mode (e.g., an engaged mode), and otherwise controlling the user interface in a second mode different from the first mode (e.g., a disengaged mode).

In some implementations, the virtual control construct follows the tracked motions of the control object with a time lag, which can be fixed or, e.g., depend on a motion parameter of the control object. In alternative implementations, the spatial position of the virtual control construct is updated based on a current distance between the control object and the virtual control construct, e.g., in accordance with a virtual energy potential defined as a function of that distance. The virtual energy potential can have minima at steady-state distances between the control object and the virtual control construct in the engaged mode and the disengaged mode. In some implementations, the steady-state distance in the engaged mode is equal to the steady-state distance in the disengaged mode; in other implementations, the steady-state distance in the engaged mode is larger (or smaller) than the steady-state distance in the disengaged mode.

Determining whether the control object intersects the virtual control construct can involve computing an intersection of a straight line through the axis of the control object with a screen displaying the user interface or, alternatively, computationally projecting a tip of the control object perpendicularly onto the screen. Controlling the user interface can involve updating the screen content based, at least in part, on the tracked control object motions and the operational mode (e.g., the engaged or disengaged mode). For example, in some implementations, it involves operating a cursor variably associated with a screen position; a cursor symbol can be displayed on the screen at that position. The cursor can also be indicative of a distance between the control object and the virtual control construct. (The term “cursor,” as used herein, refers to a control element operable to select a screen position—whether or not the control element is actually displayed—and manipulate screen content via movement across the screen, i.e., changes in the selected position.) In some implementations, the method further includes computationally determining, for a transition from the disengaged mode to the engaged mode, a degree of penetration of the virtual control construct by the control object, and controlling the user interface based at least in part thereon.

The method can also include acquiring a temporal sequence of images of the control object (e.g., with a camera system having depth-sensing capability) and/or computationally tracking the motions of the control object based on the sequence of images. In some implementations, the control object motions are computationally filtered based, at least in part, on the control object's velocity.

In another aspect, implementations pertain to a computer-implemented method for controlling a user interface via free-space motions of a control object. The method involves receiving motion information indicating positions of a control object being tracked in free space, and, using a processor, (i) defining a virtual control construct, at least a portion thereof having a spatial position determined based at least in part on the motion information such that the virtual control construct portion is positioned proximate to the control object, (ii) determining from the motion information whether the tracked motions of the control object indicate that the control object has intersected the virtual control construct, and (iii) switching from conducting control of a user interface in a first mode to conducting control of the user interface in a second mode based at least in part upon an occurrence of the control object intersecting the virtual control construct. The method can further involve updating at least the spatial position of the virtual control construct portion based at least in part on the motion information such that the virtual control construct portion is enabled to follow the control object.

In another aspect, implementations provide a system for controlling a machine user interface via free-space motions of a control object tracked with an image capturing device, the system including a processor and memory. The memory stores (i) motion information for the control object; and (ii) processor-executable instructions for causing the processor to determine from the motion information whether a motion of the control object is an engagement gesture according to an occurrence of an engagement gesture applied to at least one virtual control construct defined within a field of view of the image capturing device, to determine a control to which the engagement gesture is applicable, and to manipulate the control according to at least the motion information.

Yet another aspect pertains to a non-transitory machine-readable medium. In implementations, the medium stores one or more instructions which, when executed by one or more processors, cause the one or more processors to determine from motion information received for a control object whether a motion of the control object is an engagement gesture according to an occurrence of an engagement gesture applied to at least one virtual control construct defined within a field of view of an image capturing device; determine a control to which the engagement gesture is applicable; and manipulate the control according to at least the motion information.

In a further aspect, a system for controlling a user interface via free-space motions of a control object tracked by a motion-capture system is provided. The system includes a processor and associated memory, the memory storing processor-executable instructions for causing the processor to (i) computationally define a virtual control construct relative to the control object and update at least a spatial position thereof, based at least in part on the tracked motions of the control object, such that the spatial position of the virtual control construct follows the tracked motions of the control object, (ii) computationally determine whether the control object, in the current spatial position, intersects the virtual control construct, and (iii) if so, control the user interface in a first mode, and otherwise control the user interface in a second mode different from the first mode. In some implementations, the first and second modes are engaged and disengaged modes, respectively. Execution of the instructions by the processor can cause the processor to compute a position of the virtual control construct relative to the current position of the control object such that the virtual control construct follows the tracked motions of the control object with a time lag, and/or to update the spatial position of the virtual control construct in accordance with a virtual energy potential defined as a function of a distance between the control object and the virtual control construct.

The system can further include the motion-capture system for tracking the motions of the control object in three dimensions based on a temporal sequence of images of the control object. In some implementations, the motion-capture system includes one or more camera(s) acquiring the images and a plurality of image buffers for storing a most recent set of the images. The system can also have a filter for computationally filtering the motions of the control object based, at least in part, on a velocity of these motions. In addition, the system can include a screen for displaying the user interface; execution of the instructions by the processor can cause the processor to update screen content based, at least in part, on the mode and the tracked motions of the control object. In some implementation, execution of the instructions by the processor causes the processor to operate a cursor associated with a position on a screen based, at least in part, on the mode and the tracked motions of the control object. The screen can display a cursor symbol at the associated position; the cursor symbol can be indicative of a distance between the control object and the virtual control construct.

In another aspect, a non-transitory machine-readable medium storing one or more instructions is provided in which, when executed by one or more processors, cause the one or more processors to (i) computationally define a virtual control construct and update at least a spatial position thereof based at least in part on data indicative of tracked motions of a control object such that the position of the virtual control construct follows the tracked motions of the control object, (ii) computationally determine whether the control object intersects the virtual control construct, and (iii) if so, control the user interface in a first mode, and otherwise control the user interface in a second mode different from the first mode.

In yet another aspect, a computer-implemented method for facilitating control of a user interface via free-space motions of a control object is provided. The method involves receiving data indicative of tracked motions of the control object, and, using a processor, (i) computationally defining a virtual control construct and updating at least a spatial position thereof based at least in part on the data such that the position of the virtual control construct follows the tracked motions of the control object, (ii) computationally detecting when a tip of the control object transitions from one side of the virtual control construct to another side, and (iii) whenever it does, switching between two modes of controlling the user interface.

In a further aspect, yet another computer-implemented method for facilitating control of a user interface via free-space motions of a control object is provided. The method includes tracking motions of a control object and a gesturer; using a processor to continuously determine computationally whether the control object intersects a virtual control construct located at a temporarily fixed location in space and, if so, controlling the user interface in a first mode and otherwise controlling the user interface in a second mode different from the first mode; and, each time upon recognition of a specified gesture performed by the gesturer, using the processor to relocate the virtual control construct to a specified distance from an instantaneous position of the control object.

Among other aspects, implementations can enable quicker, crisper gesture based or “free space” (i.e., not requiring physical contact) interfacing with a variety of machines (e.g., a computing systems, including desktop, laptop, tablet computing devices, special purpose computing machinery, including graphics processors, embedded microcontrollers, gaming consoles, audio mixers, or the like; wired or wirelessly coupled networks of one or more of the foregoing, and/or combinations thereof), obviating or reducing the need for contact-based input devices such as a mouse, joystick, touch pad, or touch screen.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing will me more readily understood from the following detailed description, in particular, when taken in conjunction with the drawings, in which:

FIGS. 1A and 1B are perspective views of a planar virtual surface construct and a control object in the disengaged and engaged modes, respectively, illustrating free-space gesture control of a desktop computer in accordance with various implementations;

FIG. 1C-1 is a perspective view of a tablet connected to a motion-capture device, illustrating free-space gesture control of the tablet in accordance with various implementations;

FIG. 1C-2 is a perspective view of a tablet incorporating a motion-capture device, illustrating free-space gesture control of the tablet in accordance with various implementations;

FIG. 1D is a perspective view of a curved virtual surface construct accommodating free-space gesture control of a multi-screen computer system in accordance with various implementations;

FIG. 2 illustrates motion of a virtual surface construct relative to a user's finger in accordance with various implementations;

FIGS. 3A and 3B are plots of a virtual energy potential and its derivative, respectively, in accordance with various implementations for updating the position of a virtual surface construct;

FIGS. 3C-3E are plots of alternative virtual energy potentials in accordance with various implementations for updating the position of a virtual surface construct;

FIGS. 4A, 4B, and 4B-1 are flow charts illustrating methods for machine and/or user interface control in accordance with various implementations;

FIG. 5A is a schematic diagram of a system for tracking control object movements in accordance with various implementations;

FIG. 5B is a block diagram of a computer system for machine control based on tracked control object movements in accordance with various implementations;

FIGS. 6A-6D illustrate a free-space compound gesture in accordance with various implementations;

FIGS. 7A and 7B illustrate, in two snap shots, a zooming action performed by a user via a free-space gesture in accordance with various implementations;

FIGS. 8A and 8B illustrate, in two snap shots, a swiping action performed by a user via a free-space gesture in accordance with various implementations; and

FIGS. 9A and 9B illustrate, in two snap shots, a drawing action performed by a user via free-space hand motions in accordance with various implementations.

DETAILED DESCRIPTION

System and methods in accordance herewith generally utilize information about the motion of a control object, such as a user's finger or a stylus, in three-dimensional space to operate a user interface and/or components thereof based on the motion information. Various implementations take advantage of motion-capture technology to track the motions of the control object in real time (or near real time, i.e., sufficiently fast that any residual lag between the control object and the system's response is unnoticeable or practically insignificant). Other implementations can use synthetic motion data (e.g., generated by a computer game) or stored motion data (e.g., previously captured or generated). References to motions in “free space” or “touchless” motions are used herein with reference to an implementation to distinguish motions tied to and/or requiring physical contact of the moving object with a physical surface to effect input; however, in some applications, the control object can contact a physical surface ancillary to providing input, in such case the motion is still considered a “free-space” motion. Further, in some implementations, the virtual surface can be defined to co-reside at or very near a physical surface (e.g., a virtual touch screen can be created by defining a (substantially planar) virtual surface at or very near the screen of a display (e.g., television, monitor, or the like); or a virtual active table top can be created by defining a (substantially planar) virtual surface at or very near a table top convenient to the machine receiving the input).

A “control object” as used herein with reference to an implementation is generally any three-dimensionally movable object or appendage with an associated position and/or orientation (e.g., the orientation of its longest axis) suitable for pointing at a certain location and/or in a certain direction. Control objects include, e.g., hands, fingers, feet, or other anatomical parts, as well as inanimate objects such as pens, styluses, handheld controls, portions thereof, and/or combinations thereof. Where a specific type of control object, such as the user's finger, is used hereinafter for ease of illustration, it is to be understood that, unless otherwise indicated or clear from context, any other type of control object can be used as well.

A “virtual control construct” as used herein with reference to an implementation denotes a geometric locus defined (e.g., programmatically) in space and useful in conjunction with a control object, but not corresponding to a physical object; its purpose is to discriminate between different operational modes of the control object (and/or a user-interface element controlled therewith, such as a cursor) based on whether the control object intersects the virtual control construct. The virtual control construct, in turn, can be, e.g., a virtual surface construct (a plane oriented relative to a tracked orientation of the control object or an orientation of a screen displaying the user interface) or a point along a line or line segment extending from the tip of the control object.

The term “intersect” is herein used broadly with reference to an implementation to denote any instance in which the control object, which is an extended object, has at least one point in common with the virtual control construct and, in the case of an extended virtual control construct such as a line or two-dimensional surface, is not parallel thereto. This includes “touching” as an extreme case, but typically involves that portions of the control object fall on both sides of the virtual control construct.

Using the output of a suitable motion-capture system or motion information received from another source, various implementations facilitate user input via gestures and motions performed by the user's hand or a (typically handheld) pointing device. For example, in some implementations, the user can control the position of a cursor and/or other object on the screen by pointing at the desired screen location, e.g., with his index finger, without the need to touch the screen. The position and orientation of the finger relative to the screen, as determined by the motion-capture system, can be used to compute the intersection of a straight line through the axis of the finger with the screen, and a cursor symbol (e.g., an arrow, circle, cross hair, or hand symbol) can be displayed at the point of intersection. If the range of motion causes the intersection point to move outside the boundaries of the screen, the intersection with a (virtual) plane through the screen can be used, and the cursor motions can be re-scaled, relative to the finger motions, to remain within the screen boundaries. Alternatively to extrapolating the finger towards the screen, the position of the finger (or control object) tip can be projected perpendicularly onto the screen; in this implementation, the control object orientation can be disregarded. As will be readily apparent to one of skill in the art, many other ways of mapping the control object position and/or orientation onto a screen location can, in principle, be used; a particular mapping can be selected based on considerations such as, without limitation, the requisite amount of information about the control object, the intuitiveness of the mapping to the user, and the complexity of the computation. For example, in some implementations, the mapping is based on intersections with or projections onto a (virtual) plane defined relative to the camera, under the assumption that the screen is located within that plane (which is correct, at least approximately, if the camera is correctly aligned relative to the screen), whereas, in other implementations, the screen location relative to the camera is established via explicit calibration (e.g., based on camera images including the screen).

In some implementations, the cursor can be operated in at least two modes: a disengaged mode in which it merely indicates a position on the screen, typically without otherwise affecting the screen content; and one or more engaged modes, which allow the user to manipulate the screen content. In the engaged mode, the user can, for example, drag graphical user-interface elements (such as icons representing files or applications, controls such as scroll bars, or displayed objects) across the screen, or draw or write on a virtual canvas. Further, transient operation in the engaged mode can be interpreted as a click event. Thus, operation in the engaged mode generally corresponds to, or emulates, touching a touch screen or touch pad, or controlling a mouse with a mouse button held down.

The term “cursor,” as used in this discussion, refers generally to the cursor functionality rather than the visual element; in other words, the cursor is a control element operable to select a screen position—whether or not the control element is actually displayed and manipulate screen content via movement across the screen, i.e., changes in the selected position. The cursor need not always be visible in the engaged mode. In some instances, a cursor symbol still appears, e.g., overlaid onto another graphical element that is moved across the screen, whereas in other instances, cursor motion is implicit in the motion of other screen elements or in newly created screen content (such as a line that appears on the screen as the control object moves), obviating the need for a special symbol. In the disengaged mode, a cursor symbol is typically used to visualize the current cursor location. Alternatively or additionally, a screen element or portion presently co-located with the cursor (and thus the selected screen location) can change brightness, color, or some other property to indicate that it is being pointed at. However, in certain implementations, the symbol or other visual indication of the cursor location can be omitted so that the user has to rely on his own observation of the control object relative to the screen to estimate the screen location pointed at. (For example, in a shooter game, the player can have the option to shoot with or without a “virtual sight” indicating a pointed-to screen location.)

Discrimination between the engaged and disengaged modes can be achieved by tracking the control object relative to a virtual control construct such as a virtual plane (or, more generally, a virtual surface construct). In an implementation and by way of example, as illustrated in FIGS. 1A and 1B, a virtual control construct implemented by a virtual plane 100 can be defined in front of and substantially parallel to the screen 102. When the control object 104 “touches” or “pierces” the virtual plane (i.e., when its spatial location coincides with, intersects, or moves beyond the virtual plane's computationally defined spatial location), the cursor 106 and/or machine interface operates in the engaged mode (FIG. 1B); otherwise, the cursor and/or machine interface operates in the disengaged mode (FIG. 1A). To implement two or more distinct engaged modes, multiple virtual planes can be defined. For instance, a drawing application can define two substantially parallel virtual planes at different distances from the screen. When the user, moving his finger towards the screen, pierces the first virtual plane, the user can be able to operate menus and controls within the application; when his finger pierces the second virtual plane, the finger's further (e.g., lateral) motions can be converted to line drawings on the screen. Two parallel virtual planes can also be used to, effectively, define a virtual control construct with a certain associated thickness (i.e., a “virtual slab”). Control object movements within that virtual slab can operate the cursor in the engaged mode, while movements on either side of the virtual slab correspond to the disengaged mode. A planar virtual control construct with a non-zero thickness can serve to avoid unintended engagement and disengagement resulting from inevitable small motions in and out of the virtual plane (e.g., due to the inherent instability of the user's hand and/or the user's perception of depth). The thickness can vary depending on one or more sensed parameters (e.g., the overall speed of the control object's motion; the faster the movements, the thicker the slice can be chosen to be).

Transitions between the different operational modes can, but need not, be visually indicated by a change in the shape, color (as in FIGS. 1A and 1B), or other visual property of the cursor or other displayable object and/or audio feedback. In some implementations, the cursor symbol indicates not only the operational mode, but also the control object's distance from the virtual control construct. For instance, the cursor symbol can take the form of a circle, centered at the cursor location, whose radius is proportional to (or otherwise monotonically increasing with) the distance between control object and virtual control construct, and which, optionally, changes color when switching from the disengaged mode into the engaged mode.

Of course, the system under control need not be a desktop computer. FIG. 1C-1 illustrates an implementation in which free-space gestures are used to operate a handheld tablet 110. The tablet 110 can be connected, e.g., via a USB cable 112 (or any other wired or wireless connection), to a motion-capture device 114 (such as for example, a dual-camera motion controller as provided by Leap Motion, Inc., San Francisco, Calif. or other interfacing mechanisms and/or combinations thereof) that is positioned and oriented so as to monitor a region where hand motions normally take place. For example, the motion-capture device 114 can be placed onto a desk or other working surface, and the tablet 110 can be held at an angle to that working surface to facilitate easy viewing of the displayed content. The tablet 110 can be propped up on a tablet stand or against a wall or other suitable vertical surface to free up the second hand, facilitating two-hand gestures. FIG. 1C-2 illustrates a modified tablet implementation, in which the motion-capture device 114 is integrated into the frame of the tablet 110.

The virtual surface construct need not be planar, but can be curved in space, e.g., to conform to the user's range of movements. FIG. 1D illustrates, for example, a cylindrical virtual surface construct 120 in front of an arrangement of three monitors 122, 124, 126, which can all be connected to the same computer. The user's finger motions can control screen content on any one of the screens, depending on the direction in which the finger 128 points and/or the portion of the virtual surface construct 120 that it pierces. Of course, other types of curved virtual surfaces constructs of regular (e.g., spherical) or irregular shape, or virtual surface constructs composed of multiple (planar or curved) segments, can also be used in combination with one or more screens. Further, in some implementations, the virtual control construct is a virtual solid construct or a virtual closed surface (such as, e.g., a sphere, box, oriented ellipsoid, etc.) or portion thereof, having an interior (or, alternatively, exterior) that defines a three-dimensional engagement target. For instance, in an application that allows the user to manipulate a globe depicted on the screen, the virtual control construct can be a virtual sphere located at some distance in front of the screen. The user can be able to rotate the on-screen globe by moving his fingertips while they are touching or piercing the spherical virtual surface construct (from outside). To allow the user to manipulate the globe from inside, the spherical virtual surface construct can be defined as surrounding the user (or at least his hand), with its exterior serving as the engagement target. Engagement and disengagement of the control object need not necessarily be defined relative to a two-dimensional surface. Rather, in some implementations, the virtual control construct can be a virtual point construct along a virtual line (or line segment) extending from the control object, or a line within a plane extending from the control object.

The location and/or orientation of the virtual surface construct (or other virtual control construct) can be defined relative to the room and/or stationary objects (e.g., a screen) therein, relative to the user, relative to the device 114 or relative to some combination. For example, a planar virtual surface construct can be oriented parallel to the screen, perpendicular to the direction of the control object, or at some angle in between. The location of the virtual surface construct can, in some implementations, be set by the user, e.g., by means of a particular gesture recognized by the motion-capture system. To give just one example, the user can, with her index finger stretched out, have her thumb and middle finger touch so as to pin the virtual surface construct at a certain location relative to the current position of the index-finger-tip. Once set in this manner, the virtual surface construct can be stationary until reset by the user via performance of the same gesture in a different location.

In some implementations, the virtual surface construct is tied to and moves along with the control object, i.e., the position and/or orientation of the virtual surface construct are updated based on the tracked control object motion. This affords the user maximum freedom of motion by allowing the user to control the user interface from anywhere (or almost anywhere) within the space monitored by the motion-capture system. To enable the relative motion between the control object and virtual surface construct that is necessary for piercing the surface, the virtual surface construct follows the control object's movements with some delay. Thus, starting from a steady-state distance between the virtual surface construct and the control object tip in the disengaged mode, the distance generally decreases as the control object accelerates towards the virtual surface construct, and increases as the control object accelerates away from the virtual surface construct. If the control object's forward acceleration (i.e., towards the virtual surface construct) is sufficiently fast and/or prolonged, the control object eventually pierces the virtual surface construct. Once pierced, the virtual surface construct again follows the control object's movements. However, whereas, in the disengaged mode, the virtual surface construct is “pushed” ahead of the control object (i.e., is located in front of the control object tip), it is “pulled” behind the control object in the engaged mode (i.e., is located behind the control object tip). To disengage, the control object generally needs to be pulled back through the virtual surface construct with sufficient acceleration to exceed the surface's responsive movement.

In an implementation, an engagement target can be defined as merely the point where the user touches or pierces a virtual control construct. For example, a virtual point construct can be defined along a line extending from or through the control object tip, or any other point or points on the control object, located a certain distance from the control object tip in the steady state, and moving along the line to follow the control object. The line can, e.g., be oriented in the direction of the control object's motion, perpendicularly project the control object tip onto the screen, extend in the direction of the control object's axis, or connect the control object tip to a fixed location, e.g., a point on the display screen. Irrespective of how the line and virtual point construct are defined, the control object can, when moving sufficiently fast and in a certain manner, “catch” the virtual point construct. Similarly, a virtual line construct (straight or curved) can be defined as a line within a surface intersecting the control object at its tip, e.g., as a line lying in the same plane as the control object and oriented perpendicular (or at some other non-zero angle) to the control object. Defining the virtual line construct within a surface tied to and intersecting the control object tip ensures that the control object can eventually intersect the virtual line construct.

In an implementation, engagement targets defined by one or more virtual point constructs or virtual line (i.e., linear or curvilinear) constructs can be mapped onto engagement targets defined as virtual surface constructs, in the sense that the different mathematical descriptions are functionally equivalent. For example, a virtual point construct can correspond to the point of a virtual surface construct that is pierced by the control object (and a virtual line construct can correspond to a line in the virtual surface construct going through the virtual point construct). If the virtual point construct is defined on a line projecting the control object tip onto the screen, control object motions perpendicular to that line move the virtual point construct in a plane parallel to the screen, and if the virtual point construct is defined along a line extending in the direction of the control object's axis, control object motions perpendicular to that line move the virtual point construct in a plane perpendicular to that axis; in either case, control object motions along the line move the control object tip towards or away from the virtual point construct and, thus, the respective plane. Thus, the user's experience interacting with a virtual point construct can be little (or no) different from interacting with a virtual surface construct. Hereinafter, the description will, for ease of illustration, focus on virtual surface constructs. A person of skill in the art will appreciate, however, that the approaches, methods, and systems described can be straightforwardly modified and applied to other virtual control constructs (e.g., virtual point constructs or virtual linear/curvilinear constructs).

The position and/or orientation of the virtual surface construct (or other virtual control construct) are typically updated continuously or quasi-continuously, i.e., as often as the motion-capture system determines the control object location and/or direction (which, in visual systems, corresponds to the frame rate of image acquisition and/or image processing). However, implementations in which the virtual surface construct is updated less frequently (e.g., only every other frame, to save computational resources) or more frequently (e.g., based on interpolations between the measured control object positions) can be provided for in implementations.

In some implementations, the virtual surface construct follows the control object with a fixed time lag, e.g., between 0.1 and 1.0 second. In other words, the location of the virtual surface construct is updated, for each frame, based on where the control object tip was a certain amount of time (e.g., 0.2 second) in the past. This is illustrated in FIG. 2, which shows the control object and the virtual surface construct (represented as a plane) at locations within a consistent coordinate system across the figures for various points in time according to various implementations. As depicted, the plane can be computationally defined as substantially perpendicular to the orientation of the control object (meaning that its normal is angled relative to the control object orientation by less than a certain small amount, e.g., less than 5°, and preferably smaller than 1°). Of course, the virtual plane need not necessarily be perpendicular to the orientation of the control object. In some implementations, it is, instead, substantially parallel to the screen, but still dynamically positioned relative to the control object (e.g., so as to remain at a certain distance from the control object tip, where distance can be measured, e.g., in a direction perpendicular to the screen or, alternatively, in the direction of the control object).

At a first point t=t0 in time, when the control object is at rest, the virtual plane is located at its steady-state distance din front of the control object tip; this distance can be, e.g., a few millimeters. At a second point t=t1 in time—after the control object has started moving towards the virtual plane, but before the lag period has passed—the virtual plane is still in the same location, but its distance from the control object tip has decreased due to the control object's movement. One lag period later, at t=t1+Δtlog, the virtual plane is positioned the steady-state distance away from the location of the control object tip at the second point in time, but due to the control object's continued forward motion, the distance between the control object tip and the virtual plane has further decreased. Finally, at a fourth point in time t=t2, the control object has pierced the virtual plane. One lag time after the control object has come to a halt, at t=t2+Δtlog, the virtual plane is again a steady-state distance away from the control object tip—but now on the other side. When the control object is subsequently pulled backwards, the distance between its tip and the virtual plane decreases again (t=t3 and t=t4), until the control object tip emerges at the first side of the virtual plane (t=t5). The control object can stop at a different position than where it started, and the virtual plane will eventually follow it and be, once more, a steady-state distance away from the control object tip (t=t6). Even if the control object continues moving, if it does so at a constant speed, the virtual plane will, after an initial lag period to “catch up,” follow the control object at a constant distance.

The steady-state distances in the disengaged mode and the engaged mode can, but need not be the same. In some implementations, for instance, the steady-state distance in the engaged mode is larger, such that disengaging from the virtual plane (i.e., “unclicking”) appears harder to the user than engaging (i.e., “clicking”) because it requires a larger motion. Alternatively or additionally, to achieve a similar result, the lag times can differ between the engaged and disengaged modes. Further, in some implementations, the steady-state distance is not fixed, but adjustable based on the control object's speed of motion, generally being greater for higher control object speeds. As a result, when the control object moves very fast, motions toward the plane are “buffered” by the rather long distance that the control object has to traverse relative to the virtual plane before an engagement event is recognized (and, similarly, backwards motions for disengagement are buffered by a long disengagement steady-state distance). A similar effect can also be achieved by decreasing the lag time, i.e., increasing the responsiveness of touch-surface position updates, as the control object speed increases. Such speed-based adjustments can serve to avoid undesired switching between the modes that can otherwise be incidental to fast control object movements.

In various implementations, the position of the virtual plane (or other virtual surface construct) is updated not based on a time lag, but based on its current distance from the control object tip. That is, for any image frame, the distance between the current control object tip position and the virtual plane is computed (e.g., with the virtual-plane position being taken from the previous frame), and, based thereon, a displacement or shift to be applied to the virtual plane is determined. In some implementations, the update rate as a function of distance can be defined in terms of a virtual “potential-energy surface” or “potential-energy curve.” In FIG. 3A, an exemplary such potential-energy curve 300 is plotted as a function of the distance of the virtual plane from the control object tip according to various implementations. The negative derivative 302 (or slope) of this curve, which specifies the update rate, i.e., the shift in the virtual plane's position per frame (in arbitrary units), is shown in FIG. 3B. The minima of the potential-energy curve 300 determine the steady-state distances 304, 306 to both sides of the control object; at these distances, the virtual plane is not updated at all. At larger distances, the virtual plane is attracted towards the control object tip, at a rate that generally increases with distance. For example, at point 308, where the virtual plane is a positive distance d1 away from the control object, a negative displacement or shift Δs1 is applied to bring the virtual plane closer. Conversely, at point 310, where the virtual plane has a negative distance d2 from the control object tip (corresponding to piercing of the virtual plane, i.e., the engaged mode), a positive shift Δs2 is applied to move the virtual plane closer to the control object. At distances below the steady-state distance (e.g., at point 312), the virtual plane is repelled by the control object and driven back towards the steady state. The magnitude of the local maximum 314 between the two steady states determines the level of force or acceleration needed to cross from the disengaged to the engaged mode or back. In certain implementations, the potential-energy curve 300 is given an even more physical interpretation, and its negative slope is associated with an acceleration, i.e., a change in the velocity of the virtual plane, rather than a change in its position. In this case, the virtual plane does not immediately stop as it reaches a steady state, but oscillates around the steady state. To slow down the virtual plane's motion and thereby stabilize its position, a friction term can be introduced into the physical model.

The potential-energy curve need not be symmetric, or course. FIG. 3C, for example, shows an asymmetric curve in which the steady-state distance in the engaged mode is larger than that in the disengaged mode, rendering disengagement harder. Further, as illustrated in FIG. 3D, the curve can have more than two (e.g., four) steady states 320, which can correspond to one disengaged and three engaged modes. The requisite force to transition between modes depends, again, on the heights of the local maxima 322 between the steady states. In some implementations, the curve abruptly jumps at the steady-state points and assumes a constant, higher value therebetween. In this case, which is illustrated in FIG. 3E, the position of the virtual plane is not updated whenever the control object tip is within the steady-state distance from the virtual plane on either side, allowing fast transitions between the modes. Accordingly, the potential-energy curve can take many other forms, which can be tailored to a desired engagement-disengagement force profile experienced by the user. Moreover, the virtual plane can be updated in accordance with a two-dimensional potential-energy surface that defines the update rate depending on, e.g., the distances between the virtual plane and control object tip along various directions (as opposed to only one, e.g., the perpendicular and shortest, distance of the control object tip from the virtual plane). For example, the virtual plane can follow the control object differently for different relative orientations between the control object and the virtual plane, and each such relative orientation can correspond to a cross-section through the potential-energy surface. Two-dimensional potential-energy surfaces can also be useful to control position updates applied to a curved virtual surface construct.

Furthermore, the potential piercing energy need not, or not only, be a function of the distance from the control object tip to the virtual surface construct, but can depend on other factors. For example, in some implementations, a stylus with a pressure-sensitive grip is used as the control object. In this case, the pressure with which the user squeezes the stylus can be mapped to the piercing energy.

Whichever way the virtual surface construct is updated, jitter in the control object's motions can result in unintentional transitions between the engaged and disengaged modes. While such modal instability can be combatted by increasing the steady-state distance (i.e., the “buffer zone” between control object and virtual surface construct), this comes at the cost of requiring the user, when she intends to switch modes, to perform larger movements that can feel unnatural. The trade-off between modal stability and user convenience can be improved by filtering the tracked control object movements. Specifically, jitter can be filtered out, based on the generally more frequent changes in direction associated with it, with some form of time averaging. Accordingly, in one implementation, a moving-average filter spanning, e.g., a few frames, is applied to the tracked movements, such that only a net movement within each time window is used as input for cursor control. Since jitter generally increases with faster movements, the time-averaging window can be chosen to likewise increase as a function of control object velocity (such as a function of overall control object speed or of a velocity component, e.g., perpendicular to the virtual plane). In another implementation, the control object's previous and newly measured position are averaged with weighting factors that depend, e.g., on velocity, frame rate, and/or other factors. For example, the old and new positions can be weighted with multipliers of x and (1−x), respectively, where x varies between 0 and land increases with velocity. In one extreme, for x=1, the cursor remains completely still, whereas for the other extreme, x=0, no filtering is performed at all.

FIG. 4A summarizes representative methods for control-object-controlled cursor operation that utilize a virtual surface construct moving with the control object in accordance with various implementations. In the method implementation illustrated by FIG. 4A, a control object is tracked (400), based on computer vision or otherwise, to determine its position and/or orientation in space (typically within a detection zone proximate to the computer screen). Optionally, the tracked control object motion is computationally filtered to reduce jitter (402). Based on the tracked control object in conjunction with a definition of the virtual surface construct relative thereto, the position and/or orientation of the virtual surface construct are then computed (404). In implementations where the virtual surface construct is updated based on a control object position in the past, it can initially take a few control object tracking cycles (e.g., frames in image-based tracking) before the first position of the virtual surface construct is established; thereafter, the virtual surface construct can be updated every cycle. In implementations where the virtual surface construct is shifted from cycle to cycle based on its instantaneous distance from the control object tip, the position of the virtual surface construct can be initiated arbitrarily, e.g., such that the virtual surface construct starts a steady-state distance away from the control object. Following computation of the virtual surface construct, the current operational mode (engaged or disengaged) is identified based on a determination whether the control object touches or pierces the virtual surface construct or not (406). Further, the current cursor position is calculated, typically from the control object's position and orientation relative to the screen (408). (This step can be performed prior to, or in parallel with, the computations of the virtual surface construct.) Based on the operational mode and cursor position, the screen content is then updated (410), e.g., to move the cursor symbol or re-arrange other screen content. Steps 400-410 are executed in a loop as long as the user interacts with the system via free-space control object motions.

In some implementations, temporary piercing of the virtual surface construct—i.e., a clicking motion including penetration of the virtual surface construct immediately followed by withdrawal from the virtual surface construct—switches between modes and locks in the new mode. For example, starting in the disengaged mode, a first click event can switch the control object into the engaged mode, where it can then remain until the virtual surface construct is clicked at again.

Further, in some implementations, the degree of piercing (i.e., the distance beyond the virtual surface construct that the control object initially reaches, before the virtual surface construct catches up) is interpreted as an intensity level that can be used to refine the control input. For example, the intensity (of engagement) in a swiping gesture for scrolling through screen content can determine the speed of scrolling. Further, in a gaming environment or other virtual world, different intensity levels when touching a virtual object (by penetrating the virtual surface construct while the cursor is positioned on the object as displayed on the screen) can correspond to merely touching the object versus pushing the object over. As another example, when hitting the keys of a virtual piano displayed on the screen, the intensity level can translate into the volume of the sound created. Thus, touching or engagement of a virtual surface construct (or other virtual control construct) can provide user input beyond the binary discrimination between engaged and disengaged modes.

FIGS. 4B and 4B-1 illustrate at a higher conceptual level various methods for controlling a machine-user interface using free-space gestures or motions performed by a control object. The method involves receiving information including motion information for a control object (420). Further, it includes determining from the motion information whether the motion corresponds to an engagement gesture (422). This determination can be made by determining whether an intersection occurred between the control object and a virtual control construct (424); whether a dis-intersection of the control object from the at least one virtual control construct occurred (426); and/or whether motion of the control object occurred relative to at least one virtual control construct (428). Further, the determination can involve determining, from the motion information, one or more engagement attributes (e.g., a potential energy) defining an engagement gesture (430), and/or identifying an engagement gesture by correlating the motion information to one of a plurality of engagement gestures based in part upon one or more of motion of the control object, occurrence of any of an intersection, a dis-intersection or a non-intersection of the control object with the virtual control construct, and the set of engagement attributes (432). Once an engagement gesture has been recognized, the user-interface control to which the gesture applies (e.g., a control associated with an application or an operating environment, or a special control) is selected or otherwise determined (434). The control can then be manipulated according to the gesture (436).

As will be readily apparent to those of skill in the art, the methods described above can be readily extended to the control of a user interface with multiple simultaneously tracked control objects. For instance, both left and right index fingers of a user can be tracked, each relative to its own associated virtual touch surface, to operate to cursors simultaneously and independently. As another example, the user's hand can be tracked to determine the positions and orientations of all fingers; each finger can have its own associated virtual surface construct (or other virtual control construct) or, alternatively, all fingers can share the same virtual surface construct, which can follow the overall hand motions. A joint virtual plane can serve, e.g., as a virtual drawing canvas on which multiple lines can be drawn by the fingers at once.

In an implementation and by way of example, one or more control parameter(s) and the control object are applied to some control mechanism to determine the distance of the virtual control construct to a portion of the control object (e.g., tool tip(s), point(s) of interest on a user's hand or other points of interest). In some implementations, a lag (e.g., filter or filtering function) is introduced to delay, or modify, application of the control mechanism according to a variable or a fixed increment of time, for example. Accordingly, implementations can provide enhanced verisimilitude to the human-machine interaction, and/or increased fidelity of tracking control object(s) and/or control object portion(s).

In one example, the control object portion is a user's finger-tip. A control parameter is also the user's finger-tip. A control mechanism includes equating a plane-distance between virtual control construct and finger-tip to a distance between finger-tip and an arbitrary coordinate (e.g., center (or origin) of an interaction zone of the controller). Accordingly, the closer the finger-tip approaches to the arbitrary coordinate, the closer the virtual control construct approaches the finger-tip.

In another example, the control object is a hand, which includes a control object portion, e.g., a palm, determined by a “palm-point” or center of mass of the entire hand. A control parameter includes a velocity of the hand, as measured at the control object portion, i.e., the center of mass of the hand. A control mechanism includes filtering forward velocity over the last one (1) second. Accordingly, the faster the palm has recently been travelling forward, the closer the virtual control construct approaches to the control object (i.e., the hand).

In a further example, a control object includes a control object portion (e.g., a finger-tip). A control mechanism includes determining a distance between a thumb-tip (e.g., a first control object portion) and an index finger (e.g., a second control object portion). This distance can be used as a control parameter. Accordingly, the closer the thumb-tip and index-finger, the closer the virtual control construct is determined to be to the index finger. When the thumb-tip and index finger touch one another, the virtual control construct is determined to be partially pierced by the index finger. A lag (e.g., filter or filtering function) can introduce a delay in the application of the control mechanism by some time-increment proportional to any quantity of interest, for example horizontal jitter (i.e., the random motion of the control object in a substantially horizontal dimension). Accordingly, the greater the shake in a user's hand, the more lag will be introduced into the control mechanism.

User-interface control via free-space motions relies generally on a suitable motion-capture device or system for tracking the positions, orientations, and motions of one or more control objects. For a description of tracking positions, orientations, and motions of control objects, reference can be had to U.S. patent application Ser. No. 13/414,485, filed on Mar. 7, 2012, the entire enclosure of which is incorporated herein by reference. In various implementations, motion capture can be accomplished visually, based on a temporal sequence of images of the control object (or a larger object of interest including the control object, such as the user's hand) captured by one or more cameras. In one implementation, images acquired from two (or more) vantage points are used to define tangent lines to the surface of the object and approximate the location and shape of the object based thereon, as explained in more detail below. Other vision-based approaches that can be used in implementations include, without limitation, stereo imaging, detection of patterned light projected onto the object, or the use of sensors and markers attached to or worn by the object (such as, e.g., markers integrated into a glove) and/or combinations thereof. Alternatively or additionally, the control object can be tracked acoustically or ultrasonically, or using inertial sensors such as accelerometers, gyroscopes, and/or magnetometers (e.g., MEMS sensors) attached to or embedded within the control object. Implementations can be built employing one or more of particular motion-tracking approaches that provide control object position and/or orientation (and/or derivatives thereof) tracking with sufficient accuracy, precision, and responsiveness for the particular application.

FIGS. 5A and 5B illustrate an exemplary system for capturing images and controlling a machine based on motions relative to a virtual control construct according to various implementations. As shown in FIG. 5A, the system includes motion-capture hardware including two video cameras 500, 502 that acquire a stream of images of a region of interest 504 from two different vantage points. The cameras 500, 502 are connected to a computer 506 that processes these images to infer three-dimensional information about the position and orientation of a control object 508, or a larger object of interest including the control object (e.g., a user's hand), in the region of interest 504, and computes suitable control signals to the user interface based thereon. The cameras can be, e.g., CCD or CMOS cameras, and can operate, e.g., in the visible, infrared (IR), or ultraviolet wavelength regime, either by virtue of the intrinsic sensitivity of their sensors primarily to these wavelengths, or due to appropriate filters 510 placed in front of the cameras. In some implementations, the motion-capture hardware includes, co-located with the cameras 500, 502, one or more light sources 512 that illuminate the region of interest 504 at wavelengths matching the wavelength regime of the cameras 500, 502. For example, the light sources 512 can be LEDs that emit IR light, and the cameras 500, 502 can capture IR light that is reflected off the control object and/or objects in the background. Due to the inverse-square dependence of the illumination intensity on the distance between the light sources 512 and the illuminated object, foreground objects such as the control object generally appear significantly brighter in the images than background objects, aiding in intensity-based foreground/background discrimination. In some implementations, the cameras 500, 502 and light sources 512 are disposed below the control object to be tracked and point upward. For example, they can be placed on a desk to capture hand motions taking place in a spatial region above the desk, e.g., in front of the screen. This location can be optimal both for foreground/background discrimination (because the background is in this case typically the ceiling and, thus, far away) and for discerning the control object's direction and tip position (because the usual pointing direction will lie, more or less, in the image plane).

The computer 506 processing the images acquired by the cameras 500, 502 can be a suitably programmed general-purpose computer. As shown in FIG. 5B, it can include a processor (or CPU) 520, associated system memory 522 (typically volatile memory, e.g., RAM), one or more permanent storage devices 524 (such as hard disks, CDs, DVDs, memory keys, etc.), a display screen 526 (e.g., an LCD screen or CRT monitor), input devices (such as a keyboard and, optionally, a mouse) 528, and a system bus 530 that facilitates communication between these components and, optionally via a dedicated interface, with the cameras 500, 502 and/or other motion-capture hardware. The memory 522 can store computer-executable instructions, conceptually illustrated as a group of modules and programmed in any of various suitable programming languages (such as, e.g., C, C++, Java, Basic, Python, Pascal, Fortran, assembler languages, etc.), that control the operation of the CPU and provide the requisite computational functionality for implementing methods in accordance herewith. Specifically, in addition to an operating system 532 that stores low-level system functions (such as memory allocation and file management) and one or more end-user applications 534 (such as, e.g., web browsers, office applications, or video games), the memory can store modules for image processing and control object tracking, computation of the virtual control construct and determination of the operational mode, and cursor operation and user-interface control.

The image-processing and tracking module 536 can analyze pairs of image frames acquired by the two cameras 500, 502 (and stored, e.g., in image buffers in memory 522) to identify the control object (or an object including the control object or multiple control objects, such as a user's hand) therein (e.g., as a non-stationary foreground object) and detect its edges. Next, the module 536 can, for each pair of corresponding rows in the two images, find an approximate cross-section of the control object by defining tangent lines on the control object that extend from the vantage points (i.e., the cameras) to the respective edge points of the control object, and inscribe an ellipse (or other geometric shape defined by only a few parameters) therein. The cross-sections can then be computationally connected in a manner that is consistent with certain heuristics and known properties of the control object (e.g., the requirement of a smooth surface) and resolves any ambiguities in the fitted ellipse parameters. As a result, the control object is reconstructed or modeled in three dimensions. This method, and systems for its implementation, are described in more detail in U.S. patent application Ser. No. 13/414,485, filed on Mar. 7, 2012, the entire enclosure of which is incorporated herein by reference. A larger object including multiple control objects can similarly be reconstructed with respective tangent lines and fitted ellipses, typically exploiting information of internal constraints of the object (such as a maximum physical separation between the fingertips of one hand). The image-processing and tracking module 534 can, further, extract relevant control object parameters, such as tip positions and orientations as well as velocities, from the three-dimensional model. In some implementations, this information can be inferred from the images at a lower level, prior to or without the need for fully reconstructing the control object. These operations are readily implemented by those skilled in the art without undue experimentation. In some implementations, a filter module 538 receives input from the image-processing and tracking module 564, and smoothens or averages the tracked control object motions; the degree of smoothing or averaging can depend on a control object velocity as determined by the tracking module 536.

An engagement-target module 540 can receive tracking data about the control object from the image-processing and tracking module 536 and/or the filter module 538, and use that data to compute a representation of the virtual control construct, i.e., to define and/or update the position and orientation of the virtual control construct relative to the control object (and/or the screen); the representation can be stored in memory in any suitable mathematical form. A touch-detection module 542 in communication with the engagement-target module 540 can determine, for each frame, whether the control object touches or pierces the virtual control construct. A cursor module 544 can, based on tracking data from the image-processing and tracking module 536, determine a cursor location on the screen (e.g., as the projection of the control object tip onto the screen). The cursor module 544 can also include a visualization component that depicts a cursor at the computed location, preferably in a way that discriminates, based on output from the touch-detection module 542, between the engaged and disengaged mode (e.g., by using different colors). The visualization component of the cursor module 544 can also modify the cursor appearance based on the control object distance from the virtual control construct; for instance, the cursor can take the form of a circle having a radius proportional to the distance between the control object tip and the virtual control construct. A user-interface control module 546 can map detected motions in the engaged mode into control input for the applications 534 running on the computer 506. Collectively, the end-user application 534, user-interface control module 546, and cursor module 544 can compute the screen content, i.e., an image for display on the screen 526, which can be stored in a display buffer (e.g., in memory 522 or in the buffer of a GPU included in the system).

The functionality of the different modules can, of course, be grouped and organized in many different ways, as a person of skill in the art would readily understand. Further, it need not necessarily be implemented on a single computer, but can be distributed between multiple computers. For example, the image-processing and tracking functionality of module 536 can be provided by a separate computer in communication with the computer on which the end-user applications controlled via free-space control object motions are executed. In one exemplary implementation, the cameras 500, 502, light sources 512, and computational facility for image-processing and tracking are integrated into a single motion-capture device (which, typically, utilizes an application-specific integrated circuit (ASIC) or other special-purpose computer for image-processing). In another exemplary implementation, the camera images are sent from a client terminal over a network to a remote server computer for processing, and the tracked control object positions and orientations are sent back to the client terminal as input into the user interface. Implementations can be realized using any number and arrangement of computers (broadly understood to include any kind of general-purpose or special-purpose processing device, including, e.g., microcontrollers, ASICs, programmable gate arrays (PGAs), or digital signal processors (DSPs) and associated peripherals) executing the methods described herein, an any implementation of the various functional modules in hardware, software, or a combination thereof.

Computer programs incorporating various features or functionality described herein can be encoded on various computer readable storage media; suitable media include magnetic disk or tape, optical storage media such as compact disk (CD) or DVD (digital versatile disk), flash memory, and any other non-transitory medium capable of holding data in a computer-readable form. Computer-readable storage media encoded with the program code can be packaged with a compatible device or provided separately from other devices. In addition, program code can be encoded and transmitted via wired, optical, and/or wireless networks conforming to a variety of protocols, including the Internet, thereby allowing distribution, e.g., via Internet download and/or provided on-demand as web-services.

The systems and methods described herein can find application in a variety of computer-user-interface contexts, and can replace mouse operation or other traditional means of user input as well as provide new user-input modalities. Free-space control object motions and virtual-touch recognition can be used, for example, to provide input to commercial and industrial legacy applications (such as, e.g., business applications, including Microsoft Outlook™, office software, including Microsoft Office™, Windows™, Excel™, etc.; graphic design programs; including Microsoft Visio™ etc.), operating systems such as Microsoft Windows™; web applications (e.g., browsers, such as Internet Explorer™); other applications (such as e.g., audio, video, graphics programs, etc.), to navigate virtual worlds (e.g., in video games) or computer representations of the real world (e.g., Google Street View™), or to interact with three-dimensional virtual objects (e.g., Google Earth™). FIGS. 6A-9B illustrate various exemplary control inputs achievable with free-space hand motions and gestures when using systems and methods in accordance herewith.

An example of a compound gesture will be illustrated with reference to an implementation illustrated by FIGS. 6A-6D. These diagrams are merely an example; one of ordinary skill in the art would recognize many other variations, alternatives, and modifications. FIG. 6A illustrates a system 100a comprising wired and/or wirelessly communicatively coupled components of a tower 602a, a display device 604a, a keyboard 606a and optionally a tactile pointing device (e.g., mouse, or track ball) 608a. In some implementations, computing machinery of tower 602a can be integrated into display device 604a in an “all in one” configuration. A position and motion sensing device (e.g., 600a-1, 600a-2 and/or 600a-3) comprises all or a portion of the non-tactile interface system of FIG. 1A, that provides for receiving non-tactile input based upon detected position(s), shape(s) and/or motion(s) made by a hand 104 and/or any other detectable object serving as a control object. The position and motion sensing device can be embodied as a stand-alone entity or integrated into another device, e.g., a computer, workstation, laptop, notebook, smartphone, tablet, smart watch or other type of wearable intelligent device(s) and/or combinations thereof. Position and motion sensing device can be communicatively coupled with, and/or integrated within, one or more of the other elements of the system, and can interoperate cooperatively with component(s) of the system 100a, to provide a non-tactile interface capabilities, such as illustrated by the non-tactile interface system 100 of FIG. 1A.

The motion sensing device (e.g., 600a-1, 600a-2 and/or 600a-3) is capable of detecting position as well as motion of hands and/or portions of hands and/or other detectable objects (e.g., a pen, a pencil, a stylus, a paintbrush, an eraser, a virtualized tool, and/or a combination thereof), within a region of space 110a from which it is convenient for a user to interact with system 100a. Region 110a can be situated in front of, nearby, and/or surrounding system 100a. In some implementations, the position and motion sensing device can be integrated directly into display device 604a as integrated device 600a-2 and/or keyboard 106a as integrated device 600a-3. While FIG. 6A illustrates devices 600a-1, 600a-2 and 600a-3, it will be appreciated that these are alternative implementations shown in FIG. 6A for clarity sake. Keyboard 606a and position and motion sensing device are representative types of “user input devices.” Other examples of user input devices (not shown in FIG. 6A) can be used in conjunction with computing environment 100a, such as for example, a touch screen, light pen, mouse, track ball, touch pad, data glove and so forth. Accordingly, FIG. 6A is representative of but one type of system implementation. It will be readily apparent to one of ordinary skill in the art that many system types and configurations are suitable for use in conjunction with various implementations.

Tower 102a and/or position and motion sensing device and/or other elements of system 100a can implement functionality to provide virtual control surface 600a within region 110a with which engagement gestures are sensed and interpreted to facilitate user interactions with system 602a. Accordingly, objects and/or motions occurring relative to virtual control surface 600a within region 110a can be afforded differing interpretations than like (and/or similar) objects and/or motions otherwise occurring.

As illustrated in FIG. 6A control object 104 (happens to be a pointing finger in this example) is moving toward an “Erase” button being displayed on display 604a by a user desiring to select the “Erase” button. Now with reference to FIG. 6B, control object 104 has moved triggered an engagement gesture by means of “virtually contacting”, i.e., intersecting virtual control surface 600a. At this point, unfortunately, the user has suffered misgivings about executing an “Erase.” Since the “Erase” button has been engaged, however, mere withdrawal of control object 104 (i.e., a “dis-intersection”) will not undo the erase operation selected. Accordingly, with reference to FIG. 6C, the user makes a wiping motion with a second control object (i.e., the user's other hand in this example) indicating that the user would like to cancel an operation that is underway. Motion by a second control object illustrates a “compound gesture” that includes two or more gestures, sequentially or simultaneously. Compound gestures can be performed using a single control object, or two or more control objects (e.g., one hand, two hands, one stylus and one hand, etc.). In the illustrated case, the point/select and the wipe are two gestures made by two different control objects (two hands) occurring contemporaneously. Now with reference to FIG. 6D, when the second part of the compound gesture is recognized, the Erase button is no longer highlighted, indicating that the button is now “unselected”. The user is free to withdraw the first control object from engagement with the virtual control surface without triggering an “Erase” operation.

FIGS. 7A and 7B illustrate a zooming action performed by two fingers (thumb and index finger) according to various implementations. These diagrams are merely an example; one of ordinary skill in the art would recognize many other variations, alternatives, and modifications. As illustrated by FIG. 7A, an image 706 (happens to be a web page feed) is being displayed by display 704, by a browser or other application. To zoom in, the user commences a motion including engaging a virtual control construct (not shown) interposed between the user and display 704 at an engagement target approximately over the right most column being displayed. In FIG. 7B, the finger tips 104a, 104b of the user are moved away from each other. This motion is recognized by device 700 from differences in images captured of the control object portion 104a, 104b and determined to be an engagement gesture including a spreading motion of the thumb and index finger-tip in front of the screen using the techniques described hereinabove. The result of interpreting the engagement gesture is passed to an application (and/or to the OS) owning the display 704. The application owning display 704 responds by zooming-in the image of display 704.

FIGS. 8A and 8B show how a swiping gesture by a finger in engaged mode can serve to scroll through screen content according to various implementations. These diagrams are merely an example; one of ordinary skill in the art would recognize many other variations, alternatives, and modifications. As illustrated by FIG. 8A, an image 806 (happens to be of dogs in this example) is being displayed by display 804. When the user commences a motion relative to and engaged with a virtual control construct (not shown) interposed between the user and display 804 (e.g., at an engagement target approximately over the left-most dog), the user's gesture can be interpreted as a control input for the application displaying the images. For example, in FIG. 8B, the user has swiped a finger-tip 104a from left to right. This motion is recognized by device 800 from differences in images captured of the control object portion 104a and determined to be an engagement gesture including a swiping motion from left to right that pierces the virtual control construct using the techniques described hereinabove. The result of interpreting the engagement gesture is passed to the image application, which responds by scrolling the image on the display 804. On the other hand, the same gesture performed without engaging the virtual control construct can be passed to the operating system and, for example, used to switch the display 804 between multiple desktops or trigger some other higher-level function. This is just one example of how engagement gestures, i.e., gestures performed relative to a virtual control construct (whether in the engaged or the disengaged mode, or changing between the modes), can be used to provide different types of control input.

FIGS. 9A and 9B show how the motion of a control object in free space in conjunction with a virtual plane (or a slice of a certain thickness) can provide writing with a virtual pen onto a virtual paper defined in space according to various implementations. These diagrams are merely an example; one of ordinary skill in the art would recognize many other variations, alternatives, and modifications. As shown in FIG. 9A, a user moves a tool 104b (happens to be a stylus) in free space in front of a writing area being displayed on the screen of display 904 so as to pierce a virtual control construct (not shown) (happens to be a plane) interposed between the user and display 904. This motion is recognized by device 900 from differences in images captured of the control object portion 104b and determined to be an engagement gesture including placing a virtual pen onto a virtual paper of space, and is reflected by the contents of display 904. Continuing motion of the stylus 104b in space by the user after engaging the virtual control plane is interpreted as writing with the stylus 104b on the virtual paper of space and is reflected by the contents of display 904. As shown in FIG. 9B, when the user dis-engages with the virtual control construct, the virtual pen is lifted from the virtual paper, completing the letter “D” in script matching the handwriting of the user in free space. Accordingly, implementations can enable, e.g., signature capture, free-hand drawings, etc.

Certain implementations were described above. It is, however, expressly noted that the described implementations are not limiting, nor exhaustive, but rather the intention is that additions and modifications to what was expressly described herein can be provided for in implementations readily apparent to one of ordinary skill having access to the foregoing. Moreover, it is to be understood that the features of the various implementations described herein are not mutually exclusive and can exist in various combinations and permutations, even if such combinations or permutations were not made expressly herein. The implementations described herein have been presented for purposes of illustration and are not intended to be exhaustive or limiting. Many variations and modifications are possible in light of the foregoing teaching. The implementations described herein as well as implementations apparent in view of the foregoing description are limited only by the following claims.

Claims

1. A computer-implemented method for controlling a user interface via free-space motions of a control object, the method comprising:

receiving motion information indicating positions of a control object being tracked in a region of free space; and
using a processor: (i) receiving definitions defining a plurality of virtual control constructs, including at least a first virtual control construct defined at a spatial position determined based at least in part on the motion information for a corresponding first portion of the control object; whereby the first virtual control construct is positioned relative to the first portion of the control object, and a second virtual control construct defined at a spatial position determined based at least in part on the motion information for a corresponding second portion of the control object; whereby the second virtual control construct is positioned relative to the second portion of the control object; (ii) obtaining a determination of an input gesture determined from motion made by the control object based on a first portion state determined for the first portion of the control object and including any change in spatial position of the first portion of the control object relative to the first virtual control construct and a second portion state determined for the second portion of the control object and including any change in spatial position of the second portion of the control object relative to the second virtual control construct; and (iii) switching from conducting control of a user interface in a first mode to conducting control of the user interface in a second mode based at least in part upon interpreting the input gesture determined from the first portion state and the second portion state.

2. The computer-implemented method of claim 1, wherein: the control object includes a hand, the first portion includes a finger and the second portion includes a thumb, and wherein the determining an input gesture includes:

determining that the first portion state and the second portion state indicate that the finger and the thumb changed distance from their corresponding user-specific virtual planes; thereby reducing a distance between the finger and the thumb; and
determining from the first portion state and the second portion state that the input gesture comprises a pinching gesture of the thumb and finger.

3. The computer-implemented method of claim 2, wherein the switching further comprises:

interpreting the pinching gesture to be a command indicating a zooming out of displayed content; and
conducting control of the user interface zooming out of displayed content.

4. The computer-implemented method of claim 2, further comprising:

determining that at least one of the finger and the thumb penetrated a corresponding virtual control construct; and
determining from the first portion state and the second portion state that the input gesture comprises a maximal pinching gesture of the thumb and finger.

5. The computer-implemented method of claim 3, wherein the switching further comprises:

interpreting the maximal pinching gesture to be a command indicating a maximum zooming out of displayed content; and
conducting control of the user interface to perform continued zooming out of displayed content.

6. The computer-implemented method of claim 1, wherein: the control object includes a hand, the first portion includes a finger and the second portion includes a thumb, and wherein the determining an input gesture includes:

determining that the first portion state and the second portion state indicate that the finger and the thumb changed distance from their corresponding user-specific virtual planes; thereby increasing a distance between the finger and the thumb; and
determining from the first portion state and the second portion state that the input gesture comprises a spreading gesture of the thumb and finger.

7. The computer-implemented method of claim 6, wherein the switching further comprises:

interpreting the spreading gesture to be a command indicating a zooming in of displayed content; and
conducting control of the user interface to zooming in of displayed content.

8. The computer-implemented method of claim 6, further comprising:

determining that at least one of the finger and thumb dis-engaged from a corresponding virtual control construct; and
determining from the first portion state and the second portion state that the input gesture comprises a maximal spreading gesture of the thumb and finger.

9. The computer-implemented method of claim 8, wherein the switching further comprises:

interpreting the maximal spreading gesture to be a command indicating a maximum zooming in of displayed content; and
conducting control of the user interface to continued zooming in of displayed content.

10. The computer-implemented method of claim 1, further comprising:

updating a spatial position of at least one virtual control construct based at least in part on the motion information of a corresponding portion of the control object such that the virtual control construct is enabled to follow the corresponding portion of the control object.

11. The computer-implemented method of claim 10, wherein the virtual control construct computationally follows motions of the control object portion as tracked with a time lag.

12. The computer-implemented method of claim 11, wherein the time lag is fixed.

13. The computer-implemented method of claim 11, wherein the time lag is computed by the processor and depends on a motion parameter of the control object portion.

14. The computer-implemented method of claim 10, wherein the spatial position of the virtual control construct is updated by the processor based on a current distance between the control object portion and the virtual control construct.

15. The computer-implemented method of claim 14, wherein the spatial position of the virtual control construct is updated in accordance with a virtual energy potential defined as a function of a distance between the control object portion and a corresponding virtual control construct; wherein the virtual energy potential comprises minima at steady-state distances between the control object portion and the corresponding virtual control construct at a time when the control object portion is engaged with the virtual control construct and a time when the control object portion is disengaged from the virtual control construct.

16. The computer-implemented method of claim 1, further comprising computationally tracking the motions of the control object portions based on a temporal sequence of images of the control object; wherein the sequence of images are captured with at least one of a monocular camera system, a stereoscopic camera system; and a camera system having depth-sensing capability.

17. The computer-implemented method of claim 1, wherein the first mode is an engaged mode and the second mode is a disengaged mode, further comprising computationally determining, during a transition from the disengaged mode to the engaged mode, a degree of penetration of at least one virtual control construct by the corresponding control object portion, and controlling the user interface based at least in part thereon.

18. The computer-implemented method of claim 1, wherein conducting control of the user interface comprises at least one of: updating screen content based, at least in part, on the mode and motions of the control object portion as tracked; and operating a cursor associated with a position on a screen based, at least in part, on the mode and motions of the control object portion as tracked.

19. The computer-implemented method of claim 18, wherein operating the cursor comprises displaying a cursor symbol on the screen at the associated position; wherein the cursor symbol is indicative of a distance between the control object portion and a corresponding virtual control construct.

20. A system including one or more processors coupled to memory, the memory loaded with computer instructions to control a user interface via free-space motions of a control object, the instructions, when executed on the processors, implement actions comprising:

receiving motion information indicating positions of a control object being tracked in a region of free space;
obtaining a determination of an input gesture determined from motion made by the control object based on a first portion state determined for the first portion of the control object and including any change in spatial position of the first portion of the control object relative to a first virtual control construct and a second portion state determined for the second portion of the control object and including any change in spatial position of the second portion of the control object relative to a second virtual control construct;
wherein the first virtual control construct is defined at a spatial position determined based at least in part on motion information for a corresponding first portion of the control object; whereby the first virtual control construct is positioned relative to the first portion of the control object, and the second virtual control construct is defined at a spatial position determined based at least in part on motion information for a corresponding second portion of the control object; whereby the second virtual control construct is positioned relative to the second portion of the control object; and
switching from conducting control of a user interface in a first mode to conducting control of the user interface in a second mode based at least in part upon interpreting the input gesture determined from the first portion state and the second portion state.

21. A non-transitory computer readable storage medium impressed with computer program instructions to control a user interface via free-space motions of a control object, the instructions, when executed on a processor, implement a method comprising:

receiving motion information indicating positions of a control object being tracked in a region of free space;
receiving definitions defining a plurality of virtual control constructs, including at least a first virtual control construct defined at a spatial position determined based at least in part on the motion information for a corresponding first portion of the control object; whereby the first virtual control construct is positioned relative to the first portion of the control object, and a second virtual control construct defined at a spatial position determined based at least in part on the motion information for a corresponding second portion of the control object; whereby the second virtual control construct is positioned relative to the second portion of the control object;
obtaining a determination of an input gesture determined from motion made by the control object based on a first portion state determined for the first portion of the control object and including any change in spatial position of the first portion of the control object relative to the first virtual control construct and a second portion state determined for the second portion of the control object and including any change in spatial position of the second portion of the control object relative to the second virtual control construct; and
switching from conducting control of a user interface in a first mode to conducting control of the user interface in a second mode based at least in part upon interpreting the input gesture determined from the first portion state and the second portion state.
Referenced Cited
U.S. Patent Documents
2665041 January 1954 Maffucci
4175862 November 27, 1979 DiMatteo et al.
4876455 October 24, 1989 Sanderson et al.
4879659 November 7, 1989 Bowlin et al.
4893223 January 9, 1990 Arnold
5038258 August 6, 1991 Koch et al.
5134661 July 28, 1992 Reinsch
5282067 January 25, 1994 Bell
5434617 July 18, 1995 Bianchi
5454043 September 26, 1995 Freeman
5574511 November 12, 1996 Yang et al.
5581276 December 3, 1996 Cipolla et al.
5594469 January 14, 1997 Freeman et al.
5659475 August 19, 1997 Brown
5691737 November 25, 1997 Ito et al.
5742263 April 21, 1998 Wang et al.
5900863 May 4, 1999 Numazaki
5940538 August 17, 1999 Spiegel et al.
6002808 December 14, 1999 Freeman
6031161 February 29, 2000 Baltenberger
6031661 February 29, 2000 Tanaami
6072494 June 6, 2000 Nguyen
6075895 June 13, 2000 Qiao et al.
6147678 November 14, 2000 Kumar et al.
6154558 November 28, 2000 Hsieh
6181343 January 30, 2001 Lyons
6184326 February 6, 2001 Razavi et al.
6184926 February 6, 2001 Khosravi et al.
6195104 February 27, 2001 Lyons
6204852 March 20, 2001 Kumar et al.
6252598 June 26, 2001 Segen
6263091 July 17, 2001 Jain et al.
6346933 February 12, 2002 Lin
6417970 July 9, 2002 Travers et al.
6463402 October 8, 2002 Bennett et al.
6492986 December 10, 2002 Metaxas et al.
6493041 December 10, 2002 Hanko et al.
6498628 December 24, 2002 Iwamura
6578203 June 10, 2003 Anderson, Jr. et al.
6603867 August 5, 2003 Sugino et al.
6629065 September 30, 2003 Gadh et al.
6661918 December 9, 2003 Gordon et al.
6674877 January 6, 2004 Jojic et al.
6702494 March 9, 2004 Dumler et al.
6734911 May 11, 2004 Lyons
6738424 May 18, 2004 Allmen et al.
6771294 August 3, 2004 Pulli et al.
6798628 September 28, 2004 Macbeth
6804654 October 12, 2004 Kobylevsky et al.
6804656 October 12, 2004 Rosenfeld et al.
6814656 November 9, 2004 Rodriguez
6819796 November 16, 2004 Hong et al.
6901170 May 31, 2005 Terada et al.
6919880 July 19, 2005 Morrison et al.
6950534 September 27, 2005 Cohen et al.
6993157 January 31, 2006 Oue et al.
7152024 December 19, 2006 Marschner et al.
7213707 May 8, 2007 Hubbs et al.
7215828 May 8, 2007 Luo
7244233 July 17, 2007 Krantz et al.
7257237 August 14, 2007 Luck et al.
7259873 August 21, 2007 Sikora et al.
7308112 December 11, 2007 Fujimura et al.
7340077 March 4, 2008 Gokturk et al.
7483049 January 27, 2009 Aman et al.
7519223 April 14, 2009 Dehlin et al.
7532206 May 12, 2009 Morrison et al.
7536032 May 19, 2009 Bell
7542586 June 2, 2009 Johnson
7598942 October 6, 2009 Underkoffler et al.
7606417 October 20, 2009 Steinberg et al.
7646372 January 12, 2010 Marks et al.
7656372 February 2, 2010 Sato et al.
7665041 February 16, 2010 Wilson et al.
7692625 April 6, 2010 Morrison et al.
7831932 November 9, 2010 Josephsoon et al.
7840031 November 23, 2010 Albertson et al.
7861188 December 28, 2010 Josephsoon et al.
7940885 May 10, 2011 Stanton et al.
7948493 May 24, 2011 Klefenz et al.
7961174 June 14, 2011 Markovic et al.
7961934 June 14, 2011 Thrun et al.
7971156 June 28, 2011 Albertson et al.
7980885 July 19, 2011 Gattwinkel et al.
8023698 September 20, 2011 Niwa et al.
8035624 October 11, 2011 Bell et al.
8045825 October 25, 2011 Shimoyama et al.
8064704 November 22, 2011 Kim et al.
8085339 December 27, 2011 Marks
8086971 December 27, 2011 Radivojevic et al.
8111239 February 7, 2012 Pryor et al.
8112719 February 7, 2012 Hsu et al.
8144233 March 27, 2012 Fukuyama
8185176 May 22, 2012 Mangat et al.
8213707 July 3, 2012 Li et al.
8218858 July 10, 2012 Gu
8229134 July 24, 2012 Duraiswami et al.
8235529 August 7, 2012 Raffle et al.
8244233 August 14, 2012 Chang et al.
8249345 August 21, 2012 Wu et al.
8270669 September 18, 2012 Aichi et al.
8289162 October 16, 2012 Mooring et al.
8290208 October 16, 2012 Kurtz et al.
8304727 November 6, 2012 Lee et al.
8319832 November 27, 2012 Nagata et al.
8363010 January 29, 2013 Nagata
8395600 March 12, 2013 Kawashima et al.
8432377 April 30, 2013 Newton
8471848 June 25, 2013 Tschesnok
8514221 August 20, 2013 King et al.
8553037 October 8, 2013 Smith et al.
8582809 November 12, 2013 Halimeh et al.
8593417 November 26, 2013 Kawashima et al.
8605202 December 10, 2013 Muijs et al.
8631355 January 14, 2014 Murillo et al.
8638989 January 28, 2014 Holz
8659594 February 25, 2014 Kim et al.
8659658 February 25, 2014 Vassigh et al.
8693731 April 8, 2014 Holz et al.
8738523 May 27, 2014 Sanchez et al.
8744122 June 3, 2014 Salgian et al.
8768022 July 1, 2014 Miga et al.
8817087 August 26, 2014 Weng et al.
8842084 September 23, 2014 Andersson et al.
8843857 September 23, 2014 Berkes et al.
8854433 October 7, 2014 Rafii
8872914 October 28, 2014 Gobush
8878749 November 4, 2014 Wu et al.
8891868 November 18, 2014 Ivanchenko
8907982 December 9, 2014 Zontrop et al.
8922590 December 30, 2014 Luckett, Jr. et al.
8929609 January 6, 2015 Padovani et al.
8930852 January 6, 2015 Chen et al.
8942881 January 27, 2015 Hobbs et al.
8954340 February 10, 2015 Sanchez et al.
8957857 February 17, 2015 Lee et al.
9014414 April 21, 2015 Katano et al.
9056396 June 16, 2015 Linnell
9070019 June 30, 2015 Holz
9119670 September 1, 2015 Yang et al.
9122354 September 1, 2015 Sharma
9124778 September 1, 2015 Crabtree
9182812 November 10, 2015 Ybanez Zepeda
9182838 November 10, 2015 Kikkeri
9342160 May 17, 2016 Bailey
9389779 July 12, 2016 Anderson et al.
9459697 October 4, 2016 Bedikian et al.
9501152 November 22, 2016 Bedikian et al.
10281987 May 7, 2019 Yang et al.
20010044858 November 22, 2001 Rekimoto
20010052985 December 20, 2001 Ono
20020008139 January 24, 2002 Albertelli
20020008211 January 24, 2002 Kask
20020021287 February 21, 2002 Tomasi et al.
20020041327 April 11, 2002 Hildreth et al.
20020080094 June 27, 2002 Biocca et al.
20020105484 August 8, 2002 Navab et al.
20030053658 March 20, 2003 Pavlidis
20030053659 March 20, 2003 Pavlidis et al.
20030081141 May 1, 2003 Mazzapica
20030123703 July 3, 2003 Pavlidis et al.
20030152289 August 14, 2003 Luo
20030202697 October 30, 2003 Simard et al.
20040103111 May 27, 2004 Miller et al.
20040125228 July 1, 2004 Dougherty
20040125984 July 1, 2004 Ito et al.
20040145809 July 29, 2004 Brenner
20040155877 August 12, 2004 Hong et al.
20040212725 October 28, 2004 Raskar
20050007673 January 13, 2005 Chaoulov et al.
20050068518 March 31, 2005 Baney et al.
20050094019 May 5, 2005 Grosvenor et al.
20050131607 June 16, 2005 Breed
20050156888 July 21, 2005 Xie et al.
20050168578 August 4, 2005 Gobush
20050236558 October 27, 2005 Nabeshima et al.
20050238201 October 27, 2005 Shamaie
20060017807 January 26, 2006 Lee et al.
20060028656 February 9, 2006 Venkatesh et al.
20060029296 February 9, 2006 King et al.
20060034545 February 16, 2006 Mattes et al.
20060050979 March 9, 2006 Kawahara
20060072105 April 6, 2006 Wagner
20060098899 May 11, 2006 King et al.
20060204040 September 14, 2006 Freeman et al.
20060210112 September 21, 2006 Cohen et al.
20060262421 November 23, 2006 Matsumoto et al.
20060290950 December 28, 2006 Platt et al.
20070014466 January 18, 2007 Baldwin
20070042346 February 22, 2007 Weller
20070086621 April 19, 2007 Aggarwal et al.
20070130547 June 7, 2007 Boillot
20070206719 September 6, 2007 Suryanarayanan et al.
20070211023 September 13, 2007 Boillot
20070230929 October 4, 2007 Niwa et al.
20070238956 October 11, 2007 Haras et al.
20080013826 January 17, 2008 Hillis et al.
20080019576 January 24, 2008 Senftner et al.
20080030429 February 7, 2008 Hailpern et al.
20080031492 February 7, 2008 Lanz
20080056752 March 6, 2008 Denton et al.
20080064954 March 13, 2008 Adams et al.
20080106637 May 8, 2008 Nakao et al.
20080106746 May 8, 2008 Shpunt et al.
20080110994 May 15, 2008 Knowles et al.
20080111710 May 15, 2008 Boillot
20080118091 May 22, 2008 Serfaty et al.
20080126937 May 29, 2008 Pachet
20080187175 August 7, 2008 Kim et al.
20080244468 October 2, 2008 Nishihara et al.
20080246759 October 9, 2008 Summers
20080273764 November 6, 2008 Scholl
20080278589 November 13, 2008 Thorn
20080291160 November 27, 2008 Rabin
20080304740 December 11, 2008 Sun et al.
20080319356 December 25, 2008 Cain et al.
20090002489 January 1, 2009 Yang et al.
20090093307 April 9, 2009 Miyaki
20090102840 April 23, 2009 Li
20090103780 April 23, 2009 Nishihara et al.
20090116742 May 7, 2009 Nishihara
20090122146 May 14, 2009 Zalewski et al.
20090128564 May 21, 2009 Okuno
20090153655 June 18, 2009 Ike et al.
20090203993 August 13, 2009 Mangat et al.
20090203994 August 13, 2009 Mangat et al.
20090217211 August 27, 2009 Hildreth
20090257623 October 15, 2009 Tang et al.
20090274339 November 5, 2009 Cohen et al.
20090309710 December 17, 2009 Kakinami
20100001998 January 7, 2010 Mandella et al.
20100013662 January 21, 2010 Stude
20100013832 January 21, 2010 Xiao et al.
20100020078 January 28, 2010 Shpunt
20100023015 January 28, 2010 Park
20100026963 February 4, 2010 Faulstich
20100027845 February 4, 2010 Kim et al.
20100046842 February 25, 2010 Conwell
20100053164 March 4, 2010 Imai et al.
20100053209 March 4, 2010 Rauch et al.
20100053612 March 4, 2010 Ou-Yang et al.
20100058252 March 4, 2010 Ko
20100066676 March 18, 2010 Kramer et al.
20100066737 March 18, 2010 Liu
20100066975 March 18, 2010 Rehnstrom
20100091110 April 15, 2010 Hildreth
20100095206 April 15, 2010 Kim
20100118123 May 13, 2010 Freedman et al.
20100121189 May 13, 2010 Ma et al.
20100125815 May 20, 2010 Wang et al.
20100127995 May 27, 2010 Rigazio et al.
20100141762 June 10, 2010 Siann et al.
20100158372 June 24, 2010 Kim et al.
20100162165 June 24, 2010 Addala et al.
20100177929 July 15, 2010 Kurtz et al.
20100194863 August 5, 2010 Lopes et al.
20100199221 August 5, 2010 Yeung et al.
20100199230 August 5, 2010 Latta et al.
20100199232 August 5, 2010 Mistry et al.
20100201880 August 12, 2010 Iwamura
20100208942 August 19, 2010 Porter et al.
20100219934 September 2, 2010 Matsumoto
20100222102 September 2, 2010 Rodriguez
20100264833 October 21, 2010 Van Endert et al.
20100275159 October 28, 2010 Matsubara et al.
20100277411 November 4, 2010 Yee et al.
20100296698 November 25, 2010 Lien et al.
20100302015 December 2, 2010 Kipman et al.
20100302357 December 2, 2010 Hsu et al.
20100303298 December 2, 2010 Marks et al.
20100306712 December 2, 2010 Snook et al.
20100309097 December 9, 2010 Raviv et al.
20100321377 December 23, 2010 Gay et al.
20110007072 January 13, 2011 Khan et al.
20110025818 February 3, 2011 Gallmeier et al.
20110026765 February 3, 2011 Ivanich et al.
20110043806 February 24, 2011 Guetta et al.
20110057875 March 10, 2011 Shigeta et al.
20110066984 March 17, 2011 Li
20110080337 April 7, 2011 Matsubara et al.
20110080470 April 7, 2011 Kuno et al.
20110080490 April 7, 2011 Clarkson et al.
20110093820 April 21, 2011 Zhang et al.
20110107216 May 5, 2011 Bi
20110115486 May 19, 2011 Frohlich et al.
20110116684 May 19, 2011 Coffman et al.
20110119640 May 19, 2011 Berkes et al.
20110134112 June 9, 2011 Koh et al.
20110148875 June 23, 2011 Kim et al.
20110169726 July 14, 2011 Holmdahl et al.
20110173574 July 14, 2011 Clavin et al.
20110176146 July 21, 2011 Alvarez Diez et al.
20110181509 July 28, 2011 Rautiainen et al.
20110193778 August 11, 2011 Lee et al.
20110205151 August 25, 2011 Newton et al.
20110213664 September 1, 2011 Osterhout et al.
20110228978 September 22, 2011 Chen et al.
20110234840 September 29, 2011 Klefenz et al.
20110243451 October 6, 2011 Oyaizu
20110251896 October 13, 2011 Impollonia et al.
20110261178 October 27, 2011 Lo et al.
20110267259 November 3, 2011 Tidemand et al.
20110279397 November 17, 2011 Rimon et al.
20110286676 November 24, 2011 El Dokor
20110289455 November 24, 2011 Reville et al.
20110289456 November 24, 2011 Reville et al.
20110291925 December 1, 2011 Israel et al.
20110291988 December 1, 2011 Bamji et al.
20110296353 December 1, 2011 Ahmed et al.
20110299737 December 8, 2011 Wang et al.
20110304600 December 15, 2011 Yoshida
20110304650 December 15, 2011 Campillo et al.
20110310007 December 22, 2011 Margolis et al.
20110310220 December 22, 2011 McEldowney
20110314427 December 22, 2011 Sundararajan
20110317871 December 29, 2011 Tossell
20120038637 February 16, 2012 Marks
20120050157 March 1, 2012 Latta et al.
20120065499 March 15, 2012 Chono
20120068914 March 22, 2012 Jacobsen et al.
20120113223 May 10, 2012 Hilliges et al.
20120113316 May 10, 2012 Ueta et al.
20120159380 June 21, 2012 Kocienda et al.
20120163675 June 28, 2012 Joo et al.
20120194517 August 2, 2012 Izadi et al.
20120204133 August 9, 2012 Guendelman
20120218263 August 30, 2012 Meier et al.
20120223959 September 6, 2012 Lengeling
20120236288 September 20, 2012 Stanley
20120250936 October 4, 2012 Holmgren
20120270654 October 25, 2012 Padovani et al.
20120274781 November 1, 2012 Shet et al.
20120281873 November 8, 2012 Brown et al.
20120293667 November 22, 2012 Baba et al.
20120314030 December 13, 2012 Datta et al.
20120320080 December 20, 2012 Giese et al.
20130019204 January 17, 2013 Kotler et al.
20130033483 February 7, 2013 Im et al.
20130038694 February 14, 2013 Nichani et al.
20130044951 February 21, 2013 Cherng et al.
20130050425 February 28, 2013 Im et al.
20130086531 April 4, 2013 Sugita et al.
20130097566 April 18, 2013 Berglund
20130120319 May 16, 2013 Givon
20130148852 June 13, 2013 Partis et al.
20130181897 July 18, 2013 Izumi
20130182079 July 18, 2013 Holz
20130182897 July 18, 2013 Holz
20130187952 July 25, 2013 Berkovich et al.
20130191911 July 25, 2013 Dellinger et al.
20130194173 August 1, 2013 Zhu et al.
20130208948 August 15, 2013 Berkovich et al.
20130222233 August 29, 2013 Park et al.
20130222640 August 29, 2013 Baek et al.
20130239059 September 12, 2013 Chen et al.
20130241832 September 19, 2013 Rimon et al.
20130252691 September 26, 2013 Alexopoulos
20130257736 October 3, 2013 Hou et al.
20130258140 October 3, 2013 Lipson et al.
20130271397 October 17, 2013 MacDougall et al.
20130283213 October 24, 2013 Guendelman et al.
20130300831 November 14, 2013 Mavromatis et al.
20130307935 November 21, 2013 Rappel et al.
20130321265 December 5, 2013 Bychkov et al.
20140002365 January 2, 2014 Ackley et al.
20140010441 January 9, 2014 Shamaie
20140015831 January 16, 2014 Kim et al.
20140055385 February 27, 2014 Duheille
20140055396 February 27, 2014 Aubauer et al.
20140063055 March 6, 2014 Osterhout et al.
20140063060 March 6, 2014 Maciocci et al.
20140064566 March 6, 2014 Shreve et al.
20140081521 March 20, 2014 Frojdh et al.
20140085203 March 27, 2014 Kobayashi
20140095119 April 3, 2014 Lee et al.
20140098018 April 10, 2014 Kim et al.
20140125775 May 8, 2014 Holz
20140125813 May 8, 2014 Holz
20140132738 May 15, 2014 Ogura et al.
20140134733 May 15, 2014 Wu et al.
20140139425 May 22, 2014 Sakai
20140139641 May 22, 2014 Holz
20140157135 June 5, 2014 Lee et al.
20140161311 June 12, 2014 Kim
20140168062 June 19, 2014 Katz et al.
20140176420 June 26, 2014 Zhou et al.
20140177913 June 26, 2014 Holz
20140189579 July 3, 2014 Rimon et al.
20140192024 July 10, 2014 Holz
20140201666 July 17, 2014 Bedikian et al.
20140201689 July 17, 2014 Bedikian et al.
20140222385 August 7, 2014 Muenster et al.
20140223385 August 7, 2014 Ton et al.
20140225826 August 14, 2014 Juni
20140225918 August 14, 2014 Mittal et al.
20140240215 August 28, 2014 Tremblay et al.
20140240225 August 28, 2014 Eilat
20140248950 September 4, 2014 Tosas Bautista
20140249961 September 4, 2014 Zagel et al.
20140253512 September 11, 2014 Narikawa et al.
20140253785 September 11, 2014 Chan et al.
20140267098 September 18, 2014 Na et al.
20140282282 September 18, 2014 Holz
20140307920 October 16, 2014 Holz
20140320408 October 30, 2014 Zagorsek et al.
20140344762 November 20, 2014 Grasset et al.
20140364209 December 11, 2014 Perry
20140364212 December 11, 2014 Osman et al.
20140369558 December 18, 2014 Holz
20140375547 December 25, 2014 Katz et al.
20150003673 January 1, 2015 Fletcher
20150009149 January 8, 2015 Gharib et al.
20150016777 January 15, 2015 Abovitz et al.
20150022447 January 22, 2015 Hare et al.
20150029091 January 29, 2015 Nakashima et al.
20150040040 February 5, 2015 Balan et al.
20150054729 February 26, 2015 Minnen et al.
20150084864 March 26, 2015 Geiss et al.
20150097772 April 9, 2015 Starner
20150103004 April 16, 2015 Cohen et al.
20150115802 April 30, 2015 Kuti et al.
20150116214 April 30, 2015 Grunnet-Jepsen et al.
20150131859 May 14, 2015 Kim et al.
20150172539 June 18, 2015 Neglur
20150193669 July 9, 2015 Gu et al.
20150205358 July 23, 2015 Lyren
20150205400 July 23, 2015 Hwang et al.
20150206321 July 23, 2015 Scavezze et al.
20150227795 August 13, 2015 Starner et al.
20150234569 August 20, 2015 Hess
20150253428 September 10, 2015 Holz
20150258432 September 17, 2015 Stafford et al.
20150261291 September 17, 2015 Mikhailov et al.
20150293597 October 15, 2015 Mishra et al.
20150304593 October 22, 2015 Sakai
20150309629 October 29, 2015 Amariutei et al.
20150323785 November 12, 2015 Fukata et al.
20150363070 December 17, 2015 Katz
20160062573 March 3, 2016 Dascola et al.
20160086046 March 24, 2016 Holz et al.
20160093105 March 31, 2016 Rimon et al.
20170102791 April 13, 2017 Hosenpud et al.
Foreign Patent Documents
1984236 June 2007 CN
201332447 October 2009 CN
101729808 June 2010 CN
101930610 December 2010 CN
101951474 January 2011 CN
102053702 May 2011 CN
201859393 June 2011 CN
102201121 September 2011 CN
102236412 November 2011 CN
4201934 July 1993 DE
10326035 January 2005 DE
102007015495 October 2007 DE
102007015497 January 2014 DE
0999542 May 2000 EP
1477924 November 2004 EP
1837665 September 2007 EP
2369443 September 2011 EP
2378488 October 2011 EP
2419433 April 2006 GB
2480140 November 2011 GB
2519418 April 2015 GB
H02236407 September 1990 JP
H08261721 October 1996 JP
H09259278 October 1997 JP
2000023038 January 2000 JP
2002133400 May 2002 JP
2003256814 September 2003 JP
2004246252 September 2004 JP
2006019526 January 2006 JP
2006259829 September 2006 JP
2007272596 October 2007 JP
2008227569 September 2008 JP
2009031939 February 2009 JP
2009037594 February 2009 JP
2010060548 March 2010 JP
2011010258 January 2011 JP
2011065652 March 2011 JP
2011107681 June 2011 JP
4906960 March 2012 JP
2012527145 November 2012 JP
101092909 June 2011 KR
2422878 June 2011 RU
200844871 November 2008 TW
9426057 November 1994 WO
2004114220 December 2004 WO
2006020846 February 2006 WO
2007137093 November 2007 WO
2010007662 January 2010 WO
2010032268 March 2010 WO
2010076622 July 2010 WO
2010088035 August 2010 WO
2010138741 December 2010 WO
2010148155 December 2010 WO
2011024193 March 2011 WO
2011036618 March 2011 WO
2011044680 April 2011 WO
2011045789 April 2011 WO
2011119154 September 2011 WO
2012027422 March 2012 WO
2013109608 July 2013 WO
2013109609 July 2013 WO
2014208087 December 2014 WO
2015026707 February 2015 WO
Other references
  • U.S. Appl. No. 14/155,722—Notice of Allowance dated May 27, 2016, 10 pages.
  • U.S. Appl. No. 14/626,820—Office Action dated Jan. 22, 2016, 13 pages.
  • U.S. Appl. No. 14/626,820—Response to Office Action dated Jan. 22, 2016 filed May 21, 2016, 12 pages.
  • U.S. Appl. No. 14/626,820—Final Office Action dated Sep. 8, 2016, 21 pages.
  • U.S. Appl. No. 14/997,454—Office Action dated Dec. 1, 2016, 13 pages.
  • U.S. Appl. No. 14/626,683—Office Action dated Jan. 20, 2016, 15 pages.
  • U.S. Appl. No. 14/626,683—Final Office Action dated Sep. 12, 2016, 21 pages.
  • U.S. Appl. No. 14/626,683—Response to Office Action dated Jan. 20, 2016 filed May 20, 2016, 15 pages.
  • U.S. Appl. No. 14/626,898—Office Action dated Sep. 8, 2016, 29 pages.
  • U.S. Appl. No. 14/626,898—Response to Office Action dated Sep. 8, 2016 filed Dec. 8, 2016, 21 pages.
  • PCT/US2016/017632—Written Opinion of the International Searching Authority dated Jul. 27, 2016, 10 pages.
  • U.S. Appl. No. 14/626,904—Office Action dated Jan. 25, 2017, 23 pages.
  • U.S. Appl. No. 14/626,820—Response to Final Office Action dated Sep. 8, 2016, filed Jan. 9, 2017, 15 pages.
  • U.S. Appl. No. 14/626,820—Nonfinal Office Action dated Mar. 24, 2017, 25 pages.
  • U.S. Appl. No. 14/626,820—Advisory Action dated Jan. 26, 2017, 4 pages.
  • PCT/US2016/017632—International Search Report and Written Opinion dated Jul. 27, 2016, 13 pages.
  • U.S. Appl. No. 14/626,898—Notice of Allowance dated Feb. 15, 2017, 13 pages.
  • PCT/US2016/017632—International Preliminary Report on Patentability dated Aug. 24, 2017, 12 pages.
  • U.S. Appl. No. 15/358,104—Response to Office Action dated Nov. 2, 2017, filed Mar. 2, 2018, 9 pages.
  • U.S. Appl. No. 15/358,104—Notice of Allowance dated Apr. 11, 2018, 41 pages.
  • U.S. Appl. No. 14/476,694—Office Action dated Apr. 7, 2017, 32 pages.
  • U.S. Appl. No. 14/155,722—Response to Office Action dated Nov. 20, 2015, filed Feb. 2, 2016, 15 pages.
  • U.S. Appl. No. 15/279,363—Office Action dated Jan. 25, 2018, 29 pages.
  • U.S. Appl. No. 15/279,363—Response to Office Action dated Jan. 25, 2018, filed May 24, 2018, 11 pages.
  • U.S. Appl. No. 15/279,363—Notice of Allowance dated Jul. 10, 2018, 10 pages.
  • U.S. Appl. No. 14/476,694—Final Office Action dated Apr. 7, 2017, 32 pages.
  • U.S. Appl. No. 14/476,694—Response to Final Office Action dated Apr. 7, 2017 filed Jul. 6, 2017, 22 pages.
  • U.S. Appl. No. 14/262,691—Final Office Action dated Aug. 19, 2016, 36 pages.
  • U.S. Appl. No. 14/262,691—Response to Final Office Action dated Aug. 19, 2016, filed Nov. 21, 2016, 13 pages.
  • U.S. Appl. No. 14/476,694—Final Office Action dated Feb. 26, 2018, 53 pages.
  • U.S. Appl. No. 14/476,694—Office Action dated Jul. 30, 2018, 68 pages.
  • U.S. Appl. No. 14/476,694—Response to Final Office Action dated Feb. 26, 2018 filed Jun. 19, 2018, 16 pages.
  • U.S. Appl. No. 14/476,694—Respopnse to Office Action dated Jul. 30, 2018 filed Sep. 9, 2018, 19 pages.
  • U.S. Appl. No. 15/917,066—Office Action dated Nov. 1, 2018, 31 pages.
  • U.S. Appl. No. 14/262,691—Supplemental Response to Office Action dated Jan. 31, 2017, Jul. 20, 2018, 22 pages.
  • U.S. Appl. No. 16/054,891—Office Action dated Oct. 24, 2019, 26 pages.
  • U.S. Appl. No. 16/054,891—Response to Office Action dated Oct. 24, 2019, filed Feb. 24, 2020, 15 pages.
  • U.S. Appl. No. 16/054,891—Notice of Allowance dated Apr. 1, 2020, 6 pages.
  • U.S. Appl. No. 15/917,066—Response to Office Action dated Nov. 1, 2018, filed Mar. 1, 2019, 12 pages.
  • U.S. Appl. No. 15/917,066—Office Action dated Mar. 19, 2019, 71 pages.
  • U.S. Appl. No. 15/917,066—Response to Office Action dated Mar. 19, 2019, filed May 23, 2019, 12 pages.
  • U.S. Appl. No. 15/917,066—Notice of Allowance dated Jun. 14, 2019, 5 pages.
  • U.S. Appl. No. 16/659,468—Office Action dated Jun. 19, 2020, 111 pages.
  • U.S. Appl. No. 15/917,066—Nonfinal Office Action dated Nov. 1, 2018, 31 pages.
  • U.S. Appl. No. 16/659,468—Non-Final Office Action dated Jun. 19, 2020, 111 pages.
  • U.S. Appl. No. 16/659,468—Response to Office Action dated Jun. 19, 2020 filed Sep. 18, 2020, 12 pages.
  • U.S. Appl. No. 16/659,468—Final Office Action dated Nov. 20, 2020, 18 pages.
  • U.S. Appl. No. 16/659,468—Response to Office Action dated Nov. 20, 2020 filed Mar. 22, 2021, 13 pages.
  • U.S. Appl. No. 16/659,468—Notice of Allowance dated Apr. 23, 2021, 11 pages.
  • U.S. Appl. No. 14/262,691, filed Apr. 25, 2014, U.S. Pat. No. 9,916,009, Mar. 13, 2018, Issued.
  • U.S. Appl. No. 15/917,066, filed, Mar. 9, 2018, U.S. Pat. No. 10,452,151, Oct. 22, 2019, Issued.
  • U.S. Appl. No. 16/659,468, filed Oct. 21, 2019, U.S. Pat. No. 11,099,653, Aug. 24, 2021, Issued.
  • U.S. Appl. No. 17/409,767, filed Aug. 23, 2021, US-2021-0382563-A1, Dec. 9, 2021, Published.
  • U.S. Appl. No. 14/457,015, filed Aug. 11, 2014, Abandoned.
  • U.S. Appl. No. 14/476,694, filed Sep. 3, 2014, U.S. Pat. No. 10,281,987, May 7, 2019, Issued.
  • U.S. Appl. No. 16/402,134, filed May 2, 2019, U.S. Pat. No. 10,831,281, Nov. 10, 2020, Issued.
  • U.S. Appl. No. 17/093,490, filed Nov. 9, 2020, US-2021-0081054-A1, Mar. 18, 2021, Published.
  • U.S. Appl. No. 14/154,730, filed Jan. 14, 2014, U.S. Pat. No. 9,501,152, Nov. 22, 2016, Issued.
  • U.S. Appl. No. 15/358,104, filed Nov. 21, 2016, U.S. Pat. No. 10,042,430, Aug. 7, 2018, Issued.
  • U.S. Appl. No. 16/054,891, filed Aug. 3, 2018, U.S. Pat. No. 10,739,862, Aug. 11, 2020, Issued.
  • U.S. Appl. No. 14/155,722, filed Jan. 15, 2014, U.S. Pat. No. 9,459,697, Oct. 4, 2016, Issued.
  • U.S. Appl. No. 15/279,363, filed Sep. 28, 2016, U.S. Pat. No. 10,139,918, Nov. 27, 2018, Issued.
  • U.S. Appl. No. 16/195,755, filed Nov. 19, 2018, US-2019-0155394-A1, May 23, 2019, Allowed.
  • U.S. Appl. No. 14/476,694—Notice of Allowance dated Dec. 28, 2018, 22 pages.
  • U.S. Appl. No. 16/195,755—Office Action dated Nov. 29, 2019, 46 pages.
  • U.S. Appl. No. 16/195,755—Office Action dated Jun. 8, 2020, 15 pages.
  • U.S. Appl. No. 16/195,755—Response to Office Action dated Nov. 29, 2019, filed Feb. 27, 2020, 13 pages.
  • U.S. Appl. No. 16/402,134—Non-Final Office Action dated Jan. 27, 2020, 58 pages.
  • U.S. Appl. No. 16/402,134—Notice of Allowance dated Jul. 15, 2020, 9 pages.
  • U.S. Appl. No. 14/476,694—Response to Office Action dated Jul. 30, 2018 filed Nov. 9, 2018, 19 pages.
  • U.S. Appl. No. 16/659,468—Response to Office Action dated Jun. 19, 2020 filed , 12 pages.
  • U.S. Appl. No. 16/195,755—Response to Final Office Action dated Jun. 8, 2020 filed Sep. 21, 2020, 17 pages.
  • U.S. Appl. No. 17/093,490 Office Action, dated Dec. 17, 2021, 101 pages.
  • U.S. Appl. No. 16/195,755—Advisory Action dated Sep. 30, 2020, 3 pages.
  • U.S. Appl. No. 16/195,755—Non Final Office Action dated May 25, 2021, 19 pages.
  • U.S. Appl. No. 16/195,755—Response to Non-Final Office Action dated May 25, 2021, filed Aug. 25, 2021, 15 pages.
  • U.S. Appl. No. 16/195,755—Notice of Allowance, dated Sep. 29, 2021, 6 pages.
  • U.S. Appl. No. 16/195,755—Supplemental Notice of Allowance, dated Oct. 14, 2021, 9 pages.
  • U.S. Appl. No. 14/154,730—Office Action dated Nov. 6, 2015, 9 pages.
  • U.S. Appl. No. 14/155,722—Office Action dated Nov. 20, 2015, 14 pages.
  • U.S. Appl. No. 14/281,817—Office Action dated Sep. 28, 2015, 5 pages.
  • U.S. Appl. No. 14/262,691—Office Action dated Dec. 11, 2015, 31 pages.
  • U.S. Appl. No. 14/154,730—Response to Office Action dated Nov. 6, 2016, filed Feb. 4, 2016, 9 pages.
  • U.S. Appl. No. 14/154,730—Notice of Allowance dated May 3, 2016, 5 pages.
  • U.S. Appl. No. 14/476,694—Office Action dated Nov. 1, 2016, 28 pages.
  • U.S. Appl. No. 14/476,694—Response to Office Action dated Nov. 1, 2016 filed Jan. 31, 2017, 15 pages.
  • U.S. Appl. No. 15/358,104—Office Action dated Nov. 2, 2017, 9 pages.
  • U.S. Appl. No. 14/476,694—Response to Office Action dated Apr. 7, 2017 filed Jul. 6, 2017, 22 pages.
  • U.S. Appl. No. 14/476,694—Advisory Action dated Jun. 22, 2017, 8 pages
  • U.S. Appl. No. 14/516,493—Office Action dated May 9, 2016, 21 pages.
  • U.S. Appl. No. 14/516,493—Response dated May 9 Office Action filed Aug. 9, 2016, 18 pages
  • U.S. Appl. No. 14/516,493—Office Action dated Nov. 17, 2016, 30 pages.
  • Pavlovic, V.I., et al., “Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, No. 7, Jul. 1997, pp. 677-695.
  • Wu, Y., et al., “Vision-Based Gesture Recognition: A Review,” Beckman Institute, Copyright 1999, pp. 103-115.
  • U.S. Appl. No. 14/280,018—Office Action dated Feb. 12, 2016, 38 pages.
  • PCT/US2013/021713—International Preliminary Report on Patentability dated Jul. 22, 2014, 13 pages.
  • PCT/US2013/021713—International Search Report and Written Opinion dated Sep. 11, 2013, 7 pages.
  • Arthington, et al., “Cross-section Reconstruction During Uniaxial Loading,” Measurement Science and Technology, vol. 20, No. 7, Jun. 10, 2009, Retrieved from the Internet: http:iopscience.iop.org/0957-0233/20/7/075701, pp. 1-9.
  • Barat et al., “Feature Correspondences From Multiple Views of Coplanar Ellipses”, 2nd International Symposium on Visual Computing, Author Manuscript, 2006, 10 pages.
  • Bardinet, et al., “Fitting of iso-Surfaces Using Superquadrics and Free-Form Deformations” [on-line], Jun. 24-25, 1994 [retrieved Jan. 9, 2014], 1994 Proceedings of IEEE Workshop on Biomedical Image Analysis, Retrieved from the Internet: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=315882&tag=1, pp. 184-193.
  • Butail, S., et al., “Three-Dimensional Reconstruction of the Fast-Start Swimming Kinematics of Densely Schooling Fish,” Journal of the Royal Society Interface, Jun. 3, 2011, retrieved from the Internet <http://www.ncbi.nlm.nih.gov/pubmed/21642367>, pp. 0, 1-12.
  • Cheikh et al., “Multipeople Tracking Across Multiple Cameras”, International Journal on New Computer Architectures and Their Applications (IJNCAA), vol. 2, No. 1, 2012, pp. 23-33.
  • Chung, et al., “Recovering LSHGCs and SHGCs from Stereo,” International Journal of Computer Vision, vol. 20, No. 1/2, 1996, pp. 43-58.
  • Cumani, A., et al., “Recovering the 3D Structure of Tubular Objects from Stereo Silhouettes,” Pattern Recognition, Elsevier, GB, vol. 30, No. 7, Jul. 1, 1997, 9 pages.
  • Davis et al., “Toward 3-D Gesture Recognition”, International Journal of Pattern Recognition and Artificial Intelligence, vol. 13, No. 3, 1999, pp. 381-393.
  • Di Zenzo, S., et al., “Advances in Image Segmentation,” Image and Vision Computing, Elsevier, Guildford, GBN, vol. 1, No. 1, Copyright Butterworth & Co Ltd., Nov. 1, 1983, pp. 196-210.
  • Forbes, K., et al., “Using Silhouette Consistency Constraints to Build 3D Models,” University of Cape Town, Copyright De Beers 2003, Retrieved from the internet: <http://www.dip.ee.uct.ac.za/˜kforbes/Publications/Forbes2003Prasa.pdf> on Jun. 17, 2013, 6 pages.
  • Heikkila, J., “Accurate Camera Calibration and Feature Based 3-D Reconstruction from Monocular Image Sequences”, Infotech Oulu and Department of Electrical Engineering, University of Oulu, 1997, 126 pages.
  • Kanhangad, V., et al., “A Unified Framework for Contactless Hand Verification,” IEEE Transactions on Information Forensics and Security, IEEE, Piscataway, NJ, US., vol. 6, No. 3, Sep. 1, 2011, pp. 1014-1027.
  • Kim, et al., “Development of an Orthogonal Double-Image Processing Algorithm to Measure Bubble,” Department of Nuclear Engineering and Technology, Seoul National University Korea, vol. 39 No. 4, Published Jul. 6, 2007, pp. 313-326.
  • Kulesza, et al., “Arrangement of a Multi Stereo Visual Sensor System for a Human Activities Space,” Source: Stereo Vision, Book edited by: Dr. Asim Bhatti, ISBN 978-953-7619-22-0, Copyright Nov. 2008, I-Tech, Vienna, Austria, www.intechopen.com, pp. 153-173.
  • May, S., et al., “Robust 3D-Mapping with Time-of-Flight Cameras,” 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, Piscataway, NJ, USA, Oct. 10, 2009, pp. 1673-1678.
  • Olsson, K., et al., “Shape from Silhouette Scanner—Creating a Digital 3D Model of a Real Object by Analyzing Photos From Multiple Views,” University of Linkoping, Sweden, Copyright VCG 2001, Retrieved from the Internet: <http://liu.diva-portal.org/smash/get/diva2:18671/FULLTEXT01> on Jun. 17, 2013, 52 pages.
  • Pedersini, et al., Accurate Surface Reconstruction from Apparent Contours, Sep. 5-8, 2000 European Signal Processing Conference EUSIPCO 2000, vol. 4, Retrieved from the Internet: http://home.deib.polimi.it/sarti/CV_and_publications.html, pp. 1-4.
  • Rasmussen, Matihew K., “An Analytical Framework for the Preparation and Animation of a Virtual Mannequin forthe Purpose of Mannequin-Clothing Interaction Modeling”, A Thesis Submitted in Partial Fulfillment of the Requirements for the Master of Science Degree in Civil and Environmental Engineering in the Graduate College of the University of Iowa, Dec. 2008, 98 pages.
  • U.S. Appl. No. 14/280,018—Replacement Response to Office Action, dated Feb. 12, 2016, filed Jun. 8, 2016, 16 pages.
  • U.S. Appl. No. 14/280,018—Notice of Allowance dated Sep. 7, 2016, 7 pages.
  • U.S. Appl. No. 14/280,018—Response to Office Action dated Feb. 12, 2016, filed May 12, 2016, 15 pages.
  • U.S. Appl. No. 14/262,691—Response to Offfice Action dated Dec. 11, 2015, filed May 11, 2016, 15 pages.
  • U.S. Appl. No. 14/262,691—Office Action dated Aug. 19, 2016, 36 pages.
  • U.S. Appl. No. 14/262,691—Response to Office Action dated Aug. 19, 2016, filed Nov. 21, 2016, 13 pages.
  • U.S. Appl. No. 14/262,691—Office Action dated Jan. 31, 2017, 27 pages.
  • U.S. Appl. No. 14/262,691—Response to Office Action dated Jan. 31, 2017, filed Jun. 30, 2017, 20 pages.
  • U.S. Appl. No. 14/262,691—Notice of Allowance dated Oct. 30, 2017, 35 pages.
  • U.S. Appl. No. 14/476,694—Office Action dated Aug. 10, 2017, 71 pages.
  • U.S. Appl. No. 14/476,694—Response to Office Action dated Aug. 10, 2017, filed Nov. 10, 2017, 14 pages.
  • U.S. Appl. No. 14/155,722—Response to Office Action dated Nov. 20, 2015, filed Feb. 19, 2016, 15 pages.
Patent History
Patent number: 11353962
Type: Grant
Filed: Aug 6, 2020
Date of Patent: Jun 7, 2022
Patent Publication Number: 20200363874
Assignee: Ultrahaptics IP Two Limited (Bristol)
Inventors: Raffi Bedikian (San Francisco, CA), Jonathan Marsden (San Mateo, CA), Keith Mertens (Oakland, CA), David Holz (San Francisco, CA)
Primary Examiner: Cao H Nguyen
Application Number: 16/987,289
Classifications
Current U.S. Class: Gesture-based (715/863)
International Classification: G06F 3/01 (20060101); G06F 3/03 (20060101); G06V 40/20 (20220101); G06F 3/04845 (20220101);