METHODS, DEVICES AND SYSTEMS ENABLING DETERMINATION OF EYE STATE VARIABLES
Methods, devices and systems for generating data suitable for determining at least one eye state variable of at least one eye of a subject, and methods and systems for determining such eye state variables are provided. The at least one eye state variable being derivable from at least one image of the eve taken with a camera of known camera intrinsics. Synthetic image data of a first 3D model eye which models corneal refraction can be generated for different sets of eye state variables and may be used to determine said eye state variables using a further 3D eye model comprising at least one parameter. A characteristic of the pupil image in the synthetic images can be determined for later use to determine eye state variables based on image data of real eyes and under use of the further 3D eye model.
Latest PUPIL LABS GmbH Patents:
- Methods, devices and systems for determining eye parameters
- Eye tracking module and head-wearable device
- EYE TRACKING MODULE AND HEAD-WEARABLE DEVICE
- CAMERA MODULE, HEAD-WEARABLE EYE TRACKING DEVICE, AND METHOD FOR MANUFACTURING A CAMERA MODULE
- Devices, systems and methods for predicting gaze-related parameters
Embodiments of the present invention relate to methods, devices and systems that may be used in the context of eye tracking, in particular methods for generating data suitable for and enabling determining a state of an eye of a human or animal subject.
BACKGROUNDOver the last decades, camera-based eye trackers have become a potent and wide-spread research tool in many fields including human-computer interaction, psychology, and market research. Offering increased mobility compared to remote eye-tracking solutions, head-mounted eye trackers, in particular, have enabled the acquisition of gaze data during dynamic activities also in outdoor environments. The traditional computational pipeline for mobile gaze estimation using head-worn eye trackers involves eye landmark detection, in particular detecting the pupil center or ellipse fitting either using special-purpose image processing techniques or machine learning, and gaze mapping, traditionally using a geometric eye model or by directly mapping 2D pupil positions to 3D gaze directions or points, or 2D gaze points within a camera image of a likewise head-worn front facing scene camera. While the latter approach works well when calibrating the eye tracking device to a particular user and wearing state, once the head-worn eye tracker slightly slips or moves with respect to its initial, calibrated position on the head of the user, the calibrated mapping from 2D pupil positions to gaze points deteriorates or breaks down entirely. In order to cope with such eye tracker “slippage”, eye tracking strategies using full 3D eye models are superior, since they allow to constantly recalculate the actual 3D location of the eye(s) (eyeball centers) with respect to the head-worn eye tracker, in particular with respect to the coordinate system(s) defined by the camera(s) recording the eye(s), and corresponding gaze vectors. Knowing the location/coordinates of the eyeball center also opens the way to pupillometry, i.e. measuring the actual size of the pupil.
Methods employing 3D eye models can in turn be divided into methods making use of corneal reflections – so called “glints” – produced by light sources located at known positions with respect to the cameras recording the eye images, and methods which instead derive the eye model location and gaze direction directly from the pupil shape, without the use of any artificially produced reflections.
Eye trackers using glints rely on complex optical setups involving the active generation of said corneal reflections by means of infrared (IR) LEDs and/or pairs of calibrated stereo cameras. Glint-based (i.e. using corneal reflections) gaze estimation needs to reliably detect those reflections in the camera image and needs to be able to associate each with a unique light source. If successful, the 3D position of the cornea center (assuming a known radius of curvature, i.e. a parameter of a 3D eye model) can be determined. Beside the additional hardware requirements, another issue encountered in this approach are spurious reflections produced by other illuminators, which may strongly impact the achievable accuracy. From an engineering point of view, glint-free estimation of gaze-related and other eye state variables of an eye is therefor highly desirable. However, determining of eye state variables from camera images alone (solving an inverse problem) is challenging and so far requires comparatively high computing power often limiting the application area, in particular if head and/or eye movement with respect to the camera is to be compensated (e.g. “slippage” of a head-mounted eye tracker). Head-mounted eye trackers are in general desired to resolve ambiguities during eye state estimation with more restricted hardware setups than remote eye-trackers.
In an alternative to “glint-based” methods for eye state estimation, methods which instead derive a 3D eye model location and gaze direction directly from the pupil shape, without the use of any artificially produced reflections exist, see for example reference [1]. One of the challenges of such methods is the size-distance ambiguity: given only one 2D image of an eye it is not possible to know a priori whether the pupil of the eye is small and close or large and far away. Resolving this ambiguity requires a time series of many camera images which show the eye under largely varying gaze angles with respect to the camera, and complex numerical optimization methods to fit the 3D eye model in an iterative fashion to said time series of eye observations to yield the final eyeball center coordinates in camera coordinate space, which in turn are needed to derive quantities like the 3D gaze vector or the pupil size in physical units, such as millimeters.
A simpler and faster ways of calculating the eyeball center and thus resolving the size-distance ambiguity without requiring computationally expensive iterative numerical optimization methods have been proposed in [4], see also WO2020/244752, and WO2020/244971, which are hereby incorporated in their entirety. The methods described therein employ the same 3D eye model as [1], which does not include a cornea and has a single parameter, namely the distance R between eyeball center and pupil center, which can be assumed as a physiological constant since human eyes vary only to a small extent between individuals. A post-hoc refraction correction strategy to deal with the effects of corneal refraction is described in [4] and WO2020/244752. While this method of dealing with corneal refraction has been shown to work well, it requires ray-tracing of a substantial amount of synthetic images. Also, generation of the required polynomial features from the preliminary values of the eye state at a given point in time (one eye observation) during runtime and application of the correction mapping does take some calculation time. While the method is able to perform at common frame rates used in real-time applications, saving computational time and energy in mobile, real-time applications is always a prime directive and even faster methods are thus desirable.
Pupillometry – the study of temporal changes in pupil diameter as a function of external light stimuli or cognitive processing – is another field of application of general purpose eye-trackers and requires accurate measurements of pupil dilation. Average human pupil diameters are of the order of 3 mm (size of the aperture stop), while peak dilation in cognitive processes can amount to merely a few percent with respect to a baseline pupil size, thus demanding for sub-millimeter accuracy. Video-based eye trackers are in general able to provide apparent (entrance) pupil size signals. However, the latter are usually subject to pupil foreshortening errors – the combined effect of the change of apparent pupil size as the eye rotates away from or towards the camera and the gaze-angle dependent influence of corneal refraction. Such errors can easily amount to more than 10%, thus being larger than the pupil size changes that need to be measured. Also, many prior art methods and devices only provide pupil size in (pixel-based) arbitrary units, while there is an inherent merit in providing an absolute value in units of physical length (e.g. [mm]), since cognitively induced absolute changes are largely independent of baseline pupil radius, and hence only measuring absolute values makes experiments comparable. Hence, only a 3D eye model based eye state determination which takes effects of corneal refraction into account is maximally useful for the purpose of precision pupillometry.
Accordingly, there is a need to further improve the speed, robustness and accuracy of the detection of eyeball position, gaze direction, pupil size and other eye state variables and reduce the computational effort required therefor, while taking into account the effects of corneal refraction.
SUMMARYAccording to an embodiment of a method for generating data suitable for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the method includes providing a first 3D eye model modeling corneal refraction. Using the known camera intrinsics, synthetic image data of several model eyes according to the first 3D eye model is generated for a plurality of given values of at least one eye state variable. Using a given algorithm the at least one eye state variable is calculated using one or more of the synthetic images and a further 3D eye model having at least one parameter. A characteristic of the image of the pupil within each of the synthetic images is determined and one or more hypothetically optimal values of the at least one parameter of the further 3D eye model that minimize the error between the value(s) of the at least one given eye state variable and the value(s) of the corresponding eye state variable obtained when applying the given algorithm are determined. Finally, a relationship between the one or more hypothetically optimal values of the at least one parameter of the further 3D eye model and the characteristic of the pupil image is established.
According to an embodiment of a method for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the method comprises receiving image data of the at least one eye from a camera of known camera intrinsics and defining an image plane, determining a characteristic of the image of the pupil within the image data, providing a 3D eye model having at least one parameter, the parameter depending in a pre-determined relationship on the characteristic and using a given algorithm to calculate the at least one eye state variable using the image data and the 3D eye model including the at least one characteristic-dependent parameter.
According to an embodiment of a system for generating data suitable for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the system comprises a computing and control unit configured to generate, using the known camera intrinsics, synthetic image data of several model eyes according to a first 3D eye model modeling corneal refraction, for a plurality of given values of at least one eye state variable, calculate, using a given algorithm, the at least one eye state variable making use of one or more of the synthetic images and a further 3D eye model having at least one parameter, determine a characteristic of the image of the pupil within each of the synthetic images, determine one or more hypothetically optimal values of the at least one parameter of the further 3D eye model that minimize the error between the value(s) of the at least one given eye state variable and the value(s) of the corresponding eye state variable obtained when applying the given algorithm, and establish a relationship between the one or more hypothetically optimal values of the at least one parameter of the further 3D eye model and the characteristic of the pupil image. The relationship can be stored in a memory.
According to an embodiment of a system for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the system comprises a device comprising at least one camera of known camera intrinsics for producing image data including at least one eye of a subject, the at least one camera comprising a sensor defining an image plane, the at least one eye comprising an eyeball, an iris defining a pupil, and a cornea. The system further comprises a computing and control unit configured to receive image data of the at least one eye from the at least one camera, determine a characteristic of the image of the pupil within the image data, calculate, using a given algorithm, the at least one eye state variable making use of the image data and a 3D eye model having at least one parameter, the parameter depending in a pre-determined relationship on the characteristic, the relationship being retrieved from a memory.
Other embodiments include (non-volatile) computer-readable storage media or devices, and one or more computer programs recorded on one or more computer-readable storage media or computer storage devices. The one or more computer programs can be configured to perform particular operations or processes by virtue of including instructions that, when executed by one or more processors of a system, in particular one of the systems as explained herein, cause the system to perform the operations or processes.
The components in the figures are not necessarily to scale, instead emphasis being placed upon illustrating the principles of the invention. Moreover, in the figures, like reference numerals designate corresponding parts. In the drawings:
In the following Detailed Description, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. In this regard, directional terminology, such as “top,” “bottom,” “front,” “back,” “leading,” “trailing,” etc., is used with reference to the orientation of the Figure(s) being described. Because components of embodiments can be positioned in a number of different orientations, the directional terminology is used for purposes of illustration and is in no way limiting. It is to be understood that other embodiments may be utilized and structural or logical changes may be made without departing from the scope of the present invention. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
The terms “user” and “subject” are used interchangeably and designate a human or animal being having one or more eyes.
The term “3D” is used to signify “three-dimensional”.
The terms “eye state” and “eye state variable(s)” are used to signify quantities that characterize the pose of an eye (e.g. eyeball position and orientation such as via the gaze vector in a given coordinate system), the size of the pupil or any other quantity that is typically primarily variable during observation of a real eye. In contrast, the term “eye model parameter(s)” is used to signify quantities which characterize an abstract, idealized 3D model of an eye, e.g. a radius of an eyeball, a radius of an eye sphere, a radius (of curvature) of a cornea, an outer radius of an iris, an index of refraction of certain eye structures or various distance measures between an eyeball center, a pupil center, a cornea center, etc. Statistical information about such parameters like their means and standard deviations can be measured for a given species, like humans, for which such information is typically known from the literature.
It is a task of the invention to provide methods and systems allowing for improved generating, in particular computationally faster, easier and/or more reliably and accurately generating of data suitable for determining eye state variables of a human or animal eye, and correspondingly to provide methods, systems and devices for determining such eye state variables. A suitable device includes one or more cameras for generating image data of one or more respective eyes of a human or animal subject or user within the field-of-view of the device.
Said task is solved by the subject matter of the independent claims.
The device may be a head-wearable device, configured for being wearable on a user’s head and may be used for determining one or more gaze- and/or eye-related state variables of a user wearing the head-wearable device.
Alternatively, the device may be remote from the subject, such as a commonly known remote eye-tracking camera module.
The head-wearable device may be implemented as a (head-wearable) spectacles device comprising a spectacles body, which is configured such that it can be worn on a head of a user, for example in a way usual glasses are worn. Hence, the spectacles device when worn by a user may in particular be supported at least partially by a nose area of the user’s face. The head-wearable device may also be implemented as an augmented reality (AR-) and/or virtual reality (VR-) device (AR/VR headset), in particular a goggles, or a head-mounted display (HMD). For the sake of clarity, devices are mainly described with regard to head-wearable spectacles devices in the following.
The device has at least one camera having a sensor arranged in or defining an image plane for producing image data, typically taking images, of one or more eyes of the user, e.g. of a left and/or a right eye of the user. In other words, the camera, which is in the following also referred to as eye camera, may be a single camera of the device. This may in particular be the case if the device is remote from the user. As used herein, the term “remote” shall describe distances of approximately more than 20 centimeters from the eye(s) of the user. In such a setup, a single eye camera may be able to produce image data of more than one eye of the user simultaneously, in particular images which show both a left and right eye of a user.
Alternatively, the device may have more than one eye camera. This may in particular be the case if the device is a head-wearable device. Such devices are located in close proximity to the user when in use. An eye camera located on such a device may thus only be able to view and image one eye of the user. Such a camera is often referred to as near-eye camera. Typically, head-wearable devices thus comprise more than one (near-)eye camera, for example, in a binocular setup, at least a first or left (side) eye camera and a second or right (side) eye camera, wherein the left camera serves for taking a left image or a stream of images of at least a portion of the left eye of the user, and wherein the right camera takes an image or a stream of images of at least a portion of a right eye of the user. In the following any eye camera in excess of 1 is also called further eye camera.
In case of a head-wearable device, the eye camera(s) can be arranged at the spectacles body in inner eye camera placement zones and/or in outer eye camera placement zones, in particular wherein said zones are determined such that an appropriate picture of at least a portion of the respective eye can be taken for the purpose of determining one or more eye state variables. In particular, the cameras may be arranged in a nose bridge portion and/or in a lateral edge portion of the spectacles frame, such that an optical field of a respective eye is not obstructed by the respective camera. For example, the cameras can be integrated into a frame of the spectacles body and thereby being non-obstructive.
Furthermore, the device may have illumination means for illuminating the left and/or right eye of the user, in order to increase image data quality, in particular if the light conditions within an environment of the spectacles device are not optimal. Infrared (IR) light may be used for this purpose. Accordingly, the recorded eye image data does not necessarily need to be in the form of pictures as visible to the human eye, but can also be an appropriate representation of the recorded (filmed) eye(s) in a range of light non-visible for humans.
The eye camera(s) is/are typically of known camera intrinsics. As used herein, the term “camera of known camera intrinsics” shall describe that the optical properties of the camera, in particular the its imaging properties are known and/or can be modeled using a respective camera model including the known intrinsic(s) (parameters) approximating the eye camera producing the eye images. Typically, a pinhole camera model is used and full perspective projection is assumed for modeling the eye camera and imaging process. The known intrinsic parameters may include a focal length of the camera, an image sensor format of the camera, a principal point of the camera, a shift of a central image pixel of the camera, a shear parameter of the camera, and/or one or more distortion parameters of the camera.
The eye state of the subject’s eye typically refers to an eyeball, a gaze and/or a pupil of the subject’s eye, in particular it may refer to and/or be a center of the eyeball, in particular a center of rotation of the eyeball or an optical center of the eyeball, or a certain subset of 3D space in which said center is to be located, like for example a line in 3D, or a gaze-related variable of the eye, for example a gaze direction, a cyclopean gaze direction, a 3D gaze point, a 2D gaze point, a visual axis orientation, an optical axis orientation, a pupil axis orientation, a line of sight orientation, a limbus major and/or minor axes orientation, an eye cyclo-torsion, an eye vergence, a statistics over eye adduction and/or eye abduction, and a statistics over eye elevation and/or eye depression, and data about drowsiness and/or awareness of the user.
The eye state (variable(s)) may as well refer to and/or be a measure of the pupil size of the eye, such as a pupil radius, a pupil diameter or a pupil area.
Gaze- or eye-related variables, points and directions are typically determined with respect to a coordinate system that is fixed to the eye camera(s) and/or the device.
For example, (a) Cartesian coordinate system(s) defined by the image plane(s) of the eye camera(s) may be used.
Variables, points and directions may also be specified or determined within and/or converted into a device coordinate system, a head coordinate system, a world coordinate system or any other suitable coordinate system.
In particular, if the device comprises more than one eye camera and the relative poses, i.e. the relative positions and orientations of the eye cameras, are known, geometric quantities like points and directions which have been specified or determined in any one of the eye camera coordinate systems can be converted into a common coordinate system. Relative camera poses may be known because they are fixed by design, or because they have been measured after each camera has been adjusted into it’s use position.
Eye model parameter(s) may for example be a distance between a center of an eyeball, in particular a rotational, geometrical or optical center, and a center of a pupil or cornea, a size measure of an eyeball, a cornea or an iris such as an eyeball radius, a cornea radius, an iris diameter, a distance pupil-center to cornea-center, a distance cornea-center to eyeball-center, a distance pupil-center to limbus center, a distance crystalline lens to eyeball-center, to cornea center and/or to corneal apex, a refractive property of an eye structure such as an index of refraction of a cornea, vitreous humor or crystalline lens, an ellipsoidal shape measure of an eyeball or cornea, a degree of astigmatism, and an eye intra-ocular distance or inter-pupillary distance.
In the following, exemplary algorithms for determining eye state variables using 3D eye models will be discussed. Other such algorithms exist and can be used in the methods, devices and systems of the present invention.
In one example, an algorithm suitable for determining eye state variables of at least one eye of a subject, the eye comprising an eyeball and an iris defining a pupil, includes receiving image data of an eye at a first time from a camera of known camera intrinsics, which camera defines an image plane. A first ellipse representing a border of the pupil of the eye at the first time is determined in the image data. The camera intrinsics and the first ellipse are used to determine a 3D orientation vector of a first circle in 3D and a first center line on which a center of the first circle is located in 3D, so that a projection of the first circle, in a direction parallel to the first center line, onto the image plane is expected to reproduce the first ellipse. A first eye intersecting line in 3D expected to intersect a 3D center of the eyeball at the first time is determined as a line which is, in the direction of the orientation vector, parallel-shifted to the first center line by an expected distance between the center of the eyeball and a center of the pupil.
Accordingly, the first eye intersecting line, which limits the position of the center of the eyeball to a line and thus can be considered as one of several variables characterizing the state of the eye, can be determined without using glints or markers and with low calculation costs, low numerical effort and/or very fast. This even allows determining the state of an eye in real time (within sub-milliseconds range per processed image) with comparatively low hardware requirements. Accordingly, eye state variables may be determined with hardware that is integrated into a head-wearable device during taking eye images with the camera of the head-wearable device and with a negligible delay only, respectively with hardware of low computational power, like smart devices, connectable to the head-wearable device.
Note that the process of determining the orientation vector of the first circle and the first center line is typically done similar as explained in reference [1]. Reference [1] describes a method of 3D eye model fitting and gaze estimation based on pupil shape derived from (monocular) eye images. Starting from a camera image of an eye and having determined the area of the pupil represented by an ellipse, the first step is to determine the circle in 3D space, which gives rise to the observed elliptical image pupil, assuming a (full) perspective projection and known camera parameters. Once this circle is found, it can serve as an approximation of the actual pupil of the eye, i.e. the approximately circular opening of varying size within the iris.
Note that due to a property of perspective projection, the circle center line, i.e. the line in 3D on which lie the centers of all possible circles which produce one and the same particular ellipse in 2D (camera) image space under perspective projection, is NOT trivially obtained, since it does NOT go through the center of said ellipse. Instead, the rather involved mathematical methods for achieving this “unprojection” are explained in reference [3].
Said ellipse “unprojection” as detailed in [3] gives rise to two ambiguities: firstly, there are two solution circles for a given fixed circle radius on the cone which represents the space of all possible solutions. Deciding which one is a correct pupil candidate is described in [1].
The second ambiguity is a size-distance ambiguity, which is the harder one to resolve: given only a 2D image of the pupil it is not possible to know a priori whether the pupil is small and close to the camera or large and far away from the camera. This second ambiguity is resolved in reference [1] by generating a model which comprises 3+3N parameters, including the 3 eyeball center coordinates and parameters of pupil candidates extracted from a time series of N camera images. This model is then optimized numerically in a sophisticated iterative fashion to yield the final eyeball center coordinates.
Note that even the projection of the 3D eyeball center into 2D image space can in general NOT be trivially obtained based on 2D pupil image measures or calculations alone. Some prior art alleges that this can be done by intersecting minor axes of multiple pupil image ellipse observations under varying gaze angles. This is true if and only if the optical axis of the camera points exactly at the 3D eyeball center, which is in general NOT the case, respectively can NOT be reliably achieved or assumed when acquiring images of real eyes. Even IF such a 2D projection of the true eyeball center could be obtained, the mere derivation of such a point in 2D image space does firstly not imply the construction of a line in 3D from the camera through said point and secondly, even IF such a line would be constructed, it wouldn’t resolve the size-distance ambiguity. This is the reason why reference [1] resorts to iterative numerical optimization as explained above.
Note also, that algorithms exist which make use of the OUTER iris contour or limbus, i.e. the edge or contour where the iris and cornea transition to the sclera. They have the advantage that unlike the pupil (the “inner” iris contour), the outer iris contour does not change size. Furthermore, in humans the outer iris has a fairly uniform radius of ri = 6 mm, and is often used as a parameter of a 3D eye model. When detecting the outer iris contour as an elliptical shape in a camera image, it is thus possible to apply the same strategies as outlined in [1] and [3] and calculate the circle in 3D that gave rise to said elliptical shape – the size-distance ambiguity does not exist in this case, since the size of the circle in 3D can be assumed as known. However, limbus tracking methods have two inherent disadvantages. Firstly, the contrast of the limbus is mostly inferior to the contrast of the pupil, and secondly, larger parts of the limbus are usually occluded, either by the eye lids or – in particular if head-mounted eye cameras are used –because the viewing angle of the camera onto the eye makes it difficult or impossible to image the entire iris. Both issues make reliable limbus detection difficult and pupil detection based methods for determining eye state variables are thus largely preferable, in particular in head-mounted scenarios using near-eye cameras.
Compared to reference [1], the solutions proposed in [4], WO2020/244752 and WO2020/244971 for resolving the size-distance ambiguity (which is based on the proposed eye intersecting line(s)) represents a considerable conceptual and computational simplification. This allows determining eye state variables such as eyeball position(s), pupil size(s) and gaze direction(s) in real-time with considerably reduced hardware and/or software requirements. Accordingly, lightweight and/or comparatively simple head-wearable devices may be more broadly used for purposes like gaze-estimation and/or pupillometry, i.e. measuring the actual size of the pupil in physical units of length.
Note that these methods do not require taking into account a glint from the eye for generating data suitable for determining eye state variables. In other words, the methods are glint-free and do not require using structured light and/or special purpose illumination hardware.
Note further that eyes within a given species, e.g. humans, only vary in size within a very narrow margin and many physiological parameters can thus be assumed constant/equal between different subjects, which enables the use of 3D models of an average eye for the purpose of determining eye state variables. An example for such a physiological parameter is the distance R between center of the eyeball, in the following also referred to as eyeball center, and center of the pupil, in the following also referred to as pupil center. In human eyes the distance R can be assumed with high accuracy as a constant (R = 10.39 mm), which can therefore be used as the expected distance in a 3D model of the human eye for calculating eye state variables.
Therefore, the expected value R can be used to construct an ensemble of possible eyeball center positions (a 3D eye intersecting line), based on an ensemble of possible pupil center positions (a 3D circle center line) and a 3D orientation vector of the ensemble of possible 3D pupil circles, by parallel-shifting the 3D circle center line by the expected distance R between the center of the eyeball and a center of the pupil along the direction of the 3D orientation vector. Note again that in this particular scenario, distance R is a (constant) physiological parameter of the underlying 3D eye model and NOT a quantity that needs to be measured for each subject.
Each further image / observation of one and the same eye but with a different gaze direction gives rise to an independent eye intersecting line in 3D. Finding the nearest point between or intersection of at least two independent eye intersecting lines thus yields the coordinates of the eyeball center in a non-iterative manner. This provides considerable conceptual and computational simplification over prior art methods.
Accordingly, in a monocular version of an example algorithm for determining eye state variables, this algorithm includes receiving a second image of the eye at a second time from the camera, more typically a plurality of further images at respective times, determining a second ellipse in the second image, the second ellipse at least substantially representing the border of the pupil at the second time, more typically determining for each of the further images a respective ellipse, using the second ellipse to determine an orientation vector of a second circle and a second center line on which a center of the second circle is located, so that a projection of the second circle, in a direction parallel to the second center line, onto the image plane is expected to reproduce the second ellipse, more typically using the respective ellipse to determine an orientation vector of the further circle and a further center line on which the further circle is located, so that a projection of the further circle, in a direction parallel to the further center line, onto the image plane is expected to reproduce the respective further ellipse, and determining a second eye intersecting line expected to intersect the center of the eyeball at the second time as a line which is, in the direction of the orientation vector of the second circle, parallel-shifted to the second center line by the expected distance, more typically determining further eye intersecting lines each of which is expected to intersect the center of the eyeball at the respective further time as a line which is, in the direction of the orientation vector of the further circle, parallel-shifted to the further center line by the expected distance.
In other words, a camera model such as a pinhole camera model describing the imaging characteristics of the camera and defining an image plane (and known camera intrinsic parameters as parameters of the camera model) is used to determine for several images taken at different times with the camera an orientation vector of a respective circle and a respective center line on which a center of the circle is located, so that a projection of the circle, in a direction parallel to the center line, onto the image plane reproduces the respective ellipse in the camera model, and determining a respective line which is, in the direction of the orientation vector, which typically points away from the camera, parallel-shifted to the center line by an expected distance between a center of an eyeball of the eye and a center of a pupil of the eye as an eye intersecting line which intersects the center of the eyeball at the corresponding time. Thereafter, the eyeball center may be determined as nearest intersection point of the eye intersecting lines in a least squares sense.
Typically, the respective images of the eye which are used to determine the plurality of eye intersecting lines (Dk, k = 1 ... n) are acquired with a frame rate of at least 25 frames per second (fps), more typical of at least 30 fps, more typical of at least 60 fps, and more typical of at least 120 fps or even 200 fps.
Once the eyeball center is known, other eye state variables of the human eye such as gaze direction and pupil radius or size can also be calculated non-iteratively.
In particular, an expected gaze direction of the eye may be determined as a vector which is antiparallel to the respective circle orientation vector.
Further, the expected co-ordinates of the center of the eyeball may be used to determine for at least one of the times an expected optical axis of the eye, an expected orientation of the eye, an expected visual axis of the eye, an expected size of the pupil and/or an expected radius of the pupil.
Furthermore, at one or more later times a respective later image of the eye may be acquired by the camera and used to determine, based on the determined respective later eye intersecting line, at the later time(s) an expected gaze direction, an expected optical axis of the eye, an expected orientation of the eye, an expected visual axis of the eye, an expected size of the pupil and/or an expected radius of the pupil.
In such a monocular version of an example algorithm for determining eye state variables, the need remains to acquire a time series of N>1 eye images (also called observations) and the method requires those observations to show the eye under a relatively large variation of gaze angles in order for the intersection of those N eye intersecting lines to provide a reliable eyeball center calculation.
Accordingly, in another, binocular version of an example algorithm for determining eye state variables of one or more eyes, this algorithm includes receiving image data of a further eye of the subject at a second time, substantially corresponding to the first time, from a camera of known camera intrinsics and defining an image plane, the further eye comprising a further eyeball and a further iris defining a further pupil, determining a further ellipse in the image data, the further ellipse at least substantially representing the border of the further pupil of the further eye at the second time, using the camera intrinsics and the further ellipse to determine a 3D orientation vector of a further circle in 3D and a further center line on which a center of the further circle is located in 3D, so that a projection of the further circle, in a direction parallel to the further center line, onto the image plane is expected to reproduce the further ellipse, and determining a further eye intersecting line in 3D expected to intersect a 3D center of the further eyeball at the second time as a line which is, in the direction of the 3D orientation vector of the further circle, parallel-shifted to the further center line by an expected distance between the center of the further eyeball and a center of the further pupil.
In other words, instead of a purely monocular paradigm, image data from more than one eye of the subject, recorded substantially simultaneously can be leveraged in a binocular or multiocular setup.
Typically, the respective images of an/each eye which are used to determine the eye intersecting lines are acquired with a frame rate of at least 25 frames per second (fps), more typical of at least 30 fps, more typical of at least 60 fps, and more typical of at least 120 fps or even 200 fps.
In this way, in case image data from one eye originates from a different eye camera than image data from a further eye, it can be guaranteed that eye observations are sufficiently densely sampled in time in order to provide substantial simultaneous image data of different eyes. Image frames stemming from different cameras can be marked with timestamps from a common clock. This way, for each image frame recorded by a given camera at a (first) time t, a correspondingly closest image frame recorded by another camera at a (second) time t′ can be selected, such that abs(t-t′) is minimal (e.g. at most 2.5 ms if cameras capture image frames at 200 fps).
In case image data from one eye and from a further eye originates from the same camera, the second time can naturally correspond exactly to the first time, in particular the image data of the eye and the image data of the further eye can be one and the same image comprising both (all) eyes.
Such a binocular algorithm may include using the first eye intersecting line and the further eye intersecting line to determine expected coordinates of the center of the eyeball and of the center of the further eyeball, such that each eyeball center lies on the respective eye intersecting line and the 3D distance between the eyeball centers corresponds to a predetermined value (IED, IPD), in particular a predetermined inter-eyeball or inter-pupillary distance.
Accordingly, the centers of both eyeballs of a subject may be determined simultaneously, based on a binocular observation at merely a single point in time, instead of having to accumulate a time series of N>1 observations. Also, no monocular intersection of eye intersecting lines needs to be performed and this algorithm thus works under entirely static gaze of the subject, on a frame by frame basis. This is made possible by the insight that the distance between two eyes of a subject can be considered another physiological constant and can thus be leveraged for determining eye state variables of one or more eyes of a subject in the framework of an extended 3D eye model.
The predetermined distance value (IED, IPD) between the center of the eyeball and the center of the further eyeball can be an average value, in particular a physiological constant or population average, or an individually measured or known value of the subject. The average human inter-pupillary distance (IPD) at fixation at infinity can be assumed as IPD = 63.0 mm. This value is therefore a proxy for the actual 3D distance between the eyeball centers of a subject, the inter-eyeball distance (IED). Individually measuring the IPD can for example be performed with a simple ruler, as routinely done by optometrists.
The expected coordinates of the center of the eyeball and of the center of the further eyeball can in particular be determined, such that the radius of the first circle in 3D, representing the pupil of the eyeball, and the radius of the further circle in 3D, representing the further pupil, are substantially equal. As a further insight, it is possible to leverage the physiological fact that in most beings, pupils of different eyes are controlled by the same neural pathways and can not change size independently of each other. In other words, the pupil size of the left and of the right eye of for example a human is substantially equal at any instant in time.
Mathematically requesting the condition that the size of the circle and the size of the further circle in 3D have to be equal provides an unambiguous solution which yields both 3D eyeball center positions as well as the pupil size with merely a single binocular observation in time.
This non-iterative method is numerically stable, especially under static gaze conditions, and extremely fast and can be performed on a frame by frame basis in real-time. Alternatively, to be more robust to noise, observations can be averaged over a given time span. Once the center of an eyeball is known, other eye state variables such as an expected gaze direction, optical axis, orientation, visual axis of the eye, size or radius of the pupil of the eye can be calculated (also non-iteratively) for subsequent observations at later instants in time, simply based on the “unprojection” of pupil ellipse contours, providing even faster computation.
The algorithms detailed above merely constitute examples of algorithms for determining eye state variables, which make use of a 3D eye model. Other such algorithms are possible and can be used in the methods according to the invention.
According to the invention and contrary to prior art methods, effects of refraction by the cornea may be taken into account by adapting the 3D eye model.
It has been surprisingly found, that the simple cornea-less 3D eye model employed in [1], and which forms the basis of calculating approximate eye state variables in [4], WO2020/244752 and WO2020/244971, can be adapted to yield the correct eye state values at runtime in the following way. Note first that said eye model employed in [1] has a single parameter, namely the (physiologically constant) distance R between eyeball rotation center and pupil center. Note further that the shape and degree of distortion of the pupil image as seen by the eye camera depends in a complex non-linear manner on the pose of the eye with respect to the camera and the radius of the pupil (see reference [5]). The pose of the eye is composed of the orientation of the gaze direction of the eye with respect to the optical axis of the camera and the position of the eyeball with respect to the camera (i.e. in general offset from the optical axis of the camera and at an unknown distance). In fact, even given a particular pose of the eye with respect to the camera and given the pupil radius, it is impossible to analytically calculate the pupil contour as it would appear in a camera image under perspective projection, due to the complex non-linear nature of refraction through the cornea. This is only possible in scenarios like described in [1], based on an eye model which has no cornea – the perspective projection of a pupil assumed as a perfect circle is then a perfect ellipse in the image, which ellipse can be analytically calculated given a particular set of eye state variables (pose and pupil radius). As soon as an eye with a cornea is considered, no closed-form analytical solution characterizing the shape of the pupil under perspective projection given the pose of the eye and the pupil radius is possible. Likewise, the “inverse” problem, to derive the pose the eye and pupil radius based on the image of the pupil also has no closed-form analytical solution.
It has now been surprisingly found by the inventors, that a quantity which is very easily obtainable from the camera image, namely a measure of the shape which represents the pupil in the camera image, like for example a circularity measure, can not only serve as a first order approximation or “summary” of this eye pose and pupil radius dependent distortion, but at the same time is suitable to make the simple cornea-less 1-parameter eye model adaptive to said measure of shape. In other words, that it is possible to find simple relationships which make the parameter(s) of the 3D eye models of the prior art adaptive to account for the effects of corneal refraction in a very simple and efficient way.
Note that algorithms exist which try to derive or fit an individual value for one or more parameters of a 3D eye model for each subject or even for a particular/each eye, as part of numerical optimization schemes. This however brings the disadvantages of iterative optimization based algorithms like [1], which have already been mentioned. In particular, this has to be done using real world eye image data, i.e. in real-time, which is typically not feasible.
Note further that the goal of the present invention is NOT to determine individual geometrico-morphological measurements of an individual subject. Such measurements can be done offline in a non time critical manner. In general such individual measurements are also often not necessary for more general determination of eye state variables in an eye tracking context, since variation in individual eyeball measures are limited as already mentioned. Employing “average” 3D eye models which represent a certain population of subjects is in many cases a viable strategy to obtain statistically significant results of eye state variables in experiments with multiple subjects, like for example in many pupillometry studies. The present invention therefore provides the advantage of providing “adaptive” eye model parameters, derived via eye models of population averages but correctly modeling corneal refraction, as a function of a pupil image observation characteristics. This way, non time critical offline simulations using eye models aware of corneal refraction can enable calibration-free methods for determining eye state variables in real-time using simpler eye models which are “made” refraction aware via simple pre-established relationships between eye model parameters and easily obtainable pupil image characteristics.
According to an embodiment of a method for generating data suitable for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the method includes providing a first 3D eye model modeling corneal refraction. Using the known camera intrinsics, synthetic image data of several model eyes according to the first 3D eye model is generated for a plurality of given values of at least one eye state variable. Using a given algorithm the at least one eye state variable is calculated using one or more of the synthetic images and a further 3D eye model having at least one parameter. A characteristic of the image of the pupil within each of the synthetic images is determined and one or more hypothetically optimal values of the at least one parameter of the further 3D eye model that minimize the error between the value(s) of the at least one given eye state variable and the value(s) of the corresponding eye state variable obtained when applying the given algorithm are determined. Finally, a relationship between the one or more hypothetically optimal values of the at least one parameter of the further 3D eye model and the characteristic of the pupil image is established.
According to an embodiment of a method for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the method includes receiving image data of the at least one eye from a camera of known camera intrinsics and defining an image plane. Further, a characteristic of the image of the pupil within the image data is determined. A 3D eye model having at least one parameter is provided, the parameter depending in a pre-determined relationship on the characteristic. Finally, the method further includes using a given algorithm to calculate the at least one eye state variable using the image data and the 3D eye model including the at least one characteristic-dependent parameter.
According to a preferred embodiment of either method, the characteristic of the image of the pupil may be a measure of the circularity of the pupil area or outline, in particular a ratio of minor to major axis length of an ellipse fit to the pupil image area, a measure of variation of the curvature of the pupil outline, a measure of elongation of the pupil or a measure of the bounding box of the pupil area.
Typically, the relationship between the hypothetically optimal values of the at least one further 3D eye model parameter and the characteristic of the pupil image may be a constant value, in particular a constant value smaller or larger than the corresponding average parameter of the first 3D eye model, a linear relationship, or a polynomial relationship, or another non-linear relationship, e.g. based on a regression fit. This relationship may be stored to/in a memory. That way, a given algorithm to calculate the at least one eye state variable using the image data and a 3D eye model including the at least one parameter may later retrieve the relationship from memory and use it to calculate the one or more eye state variables in a fast and accurate way, taking corneal refraction into account, by making use of a pupil characteristic-dependent 3D eye model parameter.
In a particular embodiment, the 3D eye model respectively the further 3D eye model has at most one parameter. In particular, unlike the first 3D eye model, they do not need to model corneal refraction. Thus a very simple and fast method is provided.
Alternatively, the 3D eye model respectively the further 3D eye model may have more than one parameter and in a variant a separate relationship may be established for more than one of them with the pupil characteristic. In this way, the advantages of more complex eye models may be leveraged.
The further 3D eye model of the embodiments of methods for generating data suitable for determining at least one eye state variable and the 3D eye model of embodiments of methods for determining at least one eye state variable may be the same model, or may be partly different, the only decisive point being that they comprise a corresponding parameter for which a relationship with the characteristic of the pupil has been established.
Examples for parameters of the (any) 3D eye model as described in embodiments are a distance between a center of an eyeball, in particular a rotational, geometrical or optical center, and a center of a pupil or cornea, a size measure of an eyeball, a cornea or an iris such as an eyeball radius, a cornea radius, an iris diameter, a distance pupil-center to cornea-center, a distance cornea-center to eyeball-center, a distance pupil-center to limbus center, a distance crystalline lens to eyeball-center, to cornea center and/or to corneal apex, a refractive property of an eye structure such as an index of refraction of a cornea, vitreous humor or crystalline lens, an ellipsoidal shape measure of an eyeball or cornea, a degree of astigmatism, and an eye intra-ocular distance or inter-pupillary distance.
According to a variant, said relationship between a particular 3D eye model parameter and the characteristic of the pupil may be the same for all eye state variables.
According to a preferred embodiment, a different relationship between a parameter of the 3D eye model respectively the further 3D eye model and the characteristic of the pupil image may be/have been established for each eye state variable or for groups of eye state variables. This way, different ways in which a certain parameter of a 3D eye model influences the determination of certain eye state variable as a suite of the particular given algorithm used can be taken into account, and an optimal accuracy for all eye state variables of interest can be achieved.
The eye state variable typically is selected from the list of a pose of an eye such as a location of an eye, in particular an eyeball center, and/or an orientation of an eye, in particular a gaze vector, optical axis orientation or visual axis orientation, a 3D circle center line, a 3D eye intersecting line, and a size measure of a pupil of an eye, such as a pupil radius or diameter.
Further, the given algorithm typically does not take into account a glint from the eye for calculating the at least one eye state variable, in other words the algorithm is “glint-free”. Also, the algorithm typically does not require structured light and/or special purpose illumination to derive eye state variables.
The given algorithm typically calculates the at least one eye state variable in a non-iterative way.
According to an embodiment, a system for generating data suitable for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics is provided. The system comprising a computing and control unit configured to generate, using the known camera intrinsics, synthetic image data of several model eyes according to a first 3D eye model modeling corneal refraction, for a plurality of given values of at least one eye state variable, to calculate, using a given algorithm, the at least one eye state variable making use of one or more of the synthetic images and a further 3D eye model having at least one parameter, to determine a characteristic of the image of the pupil within each of the synthetic images, to determine one or more hypothetically optimal values of the at least one parameter of the further 3D eye model that minimize the error between the value(s) of the at least one given eye state variable and the value(s) of the corresponding eye state variable obtained when applying the given algorithm, and to establish a relationship between the one or more hypothetically optimal values of the at least one parameter of the further 3D eye model and the characteristic of the pupil image and store it in a memory.
Typically, the computing and control unit is configured to perform the methods for generating data suitable for determining at least one eye state variable of at least one eye of a subject as explained herein.
The computing and control unit of the system may be part of a device such as a personal computer, laptop, server or part of a cloud computing system.
According to an embodiment, a system for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics is provided. The system comprises a device comprising at least one camera of known camera intrinsics for producing image data including at least one eye of a subject, the at least one camera comprising a sensor defining an image plane, the at least one eye comprising an eyeball, an iris defining a pupil, and a cornea. The system further comprises a computing and control unit configured to receive image data of the at least one eye from the at least one camera, determine a characteristic of the image of the pupil within the image data, calculate, using a given algorithm, the at least one eye state variable making use of the image data and a 3D eye model having at least one parameter, the parameter depending in a pre-determined relationship on the characteristic, the relationship being retrieved from a memory.
Typically, the computing and control unit of this system is configured to perform the methods for determining at least one eye state variable of at least one eye of a subject as explained herein.
The device may be a head-wearable device or a remote (eye-tracking) device.
The computing and control unit can be at least partly integrated into the device and/or at least partly provided by a companion device of the system, for example a mobile companion device such as a mobile phone, tablet or laptop computer. Both the device and the companion device may have computing and control units, which typically communicate with each other via an interface board (interface controller), for example a USB-hub board (controller). Either of these computing and control units may be solely or partly responsible for determining the one or more eye state variables of an eye and/or a further eye of the user.
Different thereto, as previously mentioned the system for generating data suitable for determining at least one eye state variable of at least one eye of a subject, which generates synthetic image data, may typically comprise a more powerful computing and control unit such as a personal / desktop computer, server or the like. The system for generating data suitable for determining at least one eye state variable of at least one eye of a subject can be connected with or otherwise set into communication with the system for determining at least one eye state variable of at least one eye of a subject, by any suitable means known to the skilled person, in particular to communicate the established relationship(s).
In one embodiment, the head-wearable (spectacles) device is provided with electric power from a companion device of the system during operation of the spectacles device, and may thus not require an internal energy storage such as a battery. Accordingly, the head-wearable (spectacles) device may be particularly lightweight. Further, less heat may be produced during device operation compared to a device with an internal (rechargeable) energy storage. This may also improve comfort of wearing.
The computing and control unit of the head-wearable (spectacles) device may have a USB-hub board, a camera controller board connected with the camera, and a power-IC connected with the camera controller board, the camera and/or the connector for power supply and/or data exchange, and an optional head orientation sensor having an inertial measurement unit (IMU).
Reference will now be made in detail to various embodiments, one or more examples of which are illustrated in the figures. Each example is provided by way of explanation, and is not meant as a limitation of the invention. For example, features illustrated or described as part of one embodiment can be used on or in conjunction with other embodiments to yield yet a further embodiment. It is intended that the present invention includes such modifications and variations. The examples are described using specific language which should not be construed as limiting the scope of the appended claims. The drawings are not scaled and are for illustrative purposes only. For clarity, the same elements or steps have been designated by the same references in the different drawings if not stated otherwise.
With reference to
The spectacles device 1 as depicted in
According to the examples represented by
If a camera 14 or 24 is arranged in the nose bridge portion 3 of the spectacles body 2, the optical axis 15 of the left camera 14 may be inclined with an angle α of 142° to 150°, preferred 144°, measured in counter-clockwise direction (or -30° to -38°, preferred - 36°) with respect to the middle plane 100. Accordingly, the optical axis 25 of the right camera 24 may have an angle β of inclination of 30° to 38°, preferred 36°, with respect to the middle plane 100.
If a position of a camera 14, 24 is located in one of the lateral portions 12, 22 of the spectacles body 2, the optical axis 15 of the left camera 14 may have an angle γ of 55° to 70°, preferred 62° with respect to the middle plane, and/or the optical axis 25 of the right camera 24 may be inclined about an angle δ of 125° to 110° (or -55° to -70°), preferred 118° (or -62°).
Furthermore, a bounding cuboid 30 – in particular a rectangular cuboid – may be defined by the optical openings 11, 21, which serves four specifying positions of camera placement zones 17, 27, 18, 28. As shown in
In case a left/right camera 14, 24 is arranged in the nose bridge portion 3, a projected position of the left camera 14 would be set in a left inner eye camera placement zone 17 and the right camera 24 would be (projected) in the right inner eye camera placement zone 27.
When being in the left/right lateral portion 12, 22, the left camera 14 may be positioned – when projected in the plane of the camera placement zones – in the left outer eye camera placement zone 18, and the right camera 24 is in the right outer eye camera placement zone 28.
With the help of the front view on the spectacles device 1 depicted in
All examples of the spectacles device 1 as represented by
In the example shown in
The spectacles device 100 as shown in
Typically, the computing and control unit is non-visibly integrated within the holder, for example within the right holder 23 or the left holder 13 of the spectacles device 1. According to a non-shown example, a processing unit can be located within the left holder. Alternatively, the processing of the left and the right images from the cameras 14, 24 for determining the eye state variable(s) may alternatively be performed by a connected companion device such as smartphone or tablet or other computing device such as a desktop or laptop computer, and may also be performed entirely offline, based on videos recorded by the left and/or right cameras 14, 24.
The head wearable device 1 may also include components that allow determining the device orientation in 3D space, accelerometers, GPS functionality and the like.
The head wearable device 1 may further include any kind of power source, such as a replaceable or rechargeable battery, or a solar cell. Alternatively (or in addition), the head wearable device may be supplied with electric power during operation by a connected companion device, and may even be free of a battery or energy source.
The device of the present invention may however also be embodied in configurations other than in the form of spectacles, such as for example as integrated in the nose piece or frame assembly of an AR or VR head-mounted display (HMD) or goggles or similar device, or as a separate nose clip add-on or module for use with such devices. Also, the device may be a remote device, which is not wearable or otherwise in physical contact with the user.
In combination, a device and computing and control unit as detailed above may form a system for determining at least one eye state variable of at least one eye of a subject according to embodiments of the invention.
Referring first to the example of
In gaze estimation, estimating the optical axis g of the eye is a primary goal. In pupillometry, estimating the actual size (radius) of the pupil in units of physical length (e.g. mm) is the primary goal. The state of the eye model, similar to the one employed by reference [1], which is incorporated by reference in its entirety, is uniquely determined by specifying the position of the eyeball center M and the pose and radius of the pupil H3 = (φ, θ, r), where (φ and θ are the spherical coordinates of the normalized vector pointing from M into the direction of the center of the pupil P. We will refer to φ and θ as gaze angles. In some cases, we will also refer to the angle between the optical axis g and the negative z-axis as gaze angle. To determine the eyeball center M is therefore a necessary first step in video image based, glint-free gaze estimation and pupillometry.
In reference [1], a complex iterative optimization is performed to estimate eyeball positions as well as gaze angles and pupil size based on a time series of observations. In this respect, and in the context of the present disclosure, the expressions “iterative” and “optimization” respectively “optimization based” refer to algorithms which take as input image data from one or several points in time, and try to derive eye state variables in a loop-like application of the same core algorithm, until some cost function or criterion is optimized (e.g. minimized or maximized). Note that the expression “iterative” is thus NOT in any way linked to the fact if the algorithm operates on a single image or on a series of image data from different points in time.
Different thereto, examples of computationally less demanding non-iterative algorithms suitable for use in the methods of the present invention are described in the following. The examples given are based on analytical geometry. However, other non-iterative algorithms which use 3D eye model assumptions in some way may be used. For examples machine-learning based algorithms, like such using neural networks may be combined with 3D eye models.
In particular, as a first step a first ellipse E1 representing a border (outer contour) of the pupil H3 at the first time t1 is determined in a first image taken with the camera 24. This is typically achieved using image processing or machine-learning techniques.
As explained in detail in reference [1] a camera model of the camera 24 is used to determine an orientation vector n1 of the first circle C1 and a first center line L1 on which a center of the first circle C1 is located, so that a projection of the first circle C1, in a direction parallel to the first center line L1, onto the image plane Ip reproduces the first ellipse E1 in the image. In this step, the same disambiguation procedure on pairs of unprojected circles as proposed in reference [1] may be used.
As a result, we obtain circle C1, which we can choose as that circle along the unprojection cone which has radius r = 1.0 mm, and its orientation vector n1 in 3D. We will call ci the vector from the camera center X (the center of the perspective projection) to the center of this circle C1 of radius r = 1.0 mm, i.e. c1=C1-X. The center line can then be written as L1 (r) = r*c1 with r taking any positive real value. Note that vector c1 does not necessarily have length equal to 1.
However, the size-distance ambiguity explained above remains so far. It is this size-distance ambiguity which is resolved in a much simpler manner than proposed in [1] by the example algorithms presented in the following.
For this purpose, a first eye intersecting line D1 expected to intersect the center M of the eyeball at the first time t1 may be determined as a line which is, in the direction of the orientation vector n1, parallel-shifted to the first center line L1 by the expected distance R between the center M of the eyeball and the center P of the pupil. This expected distance R is usually set to its average human (physiological) value R = 10.39 mm, which is in the following also referred to as a physiological constant of human eyes. In this 3D eye model, this is the sole parameter.
Note that for each choice of pupil radius r, the circle selected by r*c1 constitutes a 3D pupil candidate that is consistent with the observed pupil ellipse E1. In the framework of the 3D eye model, if the circle thus chosen were to be the actual pupil, it would thus need to be tangent to a sphere of radius R and position given by
defining a line in 3D that is parametrized by pupil radius r, in which r*c1 represents the ensemble of possible pupil circle centers, i.e. the circle center line L1. Note that n1 is normalized to length equal 1, but vector c1 is not, as explained above. As the center of the 3D pupil equals P = r*c1 when r is chosen to be the actual pupil radius, the actual eyeball center M thus indeed needs to be contained in this line D1.
Note again that it is a property of perspective projection, that the center of the ellipse E1 in the camera image, which ellipse is the result of perspective projection of any of the possible 3D pupil circles corresponding to r*c1, does NOT lie on the circle center line L1.
Such eye intersecting lines D and such circle center lines L constitute eye state variables in the sense of the present disclosure.
In a monocular algorithm, referring to
The number of pupils (image frames) that can be calculated with the monocular algorithm explained above is, for the same computing hardware, typically at least one order of magnitude higher compared to the method of reference [1].
In an example of a binocular algorithm, referring again to
The expected distance R′ between the center of the eyeball M′ and the center of the pupil P′ of the further eye H′ may be set equal to the corresponding value R of eye H, or may be an eye-specific value.
In an example, a binocular algorithm further comprises using the first eye intersecting line D1 and the further eye intersecting line D′1 to determine expected coordinates of the center M of the eyeball H and of the center M′ of the further eyeball H′, such that each eyeball center lies on the respective eye intersecting line and the 3D distance between the eyeball centers corresponds to a predetermined value (IED, IPD), in particular a predetermined inter-eyeball distance IED, as indicated in
In particular, the predetermined distance value (IED, IPD) between the center of the eyeball and the center of the further eyeball may be an average value, in particular a physiological constant or population average, or an individually measured value of the subject. The average human inter-pupillary distance (IPD) at fixation at infinity can be assumed as IPD=63.0 mm. This value is therefore a proxy for the actual 3D distance between the eyeball centers of a human subject, the inter-eyeball distance (IED). Individually measuring the IPD can for example be performed with a simple ruler.
In this example, the center of the eyeball and the center of the further eyeball can for example be found based on some assumption about the geometric setup of the device with respect to the eyes and head of the subject, for example that the interaural axis has to be perpendicular to some particular direction, like for example the z-axis of a device coordinate system such as shown in the example of
In a further example, a binocular algorithm further comprises determining the expected coordinates of the center M of the eyeball and of the center M′ of the further eyeball, such that the radius r of the first circle in 3D and the radius r′ of the further circle in 3D are substantially equal, thereby also determining said radius. As previously set out, the center of the 3D pupil equals P = r*c1 when r is chosen to be the actual pupil radius. The same applies to the further eye, where P′ = r′*c′1, with c′1 being the vector from the camera center X′ to the center of this circle C′1 of radius r = 1.0 mm, i.e. c′1 = C′1 - X′ .
As a physiological fact, in most beings pupils of different eyes are controlled by the same neural pathways and can not change size independently of each other. In other words, the pupil size of the left and of the right eye of for example a human is substantially equal at any instant in time. This insight was surprisingly found to enable a particularly simple and fast solution to both the gaze-estimation (3D eyeball center and optical axis) and pupillometry (pupil size) problems, in a glint-free scenario based on a single observation in time of two eyes as follows. Since the center coordinates of the eyeball can be determined as
-
- with r being the actual (but so far unknown) pupil radius, and correspondingly for the further eye, with primed quantities, at any given time tk ~ t′k, one arrives at the condition for the distance ∥M-M′∥ between the eyeball centers
-
- in which ||.|| denotes the length or norm of a vector. If one makes the physiologically plausible assumptions that R = R′ (eyeballs of equal size, this is optional though) and r=r′ (pupil radii are equal in both eyes at any given time), (Eq.2) can be rewritten
-
- where a := X - X′ + R*(nk - n′k) and b: = ck-c′k. This leads to a quadratic equation for pupil radius r, which has the solutions
-
- with sqrt() signifying the square root operation and (a·b) signifying the dot product between these two vectors. The right side of (Eq.3) only contains known or measured quantities.
Which of the two solutions is the correct pupil radius can be easily decided either based on comparison with physiologically possible ranges (e.g. r > 0.5 mm and r < 4.5 mm) and/ or based on the geometric layout of the cameras and eyeballs. In
All the above calculations are performed with respect to a common 3D coordinate system, which can be the 3D coordinate system defined by a single camera of the device, or any other arbitrarily chosen coordinate system into which quantities have been transformed via the known relative camera poses, as is the case in the example of
Therefore, in this example algorithm a particularly simple and even faster solution for calculating all of the 3D eyeball centers, the optical axes (gaze vectors gk, g′k, which are antiparallel to nk, n′k respectively) and the (joint) pupil size of both eyes is provided in a glint-free scenario based on merely a single observation in time of two eyes of a subject.
Reference will now be made to
As has been set out previously, given prior art algorithms for calculating eye state variables based on eye video/image data and 3D eye models often use very simple eye models, like for example the model with a single parameter R used in [1] or [4]. Such algorithms work on image data of real eyes, i.e. eyes which have a cornea, but utilize eye models which do not include such a cornea. Consequently, they can only determine approximations to the actual eye state variables and need strategies to correct them. This is achieved in the prior art by either employing computationally costly iterative numerical optimization methods, or by performing extensive simulations with synthetic data to provide multivariate polynomial post-hoc correction mappings or functions. According to the invention, a simpler method to generate data suitable for determining eye state variables, and an even faster and accurate method for determining such eye state variables based on existing algorithms that assume very simple 3D eye models are provided.
The underlying insights are illustrated with reference to
A first insight of the invention is that, even though in real eyes a cornea Hc distorts the apparent pupil (and hence the pupil image in the eye camera image) in a complex non-linear way, some aspects of this complex distortion can be summarized in a simple way. Namely, due to the refractive effects of the cornea, the apparent pupil H′3 appears both further away from the eyeball center M as well as tilted towards the observing camera. Note that in
If given prior art algorithms are applied to such a distorted pupil image, the resulting eye state variables, indicated in
Note that this insight is broadly applicable, in the sense that it is independent of the particular algorithm, the particular 3D eye model, the particular eye model parameter and the particular eye state variable. The algorithm used for determining eye state variables including the 3D eye model can in principle be a “black box” as long as the possibility is provided to inject different values for the parameter of the model which is to be optimized with respect to a certain eye state variable. The optimal value can be found via numeric optimization in a simulation scenario based on synthetic data in the following way.
A first 3D eye model modeling corneal refraction is chosen. As an example, a two-sphere eye model may be used to model eyeballs and corneal surfaces. For example, the so-called LeGrand eye model may be used, a schematic of which is presented in
Alternatively, the so-called Navarro eye model (see reference [2]) or any other 3D eye model which include a model of a cornea may be used for modeling eyes and generating synthetic images, respectively.
According to such a chosen, first 3D eye model which models corneal refraction, for a plurality of sets of chosen eye state variables defining different possible states of the 3D eye model, synthetic images of the thus obtained eyes can be generated using known (optical) camera properties (typically including camera intrinsics) of the camera intended to be used in a corresponding device for producing image data of a subject’s eye.
Generating the synthetic images may be achieved by raytracing an arrangement of a camera model, which describes the camera, and 3D model eyeballs according to the first 3D eye model arranged in the field of view of the camera model.
The model of the camera typically includes a focal length, a shift of a central image pixel, a shear parameter, and/or one or more distortion parameters of the camera. The camera may be modeled as a pinhole camera. Typically, the camera defines a co-ordinate system, wherein all calculations described herein are performed with respect to this co-ordinate system.
These synthetic images are used to determine (calculate) expected values of the one or more eye state variables, using a given algorithm. Said given algorithm uses a further 3D eye model having at least one parameter. It is emphasized that the first 3D eye model, which is used to generate the synthetic images, is required to model corneal refraction, while the further 3D eye model, used by the given algorithm to determine eye state variables, can be a simpler model, in particular one that does not comprise a cornea, in particular even an eye model with just a single parameter.
The chosen eye state variable values typically include co-ordinates of respective centers of the model eyeballs, given radii of a pupil of the model eyeballs and/or given gaze directions of the model eyeballs. Two examples of such images are presented in
Given one or several of such synthetic images, the given algorithm calculates one or more eye state variables, and a numeric optimization determines the hypothetically optimal value or values of one or more parameters of the further 3D eye model (used by the algorithm) which minimize(s) the error between the (calculated) expected value of one or more eye state variables and the corresponding chosen (ground truth) values. The algorithm might take a single synthetic image as input to calculate a certain eye state variable, and thus a hypothetically optimal value of the one or more parameters may be obtained for each synthetic image, or the algorithm might operate on an ensemble of several synthetic images.
Referring again to the example of
It shall be emphasized at this point, that said numerical optimization is fundamentally different from optimization based methods of the prior art which have been previously referenced. Prior art methods use iterative numerical optimization schemes to derive the eye state variables themselves, based on time-series of real eye image data. Therein lies their weakness, since they cannot operate in real-time due to the computational complexity and the high frame rates encountered in state of the art systems for eye state variable determination. In contrast thereto, the methods presented herein provide means to adapt simple eye models based on simulation data which can be pre-computed in a non time critical manner. In other words, according to the invention a method for generating data suitable for determining eye state variables may use iterative numerical optimization techniques in order to generate such data, because at that stage calculations are not time critical, thereby enabling the use of non-iterative algorithms in methods for determining said eye state variables, where speed of calculation is of utmost importance.
The hypothetically optimal value(s) of one or more parameters of the further 3D eye model constitute data suitable for determining at least one eye state variable of at least one eye of a subject, and their application and use therefore will be detailed in the following example embodiments.
As a further insight, the inventors have surprisingly found that it is possible to find generalizable relationships between said optimal values and characteristics of the (camera) image of the pupil. Embodiments thus include establishing a relationship between the hypothetically optimal value(s) of the at least one parameter of the further 3D eye model and a characteristic of the pupil image.
According to a preferred embodiment, the characteristic of the image of the pupil is a measure of the circularity (c) of the pupil area or outline, in particular a ratio of minor to major axis length of an ellipse fit to the pupil image area, a measure of variation of the curvature of the pupil outline, a measure of elongation of the pupil or a measure of the bounding box of the pupil area.
Despite the complex dependency of the shape of the pupil image on the pose and pupil radius of the eye due to corneal refraction, it has surprisingly been found that a measure of the shape which represents the pupil in the camera image, like for example a circularity measure, which can be very easily obtained from given image data in real-time, makes it possible to find simple relationships which make the parameter(s) of the 3D eye models of the prior art adaptive to account for the effects of corneal refraction in a very simple and efficient way.
Reference is made to
In the context of the disclosure, a relationship between the hypothetically optimal values of the at least one further 3D eye model parameter and the characteristic of the pupil image is to signify any numerical link between these two quantities, for example also a constant value. Such constant value can for example be an average value of the optimal eye model parameter over a certain range of pupil characteristic values.
In particular a constant value smaller or larger than the corresponding average parameter of the first 3D eye model can be used.
In the example of
Other relationships include a linear relationship, such as a linear least-squares fit, as indicated by the dashed line in
As can be seen from
While the number of eye image frames that can be processed with a method such as described in [4] is, for the same computing hardware, already typically at least one order of magnitude higher compared to the method of Swirski (reference [1]), application of such post-hoc correction mapping of the prior art typically can take of the order of 10-100 microseconds per eye image/observation, depending on the complexity of the mapping, like the polynomial degree. In contrast, the method of the present invention requires either zero additional calculation time at runtime if a constant (optimal) value for the adapted parameter of the further 3D eye model is used (column three of
A further advantage of the methods of the present invention is that they are entirely independent of the choice of any coordinate system, unlike prior art methods like [4] which apply a multi-dimensional correction mapping to a set of eye state variables which may only be defined at least partly in a particular coordinate system (e.g. eyeball center coordinates, eye intersecting line directions, gaze vector directions, etc.). In contrast, the methods of the present invention operate by adapting parameters of the (further) 3D eye model, which are entirely independent of any choice of particular coordinate system that the algorithm for determining eye state variables might be using.
According to other embodiments, the further 3D eye model may have more than one parameter and a relationship may be established for more than one of them.
According to embodiments, the relationship may be the same for all eye state variables, or a different relationship between a (any) parameter of the (further) 3D eye model and the characteristic of the pupil image may be established for each eye state variable or for groups of eye state variables.
For example, eye state variables may be selected from the non-exhaustive list of a pose of an eye such as a location of an eye, in particular an eyeball center, an orientation of an eye, in particular a gaze vector, optical axis orientation or visual axis orientation, a 3D circle center line, a 3D eye intersecting line, and a size measure of a pupil of an eye, such as a pupil radius or diameter.
Referring to
As has been previously detailed in connection with the monocular and binocular algorithms and the mathematical methods therefore referenced in [1] and [3], having detected the elliptical shape best approximating the pupil in a camera image of an eye, a set of parallel shifted circles in 3D can be calculated, said circles increasing in radius as the distance from the camera (center of perspective projection) increases, their centers forming a circle center line. As long as the location of the eye is unknown, said size-distance ambiguity exists. Once the center of the eye M is known, the circle which lies tangent to an eye sphere of radius R, where R=PM represent the assumed distance between the pupil center P and the eyeball center M, represents the actual pupil circle in 3D. Its radius can for example be determined by first finding the circle of radius r = 1 mm along the circle center line. The center of this circle is designated by its vector ci as previously explained, e.g. with (Eq. 1). Shifting, that is scaling this circle such that it lies tangent to the eye sphere of center M and radius R will bring the center of the circle to a distance |c1| * rgt from the camera center, and the radius rgt of the pupil has thus been found. This is illustrated schematically (not to scale) in
This procedure is however only correct for an eye which has NO cornea. Corneal refraction adds effects of non-linear distortion to the image of the pupil. In particular, the cornea “magnifies” the apparent pupil. This is synonymous to saying that the eye constitutes a fish-eye camera/lens – the cornea allows it to collect light from a wider angle than it would be able without a cornea. This magnification has been symbolized in
The unprojection cone of the magnified pupil of apparent radius rmag > rgt, which has been indicated in
Referring to
One possible way of determining a gaze vector is to directly use the circle normal vector, as provided by the “unprojection” of the pupil image ellipse (based on methodology described in reference [3]), see vectors g respectively g′ in
Having detected the elliptical shape best approximating the pupil in a camera image of an eye and the corresponding pupil circle center line L as detailed herein already in connection with
Again, this procedure is however only correct for an eye which has NO cornea. As has been explained in connection with
However, it has been found that also in this example of the determination of another eye state variable, a hypothetically optimal value for a parameter of the further 3D eye model, in this case the distance which represents the distance between eyeball center and pupil center, can be determined for any eye observation in a simulation scenario as previously detailed. In
Referring now to
In a first step 2100, a first 3D eye model modeling corneal refraction is provided.
In a second step 2200, synthetic images SIi of several model eyes H with corneal refractive properties symbolized by an effective corneal refraction index nref in the flow chart and a plurality of given values {Xgt} of one or more eye state variables {X} of the model eye are generated using a model of the camera such as a pinhole model, assuming full perspective projection. For example, a ray tracer may be used to generate the synthetic images. For accuracy reasons, synthetic images may be ray traced at arbitrarily large image resolutions.
Eye state variables may for example include eyeball center locations M, gaze vectors g and pupil radii r, and may be sampled from physiologically plausible ranges as well as value ranges that may be expected for a given scenario, such as head-mounted eye cameras or remote eye tracking devices. For example, after fixing Mgt at a position randomly drawn from a range of practically relevant eyeball positions corresponding to a typical geometric setup of the eye camera, a number of synthetic eye images are generated, with gaze angles (φ and θ (forming ggt) randomly chosen from a uniform distribution between physiologically plausible maximum gaze angles, and with pupil radii rgt randomly chosen from a uniform distribution between 0.5 mm and 4.5 mm. Typically, a small number N of eye state variable tuples {gex, rex, Mex}i, {ggt, rgt, Mgt}i with i=1...N suffices to establish a relationship between the hypothetically optimal values of the further 3D eye model parameter and the pupil characteristic in a later step. For example, N maybe of the order of 103 or even only 102.
Eye model parameters may or may not be subject to variation in this step. In particular, they may be set to constant physiologically average values as for example detailed in connection with the eye model of
In step 2300, a characteristic ci of the image of the pupil within each of the synthetic images SIi is determined.
The characteristic may for example be a measure of the circularity of the pupil area or outline, in particular a ratio of minor to major axis length of an ellipse fit to the pupil image area, a measure of variation of the curvature of the pupil outline, a measure of elongation of the pupil or a measure of the bounding box of the pupil area.
In step 2410, a further 3D eye model having at least one parameter R is provided.
In particular, the further 3D eye model can be different from the first 3D eye model, in particular simpler. The further 3D eye model can have multiple parameters, but can in particular also have a single parameter R, which for the sake of clarity is the case illustrated in this flow chart.
In step 2420, a given algorithm is used to calculate one or more eye state variables {Xex} using one or more of the synthetic images SIi and the further 3D eye model having at least one parameter R. As explained previously in more detail with regard to a monocular and a binocular algorithm for determining eye state variables, the expected values of the one or more eye state variables {Xex} can be determined according to any suitable algorithm.
Thereafter, in step 2500, the given values {Xgt} and the calculated, expected values {Xex} of one or more eye state variables {X} are used in an error minimization step to determine one or more hypothetically optimal values of the at least one parameter of the further 3D eye model that minimize the error between the values of the corresponding at least one given eye state variable and the value of the (calculated respectively expected) eye state variable obtained when applying the given algorithm.
The superscript ’ in R′ indicates that the value of the parameter R is being changed from its original value, and the subscript opt in Ropt indicates that it is optimal in some sense. The curly brackets {.} indicate, that the parameter may be optimized for calculating a (each) particular eye state variable or group of eye state variables, such that a set of relationships of optimal parameters {R′opt(c)} results. Alternatively, only one such relationship may be determined for a certain parameter, which relationship can then be used by a given algorithm to calculate all possible eye state variables.
Finally, in step 2600 a relationship between the hypothetically optimal values of the at least one parameter of the further 3D eye model and the characteristic of the pupil image is established. The relationship(s) may be stored in a memory (not shown).
Steps of the method as detailed with reference to
In a first step 1100, image data Ik of the user’s eye, taken by an eye camera of known camera intrinsics of a device at one or more times tk is received.
Said image data may consist for example of one or several images, showing one or several eyes of the subject.
In a subsequent step 1200, a characteristic of the image of the pupil within the image data is determined. In case said image data comprises multiple images, such characteristic is determined in each image, and if the image data comprises images of multiple eyes, such characteristic may be determined for each eye separately.
In a subsequent step 1300, a 3D eye model having at least one parameter R is provided, wherein the parameter depends in a pre-determined relationship on the characteristic.
In step 1400, a given algorithm is used to calculate the at least one eye state variable {X} using the image data Ik and the 3D eye model including the at least one characteristic-dependent parameter.
The given algorithms used in steps 2420 and 1400 may for example employ methods such as the monocular or binocular algorithms previously explained with regard to
The further 3D eye model provided in step 2410 and the 3D eye model provided in step 1300 may be the same or different ones, as long as they comprise a corresponding parameter or corresponding parameters {R} for which optimal relationships in the sense of step 2600 have been determined.
According to the present disclosure, methods for generating data suitable for determining eye state variables are provided, which open the way to a fast non-iterative approach to the tasks of refraction-aware 3D gaze prediction and pupillometry based on pupil contours alone. Leveraging geometrical insights with regard to the two-sphere eye model and/or with regard to human ocular physiology, in particular the distortion of the image of the pupil due to corneal refraction, these tasks are solved by making simple 3D eye models adaptive, which virtually eliminates the systematic errors due to corneal refraction of prior art methods.
Although various exemplary embodiments of the invention have been disclosed, it will be apparent to those skilled in the art that various changes and modifications can be made which will achieve some of the advantages of the invention without departing from the spirit and scope of the invention. The present invention is therefore limited only by the following claims and their legal equivalents.
REFERENCES
- Swirski L. et al: A fully-automatic, temporal approach to single camera, glint-free 3D eye model fitting, Proc. PETMEI Lund/Sweden, 13.08.2013
- Navarro R. et al: Accommodation-dependent model of the human eye with aspherics, J. Opt. Soc. Am. A 2(8), 1273-1281 (1985)
- Safaee-Rad R. et al: Three-dimensional location estimation of circular features for machine vision, IEEE Transactions on Robotics and Automation 8(5), 624-640 (1992)
- Dierkes K. et al: A fast approach to refraction-aware eye-model fitting and gaze prediction, Proc. ETRA Denver/USA, 25.-28.06.2019
- Fedtke C. et al: The entrance pupil of the human eye: a three-dimensional model as a function of viewing angle. Optics Express 18(21), 22364-22376 (2010)
Claims
1-30. (canceled)
31. A method for generating data suitable for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the method comprising:
- providing a first 3D eye model modeling corneal refraction;
- generating, using the known camera intrinsics, synthetic images of several model eyes according to the first 3D eye model, for a plurality of given values of at least one eye state variable;
- using a given algorithm to calculate the at least one eye state variable using one or more of the synthetic images and a further 3D eye model having at least one parameter;
- determining a characteristic of the image of the pupil within each of the synthetic images;
- determining one or more hypothetically optimal values of the at least one parameter of the further 3D eye model that minimize the error between the value(s) of the at least one given eye state variable and the value(s) of the corresponding eye state variable obtained when applying the given algorithm; and
- establishing a relationship between the one or more hypothetically optimal values of the at least one parameter of the further 3D eye model and the characteristic of the pupil image.
32. The method of claim 31, wherein the characteristic of the image of the pupil is a measure of the circularity of the pupil area or outline, in particular a ratio of minor to major axis length of an ellipse fit to the pupil image area or outline, a measure of variation of the curvature of the pupil outline, a measure of elongation or a measure of the bounding box of the pupil area.
33. The method of claim 31, wherein the relationship between the hypothetically optimal values of the at least one further 3D eye model parameter and the characteristic of the pupil image is chosen from the list of a constant value, in particular a constant value smaller or larger than the corresponding average parameter of the first 3D eye model, a linear relationship, a polynomial relationship, or another non-linear relationship, in particular a relationship derived via a regression fit.
34. The method of claim 31, wherein the further 3D eye model has at most one parameter.
35. The method of claim 31, wherein the further 3D eye model has multiple parameters and a relationship is established for more than one of them.
36. The method of claim 31, wherein any parameter of the first and/or of the further 3D eye model is/are selected from the list of a distance between a center of an eyeball, in particular a rotational, geometrical or optical center, and a center of a pupil or cornea, a size measure of an eyeball, a cornea or an iris such as an eyeball radius, a cornea radius, an iris diameter, a distance pupil center to cornea center, a distance cornea center to eyeball center, a distance pupil center to limbus center, a distance crystalline lens to eyeball center, to cornea center and/or to corneal apex, a refractive property of an eye structure such as an index of refraction of a cornea, vitreous humor or crystalline lens, an ellipsoidal shape measure of an eyeball or cornea, and a degree of astigmatism.
37. The method of claim 31, wherein said relationship is the same for all eye state variables, or wherein a different relationship between a parameter of the further 3D eye model and the characteristic of the pupil image is established for each eye state variable or for groups of eye state variables.
38. The method of claim 31, wherein the eye state variable is selected from the list of a pose of an eye such as a location of an eye, in particular an eyeball center, and/or an orientation of an eye, in particular a gaze vector, optical axis orientation or visual axis orientation, a 3D circle center line, a 3D eye intersecting line, and a size measure of a pupil of an eye, such as a pupil radius or diameter.
39. A method for determining at least one eye state variable of at least one eye of a subject, the eye comprising an eyeball, an iris defining a pupil, and a cornea, the at least one eye state variable being derivable from at least one image of the eye taken with a camera of known camera intrinsics, the method comprising:
- receiving image data of the at least one eye from a camera of known camera intrinsics and defining an image plane;
- determining a characteristic of the image of the pupil within the image data;
- providing a 3D eye model having at least one parameter, the at least one parameter depending in a pre-determined relationship on the characteristic;
- using a given algorithm to calculate the at least one eye state variable using the image data and the 3D eye model including the at least one parameter.
40. The method of claim 39, wherein the characteristic of the image of the pupil is a measure of the circularity of the pupil area or outline, in particular a ratio of minor to major axis length of an ellipse fit to the pupil image area or outline, a measure of variation of the curvature of the pupil outline, a measure of elongation or a measure of the bounding box of the pupil area.
41. The method of claim 39, wherein the pre-determined relationship between the at least one parameter of the 3D eye model and the characteristic of the pupil image is chosen from the list of a constant value, a linear relationship, a polynomial relationship, or another non-linear relationship, in particular a relationship derived via a regression fit, in particular wherein the relationship is stored in analytical form and evaluated on-the-fly for given image data or stored as a lookup-table.
42. The method of claim 39, wherein the 3D eye model has either only one parameter, or wherein the 3D eye model has multiple parameters and a pre-determined relationship between any of them and the characteristic is used for at least one of the parameters.
43. The method of claim 39, wherein the respective parameter of the 3D eye model is selected from the list of a distance between a center of an eyeball, in particular a rotational, geometrical or optical center, and a center of a pupil or cornea, a size measure of an eyeball, a cornea or an iris such as an eyeball radius, a cornea radius, an iris diameter, a distance pupil-center to cornea-center, a distance cornea-center to eyeball-center, a distance pupil-center to limbus center, a distance crystalline lens to eyeball-center, to cornea center and/or to corneal apex, a refractive property of an eye structure such as an index of refraction of a cornea, vitreous humor or crystalline lens, an ellipsoidal shape measure of an eyeball or cornea, and a degree of astigmatism.
44. The method of claim 39, wherein said relationship is the same for all eye state variables, or wherein a different pre-determined relationship between a parameter of the 3D eye model and the characteristic of the pupil image is used for each eye state variable or for groups of eye state variables.
45. The method of any of claim 39, wherein the eye state variable is selected from the list of a pose of an eye such as a location of an eye, in particular an eyeball center, and/or an orientation of an eye, in particular a gaze vector, optical axis orientation or visual axis orientation, a 3D circle center line, a 3D eye intersecting line, and a size measure of a pupil of an eye, such as a pupil radius or diameter.
46. The method of claim 31, wherein the given algorithm does not take into account a glint from the eye for calculating the at least one eye state variable, wherein the algorithm is glint-free, and/or wherein the algorithm does not require structured light and/or special purpose illumination to derive eye state variables, and/or wherein the given algorithm calculates the at least one eye state variable in a non-iterative way.
47. The method of claim 31, the given algorithm including:
- determining a first ellipse in the image data, the first ellipse at least substantially representing a border of the pupil of the at least one eye at a first time;
- using the camera intrinsics and the first ellipse to determine a 3D orientation vector of a first circle in 3D and a first center line on which a center of the first circle is located in 3D, so that a projection of the first circle, in a direction parallel to the first center line, onto the image plane is expected to reproduce the first ellipse; and
- determining a first eye intersecting line in 3D expected to intersect a 3D center of the eyeball at the corresponding time as a line which is, in the direction of the orientation vector, parallel-shifted to the first center line by an expected distance between the center of the eyeball and a center of the pupil.
48. The method of claim 47, further comprising at least one of:
- receiving image data of a further eye of the subject at a time, substantially corresponding to the first times, from a camera of known camera intrinsics and defining an image plane, the further eye comprising a further eyeball, a further iris defining a further pupil, and a further cornea, the given algorithm further including: determining a further ellipse in the image data, the further ellipse at least substantially representing the border of the further pupil of the further eye at the corresponding time; using the camera intrinsics and the further ellipse to determine a 3D orientation vector of a further circle in 3D and a further center line on which a center of the further circle is located in 3D, so that a projection of the further circle, in a direction parallel to the further center line, onto the image plane is expected to reproduce the further ellipse; determine a further eye intersecting line in 3D expected to intersect a 3D center of the further eyeball at the corresponding time as a line which is, in the direction of the 3D orientation vector of the further circle, parallel-shifted to the further center line by an expected distance between the center of the further eyeball and a center of the further pupil; receiving second image data of the at least one eye at a second time from the camera;
- the given algorithm further including: determining a second ellipse in the second image data, the second ellipse at least substantially representing the border of the pupil at the second time; using the camera intrinsics and the second ellipse to determine an orientation vector of a second circle and a second center line on which a center of the second circle is located, so that a projection of the second circle, in a direction parallel to the second center line, onto the image plane is expected to reproduce the second ellipse; and determine a second eye intersecting line expected to intersect the center of the eyeball at the second time as a line which is, in the direction of the orientation vector of the second circle, parallel-shifted to the second center line by the expected distance.
49. The method of claim 48, wherein the given algorithm further includes using the first eye intersecting line and the second eye intersecting line, respectively the first eye intersecting line and the further eye intersecting line to determine other eye state variables such as co-ordinates of the center of the eyeball of the at least one eye respectively of the at least one eye and the further eye, a gaze direction, an optical axis, an orientation, a visual axis, a size of the pupil and/or a radius of the pupil of the at least one eye and/or of the further eye, wherein the expected distance between the center of the eyeball and the center of the pupil is a parameter of the 3D eye model respectively of the further 3D eye model, depending in the pre-determined relationship on the characteristic of the image of the pupil of the corresponding eye, wherein the respective center line and/or the respective eye intersecting line is determined using a model of the camera and/or the 3D eye model respectively the further 3D eye model, wherein the camera is modeled as a pinhole camera, and/or wherein the model of the camera comprises at least one of a focal length, a shift of a central image pixel, a shear parameter, and a distortion parameter.
50. A computer program product or a non-volatile computer-readable storage medium comprising instructions which, when executed by a one or more processors of a system, cause the system to carry out the following steps:
- providing a first 3D eye model modeling corneal refraction;
- generating, using known camera intrinsics of a camera, synthetic images of several model eyes according to the first 3D eye model, for a plurality of given values of at least one eye state variable, the at least one eye state variable being derivable from at least one image of an eye of a subject taken with the camera;
- using a given algorithm to calculate the at least one eye state variable using one or more of the synthetic images and a further 3D eye model having at least one parameter;
- determining a characteristic of an image of a pupil within each of the synthetic images;
- determining one or more hypothetically optimal values of the at least one parameter of the further 3D eye model that minimize the error between the value(s) of the at least one given eye state variable and the value(s) of the corresponding eye state variable obtained when applying the given algorithm; and
- establishing a relationship between the one or more hypothetically optimal values of the at least one parameter of the further 3D eye model and the characteristic of the pupil image.
Type: Application
Filed: Mar 12, 2021
Publication Date: Aug 17, 2023
Applicant: PUPIL LABS GmbH (Berlin)
Inventors: Bernhard PETERSCH (Berlin), Kai DIERKES (Berlin)
Application Number: 17/927,650