Red eye reduction technique
A red-eye reduction technique includes converting a multi-channel image to a hue, saturation, value color space. The hue channel, the saturation channel, and the value channel are processed to identify the location of the red-eye within the image, if any.
Latest Patents:
This application is a continuation of U.S. patent application Ser. No. 10/676,277, filed Sep. 30, 2003.
BACKGROUND OF THE INVENTIONThe invention relates generally to the field of digital image processing, and in particular, to the identification of and the reduction of the red-eye effect in images.
The increased use of computers in many applications has drawn increasing focus on improving the man-machine interface. It is the desire of many applications to locate the face of the user in an image, then to process it to robustly identify the person. The algorithms for facial recognition have dramatically improved in recent years and are now sufficiently robust for many applications. The weak part of the system is the face detection and location. Other applications for facial imaging beyond identification are also growing in interest, in particular perceptual computing, such as discerning a reaction or emotion from a user's face. This would enable computer-driven systems to be more responsive, like a human. Again, these algorithms will be limited by the weaknesses in face detection and location.
When flash illumination is used during the capture of an image that contains sizable human faces, the pupils of people sometimes appear red because the light is partially absorbed by capillaries in the retina. As illustrated in
Referring to
The general technique for red-eye reduction within cameras has been to impact two parameters: (a) reduce the pupil diameter of the subject, for example by emitting a series of small pre-flashes prior to capturing the desired image with full illumination; and, (b) increase the flash to lens separation, so that the illumination impinging on the subjects eyes is reflected at an angle that misses the taking lens.
In most cases, where a flash is needed to illuminate the subject, the subject's pupils are dilated due to the low ambient illumination. Light from the flash can then enter the eye through the pupil and is reflected off the blood vessels at the back of the retina. This reflection may be recorded by the camera if the geometry of the camera lens, the flash, and the subject's eyes is just right, rendering the captured image unpleasant and objectionable to viewers. Hence there is a significant need for automatic methods that identify and correct red-eye areas in a captured image.
A number of methods have been proposed for detecting and/or removing red-eye artifacts that result in the images themselves. The majority of these methods are either (i) supervised; i.e. they require the user to manually identify the subregions in an image where the artifacts are observed, or (ii) dependent on skin/face and/or eye detection to find the areas of interest. However, manual user identification is cumbersome for the user, especially when a lot of images are involved. In addition, typical skin, face, and eye detection techniques are computationally intensive.
The foregoing and other objectives, features, and advantages of the invention will be more readily understood upon consideration of the following detailed description of the invention, taken in conjunction with the accompanying drawings.
To identify the existence of the red-eye in an image in a manner that is free from user identification of an image as containing the red-eye or otherwise the sub-region of the image containing the red-eye, the present inventor came to the realization that modification of a typical red, green, blue (“RGB”) image, to one that includes an enhanced luminance channel (e.g., >60% of the luminance information in a single channel), facilitates such an identification and reduction. Referring to
With the color channels of the image modified to a hue, saturation, value (“HSV”) color space, each channel of the HSV color space may be processed and analyzed in a different manner, and combined in some manner, to accurately identify the red-eye artifacts.
As previously noted, the red-eye artifacts in an image occur as a direct consequence of using a flash while acquiring the image. Accordingly, the red-eye detection technique should focus on those regions of the image that have been affected (i.e. illuminated) by the flash. At block 120, to identify such potential red-eye regions a thresholding operation is applied to the brightness (V) component Iv of the original image. The pixels that exceed the threshold value Tf comprise a flash mask, Mf:
The value of threshold Tf may be any suitable value, such as for example, a scalar value, an integer, or a dynamic value based upon the particular image. For example, Tf is computed for each input image individually using a technique described in a paper by Otsu, N. (1979), “A thresholding selection method from gray-level histogram”, in IEEE Trans. Syst. Man Cybernet. 9(1), 62-66). Furthermore, the value of Tf may be selected such that the resulting mask function may be used to determine whether the input image is a flash image or not (e.g., has sufficient red-eye effect).
Once the flash mask Mf(i,j) is determined, other post-processing operations may be applied to reduce the number of isolated pixels at block 120. These operations may include, for example, median filtering, and morphological operations such as erosion and opening. At block 130, the remaining pixels in Mf are then grouped into a plurality of “contiguous” regions using a connected component technique, such as a convex hull technique or otherwise, and the areas of the connection components are computed. A convex hull is a polygonal area that is of smallest length and so that any pair of points within the area have the line segment between them contained entirely inside the area. Regions with areas smaller than a threshold are discarded or otherwise not used. The convex hull of each remaining region is subsequently computed and a binary mask that comprises the union of the convex hulls is constructed to yield the final flash mask Mf.
After Mf is computed, it may be used for further processing on another component of the image, such as the hue component Ih. Mf may be applied to Ih. to obtain a masked hue version at block 140. Hue may be defined as the dominant color of a pixel, and it is represented as an angle on the unit circle between 0 degrees and 360 degrees. The present inventor came to the realization that when the hue values are mapped to an appropriate interval for display (e.g., to [0,1] or [0,255]), red-eye locations are observed to appear as light, contiguous regions on darker backgrounds, as shown in
The value of the threshold Th can be chosen in any suitable manner, such as setting Thε[0,1], and set to 0.125.
After Mh is calculated, several post-processing operations at block 145 may be applied to refine it. These operations may include, for example, median filtering, and morphological filtering such as dilation and closing. The selected pixels in Mh are then grouped into contiguous regions using a connected component technique, and several features are computed for each region. Specifically, one may consider the area, aspect ratio, and/or extent of each region to determine the likelihood that the region is a red-eye area. Extent may be defined as the ratio of the total area of the region (i.e. the number of pixels in the region) to the number of pixels in the smallest bounding box for the region. Regions whose areas and/or aspect ratios fall outside predetermined ranges, or whose extent values are below a specified threshold, are discarded. In the preferred embodiment, the minimum and maximum allowed sizes for a region are computed dynamically based on the size of the input image. The aspect ratio test permits one to eliminate regions that are elongated; the aspect ratio of a candidate red-eye region is expected to be in the interval (0.33,2). Also, if the extent of a region is less than 0.33, the region is removed from the list of candidate red-eye locations.
The present inventor also came to the realization that the information in the saturation component of the image may be used to further refine the potential candidate red-eye regions. It was observed that pixels in the red-eye regions often have high saturation values, as seen in the example image in
The intersection of Mh and Mσ is then computed to yield a final mask Mhσ (
The final stage of the technique involves color-based analysis of the remaining regions to determine which pixels are strongly red. This may be achieved using the hue component, by specifying the appropriate range of hue angles corresponding to color red. Alternatively this color test may be carried out in other color spaces, such as RGB, YCrCb, La*b*, and so on. In the preferred embodiment, the system utilizes the RGB values of the pixels in each candidate region to determine whether the region contains a red-eye artifact. The RGB values can be computed directly from the available HSV components be using a simple transformation. For each region, the system may compute the mean of each primary. The system may then determine whether (i) the mean red value is less than a specified threshold, or (ii) the ratio of the means of the green and blue components is below another predetermined threshold. Any region that satisfies either of the above criteria is discarded, and the remaining regions are identified as red-eye artifact locations (
The individual pixels that require correction within these regions are identified through an analysis of the color properties of the individual pixels. This analysis may include, for example, thresholding based on pixel color values, and clustering/region growing based on color similarity. The final output of the technique is a mask that identifies the individual pixels in the image that require red-eye correction. It is to be understood that the techniques described herein may be performed separately or as a result of mathematical equations without the need to convert an entire image.
It is noted that the preferred embodiment is capable of performing the entire operation in an unsupervised manner. In addition, the techniques does not require the detection of the face and/or skin regions in an image, and is therefore computationally efficient. Further, limiting the processing of the red-eye to those regions of the image that are affected by the flash illumination improves the computational efficiency.
The embodiments described herein can be implemented in any manner, such as for example, as a stand-alone computer application that operates on digital images or as a plug-in to other image/document management software; or it may be incorporated into a multi-function machine.
Claims
1. A method to identify sub-regions of a multi-channel image as containing red-eye, said method comprising the steps of:
- (a) converting said multi-channel image to a modified multi-channel image wherein at least one of said channels is an enhanced luminance channel that has more than 60% of the luminance information of said multi-channel image and at least one of said channels is a saturation channel;
- (b) deriving a spatial flash mask by applying a luminance threshold operation to said enhanced luminance channel, said spatial flash mask identifying regions of said multi-channel image potentially affected by a flash;
- (c) masking said saturation channel with said spatial flash mask to derive a masked saturation channel;
- (d) deriving a saturation mask by applying a saturation threshold to said masked saturation channel;
- (e) identifying red-eye using said saturation mask and removing the identified said red eye from said multi-channel image.
2. The method of claim 1 where said saturation mask compares the standard deviation of the saturation value of a respective pixel to a threshold.
3. The method of claim 2 wherein said standard deviation of said saturation value measured relative to the mean saturation of pixels in a neighborhood local to said respective pixel.
4. The method of claim 3 wherein said saturation channel represents the relative bandwidth of the visible output from a light source
5. The method of claim 1 where said flash mask identifies spatial boundaries around regions of said image affected by a flash.
6. The method of claim 1 where said flash mask is derived at least in part by use of a convex hull technique.
7. The method of claim 1 where said flash mask comprises a first region of said image having a first set of pixels that satisfy said threshold operation, a second region of said image having a second set of pixels that satisfy said threshold operation, and a third region of said image having a third set of pixels that does not satisfy said threshold operation.
8. The method of claim 7 where said third region is constructed so as to be contiguous with both said first and second region.
9. A method to identify sub-regions of a multi-channel image stored on a computer-readable medium operably interactive with a processor, said multichannel image containing red-eye, said method comprising the steps of:
- (a) said processor converting said multi-channel image to a modified multi-channel image including a luminance channel, a saturation channel, and a hue channel;
- (b) said processor masking at least one of said saturation channel and said hue channel with a flash mask, said flash mask determined by applying a threshold operation to respective pixels of said luminance channel and by computing spatial bounds around those said pixels that satisfy said threshold operation;
- (c) said processor determining pixels representing red-eye from said masked at least one of said saturation channel and said hue channel; and
- (d) said processor correcting said red-eye and storing the corrected said image on said computer-readable medium.
10. The method of claim 9 where said flash mask is used to mask said saturation channel, said processor determining a saturation mask from the masked said saturation channel, and using said saturation mask to mask said hue channel so as to determine said pixels representing red-eye from said masked hue channel.
11. The method of claim 10 where said saturation mask is determined by applying a second threshold operation to respective pixels of said saturation channel and by computing spatial bounds around those said pixels that satisfy said second threshold operation.
12. The method of claim 10 where said saturation mask compares the standard deviation of the saturation value of a respective pixel to a threshold.
13. The method of claim 12 wherein said standard deviation of said saturation value measured relative to the mean saturation of pixels in a neighborhood local to said respective pixel.
14. The method of claim 13 wherein said saturation channel represents the relative bandwidth of the visible output from a light source.
15. The method of claim 9 where said flash mask is derived at least in part by use of a convex hull technique.
16. The method of claim 9 where said flash mask comprises a first region of said image having a first set of pixels that satisfy said threshold operation, a second region of said image having a second set of pixels that satisfy said threshold operation, and a third region of said image having a third set of pixels that does not satisfy said threshold operation.
17. The method of claim 16 where said third region is constructed so as to be contiguous with both said first and second region.
18. A method to identify sub-regions of a multi-channel image stored on a computer-readable medium operably interactive with a processor, said multichannel image containing red-eye, said method comprising the steps of:
- (a) said processor converting said multi-channel image to a modified multi-channel image including a luminance channel, a saturation channel, and a hue channel;
- (b) said processor masking at least one of said saturation channel and said hue channel with a flash mask, said flash mask comprising identified spatial boundaries around regions of said image potentially affected by a flash;
- (c) said processor determining pixels representing red-eye from said masked at least one of said saturation channel and said hue channel; and
- (d) said processor correcting said red-eye and storing the corrected said image on said computer-readable medium.
19. The method of claim 18 where said spatial boundaries are identified using a connected component technique.
20. The method of claim 19 where said connected component technique is a convex hull technique.
Type: Application
Filed: Mar 30, 2010
Publication Date: Dec 2, 2010
Applicant:
Inventor: A. Mufit Ferman (Vancouver, WA)
Application Number: 12/798,122
International Classification: G06K 9/36 (20060101);