BACKGROUND BLURRING FOR VIDEO CONFERENCING
Background blurring is an effective way to both preserve privacy and keep communication effective during video conferencing. The present image background blurring technique is a light weight real-time technique to perform background blurring using a fast background modeling procedure combined with an object (e.g., face) detector/tracker. A soft decision is made at each pixel whether it belongs to the foreground or the background based on multiple vision features. The classification results are mapped to a per-pixel blurring radius image to blur the background. In another embodiment, the image background blurring technique blurs the background of the image without using the object detector.
Latest Microsoft Patents:
Video conferencing has become more and more popular thanks to the emergence of high speed Internet and reduced prices of high quality web cameras. Wide-spread instant messaging software supports voice/video chatting, where people can view each other while talking. There are, however, privacy concerns with video conferencing. For instance, some people do not want to show their living rooms, bedrooms or offices to other people.
There are a number of approaches to overcoming the privacy issue with video conferencing. For example, some applications replace a talking face with a 3D animated avatar. It is fun to play with such effects, though the expressiveness of the avatars is often limited and cannot deliver information as rich as that conveyed by true faces. An alternative solution is to separate the foreground and background objects in the video, so that the background can be replaced by a different image or video. This would be ideal for preserving privacy while maintaining the effectiveness of the conversation, except that automatic video foreground/background segmentation is a very challenging task. In addition, human eyes are very sensitive to segmentation errors during background replacement, which demands the segmentation algorithm to be extremely accurate.
Existing approaches to preserving privacy in video conferencing are either too slow to be processed in real time, or assume a known background, or require a stereo camera pair. Few of them have achieved the efficiency, accuracy and convenience needed in real-world applications.
The present image background blurring technique employs background blurring as an effective way to both preserve privacy and ensure effective communication during video conferencing. In one embodiment, the present image background blurring technique is a light weight real-time technique to perform background blurring using a fast background modeling procedure combined with an object detector/tracker, such as a face detector/tracker. A soft decision is made at each pixel whether it belongs to the foreground or the background based on multiple vision features. The classification results are mapped to a per-pixel blurring radius image which is used to blur the background.
In another embodiment, the image background blurring technique blurs the background of the image without using the object detector. In this embodiment, a background classification procedure is also used to create a probability map to determine the likelihood of each pixel in an image being background. The probability map is then used in blurring the pixels of the image.
It is noted that while the foregoing limitations in existing techniques for overcoming privacy issues described in the Background section can be resolved by a particular implementation of the image background blurring technique described, this technique is in no way limited to implementations that just solve any or all of the noted disadvantages. Rather, the present technique has a much wider application as will become evident from the descriptions to follow.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
DESCRIPTION OF THE DRAWINGS
The specific features, aspects, and advantages of the claimed subject matter will become better understood with regard to the following description, appended claims, and accompanying drawings where:
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. It is understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present disclosure.
1.0 The Computing Environment
Before providing a description of embodiments of the present image background blurring technique, a brief, general description of a suitable computing environment in which portions of the technique may be implemented will be described. The technique is operational with numerous general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the process include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Device 100 may also contain communications connection(s) 112 that allow the device to communicate with other devices. Communications connection(s) 112 is an example of communication media. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. The term computer readable media as used herein includes both storage media and communication media.
Device 100 may also have input device(s) 114 such as keyboard, mouse, camera, microphone, pen, voice input device, touch input device, etc. In particular, one such input device is a video camera. Output device(s) 116 such as a display, speakers, printer, etc. may also be included. All these devices are well know in the art and need not be discussed at length here.
The present technique may be described in the general context of computer-executable instructions, such as program modules, being executed by a computing device. Generally, program modules include routines, programs, objects, components, data structures, and so on that perform particular tasks or implement particular abstract data types. The process may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The exemplary operating environment having now been discussed, the remaining parts of this description section will be devoted to a description of the program modules embodying the present image background blurring technique. A more detailed view of an exemplary overall operating environment, such as would be found in a video conferencing application, is shown in
2.0 Image Background Blurring Technique
The present image background blurring technique blurs the background in an image, such as one that would be used in video conferencing, instead of replacing the background completely. This has a number of advantages. First, as one would imagine, after background blurring, the foreground objects stay focused while the background objects are blurred. This protects the privacy of the person while maintaining an effective conversation. Second, people are much more forgiving to the errors made in background blurring than background replacement. This allows one to develop very efficient foreground/background classification procedures without much concern about classification errors. Finally, background blurring has the similar effect as a wide aperture video camera which always focuses on the foreground objects. It can make the foreground objects stand out as the background is blurred, creating an extra dimension to the video.
2.1 System and Process Overview
The present image background blurring technique can be deployed in a typical video conferencing environment. For example, in one embodiment, shown in
The following sections provide details and variations of the image background blurring technique described above.
2.2 Fast Pixel-Level Background Modeling
Fast pixel level background modeling has been extensively studied in literature. Gaussian distributions have been used to model the color variation of the background pixels, mostly for indoor environments. This was later extended to Gaussian mixture models, non-parametric kernel density estimators and three state Hidden Markov Models to adapt to outdoor, dynamic backgrounds. A separate region-level or even object-level model can be added to further improve the background modeling quality in dynamic scenes.
Since most video conferencing applications are indoors, a single Gaussian distribution is typically sufficient to model the color variations of background pixels. Another reason to use a single Gaussian distribution is its simplicity and efficiency in implementation.
The background modeling process 500 used in one embodiment of the image background blurring technique is shown in
More particularly, let an input video frame be It at time t. Let ct(x) be the color of the pixel at location x in It. To make the background model robust to lighting variations, and to simplify computations, convert the video frame from the RGB color space to the YUV color space, and only use the UV components for the task. Thus ct(x) is a two dimensional vector. Initially, all pixels are assumed to be foreground pixels. Let the initial mean color be
where I is a two dimensional identity matrix and C is a tiny number (e.g., 103).
At each pixel, a background mean and a background variance is computed, denoted as mt(x) and
Given a new input frame, the likelihood of a pixel belonging to the background can be computed using the standard Gaussian kernel:
Here the pixel location variable x is ignored for conciseness. If this probability is above a certain threshold, e.g., pt>0.2, the new pixel will be used to update the background model as:
where a is a decay factor indicating the updating rate. The above updating mechanism can handle slow background variations very well.
Another functionality that is needed in background modeling is the ability to push a pixel into the background if its color does not change for a long period of time. To enable this, a running mean μt(x) and a running variance Ωt(x) are computed for each pixel. Whenever a new frame comes in, these running means and variances are updated similarly as:
Initially μ0(x)=c0(x) and Ω0(x)=ρI, where ρ is a big number (e.g., 20). If a pixel's color remains constant for a long period, the trace of the covariance matrix Ωt will decrease. If the trace is smaller than a certain threshold, the pixel will be push into the background, i.e., one sets mt(x)=μt(x) and
A pixel background probability map is thus created using this information for each pixel.
2.3 Object (Face) Detection and Tracking
The background modeling procedure described in Section 2.2 works reasonably well if the foreground person is constantly moving. Unfortunately, many people do not move around all the time during a video conferencing session. When a person stays still for a while, the above procedure will gradually merge the foreground pixels into the background, generating a blurry foreground person. While more sophisticated algorithms exist for background modeling, they inherently suffer from the same problem.
It is observed that in video conferencing applications, the face is by far the most important foreground object that should always be in focus. Therefore, as discussed previously, an object detector and tracker (e.g., a face detector and a face tracker) can be adopted to identify foreground objects, such as, for example, the face region, in order to remedy the above-mentioned problem.
The object detection process 600 employed in one embodiment of the image background blurring technique is shown in
It should be noted that any conventional object detector can be used with the image background blurring technique. However, in one exemplary working embodiment, the object detector is a face detector that employs a three-step detector consisting of a linear pre-filter, a boosting chain and a number of post filtering algorithms such as support vector machine and color filters. The detector has a high detection rate for frontal faces with low false alarms thanks to the post-filters, however its detection rate on profile faces is relatively low and it is too expensive to run the detector on every video frame. The image background blurring technique combines the detector with a color-based non-rigid object tracker called pixel classification and integration (PCI). PCI has a number of advantages over the popular mean-shift (MS) algorithm. It guarantees a global optimal solution rather than a local optimal solution in MS. PCI is also computationally more efficient than MS, with better scale adaptation and appearance modeling. The face detector in the working embodiment is used as both a detector and a verifier for the process. In one embodiment, if no face is detected in the image, the detector will be fired once every second. Otherwise, it is used to verify the sub-image cropped by the tracked face twice a second. If a face is not verified for a number of tries, the detector is launched again for a whole image detection. The detected/tracked faces may be expanded slightly up and down to cover the hair and the neck of the person. In this working embodiment, the image background blurring technique then generates a face background likelihood map as:
ft(x)=0 if the pixel belongs to a face 1 otherwise. (6)
That is, pixels belonging to a face have probability 0 as background, and 1 otherwise. In addition, if a pixel belongs to a face region, it will not be pushed into the background model in the fast pixel-level background modeling no matter how small the running variance, Ωt(x), is. (See Section 2.2)
2.4 Background Blurring
The background blurring procedure 700 used in one embodiment of the image background blurring technique is shown in
Mathematically, the above discussed blurring process can be explained as follows. The two background likelihood maps are combined into one for background blurring. Let:
qt(x)=min(pt(x), ft(x)). (7)
where qt(x) is the combined probability map, pt(x) is the background probability map and ft(x) is the object detector probability map.
In one embodiment, the image background blurring technique maps this combined likelihood image into a blurring radius image as:
where rmax is the maximum blurring radius set by the user, δ is a small thresholding probability. If qt(x) is greater than δ, the pixel will be fully blurred. In one working embodiment it was found that δ=0.01 works well. The blurring radius image is then used to blur the original image. That is, for each pixel, the corresponding blurring radius is used to blur the input image by averaging pixels within the radius. This can be done using various methods, such as Gaussian methods. For example, various kernels can be used such as Gaussian or rectangular kernels.
One challenge that is faced during the blurring process is that it can be very computationally expensive, because each pixel can have a different blurring radius. When the maximum blurring radius rmax is large, the adaptive blurring procedure can be very slow. Fortunately, for certain blurring kernels such as the rectangular kernel, this procedure can be greatly sped up with the help of integral images. As shown in
where R(0,x) is the rectangular region formed by the origin and x, as shown in the region 802 in the figure on the left. The computational cost for calculating the integral image is low—two additions for each pixel in the image.
After the integral image has been calculated, the sum of colors in an arbitrary rectangular region 804 can be computed with 3 additions, as shown in
The blurred pixel is thus the sum of pixels within the radius divided by the size of the rectangular region, which can be computed efficiently for arbitrary size of neighbors.
3.0 Other Embodiment
In an alternate embodiment, the image background blurring process does not employ an object or face detector/tracker. In this embodiment 900, shown in
It should also be noted that any or all of the aforementioned alternate embodiments may be used in any combination desired to form additional hybrid embodiments. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
1. A computer-implemented process for blurring the background in an image of an image sequence, comprising using a computer to perform the process actions of:
- (a) dividing an image of an image sequence into pixels;
- (b) creating a first probability map of each pixel's probability it is background;
- (c) using an object detector to find an object in the image and using any found object to determine the probability of each pixel being background thereby creating a second probability map;
- (d) specifying that each pixel in the first probability map is not background, if it was determined by the object detector to belong to an object;
- (e) combining the first and second probability maps to obtain a combined probability map that defines a probability of each pixel in the image being a background pixel;
- (f) determining a blurring radius for each pixel based on its probability of being a background pixel; and
- (g) blurring each pixel in the image using the blurring radius for each pixel to create an output image with a blurred background.
2. The computer-implemented process of claim 1 further comprising repeating process actions (a) through (g) for subsequent images in the image sequence.
3. The computer-implemented process of claim 1 wherein the process action of creating the first probability map, comprises the process actions of:
- inputting a first image of an image sequence;
- designating all pixels in the first image as foreground;
- for each pixel in the first image, computing a mean color and a variance;
- inputting the next image in the image sequence;
- for each pixel in the next image, determining the probability of it being a background pixel;
- updating the mean color and variance, using the probability of each pixel in said next image being a background pixel, to compute a running mean and variance for each pixel;
- for each pixel, determining if the running variance is large; if the running pixel variance is large, classifying the pixel as not being a background pixel; if the running pixel variance is not large, classifying the pixel as being a background pixel; and
- creating a probability map that each pixel is background based on the pixel classifications.
4. The computer-implemented process of claim 3 wherein the process action of determining the probability of a pixel belonging to the background is computed using a standard Gaussian kernel.
5. The computer-implemented process of claim 3 wherein each pixel in the input image is converted from red, green, blue (RGB) color space to YUV color space prior to computing a mean color and variance.
6. The computer-implemented process of claim 1 wherein the process action of using an object detector to find an object in the image and using any found object to determine the probability of a pixel being background thereby creating a second probability map, comprises the process actions of:
- (a) inputting the image divided into pixels;
- (b) using an object detector to detect any objects in the image; if an object is detected, verifying the location of a cropped sub-image of the object; if the location of cropped sub-image of the object has not been verified for a number of tries, attempting to use the object detector to find the object in the whole image; if the object is not found, designating all pixels as background, if the object is found, designating each pixel in the image belonging to the object as having a 0 probability of being background; if an object is not detected, designating all pixels as background; and
- (c) creating the second probability map by using the designation for each pixel.
7. The computer-implemented process of claim 1 wherein the object detector is a face detector and the object detected is a face.
8. The computer-implemented process of claim 1 wherein the process action of blurring each pixel in the image using the blurring radius for each pixel to create an output image with a blurred background, comprises the process actions of:
- obtaining a combined probability derived from the first and second probability maps of each pixel in the image being background;
- determining a blurring radius for each pixel based on the probability of it being background obtained from the combined probability map;
- using the second probability map created by using the object detector to specify that pixels that belong to objects are not background; and
- using the blurring radius to blur pixels that are background.
9. The computer-implemented process of claim 8 further comprising the process action of smoothing the image after the blurring radius for each image has been determined.
10. The computer-implemented process of claim 1 wherein the blurring radius is larger if the probability is higher that the pixel is background.
11. The computer-implemented process of claim 1 wherein the background in the image is blurred by using the corresponding blurring radius for each pixel and averaging the pixel colors within the radius.
12. The computer-implemented process of claim 1 wherein a Gaussian distribution with a rectangular kernel is used in determining each blurring radius, and wherein integral images are used in determining each blurring radius.
13. A computer-readable medium having computer-executable instructions for performing the computer-implemented process recited in claim 1.
14. A system for blurring the background in an image, comprising:
- a general purpose computing device;
- a computer program comprising program modules executable by the general purpose computing device, wherein the computing device is directed by the program modules of the computer program to, divide an input image into pixels;
- perform background modeling to determine a first probability map that each pixel in the image is background;
- perform object detection to find an object and specify that pixels of any object detected are not background pixels to create a second probability map that each pixel in the image is background;
- perform image background blurring using the first and second probability maps that each pixel in the image is background to create an image with a blurred background.
15. The system of claim 14 wherein object detection is performed by a user manually segmenting an object from background in the input image.
16. The system of claim 14 wherein object detection is automatically performed by using an object detector.
17. The system of claim 14 wherein the program module to perform image background blurring comprises:
- for each pixel, averaging colors of the pixels in an area corresponding to the probability that a pixel is background, and replacing the pixel color with the averaged color.
18. A computer-implemented process for blurring the background in an image of an image sequence, comprising:
- inputting an image divided into pixels;
- performing foreground/background modeling to determine the probability of each pixel in the image being foreground or background;
- using the probability that each pixel is foreground or background to blur background pixels in the image.
19. The computer-implemented process of claim 18 wherein pixels with a high probability of being foreground pixels are not blurred.
20. The computer-implemented process of claim 19 wherein pixels with a probability of being background pixels are blurred proportional to the likelihood that they are background pixels.
International Classification: G06K 9/40 (20060101); G06K 9/34 (20060101);