Patents by Inventor Zhengyou Zhang
Zhengyou Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7539616Abstract: Speaker authentication is performed by determining a similarity score for a test utterance and a stored training utterance. Computing the similarity score involves determining the sum of a group of functions, where each function includes the product of a posterior probability of a mixture component and a difference between an adapted mean and a background mean. The adapted mean is formed based on the background mean and the test utterance. The speech content provided by the speaker for authentication can be text-independent (i.e., any content they want to say) or text-dependent (i.e., a particular phrase used for training).Type: GrantFiled: February 20, 2006Date of Patent: May 26, 2009Assignee: Microsoft CorporationInventors: Zhengyou Zhang, Ming Liu
-
Patent number: 7532359Abstract: A system and process for improving the appearance of improperly colored and/or improperly exposed images is presented. This involves the use of two novel techniques—namely an automatic color correction technique and an automatic exposure correction technique. The automatic color correction technique takes information from within an image to determine true color characteristics, and improves the color in improperly colored pixels. The automatic exposure correction technique measures the average intensity of all of the pixels and adjusts the entire image pixel by pixel to compensate for over or under exposure. These techniques are stand alone in that each can be applied to an image exclusive of the other, or they can both be applied in any order desired.Type: GrantFiled: March 9, 2004Date of Patent: May 12, 2009Assignee: Microsoft CorporationInventors: Po Yuan, Zhengyou Zhang
-
Patent number: 7518631Abstract: A visual control system controls a controlled component. In one embodiment, the visual control system controls the controlled component based on a visual location of a user. In another embodiment, input from a visual perception device is used to provide focus control for an audio input device. In additional embodiments, the visual control system stops, starts or suppresses speech recognition or other audio functions when the direction of the sound detected by the audio input device is not coming from the user's visual location.Type: GrantFiled: June 28, 2005Date of Patent: April 14, 2009Assignee: Microsoft CorporationInventors: John R. Hershey, Zhengyou Zhang
-
Patent number: 7515173Abstract: Video images representative of a conferee's head are received and evaluated with respect to a reference model to monitor a head position of the conferee. A personalized face model of the conferee is captured to track head position of the conferee. In a stereo implementation, first and second video images representative of a first conferee taken from different views are concurrently captured. A head position of the first conferee is tracked from the first and second video images. The tracking of head-position through a personalized model-based approach can be used in a number of applications such as human-computer interaction and eye-gaze correction for video conferencing.Type: GrantFiled: May 23, 2002Date of Patent: April 7, 2009Assignee: Microsoft CorporationInventors: Zhengyou Zhang, Ruigang Yang
-
Publication number: 20090080632Abstract: Audio in an audio conference is spatialized using either virtual sound-source positioning or sound-field capture. A spatial audio conference is provided between a local and remote parties using audio conferencing devices (ACDs) interconnected by a network. Each ACD captures spatial audio information from the local party, generates either one, or three or more, audio data streams which include the captured information, and transmits the generated stream(s) to each remote party. Each ACD also receives the generated audio data stream(s) transmitted from each of the remote parties, processes the received streams to generate a plurality of audio signals, and renders the signals to produce a sound-field that is perceived by the local party, where the sound-field includes the spatial audio information captured from the remote parties. A sound-field capture device is also provided which includes at least three directional microphones symmetrically configured about a center axis in a semicircular array.Type: ApplicationFiled: September 25, 2007Publication date: March 26, 2009Applicant: Microsoft CorporationInventors: Zhengyou Zhang, James D. Johnston
-
Patent number: 7508993Abstract: A system and process for improving the appearance of improperly colored and/or improperly exposed images is presented. This involves the use of two novel techniques—namely an automatic color correction technique and an automatic exposure correction technique. The automatic color correction technique takes information from within an image to determine true color characteristics, and improves the color in improperly colored pixels. The automatic exposure correction technique measures the average intensity of all of the pixels and adjusts the entire image pixel by pixel to compensate for over or under exposure. These techniques are stand alone in that each can be applied to an image exclusive of the other, or they can both be applied in any order desired.Type: GrantFiled: March 9, 2004Date of Patent: March 24, 2009Assignee: Microsoft CorporationInventors: Po Yuan, Zhengyou Zhang
-
Publication number: 20090075634Abstract: Multi-modal, multi-lingual devices can be employed to consolidate numerous items including, but not limited to, keys, remote controls, image capture devices, audio recorders, cellular telephone functionalities, location/direction detectors, health monitors, calendars, gaming devices, smart home inputs, pens, optical pointing devices or the like. For example, a corner of a cellular telephone can be used as an electronic pen. Moreover, the device can be used to snap multiple pictures stitching them together to create a panoramic image. A device can automate ignition of an automobile, initiate appliances, etc. based upon relative distance. The device can provide for near to eye capabilities for enhanced image viewing. Multiple cameras/sensors can be provided on a single device to provide for stereoscopic capabilities. The device can also provide assistance to blind, privacy, etc. by consolidating services.Type: ApplicationFiled: November 26, 2008Publication date: March 19, 2009Applicant: MICROSOFT CORPORATIONInventors: Michael J. Sinclair, Yuan Kong, Zhengyou Zhang, Behrooz Chitsaz, David W. Williams, Silviu-Petru Cucerzan, Zicheng Liu
-
Patent number: 7499686Abstract: A mobile device is provided that includes a digit input that can be manipulated by a user's fingers or thumb, an air conduction microphone and an alternative sensor that provides an alternative sensor signal indicative of speech. Under some embodiments, the mobile device also includes a proximity sensor that provides a proximity signal indicative of the distance from the mobile device to an object. Under some embodiments, the signal from the air conduction microphone, the alternative sensor signal, and the proximity signal are used to form an estimate of a clean speech value. In further embodiments, a sound is produced through a speaker in the mobile device based on the amount of noise in the clean speech value. In other embodiments, the sound produced through the speaker is based on the proximity sensor signal.Type: GrantFiled: February 24, 2004Date of Patent: March 3, 2009Assignee: Microsoft CorporationInventors: Michael J. Sinclair, Xuedong David Huang, Zhengyou Zhang
-
Patent number: 7496229Abstract: A system and method for transmitting a clear image of a whiteboard work surface for remote collaboration. The image is separated into two portions; the projected image of the work surface, and the writing physically added to the whiteboard by participants. This separation allows several benefits. The bandwidth requirements are much lower than video teleconferencing, and the benefits of whiteboard sharing are improved. The visual echo created on a physical whiteboard can be canceled.Type: GrantFiled: February 17, 2004Date of Patent: February 24, 2009Assignee: Microsoft Corp.Inventors: Zhengyou Zhang, Hanning Zhou
-
Patent number: 7477762Abstract: An incremental motion estimation system and process for estimating the camera pose parameters associated with each image of a long image sequence. Unlike previous approaches, which rely on point matches across three or more views, the present system and process also includes those points shared only by two views. The problem is formulated as a series of localized bundle adjustments in such a way that the estimated camera motions in the whole sequence are consistent with each other. The result of the inclusion of two-view matching points and the localized bundle adjustment approach is more accurate estimates of the camera pose parameters for each image in the sequence than previous incremental techniques, and providing an accuracy approaching that of global bundle adjustment techniques except with processing times about 100 to 700 times faster than the global approaches.Type: GrantFiled: August 31, 2004Date of Patent: January 13, 2009Assignee: Microsoft CorporationInventors: Zhengyou Zhang, Ying Shan
-
Publication number: 20090001165Abstract: Systems and methods for 2-D barcode recognition are described. In one aspect, the systems and methods use a charge coupled camera capturing device to capture a digital image of a 3-D scene. The systems and methods evaluate the digital image to localize and segment a 2-D barcode from the digital image of the 3-D scene. The 2-D barcode is rectified to remove non-uniform lighting and correct any perspective distortion. The rectified 2-D barcode is divided into multiple uniform cells to generate a 2-D matrix array of symbols. A barcode processing application evaluates the 2-D matrix array of symbols to present data to the user.Type: ApplicationFiled: June 29, 2007Publication date: January 1, 2009Applicant: Microsoft CorporationInventors: Chunhui Zhang, Zhouchen Lin, Zhengyou Zhang, Shi Han
-
Publication number: 20080317371Abstract: A video noise reduction technique is presented. Generally, the technique involves first decomposing each frame of the video into low-pass and high-pass frequency components. Then, for each frame of the video after the first frame, an estimate of a noise variance in the high pass component is obtained. The noise in the high pass component of each pixel of each frame is reduced using the noise variance estimate obtained for the frame under consideration, whenever there has been no substantial motion exhibited by the pixel since the last previous frame. Evidence of motion is determined by analyzing the high and low pass components.Type: ApplicationFiled: June 19, 2007Publication date: December 25, 2008Applicant: Microsoft CorporationInventors: Cha Zhang, Zhengyou Zhang, Zicheng Liu
-
Patent number: 7460884Abstract: Multi-modal, multi-lingual devices can be employed to consolidate numerous items including, but not limited to, keys, remote controls, image capture devices, audio recorders, cellular telephone functionalities, location/direction detectors, health monitors, calendars, gaming devices, smart home inputs, pens, optical pointing devices or the like. For example, a corner of a cellular telephone can be used as an electronic pen. Moreover, the device can be used to snap multiple pictures stitching them together to create a panoramic image. A device can automate ignition of an automobile, initiate appliances, etc. based upon relative distance. The device can provide for near to eye capabilities for enhanced image viewing. Multiple cameras/sensors can be provided on a single device to provide for stereoscopic capabilities. The device can also provide assistance to blind, privacy, etc. by consolidating services.Type: GrantFiled: June 29, 2005Date of Patent: December 2, 2008Assignee: Microsoft CorporationInventors: Michael J. Sinclair, Yuan Kong, Zhengyou Zhang, Behrooz Chitsaz, David W. Williams, Silviu-Petru Cucerzan, Zicheng Liu
-
Patent number: 7460160Abstract: A digital camera having a single image sensor made up of an array of filtered photosites used to capture non-visible light wavelengths in addition to the standard red/green/blue (RGB) or other visible light intensity values is presented. Essentially, this is accomplished using a separate filter disposed over each photosite that exhibits a light transmission function with regard to wavelength which passes only a prescribed range of wavelengths—some passing light in the visible light spectrum and others in the non-visible light spectrum. The photosites passing non-visible light wavelengths can be configured to pass light in the infrared (IR) light spectrum, which can be limited to just the near infrared (NIR) spectrum if desired, or alternately light in the ultra-violet (UV) light spectrum.Type: GrantFiled: September 24, 2004Date of Patent: December 2, 2008Assignee: Microsoft CorporationInventors: John Hershey, Zhengyou Zhang
-
Publication number: 20080279423Abstract: A subregion-based image parameter recovery system and method for recovering image parameters from a single image containing a face taken under sub-optimal illumination conditions. The recovered image parameters (including albedo, illumination, and face geometry) can be used to generate face images under a new lighting environment. The method includes dividing the face in the image into numerous smaller regions, generating an albedo morphable model for each region, and using a Markov Random Fields (MRF)-based framework to model the spatial dependence between neighboring regions. Different types of regions are defined, including saturated, shadow, regular, and occluded regions. Each pixel in the image is classified and assigned to a region based on intensity, and then weighted based on its classification.Type: ApplicationFiled: May 11, 2007Publication date: November 13, 2008Applicant: Microsoft CorporationInventors: Zhengyou Zhang, Zicheng Liu, Gang Hua, Yang Wang
-
Publication number: 20080279467Abstract: Image enhancement techniques are described to enhance an image in accordance with a set of training images. In an implementation, an image color tone map is generated for a facial region included in an image. The image color tone map may be normalized to a color tone map for a set of training images so that the image color tone map matches the map for the training images. The normalized color tone map may be applied to the image to enhance the in-question image. In further implementations, the procedure may be updated when the average color intensity in non-facial regions differs from an accumulated mean by a threshold amount.Type: ApplicationFiled: May 10, 2007Publication date: November 13, 2008Applicant: Microsoft CorporationInventors: Zicheng Liu, Cha Zhang, Zhengyou Zhang
-
Patent number: 7447630Abstract: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.Type: GrantFiled: November 26, 2003Date of Patent: November 4, 2008Assignee: Microsoft CorporationInventors: Zicheng Liu, Michael J. Sinclair, Alejandro Acero, Xuedong D. Huang, James G. Droppo, Li Deng, Zhengyou Zhang, Yanli Zheng
-
Publication number: 20080267578Abstract: The present virtual video muting technique seamlessly inserts a virtual video into a live video when the user does not want to reveal his/her actual activity. The virtual video is generated based on real video frames captured earlier and thus makes the virtual video appear to be real.Type: ApplicationFiled: April 30, 2007Publication date: October 30, 2008Applicant: Microsoft CorporationInventors: Zhengyou Zhang, Aaron Fred Bobick
-
Publication number: 20080234842Abstract: A device controller that controls a device by tapping or rubbing the surface of microphones on the device. It allows microphones to be used as both speech sensors (to capture speech signals, the original functionality) and a device controller (the new functionality). Tapping or rubbing the surface of microphones on the device produces complex yet distinctive signals. By detecting these events, the present device controller can generate appropriate commands to control the device.Type: ApplicationFiled: March 21, 2007Publication date: September 25, 2008Applicant: Microsoft CorporationInventor: Zhengyou Zhang
-
Patent number: 7426297Abstract: A system that captures both whiteboard content and audio signals of a meeting using a video camera and records or transmits them in real-time. The Real-Time Whiteboard Capture captures pen strokes on whiteboards in real time using an off-the-shelf video camera. Unlike many existing tools, the RTWCS does not instrument the pens or the whiteboard. It analyzes the sequence of captured video images in real time, classifies the pixels into whiteboard background, pen strokes and foreground objects (e.g., people in front of the whiteboard), and extracts newly written pen strokes. This allows the RTWCS to transmit whiteboard contents using very low bandwidth to remote meeting participants. Combined with other teleconferencing tools such as voice conference and application sharing, the RTWCS becomes a powerful tool to share ideas during online meetings.Type: GrantFiled: March 21, 2007Date of Patent: September 16, 2008Assignee: Microsoft Corp.Inventors: Zhengyou Zhang, Liwei He