Patents by Inventor Zhengyou Zhang

Zhengyou Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speaker authentication using adapted background models

Patent number: 7539616

Abstract: Speaker authentication is performed by determining a similarity score for a test utterance and a stored training utterance. Computing the similarity score involves determining the sum of a group of functions, where each function includes the product of a posterior probability of a mixture component and a difference between an adapted mean and a background mean. The adapted mean is formed based on the background mean and the test utterance. The speech content provided by the speaker for authentication can be text-independent (i.e., any content they want to say) or text-dependent (i.e., a particular phrase used for training).

Type: Grant

Filed: February 20, 2006

Date of Patent: May 26, 2009

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Ming Liu
System and process for automatic color and exposure correction in an image

Patent number: 7532359

Abstract: A system and process for improving the appearance of improperly colored and/or improperly exposed images is presented. This involves the use of two novel techniques—namely an automatic color correction technique and an automatic exposure correction technique. The automatic color correction technique takes information from within an image to determine true color characteristics, and improves the color in improperly colored pixels. The automatic exposure correction technique measures the average intensity of all of the pixels and adjusts the entire image pixel by pixel to compensate for over or under exposure. These techniques are stand alone in that each can be applied to an image exclusive of the other, or they can both be applied in any order desired.

Type: Grant

Filed: March 9, 2004

Date of Patent: May 12, 2009

Assignee: Microsoft Corporation

Inventors: Po Yuan, Zhengyou Zhang
Audio-visual control system

Patent number: 7518631

Abstract: A visual control system controls a controlled component. In one embodiment, the visual control system controls the controlled component based on a visual location of a user. In another embodiment, input from a visual perception device is used to provide focus control for an audio input device. In additional embodiments, the visual control system stops, starts or suppresses speech recognition or other audio functions when the direction of the sound detected by the audio input device is not coming from the user's visual location.

Type: Grant

Filed: June 28, 2005

Date of Patent: April 14, 2009

Assignee: Microsoft Corporation

Inventors: John R. Hershey, Zhengyou Zhang
Head pose tracking system

Patent number: 7515173

Abstract: Video images representative of a conferee's head are received and evaluated with respect to a reference model to monitor a head position of the conferee. A personalized face model of the conferee is captured to track head position of the conferee. In a stereo implementation, first and second video images representative of a first conferee taken from different views are concurrently captured. A head position of the first conferee is tracked from the first and second video images. The tracking of head-position through a personalized model-based approach can be used in a number of applications such as human-computer interaction and eye-gaze correction for video conferencing.

Type: Grant

Filed: May 23, 2002

Date of Patent: April 7, 2009

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Ruigang Yang
SPATIAL AUDIO CONFERENCING

Publication number: 20090080632

Abstract: Audio in an audio conference is spatialized using either virtual sound-source positioning or sound-field capture. A spatial audio conference is provided between a local and remote parties using audio conferencing devices (ACDs) interconnected by a network. Each ACD captures spatial audio information from the local party, generates either one, or three or more, audio data streams which include the captured information, and transmits the generated stream(s) to each remote party. Each ACD also receives the generated audio data stream(s) transmitted from each of the remote parties, processes the received streams to generate a plurality of audio signals, and renders the signals to produce a sound-field that is perceived by the local party, where the sound-field includes the spatial audio information captured from the remote parties. A sound-field capture device is also provided which includes at least three directional microphones symmetrically configured about a center axis in a semicircular array.

Type: Application

Filed: September 25, 2007

Publication date: March 26, 2009

Applicant: Microsoft Corporation

Inventors: Zhengyou Zhang, James D. Johnston
System and process for automatic exposure correction in an image

Patent number: 7508993

Abstract: A system and process for improving the appearance of improperly colored and/or improperly exposed images is presented. This involves the use of two novel techniques—namely an automatic color correction technique and an automatic exposure correction technique. The automatic color correction technique takes information from within an image to determine true color characteristics, and improves the color in improperly colored pixels. The automatic exposure correction technique measures the average intensity of all of the pixels and adjusts the entire image pixel by pixel to compensate for over or under exposure. These techniques are stand alone in that each can be applied to an image exclusive of the other, or they can both be applied in any order desired.

Type: Grant

Filed: March 9, 2004

Date of Patent: March 24, 2009

Assignee: Microsoft Corporation

Inventors: Po Yuan, Zhengyou Zhang
DATA BUDDY

Publication number: 20090075634

Abstract: Multi-modal, multi-lingual devices can be employed to consolidate numerous items including, but not limited to, keys, remote controls, image capture devices, audio recorders, cellular telephone functionalities, location/direction detectors, health monitors, calendars, gaming devices, smart home inputs, pens, optical pointing devices or the like. For example, a corner of a cellular telephone can be used as an electronic pen. Moreover, the device can be used to snap multiple pictures stitching them together to create a panoramic image. A device can automate ignition of an automobile, initiate appliances, etc. based upon relative distance. The device can provide for near to eye capabilities for enhanced image viewing. Multiple cameras/sensors can be provided on a single device to provide for stereoscopic capabilities. The device can also provide assistance to blind, privacy, etc. by consolidating services.

Type: Application

Filed: November 26, 2008

Publication date: March 19, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Michael J. Sinclair, Yuan Kong, Zhengyou Zhang, Behrooz Chitsaz, David W. Williams, Silviu-Petru Cucerzan, Zicheng Liu
Method and apparatus for multi-sensory speech enhancement on a mobile device

Patent number: 7499686

Abstract: A mobile device is provided that includes a digit input that can be manipulated by a user's fingers or thumb, an air conduction microphone and an alternative sensor that provides an alternative sensor signal indicative of speech. Under some embodiments, the mobile device also includes a proximity sensor that provides a proximity signal indicative of the distance from the mobile device to an object. Under some embodiments, the signal from the air conduction microphone, the alternative sensor signal, and the proximity signal are used to form an estimate of a clean speech value. In further embodiments, a sound is produced through a speaker in the mobile device based on the amount of noise in the clean speech value. In other embodiments, the sound produced through the speaker is based on the proximity sensor signal.

Type: Grant

Filed: February 24, 2004

Date of Patent: March 3, 2009

Assignee: Microsoft Corporation

Inventors: Michael J. Sinclair, Xuedong David Huang, Zhengyou Zhang
System and method for visual echo cancellation in a projector-camera-whiteboard system

Patent number: 7496229

Abstract: A system and method for transmitting a clear image of a whiteboard work surface for remote collaboration. The image is separated into two portions; the projected image of the work surface, and the writing physically added to the whiteboard by participants. This separation allows several benefits. The bandwidth requirements are much lower than video teleconferencing, and the benefits of whiteboard sharing are improved. The visual echo created on a physical whiteboard can be canceled.

Type: Grant

Filed: February 17, 2004

Date of Patent: February 24, 2009

Assignee: Microsoft Corp.

Inventors: Zhengyou Zhang, Hanning Zhou
Incremental motion estimation through local bundle adjustment

Patent number: 7477762

Abstract: An incremental motion estimation system and process for estimating the camera pose parameters associated with each image of a long image sequence. Unlike previous approaches, which rely on point matches across three or more views, the present system and process also includes those points shared only by two views. The problem is formulated as a series of localized bundle adjustments in such a way that the estimated camera motions in the whole sequence are consistent with each other. The result of the inclusion of two-view matching points and the localized bundle adjustment approach is more accurate estimates of the camera pose parameters for each image in the sequence than previous incremental techniques, and providing an accuracy approaching that of global bundle adjustment techniques except with processing times about 100 to 700 times faster than the global approaches.

Type: Grant

Filed: August 31, 2004

Date of Patent: January 13, 2009

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Ying Shan
2-D Barcode Recognition

Publication number: 20090001165

Abstract: Systems and methods for 2-D barcode recognition are described. In one aspect, the systems and methods use a charge coupled camera capturing device to capture a digital image of a 3-D scene. The systems and methods evaluate the digital image to localize and segment a 2-D barcode from the digital image of the 3-D scene. The 2-D barcode is rectified to remove non-uniform lighting and correct any perspective distortion. The rectified 2-D barcode is divided into multiple uniform cells to generate a 2-D matrix array of symbols. A barcode processing application evaluates the 2-D matrix array of symbols to present data to the user.

Type: Application

Filed: June 29, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Chunhui Zhang, Zhouchen Lin, Zhengyou Zhang, Shi Han
VIDEO NOISE REDUCTION

Publication number: 20080317371

Abstract: A video noise reduction technique is presented. Generally, the technique involves first decomposing each frame of the video into low-pass and high-pass frequency components. Then, for each frame of the video after the first frame, an estimate of a noise variance in the high pass component is obtained. The noise in the high pass component of each pixel of each frame is reduced using the noise variance estimate obtained for the frame under consideration, whenever there has been no substantial motion exhibited by the pixel since the last previous frame. Evidence of motion is determined by analyzing the high and low pass components.

Type: Application

Filed: June 19, 2007

Publication date: December 25, 2008

Applicant: Microsoft Corporation

Inventors: Cha Zhang, Zhengyou Zhang, Zicheng Liu
Data buddy

Patent number: 7460884

Abstract: Multi-modal, multi-lingual devices can be employed to consolidate numerous items including, but not limited to, keys, remote controls, image capture devices, audio recorders, cellular telephone functionalities, location/direction detectors, health monitors, calendars, gaming devices, smart home inputs, pens, optical pointing devices or the like. For example, a corner of a cellular telephone can be used as an electronic pen. Moreover, the device can be used to snap multiple pictures stitching them together to create a panoramic image. A device can automate ignition of an automobile, initiate appliances, etc. based upon relative distance. The device can provide for near to eye capabilities for enhanced image viewing. Multiple cameras/sensors can be provided on a single device to provide for stereoscopic capabilities. The device can also provide assistance to blind, privacy, etc. by consolidating services.

Type: Grant

Filed: June 29, 2005

Date of Patent: December 2, 2008

Assignee: Microsoft Corporation

Inventors: Michael J. Sinclair, Yuan Kong, Zhengyou Zhang, Behrooz Chitsaz, David W. Williams, Silviu-Petru Cucerzan, Zicheng Liu
Multispectral digital camera employing both visible light and non-visible light sensing on a single image sensor

Patent number: 7460160

Abstract: A digital camera having a single image sensor made up of an array of filtered photosites used to capture non-visible light wavelengths in addition to the standard red/green/blue (RGB) or other visible light intensity values is presented. Essentially, this is accomplished using a separate filter disposed over each photosite that exhibits a light transmission function with regard to wavelength which passes only a prescribed range of wavelengths—some passing light in the visible light spectrum and others in the non-visible light spectrum. The photosites passing non-visible light wavelengths can be configured to pass light in the infrared (IR) light spectrum, which can be limited to just the near infrared (NIR) spectrum if desired, or alternately light in the ultra-violet (UV) light spectrum.

Type: Grant

Filed: September 24, 2004

Date of Patent: December 2, 2008

Assignee: Microsoft Corporation

Inventors: John Hershey, Zhengyou Zhang
RECOVERING PARAMETERS FROM A SUB-OPTIMAL IMAGE

Publication number: 20080279423

Abstract: A subregion-based image parameter recovery system and method for recovering image parameters from a single image containing a face taken under sub-optimal illumination conditions. The recovered image parameters (including albedo, illumination, and face geometry) can be used to generate face images under a new lighting environment. The method includes dividing the face in the image into numerous smaller regions, generating an albedo morphable model for each region, and using a Markov Random Fields (MRF)-based framework to model the spatial dependence between neighboring regions. Different types of regions are defined, including saturated, shadow, regular, and occluded regions. Each pixel in the image is classified and assigned to a region based on intensity, and then weighted based on its classification.

Type: Application

Filed: May 11, 2007

Publication date: November 13, 2008

Applicant: Microsoft Corporation

Inventors: Zhengyou Zhang, Zicheng Liu, Gang Hua, Yang Wang
Learning image enhancement

Publication number: 20080279467

Abstract: Image enhancement techniques are described to enhance an image in accordance with a set of training images. In an implementation, an image color tone map is generated for a facial region included in an image. The image color tone map may be normalized to a color tone map for a set of training images so that the image color tone map matches the map for the training images. The normalized color tone map may be applied to the image to enhance the in-question image. In further implementations, the procedure may be updated when the average color intensity in non-facial regions differs from an accumulated mean by a threshold amount.

Type: Application

Filed: May 10, 2007

Publication date: November 13, 2008

Applicant: Microsoft Corporation

Inventors: Zicheng Liu, Cha Zhang, Zhengyou Zhang
Method and apparatus for multi-sensory speech enhancement

Patent number: 7447630

Abstract: A method and system use an alternative sensor signal received from a sensor other than an air conduction microphone to estimate a clean speech value. The estimation uses either the alternative sensor signal alone, or in conjunction with the air conduction microphone signal. The clean speech value is estimated without using a model trained from noisy training data collected from an air conduction microphone. Under one embodiment, correction vectors are added to a vector formed from the alternative sensor signal in order to form a filter, which is applied to the air conductive microphone signal to produce the clean speech estimate. In other embodiments, the pitch of a speech signal is determined from the alternative sensor signal and is used to decompose an air conduction microphone signal. The decomposed signal is then used to determine a clean signal estimate.

Type: Grant

Filed: November 26, 2003

Date of Patent: November 4, 2008

Assignee: Microsoft Corporation

Inventors: Zicheng Liu, Michael J. Sinclair, Alejandro Acero, Xuedong D. Huang, James G. Droppo, Li Deng, Zhengyou Zhang, Yanli Zheng
INSERTION OF VIRTUAL VIDEO INTO LIVE VIDEO

Publication number: 20080267578

Abstract: The present virtual video muting technique seamlessly inserts a virtual video into a live video when the user does not want to reveal his/her actual activity. The virtual video is generated based on real video frames captured earlier and thus makes the virtual video appear to be real.

Type: Application

Filed: April 30, 2007

Publication date: October 30, 2008

Applicant: Microsoft Corporation

Inventors: Zhengyou Zhang, Aaron Fred Bobick
MICROPHONES AS CONTACT SENSORS FOR DEVICE CONTROL

Publication number: 20080234842

Abstract: A device controller that controls a device by tapping or rubbing the surface of microphones on the device. It allows microphones to be used as both speech sensors (to capture speech signals, the original functionality) and a device controller (the new functionality). Tapping or rubbing the surface of microphones on the device produces complex yet distinctive signals. By detecting these events, the present device controller can generate appropriate commands to control the device.

Type: Application

Filed: March 21, 2007

Publication date: September 25, 2008

Applicant: Microsoft Corporation

Inventor: Zhengyou Zhang
System and method for real-time whiteboard capture and processing

Patent number: 7426297

Abstract: A system that captures both whiteboard content and audio signals of a meeting using a video camera and records or transmits them in real-time. The Real-Time Whiteboard Capture captures pen strokes on whiteboards in real time using an off-the-shelf video camera. Unlike many existing tools, the RTWCS does not instrument the pens or the whiteboard. It analyzes the sequence of captured video images in real time, classifies the pixels into whiteboard background, pen strokes and foreground objects (e.g., people in front of the whiteboard), and extracts newly written pen strokes. This allows the RTWCS to transmit whiteboard contents using very low bandwidth to remote meeting participants. Combined with other teleconferencing tools such as voice conference and application sharing, the RTWCS becomes a powerful tool to share ideas during online meetings.

Type: Grant

Filed: March 21, 2007

Date of Patent: September 16, 2008

Assignee: Microsoft Corp.

Inventors: Zhengyou Zhang, Liwei He

prev … 10 11 12 13 14 15 16 17 18 … next