Patents by Inventor Cha Zhang

Cha Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT

Publication number: 20110313766

Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.

Type: Application

Filed: August 30, 2011

Publication date: December 22, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
AUDIO SPATIALIZATION USING REFLECTIVE ROOM MODEL

Publication number: 20110268281

Abstract: Described are systems and methods performed by computer to reduce crosstalk produced by loudspeakers when rendering binaural sound that is emitted from the loudspeakers into a room. The room may have sound-reflecting surfaces that reflect some of the sound produced by the loudspeakers. To reduce crosstalk, a room model stored by the computer, is accessed. The room model models at least sound reflected by one or more of the physical surfaces. The room model is used to calculate a model of an audio channel from the loudspeakers to a listener. The model of the audio channel models sound transmission from the loudspeakers to the listener. The computer uses the model of the audio channel to cancel crosstalk from the loudspeakers when rendering the binaural sound.

Type: Application

Filed: April 30, 2010

Publication date: November 3, 2011

Applicant: Microsoft Corporation

Inventors: Dinei A. Florencio, Cha Zhang, Myung-Suk Song
Video noise reduction

Patent number: 8031967

Abstract: A video noise reduction technique is presented. Generally, the technique involves first decomposing each frame of the video into low-pass and high-pass frequency components. Then, for each frame of the video after the first frame, an estimate of a noise variance in the high pass component is obtained. The noise in the high pass component of each pixel of each frame is reduced using the noise variance estimate obtained for the frame under consideration, whenever there has been no substantial motion exhibited by the pixel since the last previous frame. Evidence of motion is determined by analyzing the high and low pass components.

Type: Grant

Filed: June 19, 2007

Date of Patent: October 4, 2011

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Zhengyou Zhang, Zicheng Liu
Identification of people using multiple types of input

Patent number: 8024189

Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.

Type: Grant

Filed: June 22, 2006

Date of Patent: September 20, 2011

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
Multiple-instance pruning for learning efficient cascade detectors

Patent number: 8010471

Abstract: A “Classifier Trainer” trains a combination classifier for detecting specific objects in signals (e.g., faces in images, words in speech, patterns in signals, etc.). In one embodiment “multiple instance pruning” (MIP) is introduced for training weak classifiers or “features” of the combination classifier. Specifically, a trained combination classifier and associated final threshold for setting false positive/negative operating points are combined with learned intermediate rejection thresholds to construct the combination classifier. Rejection thresholds are learned using a pruning process which ensures that objects detected by the original combination classifier are also detected by the combination classifier, thereby guaranteeing the same detection rate on the training set after pruning. The only parameter required throughout training is a target detection rate for the final cascade system.

Type: Grant

Filed: July 13, 2007

Date of Patent: August 30, 2011

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Paul Viola
Automated Acquisition of Facial Images

Publication number: 20110170739

Abstract: Described is a technology by which medical patient facial images are acquired and maintained for associating with a patient's records and/or other items. A video camera may provide video frames, such as captured when a patient is being admitted to a hospital. Face detection may be employed to clip the facial part from the frame. Multiple images of a patient's face may be displayed on a user interface to allow selection of a representative image. Also described is obtaining the patient images by processing electronic documents (e.g., patient records) to look for a face pictured therein.

Type: Application

Filed: January 12, 2010

Publication date: July 14, 2011

Applicant: Microsoft Corporation

Inventors: Michael Gillam, John Christopher Gillotte, Craig Frederick Feied, Jonathan Alan Handler, Renato Reder Cazangi, Rajesh Kutpadi Hegde, Zhengyou Zhang, Cha Zhang
Multiple Category Learning for Training Classifiers

Publication number: 20110119210

Abstract: Described is multiple category learning to jointly train a plurality of classifiers in an iterative manner. Each training iteration associates an adaptive label with each training example, in which during the iterations, the adaptive label of any example is able to be changed by the subsequent reclassification. In this manner, any mislabeled training example is corrected by the classifiers during training. The training may use a probabilistic multiple category boosting algorithm that maintains probability data provided by the classifiers, or a winner-take-all multiple category boosting algorithm selects the adaptive label based upon the highest probability classification. The multiple category boosting training system may be coupled to a multiple instance learning mechanism to obtain the training examples. The trained classifiers may be used as weak classifiers that provide a label used to select a deep classifier for further classification, e.g., to provide a multi-view object detector.

Type: Application

Filed: November 16, 2009

Publication date: May 19, 2011

Applicant: c/o Microsoft Corporation

Inventors: Cha Zhang, Zhengyou Zhang
Learning classifiers using combined boosting and weight trimming

Patent number: 7890443

Abstract: A “Classifier Trainer” trains a combination classifier for detecting specific objects in signals (e.g., faces in images, words in speech, patterns in signals, etc.). In one embodiment “multiple instance pruning” (MIP) is introduced for training weak classifiers or “features” of the combination classifier. Specifically, a trained combination classifier and associated final threshold for setting false positive/negative operating points are combined with learned intermediate rejection thresholds to construct the combination classifier. Rejection thresholds are learned using a pruning process which ensures that objects detected by the original combination classifier are also detected by the combination classifier, thereby guaranteeing the same detection rate on the training set after pruning. The only parameter required throughout training is a target detection rate for the final cascade system.

Type: Grant

Filed: July 13, 2007

Date of Patent: February 15, 2011

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Paul Viola
Image segmentation using spatial-color Gaussian mixture models

Patent number: 7885463

Abstract: A spatial-color Gaussian mixture model (SCGMM) image segmentation technique for segmenting images. The SCGMM image segmentation technique specifies foreground objects in the first frame of an image sequence, either manually or automatically. From the initial segmentation, the SCGMM segmentation system learns two spatial-color Gaussian mixture models (SCGMM) for the foreground and background objects. These models are built into a first-order Markov random field (MRF) energy function. The minimization of the energy function leads to a binary segmentation of the images in the image sequence, which can be solved efficiently using a conventional graph cut procedure.

Type: Grant

Filed: March 30, 2006

Date of Patent: February 8, 2011

Assignee: Microsoft Corp.

Inventors: Cha Zhang, Michael Cohen, Yong Rui, Ting Yu
MULTI-VIEW VIDEO COMPRESSION AND STREAMING

Publication number: 20100329358

Abstract: Multi-view video that is being streamed to a remote device in real time may be encoded. Frames of a real-world scene captured by respective video cameras are received for compression. A virtual viewpoint, positioned relative to the video cameras, is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using the frames. For each frame, compression rates for individual blocks of a frame are computed based on the determined contributions of the individual portions of the frame. The frames are compressed by compressing the blocks of the frames according to their respective determined compression rates. The frames are transmitted in compressed form via a network to a remote device, which is configured to render the scene using the compressed frames.

Type: Application

Filed: June 25, 2009

Publication date: December 30, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Dinei Florencio
BOOSTED FACE VERIFICATION

Publication number: 20100329517

Abstract: Techniques for face verification are described. Local binary pattern (LBP) features and boosting classifiers are used to verify faces in images. A boosted multi-task learning algorithm is used for face verification in images. Finally, boosted face verification is used to verify faces in videos.

Type: Application

Filed: June 26, 2009

Publication date: December 30, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Xiaogang Wang, Zhengyou Zhang
Participant positioning in multimedia conferencing

Patent number: 7840638

Abstract: A multimedia conference technique is disclosed that allows physically remote users to participate in an immersive telecollaborative environment by synchronizing multiple data, images and sounds. The multimedia conference implementation provides users with the perception of being in the same room visually as well as acoustically according to an orientation plan which reflects each remote user's position within the multimedia conference environment.

Type: Grant

Filed: June 27, 2008

Date of Patent: November 23, 2010

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Xuedong David Huang, Zicheng Liu, Cha Zhang, Philip A. Chou, Christian Huitema
VIDEO CAPTURE DEVICE PROVIDING MULTIPLE RESOLUTION VIDEO FEEDS

Publication number: 20100289904

Abstract: Systems are disclosed that provide improved transfer speed of video data from a video capture device to a computing device using multiple video feeds respectively comprising different resolutions. A high-resolution image sensor is used to convert light images into a high-resolution video data stream. A down sampler converts the high-resolution video data stream to a low-resolution video data stream, so that both a low-resolution data stream and a high-resolution data stream are available. While the low resolution-data stream can be sent to the computing device, a digital signal processor (DSP) processes the high-resolution video data stream in accordance with an input control signal that is comprised of desired high-resolution video stream parameters derived from the low-resolution video data stream.

Type: Application

Filed: May 15, 2009

Publication date: November 18, 2010

Applicant: Microsoft Corporation

Inventors: Cha Zhang, Zhengyou Zhang, Zicheng Liu, Wanghong Yuan, Christian Huitema
Histogram-based classifiers having variable bin sizes

Patent number: 7822696

Abstract: A “Classifier Trainer” trains a combination classifier for detecting specific objects in signals (e.g., faces in images, words in speech, patterns in signals, etc.). In one embodiment “multiple instance pruning” (MIP) is introduced for training weak classifiers or “features” of the combination classifier. Specifically, a trained combination classifier and associated final threshold for setting false positive/negative operating points are combined with learned intermediate rejection thresholds to construct the combination classifier. Rejection thresholds are learned using a pruning process which ensures that objects detected by the original combination classifier are also detected by the combination classifier, thereby guaranteeing the same detection rate on the training set after pruning. The only parameter required throughout training is a target detection rate for the final cascade system.

Type: Grant

Filed: July 13, 2007

Date of Patent: October 26, 2010

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Paul Viola
Three-Dimensional (3D) Imaging Based on MotionParallax

Publication number: 20100225743

Abstract: Techniques and technologies are described herein for motion parallax three-dimensional (3D) imaging. Such techniques and technologies do not require special glasses, virtual reality helmets, or other user-attachable devices. More particularly, some of the described motion parallax 3D imaging techniques and technologies generate sequential images, including motion parallax depictions of various scenes derived from clues in views obtained of or created for the displayed scene.

Type: Application

Filed: March 5, 2009

Publication date: September 9, 2010

Applicant: Microsoft Corporation

Inventors: Dinei Afonso Ferreira Florencio, Cha Zhang
Background blurring for video conferencing

Patent number: 7783075

Abstract: Background blurring is an effective way to both preserve privacy and keep communication effective during video conferencing. The present image background blurring technique is a light weight real-time technique to perform background blurring using a fast background modeling procedure combined with an object (e.g., face) detector/tracker. A soft decision is made at each pixel whether it belongs to the foreground or the background based on multiple vision features. The classification results are mapped to a per-pixel blurring radius image to blur the background. In another embodiment, the image background blurring technique blurs the background of the image without using the object detector.

Type: Grant

Filed: June 7, 2006

Date of Patent: August 24, 2010

Assignee: Microsoft Corp.

Inventors: Cha Zhang, Li-wei He, Yong Rui
Multi-Device Capture and Spatial Browsing of Conferences

Publication number: 20100085416

Abstract: Multi-device capture and spatial browsing of conferences is described. In one implementation, a system detects cameras and microphones, such as the webcams on participants' notebook computers, in a conference room, group meeting, or table game, and enlists an ad-hoc array of available devices to capture each participant and the spatial relationships between participants. A video stream composited from the array is browsable by a user to navigate a 3-dimensional representation of the meeting. Each participant may be represented by a video pane, a foreground object, or a 3-D geometric model of the participant's face or body displayed in spatial relation to the other participants in a 3-dimensional arrangement analogous to the spatial arrangement of the meeting.

Type: Application

Filed: October 6, 2008

Publication date: April 8, 2010

Applicant: Microsoft Corporation

Inventors: Rajesh K. Hegde, Zhengyou Zhang, Philip A. Chou, Cha Zhang, Zicheng Liu, Sasa Junuzovic
PARTICIPANT POSITIONING IN MULTIMEDIA CONFERENCING

Publication number: 20090327418

Abstract: A multimedia conference technique is disclosed that allows physically remote users to participate in an immersive telecollaborative environment by synchronizing multiple data, images and sounds. The multimedia conference implementation provides users with the perception of being in the same room visually as well as acoustically according to an orientation plan which reflects each remote user's position within the multimedia conference environment.

Type: Application

Filed: June 27, 2008

Publication date: December 31, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Zhengyou Zhang, Xuedong David Huang, Zicheng Liu, Cha Zhang, Philip A. Chou, Christian Huitema
ADAPTING A PARAMETERIZED CLASSIFIER TO AN ENVIRONMENT

Publication number: 20090263010

Abstract: A classifier is trained on a first set of examples, and the trained classifier is adapted to perform on a second set of examples. The classifier implements a parameterized labeling function. Initial training of the classifier optimizes the labeling function's parameters to minimize a cost function. The classifier and its parameters are provided to an environment in which it will operate, along with an approximation function that approximates the cost function using a compact representation of the first set of examples in place of the actual first set. A second set of examples is collected, and the parameters are modified to minimize a combined cost of labeling the first and second sets of examples. The part of the combined cost that represents the cost of the modified parameters applied to the first set is calculated using the approximation function.

Type: Application

Filed: April 18, 2008

Publication date: October 22, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Zhengyou Zhang
VIDEO RETARGETING

Publication number: 20090251594

Abstract: Videos are retargeted to a target display for viewing with little to no geometric distortion or video information loss. Salient regions of video frames may be determined using scale-space spatiotemporal information. Video information loss may be a result of spatial loss, due to cropping, and resolution loss, due to resizing. A desired cropping window may be determined using a coarse-to-fine searching strategy. Video frames may be cropped with a window that matches an aspect ratio of the target display, and resized isotropically to match a size of the target display.

Type: Application

Filed: April 2, 2008

Publication date: October 8, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Gang Hua, Cha Zhang, Zhengyou Zhang, Zicheng Liu, Ying Shan

prev 1 2 3 4 5 6 next