Patents by Inventor Zhengyou Zhang

Zhengyou Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPUTING 3D SHAPE PARAMETERS FOR FACE ANIMATION

Publication number: 20130121526

Abstract: A three-dimensional shape parameter computation system and method for computing three-dimensional human head shape parameters from two-dimensional facial feature points. A series of images containing a user's face is captured. Embodiments of the system and method deduce the 3D parameters of the user's head by examining a series of captured images of the user over time and in a variety of head poses and facial expressions, and then computing an average. An energy function is constructed over a batch of frames containing 2D face feature points obtained from the captured images, and the energy function is minimized to solve for the head shape parameters valid for the batch of frames. Head pose parameters and facial expression and animation parameters can vary over each captured image in the batch of frames. In some embodiments this minimization is performed using a modified Gauss-Newton minimization technique using a single iteration.

Type: Application

Filed: November 11, 2011

Publication date: May 16, 2013

Applicant: Microsoft Corporation

Inventors: Nikolay Smolyanskiy, Christian F. Huitema, Cha Zhang, Lin Liang, Sean Eron Anderson, Zhengyou Zhang
Boosted face verification

Patent number: 8406483

Abstract: Techniques for face verification are described. Local binary pattern (LBP) features and boosting classifiers are used to verify faces in images. A boosted multi-task learning algorithm is used for face verification in images. Finally, boosted face verification is used to verify faces in videos.

Type: Grant

Filed: June 26, 2009

Date of Patent: March 26, 2013

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Xiaogang Wang, Zhengyou Zhang
Visual feedback for natural head positioning

Patent number: 8405706

Abstract: A videoconferencing conferee may be provided with feedback on his or her location relative a local video camera by altering how remote videoconference video is displayed on a local videoconference display viewed by the conferee. The conferee's location may be tracked and the displayed remote video may be altered in accordance to the changing location of the conferee. The remote video may appear to move in directions mirroring movement of the conferee. This effect may be achieved by modeling the remote video as offset and behind a virtual portal corresponding to the display. The remote video may be displayed according to a view of the remote video through the virtual portal. As the conferee's position changes, the view through the portal changes, and the remote video changes accordingly.

Type: Grant

Filed: December 17, 2008

Date of Patent: March 26, 2013

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Christian Huitema, Alejandro Acero
Multiple category learning for training classifiers

Patent number: 8401979

Abstract: Described is multiple category learning to jointly train a plurality of classifiers in an iterative manner. Each training iteration associates an adaptive label with each training example, in which during the iterations, the adaptive label of any example is able to be changed by the subsequent reclassification. In this manner, any mislabeled training example is corrected by the classifiers during training. The training may use a probabilistic multiple category boosting algorithm that maintains probability data provided by the classifiers, or a winner-take-all multiple category boosting algorithm selects the adaptive label based upon the highest probability classification. The multiple category boosting training system may be coupled to a multiple instance learning mechanism to obtain the training examples. The trained classifiers may be used as weak classifiers that provide a label used to select a deep classifier for further classification, e.g., to provide a multi-view object detector.

Type: Grant

Filed: November 16, 2009

Date of Patent: March 19, 2013

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Zhengyou Zhang
Recognizing actions of animate objects in video

Patent number: 8396247

Abstract: A system that facilitates automatically determining an action of an animate object is described herein. The system includes a receiver component that receives video data that includes images of an animate object. The system additionally includes a determiner component that accesses a data store that includes an action graph and automatically determines an action undertaken by the animate object in the received video data based at least in part upon the action graph. The action graph comprises a plurality of nodes that are representative of multiple possible postures of the animate object. At least one node in the action graph is shared amongst multiple actions represented in the action graph.

Type: Grant

Filed: July 31, 2008

Date of Patent: March 12, 2013

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Wanqing Li, Zicheng Liu
CALIBRATION BETWEEN DEPTH AND COLOR SENSORS FOR DEPTH CAMERAS

Publication number: 20130010079

Abstract: A system described herein includes a receiver component that receives a first digital image from a color camera, wherein the first digital image comprises a planar object, and a second digital image from a depth sensor, wherein the second digital image comprises the planar object. The system also includes a calibrator component that jointly calibrates the color camera and the depth sensor based at least in part upon the first digital image and the second digital image.

Type: Application

Filed: July 8, 2011

Publication date: January 10, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Zhengyou Zhang
Audio transforms in connection with multiparty communication

Patent number: 8340267

Abstract: The claimed subject matter relates to an architecture that can preprocess audio portions of communications in order to enrich multiparty communication sessions or environments. In particular, the architecture can provide both a public channel for public communications that are received by substantially all connected parties and can further provide a private channel for private communications that are received by a selected subset of all connected parties. Most particularly, the architecture can apply an audio transform to communications that occur during the multiparty communication session based upon a target audience of the communication. By way of illustration, the architecture can apply a whisper transform to private communications, an emotion transform based upon relationships, an ambience or spatial transform based upon physical locations, or a pace transform based upon lack of presence.

Type: Grant

Filed: February 5, 2009

Date of Patent: December 25, 2012

Assignee: Microsoft Corporation

Inventors: Dinei A. Florencio, Alejandro Acero, William Buxton, Phillip A. Chou, Ross G. Cutler, Jason Garms, Christian Huitema, Kori M. Quinn, Daniel Allen Rosenfeld, Zhengyou Zhang
Multi-camera head pose tracking

Patent number: 8339459

Abstract: Techniques and technologies for tracking a face with a plurality of cameras wherein a geometry between the cameras is initially unknown. One disclosed method includes detecting a head with two of the cameras and registering a head model with the image of the head (as detected by one of the cameras). The method also includes back projecting the other detected face image to the head model and determining a head pose from the back-projected head image. Furthermore, the determined geometry is used to track the face with at least one of the cameras.

Type: Grant

Filed: September 16, 2009

Date of Patent: December 25, 2012

Assignee: Microsoft Corporation

Inventors: Zhengyou Zhang, Aswin Sankaranarayanan, Qing Zhang, Zicheng Liu, Qin Cai
Ambulatory Presence Features

Publication number: 20120306995

Abstract: A system facilitates managing one or more devices utilized for communicating data within a telepresence session. A telepresence session can be initiated within a communication framework that includes a first user and one or more second users. In response to determining a temporary absence of the first user from the telepresence session, a recordation of the telepresence session is initialized to enable a playback of a portion or a summary of the telepresence session that the first user has missed.

Type: Application

Filed: August 13, 2012

Publication date: December 6, 2012

Applicant: Microsoft Corporation

Inventors: Christian Huitema, William A.S. Buxton, Jonathan E. Paff, Zicheng Liu, Rajesh Kutpadi Hegde, Zhengyou Zhang, Kori Marie Quinn, Jin Li, Michel Pahud
DEPTH RECONSTRUCTION USING PLURAL DEPTH CAPTURE UNITS

Publication number: 20120294510

Abstract: A depth construction module is described that receives depth images provided by two or more depth capture units. Each depth capture unit generates its depth image using a structured light technique, that is, by projecting a pattern onto an object and receiving a captured image in response thereto. The depth construction module then identifies at least one deficient portion in at least one depth image that has been received, which may be attributed to overlapping projected patterns that impinge the object. The depth construction module then uses a multi-view reconstruction technique, such as a plane sweeping technique, to supply depth information for the deficient portion. In another mode, a multi-view reconstruction technique can be used to produce an entire depth scene based on captured images received from the depth capture units, that is, without first identifying deficient portions in the depth images.

Type: Application

Filed: May 16, 2011

Publication date: November 22, 2012

Applicant: Microsoft Corporation

Inventors: Cha Zhang, Wenwu Zhu, Zhengyou Zhang, Philip A. Chou
IMAGING THROUGH A DISPLAY SCREEN

Publication number: 20120287223

Abstract: The described implementations relate to enhancement images, such as in videoconferencing scenarios. One system includes a poriferous display screen having generally opposing front and back surfaces. This system also includes a camera positioned proximate to the back surface to capture an image through the poriferous display screen.

Type: Application

Filed: May 11, 2011

Publication date: November 15, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Cha Zhang, Timothy A. Large, Zhengyou Zhang, Ruigang Yang
Insertion of virtual video into live video

Patent number: 8310521

Abstract: The present virtual video muting technique seamlessly inserts a virtual video into a live video when the user does not want to reveal his/her actual activity. The virtual video is generated based on real video frames captured earlier, and thus makes the virtual video appear to be real.

Type: Grant

Filed: April 30, 2007

Date of Patent: November 13, 2012

Assignee: Microsoft Corp.

Inventors: Zhengyou Zhang, Aaron Bobick
PHOTO-REALISTIC SYNTHESIS OF THREE DIMENSIONAL ANIMATION WITH FACIAL FEATURES SYNCHRONIZED WITH SPEECH

Publication number: 20120280974

Abstract: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.

Type: Application

Filed: May 3, 2011

Publication date: November 8, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Lijuan Wang, Frank Soong, Qiang Huo, Zhengyou Zhang
Immersive Remote Conferencing

Publication number: 20120281059

Abstract: The subject disclosure is directed towards an immersive conference, in which participants in separate locations are brought together into a common virtual environment (scene), such that they appear to each other to be in a common space, with geometry, appearance, and real-time natural interaction (e.g., gestures) preserved. In one aspect, depth data and video data are processed to place remote participants in the common scene from the first person point of view of a local participant. Sound data may be spatially controlled, and parallax computed to provide a realistic experience. The scene may be augmented with various data, videos and other effects/animations.

Type: Application

Filed: May 4, 2011

Publication date: November 8, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Philip A. Chou, Zhengyou Zhang, Cha Zhang, Dinei A. Florencio, Zicheng Liu, Rajesh K. Hegde, Nirupama Chandrasekaran
AUGMENTED AUDITORY PERCEPTION FOR THE VISUALLY IMPAIRED

Publication number: 20120268563

Abstract: A person is provided with the ability to auditorily determine the spatial geometry of his current physical environment. A spatial map of the current physical environment of the person is generated. The spatial map is then used to generate a spatialized audio representation of the environment. The spatialized audio representation is then output to a stereo listening device which is being worn by the person.

Type: Application

Filed: April 22, 2011

Publication date: October 25, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Philip A. Chou, Zhengyou Zhang, Dinei Florencio
STEREOPHONIC TELECONFERENCING USING A MICROPHONE ARRAY

Publication number: 20120262536

Abstract: Stereophonic teleconferencing system embodiments are described which advantageously employ a microphone array at a remote conference site having multiple conferencees to produce a separate output channel from the each microphone in the array. Audio data streams each representing one of the audio output channels from the microphone array are then sent to a local conference site where a local conferencee is in attendance. The voices of the aforementioned remote conferencees are spatialized within a sound-field of the local site using multiple loudspeakers. Generally, this involves receiving the monophonic audio data streams from the remote site, and processing them to generate an audio signal for each loudspeaker. Each of the generated audio signals is then played through its respective loudspeaker to produce a spatial audio sound-field which is audibly perceived by the local conferencee as having the voice of each of the remote conferencees coming from a different location.

Type: Application

Filed: April 14, 2011

Publication date: October 18, 2012

Applicant: Microsoft Corporation

Inventors: Wei-ge Chen, Zhengyou Zhang
Facilitating advertisement placement over video content

Patent number: 8281334

Abstract: Systems, methods, computer-readable media, and graphical user interfaces for facilitating advertisement placement over video content are provided. Images within a video are partitioned into image regions. Upon partitioning images into image regions, an intrusiveness score is determined for each image region. Based on the intrusiveness scores, optimal placement of an advertisement within the video is determined.

Type: Grant

Filed: March 31, 2008

Date of Patent: October 2, 2012

Assignee: Microsoft Corporation

Inventors: Ying Shan, Yue Zhou, Xu Liu, Ying Li, Zhengyou Zhang
Ambulatory presence features

Patent number: 8253774

Abstract: The claimed subject matter provides a system and/or a method that facilitates managing one or more devices utilized for communicating data within a telepresence session. A telepresence session can be initiated within a communication framework that includes two or more virtually represented users that communicate therein. A device can be utilized by at least one virtually represented user that enables communication within the telepresence session, the device includes at least one of an input to transmit a portion of a communication to the telepresence session or an output to receive a portion of a communication from the telepresence session. A detection component can adjust at least one of the input related to the device or the output related to the device based upon the identification of a cue, the cue is at least one of a movement detected, an event detected, or an ambient variation.

Type: Grant

Filed: March 30, 2009

Date of Patent: August 28, 2012

Assignee: Microsoft Corporation

Inventors: Christian Huitema, William A. S. Buxton, John E. Paff, Zicheng Liu, Rajesh Kutpadi Hegde, Zhengyou Zhang, Kori Marie Quinn, Jin Li, Michel Pahud
Multi-sensor sound source localization

Patent number: 8233353

Abstract: A multi-sensor sound source localization (SSL) technique is presented which provides a true maximum likelihood (ML) treatment for microphone arrays having more than one pair of audio sensors. Generally, this is accomplished by selecting a sound source location that results in a time of propagation from the sound source to the audio sensors of the array, which maximizes a likelihood of simultaneously producing audio sensor output signals inputted from all the sensors in the array. The likelihood includes a unique term that estimates an unknown audio sensor response to the source signal for each of the sensors in the array.

Type: Grant

Filed: January 26, 2007

Date of Patent: July 31, 2012

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Dinei Florencio, Zhengyou Zhang
VIRTUAL AUDIO ENVIRONMENT FOR MULTIDIMENSIONAL CONFERENCING

Publication number: 20120155680

Abstract: The disclosed architecture employs signal processing techniques to provide audio perception only, or audio perception that matches the visual perception. This also provides spatial audio reproduction for multiparty teleconferencing such that the teleconferencing participants perceive themselves as if they were sitting in the same room. The solution is based on the premise that people perceive sounds as a reconstructed wavefront, and hence, the wavefronts are used to provide the spatial perceptual cues. The differences between the spatial perceptual cues derived from the reconstructed wavefront of sound waves and the ideal wavefront of sound waves form an objective metric for spatial perceptual quality, and provide the means of evaluating the overall system performance. Additionally, compensation filters are employed to improve the spatial perceptual quality of stereophonic systems by optimizing the objective metrics.

Type: Application

Filed: December 17, 2010

Publication date: June 21, 2012

Applicant: Microsoft Corporation

Inventors: Wei-ge Chen, Zhengyou Zhang, Yoomi Hur

prev … 6 7 8 9 10 11 12 13 14 … next