Patents by Inventor Zhengyou Zhang

Zhengyou Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8797386
    Abstract: A person is provided with the ability to auditorily determine the spatial geometry of his current physical environment. A spatial map of the current physical environment of the person is generated. The spatial map is then used to generate a spatialized audio representation of the environment. The spatialized audio representation is then output to a stereo listening device which is being worn by the person.
    Type: Grant
    Filed: April 22, 2011
    Date of Patent: August 5, 2014
    Assignee: Microsoft Corporation
    Inventors: Philip A. Chou, Zhengyou Zhang, Dinei Florencio
  • Publication number: 20140168204
    Abstract: A method, system, and computer-readable storage media for model based video projection are provided herein. The method includes tracking an object within a video based on a three-dimensional parametric model via a computing device and projecting the video onto the three-dimensional parametric model. The method also includes updating a texture map corresponding to the object within the video and rendering a three-dimensional video of the object from any of a number of viewpoints by loosely coupling the three-dimensional parametric model and the updated texture map.
    Type: Application
    Filed: December 13, 2012
    Publication date: June 19, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhengyou Zhang, Qin Cai, Philip A. Chou
  • Patent number: 8737648
    Abstract: A spatial element is added to communications, including over telephone conference calls heard through headphones or a stereo speaker setup. Functions are created to modify signals from different callers to create the illusion that the callers are speaking from different parts of the room.
    Type: Grant
    Filed: May 26, 2009
    Date of Patent: May 27, 2014
    Inventors: Wei-ge Chen, Zhengyou Zhang
  • Patent number: 8731911
    Abstract: Speech quality estimation technique embodiments are described which generally involve estimating the human speech quality of an audio frame in a single-channel audio signal. A representation of a harmonic component of the frame is synthesized and used to compute a non-harmonic component of the frame. The synthesized harmonic component representation and the non-harmonic component are then used to compute a harmonic to non-harmonic ratio (HnHR). This HnHR is indicative of the quality of a user's speech and is designated as an estimate of the speech quality of the frame. In one implementation, the HnHR is used to establish a minimum speech quality threshold below which the quality of the user's speech is considered unacceptable. Feedback to the user is then provided based on whether the HnHR falls below the threshold.
    Type: Grant
    Filed: December 9, 2011
    Date of Patent: May 20, 2014
    Assignee: Microsoft Corporation
    Inventors: Wei-ge Chen, Zhengyou Zhang, Jaemo Yang
  • Publication number: 20140098183
    Abstract: A controlled three-dimensional (3D) communication endpoint system and method for simulating an in-person communication between participants in an online meeting or conference and providing easy scaling of a virtual environment when additional participants join. This gives the participants the illusion that the other participants are in the same room and sitting around the same table with the viewer. The controlled communication endpoint includes a plurality of camera pods that capture video of a participant from 360 degrees around the participant. The controlled communication endpoint also includes a display device configuration containing display devices placed at least 180 degrees around the participant and display the virtual environment containing geometric proxies of the other participants. Placing the participants at a round virtual table and increasing the diameter of the virtual table as additional participants are added easily achieves scalability.
    Type: Application
    Filed: October 10, 2012
    Publication date: April 10, 2014
    Applicant: Microsoft Corporation
    Inventors: Yancey Christopher Smith, Eric G. Lang, Christian F. Huitema, Zhengyou Zhang
  • Patent number: 8693713
    Abstract: The disclosed architecture employs signal processing techniques to provide audio perception only, or audio perception that matches the visual perception. This also provides spatial audio reproduction for multiparty teleconferencing such that the teleconferencing participants perceive themselves as if they were sitting in the same room. The solution is based on the premise that people perceive sounds as a reconstructed wavefront, and hence, the wavefronts are used to provide the spatial perceptual cues. The differences between the spatial perceptual cues derived from the reconstructed wavefront of sound waves and the ideal wavefront of sound waves form an objective metric for spatial perceptual quality, and provide the means of evaluating the overall system performance. Additionally, compensation filters are employed to improve the spatial perceptual quality of stereophonic systems by optimizing the objective metrics.
    Type: Grant
    Filed: December 17, 2010
    Date of Patent: April 8, 2014
    Assignee: Microsoft Corporation
    Inventors: Wei-ge Chen, Zhengyou Zhang, Yoomi Hur
  • Patent number: 8675067
    Abstract: The subject disclosure is directed towards an immersive conference, in which participants in separate locations are brought together into a common virtual environment (scene), such that they appear to each other to be in a common space, with geometry, appearance, and real-time natural interaction (e.g., gestures) preserved. In one aspect, depth data and video data are processed to place remote participants in the common scene from the first person point of view of a local participant. Sound data may be spatially controlled, and parallax computed to provide a realistic experience. The scene may be augmented with various data, videos and other effects/animations.
    Type: Grant
    Filed: May 4, 2011
    Date of Patent: March 18, 2014
    Assignee: Microsoft Corporation
    Inventors: Philip A. Chou, Zhengyou Zhang, Cha Zhang, Dinei A. Florencio, Zicheng Liu, Rajesh K. Hegde, Nirupama Chandrasekaran
  • Patent number: 8675926
    Abstract: Multiple images including a face presented by a user are accessed. One or more determinations are made based on the multiple images, such as a determination of whether the face included in the multiple images is a 3-dimensional structure or a flat surface and/or a determination of whether motion is present in one or more face components (e.g., eyes or mouth). If it is determined that the face included in the multiple images is a 3-dimensional structure or that that motion is present in the one or more face components, then an indication is provided that the user can be authenticated. However, if it is determined that the face included in the multiple images is a flat surface or that motion is not present in the one or more face components, then an indication is provided that the user cannot be authenticated.
    Type: Grant
    Filed: June 8, 2010
    Date of Patent: March 18, 2014
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Qin Cai, Pieter R. Kasselman, Arthur H. Baker
  • Patent number: 8675981
    Abstract: Gender recognition is performed using two or more modalities. For example, depth image data and one or more types of data other than depth image data is received. The data pertains to a person. The different types of data are fused together to automatically determine gender of the person. A computing system can subsequently interact with the person based on the determination of gender.
    Type: Grant
    Filed: June 11, 2010
    Date of Patent: March 18, 2014
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Alex Aben-Athar Kipman
  • Patent number: 8670018
    Abstract: Reaction information of participants to an interaction may be sensed and analyzed to determine one or more reactions or dispositions of the participants. Feedback may be provided based on the determined reactions. The participants may be given an opportunity to opt in to having their reaction information collected, and may be provided complete control over how their reaction information is shared or used.
    Type: Grant
    Filed: May 27, 2010
    Date of Patent: March 11, 2014
    Assignee: Microsoft Corporation
    Inventors: Sharon K. Cunnington, Rajesh K. Hegde, Kori Quinn, Jin Li, Philip A. Chou, Zhengyou Zhang, Desney S. Tan
  • Patent number: 8639042
    Abstract: Described is a hierarchical filtered motion field technology such as for use in recognizing actions in videos with crowded backgrounds. Interest points are detected, e.g., as 2D Harris corners with recent motion, e.g. locations with high intensities in a motion history image (MHI). A global spatial motion smoothing filter is applied to the gradients of MHI to eliminate low intensity corners that are likely isolated, unreliable or noisy motions. At each remaining interest point, a local motion field filter is applied to the smoothed gradients by computing a structure proximity between sets of pixels in the local region and the interest point. The motion at a pixel/pixel set is enhanced or weakened based on its structure proximity with the interest point (nearer pixels are enhanced).
    Type: Grant
    Filed: June 22, 2010
    Date of Patent: January 28, 2014
    Assignee: Microsoft Corporation
    Inventors: Zicheng Liu, Yingli Tian, Liangliang Cao, Zhengyou Zhang
  • Publication number: 20140009562
    Abstract: Multi-device capture and spatial browsing of conferences is described. In one implementation, a system detects cameras and microphones, such as the webcams on participants' notebook computers, in a conference room, group meeting, or table game, and enlists an ad-hoc array of available devices to capture each participant and the spatial relationships between participants. A video stream composited from the array is browsable by a user to navigate a 3-dimensional representation of the meeting. Each participant may be represented by a video pane, a foreground object, or a 3-D geometric model of the participant's face or body displayed in spatial relation to the other participants in a 3-dimensional arrangement analogous to the spatial arrangement of the meeting. The system may automatically re-orient the 3-dimensional representation as needed to best show a currently interesting event.
    Type: Application
    Filed: July 9, 2013
    Publication date: January 9, 2014
    Inventors: Rajesh K. HEGDE, Zhengyou ZHANG, Philip A. CHOU, Cha ZHANG, Zicheng LIU, Sasa JUNUZOVIC
  • Patent number: 8620009
    Abstract: Systems and methods for determining a virtual sound source position by determining an output for loudspeakers by the position of the loudspeakers in relation to a listener. The output of respective loudspeakers is generated using aural cues to give the listener knowledge of the virtual position of the virtual sound source. Both a gain in intensity and a delay are simulated.
    Type: Grant
    Filed: June 17, 2008
    Date of Patent: December 31, 2013
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, James D. Johnston
  • Publication number: 20130336524
    Abstract: The subject disclosure is directed towards a technology by which dynamic hand gestures are recognized by processing depth data, including in real-time. In an offline stage, a classifier is trained from feature values extracted from frames of depth data that are associated with intended hand gestures. In an online stage, a feature extractor extracts feature values from sensed depth data that corresponds to an unknown hand gesture. These feature values are input to the classifier as a feature vector to receive a recognition result of the unknown hand gesture. The technology may be used in real time, and may be robust to variations in lighting, hand orientation, and the user's gesturing speed and style.
    Type: Application
    Filed: June 18, 2012
    Publication date: December 19, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhengyou Zhang, Alexey Vladimirovich Kurakin
  • Patent number: 8605890
    Abstract: A multi-party spatial audio conferencing system is configured to receive far end signals from remote participants. The system comprises a speaker array that outputs spatialized sound signals and one or more microphones that capture and relay a sound signal comprising an echo of the spatialized sound signal to a multichannel acoustic echo cancellation (MC-AEC) unit having a plurality of echo cancellers. Respective echo cancellers perform cancellation of an echo associated with a far end signal from one of the multiple participants according to an algorithm based upon echo cancellation coefficients. The echo cancellation coefficients are determined from the input channel signals, the spatialization parameters associated with each input channel, and the audio signals captured by the microphones. This allows respective echo cancellation filters to be updated simultaneously even though the corresponding remote participant is not talking.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: December 10, 2013
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Qin Cai
  • Publication number: 20130321564
    Abstract: A perspective-correct communication window system and method for communicating between participants in an online meeting, where the participants are not in the same physical locations. Embodiments of the system and method provide an in-person communications experience by changing virtual viewpoint for the participants when they are viewing the online meeting. The participant sees a different perspective displayed on a monitor based on the location of the participant's eyes. Embodiments of the system and method include a capture and creation component that is used to capture visual data about each participant and create a realistic geometric proxy from the data. A scene geometry component is used to create a virtual scene geometry that mimics the arrangement of an in-person meeting. A virtual viewpoint component displays the changing virtual viewpoint to the viewer and can add perceived depth using motion parallax.
    Type: Application
    Filed: August 31, 2012
    Publication date: December 5, 2013
    Applicant: Microsoft Corporation
    Inventors: Yancey Christopher Smith, Eric G. Lang, Zhengyou Zhang, Christian F. Huitema
  • Patent number: 8600731
    Abstract: The claimed subject matter provides a system and/or a method that facilitates communication within a telepresence session. A telepresence session can be initiated within a communication framework that includes two or more virtually represented users that communicate therein. The telepresence session can include at least one virtually represented user that communicates in a first language, the communication is at least one of a portion of audio, a portion of video, a portion of graphic, a gesture, or a portion of text. An interpreter component can evaluate the communication to translate an identified first language into a second language within the telepresence session, the translation is automatically provided to at least one virtually represented user within the telepresence.
    Type: Grant
    Filed: February 4, 2009
    Date of Patent: December 3, 2013
    Assignee: Microsoft Corporation
    Inventors: Sharon Kay Cunnington, Jin Li, Michel Pahud, Rajesh K. Hegde, Zhengyou Zhang
  • Publication number: 20130294710
    Abstract: A temporal information integration dis-occlusion system and method for using historical data to reconstruct a virtual view containing an occluded area. Embodiments of the system and method use temporal information of the scene captured previously to obtain a total history. This total history is warped onto information captured by a camera at a current time in order to help reconstruct the dis-occluded areas. The historical data (or frames) from the total history match only a portion of the frames contained in the captured information. This warping yields warped history information. Warping is performed by using one of two embodiments to match points in an estimation of the current information to points in the captured information. Next, regions of current information are split using a classifier. The warped history information and the captured information then are merged to obtain an estimate for the current information and the reconstructed virtual view.
    Type: Application
    Filed: May 4, 2012
    Publication date: November 7, 2013
    Applicant: Microsoft Corporation
    Inventors: Philip Andrew Chou, Cha Zhang, Zhengyou Zhang, Shujie Liu
  • Patent number: 8537196
    Abstract: Multi-device capture and spatial browsing of conferences is described. In one implementation, a system detects cameras and microphones, such as the webcams on participants' notebook computers, in a conference room, group meeting, or table game, and enlists an ad-hoc array of available devices to capture each participant and the spatial relationships between participants. A video stream composited from the array is browsable by a user to navigate a 3-dimensional representation of the meeting. Each participant may be represented by a video pane, a foreground object, or a 3-D geometric model of the participant's face or body displayed in spatial relation to the other participants in a 3-dimensional arrangement analogous to the spatial arrangement of the meeting.
    Type: Grant
    Filed: October 6, 2008
    Date of Patent: September 17, 2013
    Assignee: Microsoft Corporation
    Inventors: Rajesh K. Hegde, Zhengyou Zhang, Philip A. Chou, Cha Zhang, Zicheng Liu, Sasa Junuzovic
  • Publication number: 20130232515
    Abstract: Technologies described herein relate to estimating engagement of a person with respect to content being presented to the person. A sensor outputs a stream of data relating to the person as the person is consuming the content. At least one feature is extracted from the stream of data, and a level of engagement of the person is estimated based at least in part upon the at least one feature. A computing function is performed based upon the estimated level of engagement of the person.
    Type: Application
    Filed: April 19, 2013
    Publication date: September 5, 2013
    Applicant: Microsoft Corporation
    Inventors: Javier Hernandez Rivera, Zicheng Liu, Geoff Hulten, Michael Conrad, Kyle Krum, David DeBarr, Zhengyou Zhang