Patents by Inventor Zhengyou Zhang

Zhengyou Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20130121526
    Abstract: A three-dimensional shape parameter computation system and method for computing three-dimensional human head shape parameters from two-dimensional facial feature points. A series of images containing a user's face is captured. Embodiments of the system and method deduce the 3D parameters of the user's head by examining a series of captured images of the user over time and in a variety of head poses and facial expressions, and then computing an average. An energy function is constructed over a batch of frames containing 2D face feature points obtained from the captured images, and the energy function is minimized to solve for the head shape parameters valid for the batch of frames. Head pose parameters and facial expression and animation parameters can vary over each captured image in the batch of frames. In some embodiments this minimization is performed using a modified Gauss-Newton minimization technique using a single iteration.
    Type: Application
    Filed: November 11, 2011
    Publication date: May 16, 2013
    Applicant: Microsoft Corporation
    Inventors: Nikolay Smolyanskiy, Christian F. Huitema, Cha Zhang, Lin Liang, Sean Eron Anderson, Zhengyou Zhang
  • Patent number: 8406483
    Abstract: Techniques for face verification are described. Local binary pattern (LBP) features and boosting classifiers are used to verify faces in images. A boosted multi-task learning algorithm is used for face verification in images. Finally, boosted face verification is used to verify faces in videos.
    Type: Grant
    Filed: June 26, 2009
    Date of Patent: March 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Xiaogang Wang, Zhengyou Zhang
  • Patent number: 8405706
    Abstract: A videoconferencing conferee may be provided with feedback on his or her location relative a local video camera by altering how remote videoconference video is displayed on a local videoconference display viewed by the conferee. The conferee's location may be tracked and the displayed remote video may be altered in accordance to the changing location of the conferee. The remote video may appear to move in directions mirroring movement of the conferee. This effect may be achieved by modeling the remote video as offset and behind a virtual portal corresponding to the display. The remote video may be displayed according to a view of the remote video through the virtual portal. As the conferee's position changes, the view through the portal changes, and the remote video changes accordingly.
    Type: Grant
    Filed: December 17, 2008
    Date of Patent: March 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Christian Huitema, Alejandro Acero
  • Patent number: 8401979
    Abstract: Described is multiple category learning to jointly train a plurality of classifiers in an iterative manner. Each training iteration associates an adaptive label with each training example, in which during the iterations, the adaptive label of any example is able to be changed by the subsequent reclassification. In this manner, any mislabeled training example is corrected by the classifiers during training. The training may use a probabilistic multiple category boosting algorithm that maintains probability data provided by the classifiers, or a winner-take-all multiple category boosting algorithm selects the adaptive label based upon the highest probability classification. The multiple category boosting training system may be coupled to a multiple instance learning mechanism to obtain the training examples. The trained classifiers may be used as weak classifiers that provide a label used to select a deep classifier for further classification, e.g., to provide a multi-view object detector.
    Type: Grant
    Filed: November 16, 2009
    Date of Patent: March 19, 2013
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Zhengyou Zhang
  • Patent number: 8396247
    Abstract: A system that facilitates automatically determining an action of an animate object is described herein. The system includes a receiver component that receives video data that includes images of an animate object. The system additionally includes a determiner component that accesses a data store that includes an action graph and automatically determines an action undertaken by the animate object in the received video data based at least in part upon the action graph. The action graph comprises a plurality of nodes that are representative of multiple possible postures of the animate object. At least one node in the action graph is shared amongst multiple actions represented in the action graph.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: March 12, 2013
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Wanqing Li, Zicheng Liu
  • Publication number: 20130010079
    Abstract: A system described herein includes a receiver component that receives a first digital image from a color camera, wherein the first digital image comprises a planar object, and a second digital image from a depth sensor, wherein the second digital image comprises the planar object. The system also includes a calibrator component that jointly calibrates the color camera and the depth sensor based at least in part upon the first digital image and the second digital image.
    Type: Application
    Filed: July 8, 2011
    Publication date: January 10, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Cha Zhang, Zhengyou Zhang
  • Patent number: 8340267
    Abstract: The claimed subject matter relates to an architecture that can preprocess audio portions of communications in order to enrich multiparty communication sessions or environments. In particular, the architecture can provide both a public channel for public communications that are received by substantially all connected parties and can further provide a private channel for private communications that are received by a selected subset of all connected parties. Most particularly, the architecture can apply an audio transform to communications that occur during the multiparty communication session based upon a target audience of the communication. By way of illustration, the architecture can apply a whisper transform to private communications, an emotion transform based upon relationships, an ambience or spatial transform based upon physical locations, or a pace transform based upon lack of presence.
    Type: Grant
    Filed: February 5, 2009
    Date of Patent: December 25, 2012
    Assignee: Microsoft Corporation
    Inventors: Dinei A. Florencio, Alejandro Acero, William Buxton, Phillip A. Chou, Ross G. Cutler, Jason Garms, Christian Huitema, Kori M. Quinn, Daniel Allen Rosenfeld, Zhengyou Zhang
  • Patent number: 8339459
    Abstract: Techniques and technologies for tracking a face with a plurality of cameras wherein a geometry between the cameras is initially unknown. One disclosed method includes detecting a head with two of the cameras and registering a head model with the image of the head (as detected by one of the cameras). The method also includes back projecting the other detected face image to the head model and determining a head pose from the back-projected head image. Furthermore, the determined geometry is used to track the face with at least one of the cameras.
    Type: Grant
    Filed: September 16, 2009
    Date of Patent: December 25, 2012
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Aswin Sankaranarayanan, Qing Zhang, Zicheng Liu, Qin Cai
  • Publication number: 20120306995
    Abstract: A system facilitates managing one or more devices utilized for communicating data within a telepresence session. A telepresence session can be initiated within a communication framework that includes a first user and one or more second users. In response to determining a temporary absence of the first user from the telepresence session, a recordation of the telepresence session is initialized to enable a playback of a portion or a summary of the telepresence session that the first user has missed.
    Type: Application
    Filed: August 13, 2012
    Publication date: December 6, 2012
    Applicant: Microsoft Corporation
    Inventors: Christian Huitema, William A.S. Buxton, Jonathan E. Paff, Zicheng Liu, Rajesh Kutpadi Hegde, Zhengyou Zhang, Kori Marie Quinn, Jin Li, Michel Pahud
  • Publication number: 20120294510
    Abstract: A depth construction module is described that receives depth images provided by two or more depth capture units. Each depth capture unit generates its depth image using a structured light technique, that is, by projecting a pattern onto an object and receiving a captured image in response thereto. The depth construction module then identifies at least one deficient portion in at least one depth image that has been received, which may be attributed to overlapping projected patterns that impinge the object. The depth construction module then uses a multi-view reconstruction technique, such as a plane sweeping technique, to supply depth information for the deficient portion. In another mode, a multi-view reconstruction technique can be used to produce an entire depth scene based on captured images received from the depth capture units, that is, without first identifying deficient portions in the depth images.
    Type: Application
    Filed: May 16, 2011
    Publication date: November 22, 2012
    Applicant: Microsoft Corporation
    Inventors: Cha Zhang, Wenwu Zhu, Zhengyou Zhang, Philip A. Chou
  • Publication number: 20120287223
    Abstract: The described implementations relate to enhancement images, such as in videoconferencing scenarios. One system includes a poriferous display screen having generally opposing front and back surfaces. This system also includes a camera positioned proximate to the back surface to capture an image through the poriferous display screen.
    Type: Application
    Filed: May 11, 2011
    Publication date: November 15, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Cha Zhang, Timothy A. Large, Zhengyou Zhang, Ruigang Yang
  • Patent number: 8310521
    Abstract: The present virtual video muting technique seamlessly inserts a virtual video into a live video when the user does not want to reveal his/her actual activity. The virtual video is generated based on real video frames captured earlier, and thus makes the virtual video appear to be real.
    Type: Grant
    Filed: April 30, 2007
    Date of Patent: November 13, 2012
    Assignee: Microsoft Corp.
    Inventors: Zhengyou Zhang, Aaron Bobick
  • Publication number: 20120280974
    Abstract: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.
    Type: Application
    Filed: May 3, 2011
    Publication date: November 8, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Lijuan Wang, Frank Soong, Qiang Huo, Zhengyou Zhang
  • Publication number: 20120281059
    Abstract: The subject disclosure is directed towards an immersive conference, in which participants in separate locations are brought together into a common virtual environment (scene), such that they appear to each other to be in a common space, with geometry, appearance, and real-time natural interaction (e.g., gestures) preserved. In one aspect, depth data and video data are processed to place remote participants in the common scene from the first person point of view of a local participant. Sound data may be spatially controlled, and parallax computed to provide a realistic experience. The scene may be augmented with various data, videos and other effects/animations.
    Type: Application
    Filed: May 4, 2011
    Publication date: November 8, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Philip A. Chou, Zhengyou Zhang, Cha Zhang, Dinei A. Florencio, Zicheng Liu, Rajesh K. Hegde, Nirupama Chandrasekaran
  • Publication number: 20120268563
    Abstract: A person is provided with the ability to auditorily determine the spatial geometry of his current physical environment. A spatial map of the current physical environment of the person is generated. The spatial map is then used to generate a spatialized audio representation of the environment. The spatialized audio representation is then output to a stereo listening device which is being worn by the person.
    Type: Application
    Filed: April 22, 2011
    Publication date: October 25, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Philip A. Chou, Zhengyou Zhang, Dinei Florencio
  • Publication number: 20120262536
    Abstract: Stereophonic teleconferencing system embodiments are described which advantageously employ a microphone array at a remote conference site having multiple conferencees to produce a separate output channel from the each microphone in the array. Audio data streams each representing one of the audio output channels from the microphone array are then sent to a local conference site where a local conferencee is in attendance. The voices of the aforementioned remote conferencees are spatialized within a sound-field of the local site using multiple loudspeakers. Generally, this involves receiving the monophonic audio data streams from the remote site, and processing them to generate an audio signal for each loudspeaker. Each of the generated audio signals is then played through its respective loudspeaker to produce a spatial audio sound-field which is audibly perceived by the local conferencee as having the voice of each of the remote conferencees coming from a different location.
    Type: Application
    Filed: April 14, 2011
    Publication date: October 18, 2012
    Applicant: Microsoft Corporation
    Inventors: Wei-ge Chen, Zhengyou Zhang
  • Patent number: 8281334
    Abstract: Systems, methods, computer-readable media, and graphical user interfaces for facilitating advertisement placement over video content are provided. Images within a video are partitioned into image regions. Upon partitioning images into image regions, an intrusiveness score is determined for each image region. Based on the intrusiveness scores, optimal placement of an advertisement within the video is determined.
    Type: Grant
    Filed: March 31, 2008
    Date of Patent: October 2, 2012
    Assignee: Microsoft Corporation
    Inventors: Ying Shan, Yue Zhou, Xu Liu, Ying Li, Zhengyou Zhang
  • Patent number: 8253774
    Abstract: The claimed subject matter provides a system and/or a method that facilitates managing one or more devices utilized for communicating data within a telepresence session. A telepresence session can be initiated within a communication framework that includes two or more virtually represented users that communicate therein. A device can be utilized by at least one virtually represented user that enables communication within the telepresence session, the device includes at least one of an input to transmit a portion of a communication to the telepresence session or an output to receive a portion of a communication from the telepresence session. A detection component can adjust at least one of the input related to the device or the output related to the device based upon the identification of a cue, the cue is at least one of a movement detected, an event detected, or an ambient variation.
    Type: Grant
    Filed: March 30, 2009
    Date of Patent: August 28, 2012
    Assignee: Microsoft Corporation
    Inventors: Christian Huitema, William A. S. Buxton, John E. Paff, Zicheng Liu, Rajesh Kutpadi Hegde, Zhengyou Zhang, Kori Marie Quinn, Jin Li, Michel Pahud
  • Patent number: 8233353
    Abstract: A multi-sensor sound source localization (SSL) technique is presented which provides a true maximum likelihood (ML) treatment for microphone arrays having more than one pair of audio sensors. Generally, this is accomplished by selecting a sound source location that results in a time of propagation from the sound source to the audio sensors of the array, which maximizes a likelihood of simultaneously producing audio sensor output signals inputted from all the sensors in the array. The likelihood includes a unique term that estimates an unknown audio sensor response to the source signal for each of the sensors in the array.
    Type: Grant
    Filed: January 26, 2007
    Date of Patent: July 31, 2012
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Dinei Florencio, Zhengyou Zhang
  • Publication number: 20120155680
    Abstract: The disclosed architecture employs signal processing techniques to provide audio perception only, or audio perception that matches the visual perception. This also provides spatial audio reproduction for multiparty teleconferencing such that the teleconferencing participants perceive themselves as if they were sitting in the same room. The solution is based on the premise that people perceive sounds as a reconstructed wavefront, and hence, the wavefronts are used to provide the spatial perceptual cues. The differences between the spatial perceptual cues derived from the reconstructed wavefront of sound waves and the ideal wavefront of sound waves form an objective metric for spatial perceptual quality, and provide the means of evaluating the overall system performance. Additionally, compensation filters are employed to improve the spatial perceptual quality of stereophonic systems by optimizing the objective metrics.
    Type: Application
    Filed: December 17, 2010
    Publication date: June 21, 2012
    Applicant: Microsoft Corporation
    Inventors: Wei-ge Chen, Zhengyou Zhang, Yoomi Hur