Patents by Inventor Zhengyou Zhang

Zhengyou Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7274388
    Abstract: Calibration for a camera is achieved by receiving images of a calibration object whose geometry is one-dimension in space. The received images show the calibration object in several distinct positions. Calibration for the camera is then calculated based on the received images of the calibration object.
    Type: Grant
    Filed: January 5, 2006
    Date of Patent: September 25, 2007
    Assignee: Microsoft Corporation
    Inventor: Zhengyou Zhang
  • Patent number: 7272256
    Abstract: Method and a system for correlating pixels between two digital images. In general, the present invention uses a progressive iterative technique that finds generally unambiguous pixel matches by beginning with a few reliable pixel matches and finding progressively more unambiguous pixel matches. Unambiguous pixel matches in the current iteration then are found using the correlation technique and based on a correlation score associated with a pixel match. The search range is capable of being rotated, and is part of a novel correlation technique of the present invention that provides a more robust estimate of pixel match reliability. Potential pixel matches found in the search ranges are tested for ambiguity and any unambiguous matches are selected and added to the set of reliable pixel matches. The ambiguity testing includes determining a correlation score for the pixel match and classifying the match based on the correlation score.
    Type: Grant
    Filed: December 30, 2004
    Date of Patent: September 18, 2007
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Ying Shan
  • Publication number: 20070206878
    Abstract: A real-time approximately 360 degree image correction system and a method for alleviating distortion and perception problems in images captured by omni-directional cameras. In general, the real-time panoramic image correction method generates a warp table from pixel coordinates of a panoramic image and applies the warp table to the panoramic image to create a corrected panoramic image. The corrections are performed using a parametric class of warping functions that include Spatially Varying Uniform (SVU) scaling functions. The SVU scaling functions and scaling factors are used to perform vertical scaling and horizontal scaling on the panoramic image pixel coordinates. A horizontal distortion correction is performed using the SVU scaling functions at at least two different scaling factors. This processing generates a warp table that can be applied to the panoramic image to yield the corrected panoramic image.
    Type: Application
    Filed: August 18, 2006
    Publication date: September 6, 2007
    Applicant: Microsoft Corporation
    Inventors: Zicheng Liu, Ross Cutler, Michael Cohen, Zhengyou Zhang
  • Publication number: 20070198257
    Abstract: Speaker authentication is performed by determining a similarity score for a test utterance and a stored training utterance. Computing the similarity score involves determining the sum of a group of functions, where each function includes the product of a posterior probability of a mixture component and a difference between an adapted mean and a background mean. The adapted mean is formed based on the background mean and the test utterance. The speech content provided by the speaker for authentication can be text-independent (i.e., any content they want to say) or text-dependent (i.e., a particular phrase used for training).
    Type: Application
    Filed: February 20, 2006
    Publication date: August 23, 2007
    Applicant: Microsoft Corporation
    Inventors: Zhengyou Zhang, Ming Liu
  • Patent number: 7260257
    Abstract: A system that captures both whiteboard content and audio signals of a meeting using a digital camera and a microphone. The system can be retrofit to any existing whiteboard. It computes the time stamps of pen strokes on the whiteboard by analyzing the sequence of captured snapshots. It also automatically produces a set of key frames representing all the written content on the whiteboard before each erasure. The whiteboard content serves as a visual index to efficiently browse the audio meeting. The system not only captures the whiteboard content, but also helps the users to view and manage the captured meeting content efficiently and securely.
    Type: Grant
    Filed: June 19, 2002
    Date of Patent: August 21, 2007
    Assignee: Microsoft Corp.
    Inventors: Zhengyou Zhang, Ross Cutler, Zicheng Liu, Anoop Gupta, Li-wei He
  • Patent number: 7260278
    Abstract: A system that captures both whiteboard content and audio signals of a meeting using a video camera and records or transmits them in real-time. The Real-Time Whiteboard Capture captures pen strokes on whiteboards in real time using an off-the-shelf video camera. Unlike many existing tools, the RTWCS does not instrument the pens or the whiteboard. It analyzes the sequence of captured video images in real time, classifies the pixels into whiteboard background, pen strokes and foreground objects (e.g., people in front of the whiteboard), and extracts newly written pen strokes. This allows the RTWCS to transmit whiteboard contents using very low bandwidth to remote meeting participants. Combined with other teleconferencing tools such as voice conference and application sharing, the RTWCS becomes a powerful tool to share ideas during online meetings.
    Type: Grant
    Filed: March 19, 2004
    Date of Patent: August 21, 2007
    Assignee: Microsoft Corp.
    Inventors: Zhengyou Zhang, Liwei He
  • Publication number: 20070177183
    Abstract: A system for generating soft copy (digital) versions of hard copy documents uses images of the hard copy documents. The images may be captured using a device suitable for capturing images, like a camera phone. Once available, the images may be processed to improve their suitability for document generation. The images may then be processed to recognize and generate soft copy versions of the documents represented by the images.
    Type: Application
    Filed: February 2, 2006
    Publication date: August 2, 2007
    Applicant: Microsoft Corporation
    Inventors: Merle Robinson, Matthieu Uyttendaele, Zhengyou Zhang, Patrice Simard
  • Patent number: 7250965
    Abstract: Calibration for a camera is achieved by receiving images of a calibration object whose geometry is one-dimension in space. The received images show the calibration object in several distinct positions. Calibration for the camera is then calculated based on the received images of the calibration object.
    Type: Grant
    Filed: January 5, 2006
    Date of Patent: July 31, 2007
    Assignee: Microsoft Corporation
    Inventor: Zhengyou Zhang
  • Publication number: 20070172144
    Abstract: A video clip is processed by selecting a plurality of video frames of the video clip. A plurality of the pixels of the selected video frames are modified to form modified video frames. The modification to each of the plurality of the pixels is based on the intensity of the pixel, a change in the intensity of the pixel from the corresponding pixel in at least one related video frame, and the intensity of the corresponding pixel. A second video clip is formed that comprises the modified video clips.
    Type: Application
    Filed: January 26, 2006
    Publication date: July 26, 2007
    Applicant: Microsoft Corporation
    Inventors: Zhengyou Zhang, An Xu, Chunhui Zhang
  • Publication number: 20070156816
    Abstract: A system that captures both whiteboard content and audio signals of a meeting using a video camera and records or transmits them in real-time. The Real-Time Whiteboard Capture captures pen strokes on whiteboards in real time using an off-the-shelf video camera. Unlike many existing tools, the RTWCS does not instrument the pens or the whiteboard. It analyzes the sequence of captured video images in real time, classifies the pixels into whiteboard background, pen strokes and foreground objects (e.g., people in front of the whiteboard), and extracts newly written pen strokes. This allows the RTWCS to transmit whiteboard contents using very low bandwidth to remote meeting participants. Combined with other teleconferencing tools such as voice conference and application sharing, the RTWCS becomes a powerful tool to share ideas during online meetings.
    Type: Application
    Filed: March 21, 2007
    Publication date: July 5, 2007
    Applicant: Microsoft Corporation
    Inventors: Zhengyou Zhang, Liwei He
  • Publication number: 20070150263
    Abstract: A frame of a speech signal is converted into the spectral domain to identify a plurality of frequency components and an energy value for the frame is determined. The plurality of frequency components is divided by the energy value for the frame to form energy-normalized frequency components. A model is then constructed from the energy-normalized frequency components and can be used for speech recognition and speech enhancement.
    Type: Application
    Filed: December 23, 2005
    Publication date: June 28, 2007
    Applicant: Microsoft Corporation
    Inventors: Zhengyou Zhang, Alejandro Acero, Amarnag Subramanya, Zicheng Liu
  • Publication number: 20070126755
    Abstract: A system that captures both whiteboard content and audio signals of a meeting using a digital camera and a microphone. The system can be retrofit to any existing whiteboard. It computes the time stamps of pen strokes on the whiteboard by analyzing the sequence of captured snapshots. It also automatically produces a set of key frames representing all the written content on the whiteboard before each erasure. The whiteboard content serves as a visual index to efficiently browse the audio meeting. The system not only captures the whiteboard content, but also helps the users to view and manage the captured meeting content efficiently and securely.
    Type: Application
    Filed: November 30, 2006
    Publication date: June 7, 2007
    Applicant: Microsoft Corporation
    Inventors: Zhengyou Zhang, Ross Cutler, Zicheng Liu, Anoop Gupta, Li-wei He
  • Publication number: 20070122039
    Abstract: An “Image Segmenter” provides a variational energy formulation for segmentation of natural objects from images. In general, the Image Segmenter operates by adopting Gaussian mixture models (GMM) to capture the appearance variation of objects in one or more images. A global image data likelihood potential is then computed and combined with local region potentials to obtain a robust and accurate estimation of pixel foreground and background distributions. Iterative minimization of a “global-local energy function” is then accomplished by evolution of a foreground/background boundary curve by level set, and estimation of a foreground/background model by fixed-point iteration, termed “quasi-semi-supervised EM.” In various embodiments, this process is further improved by providing general object shape information for use in rectifying objects segmented from the image.
    Type: Application
    Filed: November 29, 2005
    Publication date: May 31, 2007
    Applicant: Microsoft Corporation
    Inventors: Zhengyou Zhang, Zicheng Liu, Gang Hua
  • Patent number: 7224847
    Abstract: A system and method for streaming whiteboard content to computing devices in a networked environment. The invention is an extension of whiteboard image generation technology to provide network-based collaboration of a target meeting. In one aspect, each networked client can receive audio content and whiteboard content (video images). In another aspect, each networked client can transmit audio content and annotation content which is displayed separately or generated on the whiteboard image. The streaming content is built on external collaboration frameworks.
    Type: Grant
    Filed: June 17, 2003
    Date of Patent: May 29, 2007
    Assignee: Microsoft Corp.
    Inventors: Zhengyou Zhang, Li-wei He
  • Publication number: 20070112906
    Abstract: Infrastructure for a multi-modal multilingual communications device (MMCD) is presented. A communications component is provided that includes wireless and wired IP networks (e.g, LANs, MANs, and WANs, . . . ), as well as cellular and/or wired telecommunications networks for cellular communications. A management component can include software and hardware entities that facilitate the activation, authentication, accounting, updating of the MMCD systems, and synchronization to other entities. Additionally, the management component can facilitate the dissemination of applications, third-party services, and subscription information. An access component (e.g., a web server and interface) facilitates access to one or more of these entities such that administrators and/or users can access aspects of setup, configuration, subscriptions, updates, etc.
    Type: Application
    Filed: November 15, 2005
    Publication date: May 17, 2007
    Applicant: Microsoft Corporation
    Inventors: Zicheng Liu, David Kurlander, David Williams, Michael Sinclair, Zhengyou Zhang
  • Publication number: 20070100480
    Abstract: A system that facilitates managing resources (e.g., functionality, services) based at least in part upon an established context. More particularly, a context determination component can be employed to establish a context by processing sensor inputs or learning/inferring a user action/preference. Once the context is established via context determination component, a power/mode management component can be employed to activate and/or mask resources in accordance with the established context. The power and mode management of the device can extend life of a power source (e.g., battery) and mask functionality in accordance with a user and/or device state.
    Type: Application
    Filed: October 28, 2005
    Publication date: May 3, 2007
    Applicant: Microsoft Corporation
    Inventors: Michael Sinclair, David Williams, Zhengyou Zhang, Zicheng Liu
  • Publication number: 20070099602
    Abstract: A multi-modal multi-lingual mobile device that facilitates intelligently automating an action. The device can automatically synchronize a user schedule based upon a user state, intention, preference and/or limitation. The device can employ sensors to automatically detect criteria by which to automatically implement an action. Moreover, the system can interrogate a user thus converging upon a user intention and/or preference. An analyzer component can intelligently evaluate the compiled criterion in order to automatically perform an action. The multi-modal multi-lingual mobile device can automatically facilitate identification of an individual. Other actions that are automatically performed can include modifying personal information manager data, translating languages into a language comprehendible to a user, etc. Implementation of these actions can be based at least in part upon an environmental factor, a conversation, a location factor and a temporal factor.
    Type: Application
    Filed: October 28, 2005
    Publication date: May 3, 2007
    Applicant: Microsoft Corporation
    Inventors: David Kurlander, David Williams, Yuan Kong, Zhengyou Zhang
  • Publication number: 20070100704
    Abstract: A multi-modal device that can substantially facilitate intelligent shopping. Electronic receipts can be provided to a user wirelessly and stored/indexed on the multi-modal device. Receipts can be categorized (e.g., personal, business, client entertainment) thereby facilitating financial management and accounting. Likewise, such electronic receipts can provide for easier return/exchange of goods. The multi-modal device can also assist in tracking/managing shopping lists and business cards (e.g., provide for business card exchanges). Moreover, the multi-modal device can provide for comparison shopping, catalog shopping, locating products and obtaining more information about a product via visual or audio mechanisms.
    Type: Application
    Filed: October 28, 2005
    Publication date: May 3, 2007
    Applicant: Microsoft Corporation
    Inventors: Zicheng Liu, Silviu-Petru Cucerzan, Zhengyou Zhang, David Kurlander, Alejandro Acero
  • Patent number: 7212656
    Abstract: Described herein is a technique for creating a 3D face model using images obtained from an inexpensive camera associated with a general-purpose computer. Two still images of the user are captured, and two video sequences. The user is asked to identify five facial features, which are used to calculate a mask and to perform fitting operations. Based on a comparison of the still images, deformation vectors are applied to a neutral face model to create the 3D model. The video sequences are used to create a texture map. The process of creating the texture map references the previously obtained 3D model to determine poses of the sequential video images.
    Type: Grant
    Filed: January 26, 2006
    Date of Patent: May 1, 2007
    Assignee: Microsoft Corporation
    Inventors: Zicheng Liu, Zhengyou Zhang, Charles E. Jacobs, Michael F. Cohen
  • Publication number: 20070088544
    Abstract: A first set of signals from an array of one or more microphones, and a second signal from a reference microphone are used to calibrate a set of filter parameters such that the filter parameters minimize a difference between the second signal and a beamformer output signal that is based on the first set of signals. Once calibrated, the filter parameters are used to form a beamformer output signal that is filtered using a non-linear adaptive filter that is adapted based on portions of a signal that do not contain speech, as determined by a speech detection sensor.
    Type: Application
    Filed: October 14, 2005
    Publication date: April 19, 2007
    Applicant: Microsoft Corporation
    Inventors: Alejandro Acero, Michael Seltzer, Zhengyou Zhang, Zicheng Liu