Patents by Inventor Yong Rui

Yong Rui has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7305095
    Abstract: A system and process is described for estimating the location of a speaker using signals output by a microphone array characterized by multiple pairs of audio sensors. The location of a speaker is estimated by first determining whether the signal data contains human speech components and filtering out noise attributable to stationary sources. The location of the person speaking is then estimated using a time-delay-of-arrival based SSL technique on those parts of the data determined to contain human speech components. A consensus location for the speaker is computed from the individual location estimates associated with each pair of microphone array audio sensors taking into consideration the uncertainty of each estimate. A final consensus location is also computed from the individual consensus locations computed over a prescribed number of sampling periods using a temporal filtering technique.
    Type: Grant
    Filed: July 15, 2005
    Date of Patent: December 4, 2007
    Assignee: Microsoft Corporation
    Inventor: Yong Rui
  • Patent number: 7293280
    Abstract: A program distribution system includes a plurality of set-top boxes that receive broadcast programming and segmentation data from content and information providers. The segmentation information indicates portions of programs that are to be included in skimmed or condensed versions of the received programming, and is produced using manual or automated methods. Automated methods include the use of ancillary production data to detect the most important parts of a program. A user interface allows a user to control time scale modification and skimming during playback, and also allows the user to easily browse to different points within the current program.
    Type: Grant
    Filed: May 5, 2000
    Date of Patent: November 6, 2007
    Assignee: Microsoft Corporation
    Inventors: Anoop Gupta, Li-Wei He, Francis C. Li, Yong Rui
  • Publication number: 20070237393
    Abstract: A spatial-color Gaussian mixture model (SCGMM) image segmentation technique for segmenting images. The SCGMM image segmentation technique specifies foreground objects in the first frame of an image sequence, either manually or automatically. From the initial segmentation, the SCGMM segmentation system learns two spatial-color Gaussian mixture models (SCGMM) for the foreground and background objects. These models are built into a first-order Markov random field (MRF) energy function.
    Type: Application
    Filed: March 30, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Cha Zhang, Michael Cohen, Yong Rui, Ting Yu
  • Publication number: 20070237099
    Abstract: A decentralized computer network architecture and method that gathers metadata from local and remote clients and, based on that metadata, locally makes a decision whether to send a packet over the network. Each client listens to what other clients are doing, and only sends when the total number of concurrent speakers is below some threshold. In a multi-party voice conferencing embodiment, the threshold is a number of concurrent speakers that is restricted to less than a certain number. Under the decentralized computer network architecture, the type of network topology used to connect the clients is flexible, as long as each client is running a peer-aware system to decide locally whether to send their packets. The decentralized computer network architecture and method is distributed to run on each client, making it suitable for a wide variety of network topologies (such as full-mesh, bridge-based, or a hybrid of the two).
    Type: Application
    Filed: March 29, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Li-wei He, Dinei Florencio, Yong Rui
  • Publication number: 20070186171
    Abstract: Techniques are provided for indicating workspace awareness using one or more of a write shadow, a read shadow, and/or a shadowbar providing an indication of operations performed at associated locations by various users accessing a same document. A write shadow may be used to indicate a position in a document being modified by a user. A read shadow may be used to indicate a position being viewed by a user. A shadowbar may be used to indicate areas of overlap among users with a shading and coloring indicative of a degree of overlap.
    Type: Application
    Filed: February 9, 2006
    Publication date: August 9, 2007
    Applicant: Microsoft Corporation
    Inventors: Sasa Junuzovic, Prasun Dewan, Yong Rui
  • Patent number: 7254241
    Abstract: A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source localization (SSL) approach involves two steps. The first step computes TDOA for each microphone pair, and the second step combines these estimates. This two-step process discards relevant information in the first step, thus degrading the SSL accuracy and robustness. In the present invention, direct, one-step, approaches are employed. Namely, a one-step TDOA SSL approach and a steered beam (SB) SSL approach are employed. Each of these approaches provides an accuracy and robustness not available with the traditional two-step approaches.
    Type: Grant
    Filed: July 26, 2005
    Date of Patent: August 7, 2007
    Assignee: Microsoft Corporation
    Inventors: Yong Rui, Dinei Florencio
  • Patent number: 7231064
    Abstract: A system and method for object tracking using probabilistic mode-based multi-hypothesis tracking (MHT) provides for robust and computationally efficient tracking of moving objects such as heads and faces in complex environments. A mode-based multi-hypothesis tracker uses modes that are local maximums which are refined from initial samples in a parametric state space. Because the modes are highly representative, the mode-based multi-hypothesis tracker effectively models non-linear probabilistic distributions using a small number of hypotheses. Real-time tracking performance is achieved by using a parametric causal contour model to refine initial contours to nearby modes. In addition, one common drawback of conventional MHT schemes, i.e., producing only maximum likelihood estimates instead of a desired posterior probability distribution, is addressed by introducing an importance sampling framework into MHT, and estimating the posterior probability distribution from the importance function.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: June 12, 2007
    Assignee: Microsoft Corporation
    Inventors: Yong Rui, Yunqiang Chen
  • Publication number: 20070124370
    Abstract: A unique system and method that facilitates multi-user collaborative interactions is provided. Multiple users can provide input to an interactive surface at or about the same time without yielding control of the surface to any one user. The multiple users can share control of the surface and perform operations on various objects displayed on the surface. The objects can undergo a variety of manipulations and modifications depending on the particular application in use. Objects can be moved or copied between the interactive surface (a public workspace) and a more private workspace where a single user controls the workspace.
    Type: Application
    Filed: November 29, 2005
    Publication date: May 31, 2007
    Applicant: Microsoft Corporation
    Inventors: Krishnamohan Nareddy, Andrew Wilson, Yong Rui
  • Publication number: 20070120979
    Abstract: A combined digital and mechanical tracking system and process for generating a video using a single digital video camera that tracks a person or object of interest moving in a scene is presented. This generally involves operating the camera at a higher resolution than is needed for the application, and cropping a sub-region out of the image captured that is output as the output video. The person or object being tracked is at least partially contained within the cropped sub-region. As the person or object moves within the field of view of the camera, the location of the cropped sub-region is also moved so as to keep the subject of interest within its boundaries. When the subject of interest moves to the boundary of the FOV of the camera, the camera is mechanically panned to keep the person or object inside its FOV.
    Type: Application
    Filed: November 21, 2005
    Publication date: May 31, 2007
    Applicant: Microsoft Corporation
    Inventors: Cha Zhang, Li-wei He, Yong Rui
  • Publication number: 20070118868
    Abstract: A computer network-based distributed presentation system and process is presented that controls the display of one or more video streams output by multiple video cameras located across multiple presentation sites on display screens located at each presentation site. The distributed presentation system and process provides the ability for a user at a site to customize the screen configuration (i.e., what video streams are display at any one time and in what format) for that site via a two-layer display director module. In the design layer of the module, a user interface is provided for a user to specify display priorities dictating what video streams are to be displayed on the screen over time. These display priorities are then provided to the execution layer of the module which translates them into probabilistic timed automata and uses the automata to control what is displayed on the display screen.
    Type: Application
    Filed: November 23, 2005
    Publication date: May 24, 2007
    Applicant: Microsoft Corporation
    Inventors: Cha Zhang, Bin Yu, Yong Rui
  • Patent number: 7171025
    Abstract: Automatic detection and tracking of multiple individuals includes receiving a frame of video and/or audio content and identifying a candidate area for a new face region in the frame. One or more hierarchical verification levels are used to verify whether a human face is in the candidate area, and an indication made that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area. A plurality of audio and/or video cues are used to track each verified face in the video content from frame to frame.
    Type: Grant
    Filed: January 25, 2005
    Date of Patent: January 30, 2007
    Assignee: Microsoft Corporation
    Inventors: Yong Rui, Yunqiang Chen
  • Patent number: 7151843
    Abstract: Automatic detection and tracking of multiple individuals includes receiving a frame of video and/or audio content and identifying a candidate area for a new face region in the frame. One or more hierarchical verification levels are used to verify whether a human face is in the candidate area, and an indication made that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area. A plurality of audio and/or video cues are used to track each verified face in the video content from frame to frame.
    Type: Grant
    Filed: January 25, 2005
    Date of Patent: December 19, 2006
    Assignee: Microsoft Corporation
    Inventors: Yong Rui, Yunqiang Chen
  • Publication number: 20060268101
    Abstract: A method of digitally adding the appearance of makeup to a videoconferencing participant. The system and method for applying digital make-up operates in a loop processing sequential video frames. For each input frame, there are typically three general steps: 1) Locating the face and eye and mouth regions; 2) Applying digital make-up to the face, preferably with the exception of the eye and open mouth areas; and 3) Blending the make-up region with the rest of the face. In one embodiment of the invention, the background in the frame containing a video conferencing participant can also be modified so that other video conferencing participants cannot clearly see the background behind the participant in the image frame. In one such embodiment of the invention, the video conferencing participant tries to make his or her own image look comical or altered. In another embodiment of the invention, a particular remote participant tries to make another participant look funny to the other participants.
    Type: Application
    Filed: May 25, 2005
    Publication date: November 30, 2006
    Applicant: Microsoft Corporation
    Inventors: Li-wei He, Michael Cohen, Yong Rui, Shinichi Manaka
  • Patent number: 7130446
    Abstract: Automatic detection and tracking of multiple individuals includes receiving a frame of video and/or audio content and identifying a candidate area for a new face region in the frame. One or more hierarchical verification levels are used to verify whether a human face is in the candidate area, and an indication made that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area. A plurality of audio and/or video cues are used to track each verified face in the video content from frame to frame.
    Type: Grant
    Filed: December 3, 2001
    Date of Patent: October 31, 2006
    Assignee: Microsoft Corporation
    Inventors: Yong Rui, Yunqiang Chen
  • Patent number: 7127071
    Abstract: A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source localization (SSL) approach involves two steps. The first step computes TDOA for each microphone pair, and the second step combines these estimates. This two-step process discards relevant information in the first step, thus degrading the SSL accuracy and robustness. In the present invention, direct, one-step, approaches are employed. Namely, a one-step TDOA SSL approach and a steered beam (SB) SSL approach are employed. Each of these approaches provides an accuracy and robustness not available with the traditional two-step approaches.
    Type: Grant
    Filed: November 4, 2005
    Date of Patent: October 24, 2006
    Assignee: Microsoft Corporation
    Inventors: Yong Rui, Dinei Florencio
  • Publication number: 20060227977
    Abstract: A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source localization (SSL) approach involves two steps. The first step computes TDOA for each microphone pair, and the second step combines these estimates. This two-step process discards relevant information in the first step, thus degrading the SSL accuracy and robustness. In the present invention, direct, one-step, approaches are employed. Namely, a one-step TDOA SSL approach and a steered beam (SB) SSL approach are employed. Each of these approaches provides an accuracy and robustness not available with the traditional two-step approaches.
    Type: Application
    Filed: July 26, 2005
    Publication date: October 12, 2006
    Applicant: Microsoft Corporation
    Inventors: Yong Rui, Dinei Florencio
  • Publication number: 20060215850
    Abstract: A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source localization (SSL) approach involves two steps. The first step computes TDOA for each microphone pair, and the second step combines these estimates. This two-step process discards relevant information in the first step, thus degrading the SSL accuracy and robustness. In the present invention, direct, one-step, approaches are employed. Namely, a one-step TDOA SSL approach and a steered beam (SB) SSL approach are employed. Each of these approaches provides an accuracy and robustness not available with the traditional two-step approaches.
    Type: Application
    Filed: November 4, 2005
    Publication date: September 28, 2006
    Applicant: Microsoft Corporation
    Inventors: Yong Rui, Dinei Florencio
  • Patent number: 7113605
    Abstract: A system and process for estimating the time delay of arrival (TDOA) between a pair of audio sensors of a microphone array is presented. Generally, a generalized cross-correlation (GCC) technique is employed. However, this technique is improved to include provisions for both reducing the influence (including interference) from correlated ambient noise and reverberation noise in the sensor signals prior to computing the TDOA estimate. Two unique correlated ambient noise reduction procedures are also proposed. One involves the application of Wiener filtering, and the other a combination of Wiener filtering with a Gnn subtraction technique. In addition, two unique reverberation noise reduction procedures are proposed. Both involve applying a weighting factor to the signals prior to computing the TDOA which combines the effects of a traditional maximum likelihood (TML) weighting function and a phase transformation (PHAT) weighting function.
    Type: Grant
    Filed: July 14, 2005
    Date of Patent: September 26, 2006
    Assignee: Microsoft Corporation
    Inventors: Yong Rui, Dinei Florencio
  • Patent number: 7099798
    Abstract: An event-based system and process for recording and playback of collaborative electronic presentations is presented. The present system and process includes a technique for recording collaborative electronic presentations by capturing and storing the interactions between each participant and presentation data where each interaction event is timestamped and linked to a data file comprising the presentation data. The present system and process also includes a technique for playing back the recorded collaborative electronic presentation, which involves displaying the presentation data in an order it was originally presented and reproducing the recorded interactions between each participant and the displayed presentation data at the same point in the presentation that they were originally performed, based on the aforementioned timestamps.
    Type: Grant
    Filed: October 25, 2004
    Date of Patent: August 29, 2006
    Assignee: Microsoft Corporation
    Inventors: Bin Yu, Yong Rui
  • Publication number: 20060167995
    Abstract: A system and process for muting the audio transmission from a location of a participant engaged in a multi-party, computer network-based teleconference when that participant is working on a keyboard, is presented. The audio is muted as it is assumed the participant is doing something other than actively participation in the meeting when typing on the keyboard. If left un-muted the sound of typing would distract the other participant in the teleconference.
    Type: Application
    Filed: January 12, 2005
    Publication date: July 27, 2006
    Applicant: Microsoft Corporation
    Inventor: Yong Rui