Patents by Inventor Yong Rui

Yong Rui has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for communicating audio data signals via an audio communications medium

Patent number: 6934370

Abstract: A system for communicating audio data signals comprises a source computer that performs an action, generates an event message corresponding to the action, converts the event message into an audio data signal, and communicates the audio data signal through its speaker. A source telephone receives a voice signal from a participant and the audio data signal through its microphone and communicates the audio data signal and voice as coherent sound via an audio communications medium. A recipient telephone receives the audio data signal from the coherent sound communicated via the audio communications medium and communicates the audio data signal via its speaker. A recipient computer receives the audio data signal through its microphone, extracts the event message from the audio data signal, and performs an action based on the event message from the audio data signal. The audio communications medium can comprise a telephone communications system or air.

Type: Grant

Filed: June 16, 2003

Date of Patent: August 23, 2005

Assignee: Microsoft Corporation

Inventors: Roy Leban, Ross Garrett Cutler, Henrique S. Malvar, Yong Rui
Annotating programs for automatic summary generations

Publication number: 20050160457

Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.

Type: Application

Filed: March 15, 2005

Publication date: July 21, 2005

Applicant: Microsoft Corporation

Inventors: Yong Rui, Anoop Gupta, Alejandro Acero
Annotating programs for automatic summary generation

Publication number: 20050159956

Abstract: Audio/video programming content is made available to a receiver from a content provider, and meta data is made available to the receiver from a meta data provider. The meta data corresponds to the programming content, and identifies, for each of multiple portions of the programming content, an indicator of a likelihood that the portion is an exciting portion of the content. In one implementation, the meta data includes probabilities that segments of a baseball program are exciting, and is generated by analyzing the audio data of the baseball program for both excited speech and baseball hits. The meta data can then be used to generate a summary for the baseball program.

Type: Application

Filed: March 4, 2005

Publication date: July 21, 2005

Applicant: Microsoft Corporation

Inventors: Yong Rui, Anoop Gupta, Alejandro Acero
Automatic detection and tracking of multiple individuals using multiple cues

Publication number: 20050147278

Abstract: Automatic detection and tracking of multiple individuals includes receiving a frame of video and/or audio content and identifying a candidate area for a new face region in the frame. One or more hierarchical verification levels are used to verify whether a human face is in the candidate area, and an indication made that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area. A plurality of audio and/or video cues are used to track each verified face in the video content from frame to frame.

Type: Application

Filed: January 25, 2005

Publication date: July 7, 2005

Applicant: Mircosoft Corporation

Inventors: Yong Rui, Yungqiang Chen
Automatic detection and tracking of multiple individuals using multiple cues

Publication number: 20050129278

Abstract: Automatic detection and tracking of multiple individuals includes receiving a frame of video and/or audio content and identifying a candidate area for a new face region in the frame. One or more hierarchical verification levels are used to verify whether a human face is in the candidate area, and an indication made that the candidate area includes a face if the one or more hierarchical verification levels verify that a human face is in the candidate area. A plurality of audio and/or video cues are used to track each verified face in the video content from frame to frame.

Type: Application

Filed: January 25, 2005

Publication date: June 16, 2005

Applicant: Microsoft Corporation

Inventors: Yong Rui, Yunqiang Chen
System and process for tracking an object state using a particle filter sensor fusion technique

Publication number: 20050114079

Abstract: A system and process for tracking an object state over time using particle filter sensor fusion and a plurality of logical sensor modules is presented. This new fusion framework combines both the bottom-up and top-down approaches to sensor fusion to probabilistically fuse multiple sensing modalities. At the lower level, individual vision and audio trackers can be designed to generate effective proposals for the fuser. At the higher level, the fuser performs reliable tracking by verifying hypotheses over multiple likelihood models from multiple cues. Different from the traditional fusion algorithms, the present framework is a closed-loop system where the fuser and trackers coordinate their tracking information. Furthermore, to handle non-stationary situations, the present framework evaluates the performance of the individual trackers and dynamically updates their object states.

Type: Application

Filed: November 10, 2004

Publication date: May 26, 2005

Applicant: Microsoft Corporation

Inventors: Yong Rui, Yunqiang Chen
Image retrieval based on relevance feedback

Publication number: 20050086223

Abstract: An improved image retrieval process based on relevance feedback uses a hierarchical (per-feature) approach in comparing images. Multiple query vectors are generated for an initial image by extracting multiple low-level features from the initial image. When determining how closely a particular image in an image collection matches the initial image, a distance is calculated between the query vectors and corresponding low-level feature vectors extracted from the particular image. Once these individual distances are calculated, they are combined to generate an overall distance that represents how closely the two images match. According to other aspects, relevancy feedback received regarding previously retrieved images is used during the query vector generation and the distance determination to influence which images are subsequently retrieved.

Type: Application

Filed: October 21, 2004

Publication date: April 21, 2005

Applicant: Microsoft Corporation

Inventor: Yong Rui
Skimming continuous multimedia content

Publication number: 20050086703

Abstract: A program distribution system includes a plurality of set-top boxes that receive broadcast programming and segmentation data from content and information providers. The segmentation information indicates portions of programs that are to be included in skimmed or condensed versions of the received programming, and is produced using manual or automated methods. Automated methods include the use of ancillary production data to detect the most important parts of a program. A user interface allows a user to control time scale modification and skimming during playback, and also allows the user to easily browse to different points within the current program.

Type: Application

Filed: October 22, 2004

Publication date: April 21, 2005

Applicant: Microsoft Corporation

Inventors: Anoop Gupta, Li-Wei He, Francis Li, Yong Rui
Methods and systems for estimating network available bandwidth using packet pairs and spatial filtering

Publication number: 20050083849

Abstract: Estimation of available bandwidth on a network uses packet pairs and spatially filtering. Packet pairs are transmitted over the network. The dispersion of the packet pairs is used to generate samples of the available bandwidth, which are then classified into bins to generate a histogram. The bins can have uniform bin widths, and the histogram data can be aged so that older samples are given less weight in the estimation. The histogram data is then spatially filtered. Kernel density algorithms can be used to spatially filter the histogram data. The network available bandwidth is estimated using the spatially filtered histogram data. Alternatively, the spatially filtered histogram data can be temporally filtered before the available bandwidth is estimated.

Type: Application

Filed: October 15, 2003

Publication date: April 21, 2005

Inventors: Yong Rui, Andres Vega-Garcia
System and process for tracking an object state using a particle filter sensor fusion technique

Patent number: 6882959

Abstract: A system and process for tracking an object state over time using particle filter sensor fusion and a plurality of logical sensor modules is presented. This new fusion framework combines both the bottom-up and top-down approaches to sensor fusion to probabilistically fuse multiple sensing modalities. At the lower level, individual vision and audio trackers can be designed to generate effective proposals for the fuser. At the higher level, the fuser performs reliable tracking by verifying hypotheses over multiple likelihood models from multiple cues. Different from the traditional fusion algorithms, the present framework is a closed-loop system where the fuser and trackers coordinate their tracking information. Furthermore, to handle non-stationary situations, the present framework evaluates the performance of the individual trackers and dynamically updates their object states.

Type: Grant

Filed: May 2, 2003

Date of Patent: April 19, 2005

Assignee: Microsoft Corporation

Inventors: Yong Rui, Yunqiang Chen
Methods and systems for participant sourcing indication in multi-party conferencing and for audio source discrimination

Publication number: 20050076081

Abstract: Indications of which participant is providing information during a multi-party conference. Each participant has equipment to display information being transferred during the conference. A sourcing signaler residing in the participant equipment provides a signal that indicates the identity of its participant when this participant is providing information to the conference. The source indicators of the other participant equipment receive the signal and cause a UI to indicate that the participant identified by the received signal is providing information (e.g. the UI can causes the identifier to change appearance). An audio discriminator is used to distinguish between an acoustic signal that was generated by a person speaking from that generated in a band-limited manner. The audio discriminator analyzes the spectrum of detected audio signals and generates several parameters from the spectrum and from past determinations to determine the source of an audio signal on a frame-by-frame basis.

Type: Application

Filed: October 1, 2003

Publication date: April 7, 2005

Inventors: Yong Rui, Anoop Gupta
Image retrieval based on relevance feedback

Publication number: 20050065929

Abstract: An improved image retrieval process based on relevance feedback uses a hierarchical (per-feature) approach in comparing images. Multiple query vectors are generated for an initial image by extracting multiple low-level features from the initial image. When determining how closely a particular image in an image collection matches the initial image, a distance is calculated between the query vectors and corresponding low-level feature vectors extracted from the particular image. Once these individual distances are calculated, they are combined to generate an overall distance that represents how closely the two images match. According to other aspects, relevancy feedback received regarding previously retrieved images is used during the query vector generation and the distance determination to influence which images are subsequently retrieved.

Type: Application

Filed: October 26, 2004

Publication date: March 24, 2005

Applicant: Microsoft Corporation

Inventor: Yong Rui
System and method for devising a human interactive proof that determines whether a remote client is a human or a computer program

Publication number: 20050065802

Abstract: A system and method for automatically determining if a remote client is a human or a computer. A set of HIP design guidelines which are important to ensure the security and usability of a HIP system are described. Furthermore, one embodiment of this new HIP system and method is based on human face and facial feature detection. Because human face is the most familiar object to all human users the embodiment of the invention employing a face is possibly the most universal HIP system so far.

Type: Application

Filed: September 19, 2003

Publication date: March 24, 2005

Applicant: Microsoft Corporation

Inventors: Yong Rui, Zicheng Liu
Image retrieval based on relevance feedback

Patent number: 6859802

Abstract: An improved image retrieval process based on relevance feedback uses a hierarchical (per-feature) approach in comparing images. Multiple query vectors are generated for an initial image by extracting multiple low-level features from the initial image. When determining how closely a particular image in an image collection matches the initial image, a distance is calculated between the query vectors and corresponding low-level feature vectors extracted from the particular image. Once these individual distances are calculated, they are combined to generate an overall distance that represents how closely the two images match. According to other aspects, relevancy feedback received regarding previously retrieved images is used during the query vector generation and the distance determination to influence which images are subsequently retrieved.

Type: Grant

Filed: September 13, 2000

Date of Patent: February 22, 2005

Assignee: Microsoft Corporation

Inventor: Yong Rui
System and method for distributed meetings

Publication number: 20040263636

Abstract: A system and method for teleconferencing and recording of meetings. The system uses a variety of capture devices (a novel 360° camera, a whiteboard camera, a presenter view camera, a remote view camera, and a microphone array) to provide a rich experience for people who want to participate in a meeting from a distance. The system is also combined with speaker clustering, spatial indexing, and time compression to provide a rich experience for people who miss a meeting and want to watch it afterward.

Type: Application

Filed: June 26, 2003

Publication date: December 30, 2004

Applicant: Microsoft Corporation

Inventors: Ross Cutler, Yong Rui, Anoop Gupta
System and process for robust sound source localization

Publication number: 20040240680

Abstract: A system and process for finding the location of a sound source using direct approaches having weighting factors that mitigate the effect of both correlated and reverberation noise is presented. When more than two microphones are used, the traditional time-delay-of-arrival (TDOA) based sound source localization (SSL) approach involves two steps. The first step computes TDOA for each microphone pair, and the second step combines these estimates. This two-step process discards relevant information in the first step, thus degrading the SSL accuracy and robustness. In the present invention, direct, one-step, approaches are employed. Namely, a one-step TDOA SSL approach and a steered beam (SB) SSL approach are employed. Each of these approaches provides an accuracy and robustness not available with the traditional two-step approaches.

Type: Application

Filed: May 28, 2003

Publication date: December 2, 2004

Inventors: Yong Rui, Dinei A. Florencio
SYSTEM AND PROCESS FOR TRACKING AN OBJECT STATE USING A PARTICLE FILTER SENSOR FUSION TECHNIQUE

Publication number: 20040220769

Abstract: A system and process for tracking an object state over time using particle filter sensor fusion and a plurality of logical sensor modules is presented. This new fusion framework combines both the bottom-up and top-down approaches to sensor fusion to probabilistically fuse multiple sensing modalities. At the lower level, individual vision and audio trackers can be designed to generate effective proposals for the fuser. At the higher level, the fuser performs reliable tracking by verifying hypotheses over multiple likelihood models from multiple cues. Different from the traditional fusion algorithms, the present framework is a closed-loop system where the fuser and trackers coordinate their tracking information. Furthermore, to handle non-stationary situations, the present framework evaluates the performance of the individual trackers and dynamically updates their object states.

Type: Application

Filed: May 2, 2003

Publication date: November 4, 2004

Inventors: Yong Rui, Yunqiang Chen
System and process for time delay estimation in the presence of correlated noise and reverberation

Publication number: 20040190730

Abstract: A system and process for estimating the time delay of arrival (TDOA) between a pair of audio sensors of a microphone array is presented. Generally, a generalized cross-correlation (GCC) technique is employed. However, this technique is improved to include provisions for both reducing the influence (including interference) from correlated ambient noise and reverberation noise in the sensor signals prior to computing the TDOA estimate. Two unique correlated ambient noise reduction procedures are also proposed. One involves the application of Wiener filtering, and the other a combination of Wiener filtering with a Gnn subtraction technique. In addition, two unique reverberation noise reduction procedures are proposed. Both involve applying a weighting factor to the signals prior to computing the TDOA which combines the effects of a traditional maximum likelihood (TML) weighting function and a phase transformation (PHAT) weighting function.

Type: Application

Filed: March 31, 2003

Publication date: September 30, 2004

Inventors: Yong Rui, Dinei A. Florencio
Automated camera management system and method for capturing presentations using videography rules

Publication number: 20040105004

Abstract: An automated camera management system and method for capturing presentations using videography rules. The system and method use technology components and aesthetic components represented by the videography rules to capture a presentation. In general, the automated camera management method captures a presentation using videography rules to determine camera positioning, camera movement, and switching or transition between cameras. The videography rules depend on the type of presentation room and the number of audio-visual camera units used to capture the presentation. The automated camera management system of the invention uses the above method to capture a presentation in a presentation room. The system includes a least one audio-visual (A-V) camera unit for capturing and tracking a subject based on vision or sound. The (A-V) camera unit includes any combination of the following components: (1) a pan-tilt-zoom (PTZ) camera; (2) a fixed camera; and (3) a microphone array.

Type: Application

Filed: November 30, 2002

Publication date: June 3, 2004

Inventors: Yong Rui, Anoop Gupta, Jonathan Thomas Grudin
System and process for locating a speaker using 360 degree sound source localization

Publication number: 20040037436

Abstract: A system and process is described for estimating the location of a speaker using signals output by a microphone array characterized by multiple pairs of audio sensors. The location of a speaker is estimated by first determining whether the signal data contains human speech components and filtering out noise attributable to stationary sources. The location of the person speaking is then estimated using a time-delay-of-arrival based SSL technique on those parts of the data determined to contain human speech components. A consensus location for the speaker is computed from the individual location estimates associated with each pair of microphone array audio sensors taking into consideration the uncertainty of each estimate. A final consensus location is also computed from the individual consensus locations computed over a prescribed number of sampling periods using a temporal filtering technique.

Type: Application

Filed: August 26, 2002

Publication date: February 26, 2004

Inventor: Yong Rui

prev … 3 4 5 6 7 8 next