Patents by Inventor Xinding Sun

Xinding Sun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8510110
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Grant
    Filed: July 11, 2012
    Date of Patent: August 13, 2013
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 8446456
    Abstract: Provides a system for detecting an intersection between more than one panoramic video sequence and detecting the orientation of the sequences forming the intersection. Video images and corresponding location data are received. If required, the images and location data is processed to ensure the images contain location data. An intersection between two paths is then derived from the video images by deriving a rough intersection between two images, determining a neighborhood for the two images, and dividing each image in the neighborhood into strips. An identifying value is derived from each strip to create a row of strip values which are then converted to the frequency domain. A distance measure is taken between strips in the frequency domain, and the intersection is determined from the images having the smallest distance measure between them.
    Type: Grant
    Filed: September 7, 2007
    Date of Patent: May 21, 2013
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Jonathan T. Foote, Donald G. Kimber, Xinding Sun, John E. Adcock
  • Publication number: 20120278077
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Application
    Filed: July 11, 2012
    Publication date: November 1, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 8234113
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: July 31, 2012
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 8219387
    Abstract: Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: July 10, 2012
    Assignee: Microsoft Corporation
    Inventors: Ross Cutler, Xinding Sun, Senthil Velayutham
  • Patent number: 8130978
    Abstract: This disclosure describes techniques of automatically identifying a direction of a speech source relative to an array of directional microphones using audio streams from some or all of the directional microphones. Whether the direction of the speech source is identified using audio streams from some of the directional microphones or from all of the directional microphones depends on whether using audio streams from a subgroup of the directional microphones or using audio streams from all of the directional microphones is more likely to correctly identify the direction of the speech source. Switching between using audio streams from some of the directional microphones and using audio streams from all of the directional microphones may occur automatically to best identify the direction of the speech source. A display screen at a remote venue may then display images having angles of view that are centered generally in the direction of the speech source.
    Type: Grant
    Filed: October 15, 2008
    Date of Patent: March 6, 2012
    Assignee: Microsoft Corporation
    Inventor: Xinding Sun
  • Publication number: 20110313766
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Application
    Filed: August 30, 2011
    Publication date: December 22, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 8024189
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Grant
    Filed: June 22, 2006
    Date of Patent: September 20, 2011
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Publication number: 20100092007
    Abstract: This disclosure describes techniques of automatically identifying a direction of a speech source relative to an array of directional microphones using audio streams from some or all of the directional microphones. Whether the direction of the speech source is identified using audio streams from some of the directional microphones or from all of the directional microphones depends on whether using audio streams from a subgroup of the directional microphones or using audio streams from all of the directional microphones is more likely to correctly identify the direction of the speech source. Switching between using audio streams from some of the directional microphones and using audio streams from all of the directional microphones may occur automatically to best identify the direction of the speech source. A display screen at a remote venue may then display images having angles of view that are centered generally in the direction of the speech source.
    Type: Application
    Filed: October 15, 2008
    Publication date: April 15, 2010
    Applicant: MICROSOFT CORPORATION
    Inventor: Xinding Sun
  • Patent number: 7656951
    Abstract: A digital video processing method and an apparatus thereof are provided. The method for processing digital images received in the form of compressed video streams comprising the step of determining a region intensity histogram (RIH) based on information on motion compensation of inter frames. The RIH information is obtained based on the motion compensation values of inter frames, and the RIH information is a good indicator of motion information of a video scene. Also, since the RIH information is quite a good indicator of intensity of the video scene, video streams having similar intensities can be effectively searched by searching for similar video scenes based on the RIH information obtained by the digital video processing method.
    Type: Grant
    Filed: August 5, 2003
    Date of Patent: February 2, 2010
    Assignees: Samsung Electronics Co., Ltd., The Regents of the University of California
    Inventors: Hyun-doo Shin, Yang-lim Choi, B. S. Manjunath, Xinding Sun
  • Publication number: 20090150149
    Abstract: Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.
    Type: Application
    Filed: December 10, 2007
    Publication date: June 11, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Ross Culter, Xinding Sun, Senthil Velayutham
  • Patent number: 7362806
    Abstract: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.
    Type: Grant
    Filed: July 27, 2001
    Date of Patent: April 22, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yang-lim Choi, Yun-ju Yu, Bangalore S. Manjunath, Xinding Sun, Ching-wei Chen
  • Publication number: 20070296807
    Abstract: Provides a system for detecting an intersection between more than one panoramic video sequence and detecting the orientation of the sequences forming the intersection. Video images and corresponding location data are received. If required, the images and location data is processed to ensure the images contain location data. An intersection between two paths is then derived from the video images by deriving a rough intersection between two images, determining a neighborhood for the two images, and dividing each image in the neighborhood into strips. An identifying value is derived from each strip to create a row of strip values which are then converted to the frequency domain. A distance measure is taken between strips in the frequency domain, and the intersection is determined from the images having the smallest distance measure between them.
    Type: Application
    Filed: September 7, 2007
    Publication date: December 27, 2007
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Jonathan Foote, Donald Kimber, Xinding Sun, John Adcock
  • Publication number: 20070297682
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Application
    Filed: June 22, 2006
    Publication date: December 27, 2007
    Applicant: Microsoft Corporation
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 7308030
    Abstract: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.
    Type: Grant
    Filed: April 12, 2005
    Date of Patent: December 11, 2007
    Assignees: Samsung Electronics Co., Ltd., The Regents of the University of California
    Inventors: Yang-lim Choi, Yun-ju Yu, Bangalore S. Manjunath, Xinding Sun, Ching-wei Chen
  • Patent number: 7289138
    Abstract: Provides a system for detecting an intersection between more than one panoramic video sequence and detecting the orientation of the sequences forming the intersection. Video images and corresponding location data are received. If required, the images and location data is processed to ensure the images contain location data. An intersection between two paths is then derived from the video images by deriving a rough intersection between two images, determining a neighborhood for the two images, and dividing each image in the neighborhood into strips. An identifying value is derived from each strip to create a row of strip values which are then converted to the frequency domain. A distance measure is taken between strips in the frequency domain, and the intersection is determined from the images having the smallest distance measure between them.
    Type: Grant
    Filed: July 2, 2002
    Date of Patent: October 30, 2007
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Jonathan T. Foote, Donald Kimber, Xinding Sun, John Adcock
  • Patent number: 7006569
    Abstract: A digital video processing method and an apparatus thereof are provided. The method for processing digital images received in the form of compressed video streams comprising the step of determining a region intensity histogram (RIH) based on information on motion compensation of inter frames. The RIH information is obtained based on the motion compensation values of inter frames, and the RIH information is a good indicator of motion information of a video scene. Also, since the RIH information is quite a good indicator of intensity of the video scene, video streams having similar intensities can be effectively searched by searching for similar video scenes based on the RIH information obtained by the digital video processing method.
    Type: Grant
    Filed: February 4, 2000
    Date of Patent: February 28, 2006
    Assignees: Samsung Electronics Co., Ltd., The Regents of the University of California
    Inventors: Hyun-doo Shin, Yang-lim Choi, Bangalore S. Manjunath, Xinding Sun
  • Patent number: 7003038
    Abstract: A method describes activity in a video sequence. The method measures intensity, direction, spatial, and temporal attributes in the video sequence, and the measured attributes are combined in a digital descriptor of the activity of the video sequence.
    Type: Grant
    Filed: August 13, 2002
    Date of Patent: February 21, 2006
    Assignee: Mitsubishi Electric Research Labs., Inc.
    Inventors: Ajay Divakaran, Huifang Sun, Hae-Kwang Kim, Chul-Soo Park, Xinding Sun, Bangalore S. Manjunath, Vinod V. Vasudevan, Manoranjan D. Jesudoss, Ganesh Rattinassababady, Hyundoo Shin
  • Publication number: 20050220191
    Abstract: An object activity modeling method which can efficiently model complex objects such as a human body is provided. The object activity modeling method includes the steps of (a) obtaining an optical flow vector from a video sequence; (b) obtaining the probability distribution of the feature vector for a plurality of video frames, using the optical flow vector; (c) modeling states, using the probability distribution of the feature vector; and (d) expressing the activity of the object in the video sequence based on state transition. According to the modeling method, in video indexing and recognition field, complex activities such as human activities can be efficiently modeled and recognized without segmenting objects.
    Type: Application
    Filed: April 12, 2005
    Publication date: October 6, 2005
    Inventors: Yang-lim Choi, Yun-ju Yu, Bangalore Manjunath, Xinding Sun, Ching-wei Chen
  • Publication number: 20040022317
    Abstract: A digital video processing method and an apparatus thereof are provided. The method for processing digital images received in the form of compressed video streams comprising the step of determining a region intensity histogram (RIH) based on information on motion compensation of inter frames. The RIH information is obtained based on the motion compensation values of inter frames, and the RIH information is a good indicator of motion information of a video scene. Also, since the RIH information is quite a good indicator of intensity of the video scene, video streams having similar intensities can be effectively searched by searching for similar video scenes based on the RIH information obtained by the digital video processing method.
    Type: Application
    Filed: July 18, 2003
    Publication date: February 5, 2004
    Applicants: SAMSUNG ELECTRONICS CO., LTD., THE REGENTS OF THE UNIVERSITY OF CALIF.
    Inventors: Hyun-Doo Shin, Yang-Lim Choi, B.S. Manjunath, Xinding Sun