Patents by Inventor Xian-Sheng Hua

Xian-Sheng Hua has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10592769
    Abstract: Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.
    Type: Grant
    Filed: August 18, 2016
    Date of Patent: March 17, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Linjun Yang, Xian-Sheng Hua, Yang Cai
  • Patent number: 10521692
    Abstract: Techniques for intelligent image search results summarization and browsing scheme are described. Images having visual attributes are evaluated for similarities based in part on their visual attributes. At least one preference score indicating a probability of an image to be selected into a summary is calculated for each image. Images are selected based on the similarity of the selected images to the other images and the preference scores of the selected images. A summary of the plurality of images is generated including the selected one individual image.
    Type: Grant
    Filed: July 7, 2014
    Date of Patent: December 31, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jingdong Wang, Xian-Sheng Hua
  • Patent number: 10013637
    Abstract: Optimizing multi-class image classification by leveraging patch-based features extracted from weakly supervised images to train classifiers is described. A corpus of images associated with a set of labels may be received. One or more patches may be extracted from individual images in the corpus. Patch-based features may be extracted from the one or more patches and patch representations may be extracted from individual patches of the one or more patches. The patches may be arranged into clusters based at least in part on the patch-based features. At least some of the individual patches may be removed from individual clusters based at least in part on determined similarity values that are representative of similarity between the individual patches. The system may train classifiers based in part on patch-based features extracted from patches in the refined clusters. The classifiers may be used to accurately and efficiently classify new images.
    Type: Grant
    Filed: January 22, 2015
    Date of Patent: July 3, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ishan Misra, Jin Li, Xian-Sheng Hua
  • Patent number: 9875301
    Abstract: Systems and methods for learning topic models from unstructured data and applying the learned topic models to recognize semantics for new data items are described herein. In at least one embodiment, a corpus of multimedia data items associated with a set of labels may be processed to generate a refined corpus of multimedia data items associated with the set of labels. Such processing may include arranging the multimedia data items in clusters based on similarities of extracted multimedia features and generating intra-cluster and inter-cluster features. The intra-cluster and the inter-cluster features may be used for removing multimedia data items from the corpus to generate the refined corpus. The refined corpus may be used for training topic models for identifying labels. The resulting models may be stored and subsequently used for identifying semantics of a multimedia data item input by a user.
    Type: Grant
    Filed: April 30, 2014
    Date of Patent: January 23, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xian-Sheng Hua, Jin Li, Yoshitaka Ushiku
  • Patent number: 9785866
    Abstract: Techniques for optimizing multi-class image classification by leveraging negative multimedia data items to train and update classifiers are described. The techniques describe accessing positive multimedia data items of a plurality of multimedia data items, extracting features from the positive multimedia data items, and training classifiers based at least in part on the features. The classifiers may include a plurality of model vectors each corresponding to one of the individual labels. The system may iteratively test the classifiers using positive multimedia data and negative multimedia data and may update one or more model vectors associated with the classifiers differently, depending on whether multimedia data items are positive or negative. Techniques for applying the classifiers to determine whether a new multimedia data item is associated with a topic based at least in part on comparing similarity values with corresponding statistics derived from classifier training are also described.
    Type: Grant
    Filed: January 22, 2015
    Date of Patent: October 10, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xian-Sheng Hua, Jin Li, Ishan Misra
  • Patent number: 9788080
    Abstract: Systems and methods for automatically inserting advertisements into source video content playback streams are described. In one aspect, the systems and methods communicate a source video content playback stream to a video player to present source video to a user. During playback of the source video, and in response to receipt of a request from the user to navigate portions of the source video (e.g., a user command to fast forward the source video, rewind the source video, or other action), the systems and methods dynamically define a video advertisement clip insertion point (e.g., and insertion point based on a current playback position). The systems and methods then insert a contextually relevant and/or targeted video advertisement clip into the playback stream for presentation to the user.
    Type: Grant
    Filed: December 19, 2016
    Date of Patent: October 10, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xian-Sheng Hua, Wei Lai, Wei-Ying Ma, Shipeng Li
  • Patent number: 9646227
    Abstract: This disclosure describes techniques for training models from video data and applying the learned models to identify desirable video data. Video data may be labeled to indicate a semantic category and/or a score indicative of desirability. The video data may be processed to extract low and high level features. A classifier and a scoring model may be trained based on the extracted features. The classifier may estimate a probability that the video data belongs to at least one of the categories in a set of semantic categories. The scoring model may determine a desirability score for the video data. New video data may be processed to extract low and high level features, and feature values may be determined based on the extracted features. The learned classifier and scoring model may be applied to the feature values to determine a desirability score associated with the new video data.
    Type: Grant
    Filed: July 29, 2014
    Date of Patent: May 9, 2017
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Nitin Suri, Xian-Sheng Hua, Tzong-Jhy Wang, William D. Sproule, Andrew S. Ivory, Jin Li
  • Patent number: 9628673
    Abstract: Described is perceptually near-lossless video summarization for use in maintaining video summaries, which operates to substantially reconstruct an original video in a generally perceptually near-lossless manner. A video stream is summarized with little information loss by using a relatively very small piece of summary metadata. The summary metadata comprises an image set of synthesized mosaics and representative keyframes, audio data, and the metadata about video structure and motion. In one implementation, the metadata is computed and maintained (e.g., as a file) to summarize a relatively large video sequence, by segmenting a video shot into subshots, and selecting keyframes and mosaics based upon motion data corresponding to those subshots. The motion data is maintained as a semantic description associated with the image set.
    Type: Grant
    Filed: April 28, 2010
    Date of Patent: April 18, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Lin-Xie Tang
  • Publication number: 20170099526
    Abstract: Systems and methods for automatically inserting advertisements into source video content playback streams are described. In one aspect, the systems and methods communicate a source video content playback stream to a video player to present source video to a user. During playback of the source video, and in response to receipt of a request from the user to navigate portions of the source video (e.g., a user command to fast forward the source video, rewind the source video, or other action), the systems and methods dynamically define a video advertisement clip insertion point (e.g., and insertion point based on a current playback position). The systems and methods then insert a contextually relevant and/or targeted video advertisement clip into the playback stream for presentation to the user.
    Type: Application
    Filed: December 19, 2016
    Publication date: April 6, 2017
    Inventors: Xian-Sheng Hua, Wei Lai, Wei-Ying Ma, Shipeng Li
  • Patent number: 9554093
    Abstract: Systems and methods for automatically inserting advertisements into source video content playback streams are described. In one aspect, the systems and methods communicate a source video content playback stream to a video player to present source video to a user. During playback of the source video, and in response to receipt of a request from the user to navigate portions of the source video (e.g., a user command to fast forward the source video, rewind the source video, or other action), the systems and methods dynamically define a video advertisement clip insertion point (e.g., and insertion point based on a current playback position). The systems and methods then insert a contextually relevant and/or targeted video advertisement clip into the playback stream for presentation to the user.
    Type: Grant
    Filed: January 23, 2007
    Date of Patent: January 24, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xian-Sheng Hua, Wei Lai, Wei-Ying Ma, Shipeng Li
  • Publication number: 20160358036
    Abstract: Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.
    Type: Application
    Filed: August 18, 2016
    Publication date: December 8, 2016
    Inventors: Linjun Yang, Xian-Sheng Hua, Yang Cai
  • Publication number: 20160358025
    Abstract: Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.
    Type: Application
    Filed: August 19, 2016
    Publication date: December 8, 2016
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li
  • Patent number: 9443011
    Abstract: Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.
    Type: Grant
    Filed: May 18, 2011
    Date of Patent: September 13, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Linjun Yang, Xian-Sheng Hua, Yang Cai
  • Patent number: 9443147
    Abstract: Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.
    Type: Grant
    Filed: April 26, 2010
    Date of Patent: September 13, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li
  • Publication number: 20160217349
    Abstract: Techniques for optimizing multi-class image classification by leveraging negative multimedia data items to train and update classifiers are described. The techniques describe accessing positive multimedia data items of a plurality of multimedia data items, extracting features from the positive multimedia data items, and training classifiers based at least in part on the features. The classifiers may include a plurality of model vectors each corresponding to one of the individual labels. The system may iteratively test the classifiers using positive multimedia data and negative multimedia data and may update one or more model vectors associated with the classifiers differently, depending on whether multimedia data items are positive or negative. Techniques for applying the classifiers to determine whether a new multimedia data item is associated with a topic based at least in part on comparing similarity values with corresponding statistics derived from classifier training are also described.
    Type: Application
    Filed: January 22, 2015
    Publication date: July 28, 2016
    Inventors: Xian-Sheng Hua, Jin Li, Ishan Misra
  • Publication number: 20160217344
    Abstract: Optimizing multi-class image classification by leveraging patch-based features extracted from weakly supervised images to train classifiers is described. A corpus of images associated with a set of labels may be received. One or more patches may be extracted from individual images in the corpus. Patch-based features may be extracted from the one or more patches and patch representations may be extracted from individual patches of the one or more patches. The patches may be arranged into clusters based at least in part on the patch-based features. At least some of the individual patches may be removed from individual clusters based at least in part on determined similarity values that are representative of similarity between the individual patches. The system may train classifiers based in part on patch-based features extracted from patches in the refined clusters. The classifiers may be used to accurately and efficiently classify new images.
    Type: Application
    Filed: January 22, 2015
    Publication date: July 28, 2016
    Inventors: Ishan Misra, Jin Li, Xian-Sheng Hua
  • Patent number: 9336316
    Abstract: Architecture that includes a junk (unwanted) image detection algorithm which performs junk image detection of unwanted images before the images are actually downloaded for indexing. Features are employed related to image location information and host websites, such as image path descriptor (e.g., URL-uniform resource locator) pattern features, webpage content features, click features, and image aggregated information in a machine learning based framework to predict the probability that an image is unwanted (or wanted) before the images are downloaded. The framework is then applied to build a statistical model and predict junk scores. By removing image URLs marked as “junk” from the work list of an automated indexer (e.g., crawler), the indexer bandwidth is significantly improved with a corresponding improvement in the publish rate.
    Type: Grant
    Filed: November 5, 2012
    Date of Patent: May 10, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhong Wu, Xian-Sheng Hua
  • Patent number: 9271035
    Abstract: Tools and techniques for acquiring key roles and their relationships from a video independent of metadata, such as cast lists and scripts, are described herein. These techniques include discovering key roles and their relationships by treating a video (e.g., a movie, television program, music video, and personal video, etc.) as a community. For instance, a video is segmented into a hierarchical structure that includes levels for scenes, shots, and key frames. In some implementations, the techniques include performing face detection and grouping on the detected key frames. In some implementations, the techniques include exploiting the key roles and their correlations in this video to discover a community. The discovered community provides for a wide variety of applications, including the automatic generation of visual summaries or video posters including acquired key roles.
    Type: Grant
    Filed: April 12, 2011
    Date of Patent: February 23, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Yan Wang
  • Publication number: 20160034786
    Abstract: This disclosure describes techniques for training models from video data and applying the learned models to identify desirable video data. Video data may be labeled to indicate a semantic category and/or a score indicative of desirability. The video data may be processed to extract low and high level features. A classifier and a scoring model may be trained based on the extracted features. The classifier may estimate a probability that the video data belongs to at least one of the categories in a set of semantic categories. The scoring model may determine a desirability score for the video data. New video data may be processed to extract low and high level features, and feature values may be determined based on the extracted features. The learned classifier and scoring model may be applied to the feature values to determine a desirability score associated with the new video data.
    Type: Application
    Filed: July 29, 2014
    Publication date: February 4, 2016
    Inventors: Nitin Suri, Xian-Sheng Hua, Tzong-Jhy Wang, William D. Sproule, Andrew S. Ivory, Jin Li
  • Publication number: 20150332124
    Abstract: A similarity of a first video to a second video may be identified automatically. Images are received from the videos, and divided into sub-images. The sub-images are evaluated based on a feature common to each of the sub-images. Binary representations of the images may be created based on the evaluation of the sub-images. A similarity of the first video to the second video may be determined based on a number of occurrences of a binary representation in the first video and the second video.
    Type: Application
    Filed: July 27, 2015
    Publication date: November 19, 2015
    Inventors: Linjun Yang, Lifeng Shang, Xian-Sheng Hua, Fei Wang