Patents by Inventor Xian-Sheng Hua
Xian-Sheng Hua has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10592769Abstract: Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.Type: GrantFiled: August 18, 2016Date of Patent: March 17, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Linjun Yang, Xian-Sheng Hua, Yang Cai
-
Patent number: 10521692Abstract: Techniques for intelligent image search results summarization and browsing scheme are described. Images having visual attributes are evaluated for similarities based in part on their visual attributes. At least one preference score indicating a probability of an image to be selected into a summary is calculated for each image. Images are selected based on the similarity of the selected images to the other images and the preference scores of the selected images. A summary of the plurality of images is generated including the selected one individual image.Type: GrantFiled: July 7, 2014Date of Patent: December 31, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jingdong Wang, Xian-Sheng Hua
-
Patent number: 10013637Abstract: Optimizing multi-class image classification by leveraging patch-based features extracted from weakly supervised images to train classifiers is described. A corpus of images associated with a set of labels may be received. One or more patches may be extracted from individual images in the corpus. Patch-based features may be extracted from the one or more patches and patch representations may be extracted from individual patches of the one or more patches. The patches may be arranged into clusters based at least in part on the patch-based features. At least some of the individual patches may be removed from individual clusters based at least in part on determined similarity values that are representative of similarity between the individual patches. The system may train classifiers based in part on patch-based features extracted from patches in the refined clusters. The classifiers may be used to accurately and efficiently classify new images.Type: GrantFiled: January 22, 2015Date of Patent: July 3, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Ishan Misra, Jin Li, Xian-Sheng Hua
-
Patent number: 9875301Abstract: Systems and methods for learning topic models from unstructured data and applying the learned topic models to recognize semantics for new data items are described herein. In at least one embodiment, a corpus of multimedia data items associated with a set of labels may be processed to generate a refined corpus of multimedia data items associated with the set of labels. Such processing may include arranging the multimedia data items in clusters based on similarities of extracted multimedia features and generating intra-cluster and inter-cluster features. The intra-cluster and the inter-cluster features may be used for removing multimedia data items from the corpus to generate the refined corpus. The refined corpus may be used for training topic models for identifying labels. The resulting models may be stored and subsequently used for identifying semantics of a multimedia data item input by a user.Type: GrantFiled: April 30, 2014Date of Patent: January 23, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Xian-Sheng Hua, Jin Li, Yoshitaka Ushiku
-
Patent number: 9785866Abstract: Techniques for optimizing multi-class image classification by leveraging negative multimedia data items to train and update classifiers are described. The techniques describe accessing positive multimedia data items of a plurality of multimedia data items, extracting features from the positive multimedia data items, and training classifiers based at least in part on the features. The classifiers may include a plurality of model vectors each corresponding to one of the individual labels. The system may iteratively test the classifiers using positive multimedia data and negative multimedia data and may update one or more model vectors associated with the classifiers differently, depending on whether multimedia data items are positive or negative. Techniques for applying the classifiers to determine whether a new multimedia data item is associated with a topic based at least in part on comparing similarity values with corresponding statistics derived from classifier training are also described.Type: GrantFiled: January 22, 2015Date of Patent: October 10, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Xian-Sheng Hua, Jin Li, Ishan Misra
-
Patent number: 9788080Abstract: Systems and methods for automatically inserting advertisements into source video content playback streams are described. In one aspect, the systems and methods communicate a source video content playback stream to a video player to present source video to a user. During playback of the source video, and in response to receipt of a request from the user to navigate portions of the source video (e.g., a user command to fast forward the source video, rewind the source video, or other action), the systems and methods dynamically define a video advertisement clip insertion point (e.g., and insertion point based on a current playback position). The systems and methods then insert a contextually relevant and/or targeted video advertisement clip into the playback stream for presentation to the user.Type: GrantFiled: December 19, 2016Date of Patent: October 10, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Xian-Sheng Hua, Wei Lai, Wei-Ying Ma, Shipeng Li
-
Patent number: 9646227Abstract: This disclosure describes techniques for training models from video data and applying the learned models to identify desirable video data. Video data may be labeled to indicate a semantic category and/or a score indicative of desirability. The video data may be processed to extract low and high level features. A classifier and a scoring model may be trained based on the extracted features. The classifier may estimate a probability that the video data belongs to at least one of the categories in a set of semantic categories. The scoring model may determine a desirability score for the video data. New video data may be processed to extract low and high level features, and feature values may be determined based on the extracted features. The learned classifier and scoring model may be applied to the feature values to determine a desirability score associated with the new video data.Type: GrantFiled: July 29, 2014Date of Patent: May 9, 2017Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Nitin Suri, Xian-Sheng Hua, Tzong-Jhy Wang, William D. Sproule, Andrew S. Ivory, Jin Li
-
Patent number: 9628673Abstract: Described is perceptually near-lossless video summarization for use in maintaining video summaries, which operates to substantially reconstruct an original video in a generally perceptually near-lossless manner. A video stream is summarized with little information loss by using a relatively very small piece of summary metadata. The summary metadata comprises an image set of synthesized mosaics and representative keyframes, audio data, and the metadata about video structure and motion. In one implementation, the metadata is computed and maintained (e.g., as a file) to summarize a relatively large video sequence, by segmenting a video shot into subshots, and selecting keyframes and mosaics based upon motion data corresponding to those subshots. The motion data is maintained as a semantic description associated with the image set.Type: GrantFiled: April 28, 2010Date of Patent: April 18, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Lin-Xie Tang
-
Publication number: 20170099526Abstract: Systems and methods for automatically inserting advertisements into source video content playback streams are described. In one aspect, the systems and methods communicate a source video content playback stream to a video player to present source video to a user. During playback of the source video, and in response to receipt of a request from the user to navigate portions of the source video (e.g., a user command to fast forward the source video, rewind the source video, or other action), the systems and methods dynamically define a video advertisement clip insertion point (e.g., and insertion point based on a current playback position). The systems and methods then insert a contextually relevant and/or targeted video advertisement clip into the playback stream for presentation to the user.Type: ApplicationFiled: December 19, 2016Publication date: April 6, 2017Inventors: Xian-Sheng Hua, Wei Lai, Wei-Ying Ma, Shipeng Li
-
Patent number: 9554093Abstract: Systems and methods for automatically inserting advertisements into source video content playback streams are described. In one aspect, the systems and methods communicate a source video content playback stream to a video player to present source video to a user. During playback of the source video, and in response to receipt of a request from the user to navigate portions of the source video (e.g., a user command to fast forward the source video, rewind the source video, or other action), the systems and methods dynamically define a video advertisement clip insertion point (e.g., and insertion point based on a current playback position). The systems and methods then insert a contextually relevant and/or targeted video advertisement clip into the playback stream for presentation to the user.Type: GrantFiled: January 23, 2007Date of Patent: January 24, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Xian-Sheng Hua, Wei Lai, Wei-Ying Ma, Shipeng Li
-
Publication number: 20160358036Abstract: Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.Type: ApplicationFiled: August 18, 2016Publication date: December 8, 2016Inventors: Linjun Yang, Xian-Sheng Hua, Yang Cai
-
Publication number: 20160358025Abstract: Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.Type: ApplicationFiled: August 19, 2016Publication date: December 8, 2016Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li
-
Patent number: 9443147Abstract: Many internet users consume content through online videos. For example, users may view movies, television shows, music videos, and/or homemade videos. It may be advantageous to provide additional information to users consuming the online videos. Unfortunately, many current techniques may be unable to provide additional information relevant to the online videos from outside sources. Accordingly, one or more systems and/or techniques for determining a set of additional information relevant to an online video are disclosed herein. In particular, visual, textual, audio, and/or other features may be extracted from an online video (e.g., original content of the online video and/or embedded advertisements). Using the extracted features, additional information (e.g., images, advertisements, etc.) may be determined based upon matching the extracted features with content of a database. The additional information may be presented to a user consuming the online video.Type: GrantFiled: April 26, 2010Date of Patent: September 13, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Tao Mei, Xian-Sheng Hua, Shipeng Li
-
Patent number: 9443011Abstract: Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.Type: GrantFiled: May 18, 2011Date of Patent: September 13, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Linjun Yang, Xian-Sheng Hua, Yang Cai
-
Publication number: 20160217349Abstract: Techniques for optimizing multi-class image classification by leveraging negative multimedia data items to train and update classifiers are described. The techniques describe accessing positive multimedia data items of a plurality of multimedia data items, extracting features from the positive multimedia data items, and training classifiers based at least in part on the features. The classifiers may include a plurality of model vectors each corresponding to one of the individual labels. The system may iteratively test the classifiers using positive multimedia data and negative multimedia data and may update one or more model vectors associated with the classifiers differently, depending on whether multimedia data items are positive or negative. Techniques for applying the classifiers to determine whether a new multimedia data item is associated with a topic based at least in part on comparing similarity values with corresponding statistics derived from classifier training are also described.Type: ApplicationFiled: January 22, 2015Publication date: July 28, 2016Inventors: Xian-Sheng Hua, Jin Li, Ishan Misra
-
Publication number: 20160217344Abstract: Optimizing multi-class image classification by leveraging patch-based features extracted from weakly supervised images to train classifiers is described. A corpus of images associated with a set of labels may be received. One or more patches may be extracted from individual images in the corpus. Patch-based features may be extracted from the one or more patches and patch representations may be extracted from individual patches of the one or more patches. The patches may be arranged into clusters based at least in part on the patch-based features. At least some of the individual patches may be removed from individual clusters based at least in part on determined similarity values that are representative of similarity between the individual patches. The system may train classifiers based in part on patch-based features extracted from patches in the refined clusters. The classifiers may be used to accurately and efficiently classify new images.Type: ApplicationFiled: January 22, 2015Publication date: July 28, 2016Inventors: Ishan Misra, Jin Li, Xian-Sheng Hua
-
Patent number: 9336316Abstract: Architecture that includes a junk (unwanted) image detection algorithm which performs junk image detection of unwanted images before the images are actually downloaded for indexing. Features are employed related to image location information and host websites, such as image path descriptor (e.g., URL-uniform resource locator) pattern features, webpage content features, click features, and image aggregated information in a machine learning based framework to predict the probability that an image is unwanted (or wanted) before the images are downloaded. The framework is then applied to build a statistical model and predict junk scores. By removing image URLs marked as “junk” from the work list of an automated indexer (e.g., crawler), the indexer bandwidth is significantly improved with a corresponding improvement in the publish rate.Type: GrantFiled: November 5, 2012Date of Patent: May 10, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Zhong Wu, Xian-Sheng Hua
-
Patent number: 9271035Abstract: Tools and techniques for acquiring key roles and their relationships from a video independent of metadata, such as cast lists and scripts, are described herein. These techniques include discovering key roles and their relationships by treating a video (e.g., a movie, television program, music video, and personal video, etc.) as a community. For instance, a video is segmented into a hierarchical structure that includes levels for scenes, shots, and key frames. In some implementations, the techniques include performing face detection and grouping on the detected key frames. In some implementations, the techniques include exploiting the key roles and their correlations in this video to discover a community. The discovered community provides for a wide variety of applications, including the automatic generation of visual summaries or video posters including acquired key roles.Type: GrantFiled: April 12, 2011Date of Patent: February 23, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Yan Wang
-
Publication number: 20160034786Abstract: This disclosure describes techniques for training models from video data and applying the learned models to identify desirable video data. Video data may be labeled to indicate a semantic category and/or a score indicative of desirability. The video data may be processed to extract low and high level features. A classifier and a scoring model may be trained based on the extracted features. The classifier may estimate a probability that the video data belongs to at least one of the categories in a set of semantic categories. The scoring model may determine a desirability score for the video data. New video data may be processed to extract low and high level features, and feature values may be determined based on the extracted features. The learned classifier and scoring model may be applied to the feature values to determine a desirability score associated with the new video data.Type: ApplicationFiled: July 29, 2014Publication date: February 4, 2016Inventors: Nitin Suri, Xian-Sheng Hua, Tzong-Jhy Wang, William D. Sproule, Andrew S. Ivory, Jin Li
-
Publication number: 20150332124Abstract: A similarity of a first video to a second video may be identified automatically. Images are received from the videos, and divided into sub-images. The sub-images are evaluated based on a feature common to each of the sub-images. Binary representations of the images may be created based on the evaluation of the sub-images. A similarity of the first video to the second video may be determined based on a number of occurrences of a binary representation in the first video and the second video.Type: ApplicationFiled: July 27, 2015Publication date: November 19, 2015Inventors: Linjun Yang, Lifeng Shang, Xian-Sheng Hua, Fei Wang