Patents by Inventor Xian-Sheng Hua

Xian-Sheng Hua has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100158412
    Abstract: This disclosure describes various exemplary user interfaces, methods, and computer program products for the interactively ranking image search results refinement method using a color layout. The method includes receiving a text query for an image search, presenting image search results in a structured presentation based on the text query and information from an interest color layout. The process creates image search results that may be selected by the user based on color selection palettes or color layout specification schemes. Then the process ranks the image search results by sorting the results according to similarity scores between color layouts from the image search results and the interest color layout from a user based on the color selection palettes and the color layout specification schemes.
    Type: Application
    Filed: December 22, 2008
    Publication date: June 24, 2010
    Applicant: Microsoft Corporation
    Inventors: Jingdong Wang, Shipeng Li, Xian Sheng Hua, Yinghai Zhao
  • Publication number: 20100158396
    Abstract: This disclosure describes various exemplary systems, computer program products, and methods for feature distance metric learning with feature decomposition (DMLFD). The disclosure describes decomposing a high-dimensional feature space into one or more low-dimensional feature spaces according to minimum dependence. Furthermore, the disclosure describes how the sub-metrics are constructed and combined to form a global metric.
    Type: Application
    Filed: December 24, 2008
    Publication date: June 24, 2010
    Applicant: Microsoft Corporation
    Inventors: Meng Wang, Xian-Sheng Hua
  • Publication number: 20100149419
    Abstract: Embodiments that provide multi-video synthesis are disclosed. In accordance with one embodiment, multi-video synthesis includes breaking a main video into a plurality of main frames and break a supplementary video into a plurality of supplementary frames. The multi-video synthesis also includes assigning one or more supplementary frames into each of a plurality of states of a Hidden Markov Model (HMM), where each of the plurality of states corresponding to one or more main frames. The multi-video synthesis further includes determining optimal frames in the plurality of main frames for insertion of the plurality of supplementary frames based on the plurality of states and visual properties. The optimal frames include optimal insertion positions. The multi-video synthesis additionally includes inserting the plurality of supplementary frames into the optimal insertion positions to form a synthesized video.
    Type: Application
    Filed: December 12, 2008
    Publication date: June 17, 2010
    Applicant: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Teng Li
  • Publication number: 20100153219
    Abstract: Computer program products, devices, and methods for generating in-text embedded advertising are described. Embedded advertising is “hidden” or embedded into a message by matching an advertisement to the message and identifying a place in the message to insert the advertisement. For textual messages, statistical analysis of individual sentences is performed to determine where it would be most natural to insert an advertisement. Statistical rules of grammar derived from a language model may be used choose a natural and grammatical place in the sentence for inserting the advertisement. Insertion of the advertisement creates a modified sentence without degrading a meaning of the original sentence, yet also includes the advertisement as a part of a new sentence.
    Type: Application
    Filed: December 12, 2008
    Publication date: June 17, 2010
    Applicant: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Linjun Yang
  • Publication number: 20100142803
    Abstract: This disclosure describes various exemplary method and computer program products for transductive multi-label classification in detecting video concepts for information retrieval. This disclosure describes utilizing a hidden Markov random field formulation to detect labels for concepts in a video content and modeling a multi-label interdependence between the labels by a pairwise Markov random field. The process groups the labels into several parts to speed up a labeling inference and calculates a conditional probability score for the labels, the conditional probability scores are ordered for ranking in a video retrieval evaluation.
    Type: Application
    Filed: December 5, 2008
    Publication date: June 10, 2010
    Applicant: Microsoft Corporation
    Inventors: Jingdong Wang, Shipeng Li, Xian-Sheng Hua, Yinghai Zhao
  • Publication number: 20100106486
    Abstract: Image-based semantic distance technique embodiments are presented that involve establishing a measure of an image-based semantic distance between semantic concepts. Generally, this entails respectively computing a semantic concept representation for each concept based on a collection of images associated with the concept. A degree of difference is then computed between two semantic concept representations to produce the aforementioned semantic distance measure for the pair of corresponding concepts.
    Type: Application
    Filed: December 19, 2008
    Publication date: April 29, 2010
    Applicant: Microsoft Corporation
    Inventors: Xian-Sheng Hua, Lei Wu, Shipeng Li
  • Publication number: 20100082614
    Abstract: A general framework for video search reranking is disclosed which explicitly formulates reranking into a global optimization problem from the Bayesian perspective. Under this framework, with two novel pair-wise ranking distances, two effective video search reranking methods, hinge reranking and preference strength reranking, are disclosed. Experiments conducted on the TRECVID dataset have demonstrated that the disclosed methods outperform several existing reranking approaches.
    Type: Application
    Filed: September 22, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Linjun Yang, Jingdong Wang, Xian-Sheng Hua, Xinmei Tian
  • Publication number: 20100076923
    Abstract: Online multi-label active annotation may include building a preliminary classifier from a pre-labeled training set included with an initial batch of annotated data samples, and selecting a first batch of sample-label pairs from the initial batch of annotated data samples. The sample-label pairs may be selected by using a sample-label pair selection module. The first batch of sample-label pairs may be provided to online participants to manually annotate the first batch of sample-label pairs based on the preliminary classifier. The preliminary classifier may be updated to form a first updated classifier based on an outcome of the providing the first batch of sample-label pairs to the online participants.
    Type: Application
    Filed: September 25, 2008
    Publication date: March 25, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Xian-Sheng Hua, Guo-Jun Qi, Shipeng Li
  • Publication number: 20100074537
    Abstract: Kernelized spatial-contextual image classification is disclosed. One embodiment comprises generating a first spatial-contextual model to represent a first image, the first spatial-contextual model having a plurality of interconnected nodes arranged in a first pattern of connections with each node connected to at least one other node, generating a second spatial-contextual model to represent a second image using the first pattern of connections, and estimating the distance between corresponding nodes in the first spatial-contextual model and the second spatial-contextual model based on a relationship with adjacent connected nodes to determine a distance between the first image and the second image.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Xian-Sheng Hua, Guo-Jun Qi, Yong Rui, Hong-Jiang Zhang
  • Publication number: 20090319883
    Abstract: Described is a technology in which a new video is automatically annotated based on terms mined from the text associated with similar videos. In a search phase, searching by one or more various search modalities (e.g., text, concept and/or video) finds a set of videos that are similar to a new video. Text associated with the new video and with the set of videos is obtained, such as by automatic speech recognition that generates transcripts. A mining mechanism combines the associated text of the similar videos with that of the new video to find the terms that annotate the new video. For example, the mining mechanism creates a new term frequency vector by combining term frequency vectors for the set of similar videos with a term frequency vector for the new video, and provides the mined terms by fitting a zipf curve to the new term frequency vector.
    Type: Application
    Filed: June 19, 2008
    Publication date: December 24, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Mei, Xian-Sheng Hua, Wei-Ying Ma, Emily Kay Moxley
  • Publication number: 20090310854
    Abstract: Described is a technology by which an image is classified (e.g., grouped and/or labeled), based on multi-label multi-instance data learning-based classification according to semantic labels and regions. An image is processed in an integrated framework into multi-label multi-instance data, including region and image labels. The framework determines local association data based on each region of an image. Other multi-label multi-instance data is based on relationships between region labels of the image, relationships between image labels of the image, and relationships between the region and image labels. These data are combined to classify the image. Training is also described.
    Type: Application
    Filed: June 16, 2008
    Publication date: December 17, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Zheng-Jun Zha
  • Publication number: 20090313294
    Abstract: Images are automatically annotated using semantic distance learning. Training images are manually annotated and partitioned into semantic clusters. Semantic distance functions (SDFs) are learned for the clusters. The SDF for each cluster is used to compute semantic distance scores between a new image and each image in the cluster. The scores for each cluster are used to generate a ranking list which ranks each image in the cluster according to its semantic distance from the new image. An association probability is estimated for each cluster which specifies the probability of the new image being semantically associated with the cluster. Cluster-specific probabilistic annotations for the new image are generated from the manual annotations for the images in each cluster. The association probabilities and cluster-specific probabilistic annotations for all the clusters are used to generate final annotations for the new image.
    Type: Application
    Filed: June 11, 2008
    Publication date: December 17, 2009
    Applicant: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Yong Wang
  • Publication number: 20090292685
    Abstract: A video search re-ranking via multi-graph propagation technique employing multimodal fusion in video search is presented. It employs not only textual and visual features, but also semantic and conceptual similarity between video shots to rank or re-rank the search results received in response to a text-based search query. In one embodiment, the technique employs an object-sensitive approach to query analysis to improve the baseline result of text-based video search. The technique then employs a graph-based approach to text-based search result ranking or re-ranking. To better exploit the underlying relationship between video shots, the re-ranking scheme simultaneously leverages textual relevancy, semantic concept relevancy, and low-level-feature-based visual similarity. The technique constructs a set of graphs with the video shots as vertices, and the conceptual and visual similarity between video shots as hyperlinks.
    Type: Application
    Filed: May 22, 2008
    Publication date: November 26, 2009
    Applicant: Microsoft Corporation
    Inventors: Jingjing Liu, Xian-Sheng Hua, Wei Lai, Shipeng Li
  • Publication number: 20090290802
    Abstract: The concurrent multiple instance learning technique described encodes the inter-dependency between instances (e.g. regions in an image) in order to predict a label for a future instance, and, if desired the label for an image determined from the label of these instances. The technique, in one embodiment, uses a concurrent tensor to model the semantic linkage between instances in a set of images. Based on the concurrent tensor, rank-1 supersymmetric non-negative tensor factorization (SNTF) can be applied to estimate the probability of each instance being relevant to a target category. In one embodiment, the technique formulates the label prediction processes in a regularization framework, which avoids overfitting, and significantly improves a learning machine's generalization capability, similar to that in SVMs. The technique, in one embodiment, uses Reproducing Kernel Hilbert Space (RKHS) to extend predicted labels to the whole feature space based on the generalized representer theorem.
    Type: Application
    Filed: May 22, 2008
    Publication date: November 26, 2009
    Applicant: Microsoft Corporation
    Inventors: Xian-Sheng Hua, Guo-Jun Qi, Yong Rui, Tao Mei, Hong-Jiang Zhang
  • Publication number: 20090274434
    Abstract: Visual concepts contained within a video clip are classified based upon a set of target concepts. The clip is segmented into shots and a multi-layer multi-instance (MLMI) structured metadata representation of each shot is constructed. A set of pre-generated trained models of the target concepts is validated using a set of training shots. An MLMI kernel is recursively generated which models the MLMI structured metadata representation of each shot by comparing prescribed pairs of shots. The MLMI kernel is subsequently utilized to generate a learned objective decision function which learns a classifier for determining if a particular shot (that is not in the set of training shots) contains instances of the target concepts. A regularization framework can also be utilized in conjunction with the MLMI kernel to generate modified learned objective decision functions. The regularization framework introduces explicit constraints which serve to maximize the precision of the classifier.
    Type: Application
    Filed: April 29, 2008
    Publication date: November 5, 2009
    Applicant: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Zhiwei Gu
  • Patent number: 7610554
    Abstract: Systems and methods for template-based multimedia capturing are described. In one aspect, a capturing template is selected to facilitate capturing a particular quantity and type(s) of media content. Media content is captured based on a temporal structure provided by the capturing template. These quantities and types of media content captured with respect to the temporal structure facilitate media content browsing, indexing, authoring, and sharing activities.
    Type: Grant
    Filed: November 1, 2005
    Date of Patent: October 27, 2009
    Assignee: Microsoft Corporation
    Inventors: Xian-Sheng Hua, Shipeng Li
  • Patent number: 7565016
    Abstract: Systems and methods for learning-based automatic commercial content detection are described. In one aspect, the systems and methods include a training component and an analyzing component. The training component trains a commercial content classification model using a kernel support vector machine. The analyzing component analyzes program data such as video and audio data using the commercial content classification model and one or more of single-side left neighborhood(s) and right neighborhood(s) of program data segments. Based on this analysis, each of the program data segments are classified as being commercial or non-commercial segments.
    Type: Grant
    Filed: January 15, 2007
    Date of Patent: July 21, 2009
    Assignee: Microsoft Corporation
    Inventors: Xian-Sheng Hua, Lie Lu, Mingjing Li, Hong-Jiang Zhang
  • Publication number: 20090171787
    Abstract: A method for making online adverisement makes an impressionative presentation of an advertisement to a viewer. The impressionative presentation is an impressionized version of an original online source medium such as a photo. The method associates advertisements with the source medium based, at least in part, on calculated ad relevance, and determines one or more viewer iteractive points on the original source medium. The method then presents to the viewer an ad-augmented medium including an impressionized version of the source medium, which has the ability to change the form of impression to a viewer in response to an interactive act conducted by the viewer. The ad-augmented medium may include the associated advertisement content or direct the viewer's attention thereto.
    Type: Application
    Filed: June 20, 2008
    Publication date: July 2, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li
  • Publication number: 20090125461
    Abstract: Multi-label active learning may entail training a classifier with a set of training samples having multiple labels per sample. In an example embodiment, a method includes accepting a set of training samples, with the set of training samples having multiple respective samples that are each respectively associated with multiple labels. The set of training samples is analyzed to select a sample-label pair responsive to at least one error parameter. The selected sample-label pair is then submitted to an oracle for labeling.
    Type: Application
    Filed: December 17, 2007
    Publication date: May 14, 2009
    Applicant: Microsoft Corporation
    Inventors: Guo-Jun Qi, Xian-Sheng Hua, Yong Rui, Hong-Jiang Zhang, Shipeng Li
  • Publication number: 20090079871
    Abstract: Systems and methods for determining insertion points in a first video stream are described. The insertions points being configured for inserting at least one second video into the first video. In accordance with one embodiment, a method for determining the insertion points includes parsing the first video into a plurality of shots. The plurality of shots includes one or more shot boundaries. The method then determines one or more insertion points by balancing a discontinuity metric and an attractiveness metric of each shot boundary.
    Type: Application
    Filed: September 20, 2007
    Publication date: March 26, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Xian-Sheng Hua, Tao Mei, Linjun Yang, Shipeng Li