Patents by Inventor Xian-Sheng Hua

Xian-Sheng Hua has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20130013578
    Abstract: Some implementations provide techniques and arrangements to perform image retrieval. For example, some implementations identify an object of interest and a visual context in a first image. In some implementations, a second image that includes a second object of interest and a second visual context may be compared to the object of interest and the visual content, respectively, to determine whether the second image matches the first image.
    Type: Application
    Filed: July 5, 2011
    Publication date: January 10, 2013
    Applicant: Microsoft Corporation
    Inventors: Linjun Yang, Bo Geng, Xian-Sheng Hua, Yang Cai
  • Patent number: 8352321
    Abstract: Computer program products, devices, and methods for generating in-text embedded advertising are described. Embedded advertising is “hidden” or embedded into a message by matching an advertisement to the message and identifying a place in the message to insert the advertisement. For textual messages, statistical analysis of individual sentences is performed to determine where it would be most natural to insert an advertisement. Statistical rules of grammar derived from a language model may be used choose a natural and grammatical place in the sentence for inserting the advertisement. Insertion of the advertisement creates a modified sentence without degrading a meaning of the original sentence, yet also includes the advertisement as a part of a new sentence.
    Type: Grant
    Filed: December 12, 2008
    Date of Patent: January 8, 2013
    Assignee: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Linjun Yang
  • Patent number: 8346767
    Abstract: An informative priors image search result summarization system and method that summarizes image search results based on the image relevance (as determined by a search engine's initial ranking) and the image quality. Embodiments of the system and method cluster the image search results, rank images within each cluster based on a computed image score, and then select a summary image for the cluster. Each cluster is analyzed and an image in the cluster having the maximum image score is included in a selected summary collection. The image score is computed using the image relevance and the image quality, as well as a cluster coherence, a density, and a diversity. The selection of images from a collection of candidate images generates an image search result summarization, which is presented to a user. The summaries are presented to the user in a ranked order based on their image scores.
    Type: Grant
    Filed: April 21, 2010
    Date of Patent: January 1, 2013
    Assignee: Microsoft Corporation
    Inventors: Linjun Yang, Rui Liu, Xian-Sheng Hua
  • Publication number: 20120321181
    Abstract: A similarity of a first video to a second video may be identified automatically. Images are received from the videos, and divided into sub-images. The sub-images are evaluated based on a feature common to each of the sub-images. Binary representations of the images may be created based on the evaluation of the sub-images. A similarity of the first video to the second video may be determined based on a number of occurrences of a binary representation in the first video and the second video.
    Type: Application
    Filed: June 20, 2011
    Publication date: December 20, 2012
    Applicant: Microsoft Corporation
    Inventors: Linjun Yang, Lifeng Shang, Xian-Sheng Hua, Fei Wang
  • Publication number: 20120297038
    Abstract: Techniques describe analyzing users and groups of a social network to identify user interests and providing recommendations for a user based on the user's identified interests. A content-awareness application obtains a collection of images and tags associated with the images belonging to members in the social network. The content-awareness application decomposes the members into a representative matrix to identify users and groups in order to calculate a similarity matrix between the users and their images based on a visual content of the images and a textual content of the tags. The content-awareness application further constructs a graph Laplacian over the users and the groups to align with the representative matrix based at least in part on the similarity matrix and further provides recommendations of groups for a user to join in the social network based at least in part on the graph Laplacian identifying the user's interests.
    Type: Application
    Filed: May 16, 2011
    Publication date: November 22, 2012
    Applicant: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Jinfeng Zhuang
  • Publication number: 20120294477
    Abstract: Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.
    Type: Application
    Filed: May 18, 2011
    Publication date: November 22, 2012
    Applicant: Microsoft Corporation
    Inventors: Linjun Yang, Xian-Sheng Hua, Yang Cai
  • Publication number: 20120271833
    Abstract: A hybrid search method may be used to identify information responsive to a query. A search may be performed utilizing a neighborhood graph and a partitioning tree. The partitioning tree may be searched to select one or more pivots that may be used to guide a subsequent search in the neighborhood graph. Once the search in the neighborhood graph is unable to identify nearest neighbors in closer proximity to the query, the search may be switched to the partitioning tree. The partitioning tree may then be searched to select pivots that may be used to guide subsequent searches in the neighborhood graph. The searches performed in the partitioning tree and/or the neighborhood graph may be conducted utilizing an iterative algorithm.
    Type: Application
    Filed: April 21, 2011
    Publication date: October 25, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Jingdong Wang, Xian-Sheng Hua, Shipeng Li, Jing Wang
  • Publication number: 20120265772
    Abstract: Technologies for recommending relevant tags for the tagging of media based on one or more initial tags provided for the media and based on a large quantity of other tagged media. Sample media as candidates for recommendation are provided by a set of weak rankers based on corresponding relevance measures in semantic and visual domains. The various samples provided by the weak rankers are then ranked based on relative order to provide a list of recommended tags for the media. The weak rankers provide sample tags based on relevance measures including tag co-occurrence, tag content correlation, and image-conditioned tag correlation.
    Type: Application
    Filed: June 29, 2012
    Publication date: October 18, 2012
    Applicant: Microsoft Corporation
    Inventors: Linjun Yang, Lei Wu, Xian-Sheng Hua
  • Publication number: 20120263433
    Abstract: Tools and techniques for acquiring key roles and their relationships from a video independent of metadata, such as cast lists and scripts, are described herein. These techniques include discovering key roles and their relationships by treating a video (e.g., a movie, television program, music video, and personal video, etc.) as a community. For instance, a video is segmented into a hierarchical structure that includes levels for scenes, shots, and key frames. In some implementations, the techniques include performing face detection and grouping on the detected key frames. In some implementations, the techniques include exploiting the key roles and their correlations in this video to discover a community. The discovered community provides for a wide variety of applications, including the automatic generation of visual summaries or video posters including acquired key roles.
    Type: Application
    Filed: April 12, 2011
    Publication date: October 18, 2012
    Applicant: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Yan Wang
  • Publication number: 20120251007
    Abstract: Techniques for construction of a visual codebook are described herein. Feature points may be extracted from large numbers of images. In one example, images providing N feature points may be used to construct a codebook of K words. The centers of each of K clusters of feature points may be initialized. In a looping or iterative manner, an assignment step assigns each feature point to a cluster and an update step locates a center of each cluster. The feature points may be assigned to a cluster based on a lesser of a distance to a center of a previously assigned cluster and a distance to a center derived by operation of an approximate nearest neighbor algorithm having aspects of randomization. The loop terminates when the feature points have sufficiently converged to their respective clusters. Centers of the clusters represent visual words, which may be used to construct the visual codebook.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Applicant: Microsoft Corporation
    Inventors: Linjun Yang, Darui Li, Xian-Sheng Hua, Hong-Jiang Zhang
  • Publication number: 20120254076
    Abstract: Supervised re-ranking for visual search may include re-ordering images that are returned in response to a text-based image search by exploiting visual information included in the images. In one example, supervised re-ranking for visual search may include receiving a textual query, obtaining an initial ranking result including a plurality of images corresponding to the textual query, and representing the textual query by a visual context of the plurality of images. A query-independent re-ranking model may be trained based on visual re-ranking features of the plurality of images of the textual query in accordance with a supervised training algorithm.
    Type: Application
    Filed: March 30, 2011
    Publication date: October 4, 2012
    Applicant: Microsoft Corporation
    Inventors: Linjun Yang, Xian-Sheng Hua
  • Publication number: 20120251011
    Abstract: Events may be determined based on an image and context data associated with the image. An event type associated with the image may be determined based on a concept of the image. A list of events may be retrieved from an event database based on the context data. The retrieved list of events may then be ranked based on the determined event type and the context data. Through this event determination, a user may obtain information of one or more events happening at a specific location simply by capturing an image of that specific location, thereby saving the user from searching and browsing the Internet or brochure to locate the information of the one or more events at the specific location.
    Type: Application
    Filed: April 4, 2011
    Publication date: October 4, 2012
    Applicant: Microsoft Corporation
    Inventors: Mingyan Gao, Xian-Sheng Hua
  • Patent number: 8249366
    Abstract: Described is a technology by which an image is classified (e.g., grouped and/or labeled), based on multi-label multi-instance data learning-based classification according to semantic labels and regions. An image is processed in an integrated framework into multi-label multi-instance data, including region and image labels. The framework determines local association data based on each region of an image. Other multi-label multi-instance data is based on relationships between region labels of the image, relationships between image labels of the image, and relationships between the region and image labels. These data are combined to classify the image. Training is also described.
    Type: Grant
    Filed: June 16, 2008
    Date of Patent: August 21, 2012
    Assignee: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Zheng-Jun Zha
  • Patent number: 8239333
    Abstract: Technologies for recommending relevant tags for the tagging of media based on one or more initial tags provided for the media and based on a large quantity of other tagged media. Sample media as candidates for recommendation are provided by a set of weak rankers based on corresponding relevance measures in semantic and visual domains. The various samples provided by the weak rankers are then ranked based on relative order to provide a list of recommended tags for the media. The weak rankers provide sample tags based on relevance measures including tag co-occurrence, tag content correlation, and image-conditioned tag correlation.
    Type: Grant
    Filed: March 3, 2009
    Date of Patent: August 7, 2012
    Assignee: Microsoft Corporation
    Inventors: Linjun Yang, Lei Wu, Xian-Sheng Hua
  • Patent number: 8218859
    Abstract: This disclosure describes various exemplary method and computer program products for transductive multi-label classification in detecting video concepts for information retrieval. This disclosure describes utilizing a hidden Markov random field formulation to detect labels for concepts in a video content and modeling a multi-label interdependence between the labels by a pairwise Markov random field. The process groups the labels into several parts to speed up a labeling inference and calculates a conditional probability score for the labels, the conditional probability scores are ordered for ranking in a video retrieval evaluation.
    Type: Grant
    Filed: December 5, 2008
    Date of Patent: July 10, 2012
    Assignee: Microsoft Corporation
    Inventors: Jingdong Wang, Shipeng Li, Xian-Sheng Hua, Yinghai Zhao
  • Patent number: 8219511
    Abstract: Techniques described herein create an accurate active-learning model that takes into account a sample selection bias of elements, such as images, selected for labeling by a user. These techniques select a first set of elements for labeling. Once a user labels these elements, the techniques calculate a sample selection bias of the selected elements and train a model that takes into account the sample selection bias. The techniques then select a second set of elements based, in part, on a sample selection bias of the elements. Again, once a user labels the second set of elements the techniques train the model while taking into account the calculated sample selection bias. Once the trained model satisfies a predefined stop condition, the techniques use the trained model to predict labels for the remaining unlabeled elements.
    Type: Grant
    Filed: February 24, 2009
    Date of Patent: July 10, 2012
    Assignee: Microsoft Corporation
    Inventors: Linjun Yang, Bo Geng, Xian-Sheng Hua
  • Patent number: 8207989
    Abstract: Embodiments that provide multi-video synthesis are disclosed. In accordance with one embodiment, multi-video synthesis includes breaking a main video into a plurality of main frames and break a supplementary video into a plurality of supplementary frames. The multi-video synthesis also includes assigning one or more supplementary frames into each of a plurality of states of a Hidden Markov Model (HMM), where each of the plurality of states corresponding to one or more main frames. The multi-video synthesis further includes determining optimal frames in the plurality of main frames for insertion of the plurality of supplementary frames based on the plurality of states and visual properties. The optimal frames include optimal insertion positions. The multi-video synthesis additionally includes inserting the plurality of supplementary frames into the optimal insertion positions to form a synthesized video.
    Type: Grant
    Filed: December 12, 2008
    Date of Patent: June 26, 2012
    Assignee: Microsoft Corporation
    Inventors: Tao Mei, Xian-Sheng Hua, Shipeng Li, Teng Li
  • Publication number: 20120158721
    Abstract: A computing device configured to determine that one or more regions of an image are associated with a tag of the image is described herein. The computing device is further configured to determine one or more attribute tags describing at least one of the content or context of the one or more regions. Upon determining the attribute tags, the computing device associates the attribute tags with the tag to enable image searching based on the tag and attribute tags.
    Type: Application
    Filed: December 17, 2010
    Publication date: June 21, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Xian-Sheng Hua, Kuiyuan Yang, Meng Wang, Hong-Jiang Zhang
  • Publication number: 20120158686
    Abstract: A computing device configured to determine a subset of the tags associated with at least one image of a plurality of received, tagged images is described herein. The computing device performs the determining based on one or more measures of consistency of visual similarity between ones of the images with semantic similarity between tags of the ones of the images.
    Type: Application
    Filed: December 17, 2010
    Publication date: June 21, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Xian-Sheng Hua, Dong Liu, Meng Wang, Hong-Jiang Zhang
  • Publication number: 20120150871
    Abstract: An autonomous blog engine is implemented to enable the autonomous generation of a blog. The autonomous blog engine receives media objects that are captured by an electronic device during a trip session. The autonomous blog engine determines a place of interest based on photographs selected from the media objects. The autonomous blog engine then generates textual content using one or more pre-stored knowledge items that include information on the place of interest. The autonomous blog engine further autonomously publishes a blog entry on the place of interest that includes one or more photographs from the photograph cluster and the textual content.
    Type: Application
    Filed: December 10, 2010
    Publication date: June 14, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Xian-Sheng Hua, Hongzhi Li, Shipeng Li