Patents by Inventor Ji-Rong Wen

Ji-Rong Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20120303557
    Abstract: A “Name Disambiguator” provides various techniques for implementing an interactive framework for resolving or disambiguating entity names (associated with objects such as publications) for entity searches where two or more same or similar names may refer to different entities. More specifically, the Name Disambiguator uses a combination of user input and automatic models to address the disambiguation problem. In various embodiments, the Name Disambiguator uses a two part process, including: 1) a global SVM trained from large sets of documents or objects in a simulated interactive mode, and 2) further personalization of local SVM models (associated with individual names or groups of names such as, for example, a group of coauthors) derived from the global SVM model. The result of this process is that large sets of documents or objects are rapidly and accurately condensed or clustered into ordered sets by that are organized by entity names.
    Type: Application
    Filed: May 28, 2011
    Publication date: November 29, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhengdong Lu, Zaiqing Nie, Gang Luo, Yong Cao, Ji-Rong Wen, Wei-Ying Ma
  • Patent number: 8229960
    Abstract: Described is a summarizing a web entity (e.g., a person, place, product or so forth) based upon the entity's appearance in web documents (e.g., on the order of hundreds of millions or billions of webpages). Webpages are separated into blocks, which are then processed according to various features to filter the number of blocks to further process, and rank the most relevant blocks with respect to the entity that remain. A redundancy removal mechanism removes redundant blocks, leaving a set of remaining blocks that are used to provide a summary of information that is relevant to the entity.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: July 24, 2012
    Assignee: Microsoft Corporation
    Inventors: Zaiqing Nie, Ji-Rong Wen, Liu Yang
  • Publication number: 20120109950
    Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.
    Type: Application
    Filed: January 10, 2012
    Publication date: May 3, 2012
    Applicant: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
  • Patent number: 8112421
    Abstract: A learning system for a search ranking function model may include a computer program that iteratively refines the model using new queries and associated documents from an unlabeled training set. The unlabeled training set may include a set of queries for which the associated documents have not been labeled as “relevant” or otherwise labeled. The new queries may be selected based on a similarity to and an accuracy of each neighbor from a labeled training set, such as a labeled validation set. Upon selection, the documents associated with the new queries may be labeled. The new queries and their associated documents may be accumulated into a labeled training set, such as a labeled training set, and a refined model may be learned based on the augmented labeled training set. The model may be iteratively refined until it is determined that the model is adequate.
    Type: Grant
    Filed: July 20, 2007
    Date of Patent: February 7, 2012
    Assignee: Microsoft Corporation
    Inventors: Nan Sun, Qing Yu, Shuming Shi, Ji-Rong Wen
  • Publication number: 20120030206
    Abstract: A topic modeling architecture is used to discover high-quality semantic classes from a large collection of raw semantic classes (RASCs) for use in generating responses to queries. A specific semantic class is identified from a collection of RASCs, and a preprocessing operation is conducted to remove one or more items with a semantic class frequency less than a predetermined threshold. A topic model is then applied to the specific semantic class for each of the items that remain in the specific semantic class after the preprocessing operation. A postprocessing operation is then conducted on the items of the specific semantic class to merge and sort the results of the topic model and generate final semantic classes for use by a search engine to respond to a query.
    Type: Application
    Filed: July 29, 2010
    Publication date: February 2, 2012
    Applicant: Microsoft Corporation
    Inventors: Shuming Shi, Ji-Rong Wen
  • Patent number: 8095478
    Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.
    Type: Grant
    Filed: April 10, 2008
    Date of Patent: January 10, 2012
    Assignee: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
  • Patent number: 8073838
    Abstract: A search method uses pseudo-anchor text associated with search objects to improve search performance. The pseudo-anchor text may be extracted in combination with an identifier of the search objects (such as a pseudo-URL) from a digital corpus such as a collection of documents. Pseudo-anchor texts for each object are preferably extracted from candidate anchor blocks using a machine learning based approach. The pseudo-anchor texts are made available for searching and used to help rank the objects in a search result to improve search performance. The method may be used in vertical search of objects such as published articles, products and images that lack explicit URLs and anchor text information.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: December 6, 2011
    Assignee: Microsoft Corporation
    Inventors: Shuming Shi, Ji-Rong Wen, Mingjie Zhu, Fei Xing, Zaiqing Nie
  • Publication number: 20110283205
    Abstract: The automated social networking graph mining and visualization technique described herein mines social connections and allows creation of a social networking graph from general (not necessarily social-application specific) Web pages. The technique uses the distances between a person's/entity's name and related people's/entities names on one or more Web pages to determine connections between people/entities and the strengths of the connections. In one embodiment, the technique lays out these connections, and then clusters them, in a 2-D layout of a social networking graph that represents the Web connection strengths among the related people's or entities' names, by using a force-directed model.
    Type: Application
    Filed: May 14, 2010
    Publication date: November 17, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Zaiqing Nie, Yong Cao, Gang Luo, Ruochi Zhang, Xiaojiang Liu, Yunxiao Ma, Bo Zhang, Ying-Qing Xu, Ji-Rong Wen
  • Publication number: 20110264658
    Abstract: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.
    Type: Application
    Filed: July 1, 2011
    Publication date: October 27, 2011
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Shuming Shi, Wei-Ying Ma, Yunxiao Ma, Zaiqing Nie
  • Patent number: 8046370
    Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.
    Type: Grant
    Filed: September 16, 2008
    Date of Patent: October 25, 2011
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Hang Cui
  • Publication number: 20110251984
    Abstract: Methods and systems for Web-scale entity relationship extraction are usable to build large-scale entity relationship graphs from any data corpora stored on a computer-readable medium or accessible through a network. Such entity relationship graphs may be used to navigate previously undiscoverable relationships among entities within data corpora. Additionally, the entity relationship extraction may be configured to utilize discriminative models to jointly model correlated data found within the selected corpora.
    Type: Application
    Filed: April 9, 2010
    Publication date: October 13, 2011
    Applicant: Microsoft Corporation
    Inventors: Zaiqing Nie, Xiaojiang Liu, Jun Zhu, Ji-Rong Wen
  • Publication number: 20110238644
    Abstract: This document describes tools for adjusting anchor text weight to provide more relevant search engine results. Specifically, these tools take advantage of a site-relationship model to consider relationships not only between an anchor text source site and a destination page but also relationships between multiple anchor text source sites to improve web searches. Consideration of these relationships aids in determining a new an anchor text weight, which in turn results in more relevant search results.
    Type: Application
    Filed: March 29, 2010
    Publication date: September 29, 2011
    Applicant: Microsoft Corporation
    Inventors: Zhicheng Dou, Junyan Chen, Ruihua Song, Ji-Rong Wen
  • Patent number: 8024319
    Abstract: A method of creating an index of web queries is discussed. The method includes receiving a first query representative of one or more symbolic characters and assigning the first query to a first data structure. A first text string representative of the first query is created and assigned to a second data structure. The first and second data structures are stored on a tangible computer readable medium.
    Type: Grant
    Filed: January 25, 2007
    Date of Patent: September 20, 2011
    Assignee: Microsoft Corporation
    Inventors: Jianfeng Gao, Qi Yao, Ji-Rong Wen
  • Publication number: 20110209048
    Abstract: Interactive synchronization of Web data and spreadsheets is usable to build data wrappers based on any type of data found in a document. Such data wrappers can be used to interact with source documents, crawl a network for additional data, map data from across domains, and/or synchronize data from dynamic Web documents.
    Type: Application
    Filed: February 19, 2010
    Publication date: August 25, 2011
    Applicant: Microsoft Corporation
    Inventors: Matthew Robert Scott, Ruochi Zhang, Ruihua Song, Ji-Rong Wen
  • Patent number: 8001130
    Abstract: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.
    Type: Grant
    Filed: July 25, 2006
    Date of Patent: August 16, 2011
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Shuming Shi, Wei-Ying Ma, Yunxiao Ma, Zaiqing Nie
  • Publication number: 20110191381
    Abstract: Described is a technology for efficiently labeling a webpage. A wrapper tool labels records of a webpage at the record level. If an existing wrapper exists that is appropriate for labeling a record, the wrapper tool automatically labels that record. For unlabeled records, the tool provides a user interface to label those records, and updates the set of existing wrappers with a new wrapper that is generated based upon the labeling operation; the new wrapper is then applied to any unlabeled records if appropriate for those records. As a result, a user typically needs only to label a relatively few records, with the wrappers generated for those records automatically used to label the other unlabeled records of the webpage.
    Type: Application
    Filed: January 29, 2010
    Publication date: August 4, 2011
    Applicant: Microsoft Corporation
    Inventors: Shuyi Zheng, Ruihua Song, Matthew Robert Scott, Ji-Rong Wen
  • Patent number: 7979459
    Abstract: Aspects of the subject matter described herein relate to matching product information to products. In aspects, a product matching component receives product information. The product matching component normalizes the product information and obtains keywords from the product information. By querying a database of recognized products, the keywords are used to obtain a list of products that potentially match the product information. A confidence level is assigned to each of the potential matches in the list. A match may be returned for the highest matched product or for a selectable number of products whose confidence level(s) exceed a selectable threshold.
    Type: Grant
    Filed: June 15, 2007
    Date of Patent: July 12, 2011
    Assignee: Microsoft Corporation
    Inventors: Kai Wu, Daniel Takacs, Tong Yao, Jiyu Zhang, Hua Yang, Ji-Rong Wen, Jonathan R M Hart, Eric Anthony Reel
  • Patent number: 7974957
    Abstract: A method and system for ranking pages of a search result based on the mobile readiness of the pages is provided. A mobile-readiness system receives an indication of pages that are to be ranked. The mobile-readiness system evaluates the mobile readiness for each of the pages. Mobile readiness indicates suitability of the page for a mobile device. The mobile readiness system then ranks the pages based on the generated mobile readiness and some other criterion such as a relevance score or an importance score. The mobile-readiness system may train a classifier to classify pages based on their mobile readiness.
    Type: Grant
    Filed: April 5, 2007
    Date of Patent: July 5, 2011
    Assignee: Microsoft Corporation
    Inventors: Xing Xie, Jihwan Song, Ji-Rong Wen
  • Publication number: 20110137886
    Abstract: Described is a data-centric web search engine technology/architecture, in which document metadata, including offline-extracted metadata, is used as part of a search indexing and ranking pipeline. A web data management component receives crawled documents and extracts document metadata from the documents. An indexing component uses the document metadata to build an index for the documents. A serving component uses the index and the document metadata to serve content, e.g., search results. Also described is the use of query metadata extracted from queries of a query log for use in the pipeline.
    Type: Application
    Filed: December 8, 2009
    Publication date: June 9, 2011
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Guomao Xin, Yunxiao Ma, Yu Chen, Qing Yu, Yi Liu, Zhicheng Dou, Shuming Shi
  • Publication number: 20110087660
    Abstract: A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.
    Type: Application
    Filed: December 17, 2010
    Publication date: April 14, 2011
    Applicant: Microsoft Corporation
    Inventors: Qing Yu, Shuming Shi, Zhiwei Li, Ji-Rong Wen, Wei-Ying Ma