Patents by Inventor Ji-Rong Wen

Ji-Rong Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

INTERACTIVE FRAMEWORK FOR NAME DISAMBIGUATION

Publication number: 20120303557

Abstract: A “Name Disambiguator” provides various techniques for implementing an interactive framework for resolving or disambiguating entity names (associated with objects such as publications) for entity searches where two or more same or similar names may refer to different entities. More specifically, the Name Disambiguator uses a combination of user input and automatic models to address the disambiguation problem. In various embodiments, the Name Disambiguator uses a two part process, including: 1) a global SVM trained from large sets of documents or objects in a simulated interactive mode, and 2) further personalization of local SVM models (associated with individual names or groups of names such as, for example, a group of coauthors) derived from the global SVM model. The result of this process is that large sets of documents or objects are rapidly and accurately condensed or clustered into ordered sets by that are organized by entity names.

Type: Application

Filed: May 28, 2011

Publication date: November 29, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Zhengdong Lu, Zaiqing Nie, Gang Luo, Yong Cao, Ji-Rong Wen, Wei-Ying Ma
Web-scale entity summarization

Patent number: 8229960

Abstract: Described is a summarizing a web entity (e.g., a person, place, product or so forth) based upon the entity's appearance in web documents (e.g., on the order of hundreds of millions or billions of webpages). Webpages are separated into blocks, which are then processed according to various features to filter the number of blocks to further process, and rank the most relevant blocks with respect to the entity that remain. A redundancy removal mechanism removes redundant blocks, leaving a set of remaining blocks that are used to provide a summary of information that is relevant to the entity.

Type: Grant

Filed: September 30, 2009

Date of Patent: July 24, 2012

Assignee: Microsoft Corporation

Inventors: Zaiqing Nie, Ji-Rong Wen, Liu Yang
METHOD AND SYSTEM FOR CALCULATING IMPORTANCE OF A BLOCK WITHIN A DISPLAY PAGE

Publication number: 20120109950

Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.

Type: Application

Filed: January 10, 2012

Publication date: May 3, 2012

Applicant: Microsoft Corporation

Inventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
Query selection for effectively learning ranking functions

Patent number: 8112421

Abstract: A learning system for a search ranking function model may include a computer program that iteratively refines the model using new queries and associated documents from an unlabeled training set. The unlabeled training set may include a set of queries for which the associated documents have not been labeled as “relevant” or otherwise labeled. The new queries may be selected based on a similarity to and an accuracy of each neighbor from a labeled training set, such as a labeled validation set. Upon selection, the documents associated with the new queries may be labeled. The new queries and their associated documents may be accumulated into a labeled training set, such as a labeled training set, and a refined model may be learned based on the augmented labeled training set. The model may be iteratively refined until it is determined that the model is adequate.

Type: Grant

Filed: July 20, 2007

Date of Patent: February 7, 2012

Assignee: Microsoft Corporation

Inventors: Nan Sun, Qing Yu, Shuming Shi, Ji-Rong Wen
Employing Topic Models for Semantic Class Mining

Publication number: 20120030206

Abstract: A topic modeling architecture is used to discover high-quality semantic classes from a large collection of raw semantic classes (RASCs) for use in generating responses to queries. A specific semantic class is identified from a collection of RASCs, and a preprocessing operation is conducted to remove one or more items with a semantic class frequency less than a predetermined threshold. A topic model is then applied to the specific semantic class for each of the items that remain in the specific semantic class after the preprocessing operation. A postprocessing operation is then conducted on the items of the specific semantic class to merge and sort the results of the topic model and generate final semantic classes for use by a search engine to respond to a query.

Type: Application

Filed: July 29, 2010

Publication date: February 2, 2012

Applicant: Microsoft Corporation

Inventors: Shuming Shi, Ji-Rong Wen
Method and system for calculating importance of a block within a display page

Patent number: 8095478

Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.

Type: Grant

Filed: April 10, 2008

Date of Patent: January 10, 2012

Assignee: Microsoft Corporation

Inventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
Pseudo-anchor text extraction

Patent number: 8073838

Abstract: A search method uses pseudo-anchor text associated with search objects to improve search performance. The pseudo-anchor text may be extracted in combination with an identifier of the search objects (such as a pseudo-URL) from a digital corpus such as a collection of documents. Pseudo-anchor texts for each object are preferably extracted from candidate anchor blocks using a machine learning based approach. The pseudo-anchor texts are made available for searching and used to help rank the objects in a search result to improve search performance. The method may be used in vertical search of objects such as published articles, products and images that lack explicit URLs and anchor text information.

Type: Grant

Filed: January 29, 2010

Date of Patent: December 6, 2011

Assignee: Microsoft Corporation

Inventors: Shuming Shi, Ji-Rong Wen, Mingjie Zhu, Fei Xing, Zaiqing Nie
AUTOMATED SOCIAL NETWORKING GRAPH MINING AND VISUALIZATION

Publication number: 20110283205

Abstract: The automated social networking graph mining and visualization technique described herein mines social connections and allows creation of a social networking graph from general (not necessarily social-application specific) Web pages. The technique uses the distances between a person's/entity's name and related people's/entities names on one or more Web pages to determine connections between people/entities and the strengths of the connections. In one embodiment, the technique lays out these connections, and then clusters them, in a 2-D layout of a social networking graph that represents the Web connection strengths among the related people's or entities' names, by using a force-directed model.

Type: Application

Filed: May 14, 2010

Publication date: November 17, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Zaiqing Nie, Yong Cao, Gang Luo, Ruochi Zhang, Xiaojiang Liu, Yunxiao Ma, Bo Zhang, Ying-Qing Xu, Ji-Rong Wen
WEB OBJECT RETRIEVAL BASED ON A LANGUAGE MODEL

Publication number: 20110264658

Abstract: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.

Type: Application

Filed: July 1, 2011

Publication date: October 27, 2011

Applicant: Microsoft Corporation

Inventors: Ji-Rong Wen, Shuming Shi, Wei-Ying Ma, Yunxiao Ma, Zaiqing Nie
Retrieval of structured documents

Patent number: 8046370

Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.

Type: Grant

Filed: September 16, 2008

Date of Patent: October 25, 2011

Assignee: Microsoft Corporation

Inventors: Ji-Rong Wen, Hang Cui
WEB-SCALE ENTITY RELATIONSHIP EXTRACTION

Publication number: 20110251984

Abstract: Methods and systems for Web-scale entity relationship extraction are usable to build large-scale entity relationship graphs from any data corpora stored on a computer-readable medium or accessible through a network. Such entity relationship graphs may be used to navigate previously undiscoverable relationships among entities within data corpora. Additionally, the entity relationship extraction may be configured to utilize discriminative models to jointly model correlated data found within the selected corpora.

Type: Application

Filed: April 9, 2010

Publication date: October 13, 2011

Applicant: Microsoft Corporation

Inventors: Zaiqing Nie, Xiaojiang Liu, Jun Zhu, Ji-Rong Wen
Using Anchor Text With Hyperlink Structures for Web Searches

Publication number: 20110238644

Abstract: This document describes tools for adjusting anchor text weight to provide more relevant search engine results. Specifically, these tools take advantage of a site-relationship model to consider relationships not only between an anchor text source site and a destination page but also relationships between multiple anchor text source sites to improve web searches. Consideration of these relationships aids in determining a new an anchor text weight, which in turn results in more relevant search results.

Type: Application

Filed: March 29, 2010

Publication date: September 29, 2011

Applicant: Microsoft Corporation

Inventors: Zhicheng Dou, Junyan Chen, Ruihua Song, Ji-Rong Wen
Finite-state model for processing web queries

Patent number: 8024319

Abstract: A method of creating an index of web queries is discussed. The method includes receiving a first query representative of one or more symbolic characters and assigning the first query to a first data structure. A first text string representative of the first query is created and assigned to a second data structure. The first and second data structures are stored on a tangible computer readable medium.

Type: Grant

Filed: January 25, 2007

Date of Patent: September 20, 2011

Assignee: Microsoft Corporation

Inventors: Jianfeng Gao, Qi Yao, Ji-Rong Wen
INTERACTIVE SYNCHRONIZATION OF WEB DATA AND SPREADSHEETS

Publication number: 20110209048

Abstract: Interactive synchronization of Web data and spreadsheets is usable to build data wrappers based on any type of data found in a document. Such data wrappers can be used to interact with source documents, crawl a network for additional data, map data from across domains, and/or synchronize data from dynamic Web documents.

Type: Application

Filed: February 19, 2010

Publication date: August 25, 2011

Applicant: Microsoft Corporation

Inventors: Matthew Robert Scott, Ruochi Zhang, Ruihua Song, Ji-Rong Wen
Web object retrieval based on a language model

Patent number: 8001130

Abstract: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.

Type: Grant

Filed: July 25, 2006

Date of Patent: August 16, 2011

Assignee: Microsoft Corporation

Inventors: Ji-Rong Wen, Shuming Shi, Wei-Ying Ma, Yunxiao Ma, Zaiqing Nie
Interactive System for Extracting Data from a Website

Publication number: 20110191381

Abstract: Described is a technology for efficiently labeling a webpage. A wrapper tool labels records of a webpage at the record level. If an existing wrapper exists that is appropriate for labeling a record, the wrapper tool automatically labels that record. For unlabeled records, the tool provides a user interface to label those records, and updates the set of existing wrappers with a new wrapper that is generated based upon the labeling operation; the new wrapper is then applied to any unlabeled records if appropriate for those records. As a result, a user typically needs only to label a relatively few records, with the wrappers generated for those records automatically used to label the other unlabeled records of the webpage.

Type: Application

Filed: January 29, 2010

Publication date: August 4, 2011

Applicant: Microsoft Corporation

Inventors: Shuyi Zheng, Ruihua Song, Matthew Robert Scott, Ji-Rong Wen
Scalable model-based product matching

Patent number: 7979459

Abstract: Aspects of the subject matter described herein relate to matching product information to products. In aspects, a product matching component receives product information. The product matching component normalizes the product information and obtains keywords from the product information. By querying a database of recognized products, the keywords are used to obtain a list of products that potentially match the product information. A confidence level is assigned to each of the potential matches in the list. A match may be returned for the highest matched product or for a selectable number of products whose confidence level(s) exceed a selectable threshold.

Type: Grant

Filed: June 15, 2007

Date of Patent: July 12, 2011

Assignee: Microsoft Corporation

Inventors: Kai Wu, Daniel Takacs, Tong Yao, Jiyu Zhang, Hua Yang, Ji-Rong Wen, Jonathan R M Hart, Eric Anthony Reel
Assessing mobile readiness of a page using a trained scorer

Patent number: 7974957

Abstract: A method and system for ranking pages of a search result based on the mobile readiness of the pages is provided. A mobile-readiness system receives an indication of pages that are to be ranked. The mobile-readiness system evaluates the mobile readiness for each of the pages. Mobile readiness indicates suitability of the page for a mobile device. The mobile readiness system then ranks the pages based on the generated mobile readiness and some other criterion such as a relevance score or an importance score. The mobile-readiness system may train a classifier to classify pages based on their mobile readiness.

Type: Grant

Filed: April 5, 2007

Date of Patent: July 5, 2011

Assignee: Microsoft Corporation

Inventors: Xing Xie, Jihwan Song, Ji-Rong Wen
Data-Centric Search Engine Architecture

Publication number: 20110137886

Abstract: Described is a data-centric web search engine technology/architecture, in which document metadata, including offline-extracted metadata, is used as part of a search indexing and ranking pipeline. A web data management component receives crawled documents and extracts document metadata from the documents. An indexing component uses the document metadata to build an index for the documents. A serving component uses the index and the document metadata to serve content, e.g., search results. Also described is the use of query metadata extracted from queries of a query log for use in the pipeline.

Type: Application

Filed: December 8, 2009

Publication date: June 9, 2011

Applicant: Microsoft Corporation

Inventors: Ji-Rong Wen, Guomao Xin, Yunxiao Ma, Yu Chen, Qing Yu, Yi Liu, Zhicheng Dou, Shuming Shi
SCORING RELEVANCE OF A DOCUMENT BASED ON IMAGE TEXT

Publication number: 20110087660

Abstract: A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.

Type: Application

Filed: December 17, 2010

Publication date: April 14, 2011

Applicant: Microsoft Corporation

Inventors: Qing Yu, Shuming Shi, Zhiwei Li, Ji-Rong Wen, Wei-Ying Ma

prev 1 2 3 4 5 6 … next