Patents by Inventor Benyu Zhang

Benyu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for classifying and identifying messages as question or not a question within a discussion thread

Patent number: 7590603

Abstract: A method and system for classifying messages of a discussion thread as questions is provided. A classification system generates a classifier to classify messages of discussion threads as question messages or non-question messages. The system trains the classifier using the feature vectors and input classifications derived from a training set of discussion threads. After the classifier is trained, the classification system uses the classifier to classify messages within a corpus of discussion threads as question or non-question messages. To classify a message, the classification system generates a feature vector for the messages and submits that feature vector to the classifier. The classifier generates a score for the message indicating a likelihood that the message is a question message.

Type: Grant

Filed: October 1, 2004

Date of Patent: September 15, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Zheng Chen, Hua-Jun Zeng, Wei-Ying Ma
METHOD AND SYSTEM FOR MINING INFORMATION BASED ON RELATIONSHIPS

Publication number: 20090228452

Abstract: A method and system for identifying information about people is provided. The information system identifies groups of people that have relationships based on their relationships to documents or more generally to objects. The information system initially is provided with an indication of which people have which relationships to which documents. The information system then identifies clusters of people based on having a relationship to the same objects. The information system may also identify clusters of related objects associated with a cluster of people. When a user wants to identify information about a person, the user can provide the name of that person to the information system. The information system then can retrieve and display the names of the other people who are in the same cluster as the person.

Type: Application

Filed: March 17, 2009

Publication date: September 10, 2009

Applicant: Microsoft Corporation

Inventors: Benyu Zhang, Wei-Ying Ma, Gu Xu, Hongbin Gao, Zheng Chen, Randy Hinrichs, Hua-Jun Zeng
Method and system for clustering using generalized sentence patterns

Patent number: 7584100

Abstract: A method and system for clustering documents based on generalized sentence patterns of the topics of the documents is provided. A generalized sentence patterns (“GSP”) system identifies a “sentence” that describes the topic of a document. To cluster documents, the GSP system generates a “generalized sentence” form of the sentence that describes the topic of each document. The generalized sentence is an abstraction of the words of the sentence. The GSP system identifies clusters of documents based on the patterns of their generalized sentences. The GSP system clusters documents when the generalized sentence representations of their topics have a similar pattern.

Type: Grant

Filed: June 30, 2004

Date of Patent: September 1, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Wei-Ying Ma, Zheng Chen, Hua-Jun Zeng
CONTRUCTING WEB QUERY HIERARCHIES FROM CLICK-THROUGH DATA

Publication number: 20090193047

Abstract: The claimed subject matter is directed to constructing query hierarchies in response to a query request. To construct a query hierarchy, a list of related candidate queries is generated in response to the received query request. The list of related candidate queries is generated by determining the relative coverage of information shared by the candidate queries and the query request. Relationships between the submitted query request and the candidate queries in the list are determined based upon the extent of relative coverage of information shared by the candidate queries and the query request. A query hierarchy is then constructed to reflect the determined relationships between the query request and the candidate queries.

Type: Application

Filed: January 28, 2008

Publication date: July 30, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Weizhu Chen, Benyu Zhang, Zheng Chen, Jian Wang, Dou Shen
Method and system for prioritizing communications based on sentence classifications

Patent number: 7567895

Abstract: A method and system for prioritizing communications based on classifications of sentences within the communications is provided. A sentence classification system may classify sentences of communications according to various classifications such as “sentence mode.” The sentence classification system trains a sentence classifier using training data and then classifies sentences using the trained sentence classifier. After the sentences of a communication are classified, a document ranking system may generate a rank for the communication based on the classifications of the sentences within the communication. The document ranking system trains a document rank classifier using training data and then calculates the rank of communications using the trained document rank classifier.

Type: Grant

Filed: August 31, 2004

Date of Patent: July 28, 2009

Assignee: Microsoft Corporation

Inventors: Zheng Chen, Wei-Ying Ma, Hua-Jun Zeng, Benyu Zhang
Evaluating and generating summaries using normalized probabilities

Patent number: 7565372

Abstract: A summary system for evaluating summaries of documents and for generating summaries of documents based on normalized probabilities of portions of the document. A summarization system generates a summary by selecting sentences for the summary based on their normalized probabilities as derived from a document model. An evaluation system evaluates the effectiveness of a summary based on a normalized probability for the summary that is derived from a document model.

Type: Grant

Filed: September 13, 2005

Date of Patent: July 21, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma, Zheng Chen, Hua Li
Comparatively crawling web page data records relative to a template

Patent number: 7555480

Abstract: The invention provides a method of interactively crawling data records on a web page. Users may select various data records of interest on a web page to generate templates to search for similar data items on the same web page or on different web pages. A tree matching algorithm may be used to compare and extract data matching the generated template.

Type: Grant

Filed: July 11, 2006

Date of Patent: June 30, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Chenxi Lin, Hua-Jun Zeng, Jian Wang, Ke Tang, Zheng Chen
WEB CONTENT MINING OF PAIR-BASED DATA

Publication number: 20090132530

Abstract: Described herein is technology for, among other things, mining pair-based data on the web. The technology involves an online pair-based data mining system as well as an offline SVM training system. By subjecting a pair-based input data to the systems, one may grow a pool of pair-based data which share characteristics of the pair-based input data in more efficient manner.

Type: Application

Filed: November 19, 2007

Publication date: May 21, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Weizhu Chen, Long Jiang, Ming Zhou, Benyu Zhang, Zheng Chen, Jian Wang
Method and system for determining similarity of items based on similarity objects and their features

Patent number: 7533094

Abstract: A method and system for determining similarity between items is provided. To calculate similarity scores for pairs of items, the similarity system initializes a similarity score for each pair of objects and each pair of features. The similarity system then iteratively calculates the similarity scores for each pair of objects based on the similar scores of the pairs of features calculated during a previous iteration and calculates the similarity scores for each pair of features based on the similarity scores of the pairs of objects calculated during a previous iteration. The similarity system implements an algorithm that is based on a recursive definition of the similarities between objects and between features. The similarity system continues the iterations of recalculating the similarity scores until the similarity scores converge on a solution.

Type: Grant

Filed: November 23, 2004

Date of Patent: May 12, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma, Zheng Chen, Ning Liu, Jun Yan
METHOD AND SYSTEM FOR CLASSIFYING DISPLAY PAGES USING SUMMARIES

Publication number: 20090119284

Abstract: A method and system for classifying display pages based on automatically generated summaries of display pages. A web page classification system uses a web page summarization system to generate summaries of web pages. The summary of a web page may include the sentences of the web page that are most closely related to the primary topic of the web page. The summarization system may combine the benefits of multiple summarization techniques to identify the sentences of a web page that represent the primary topic of the web page. Once the summary is generated, the classification system may apply conventional classification techniques to the summary to classify the web page. The classification system may use conventional classification techniques such as a Naïve Bayesian classifier or a support vector machine to identify the classifications of a web page based on the summary generated by the summarization system.

Type: Application

Filed: June 24, 2008

Publication date: May 7, 2009

Applicant: Microsoft Corporation

Inventors: Zheng Chen, Dou Shen, Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma
Method and system for mining information based on relationships

Patent number: 7529735

Abstract: A method and system for identifying information about people is provided. The information system identifies groups of people that have relationships based on their relationships to documents or more generally to objects. The information system initially is provided with an indication of which people have which relationships to which documents. The information system then identifies clusters of people based on having a relationship to the same objects. The information system may also identify clusters of related objects associated with a cluster of people. When a user wants to identify information about a person, the user can provide the name of that person to the information system. The information system then can retrieve and display the names of the other people who are in the same cluster as the person.

Type: Grant

Filed: February 11, 2005

Date of Patent: May 5, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Wei-Ying Ma, Gu Xu, Hongbin Gao, Zheng Chen, Randy Hinrichs, Hua-Jun Zeng
Document characterization using a tensor space model

Patent number: 7529719

Abstract: Computer-readable media having computer-executable instructions and apparatuses categorize documents or corpus of documents. A Tensor Space Model (TSM), which models the text by a higher-order tensor, represents a document or a corpus of documents. Supported by techniques of multilinear algebra, TSM provides a framework for analyzing the multifactor structures. TSM is further supported by operations and presented tools, such as the High-Order Singular Value Decomposition (HOSVD) for a reduction of the dimensions of the higher-order tensor. The dimensionally reduced tensor is compared with tensors that represent possible categories. Consequently, a category is selected for the document or corpus of documents. Experimental results on the dataset for 20 Newsgroups suggest that TSM is advantageous to a Vector Space Model (VSM) for text classification.

Type: Grant

Filed: March 17, 2006

Date of Patent: May 5, 2009

Assignee: Microsoft Corporation

Inventors: Ning Liu, Benyu Zhang, Jun Yan, Zheng Chen, Hua-Jun Zeng, Jian Wang
METHOD AND SYSTEM FOR PRIORITIZING COMMUNICATIONS BASED ON SENTENCE CLASSIFICATIONS

Publication number: 20090106019

Abstract: A method and system for prioritizing communications based on classifications of sentences within the communications is provided. A sentence classification system may classify sentences of communications according to various classifications such as “sentence mode.” The sentence classification system trains a sentence classifier using training data and then classifies sentences using the trained sentence classifier. After the sentences of a communication are classified, a document ranking system may generate a rank for the communication based on the classifications of the sentences within the communication. The document ranking system trains a document rank classifier using training data and then calculates the rank of communications using the trained document rank classifier.

Type: Application

Filed: October 20, 2008

Publication date: April 23, 2009

Applicant: Microsoft Corporation

Inventors: Zheng Chen, Wei-Ying Ma, Hua-Jun Zeng, Benyu Zhang
System and method for interactively displaying multi-dimensional data

Patent number: 7506274

Abstract: Methods and systems for displaying data retrieved from a multi-dimensional data source via an interactive data diagram. A graphical user interface is responsive to input from a user to retrieve multi-dimensional data for display via an interactive data diagram. The interactive data diagram displays multi-dimensional data in a hierarchical structure that includes a plurality of dimension levels and one or more member levels within each dimension level. A user specifies a change to the display structure by selecting a displayed member level in the hierarchical structure. The interactive data diagram is responsive to the user specified change to generate a drilled down data diagram displaying detailed dimension and member levels related to the selected member level.

Type: Grant

Filed: May 18, 2005

Date of Patent: March 17, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Teresa B. Mah, Lee Wang, Julie L. Hesseltine Richardson
Extracting semantic attributes

Patent number: 7502785

Abstract: Extraction of semantic information and the generation of semantic attributes allows for improved organization and management of data. Semantic attributes are automatically generated and eliminate the need for manual entry of attribute information. A semantic file network may further be constructed based on similarities between files that are based on the semantic attribute information. Semantic links representing a semantic relationship may be built between similar or relevant files. In addition, user operations and user operation patterns may also be considered in building the file network. Semantic attributes and information may further facilitate browsing the file systems as well as improve the accuracy and speed of queries.

Type: Grant

Filed: March 30, 2006

Date of Patent: March 10, 2009

Assignee: Microsoft Corporation

Inventors: Zheng Chen, Lei Li, Chenxi Lin, Qiaoling Liu, Jian Wang, Benyu Zhang
Method and system for incrementally learning an adaptive subspace by optimizing the maximum margin criterion

Patent number: 7502495

Abstract: A method and system for generating a projection matrix for projecting data from a high dimensional space to a low dimensional space. The system establishes an objective function based on a maximum margin criterion matrix. The system then provides data samples that are in the high dimensional space and have a class. For each data sample, the system incrementally derives leading eigenvectors of the maximum margin criterion matrix based on the derivation of the leading eigenvectors of the last data sample. The derived eigenvectors compose the projection matrix, which can be used to project data samples in a high dimensional space into a low dimensional space.

Type: Grant

Filed: March 1, 2005

Date of Patent: March 10, 2009

Assignee: Microsoft Corporation

Inventors: Benyu Zhang, Hua-Jun Zeng, Jun Yan, Wei-Ying Ma, Zheng Chen
REPRESENTING QUERIES AND DETERMINING SIMILARITY BASED ON AN ARIMA MODEL

Publication number: 20090006326

Abstract: Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.

Type: Application

Filed: June 28, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang
FORECASTING TIME-DEPENDENT SEARCH QUERIES

Publication number: 20090006045

Abstract: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

Type: Application

Filed: June 28, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang
DETERMINATION OF TIME DEPENDENCY OF SEARCH QUERIES

Publication number: 20090006312

Abstract: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

Type: Application

Filed: June 28, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang
FORECASTING TIME-INDEPENDENT SEARCH QUERIES

Publication number: 20090006284

Abstract: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.

Type: Application

Filed: June 28, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang

prev 1 2 3 4 5 6 7 8 9 next