Patents by Inventor Benyu Zhang

Benyu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7590603
    Abstract: A method and system for classifying messages of a discussion thread as questions is provided. A classification system generates a classifier to classify messages of discussion threads as question messages or non-question messages. The system trains the classifier using the feature vectors and input classifications derived from a training set of discussion threads. After the classifier is trained, the classification system uses the classifier to classify messages within a corpus of discussion threads as question or non-question messages. To classify a message, the classification system generates a feature vector for the messages and submits that feature vector to the classifier. The classifier generates a score for the message indicating a likelihood that the message is a question message.
    Type: Grant
    Filed: October 1, 2004
    Date of Patent: September 15, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Zheng Chen, Hua-Jun Zeng, Wei-Ying Ma
  • Publication number: 20090228452
    Abstract: A method and system for identifying information about people is provided. The information system identifies groups of people that have relationships based on their relationships to documents or more generally to objects. The information system initially is provided with an indication of which people have which relationships to which documents. The information system then identifies clusters of people based on having a relationship to the same objects. The information system may also identify clusters of related objects associated with a cluster of people. When a user wants to identify information about a person, the user can provide the name of that person to the information system. The information system then can retrieve and display the names of the other people who are in the same cluster as the person.
    Type: Application
    Filed: March 17, 2009
    Publication date: September 10, 2009
    Applicant: Microsoft Corporation
    Inventors: Benyu Zhang, Wei-Ying Ma, Gu Xu, Hongbin Gao, Zheng Chen, Randy Hinrichs, Hua-Jun Zeng
  • Patent number: 7584100
    Abstract: A method and system for clustering documents based on generalized sentence patterns of the topics of the documents is provided. A generalized sentence patterns (“GSP”) system identifies a “sentence” that describes the topic of a document. To cluster documents, the GSP system generates a “generalized sentence” form of the sentence that describes the topic of each document. The generalized sentence is an abstraction of the words of the sentence. The GSP system identifies clusters of documents based on the patterns of their generalized sentences. The GSP system clusters documents when the generalized sentence representations of their topics have a similar pattern.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: September 1, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Wei-Ying Ma, Zheng Chen, Hua-Jun Zeng
  • Publication number: 20090193047
    Abstract: The claimed subject matter is directed to constructing query hierarchies in response to a query request. To construct a query hierarchy, a list of related candidate queries is generated in response to the received query request. The list of related candidate queries is generated by determining the relative coverage of information shared by the candidate queries and the query request. Relationships between the submitted query request and the candidate queries in the list are determined based upon the extent of relative coverage of information shared by the candidate queries and the query request. A query hierarchy is then constructed to reflect the determined relationships between the query request and the candidate queries.
    Type: Application
    Filed: January 28, 2008
    Publication date: July 30, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Weizhu Chen, Benyu Zhang, Zheng Chen, Jian Wang, Dou Shen
  • Patent number: 7567895
    Abstract: A method and system for prioritizing communications based on classifications of sentences within the communications is provided. A sentence classification system may classify sentences of communications according to various classifications such as “sentence mode.” The sentence classification system trains a sentence classifier using training data and then classifies sentences using the trained sentence classifier. After the sentences of a communication are classified, a document ranking system may generate a rank for the communication based on the classifications of the sentences within the communication. The document ranking system trains a document rank classifier using training data and then calculates the rank of communications using the trained document rank classifier.
    Type: Grant
    Filed: August 31, 2004
    Date of Patent: July 28, 2009
    Assignee: Microsoft Corporation
    Inventors: Zheng Chen, Wei-Ying Ma, Hua-Jun Zeng, Benyu Zhang
  • Patent number: 7565372
    Abstract: A summary system for evaluating summaries of documents and for generating summaries of documents based on normalized probabilities of portions of the document. A summarization system generates a summary by selecting sentences for the summary based on their normalized probabilities as derived from a document model. An evaluation system evaluates the effectiveness of a summary based on a normalized probability for the summary that is derived from a document model.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: July 21, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma, Zheng Chen, Hua Li
  • Patent number: 7555480
    Abstract: The invention provides a method of interactively crawling data records on a web page. Users may select various data records of interest on a web page to generate templates to search for similar data items on the same web page or on different web pages. A tree matching algorithm may be used to compare and extract data matching the generated template.
    Type: Grant
    Filed: July 11, 2006
    Date of Patent: June 30, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Chenxi Lin, Hua-Jun Zeng, Jian Wang, Ke Tang, Zheng Chen
  • Publication number: 20090132530
    Abstract: Described herein is technology for, among other things, mining pair-based data on the web. The technology involves an online pair-based data mining system as well as an offline SVM training system. By subjecting a pair-based input data to the systems, one may grow a pool of pair-based data which share characteristics of the pair-based input data in more efficient manner.
    Type: Application
    Filed: November 19, 2007
    Publication date: May 21, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Weizhu Chen, Long Jiang, Ming Zhou, Benyu Zhang, Zheng Chen, Jian Wang
  • Patent number: 7533094
    Abstract: A method and system for determining similarity between items is provided. To calculate similarity scores for pairs of items, the similarity system initializes a similarity score for each pair of objects and each pair of features. The similarity system then iteratively calculates the similarity scores for each pair of objects based on the similar scores of the pairs of features calculated during a previous iteration and calculates the similarity scores for each pair of features based on the similarity scores of the pairs of objects calculated during a previous iteration. The similarity system implements an algorithm that is based on a recursive definition of the similarities between objects and between features. The similarity system continues the iterations of recalculating the similarity scores until the similarity scores converge on a solution.
    Type: Grant
    Filed: November 23, 2004
    Date of Patent: May 12, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma, Zheng Chen, Ning Liu, Jun Yan
  • Publication number: 20090119284
    Abstract: A method and system for classifying display pages based on automatically generated summaries of display pages. A web page classification system uses a web page summarization system to generate summaries of web pages. The summary of a web page may include the sentences of the web page that are most closely related to the primary topic of the web page. The summarization system may combine the benefits of multiple summarization techniques to identify the sentences of a web page that represent the primary topic of the web page. Once the summary is generated, the classification system may apply conventional classification techniques to the summary to classify the web page. The classification system may use conventional classification techniques such as a Naïve Bayesian classifier or a support vector machine to identify the classifications of a web page based on the summary generated by the summarization system.
    Type: Application
    Filed: June 24, 2008
    Publication date: May 7, 2009
    Applicant: Microsoft Corporation
    Inventors: Zheng Chen, Dou Shen, Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma
  • Patent number: 7529735
    Abstract: A method and system for identifying information about people is provided. The information system identifies groups of people that have relationships based on their relationships to documents or more generally to objects. The information system initially is provided with an indication of which people have which relationships to which documents. The information system then identifies clusters of people based on having a relationship to the same objects. The information system may also identify clusters of related objects associated with a cluster of people. When a user wants to identify information about a person, the user can provide the name of that person to the information system. The information system then can retrieve and display the names of the other people who are in the same cluster as the person.
    Type: Grant
    Filed: February 11, 2005
    Date of Patent: May 5, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Wei-Ying Ma, Gu Xu, Hongbin Gao, Zheng Chen, Randy Hinrichs, Hua-Jun Zeng
  • Patent number: 7529719
    Abstract: Computer-readable media having computer-executable instructions and apparatuses categorize documents or corpus of documents. A Tensor Space Model (TSM), which models the text by a higher-order tensor, represents a document or a corpus of documents. Supported by techniques of multilinear algebra, TSM provides a framework for analyzing the multifactor structures. TSM is further supported by operations and presented tools, such as the High-Order Singular Value Decomposition (HOSVD) for a reduction of the dimensions of the higher-order tensor. The dimensionally reduced tensor is compared with tensors that represent possible categories. Consequently, a category is selected for the document or corpus of documents. Experimental results on the dataset for 20 Newsgroups suggest that TSM is advantageous to a Vector Space Model (VSM) for text classification.
    Type: Grant
    Filed: March 17, 2006
    Date of Patent: May 5, 2009
    Assignee: Microsoft Corporation
    Inventors: Ning Liu, Benyu Zhang, Jun Yan, Zheng Chen, Hua-Jun Zeng, Jian Wang
  • Publication number: 20090106019
    Abstract: A method and system for prioritizing communications based on classifications of sentences within the communications is provided. A sentence classification system may classify sentences of communications according to various classifications such as “sentence mode.” The sentence classification system trains a sentence classifier using training data and then classifies sentences using the trained sentence classifier. After the sentences of a communication are classified, a document ranking system may generate a rank for the communication based on the classifications of the sentences within the communication. The document ranking system trains a document rank classifier using training data and then calculates the rank of communications using the trained document rank classifier.
    Type: Application
    Filed: October 20, 2008
    Publication date: April 23, 2009
    Applicant: Microsoft Corporation
    Inventors: Zheng Chen, Wei-Ying Ma, Hua-Jun Zeng, Benyu Zhang
  • Patent number: 7506274
    Abstract: Methods and systems for displaying data retrieved from a multi-dimensional data source via an interactive data diagram. A graphical user interface is responsive to input from a user to retrieve multi-dimensional data for display via an interactive data diagram. The interactive data diagram displays multi-dimensional data in a hierarchical structure that includes a plurality of dimension levels and one or more member levels within each dimension level. A user specifies a change to the display structure by selecting a displayed member level in the hierarchical structure. The interactive data diagram is responsive to the user specified change to generate a drilled down data diagram displaying detailed dimension and member levels related to the selected member level.
    Type: Grant
    Filed: May 18, 2005
    Date of Patent: March 17, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Teresa B. Mah, Lee Wang, Julie L. Hesseltine Richardson
  • Patent number: 7502785
    Abstract: Extraction of semantic information and the generation of semantic attributes allows for improved organization and management of data. Semantic attributes are automatically generated and eliminate the need for manual entry of attribute information. A semantic file network may further be constructed based on similarities between files that are based on the semantic attribute information. Semantic links representing a semantic relationship may be built between similar or relevant files. In addition, user operations and user operation patterns may also be considered in building the file network. Semantic attributes and information may further facilitate browsing the file systems as well as improve the accuracy and speed of queries.
    Type: Grant
    Filed: March 30, 2006
    Date of Patent: March 10, 2009
    Assignee: Microsoft Corporation
    Inventors: Zheng Chen, Lei Li, Chenxi Lin, Qiaoling Liu, Jian Wang, Benyu Zhang
  • Patent number: 7502495
    Abstract: A method and system for generating a projection matrix for projecting data from a high dimensional space to a low dimensional space. The system establishes an objective function based on a maximum margin criterion matrix. The system then provides data samples that are in the high dimensional space and have a class. For each data sample, the system incrementally derives leading eigenvectors of the maximum margin criterion matrix based on the derivation of the leading eigenvectors of the last data sample. The derived eigenvectors compose the projection matrix, which can be used to project data samples in a high dimensional space into a low dimensional space.
    Type: Grant
    Filed: March 1, 2005
    Date of Patent: March 10, 2009
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Hua-Jun Zeng, Jun Yan, Wei-Ying Ma, Zheng Chen
  • Publication number: 20090006326
    Abstract: Representing queries and determining similarity of queries based on an autoregressive integrated moving average (“ARIMA”) model is provided. A query analysis system represents each query by its ARIMA coefficients. The query analysis system may estimate the frequency information for a desired past or future interval based on frequency information for some initial intervals. The query analysis system may also determine the similarity of a pair of queries based on the similarity of their ARIMA coefficients. The query analysis system may use various metrics, such as a correlation metric, to determine the similarity of the ARIMA coefficients.
    Type: Application
    Filed: June 28, 2007
    Publication date: January 1, 2009
    Applicant: Microsoft Corporation
    Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang
  • Publication number: 20090006045
    Abstract: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
    Type: Application
    Filed: June 28, 2007
    Publication date: January 1, 2009
    Applicant: Microsoft Corporation
    Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang
  • Publication number: 20090006312
    Abstract: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
    Type: Application
    Filed: June 28, 2007
    Publication date: January 1, 2009
    Applicant: Microsoft Corporation
    Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang
  • Publication number: 20090006284
    Abstract: Techniques for analyzing and modeling the frequency of queries are provided by a query analysis system. A query analysis system analyzes frequencies of a query over time to determine whether the query is time-dependent or time-independent. The query analysis system forecasts the frequency of time-dependent queries based on their periodicities. The query analysis system forecasts the frequency of time-independent queries based on causal relationships with other queries. To forecast the frequency of time-independent queries, the query analysis system analyzes the frequency of a query over time to identify significant increases in the frequency, which are referred to as “query events” or “events.” The query analysis system forecasts frequencies of time-independent queries based on queries with events that tend to causally precede events of the query to be forecasted.
    Type: Application
    Filed: June 28, 2007
    Publication date: January 1, 2009
    Applicant: Microsoft Corporation
    Inventors: Ning Liu, Jun Yan, Benyu Zhang, Zheng Chen, Jian Wang