Patents by Inventor Benyu Zhang

Benyu Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20080070209
    Abstract: An influential persons identification system and method for identifying a set of influential persons (or influencers) in a social network (such as an online social network). The influential persons set is generated such that by sending a message to the set the message will be propagated through the network at the greatest speed and coverage. A ranking of users is generated, and a pruning process is performed starting with the top-ranked user and working down the list. For each user on the list, the user is identified as an influencer and then the user and each of his friends are deleted from the social network users list. Next, the same process is performed for the second-ranked user, the third-ranked user, and so forth. The process terminates when the list of users of the social network is exhausted or the desired number of influencers on the influential person set is reached.
    Type: Application
    Filed: September 20, 2006
    Publication date: March 20, 2008
    Applicant: Microsoft Corporation
    Inventors: Dong Zhuang, Benyu Zhang, Heng Zhang, Jeremy Tantrum, Teresa Mah, Hua-Jun Zeng, Zheng Chen, Jian Wang
  • Patent number: 7346621
    Abstract: A method and system for ranking objects based on relationships with objects of a different object type is provided. The ranking system defines an equation for each attribute of each type of object. The equations define the attribute values and are based on relationships between the attribute and the attributes associated with the same type of object and different types of objects. The ranking system iteratively calculates the attribute values for the objects using the equations until the attribute values converge on a solution. The ranking system then ranks objects based on attribute values.
    Type: Grant
    Filed: May 14, 2004
    Date of Patent: March 18, 2008
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma, Wensi Xi, Zheng Chen, Edward A. Fox
  • Publication number: 20080016087
    Abstract: The invention provides a method of interactively crawling data records on a web page. Users may select various data records of interest on a web page to generate templates to search for similar data items on the same web page or on different web pages. A tree matching algorithm may be used to compare and extract data matching the generated template.
    Type: Application
    Filed: July 11, 2006
    Publication date: January 17, 2008
    Applicant: One Microsoft Way
    Inventors: Benyu Zhang, Chenxi Lin, Hua-Jun Zeng, Jian Wang, Ke Tang, Zheng Chen
  • Patent number: 7305389
    Abstract: Systems and methods providing computer-implemented content propagation for enhanced document retrieval are described. In one aspect, reference information directed to one or more documents is identified. The reference information is identified from one or more sources of data that are independent of a data source that includes the one or more documents. Metadata that is proximally located to the reference information is extracted from the one or more sources of data. Relevance between respective features of the metadata to content of associated ones of the one or more documents is calculated. For each document of the one or more documents, associated portions of the metadata is indexed with the relevance of features from the respective portions into original content of the document. The indexing generates one or more enhanced documents.
    Type: Grant
    Filed: April 15, 2004
    Date of Patent: December 4, 2007
    Assignee: Microsoft Corporation
    Inventors: Hua-Jun Zeng, Benyu Zhang, Zheng Chen, Wei-Ying Ma, Hsiao-Wuen Hon, Daniel B. Cook, Gabor Hirschler, Karen Fries, Kurt Samuelson
  • Patent number: 7289985
    Abstract: Systems and methods for enhanced document retrieval are described. In one aspect, a search query from an end-user is received. Responsive to receiving the search query, search results are retrieved. The search results include an enhanced document and a set of non-enhanced documents. The enhanced document and the non-enhanced documents include term(s) of the search query. The enhanced document is derived from a base document. The base document was modified with metadata mined from one or more different documents. The metadata is associated with one or more respective references to the base document. The one or more different documents are independent of the base document.
    Type: Grant
    Filed: April 15, 2004
    Date of Patent: October 30, 2007
    Assignee: Microsoft Corporation
    Inventors: Hua-Jun Zeng, Benyu Zhang, Zheng Chen, Wei-Ying Ma, Hsiao-Wuen Hon, Daniel B. Cook, Gabor Hirschler, Karen Fries, Kurt Samuelson
  • Publication number: 20070239643
    Abstract: Computer-readable media having computer-executable instructions and apparatuses categorize documents or corpus of documents. A Tensor Space Model (TSM), which models the text by a higher-order tensor, represents a document or a corpus of documents. Supported by techniques of multilinear algebra, TSM provides a framework for analyzing the multifactor structures. TSM is further supported by operations and presented tools, such as the High-Order Singular Value Decomposition (HOSVD) for a reduction of the dimensions of the higher-order tensor. The dimensionally reduced tensor is compared with tensors that represent possible categories. Consequently, a category is selected for the document or corpus of documents. Experimental results on the dataset for 20 Newsgroups suggest that TSM is advantageous to a Vector Space Model (VSM) for text classification.
    Type: Application
    Filed: March 17, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Ning Liu, Benyu Zhang, Jun Yan, Zheng Chen, Hua-Jun Zeng, Jian Wang
  • Publication number: 20070239431
    Abstract: A scalable two-pass scalable probabilistic latent semantic analysis (PLSA) methodology is disclosed that may perform more efficiently, and in some cases more accurately, than traditional PLSA, especially where large and/or sparse data sets are provided for analysis. The improved methodology can greatly reduce the storage and/or computational costs of training a PLSA model. In the first pass of the two-pass methodology, objects are clustered into groups, and PLSA is performed on the groups instead of the original individual objects. In the second pass, the conditional probability of a latent class, given an object, is obtained. This may be done by extending the training results of the first pass. During the second pass, the most likely latent classes for each object are identified.
    Type: Application
    Filed: March 30, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Chenxi Lin, Jie Han, Guirong Xue, Hua-Jun Zeng, Benyu Zhang, Zheng Chen, Jian Wang
  • Publication number: 20070239697
    Abstract: Extraction of semantic information and the generation of semantic attributes allows for improved organization and management of data. Semantic attributes are automatically generated and eliminate the need for manual entry of attribute information. A semantic file network may further be constructed based on similarities between files that are based on the semantic attribute information. Semantic links representing a semantic relationship may be built between similar or relevant files. In addition, user operations and user operation patterns may also be considered in building the file network. Semantic attributes and information may further facilitate browsing the file systems as well as improve the accuracy and speed of queries.
    Type: Application
    Filed: March 30, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Zheng Chen, Lei Li, Chenxi Lin, Qiaoling Liu, Jian Wang, Benyu Zhang
  • Publication number: 20070239712
    Abstract: Extraction of semantic information and the generation of semantic attributes allows for improved organization and management of data. Semantic attributes are automatically generated and eliminate the need for manual entry of attribute information. A semantic file network may further be constructed based on similarities between files that are based on the semantic attribute information. Semantic links representing a semantic relationship may be built between similar or relevant files. In addition, user operations and user operation patterns may also be considered in building the file network. Semantic attributes and information may further facilitate browsing the file systems as well as improve the accuracy and speed of queries.
    Type: Application
    Filed: March 30, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Zheng Chen, Lei Li, Chenxi Lin, Qiaoling Liu, Jian Wang, Benyu Zhang
  • Publication number: 20070239638
    Abstract: Embodiments of the invention relate to improvements to the support vector machine (SVM) classification model. When text data is significantly unbalanced (i.e., positive and negative labeled data are in disproportion), the classification quality of standard SVM deteriorates. Embodiments of the invention are directed to a weighted proximal SVM (WPSVM) model that achieves substantially the same accuracy as the traditional SVM model while requiring significantly less computational time. A weighted proximal SVM (WPSVM) model in accordance with embodiments of the invention may include a weight for each training error and a method for estimating the weights, which automatically solves the unbalanced data problem.
    Type: Application
    Filed: March 20, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Dong Zhuang, Benyu Zhang, Zheng Chen, Hua-Jun Zeng, Jian Wang
  • Publication number: 20070239553
    Abstract: In an embodiment, a method of predicting an active user's rating for an item is disclosed. A database of users may be sorted into clusters. The data associated with the users in each cluster may be smoothed to filling in ratings for items that the users have not personally rated. An active user may then be compared to a set of users, where the set may be all or some portion of the database, to determine the K users that are most similar to the active user. The ratings of the K users regarding the item may be used to predict the active user's rating for the item. In an embodiment, the rating of each of the K users is assigned a confidence value associated with whether the user personally rated the item or if the rating was generated by the data smoothing process.
    Type: Application
    Filed: March 16, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Chenxi Lin, Gui-Rong Xue, Hua-Jun Zeng, Zheng Chen, Benyu Zhang, Jian Wang
  • Publication number: 20070239554
    Abstract: Methods for determining a predictive rating are disclosed. In an embodiment, an active user is compared to a set of clusters. One or more of the clusters are determined to be most similar to the active user. From the one or more clusters, K users are determined to be most similar to the active user. Prior ratings for an item by the K users may be used to predict a rating for the item for the active user.
    Type: Application
    Filed: March 16, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Chenxi Lin, Gui-Rong Xue, Hua-Jun Zeng, Zheng Chen, Benyu Zhang, Jian Wang
  • Publication number: 20070239792
    Abstract: Extraction of semantic information and the generation of semantic attributes allows for improved organization and management of data. Semantic attributes are automatically generated and eliminate the need for manual entry of attribute information. A semantic file network may further be constructed based on similarities between files that are based on the semantic attribute information. Semantic links representing a semantic relationship may be built between similar or relevant files. In addition, user operations and user operation patterns may also be considered in building the file network. Semantic attributes and information may further facilitate browsing the file systems as well as improve the accuracy and speed of queries.
    Type: Application
    Filed: March 30, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Zheng Chen, Lei Li, Chenxi Lin, Qiaoling Liu, Jian Wang, Benyu Zhang
  • Publication number: 20070219945
    Abstract: Computer-readable media having computer-executable instructions and apparatuses provide a keyphrase navigation map (KNM) for a document page. Keyphrases are extracted from the document page. Keyphrase clusters are subsequently formed by a measure of relevancy, and a salient keyphrase is determined for each cluster. A thumbnail is formed with tags corresponding to the salient keyphrases. A selected tag is expanded with associated keyphrases. An associated keyphrase may be further selected in order to facilitate the navigation of the document page. The displayed tags on the thumbnail are positioned in accordance with locations of associated keyphrases in the document page.
    Type: Application
    Filed: March 9, 2006
    Publication date: September 20, 2007
    Applicant: Microsoft Corporation
    Inventors: Min Wang, Benyu Zhang, Hua-Jun Zeng, Jian Wang, Shiguang Liu, Zheng Chen
  • Publication number: 20070208728
    Abstract: This invention provides a system and method for predicting user demographic attributes for non-registered users and users with incomplete profiles. The invention uses demographic information from registered users and user search history logs to create a database of information that associates the users' search history habits with their demographic attributes. The invention creates a first database that associates users' search query history with their demographic attributes, and also creates a second database that associates web pages that users have visited frequently along with the users' demographic attributes. The invention can compare the searching and browsing habits of non-registered users and users with incomplete profiles to the searching and browsing habits of registered users.
    Type: Application
    Filed: March 3, 2006
    Publication date: September 6, 2007
    Applicant: Microsoft Corporation
    Inventors: Benyu Zhang, Honghua Dai, Hua-Jun Zeng, Li Qi, Tarek Najm, Teresa Mah, Vladimir Shipunov, Ying Li, Zheng Chen
  • Patent number: 7260568
    Abstract: Systems and methods for verifying relevance between terms and Web site contents are described. In one aspect, site contents from a bid URL are retrieved. Expanded term(s) semantically and/or contextually related to bid term(s) are calculated. Content similarity and expanded similarity measurements are calculated from respective combinations of the bid term(s), the site contents, and the expanded terms. Category similarity measurements between the expanded terms and the site contents are determined in view of a trained similarity classifier. The trained similarity classifier having been trained from mined web site content associated with directory data. A confidence value providing an objective measure of relevance between the bid term(s) and the site contents is determined from the content, expanded, and category similarity measurements evaluating the multiple similarity scores in view of a trained relevance classifier model.
    Type: Grant
    Filed: April 15, 2004
    Date of Patent: August 21, 2007
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Hua-Jun Zeng, Zheng Chen, Wei-Ying Ma, Li Li, Ying Li, Tarek Najm
  • Publication number: 20070192164
    Abstract: According to embodiments of the invention, an advertisement-generation system generates image-containing advertisements. The advertisement-generation system includes: at least one feature-selection guideline that specifies at least one recommended feature for image-containing advertisements based on advertiser inputs that specify at least one of advertisement-target-audience information, cost information, and advertiser-industry information; an image-clip library from which images are selected for inclusion in the image-containing advertisements; and at least one advertisement template that is based on the at least one feature-selection guideline; wherein the system automatically generates image-containing advertisements that contain one or more suggested colors that are automatically suggested based on one or more colors present on a web page that will host the image-containing advertisement.
    Type: Application
    Filed: February 15, 2006
    Publication date: August 16, 2007
    Applicant: Microsoft Corporation
    Inventors: Shuzhen Nong, Ying Li, Tarek Najm, Li Li, Zheng Chen, Hua-Jun Zeng, Benyu Zhang, Yin Li, Dean Carignan, Ying-Qing Xu
  • Publication number: 20070143176
    Abstract: Seed keywords are leveraged to provide expanded keywords that are then associated with relevant advertisers. Instances can also include locating potential advertisers based on the expanded keywords. Inverse lookup techniques are employed to determine which keywords are associated with an advertiser. Filtering can then be employed to eliminate inappropriate keywords for that advertiser. The keywords are then automatically revealed to the advertiser for consideration as relevant search terms for their advertisements. In this manner, revenue for a search engine and/or for an advertiser can be substantially enhanced through the automatic expansion of relevant search terms. Advertisers also benefit by having larger and more relevant search term selections automatically available to them, saving them both time and money.
    Type: Application
    Filed: December 15, 2005
    Publication date: June 21, 2007
    Applicant: Microsoft Corporation
    Inventors: Shuzhen Nong, Ying Li, Tarek Najm, Li Li, Hua-Jun Zeng, Zheng Chen, Benyu Zhang
  • Publication number: 20070061356
    Abstract: A summary system for evaluating summaries of documents and for generating summaries of documents based on normalized probabilities of portions of the documents is provided. A summarization system generates a summary by selecting sentences for the summary based on their normalized probabilities as derived from a document model. An evaluation system evaluates the effectiveness of a summary based on a normalized probability for the summary that is derived from a document model.
    Type: Application
    Filed: September 13, 2005
    Publication date: March 15, 2007
    Applicant: Microsoft Corporation
    Inventors: Benyu Zhang, Hua-Jun Zeng, Wei-Ying Ma, Zheng Chen, Hua Li
  • Publication number: 20070055646
    Abstract: A system for augmenting click-through data with latent information present in the click-through data for use in generating search results that are better tailored to the information needs of a user submitting a query is provided. The augmentation system creates a three-dimensional matrix with the dimensions of users, queries, and documents. The augmentation system then performs a three-order singular value decomposition of the three-dimensional matrix to generate a three-dimensional core singular value matrix and a left singular matrix for each dimension. The augmentation system finally multiplies the three-dimensional core singular value matrix by the left singular matrices to generate an augmented three-dimensional matrix that explicitly contains the information that was latent in the un-augmented three-dimensional matrix.
    Type: Application
    Filed: September 8, 2005
    Publication date: March 8, 2007
    Applicant: Microsoft Corporation
    Inventors: Hua-Jun Zeng, Jian-Tao Sun, Wei-Ying Ma, Zheng Chen, Benyu Zhang