Patents by Inventor Rong Wen

Rong Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7428538
    Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.
    Type: Grant
    Filed: March 23, 2006
    Date of Patent: September 23, 2008
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Hang Cui
  • Patent number: 7428700
    Abstract: Vision-based document segmentation identifies one or more portions of semantic content of a document. The one or more portions are identified by identifying a plurality of visual blocks in the document, and detecting one or more separators between the visual blocks of the plurality of visual blocks. A content structure for the document is constructed based at least in part on the plurality of visual blocks and the one or more separators, and the content structure identifies the one or more portions of semantic content of the document. The content structure obtained using the vision-based document segmentation can optionally be used during document retrieval.
    Type: Grant
    Filed: July 28, 2003
    Date of Patent: September 23, 2008
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Shipeng Yu, Deng Cai, Wei-Ying Ma
  • Publication number: 20080215563
    Abstract: A search method uses pseudo-anchor text associated with search objects to improve search performance. The pseudo-anchor text may be extracted in combination with an identifier of the search objects (such as a pseudo-URL) from a digital corpus such as a collection of documents. Pseudo-anchor texts for each object are preferably extracted from candidate anchor blocks using a machine learning based approach. The pseudo-anchor texts are made available for searching and used to help ranking the objects in a search result to improve search performance. Method may be used in vertical search of objects such as published articles, products and images that lack explicit URL and anchor text information.
    Type: Application
    Filed: March 2, 2007
    Publication date: September 4, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: Shuming Shi, Zaiqing Nie, Ji-Rong Wen, Mingjie Zhu, Fei Xing
  • Publication number: 20080215561
    Abstract: A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.
    Type: Application
    Filed: March 1, 2007
    Publication date: September 4, 2008
    Applicant: Microsoft Corporation
    Inventors: Qing Yu, Shuming Shi, Zhiwei Li, Ji-Rong Wen, Wei-Ying Ma
  • Publication number: 20080183673
    Abstract: A method of creating an index of web queries is discussed. The method includes receiving a first query representative of one or more symbolic characters and assigning the first query to a first data structure. A first text string representative of the first query is created and assigned to a second data structure. The first and second data structures are stored on a tangible computer readable medium.
    Type: Application
    Filed: January 25, 2007
    Publication date: July 31, 2008
    Applicant: Microsoft Corporation
    Inventors: Jianfeng Gao, Qi Yao, Ji-Rong Wen
  • Patent number: 7389444
    Abstract: A method and system for ranking possible causes of a component exhibiting a certain behavior is provided. In one embodiment, a troubleshooting system ranks candidate configuration parameters that may be causing a software application to exhibit an undesired behavior using support information relating to problems resulting from the settings of configuration parameters. The support information may be collected from problem reports generated by product support services personnel when troubleshooting problems that users encounter with the application. The troubleshooting system ranks the candidate configuration parameters as likely causing the application to exhibit the undesired behavior based on analysis of the support information.
    Type: Grant
    Filed: July 27, 2004
    Date of Patent: June 17, 2008
    Assignee: Microsoft Corporation
    Inventors: Wei-Ying Ma, Yi-Min Wang, Ji-Rong Wen
  • Patent number: 7383254
    Abstract: A method and system for identifying object information of an information page is provided. An information extraction system identifies the object blocks of an information page. The extraction system classifies the object blocks into object types. Each object type has associated attributes that define a schema for the information of the object type. The extraction system identifies object elements within an object block that may represent an attribute value for the object. After the object elements are identified, the extraction system attempts to identify which object elements correspond to which attributes of the object type in a process referred to as “labeling.” The extraction system uses an algorithm to determine the confidence that a certain object element corresponds to a certain attribute. The extraction system then selects the set of labels with the highest confidence as being the labels for the object elements.
    Type: Grant
    Filed: April 13, 2005
    Date of Patent: June 3, 2008
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Wei-Ying Ma, Zaiqing Nie
  • Patent number: 7363279
    Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.
    Type: Grant
    Filed: April 29, 2004
    Date of Patent: April 22, 2008
    Assignee: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
  • Publication number: 20080065627
    Abstract: A method and system for determining relatedness of images of pages based on link and page layout analysis. A link analysis system determines relatedness between images by first identifying blocks within web pages, and then analyzing the importance of the blocks to web pages, web pages to blocks, and images to blocks. Based on this analysis, the link analysis system determines the degree to which each image is related to each other image. The link analysis system may also use the relatedness of images to generate a ranking of the images. The link analysis system may also generate a vector representation of the images based on their relatedness and apply a clustering algorithm to the vector representations to identify clusters of related images.
    Type: Application
    Filed: November 6, 2007
    Publication date: March 13, 2008
    Applicant: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Xiaofei He, Deng Cai
  • Patent number: 7337092
    Abstract: System events preceding occurrence of a problem are likely to be similar to events preceding occurrence of the same problem at other times or on other systems. Thus, the cause of a problem may be identified by comparing a trace of events preceding occurrence of the problem with previously diagnosed traces. Traces of events preceding occurrences of a problem arising from a known cause are reduced to a series of descriptive elements. These elements are aligned to correlate differently timed but otherwise similar traces of events, converted into symbolic representations, and archived. A trace of events leading to an undiagnosed a problem similarly is converted to a symbolic representation. The representation of the undiagnosed trace is then compared to the archived representations to identify a similar archived representation. The cause of the similar archived representation is presented as a diagnosis of the problem.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: February 26, 2008
    Assignee: Microsoft Corporation
    Inventors: Chun Yuan, Ji-Rong Wen, Wei-Ying Ma, Yi-Min Wang, Zheng Zhang
  • Publication number: 20080046441
    Abstract: A method and system for generating wrappers for hierarchically organized documents by jointly optimizing template detection and wrapper generation is provided. A wrapper generation system generates a wrapper for documents with similar templates by identifying a cluster of document trees and generating a wrapper tree for the cluster. A wrapper tree defines the wrapper for documents that match the template of the cluster. The wrapper generation system clusters document trees by generating a wrapper tree for the cluster based on an initial document tree. The wrapper generation system then repeatedly determines whether any other document tree matches or nearly matches the wrapper tree for the cluster and, if so, adds the document tree to the cluster and adjusts the wrapper tree as appropriate so that all the document trees, including the newly added one, match the wrapper tree.
    Type: Application
    Filed: August 16, 2006
    Publication date: February 21, 2008
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Min Wan, Ruihua Song, Wei-Ying Ma, Shuyi Zeng
  • Publication number: 20080027910
    Abstract: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.
    Type: Application
    Filed: July 25, 2006
    Publication date: January 31, 2008
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Shuming Shi, Wei-Ying Ma, Yunxiao Ma, Zaiqing Nie
  • Publication number: 20080027969
    Abstract: A method and system for labeling object information of an information page is provided. A labeling system identifies an object record of an information page based on the labeling of object elements within an object record and labels object elements based on the identification of an object record that contains the object elements. To identify the records and label the elements, the labeling system generates a hierarchical representation of blocks of an information page. The labeling system identifies records and elements within the records by propagating probability-related information of record labels and element labels through the hierarchy of the blocks. The labeling system generates a feature vector for each block to represent the block and calculates a probability of a label for a block being correct based on a score derived from the feature vectors associated with related blocks. The labeling system searches for the labeling of records and elements that has the highest probability of being correct.
    Type: Application
    Filed: July 31, 2006
    Publication date: January 31, 2008
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Wei-Ying Ma, Zaiqing Nie, Jun Zhu
  • Patent number: 7293007
    Abstract: A method and system for determining relatedness of images of pages based on link and page layout analysis. A link analysis system determines relatedness between images by first identifying blocks within web pages, and then analyzing the importance of the blocks to web pages, web pages to blocks, and images to blocks. Based on this analysis, the link analysis system determines the degree to which each image is related to each other image. The link analysis system may also use the relatedness of images to generate a ranking of the images. The link analysis system may also generate a vector representation of the images based on their relatedness and apply a clustering algorithm to the vector representations to identify clusters of related images.
    Type: Grant
    Filed: April 29, 2004
    Date of Patent: November 6, 2007
    Assignee: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Xiaofei He, Deng Cai
  • Patent number: 7287025
    Abstract: Systems and methods for query expansion are described. In one aspect, new terms are extracted from a newly submitted query. Terms to expand the new terms are identified to a relevant document list. The expansion term are identified at least in part on the new terms and probabilistic correlations from information in a query log. The query log information includes one or more query terms and a corresponding set of document identifiers (IDs). The query terms were previously submitted to a search engine. The document IDs represent each document selected from a list generated by the search engine in response to searching for information relevant to corresponding ones of the query terms.
    Type: Grant
    Filed: February 12, 2003
    Date of Patent: October 23, 2007
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Hang Cui, Wei-Ying Ma
  • Patent number: 7249135
    Abstract: A method and system for identifying schemas of web databases is provided. A schema matching system generates a mapping between an interface schema and a result schema of a web database, which is used to represent the underlying database schema. The schema matching system also generates a mapping of the interface attributes and the result attributes of the web database to global attributes of a global schema whose semantics are known. Using these mappings, a search engine service can formulate queries using the global attributes, map those queries to the corresponding interface attributes, submit the query, and retrieve the values from the result attributes that correspond to the desired global attributes.
    Type: Grant
    Filed: May 14, 2004
    Date of Patent: July 24, 2007
    Assignee: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen
  • Publication number: 20070162408
    Abstract: A content object indexing process including creating a content object knowledge index, calculating a description vector of a target content object, and indexing the target content object by searching for the description vector in the content object knowledge database. It may be difficult to search for an exact content object such as a music file or academic researcher as a conventional search index may not include related hierarchical information. A content object indexing process may add hierarchical information taken from a content object knowledge index and incorporate the hierarchical information to the index entry for a specific content object. An application of such a content object indexing process may be a world wide web search engine.
    Type: Application
    Filed: January 11, 2006
    Publication date: July 12, 2007
    Applicant: Microsoft Corporation
    Inventors: Wei-Ying Ma, Lie Lu, Ji-Rong Wen, Zhiwei Li, Zaiqing Nie, Hsiao-Wuen Hon
  • Publication number: 20070150486
    Abstract: A labeling system uses a two-dimensional conditional random fields technique to label the object elements. The labeling system represents transition features and state features that depend on object elements that are adjacent in two dimensions. The labeling system represents the grid as a graph of vertices and edges with a vertex representing an object element and an edge representing a relationship between the object elements. The labeling system represents each diagonal of the graph as a sequence of states. The labeling system selects a labeling for the vertices of the diagonals that has the highest probability based on transition probabilities between vertices of adjacent diagonals and on the state probabilities of a position within a diagonal.
    Type: Application
    Filed: December 14, 2005
    Publication date: June 28, 2007
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Wei-Ying Ma, Zaiqing Nie, Jun Zhu
  • Publication number: 20070136457
    Abstract: Features extracted from network browser pages and/or network search queries are leveraged to facilitate in detecting a user's browsing and/or searching intent. Machine learning classifiers constructed from these features automatically detect a user's online commercial intention (OCI). A user's intention can be commercial or non-commercial, with commercial intentions being informational or transactional. In one instance, an OCI ranking mechanism is employed with a search engine to facilitate in providing search results that are ranked according to a user's intention. This also provides a means to match purchasing advertisements with potential customers who are more than likely ready to make a purchase (transactional stage). Additionally, informational advertisements can be matched to users who are researching a potential purchase (informational stage).
    Type: Application
    Filed: December 14, 2005
    Publication date: June 14, 2007
    Applicant: Microsoft Corporation
    Inventors: Honghua Dai, Lee Wang, Ying Li, Zaiqing Nie, Ji-Rong Wen, Lingzhi Zhao
  • Publication number: 20070112756
    Abstract: A mechanism to classify source documents into one of two categories, either likely to contain desired information or unlikely to contain desired information. Generally some form of rules based classification in conjunction with deeper analysis using advanced techniques on difficult cases is utilized. The rules based classification is generally good for eliminating cases from further consideration and for identifying documents of interest based on generally discernable relationships between data or based on the presence or absence of data. The deeper analysis is used to uncover more complex relationships between data that may identify documents of interest. Portions of the process may use the entire document while other portions of the process may use only a portion of the document.
    Type: Application
    Filed: March 15, 2006
    Publication date: May 17, 2007
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Yan-Feng Sun, Wei-Ying Ma, Zaiqing Nie, Renkuan Jiang