Patents by Inventor Ji-Rong Wen

Ji-Rong Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7529748
    Abstract: A mechanism to classify source documents into one of two categories, either likely to contain desired information or unlikely to contain desired information. Generally some form of rules based classification in conjunction with deeper analysis using advanced techniques on difficult cases is utilized. The rules based classification is generally good for eliminating cases from further consideration and for identifying documents of interest based on generally discernable relationships between data or based on the presence or absence of data. The deeper analysis is used to uncover more complex relationships between data that may identify documents of interest. Portions of the process may use the entire document while other portions of the process may use only a portion of the document.
    Type: Grant
    Filed: March 15, 2006
    Date of Patent: May 5, 2009
    Inventors: Ji-Rong Wen, Yan-Feng Sun, Wei-Ying Ma, Zaiqing Nie, Renkuan Jiang
  • Patent number: 7523105
    Abstract: Systems and methods for clustering Web queries are described. In one aspect, one or more of a same document and a plurality of similar documents selected by a user in response to a plurality of queries is identified. Responsive to this identification, a query cluster is generated. The cleric the query cluster indicates that the queries are similar independent of whether individual ones of the queries comprise similar composition with respect to other ones of the queries.
    Type: Grant
    Filed: February 23, 2006
    Date of Patent: April 21, 2009
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Jian-Yun Nie, Mingjing Li, Hong-Jiang Zhang
  • Patent number: 7490097
    Abstract: In one aspect, this disclosure relates to a method and associated apparatus that allows a user to obtain a semi-structured data input and a workload input. An improved semi-structured data storage schema is selected for a relational schema in response to the semi-structured data input and the workload input. The semi-structured data is segmented based on the selected improved semi-structured data storage schema. In one aspect, the semi-structured data is XML data.
    Type: Grant
    Filed: February 20, 2003
    Date of Patent: February 10, 2009
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Shihui Zheng, Hongjun Lu
  • Publication number: 20090024607
    Abstract: A learning system for a search ranking function model may include a computer program that iteratively refines the model using new queries and associated documents from an unlabeled training set. The unlabeled training set may include a set of queries for which the associated documents have not been labeled as “relevant” or otherwise labeled. The new queries may be selected based on a similarity to and an accuracy of each neighbor from a labeled training set, such as a labeled validation set. Upon selection, the documents associated with the new queries may be labeled. The new queries and their associated documents may be accumulated into a labeled training set, such as a labeled training set, and a refined model may be learned based on the augmented labeled training set. The model may be iteratively refined until it is determined that the model is adequate.
    Type: Application
    Filed: July 20, 2007
    Publication date: January 22, 2009
    Applicant: Microsoft Corporation
    Inventors: Nan Sun, Qing Yu, Shuming Shi, Ji-Rong Wen
  • Patent number: 7480652
    Abstract: A relevance system determines the relevance of a query term to a document based on spans within the document that contain the query term. The relevance system aggregates the relevance of the query terms into an overall relevance for the document. For each query term, the relevance system calculates a span relevance for each span that contains that query term. The relevance system then aggregates the span relevances for a query term into a query term relevance for that document. The relevance system may aggregate the query term relevances into a document relevance.
    Type: Grant
    Filed: October 26, 2005
    Date of Patent: January 20, 2009
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Ruihua Song, Wei-Ying Ma
  • Publication number: 20090012956
    Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.
    Type: Application
    Filed: September 16, 2008
    Publication date: January 8, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Ji-Rong Wen, Hang Cui
  • Publication number: 20080313165
    Abstract: Aspects of the subject matter described herein relate to matching product information to products. In aspects, a product matching component receives product information. The product matching component normalizes the product information and obtains keywords from the product information. By querying a database of recognized products, the keywords are used to obtain a list of products that potentially match the product information. A confidence level is assigned to each of the potential matches in the list. A match may be returned for the highest matched product or for a selectable number of products whose confidence level(s) exceed a selectable threshold.
    Type: Application
    Filed: June 15, 2007
    Publication date: December 18, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: Kai Wu, Daniel Takacs, Tong Yao, Jiyu Zhang, Hua Yang, Ji-Rong Wen, Jonathan R.M. Hart, Eric Anthony Reel
  • Publication number: 20080256068
    Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.
    Type: Application
    Filed: April 10, 2008
    Publication date: October 16, 2008
    Applicant: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
  • Publication number: 20080250009
    Abstract: A method and system for ranking pages of a search result based on the mobile readiness of the pages is provided. A mobile-readiness system receives an indication of pages that are to be ranked. The mobile-readiness system evaluates the mobile readiness for each of the pages. Mobile readiness indicates suitability of the page for a mobile device. The mobile readiness system then ranks the pages based on the generated mobile readiness and some other criterion such as a relevance score or an importance score. The mobile-readiness system may train a classifier to classify pages based on their mobile readiness.
    Type: Application
    Filed: April 5, 2007
    Publication date: October 9, 2008
    Applicant: Microsoft Corporation
    Inventors: Xing Xie, Jihwan Song, Ji-Rong Wen
  • Patent number: 7428538
    Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.
    Type: Grant
    Filed: March 23, 2006
    Date of Patent: September 23, 2008
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Hang Cui
  • Patent number: 7428700
    Abstract: Vision-based document segmentation identifies one or more portions of semantic content of a document. The one or more portions are identified by identifying a plurality of visual blocks in the document, and detecting one or more separators between the visual blocks of the plurality of visual blocks. A content structure for the document is constructed based at least in part on the plurality of visual blocks and the one or more separators, and the content structure identifies the one or more portions of semantic content of the document. The content structure obtained using the vision-based document segmentation can optionally be used during document retrieval.
    Type: Grant
    Filed: July 28, 2003
    Date of Patent: September 23, 2008
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Shipeng Yu, Deng Cai, Wei-Ying Ma
  • Publication number: 20080215563
    Abstract: A search method uses pseudo-anchor text associated with search objects to improve search performance. The pseudo-anchor text may be extracted in combination with an identifier of the search objects (such as a pseudo-URL) from a digital corpus such as a collection of documents. Pseudo-anchor texts for each object are preferably extracted from candidate anchor blocks using a machine learning based approach. The pseudo-anchor texts are made available for searching and used to help ranking the objects in a search result to improve search performance. Method may be used in vertical search of objects such as published articles, products and images that lack explicit URL and anchor text information.
    Type: Application
    Filed: March 2, 2007
    Publication date: September 4, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: Shuming Shi, Zaiqing Nie, Ji-Rong Wen, Mingjie Zhu, Fei Xing
  • Publication number: 20080215561
    Abstract: A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.
    Type: Application
    Filed: March 1, 2007
    Publication date: September 4, 2008
    Applicant: Microsoft Corporation
    Inventors: Qing Yu, Shuming Shi, Zhiwei Li, Ji-Rong Wen, Wei-Ying Ma
  • Publication number: 20080183673
    Abstract: A method of creating an index of web queries is discussed. The method includes receiving a first query representative of one or more symbolic characters and assigning the first query to a first data structure. A first text string representative of the first query is created and assigned to a second data structure. The first and second data structures are stored on a tangible computer readable medium.
    Type: Application
    Filed: January 25, 2007
    Publication date: July 31, 2008
    Applicant: Microsoft Corporation
    Inventors: Jianfeng Gao, Qi Yao, Ji-Rong Wen
  • Patent number: 7389444
    Abstract: A method and system for ranking possible causes of a component exhibiting a certain behavior is provided. In one embodiment, a troubleshooting system ranks candidate configuration parameters that may be causing a software application to exhibit an undesired behavior using support information relating to problems resulting from the settings of configuration parameters. The support information may be collected from problem reports generated by product support services personnel when troubleshooting problems that users encounter with the application. The troubleshooting system ranks the candidate configuration parameters as likely causing the application to exhibit the undesired behavior based on analysis of the support information.
    Type: Grant
    Filed: July 27, 2004
    Date of Patent: June 17, 2008
    Assignee: Microsoft Corporation
    Inventors: Wei-Ying Ma, Yi-Min Wang, Ji-Rong Wen
  • Patent number: 7383254
    Abstract: A method and system for identifying object information of an information page is provided. An information extraction system identifies the object blocks of an information page. The extraction system classifies the object blocks into object types. Each object type has associated attributes that define a schema for the information of the object type. The extraction system identifies object elements within an object block that may represent an attribute value for the object. After the object elements are identified, the extraction system attempts to identify which object elements correspond to which attributes of the object type in a process referred to as “labeling.” The extraction system uses an algorithm to determine the confidence that a certain object element corresponds to a certain attribute. The extraction system then selects the set of labels with the highest confidence as being the labels for the object elements.
    Type: Grant
    Filed: April 13, 2005
    Date of Patent: June 3, 2008
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Wei-Ying Ma, Zaiqing Nie
  • Patent number: 7363279
    Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.
    Type: Grant
    Filed: April 29, 2004
    Date of Patent: April 22, 2008
    Assignee: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
  • Publication number: 20080065627
    Abstract: A method and system for determining relatedness of images of pages based on link and page layout analysis. A link analysis system determines relatedness between images by first identifying blocks within web pages, and then analyzing the importance of the blocks to web pages, web pages to blocks, and images to blocks. Based on this analysis, the link analysis system determines the degree to which each image is related to each other image. The link analysis system may also use the relatedness of images to generate a ranking of the images. The link analysis system may also generate a vector representation of the images based on their relatedness and apply a clustering algorithm to the vector representations to identify clusters of related images.
    Type: Application
    Filed: November 6, 2007
    Publication date: March 13, 2008
    Applicant: Microsoft Corporation
    Inventors: Wei-Ying Ma, Ji-Rong Wen, Xiaofei He, Deng Cai
  • Patent number: 7337092
    Abstract: System events preceding occurrence of a problem are likely to be similar to events preceding occurrence of the same problem at other times or on other systems. Thus, the cause of a problem may be identified by comparing a trace of events preceding occurrence of the problem with previously diagnosed traces. Traces of events preceding occurrences of a problem arising from a known cause are reduced to a series of descriptive elements. These elements are aligned to correlate differently timed but otherwise similar traces of events, converted into symbolic representations, and archived. A trace of events leading to an undiagnosed a problem similarly is converted to a symbolic representation. The representation of the undiagnosed trace is then compared to the archived representations to identify a similar archived representation. The cause of the similar archived representation is presented as a diagnosis of the problem.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: February 26, 2008
    Assignee: Microsoft Corporation
    Inventors: Chun Yuan, Ji-Rong Wen, Wei-Ying Ma, Yi-Min Wang, Zheng Zhang
  • Publication number: 20080046441
    Abstract: A method and system for generating wrappers for hierarchically organized documents by jointly optimizing template detection and wrapper generation is provided. A wrapper generation system generates a wrapper for documents with similar templates by identifying a cluster of document trees and generating a wrapper tree for the cluster. A wrapper tree defines the wrapper for documents that match the template of the cluster. The wrapper generation system clusters document trees by generating a wrapper tree for the cluster based on an initial document tree. The wrapper generation system then repeatedly determines whether any other document tree matches or nearly matches the wrapper tree for the cluster and, if so, adds the document tree to the cluster and adjusts the wrapper tree as appropriate so that all the document trees, including the newly added one, match the wrapper tree.
    Type: Application
    Filed: August 16, 2006
    Publication date: February 21, 2008
    Applicant: Microsoft Corporation
    Inventors: Ji-Rong Wen, Min Wan, Ruihua Song, Wei-Ying Ma, Shuyi Zeng