Patents by Inventor Ji-Rong Wen
Ji-Rong Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7529748Abstract: A mechanism to classify source documents into one of two categories, either likely to contain desired information or unlikely to contain desired information. Generally some form of rules based classification in conjunction with deeper analysis using advanced techniques on difficult cases is utilized. The rules based classification is generally good for eliminating cases from further consideration and for identifying documents of interest based on generally discernable relationships between data or based on the presence or absence of data. The deeper analysis is used to uncover more complex relationships between data that may identify documents of interest. Portions of the process may use the entire document while other portions of the process may use only a portion of the document.Type: GrantFiled: March 15, 2006Date of Patent: May 5, 2009Inventors: Ji-Rong Wen, Yan-Feng Sun, Wei-Ying Ma, Zaiqing Nie, Renkuan Jiang
-
Patent number: 7523105Abstract: Systems and methods for clustering Web queries are described. In one aspect, one or more of a same document and a plurality of similar documents selected by a user in response to a plurality of queries is identified. Responsive to this identification, a query cluster is generated. The cleric the query cluster indicates that the queries are similar independent of whether individual ones of the queries comprise similar composition with respect to other ones of the queries.Type: GrantFiled: February 23, 2006Date of Patent: April 21, 2009Assignee: Microsoft CorporationInventors: Ji-Rong Wen, Jian-Yun Nie, Mingjing Li, Hong-Jiang Zhang
-
Patent number: 7490097Abstract: In one aspect, this disclosure relates to a method and associated apparatus that allows a user to obtain a semi-structured data input and a workload input. An improved semi-structured data storage schema is selected for a relational schema in response to the semi-structured data input and the workload input. The semi-structured data is segmented based on the selected improved semi-structured data storage schema. In one aspect, the semi-structured data is XML data.Type: GrantFiled: February 20, 2003Date of Patent: February 10, 2009Assignee: Microsoft CorporationInventors: Ji-Rong Wen, Shihui Zheng, Hongjun Lu
-
Publication number: 20090024607Abstract: A learning system for a search ranking function model may include a computer program that iteratively refines the model using new queries and associated documents from an unlabeled training set. The unlabeled training set may include a set of queries for which the associated documents have not been labeled as “relevant” or otherwise labeled. The new queries may be selected based on a similarity to and an accuracy of each neighbor from a labeled training set, such as a labeled validation set. Upon selection, the documents associated with the new queries may be labeled. The new queries and their associated documents may be accumulated into a labeled training set, such as a labeled training set, and a refined model may be learned based on the augmented labeled training set. The model may be iteratively refined until it is determined that the model is adequate.Type: ApplicationFiled: July 20, 2007Publication date: January 22, 2009Applicant: Microsoft CorporationInventors: Nan Sun, Qing Yu, Shuming Shi, Ji-Rong Wen
-
Patent number: 7480652Abstract: A relevance system determines the relevance of a query term to a document based on spans within the document that contain the query term. The relevance system aggregates the relevance of the query terms into an overall relevance for the document. For each query term, the relevance system calculates a span relevance for each span that contains that query term. The relevance system then aggregates the span relevances for a query term into a query term relevance for that document. The relevance system may aggregate the query term relevances into a document relevance.Type: GrantFiled: October 26, 2005Date of Patent: January 20, 2009Assignee: Microsoft CorporationInventors: Ji-Rong Wen, Ruihua Song, Wei-Ying Ma
-
Publication number: 20090012956Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.Type: ApplicationFiled: September 16, 2008Publication date: January 8, 2009Applicant: MICROSOFT CORPORATIONInventors: Ji-Rong Wen, Hang Cui
-
Publication number: 20080313165Abstract: Aspects of the subject matter described herein relate to matching product information to products. In aspects, a product matching component receives product information. The product matching component normalizes the product information and obtains keywords from the product information. By querying a database of recognized products, the keywords are used to obtain a list of products that potentially match the product information. A confidence level is assigned to each of the potential matches in the list. A match may be returned for the highest matched product or for a selectable number of products whose confidence level(s) exceed a selectable threshold.Type: ApplicationFiled: June 15, 2007Publication date: December 18, 2008Applicant: MICROSOFT CORPORATIONInventors: Kai Wu, Daniel Takacs, Tong Yao, Jiyu Zhang, Hua Yang, Ji-Rong Wen, Jonathan R.M. Hart, Eric Anthony Reel
-
Publication number: 20080256068Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.Type: ApplicationFiled: April 10, 2008Publication date: October 16, 2008Applicant: Microsoft CorporationInventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
-
Publication number: 20080250009Abstract: A method and system for ranking pages of a search result based on the mobile readiness of the pages is provided. A mobile-readiness system receives an indication of pages that are to be ranked. The mobile-readiness system evaluates the mobile readiness for each of the pages. Mobile readiness indicates suitability of the page for a mobile device. The mobile readiness system then ranks the pages based on the generated mobile readiness and some other criterion such as a relevance score or an importance score. The mobile-readiness system may train a classifier to classify pages based on their mobile readiness.Type: ApplicationFiled: April 5, 2007Publication date: October 9, 2008Applicant: Microsoft CorporationInventors: Xing Xie, Jihwan Song, Ji-Rong Wen
-
Patent number: 7428538Abstract: This disclosure relates to performing a query for a search term of a database containing a plurality of structured documents. Those structured documents that do not include the search term are ferreted or filtered out during an initial search. Matched structured documents which are those structured documents that do contain the search term are evaluated by ranking the individual elements based on how well each individual element matches the search term, and indicating to the user the ranking of the individual elements wherein the individual elements can be accessed by the user.Type: GrantFiled: March 23, 2006Date of Patent: September 23, 2008Assignee: Microsoft CorporationInventors: Ji-Rong Wen, Hang Cui
-
Patent number: 7428700Abstract: Vision-based document segmentation identifies one or more portions of semantic content of a document. The one or more portions are identified by identifying a plurality of visual blocks in the document, and detecting one or more separators between the visual blocks of the plurality of visual blocks. A content structure for the document is constructed based at least in part on the plurality of visual blocks and the one or more separators, and the content structure identifies the one or more portions of semantic content of the document. The content structure obtained using the vision-based document segmentation can optionally be used during document retrieval.Type: GrantFiled: July 28, 2003Date of Patent: September 23, 2008Assignee: Microsoft CorporationInventors: Ji-Rong Wen, Shipeng Yu, Deng Cai, Wei-Ying Ma
-
Publication number: 20080215563Abstract: A search method uses pseudo-anchor text associated with search objects to improve search performance. The pseudo-anchor text may be extracted in combination with an identifier of the search objects (such as a pseudo-URL) from a digital corpus such as a collection of documents. Pseudo-anchor texts for each object are preferably extracted from candidate anchor blocks using a machine learning based approach. The pseudo-anchor texts are made available for searching and used to help ranking the objects in a search result to improve search performance. Method may be used in vertical search of objects such as published articles, products and images that lack explicit URL and anchor text information.Type: ApplicationFiled: March 2, 2007Publication date: September 4, 2008Applicant: MICROSOFT CORPORATIONInventors: Shuming Shi, Zaiqing Nie, Ji-Rong Wen, Mingjie Zhu, Fei Xing
-
Publication number: 20080215561Abstract: A method and system for determining relevance of a document having text and images to a text string is provided. A scoring system identifies image text associated with an image of the document. The scoring system calculates an image score indicating relevance of the image text to the text string. The image score may be used in many applications, such as searching, summary generation, and document classification, image search, and image classification.Type: ApplicationFiled: March 1, 2007Publication date: September 4, 2008Applicant: Microsoft CorporationInventors: Qing Yu, Shuming Shi, Zhiwei Li, Ji-Rong Wen, Wei-Ying Ma
-
Publication number: 20080183673Abstract: A method of creating an index of web queries is discussed. The method includes receiving a first query representative of one or more symbolic characters and assigning the first query to a first data structure. A first text string representative of the first query is created and assigned to a second data structure. The first and second data structures are stored on a tangible computer readable medium.Type: ApplicationFiled: January 25, 2007Publication date: July 31, 2008Applicant: Microsoft CorporationInventors: Jianfeng Gao, Qi Yao, Ji-Rong Wen
-
Patent number: 7389444Abstract: A method and system for ranking possible causes of a component exhibiting a certain behavior is provided. In one embodiment, a troubleshooting system ranks candidate configuration parameters that may be causing a software application to exhibit an undesired behavior using support information relating to problems resulting from the settings of configuration parameters. The support information may be collected from problem reports generated by product support services personnel when troubleshooting problems that users encounter with the application. The troubleshooting system ranks the candidate configuration parameters as likely causing the application to exhibit the undesired behavior based on analysis of the support information.Type: GrantFiled: July 27, 2004Date of Patent: June 17, 2008Assignee: Microsoft CorporationInventors: Wei-Ying Ma, Yi-Min Wang, Ji-Rong Wen
-
Patent number: 7383254Abstract: A method and system for identifying object information of an information page is provided. An information extraction system identifies the object blocks of an information page. The extraction system classifies the object blocks into object types. Each object type has associated attributes that define a schema for the information of the object type. The extraction system identifies object elements within an object block that may represent an attribute value for the object. After the object elements are identified, the extraction system attempts to identify which object elements correspond to which attributes of the object type in a process referred to as “labeling.” The extraction system uses an algorithm to determine the confidence that a certain object element corresponds to a certain attribute. The extraction system then selects the set of labels with the highest confidence as being the labels for the object elements.Type: GrantFiled: April 13, 2005Date of Patent: June 3, 2008Assignee: Microsoft CorporationInventors: Ji-Rong Wen, Wei-Ying Ma, Zaiqing Nie
-
Patent number: 7363279Abstract: A method and system for identifying the importance of information areas of a display page. An importance system identifies information areas or blocks of a web page. A block of a web page represents an area of the web page that appears to relate to a similar topic. The importance system provides the characteristics or features of a block to an importance function that generates an indication of the importance of that block to its web page. The importance system “learns” the importance function by generating a model based on the features of blocks and the user-specified importance of those blocks. To learn the importance function, the importance system asks users to provide an indication of the importance of blocks of web pages in a collection of web pages.Type: GrantFiled: April 29, 2004Date of Patent: April 22, 2008Assignee: Microsoft CorporationInventors: Wei-Ying Ma, Ji-Rong Wen, Ruihua Song, Haifeng Liu
-
Publication number: 20080065627Abstract: A method and system for determining relatedness of images of pages based on link and page layout analysis. A link analysis system determines relatedness between images by first identifying blocks within web pages, and then analyzing the importance of the blocks to web pages, web pages to blocks, and images to blocks. Based on this analysis, the link analysis system determines the degree to which each image is related to each other image. The link analysis system may also use the relatedness of images to generate a ranking of the images. The link analysis system may also generate a vector representation of the images based on their relatedness and apply a clustering algorithm to the vector representations to identify clusters of related images.Type: ApplicationFiled: November 6, 2007Publication date: March 13, 2008Applicant: Microsoft CorporationInventors: Wei-Ying Ma, Ji-Rong Wen, Xiaofei He, Deng Cai
-
Patent number: 7337092Abstract: System events preceding occurrence of a problem are likely to be similar to events preceding occurrence of the same problem at other times or on other systems. Thus, the cause of a problem may be identified by comparing a trace of events preceding occurrence of the problem with previously diagnosed traces. Traces of events preceding occurrences of a problem arising from a known cause are reduced to a series of descriptive elements. These elements are aligned to correlate differently timed but otherwise similar traces of events, converted into symbolic representations, and archived. A trace of events leading to an undiagnosed a problem similarly is converted to a symbolic representation. The representation of the undiagnosed trace is then compared to the archived representations to identify a similar archived representation. The cause of the similar archived representation is presented as a diagnosis of the problem.Type: GrantFiled: November 3, 2006Date of Patent: February 26, 2008Assignee: Microsoft CorporationInventors: Chun Yuan, Ji-Rong Wen, Wei-Ying Ma, Yi-Min Wang, Zheng Zhang
-
Publication number: 20080046441Abstract: A method and system for generating wrappers for hierarchically organized documents by jointly optimizing template detection and wrapper generation is provided. A wrapper generation system generates a wrapper for documents with similar templates by identifying a cluster of document trees and generating a wrapper tree for the cluster. A wrapper tree defines the wrapper for documents that match the template of the cluster. The wrapper generation system clusters document trees by generating a wrapper tree for the cluster based on an initial document tree. The wrapper generation system then repeatedly determines whether any other document tree matches or nearly matches the wrapper tree for the cluster and, if so, adds the document tree to the cluster and adjusts the wrapper tree as appropriate so that all the document trees, including the newly added one, match the wrapper tree.Type: ApplicationFiled: August 16, 2006Publication date: February 21, 2008Applicant: Microsoft CorporationInventors: Ji-Rong Wen, Min Wan, Ruihua Song, Wei-Ying Ma, Shuyi Zeng