Patents by Inventor Meghana Kshirsagar

Meghana Kshirsagar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100257440
    Abstract: Techniques for high precision web extraction using site knowledge are provided. Portions of repeating text are identified in unlabeled web pages from a particular web site. Based on the portions of repeating text, the unlabeled web pages are partitioned into a set of segments. Multiple labels are assigned to respectively corresponding multiple attributes in the set of segments, where assigning the multiple labels comprises applying a classification model to each separate segment in the set of segments. First one or more labels are identified that were erroneously assigned to one or more attributes in the set of segments. Second one or more correct labels for the one or more attributes are determined. The first one or more labels in the set of segments are corrected by assigning the second one or more labels to the one or more attributes.
    Type: Application
    Filed: April 1, 2009
    Publication date: October 7, 2010
    Inventors: Meghana Kshirsagar, Rajeev Rastogi, Sandeepkumar Bhuramal Satpal, Srinivasan H. Sengamedu, Venu Satuluri
  • Publication number: 20100223214
    Abstract: A method and apparatus for automatically extracting information from a large number of documents through applying machine learning techniques and exploiting structural similarities among documents. A machine learning model is trained to have at least 50% accuracy. The trained machine learning model is used to identify information attributes in a sample of pages from a cluster of structurally similar documents. A structure-specific model of the cluster is created by compiling a list of top-K locations for each attribute identified by the trained machine learning model in the sample. These top-K lists are used to extract information from the pages of the cluster from which the sample of pages was taken.
    Type: Application
    Filed: February 27, 2009
    Publication date: September 2, 2010
    Inventors: Alok S. Kirpal, Sandeepkumar Bhuramal Satpal, Meghana Kshirsagar, Srinivasan H. Sengamedu
  • Publication number: 20090216739
    Abstract: Methods and apparatus are described for use with information extraction techniques based on sequential models. Additional statistics are maintained during inference and employed to boost the accuracy of the extraction algorithm and mitigate the effects of training bias.
    Type: Application
    Filed: February 22, 2008
    Publication date: August 27, 2009
    Applicant: YAHOO! INC.
    Inventors: Alok S. Kirpal, Meghana Kshirsagar