Patents by Inventor Swapnil Hajela

Swapnil Hajela has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9514216
    Abstract: Exemplary methods and apparatuses are provided which may be used for classifying and indexing segmented portions of web pages and providing related information for use in information extraction and/or information retrieval systems. In an embodiment, an index of segmented portions may be used by a search engine to respond to a search query. In an embodiment, one or more machine learned models may be used to identify one or more feature properties of a plurality of segmented portions within one or more files, or otherwise inferable from the one or more files. In an embodiment, one or more machine learned models may be used to classify one or more of a plurality of segmented portions as being at least one of a plurality of segment types.
    Type: Grant
    Filed: September 8, 2014
    Date of Patent: December 6, 2016
    Assignee: Yahoo! Inc.
    Inventors: Lei Duan, Fan Li, Srinivas Vadrevu, Emre Velipasaoglu, Swapnil Hajela, Deepayan Chakrabarti
  • Publication number: 20150066934
    Abstract: Exemplary methods and apparatuses are provided which may be used for classifying and indexing segmented portions of web pages and providing related information for use in information extraction and/or information retrieval systems.
    Type: Application
    Filed: September 8, 2014
    Publication date: March 5, 2015
    Inventors: Lei Duan, Fan Li, Srinivas Vadrevu, Emre Velipasaoglu, Swapnil Hajela, Deepayan Chakrabarti
  • Patent number: 8849725
    Abstract: Exemplary methods and apparatuses are provided which may be used for classifying and indexing segmented portions of web pages and providing related information for use in information extraction and/or information retrieval systems.
    Type: Grant
    Filed: August 10, 2009
    Date of Patent: September 30, 2014
    Assignee: Yahoo! Inc.
    Inventors: Lei Duan, Fan Li, Srinivas Vadrevu, Emre Velipasaoglu, Swapnil Hajela, Deepayan Chakrabarti
  • Patent number: 8255793
    Abstract: To provide valuable information regarding a webpage, the webpage must be divided into distinct semantically coherent segments for analysis. A set of heuristics allow a segmentation algorithm to identify an optimal number of segments for a given webpage or any portion thereof more accurately. A first heuristic estimates the optimal number of segments for any given webpage or portion thereof. A second heuristic coalesces segments where the number of segments identified far exceeds the optimal number recommended. A third heuristic coalesces segments corresponding to a portion of a webpage with much unused whitespace and little content. A fourth heuristic coalesces segments of nodes that have a recommended number of segments below a certain threshold into segments of other nodes. A fifth heuristic recursively analyzes and splits segments that correspond to webpage portions surpassing a certain threshold portion size.
    Type: Grant
    Filed: January 8, 2008
    Date of Patent: August 28, 2012
    Assignee: Yahoo! Inc.
    Inventors: Deepayan Chakrabarti, Manav Ratan Mital, Swapnil Hajela, Emre Velipasaoglu
  • Patent number: 8135717
    Abstract: Words having selected characteristics in a corpus of documents are found using a data processor arranged to execute queries. Memory stores an index structure in which entries in the index structure map words and marks for words having the selected characteristics to locations within documents in the corpus. Entries in the index structure represent words and other entries represent marks with the location information of a marked word. The entries for the marks can be tokens coalesced with prefixes of respective marked words or adjacent. A query processor forms a modified query by adding a mark for a word to the query. The processor executes the modified query.
    Type: Grant
    Filed: March 30, 2009
    Date of Patent: March 13, 2012
    Assignee: SAP America, Inc.
    Inventors: Ramana B. Rao, Swapnil Hajela, Nareshkumar Rajkumar
  • Patent number: 8131730
    Abstract: Phrases in a corpus of documents including stopwords are found using a data processor arranged to execute phrase queries. Memory stores an index structure which maps entries in the index structure to documents in the corpus. Entries in the index structure represent words and other entries represent stopwords found in the corpus coalesced with prefixes of respective adjacent words adjacent to the stopwords. The prefixes comprise one or more leading characters of the respective adjacent words. A query processor forms a modified query by substituting a stopword with a search token representing the stopword coalesced with a prefix of the next word in the query. The processor executes the modified query. Also, index structures including coalesced stopwords are created and maintained.
    Type: Grant
    Filed: March 30, 2009
    Date of Patent: March 6, 2012
    Assignee: SAP America, Inc.
    Inventors: Swapnil Hajela, Nareshkumar Rajkumar
  • Publication number: 20110035345
    Abstract: Exemplary methods and apparatuses are provided which may be used for classifying and indexing segmented portions of web pages and providing related information for use in information extraction and/or information retrieval systems.
    Type: Application
    Filed: August 10, 2009
    Publication date: February 10, 2011
    Applicant: Yahoo! Inc.
    Inventors: Lei Duan, Fan Li, Srinivas Vadrevu, Emre Velipasaoglu, Swapnil Hajela, Deepayan Chakrabarti
  • Publication number: 20090193005
    Abstract: Words having selected characteristics in a corpus of documents are found using a data processor arranged to execute queries. Memory stores an index structure in which entries in the index structure map words and marks for words having the selected characteristics to locations within documents in the corpus. Entries in the index structure represent words and other entries represent marks with the location information of a marked word. The entries for the marks can be tokens coalesced with prefixes of respective marked words or adjacent. A query processor forms a modified query by adding a mark for a word to the query. The processor executes the modified query.
    Type: Application
    Filed: March 30, 2009
    Publication date: July 30, 2009
    Inventors: Ramana B. Rao, Swapnil Hajela, Nareshkumar Rajkumar
  • Publication number: 20090187564
    Abstract: Phrases in a corpus of documents including stopwords are found using a data processor arranged to execute phrase queries. Memory stores an index structure which maps entries in the index structure to documents in the corpus. Entries in the index structure represent words and other entries represent stopwords found in the corpus coalesced with prefixes of respective adjacent words adjacent to the stopwords. The prefixes comprise one or more leading characters of the respective adjacent words. A query processor forms a modified query by substituting a stopword with a search token representing the stopword coalesced with a prefix of the next word in the query. The processor executes the modified query. Also, index structures including coalesced stopwords are created and maintained.
    Type: Application
    Filed: March 30, 2009
    Publication date: July 23, 2009
    Inventors: Swapnil Hajela, Nareshkumar Rajkumar
  • Publication number: 20090177959
    Abstract: To provide valuable information regarding a webpage, the webpage must be divided into distinct semantically coherent segments for analysis. A set of heuristics allow a segmentation algorithm to identify an optimal number of segments for a given webpage or any portion thereof more accurately. A first heuristic estimates the optimal number of segments for any given webpage or portion thereof. A second heuristic coalesces segments where the number of segments identified far exceeds the optimal number recommended. A third heuristic coalesces segments corresponding to a portion of a webpage with much unused whitespace and little content. A fourth heuristic coalesces segments of nodes that have a recommended number of segments below a certain threshold into segments of other nodes. A fifth heuristic recursively analyzes and splits segments that correspond to webpage portions surpassing a certain threshold portion size.
    Type: Application
    Filed: January 8, 2008
    Publication date: July 9, 2009
    Inventors: DEEPAYAN CHAKRABARTI, Manav Ratan Mital, Swapnil Hajela, Emre Velipasaoglu
  • Patent number: 7516125
    Abstract: Words having selected characteristics in a corpus of documents are found using a data processor arranged to execute queries. Memory stores an index structure in which entries in the index structure map words and marks for words having the selected characteristics to locations within documents in the corpus. Entries in the index structure represent words and other entries represent marks with the location information of a marked word. The entries for the marks can be tokens coalesced with prefixes of respective marked words or adjacent. A query processor forms a modified query by adding a mark for a word to the query. The processor executes the modified query.
    Type: Grant
    Filed: March 29, 2006
    Date of Patent: April 7, 2009
    Assignee: Business Objects Americas
    Inventors: Ramana B. Rao, Swapnil Hajela, Nareshkumar Rajkumar
  • Patent number: 7512596
    Abstract: Phrases in a corpus of documents including stopwords are found using a data processor arranged to execute phrase queries. Memory stores an index structure which maps entries in the index structure to documents in the corpus. Entries in the index structure represent words and other entries represent stopwords found in the corpus coalesced with prefixes of respective adjacent words adjacent to the stopwords. The prefixes comprise one or more leading characters of the respective adjacent words. A query processor forms a modified query by substituting a stopword with a search token representing the stopword coalesced with a prefix of the next word in the query. The processor executes the modified query. Also, index structures including coalesced stopwords are created and maintained.
    Type: Grant
    Filed: March 29, 2006
    Date of Patent: March 31, 2009
    Assignee: Business Objects Americas
    Inventors: Swapnil Hajela, Nareshkumar Rajkumar
  • Publication number: 20070027854
    Abstract: Words having selected characteristics in a corpus of documents are found using a data processor arranged to execute queries. Memory stores an index structure in which entries in the index structure map words and marks for words having the selected characteristics to locations within documents in the corpus. Entries in the index structure represent words and other entries represent marks with the location information of a marked word. The entries for the marks can be tokens coalesced with prefixes of respective marked words or adjacent. A query processor forms a modified query by adding a mark for a word to the query. The processor executes the modified query.
    Type: Application
    Filed: March 29, 2006
    Publication date: February 1, 2007
    Applicant: Inxight Software, Inc.
    Inventors: Ramana Rao, Swapnil Hajela, Nareshkumar Rajkumar
  • Publication number: 20070027853
    Abstract: Phrases in a corpus of documents including stopwords are found using a data processor arranged to execute phrase queries. Memory stores an index structure which maps entries in the index structure to documents in the corpus. Entries in the index structure represent words and other entries represent stopwords found in the corpus coalesced with prefixes of respective adjacent words adjacent to the stopwords. The prefixes comprise one or more leading characters of the respective adjacent words. A query processor forms a modified query by substituting a stopword with a search token representing the stopword coalesced with a prefix of the next word in the query. The processor executes the modified query. Also, index structures including coalesced stopwords are created and maintained.
    Type: Application
    Filed: March 29, 2006
    Publication date: February 1, 2007
    Applicant: Inxight Software, Inc.
    Inventors: Swapnil Hajela, Nareshkumar Rajkumar