Patents by Inventor Venkatesh Ganti

Venkatesh Ganti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9600566
    Abstract: Embodiments for identifying an entity synonym of an entity are described. A query log is stored in a database located on at least one computing device. A candidate generation module can select a candidate query in the query log that shares a click on a URL with the entity. A correlated tag module can generate a set of phrase-tag pairs for the entity and the candidate query and measure a mutual information value for each phrase-tag pair. A candidate filtering module can determine a click similarity value between the candidate query and the entity based on a set of URLs selected in the search engine results and a tag similarity value based on the mutual information values. A candidate query is selected as an entity synonym if the click similarity value and the tag similarity value are greater than predetermined thresholds respectively.
    Type: Grant
    Filed: May 14, 2010
    Date of Patent: March 21, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Venkatesh Ganti, Dong Xin
  • Patent number: 9501475
    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.
    Type: Grant
    Filed: June 3, 2014
    Date of Patent: November 22, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Patent number: 9244952
    Abstract: Disclosed are a method, a device and/or a system of editable and searchable markup pages automatically populated through query monitoring of users of a database. In one aspect, a method includes automatically generating an editable markup page and/or a page name based on an initial query of a database using a processor and a memory, associating the generated markup page with a user of the database, and appending information to the editable markup page based on a similar query of the database by another user. The method may include permitting other users of the database to access, modify, append, and/or delete entries from the editable mark-up page.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: January 26, 2016
    Assignee: ALATION, INC.
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Patent number: 8996559
    Abstract: Disclosed are a method, a device and/or a system of assisted query formation, validation, and result previewing in a database having a complex schema. In one aspect, a method of a query editor includes generating a data profile which includes a set of characteristics captured at various granularities of an initial result set generated from an initial query using a processor and a memory. The method determines what a user expects in the initial result set of the initial query and/or a subsequent result set of a subsequent query based on the data profile and/or a heuristically estimated data profile. The method includes enabling the user to evaluate a semantic accuracy of the subsequent query based on the likely expectation of the user as determined through the set of characteristics of the data profile. The set of characteristics may include metadata of the initial query.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: March 31, 2015
    Assignee: Alation, Inc.
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Patent number: 8965915
    Abstract: Disclosed are a method, a device and/or a system of assisted query formation, validation, and result previewing in a database having a complex schema. In one aspect, a method of a query editor includes generating a data profile which includes a set of characteristics captured at various granularities of an initial result set generated from an initial query using a processor and a memory. The method determines what a user expects in the initial result set of the initial query and/or a subsequent result set of a subsequent query based on the data profile and/or a heuristically estimated data profile. The method includes enabling the user to evaluate a semantic accuracy of the subsequent query based on the likely expectation of the user as determined through the set of characteristics of the data profile. The set of characteristics may include metadata of the initial query.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: February 24, 2015
    Assignee: Alation, Inc.
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Patent number: 8935272
    Abstract: In one embodiment, a method of a curated answers system includes automatically populating a profile markup page of a user with information describing an initial query of a database that the user has generated using a processor and a memory, determining that another user of the database has submitted a similar query that is semantically proximate to the initial query of the database that the user has generated, and presenting the profile markup page of the user to the other user. The method of the curated answers system may include enabling the other user to communicate with the user through a communication channel on the profile markup page. A question of the other user may be published to the user on the profile markup page of the user, and/or other profile markup page of the other user. The question may be associated as being posted by the other user.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: January 13, 2015
    Assignee: Alation, Inc.
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Publication number: 20140351274
    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.
    Type: Application
    Filed: June 3, 2014
    Publication date: November 27, 2014
    Applicant: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Publication number: 20140280286
    Abstract: Disclosed are a method, a device and/or a system of assisted query formation, validation, and result previewing in a database having a complex schema. In one aspect, a method of a query editor includes generating a data profile which includes a set of characteristics captured at various granularities of an initial result set generated from an initial query using a processor and a memory. The method determines what a user expects in the initial result set of the initial query and/or a subsequent result set of a subsequent query based on the data profile and/or a heuristically estimated data profile. The method includes enabling the user to evaluate a semantic accuracy of the subsequent query based on the likely expectation of the user as determined through the set of characteristics of the data profile. The set of characteristics may include metadata of the initial query.
    Type: Application
    Filed: October 18, 2013
    Publication date: September 18, 2014
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Publication number: 20140279845
    Abstract: Disclosed are a method, a device and/or a system of editable and searchable markup pages automatically populated through query monitoring of users of a database. In one aspect, a method includes automatically generating an editable markup page and/or a page name based on an initial query of a database using a processor and a memory, associating the generated markup page with a user of the database, and appending information to the editable markup page based on a similar query of the database by another user. The method may include permitting other users of the database to access, modify, append, and/or delete entries from the editable mark-up page.
    Type: Application
    Filed: October 18, 2013
    Publication date: September 18, 2014
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Publication number: 20140280287
    Abstract: Disclosed are a method, a device and/or a system of assisted query formation, validation, and result previewing in a database having a complex schema. In one aspect, a method of a query editor includes generating a data profile which includes a set of characteristics captured at various granularities of an initial result set generated from an initial query using a processor and a memory. The method determines what a user expects in the initial result set of the initial query and/or a subsequent result set of a subsequent query based on the data profile and/or a heuristically estimated data profile. The method includes enabling the user to evaluate a semantic accuracy of the subsequent query based on the likely expectation of the user as determined through the set of characteristics of the data profile. The set of characteristics may include metadata of the initial query.
    Type: Application
    Filed: October 18, 2013
    Publication date: September 18, 2014
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Publication number: 20140280067
    Abstract: In one embodiment, a method of a curated answers system includes automatically populating a profile markup page of a user with information describing an initial query of a database that the user has generated using a processor and a memory, determining that another user of the database has submitted a similar query that is semantically proximate to the initial query of the database that the user has generated, and presenting the profile markup page of the user to the other user. The method of the curated answers system may include enabling the other user to communicate with the user through a communication channel on the profile markup page. A question of the other user may be published to the user on the profile markup page of the user, and/or other profile markup page of the other user. The question may be associated as being posted by the other user.
    Type: Application
    Filed: October 18, 2013
    Publication date: September 18, 2014
    Inventors: Venkatesh Ganti, Aaron Kalb, Feng Niu, Satyen Sangani
  • Patent number: 8782061
    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: July 15, 2014
    Assignee: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Patent number: 8533203
    Abstract: Identifying synonyms of entities using a collection of documents is disclosed herein. In some aspects, a document from a collection of documents may be analyzed to identify hit sequences that include one or more tokens (e.g., words, number, etc.). The hit sequences may then be used to generate discriminating token sets (DTS's) that are subsets of both the hit sequences and the entity names. The DTS's are matched with corresponding entity names, and then used to create DTS phrases by selecting adjacent text in the document that is proximate to the DTS. The DTS phrases may be analyzed to determine whether the corresponding DTS is synonyms of the entity name. In various aspects, the tokens of an associated entity name that are present in the DTS phrases are used to generate a score for the DTS. When the score at least reaches a threshold, the DTS may be designated as a synonym. A list of synonyms may be generated for each entity name.
    Type: Grant
    Filed: June 4, 2009
    Date of Patent: September 10, 2013
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Venkatesh Ganti, Dong Xin
  • Patent number: 8527893
    Abstract: This patent application relates to taxonomy editing. One implementation involves a taxonomy editor configured to generate a visual representation of a taxonomy associated with a set of scientific papers. The taxonomy editor includes a properties module configured to identify properties relating to an individual node of the taxonomy and a statistics module configured to determine trends relating to the individual node. The taxonomy editor further includes a similarity module configured to evaluate keyword similarity relative to individual scientific papers associated with the individual node. The taxonomy editor also includes a suggestion module configured to utilize the properties, the trends and the keyword similarity to identify potential modifications to the taxonomy. The taxonomy editor is further configured to present at least some of the potential modifications, the properties, the trends, and the keyword similarity concurrently with the visual representation of the taxonomy.
    Type: Grant
    Filed: February 26, 2010
    Date of Patent: September 3, 2013
    Assignee: Microsoft Corporation
    Inventors: Sanjay Agrawal, Surajit Chaudhuri, Venkatesh Ganti, Yuri Siradeghyan
  • Patent number: 8204866
    Abstract: A deduplication algorithm that provides improved accuracy in data deduplication by using aggregate and/or groupwise constraints. Deduplication is accomplished using only as many of these constraints that are satisfied rather than be imposed inflexibly as hard constraints. Additionally, textual similarity between tuples is leveraged to restrict the search space. The algorithm begins with a coarse initial partition of data records and continues by raising the similarity threshold until the threshold splits a given partition. This sequence of splits defines a rich space of alternatives. Over this space, an algorithm finds a partition of the input that maximizes constraint satisfaction. In the context of groupwise aggregation constraints for deduplication all SQL (structured query language) aggregates are allowed, including summation.
    Type: Grant
    Filed: May 18, 2007
    Date of Patent: June 19, 2012
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Venkatesh Ganti, Shriraghav Kaushik, Anish Das Sarma
  • Patent number: 8195655
    Abstract: Architecture for finding related entities for web search queries. An extraction component takes a document as input and outputs all the mentions (or occurrences) of named entities such as names of people, organizations, locations, and products in the document, as well as entity metadata. An indexing component takes a document identifier (docID) and the set of mentions of named entities and, stores and indexes the information for retrieval. A document-based search component takes a keyword query and returns the docIDs of the top documents matching with the query. A retrieval component takes a docID as input, accesses the information stored by the indexing component and returns the set of mentions of named entities in the document. This information is then passed to an entity scoring and thresholding component that computes an aggregate score of each entity and selects the entities to return to the user.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: June 5, 2012
    Assignee: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Publication number: 20110320446
    Abstract: This patent application relates to interval-based information retrieval (IR) search techniques for efficiently and correctly answering keyword search queries. In some embodiments, a range of information-containing blocks for a search query can be identified. Each of these blocks, and thus the range, can include document identifiers that identify individual corresponding documents that contain a term found in the search query. From the range, a subrange(s) having a smaller number of blocks than the range can be selected. This can be accomplished without decompressing the blocks by partitioning the range into intervals and evaluating the intervals. The smaller number of blocks in the subranges(s) can then be decompressed and processed to identify a doc ID(s) and thus document(s) that satisfies the query.
    Type: Application
    Filed: June 25, 2010
    Publication date: December 29, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Publication number: 20110314010
    Abstract: A query comprising a set of keywords may be applied to a data set having various attributes, but it may be difficult to determine the query predicates intended for each keyword (e.g., the attributes targeted by each keyword, and the values of those attributes satisfying the keyword.) The meaning of a keyword of interest may be inferred from a set of query pairs, comprising a background query (comprising a set of keywords excluding the keyword of interest) and a foreground query (comprising the same set of keywords but also including the keyword of interest.) Differences in the query results for the foreground query and the background query of many query pairs may identify a query predicate intended by the keyword and a confidence score. These results may be associated with the keyword in a keyword map, useful for translating queries into query predicates that may yield relevant query results.
    Type: Application
    Filed: June 17, 2010
    Publication date: December 22, 2011
    Applicant: Microsoft Corporation
    Inventors: Venkatesh Ganti, Dong Xin, Yeye He
  • Publication number: 20110282856
    Abstract: Embodiments for identifying an entity synonym of an entity are described. A query log is stored in a database located on at least one computing device. A candidate generation module can select a candidate query in the query log that shares a click on a URL with the entity. A correlated tag module can generate a set of phrase-tag pairs for the entity and the candidate query and measure a mutual information value for each phrase-tag pair. A candidate filtering module can determine a click similarity value between the candidate query and the entity based on a set of URLs selected in the search engine results and a tag similarity value based on the mutual information values. A candidate query is selected as an entity synonym if the click similarity value and the tag similarity value are greater than predetermined thresholds respectively.
    Type: Application
    Filed: May 14, 2010
    Publication date: November 17, 2011
    Applicant: Microsoft Corporation
    Inventors: Venkatesh Ganti, Dong Xin
  • Patent number: 8046339
    Abstract: Example-driven creation of record matching queries. The disclosed architecture employs techniques that exploit the availability of positive (or matching) and negative (non-matching) examples to search through this space and suggest an initial record matching query. The record matching task is modeled as that of designing an operator tree obtained by composing a few primitive operators. This ensures that record matching programs be executable efficiently and scalably over large input relations. The architecture joins records across multiple (e.g., two) relations (e.g., R and S). The architecture exploits the monotonicity property of similarity functions for record matching in the relations, in that, any pair of matching records have a higher similarity value than non-matching record pairs on at least one similarity function.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: October 25, 2011
    Assignee: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Bee Chung Chen, Venkatesh Ganti, Shriraghav Kaushik