Patents by Inventor Kaushik Chakrabarti

Kaushik Chakrabarti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

REFINING STRUCTURED DATA INDEXES

Publication number: 20180232410

Abstract: The present invention extends to methods, systems, and computer program products for refining structured data indexes. Aspects of the invention include associating structured data, such as, for example, tables, with additional content. Additional content can include content outside the <table> and </table> tags of a web table. Indexes for structured data (e.g., table indexes) can be refined based on the additional content to improve the relevance of providing parts of the structured data (e.g., parts of the table) in search results.

Type: Application

Filed: April 11, 2018

Publication date: August 16, 2018

Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
Data services for enterprises leveraging search system data assets

Patent number: 10032131

Abstract: A data service system is described herein which processes raw data assets from at least one network-accessible system (such as a search system), to produce processed data assets. Enterprise applications can then leverage the processed data assets to perform various environment-specific tasks. In one implementation, the data service system can generate any of: synonym resources for use by an enterprise application in providing synonyms for specified terms associated with entities; augmentation resources for use by an enterprise application in providing supplemental information for specified seed information; and spelling-correction resources for use by an enterprise application in providing spelling information for specified terms, and so on.

Type: Grant

Filed: June 20, 2012

Date of Patent: July 24, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Tao Cheng, Kris Ganjam, Kaushik Chakrabarti, Zhimin Chen, Vivek R. Narasayya, Surajit Chaudhuri
Annotating structured data for search

Patent number: 9959305

Abstract: The present invention extends to methods, systems, and computer program products for annotating structured data for search. Aspects of the invention include associating structured data, such as, for example, tables, with additional content to improve indexing of the structured data for search and/or provide improved search results for structured data. Web pages can include tables as well as other content. The other content in a web page, such as, for example, content outside the <table> and </table> tags of a web table, can be useful in supporting searches for web tables. Content in one web page can also be useful in supporting searches for a table in another web page.

Type: Grant

Filed: July 8, 2014

Date of Patent: May 1, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
Ranking tables for keyword search

Patent number: 9940365

Abstract: The present invention extends to methods, systems, and computer program products for ranking tables for keyword search. Aspects of the invention include generating lists of candidate tables for inclusion in a search query response, computing table hit matrices, retrieving content from fields of candidate tables having keyword hits, generating ranking features of tables, and computing ranking scores for tables. Aspects of the invention can be used to match keywords against column names, to match keywords against values in subject and non-subject columns, and to match keywords against table descriptions like page titles, table captions, cell values, nearest headings and surrounding text. Which keywords are matched against which fields can depend on the table and/or the query (referred to as “late binding”).

Type: Grant

Filed: July 8, 2014

Date of Patent: April 10, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
LEVERAGING CORPORAL DATA FOR DATA PARSING AND PREDICTING

Publication number: 20170371958

Abstract: The techniques discussed herein leverage structure within data of a corpus to parse unstructured data to obtain structured data and/or to predict latent data that is related to the unstructured and/or structured data. In some examples, parsing and/or predicting can be conducted at varying levels of granularity. In some examples, parsing and/or predicting can be iteratively conducted to improve accuracy and/or to expose more hidden data.

Type: Application

Filed: June 28, 2016

Publication date: December 28, 2017

Inventors: Kris K. Ganjam, Kaushik Chakrabarti
Aggregate-Query Database System and Processing

Publication number: 20170371924

Abstract: A processing unit can determine a first subset of a data set including data records selected based on measure values thereof. The processing unit can determine an index mapping a predicate to data records associated with that predicate and approximation values of the records. The processing unit can process a query against the first subset to provide a first result and a first accuracy value, determine that the first accuracy value does not satisfy an accuracy criterion, and process the query against the index. In some examples, the processing unit can process the query against a second subset including data records satisfying a predetermined predicate. In some examples, the processing unit can receive data records and determine the first subset. Data records can include respective measure values. Data records with higher measure values can occur in the first subset more frequently than data records with lower measure values.

Type: Application

Filed: June 24, 2016

Publication date: December 28, 2017

Inventors: Bolin Ding, Silu Huang, Chi Wang, Kaushik Chakrabarti, Surajit Chaudhuri
UNDERSTANDING TABLES FOR SEARCH

Publication number: 20170322964

Abstract: The present invention extends to methods, systems, and computer program products for understanding tables for search. Aspects of the invention include identifying a subject tuple (e.g., a subject column) for a table, detecting a tuple header (e.g., a column header) using other tables, and detecting a tuple header (e.g., a column header) using a knowledge base. Implementations can be utilized in a structured data search system (SDSS) that indexes structured information, such as, tables in a relational database or html tables extracted from web pages. The SDSS allows users to search over the structured information (tables) using different mechanisms including keyword search and data finding data.

Type: Application

Filed: July 27, 2017

Publication date: November 9, 2017

Inventors: Zhongyuan Wang, Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
Understanding tables for search

Patent number: 9734181

Abstract: The present invention extends to methods, systems, and computer program products for understanding tables for search. Aspects of the invention include identifying a subject column for a table, detecting a column header using other tables, and detecting a column header using a knowledge base. Implementations can be utilized in a structured data search system (SDSS) that indexes structured information, such as, tables in a relational database or html tables extracted from web pages. The SDSS allows users to search over the structured information (tables) using different mechanisms including keyword search and data finding data.

Type: Grant

Filed: October 2, 2014

Date of Patent: August 15, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Zhongyuan Wang, Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
TECHNIQUES FOR DIGITAL ENTITY CORRELATION

Publication number: 20170132329

Abstract: Techniques for using digital entity correlation to generate a composite knowledge graph from constituent graphs. In an aspect, digital attribute values associated with primary entities may be encoded into primitives, e.g., using a multi-resolution encoding scheme. A pairs graph may be constructed, based on seed pairs calculated from correlating encoded primitives, and further expanded to include subjects and objects of the seed pairs, as well as pairs connected to relationship entities. A similarity metric is computed for each candidate pair to determine whether a match exists. The similarity metric may be based on summing a weighted landing probability over all primitives associated directly or indirectly with each candidate pair. By incorporating primitive matches from not only the candidate pair but also from pairs surrounding the candidate pair, entity matching may be efficiently implemented on a holistic basis.

Type: Application

Filed: November 5, 2015

Publication date: May 11, 2017

Inventors: Mohamed Yakout, Kaushik Chakrabarti, Maria Pershina
Targeted disambiguation of named entities

Patent number: 9594831

Abstract: A targeted disambiguation system is described herein which determines true mentions of a list of named entities in a collection of documents. The list of named entities is homogenous in the sense that the entities pertain to the same subject matter domain. The system determines the true mentions by leveraging the homogeneity in the list, and, more specifically by applying a context similarity hypothesis, a co-mention hypothesis, and an interdependency hypothesis. In one implementation, the system executes its analysis using a graph-based model. The system can operate without the existence of additional information regarding the entities in the list; nevertheless, if such information is available, the system can integrate it into its analysis.

Type: Grant

Filed: June 22, 2012

Date of Patent: March 14, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Chi Wang, Kaushik Chakrabarti, Tao Cheng, Surajit Chaudhuri
CONCEPT EXPANSION USING TABLES

Publication number: 20160378765

Abstract: Concept expansion using tables, such as web tables, can return entities belonging to a concept based on an input of the concept and at least one seed entity that belongs to the concept. A concept expansion frontend can receive the concept and seed entity and provide them to a concept expansion framework. The concept expansion framework can expand the coverage of entities for concepts, including tail concepts, using tables by leveraging rich content signals corresponding to concept names. Such content signals can include content matching the concept that appear in captions, early headings, page titles, surrounding text, anchor text, and queries for which the page has been clicked. The concept expansion framework can use the structured entities in tables to infer exclusive tables. Such inference differs from previous label propagation methods and involves modeling a table-entity relationship. The table-entity relationship reduces semantic drift without using a reference ontology.

Type: Application

Filed: June 29, 2015

Publication date: December 29, 2016

Inventors: Philip A. Bernstein, Kaushik Chakrabarti, Zhimin Chen, Yeye He, Chi Wang, Kris K. Ganjam
Scalable lookup-driven entity extraction from indexed document collections

Patent number: 9501475

Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.

Type: Grant

Filed: June 3, 2014

Date of Patent: November 22, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
Tagging entities with descriptive phrases

Patent number: 9298825

Abstract: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.

Type: Grant

Filed: November 17, 2011

Date of Patent: March 29, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Tao Cheng
RANKING TABLES FOR KEYWORD SEARCH

Publication number: 20160012052

Abstract: The present invention extends to methods, systems, and computer program products for ranking tables for keyword search. Aspects of the invention include generating lists of candidate tables for inclusion in a search query response, computing table hit matrices, retrieving content from fields of candidate tables having keyword hits, generating ranking features of tables, and computing ranking scores for tables. Aspects of the invention can be used to match keywords against column names, to match keywords against values in subject and non-subject columns, and to match keywords against table descriptions like page titles, table captions, cell values, nearest headings and surrounding text. Which keywords are matched against which fields can depend on the table and/to the query (referred to as “late binding”).

Type: Application

Filed: July 8, 2014

Publication date: January 14, 2016

Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
ANNOTATING STRUCTURED DATA FOR SEARCH

Publication number: 20160012091

Abstract: The present invention extends to methods, systems, and computer program products for annotating structured data for search. Aspects of the invention include associating structured data, such as, for example, tables, with additional content to improve indexing of the structured data for search and/or provide improved search results for structured data. Web pages can include tables as well as other content. The other content in a web page, such as, for example, content outside the <table> and </table> tags of a web table, can be useful in supporting searches for web tables. Content in one web page can also be useful in supporting searches for a table in another web page.

Type: Application

Filed: July 8, 2014

Publication date: January 14, 2016

Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
COMPUTING FEATURES OF STRUCTURED DATA

Publication number: 20160012051

Abstract: The present invention extends to methods, systems, and computer program products for computing features of structured data. Aspects of the invention include computing features of table components (e.g., of rows, columns, cells, etc.). Computed features can be used for ranking the table components. When aggregated, features for different components of a table can be used for ranking the table (e.g., a web table).

Type: Application

Filed: July 8, 2014

Publication date: January 14, 2016

Inventors: Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
UNDERSTANDING TABLES FOR SEARCH

Publication number: 20150379057

Abstract: The present invention extends to methods, systems, and computer program products for understanding tables for search. Aspects of the invention include identifying a subject column for a table, detecting a column header using other tables, and detecting a column header using a knowledge base. Implementations can be utilized in a structured data search system (SDSS) that indexes structured information, such as, tables in a relational database or html tables extracted from web pages. The SDSS allows users to search over the structured information (tables) using different mechanisms including keyword search and data finding data.

Type: Application

Filed: October 2, 2014

Publication date: December 31, 2015

Inventors: Zhongyuan Wang, Kanstantsyn Zoryn, Zhimin Chen, Kaushik Chakrabarti, James P. Finnigan, Vivek R. Narasayya, Surajit Chaudhuri, Kris Ganjam
FINDING PATTERNS IN A KNOWLEDGE BASE TO COMPOSE TABLE ANSWERS

Publication number: 20150310073

Abstract: In general, the knowledge base table composer embodiments described herein provide table answers to keyword queries against one or more knowledge bases. Highly relevant patterns in a knowledge base are found for user-given keyword queries. These patterns are used to compose table answers. To this end, a knowledge base is modeled as a directed graph called a knowledge graph, where nodes represent entities in the knowledge base and edges represent the relationships among them. Each node/edge is labeled with a type and text. A pattern that is an aggregation of subtrees which contain all keywords in the texts and have the same structure and types on node/edges is sought. Patterns that are relevant to a query for a class can be found using a set of scoring functions. Furthermore, path-based indexes and various query-processing procedures can be employed to speed up processing.

Type: Application

Filed: April 29, 2014

Publication date: October 29, 2015

Applicant: MICROSOFT CORPORATION

Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Bolin Ding, Mohan Yang
Entity augmentation service from latent relational data

Patent number: 9171081

Abstract: The subject disclosure is directed towards providing data for augmenting an entity-attribute-related task. Pre-processing is preformed on entity-attribute tables extracted from the web, e.g., to provide indexes that are accessible to find data that completes augmentation tasks. The indexes are based on both direct mappings and indirect mappings between tables. Example augmentation tasks include queries for augmented data based on an attribute name or examples, or finding synonyms for augmentation. An online query is efficiently processed by accessing the indexes to return augmented data related to the task.

Type: Grant

Filed: March 6, 2012

Date of Patent: October 27, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kris K. Ganjam, Kaushik Chakrabarti, Mohamed A. Yakout, Surajit Chaudhuri
SEMANTIC MATCHING AND ANNOTATION OF ATTRIBUTES

Publication number: 20150227589

Abstract: Techniques and constructs to facilitate semantic matching and automated annotation (SMA) of attributes can take entity names and a keyword describing an attribute associated with the named entities as input and leverage a corpus of data such as data from tables, which can include HTML web tables, to automatically populate values associated with the named entities for the attribute. The constructs enable accurate SMA of attributes, such as attributes that relate to the entity and include numeric values in a different unit than the query, in a different scale than the query, and/or reflecting a time different from that of the query. An entity augmentation application programming interface (API) may be used to accept queries that include numeric criteria, parameters, or arguments, including query attributes represented by numeric values, which may be in different units or scales, and attributes represented by numeric values that can vary by time.

Type: Application

Filed: February 10, 2014

Publication date: August 13, 2015

Applicant: Microsoft Corporation

Inventors: Kaushik Chakrabarti, Meihui Zhang

prev 1 2 3 4 next