Patents by Inventor Kaushik Chakrabarti

Kaushik Chakrabarti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ENRICHING PRODUCT CATALOG WITH SEARCH KEYWORDS

Publication number: 20150154682

Abstract: A keyword generator identifies words or phrases of interest in a product catalog and also identifies synonyms for the words or phrases of interest. The synonyms are integrated into the product catalog to generate an enriched product catalog. The enriched product catalog is published for use in one or more commercial channels.

Type: Application

Filed: April 15, 2014

Publication date: June 4, 2015

Applicant: MICROSOFT CORPORATION

Inventors: Raghu Ram, Kaushik Chakrabarti, Meera Mahabala, Navid Azimi-Garakani, Tao Cheng, Yeye He
ENRICHING PRODUCT CATALOG WITH PRODUCT NAME KEYWORDS

Publication number: 20150154681

Abstract: A keyword generator identifies words or phrases of interest in a product catalog and also identifies synonyms for the words or phrases of interest. The synonyms are integrated into the product catalog to generate an enriched product catalog. The level of co-occurrence of synonyms between sets of product catalog entries is identified and, if it meets a threshold level, the product names from the catalog entries are integrated into the other catalog entries in the set, as synonyms.

Type: Application

Filed: April 15, 2014

Publication date: June 4, 2015

Applicant: Microsoft Corporation

Inventors: Raghu Ram, Kaushik Chakrabarti, Meera Mahabala, Navid Azimi-Garakani, Tao Cheng, Yeye He
PROGRESSIVE SPATIAL SEARCHING USING AUGMENTED STRUCTURES

Publication number: 20150088904

Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.

Type: Application

Filed: November 30, 2014

Publication date: March 26, 2015

Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Senjuti Basu Roy
RETRIEVAL OF ATTRIBUTE VALUES BASED UPON IDENTIFIED ENTITIES

Publication number: 20150019540

Abstract: Various technologies that facilitate performance of a data finding data (DFD) search are described herein. A user specifies entities, for example, by entering the entities into a query field, selecting the entities from a computer-executable application, or the like. The user further specifies an attribute of the entities that is of interest. A query is constructed based upon the entities and the attribute, and a search for tables is performed based upon the entities and the attribute. Values of the attribute for the selected entities are identified in a table, and the values of the attribute are returned.

Type: Application

Filed: May 21, 2014

Publication date: January 15, 2015

Applicant: Microsoft Corporation

Inventors: Kris Ganjam, Zhimin Chen, Kaushik Chakrabarti, Surajit Chaudhuri, Vivek Narasayya, James Finnigan, Kanstantsyn Zoryn
PERFORMING AN OPERATION RELATIVE TO TABULAR DATA BASED UPON VOICE INPUT

Publication number: 20150019216

Abstract: Described herein are various technologies pertaining to performing an operation relative to tabular data based upon voice input. An ASR system includes a language model that is customized based upon content of the tabular data. The ASR system receives a voice signal that is representative of speech of a user. The ASR system creates a transcription of the voice signal based upon the ASR being customized with the content of the tabular data. The operation relative to the tabular data is performed based upon the transcription of the voice signal.

Type: Application

Filed: May 21, 2014

Publication date: January 15, 2015

Applicant: Microsoft Corporation

Inventors: Prabhdeep Singh, Kris Ganjam, Sumit Gulwani, Mark Marron, Yun-Cheng Ju, Kaushik Chakrabarti
Progressive spatial searching using augmented structures

Patent number: 8930391

Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.

Type: Grant

Filed: December 29, 2010

Date of Patent: January 6, 2015

Assignee: Microsoft Corporation

Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Senjuti Basu Roy
SCALABLE LOOKUP-DRIVEN ENTITY EXTRACTION FROM INDEXED DOCUMENT COLLECTIONS

Publication number: 20140351274

Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.

Type: Application

Filed: June 3, 2014

Publication date: November 27, 2014

Applicant: Microsoft Corporation

Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
Fast personalized page rank on map reduce

Patent number: 8856047

Abstract: A personalized page rank computation system is described herein that provides a fast MapReduce method for Monte Carlo approximation of personalized PageRank vectors of all the nodes in a graph. The method presented is both faster and less computationally intensive than existing methods, allowing a broader scope of problems to be solved by existing computing hardware. The system adopts the Monte Carlo approach and provides a method to compute single random walks of a given length for all nodes in a graph that it is superior in terms of the number of map-reduce iterations among a broad class of methods. The resulting solution reduces the I/O cost and outperforms the state-of-the-art FPPR approximation methods, in terms of efficiency and approximation error. Thus, the system can very efficiently perform single random walks of a given length starting at each node in the graph and can very efficiently approximate all the personalized PageRank vectors.

Type: Grant

Filed: June 21, 2011

Date of Patent: October 7, 2014

Assignee: Microsoft Corporation

Inventors: Kaushik Chakrabarti, Dong Xin, Bahman Bahmani
Scalable lookup-driven entity extraction from indexed document collections

Patent number: 8782061

Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.

Type: Grant

Filed: June 24, 2008

Date of Patent: July 15, 2014

Assignee: Microsoft Corporation

Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
Robust discovery of entity synonyms using query logs

Patent number: 8745019

Abstract: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.

Type: Grant

Filed: June 4, 2012

Date of Patent: June 3, 2014

Assignee: Microsoft Corporation

Inventors: Tao Cheng, Kaushik Chakrabarti, Surajit Chaudhuri, Dong Xin
Data Services for Enterprises Leveraging Search System Data Assets

Publication number: 20130346464

Abstract: A data service system is described herein which processes raw data assets from at least one network-accessible system (such as a search system), to produce processed data assets. Enterprise applications can then leverage the processed data assets to perform various environment-specific tasks. In one implementation, the data service system can generate any of: synonym resources for use by an enterprise application in providing synonyms for specified terms associated with entities; augmentation resources for use by an enterprise application in providing supplemental information for specified seed information; and spelling-correction resources for use by an enterprise application in providing spelling information for specified terms, and so on.

Type: Application

Filed: June 20, 2012

Publication date: December 26, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Tao Cheng, Kris Ganjam, Kaushik Chakrabarti, Zhimin Chen, Vivek R. Narasayya, Surajit Chaudhuri
TARGETED DISAMBIGUATION OF NAMED ENTITIES

Publication number: 20130346421

Abstract: A targeted disambiguation system is described herein which determines true mentions of a list of named entities in a collection of documents. The list of named entities is homogenous in the sense that the entities pertain to the same subject matter domain. The system determines the true mentions by leveraging the homogeneity in the list, and, more specifically by applying a context similarity hypothesis, a co-mention hypothesis, and an interdependency hypothesis. In one implementation, the system executes its analysis using a graph-based model. The system can operate without the existence of additional information regarding the entities in the list; nevertheless, if such information is available, the system can integrate it into its analysis.

Type: Application

Filed: June 22, 2012

Publication date: December 26, 2013

Applicant: Microsoft Corporation

Inventors: Chi Wang, Kaushik Chakrabarti, Tao Cheng, Surajit Chaudhuri
Entity Augmentation Service from Latent Relational Data

Publication number: 20130238621

Abstract: The subject disclosure is directed towards providing data for augmenting an entity-attribute-related task. Pre-processing is preformed on entity-attribute tables extracted from the web, e.g., to provide indexes that are accessible to find data that completes augmentation tasks. The indexes are based on both direct mappings and indirect mappings between tables. Example augmentation tasks include queries for augmented data based on an attribute name or examples, or finding synonyms for augmentation. An online query is efficiently processed by accessing the indexes to return augmented data related to the task.

Type: Application

Filed: March 6, 2012

Publication date: September 12, 2013

Applicant: Microsoft Corporation

Inventors: Kris K. Ganjam, Kaushik Chakrabarti, Mohamed A. Yakout, Surajit Chaudhuri
ROBUST DISCOVERY OF ENTITY SYNONYMS USING QUERY LOGS

Publication number: 20130232129

Abstract: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.

Type: Application

Filed: June 4, 2012

Publication date: September 5, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Tao Cheng, Kaushik Chakrabarti, Surajit Chaudhuri, Dong Xin
TAGGING ENTITIES WITH DESCRIPTIVE PHRASES

Publication number: 20130132381

Abstract: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.

Type: Application

Filed: November 17, 2011

Publication date: May 23, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Tao Cheng
FAST PERSONALIZED PAGE RANK ON MAP REDUCE

Publication number: 20120330864

Abstract: A personalized page rank computation system is described herein that provides a fast MapReduce method for Monte Carlo approximation of personalized PageRank vectors of all the nodes in a graph. The method presented is both faster and less computationally intensive than existing methods, allowing a broader scope of problems to be solved by existing computing hardware. The system adopts the Monte Carlo approach and provides a method to compute single random walks of a given length for all nodes in a graph that it is superior in terms of the number of map-reduce iterations among a broad class of methods. The resulting solution reduces the I/O cost and outperforms the state-of-the-art FPPR approximation methods, in terms of efficiency and approximation error. Thus, the system can very efficiently perform single random walks of a given length starting at each node in the graph and can very efficiently approximate all the personalized PageRank vectors.

Type: Application

Filed: June 21, 2011

Publication date: December 27, 2012

Applicant: Microsoft Corporation

Inventors: Kaushik Chakrabarti, Dong Xin, Bahman Bahmani
PROGRESSIVE SPATIAL SEARCHING USING AUGMENTED STRUCTURES

Publication number: 20120173500

Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.

Type: Application

Filed: December 29, 2010

Publication date: July 5, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Kaushik Chakrabarti, Surajit Chaudhuri
Finding related entity results for search queries

Patent number: 8195655

Abstract: Architecture for finding related entities for web search queries. An extraction component takes a document as input and outputs all the mentions (or occurrences) of named entities such as names of people, organizations, locations, and products in the document, as well as entity metadata. An indexing component takes a document identifier (docID) and the set of mentions of named entities and, stores and indexes the information for retrieval. A document-based search component takes a keyword query and returns the docIDs of the top documents matching with the query. A retrieval component takes a docID as input, accesses the information stored by the indexing component and returns the set of mentions of named entities in the document. This information is then passed to an entity scoring and thresholding component that computes an aggregate score of each entity and selects the entities to return to the user.

Type: Grant

Filed: June 5, 2007

Date of Patent: June 5, 2012

Assignee: Microsoft Corporation

Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
Pushing Search Query Constraints Into Information Retrieval Processing

Publication number: 20110320446

Abstract: This patent application relates to interval-based information retrieval (IR) search techniques for efficiently and correctly answering keyword search queries. In some embodiments, a range of information-containing blocks for a search query can be identified. Each of these blocks, and thus the range, can include document identifiers that identify individual corresponding documents that contain a term found in the search query. From the range, a subrange(s) having a smaller number of blocks than the range can be selected. This can be accomplished without decompressing the blocks by partitioning the range into intervals and evaluating the intervals. The smaller number of blocks in the subranges(s) can then be decompressed and processed to identify a doc ID(s) and thus document(s) that satisfies the query.

Type: Application

Filed: June 25, 2010

Publication date: December 29, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
Membership checking of digital text

Patent number: 8037069

Abstract: The described implementations relate to data analysis, such as membership checking. One technique identifies candidate matches between document sub-strings and database members utilizing signatures. The technique further verifies that the candidate matches are true matches.

Type: Grant

Filed: June 3, 2008

Date of Patent: October 11, 2011

Assignee: Microsoft Corporation

Inventors: Kaushik Chakrabarti, Surajt Chaudhuri, Venkatesh Ganti, Dong Xin

prev 1 2 3 4 next