Patents by Inventor Kaushik Chakrabarti

Kaushik Chakrabarti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150154682
    Abstract: A keyword generator identifies words or phrases of interest in a product catalog and also identifies synonyms for the words or phrases of interest. The synonyms are integrated into the product catalog to generate an enriched product catalog. The enriched product catalog is published for use in one or more commercial channels.
    Type: Application
    Filed: April 15, 2014
    Publication date: June 4, 2015
    Applicant: MICROSOFT CORPORATION
    Inventors: Raghu Ram, Kaushik Chakrabarti, Meera Mahabala, Navid Azimi-Garakani, Tao Cheng, Yeye He
  • Publication number: 20150154681
    Abstract: A keyword generator identifies words or phrases of interest in a product catalog and also identifies synonyms for the words or phrases of interest. The synonyms are integrated into the product catalog to generate an enriched product catalog. The level of co-occurrence of synonyms between sets of product catalog entries is identified and, if it meets a threshold level, the product names from the catalog entries are integrated into the other catalog entries in the set, as synonyms.
    Type: Application
    Filed: April 15, 2014
    Publication date: June 4, 2015
    Applicant: Microsoft Corporation
    Inventors: Raghu Ram, Kaushik Chakrabarti, Meera Mahabala, Navid Azimi-Garakani, Tao Cheng, Yeye He
  • Publication number: 20150088904
    Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.
    Type: Application
    Filed: November 30, 2014
    Publication date: March 26, 2015
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Senjuti Basu Roy
  • Publication number: 20150019540
    Abstract: Various technologies that facilitate performance of a data finding data (DFD) search are described herein. A user specifies entities, for example, by entering the entities into a query field, selecting the entities from a computer-executable application, or the like. The user further specifies an attribute of the entities that is of interest. A query is constructed based upon the entities and the attribute, and a search for tables is performed based upon the entities and the attribute. Values of the attribute for the selected entities are identified in a table, and the values of the attribute are returned.
    Type: Application
    Filed: May 21, 2014
    Publication date: January 15, 2015
    Applicant: Microsoft Corporation
    Inventors: Kris Ganjam, Zhimin Chen, Kaushik Chakrabarti, Surajit Chaudhuri, Vivek Narasayya, James Finnigan, Kanstantsyn Zoryn
  • Publication number: 20150019216
    Abstract: Described herein are various technologies pertaining to performing an operation relative to tabular data based upon voice input. An ASR system includes a language model that is customized based upon content of the tabular data. The ASR system receives a voice signal that is representative of speech of a user. The ASR system creates a transcription of the voice signal based upon the ASR being customized with the content of the tabular data. The operation relative to the tabular data is performed based upon the transcription of the voice signal.
    Type: Application
    Filed: May 21, 2014
    Publication date: January 15, 2015
    Applicant: Microsoft Corporation
    Inventors: Prabhdeep Singh, Kris Ganjam, Sumit Gulwani, Mark Marron, Yun-Cheng Ju, Kaushik Chakrabarti
  • Patent number: 8930391
    Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.
    Type: Grant
    Filed: December 29, 2010
    Date of Patent: January 6, 2015
    Assignee: Microsoft Corporation
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Senjuti Basu Roy
  • Publication number: 20140351274
    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.
    Type: Application
    Filed: June 3, 2014
    Publication date: November 27, 2014
    Applicant: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Patent number: 8856047
    Abstract: A personalized page rank computation system is described herein that provides a fast MapReduce method for Monte Carlo approximation of personalized PageRank vectors of all the nodes in a graph. The method presented is both faster and less computationally intensive than existing methods, allowing a broader scope of problems to be solved by existing computing hardware. The system adopts the Monte Carlo approach and provides a method to compute single random walks of a given length for all nodes in a graph that it is superior in terms of the number of map-reduce iterations among a broad class of methods. The resulting solution reduces the I/O cost and outperforms the state-of-the-art FPPR approximation methods, in terms of efficiency and approximation error. Thus, the system can very efficiently perform single random walks of a given length starting at each node in the graph and can very efficiently approximate all the personalized PageRank vectors.
    Type: Grant
    Filed: June 21, 2011
    Date of Patent: October 7, 2014
    Assignee: Microsoft Corporation
    Inventors: Kaushik Chakrabarti, Dong Xin, Bahman Bahmani
  • Patent number: 8782061
    Abstract: A set of documents is filtered for entity extraction. A list of entity strings is received. A set of token sets that covers the entity strings in the list is determined. An inverted index generated on a first set of documents is queried using the set of token sets to determine a set of document identifiers for a subset of the documents in the first set. A second set of documents identified by the set of document identifiers is retrieved from the first set of documents. The second set of documents is filtered to include one or more documents of the second set that each includes a match with at least one entity string of the list of entity strings. Entity recognition may be performed on the filtered second set of documents.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: July 15, 2014
    Assignee: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Patent number: 8745019
    Abstract: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.
    Type: Grant
    Filed: June 4, 2012
    Date of Patent: June 3, 2014
    Assignee: Microsoft Corporation
    Inventors: Tao Cheng, Kaushik Chakrabarti, Surajit Chaudhuri, Dong Xin
  • Publication number: 20130346464
    Abstract: A data service system is described herein which processes raw data assets from at least one network-accessible system (such as a search system), to produce processed data assets. Enterprise applications can then leverage the processed data assets to perform various environment-specific tasks. In one implementation, the data service system can generate any of: synonym resources for use by an enterprise application in providing synonyms for specified terms associated with entities; augmentation resources for use by an enterprise application in providing supplemental information for specified seed information; and spelling-correction resources for use by an enterprise application in providing spelling information for specified terms, and so on.
    Type: Application
    Filed: June 20, 2012
    Publication date: December 26, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Cheng, Kris Ganjam, Kaushik Chakrabarti, Zhimin Chen, Vivek R. Narasayya, Surajit Chaudhuri
  • Publication number: 20130346421
    Abstract: A targeted disambiguation system is described herein which determines true mentions of a list of named entities in a collection of documents. The list of named entities is homogenous in the sense that the entities pertain to the same subject matter domain. The system determines the true mentions by leveraging the homogeneity in the list, and, more specifically by applying a context similarity hypothesis, a co-mention hypothesis, and an interdependency hypothesis. In one implementation, the system executes its analysis using a graph-based model. The system can operate without the existence of additional information regarding the entities in the list; nevertheless, if such information is available, the system can integrate it into its analysis.
    Type: Application
    Filed: June 22, 2012
    Publication date: December 26, 2013
    Applicant: Microsoft Corporation
    Inventors: Chi Wang, Kaushik Chakrabarti, Tao Cheng, Surajit Chaudhuri
  • Publication number: 20130238621
    Abstract: The subject disclosure is directed towards providing data for augmenting an entity-attribute-related task. Pre-processing is preformed on entity-attribute tables extracted from the web, e.g., to provide indexes that are accessible to find data that completes augmentation tasks. The indexes are based on both direct mappings and indirect mappings between tables. Example augmentation tasks include queries for augmented data based on an attribute name or examples, or finding synonyms for augmentation. An online query is efficiently processed by accessing the indexes to return augmented data related to the task.
    Type: Application
    Filed: March 6, 2012
    Publication date: September 12, 2013
    Applicant: Microsoft Corporation
    Inventors: Kris K. Ganjam, Kaushik Chakrabarti, Mohamed A. Yakout, Surajit Chaudhuri
  • Publication number: 20130232129
    Abstract: A similarity analysis framework is described herein which leverages two or more similarity analysis functions to generate synonyms for an entity reference string re. The functions are selected such that the synonyms that are generated by the framework satisfy a core set of synonym-related properties. The functions operate by leveraging query log data. One similarity analysis function takes into consideration the strength of similarity between a particular candidate string se and an entity reference string re even in the presence of sparse query log data, while another function takes into account the classes of se and re. The framework also provides indexing mechanisms that expedite its computations. The framework also provides a reduction module for converting long entity reference strings into shorter strings, where each shorter string (if found) contains a subset of the terms in its longer counterpart.
    Type: Application
    Filed: June 4, 2012
    Publication date: September 5, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Tao Cheng, Kaushik Chakrabarti, Surajit Chaudhuri, Dong Xin
  • Publication number: 20130132381
    Abstract: A plurality of description phrases associated with a first domain may be determined, based on an analysis of a first plurality of documents to determine co-occurrences of the description phrases with one or more name labels associated with the first domain. An entity associated with the first domain may be obtained. An analysis of a second plurality of documents may be initiated to identify co-occurrences of mentions of the obtained entity and one or more of the plurality of description phrases, and contexts associated with each of the co-occurrences of the mentions and description phrases, in each one of the second plurality of documents. A description tag association between the obtained entity and one of the description phrases may be determined, based on an analysis of the identified contexts.
    Type: Application
    Filed: November 17, 2011
    Publication date: May 23, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Tao Cheng
  • Publication number: 20120330864
    Abstract: A personalized page rank computation system is described herein that provides a fast MapReduce method for Monte Carlo approximation of personalized PageRank vectors of all the nodes in a graph. The method presented is both faster and less computationally intensive than existing methods, allowing a broader scope of problems to be solved by existing computing hardware. The system adopts the Monte Carlo approach and provides a method to compute single random walks of a given length for all nodes in a graph that it is superior in terms of the number of map-reduce iterations among a broad class of methods. The resulting solution reduces the I/O cost and outperforms the state-of-the-art FPPR approximation methods, in terms of efficiency and approximation error. Thus, the system can very efficiently perform single random walks of a given length starting at each node in the graph and can very efficiently approximate all the personalized PageRank vectors.
    Type: Application
    Filed: June 21, 2011
    Publication date: December 27, 2012
    Applicant: Microsoft Corporation
    Inventors: Kaushik Chakrabarti, Dong Xin, Bahman Bahmani
  • Publication number: 20120173500
    Abstract: A location associated with a user of a computing device and a prefix portion of an input string may be received as one or more successive characters of the input string are provided by the user via the computing device. A list of suggested items may be obtained based on a function of respective recommendation indicators and proximities of the items to the location in response to receiving the prefix portion, and based on partially traversing a character string search structure having a plurality of non-terminal nodes augmented with bound indicators associated with spatial regions. The list of suggested items and descriptive information associated with each suggested item may be returned to the user, in response to receiving the prefix portion, for rendering an image illustrating indicators associated with the list in a manner relative to the location, as the user provides each successive character of the input string.
    Type: Application
    Filed: December 29, 2010
    Publication date: July 5, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri
  • Patent number: 8195655
    Abstract: Architecture for finding related entities for web search queries. An extraction component takes a document as input and outputs all the mentions (or occurrences) of named entities such as names of people, organizations, locations, and products in the document, as well as entity metadata. An indexing component takes a document identifier (docID) and the set of mentions of named entities and, stores and indexes the information for retrieval. A document-based search component takes a keyword query and returns the docIDs of the top documents matching with the query. A retrieval component takes a docID as input, accesses the information stored by the indexing component and returns the set of mentions of named entities in the document. This information is then passed to an entity scoring and thresholding component that computes an aggregate score of each entity and selects the entities to return to the user.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: June 5, 2012
    Assignee: Microsoft Corporation
    Inventors: Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Publication number: 20110320446
    Abstract: This patent application relates to interval-based information retrieval (IR) search techniques for efficiently and correctly answering keyword search queries. In some embodiments, a range of information-containing blocks for a search query can be identified. Each of these blocks, and thus the range, can include document identifiers that identify individual corresponding documents that contain a term found in the search query. From the range, a subrange(s) having a smaller number of blocks than the range can be selected. This can be accomplished without decompressing the blocks by partitioning the range into intervals and evaluating the intervals. The smaller number of blocks in the subranges(s) can then be decompressed and processed to identify a doc ID(s) and thus document(s) that satisfies the query.
    Type: Application
    Filed: June 25, 2010
    Publication date: December 29, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti
  • Patent number: 8037069
    Abstract: The described implementations relate to data analysis, such as membership checking. One technique identifies candidate matches between document sub-strings and database members utilizing signatures. The technique further verifies that the candidate matches are true matches.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: October 11, 2011
    Assignee: Microsoft Corporation
    Inventors: Kaushik Chakrabarti, Surajt Chaudhuri, Venkatesh Ganti, Dong Xin