Patents by Inventor JOSEPH M. KAUFMANN

JOSEPH M. KAUFMANN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11397854
    Abstract: Embodiments provide a computer implemented method for generating a domain-specific thesaurus on a cognitive system, comprising: receiving data of the domain-specific corpus and a plurality of terms of interest from a user; splitting the data of the domain-specific corpus into a plurality of sentences using natural language processing techniques; for each term in the plurality of terms of interest, retrieving a plurality of candidate sentences containing a corresponding term, from the plurality of sentences; for each candidate sentence, providing a list of synonyms of the corresponding term, wherein the synonyms are contextual alternatives in the corresponding candidate sentence; for each term in the plurality of terms of interest, tracking a frequency of each synonym, and forming a frequency map including all the synonyms of a corresponding term and the frequency of each synonym; and generating a domain-specific thesaurus based on a combination of all the synonyms in the frequency map.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: July 26, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy
  • Patent number: 10832145
    Abstract: A technique for resolving entities provided in a question includes creating respective entity context vectors (ECVs) for respective entities in an applicable knowledge graph (KG). A question is received from a user. A first entity is identified in the question. The first entity is associated with a matching one of the entities in the KG. An ECV for the matching one of the entities in the KG is modified. An answer to the question is generated based on the modified ECV.
    Type: Grant
    Filed: October 5, 2015
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Swaminathan Chandrasekaran, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy
  • Patent number: 10832146
    Abstract: Embodiments are directed to a method of utilizing an ensemble of distributional semantics systems in conjunction with a domain term extractor for generating domain-specific synonyms. The method allows for extraction of high-quality, domain-specific synonyms that can be used in an information handling system, such as a question-answer system or in an information retrieval (IR) system, capable of processing natural language. According to embodiments, the domain term extractor identifies the words for which synonyms are sought, and the ensemble of distributional semantics systems determines the synonyms.
    Type: Grant
    Filed: January 19, 2016
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Pablo N. Mendes
  • Publication number: 20200342061
    Abstract: Embodiments provide a computer implemented method for generating a domain-specific thesaurus on a cognitive system, comprising: receiving data of the domain-specific corpus and a plurality of terms of interest from a user; splitting the data of the domain-specific corpus into a plurality of sentences using natural language processing techniques; for each term in the plurality of terms of interest, retrieving a plurality of candidate sentences containing a corresponding term, from the plurality of sentences; for each candidate sentence, providing a list of synonyms of the corresponding term, wherein the synonyms are contextual alternatives in the corresponding candidate sentence; for each term in the plurality of terms of interest, tracking a frequency of each synonym, and forming a frequency map including all the synonyms of a corresponding term and the frequency of each synonym; and generating a domain-specific thesaurus based on a combination of all the synonyms in the frequency map.
    Type: Application
    Filed: April 26, 2019
    Publication date: October 29, 2020
    Inventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy
  • Patent number: 10754886
    Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.
    Type: Grant
    Filed: October 5, 2016
    Date of Patent: August 25, 2020
    Assignee: International Business Machines Corporation
    Inventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
  • Patent number: 10572601
    Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.
    Type: Grant
    Filed: July 28, 2017
    Date of Patent: February 25, 2020
    Assignee: International Business Machines Corporation
    Inventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
  • Patent number: 10558760
    Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.
    Type: Grant
    Filed: October 23, 2017
    Date of Patent: February 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
  • Patent number: 10303683
    Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.
    Type: Grant
    Filed: October 5, 2016
    Date of Patent: May 28, 2019
    Assignee: International Business Machines Corporation
    Inventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
  • Publication number: 20190034410
    Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.
    Type: Application
    Filed: October 23, 2017
    Publication date: January 31, 2019
    Inventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
  • Publication number: 20190034408
    Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.
    Type: Application
    Filed: July 28, 2017
    Publication date: January 31, 2019
    Inventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
  • Publication number: 20180095962
    Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.
    Type: Application
    Filed: October 5, 2016
    Publication date: April 5, 2018
    Inventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
  • Publication number: 20180096058
    Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.
    Type: Application
    Filed: October 5, 2016
    Publication date: April 5, 2018
    Inventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
  • Publication number: 20170220936
    Abstract: Embodiments of the invention relate to identification of material that contains linguistically related content. Key phrases are filtered through a content store to ascertain the linguistically related content and to move the identified content to a target corpus. At least two iterations of the filtering process are employed. Each subsequent iteration of the filtering process identifies at least one new key phrase within the filtered material. In addition, each subsequent iteration takes place with a union of each previously employed key phrase and each new key phrase. As new content is identified, the content is populated to the target corpus.
    Type: Application
    Filed: January 29, 2016
    Publication date: August 3, 2017
    Applicant: International Business Machines Corporation
    Inventors: Daniel F. Gruhl, Joseph M. Kaufmann, Joseph N. Kozhaya, Pablo N. Mendes, Sridhar Sudarsan
  • Publication number: 20170220584
    Abstract: Embodiments of the invention relate to identification of material that contains linguistically related content. Key phrases are filtered through a content store to ascertain the linguistically related content and to move the identified content to a target corpus. At least two iterations of the filtering process are employed. Each subsequent iteration of the filtering process identifies at least one new key phrase within the filtered material. In addition, each subsequent iteration takes place with a union of each previously employed key phrase and each new key phrase. As new content is identified, the content is populated to the target corpus.
    Type: Application
    Filed: February 22, 2016
    Publication date: August 3, 2017
    Applicant: International Business Machines Corporation
    Inventors: Daniel F. Gruhl, Joseph M. Kaufmann, Joseph N. Kozhaya, Pablo N. Mendes, Sridhar Sudarsan
  • Publication number: 20170206453
    Abstract: Embodiments are directed to a method of utilizing an ensemble of distributional semantics systems in conjunction with a domain term extractor for generating domain-specific synonyms. The method allows for extraction of high-quality, domain-specific synonyms that can be used in an information handling system, such as a question-answer system or in an information retrieval (IR) system, capable of processing natural language. According to embodiments, the domain term extractor identifies the words for which synonyms are sought, and the ensemble of distributional semantics systems determines the synonyms.
    Type: Application
    Filed: January 19, 2016
    Publication date: July 20, 2017
    Inventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Pablo N. Mendes
  • Publication number: 20170098163
    Abstract: A technique for resolving entities provided in a question includes creating respective entity context vectors (ECVs) for respective entities in an applicable knowledge graph (KG). A question is received from a user. A first entity is identified in the question. The first entity is associated with a matching one of the entities in the KG. An ECV for the matching one of the entities in the KG is modified. An answer to the question is generated based on the modified ECV.
    Type: Application
    Filed: October 5, 2015
    Publication date: April 6, 2017
    Inventors: SWAMINATHAN CHANDRASEKARAN, JOSEPH M. KAUFMANN, LAKSHMINARAYANAN KRISHNAMURTHY