Patents by Inventor JOSEPH M. KAUFMANN
JOSEPH M. KAUFMANN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11397854Abstract: Embodiments provide a computer implemented method for generating a domain-specific thesaurus on a cognitive system, comprising: receiving data of the domain-specific corpus and a plurality of terms of interest from a user; splitting the data of the domain-specific corpus into a plurality of sentences using natural language processing techniques; for each term in the plurality of terms of interest, retrieving a plurality of candidate sentences containing a corresponding term, from the plurality of sentences; for each candidate sentence, providing a list of synonyms of the corresponding term, wherein the synonyms are contextual alternatives in the corresponding candidate sentence; for each term in the plurality of terms of interest, tracking a frequency of each synonym, and forming a frequency map including all the synonyms of a corresponding term and the frequency of each synonym; and generating a domain-specific thesaurus based on a combination of all the synonyms in the frequency map.Type: GrantFiled: April 26, 2019Date of Patent: July 26, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy
-
Patent number: 10832145Abstract: A technique for resolving entities provided in a question includes creating respective entity context vectors (ECVs) for respective entities in an applicable knowledge graph (KG). A question is received from a user. A first entity is identified in the question. The first entity is associated with a matching one of the entities in the KG. An ECV for the matching one of the entities in the KG is modified. An answer to the question is generated based on the modified ECV.Type: GrantFiled: October 5, 2015Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Swaminathan Chandrasekaran, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy
-
Patent number: 10832146Abstract: Embodiments are directed to a method of utilizing an ensemble of distributional semantics systems in conjunction with a domain term extractor for generating domain-specific synonyms. The method allows for extraction of high-quality, domain-specific synonyms that can be used in an information handling system, such as a question-answer system or in an information retrieval (IR) system, capable of processing natural language. According to embodiments, the domain term extractor identifies the words for which synonyms are sought, and the ensemble of distributional semantics systems determines the synonyms.Type: GrantFiled: January 19, 2016Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Pablo N. Mendes
-
Publication number: 20200342061Abstract: Embodiments provide a computer implemented method for generating a domain-specific thesaurus on a cognitive system, comprising: receiving data of the domain-specific corpus and a plurality of terms of interest from a user; splitting the data of the domain-specific corpus into a plurality of sentences using natural language processing techniques; for each term in the plurality of terms of interest, retrieving a plurality of candidate sentences containing a corresponding term, from the plurality of sentences; for each candidate sentence, providing a list of synonyms of the corresponding term, wherein the synonyms are contextual alternatives in the corresponding candidate sentence; for each term in the plurality of terms of interest, tracking a frequency of each synonym, and forming a frequency map including all the synonyms of a corresponding term and the frequency of each synonym; and generating a domain-specific thesaurus based on a combination of all the synonyms in the frequency map.Type: ApplicationFiled: April 26, 2019Publication date: October 29, 2020Inventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy
-
Patent number: 10754886Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.Type: GrantFiled: October 5, 2016Date of Patent: August 25, 2020Assignee: International Business Machines CorporationInventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
-
Patent number: 10572601Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.Type: GrantFiled: July 28, 2017Date of Patent: February 25, 2020Assignee: International Business Machines CorporationInventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
-
Patent number: 10558760Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.Type: GrantFiled: October 23, 2017Date of Patent: February 11, 2020Assignee: International Business Machines CorporationInventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
-
Patent number: 10303683Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.Type: GrantFiled: October 5, 2016Date of Patent: May 28, 2019Assignee: International Business Machines CorporationInventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
-
Publication number: 20190034410Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.Type: ApplicationFiled: October 23, 2017Publication date: January 31, 2019Inventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
-
Publication number: 20190034408Abstract: An approach is provided that improves a question answering (QA) computer system by automatically generating relationship templates. Event patterns are extracted from data in a corpus utilized by the QA computer system. The extracted event patterns are analyzed with the analysis resulting in a number of clusters of related event patterns. Relationship templates are then created from the plurality of clusters of related event patterns and these relationship templates are then utilized to visually interact with the corpus.Type: ApplicationFiled: July 28, 2017Publication date: January 31, 2019Inventors: Eddy Hudson, Joseph M. Kaufmann, Niyati Parameswaran
-
Publication number: 20180095962Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.Type: ApplicationFiled: October 5, 2016Publication date: April 5, 2018Inventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
-
Publication number: 20180096058Abstract: A natural language query (NLQ) is translated to a structured data query (e.g., a SQL statement) by extracting entities from the NLQ and replacing them with generic variables to form a generic query. The generic query is associated with a structured question type which includes structured data variables using natural language classifiers (NLCs). Specific data is inserted in the structured question type in relation to the structured data variables based on the extracted entities to form the structured data query. An ensemble of NLCs trained with different ground truths can be used to yield multiple candidate question types. One of the candidate question types is selected based on confidence levels. The multiple NLCs can include an NLC which is optimized according to a focus of the generic query. For example, an NLC can be optimized for a specific data structure (such as SQL), or for comparative queries.Type: ApplicationFiled: October 5, 2016Publication date: April 5, 2018Inventors: Ryan R. Anderson, Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Niyati Parameswaran
-
Publication number: 20170220936Abstract: Embodiments of the invention relate to identification of material that contains linguistically related content. Key phrases are filtered through a content store to ascertain the linguistically related content and to move the identified content to a target corpus. At least two iterations of the filtering process are employed. Each subsequent iteration of the filtering process identifies at least one new key phrase within the filtered material. In addition, each subsequent iteration takes place with a union of each previously employed key phrase and each new key phrase. As new content is identified, the content is populated to the target corpus.Type: ApplicationFiled: January 29, 2016Publication date: August 3, 2017Applicant: International Business Machines CorporationInventors: Daniel F. Gruhl, Joseph M. Kaufmann, Joseph N. Kozhaya, Pablo N. Mendes, Sridhar Sudarsan
-
Publication number: 20170220584Abstract: Embodiments of the invention relate to identification of material that contains linguistically related content. Key phrases are filtered through a content store to ascertain the linguistically related content and to move the identified content to a target corpus. At least two iterations of the filtering process are employed. Each subsequent iteration of the filtering process identifies at least one new key phrase within the filtered material. In addition, each subsequent iteration takes place with a union of each previously employed key phrase and each new key phrase. As new content is identified, the content is populated to the target corpus.Type: ApplicationFiled: February 22, 2016Publication date: August 3, 2017Applicant: International Business Machines CorporationInventors: Daniel F. Gruhl, Joseph M. Kaufmann, Joseph N. Kozhaya, Pablo N. Mendes, Sridhar Sudarsan
-
Publication number: 20170206453Abstract: Embodiments are directed to a method of utilizing an ensemble of distributional semantics systems in conjunction with a domain term extractor for generating domain-specific synonyms. The method allows for extraction of high-quality, domain-specific synonyms that can be used in an information handling system, such as a question-answer system or in an information retrieval (IR) system, capable of processing natural language. According to embodiments, the domain term extractor identifies the words for which synonyms are sought, and the ensemble of distributional semantics systems determines the synonyms.Type: ApplicationFiled: January 19, 2016Publication date: July 20, 2017Inventors: Joseph M. Kaufmann, Lakshminarayanan Krishnamurthy, Pablo N. Mendes
-
Publication number: 20170098163Abstract: A technique for resolving entities provided in a question includes creating respective entity context vectors (ECVs) for respective entities in an applicable knowledge graph (KG). A question is received from a user. A first entity is identified in the question. The first entity is associated with a matching one of the entities in the KG. An ECV for the matching one of the entities in the KG is modified. An answer to the question is generated based on the modified ECV.Type: ApplicationFiled: October 5, 2015Publication date: April 6, 2017Inventors: SWAMINATHAN CHANDRASEKARAN, JOSEPH M. KAUFMANN, LAKSHMINARAYANAN KRISHNAMURTHY