Patents by Inventor Philip E. Parker

Philip E. Parker has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systematic tuning of text analytic annotators

Patent number: 10803254

Abstract: A data structure is generated containing enumerators for data types of a domain, text forms of the enumerators and context patterns for the text forms. The data structure also includes information extraction rules that are associated with the enumerators. The data structure is updated with additional context patterns and text forms that are identified within a set of documents to which text analytic annotators are to be tuned. The set of documents are analyzed against the updated data structure and additional extraction rules are generated based on the analysis.

Type: Grant

Filed: July 13, 2018

Date of Patent: October 13, 2020

Assignee: International Business Machines Corporation

Inventors: Harish Deshmukh, Philip E. Parker, Roger C. Raphael, Paul S. Taylor, Gabriel Valencia
Generating training data for machine learning

Patent number: 10719781

Abstract: A computer-implemented method includes receiving a rule, wherein the rule includes at least one token, and receiving at least two dictionaries, wherein the at least two dictionaries include at least one general language dictionary and at least one domain-specific dictionary for a domain. The computer-implemented method further includes, for each of the at least one token, selecting at least one word at random from at least one of the at least two dictionaries and adding the at least one word to a test data line, such that the test data line includes a candidate statement conforming to the rule. The computer-implemented method further includes filtering the candidate statement based on a domain-specific model for the domain and including the candidate statement in training data provided to a machine learning model. A corresponding computer program product and computer system are also disclosed.

Type: Grant

Filed: July 17, 2017

Date of Patent: July 21, 2020

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Determining correlation between medical symptoms and environmental factors

Patent number: 10698982

Abstract: A method, a processing device, and a computer program product are provided. Unstructured text may be analyzed to identify medical condition information of multiple occurrences of a medical condition for at least one subject. Times and geographic locations corresponding to the multiple occurrences of the medical condition may be obtained. Environmental information that corresponds to the times and the geographic locations of the multiple medical condition occurrences, may be retrieved. Correlations between the medical condition information and the retrieved environmental information for the at least one subject may be determined. Environmental factors affecting the medical condition, based on the determined correlations, are identified.

Type: Grant

Filed: February 24, 2016

Date of Patent: June 30, 2020

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Generating training data for machine learning

Patent number: 10679144

Abstract: A computer-implemented method includes receiving a rule, wherein the rule includes at least one token, and receiving at least two dictionaries, wherein the at least two dictionaries include at least one general language dictionary and at least one domain-specific dictionary for a domain. The computer-implemented method further includes, for each of the at least one token, selecting at least one word at random from at least one of the at least two dictionaries and adding the at least one word to a test data line, such that the test data line includes a candidate statement conforming to the rule. The computer-implemented method further includes filtering the candidate statement based on a domain-specific model for the domain and including the candidate statement in training data provided to a machine learning model. A corresponding computer program product and computer system are also disclosed.

Type: Grant

Filed: July 12, 2016

Date of Patent: June 9, 2020

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Natural language processing of formatted documents

Patent number: 10628525

Abstract: Detecting and incorporating formatting characteristics within natural language processing analytics. Source documents are ingested and the markup formatting language is identified by the program. Once identified, the markup language is parsed and examined for formatting characteristics, embedded notes, comments and other metadata. The formatting characteristics of the plain text are extracted, along with the plain text, and converted into a common analysis structure (CAS), or CAS-equivalent structure, which annotates the natural language text together with its respective formatting characteristics. The CAS or CAS-equivalent structures are stored and sent to a natural language processing pipeline for further analysis via complex algorithms and rules. The natural language processing results data are curated to reflect meaningful analysis of the extracted CAS or CAS-equivalent structure.

Type: Grant

Filed: May 17, 2017

Date of Patent: April 21, 2020

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Configurable analytics framework for assistance needs detection

Patent number: 10628520

Abstract: According to an embodiment of the present invention, a system dynamically processes a document including unstructured text and comprises a computer system including at least one processor. Initially, the system configures a plurality of dictionaries with terms supplied by a user and associated with a desired category. The processor in the system applies a set of rules to the unstructured text of the document to detect patterns indicating a presence of the desired category, wherein the set of rules is re-usable across dictionaries configured for different categories and pertains to arrangements of dictionary terms within sentences. The system produces annotations associated with the desired category for the document based on the detected patterns. Embodiments of the present invention further include a method and computer program product for dynamically processing a document including unstructured text in substantially the same manner as is described above.

Type: Grant

Filed: May 10, 2017

Date of Patent: April 21, 2020

Assignee: International Business Machines Corporation

Inventors: Stephen D. Bowman, Kristin E. McNeil, Philip E. Parker
Image processing and text analysis to determine medical condition

Patent number: 10565350

Abstract: A method, a processing device, and a computer program product are provided. At least one processing device correlates textual medical information related to the subject with characteristics of an image of a medical condition of the subject to generate a subject signature. The at least one processing device compares the subject signature with multiple reference signatures to determine at least one reference signature corresponding to the subject signature. Each reference signature is associated with a corresponding medical condition and is generated by correlating textual medical information regarding the corresponding medical condition with characteristics of an image of the corresponding medical condition. The at least one processing device identifies the medical condition of the subject based on the medical conditions associated with the determined at least one reference signature. Information is provided regarding the identified medical condition of the subject.

Type: Grant

Filed: June 30, 2017

Date of Patent: February 18, 2020

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Identifying potential patient candidates for clinical trials

Patent number: 10417240

Abstract: A computer system gleans data from patient records and clinical trial descriptions using NLP techniques. NLP annotation data is used to generate clinical trial feature vectors and patient feature vectors. Clinical trial feature vectors and patient feature vectors are compared to match appropriate patient candidates with clinical trial openings.

Type: Grant

Filed: June 3, 2016

Date of Patent: September 17, 2019

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
WEIGHTED ANNOTATION EVALUATION

Publication number: 20190251155

Abstract: A method for providing annotation summaries for annotations is provided. The method may include receiving annotations associated with analyzed unstructured data. The method may further include sorting the received annotations. Additionally, the method may include receiving focal points on the analyzed unstructured data. The method may also include extracting the sorted annotations associated with the focal points. The method may further include normalizing terms and phrases associated with the extracted annotations. The method may also include determining topics based on the normalized terms and phrases associated with the extracted annotations. The method may further include grouping the extracted annotations based on the determined topics. The method may also include summarizing the grouped annotations to generate a summarized annotation. The method may further include replacing the extracted annotations with the summarized annotation.

Type: Application

Filed: April 25, 2019

Publication date: August 15, 2019

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
WEIGHTED ANNOTATION EVALUATION

Publication number: 20190251154

Abstract: A method for providing annotation summaries for annotations is provided. The method may include receiving annotations associated with analyzed unstructured data. The method may further include sorting the received annotations. Additionally, the method may include receiving focal points on the analyzed unstructured data. The method may also include extracting the sorted annotations associated with the focal points. The method may further include normalizing terms and phrases associated with the extracted annotations. The method may also include determining topics based on the normalized terms and phrases associated with the extracted annotations. The method may further include grouping the extracted annotations based on the determined topics. The method may also include summarizing the grouped annotations to generate a summarized annotation. The method may further include replacing the extracted annotations with the summarized annotation.

Type: Application

Filed: April 25, 2019

Publication date: August 15, 2019

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Automatic discovery and presentation of topic summaries related to a selection of text

Patent number: 10380120

Abstract: Topic summaries related to a selection of text in an electronic document may be generated and presented. A topic summary application receives the user-selected text and identifies entities in the text using natural language processing. Using natural language processing, the summary application also identifies related entities and associated text phrases in a remaining portion of the electronic document. The remaining portion may be a portion of the document that precedes the user-selected text, so that a summary generated therefrom may be used to refresh the memory of the user while not revealing information that the user has not yet encountered. In addition, the summary application determines semantically important text phrases using text analytics and generates a summary, presented to the user in a pop-up window, of most frequently correlated related entities along with text phrases that are semantically important.

Type: Grant

Filed: March 18, 2014

Date of Patent: August 13, 2019

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Philip E. Parker
Automatic discovery and presentation of topic summaries related to a selection of text

Patent number: 10372716

Abstract: Topic summaries related to a selection of text in an electronic document may be generated and presented. A topic summary application receives the user-selected text and identifies entities in the text using natural language processing. Using natural language processing, the summary application also identifies related entities and associated text phrases in a remaining portion of the electronic document. The remaining portion may be a portion of the document that precedes the user-selected text, so that a summary generated therefrom may be used to refresh the memory of the user while not revealing information that the user has not yet encountered. In addition, the summary application determines semantically important text phrases using text analytics and generates a summary, presented to the user in a pop-up window, of most frequently correlated related entities along with text phrases that are semantically important.

Type: Grant

Filed: February 13, 2015

Date of Patent: August 6, 2019

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Philip E. Parker
Weighted annotation evaluation

Patent number: 10318622

Abstract: A method for providing annotation summaries for annotations is provided. The method may include receiving annotations associated with analyzed unstructured data. The method may further include sorting the received annotations. Additionally, the method may include receiving focal points on the analyzed unstructured data. The method may also include extracting the sorted annotations associated with the focal points. The method may further include normalizing terms and phrases associated with the extracted annotations. The method may also include determining topics based on the normalized terms and phrases associated with the extracted annotations. The method may further include grouping the extracted annotations based on the determined topics. The method may also include summarizing the grouped annotations to generate a summarized annotation. The method may further include replacing the extracted annotations with the summarized annotation.

Type: Grant

Filed: March 30, 2016

Date of Patent: June 11, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Systematic tuning of text analytic annotators with specialized information

Patent number: 10275458

Abstract: A data structure is generated containing enumerators for data types of a domain, text forms of the enumerators and context patterns for the text forms. The data structure also includes information extraction rules that are associated with the enumerators. The data structure is updated with additional context patterns and text forms that are identified within a set of documents to which text analytic annotators are to be tuned. The set of documents are analyzed against the updated data structure and additional extraction rules are generated based on the analysis.

Type: Grant

Filed: August 14, 2014

Date of Patent: April 30, 2019

Assignee: International Business Machines Corporation

Inventors: Harish Deshmukh, Philip E. Parker, Roger C. Raphael, Paul S. Taylor, Gabriel Valencia
Systematic tuning of text analytic annotators with specialized information

Patent number: 10169334

Abstract: A data structure is generated containing enumerators for data types of a domain, text forms of the enumerators and context patterns for the text forms. The data structure also includes information extraction rules that are associated with the enumerators. The data structure is updated with additional context patterns and text forms that are identified within a set of documents to which text analytic annotators are to be tuned. The set of documents are analyzed against the updated data structure and additional extraction rules are generated based on the analysis.

Type: Grant

Filed: March 26, 2015

Date of Patent: January 1, 2019

Assignee: International Business Machines Corporation

Inventors: Harish Deshmukh, Philip E. Parker, Roger C. Raphael, Paul S. Taylor, Gabriel Valencia
Identifying potential patient candidates for clinical trials

Patent number: 10162866

Abstract: A computer system gleans data from patient records and clinical trial descriptions using NLP techniques. NLP annotation data is used to generate clinical trial feature vectors and patient feature vectors. Clinical trial feature vectors and patient feature vectors are compared to match appropriate patient candidates with clinical trial openings.

Type: Grant

Filed: August 9, 2017

Date of Patent: December 25, 2018

Assignee: International Business Machines Corporation

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
NATURAL LANGUAGE PROCESSING OF FORMATTED DOCUMENTS

Publication number: 20180336181

Abstract: Detecting and incorporating formatting characteristics within natural language processing analytics. Source documents are ingested and the markup formatting language is identified by the program. Once identified, the markup language is parsed and examined for formatting characteristics, embedded notes, comments and other metadata. The formatting characteristics of the plain text are extracted, along with the plain text, and converted into a common analysis structure (CAS), or CAS-equivalent structure, which annotates the natural language text together with its respective formatting characteristics. The CAS or CAS-equivalent structures are stored and sent to a natural language processing pipeline for further analysis via complex algorithms and rules. The natural language processing results data are curated to reflect meaningful analysis of the extracted CAS or CAS-equivalent structure.

Type: Application

Filed: May 17, 2017

Publication date: November 22, 2018

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
NATURAL LANGUAGE PROCESSING OF FORMATTED DOCUMENTS

Publication number: 20180336185

Abstract: Detecting and incorporating formatting characteristics within natural language processing analytics. Source documents are ingested and the markup formatting language is identified by the program. Once identified, the markup language is parsed and examined for formatting characteristics, embedded notes, comments and other metadata. The formatting characteristics of the plain text are extracted, along with the plain text, and converted into a common analysis structure (CAS), or CAS-equivalent structure, which annotates the natural language text together with its respective formatting characteristics. The CAS or CAS-equivalent structures are stored and sent to a natural language processing pipeline for further analysis via complex algorithms and rules. The natural language processing results data are curated to reflect meaningful analysis of the extracted CAS or CAS-equivalent structure.

Type: Application

Filed: September 18, 2017

Publication date: November 22, 2018

Inventors: Patrick W. Fink, Kristin E. McNeil, Philip E. Parker, David B. Werts
Domain specific representation of document text for accelerated natural language processing

Patent number: 10133713

Abstract: Provided are techniques for a domain specific representation of document text for accelerated natural language processing. A document is selected from a set of documents to be analyzed. A character stream from the document is converted into a token stream based on tokenization rules. Irrelevant tokens are removed from the token stream. The tokens remaining in the token stream are converted into an integer domain representation based on a domain specific ontology dictionary. The integer domain representation are stored to a Graphics Processing Unit (GPU) processing queue of each of one or more GPUs. Then, a result set is received from the one or more GPUs.

Type: Grant

Filed: June 8, 2016

Date of Patent: November 20, 2018

Assignee: International Business Machines Corporation

Inventors: Rajesh M. Desai, Alon S. Housfater, Philip E. Parker, Roger C. Raphael
CONFIGURABLE ANALYTICS FRAMEWORK FOR ASSISTANCE NEEDS DETECTION

Publication number: 20180329875

Abstract: According to an embodiment of the present invention, a system dynamically processes a document including unstructured text and comprises a computer system including at least one processor. Initially, the system configures a plurality of dictionaries with terms supplied by a user and associated with a desired category. The processor in the system applies a set of rules to the unstructured text of the document to detect patterns indicating a presence of the desired category, wherein the set of rules is re-usable across dictionaries configured for different categories and pertains to arrangements of dictionary terms within sentences. The system produces annotations associated with the desired category for the document based on the detected patterns. Embodiments of the present invention further include a method and computer program product for dynamically processing a document including unstructured text in substantially the same manner as is described above.

Type: Application

Filed: May 10, 2017

Publication date: November 15, 2018

Inventors: Stephen D. Bowman, Kristin E. McNeil, Philip E. Parker

1 2 3 4 next