Patents by Inventor Ritwik Ray

Ritwik Ray has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PRE-PROCESSING A TABLE IN A DOCUMENT FOR NATURAL LANGUAGE PROCESSING

Publication number: 20240096124

Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing (NLP). A graphical user interface (GUI) provides a representation of table items in a table in a document including a set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. Graphical controls are rendered in the GUI to enable a user to select an element from the table to be the main element, conditional element, and value element. The set of the main element, conditional element, and value element are updated with the user selected element to form a modified set. The modified set of the main element, conditional element, and the value element are provided to an NLP engine to perform natural language processing.

Type: Application

Filed: November 22, 2023

Publication date: March 21, 2024

Inventors: Scott CARRIER, Ritwik RAY, Jonathan Chapin RAND, Jothilakshmi SIRANGIMOORTHY, Hui WANG, Robert FREDENBURG
Pre-processing a table in a document for natural language processing

Patent number: 11869264

Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.

Type: Grant

Filed: January 13, 2023

Date of Patent: January 9, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Scott Carrier, Ritwik Ray, Jonathan Chapin Rand, Jothilakshmi Sirangimoorthy, Hui Wang, Robert Fredenburg
PRE-PROCESSING A TABLE IN A DOCUMENT FOR NATURAL LANGUAGE PROCESSING

Publication number: 20230154220

Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.

Type: Application

Filed: January 13, 2023

Publication date: May 18, 2023

Inventors: Scott CARRIER, Ritwik RAY, Jonathan Chapin RAND, Jothilakshmi SIRANGIMOORTHY, Hui WANG, Robert FREDENBURG
Pre-processing a table in a document for natural language processing

Patent number: 11587347

Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.

Type: Grant

Filed: January 21, 2021

Date of Patent: February 21, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Scott Carrier, Ritwik Ray, Jonathan Chapin Rand, Jothilakshmi Sirangimoorthy, Hui Wang, Robert Fredenburg
Extracting information from unstructured documents using natural language processing and conversion of unstructured documents into structured documents

Patent number: 11423042

Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving a training data set including a plurality of documents having related textual strings. A relevancy model is generated from the training data set. The relevancy model is generally configured to generate relevance scores for a plurality of words extracted from the plurality of documents. A knowledge graph model illustrating relationships between the plurality of words extracted from the plurality of documents is generated from the training data set. The relevancy model and the knowledge graph model are aggregated into a complimentary model including a plurality of nodes from the knowledge graph model and weights associated with edges between connected nodes, wherein the weights comprise relevance scores generated from the relevancy model, and the complimentary model is deployed for use in analyzing documents.

Type: Grant

Filed: February 7, 2020

Date of Patent: August 23, 2022

Assignee: International Business Machines Corporation

Inventors: Jothilakshmi Sirangimoorthy, Ritwik Ray, Hui Wang, Jonathan Rand, Scott Carrier
PRE-PROCESSING A TABLE IN A DOCUMENT FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220230012

Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.

Type: Application

Filed: January 21, 2021

Publication date: July 21, 2022

Inventors: Scott CARRIER, Ritwik RAY, Jonathan Chapin RAND, Jothilakshmi SIRANGIMOORTHY, Hui WANG, Robert FREDENBURG
Navigating unstructured documents using structured documents including information extracted from unstructured documents

Patent number: 11392753

Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving an unstructured document and a structured document including information extracted from the unstructured document and position information associated with the extracted information. The unstructured document is rendered in a first pane, and a graphical rendering of the structured document is rendered in a second pane. The graphical rendering generally may be a structure in which content from the structured document is displayed in a hierarchical format. Each element in the structured document is linked to the rendered unstructured document based on position information included in the structured document.

Type: Grant

Filed: February 7, 2020

Date of Patent: July 19, 2022

Assignee: International Business Machines Corporation

Inventors: Jothilakshmi Sirangimoorthy, Ritwik Ray, Hui Wang, Jonathan Rand, Scott Carrier
Extraction of information and smart annotation of relevant information within complex documents

Patent number: 11163836

Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.

Type: Grant

Filed: February 12, 2018

Date of Patent: November 2, 2021

Assignee: International Business Machines Corporation

Inventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani
Extraction of information and smart annotation of relevant information within complex documents

Patent number: 11163837

Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.

Type: Grant

Filed: June 18, 2019

Date of Patent: November 2, 2021

Assignee: International Business Machines Corporation

Inventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani
EXTRACTING INFORMATION FROM UNSTRUCTURED DOCUMENTS USING NATURAL LANGUAGE PROCESSING AND CONVERSION OF UNSTRUCTURED DOCUMENTS INTO STRUCTURED DOCUMENTS

Publication number: 20210248153

Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving a training data set including a plurality of documents having related textual strings. A relevancy model is generated from the training data set. The relevancy model is generally configured to generate relevance scores for a plurality of words extracted from the plurality of documents. A knowledge graph model illustrating relationships between the plurality of words extracted from the plurality of documents is generated from the training data set. The relevancy model and the knowledge graph model are aggregated into a complimentary model including a plurality of nodes from the knowledge graph model and weights associated with edges between connected nodes, wherein the weights comprise relevance scores generated from the relevancy model, and the complimentary model is deployed for use in analyzing documents.

Type: Application

Filed: February 7, 2020

Publication date: August 12, 2021

Inventors: Jothilakshmi SIRANGIMOORTHY, Ritwik RAY, Hui WANG, Jonathan RAND, Scott CARRIER
NAVIGATING UNSTRUCTURED DOCUMENTS USING STRUCTURED DOCUMENTS INCLUDING INFORMATION EXTRACTED FROM UNSTRUCTURED DOCUMENTS

Publication number: 20210248303

Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving an unstructured document and a structured document including information extracted from the unstructured document and position information associated with the extracted information. The unstructured document is rendered in a first pane, and a graphical rendering of the structured document is rendered in a second pane. The graphical rendering generally may be a structure in which content from the structured document is displayed in a hierarchical format. Each element in the structured document is linked to the rendered unstructured document based on position information included in the structured document.

Type: Application

Filed: February 7, 2020

Publication date: August 12, 2021

Inventors: Jothilakshmi SIRANGIMOORTHY, Ritwik RAY, Hui WANG, Jonathan RAND, Scott CARRIER
EXTRACTION OF INFORMATION AND SMART ANNOTATION OF RELEVANT INFORMATION WITHIN COMPLEX DOCUMENTS

Publication number: 20190303412

Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.

Type: Application

Filed: June 18, 2019

Publication date: October 3, 2019

Inventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani
EXTRACTION OF INFORMATION AND SMART ANNOTATION OF RELEVANT INFORMATION WITHIN COMPLEX DOCUMENTS

Publication number: 20190251182

Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.

Type: Application

Filed: February 12, 2018

Publication date: August 15, 2019

Inventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani