Patents by Inventor Ritwik Ray
Ritwik Ray has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240096124Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing (NLP). A graphical user interface (GUI) provides a representation of table items in a table in a document including a set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. Graphical controls are rendered in the GUI to enable a user to select an element from the table to be the main element, conditional element, and value element. The set of the main element, conditional element, and value element are updated with the user selected element to form a modified set. The modified set of the main element, conditional element, and the value element are provided to an NLP engine to perform natural language processing.Type: ApplicationFiled: November 22, 2023Publication date: March 21, 2024Inventors: Scott CARRIER, Ritwik RAY, Jonathan Chapin RAND, Jothilakshmi SIRANGIMOORTHY, Hui WANG, Robert FREDENBURG
-
Patent number: 11869264Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.Type: GrantFiled: January 13, 2023Date of Patent: January 9, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Scott Carrier, Ritwik Ray, Jonathan Chapin Rand, Jothilakshmi Sirangimoorthy, Hui Wang, Robert Fredenburg
-
Publication number: 20230154220Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.Type: ApplicationFiled: January 13, 2023Publication date: May 18, 2023Inventors: Scott CARRIER, Ritwik RAY, Jonathan Chapin RAND, Jothilakshmi SIRANGIMOORTHY, Hui WANG, Robert FREDENBURG
-
Patent number: 11587347Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.Type: GrantFiled: January 21, 2021Date of Patent: February 21, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Scott Carrier, Ritwik Ray, Jonathan Chapin Rand, Jothilakshmi Sirangimoorthy, Hui Wang, Robert Fredenburg
-
Patent number: 11423042Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving a training data set including a plurality of documents having related textual strings. A relevancy model is generated from the training data set. The relevancy model is generally configured to generate relevance scores for a plurality of words extracted from the plurality of documents. A knowledge graph model illustrating relationships between the plurality of words extracted from the plurality of documents is generated from the training data set. The relevancy model and the knowledge graph model are aggregated into a complimentary model including a plurality of nodes from the knowledge graph model and weights associated with edges between connected nodes, wherein the weights comprise relevance scores generated from the relevancy model, and the complimentary model is deployed for use in analyzing documents.Type: GrantFiled: February 7, 2020Date of Patent: August 23, 2022Assignee: International Business Machines CorporationInventors: Jothilakshmi Sirangimoorthy, Ritwik Ray, Hui Wang, Jonathan Rand, Scott Carrier
-
Publication number: 20220230012Abstract: Provided are a computer program product, system, and method for pre-processing a table in a document for natural language processing. A table in a document is parsed to extract column headers, row headers, and data cells, which are processed to determine an initial set of a main element comprising an entity whose value is to be extracted, a conditional element that refines the entity, and a value element comprising a value for the entity. A user selection is received of at least one of the column headers, row headers, and data cells for at least one of the main element, conditional element, and the value element in the initial set to produce a modified set of the main element, conditional element, and value element. The modified set is provided to a natural language processing engine to perform natural language processing of the document including the table, using the modified set.Type: ApplicationFiled: January 21, 2021Publication date: July 21, 2022Inventors: Scott CARRIER, Ritwik RAY, Jonathan Chapin RAND, Jothilakshmi SIRANGIMOORTHY, Hui WANG, Robert FREDENBURG
-
Patent number: 11392753Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving an unstructured document and a structured document including information extracted from the unstructured document and position information associated with the extracted information. The unstructured document is rendered in a first pane, and a graphical rendering of the structured document is rendered in a second pane. The graphical rendering generally may be a structure in which content from the structured document is displayed in a hierarchical format. Each element in the structured document is linked to the rendered unstructured document based on position information included in the structured document.Type: GrantFiled: February 7, 2020Date of Patent: July 19, 2022Assignee: International Business Machines CorporationInventors: Jothilakshmi Sirangimoorthy, Ritwik Ray, Hui Wang, Jonathan Rand, Scott Carrier
-
Patent number: 11163836Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.Type: GrantFiled: February 12, 2018Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani
-
Patent number: 11163837Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.Type: GrantFiled: June 18, 2019Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani
-
Publication number: 20210248153Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving a training data set including a plurality of documents having related textual strings. A relevancy model is generated from the training data set. The relevancy model is generally configured to generate relevance scores for a plurality of words extracted from the plurality of documents. A knowledge graph model illustrating relationships between the plurality of words extracted from the plurality of documents is generated from the training data set. The relevancy model and the knowledge graph model are aggregated into a complimentary model including a plurality of nodes from the knowledge graph model and weights associated with edges between connected nodes, wherein the weights comprise relevance scores generated from the relevancy model, and the complimentary model is deployed for use in analyzing documents.Type: ApplicationFiled: February 7, 2020Publication date: August 12, 2021Inventors: Jothilakshmi SIRANGIMOORTHY, Ritwik RAY, Hui WANG, Jonathan RAND, Scott CARRIER
-
Publication number: 20210248303Abstract: Aspects of the present disclosure describe techniques for generating a machine learning model for extracting information from textual content. The method generally includes receiving an unstructured document and a structured document including information extracted from the unstructured document and position information associated with the extracted information. The unstructured document is rendered in a first pane, and a graphical rendering of the structured document is rendered in a second pane. The graphical rendering generally may be a structure in which content from the structured document is displayed in a hierarchical format. Each element in the structured document is linked to the rendered unstructured document based on position information included in the structured document.Type: ApplicationFiled: February 7, 2020Publication date: August 12, 2021Inventors: Jothilakshmi SIRANGIMOORTHY, Ritwik RAY, Hui WANG, Jonathan RAND, Scott CARRIER
-
Publication number: 20190303412Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.Type: ApplicationFiled: June 18, 2019Publication date: October 3, 2019Inventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani
-
Publication number: 20190251182Abstract: Methods and systems are provided to extract information within complex documents, and the extracted information may be compared to identify differences between complex documents or the extracted information may be analyzed with respect to the individual document. Information is extracted from complex documents comprising unstructured data to create a structured data repository, or analytics knowledge base. This database may be utilized to compare concepts that are common to one or more documents, allowing ease of comparison of documents, and identification of information that is different or identification of (same or similar) information that is presented differently in a set of complex documents.Type: ApplicationFiled: February 12, 2018Publication date: August 15, 2019Inventors: Ritwik Ray, Marie Angelopoulos, Frederick Roberts, Christopher Gagen, Maria Gabrani