Patents by Inventor Marina Danilevsky Hailpern

Marina Danilevsky Hailpern has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11900070
    Abstract: A computer-implemented method according to one embodiment includes receiving, at a deep neural network (DNN), a plurality of sentences each having an associated label; training the DNN, utilizing the plurality of sentences and associated labels; and producing a linguistic expression (LE) utilizing the trained DNN.
    Type: Grant
    Filed: February 3, 2020
    Date of Patent: February 13, 2024
    Assignee: International Business Machines Corporation
    Inventors: Prithviraj Sen, Siddhartha Brahma, Yunyao Li, Laura Chiticariu, Rajasekar Krishnamurthy, Shivakumar Vaithyanathan, Marina Danilevsky Hailpern
  • Patent number: 11650970
    Abstract: Methods, systems, and computer program products for extracting structure and semantics from tabular data are provided herein. A computer-implemented method includes processing tabular data comprising data cells and header cells, wherein the processing includes: identifying one or more regions within the tabular data, wherein each of the regions comprises one or more of the data cells; matching some of the regions to one or more of the header cells, wherein the matched header cells are semantically related to the data cells inside the matched region; and generating, based on the matching, an output describing semantic relationships between the data cells and the header cells. The method also includes creating, for each data cell, a tuple comprising semantic information contained within one or more of the header cells that pertains to the data cell.
    Type: Grant
    Filed: March 9, 2018
    Date of Patent: May 16, 2023
    Assignee: International Business Machines Corporation
    Inventors: Xilun Chen, Laura Chiticariu, Alexandre Evfimievski, Marina Danilevsky Hailpern, Prithviraj Sen
  • Publication number: 20220269858
    Abstract: A system, computer program product, and method are provided for jointly learning dictionary based rules and dictionary candidates. Natural language text is received and parsed into subsets, with the subset being subjected to natural language processing to identify one or more verbs within the subset. The identified verbs are evaluated with respect to a dictionary and one or more rules. The evaluation is directed at each predicate in the rules with respect to the identified verbs. A neural network is leveraged to jointly induce modification of the rules and one or more dictionaries responsive to the evaluation.
    Type: Application
    Filed: February 19, 2021
    Publication date: August 25, 2022
    Applicant: International Business Machines Corporation
    Inventors: Prithviraj Sen, Marina Danilevsky Hailpern, Yunyao Li
  • Patent number: 11200413
    Abstract: Methods, systems, and computer program products for table recognition in PDF documents are provided herein. A computer-implemented method includes discretizing one or more contiguous areas of a PDF document; identifying one or more white-space separator lines within the one or more discretized contiguous areas of the PDF document; detecting one or more candidate table regions within the one or more discretized contiguous areas of the PDF document by clustering the one or more white-space separator lines into one or more grids; and outputting at least one of the candidate table regions as a finalized table in accordance with scores assigned to each of the one or more candidate table regions based on (i) border information and (ii) cell structure information.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: December 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Douglas Ronald Burdick, Wei Cheng, Alexandre Evfimievski, Marina Danilevsky Hailpern, Rajasekar Krishnamurthy, Shajith Ikbal Mohamed, Prithviraj Sen, Shivakumar Vaithyanathan
  • Publication number: 20210240917
    Abstract: A computer-implemented method according to one embodiment includes receiving, at a deep neural network (DNN), a plurality of sentences each having an associated label; training the DNN, utilizing the plurality of sentences and associated labels; and producing a linguistic expression (LE) utilizing the trained DNN.
    Type: Application
    Filed: February 3, 2020
    Publication date: August 5, 2021
    Inventors: Prithviraj Sen, Siddhartha Brahma, Yunyao Li, Laura Chiticariu, Rajasekar Krishnamurthy, Shivakumar Vaithyanathan, Marina Danilevsky Hailpern
  • Patent number: 11074508
    Abstract: Methods, systems, and computer program products for constraint tracking and inference generation are provided herein. A computer-implemented method includes parsing descriptions of one or more user-provided constraints pertaining to data within a target system, parsing truth value assignments to the user-provided constraints, and deriving a truth value for at least one of the user-provided constraints that does not correspond to a known truth value, wherein said deriving comprises performing a logical inference utilizing known truth values of one or more of the user-provided constraints. The computer-implemented method also includes storing, in a database, (i) the user-provided constraints, (ii) the known truth values, and (iii) the at least one derived truth value, and outputting the at least one derived truth value, one or more identified contradictions among the known truth values, and/or an indication that one or more unknown truth values corresponding to the user-provided constraints remain unknown.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: July 27, 2021
    Assignee: International Business Machines Corporation
    Inventors: Huaiyu Zhu, Michael James Wehar, Marina Danilevsky Hailpern, Mauricio Antonio Hernandez-Sherrington, Yunyao Li
  • Publication number: 20200042785
    Abstract: Methods, systems, and computer program products for table recognition in PDF documents are provided herein. A computer-implemented method includes discretizing one or more contiguous areas of a PDF document; identifying one or more white-space separator lines within the one or more discretized contiguous areas of the PDF document; detecting one or more candidate table regions within the one or more discretized contiguous areas of the PDF document by clustering the one or more white-space separator lines into one or more grids; and outputting at least one of the candidate table regions as a finalized table in accordance with scores assigned to each of the one or more candidate table regions based on (i) border information and (ii) cell structure information.
    Type: Application
    Filed: July 31, 2018
    Publication date: February 6, 2020
    Inventors: Douglas Ronald Burdick, Wei Cheng, Alexandre Evfimievski, Marina Danilevsky Hailpern, Rajasekar Krishnamurthy, Shajith Ikbal Mohamed, Prithviraj Sen, Shivakumar Vaithyanathan
  • Publication number: 20190303772
    Abstract: Methods, systems, and computer program products for constraint tracking and inference generation are provided herein. A computer-implemented method includes parsing descriptions of one or more user-provided constraints pertaining to data within a target system, parsing truth value assignments to the user-provided constraints, and deriving a truth value for at least one of the user-provided constraints that does not correspond to a known truth value, wherein said deriving comprises performing a logical inference utilizing known truth values of one or more of the user-provided constraints. The computer-implemented method also includes storing, in a database, (i) the user-provided constraints, (ii) the known truth values, and (iii) the at least one derived truth value, and outputting the at least one derived truth value, one or more identified contradictions among the known truth values, and/or an indication that one or more unknown truth values corresponding to the user-provided constraints remain unknown.
    Type: Application
    Filed: March 29, 2018
    Publication date: October 3, 2019
    Inventors: Huaiyu Zhu, Michael James Wehar, Marina Danilevsky Hailpern, Mauricio Antonio Hernandez-Sherrington, Yunyao Li
  • Publication number: 20190278853
    Abstract: Methods, systems, and computer program products for extracting structure and semantics from tabular data are provided herein. A computer-implemented method includes processing tabular data comprising data cells and header cells, wherein the processing includes: identifying one or more regions within the tabular data, wherein each of the regions comprises one or more of the data cells; matching some of the regions to one or more of the header cells, wherein the matched header cells are semantically related to the data cells inside the matched region; and generating, based on the matching, an output describing semantic relationships between the data cells and the header cells. The method also includes creating, for each data cell, a tuple comprising semantic information contained within one or more of the header cells that pertains to the data cell.
    Type: Application
    Filed: March 9, 2018
    Publication date: September 12, 2019
    Inventors: Xilun Chen, Laura Chiticariu, Alexandre Evfimievski, Marina Danilevsky Hailpern, Prithviraj Sen
  • Patent number: 10042846
    Abstract: One embodiment provides method for constructing a cross-lingual information extraction program, the method including: utilizing at least one processor to execute computer code that performs the steps of: constructing a plurality of language-specific representations from text expressed in a plurality of languages by parsing the text of each language using a language-specific semantic parser; mapping the plurality of language-specific representations to a single cross-lingual semantic representation, wherein the cross-lingual semantic representation encompasses the plurality of languages; and constructing the cross-lingual information extraction program based on the cross-lingual semantic representation. Other aspects are described and claimed.
    Type: Grant
    Filed: April 28, 2016
    Date of Patent: August 7, 2018
    Assignee: International Business Machines Corporation
    Inventors: Alan Akbik, Laura Chiticariu, Marina Danilevsky Hailpern, Yunyao Li, Huaiyu Zhu
  • Patent number: 9898460
    Abstract: One embodiment provides a method for generating a natural language resource using a parallel corpus, the method including: utilizing at least one processor to execute computer code that performs the steps of: receiving, from a parallel corpus, natural language text in a source language and a corresponding translation of the natural language text in a target language, wherein the natural language text in the source language comprises linguistic annotations; projecting the linguistic annotations from the source language natural language text to the target language natural language text; applying one or more filters to remove at least one projected linguistic annotation from the target language natural language text that results in at least one error; selecting at least one target language natural language text having substantially complete linguistic annotations; training a machine learning model using the selected at least one target language natural language text and annotations; and adding, using the trained
    Type: Grant
    Filed: January 26, 2016
    Date of Patent: February 20, 2018
    Assignee: International Business Machines Corporation
    Inventors: Alan Akbik, Laura Chiticariu, Marina Danilevsky Hailpern, Yunyao Li, Huaiyu Zhu
  • Publication number: 20170315986
    Abstract: One embodiment provides method for constructing a cross-lingual information extraction program, the method including: utilizing at least one processor to execute computer code that performs the steps of: constructing a plurality of language-specific representations from text expressed in a plurality of languages by parsing the text of each language using a language-specific semantic parser; mapping the plurality of language-specific representations to a single cross-lingual semantic representation, wherein the cross-lingual semantic representation encompasses the plurality of languages; and constructing the cross-lingual information extraction program based on the cross-lingual semantic representation. Other aspects are described and claimed.
    Type: Application
    Filed: April 28, 2016
    Publication date: November 2, 2017
    Inventors: Alan Akbik, Laura Chiticariu, Marina Danilevsky Hailpern, Yunyao Li, Huaiyu Zhu
  • Publication number: 20170212890
    Abstract: One embodiment provides a method for generating a natural language resource using a parallel corpus, the method including: utilizing at least one processor to execute computer code that performs the steps of: receiving, from a parallel corpus, natural language text in a source language and a corresponding translation of the natural language text in a target language, wherein the natural language text in the source language comprises linguistic annotations; projecting the linguistic annotations from the source language natural language text to the target language natural language text; applying one or more filters to remove at least one projected linguistic annotation from the target language natural language text that results in at least one error; selecting at least one target language natural language text having substantially complete linguistic annotations; training a machine learning model using the selected at least one target language natural language text and annotations; and adding, using the trained
    Type: Application
    Filed: January 26, 2016
    Publication date: July 27, 2017
    Inventors: Alan Akbik, Laura Chiticariu, Marina Danilevsky Hailpern, Yunyao Li, Huaiyu Zhu