Patents by Inventor Rob Voigt

Rob Voigt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210157984
    Abstract: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.
    Type: Application
    Filed: March 27, 2020
    Publication date: May 27, 2021
    Inventors: Robert J. Munro, Rob Voigt, Schuyler D. Erle, Brendan D. Callahan, Gary C. King, Jessica D. Long, Jason Brenier, Tripti Saxena, Stefan Krawczyk
  • Publication number: 20190205377
    Abstract: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.
    Type: Application
    Filed: August 6, 2018
    Publication date: July 4, 2019
    Applicant: Idibon, Inc.
    Inventors: Robert J. Munro, Rob Voigt, Schuyler D. Erle, Brendan D. Callahan, Gary C. King, Jessica D. Long, Jason Brenier, Tripti Saxena, Stefan Krawczyk
  • Patent number: 9965458
    Abstract: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: May 8, 2018
    Assignee: Sansa AI Inc.
    Inventors: Robert J. Munro, Rob Voigt, Schuyler D. Erle, Brendan D. Callahan, Gary C. King, Jessica D. Long, Jason Brenier, Tripti Saxena, Stefan Krawczyk
  • Publication number: 20180095946
    Abstract: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.
    Type: Application
    Filed: May 16, 2017
    Publication date: April 5, 2018
    Applicant: Idibon, Inc.
    Inventors: Robert Munro, Rob Voigt, Schuyler D. Erle, Brendan D. Callahan, Gary C. King, Jessica D. Long, Jason Brenier, Tripti Saxena, Stefan Krawczyk
  • Publication number: 20160162466
    Abstract: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.
    Type: Application
    Filed: December 9, 2015
    Publication date: June 9, 2016
    Applicant: Idibon, Inc.
    Inventors: Robert J. Munro, Rob Voigt, Schuyler D. Erle, Brendan D. Callahan, Gary C. King, Jessica D. Long, Jason Brenier, Tripti Saxena, Stefan Krawczyk