Patents by Inventor Alexander Karl Hudek

Alexander Karl Hudek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11915157
    Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, at least one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.
    Type: Grant
    Filed: October 21, 2022
    Date of Patent: February 27, 2024
    Assignees: KIRA INC., ZUVA INC.
    Inventors: Jamal Zabihi, Alexander Karl Hudek
  • Publication number: 20230040388
    Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, at least one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.
    Type: Application
    Filed: October 21, 2022
    Publication date: February 9, 2023
    Inventors: Jamal ZABIHI, Alexander Karl HUDEK
  • Patent number: 11507864
    Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, atleast one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: November 22, 2022
    Assignees: Kira Inc., Zuva Inc.
    Inventors: Jamal Zabihi, Alexander Karl Hudek
  • Publication number: 20220076010
    Abstract: A method of generating text features from a document comprises one or more processors grouping text comprised in the document into multiple logical text blocks, wherein each of the logical text blocks comprises one or more tokens. One of the logical text blocks is selected for generating features. Thereafter, logical text blocks neighbouring the selected logical block are identified. Further, the processer qualifies one or more of the neighbouring logical text blocks for generating features. The processor generates features for one or more of the tokens in the selected logical block using the qualified logical text blocks.
    Type: Application
    Filed: September 9, 2020
    Publication date: March 10, 2022
    Inventors: Samuel Peter Thomas FLETCHER, Adam ROEGEIST, Alexander Karl HUDEK
  • Publication number: 20210374559
    Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, atleast one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.
    Type: Application
    Filed: May 26, 2020
    Publication date: December 2, 2021
    Inventors: Jamal ZABIHI, Alexander Karl HUDEK
  • Patent number: 11101979
    Abstract: The present invention discloses a method of creating word-level differential privacy with the hashing trick to protect confidentiality of a textual data, the method comprising: receiving a list of a plurality of hashes with a weight (or weights) associated with each of the plurality of hashes; Updating said list with new hashes that are within the range of allowable hash values but not included in said received list of hashes; Updating said list with a new weight to each of said plurality of hashes that are missing said weight; Fitting a probability distribution to said list of said weights of said plurality of hashes; and generating said new weights and said adjusted weights based on sampling of said probability distribution.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: August 24, 2021
    Assignee: KIRA INC.
    Inventors: Samuel Peter Thomas Fletcher, Alexander Karl Hudek
  • Patent number: 11087123
    Abstract: A method for extracting of data contained in a fixed format electronic document is disclosed. The method is particularly applicable to extracting data from tables in electronic documents and includes reading, by a computer system, the electronic document as a computer image file; segmenting, by the computer system, the computer image file into document sections representative of distinct portions of data; applying a label to each distinct document section; and executing, by the computer system, an optical character recognition algorithm to convert the image file into computer-readable text, wherein segments of the converted text is associated with a respective label indicative of each distinct document section.
    Type: Grant
    Filed: August 24, 2019
    Date of Patent: August 10, 2021
    Assignee: KIRA INC.
    Inventors: Radha Chitta, Alexander Karl Hudek
  • Publication number: 20210056300
    Abstract: A method for extracting of data contained in a fixed format electronic document is disclosed. The method is particularly applicable to extracting data from tables in electronic documents and includes reading, by a computer system, the electronic document as a computer image file; segmenting, by the computer system, the computer image file into document sections representative of distinct portions of data; applying a label to each distinct document section; and executing, by the computer system, an optical character recognition algorithm to convert the image file into computer-readable text, wherein segments of the converted text is associated with a respective label indicative of each distinct document section.
    Type: Application
    Filed: August 24, 2019
    Publication date: February 25, 2021
    Inventors: Radha CHITTA, Alexander Karl HUDEK
  • Publication number: 20200382281
    Abstract: The present invention discloses a method of creating word-level differential privacy with the hashing trick to protect confidentiality of a textual data, the method comprising: receiving a list of a plurality of hashes with a weight (or weights) associated with each of the plurality of hashes; Updating said list with new hashes that are within the range of allowable hash values but not included in said received list of hashes; Updating said list with a new weight to each of said plurality of hashes that are missing said weight; Fitting a probability distribution to said list of said weights of said plurality of hashes; and generating said new weights and said adjusted weights based on sampling of said probability distribution.
    Type: Application
    Filed: May 30, 2019
    Publication date: December 3, 2020
    Inventors: Samuel Peter Thomas FLETCHER, Alexander Karl HUDEK
  • Publication number: 20200143274
    Abstract: A system for answering multiple choice questions includes at least one processor configured to create a question answering model using a training data set. The system is configured to create a balanced data from the imbalanced training data set. The balancing of the imbalanced training data set is achieved by generating synthetic instances of at least one minority category, among a plurality of categories into which the training data set is categorized.
    Type: Application
    Filed: November 6, 2018
    Publication date: May 7, 2020
    Inventors: Radha CHITTA, Alexander Karl HUDEK
  • Patent number: 10157177
    Abstract: A method for entity extraction within an electronic document including executing by a computer processor a conditional random field algorithm stored on a computer readable medium to generate a conditional random field model; the conditional random field algorithm having an input including one or more training text documents; executing by a computer processor an entity extraction algorithm stored on a computer readable medium to generate an entity extraction model; the entity extraction algorithm having an input including the same one or more training text documents input into the conditional random field algorithm; applying by a computer processor the conditional random field model to at least one electronic document; wherein application of the conditional random field model returns a list of passages in the at least one electronic document having an entity; applying by a computer processor the entity extraction model to the at least one electronic document; wherein application of the entity extraction model
    Type: Grant
    Filed: October 28, 2016
    Date of Patent: December 18, 2018
    Assignee: KIRA INC.
    Inventors: Robert Henry Warren, Alexander Karl Hudek
  • Publication number: 20180121413
    Abstract: A method for entity extraction within an electronic document including executing by a computer processor a conditional random field algorithm stored on a computer readable medium to generate a conditional random field model; the conditional random field algorithm having an input including one or more training text documents; executing by a computer processor an entity extraction algorithm stored on a computer readable medium to generate an entity extraction model; the entity extraction algorithm having an input including the same one or more training text documents input into the conditional random field algorithm; applying by a computer processor the conditional random field model to at least one electronic document; wherein application of the conditional random field model returns a list of passages in the at least one electronic document having an entity; applying by a computer processor the entity extraction model to the at least one electronic document; wherein application of the entity extraction model
    Type: Application
    Filed: October 28, 2016
    Publication date: May 3, 2018
    Inventors: Robert Henry WARREN, Alexander Karl HUDEK
  • Publication number: 20180011919
    Abstract: A system and method for clustering electronic documents where the method includes identifying a plurality of electronic documents stored on a computer readable medium, determining by a computer processor a distance metric between each document in said plurality of electronic documents, and grouping by the computer processor one or more documents from said plurality of electronic documents into clusters based on a maximum permissible distance metric between documents within a cluster.
    Type: Application
    Filed: July 5, 2016
    Publication date: January 11, 2018
    Inventors: Robert Henry WARREN, Alexander Karl HUDEK
  • Patent number: 9645988
    Abstract: The methods proposed here deconstructs training sentences into a stream of features that represent both the sentences and tokens used by the text, their sequence and other ancillary features extracted using natural language processing. Then, we use a conditional random field where we represent the concept we are looking for as state A and the background (everything not concept A) as a state B. The model created by this training phase is then used to locate the concept as a sequence of sentences within a document. This has distinct advantages in accuracy and speed over methods that individually classify each sentence and then use a secondary method to group the classified sentences into passages. Furthermore while previous methods were based on searching for the occurrence of tokens only, the use of a wider set of features enables this method to locate relevant passages even though a different terminology is in use.
    Type: Grant
    Filed: August 25, 2016
    Date of Patent: May 9, 2017
    Assignee: KIRA INC.
    Inventors: Robert Henry Warren, Alexander Karl Hudek
  • Patent number: 9158839
    Abstract: A mechanism for training and classifying data is disclosed. The method includes receiving a data set having at least a first annotation and at least a second annotation. The first annotation and the second annotation represent characteristics within the data set. The method also includes determining a first identifier from the first annotation and a second identifier from the second annotation and associating the first identifier to the second identifier to generate a joined identifier. The method also includes computing feature weights and transition weights for the annotated data set based on the at least a first identifier, at least a second identifier, and at least a joined identifier and transitions between each of the first, the second and the joined identifiers. The method further includes receiving a second un-annotated data set and classifying the second data set based on the computed feature weights and the transition weights.
    Type: Grant
    Filed: August 23, 2012
    Date of Patent: October 13, 2015
    Assignee: DiligenceEngine, Inc.
    Inventor: Alexander Karl Hudek