Patents by Inventor Alexander Karl Hudek

Alexander Karl Hudek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Computerized method of training a computer executed model for recognizing numerical quantities

Patent number: 11915157

Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, at least one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.

Type: Grant

Filed: October 21, 2022

Date of Patent: February 27, 2024

Assignees: KIRA INC., ZUVA INC.

Inventors: Jamal Zabihi, Alexander Karl Hudek
COMPUTERIZED METHOD OF TRAINING A COMPUTER EXECUTED MODEL FOR RECOGNIZING NUMERICAL QUANTITIES

Publication number: 20230040388

Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, at least one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.

Type: Application

Filed: October 21, 2022

Publication date: February 9, 2023

Inventors: Jamal ZABIHI, Alexander Karl HUDEK
Computerized method of training a computer executed model for recognizing numerical quantities

Patent number: 11507864

Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, atleast one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.

Type: Grant

Filed: May 26, 2020

Date of Patent: November 22, 2022

Assignees: Kira Inc., Zuva Inc.

Inventors: Jamal Zabihi, Alexander Karl Hudek
METHOD OF GENERATING TEXT FEATURES FROM A DOCUMENT

Publication number: 20220076010

Abstract: A method of generating text features from a document comprises one or more processors grouping text comprised in the document into multiple logical text blocks, wherein each of the logical text blocks comprises one or more tokens. One of the logical text blocks is selected for generating features. Thereafter, logical text blocks neighbouring the selected logical block are identified. Further, the processer qualifies one or more of the neighbouring logical text blocks for generating features. The processor generates features for one or more of the tokens in the selected logical block using the qualified logical text blocks.

Type: Application

Filed: September 9, 2020

Publication date: March 10, 2022

Inventors: Samuel Peter Thomas FLETCHER, Adam ROEGEIST, Alexander Karl HUDEK
COMPUTERIZED METHOD OF TRAINING A COMPUTER EXECUTED MODEL FOR RECOGNIZING NUMERICAL QUANTITIES

Publication number: 20210374559

Abstract: A computerized method for training a computer executed model for recognizing numerical quantities is provided. An input, atleast one unit expression, is received by an input module. The input module may then search for numeric values and the unit expression in a text corpus, wherein, the text corpus comprises sets of words and frequency of occurrence of each of the sets. The input module may identify identified sets, wherein the identified sets may comprise a combination of a numeric value and the unit expression. A synthetic text generation module may then generate sentences from the text corpus by applying the identified sets as input. A training dataset may be generated by a labeling module by auto labelling features in the generated sentences based on the numeric value and the unit expression and further a training module may train the training model by providing input based on the training dataset.

Type: Application

Filed: May 26, 2020

Publication date: December 2, 2021

Inventors: Jamal ZABIHI, Alexander Karl HUDEK
Method and system for creating word-level differential privacy using feature hashing techniques

Patent number: 11101979

Abstract: The present invention discloses a method of creating word-level differential privacy with the hashing trick to protect confidentiality of a textual data, the method comprising: receiving a list of a plurality of hashes with a weight (or weights) associated with each of the plurality of hashes; Updating said list with new hashes that are within the range of allowable hash values but not included in said received list of hashes; Updating said list with a new weight to each of said plurality of hashes that are missing said weight; Fitting a probability distribution to said list of said weights of said plurality of hashes; and generating said new weights and said adjusted weights based on sampling of said probability distribution.

Type: Grant

Filed: May 30, 2019

Date of Patent: August 24, 2021

Assignee: KIRA INC.

Inventors: Samuel Peter Thomas Fletcher, Alexander Karl Hudek
Text extraction, in particular table extraction from electronic documents

Patent number: 11087123

Abstract: A method for extracting of data contained in a fixed format electronic document is disclosed. The method is particularly applicable to extracting data from tables in electronic documents and includes reading, by a computer system, the electronic document as a computer image file; segmenting, by the computer system, the computer image file into document sections representative of distinct portions of data; applying a label to each distinct document section; and executing, by the computer system, an optical character recognition algorithm to convert the image file into computer-readable text, wherein segments of the converted text is associated with a respective label indicative of each distinct document section.

Type: Grant

Filed: August 24, 2019

Date of Patent: August 10, 2021

Assignee: KIRA INC.

Inventors: Radha Chitta, Alexander Karl Hudek
TEXT EXTRACTION, IN PARTICULAR TABLE EXTRACTION FROM ELECTRONIC DOCUMENTS

Publication number: 20210056300

Abstract: A method for extracting of data contained in a fixed format electronic document is disclosed. The method is particularly applicable to extracting data from tables in electronic documents and includes reading, by a computer system, the electronic document as a computer image file; segmenting, by the computer system, the computer image file into document sections representative of distinct portions of data; applying a label to each distinct document section; and executing, by the computer system, an optical character recognition algorithm to convert the image file into computer-readable text, wherein segments of the converted text is associated with a respective label indicative of each distinct document section.

Type: Application

Filed: August 24, 2019

Publication date: February 25, 2021

Inventors: Radha CHITTA, Alexander Karl HUDEK
METHOD AND SYSTEM FOR CREATING WORD-LEVEL DIFFERENTIAL PRIVACY USING FEATURE HASHING TECHNIQUES

Publication number: 20200382281

Abstract: The present invention discloses a method of creating word-level differential privacy with the hashing trick to protect confidentiality of a textual data, the method comprising: receiving a list of a plurality of hashes with a weight (or weights) associated with each of the plurality of hashes; Updating said list with new hashes that are within the range of allowable hash values but not included in said received list of hashes; Updating said list with a new weight to each of said plurality of hashes that are missing said weight; Fitting a probability distribution to said list of said weights of said plurality of hashes; and generating said new weights and said adjusted weights based on sampling of said probability distribution.

Type: Application

Filed: May 30, 2019

Publication date: December 3, 2020

Inventors: Samuel Peter Thomas FLETCHER, Alexander Karl HUDEK
SYSTEM AND METHOD FOR APPLYING ARTIFICIAL INTELLIGENCE TECHNIQUES TO RESPOND TO MULTIPLE CHOICE QUESTIONS

Publication number: 20200143274

Abstract: A system for answering multiple choice questions includes at least one processor configured to create a question answering model using a training data set. The system is configured to create a balanced data from the imbalanced training data set. The balancing of the imbalanced training data set is achieved by generating synthetic instances of at least one minority category, among a plurality of categories into which the training data set is categorized.

Type: Application

Filed: November 6, 2018

Publication date: May 7, 2020

Inventors: Radha CHITTA, Alexander Karl HUDEK
System and method for extracting entities in electronic documents

Patent number: 10157177

Abstract: A method for entity extraction within an electronic document including executing by a computer processor a conditional random field algorithm stored on a computer readable medium to generate a conditional random field model; the conditional random field algorithm having an input including one or more training text documents; executing by a computer processor an entity extraction algorithm stored on a computer readable medium to generate an entity extraction model; the entity extraction algorithm having an input including the same one or more training text documents input into the conditional random field algorithm; applying by a computer processor the conditional random field model to at least one electronic document; wherein application of the conditional random field model returns a list of passages in the at least one electronic document having an entity; applying by a computer processor the entity extraction model to the at least one electronic document; wherein application of the entity extraction model

Type: Grant

Filed: October 28, 2016

Date of Patent: December 18, 2018

Assignee: KIRA INC.

Inventors: Robert Henry Warren, Alexander Karl Hudek
SYSTEM AND METHOD FOR EXTRACTING ENTITIES IN ELECTRONIC DOCUMENTS

Publication number: 20180121413

Abstract: A method for entity extraction within an electronic document including executing by a computer processor a conditional random field algorithm stored on a computer readable medium to generate a conditional random field model; the conditional random field algorithm having an input including one or more training text documents; executing by a computer processor an entity extraction algorithm stored on a computer readable medium to generate an entity extraction model; the entity extraction algorithm having an input including the same one or more training text documents input into the conditional random field algorithm; applying by a computer processor the conditional random field model to at least one electronic document; wherein application of the conditional random field model returns a list of passages in the at least one electronic document having an entity; applying by a computer processor the entity extraction model to the at least one electronic document; wherein application of the entity extraction model

Type: Application

Filed: October 28, 2016

Publication date: May 3, 2018

Inventors: Robert Henry WARREN, Alexander Karl HUDEK
SYSTEMS AND METHOD FOR CLUSTERING ELECTRONIC DOCUMENTS

Publication number: 20180011919

Abstract: A system and method for clustering electronic documents where the method includes identifying a plurality of electronic documents stored on a computer readable medium, determining by a computer processor a distance metric between each document in said plurality of electronic documents, and grouping by the computer processor one or more documents from said plurality of electronic documents into clusters based on a maximum permissible distance metric between documents within a cluster.

Type: Application

Filed: July 5, 2016

Publication date: January 11, 2018

Inventors: Robert Henry WARREN, Alexander Karl HUDEK
System and method for identifying passages in electronic documents

Patent number: 9645988

Abstract: The methods proposed here deconstructs training sentences into a stream of features that represent both the sentences and tokens used by the text, their sequence and other ancillary features extracted using natural language processing. Then, we use a conditional random field where we represent the concept we are looking for as state A and the background (everything not concept A) as a state B. The model created by this training phase is then used to locate the concept as a sequence of sentences within a document. This has distinct advantages in accuracy and speed over methods that individually classify each sentence and then use a secondary method to group the classified sentences into passages. Furthermore while previous methods were based on searching for the occurrence of tokens only, the use of a wider set of features enables this method to locate relevant passages even though a different terminology is in use.

Type: Grant

Filed: August 25, 2016

Date of Patent: May 9, 2017

Assignee: KIRA INC.

Inventors: Robert Henry Warren, Alexander Karl Hudek
Systems and methods for training and classifying data

Patent number: 9158839

Abstract: A mechanism for training and classifying data is disclosed. The method includes receiving a data set having at least a first annotation and at least a second annotation. The first annotation and the second annotation represent characteristics within the data set. The method also includes determining a first identifier from the first annotation and a second identifier from the second annotation and associating the first identifier to the second identifier to generate a joined identifier. The method also includes computing feature weights and transition weights for the annotated data set based on the at least a first identifier, at least a second identifier, and at least a joined identifier and transitions between each of the first, the second and the joined identifiers. The method further includes receiving a second un-annotated data set and classifying the second data set based on the computed feature weights and the transition weights.

Type: Grant

Filed: August 23, 2012

Date of Patent: October 13, 2015

Assignee: DiligenceEngine, Inc.

Inventor: Alexander Karl Hudek