Patents by Inventor Tharathorn RIMCHALA

Tharathorn RIMCHALA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Methods and systems for generating mobile enabled extraction models

Patent number: 11977842

Abstract: A computing system generates a plurality of training data sets for generating the NLP model. The computing system trains a teacher network to extract and classify tokens from a document. The training includes a pre-training stage where the teacher network is trained to classify generic data in the plurality of training data sets and a fine-tuning stage where the teacher network is trained to classify targeted data in the plurality of training data sets. The computing system trains a student network to extract and classify tokens from a document by distilling knowledge learned by the teacher network during the fine-tuning stage from the teacher network to the student network. The computing system outputs the NLP model based on the training. The computing system causes the NLP model to be deployed in a remote computing environment.

Type: Grant

Filed: April 30, 2021

Date of Patent: May 7, 2024

Assignee: INTUIT INC.

Inventors: Dominic Miguel Rossi, Hui Fang Lee, Tharathorn Rimchala
SYSTEM AND METHOD FOR SPATIAL ENCODING AND FEATURE GENERATORS FOR ENHANCING INFORMATION EXTRACTION

Publication number: 20240054802

Abstract: A system and method for extracting data from a piece of content using spatial information about the piece of content. The system and method may use a conditional random fields process or a bidirectional long short term memory and conditional random fields process to extract structured data using the spatial information.

Type: Application

Filed: October 24, 2023

Publication date: February 15, 2024

Applicant: INTUIT INC.

Inventor: Tharathorn RIMCHALA
TEXT FEATURE GUIDED VISUAL BASED DOCUMENT CLASSIFIER

Publication number: 20240037125

Abstract: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.

Type: Application

Filed: June 16, 2023

Publication date: February 1, 2024

Applicant: Intuit Inc.

Inventors: Tharathorn RIMCHALA, Yingxin Wang
Systems and methods for training an information extraction transformer model architecture

Patent number: 11861884

Abstract: Certain aspects of the disclosure provide systems and methods for training an information extraction transformer model architecture directed to pre-training a first multimodal transformer model on an unlabeled dataset, training a second multimodal transformer model on a first labeled dataset to perform a key information extraction task processing the unlabeled dataset with the second multimodal transformer model to generate pseudo-labels for the unlabeled dataset, training the first multimodal transformer model based on a second labeled dataset comprising one or more labels, the pseudo-labels generated, or combinations thereof to generate a third multimodal transformer model, generating updated pseudo-labels based on label completion predictions from the third multimodal transformer model, and training the third multimodal transformer model using a noise-aware loss function and the updated pseudo-labels to generate an updated third multimodal transformer model.

Type: Grant

Filed: April 10, 2023

Date of Patent: January 2, 2024

Assignee: Intuit, Inc.

Inventors: Karelia Del Carmen Pena Pena, Tharathorn Rimchala, Peter Lee Frick, Tak Yiu Daniel Li
System and method for spatial encoding and feature generators for enhancing information extraction

Patent number: 11837002

Abstract: A system and method for extracting data from a piece of content using spatial information about the piece of content. The system and method may use a conditional random fields process or a bidirectional long short term memory and conditional random fields process to extract structured data using the spatial information.

Type: Grant

Filed: February 1, 2019

Date of Patent: December 5, 2023

Assignee: INTUIT INC.

Inventor: Tharathorn Rimchala
ENTITY EXTRACTION WITH ENCODER DECODER MACHINE LEARNING MODEL

Publication number: 20230386236

Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.

Type: Application

Filed: November 30, 2022

Publication date: November 30, 2023

Applicant: Intuit Inc.

Inventors: Tharathorn Rimchala, Peter Frick
Image-based document search using machine learning

Patent number: 11829406

Abstract: Aspects of the present disclosure provide techniques for image-based document search. Embodiments include receiving an image of a document and providing the image of the document as input to a machine learning model, where the machine learning model generates separate embeddings of a plurality of patches of the image of the document and the machine learning model generates an embedding of the image of the document based on the separate embeddings of the plurality of patches. Embodiments include determining a compact embedding of the image of the document based on applying a dimensionality reduction technique to the embedding of the image of the document generated by the machine learning model. Embodiments include performing a search for relevant documents based on the compact embedding of the image of the document. Embodiments include performing one or more actions based on one or more relevant documents identified through the search.

Type: Grant

Filed: June 30, 2023

Date of Patent: November 28, 2023

Assignee: INTUIT, INC.

Inventors: Shir Meir Lador, Sameeksha Khillan, Peter Lee Frick, Tharathorn Rimchala, Guohan Gao
Compositional pipeline for generating synthetic training data for machine learning models to extract line items from OCR text

Patent number: 11798301

Abstract: Systems and methods of generating synthetic training data for machine learning models. First, line items in source documents such as bills, invoices, and or receipts are identified and labeled. The identification and labeling generate labeled documents. Then, in the labeled documents, the line items are augmented by adding, deleting, and or swapping line items to generate synthetic training documents. An addition operation randomly selects one or more line items and adds the selected line item(s) to the same labeled document or another labeled document. A deletion operation randomly deletes one or more line items. A swapping operation randomly swaps line items in a single labeled document or across different labeled documents. These operations can generate synthetic labeled documents of any length, which form synthetic training data for training the machine learning models.

Type: Grant

Filed: October 17, 2022

Date of Patent: October 24, 2023

Assignee: INTUIT INC.

Inventor: Tharathorn Rimchala
Generalizable key-value set extraction from documents using machine learning models

Patent number: 11783605

Abstract: Certain aspects of the present disclosure provide techniques for training and using machine learning models to extract key-value sets from a document. An example method generally includes identifying regions of a document including key-value sets corresponding to inputs to a data processing application based on a first machine learning model and an electronic version of the document. One or more keys and one or more values are identified in the document based on a second machine learning model. One or more key-value sets are generated based on matching keys of the one or more keys and values of the one or more values in the region of the document. The one or more key-value sets are provided to a data processing application for processing.

Type: Grant

Filed: June 30, 2022

Date of Patent: October 10, 2023

Assignee: INTUIT, INC.

Inventors: Amogha Sekhar, Eric Vanoeveren, Deepankar Mohapatra, Tharathorn Rimchala, Priyadarshini Rajendran
METHOD FOR SERVING PARAMETER EFFICIENT NLP MODELS THROUGH ADAPTIVE ARCHITECTURES

Publication number: 20230316157

Abstract: A machine learning system executed by a processor may generate predictions for a variety of natural language processing (NLP) tasks. The machine learning system may include a single deployment implementing a parameter efficient transfer learning architecture. The machine learning system may use adapter layers to dynamically modify a base model to generate a plurality of fine-tuned models. Each fine-tuned model may generate predictions for a specific NLP task. By transferring knowledge from the base model to each fine-tuned model, the ML system achieves a significant reduction in the number of tunable parameters required to generate a fine-tuned NLP model and decreases the fine-tuned model artifact size. Additionally, the ML system reduces training times for fine-tuned NLP models, promotes transfer learning across NLP tasks with lower labeled data volumes, and enables easier and more computationally efficient deployments for multi-task NLP.

Type: Application

Filed: June 2, 2023

Publication date: October 5, 2023

Applicant: INTUIT INC.

Inventors: Terrence J. TORRES, Tharathorn Rimchala, Andrew Mattarella-Micke
Text feature guided visual based document classifier

Patent number: 11720605

Abstract: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.

Type: Grant

Filed: July 28, 2022

Date of Patent: August 8, 2023

Assignee: Intuit Inc.

Inventors: Tharathorn Rimchala, Yingxin Wang
Method for serving parameter efficient NLP models through adaptive architectures

Patent number: 11704602

Abstract: A machine learning system executed by a processor may generate predictions for a variety of natural language processing (NLP) tasks. The machine learning system may include a single deployment implementing a parameter efficient transfer learning architecture. The machine learning system may use adapter layers to dynamically modify a base model to generate a plurality of fine-tuned models. Each fine-tuned model may generate predictions for a specific NLP task. By transferring knowledge from the base model to each fine-tuned model, the ML system achieves a significant reduction in the number of tunable parameters required to generate a fine-tuned NLP model and decreases the fine-tuned model artifact size. Additionally, the ML system reduces training times for fine-tuned NLP models, promotes transfer learning across NLP tasks with lower labeled data volumes, and enables easier and more computationally efficient deployments for multi-task NLP.

Type: Grant

Filed: January 2, 2020

Date of Patent: July 18, 2023

Assignee: Intuit Inc.

Inventors: Terrence J. Torres, Tharathorn Rimchala, Andrew Mattarella-Micke
Systems and methods for determining consensus values

Patent number: 11593555

Abstract: Systems and methods are provided to determine consensus values for duplicate fields in a document or form.

Type: Grant

Filed: May 9, 2022

Date of Patent: February 28, 2023

Assignee: INTUIT INC.

Inventors: Peter Anthony, Preeti Duraipandian, Tharathorn Rimchala, Sricharan Kallur Palli Kumar
Entity extraction with encoder decoder machine learning model

Patent number: 11544943

Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.

Type: Grant

Filed: May 31, 2022

Date of Patent: January 3, 2023

Assignee: Intuit Inc.

Inventors: Tharathorn Rimchala, Peter Frick
METHODS AND SYSTEMS FOR GENERATING MOBILE ENABLED EXTRACTION MODELS

Publication number: 20220350968

Abstract: A computing system generates a plurality of training data sets for generating the NLP model. The computing system trains a teacher network to extract and classify tokens from a document. The training includes a pre-training stage where the teacher network is trained to classify generic data in the plurality of training data sets and a fine-tuning stage where the teacher network is trained to classify targeted data in the plurality of training data sets. The computing system trains a student network to extract and classify tokens from a document by distilling knowledge learned by the teacher network during the fine-tuning stage from the teacher network to the student network. The computing system outputs the NLP model based on the training. The computing system causes the NLP model to be deployed in a remote computing environment.

Type: Application

Filed: April 30, 2021

Publication date: November 3, 2022

Applicant: INTUIT INC.

Inventors: Dominic Miguel ROSSI, Hui Fang LEE, Tharathorn RIMCHALA
System and method of performing patch-based document segmentation for information extraction

Patent number: 11216660

Abstract: A user device associated with a user may receive a document associated with the user. The user device may encrypt the received document. The user device may perform patch-based document segmentation on the received document to form a plurality of patches on the received document. The user device may extract text from each patch of the plurality of patches. The user device may analyze the extracted text from each patch to detect a field title and a field value. The user device may encrypt the extracted text and its associated field value for each patch of the plurality of patches. The user device may send the encrypted extracted text and its associated field value to the user device and instructions to display the extracted text and its associated field value on a user interface.

Type: Grant

Filed: August 30, 2019

Date of Patent: January 4, 2022

Assignee: Intuit Inc.

Inventors: Tharathorn Rimchala, Yang Li
METHOD FOR SERVING PARAMETER EFFICIENT NLP MODELS THROUGH ADAPTIVE ARCHITECTURES

Publication number: 20210209513

Abstract: A machine learning system executed by a processor may generate predictions for a variety of natural language processing (NLP) tasks. The machine learning system may include a single deployment implementing a parameter efficient transfer learning architecture. The machine learning system may use adapter layers to dynamically modify a base model to generate a plurality of fine-tuned models. Each fine-tuned model may generate predictions for a specific NLP task. By transferring knowledge from the base model to each fine-tuned model, the ML system achieves a significant reduction in the number of tunable parameters required to generate a fine-tuned NLP model and decreases the fine-tuned model artifact size. Additionally, the ML system reduces training times for fine-tuned NLP models, promotes transfer learning across NLP tasks with lower labeled data volumes, and enables easier and more computationally efficient deployments for multi-task NLP.

Type: Application

Filed: January 2, 2020

Publication date: July 8, 2021

Applicant: Intuit Inc.

Inventors: Terrence J. TORRES, Tharathorn Rimchala, Andrew Mattarella-Micke
SYSTEM AND METHOD OF PERFORMING PATCH-BASED DOCUMENT SEGMENTATION FOR INFORMATION EXTRACTION

Publication number: 20210064865

Abstract: A user device associated with a user may receive a document associated with the user. The user device may encrypt the received document. The user device may perform patch-based document segmentation on the received document to form a plurality of patches on the received document. The user device may extract text from each patch of the plurality of patches. The user device may analyze the extracted text from each patch to detect a field title and a field value. The user device may encrypt the extracted text and its associated field value for each patch of the plurality of patches. The user device may send the encrypted extracted text and its associated field value to the user device and instructions to display the extracted text and its associated field value on a user interface.

Type: Application

Filed: August 30, 2019

Publication date: March 4, 2021

Applicant: Intuit Inc.

Inventors: Tharathorn RIMCHALA, Yang LI
SYSTEM AND METHOD FOR SPATIAL ENCODING AND FEATURE GENERATORS FOR ENHANCING INFORMATION EXTRACTION

Publication number: 20200250263

Abstract: A system and method for extracting data from a piece of content using spatial information about the piece of content. The system and method may use a conditional random fields process or a bidirectional long short term memory and conditional random fields process to extract structured data using the spatial information.

Type: Application

Filed: February 1, 2019

Publication date: August 6, 2020

Applicant: INTUIT INC.

Inventor: Tharathorn RIMCHALA