Patents by Inventor Tharathorn RIMCHALA

Tharathorn RIMCHALA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11977842
    Abstract: A computing system generates a plurality of training data sets for generating the NLP model. The computing system trains a teacher network to extract and classify tokens from a document. The training includes a pre-training stage where the teacher network is trained to classify generic data in the plurality of training data sets and a fine-tuning stage where the teacher network is trained to classify targeted data in the plurality of training data sets. The computing system trains a student network to extract and classify tokens from a document by distilling knowledge learned by the teacher network during the fine-tuning stage from the teacher network to the student network. The computing system outputs the NLP model based on the training. The computing system causes the NLP model to be deployed in a remote computing environment.
    Type: Grant
    Filed: April 30, 2021
    Date of Patent: May 7, 2024
    Assignee: INTUIT INC.
    Inventors: Dominic Miguel Rossi, Hui Fang Lee, Tharathorn Rimchala
  • Publication number: 20240054802
    Abstract: A system and method for extracting data from a piece of content using spatial information about the piece of content. The system and method may use a conditional random fields process or a bidirectional long short term memory and conditional random fields process to extract structured data using the spatial information.
    Type: Application
    Filed: October 24, 2023
    Publication date: February 15, 2024
    Applicant: INTUIT INC.
    Inventor: Tharathorn RIMCHALA
  • Publication number: 20240037125
    Abstract: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.
    Type: Application
    Filed: June 16, 2023
    Publication date: February 1, 2024
    Applicant: Intuit Inc.
    Inventors: Tharathorn RIMCHALA, Yingxin Wang
  • Patent number: 11861884
    Abstract: Certain aspects of the disclosure provide systems and methods for training an information extraction transformer model architecture directed to pre-training a first multimodal transformer model on an unlabeled dataset, training a second multimodal transformer model on a first labeled dataset to perform a key information extraction task processing the unlabeled dataset with the second multimodal transformer model to generate pseudo-labels for the unlabeled dataset, training the first multimodal transformer model based on a second labeled dataset comprising one or more labels, the pseudo-labels generated, or combinations thereof to generate a third multimodal transformer model, generating updated pseudo-labels based on label completion predictions from the third multimodal transformer model, and training the third multimodal transformer model using a noise-aware loss function and the updated pseudo-labels to generate an updated third multimodal transformer model.
    Type: Grant
    Filed: April 10, 2023
    Date of Patent: January 2, 2024
    Assignee: Intuit, Inc.
    Inventors: Karelia Del Carmen Pena Pena, Tharathorn Rimchala, Peter Lee Frick, Tak Yiu Daniel Li
  • Patent number: 11837002
    Abstract: A system and method for extracting data from a piece of content using spatial information about the piece of content. The system and method may use a conditional random fields process or a bidirectional long short term memory and conditional random fields process to extract structured data using the spatial information.
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: December 5, 2023
    Assignee: INTUIT INC.
    Inventor: Tharathorn Rimchala
  • Publication number: 20230386236
    Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.
    Type: Application
    Filed: November 30, 2022
    Publication date: November 30, 2023
    Applicant: Intuit Inc.
    Inventors: Tharathorn Rimchala, Peter Frick
  • Patent number: 11829406
    Abstract: Aspects of the present disclosure provide techniques for image-based document search. Embodiments include receiving an image of a document and providing the image of the document as input to a machine learning model, where the machine learning model generates separate embeddings of a plurality of patches of the image of the document and the machine learning model generates an embedding of the image of the document based on the separate embeddings of the plurality of patches. Embodiments include determining a compact embedding of the image of the document based on applying a dimensionality reduction technique to the embedding of the image of the document generated by the machine learning model. Embodiments include performing a search for relevant documents based on the compact embedding of the image of the document. Embodiments include performing one or more actions based on one or more relevant documents identified through the search.
    Type: Grant
    Filed: June 30, 2023
    Date of Patent: November 28, 2023
    Assignee: INTUIT, INC.
    Inventors: Shir Meir Lador, Sameeksha Khillan, Peter Lee Frick, Tharathorn Rimchala, Guohan Gao
  • Patent number: 11798301
    Abstract: Systems and methods of generating synthetic training data for machine learning models. First, line items in source documents such as bills, invoices, and or receipts are identified and labeled. The identification and labeling generate labeled documents. Then, in the labeled documents, the line items are augmented by adding, deleting, and or swapping line items to generate synthetic training documents. An addition operation randomly selects one or more line items and adds the selected line item(s) to the same labeled document or another labeled document. A deletion operation randomly deletes one or more line items. A swapping operation randomly swaps line items in a single labeled document or across different labeled documents. These operations can generate synthetic labeled documents of any length, which form synthetic training data for training the machine learning models.
    Type: Grant
    Filed: October 17, 2022
    Date of Patent: October 24, 2023
    Assignee: INTUIT INC.
    Inventor: Tharathorn Rimchala
  • Patent number: 11783605
    Abstract: Certain aspects of the present disclosure provide techniques for training and using machine learning models to extract key-value sets from a document. An example method generally includes identifying regions of a document including key-value sets corresponding to inputs to a data processing application based on a first machine learning model and an electronic version of the document. One or more keys and one or more values are identified in the document based on a second machine learning model. One or more key-value sets are generated based on matching keys of the one or more keys and values of the one or more values in the region of the document. The one or more key-value sets are provided to a data processing application for processing.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: October 10, 2023
    Assignee: INTUIT, INC.
    Inventors: Amogha Sekhar, Eric Vanoeveren, Deepankar Mohapatra, Tharathorn Rimchala, Priyadarshini Rajendran
  • Publication number: 20230316157
    Abstract: A machine learning system executed by a processor may generate predictions for a variety of natural language processing (NLP) tasks. The machine learning system may include a single deployment implementing a parameter efficient transfer learning architecture. The machine learning system may use adapter layers to dynamically modify a base model to generate a plurality of fine-tuned models. Each fine-tuned model may generate predictions for a specific NLP task. By transferring knowledge from the base model to each fine-tuned model, the ML system achieves a significant reduction in the number of tunable parameters required to generate a fine-tuned NLP model and decreases the fine-tuned model artifact size. Additionally, the ML system reduces training times for fine-tuned NLP models, promotes transfer learning across NLP tasks with lower labeled data volumes, and enables easier and more computationally efficient deployments for multi-task NLP.
    Type: Application
    Filed: June 2, 2023
    Publication date: October 5, 2023
    Applicant: INTUIT INC.
    Inventors: Terrence J. TORRES, Tharathorn Rimchala, Andrew Mattarella-Micke
  • Patent number: 11720605
    Abstract: A visual-based classification model influenced by text features as a result of the outputs of a text-based classification model is disclosed. A system receives one or more documents to be classified based on one or more visual features and provides the one or more documents to a student classification model, which is a visual-based classification model. The system also classifies, by the student classification model, the one or more documents into one or more document types based on one or more visual features. The one or more visual features are generated by the student classification model that is trained based on important text identified by a teacher classification model for the one or more document types, with the teacher classification model being a text-based classification model. Generating training data and training the student classification model based on the training data are also described.
    Type: Grant
    Filed: July 28, 2022
    Date of Patent: August 8, 2023
    Assignee: Intuit Inc.
    Inventors: Tharathorn Rimchala, Yingxin Wang
  • Patent number: 11704602
    Abstract: A machine learning system executed by a processor may generate predictions for a variety of natural language processing (NLP) tasks. The machine learning system may include a single deployment implementing a parameter efficient transfer learning architecture. The machine learning system may use adapter layers to dynamically modify a base model to generate a plurality of fine-tuned models. Each fine-tuned model may generate predictions for a specific NLP task. By transferring knowledge from the base model to each fine-tuned model, the ML system achieves a significant reduction in the number of tunable parameters required to generate a fine-tuned NLP model and decreases the fine-tuned model artifact size. Additionally, the ML system reduces training times for fine-tuned NLP models, promotes transfer learning across NLP tasks with lower labeled data volumes, and enables easier and more computationally efficient deployments for multi-task NLP.
    Type: Grant
    Filed: January 2, 2020
    Date of Patent: July 18, 2023
    Assignee: Intuit Inc.
    Inventors: Terrence J. Torres, Tharathorn Rimchala, Andrew Mattarella-Micke
  • Patent number: 11593555
    Abstract: Systems and methods are provided to determine consensus values for duplicate fields in a document or form.
    Type: Grant
    Filed: May 9, 2022
    Date of Patent: February 28, 2023
    Assignee: INTUIT INC.
    Inventors: Peter Anthony, Preeti Duraipandian, Tharathorn Rimchala, Sricharan Kallur Palli Kumar
  • Patent number: 11544943
    Abstract: A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: January 3, 2023
    Assignee: Intuit Inc.
    Inventors: Tharathorn Rimchala, Peter Frick
  • Publication number: 20220350968
    Abstract: A computing system generates a plurality of training data sets for generating the NLP model. The computing system trains a teacher network to extract and classify tokens from a document. The training includes a pre-training stage where the teacher network is trained to classify generic data in the plurality of training data sets and a fine-tuning stage where the teacher network is trained to classify targeted data in the plurality of training data sets. The computing system trains a student network to extract and classify tokens from a document by distilling knowledge learned by the teacher network during the fine-tuning stage from the teacher network to the student network. The computing system outputs the NLP model based on the training. The computing system causes the NLP model to be deployed in a remote computing environment.
    Type: Application
    Filed: April 30, 2021
    Publication date: November 3, 2022
    Applicant: INTUIT INC.
    Inventors: Dominic Miguel ROSSI, Hui Fang LEE, Tharathorn RIMCHALA
  • Patent number: 11216660
    Abstract: A user device associated with a user may receive a document associated with the user. The user device may encrypt the received document. The user device may perform patch-based document segmentation on the received document to form a plurality of patches on the received document. The user device may extract text from each patch of the plurality of patches. The user device may analyze the extracted text from each patch to detect a field title and a field value. The user device may encrypt the extracted text and its associated field value for each patch of the plurality of patches. The user device may send the encrypted extracted text and its associated field value to the user device and instructions to display the extracted text and its associated field value on a user interface.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: January 4, 2022
    Assignee: Intuit Inc.
    Inventors: Tharathorn Rimchala, Yang Li
  • Publication number: 20210209513
    Abstract: A machine learning system executed by a processor may generate predictions for a variety of natural language processing (NLP) tasks. The machine learning system may include a single deployment implementing a parameter efficient transfer learning architecture. The machine learning system may use adapter layers to dynamically modify a base model to generate a plurality of fine-tuned models. Each fine-tuned model may generate predictions for a specific NLP task. By transferring knowledge from the base model to each fine-tuned model, the ML system achieves a significant reduction in the number of tunable parameters required to generate a fine-tuned NLP model and decreases the fine-tuned model artifact size. Additionally, the ML system reduces training times for fine-tuned NLP models, promotes transfer learning across NLP tasks with lower labeled data volumes, and enables easier and more computationally efficient deployments for multi-task NLP.
    Type: Application
    Filed: January 2, 2020
    Publication date: July 8, 2021
    Applicant: Intuit Inc.
    Inventors: Terrence J. TORRES, Tharathorn Rimchala, Andrew Mattarella-Micke
  • Publication number: 20210064865
    Abstract: A user device associated with a user may receive a document associated with the user. The user device may encrypt the received document. The user device may perform patch-based document segmentation on the received document to form a plurality of patches on the received document. The user device may extract text from each patch of the plurality of patches. The user device may analyze the extracted text from each patch to detect a field title and a field value. The user device may encrypt the extracted text and its associated field value for each patch of the plurality of patches. The user device may send the encrypted extracted text and its associated field value to the user device and instructions to display the extracted text and its associated field value on a user interface.
    Type: Application
    Filed: August 30, 2019
    Publication date: March 4, 2021
    Applicant: Intuit Inc.
    Inventors: Tharathorn RIMCHALA, Yang LI
  • Publication number: 20200250263
    Abstract: A system and method for extracting data from a piece of content using spatial information about the piece of content. The system and method may use a conditional random fields process or a bidirectional long short term memory and conditional random fields process to extract structured data using the spatial information.
    Type: Application
    Filed: February 1, 2019
    Publication date: August 6, 2020
    Applicant: INTUIT INC.
    Inventor: Tharathorn RIMCHALA