Patents by Inventor Karpaga Ganesh Patchirajan

Karpaga Ganesh Patchirajan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for recognizing domain specific named entities using domain specific word embeddings

Patent number: 11687721

Abstract: Systems and methods for recognizing domain specific named entities are disclosed. An example method may be performed by one or more processors of a text incorporation system and include extracting a number of terms from a text under consideration, identifying, among the number of terms, a set of unmatched terms that do not match any of a plurality of known terms, passing each respective unmatched term to a vectorization module, embedding a vectorized version of each respective unmatched term in a vector space, comparing each vectorized version to known term vectors, passing, to a machine learning model, candidate terms corresponding to known term vectors closest to the vectorized versions, identifying, using the machine learning model, a best candidate term for each respective unmatched term, mapping the best candidate terms to unmatched terms in the text under consideration, and incorporating the text under consideration into the system based on the mappings.

Type: Grant

Filed: July 20, 2021

Date of Patent: June 27, 2023

Assignee: Intuit Inc.

Inventors: Conrad De Peuter, Karpaga Ganesh Patchirajan, Saikat Mukherjee
SYSTEM AND METHOD FOR DETERMINING A STRUCTURED REPRESENTATION OF A FORM DOCUMENT UTILIZING MULTIPLE MACHINE LEARNING MODELS

Publication number: 20230147276

Abstract: Systems and methods may be used to generate and use a structured form representation and structured metadata. The structured form representation and structured metadata may include information relevant to a particular context and may be used to update document templates, import new documents and update document versions into software, automate data entry for document completion, update records to include new and or updated information, and provide other functionality of an information service.

Type: Application

Filed: December 28, 2022

Publication date: May 11, 2023

Applicant: Intuit Inc.

Inventors: Stephanie BROYLES, Andrew Van CAO, Stephen A. EUBANKS, William R. GEORGEN, Karpaga Ganesh PATCHIRAJAN
LEAN PARSING: A NATURAL LANGUAGE PROCESSING SYSTEM AND METHOD FOR PARSING DOMAIN-SPECIFIC LANGUAGES

Publication number: 20230065070

Abstract: Systems and methods for lean parsing are disclosed. An example method is performed by one or more processors of a system and includes retrieving form data including first sentence segments and second sentence segments, determining a first predicate structure for each of the sentence segments based on a set of operators within the first set of sentence segments, identifying known tokens within the second set of sentence segments, each of the known tokens appearing on a list of predetermined tokens, identifying new tokens within the second set of sentence segments, each of the new tokens not on the list, mapping each known and new token to at least one operator, determining a second predicate structure for each sentence segment based on the mapping, and generating a predicate argument structure incorporating the first and second predicate structures, the predicate argument structure ready for mapping to at least one machine executable function.

Type: Application

Filed: October 28, 2022

Publication date: March 2, 2023

Applicant: Intuit Inc.

Inventors: Saikat Mukherjee, Esmé Manandise, Sudhir Agarwal, Karpaga Ganesh Patchirajan
System and method for determining a structured representation of a form document utilizing multiple machine learning models

Patent number: 11568284

Abstract: Systems and methods may be used to generate and use a structured form representation and structured metadata. The structured form representation and structured metadata may include information relevant to a particular context and may be used to update document templates, import new documents and update document versions into software, automate data entry for document completion, update records to include new and or updated information, and provide other functionality of an information service.

Type: Grant

Filed: June 26, 2020

Date of Patent: January 31, 2023

Assignee: Intuit Inc.

Inventors: Stephanie Broyles, Andrew Van Cao, Stephen A. Eubanks, William R. Georgen, Karpaga Ganesh Patchirajan
Document text extraction to field-specific computer executable operations

Patent number: 11544468

Abstract: This disclosure describes converting computer-executable predicate-argument structures for a specific field to field-specific predicated-argument structures to improve execution. In some implementations, a method can be performed by one or more processors of a computing device, and can include receiving one or more predicate-argument structures (PASs) associated with taxation-specific text and converting the one or more PASs into one or more tax-specific predicate-argument structures (TPASs). Converting the one or more PASs to one or more TPASs may include one or more of: defining terms in a segment based on a definition of the term from a different segment or line description (including from a different document); reordering nodes, replacing nodes, or removing nodes of a segment (such as based on one or more single segment tree traversal rules); or combining multiple PASs for multiple segments of a single line description based on one or more multiple segment tree traversal rules.

Type: Grant

Filed: July 24, 2020

Date of Patent: January 3, 2023

Assignee: Intuit Inc.

Inventors: Esmé Manandise, Karpaga Ganesh Patchirajan, Saikat Mukherjee
Generating structured representations of forms using machine learning

Patent number: 11521405

Abstract: A method may include acquiring, from an initial document having a document type, initial document elements and initial attributes, deriving initial features for the initial document elements using the initial attributes, detecting initial form components using the initial features, clustering the initial form components into initial line objects of an initial structured representation by applying an unsupervised machine learning model to the geometric attributes of the initial document elements, acquiring, from a next document having the document type, next document elements and next attributes describing the next document elements, deriving next features for the next document elements using the next attributes, detecting next form components using the next features, determining that the initial form components and the next form components are different, clustering the next form components into next line objects of a next structured representation, and replacing the initial structured representation with the

Type: Grant

Filed: April 29, 2021

Date of Patent: December 6, 2022

Assignee: Intuit Inc.

Inventors: Anu Singh, Saikat Mukherjee, Mritunjay Kumar, Karpaga Ganesh Patchirajan
Lean parsing: a natural language processing system and method for parsing domain-specific languages

Patent number: 11520975

Abstract: A method and system parses natural language in a unique way, determining important words pertaining to a text corpus of a particular genre, such as tax preparation. Sentences extracted from instructions or forms pertaining to tax preparation, for example are parsed to determine word groups forming various parts of speech, and then are processed to exclude words on an exclusion list and word groups that don't meet predetermined criteria. From the resulting data, synonyms are replaced with a common functional operator and the resulting sentence text is analyzed against predetermined patterns to determine one or more functions to be used in a document preparation system.

Type: Grant

Filed: January 23, 2020

Date of Patent: December 6, 2022

Assignee: Intuit Inc.

Inventors: Saikat Mukherjee, Esmé Manandise, Sudhir Agarwal, Karpaga Ganesh Patchirajan
System and method of generating deltas between documents

Patent number: 11295076

Abstract: Generating a difference between a first and second plurality of lines of text in structured machine-readable format may include determining, by at least one processor, a line of the second plurality of lines that constitutes a best match for a line of the first plurality of lines. The line of the first plurality of lines and its respective best match may be associated with a similarity score. The at least one processor may compare the similarity score to a threshold value. In response to determining that the similarity score is greater than or equal to the threshold value, the at least one processor may compute, the textual difference between the line of the first plurality of lines and its best match. In response to computing the textual difference, the at least one processor may analyze the textual difference to identify a non-meaningful change.

Type: Grant

Filed: July 31, 2019

Date of Patent: April 5, 2022

Assignee: Intuit Inc.

Inventors: Mritunjay Kumar, Saikat Mukherjee, Karpaga Ganesh Patchirajan, Anu Singh
DOCUMENT TEXT EXTRACTION TO FIELD-SPECIFIC COMPUTER EXECUTABLE OPERATIONS

Publication number: 20220027564

Abstract: This disclosure describes converting computer-executable predicate-argument structures for a specific field to field-specific predicated-argument structures to improve execution. In some implementations, a method can be performed by one or more processors of a computing device, and can include receiving one or more predicate-argument structures (PASs) associated with taxation-specific text and converting the one or more PASs into one or more tax-specific predicate-argument structures (TPASs). Converting the one or more PASs to one or more TPASs may include one or more of: defining terms in a segment based on a definition of the term from a different segment or line description (including from a different document); reordering nodes, replacing nodes, or removing nodes of a segment (such as based on one or more single segment tree traversal rules); or combining multiple PASs for multiple segments of a single line description based on one or more multiple segment tree traversal rules.

Type: Application

Filed: July 24, 2020

Publication date: January 27, 2022

Applicant: Intuit Inc.

Inventors: Esmé Manandise, Karpaga Ganesh Patchirajan, Saikat Mukherjee
METHODS AND SYSTEMS FOR ACQUIRING AND MANIPULATING RELEVANT INFORMATION USING MACHINE LEARNING

Publication number: 20210406716

Abstract: Systems and methods may be used to generate and use a structured form representation and structured metadata. The structured form representation and structured metadata may include information relevant to a particular context and may be used to update document templates, import new documents and update document versions into software, automate data entry for document completion, update records to include new and or updated information, and provide other functionality of an information service.

Type: Application

Filed: June 26, 2020

Publication date: December 30, 2021

Applicant: Intuit Inc.

Inventors: Stephanie BROYLES, Andrew Van CAO, Stephen A. EUBANKS, William R. GEORGEN, Karpaga Ganesh PATCHIRAJAN
SYSTEM AND METHOD FOR RECOGNIZING DOMAIN SPECIFIC NAMED ENTITIES USING DOMAIN SPECIFIC WORD EMBEDDINGS

Publication number: 20210350081

Abstract: Systems and methods for recognizing domain specific named entities are disclosed. An example method may be performed by one or more processors of a text incorporation system and include extracting a number of terms from a text under consideration, identifying, among the number of terms, a set of unmatched terms that do not match any of a plurality of known terms, passing each respective unmatched term to a vectorization module, embedding a vectorized version of each respective unmatched term in a vector space, comparing each vectorized version to known term vectors, passing, to a machine learning model, candidate terms corresponding to known term vectors closest to the vectorized versions, identifying, using the machine learning model, a best candidate term for each respective unmatched term, mapping the best candidate terms to unmatched terms in the text under consideration, and incorporating the text under consideration into the system based on the mappings.

Type: Application

Filed: July 20, 2021

Publication date: November 11, 2021

Applicant: Intuit Inc.

Inventors: Conrad De Peuter, Karpaga Ganesh Patchirajan, Saikat Mukherjee
System and method for recognizing domain specific named entities using domain specific word embeddings

Patent number: 11163956

Abstract: A natural language processing method and system utilizes a combination of rules-based processes, vector-based processes, and machine learning-based processes to identify the meaning of terms extracted from data management system related text. Once the meaning of the terms has been identified, the method and system can automatically incorporate new forms and text into a data management system.

Type: Grant

Filed: May 23, 2019

Date of Patent: November 2, 2021

Assignee: Intuit Inc.

Inventors: Conrad De Peuter, Karpaga Ganesh Patchirajan, Saikat Mukherjee
Reducing nonvisual noise byte codes in machine readable format documents

Patent number: 11163937

Abstract: A method may include obtaining a first byte stream from first document code and a second byte stream from second document code. The first document code has a document type and the second document code has the document type. The method may further include identifying, in the first byte stream, nonvisual noise corresponding to a custom byte code defined in a custom character encoding set. The nonvisual noise is invisible when rendering the first document code. The method may further include replacing, in the first byte stream, the custom byte code with at least one standard byte code defined in a standard character encoding set to obtain modified document code. The second document code uses the standard character encoding set. The method may further include comparing the modified document code with the second document code by comparing the first byte stream with the second byte stream.

Type: Grant

Filed: July 7, 2020

Date of Patent: November 2, 2021

Assignee: Intuit Inc.

Inventors: Karpaga Ganesh Patchirajan, Connor Lawson Mcdonald, Harsha Ilapakurthy
GENERATING STRUCTURED REPRESENTATIONS OF FORMS USING MACHINE LEARNING

Publication number: 20210264146

Abstract: A method may include acquiring, from an initial document having a document type, initial document elements and initial attributes, deriving initial features for the initial document elements using the initial attributes, detecting initial form components using the initial features, clustering the initial form components into initial line objects of an initial structured representation by applying an unsupervised machine learning model to the geometric attributes of the initial document elements, acquiring, from a next document having the document type, next document elements and next attributes describing the next document elements, deriving next features for the next document elements using the next attributes, detecting next form components using the next features, determining that the initial form components and the next form components are different, clustering the next form components into next line objects of a next structured representation, and replacing the initial structured representation with the

Type: Application

Filed: April 29, 2021

Publication date: August 26, 2021

Applicant: Intuit Inc.

Inventors: Anu Singh, Saikat Mukherjee, Mritunjay Kumar, Karpaga Ganesh Patchirajan
Generating structured representations of forms using machine learning

Patent number: 11048933

Abstract: A method may include acquiring, from a document, document elements and attributes describing the document elements. One or more of the attributes may be geometric attributes describing a placement of the corresponding document element within the document. The method may further include deriving features for the document elements using the attributes, detecting form components using the features, clustering the form components into line objects of a structured representation by applying an unsupervised machine learning model to the geometric attributes of the document elements, and populating a compliance form using the structured representation.

Type: Grant

Filed: September 12, 2019

Date of Patent: June 29, 2021

Assignee: Intuit Inc.

Inventors: Anu Singh, Saikat Mukherjee, Mritunjay Kumar, Karpaga Ganesh Patchirajan
Automated document extraction and classification

Patent number: 10977291

Abstract: A method including receiving a source file containing a plurality of documents which, to a computer, initially are indistinguishable from each other. A first classification stage is applied to the source file using a convolutional neural network image classification to identify source documents in the multitude of documents and to produce a partially parsed file having a multitude of identified source documents. The partially parsed file includes sub-images corresponding to the plurality of identified source documents. A second classification stage, including a natural language processing artificial intelligence, is applied to sets of text in bounding boxes of the sub-images, to classify each of the multitude of identified source documents as a corresponding sub-type of document. Each of the sets of text corresponding to one of the sub-images. A parsed file having a multitude of identified sub-types of documents is produced. The parsed file is further computer processed.

Type: Grant

Filed: August 3, 2018

Date of Patent: April 13, 2021

Assignee: Intuit Inc.

Inventors: Ronnie Douglas Douthit, Deepankar Mohapatra, Ram Mohan Shamanna, Chiranjeev Jagannadha Reddy, Yexin Huang, Trichur Shivaramakrishnan Subramanian, Chinnadurai Duraisami, Karpaga Ganesh Patchirajan, Amar J. Mattey
GENERATING STRUCTURED REPRESENTATIONS OF FORMS USING MACHINE LEARNING

Publication number: 20210034858

Abstract: A method may include acquiring, from a document, document elements and attributes describing the document elements. One or more of the attributes may be geometric attributes describing a placement of the corresponding document element within the document. The method may further include deriving features for the document elements using the attributes, detecting form components using the features, clustering the form components into line objects of a structured representation by applying an unsupervised machine learning model to the geometric attributes of the document elements, and populating a compliance form using the structured representation.

Type: Application

Filed: September 12, 2019

Publication date: February 4, 2021

Applicant: Intuit Inc.

Inventors: Anu Singh, Saikat Mukherjee, Mritunjay Kumar, Karpaga Ganesh Patchirajan
LEAN PARSING: A NATURAL LANGUAGE PROCESSING SYSTEM AND METHOD FOR PARSING DOMAIN-SPECIFIC LANGUAGES

Publication number: 20200159990

Abstract: A method and system parses natural language in a unique way, determining important words pertaining to a text corpus of a particular genre, such as tax preparation. Sentences extracted from instructions or forms pertaining to tax preparation, for example are parsed to determine word groups forming various parts of speech, and then are processed to exclude words on an exclusion list and word groups that don't meet predetermined criteria. From the resulting data, synonyms are replaced with a common functional operator and the resulting sentence text is analyzed against predetermined patterns to determine one or more functions to be used in a document preparation system.

Type: Application

Filed: January 23, 2020

Publication date: May 21, 2020

Applicant: Intuit Inc.

Inventors: Saikat Mukherjee, Esmé Manandise, Sudhir Agarwal, Karpaga Ganesh Patchirajan
Lean parsing: a natural language processing system and method for parsing domain-specific languages

Patent number: 10579721

Abstract: A method and system parses natural language in a unique way, determining important words pertaining to a text corpus of a particular genre, such as tax preparation. Sentences extracted from instructions or forms pertaining to tax preparation, for example are parsed to determine word groups forming various parts of speech, and then are processed to exclude words on an exclusion list and word groups that don't meet predetermined criteria. From the resulting data, synonyms are replaced with a common functional operator and the resulting sentence text is analyzed against predetermined patterns to determine one or more functions to be used in a document preparation system.

Type: Grant

Filed: September 22, 2017

Date of Patent: March 3, 2020

Assignee: Intuit Inc.

Inventors: Saikat Mukherjee, Esmé Manandise, Sudhir Agarwal, Karpaga Ganesh Patchirajan
AUTOMATED DOCUMENT EXTRACTION AND CLASSIFICATION

Publication number: 20200042645

Abstract: A method including receiving a source file containing a plurality of documents which, to a computer, initially are indistinguishable from each other. A first classification stage is applied to the source file using a convolutional neural network image classification to identify source documents in the multitude of documents and to produce a partially parsed file having a multitude of identified source documents. The partially parsed file includes sub-images corresponding to the plurality of identified source documents. A second classification stage, including a natural language processing artificial intelligence, is applied to sets of text in bounding boxes of the sub-images, to classify each of the multitude of identified source documents as a corresponding sub-type of document. Each of the sets of text corresponding to one of the sub-images. A parsed file having a multitude of identified sub-types of documents is produced. The parsed file is further computer processed.

Type: Application

Filed: August 3, 2018

Publication date: February 6, 2020

Applicant: Intuit Inc.

Inventors: Ronnie Douglas Douthit, Deepankar Mohapatra, Ram Mohan Shamanna, Chiranjeev Jagannadha Reddy, Yexin Huang, Trichur Shivaramakrishnan Subramanian, Chinnadurai Duraisami, Karpaga Ganesh Patchirajan, Amar J. Mattey

1 2 next