Patents by Inventor Saikat Mukherjee
Saikat Mukherjee has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220245377Abstract: This disclosure relates to automatic text information extraction from electronic documents. An example system is configured to perform operations including obtaining text and one or more document features of an electronic document, clustering the text into one or more groups based on the one or more document features, and identifying one or more text strings from the text in one or more groups as one or more keys. Identifying the one or more text strings is based on the clustering. The system is also configured to perform operations including generating one or more key/value pairs. Generating one or more key/value pairs includes associating one or more values to the one or more keys (with a value including text outside of the one or more identified text strings). The system is further configured to output the one or more key/value pairs.Type: ApplicationFiled: January 29, 2021Publication date: August 4, 2022Applicant: Intuit Inc.Inventors: Anu Singh, Saikat Mukherjee
-
Patent number: 11361245Abstract: The disclosure relates to technology that implements flow control for machine learning on data such as Internet of Things (“IoT”) datasets. The system may route outputs of a data splitter function performed on the IoT datasets to a designated target model based on a user specification for routing the outputs. In this manner, the IoT datasets may be dynamically routed to target datasets without reprogramming machine-learning pipelines, which enable rapid training, testing and validation of ML models as well as an ability to concurrently train, validate, and execute ML models.Type: GrantFiled: August 9, 2018Date of Patent: June 14, 2022Assignee: Hewlett Packard Enterprise Development LPInventors: Satish Kumar Mopur, Saikat Mukherjee, Gunalan Perumal Vijayan, Sridhar Balachandriah, Ashutosh Agrawal, Krishnaprasad Lingadahalli Shastry, Gregory S. Battas
-
Patent number: 11295076Abstract: Generating a difference between a first and second plurality of lines of text in structured machine-readable format may include determining, by at least one processor, a line of the second plurality of lines that constitutes a best match for a line of the first plurality of lines. The line of the first plurality of lines and its respective best match may be associated with a similarity score. The at least one processor may compare the similarity score to a threshold value. In response to determining that the similarity score is greater than or equal to the threshold value, the at least one processor may compute, the textual difference between the line of the first plurality of lines and its best match. In response to computing the textual difference, the at least one processor may analyze the textual difference to identify a non-meaningful change.Type: GrantFiled: July 31, 2019Date of Patent: April 5, 2022Assignee: Intuit Inc.Inventors: Mritunjay Kumar, Saikat Mukherjee, Karpaga Ganesh Patchirajan, Anu Singh
-
Publication number: 20220092436Abstract: A method and system learn functions to be associated with data fields of forms to be incorporated into an electronic document preparation system. The functions are essentially sets of operations required to calculate the data field. The method and system receive form data related to a data field that expects data values resulting from performing specific operations. The method and system utilize machine learning and training set data to generate, test, and evaluate candidate functions to determine acceptable functions.Type: ApplicationFiled: December 6, 2021Publication date: March 24, 2022Applicant: Intuit Inc.Inventors: Cem Unsal, Saikat Mukherjee, Roger Charles Meike
-
Publication number: 20220085975Abstract: Systems and methods are provided for implementing swarm learning while using blockchain technology and election/voting mechanisms to ensure data privacy. Nodes may train local instances of a machine learning model using local data, from which parameters are derived or extracted. Those parameters may be encrypted and persisted until a merge leader is elected that can merge the parameters using a public key generated by an external key manager. A decryptor that is not the merge leader can be elected to decrypt the merged parameter using a corresponding private key, and the decrypted merged parameter can then be shared amongst the nodes, and applied to their local models. This process can be repeated until a desired level of learning has been achieved. The public and private keys are never revealed to the same node, and may be permanently discarded after use to further ensure privacy.Type: ApplicationFiled: November 23, 2021Publication date: March 17, 2022Inventors: Sathyanarayanan MANAMOHAN, Vishesh GARG, Krishnaprasad Lingadahalli SHASTRY, Saikat MUKHERJEE
-
Publication number: 20220027564Abstract: This disclosure describes converting computer-executable predicate-argument structures for a specific field to field-specific predicated-argument structures to improve execution. In some implementations, a method can be performed by one or more processors of a computing device, and can include receiving one or more predicate-argument structures (PASs) associated with taxation-specific text and converting the one or more PASs into one or more tax-specific predicate-argument structures (TPASs). Converting the one or more PASs to one or more TPASs may include one or more of: defining terms in a segment based on a definition of the term from a different segment or line description (including from a different document); reordering nodes, replacing nodes, or removing nodes of a segment (such as based on one or more single segment tree traversal rules); or combining multiple PASs for multiple segments of a single line description based on one or more multiple segment tree traversal rules.Type: ApplicationFiled: July 24, 2020Publication date: January 27, 2022Applicant: Intuit Inc.Inventors: Esmé Manandise, Karpaga Ganesh Patchirajan, Saikat Mukherjee
-
Patent number: 11222266Abstract: A method and system learns functions to be associated with data fields of forms to be incorporated into an electronic document preparation system. The functions are essentially sets of operations required to calculate the data field. The method and system receive form data related to a data field that expects data values resulting from performing specific operations. The method and system utilize machine learning and training set data to generate, test, and evaluate candidate functions to determine acceptable functions.Type: GrantFiled: October 14, 2016Date of Patent: January 11, 2022Assignee: Intuit Inc.Inventors: Cem Unsal, Saikat Mukherjee, Roger Charles Meike
-
Patent number: 11218293Abstract: Systems and methods are provided for implementing swarm learning while using blockchain technology and election/voting mechanisms to ensure data privacy. Nodes may train local instances of a machine learning model using local data, from which parameters are derived or extracted. Those parameters may be encrypted and persisted until a merge leader is elected that can merge the parameters using a public key generated by an external key manager. A decryptor that is not the merge leader can be elected to decrypt the merged parameter using a corresponding private key, and the decrypted merged parameter can then be shared amongst the nodes, and applied to their local models. This process can be repeated until a desired level of learning has been achieved. The public and private keys are never revealed to the same node, and may be permanently discarded after use to further ensure privacy.Type: GrantFiled: January 27, 2020Date of Patent: January 4, 2022Assignee: Hewlett Packard Enterprise Development LPInventors: Sathyanarayanan Manamohan, Vishesh Garg, Krishnaprasad Lingadahalli Shastry, Saikat Mukherjee
-
Publication number: 20210398017Abstract: Systems and methods are provided for calculating validation loss in a distributed machine learning network, where nodes train local instances of a machine learning model using local data maintained at those nodes. After each training iteration of the local instances of the machine learning model, each node may calculate a local validation loss value corresponding to the performance of the local instance of the machine learning model trained at each of the nodes. Those local validation loss values may be shared with an elected leader that can average all the local validation loss values, return a global validation loss value to the nodes. The nodes may then determine whether or not training of their local instance of the machine learning model should stop or continue.Type: ApplicationFiled: March 18, 2021Publication date: December 23, 2021Inventors: Vishesh GARG, Sathyanarayanan MANAMOHAN, Saikat MUKHERJEE, Krishnaprasad Lingadahalli SHASTRY
-
Publication number: 20210350081Abstract: Systems and methods for recognizing domain specific named entities are disclosed. An example method may be performed by one or more processors of a text incorporation system and include extracting a number of terms from a text under consideration, identifying, among the number of terms, a set of unmatched terms that do not match any of a plurality of known terms, passing each respective unmatched term to a vectorization module, embedding a vectorized version of each respective unmatched term in a vector space, comparing each vectorized version to known term vectors, passing, to a machine learning model, candidate terms corresponding to known term vectors closest to the vectorized versions, identifying, using the machine learning model, a best candidate term for each respective unmatched term, mapping the best candidate terms to unmatched terms in the text under consideration, and incorporating the text under consideration into the system based on the mappings.Type: ApplicationFiled: July 20, 2021Publication date: November 11, 2021Applicant: Intuit Inc.Inventors: Conrad De Peuter, Karpaga Ganesh Patchirajan, Saikat Mukherjee
-
Patent number: 11163956Abstract: A natural language processing method and system utilizes a combination of rules-based processes, vector-based processes, and machine learning-based processes to identify the meaning of terms extracted from data management system related text. Once the meaning of the terms has been identified, the method and system can automatically incorporate new forms and text into a data management system.Type: GrantFiled: May 23, 2019Date of Patent: November 2, 2021Assignee: Intuit Inc.Inventors: Conrad De Peuter, Karpaga Ganesh Patchirajan, Saikat Mukherjee
-
Publication number: 20210287302Abstract: A method and system to learn new forms to be incorporated into an electronic document preparation system, or to learn the behavior of existing systems, receive form data related to a new form having a plurality of data fields that expect data values based on specific functions. The method and system gather training set data including previously filled forms having completed data fields corresponding to the data fields of the new form. The method and system include multiple analysis modules that each generate candidate functions for providing data values for the data fields of the new form. The method and system evaluate the candidate functions from each analysis technique and select the candidate functions that are most accurate based on comparisons with the training set data.Type: ApplicationFiled: May 26, 2021Publication date: September 16, 2021Applicant: Intuit Inc.Inventors: Saikat Mukherjee, Cem Unsal, William T. Laaser, Mritunjay Kumar, Anu Sreepathy, Per-Kristian Halvorsen
-
Publication number: 20210264146Abstract: A method may include acquiring, from an initial document having a document type, initial document elements and initial attributes, deriving initial features for the initial document elements using the initial attributes, detecting initial form components using the initial features, clustering the initial form components into initial line objects of an initial structured representation by applying an unsupervised machine learning model to the geometric attributes of the initial document elements, acquiring, from a next document having the document type, next document elements and next attributes describing the next document elements, deriving next features for the next document elements using the next attributes, detecting next form components using the next features, determining that the initial form components and the next form components are different, clustering the next form components into next line objects of a next structured representation, and replacing the initial structured representation with theType: ApplicationFiled: April 29, 2021Publication date: August 26, 2021Applicant: Intuit Inc.Inventors: Anu Singh, Saikat Mukherjee, Mritunjay Kumar, Karpaga Ganesh Patchirajan
-
Publication number: 20210234668Abstract: Systems and methods are provided for implementing swarm learning while using blockchain technology and election/voting mechanisms to ensure data privacy. Nodes may train local instances of a machine learning model using local data, from which parameters are derived or extracted. Those parameters may be encrypted and persisted until a merge leader is elected that can merge the parameters using a public key generated by an external key manager. A decryptor that is not the merge leader can be elected to decrypt the merged parameter using a corresponding private key, and the decrypted merged parameter can then be shared amongst the nodes, and applied to their local models. This process can be repeated until a desired level of learning has been achieved. The public and private keys are never revealed to the same node, and may be permanently discarded after use to further ensure privacy.Type: ApplicationFiled: January 27, 2020Publication date: July 29, 2021Inventors: SATHYANARAYANAN MANAMOHAN, Vishesh Garg, Krishnaprasad Lingadahalli Shastry, Saikat Mukherjee
-
Patent number: 11048933Abstract: A method may include acquiring, from a document, document elements and attributes describing the document elements. One or more of the attributes may be geometric attributes describing a placement of the corresponding document element within the document. The method may further include deriving features for the document elements using the attributes, detecting form components using the features, clustering the form components into line objects of a structured representation by applying an unsupervised machine learning model to the geometric attributes of the document elements, and populating a compliance form using the structured representation.Type: GrantFiled: September 12, 2019Date of Patent: June 29, 2021Assignee: Intuit Inc.Inventors: Anu Singh, Saikat Mukherjee, Mritunjay Kumar, Karpaga Ganesh Patchirajan
-
Patent number: 11049190Abstract: A method and system to learn new forms to be incorporated into an electronic document preparation system, or to learn the behavior of existing systems, receive form data related to a new form having a plurality of data fields that expect data values based on specific functions. The method and system gather training set data including previously filled forms having completed data fields corresponding to the data fields of the new form. The method and system include multiple analysis modules that each generate candidate functions for providing data values for the data fields of the new form. The method and system evaluate the candidate functions from each analysis technique and select the candidate functions that are most accurate based on comparisons with the training set data.Type: GrantFiled: December 20, 2016Date of Patent: June 29, 2021Assignee: Intuit Inc.Inventors: Saikat Mukherjee, Cem Unsal, William T. Laaser, Mritunjay Kumar, Anu Sreepathy, Per-Kristian Halvorsen
-
Publication number: 20210034858Abstract: A method may include acquiring, from a document, document elements and attributes describing the document elements. One or more of the attributes may be geometric attributes describing a placement of the corresponding document element within the document. The method may further include deriving features for the document elements using the attributes, detecting form components using the features, clustering the form components into line objects of a structured representation by applying an unsupervised machine learning model to the geometric attributes of the document elements, and populating a compliance form using the structured representation.Type: ApplicationFiled: September 12, 2019Publication date: February 4, 2021Applicant: Intuit Inc.Inventors: Anu Singh, Saikat Mukherjee, Mritunjay Kumar, Karpaga Ganesh Patchirajan
-
Publication number: 20200159990Abstract: A method and system parses natural language in a unique way, determining important words pertaining to a text corpus of a particular genre, such as tax preparation. Sentences extracted from instructions or forms pertaining to tax preparation, for example are parsed to determine word groups forming various parts of speech, and then are processed to exclude words on an exclusion list and word groups that don't meet predetermined criteria. From the resulting data, synonyms are replaced with a common functional operator and the resulting sentence text is analyzed against predetermined patterns to determine one or more functions to be used in a document preparation system.Type: ApplicationFiled: January 23, 2020Publication date: May 21, 2020Applicant: Intuit Inc.Inventors: Saikat Mukherjee, Esmé Manandise, Sudhir Agarwal, Karpaga Ganesh Patchirajan
-
Publication number: 20200151619Abstract: A system and method for accounting for the impact of concept drift in selecting machine learning training methods to address the identified impact. Pattern recognition is performed on performance metrics of a deployed production model in an Internet-of-Things (IoT) environment to determine the impact that concept drift (data drift) has had on prediction performance. This concurrent analysis is utilized to select one or more approaches for training machine learning models, thereby accounting for the temporal dynamics of concept drift (and its subsequent impact on prediction performance) in a faster and more efficient manner.Type: ApplicationFiled: November 9, 2018Publication date: May 14, 2020Inventors: Satish Kumar MOPUR, Gregory S. BATTAS, Gunalan Perumal VIJAYAN, Krishnaprasad Lingadahalli SHASTRY, Saikat MUKHERJEE, Ashutosh AGRAWAL, Sridhar BALACHANDRIAH
-
Publication number: 20200112490Abstract: The disclosure relates to a framework for dynamic management of analytic functions such as data processors and machine learned (“ML”) models for an Internet of Things intelligent edge that addresses management of the lifecycle of the analytic functions from creation to execution, in production. The end user will be seamlessly able to check in an analytic function, version it, deploy it, evaluate model performance and deploy refined versions into the data flows at the edge or core dynamically for existing and new end points. The framework comprises a hypergraph-based model as a foundation, and may use a microservices architecture with the ML infrastructure and models deployed as containerized microservices.Type: ApplicationFiled: October 4, 2018Publication date: April 9, 2020Inventors: SATISH KUMAR MOPUR, SAIKAT MUKHERJEE, GUNALAN PERUMAL VIJAYAN, SRIDHAR BALACHANDRIAH, ASHUTOSH AGRAWAL, KRISHNAPRASAD LINGADAHALLI SHASTRY, GREGORY S. BATTAS