Patents by Inventor Srikant Panda

Srikant Panda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DYNAMIC VOCABULARIES FOR CONDITIONING A LANGUAGE MODEL FOR TRANSFORMING NATURAL LANGUAGE TO A LOGICAL FORM

Publication number: 20250238614

Abstract: Techniques are disclosed herein for generating dynamic vocabularies for conditioning a language model. A dynamic vocabulary is constructed from an input prompt, database schema information for a database to be queried, and programming language information for a programming language to be used for querying the database to condition the language model to predict an output statement in the programming language. The dynamic vocabulary can be included in prompt information that is provided to the language model. The number of tokens in the dynamic vocabulary can be different than a number of tokens included in a vocabulary of the language model. By utilizing a dynamic vocabulary, the language model can be conditioned to predict tokens for the output statement that are contextually consistent with the tokens included the dynamic vocabulary.

Type: Application

Filed: January 23, 2024

Publication date: July 24, 2025

Applicant: Oracle International Corporation

Inventors: Srikant Panda, Amit Agarwal, Kulbhushan Pachauri
TABLE EXTRACTION FROM IMAGES USING LANGUAGE MODELS

Publication number: 20250209085

Abstract: Techniques for extracting tables from images using a Language Model. The techniques include detecting, within an image, an area that includes a table. The techniques further include extracting, from the area of the image, tabular data for the table, the extracted tabular data comprising a plurality of content items in the table and structural information for the table. The techniques further include generating a prompt that includes the plurality of content items and the structural information. The techniques further include providing the prompt as input to a language model. The techniques further include responsive to providing the prompt as input to the language model, generating, by the language model, a parsable representation of the table, wherein the parsable representation is in a format and includes the plurality of content items of the table and the structural information of the table in the image.

Type: Application

Filed: July 18, 2024

Publication date: June 26, 2025

Applicant: Oracle International Corporation

Inventors: Avinash Rajeshchandra Rai, Srikant Panda, Kulbhushan Pachauri
TECHNIQUES OF INFORMATION EXTRACTION FOR SELECTION MARKS

Publication number: 20250078555

Abstract: A method may include receiving a primary document including one or more selection boxes, one or more text lines, and one or more annotations. The method may include determining, a class based on the annotations. The method may include identifying the one or more selection boxes and one or more text lines of the primary document. The method may include generating a graph representing the one or more selection boxes and the one or more text lines. The method may include mapping each of the one or more selection boxes to a respective text line of the one or more text lines of the graph based at least in part on one or more characteristics associated with the selection boxes. The method may include generating a key-value pair associated with each of the one or more text lines and generating a document model of the primary document.

Type: Application

Filed: August 30, 2023

Publication date: March 6, 2025

Applicant: Oracle International Corporation

Inventors: Amit Agarwal, Srikant Panda, Kulbhushan Pachauri
TECHNIQUES OF INFORMATION EXTRACTION FOR SELECTION MARKS

Publication number: 20250078556

Abstract: A method may include detecting one or more selection boxes and one or more text lines in a primary document. The method may include determining respective vectors associated with the selection box and adjacent text lines to the selection box in a plurality of directions. The method may include determining a set of respective vectors associated with a unique selection box. The method may include determining a variance between respective vectors in the set of respective vectors and identifying a particular direction corresponding to a minimal variance between the respective vectors in the set of respective vectors as compared to a variance of other sets of respective vectors. The method may include generating a key-value pair based on the set of respective vectors characterized by the minimal variance. The method may include generating a document model, including the key-value pair, and extracting data according to the document model.

Type: Application

Filed: August 30, 2023

Publication date: March 6, 2025

Applicant: Oracle International Corporation

Inventors: Srikant Panda, Amit Agarwal, Kulbhushan Pachauri
OUT OF DISTRIBUTION ELEMENT DETECTION FOR INFORMATION EXTRACTION

Publication number: 20250014374

Abstract: Techniques for extracting information from unstructured documents that enable an ML model to be trained such that the model can accurately distinguish in-distribution (“in-D”) elements and out-of-distribution (“OO-D”) elements within an unstructured document. Novel training techniques are used that train an ML model using a combination of a regular training dataset and an enhanced augmented training dataset. The regular training dataset is used to train an ML model to identify in-D elements, i.e., to classify an element extracted from a document as belonging to one of the in-D classes contained in the regular training dataset. The augmented training dataset, which is generated based upon the regular training dataset may contain one or more augmented elements which are used to train the model to identify OO-D elements, i.e., to classify an augmented element extracted from a document as belonging to an OO-D class instead of to an in-D class.

Type: Application

Filed: July 6, 2023

Publication date: January 9, 2025

Applicant: Oracle International Corporation

Inventors: Srikant Panda, Amit Agarwal, Gouttham Nambirajan, Kulbhushan Pachauri
DOMAIN ADAPTING GRAPH NETWORKS FOR VISUALLY RICH DOCUMENTS

Publication number: 20240289551

Abstract: In some implementations, techniques described herein may include identifying text in a visually rich document and determining a sequence for the identified text. The techniques may include selecting a language model based at least in part on the identified text and the determined sequence. Moreover, the techniques may include assigning each word of the identified text to a respective token to generate textual features corresponding to the identified text. The techniques may include extracting visual features corresponding to the identified text. The techniques may include determining positional features for each word of the identified text. The techniques may include generating a graph representing the visually rich document, each node in the graph representing each of the visual features, textual features, and positional features of a respective word of the identified text. The techniques may include training a classifier on the graph to classify each respective word of the identified text.

Type: Application

Filed: August 31, 2023

Publication date: August 29, 2024

Applicant: Oracle International Corporation

Inventors: Amit Agarwal, Srikant Panda, Deepak Karmakar, Kulbhushan Pachauri
SYNTHETIC DOCUMENT GENERATION PIPELINE FOR TRAINING ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20240005640

Abstract: Embodiments described herein are directed towards a synthetic document generation pipeline for training artificial intelligence models. One embodiment includes a method including a device that receives an instruction to generate a document to be used as a training instance for a first machine learning model, the instruction including an element configuration, a document class configuration, a format configuration, an augmentation configuration, and data bias and fairness. The device can receive an element from an interface based at least in part on the element configuration, the element can simulate a real-world image, real-world text, or real-world machine-readable visual code. The device can generate metadata describe a layout for the element on the document based on the document class configuration. The device can generate the document by arranging the element on the document based on the metadata, wherein the document is generated in a format based on the format configuration.

Type: Application

Filed: November 28, 2022

Publication date: January 4, 2024

Applicant: Oracle International Corporation

Inventors: Amit Agarwal, Srikant Panda, Kulbhushan Pachauri
APPARATUS AND METHOD FOR FRAUD DETECTION

Publication number: 20200252802

Abstract: Approaches, techniques, and mechanisms are disclosed for generating subscriptions. According to one embodiment, one or more local features of an input request for service subscription are generated based at least in part on one or more messages originated from a client device that represent the input request. One or more global features of a population of input requests originated from a population of client devices are determined based at least in part on a population of input requests. One or more mapped global features of the input request are generated from the one or more global features via one or more mapping functions. One or more machine learning (ML) based prediction models are applied to the one or more local features and the one or more mapped global features of the input request to compute a fraud score for the input request. The fraud score for the input request is used to determine whether the input request for service subscription is to be accepted.

Type: Application

Filed: February 6, 2019

Publication date: August 6, 2020

Inventors: Kulbhushan Pachauri, Bo Shen, Srikant Panda