Patents by Inventor Budhaditya Saha

Budhaditya Saha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240062112
    Abstract: Techniques are disclosed herein for adaptive training data augmentation to facilitate training named entity recognition (NER) models. Adaptive augmentation techniques are disclosed herein that take into consideration the distribution of different entity types within training data. The adaptive augmentation techniques generate adaptive numbers of augmented examples (e.g., utterances) based on the distribution of entities to make sure enough numbers of examples for minority class entities are generated during augmentation of the training data.
    Type: Application
    Filed: August 16, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Omid Mohamad Nezami, Thanh Tien Vu, Budhaditya Saha, Shubham Pawankumar Shah
  • Publication number: 20230325599
    Abstract: Techniques are provided for augmenting training data using gazetteers and perturbations to facilitate training named entity recognition models. The training data can be augmented by generating additional utterances from original utterances in the training data and combining the generated additional utterances with the original utterances to form the augmented training data. The additional utterances can be generated by replacing the named entities in the original utterances with different named entities and/or perturbed versions of the named entities in the original utterances selected from a gazetteer. Gazetteers of named entities can be generated from the training data and expanded by searching a knowledge base and/or perturbing the named entities therein. The named entity recognition model can be trained using the augmented training data.
    Type: Application
    Filed: March 17, 2023
    Publication date: October 12, 2023
    Applicant: Oracle International Corporation
    Inventors: Omid Mohamad Nezami, Shivashankar Subramanian, Thanh Tien Vu, Tuyen Quang Pham, Budhaditya Saha, Aashna Devang Kanuga, Shubham Pawankumar Shah
  • Publication number: 20230095673
    Abstract: Techniques for extracting key information from a document using machine-learning models in a chatbot system is disclosed herein. In one particular aspect, a method is provided that includes receiving a set of data, which includes key fields, within a document at a data processing system that includes a table detection module, a key information extraction module, and a table extraction module. Text information and corresponding location data are extracted via optical character recognition. The table detection module detects whether one or more tables are present in the document and, if applicable, a location of each of the tables. The key information extraction module extracts text from the key fields. The table extraction module extracts each of the tables based on input from the optical character recognition and the table detection module. Extraction results include the text from the key fields and each of the tables can be output.
    Type: Application
    Filed: August 15, 2022
    Publication date: March 30, 2023
    Applicant: Oracle International Corporation
    Inventors: Yakupitiyage Don Thanuja Samodhye Dharmasiri, Xu Zhong, Ahmed Ataallah Ataallah Abobakr, Hongtao Yang, Budhaditya Saha, Shaoke Xu, Shashi Prasad Suravarapu, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20230098783
    Abstract: Techniques are disclosed herein for focused training of language models and end-to-end hypertuning of the framework. In one aspect, a method is provided that includes obtaining a machine learning model pre-trained for language modeling, and post-training the machine learning model for various tasks to generate a focused machine learning model. The post-training includes: (i) training the machine learning model on an unlabeled set of training data pertaining to a task that the machine learning model was pre-trained for as part of the language modeling, and the unlabeled set of training data is obtained with respect to a target domain, a target task, or a target language, and (ii) training the machine learning model on a labeled set of training data that pertains to another task that is an auxiliary task related to a downstream task to be performed using the machine learning model or output from the machine learning model.
    Type: Application
    Filed: September 23, 2022
    Publication date: March 30, 2023
    Applicant: Oracle International Corporation
    Inventors: Poorya Zaremoodi, Cong Duy Vu Hoang, Duy Vu, Dai Hoang Tran, Budhaditya Saha, Nagaraj N. Bhat, Thanh Tien Vu, Tuyen Quang Pham, Adam Craig Pocock, Katherine Silverstein, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20210375488
    Abstract: An automatic medical knowledge curation system automatically extracts medical knowledge from multiple sources, including medical journals, publications and publication databases, and stores this extracted information in the form of a large-scale medical knowledge graph. The system identifies clinical, health and life insurance risk factor entities and medical management information including disease detection, smoking, alcohol consumption patterns, lifestyle information, diagnosis, prognosis, treatment, measuring, monitoring and reporting. The system determines relationships between clinical entities using machine learning and data mining methods. The system determines relationship strengths and can also determine missing and noisy relationships.
    Type: Application
    Filed: May 25, 2021
    Publication date: December 2, 2021
    Applicant: MEDIUS HEALTH
    Inventors: Shameek GHOSH, Budhaditya SAHA, Suhrid SATYAL
  • Publication number: 20210287800
    Abstract: A personal medical-bot with a natural language translator implemented on a personal communication device. The medical-bot interacts in natural language with a user respondent/patient who presents a medical problem. The medical-bot includes a natural language translator with an artificial intelligence (AI) module that accepts the natural language inputs, and identifies medically relevant terminologies and their associations. These are fed to the AI for processing generate clinical-based queries to be answered by the patient. The responses are used by the medical-bot to extract medically relevant data for establishing a medical history and enabling a medical diagnosis for the patient. The medical-bot is able to simulate the sequential queries of a doctor or nurse practitioner to arrive at a diagnosis and an immediate treatment plan, which determines the triage to be followed by the patient. A health score for the patient is also determined.
    Type: Application
    Filed: March 9, 2021
    Publication date: September 16, 2021
    Applicant: MEDIUS HEALTH
    Inventors: Shameek GHOSH, Budhaditya SAHA
  • Patent number: 8744124
    Abstract: The present disclosure concerns methods and/or systems for processing, detecting and/or notifying for the presence of anomalies or infrequent events from data. Some of the disclose methods and/or systems may be used on large-scale data sets. Certain applications are directed to analyzing sensor surveillance records to identify aberrant behavior. The sensor data may be from a number of sensor types including video and/or audio. Certain applications are directed to methods and/or systems that use compressive sensing. Certain applications may be performed in substantially real time.
    Type: Grant
    Filed: April 1, 2010
    Date of Patent: June 3, 2014
    Assignee: Curtin University of Technology
    Inventors: Svetha Venkatesh, Budhaditya Saha, Mihai Mugurel Lazarescu, Duc-Son Pham
  • Publication number: 20120063641
    Abstract: The present disclosure concerns methods and/or systems for processing, detecting and/or notifying for the presence of anomalies or infrequent events from data. Some of the disclose methods and/or systems may be used on large-scale data sets. Certain applications are directed to analyzing sensor surveillance records to identify aberrant behavior. The sensor data may be from a number of sensor types including video and/or audio. Certain applications are directed to methods and/or systems that use compressive sensing. Certain applications may be performed in substantially real time.
    Type: Application
    Filed: April 1, 2010
    Publication date: March 15, 2012
    Applicant: Curtin University of Technology
    Inventors: Svetha Venkatesh, Budhaditya Saha, Mihai Mugurel Lazarescu, Duc-Son Pham