Patents by Inventor Subrata Mitra

Subrata Mitra has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12164517
    Abstract: To retrieve information derived from a plurality of separately stored datasets, join structures are identified within the plurality of separately stored datasets. Join structures can include datasets joined by a central dataset, datasets joined by a single key, and datasets joined across a plurality of keys. Each of the join structures corresponds to a query processing schema that defines a sampling technique. When a join query is received as a SQL query, the join query identifies a portion of the plurality of separately stored datasets, from which a join structure is selected and a corresponding query processing schema is identified. The join query is reconstructed to form a reconstructed join query that comprises query processing schema instructions to derive the requested information using the sampling technique defined by the identified query processing schema.
    Type: Grant
    Filed: January 3, 2023
    Date of Patent: December 10, 2024
    Assignee: Adobe Inc.
    Inventors: Vibhor Porwal, Yeuk-Yin Chan, Vidit Bhatia, Subrata Mitra, Shaddy Garg, Sergey N Kazarin, Sameeksha Arora, Himanshu Panday, Gautam Pratap Kowshik, Fan Du, Anup Bandigadi Rao, Anil Malkani
  • Publication number: 20240394407
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that implements a secure distributed data collaboration architecture for generating synthetic datasets. For example, the disclosed system sends a request to perform a data collaboration with a first dataset of a first local node and a second dataset of a second local node. The disclosed system receives intermediate feature maps from the local nodes that correspond with the datasets and generates a combined feature map. Further, the disclosed system generates a synthetic dataset from the combined feature map by utilizing a central generative model. Moreover, the synthetic dataset generated by the disclosed system is statistically representative of the first dataset and the second dataset.
    Type: Application
    Filed: May 26, 2023
    Publication date: November 28, 2024
    Inventors: Sunav Choudhary, Subrata Mitra, Sanjay Sukumaran, Priyanshu Yadav, Munish Gupta, Jashn Arora, Iftikhar Ahamath Burhanuddin, Gautam Choudhary, Atharv Tyagi
  • Publication number: 20240386002
    Abstract: A dataset comprising tables is received. Embeddings are generated for column titles of a table. Based on the embeddings, similar tables are clustered. The tables are organized into smaller clusters based on statistical similarities. Similarity scores are calculated for tables within the same cluster. A relatedness graph is created based on the similarity scores; similar tables are represented by nodes connected by edges. If the similarity score for a pair of tables exceeds a threshold, a table is deleted.
    Type: Application
    Filed: May 18, 2023
    Publication date: November 21, 2024
    Inventors: Raunak Shah, Koyel MUKHERJEE, Subrata MITRA, Dhruv JOSHI, Sai KARNAM, Shivam Pravin BHOSALE
  • Patent number: 12079217
    Abstract: Some techniques described herein relate to utilizing a machine-learning (ML) model to select respective samples for queries of a query sequence. In one example, a method includes receiving a query in a query sequence, where the query is directed toward a dataset. Samples are available as down-sampled versions of the dataset. The method further include applying an agent to select, for the query, a sample from among the samples of the dataset. The agent includes an ML model trained, such as via intent-based reinforcement learning, to select respective samples for queries. The query is then executed against the sample to output a response.
    Type: Grant
    Filed: May 11, 2022
    Date of Patent: September 3, 2024
    Assignee: Adobe Inc.
    Inventors: Subrata Mitra, Yash Gadhia, Tong Yu, Shaddy Garg, Nikhil Sheoran, Arjun Kashettiwar, Anjali Yadav
  • Publication number: 20240273296
    Abstract: Embodiments of the technology described herein describe a machine classifier capable of continually learning new classes through a continual few-shot learning approach. A natural language processing (NLP) machine classifier may initially be trained to identify a plurality of other classes through a conventional training process. In order to learn a new class, natural-language training data for a new class is generated. The training data for the new class may be few-shot training data. The training also uses synthetic training data that represents each of the plurality of other classes. The synthetic training data may be generated through a model inversion of the original classifier. The synthetic training data and the natural-language training data are used to retrain the NLP classifier to identify text in the plurality of other classes and the new class using.
    Type: Application
    Filed: April 3, 2024
    Publication date: August 15, 2024
    Inventors: Sungchul KIM, Subrata MITRA, Ruiyi Zhang, Rui Wang, Handong ZHAO, Tong YU
  • Publication number: 20240220502
    Abstract: To retrieve information derived from a plurality of separately stored datasets, join structures are identified within the plurality of separately stored datasets. Join structures can include datasets joined by a central dataset, datasets joined by a single key, and datasets joined across a plurality of keys. Each of the join structures corresponds to a query processing schema that defines a sampling technique. When a join query is received as a SQL query, the join query identifies a portion of the plurality of separately stored datasets, from which a join structure is selected and a corresponding query processing schema is identified. The join query is reconstructed to form a reconstructed join query that comprises query processing schema instructions to derive the requested information using the sampling technique defined by the identified query processing schema.
    Type: Application
    Filed: January 3, 2023
    Publication date: July 4, 2024
    Inventors: Vibhor PORWAL, Yeuk-Yin CHAN, Vidit BHATIA, Subrata MITRA, Shaddy GARG, Sergey N. KAZARIN, Sameeksha ARORA, Himanshu PANDAY, Gautam Pratap KOWSHIK, Fan DU, Anup Bandigadi RAO, Anil MALKANI
  • Patent number: 12014217
    Abstract: A resource control system is described that is configured to control scheduling of executable jobs by compute instances of a service provider system. In one example, the resource control system outputs a deployment user interface to obtain job information. Upon receipt of the job information, the resource control system communicates with a service provider system to obtain logs from compute instances implemented by the service provider system for the respective executable jobs. The resource control system uses data obtained from the logs to estimate utility indicating status of respective executable jobs and an amount of time to complete the executable jobs by respective compute instances. The resource control system then employs a machine-learning module to generate an action to be performed by compute instances for respective executable jobs.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: June 18, 2024
    Assignee: Adobe Inc.
    Inventors: Subrata Mitra, Sunav Choudhary, Shaddy Garg, Anuj Jitendra Diwan, Piyush Kumar Maurya, Arpit Aggarwal, Prateek Jain
  • Patent number: 11995403
    Abstract: Embodiments of the technology described herein describe a machine classifier capable of continually learning new classes through a continual few-shot learning approach. A natural language processing (NLP) machine classifier may initially be trained to identify a plurality of other classes through a conventional training process. In order to learn a new class, natural-language training data for a new class is generated. The training data for the new class may be few-shot training data. The training also uses synthetic training data that represents each of the plurality of other classes. The synthetic training data may be generated through a model inversion of the original classifier. The synthetic training data and the natural-language training data are used to retrain the NLP classifier to identify text in the plurality of other classes and the new class using.
    Type: Grant
    Filed: November 11, 2021
    Date of Patent: May 28, 2024
    Assignee: ADOBE INC.
    Inventors: Sungchul Kim, Subrata Mitra, Ruiyi Zhang, Rui Wang, Handong Zhao, Tong Yu
  • Patent number: 11989647
    Abstract: The technology described herein is directed to a self-learning application scheduler for improved scheduling distribution of resource requests, e.g., job and service scheduling requests or tasks derived therefrom, initiated by applications on a shared compute infrastructure. More specifically, the self-learning application scheduler includes a reinforcement learning agent that iteratively learns a scheduling policy to improve scheduling distribution of the resource requests on the shared compute infrastructure. In some implementations, the reinforcement learning agent learns inherent characteristics and patterns of the resource requests initiated by the applications and orchestrates placement or scheduling of the resource requests on the shared compute infrastructure to minimize resource contention and thereby improve application performance for better overall user-experience.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: May 21, 2024
    Assignee: Adobe Inc.
    Inventors: Subrata Mitra, Nikhil Sheoran, Ramanuja Narasimha Simha, Shanka Subhra Mondal, Neeraj Jagdish Dhake, Ravinder Nehra
  • Patent number: 11947986
    Abstract: Embodiments relate to tenant-side detection and mitigation of performance degradation resulting from interference generated by a noisy neighbor in a distributed computing environment. A first machine-learning model such as a k-means nearest neighbor classifier is operated by a tenant to detect an anomaly with a computer system emulator resulting from a co-located noisy neighbor. A second machine-learning model such as a multi-class classifier is operated by the tenant to identify a contended resource associated with the anomaly. A corresponding trigger signal is generated and provided to trigger various mitigation responses, including an application/framework-specific mitigation strategy (e.g., triggered approximations in application/framework performance, best-efforts paths, run-time changes, etc.), load-balancing, scaling out, updates to a scheduler to avoid impacted nodes, and the like. In this manner, a tenant can detect, classify, and mitigate performance degradation resulting from a noisy neighbor.
    Type: Grant
    Filed: June 23, 2021
    Date of Patent: April 2, 2024
    Assignee: Adobe Inc.
    Inventors: Subrata Mitra, Sopan Khosla, Sanket Vaibhav Mehta, Mekala Rajasekhar Reddy, Aashaka Dhaval Shah
  • Patent number: 11915054
    Abstract: Techniques are provided for scheduling multiple jobs on one or more cloud computing instances, which provide the ability to select a job for execution from among a plurality of jobs, and to further select a designated instance from among a plurality of cloud computing instances for executing the selected job. The job and the designated instance are each selected based on a probability distribution that a cost of executing the job on the designated instance does not exceed the budget. The probability distribution is based on several factors including a cost of prior executions of other jobs on the designated instance and a utility function that represents a value associated with a progress of each job. By scheduling select jobs on discounted cloud computing instances, the aggregate utility of the jobs can be maximized or otherwise improved for a given budget.
    Type: Grant
    Filed: May 19, 2021
    Date of Patent: February 27, 2024
    Assignee: Adobe Inc.
    Inventors: Subrata Mitra, Sunav Choudhary, Sheng Yang, Kanak Vivek Mahadik, Samir Khuller
  • Patent number: 11847496
    Abstract: A digital environment includes multiple computing nodes and a scheduling system that assigns workloads to computing nodes. The scheduling system includes an equivalence-class-based resource usage prediction system that receives a workload request and predicts an equivalence class for that workload request based on resource usage over time by the workload request or metadata associated with the workload request. The scheduling system also includes a workload assignment system that assigns the workload request to one or more of the computing nodes based on the predicted equivalence class. The number of equivalence classes is small relative to the total number of workloads that are scheduled (as an example, 10 to 15 equivalence classes for a total number of workloads in the tens or hundreds of thousands).
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: December 19, 2023
    Assignee: Adobe Inc.
    Inventors: Nikhil Sheoran, Subrata Mitra
  • Patent number: 11829239
    Abstract: A method performed by one or more processors that preserves a machine learning model comprises accessing model parameters associated with a machine learning model. The model parameters are determined responsive to training the machine learning model. The method comprises generating a plurality of model parameter sets, where each of the plurality of model parameter sets comprises a separate portion of the set of model parameters. The method comprises determining one or more parity sets comprising values calculated from the plurality of model parameter sets. The method comprises distributing the plurality of model parameter sets and the one or more parity sets among a plurality of computing devices, where each of the plurality of computing devices stores a model parameter set of the plurality of model parameter sets or a parity set of the one or more parity sets. The method comprises accessing, from the plurality of computing devices, a number of sets comprising model parameter sets and at least one parity set.
    Type: Grant
    Filed: November 17, 2021
    Date of Patent: November 28, 2023
    Assignee: Adobe Inc.
    Inventors: Subrata Mitra, Ayush Chauhan, Sunav Choudhary
  • Publication number: 20230367772
    Abstract: Some techniques described herein relate to utilizing a machine-learning (ML) model to select respective samples for queries of a query sequence. In one example, a method includes receiving a query in a query sequence, where the query is directed toward a dataset. Samples are available as down-sampled versions of the dataset. The method further include applying an agent to select, for the query, a sample from among the samples of the dataset. The agent includes an ML model trained, such as via intent-based reinforcement learning, to select respective samples for queries. The query is then executed against the sample to output a response.
    Type: Application
    Filed: May 11, 2022
    Publication date: November 16, 2023
    Inventors: Subrata Mitra, Yash Gadhia, Tong Yu, Shaddy Garg, Nikhil Sheoran, Arjun Kashettiwar, Anjali Yadav
  • Publication number: 20230262237
    Abstract: Systems and methods for image processing are described. The systems and methods include receiving a plurality of frames of a video at an edge device, wherein the video depicts an action that spans the plurality of frames, compressing, using an encoder network, each of the plurality of frames to obtain compressed frame features, wherein the compressed frame features include fewer data bits than the plurality of frames of the video, classifying, using a classification network, the compressed frame features at the edge device to obtain action classification information corresponding to the action in the video, and transmitting the action classification information from the edge device to a central server.
    Type: Application
    Filed: February 15, 2022
    Publication date: August 17, 2023
    Inventors: Subrata Mitra, Aniruddha Mahapatra, Kuldeep Sharad Kulkarni, Abhishek Yadav, Abhijith Kuruba, Manoj Kilaru
  • Publication number: 20230222005
    Abstract: Shared resource interference detection techniques are described. In an example, a resource detection module supports techniques to quantify levels of interference through use of working set sizes. The resource detection module selects working set sizes. The resource detection module then initiates execution of code that utilizes the shared resource based on the first working set size. The resource detection module detects a resource consumption amount based on the execution of the code. The resource detection module then determines whether the detected resource consumption amount corresponds to the defined resource consumption amount for the selected working set size.
    Type: Application
    Filed: January 11, 2022
    Publication date: July 13, 2023
    Applicant: Adobe Inc.
    Inventors: Subrata Mitra, Pradeep Dogga
  • Publication number: 20230168941
    Abstract: A resource control system is described that is configured to control scheduling of executable jobs by compute instances of a service provider system. In one example, the resource control system outputs a deployment user interface to obtain job information. Upon receipt of the job information, the resource control system communicates with a service provider system to obtain logs from compute instances implemented by the service provider system for the respective executable jobs. The resource control system uses data obtained from the logs to estimate utility indicating status of respective executable jobs and an amount of time to complete the executable jobs by respective compute instances. The resource control system then employs a machine-learning module to generate an action to be performed by compute instances for respective executable jobs.
    Type: Application
    Filed: November 30, 2021
    Publication date: June 1, 2023
    Applicant: Adobe Inc.
    Inventors: Subrata Mitra, Sunav Choudhary, Shaddy Garg, Anuj Jitendra Diwan, Piyush Kumar Maurya, Arpit Aggarwal, Prateek Jain
  • Publication number: 20230153195
    Abstract: A method performed by one or more processors that preserves a machine learning model comprises accessing model parameters associated with a machine learning model. The model parameters are determined responsive to training the machine learning model. The method comprises generating a plurality of model parameter sets, where each of the plurality of model parameter sets comprises a separate portion of the set of model parameters. The method comprises determining one or more parity sets comprising values calculated from the plurality of model parameter sets. The method comprises distributing the plurality of model parameter sets and the one or more parity sets among a plurality of computing devices, where each of the plurality of computing devices stores a model parameter set of the plurality of model parameter sets or a parity set of the one or more parity sets. The method comprises accessing, from the plurality of computing devices, a number of sets comprising model parameter sets and at least one parity set.
    Type: Application
    Filed: November 17, 2021
    Publication date: May 18, 2023
    Inventors: SUBRATA MITRA, AYUSH CHAUHAN, SUNAV CHOUDHARY
  • Publication number: 20230153448
    Abstract: Methods and systems are provided for facilitating generation of representative datasets. In embodiments, an original dataset for which a data representation is to be generated is obtained. A data generation model is trained to generate a representative dataset that represents the original dataset. The data generation model is trained based on the original dataset, a set of privacy settings indicating privacy of data associated with the original dataset, and a set of value settings indicating value of data associated with the original dataset. A representative dataset that represents the original dataset is generated via the trained data generation model. The generated representative dataset maintains a set of desired statistical properties of the original dataset, maintains an extent of data privacy of the set of original data, and maintains an extent of data value of the set of original data.
    Type: Application
    Filed: November 12, 2021
    Publication date: May 18, 2023
    Inventors: Subrata Mitra, Sunny Dhamnani, Piyush Bagad, Raunak Gautam, Haresh Khanna, Atanu R. Sinha
  • Publication number: 20230143721
    Abstract: Embodiments of the technology described herein describe a machine classifier capable of continually learning new classes through a continual few-shot learning approach. A natural language processing (NLP) machine classifier may initially be trained to identify a plurality of other classes through a conventional training process. In order to learn a new class, natural-language training data for a new class is generated. The training data for the new class may be few-shot training data. The training also uses synthetic training data that represents each of the plurality of other classes. The synthetic training data may be generated through a model inversion of the original classifier. The synthetic training data and the natural-language training data are used to retrain the NLP classifier to identify text in the plurality of other classes and the new class using.
    Type: Application
    Filed: November 11, 2021
    Publication date: May 11, 2023
    Inventors: Sungchul Kim, Subrata Mitra, Ruiyi Zhang, Rui Wang, Handong Zhao, Tong Yu