Patents by Inventor Subrata Mitra
Subrata Mitra has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12164517Abstract: To retrieve information derived from a plurality of separately stored datasets, join structures are identified within the plurality of separately stored datasets. Join structures can include datasets joined by a central dataset, datasets joined by a single key, and datasets joined across a plurality of keys. Each of the join structures corresponds to a query processing schema that defines a sampling technique. When a join query is received as a SQL query, the join query identifies a portion of the plurality of separately stored datasets, from which a join structure is selected and a corresponding query processing schema is identified. The join query is reconstructed to form a reconstructed join query that comprises query processing schema instructions to derive the requested information using the sampling technique defined by the identified query processing schema.Type: GrantFiled: January 3, 2023Date of Patent: December 10, 2024Assignee: Adobe Inc.Inventors: Vibhor Porwal, Yeuk-Yin Chan, Vidit Bhatia, Subrata Mitra, Shaddy Garg, Sergey N Kazarin, Sameeksha Arora, Himanshu Panday, Gautam Pratap Kowshik, Fan Du, Anup Bandigadi Rao, Anil Malkani
-
Publication number: 20240394407Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that implements a secure distributed data collaboration architecture for generating synthetic datasets. For example, the disclosed system sends a request to perform a data collaboration with a first dataset of a first local node and a second dataset of a second local node. The disclosed system receives intermediate feature maps from the local nodes that correspond with the datasets and generates a combined feature map. Further, the disclosed system generates a synthetic dataset from the combined feature map by utilizing a central generative model. Moreover, the synthetic dataset generated by the disclosed system is statistically representative of the first dataset and the second dataset.Type: ApplicationFiled: May 26, 2023Publication date: November 28, 2024Inventors: Sunav Choudhary, Subrata Mitra, Sanjay Sukumaran, Priyanshu Yadav, Munish Gupta, Jashn Arora, Iftikhar Ahamath Burhanuddin, Gautam Choudhary, Atharv Tyagi
-
Publication number: 20240386002Abstract: A dataset comprising tables is received. Embeddings are generated for column titles of a table. Based on the embeddings, similar tables are clustered. The tables are organized into smaller clusters based on statistical similarities. Similarity scores are calculated for tables within the same cluster. A relatedness graph is created based on the similarity scores; similar tables are represented by nodes connected by edges. If the similarity score for a pair of tables exceeds a threshold, a table is deleted.Type: ApplicationFiled: May 18, 2023Publication date: November 21, 2024Inventors: Raunak Shah, Koyel MUKHERJEE, Subrata MITRA, Dhruv JOSHI, Sai KARNAM, Shivam Pravin BHOSALE
-
Patent number: 12079217Abstract: Some techniques described herein relate to utilizing a machine-learning (ML) model to select respective samples for queries of a query sequence. In one example, a method includes receiving a query in a query sequence, where the query is directed toward a dataset. Samples are available as down-sampled versions of the dataset. The method further include applying an agent to select, for the query, a sample from among the samples of the dataset. The agent includes an ML model trained, such as via intent-based reinforcement learning, to select respective samples for queries. The query is then executed against the sample to output a response.Type: GrantFiled: May 11, 2022Date of Patent: September 3, 2024Assignee: Adobe Inc.Inventors: Subrata Mitra, Yash Gadhia, Tong Yu, Shaddy Garg, Nikhil Sheoran, Arjun Kashettiwar, Anjali Yadav
-
Publication number: 20240273296Abstract: Embodiments of the technology described herein describe a machine classifier capable of continually learning new classes through a continual few-shot learning approach. A natural language processing (NLP) machine classifier may initially be trained to identify a plurality of other classes through a conventional training process. In order to learn a new class, natural-language training data for a new class is generated. The training data for the new class may be few-shot training data. The training also uses synthetic training data that represents each of the plurality of other classes. The synthetic training data may be generated through a model inversion of the original classifier. The synthetic training data and the natural-language training data are used to retrain the NLP classifier to identify text in the plurality of other classes and the new class using.Type: ApplicationFiled: April 3, 2024Publication date: August 15, 2024Inventors: Sungchul KIM, Subrata MITRA, Ruiyi Zhang, Rui Wang, Handong ZHAO, Tong YU
-
Publication number: 20240220502Abstract: To retrieve information derived from a plurality of separately stored datasets, join structures are identified within the plurality of separately stored datasets. Join structures can include datasets joined by a central dataset, datasets joined by a single key, and datasets joined across a plurality of keys. Each of the join structures corresponds to a query processing schema that defines a sampling technique. When a join query is received as a SQL query, the join query identifies a portion of the plurality of separately stored datasets, from which a join structure is selected and a corresponding query processing schema is identified. The join query is reconstructed to form a reconstructed join query that comprises query processing schema instructions to derive the requested information using the sampling technique defined by the identified query processing schema.Type: ApplicationFiled: January 3, 2023Publication date: July 4, 2024Inventors: Vibhor PORWAL, Yeuk-Yin CHAN, Vidit BHATIA, Subrata MITRA, Shaddy GARG, Sergey N. KAZARIN, Sameeksha ARORA, Himanshu PANDAY, Gautam Pratap KOWSHIK, Fan DU, Anup Bandigadi RAO, Anil MALKANI
-
Patent number: 12014217Abstract: A resource control system is described that is configured to control scheduling of executable jobs by compute instances of a service provider system. In one example, the resource control system outputs a deployment user interface to obtain job information. Upon receipt of the job information, the resource control system communicates with a service provider system to obtain logs from compute instances implemented by the service provider system for the respective executable jobs. The resource control system uses data obtained from the logs to estimate utility indicating status of respective executable jobs and an amount of time to complete the executable jobs by respective compute instances. The resource control system then employs a machine-learning module to generate an action to be performed by compute instances for respective executable jobs.Type: GrantFiled: November 30, 2021Date of Patent: June 18, 2024Assignee: Adobe Inc.Inventors: Subrata Mitra, Sunav Choudhary, Shaddy Garg, Anuj Jitendra Diwan, Piyush Kumar Maurya, Arpit Aggarwal, Prateek Jain
-
Patent number: 11995403Abstract: Embodiments of the technology described herein describe a machine classifier capable of continually learning new classes through a continual few-shot learning approach. A natural language processing (NLP) machine classifier may initially be trained to identify a plurality of other classes through a conventional training process. In order to learn a new class, natural-language training data for a new class is generated. The training data for the new class may be few-shot training data. The training also uses synthetic training data that represents each of the plurality of other classes. The synthetic training data may be generated through a model inversion of the original classifier. The synthetic training data and the natural-language training data are used to retrain the NLP classifier to identify text in the plurality of other classes and the new class using.Type: GrantFiled: November 11, 2021Date of Patent: May 28, 2024Assignee: ADOBE INC.Inventors: Sungchul Kim, Subrata Mitra, Ruiyi Zhang, Rui Wang, Handong Zhao, Tong Yu
-
Patent number: 11989647Abstract: The technology described herein is directed to a self-learning application scheduler for improved scheduling distribution of resource requests, e.g., job and service scheduling requests or tasks derived therefrom, initiated by applications on a shared compute infrastructure. More specifically, the self-learning application scheduler includes a reinforcement learning agent that iteratively learns a scheduling policy to improve scheduling distribution of the resource requests on the shared compute infrastructure. In some implementations, the reinforcement learning agent learns inherent characteristics and patterns of the resource requests initiated by the applications and orchestrates placement or scheduling of the resource requests on the shared compute infrastructure to minimize resource contention and thereby improve application performance for better overall user-experience.Type: GrantFiled: February 8, 2019Date of Patent: May 21, 2024Assignee: Adobe Inc.Inventors: Subrata Mitra, Nikhil Sheoran, Ramanuja Narasimha Simha, Shanka Subhra Mondal, Neeraj Jagdish Dhake, Ravinder Nehra
-
Patent number: 11947986Abstract: Embodiments relate to tenant-side detection and mitigation of performance degradation resulting from interference generated by a noisy neighbor in a distributed computing environment. A first machine-learning model such as a k-means nearest neighbor classifier is operated by a tenant to detect an anomaly with a computer system emulator resulting from a co-located noisy neighbor. A second machine-learning model such as a multi-class classifier is operated by the tenant to identify a contended resource associated with the anomaly. A corresponding trigger signal is generated and provided to trigger various mitigation responses, including an application/framework-specific mitigation strategy (e.g., triggered approximations in application/framework performance, best-efforts paths, run-time changes, etc.), load-balancing, scaling out, updates to a scheduler to avoid impacted nodes, and the like. In this manner, a tenant can detect, classify, and mitigate performance degradation resulting from a noisy neighbor.Type: GrantFiled: June 23, 2021Date of Patent: April 2, 2024Assignee: Adobe Inc.Inventors: Subrata Mitra, Sopan Khosla, Sanket Vaibhav Mehta, Mekala Rajasekhar Reddy, Aashaka Dhaval Shah
-
Patent number: 11915054Abstract: Techniques are provided for scheduling multiple jobs on one or more cloud computing instances, which provide the ability to select a job for execution from among a plurality of jobs, and to further select a designated instance from among a plurality of cloud computing instances for executing the selected job. The job and the designated instance are each selected based on a probability distribution that a cost of executing the job on the designated instance does not exceed the budget. The probability distribution is based on several factors including a cost of prior executions of other jobs on the designated instance and a utility function that represents a value associated with a progress of each job. By scheduling select jobs on discounted cloud computing instances, the aggregate utility of the jobs can be maximized or otherwise improved for a given budget.Type: GrantFiled: May 19, 2021Date of Patent: February 27, 2024Assignee: Adobe Inc.Inventors: Subrata Mitra, Sunav Choudhary, Sheng Yang, Kanak Vivek Mahadik, Samir Khuller
-
Patent number: 11847496Abstract: A digital environment includes multiple computing nodes and a scheduling system that assigns workloads to computing nodes. The scheduling system includes an equivalence-class-based resource usage prediction system that receives a workload request and predicts an equivalence class for that workload request based on resource usage over time by the workload request or metadata associated with the workload request. The scheduling system also includes a workload assignment system that assigns the workload request to one or more of the computing nodes based on the predicted equivalence class. The number of equivalence classes is small relative to the total number of workloads that are scheduled (as an example, 10 to 15 equivalence classes for a total number of workloads in the tens or hundreds of thousands).Type: GrantFiled: October 28, 2020Date of Patent: December 19, 2023Assignee: Adobe Inc.Inventors: Nikhil Sheoran, Subrata Mitra
-
Patent number: 11829239Abstract: A method performed by one or more processors that preserves a machine learning model comprises accessing model parameters associated with a machine learning model. The model parameters are determined responsive to training the machine learning model. The method comprises generating a plurality of model parameter sets, where each of the plurality of model parameter sets comprises a separate portion of the set of model parameters. The method comprises determining one or more parity sets comprising values calculated from the plurality of model parameter sets. The method comprises distributing the plurality of model parameter sets and the one or more parity sets among a plurality of computing devices, where each of the plurality of computing devices stores a model parameter set of the plurality of model parameter sets or a parity set of the one or more parity sets. The method comprises accessing, from the plurality of computing devices, a number of sets comprising model parameter sets and at least one parity set.Type: GrantFiled: November 17, 2021Date of Patent: November 28, 2023Assignee: Adobe Inc.Inventors: Subrata Mitra, Ayush Chauhan, Sunav Choudhary
-
Publication number: 20230367772Abstract: Some techniques described herein relate to utilizing a machine-learning (ML) model to select respective samples for queries of a query sequence. In one example, a method includes receiving a query in a query sequence, where the query is directed toward a dataset. Samples are available as down-sampled versions of the dataset. The method further include applying an agent to select, for the query, a sample from among the samples of the dataset. The agent includes an ML model trained, such as via intent-based reinforcement learning, to select respective samples for queries. The query is then executed against the sample to output a response.Type: ApplicationFiled: May 11, 2022Publication date: November 16, 2023Inventors: Subrata Mitra, Yash Gadhia, Tong Yu, Shaddy Garg, Nikhil Sheoran, Arjun Kashettiwar, Anjali Yadav
-
Publication number: 20230262237Abstract: Systems and methods for image processing are described. The systems and methods include receiving a plurality of frames of a video at an edge device, wherein the video depicts an action that spans the plurality of frames, compressing, using an encoder network, each of the plurality of frames to obtain compressed frame features, wherein the compressed frame features include fewer data bits than the plurality of frames of the video, classifying, using a classification network, the compressed frame features at the edge device to obtain action classification information corresponding to the action in the video, and transmitting the action classification information from the edge device to a central server.Type: ApplicationFiled: February 15, 2022Publication date: August 17, 2023Inventors: Subrata Mitra, Aniruddha Mahapatra, Kuldeep Sharad Kulkarni, Abhishek Yadav, Abhijith Kuruba, Manoj Kilaru
-
Publication number: 20230222005Abstract: Shared resource interference detection techniques are described. In an example, a resource detection module supports techniques to quantify levels of interference through use of working set sizes. The resource detection module selects working set sizes. The resource detection module then initiates execution of code that utilizes the shared resource based on the first working set size. The resource detection module detects a resource consumption amount based on the execution of the code. The resource detection module then determines whether the detected resource consumption amount corresponds to the defined resource consumption amount for the selected working set size.Type: ApplicationFiled: January 11, 2022Publication date: July 13, 2023Applicant: Adobe Inc.Inventors: Subrata Mitra, Pradeep Dogga
-
Publication number: 20230168941Abstract: A resource control system is described that is configured to control scheduling of executable jobs by compute instances of a service provider system. In one example, the resource control system outputs a deployment user interface to obtain job information. Upon receipt of the job information, the resource control system communicates with a service provider system to obtain logs from compute instances implemented by the service provider system for the respective executable jobs. The resource control system uses data obtained from the logs to estimate utility indicating status of respective executable jobs and an amount of time to complete the executable jobs by respective compute instances. The resource control system then employs a machine-learning module to generate an action to be performed by compute instances for respective executable jobs.Type: ApplicationFiled: November 30, 2021Publication date: June 1, 2023Applicant: Adobe Inc.Inventors: Subrata Mitra, Sunav Choudhary, Shaddy Garg, Anuj Jitendra Diwan, Piyush Kumar Maurya, Arpit Aggarwal, Prateek Jain
-
Publication number: 20230153195Abstract: A method performed by one or more processors that preserves a machine learning model comprises accessing model parameters associated with a machine learning model. The model parameters are determined responsive to training the machine learning model. The method comprises generating a plurality of model parameter sets, where each of the plurality of model parameter sets comprises a separate portion of the set of model parameters. The method comprises determining one or more parity sets comprising values calculated from the plurality of model parameter sets. The method comprises distributing the plurality of model parameter sets and the one or more parity sets among a plurality of computing devices, where each of the plurality of computing devices stores a model parameter set of the plurality of model parameter sets or a parity set of the one or more parity sets. The method comprises accessing, from the plurality of computing devices, a number of sets comprising model parameter sets and at least one parity set.Type: ApplicationFiled: November 17, 2021Publication date: May 18, 2023Inventors: SUBRATA MITRA, AYUSH CHAUHAN, SUNAV CHOUDHARY
-
Publication number: 20230153448Abstract: Methods and systems are provided for facilitating generation of representative datasets. In embodiments, an original dataset for which a data representation is to be generated is obtained. A data generation model is trained to generate a representative dataset that represents the original dataset. The data generation model is trained based on the original dataset, a set of privacy settings indicating privacy of data associated with the original dataset, and a set of value settings indicating value of data associated with the original dataset. A representative dataset that represents the original dataset is generated via the trained data generation model. The generated representative dataset maintains a set of desired statistical properties of the original dataset, maintains an extent of data privacy of the set of original data, and maintains an extent of data value of the set of original data.Type: ApplicationFiled: November 12, 2021Publication date: May 18, 2023Inventors: Subrata Mitra, Sunny Dhamnani, Piyush Bagad, Raunak Gautam, Haresh Khanna, Atanu R. Sinha
-
Publication number: 20230143721Abstract: Embodiments of the technology described herein describe a machine classifier capable of continually learning new classes through a continual few-shot learning approach. A natural language processing (NLP) machine classifier may initially be trained to identify a plurality of other classes through a conventional training process. In order to learn a new class, natural-language training data for a new class is generated. The training data for the new class may be few-shot training data. The training also uses synthetic training data that represents each of the plurality of other classes. The synthetic training data may be generated through a model inversion of the original classifier. The synthetic training data and the natural-language training data are used to retrain the NLP classifier to identify text in the plurality of other classes and the new class using.Type: ApplicationFiled: November 11, 2021Publication date: May 11, 2023Inventors: Sungchul Kim, Subrata Mitra, Ruiyi Zhang, Rui Wang, Handong Zhao, Tong Yu