Patents by Inventor Jayaram Kallapalayam Radhakrishnan
Jayaram Kallapalayam Radhakrishnan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11586475Abstract: One embodiment provides a method, including: receiving at least one deep learning job for scheduling and running on a distributed system comprising a plurality of nodes; receiving a batch size range indicating a minimum batch size and a maximum batch size that can be utilized for running the at least one deep learning job; determining a plurality of runtime estimations for running the at least one deep learning job; creating a list of optimal combinations of (i) batch sizes and (ii) numbers of the plurality of nodes for running both (a) the at least one deep learning job and (b) current deep learning jobs; and scheduling the at least one deep-learning job at the distributed system, responsive to identifying, by utilizing the list, that the distributed system has necessary processing resources for running both (iii) the at least one deep learning job and (iv) the current deep learning jobs.Type: GrantFiled: February 28, 2020Date of Patent: February 21, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Saurav Basu, Vaibhav Saxena, Yogish Sabharwal, Ashish Verma, Jayaram Kallapalayam Radhakrishnan
-
Publication number: 20220374762Abstract: Techniques for distributed federated learning leverage a multi-layered defense strategy to provide for reduced information leakage. In lieu of aggregating model updates centrally, an aggregation function is decentralized into multiple independent and functionally-equivalent execution entities, each running within its own trusted executed environment (TEE). The TEEs enable confidential and remote-attestable federated aggregation. Preferably, each aggregator entity runs within an encrypted virtual machine that support runtime in-memory encryption. Each party remotely authenticates the TEE before participating in the training. By using multiple decentralized aggregators, parties are enabled to partition their respective model updates at model-parameter granularity, and can map single weights to a specific aggregator entity. Parties also can dynamically shuffle fragmentary model updates at each training iteration to further obfuscate the information dispatched to each aggregator execution entity.Type: ApplicationFiled: May 18, 2021Publication date: November 24, 2022Applicant: International Business Machines CorporationInventors: Jayaram Kallapalayam Radhakrishnan, Ashish Verma, Zhongshu Gu, Enriquillo Valdez, Pau-Chen Cheng, Hani Talal Jamjoom
-
Publication number: 20220374763Abstract: Techniques for distributed federated learning leverage a multi-layered defense strategy to provide for reduced information leakage. In lieu of aggregating model updates centrally, an aggregation function is decentralized into multiple independent and functionally-equivalent execution entities, each running within its own trusted executed environment (TEE). The TEEs enable confidential and remote-attestable federated aggregation. Preferably, each aggregator entity runs within an encrypted virtual machine that support runtime in-memory encryption. Each party remotely authenticates the TEE before participating in the training. By using multiple decentralized aggregators, parties are enabled to partition their respective model updates at model-parameter granularity, and can map single weights to a specific aggregator entity. Parties also can dynamically shuffle fragmentary model updates at each training iteration to further obfuscate the information dispatched to each aggregator execution entity.Type: ApplicationFiled: May 18, 2021Publication date: November 24, 2022Applicant: International Business Machines CorporationInventors: Zhongshu Gu, Jayaram Kallapalayam Radhakrishnan, Ashish Verma, Enriquillo Valdez, Pau-Chen Cheng, Hani Talal Jamjoom, Kevin Eykholt
-
Patent number: 11269728Abstract: A lifecycle management method, system, and computer program product include coordinating hardware, platform and application-level health checks for framework-independent and application-specific monitoring, failure detection, and recovery, coordinating the hardware, the platform, and the application-level health check by state-specific aggregation of distributed atomic status events, and creating a recovery policy based on the state-specific aggregation of the distributed atomic status events.Type: GrantFiled: March 20, 2019Date of Patent: March 8, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jayaram Kallapalayam Radhakrishnan, Vinod Muthusamy, Vatche Isahagian, Scott Boag, Benjamin Herta, Atin Sood
-
Patent number: 11263052Abstract: Methods, systems, and computer program products for determining optimal compute resources for distributed batch based optimization applications are provided herein. A method includes obtaining a size of an input dataset, a size of a model, and a set of batch sizes corresponding to a job to be processed using a distributed computing system; computing, based at least in part on the set of batch sizes, one or more node counts corresponding to a number of nodes that can be used for processing said job; estimating, for each given one of the node counts, an execution time to process the job based on an average computation time for a batch of said input dataset and an average communication time for said batch of said input dataset; and selecting, based at least in part on said estimating, at least one of said node counts for processing the job.Type: GrantFiled: July 29, 2019Date of Patent: March 1, 2022Assignee: International Business Machines CorporationInventors: Vaibhav Saxena, Saurav Basu, Jayaram Kallapalayam Radhakrishnan, Yogish Sabharwal, Ashish Verma
-
Patent number: 11196547Abstract: A lifecycle management method, system, and computer program product include establishing a public key infrastructure (PKI) for end-to-end encryption of control plane and data plane communications by providing encryption between arbitrary components for applicant execution where an interaction pattern is isolated, secure, and a multi-tenant environment.Type: GrantFiled: March 20, 2019Date of Patent: December 7, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jayaram Kallapalayam Radhakrishnan, Vinod Muthusamy, Vatche Isahagian, Scott Boag, Benjamin Herta, Atin Sood
-
Publication number: 20210271520Abstract: One embodiment provides a method, including: receiving at least one deep learning job for scheduling and running on a distributed system comprising a plurality of nodes; receiving a batch size range indicating a minimum batch size and a maximum batch size that can be utilized for running the at least one deep learning job; determining a plurality of runtime estimations for running the at least one deep learning job; creating a list of optimal combinations of (i) batch sizes and (ii) numbers of the plurality of nodes for running both (a) the at least one deep learning job and (b) current deep learning jobs; and scheduling the at least one deep-learning job at the distributed system, responsive to identifying, by utilizing the list, that the distributed system has necessary processing resources for running both (iii) the at least one deep learning job and (iv) the current deep learning jobs.Type: ApplicationFiled: February 28, 2020Publication date: September 2, 2021Inventors: Saurav Basu, Vaibhav Saxena, Yogish Sabharwal, Ashish Verma, Jayaram Kallapalayam Radhakrishnan
-
Publication number: 20210216902Abstract: Techniques regarding determining hyperparameters for a differentially private federated learning process are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a hyperparameter advisor component that determines a hyperparameter for a model of a differentially private federated learning process based on a defined numeric relationship between a privacy budget, a learning rate schedule, and a batch size.Type: ApplicationFiled: January 9, 2020Publication date: July 15, 2021Inventors: Colin Sutcher-Shepard, Ashish Verma, Jayaram Kallapalayam Radhakrishnan, Gegi Thomas
-
Publication number: 20210150037Abstract: Embodiments relate to training a machine learning model based on an iterative algorithm in a distributed, federated, private, and secure manner. Participating entities are registered in a collaborative relationship. The registered participating entities are arranged in a topology and a topological communication direction is established. Each registered participating entity receives a public additive homomorphic encryption (AHE) key and local machine learning model weights are encrypted with the received public key. The encrypted local machine learning model weights are selectively aggregated and distributed to one or more participating entities in the topology responsive to the topological communication direction. The aggregated sum of the encrypted local machine learning model weights is subjected to decryption with a corresponding private AHE key. The decrypted aggregated sum of the encrypted local machine learning model weights is shared with the registered participating entities.Type: ApplicationFiled: November 15, 2019Publication date: May 20, 2021Applicant: International Business Machines CorporationInventors: Jayaram Kallapalayam Radhakrishnan, Gegi Thomas, Ashish Verma
-
Publication number: 20210034374Abstract: Methods, systems, and computer program products for determining optimal compute resources for distributed batch based optimization applications are provided herein. A method includes obtaining a size of an input dataset, a size of a model, and a set of batch sizes corresponding to a job to be processed using a distributed computing system; computing, based at least in part on the set of batch sizes, one or more node counts corresponding to a number of nodes that can be used for processing said job; estimating, for each given one of the node counts, an execution time to process the job based on an average computation time for a batch of said input dataset and an average communication time for said batch of said input dataset; and selecting, based at least in part on said estimating, at least one of said node counts for processing the job.Type: ApplicationFiled: July 29, 2019Publication date: February 4, 2021Inventors: Vaibhav Saxena, Saurav Basu, Jayaram Kallapalayam Radhakrishnan, Yogish Sabharwal, Ashish Verma
-
Publication number: 20200304297Abstract: A lifecycle management method, system, and computer program product include establishing a public key infrastructure (PKI) for end-to-end encryption of control plane and data plane communications by providing encryption between arbitrary components for applicant execution where an interaction pattern is isolated, secure, and a multi-tenant environment.Type: ApplicationFiled: March 20, 2019Publication date: September 24, 2020Inventors: Jayaram Kallapalayam Radhakrishnan, Vinod Muthusamy, Vatche Isahagian, Scott Boag, Benjamin Herta, ATIN SOOD
-
Publication number: 20200301782Abstract: A lifecycle management method, system, and computer program product include coordinating hardware, platform and application-level health checks for framework-independent and application-specific monitoring, failure detection, and recovery, coordinating the hardware, the platform, and the application-level health check by state-specific aggregation of distributed atomic status events, and creating a recovery policy based on the state-specific aggregation of the distributed atomic status events.Type: ApplicationFiled: March 20, 2019Publication date: September 24, 2020Inventors: Jayaram Kallapalayam Radhakrishnan, Vinod Muthusamy, Vatche lsahagian, Scott Boag, Benjamin Herta, Atin SOOD
-
Patent number: 10419457Abstract: In response to determining that an event matches a condition of a rule, a given one of a plurality of computing nodes is selected to send the event, based on one or both of an attribute of the event and an identifier of the rule. Information of the event is sent to the given computing node to perform correlation of the event with another event.Type: GrantFiled: April 30, 2014Date of Patent: September 17, 2019Assignee: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LPInventors: Daniel Juergen Gmach, Alvin AuYoung, Robert Block, Jayaram Kallapalayam Radhakrishnan, Suranjan Pramanik, Julian James Stephen, Anurag Singla
-
Patent number: 10228924Abstract: Examples of techniques for deploying an application on a cloud environment satisfying integrity and geo-fencing constraints are disclosed herein. A computer implemented method may include: receiving a guest application for deployment on a cloud environment; receiving the integrity constraints on the integrity of each of the plurality of host where the application is to be deployed; receiving geo-fencing constraints identifying a geographic location where the guest application is to be deployed; determining for which of the plurality of hosts the integrity constraints and the geo-fencing constraints are satisfied; and deploying the guest application on at least one of the plurality of hosts that satisfy the integrity constraints and the geo-fencing constraints.Type: GrantFiled: April 19, 2016Date of Patent: March 12, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Stefan Berger, Kenneth A. Goldman, Simon J. Kofkin-Hansen, Hui Lei, Vijay K. Naik, Dimitrios Pendarakis, Jayaram Kallapalayam Radhakrishnan, David R. Safford, Shu Tao
-
Patent number: 9830677Abstract: Examples of GPU resource sharing among applications are disclosed. In one example, a method includes receiving a first request from a first application of the plurality of applications for first requested GPU resources, and receiving a second request from a second application of the plurality of applications for second GPU resources. The method also includes, responsive to determining that the first requested GPU resources are available, allocating a first slice of the GPU resources with a first requested amount of resources to the first application and, responsive to determining that the second requested GPU resources are available, allocating a second slice of the GPU resources with a second requested amount of resources to the second application. Further, the method includes enabling the first application and the second application to execute concurrently within the first slice of the GPU and the second slice of the GPU respectively.Type: GrantFiled: March 3, 2016Date of Patent: November 28, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Anshul Gandhi, Hui Lei, Jayaram Kallapalayam Radhakrishnan, Charles O. Schulz, Shu Tao
-
Patent number: 9830678Abstract: Examples of GPU resource sharing among distributed applications in a distributed computing environment are disclosed. In one example, a method includes receiving a first request from a first distributed application of the plurality of distributed applications for first requested GPU resources. The method may further include receiving a second request from a second distributed application of the plurality of distributed applications for second requested GPU resources. The method may also include receiving response from each of the plurality of computing nodes indicating an availability of GPU resources for each of the plurality of computing nodes. Additionally, the method may include, responsive to determining that at least one of the first and second requests can be fulfilled by at least one of the plurality of computing nodes, allocating a first set of GPU slices for the first application and allocating a second set of GPU slices for the second application.Type: GrantFiled: March 3, 2016Date of Patent: November 28, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Anshul Gandhi, Hui Lei, Jayaram Kallapalayam Radhakrishnan, Charles O. Schulz, Shu Tao
-
Publication number: 20170300309Abstract: Examples of techniques for deploying an application on a cloud environment satisfying integrity and geo-fencing constraints are disclosed herein. A computer implemented method may include: receiving a guest application for deployment on a cloud environment; receiving the integrity constraints on the integrity of each of the plurality of host where the application is to be deployed; receiving geo-fencing constraints identifying a geographic location where the guest application is to be deployed; determining for which of the plurality of hosts the integrity constraints and the geo-fencing constraints are satisfied; and deploying the guest application on at least one of the plurality of hosts that satisfy the integrity constraints and the geo-fencing constraints.Type: ApplicationFiled: April 19, 2016Publication date: October 19, 2017Inventors: STEFAN BERGER, KENNETH A. GOLDMAN, SIMON J. KOFKIN-HANSEN, HUI LEI, VIJAY K. NAIK, DIMITRIOS PENDARAKIS, JAYARAM KALLAPALAYAM RADHAKRISHNAN, DAVID R. SAFFORD, SHU TAO
-
Publication number: 20170256018Abstract: Examples of GPU resource sharing among distributed applications in a distributed computing environment are disclosed. In one example, a method includes receiving a first request from a first distributed application of the plurality of distributed applications for first requested GPU resources. The method may further include receiving a second request from a second distributed application of the plurality of distributed applications for second requested GPU resources. The method may also include receiving response from each of the plurality of computing nodes indicating an availability of GPU resources for each of the plurality of computing nodes. Additionally, the method may include, responsive to determining that at least one of the first and second requests can be fulfilled by at least one of the plurality of computing nodes, allocating a first set of GPU slices for the first application and allocating a second set of GPU slices for the second application.Type: ApplicationFiled: March 3, 2016Publication date: September 7, 2017Inventors: Anshul Gandhi, Hui Lei, Jayaram Kallapalayam Radhakrishnan, Charles O. Schulz, Shu Tao
-
Publication number: 20170256017Abstract: Examples of GPU resource sharing among applications are disclosed. In one example, a method includes receiving a first request from a first application of the plurality of applications for first requested GPU resources, and receiving a second request from a second application of the plurality of applications for second GPU resources. The method also includes, responsive to determining that the first requested GPU resources are available, allocating a first slice of the GPU resources with a first requested amount of resources to the first application and, responsive to determining that the second requested GPU resources are available, allocating a second slice of the GPU resources with a second requested amount of resources to the second application. Further, the method includes enabling the first application and the second application to execute concurrently within the first slice of the GPU and the second slice of the GPU respectively.Type: ApplicationFiled: March 3, 2016Publication date: September 7, 2017Inventors: Anshul Gandhi, Hui Lei, Jayaram Kallapalayam Radhakrishnan, Charles O. Schulz, Shu Tao
-
Publication number: 20170048261Abstract: In response to determining that an event matches a condition of a rule, a given one of a plurality of computing nodes is selected to send the event, based on one or both of an attribute of the event and an identifier of the rule. Information of the event is sent to the given computing node to perform correlation of the event with another event.Type: ApplicationFiled: April 30, 2014Publication date: February 16, 2017Inventors: Daniel Juergen Gmach, Alvin AuYoung, Robert Block, Jayaram Kallapalayam Radhakrishnan, Suranjan Pramanik, Julian James Stephen, Anurag Singla