Patents by Inventor Muthian Sivathanu

Muthian Sivathanu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Scheduler for planet-scale computing system

Patent number: 12353908

Abstract: The disclosure herein describes scheduling execution of artificial intelligence (AI) workloads in a cloud infrastructure platform. A global scheduler receives AI workloads associated with resource ticket values. The scheduler distributes the AI workloads to nodes based on balancing resource ticket values. Local schedulers of the nodes schedule AI workloads on resources based on the resource ticket values of the AI workloads. Based on scheduling the AI workloads, coordinator services of the local schedulers execute the distributed AI workloads on the infrastructure resources of the nodes. The disclosure further describes scheduling AI workloads based on priority tiers. A scheduler receives AI workloads, and each AI workload is associated with a priority tier indicative of a preemption priority while being executed. The AI workloads are scheduled for execution on a distributed set of nodes based on the priority tiers and then execute based on the scheduling.

Type: Grant

Filed: June 28, 2021

Date of Patent: July 8, 2025

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Muthian Sivathanu, Atul Katiyar, Dharma Kiritkumar Shukla, Rimma Vladimirovna Nehme, Shreshth Singhal, Pankaj Sharma, Nipun Kwatra, Ramachandran Ramjee
TRANSPARENT PRE-EMPTION AND MIGRATION FOR PLANET-SCALE COMPUTER

Publication number: 20250094212

Abstract: The disclosure herein describes platform-level checkpointing for deep learning (DL) jobs. The checkpointing is performed through capturing two kinds of state data: (i) GPU state (device state), and (ii) CPU state (host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by the libraries such as DNN, Blas, etc.). Only a fraction of the GPU memory is copied because the checkpointing is done in a domain-aware manner. The “active” memory contains useful data like model parameters. To be able to capture the useful data, memory management is controlled to identify which parts of the memory are active. Also, to restore the destination GPU to the same context/state, a mechanism is used to capture such state-changing events on an original GPU and replayed on a destination GPU.

Type: Application

Filed: November 26, 2024

Publication date: March 20, 2025

Inventors: Muthian SIVATHANU, Srinidhi VISWANATHA, Dharma Kiritkumar SHUKLA, Nipun KWATRA, Ramachandran RAMJEE, Rimma Vladimirovna NEHME, Pankaj SHARMA, Bhalakumaaran Erode RANGANATHAN, Vaibhav SHARMA
ARTIFICIAL INTELLIGENCE WORKLOAD MIGRATION FOR PLANET-SCALE ARTIFICIAL INTELLIGENCE INFRASTRUCTURE SERVICE

Publication number: 20250055923

Abstract: The disclosure herein describes platform-level migration for deep learning training (DLT) jobs from a checkpointed stated between a source node and a destination node. The checkpointing is performed through capturing GPU state (e.g., device state) and CPU state (e.g., host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by libraries). Restoring the DLT job on the destination node involves resumption of processing of a destination GPU at the same checkpointed state.

Type: Application

Filed: October 29, 2024

Publication date: February 13, 2025

Inventors: Dharma Kiritkumar SHUKLA, Muthian SIVATHANU, Lu XUN, Rimma Vladimirovna NEHME
Transparent pre-emption and migration for planet-scale computer

Patent number: 12190147

Abstract: The disclosure herein describes platform-level checkpointing for deep learning (DL) jobs. The checkpointing is performed through capturing two kinds of state data: (i) GPU state (device state), and (ii) CPU state (host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by the libraries such as DNN, Blas, etc.). Only a fraction of the GPU memory is copied because the checkpointing is done in a domain-aware manner. The “active” memory contains useful data like model parameters. To be able to capture the useful data, memory management is controlled to identify which parts of the memory are active. Also, to restore the destination GPU to the same context/state, a mechanism is used to capture such state-changing events on an original GPU and replayed on a destination GPU.

Type: Grant

Filed: June 26, 2021

Date of Patent: January 7, 2025

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Muthian Sivathanu, Srinidhi Viswanatha, Dharma Kiritkumar Shukla, Nipun Kwatra, Ramachandran Ramjee, Rimma Vladimirovna Nehme, Pankaj Sharma, Bhalakumaaran Erode Ranganathan, Vaibhav Sharma
Artificial intelligence workload migration for planet-scale artificial intelligence infrastructure service

Patent number: 12166829

Abstract: The disclosure herein describes platform-level migration for deep learning training (DLT) jobs from a checkpointed stated between a source node and a destination node. The checkpointing is performed through capturing GPU state (e.g., device state) and CPU state (e.g., host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by libraries). Restoring the DLT job on the destination node involves resumption of processing of a destination GPU at the same checkpointed state.

Type: Grant

Filed: June 7, 2023

Date of Patent: December 10, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Dharma Kiritkumar Shukla, Muthian Sivathanu, Lu Xun, Rimma Vladimirovna Nehme
ARTIFICIAL INTELLIGENCE WORKLOAD MIGRATION FOR PLANET-SCALE ARTIFICIAL INTELLIGENCE INFRASTRUCTURE SERVICE

Publication number: 20230396682

Abstract: The disclosure herein describes platform-level migration for deep learning training (DLT) jobs from a checkpointed stated between a source node and a destination node. The checkpointing is performed through capturing GPU state (e.g., device state) and CPU state (e.g., host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by libraries). Restoring the DLT job on the destination node involves resumption of processing of a destination GPU at the same checkpointed state.

Type: Application

Filed: June 7, 2023

Publication date: December 7, 2023

Inventors: Dharma Kiritkumar SHUKLA, Muthian SIVATHANU, Lu XUN, Rimma Vladimirovna NEHME
Artificial intelligence workload migration for planet-scale artificial intelligence infrastructure service

Patent number: 11722573

Abstract: The disclosure herein describes platform-level migration for deep learning training (DLT) jobs from a checkpointed stated between a source node and a destination node. The checkpointing is performed through capturing GPU state (e.g., device state) and CPU state (e.g., host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by libraries). Restoring the DLT job on the destination node involves resumption of processing of a destination GPU at the same checkpointed state.

Type: Grant

Filed: June 25, 2021

Date of Patent: August 8, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dharma Kiritkumar Shukla, Muthian Sivathanu, Lu Xun, Rimma Vladimirovna Nehme
ELASTICALLY MANAGING WORKERS OF MULTI-WORKER WORKLOADS ON ACCELERATOR DEVICES

Publication number: 20230236837

Abstract: The disclosure herein describes elastically managing the execution of workers of multi-worker workloads on accelerator devices. A first worker of a workload is executed on an accelerator device during a first time interval. A first context switch point is identified when the first worker is in a first worker state. At the identified context switch point, a first memory state of the first worker is stored in a host memory and the accelerator device is configured to a second memory state of the second worker. The second worker is executed during a second time interval and a second context switch point is identified at the end of the second time interval when the second worker is in a state that is equivalent to the first worker state. During the intervals, collective communication operations between the workers are accumulated and, at the second context switch point, the accumulated operations are performed.

Type: Application

Filed: June 30, 2022

Publication date: July 27, 2023

Inventors: Muthian SIVATHANU, Srinidhi VISWANATHA, Bhargav GULAVANI, Dharma Kiritkumar SHUKLA, Rimma Vladimirovna NEHME, Amey AGRAWAL, Ramachandran RAMJEE, Kaustubh WELANKAR, Ravi Shreyas ANUPINDI
PLANET-SCALE, FULLY MANAGED ARTIFICIAL INTELLIGENCE INFRASTRUCTURE SERVICE

Publication number: 20220318674

Abstract: The disclosure herein describes managing artificial intelligence (AI) workloads in a cloud infrastructure platform. A set of distributed infrastructure resources are integrated into the cloud infrastructure platform via native support interfaces. AI workloads are received from a plurality of tenants, wherein the AI workloads include training workloads and inferencing workloads and resource subsets of the set of distributed infrastructure resources are assigned to the received AI workloads. The received AI workloads are scheduled for execution on the assigned resource subsets and based on the scheduling of the AI workloads, they are executed on the assigned resource subsets. The described cloud infrastructure platform provides efficient, secure execution of AI workloads for many different tenants and enables the flexible use of a wide variety of both third-party and first-party infrastructure resources.

Type: Application

Filed: June 28, 2021

Publication date: October 6, 2022

Inventors: Dharma Kiritkumar SHUKLA, Rimma Vladimirovna NEHME, Pankaj SHARMA, Shreshth SINGHAL, Vipul Arunkant MODI, Muthian SIVATHANU, Atul KATIYAR
SCHEDULER FOR PLANET-SCALE COMPUTING SYSTEM

Publication number: 20220318052

Abstract: The disclosure herein describes scheduling execution of artificial intelligence (AI) workloads in a cloud infrastructure platform. A global scheduler receives AI workloads associated with resource ticket values. The scheduler distributes the AI workloads to nodes based on balancing resource ticket values. Local schedulers of the nodes schedule AI workloads on resources based on the resource ticket values of the AI workloads. Based on scheduling the AI workloads, coordinator services of the local schedulers execute the distributed AI workloads on the infrastructure resources of the nodes. The disclosure further describes scheduling AI workloads based on priority tiers. A scheduler receives AI workloads, and each AI workload is associated with a priority tier indicative of a preemption priority while being executed. The AI workloads are scheduled for execution on a distributed set of nodes based on the priority tiers and then execute based on the scheduling.

Type: Application

Filed: June 28, 2021

Publication date: October 6, 2022

Inventors: Muthian SIVATHANU, Atul KATIYAR, Dharma Kiritkumar SHUKLA, Rimma Vladimirovna NEHME, Shreshth SINGHAL, Pankaj SHARMA, Nipun KWATRA, Ramachandran RAMJEE
TRANSPARENT PRE-EMPTION AND MIGRATION FOR PLANET-SCALE COMPUTER

Publication number: 20220308917

Abstract: The disclosure herein describes platform-level checkpointing for deep learning (DL) jobs. The checkpointing is performed through capturing two kinds of state data: (i) GPU state (device state), and (ii) CPU state (host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by the libraries such as DNN, Blas, etc.). Only a fraction of the GPU memory is copied because the checkpointing is done in a domain-aware manner. The “active” memory contains useful data like model parameters. To be able to capture the useful data, memory management is controlled to identify which parts of the memory are active. Also, to restore the destination GPU to the same context/state, a mechanism is used to capture such state-changing events on an original GPU and replayed on a destination GPU.

Type: Application

Filed: June 26, 2021

Publication date: September 29, 2022

Inventors: Muthian SIVATHANU, Srinidhi VISWANATHA, Dharma Kiritkumar SHUKLA, Nipun KWATRA, Ramachandran RAMJEE, Rimma Vladimirovna NEHME, Pankaj SHARMA, Bhalakumaaran Erode RANGANATHAN, Vaibhav SHARMA
ARTIFICIAL INTELLIGENCE WORKLOAD MIGRATION FOR PLANET-SCALE ARTIFICIAL INTELLIGENCE INFRASTRUCTURE SERVICE

Publication number: 20220311832

Abstract: The disclosure herein describes platform-level migration for deep learning training (DLT) jobs from a checkpointed stated between a source node and a destination node. The checkpointing is performed through capturing GPU state (e.g., device state) and CPU state (e.g., host state). The GPU state includes GPU data (e.g., model parameters, optimizer state, etc.) that is located in the GPU and GPU context (e.g., the default stream in GPU, various handles created by libraries). Restoring the DLT job on the destination node involves resumption of processing of a destination GPU at the same checkpointed state.

Type: Application

Filed: June 25, 2021

Publication date: September 29, 2022

Inventors: Dharma Kiritkumar SHUKLA, Muthian SIVATHANU, Lu XUN, Rimma Vladimirovna NEHME
Lightweight blockchain based on split-trust

Patent number: 11405181

Abstract: A system includes a set of low resource devices, each configured to receive transactions to be added to an encrypted block chain ledger from a sample of untrusted high resource devices, prepare a proposed block of the received transactions, provide the proposed block to the sample of untrusted high resource devices, receive proposed blocks from the untrusted high resource devices originating from the set of low resource devices. The low resource devices run a consensus protocol to select one proposed block to add to the encrypted block chain ledger stored on the untrusted high resource devices.

Type: Grant

Filed: July 12, 2019

Date of Patent: August 2, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Muthian Sivathanu, Nishanth Chandran, Divya Gupta, Apurv Mehra, Satyanarayana V. Lokam, Sambhav Satija, Sudheesh Singanamalla
Lightweight Blockchain Based on Split-Trust

Publication number: 20210014042

Abstract: A system includes a set of low resource devices, each configured to receive transactions to be added to an encrypted block chain ledger from a sample of untrusted high resource devices, prepare a proposed block of the received transactions, provide the proposed block to the sample of untrusted high resource devices, receive proposed blocks from the untrusted high resource devices originating from the set of low resource devices. The low resource devices run a consensus protocol to select one proposed block to add to the encrypted block chain ledger stored on the untrusted high resource devices.

Type: Application

Filed: July 12, 2019

Publication date: January 14, 2021

Inventors: Muthian Sivathanu, Nishanth Chandran, Divya Gupta, Apurv Mehra, Satyanarayana V. Lokam, Sambhav Satija, Sudheesh Singanamalla
Efficient multi-dimensional partitioning and sorting in large-scale distributed data processing systems

Patent number: 10810206

Abstract: Methods, systems, and computer programs are presented for structuring a database to support multiple partitioning orders at the storage layer. One method includes an operation for identifying partitioning fields for a database that is stored distributed across computing devices, where each computing device stores an extent that holds a subset of entries from the database. For each partitioning field, the database entries are stored in extents associated with the partitioning field, the database entries in the extents for the partitioning field being organized based on the value of the partitioning field. Further, the method includes operations for receiving a database query that includes a filter based on values of a selected partitioning field, and for retrieving the data for the database query from one or more of the extents associated with the selected partitioning field. The retrieved data is returned for the database query.

Type: Grant

Filed: June 15, 2017

Date of Patent: October 20, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventor: Muthian Sivathanu
Access control for enterprise knowledge

Patent number: 10798098

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium for access control for enterprise information. In one aspect, a method includes receiving resources of an enterprise, each resource having a respective access control list specifying access privileges to the resource for one or more members, and the resources including entities related to the enterprise and relationships; identifying entity facts of the entities from the resources; determining, for each entity fact, an entity fact access control list; storing data describing the entities, entity facts and the respective entity fact access control lists, wherein each entity fact is associated with its corresponding entity fact access control list; and providing, to each of the members of the enterprise, access privileges to the data describing the entities and the entity facts according to the respective entity fact access control lists.

Type: Grant

Filed: April 30, 2019

Date of Patent: October 6, 2020

Assignee: Google LLC

Inventors: Brent VerWeyst, Martin James Cochran, Muthian Sivathanu
In-place updates for inverted indices

Patent number: 10474650

Abstract: Implementations provide an indexing system with near-instant updates to an inverted index while maintaining techniques for query optimization. The system may provision empty positions in posting lists to enable in-place updating, without having to rebuild the posting list or append updates to the end of the posting list. For example, a system comprises at least one processor and memory storing an index that includes at least one posting list that maps a term to a set of the documents. The posting list includes an ordered list of documents and has a plurality of open positions within the ordered list. The memory also stores instructions that, when executed by the at least one processor, cause the system to locate an open position of the plurality of open positions for a new document and to insert the new document into the at least one posting list using the open position.

Type: Grant

Filed: November 21, 2013

Date of Patent: November 12, 2019

Assignee: GOOGLE LLC

Inventors: Muthian Sivathanu, Saurabh Goyal, Rajiv Mathews
ACCESS CONTROL FOR ENTERPRISE KNOWLEDGE

Publication number: 20190260749

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium for access control for enterprise information. In one aspect, a method includes receiving resources of an enterprise, each resource having a respective access control list specifying access privileges to the resource for one or more members, and the resources including entities related to the enterprise and relationships; identifying entity facts of the entities from the resources; determining, for each entity fact, an entity fact access control list; storing data describing the entities, entity facts and the respective entity fact access control lists, wherein each entity fact is associated with its corresponding entity fact access control list; and providing, to each of the members of the enterprise, access privileges to the data describing the entities and the entity facts according to the respective entity fact access control lists.

Type: Application

Filed: April 30, 2019

Publication date: August 22, 2019

Inventors: Brent VerWeyst, Martin James Cochran, Muthian Sivathanu
Access control for enterprise knowledge

Patent number: 10326768

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium for access control for enterprise information. In one aspect, a method includes receiving resources of an enterprise, each resource having a respective access control list specifying access privileges to the resource for one or more members, and the resources including entities related to the enterprise and relationships; identifying entity facts of the entities from the resources; determining, for each entity fact, an entity fact access control list; storing data describing the entities, entity facts and the respective entity fact access control lists, wherein each entity fact is associated with its corresponding entity fact access control list; and providing, to each of the members of the enterprise, access privileges to the data describing the entities and the entity facts according to the respective entity fact access control lists.

Type: Grant

Filed: May 28, 2015

Date of Patent: June 18, 2019

Assignee: Google LLC

Inventors: Brent VerWeyst, Martin James Cochran, Muthian Sivathanu
EFFICIENT MULTI-DIMENSIONAL PARTITIONING AND SORTING IN LARGE-SCALE DISTRIBUTED DATA PROCESSING SYSTEMS

Publication number: 20180365292

Abstract: Methods, systems, and computer programs are presented for structuring a database to support multiple partitioning orders at the storage layer. One method includes an operation for identifying partitioning fields for a database that is stored distributed across computing devices, where each computing device stores an extent that holds a subset of entries from the database. For each partitioning field, the database entries are stored in extents associated with the partitioning field, the database entries in the extents for the partitioning field being organized based on the value of the partitioning field. Further, the method includes operations for receiving a database query that includes a filter based on values of a selected partitioning field, and for retrieving the data for the database query from one or more of the extents associated with the selected partitioning field. The retrieved data is returned for the database query.

Type: Application

Filed: June 15, 2017

Publication date: December 20, 2018

Inventor: Muthian Sivathanu

1 2 next