Patents by Inventor Srinivas Sridharan

Srinivas Sridharan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Communication optimizations for distributed machine learning

Patent number: 12450484

Abstract: Embodiments described herein provide an apparatus comprising an interconnect switch configured to couple with a plurality of graphics processors via a plurality of point-to-point interconnects and one or more processors including a graphics processor coupled with the interconnect switch via a point-to-point interconnect of the plurality of point-to-point interconnects.

Type: Grant

Filed: May 19, 2023

Date of Patent: October 21, 2025

Assignee: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
Machine learning accelerator mechanism

Patent number: 12417380

Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.

Type: Grant

Filed: May 31, 2024

Date of Patent: September 16, 2025

Assignee: INTEL CORPORATION

Inventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
Dynamic precision management for integer deep learning primitives

Patent number: 12412232

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.

Type: Grant

Filed: June 24, 2024

Date of Patent: September 9, 2025

Assignee: Intel Corporation

Inventors: Naveen K. Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
Abstraction layers for scalable distributed machine learning

Patent number: 12387287

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

Type: Grant

Filed: September 5, 2023

Date of Patent: August 12, 2025

Assignee: INTEL CORPORATION

Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
Hardware implemented point to point communication primitives for machine learning

Patent number: 12354001

Abstract: One embodiment provides for a graphics processing unit including a fabric interface configured to transmit gradient data stored in a memory device of the graphics processing unit according to a pre-defined communication operation. The memory device is a physical memory device shared with a compute block of the graphics processing unit and the fabric interface. The fabric interface automatically transmits the gradient data stored in memory to a second distributed training node based on an address of the gradient data in the memory device.

Type: Grant

Filed: October 25, 2022

Date of Patent: July 8, 2025

Assignee: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das
DATA PARALLELISM AND HALO EXCHANGE FOR DISTRIBUTED MACHINE LEARNING

Publication number: 20250200696

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.

Type: Application

Filed: December 16, 2024

Publication date: June 19, 2025

Applicant: Intel Corporation

Inventors: Dipankar Das, KARTHIKEYAN VAIDYANATHAN, Srinivas Sridharan
Data parallelism and halo exchange for distributed machine learning

Patent number: 12211117

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.

Type: Grant

Filed: June 27, 2022

Date of Patent: January 28, 2025

Assignee: Intel Corporation

Inventors: Dipankar Das, Karthikeyan Vaidyanathan, Srinivas Sridharan
MACHINE LEARNING ACCELERATOR MECHANISM

Publication number: 20240403620

Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.

Type: Application

Filed: May 31, 2024

Publication date: December 5, 2024

Applicant: Intel Corporation

Inventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
Fine-grain compute communication execution for deep learning frameworks via hardware accelerated point-to-point primitives

Patent number: 12154028

Abstract: One embodiment provides for a system to configure distributed training of a neural network. The system includes memory to store a library to facilitate transmission of data during distributed training of the neural network; a network interface to transmit and receive gradient data associated with the trainable parameters; a general-purpose processor to execute instructions provided by the library, the instructions to cause the general-purpose processor to configure the network interface to transmit and receive the gradient data associated with the trainable parameters during a workflow of a machine learning framework; and a graphics processor to perform compute operations associated with machine learning framework workflow to generate the gradient data associated with the trainable parameters, wherein, based on the machine learning framework workflow, the library is to interleave the compute operations on the graphics processor with transmission and receipt of gradient data via the network interface.

Type: Grant

Filed: January 12, 2018

Date of Patent: November 26, 2024

Assignee: Intel Corporation

Inventors: Srinivas Sridharan, Dheevatsa Mudigere
Machine learning accelerator mechanism

Patent number: 12039435

Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.

Type: Grant

Filed: June 21, 2022

Date of Patent: July 16, 2024

Assignee: INTEL CORPORATION

Inventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
Dynamic precision management for integer deep learning primitives

Patent number: 12033237

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.

Type: Grant

Filed: April 24, 2023

Date of Patent: July 9, 2024

Assignee: Intel Corporation

Inventors: Naveen K. Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
ABSTRACTION LAYERS FOR SCALABLE DISTRIBUTED MACHINE LEARNING

Publication number: 20240070799

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

Type: Application

Filed: September 5, 2023

Publication date: February 29, 2024

Applicant: Intel Corporation

Inventors: Dhiraj D. KALAMKAR, Karthikeyan VAIDYANATHAN, Srinivas SRIDHARAN, Dipankar DAS
COMMUNICATION OPTIMIZATIONS FOR DISTRIBUTED MACHINE LEARNING

Publication number: 20230376762

Abstract: Embodiments described herein provide an apparatus comprising an interconnect switch configured to couple with a plurality of graphics processors via a plurality of point-to-point interconnects and one or more processors including a graphics processor coupled with the interconnect switch via a point-to-point interconnect of the plurality of point-to-point interconnects.

Type: Application

Filed: May 19, 2023

Publication date: November 23, 2023

Applicant: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
Abstraction layers for scalable distributed machine learning

Patent number: 11798120

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

Type: Grant

Filed: August 10, 2021

Date of Patent: October 24, 2023

Assignee: INTEL CORPORATION

Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
Communication optimizations for distributed machine learning

Patent number: 11704565

Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library. The instructions cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network. The processor can transparently adjust communication paths between worker nodes based on the communication pattern.

Type: Grant

Filed: March 3, 2022

Date of Patent: July 18, 2023

Assignee: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
HARDWARE IMPLEMENTED POINT TO POINT COMMUNICATION PRIMITIVES FOR MACHINE LEARNING

Publication number: 20230177328

Abstract: One embodiment provides for a graphics processing unit including a fabric interface configured to transmit gradient data stored in a memory device of the graphics processing unit according to a pre-defined communication operation. The memory device is a physical memory device shared with a compute block of the graphics processing unit and the fabric interface. The fabric interface automatically transmits the gradient data stored in memory to a second distributed training node based on an address of the gradient data in the memory device.

Type: Application

Filed: October 25, 2022

Publication date: June 8, 2023

Applicant: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das
Dynamic precision management for integer deep learning primitives

Patent number: 11669933

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to quantize elements of a floating-point tensor to convert the floating-point tensor into a dynamic fixed-point tensor.

Type: Grant

Filed: April 27, 2022

Date of Patent: June 6, 2023

Assignee: Intel Corporation

Inventors: Naveen K. Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
MACHINE LEARNING ACCELERATOR MECHANISM

Publication number: 20230053289

Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.

Type: Application

Filed: June 21, 2022

Publication date: February 16, 2023

Applicant: Intel Corporation

Inventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
DATA PARALLELISM AND HALO EXCHANGE FOR DISTRIBUTED MACHINE LEARNING

Publication number: 20220366526

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.

Type: Application

Filed: June 27, 2022

Publication date: November 17, 2022

Applicant: Intel Corporation

Inventors: Dipankar Das, KARTHIKEYAN VAIDYANATHAN, Srinivas Sridharan
Hardware implemented point to point communication primitives for machine learning

Patent number: 11488008

Abstract: One embodiment provides for a system to compute and distribute data for distributed training of a neural network, the system including first memory to store a first set of instructions including a machine learning framework; a fabric interface to enable transmission and receipt of data associated with the set of trainable machine learning parameters; a first set of general-purpose processor cores to execute the first set of instructions, the first set of instructions to provide a training workflow for computation of gradients for the trainable machine learning parameters and to communicate with a second set of instructions, the second set of instructions facilitate transmission and receipt of the gradients via the fabric interface; and a graphics processor to perform compute operations associated with the training workflow to generate the gradients for the trainable machine learning parameters.

Type: Grant

Filed: January 12, 2018

Date of Patent: November 1, 2022

Assignee: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das

1 2 3 next