Patents by Inventor Srinivas Sridharan

Srinivas Sridharan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240070799
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Application
    Filed: September 5, 2023
    Publication date: February 29, 2024
    Applicant: Intel Corporation
    Inventors: Dhiraj D. KALAMKAR, Karthikeyan VAIDYANATHAN, Srinivas SRIDHARAN, Dipankar DAS
  • Publication number: 20230376762
    Abstract: Embodiments described herein provide an apparatus comprising an interconnect switch configured to couple with a plurality of graphics processors via a plurality of point-to-point interconnects and one or more processors including a graphics processor coupled with the interconnect switch via a point-to-point interconnect of the plurality of point-to-point interconnects.
    Type: Application
    Filed: May 19, 2023
    Publication date: November 23, 2023
    Applicant: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
  • Publication number: 20230351542
    Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.
    Type: Application
    Filed: April 24, 2023
    Publication date: November 2, 2023
    Applicant: Intel Corporation
    Inventors: Naveen K. MELLEMPUDI, DHEEVATSA MUDIGERE, DIPANKAR DAS, SRINIVAS SRIDHARAN
  • Patent number: 11798120
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Grant
    Filed: August 10, 2021
    Date of Patent: October 24, 2023
    Assignee: INTEL CORPORATION
    Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
  • Patent number: 11704565
    Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library. The instructions cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network. The processor can transparently adjust communication paths between worker nodes based on the communication pattern.
    Type: Grant
    Filed: March 3, 2022
    Date of Patent: July 18, 2023
    Assignee: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
  • Publication number: 20230177328
    Abstract: One embodiment provides for a graphics processing unit including a fabric interface configured to transmit gradient data stored in a memory device of the graphics processing unit according to a pre-defined communication operation. The memory device is a physical memory device shared with a compute block of the graphics processing unit and the fabric interface. The fabric interface automatically transmits the gradient data stored in memory to a second distributed training node based on an address of the gradient data in the memory device.
    Type: Application
    Filed: October 25, 2022
    Publication date: June 8, 2023
    Applicant: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das
  • Patent number: 11669933
    Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to quantize elements of a floating-point tensor to convert the floating-point tensor into a dynamic fixed-point tensor.
    Type: Grant
    Filed: April 27, 2022
    Date of Patent: June 6, 2023
    Assignee: Intel Corporation
    Inventors: Naveen K. Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
  • Publication number: 20230053289
    Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.
    Type: Application
    Filed: June 21, 2022
    Publication date: February 16, 2023
    Applicant: Intel Corporation
    Inventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
  • Publication number: 20220366526
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.
    Type: Application
    Filed: June 27, 2022
    Publication date: November 17, 2022
    Applicant: Intel Corporation
    Inventors: Dipankar Das, KARTHIKEYAN VAIDYANATHAN, Srinivas Sridharan
  • Patent number: 11488008
    Abstract: One embodiment provides for a system to compute and distribute data for distributed training of a neural network, the system including first memory to store a first set of instructions including a machine learning framework; a fabric interface to enable transmission and receipt of data associated with the set of trainable machine learning parameters; a first set of general-purpose processor cores to execute the first set of instructions, the first set of instructions to provide a training workflow for computation of gradients for the trainable machine learning parameters and to communicate with a second set of instructions, the second set of instructions facilitate transmission and receipt of the gradients via the fabric interface; and a graphics processor to perform compute operations associated with the training workflow to generate the gradients for the trainable machine learning parameters.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: November 1, 2022
    Assignee: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das
  • Publication number: 20220327656
    Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to quantize elements of a floating-point tensor to convert the floating-point tensor into a dynamic fixed-point tensor.
    Type: Application
    Filed: April 27, 2022
    Publication date: October 13, 2022
    Applicant: Intel Corporation
    Inventors: Naveen K. MELLEMPUDI, DHEEVATSA MUDIGERE, DIPANKAR DAS, SRINIVAS SRIDHARAN
  • Publication number: 20220245454
    Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library. The instructions cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network. The processor can transparently adjust communication paths between worker nodes based on the communication pattern.
    Type: Application
    Filed: March 3, 2022
    Publication date: August 4, 2022
    Applicant: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
  • Patent number: 11373266
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.
    Type: Grant
    Filed: January 12, 2018
    Date of Patent: June 28, 2022
    Assignee: Intel Corporation
    Inventors: Dipankar Das, Karthikeyan Vaidyanathan, Srinivas Sridharan
  • Patent number: 11373088
    Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.
    Type: Grant
    Filed: December 30, 2017
    Date of Patent: June 28, 2022
    Assignee: INTEL CORPORATION
    Inventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
  • Patent number: 11321805
    Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic, the compute unit to receive a set of dynamic fixed-point tensors, compute, via the dynamic precision fixed-point logic, a right-shift value using an absolute maximum value within the set of dynamic fixed-point tensors and a dynamic range of the set of dynamic fixed-point tensors, right-shift data values within the set of dynamic fixed-point tensors based on the right-shift value, increment a shared exponent associated with the set of dynamic fixed-point tensors based on the right-shift value, perform a compute operation on the set of dynamic fixed-point tensors, and generate an output tensor via the compute operation on the set of dynamic fixed-point tensors.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: May 3, 2022
    Assignee: Intel Corporation
    Inventors: Naveen Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
  • Publication number: 20220101480
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Application
    Filed: August 10, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: DHIRAJ D. KALAMKAR, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
  • Patent number: 11270201
    Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library, the instructions to cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: March 8, 2022
    Assignee: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
  • Patent number: 11249910
    Abstract: Systems, apparatuses and methods may provide for technology that detects a runtime call to a communication library, wherein the runtime call identifies a memory buffer, determines that a class of service (CLOS) attribute is associated with the memory buffer, and issues a driver instruction to modify the CLOS attribute in response to the runtime call.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: February 15, 2022
    Assignee: Intel Corporation
    Inventors: Aravindh Anantaraman, Srinivas Sridharan, Ajaya Durg, Mohammad R. Haghighat, Mikhail E. Smorkalov, Sudarshan Srinivasan
  • Publication number: 20210350212
    Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.
    Type: Application
    Filed: May 24, 2021
    Publication date: November 11, 2021
    Applicant: Intel Corporation
    Inventors: DHIRAJ D. KALAMKAR, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
  • Publication number: 20210342692
    Abstract: Technologies for artificial neural network training include a computing node with a host fabric interface that sends a message that includes one or more artificial neural network training algorithm values to another computing node in response to receipt of a request to send the message. Prior to sending the message, the host fabric interface may receive a request to quantize the message and quantize the message based on a quantization level included in the request to generate a quantized message. The quantization message includes one or more quantized values such that each quantized value has a lower precision than a corresponding artificial neural network training algorithm value. The host fabric interface then transmits the quantized message, which includes metadata indicative of the quantization level, to another computing node in response to quantization of the message for artificial neural network training. Other embodiments are described and claimed.
    Type: Application
    Filed: May 14, 2021
    Publication date: November 4, 2021
    Inventors: Naveen K. Mellempudi, Srinivas Sridharan, Dheevatsa Mudigere, Dipankar Das