Patents by Inventor Srinivas Sridharan

Srinivas Sridharan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DYNAMIC QUALITY OF SERVICE MANAGEMENT FOR DEEP LEARNING TRAINING COMMUNICATION

Publication number: 20210304025

Abstract: A processor analyzes a machine learning workload. Corresponding priority levels are assigned to identified data requests in the machine learning workload based on an associated data dependency delay performance impact. The assigned corresponding priority levels are indicated when providing the data requests to a memory controller. The memory controller sorts the received data requests into a plurality of different priority queues based on the indicated corresponding priority levels. The memory controller initiates the data requests from the different priority queues to memory in an order based on different qualities of service of the different priority queues.

Type: Application

Filed: March 24, 2020

Publication date: September 30, 2021

Inventor: Srinivas Sridharan
Abstraction layers for scalable distributed machine learning

Patent number: 11094029

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

Type: Grant

Filed: April 10, 2017

Date of Patent: August 17, 2021

Assignee: INTEL CORPORATION

Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
Technologies for scaling deep learning training

Patent number: 11068780

Abstract: Technologies for artificial neural network training include a computing node with a host fabric interface that sends a message that includes one or more artificial neural network training algorithm values to another computing node in response to receipt of a request to send the message. Prior to sending the message, the host fabric interface may receive a request to quantize the message and quantize the message based on a quantization level included in the request to generate a quantized message. The quantization message includes one or more quantized values such that each quantized value has a lower precision than a corresponding artificial neural network training algorithm value. The host fabric interface then transmits the quantized message, which includes metadata indicative of the quantization level, to another computing node in response to quantization of the message for artificial neural network training. Other embodiments are described and claimed.

Type: Grant

Filed: April 1, 2017

Date of Patent: July 20, 2021

Assignee: Intel Corporation

Inventors: Naveen K. Mellempudi, Srinivas Sridharan, Dheevatsa Mudigere, Dipankar Das
Abstraction library to enable scalable distributed machine learning

Patent number: 11023803

Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.

Type: Grant

Filed: April 10, 2017

Date of Patent: June 1, 2021

Assignee: INTEL CORPORATION

Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
Smart network interface controller for caching distributed data

Patent number: 11012511

Abstract: A request for data from a distributed table is received at a network interface controller system. The request for data from the distributed table is identified as a request to be processed by the network interface controller system instead of a processor of a host computer system. The requested data is requested and received from a memory of the computing host computer system via a computer interface of the network interface controller system. The received requested data is caused to be cached in a cache of the network interface controller system.

Type: Grant

Filed: January 14, 2020

Date of Patent: May 18, 2021

Assignee: Facebook, Inc.

Inventor: Srinivas Sridharan
DYNAMIC PRECISION MANAGEMENT FOR INTEGER DEEP LEARNING PRIMITIVES

Publication number: 20210110508

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic, the compute unit to receive a set of dynamic fixed-point tensors, compute, via the dynamic precision fixed-point logic, a right-shift value using an absolute maximum value within the set of dynamic fixed-point tensors and a dynamic range of the set of dynamic fixed-point tensors, right-shift data values within the set of dynamic fixed-point tensors based on the right-shift value, increment a shared exponent associated with the set of dynamic fixed-point tensors based on the right-shift value, perform a compute operation on the set of dynamic fixed-point tensors, and generate an output tensor via the compute operation on the set of dynamic fixed-point tensors.

Type: Application

Filed: October 29, 2020

Publication date: April 15, 2021

Applicant: Intel Corporation

Inventors: Naveen MELLEMPUDI, DHEEVATSA MUDIGERE, DIPANKAR DAS, SRINIVAS SRIDHARAN
PARALLEL PROCESSING BASED ON INJECTION NODE BANDWIDTH

Publication number: 20210109888

Abstract: A technique includes performing a collective operation among multiple nodes of a parallel processing computer system using multiple parallel processing stages. The technique includes regulating an ordering of the parallel processing stages so that an initial stage of the plurality of parallel processing stages is associated with a higher node injection bandwidth than a subsequent stage of the plurality of parallel processing stages.

Type: Application

Filed: September 30, 2017

Publication date: April 15, 2021

Inventors: Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
Dynamic precision management for integer deep learning primitives

Patent number: 10825127

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic, the compute unit to receive a set of dynamic fixed-point tensors, compute, via the dynamic precision fixed-point logic, a right-shift value using an absolute maximum value within the set of dynamic fixed-point tensors and a dynamic range of the set of dynamic fixed-point tensors, right-shift data values within the set of dynamic fixed-point tensors based on the right-shift value, increment a shared exponent associated with the set of dynamic fixed-point tensors based on the right-shift value, perform a compute operation on the set of dynamic fixed-point tensors, and generate an output tensor via the compute operation on the set of dynamic fixed-point tensors.

Type: Grant

Filed: April 20, 2020

Date of Patent: November 3, 2020

Assignee: Intel Corporation

Inventors: Naveen Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
DYNAMIC PRECISION MANAGEMENT FOR INTEGER DEEP LEARNING PRIMITIVES

Publication number: 20200265545

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic, the compute unit to receive a set of dynamic fixed-point tensors, compute, via the dynamic precision fixed-point logic, a right-shift value using an absolute maximum value within the set of dynamic fixed-point tensors and a dynamic range of the set of dynamic fixed-point tensors, right-shift data values within the set of dynamic fixed-point tensors based on the right-shift value, increment a shared exponent associated with the set of dynamic fixed-point tensors based on the right-shift value, perform a compute operation on the set of dynamic fixed-point tensors, and generate an output tensor via the compute operation on the set of dynamic fixed-point tensors.

Type: Application

Filed: April 20, 2020

Publication date: August 20, 2020

Applicant: Intel Corporation

Inventors: Naveen MELLEMPUDI, DHEEVATSA MUDIGERE, DIPANKAR DAS, SRINIVAS SRIDHARAN
Technologies for automatic processor core association management and communication using direct data placement in private caches

Patent number: 10652353

Abstract: Technologies for communication with direct data placement include a number of computing nodes in communication over a network. Each computing node includes a many-core processor having an integrated host fabric interface (HFI) that maintains an association table (AT). In response to receiving a message from a remote device, the HFI determines whether the AT includes an entry associating one or more parameters of the message to a destination processor core. If so, the HFI causes a data transfer agent (DTA) of the destination core to receive the message data. The DTA may place the message data in a private cache of the destination core. Message parameters may include a destination process identifier or other network address and a virtual memory address range. The HFI may automatically update the AT based on communication operations generated by software executed by the processor cores. Other embodiments are described and claimed.

Type: Grant

Filed: September 24, 2015

Date of Patent: May 12, 2020

Assignee: Intel Corporation

Inventors: James Dinan, Venkata Krishnan, Srinivas Sridharan, David A. Webb
Dynamic precision management for integer deep learning primitives

Patent number: 10643297

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic; a decode unit to decode an instruction for execution by the compute unit, the instruction to cause the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation, the dynamic precision manager to adjust the precision of the compute operation to prevent an arithmetic overflow.

Type: Grant

Filed: January 29, 2018

Date of Patent: May 5, 2020

Assignee: Intel Corporation

Inventors: Naveen Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
INITIALIZATION AND MANAGEMENT OF CLASS OF SERVICE ATTRIBUTES IN RUNTIME TO OPTIMIZE DEEP LEARNING TRAINING IN DISTRIBUTED ENVIRONMENTS

Publication number: 20200125499

Abstract: Systems, apparatuses and methods may provide for technology that detects a runtime call to a communication library, wherein the runtime call identifies a memory buffer, determines that a class of service (CLOS) attribute is associated with the memory buffer, and issues a driver instruction to modify the CLOS attribute in response to the runtime call.

Type: Application

Filed: December 17, 2019

Publication date: April 23, 2020

Inventors: Aravindh Anantaraman, Srinivas Sridharan, Ajaya Durg, Mohammad R. Haghighat, Mikhail E. Smorkalov, Sudarshan Srinivasan
MACHINE LEARNING ACCELERATOR MECHANISM

Publication number: 20190205737

Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.

Type: Application

Filed: December 30, 2017

Publication date: July 4, 2019

Applicant: Intel Corporation

Inventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
COMMUNICATION OPTIMIZATIONS FOR DISTRIBUTED MACHINE LEARNING

Publication number: 20190205745

Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library, the instructions to cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network.

Type: Application

Filed: December 29, 2017

Publication date: July 4, 2019

Applicant: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
FINE-GRAIN COMPUTE COMMUNICATION EXECUTION FOR DEEP LEARNING FRAMEWORKS

Publication number: 20180322386

Abstract: One embodiment provides for a system to configure distributed training of a neural network. The system includes memory to store a library to facilitate transmission of data during distributed training of the neural network; a network interface to transmit and receive gradient data associated with the trainable parameters; a general-purpose processor to execute instructions provided by the library, the instructions to cause the general-purpose processor to configure the network interface to transmit and receive the gradient data associated with the trainable parameters during a workflow of a machine learning framework; and a graphics processor to perform compute operations associated with machine learning framework workflow to generate the gradient data associated with the trainable parameters, wherein, based on the machine learning framework workflow, the library is to interleave the compute operations on the graphics processor with transmission and receipt of gradient data via the network interface.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: SRINIVAS SRIDHARAN, DHEEVATSA MUDIGERE
DATA PARALLELISM AND HALO EXCHANGE FOR DISTRIBUTED MACHINE LEARNING

Publication number: 20180322606

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising multi-dimensionally partitioning data of a feature map across multiple nodes for distributed training of a convolutional neural network; performing a parallel convolution operation on the multiple partitions to train weight data of the neural network; and exchanging data between nodes to enable computation of halo regions, the halo regions having dependencies on data processed by a different node.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Dipankar Das, KARTHIKEYAN VAIDYANATHAN, Srinivas Sridharan
HARDWARE IMPLEMENTED POINT TO POINT COMMUNICATION PRIMITIVES FOR MACHINE LEARNING

Publication number: 20180322387

Abstract: One embodiment provides for a system to compute and distribute data for distributed training of a neural network, the system including first memory to store a first set of instructions including a machine learning framework; a fabric interface to enable transmission and receipt of data associated with the set of trainable machine learning parameters; a first set of general-purpose processor cores to execute the first set of instructions, the first set of instructions to provide a training workflow for computation of gradients for the trainable machine learning parameters and to communicate with a second set of instructions, the second set of instructions facilitate transmission and receipt of the gradients via the fabric interface; and a graphics processor to perform compute operations associated with the training workflow to generate the gradients for the trainable machine learning parameters.

Type: Application

Filed: January 12, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das
DYNAMIC PRECISION MANAGEMENT FOR INTEGER DEEP LEARNING PRIMITIVES

Publication number: 20180322607

Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising compute unit including a hardware logic unit having dynamic precision fixed-point logic; a decode unit to decode an instruction for execution by the compute unit, the instruction to cause the compute unit to perform a matrix arithmetic operation on a set of dynamic fixed-point tensors; and a dynamic precision manager to dynamically adjust the precision of a compute operation performed by the compute unit during the matrix arithmetic operation, the dynamic precision manager to adjust the precision of the compute operation to prevent an arithmetic overflow.

Type: Application

Filed: January 29, 2018

Publication date: November 8, 2018

Applicant: Intel Corporation

Inventors: Naveen MELLEMPUDI, DHEEVATSA MUDIGERE, DIPANKAR DAS, SRINIVAS SRIDHARAN
ABSTRACTION LIBRARY TO ENABLE SCALABLE DISTRIBUTED MACHINE LEARNING

Publication number: 20180293492

Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.

Type: Application

Filed: April 10, 2017

Publication date: October 11, 2018

Applicant: Intel Corporation

Inventors: Dhiraj D. Kalamkar, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
ABSTRACTION LAYERS FOR SCALABLE DISTRIBUTED MACHINE LEARNING

Publication number: 20180293493

Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.

Type: Application

Filed: April 10, 2017

Publication date: October 11, 2018

Applicant: Intel Corporation

Inventors: Dhiraj D. Kalamkar, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS

prev 1 2 3 next