Patents by Inventor DHIRAJ D. KALAMKAR

DHIRAJ D. KALAMKAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240427842
    Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
    Type: Application
    Filed: May 24, 2024
    Publication date: December 26, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Fangwen Fu, Dhiraj D. Kalamkar, Sasikanth Avancha
  • Patent number: 12039000
    Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
    Type: Grant
    Filed: February 2, 2023
    Date of Patent: July 16, 2024
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Fangwen Fu, Dhiraj D. Kalamkar, Sasikanth Avancha
  • Publication number: 20240070799
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Application
    Filed: September 5, 2023
    Publication date: February 29, 2024
    Applicant: Intel Corporation
    Inventors: Dhiraj D. KALAMKAR, Karthikeyan VAIDYANATHAN, Srinivas SRIDHARAN, Dipankar DAS
  • Patent number: 11798120
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Grant
    Filed: August 10, 2021
    Date of Patent: October 24, 2023
    Assignee: INTEL CORPORATION
    Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
  • Publication number: 20230289399
    Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
    Type: Application
    Filed: February 2, 2023
    Publication date: September 14, 2023
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Fangwen Fu, Dhiraj D. Kalamkar, Sasikanth Avancha
  • Patent number: 11593454
    Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
    Type: Grant
    Filed: June 2, 2020
    Date of Patent: February 28, 2023
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Fangwen Fu, Dhiraj D. Kalamkar, Sasikanth Avancha
  • Patent number: 11494163
    Abstract: An apparatus to facilitate a computer number format conversion is disclosed. The apparatus comprises a control unit to receive to receive data format information indicating a first precision data format that input data is to be received and converter hardware to receive the input data and convert the first precision data format to a second precision data format based on the data format information.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: November 8, 2022
    Assignee: Intel Corporation
    Inventors: Naveen Mellempudi, Dipankar Das, Chunhui Mei, Kristopher Wong, Dhiraj D. Kalamkar, Hong H. Jiang, Subramaniam Maiyuran, Varghese George
  • Publication number: 20220101480
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Application
    Filed: August 10, 2021
    Publication date: March 31, 2022
    Applicant: Intel Corporation
    Inventors: DHIRAJ D. KALAMKAR, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
  • Publication number: 20210374209
    Abstract: An apparatus to facilitate machine learning matrix processing is disclosed. The apparatus comprises a memory to store matrix data one or more processors to execute an instruction to examine a message descriptor included in the instruction to determine a type of matrix layout manipulation operation that is to be executed, examine a message header included in the instruction having a plurality of parameters that define a two-dimensional (2D) memory surface that is to be retrieved, retrieve one or more blocks of the matrix data from the memory based on the plurality of parameters and a register file including a plurality of registers, wherein the one or more blocks of the matrix data is stored within a first set of the plurality of registers.
    Type: Application
    Filed: June 2, 2020
    Publication date: December 2, 2021
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Fangwen Fu, Dhiraj D. Kalamkar, Sasikanth Avancha
  • Publication number: 20210350212
    Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.
    Type: Application
    Filed: May 24, 2021
    Publication date: November 11, 2021
    Applicant: Intel Corporation
    Inventors: DHIRAJ D. KALAMKAR, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
  • Patent number: 11094029
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Grant
    Filed: April 10, 2017
    Date of Patent: August 17, 2021
    Assignee: INTEL CORPORATION
    Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
  • Patent number: 11023803
    Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.
    Type: Grant
    Filed: April 10, 2017
    Date of Patent: June 1, 2021
    Assignee: INTEL CORPORATION
    Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
  • Publication number: 20210072955
    Abstract: An apparatus to facilitate a computer number format conversion is disclosed. The apparatus comprises a control unit to receive to receive data format information indicating a first precision data format that input data is to be received and converter hardware to receive the input data and convert the first precision data format to a second precision data format based on the data format information.
    Type: Application
    Filed: September 6, 2019
    Publication date: March 11, 2021
    Applicant: Intel Corporation
    Inventors: Naveen MELLEMPUDI, Dipankar DAS, Chunhui MEI, Kristopher WONG, Dhiraj D. KALAMKAR, Hong H. JIANG, Subramaniam Maiyuran, Varghese George
  • Patent number: 10775873
    Abstract: In an embodiment, a processor includes: a plurality of first cores to independently execute instructions, each of the plurality of first cores including a plurality of counters to store performance information; at least one second core to perform memory operations; and a power controller to receive performance information from at least some of the plurality of counters, determine a workload type executed on the processor based at least in part on the performance information, and based on the workload type dynamically migrate one or more threads from one or more of the plurality of first cores to the at least one second core for execution during a next operation interval. Other embodiments are described and claimed.
    Type: Grant
    Filed: February 28, 2019
    Date of Patent: September 15, 2020
    Assignee: Intel Corporation
    Inventors: Victor W. Lee, Edward T. Grochowski, Daehyun Kim, Yuxin Bai, Sheng Li, Naveen K. Mellempudi, Dhiraj D. Kalamkar
  • Publication number: 20190265777
    Abstract: In an embodiment, a processor includes: a plurality of first cores to independently execute instructions, each of the plurality of first cores including a plurality of counters to store performance information; at least one second core to perform memory operations; and a power controller to receive performance information from at least some of the plurality of counters, determine a workload type executed on the processor based at least in part on the performance information, and based on the workload type dynamically migrate one or more threads from one or more of the plurality of first cores to the at least one second core for execution during a next operation interval. Other embodiments are described and claimed.
    Type: Application
    Filed: February 28, 2019
    Publication date: August 29, 2019
    Inventors: Victor W. Lee, Edward T. Grochowski, Daehyun Kim, Yuxin Bai, Sheng Li, Naveen K. Mellempudi, Dhiraj D. Kalamkar
  • Patent number: 10234930
    Abstract: In an embodiment, a processor includes: a plurality of first cores to independently execute instructions, each of the plurality of first cores including a plurality of counters to store performance information; at least one second core to perform memory operations; and a power controller to receive performance information from at least some of the plurality of counters, determine a workload type executed on the processor based at least in part on the performance information, and based on the workload type dynamically migrate one or more threads from one or more of the plurality of first cores to the at least one second core for execution during a next operation interval. Other embodiments are described and claimed.
    Type: Grant
    Filed: February 13, 2015
    Date of Patent: March 19, 2019
    Assignee: Intel Corporation
    Inventors: Victor W. Lee, Edward T. Grochowski, Daehyun Kim, Yuxin Bai, Sheng Li, Naveen K. Mellempudi, Dhiraj D. Kalamkar
  • Publication number: 20180293493
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Application
    Filed: April 10, 2017
    Publication date: October 11, 2018
    Applicant: Intel Corporation
    Inventors: Dhiraj D. Kalamkar, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
  • Publication number: 20180293492
    Abstract: One embodiment provides for a non-transitory machine readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising providing an interface to define a neural network using machine-learning domain specific terminology, wherein the interface enables selection of a neural network topology and abstracts low-level communication details of distributed training of the neural network.
    Type: Application
    Filed: April 10, 2017
    Publication date: October 11, 2018
    Applicant: Intel Corporation
    Inventors: Dhiraj D. Kalamkar, KARTHIKEYAN VAIDYANATHAN, SRINIVAS SRIDHARAN, DIPANKAR DAS
  • Patent number: 9910481
    Abstract: In an embodiment, a processor a plurality of cores to independently execute instructions, the cores including a plurality of counters to store performance information, and a power controller coupled to the plurality of cores, the power controller having a logic to receive performance information from at least some of the plurality of counters, determine a number of cores to be active and a performance state for the number of cores for a next operation interval, based at least in part on the performance information and model information, and cause the number of cores to be active during the next operation interval, the performance information associated with execution of a workload on one or more of the plurality of cores. Other embodiments are described and claimed.
    Type: Grant
    Filed: February 13, 2015
    Date of Patent: March 6, 2018
    Assignee: Intel Corporation
    Inventors: Victor W. Lee, Daehyun Kim, Yuxin Bai, Shihao Ji, Sheng Li, Dhiraj D. Kalamkar, Naveen K. Mellempudi
  • Publication number: 20160239065
    Abstract: In an embodiment, a processor a plurality of cores to independently execute instructions, the cores including a plurality of counters to store performance information, and a power controller coupled to the plurality of cores, the power controller having a logic to receive performance information from at least some of the plurality of counters, determine a number of cores to be active and a performance state for the number of cores for a next operation interval, based at least in part on the performance information and model information, and cause the number of cores to be active during the next operation interval, the performance information associated with execution of a workload on one or more of the plurality of cores. Other embodiments are described and claimed.
    Type: Application
    Filed: February 13, 2015
    Publication date: August 18, 2016
    Inventors: VICTOR W. LEE, DAEHYUN KIM, YUXIN BAI, SHIHAO JI, SHENG LI, DHIRAJ D. KALAMKAR, NAVEEN K. MELLEMPUDI