Patents by Inventor Dipankar Das

Dipankar Das has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12106210
    Abstract: One embodiment provides for a machine-learning accelerator device a multiprocessor to execute parallel threads of an instruction stream, the multiprocessor including a compute unit, the compute unit including a set of functional units, each functional unit to execute at least one of the parallel threads of the instruction stream. The compute unit includes compute logic configured to execute a single instruction to scale an input tensor associated with a layer of a neural network according to a scale factor, the input tensor stored in a floating-point data type, the compute logic to scale the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type.
    Type: Grant
    Filed: August 25, 2023
    Date of Patent: October 1, 2024
    Assignee: Intel Corporation
    Inventors: Naveen Mellempudi, Dipankar Das
  • Publication number: 20240320000
    Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
    Type: Application
    Filed: March 29, 2024
    Publication date: September 26, 2024
    Applicant: Intel Corporation
    Inventors: Subramaniam Maiyuran, Jorge Parra, Ashutosh Garg, Chandra Gurram, Chunhui Mei, Durgesh Borkar, Shubra Marwaha, Supratim Pal, Varghese George, Wei Xiong, Yan Li, Yongsheng Liu, Dipankar Das, Sasikanth Avancha, Dharma Teja Vooturi, Naveen K. Mellempudi
  • Patent number: 12099184
    Abstract: A system for detecting and cleaning contaminants from an imaging optical path, comprising an imaging device configured to receive a slide and capture a first slide image, at least a computing device configured to determine a contaminant presence indicator associated with a contaminant within an optical path of the imaging device based on the first slide image and execute a contaminant cleaning protocol as a function of the contaminant presence indicator, a contaminant removal mechanism configured to remove the contaminant from the optical path according to the contaminant cleaning protocol, wherein the computing device is further configured to re-evaluate the contaminant presence indicator based on a second slide image of the slide captured using the imaging device and request a user input upon a positive re-evaluation of the contaminant presence indicator.
    Type: Grant
    Filed: October 20, 2023
    Date of Patent: September 24, 2024
    Inventors: Prasanth Perugupalli, Ajay Chadha, Vinothkumar Anbalagan, Rohan Prateek, Shilpa G Krishna, Dipankar Das, Somesh Singh
  • Patent number: 12033237
    Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.
    Type: Grant
    Filed: April 24, 2023
    Date of Patent: July 9, 2024
    Assignee: Intel Corporation
    Inventors: Naveen K. Mellempudi, Dheevatsa Mudigere, Dipankar Das, Srinivas Sridharan
  • Publication number: 20240160931
    Abstract: One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.
    Type: Application
    Filed: December 7, 2023
    Publication date: May 16, 2024
    Applicant: Intel Corporation
    Inventors: Abhisek KUNDU, NAVEEN MELLEMPUDI, DHEEVATSA MUDIGERE, Dipankar DAS
  • Patent number: 11977885
    Abstract: An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: May 7, 2024
    Assignee: INTEL CORPORATION
    Inventors: Subramaniam Maiyuran, Jorge Parra, Ashutosh Garg, Chandra Gurram, Chunhui Mei, Durgesh Borkar, Shubra Marwaha, Supratim Pal, Varghese George, Wei Xiong, Yan Li, Yongsheng Liu, Dipankar Das, Sasikanth Avancha, Dharma Teja Vooturi, Naveen K. Mellempudi
  • Publication number: 20240126544
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Application
    Filed: December 28, 2023
    Publication date: April 18, 2024
    Inventors: Dipankar DAS, Naveen K. MELLEMPUDI, Mrinmay DUTTA, Arun KUMAR, Dheevatsa MUDIGERE, Abhisek KUNDU
  • Publication number: 20240118892
    Abstract: Methods and apparatuses relating to processing neural networks are described. In one embodiment, an apparatus to process a neural network includes a plurality of fully connected layer chips coupled by an interconnect; a plurality of convolutional layer chips each coupled by an interconnect to a respective fully connected layer chip of the plurality of fully connected layer chips and each of the plurality of fully connected layer chips and the plurality of convolutional layer chips including an interconnect to couple each of a forward propagation compute intensive tile, a back propagation compute intensive tile, and a weight gradient compute intensive tile of a column of compute intensive tiles between a first memory intensive tile and a second memory intensive tile.
    Type: Application
    Filed: December 18, 2023
    Publication date: April 11, 2024
    Inventors: Swagath VENKATARAMANI, Dipankar DAS, Ashish RANJAN, Subarno BANERJEE, Sasikanth AVANCHA, Ashok JAGANNATHAN, Ajaya V. DURG, Dheemanth NAGARAJ, Bharat KAUL, Anand RAGHUNATHAN
  • Publication number: 20240070799
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Application
    Filed: September 5, 2023
    Publication date: February 29, 2024
    Applicant: Intel Corporation
    Inventors: Dhiraj D. KALAMKAR, Karthikeyan VAIDYANATHAN, Srinivas SRIDHARAN, Dipankar DAS
  • Patent number: 11900107
    Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.
    Type: Grant
    Filed: March 25, 2022
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Dipankar Das, Naveen K. Mellempudi, Mrinmay Dutta, Arun Kumar, Dheevatsa Mudigere, Abhisek Kundu
  • Patent number: 11893490
    Abstract: One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.
    Type: Grant
    Filed: November 30, 2022
    Date of Patent: February 6, 2024
    Assignee: Intel Corporation
    Inventors: Abhisek Kundu, Naveen Mellempudi, Dheevatsa Mudigere, Dipankar Das
  • Publication number: 20230409891
    Abstract: One embodiment provides for a machine-learning accelerator device a multiprocessor to execute parallel threads of an instruction stream, the multiprocessor including a compute unit, the compute unit including a set of functional units, each functional unit to execute at least one of the parallel threads of the instruction stream. The compute unit includes compute logic configured to execute a single instruction to scale an input tensor associated with a layer of a neural network according to a scale factor, the input tensor stored in a floating-point data type, the compute logic to scale the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type.
    Type: Application
    Filed: August 25, 2023
    Publication date: December 21, 2023
    Applicant: Intel Corporation
    Inventors: NAVEEN MELLEMPUDI, DIPANKAR DAS
  • Publication number: 20230376762
    Abstract: Embodiments described herein provide an apparatus comprising an interconnect switch configured to couple with a plurality of graphics processors via a plurality of point-to-point interconnects and one or more processors including a graphics processor coupled with the interconnect switch via a point-to-point interconnect of the plurality of point-to-point interconnects.
    Type: Application
    Filed: May 19, 2023
    Publication date: November 23, 2023
    Applicant: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
  • Patent number: 11823034
    Abstract: A graphics processor is described that includes a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The multiprocessor can execute parallel threads of instructions associated with a command stream, where the multiprocessor includes a set of functional units to execute at least one of the parallel threads of the instructions. The set of functional units can include a mixed precision tensor processor to perform tensor computations to generate loss data. The loss data is stored as a first floating-point data type and scaled by a scaling factor to enable a data distribution of a gradient tensor generated based on the loss data to be represented by a second floating point data type.
    Type: Grant
    Filed: October 6, 2022
    Date of Patent: November 21, 2023
    Assignee: Intel Corporation
    Inventors: Naveen Mellempudi, Dipankar Das
  • Publication number: 20230351542
    Abstract: One embodiment provides for a graphics processing unit to perform computations associated with a neural network, the graphics processing unit comprising a hardware processing unit having a dynamic precision fixed-point unit that is configurable to convert elements of a floating-point tensor to convert the floating-point tensor into a fixed-point tensor.
    Type: Application
    Filed: April 24, 2023
    Publication date: November 2, 2023
    Applicant: Intel Corporation
    Inventors: Naveen K. MELLEMPUDI, DHEEVATSA MUDIGERE, DIPANKAR DAS, SRINIVAS SRIDHARAN
  • Patent number: 11798120
    Abstract: One embodiment provides for a method of transmitting data between multiple compute nodes of a distributed compute system, the method comprising creating a global view of communication operations to be performed between the multiple compute nodes of the distributed compute system, the global view created using information specific to a machine learning model associated with the distributed compute system; using the global view to determine a communication cost of the communication operations; and automatically determining a number of network endpoints for use in transmitting the data between the multiple compute nodes of the distributed compute system.
    Type: Grant
    Filed: August 10, 2021
    Date of Patent: October 24, 2023
    Assignee: INTEL CORPORATION
    Inventors: Dhiraj D. Kalamkar, Karthikeyan Vaidyanathan, Srinivas Sridharan, Dipankar Das
  • Patent number: 11768681
    Abstract: An apparatus and method for performing multiply-accumulate operations.
    Type: Grant
    Filed: January 24, 2018
    Date of Patent: September 26, 2023
    Assignee: Intel Corporation
    Inventors: Alexander Heinecke, Dipankar Das, Robert Valentine, Mark Charney
  • Patent number: 11704565
    Abstract: Embodiments described herein provide a system to configure distributed training of a neural network, the system comprising memory to store a library to facilitate data transmission during distributed training of the neural network; a network interface to enable transmission and receipt of configuration data associated with a set of worker nodes, the worker nodes configured to perform distributed training of the neural network; and a processor to execute instructions provided by the library. The instructions cause the processor to create one or more groups of the worker nodes, the one or more groups of worker nodes to be created based on a communication pattern for messages to be transmitted between the worker nodes during distributed training of the neural network. The processor can transparently adjust communication paths between worker nodes based on the communication pattern.
    Type: Grant
    Filed: March 3, 2022
    Date of Patent: July 18, 2023
    Assignee: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das, Chandrasekaran Sakthivel, Mikhail E. Smorkalov
  • Patent number: 11681529
    Abstract: Systems, methods, and apparatuses relating to access synchronization in a shared memory are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction, and an execution unit to execute the decoded instruction to: receive a first input operand of a memory address to be tracked and a second input operand of an allowed sequence of memory accesses to the memory address, and cause a block of a memory access that violates the allowed sequence of memory accesses to the memory address. In one embodiment, a circuit separate from the execution unit compares a memory address for a memory access request to one or more memory addresses in a tracking table, and blocks a memory access for the memory access request when a type of access violates a corresponding allowed sequence of memory accesses to the memory address for the memory access request.
    Type: Grant
    Filed: August 24, 2021
    Date of Patent: June 20, 2023
    Assignee: Intel Corporation
    Inventors: Swagath Venkataramani, Dipankar Das, Sasikanth Avancha, Ashish Ranjan, Subarno Banerjee, Bharat Kaul, Anand Raghunathan
  • Publication number: 20230177328
    Abstract: One embodiment provides for a graphics processing unit including a fabric interface configured to transmit gradient data stored in a memory device of the graphics processing unit according to a pre-defined communication operation. The memory device is a physical memory device shared with a compute block of the graphics processing unit and the fabric interface. The fabric interface automatically transmits the gradient data stored in memory to a second distributed training node based on an address of the gradient data in the memory device.
    Type: Application
    Filed: October 25, 2022
    Publication date: June 8, 2023
    Applicant: Intel Corporation
    Inventors: Srinivas Sridharan, Karthikeyan Vaidyanathan, Dipankar Das