Patents by Inventor Bharat Daga

Bharat Daga has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240078453
    Abstract: Embodiments described herein provide a processing apparatus comprising compute circuitry to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute circuitry additionally includes a direct memory access (DMA) controller including a hardware codec having encode circuitry and a decode circuitry. The DMA controller reads the neural network data from the memory buffer, encode the neural network data via the encode circuit, writes encoded neural network data to a memory device coupled with the processing apparatus, writes metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decodes encoded neural network data via the decode circuit in response to a request from the compute circuitry.
    Type: Application
    Filed: September 14, 2023
    Publication date: March 7, 2024
    Applicant: Intel Corporation
    Inventors: Ajit Singh, Bharat Daga, Michael Behar
  • Patent number: 11763183
    Abstract: Embodiments described herein provide a processing apparatus comprising compute circuitry to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute circuitry additionally includes a direct memory access (DMA) controller including a hardware codec having encode circuitry and a decode circuitry. The DMA controller reads the neural network data from the memory buffer, encode the neural network data via the encode circuit, writes encoded neural network data to a memory device coupled with the processing apparatus, writes metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decodes encoded neural network data via the decode circuit in response to a request from the compute circuitry.
    Type: Grant
    Filed: July 30, 2021
    Date of Patent: September 19, 2023
    Assignee: Intel Corporation
    Inventors: Ajit Singh, Bharat Daga, Michael Behar
  • Patent number: 11640537
    Abstract: An apparatus to facilitate execution of non-linear functions operations is disclosed. The apparatus comprises accelerator circuitry including a compute grid having a plurality of processing elements to execute neural network computations, store values resulting from the neural network computations, and perform piecewise linear (PWL) approximations of one or more non-linear functions using the stored values as input data.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: May 2, 2023
    Assignee: Intel Corporation
    Inventors: Bharat Daga, Krishnakumar Nair, Pradeep Janedula, Aravind Babu Srinivasan, Bijoy Pazhanimala, Ambili Vengallur
  • Patent number: 11544191
    Abstract: Hardware accelerators for accelerated grouped convolution operations. A first buffer of a hardware accelerator may receive a first row of an input feature map (IFM) from a memory. A first group comprising a plurality of tiles may receive a first row of the IFM. A plurality of processing elements of the first group may compute a portion of a first row of an output feature map (OFM) based on the first row of the IFM and a kernel. A second buffer of the accelerator may receive a third row of the IFM from the memory. A second group comprising a plurality of tiles may receive the third row of the IFM. A plurality of processing elements of the second group may compute a portion of a third row of the OFM based on the third row of the IFM and the kernel as part of a grouped convolution operation.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: January 3, 2023
    Assignee: INTEL CORPORATION
    Inventors: Ambili Vengallur, Bharat Daga, Pradeep K. Janedula, Bijoy Pazhanimala, Aravind Babu Srinivasan
  • Publication number: 20220043884
    Abstract: One embodiment provides a compute apparatus to perform machine learning operations, the compute apparatus comprising a hardware accelerator including a compute unit to perform a Winograd convolution, the compute unit configurable to perform the Winograd convolution for a first kernel size using a transform associated with a second kernel size.
    Type: Application
    Filed: April 22, 2021
    Publication date: February 10, 2022
    Applicant: Intel Corporation
    Inventors: Pradeep K. Janedula, Bijoy Pazhanimala, Bharat Daga, Saurabh M. Dhoble
  • Publication number: 20210357793
    Abstract: Embodiments described herein provide a processing apparatus comprising compute circuitry to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute circuitry additionally includes a direct memory access (DMA) controller including a hardware codec having encode circuitry and a decode circuitry. The DMA controller reads the neural network data from the memory buffer, encode the neural network data via the encode circuit, writes encoded neural network data to a memory device coupled with the processing apparatus, writes metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decodes encoded neural network data via the decode circuit in response to a request from the compute circuitry.
    Type: Application
    Filed: July 30, 2021
    Publication date: November 18, 2021
    Applicant: Intel Corporation
    Inventors: Ajit Singh, Bharat Daga, Michael Behar
  • Patent number: 11080611
    Abstract: Embodiments described herein provide a processing apparatus comprising compute logic to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute logic additionally includes a direct memory access (DMA) controller including a hardware codec having an encode unit and a decode unit, the DMA controller to read the neural network data from the memory buffer, encode the neural network data via the encode unit, write encoded neural network data to a memory device coupled with the processing apparatus, write metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decode encoded neural network data via the decode unit in response to a request from the compute logic.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Ajit Singh, Bharat Daga, Michael Behar
  • Patent number: 10990648
    Abstract: One embodiment provides a compute apparatus to perform machine learning operations, the compute apparatus comprising a hardware accelerator including a compute unit to perform a Winograd convolution, the compute unit configurable to perform the Winograd convolution for a first kernel size using a transform associated with a second kernel size.
    Type: Grant
    Filed: August 7, 2017
    Date of Patent: April 27, 2021
    Assignee: INTEL CORPORATION
    Inventors: Pradeep Janedula, Bijoy Pazhanimala, Bharat Daga, Saurabh Dhoble
  • Publication number: 20200320403
    Abstract: An apparatus to facilitate execution of non-linear functions operations is disclosed. The apparatus comprises accelerator circuitry including a compute grid having a plurality of processing elements to execute neural network computations, store values resulting from the neural network computations, and perform piecewise linear (PWL) approximations of one or more non-linear functions using the stored values as input data.
    Type: Application
    Filed: April 8, 2019
    Publication date: October 8, 2020
    Applicant: Intel Corporation
    Inventors: Bharat Daga, Krishnakumar Nair, Pradeep Janedula, Aravind Babu Srinivasan, Bijoy Pazhanimala, Ambili Vengallur
  • Patent number: 10769526
    Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises accelerator circuitry including a first set of processing elements to perform first computations including matrix multiplication operations, a second set of processing elements to perform second computations including sum of elements of weights and offset multiply operations and a third set of processing elements to perform third computations including sum of elements of inputs and offset multiply operations, wherein the second and third computations are performed in parallel with the first computations.
    Type: Grant
    Filed: April 24, 2018
    Date of Patent: September 8, 2020
    Assignee: Intel Corporation
    Inventors: Bharat Daga, Pradeep Janedula, Aravind Babu Srinivasan, Ambili Vengallur
  • Patent number: 10726583
    Abstract: Embodiments described herein provide a processing apparatus comprising compute logic to generate output feature map data for a convolutional neural network (CNN) and write the feature map data to a memory buffer; a direct memory access (DMA) controller including a feature map encoder, the DMA controller to read the feature map data from the memory buffer, encode the feature map data using one of multiple encode algorithms, and write encoded feature map data to memory coupled with the processing apparatus; and wherein the compute logic is to read the encoded feature map data from the memory in an encoded format and decode the encoded feature map data while reading the encoded feature map data.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: July 28, 2020
    Assignee: INTEL CORPORATION
    Inventors: Ajit Singh, Bharat Daga, Oren Agam, Michael Behar, Dmitri Vainbrand
  • Publication number: 20200233803
    Abstract: Hardware accelerators for accelerated grouped convolution operations. A first buffer of a hardware accelerator may receive a first row of an input feature map (IFM) from a memory. A first group comprising a plurality of tiles may receive a first row of the IFM. A plurality of processing elements of the first group may compute a portion of a first row of an output feature map (OFM) based on the first row of the IFM and a kernel. A second buffer of the accelerator may receive a third row of the IFM from the memory. A second group comprising a plurality of tiles may receive the third row of the IFM. A plurality of processing elements of the second group may compute a portion of a third row of the OFM based on the third row of the IFM and the kernel as part of a grouped convolution operation.
    Type: Application
    Filed: March 26, 2020
    Publication date: July 23, 2020
    Applicant: Intel Corporation
    Inventors: AMBILI VENGALLUR, BHARAT DAGA, PRADEEP K. JANEDULA, BIJOY PAZHANIMALA, ARAVIND BABU SRINIVASAN
  • Patent number: 10600147
    Abstract: A mechanism is described for facilitating efficient memory layout for enabling smart data compression in machine learning environments. A method of embodiments, as described herein, includes facilitating dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more processors of a computing device. The method may further include computing the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer. The method may further include merging the multiple secondary multiple tiles into a final tile representing the image, and compressing the final tile.
    Type: Grant
    Filed: August 22, 2017
    Date of Patent: March 24, 2020
    Assignee: INTEL CORPORATION
    Inventors: Bharat Daga, Ajit Singh, Pradeep Janedula
  • Publication number: 20190325303
    Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises accelerator circuitry including a first set of processing elements to perform first computations including matrix multiplication operations, a second set of processing elements to perform second computations including sum of elements of weights and offset multiply operations and a third set of processing elements to perform third computations including sum of elements of inputs and offset multiply operations, wherein the second and third computations are performed in parallel with the first computations.
    Type: Application
    Filed: April 24, 2018
    Publication date: October 24, 2019
    Applicant: Intel Corporation
    Inventors: BHARAT DAGA, PRADEEP JANEDULA, ARAVIND BABU SRINIVASAN, AMBILI VENGALLUR
  • Publication number: 20190197420
    Abstract: Embodiments described herein provide a processing apparatus comprising compute logic to generate neural network data for a convolutional neural network (CNN) and write the neural network data to a memory buffer. The compute logic additionally includes a direct memory access (DMA) controller including a hardware codec having an encode unit and a decode unit, the DMA controller to read the neural network data from the memory buffer, encode the neural network data via the encode unit, write encoded neural network data to a memory device coupled with the processing apparatus, write metadata for the encoded neural network data to the memory device coupled with the processing apparatus, and decode encoded neural network data via the decode unit in response to a request from the compute logic.
    Type: Application
    Filed: December 22, 2017
    Publication date: June 27, 2019
    Applicant: Intel Corporation
    Inventors: Ajit Singh, Bharat Daga, Michael Behar
  • Publication number: 20190066257
    Abstract: A mechanism is described for facilitating efficient memory layout for enabling smart data compression in machine learning environments. A method of embodiments, as described herein, includes facilitating dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more processors of a computing device. The method may further include computing the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer. The method may further include merging the multiple secondary multiple tiles into a final tile representing the image, and compressing the final tile.
    Type: Application
    Filed: August 22, 2017
    Publication date: February 28, 2019
    Applicant: Intel Corporation
    Inventors: Bharat Daga, Ajit Singh, Pradeep Janedula
  • Publication number: 20190042923
    Abstract: One embodiment provides a compute apparatus to perform machine learning operations, the compute apparatus comprising a hardware accelerator including a compute unit to perform a Winograd convolution, the compute unit configurable to perform the Winograd convolution for a first kernel size using a transform associated with a second kernel size.
    Type: Application
    Filed: August 7, 2017
    Publication date: February 7, 2019
    Applicant: Intel Corporation
    Inventors: PRADEEP JANEDULA, BIJOY PAZHANIMALA, BHARAT DAGA, SAURABH DHOBLE
  • Publication number: 20180189981
    Abstract: Embodiments described herein provide a processing apparatus comprising compute logic to generate output feature map data for a convolutional neural network (CNN) and write the feature map data to a memory buffer; a direct memory access (DMA) controller including a feature map encoder, the DMA controller to read the feature map data from the memory buffer, encode the feature map data using one of multiple encode algorithms, and write encoded feature map data to memory coupled with the processing apparatus; and wherein the compute logic is to read the encoded feature map data from the memory in an encoded format and decode the encoded feature map data while reading the encoded feature map data.
    Type: Application
    Filed: December 30, 2016
    Publication date: July 5, 2018
    Inventors: AJIT SINGH, BHARAT DAGA, OREN AGAM, MICHAEL BEHAR, DMITRI VAINBRAND
  • Publication number: 20170286357
    Abstract: In one embodiment, an apparatus comprises: a controller to communicate data having a format according to a first communication protocol, the controller comprising a Mobile Industry Processor Interface (MIPI)-compatible controller; an interface circuit coupled to the controller to receive the data, convert the data and communicate the converted data to a physical unit of a second communication protocol, the converted data having a format according to the second communication protocol; and the physical unit coupled to the interface circuit to receive and serialize the converted data and output the serialized converted data to a destination. Other embodiments are described and claimed.
    Type: Application
    Filed: March 30, 2016
    Publication date: October 5, 2017
    Inventors: Satheesh Chellappan, Anoop Mukker, Bharat Daga, David W. Vogel
  • Patent number: 7609092
    Abstract: An automatic phase detection circuit for generating an internal synchronization signal when two clock input signals achieve a certain phase relationship. No external reference signal is required. The logic state of one clock is sampled on the active edge of the other clock and stored in a shift register. The content of the shift register is compared to a pre-defined signature and a sync signal is generated when the content matches the pre-defined signature. A mask register may be used to define which bits of the shift register and pre-defined signature are compared.
    Type: Grant
    Filed: January 23, 2008
    Date of Patent: October 27, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Thomas Wicki, Bharat Daga