Patents by Inventor Abhijit Davare

Abhijit Davare has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for adaptable and efficient lane-wise tensor processing

Patent number: 11379229

Abstract: An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule matrix operations responsive to a matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, wherein a lane comprises an arithmetic logic unit to multiply a block of a first matrix with a block of a second matrix to generate a product and to accumulate the product with a block of a third matrix, and wherein the matrix blocks are to be stored in registers within the lane; and broadcast circuitry to broadcast one or more invariant matrix blocks to at least one of different registers within the lane and different registers across different lanes.

Type: Grant

Filed: August 7, 2020

Date of Patent: July 5, 2022

Assignee: INTEL CORPORATION

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Debbie Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond A. Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
APPARATUS, ARTICLES OF MANUFACTURE, AND METHODS FOR COMPOSABLE MACHINE LEARNING COMPUTE NODES

Publication number: 20220114495

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for composable machine learning compute nodes. An example apparatus includes interface circuitry to receive a workload, instructions in the apparatus, and processor circuitry to at least one of execute or instantiate the instructions to generate a first configuration of one or more machine-learning models based on a workload, generate a second configuration of hardware, determine an evaluation parameter based on an execution of the workload, the execution of the workload based on the first configuration and the second configuration, and, in response to the evaluation parameter satisfying a threshold, execute the one or more machine-learning models in the first configuration on the hardware in the second configuration, the one or more machine-learning models and the hardware to execute the workload.

Type: Application

Filed: December 21, 2021

Publication date: April 14, 2022

Inventors: Eriko Nurvitadhi, Rajesh Poornachandran, Abhijit Davare, Nilesh Jain, Chaunte Lacewell, Anahita Bhiwandiwalla, Juan Pablo Munoz, Andrew Boutros, Yash Akhauri
HARDWARE OFFLOAD CIRCUITRY

Publication number: 20220114270

Abstract: Examples described herein relate to offload circuitry comprising one or more compute engines that are configurable to perform a workload offloaded from a process executed by a processor based on a descriptor particular to the workload. In some examples, the offload circuitry is configurable to perform the workload, among multiple different workloads. In some examples, the multiple different workloads include one or more of: data transformation (DT) for data format conversion, Locality Sensitive Hashing (LSH) for neural network (NN), similarity search, sparse general matrix-matrix multiplication (SpGEMM) acceleration of hash based sparse matrix multiplication, data encode, data decode, or embedding lookup.

Type: Application

Filed: December 22, 2021

Publication date: April 14, 2022

Inventors: Ren WANG, Sameh GOBRIEL, Somnath PAUL, Yipeng WANG, Priya AUTEE, Abhirupa LAYEK, Shaman NARAYANA, Edwin VERPLANKE, Mrittika GANGULI, Jr-Shian TSAI, Anton SOROKIN, Suvadeep BANERJEE, Abhijit DAVARE, Desmond KIRKPATRICK
METHODS AND APPARATUS FOR DATA ENHANCED AUTOMATED MODEL GENERATION

Publication number: 20220114451

Abstract: Methods, apparatus, systems, and articles of manufacture for data enhanced automated model generation are disclosed. Example instructions, when executed, cause at least one processor to access a request to generate a machine learning model to perform a selected task, generate task knowledge based on a previously generated machine learning model, create a search space based on the task knowledge, and generate a machine learning model using neural architecture search, the neural architecture search beginning based on the search space.

Type: Application

Filed: December 22, 2021

Publication date: April 14, 2022

Inventors: Chaunté W. Lacewell, Juan Pablo Muñoz, Rajesh Poornachandran, Nilesh Jain, Anahita Bhiwandiwalla, Eriko Nurvitadhi, Abhijit Davare
Gather-scatter cache architecture having plurality of tag and data banks and arbiter for single program multiple data (SPMD) processor

Patent number: 10896141

Abstract: In one embodiment, a cache memory includes: a plurality of data banks, each of the plurality of data banks having a plurality of entries each to store a portion of a cache line distributed across the plurality of data banks; and a plurality of tag banks decoupled from the plurality of data banks, wherein a tag for a cache line is to be assigned to one of the plurality of tag banks. Other embodiments are described and claimed.

Type: Grant

Filed: March 26, 2019

Date of Patent: January 19, 2021

Assignee: Intel Corporation

Inventors: Jeffrey J. Cook, Jonathan D. Pearce, Srikanth T. Srinivasan, Rishiraj A. Bheda, David B. Sheffield, Abhijit Davare, Anton Alexandrovich Sorokin
APPARATUS AND METHOD FOR ADAPTABLE AND EFFICIENT LANE-WISE TENSOR PROCESSING

Publication number: 20200364045

Abstract: An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule matrix operations responsive to a matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, wherein a lane comprises an arithmetic logic unit to multiply a block of a first matrix with a block of a second matrix to generate a product and to accumulate the product with a block of a third matrix, and wherein the matrix blocks are to be stored in registers within the lane; and broadcast circuitry to broadcast one or more invariant matrix blocks to at least one of different registers within the lane and different registers across different lanes.

Type: Application

Filed: August 7, 2020

Publication date: November 19, 2020

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Debbie Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond A. Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
Architecture and method for data parallel single program multiple data (SPMD) execution

Patent number: 10831505

Abstract: An apparatus and method for data parallel single program multiple data (SPMD) execution. For example, one embodiment of a processor comprises: instruction fetch circuitry to fetch instructions of one or more primary threads; a decoder to decode the instructions to generate uops; a data parallel cluster (DPC) to execute microthreads comprising a subset of the uops, the DPC further comprising: a plurality of execution lanes to perform parallel execution of the microthreads; an instruction decode queue (IDQ) to store the uops prior to execution; and a scheduler to evaluate the microthreads based on associated variables including instruction pointer (IP) values, the scheduler to gang microthreads into fragments for parallel execution on the execution lanes based on the evaluation.

Type: Grant

Filed: September 29, 2018

Date of Patent: November 10, 2020

Assignee: Intel Corporation

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Andrey Ayupov
Gather-Scatter Cache Architecture For Single Program Multiple Data (SPMD) Processor

Publication number: 20200310992

Abstract: In one embodiment, a cache memory includes: a plurality of data banks, each of the plurality of data banks having a plurality of entries each to store a portion of a cache line distributed across the plurality of data banks; and a plurality of tag banks decoupled from the plurality of data banks, wherein a tag for a cache line is to be assigned to one of the plurality of tag banks. Other embodiments are described and claimed.

Type: Application

Filed: March 26, 2019

Publication date: October 1, 2020

Inventors: Jeffrey J. Cook, Jonathan D. Pearce, Srikanth T. Srinivasan, Rishiraj A. Bheda, David B. Sheffield, Abhijit Davare, Anton Alexandrovich Sorokin
Apparatus and method for adaptable and efficient lane-wise tensor processing

Patent number: 10776110

Abstract: An apparatus and method for performing efficient, adaptable tensor operations.

Type: Grant

Filed: September 29, 2018

Date of Patent: September 15, 2020

Assignee: Intel Corporation

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
APPARATUS AND METHOD FOR ADAPTABLE AND EFFICIENT LANE-WISE TENSOR PROCESSING

Publication number: 20200104126

Abstract: An apparatus and method for performing efficient, adaptable tensor operations.

Type: Application

Filed: September 29, 2018

Publication date: April 2, 2020

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
ARCHITECTURE AND METHOD FOR DATA PARALLEL SINGLE PROGRAM MULTIPLE DATA (SPMD) EXECUTION

Publication number: 20200104139

Abstract: An apparatus and method for data parallel single program multiple data (SPMD) execution. For example, one embodiment of a processor comprises: instruction fetch circuitry to fetch instructions of one or more primary threads; a decoder to decode the instructions to generate uops; a data parallel cluster (DPC) to execute microthreads comprising a subset of the uops, the DPC further comprising: a plurality of execution lanes to perform parallel execution of the microthreads; an instruction decode queue (IDQ) to store the uops prior to execution; and a scheduler to evaluate the microthreads based on associated variables including instruction pointer (IP) values, the scheduler to gang microthreads into fragments for parallel execution on the execution lanes based on the evaluation.

Type: Application

Filed: September 29, 2018

Publication date: April 2, 2020

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Andrey Ayupov
Test, validation, and debug architecture

Patent number: 10198333

Abstract: An apparatus and method is described herein for providing a test, validation, and debug architecture. At a target or base level, hardware hooks (Design for Test or DFx) are designed into and integrated with silicon parts. A controller may provide abstracted access to such hooks, such as through an abstraction layer that abstracts low level details of the hardware DFx. In addition, the abstraction layer through an interface, such as APIs, provides services, routines, and data structures to higher-level software/presentation layers, which are able to collect test data for validation and debug of a unit/platform under test. Moreover, the architecture potentially provides tiered (multiple levels of) secure access to the test architecture. Additionally, physical access to the test architecture for a platform may be simplified through use of a unified, bi-directional test access port, while also potentially allowing remote access to perform remote test and debug of a part/platform under test.

Type: Grant

Filed: December 23, 2010

Date of Patent: February 5, 2019

Assignee: INTEL CORPORATION

Inventors: Mark B. Trobough, Keshavan K. Tiruvallur, Chinna B. Prudvi, Christian E. Iovin, David W. Grawrock, Jay J. Nejedlo, Ashok N. Kabadi, Travis K. Goff, Evan J. Halprin, Kapila B. Udawatta, Jiun Long Foo, Wee Hoo Cheah, Vui Yong Liew, Selvakumar Raja Gopal, Yuen Tat Lee, Samie B. Samaan, Kip C. Killpack, Neil Dobler, Nagib Z. Hakim, Brian Meyer, William H. Penner, John L. Baudrexl, Russell J. Wunderlich, James J. Grealish, Kyle Markley, Timothy S. Storey, Loren J. McConnell, Lyle E. Cool, Mukesh Kataria, Rahima K. Mohammed, Tieyu Zheng, Yi Amy Xia, Ridvan A. Sahan, Arun R. Ramadorai, Priyadarsan Patra, Edwin E. Parks, Abhijit Davare, Padmakumar Gopal, Bruce Querbach, Hermann W. Gartler, Keith Drescher, Sanjay S. Salem, David C. Florey
TEST, VALIDATION, AND DEBUG ARCHITECTURE

Publication number: 20150127983

Abstract: An apparatus and method is described herein for providing a test, validation, and debug architecture. At a target or base level, hardware (Design for Test or DFx) are designed into and integrated with silicon parts. A controller may provide abstracted access to such hooks, such as through an abstraction layer that abstracts low level details of the hardware DFx. In addition, the abstraction layer through an interface, such as APIs, provides services, routines, and data structures to higher-level software/presentation layers, which are able to collect test data for validation and debug of a unit/platform under test. Moreover, the architecture potentially provides tiered (multiple levels of) secure access to the test architecture. Additionally, physical access to the test architecture for a platform may be simplified through use of a unified, bi-directional test access port, while also potentially allowing remote access to perform remote test and de-bug of a part/platform under test.

Type: Application

Filed: December 23, 2010

Publication date: May 7, 2015

Applicant: INTEL CORPORATION

Inventors: Mark B. Trobough, Keshavan K. Tiruvallur, Chinna B. Prudvi, Christian E. Iovin, David W. Grawrock, Jay J. Nejedlo, Ashok N. Kabadi, Travis K. Goff, Evan J. Halprin, Kapila B. Udawatta, Jiun Long Foo, Wee Hoo Cheah, Vui Yong Liew, Selvakumar Raja Gopal, Yuen Tat Lee, Samie B. Samaan, Kip C. Killpack, Neil Dobler, Nagib Z. Hakim, Briar Meyer, William H. Penner, John L. Baudrexl, Russell J. Wunderlich, James J. Grealish, Kyle Markley, Timothy S. Storey, Loren J. McConnell, Lyle E. Cool, Mukesh Kataria, Rahima K. Mohammed, Tieyu Zheng, Yi Amy Xia, Ridvan A. Sahan, Arun R. Ramadorai, Priyadarsan Patra, Edwin E. Parks, Abhijit Davare, Padmakumar Gopal, Bruce Querbach, Hermann W. Gartler, Keith Drescher, Sanjay S. Salem, David C. Florey