Patents by Inventor Kermin ChoFleming

Kermin ChoFleming has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multi-stage cache tag with first stage tag size reduction

Patent number: 12332802

Abstract: An embodiment of an integrated circuit comprises circuitry to generate a cache tag for data to be stored in a cache memory, store a first portion of the cache tag in a primary tag memory, and store a second portion of the cache tag in a secondary tag memory, wherein a size of the first portion is smaller than a size of the second portion. Other embodiments are disclosed and claimed.

Type: Grant

Filed: June 21, 2021

Date of Patent: June 17, 2025

Assignee: Intel Corporation

Inventors: Kermin ChoFleming, Yu Bai, Ping Zou
Techniques for near data acceleration for a multi-core architecture

Patent number: 12204478

Abstract: Examples include techniques for near data acceleration for a multi-core architecture. A near data processor included in a memory controller of a processor may access data maintained in a memory device coupled with the near data processor via one or more memory channels responsive to a work request to execute a kernel, an application or a loop routine using the accessed data to generate values. The near data processor provides an indication to the requestor of the work request that values have been generated.

Type: Grant

Filed: March 19, 2021

Date of Patent: January 21, 2025

Assignee: Intel Corporation

Inventors: Swapna Raj, Samantika S. Sury, Kermin Chofleming, Simon C. Steely, Jr.
NEURAL NETWORK FACILITATING FIXED-POINT EMULATION OF FLOATING-POINT COMPUTATION

Publication number: 20230008856

Abstract: An DNN accelerator can perform fixed-point emulation of floating-point computation. In a multiplication operation on two floating-point matrices, the DNN accelerator determines an extreme exponent for a row in the first floating-point matrix and determines another extreme exponent for a column in the second floating-point matrix. The row and column can be converted to fixed-point vectors based on the extreme exponents. The two fixed-point vectors are fed into a PE array in the DNN accelerator. The PE array performs a multiplication operation on the two fixed-point vectors and generates a fixed-point inner product. The fixed-point inner product can be converted back to a floating-point inner product based on the extreme exponents. The floating-point inner product is an element in the matrix resulted from the multiplication operation on the two floating-point matrices. The matrix can be accumulated with another matrix resulted from a fixed-point emulation of a floating-point matrix multiplication.

Type: Application

Filed: September 5, 2022

Publication date: January 12, 2023

Inventors: Gregory Henry, Kermin Chofleming, Simon Steely, JR.
MULTI-STAGE CACHE TAG WITH FIRST STAGE TAG SIZE REDUCTION

Publication number: 20220405209

Abstract: An embodiment of an integrated circuit comprises circuitry to generate a cache tag for data to be stored in a cache memory, store a first portion of the cache tag in a primary tag memory, and store a second portion of the cache tag in a secondary tag memory, wherein a size of the first portion is smaller than a size of the second portion. Other embodiments are disclosed and claimed.

Type: Application

Filed: June 21, 2021

Publication date: December 22, 2022

Applicant: Intel Corporation

Inventors: Kermin ChoFleming, Yu Bai, Ping Zou
SYSTEMS, APPARATUS, ARTICLES OF MANUFACTURE, AND METHODS FOR IMPROVED DATA TRANSFER FOR HETEROGENEOUS PROGRAMS

Publication number: 20220222177

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for improving data transfer for heterogeneous programs. An example apparatus includes instructions in the apparatus, and processor circuitry to at least one of execute or instantiate the instructions to determine a runtime associated with executing a code object by a heterogeneous electronic device based on at least one of a location of a memory object or a data transfer penalty, the data transfer penalty associated with access of the memory object in response to execution of the code object, identify a memory operation for the memory object based on the runtime, and generate an executable file based on the memory operation, the executable file, when executed, to cause execution of the code object by at least one of first hardware or second hardware of the heterogeneous electronic device based on the memory operation.

Type: Application

Filed: March 31, 2022

Publication date: July 14, 2022

Inventors: Kermin ChoFleming, Swapna Raj
Control speculation in dataflow graphs

Patent number: 11385873

Abstract: Systems, apparatuses and methods may provide for technology that determines that a control loop is to be executed for an unspecified number of iterations and automatically forces the control loop to be executed for a fixed number of iterations in addition to the unspecified number of iterations, where execution of the control loop for the fixed number of iterations is conducted in parallel. In one example, the technology also removes one or more dataflow tokens associated with the execution of the control loop for the fixed number of iterations.

Type: Grant

Filed: December 7, 2020

Date of Patent: July 12, 2022

Assignee: Intel Corporation

Inventor: Kermin ChoFleming
Simulated-annealing based memory allocations

Patent number: 11249683

Abstract: Systems, apparatuses and methods may provide for technology that determines a plurality of memory operations associated with a data-flow graph that represents a computer code, where a spatial architecture executes the data-flow graph and the spatial architecture includes a plurality of memory controllers, randomly assigns one or more of the plurality of memory operations to one or more of the plurality of memory controllers to generate a first allocation of the plurality of memory operations to the memory controllers, and determines that the first allocation is to be stored as a permanent memory allocation based on a first performance metric associated with the first allocation.

Type: Grant

Filed: March 13, 2020

Date of Patent: February 15, 2022

Assignee: Intel Corporation

Inventors: Yu Bai, Kermin Chofleming
Apparatuses, methods, and systems for memory interface circuit arbitration in a configurable spatial accelerator

Patent number: 11037050

Abstract: Systems, methods, and apparatuses relating to arbitration among a plurality of memory interface circuits in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator (CSA) includes a plurality of processing elements; a plurality of request address file (RAF) circuits, and a circuit switched interconnect network between the plurality of processing elements and the RAF circuits. As a dataflow architecture, embodiments of CSA have a unique memory architecture where memory accesses are decoupled into an explicit request and response phase allowing pipelining through memory. Certain embodiments herein provide for improved memory sub-system design via arbitration and the improvements to arbitration discussed herein.

Type: Grant

Filed: June 29, 2019

Date of Patent: June 15, 2021

Assignee: Intel Corporation

Inventors: Krishna N. Vinod, Sujoyita Kaushikkar, Aniket S. Kakade, Kermin ChoFleming, Ping Zou, Alexey Suprun, Bhavya K. Daya
CONTROL SPECULATION IN DATAFLOW GRAPHS

Publication number: 20210165642

Abstract: Systems, apparatuses and methods may provide for technology that determines that a control loop is to be executed for an unspecified number of iterations and automatically forces the control loop to be executed for a fixed number of iterations in addition to the unspecified number of iterations, where execution of the control loop for the fixed number of iterations is conducted in parallel. In one example, the technology also removes one or more dataflow tokens associated with the execution of the control loop for the fixed number of iterations.

Type: Application

Filed: December 7, 2020

Publication date: June 3, 2021

Inventor: Kermin ChoFleming
Apparatuses, methods, and systems for memory interface circuit allocation in a configurable spatial accelerator

Patent number: 10915471

Abstract: Systems, methods, and apparatuses relating to memory interface circuit allocation in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator (CSA) includes a plurality of processing elements; a plurality of request address file (RAF) circuits, and a circuit switched interconnect network between the plurality of processing elements and the RAF circuits. As a dataflow architecture, embodiments of CSA have a unique memory architecture where memory accesses are decoupled into an explicit request and response phase allowing pipelining through memory. Certain embodiments herein provide for an improved memory sub-system design via the improvements to allocation discussed herein.

Type: Grant

Filed: March 30, 2019

Date of Patent: February 9, 2021

Assignee: Intel Corporation

Inventors: Kermin ChoFleming, Yu Bai, Simon C. Steely
APPARATUSES, METHODS, AND SYSTEMS FOR MEMORY INTERFACE CIRCUIT ARBITRATION IN A CONFIGURABLE SPATIAL ACCELERATOR

Publication number: 20200410323

Abstract: Systems, methods, and apparatuses relating to arbitration among a plurality of memory interface circuits in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator (CSA) includes a plurality of processing elements; a plurality of request address file (RAF) circuits, and a circuit switched interconnect network between the plurality of processing elements and the RAF circuits. As a dataflow architecture, embodiments of CSA have a unique memory architecture where memory accesses are decoupled into an explicit request and response phase allowing pipelining through memory. Certain embodiments herein provide for improved memory sub-system design via arbitration and the improvements to arbitration discussed herein.

Type: Application

Filed: June 29, 2019

Publication date: December 31, 2020

Inventors: Krishna N. Vinod, Sujoyita Kaushikkar, Aniket S. Kakade, Kermin ChoFleming, Ping Zou, Alexey Suprun, Bhavya K. Daya
APPARATUSES, METHODS, AND SYSTEMS FOR TIME-MULTIPLEXING IN A CONFIGURABLE SPATIAL ACCELERATOR

Publication number: 20200409709

Abstract: Systems, methods, and apparatuses relating to time-multiplexing circuitry in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator (CSA) includes a plurality of processing elements; and a time-multiplexed, circuit switched interconnect network between the plurality of processing elements. In another embodiment, a configurable spatial accelerator (CSA) includes a plurality of time-multiplexed processing elements; and a time-multiplexed, circuit switched interconnect network between the plurality of time-multiplexed processing elements.

Type: Application

Filed: June 29, 2019

Publication date: December 31, 2020

Inventors: Kermin ChoFleming, Simon C. Steely, JR., Mitchell Diamond
Control speculation in dataflow graphs

Patent number: 10860301

Abstract: Systems, apparatuses and methods may provide for technology that determines that a control loop is to be executed for an unspecified number of iterations and automatically forces the control loop to be executed for a fixed number of iterations in addition to the unspecified number of iterations, where execution of the control loop for the fixed number of iterations is conducted in parallel. In one example, the technology also removes one or more dataflow tokens associated with the execution of the control loop for the fixed number of iterations.

Type: Grant

Filed: June 28, 2019

Date of Patent: December 8, 2020

Assignee: Intel Corporation

Inventor: Kermin ChoFleming
Apparatuses, methods, and systems for swizzle operations in a configurable spatial accelerator

Patent number: 10817291

Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA.

Type: Grant

Filed: March 30, 2019

Date of Patent: October 27, 2020

Assignee: Intel Corporation

Inventors: Jesus Corbal, Rohan Sharma, Simon Steely, Jr., Chinmay Ashok, Kent D. Glossop, Dennis Bradford, Paul Caprioli, Louise Huot, Kermin ChoFleming, Barry Tannenbaum
APPARATUSES, METHODS, AND SYSTEMS FOR MEMORY INTERFACE CIRCUIT ALLOCATION IN A CONFIGURABLE SPATIAL ACCELERATOR

Publication number: 20200310994

Abstract: Systems, methods, and apparatuses relating to memory interface circuit allocation in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator (CSA) includes a plurality of processing elements; a plurality of request address file (RAF) circuits, and a circuit switched interconnect network between the plurality of processing elements and the RAF circuits. As a dataflow architecture, embodiments of CSA have a unique memory architecture where memory accesses are decoupled into an explicit request and response phase allowing pipelining through memory. Certain embodiments herein provide for an improved memory sub-system design via the improvements to allocation discussed herein.

Type: Application

Filed: March 30, 2019

Publication date: October 1, 2020

Inventors: Kermin ChoFleming, Yu Bai, Simon C. Steely
APPARATUSES, METHODS, AND SYSTEMS FOR SWIZZLE OPERATIONS IN A CONFIGURABLE SPATIAL ACCELERATOR

Publication number: 20200310797

Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA.

Type: Application

Filed: March 30, 2019

Publication date: October 1, 2020

Inventors: Jesus Corbal, Rohan Sharma, Simon Steely, JR., Chinmay Ashok, Kent D. Glossop, Dennis Bradford, Paul Caprioli, Louise Huot, Kermin ChoFleming, Barry Tannenbaum
APPARATUSES, METHODS, AND SYSTEMS FOR IN-NETWORK STORAGE IN A CONFIGURABLE SPATIAL ACCELERATOR

Publication number: 20200210358

Abstract: Systems, methods, and apparatuses relating to in-network storage for a configurable spatial accelerator are described.

Type: Application

Filed: December 29, 2018

Publication date: July 2, 2020

Inventors: Kermin ChoFleming, Simon Steely, JR., Kent Glossop
SIMULATED-ANNEALING BASED MEMORY ALLOCATIONS

Publication number: 20200210113

Abstract: Systems, apparatuses and methods may provide for technology that determines a plurality of memory operations associated with a data-flow graph that represents a computer code, where a spatial architecture executes the data-flow graph and the spatial architecture includes a plurality of memory controllers, randomly assigns one or more of the plurality of memory operations to one or more of the plurality of memory controllers to generate a first allocation of the plurality of memory operations to the memory controllers, and determines that the first allocation is to be stored as a permanent memory allocation based on a first performance metric associated with the first allocation.

Type: Application

Filed: March 13, 2020

Publication date: July 2, 2020

Applicant: Intel Corporation

Inventors: Yu Bai, Kermin Chofleming
Apparatuses, methods, and systems for in-network storage in a configurable spatial accelerator

Patent number: 10678724

Abstract: Systems, methods, and apparatuses relating to in-network storage for a configurable spatial accelerator are described.

Type: Grant

Filed: December 29, 2018

Date of Patent: June 9, 2020

Assignee: Intel Corporation

Inventors: Kermin ChoFleming, Simon Steely, Jr., Kent Glossop
CONTROL SPECULATION IN DATAFLOW GRAPHS

Publication number: 20190317744

Abstract: Systems, apparatuses and methods may provide for technology that determines that a control loop is to be executed for an unspecified number of iterations and automatically forces the control loop to be executed for a fixed number of iterations in addition to the unspecified number of iterations, where execution of the control loop for the fixed number of iterations is conducted in parallel. In one example, the technology also removes one or more dataflow tokens associated with the execution of the control loop for the fixed number of iterations.

Type: Application

Filed: June 28, 2019

Publication date: October 17, 2019

Inventor: Kermin ChoFleming