Patents by Inventor Michael Kinsner

Michael Kinsner has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Data parallel programming task graph optimization through device telemetry

Patent number: 12663970

Abstract: An apparatus to facilitate data parallel programming task graph optimization through device telemetry is disclosed. The apparatus includes a processor to: receive, from a compiler, compiled code generated from source code of an application, the compiled code to support a workload of the application; generate a task graph of the application using the compiled code, the task graph to represent at least one of a relationship or dependency of the compiled code; receive runtime telemetry data corresponding to execution of the compiled code on the one or more accelerator devices; identify one or more scheduling optimizations for the one or more accelerator devices based on the task graph and the received telemetry data; and provide a scheduling command to cause the one or more scheduling optimizations to be implemented in the one or more accelerator devices.

Type: Grant

Filed: March 11, 2022

Date of Patent: June 23, 2026

Assignee: INTEL CORPORATION

Inventors: Michael Kinsner, Ben J. Ashbaugh, James Brodman, Rajesh Poornachandran
Apparatus, device, method, and computer program for generating logic to be performed by computing circuitry of a computing architecture

Patent number: 12602528

Abstract: Examples relate to an apparatus, device, method, and computer program for generating logic to be performed by computing circuitry of a computing architecture. The apparatus is configured to determine a performance-critical compute path of a compute kernel to be executed on a plurality of units of computing circuitry of a computing architecture, the compute kernel comprising a plurality of interdependent groups of computational instructions, with the performance-critical compute path being based on a subset of the interdependent groups of computational instructions. The apparatus is configured to determine, for at least one group of computational instructions outside the performance-critical compute path, a reduced clock frequency being lower than a maximally feasible clock frequency of the respective group of computational instructions.

Type: Grant

Filed: June 28, 2022

Date of Patent: April 14, 2026

Assignee: Altera Corporation

Inventors: Rajesh Poornachandran, Michael Kinsner, John Freeman, Joseph Garvey, Artem Radzikhovskyy
CLOCK GATING AND CLOCK SCALING BASED ON RUNTIME APPLICATION TASK GRAPH INFORMATION

Publication number: 20260072659

Abstract: An apparatus to facilitate clock gating and clock scaling based on runtime application task graph information is disclosed. The apparatus includes a processor to: receive, from a compiler, a bitstream generated from code of an application, the bitstream related to a workload of the application; generate a task graph of the application using at least part of the bitstream, the task graph to represent one of a relationship and dependency of the code; program the bitstream to an accelerator device, wherein the bitstream to configure the accelerator device to support the workload of the application; execute one or more kernels of the code using the accelerator device; identify one or more optimizations for the accelerator device based on the task graph of the application; and transmit a command to cause the one or more optimizations to be implemented in the at least one region of the accelerator device.

Type: Application

Filed: November 12, 2025

Publication date: March 12, 2026

Inventors: Michael Kinsner, Rajesh Poornachandran, John Freeman
Incremental just-in-time (JIT) performance refinement for programmable logic device offload

Patent number: 12493454

Abstract: An apparatus to facilitate incremental just-in-time (JIT) performance refinement for programmable logic device offload is disclosed. The apparatus includes a processor to: initiate multiple just-in-time (JIT) compilation iterations of an application; program a first architecture of a first compilation of the multiple JIT compilation iterations to a programmable logic device and execute the application on the first architecture, wherein the first compilation comprises a faster compilation time amongst the multiple JIT compilation iterations; identify a hotspot; determine that a second compilation of the multiple JIT compilation iterations is complete, wherein the second compilation comprises a slower compilation time than the first compilation; and program a second architecture of the second compilation of the multiple JIT compilation iterations to the programmable logic device and execute the application on the second architecture.

Type: Grant

Filed: March 11, 2022

Date of Patent: December 9, 2025

Assignee: Altera Corporation

Inventors: Michael Kinsner, John Freeman, Ben J. Ashbaugh, Rajesh Poornachandran
Techniques For Coarse Grained And Fine Grained Configurations Of Configurable Logic Circuits

Publication number: 20240193331

Abstract: An integrated circuit includes configurable logic circuit blocks that are configurable with a first configuration bitstream according to a coarse grained configuration. The coarse grained configuration implements an aggregate circuit structure of the configurable logic circuit blocks. The configurable logic circuit blocks are configurable with a second configuration bitstream according to a fine grained configuration. A total number of the first and the second configuration bits is fewer than a single fine grained configuration bitstream.

Type: Application

Filed: February 22, 2024

Publication date: June 13, 2024

Applicant: Altera Corporation

Inventors: Michael Kinsner, Byron Sinclair, Gregory Nash
Fast CAD Compilation Through Coarse Macro Lowering

Publication number: 20240020449

Abstract: Systems or methods of the present disclosure may provide a library including multiple macros that may be pre-compiled prior to implementation of the design. For example, a design may be mapped to one or more macros in the library, and the one or more macros may be placed into and routed between a portion of a region, one region, one or more regions of the integrated circuit device to implement the design. Since the macros may be pre-compiled, compilation time experienced by the designer may correspond to the placement and routing of the one or more macros, which may be less than compilation time for fine-grained operations. The pre-compiled logic within the macros may be set using a lookup table mask to set and/or adjust a functionality of the macro. Additionally or alternatively, the place and route operation may be performed at finer granularities to reduce bottle necks.

Type: Application

Filed: September 27, 2023

Publication date: January 18, 2024

Inventors: Byron Sinclair, Deshanand P. Singh, Gregg William Baeckler, Mahesh A. Iyer, Michael Kinsner, Chengping Liang, Victor Tzi-on Zhang
FAST FPGA COMPILATION THROUGH BITSTREAM STITCHING

Publication number: 20230333826

Abstract: Systems or methods of the present disclosure may provide a library including multiple regional bits streams that may be pre-generated by a manufacturer and/or custom generated by a designer that may be used to implement a design onto an integrated circuit device. The design may be decomposed into one or more regional bitstreams and stitched to form a larger combined bitstream to be implemented as coarse-grained operations on the integrated circuit device, thereby decreasing compilation time experienced by the designer. The combined bitstreams may be loaded into all or a portion of the integrated circuit device to realize the design. Additionally or alternatively, the integrated circuit device may include a hardened networks-on-chip to improve data routing within the combined bitstream.

Type: Application

Filed: June 20, 2023

Publication date: October 19, 2023

Inventors: Michael Kinsner, Byron Sinclair, Deshanand P. Singh, Scott Jeremy Weber, Mahesh A. Iyer, Chengping Liang, Victor Tzi-on Zhang, Gabriel Quan
FAST FPGA COMPILATION FROM SOFTWARE FLOWS THROUGH PARTIAL RECONFIGURATION AND HARDENED NETWORK-ON-CHIP

Publication number: 20230237230

Abstract: Systems or methods of the present disclosure may provide a library including multiple personas that may be pre-generated by a manufacturer and/or custom generated by a designer that may be used to implement a design onto an integrated circuit device. The design may be decomposed into one or more personas to be implemented as coarse-grained operations on the integrated circuit device, thereby decreasing compilation time experienced by the designer. The personas may be loaded into one or more regions of the integrated circuit device to realize the design. That is, the design may be realized by one persona may be implemented across multiple regions, one region may be configured by multiple personas, one persona configuring one region, or any combination thereof. Additionally or alternatively, the integrated circuit device may include networks-on-chip to improve data routing between the regions.

Type: Application

Filed: March 28, 2023

Publication date: July 27, 2023

Inventors: Michael Kinsner, Byron Sinclair, Deshanand P. Singh, Scott Jeremy Weber, Anandh Venkateswaran, Mahesh A. Iyer
Modular Compilation Flows for a Programmable Logic Device

Publication number: 20230237231

Abstract: Systems or methods of the present disclosure may provide an electronic device that includes memory storing instructions; and a processor, that when executing the instructions, is to receive a design for a programmable fabric of an integrated circuit device. The instructions are also to cause the processor to cause compilation of the design into a configuration during a compilation window. The instructions further are to cause the processor to determine at least some routing for the configuration outside of the compilation window.

Type: Application

Filed: March 28, 2023

Publication date: July 27, 2023

Inventors: Byron Sinclair, Michael Kinsner, Gabriel Quan, Victor Tzi-on Zhang, Mahesh A. Iyer, Chengping Liang, Deshanand P. Singh
Apparatus, Device, Method, and Computer Program for Scheduling an Execution of Compute Kernels

Publication number: 20220365813

Abstract: Examples relate to an apparatus, a device, a method, and a computer program for scheduling an execution of compute kernels on one or more computing devices, and to a computer system comprising such an apparatus or device. The apparatus comprises processing circuitry and interface circuitry. The processing circuitry is configured to determine an impending execution of two or more compute kernels to the one or more computing devices. The processing circuitry is configured to pipeline a data transfer related to the execution of the two or more compute kernels to the one or more computing devices via the interface circuitry.

Type: Application

Filed: June 28, 2022

Publication date: November 17, 2022

Inventors: Rajesh POORNACHANDRAN, Ben J. ASHBAUGH, Gregory LUECK, James BRODMAN, Simon PENNYCOOK, Michael KINSNER, Roland SCHULZ
Apparatus, Device, Method, and Computer Program for Generating Logic to be Performed by Computing Circuitry of a Computing Architecture

Publication number: 20220327267

Abstract: Examples relate to an apparatus, device, method, and computer program for generating logic to be performed by computing circuitry of a computing architecture. The apparatus is configured to determine a performance-critical compute path of a compute kernel to be executed on a plurality of units of computing circuitry of a computing architecture, the compute kernel comprising a plurality of interdependent groups of computational instructions, with the performance-critical compute path being based on a subset of the interdependent groups of computational instructions. The apparatus is configured to determine, for at least one group of computational instructions outside the performance-critical compute path, a reduced clock frequency being lower than a maximally feasible clock frequency of the respective group of computational instructions.

Type: Application

Filed: June 28, 2022

Publication date: October 13, 2022

Inventors: Rajesh POORNACHANDRAN, Michael KINSNER, John FREEMAN, Joseph GARVEY, Artem RADZIKHOVSKYY
DATA PARALLEL PROGRAMMING TASK GRAPH OPTIMIZATION THROUGH DEVICE TELEMETRY

Publication number: 20220197615

Abstract: An apparatus to facilitate data parallel programming task graph optimization through device telemetry is disclosed. The apparatus includes a processor to: receive, from a compiler, compiled code generated from source code of an application, the compiled code to support a workload of the application; generate a task graph of the application using the compiled code, the task graph to represent at least one of a relationship or dependency of the compiled code; receive runtime telemetry data corresponding to execution of the compiled code on the one or more accelerator devices; identify one or more scheduling optimizations for the one or more accelerator devices based on the task graph and the received telemetry data; and provide a scheduling command to cause the one or more scheduling optimizations to be implemented in the one or more accelerator devices.

Type: Application

Filed: March 11, 2022

Publication date: June 23, 2022

Applicant: Intel Corporation

Inventors: Michael Kinsner, Ben J. Ashbaugh, James Brodman, Rajesh Poornachandran
DATA PARALLEL PROGRAMMING-BASED TRANSPARENT TRANSFER ACROSS HETEROGENEOUS DEVICES

Publication number: 20220197715

Abstract: An apparatus to facilitate data parallel programming-based transparent transfer across heterogeneous devices is disclosed. The apparatus includes a processor to: identify a change in device status that triggers a device transfer process from an original device, wherein the original device is associated with a queue of an application program of a data parallel programming runtime; identify a new device that is compatible with the original device; migrate at least one of a state or data of the original device to the new device; logically map, without user intervention, the queue to the new device in the data parallel programming runtime; and initiate execution of the application program on the new device using the queue.

Type: Application

Filed: March 11, 2022

Publication date: June 23, 2022

Applicant: Intel Corporation

Inventors: Ben J. Ashbaugh, Michael Kinsner, James Brodman, Rajesh Poornachandran
CLOCK GATING AND CLOCK SCALING BASED ON RUNTIME APPLICATION TASK GRAPH INFORMATION

Publication number: 20220197613

Abstract: An apparatus to facilitate clock gating and clock scaling based on runtime application task graph information is disclosed. The apparatus includes a processor to: receive, from a compiler, a bitstream generated from code of an application, the bitstream related to a workload of the application; generate a task graph of the application using at least part of the bitstream, the task graph to represent one of a relationship and dependency of the code; program the bitstream to an accelerator device, wherein the bitstream to configure the accelerator device to support the workload of the application; execute one or more kernels of the code using the accelerator device; identify one or more optimizations for the accelerator device based on the task graph of the application; and transmit a command to cause the one or more optimizations to be implemented in the at least one region of the accelerator device.

Type: Application

Filed: March 11, 2022

Publication date: June 23, 2022

Applicant: Intel Corporation

Inventors: Michael Kinsner, Rajesh Poornachandran, John Freeman
INCREMENTAL JUST-IN-TIME (JIT) PERFORMANCE REFINEMENT FOR PROGRAMMABLE LOGIC DEVICE OFFLOAD

Publication number: 20220197610

Abstract: An apparatus to facilitate incremental just-in-time (JIT) performance refinement for programmable logic device offload is disclosed. The apparatus includes a processor to: initiate multiple just-in-time (JIT) compilation iterations of an application; program a first architecture of a first compilation of the multiple JIT compilation iterations to a programmable logic device and execute the application on the first architecture, wherein the first compilation comprises a faster compilation time amongst the multiple JIT compilation iterations; identify a hotspot; determine that a second compilation of the multiple JIT compilation iterations is complete, wherein the second compilation comprises a slower compilation time than the first compilation; and program a second architecture of the second compilation of the multiple JIT compilation iterations to the programmable logic device and execute the application on the second architecture.

Type: Application

Filed: March 11, 2022

Publication date: June 23, 2022

Applicant: Intel Corporation

Inventors: Michael Kinsner, John Freeman, Ben J. Ashbaugh, Rajesh Poornachandran