Patents by Inventor Karthikeyan Sankaralingam

Karthikeyan Sankaralingam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Reconfigurable computer accelerator providing stream processor and dataflow processor

Patent number: 11853244

Abstract: A reconfigurable hardware accelerator for computers combines a high-speed dataflow processor, having programmable functional units rapidly reconfigured in a network of programmable switches, with a stream processor that may autonomously access memory in predefined access patterns after receiving simple stream instructions. The result is a compact, high-speed processor that may exploit parallelism associated with many application-specific programs susceptible to acceleration.

Type: Grant

Filed: January 26, 2017

Date of Patent: December 26, 2023

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar
Computer architecture with synergistic heterogeneous processors

Patent number: 11513805

Abstract: A computer architecture employs multiple special-purpose processors having different affinities for program execution to execute substantial portions of general-purpose programs to provide improved performance with respect to a general-purpose processor executing the general-purpose program alone.

Type: Grant

Filed: August 19, 2016

Date of Patent: November 29, 2022

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki
Computer architecture with fixed program dataflow elements and stream processor

Patent number: 11151077

Abstract: A hardware accelerator for computers combines a stand-alone, high-speed, fixed program dataflow functional element with a stream processor, the latter of which may autonomously access memory in predefined access patterns after receiving simple stream instructions and provide them to the dataflow functional element. The result is a compact, high-speed processor that may exploit fixed program dataflow functional elements.

Type: Grant

Filed: June 28, 2017

Date of Patent: October 19, 2021

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar
Systems and methods for stream-dataflow acceleration wherein a delay is implemented so as to equalize arrival times of data packets at a destination functional unit

Patent number: 11048661

Abstract: A dataflow accelerator including a control/command core, a scratchpad and a coarse grain reconfigurable array (CGRA) according to an exemplary embodiment is disclosed. The scratchpad may include a write controller to transmit data to an input vector port interface and to receive data from the input vector port interface. The CGRA may receive data from the input vector port interface and includes a plurality of interconnects and a plurality of functional units.

Type: Grant

Filed: April 15, 2019

Date of Patent: June 29, 2021

Assignee: SIMPLE MACHINES INC.

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar, Preyas Shah, Newsha Ardalani
Accelerating parallel processing of data in a recurrent neural network

Patent number: 11042797

Abstract: According to exemplary embodiments, a method, processor, and system for accelerating a recurrent neural network are presented. A method of accelerating a recurrent neural network may include distributing from a first master core to each of a plurality of processing cores a same relative one or more columns of weight matrix data for each of a plurality of gates in the neural network, broadcasting a current input vector from the first master core to each of the processing cores, and processing each column of weight matrix data in parallel, at each of the respective processing cores.

Type: Grant

Filed: January 6, 2020

Date of Patent: June 22, 2021

Assignee: SIMPLEMACHINES INC.

Inventors: Karthikeyan Sankaralingam, Yunfeng Li, Vinay Gangadhar, Anthony Nowatzki
Method, computer program product, and apparatus for acceleration of simultaneous access to shared data

Patent number: 10963384

Abstract: A method for performing acceleration of simultaneous access to shared data may include providing a plurality of groups of cores and a plurality of shared memory structures, providing a pod comprising the plurality of groups of cores linked by a common broadcast channel, and coordinating each shared memory structure to provide a logically unified memory structure. Each memory structure may be associated with a group of cores, and each group of cores may include one or more cores. The common broadcast channel may be operatively coupled to each shared memory structure. The coordinating each shared memory structure may include identifying a simultaneous read-reuse load to a first shared memory structure, fetching data corresponding to the simultaneous read-reuse load, and forwarding the data to shared memory structures other than the first shared memory structure and to groups of cores other than a first group of cores via the broadcast channel.

Type: Grant

Filed: December 18, 2019

Date of Patent: March 30, 2021

Assignee: SimpleMachines Inc.

Inventors: Karthikeyan Sankaralingam, Vinay Gangadhar, Anthony Nowatzki, Yunfeng Li
Memory processing core architecture

Patent number: 10936536

Abstract: Aspects of the present invention provide a memory system comprising a plurality of stacked memory layers, each memory layer divided into memory sections, wherein each memory section connects to a neighboring memory section in an adjacent memory layer, and a logic layer stacked among the plurality of memory layers, the logic layer divided into logic sections, each logic section including a memory processing core, wherein each logic section connects to a neighboring memory section in an adjacent memory layer to form a memory vault of connected logic and memory sections, and wherein each logic section is configured to communicate directly or indirectly with a host processor. Accordingly, each memory processing core may be configured to respond to a procedure call from the host processor by processing data stored in its respective memory vault and providing a result to the host processor. As a result, increased performance may be provided.

Type: Grant

Filed: April 30, 2019

Date of Patent: March 2, 2021

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Jaikrishnan Menon, Lorenzo De Carli
Method of estimating program speed-up in highly parallel architectures using static analysis

Patent number: 10754744

Abstract: The amount of speed-up that can be obtained by optimizing the program to run on a different architecture is determined by static measurements of the program. Multiple such static measurements are processed by a machine learning system after being discretized to alter their accuracy vs precision. Static analysis requires less analysis overhead and permits analysis of program portions to optimize allocation of porting resources on a large program.

Type: Grant

Filed: March 15, 2016

Date of Patent: August 25, 2020

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Newsha Ardalani, Urmish Thakker
ACCELERATING PARALLEL PROCESSING OF DATA IN A RECURRENT NEURAL NETWORK

Publication number: 20200218965

Abstract: According to exemplary embodiments, a method, processor, and system for accelerating a recurrent neural network are presented. A method of accelerating a recurrent neural network may include distributing from a first master core to each of a plurality of processing cores a same relative one or more columns of weight matrix data for each of a plurality of gates in the neural network, broadcasting a current input vector from the first master core to each of the processing cores, and processing each column of weight matrix data in parallel, at each of the respective processing cores.

Type: Application

Filed: January 6, 2020

Publication date: July 9, 2020

Applicant: SimpleMachines Inc.

Inventors: Karthikeyan Sankaralingam, Yunfeng Li, Vinay Gangadhar, Anthony Nowatzki
METHOD, COMPUTER PROGRAM PRODUCT, AND APPARATUS FOR ACCELERATION OF SIMULTANEOUS ACCESS TO SHARED DATA

Publication number: 20200201690

Abstract: A method for performing acceleration of simultaneous access to shared data may include providing a plurality of groups of cores and a plurality of shared memory structures, providing a pod comprising the plurality of groups of cores linked by a common broadcast channel, and coordinating each shared memory structure to provide a logically unified memory structure. Each memory structure may be associated with a group of cores, and each group of cores may include one or more cores. The common broadcast channel may be operatively coupled to each shared memory structure. The coordinating each shared memory structure may include identifying a simultaneous read-reuse load to a first shared memory structure, fetching data corresponding to the simultaneous read-reuse load, and forwarding the data to shared memory structures other than the first shared memory structure and to groups of cores other than a first group of cores via the broadcast channel.

Type: Application

Filed: December 18, 2019

Publication date: June 25, 2020

Applicant: SimpleMachines Inc.

Inventors: Karthikeyan Sankaralingam, Vinay Gangadhar, Anthony Nowatzki, Yunfeng Li
Computer accelerator system using a trigger architecture memory access processor

Patent number: 10591983

Abstract: A specialized memory access processor is placed between a main processor and accelerator hardware to handle memory access for the accelerator hardware. The architecture of the memory access processor is designed to allow lower energy memory accesses than can be obtained by the main processor in providing data to the hardware accelerator while providing the hardware accelerator with a sufficiently high bandwidth memory channel. In some embodiments, the main processor may enter a sleep state during accelerator calculations to substantially lower energy consumption.

Type: Grant

Filed: March 14, 2014

Date of Patent: March 17, 2020

Assignee: Wisconsin Alumni Research Foundation

Inventors: Chen-Han Ho, Karthikeyan Sankaralingam, Sung Kim
SYSTEMS AND METHODS FOR STREAM-DATAFLOW ACCELERATION

Publication number: 20190317770

Abstract: According to some embodiments, a dataflow accelerator comprises a control/command core, a scratchpad and a coarse grain reconfigurable array (CGRA). The scratchpad comprises a write controller to transmit data to an input vector port interface and to receive data from the input vector port interface. The CGRA receives data from the input vector port interface where the CGRA comprising a plurality of interconnects and a plurality of functional units.

Type: Application

Filed: April 15, 2019

Publication date: October 17, 2019

Applicant: SimpleMachines Inc.

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar, Preyas Shah, Newsha Ardalani
Memory Processing Core Architecture

Publication number: 20190258601

Abstract: Aspects of the present invention provide a memory system comprising a plurality of stacked memory layers, each memory layer divided into memory sections, wherein each memory section connects to a neighboring memory section in an adjacent memory layer, and a logic layer stacked among the plurality of memory layers, the logic layer divided into logic sections, each logic section including a memory processing core, wherein each logic section connects to a neighboring memory section in an adjacent memory layer to form a memory vault of connected logic and memory sections, and wherein each logic section is configured to communicate directly or indirectly with a host processor. Accordingly, each memory processing core may be configured to respond to a procedure call from the host processor by processing data stored in its respective memory vault and providing a result to the host processor. As a result, increased performance may be provided.

Type: Application

Filed: April 30, 2019

Publication date: August 22, 2019

Inventors: Karthikeyan Sankaralingam, Jaikrishnan Menon, Lorenzo De Carli
Memory processing core architecture

Patent number: 10289604

Abstract: Aspects of the present invention provide a memory system comprising a plurality of stacked memory layers, each memory layer divided into memory sections, wherein each memory section connects to a neighboring memory section in an adjacent memory layer, and a logic layer stacked among the plurality of memory layers, the logic layer divided into logic sections, each logic section including a memory processing core, wherein each logic section connects to a neighboring memory section in an adjacent memory layer to form a memory vault of connected logic and memory sections, and wherein each logic section is configured to communicate directly or indirectly with a host processor. Accordingly, each memory processing core may be configured to respond to a procedure call from the host processor by processing data stored in its respective memory vault and providing a result to the host processor. As a result, increased performance may be provided.

Type: Grant

Filed: August 7, 2014

Date of Patent: May 14, 2019

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Jaikrishnan Menon, Lorenzo De Carli
Computer with hybrid Von-Neumann/dataflow execution architecture

Patent number: 10216693

Abstract: A dataflow computer processor is teamed with a general computer processor so that program portions of an application program particularly suited to dataflow execution may be transferred to the dataflow processor during portions of the execution of the application program by the general computer processor. During this time the general computer processor may be placed in partial shutdown for energy conservation.

Type: Grant

Filed: July 30, 2015

Date of Patent: February 26, 2019

Assignee: Wisconsin Alumni Research Foundation

Inventors: Anthony Nowatzki, Vinay Gangadhar, Karthikeyan Sankaralingam
High-Speed, Fixed-Function, Computer Accelerator

Publication number: 20190004995

Abstract: A hardware accelerator for computers combines a stand-alone, high-speed, fixed program dataflow functional element with a stream processor, the latter of which may autonomously access memory in predefined access patterns after receiving simple stream instructions and provide them to the dataflow functional element. The result is a compact, high-speed processor that may exploit fixed program dataflow functional elements.

Type: Application

Filed: June 28, 2017

Publication date: January 3, 2019

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar
Reconfigurable, Application-Specific Computer Accelerator

Publication number: 20180210730

Abstract: A reconfigurable hardware accelerator for computers combines a high-speed dataflow processor, having programmable functional units rapidly reconfigured in a network of programmable switches, with a stream processor that may autonomously access memory in predefined access patterns after receiving simple stream instructions. The result is a compact, high-speed processor that may exploit parallelism associated with many application-specific programs susceptible to acceleration.

Type: Application

Filed: January 26, 2017

Publication date: July 26, 2018

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki, Vinay Gangadhar
Computer Architecture with Synergistic Heterogeneous Processors

Publication number: 20180052693

Abstract: A computer architecture employs multiple special-purpose processors having different affinities for program execution to execute substantial portions of general-purpose programs to provide improved performance with respect to a general-purpose processor executing the general-purpose program alone.

Type: Application

Filed: August 19, 2016

Publication date: February 22, 2018

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki
Method of Estimating Program Speed-Up in Highly Parallel Architectures Using Static Analysis

Publication number: 20170270424

Abstract: The amount of speed-up that can be obtained by optimizing the program to run on a different architecture is determined by static measurements of the program. Multiple such static measurements are processed by a machine learning system after being discretized to alter their accuracy vs precision. Static analysis requires less analysis overhead and permits analysis of program portions to optimize allocation of porting resources on a large program.

Type: Application

Filed: March 15, 2016

Publication date: September 21, 2017

Inventors: Karthikeyan Sankaralingam, Newsha Ardalani, Urmish Thakker
Computer processor providing exception handling with reduced state storage

Patent number: 9619233

Abstract: A computer architecture allows for simplified exception handling by restarting the program after exceptions at the beginning of idempotent regions, the idempotent regions allowing re-execution without the need for restoring complex state information from checkpoints. Recovery from mis-speculation may be provided by a similar mechanism but using smaller idempotent regions reflecting a more frequent occurrence of mis-speculation. A compiler generating different idempotent regions for speculation and exception handling is also disclosed.

Type: Grant

Filed: February 19, 2016

Date of Patent: April 11, 2017

Assignee: Wisconsin Alumni Research Foundation

Inventors: Jaikrishnan Menon, Marc Asher De Kruijf, Karthikeyan Sankaralingam

1 2 next