Patents by Inventor Kevin Skadron

Kevin Skadron has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Reconfigurable Crypto-Processor

Publication number: 20200012495

Abstract: The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.

Type: Application

Filed: July 3, 2018

Publication date: January 9, 2020

Inventors: Mohamed E Aly, Wen-Mei W. Hwu, Kevin Skadron
System, method, and computer readable medium for walking pads: fast power- supply pad-placement optimization

Patent number: 10482210

Abstract: A virtual force controlled collapse chip connection (C4) pad placement optimization frame-work for 2D power delivery grids is proposed. The present optimization framework regards power pads as mobile “positive charged particles” and current resources as a “negative charged back-ground.” The virtual electrostatic force is calculated from voltage gradients. This optimization framework optimizes pad locations by moving pads according to the virtual forces exerted on them by other pads and current sources in the system. Within this framework, three algorithms are proposed to meet various requirements of optimization quality and speed. These algorithms minimize resistive voltage drop (IR drop), the maximum current density, and power distribution network metal power dissipation at the same time.

Type: Grant

Filed: January 19, 2016

Date of Patent: November 19, 2019

Assignee: University of Virginia Patent Foundation

Inventors: Ke Wang, Kevin Skadron, Mircea R. Stan, Runjie Zhang, Brett Meyer
Disjunctive rule mining with finite automaton hardware

Patent number: 10474690

Abstract: The present invention introduces the development of a flexible CPU-AP (Computer Processing Unit-Automata Processor) computing infrastructure for mining hierarchical patterns based on Apriori algorithm. A novel automaton design strategy, called linear design, is described to generate automata for matching and counting hierarchical patterns and apply it on SPM (Sequential Pattern Mining). In addition, another novel automaton design strategy, called reduction design, is described for the disjunctive rule matching (DRM) and counting. The present invention shows performance improvement of AP SPM and DRM solutions and broader capability over multicore and GPU (Graphics Processing Unit) implementations of GSP SPM, and shows that AP SPM and DRM solutions outperform state-of-the-art SPM algorithms SPADE and PrefixSpan (especially for larger datasets).

Type: Grant

Filed: March 31, 2017

Date of Patent: November 12, 2019

Assignee: University of Virginia Patent Foundation

Inventors: Ke Wang, Kevin Skadron, Elaheh Sadredini
Association rule mining with the micron automata processor

Patent number: 10445323

Abstract: The present invention discloses a heterogeneous computation framework, of Association. Rule Mining (ARM) using Micron's Autotmata Processor (AP). This framework is based on the Apriori algorithm. Two Automaton designs are proposed to match and count the individual itemset. Several performance improvement strategies are proposed including minimizing the number of reporting vectors and reduce reconfiguration delays. The experiment results show up to 94× speed ups of the proposed AP-accelerated Apriori on six synthetic and real-world datasets, when compared with the Apriori single-core CPU implementation. The proposed AP-accelerated Apriori solution also outperforms the state-of-the-art multicore and GPU implementations of Equivalence Class Transformation (Eclat) algorithm on big datasets.

Type: Grant

Filed: September 30, 2015

Date of Patent: October 15, 2019

Assignee: UNIVERSITY OF VIRGINIA PATENT FOUNDATION

Inventors: Ke Wang, Kevin Skadron
System for placement optimization of chip design for transient noise control and related methods thereof

Patent number: 10417367

Abstract: Transient voltage noise, including resistive and reactive noise, causes timing errors at runtime. A heuristic framework, Walking Pads, is introduced to minimize transient voltage violations by optimizing power supply pad placement. It is shown that the steady-state optimal design point differs from the transient optimum, and further noise reduction can be achieved with transient optimization. The methodology significantly reduces voltage violations by balancing the average transient voltage noise of the four branches at each pad site. When pad placement is optimized using a representative stressmark, voltage violations are reduced 46-80% across 11 Parsec benchmarks with respect to the results from IR-drop-optimized pad placement. It is shown that the allocation of on-chip decoupling capacitance significantly influences the optimal locations of pads.

Type: Grant

Filed: June 1, 2015

Date of Patent: September 17, 2019

Assignee: UNIVERSITY OF VIRGINIA PATENT FOUNDATION

Inventors: Ke Wang, Kevin Skadron, Mircea R. Stan, Runjie Zhang
Methods, circuits, systems, and articles of manufacture for searching a reference sequence for a target sequence within a specified distance

Publication number: 20190258777

Abstract: A method of operating a finite state machine circuit can be provided by determining if a target sequence of characters included in a string of reference characters occurs within a specified difference distance using states indicated by the finite state machine circuit to indicate a number of character mis-matches between the target sequence of characters and a respective sequence of characters within the string of reference characters.

Type: Application

Filed: February 16, 2018

Publication date: August 22, 2019

Inventors: Chunkun BO, Kevin SKADRON, Elaheh SADREDINI, Vinh DANG
METHODS, CIRCUITS, AND ARTICLES OF MANUFACTURE FOR FREQUENT SUB-TREE MINING USING NON-DETERMINISTIC FINITE STATE MACHINES

Publication number: 20190228012

Abstract: A method of searching tree-structured data can be provided by identifying all labels associated with nodes in a plurality of trees including the tree-structured data, determining which of the labels is included in a percentage of the plurality of trees that exceeds a frequent threshold value to provide frequent labels, defining frequent candidate sub-trees for searching within the plurality of trees using combinations of only the frequent labels, and then searching for the frequent candidate sub-trees in the plurality of trees including the tree-structured data using a plurality of pruning kernels instantiated on a non-deterministic finite state machine to provide a less than exact count of the frequent candidate sub-trees in the plurality of trees.

Type: Application

Filed: January 14, 2019

Publication date: July 25, 2019

Inventors: Elaheh Sadredini, Kevin Skadron, Gholamreza Rahimi, Ke Wang
MEMORY SYSTEMS INCLUDING SUPPORT FOR TRANSPOSITION OPERATIONS AND RELATED METHODS AND CIRCUITS

Publication number: 20190114147

Abstract: A method operating a memory system, can be provided by reading a plurality of data words from a memory system, where each of the plurality of data words is stored in the memory system in a first dimension-major order. The plurality of data words can be shifted into a transpose memory system in the first dimension in parallel with one another using first directly time adjacent clock edges to store a plurality of transposed data words in a second dimension-major order in the transpose memory system relative to the memory system. The plurality of transposed data words can be shifted out of the transpose memory system in the second dimension using second directly time adjacent clock edges.

Type: Application

Filed: October 13, 2017

Publication date: April 18, 2019

Inventors: Mohamed Ezzat El Hadedy Aly, Kevin Skadron
DISJUNCTIVE RULE MINING WITH FINITE AUTOMATON HARDWARE

Publication number: 20180285424

Abstract: The present invention introduces the development of a flexible CPU-AP (Computer Processing Unit-Automata Processor) computing infrastructure for mining hierarchical patterns based on Apriori algorithm. A novel automaton design strategy, called linear design, is described to generate automata for matching and counting hierarchical patterns and apply it on SPM (Sequential Pattern Mining). In addition, another novel automaton design strategy, called reduction design, is described for the disjunctive rule matching (DRM) and counting. The present invention shows performance improvement of AP SPM and DRM solutions and broader capability over multicore and GPU (Graphics Processing Unit) implementations of GSP SPM, and shows that AP SPM and DRM solutions outperform state-of-the-art SPM algorithms SPADE and PrefixSpan (especially for larger datasets).

Type: Application

Filed: March 31, 2017

Publication date: October 4, 2018

Applicant: UNIVERSITY OF VIRGINIA PATENT FOUNDATION

Inventors: Ke Wang, Kevin Skadron, Elaheh Sadredini
SEQUENTIAL PATTERN MINING WITH THE MICRON AUTOMATA PROCESSOR

Publication number: 20170293670

Abstract: A hardware accelerated solution of the SPM (Sequential Pattern Mining) is proposed using Micron's Automata Processor (AP), a hardware implementation of non-deterministic finite automata (NFAs) The Generalized Sequential Pattern (GSP) algorithm for SPM searching exposes massive parallelism, and is therefore well-suited for AP acceleration. The multi puss pruning strategy of the GSP is implemented is the APs fast reconfigurability. A generalized automaton structure is proposed by flattening sequential patterns to simple strings to reduce compilation time and to minimize overhead of reconfiguration. Up to 90× and 29× speedups are achieved by the AP-accelerated GSP on six real-world datasets, when compared with the optimized multicore CPU (Central Processing Unit) and GPU (Graphics Processing Unit) GSP implementations, respectively.

Type: Application

Filed: June 30, 2016

Publication date: October 12, 2017

Applicant: University of Virginia Patent Foundation

Inventors: Ke Wang, Elaheh Sadredini, Kevin Skadron
ASSOCIATION RULE MINING WITH THE MICRON AUTOMATA PROCESSOR

Publication number: 20170091287

Abstract: The present invention discloses a heterogeneous computation framework, of Association. Rule Mining (ARM) using Micron's Autotmata Processor (AP). This framework is based on the Apriori algorithm. Two Automaton designs are proposed to match and count the individual itemset. Several performance improvement strategies are proposed including minimizing the number of reporting vectors and reduce reconfiguration delays. The experiment results show up to 94× speed ups of the proposed AP-accelerated Apriori on six synthetic and real-world datasets, when compared with the Apriori single-core CPU implementation. The proposed AP-accelerated Apriori solution also outperforms the state-of-the-art multicore and GPU implementations of Equivalence Class Transformation (Eclat) algorithm on big datasets.

Type: Application

Filed: September 30, 2015

Publication date: March 30, 2017

Applicant: UNIVERSITY OF VIRGINIA PATENT FOUNDATION

Inventors: Ke Wang, Kevin Skadron
SYSTEM, METHOD, AND COMPUTER READABLE MEDIUM FOR WALKING PADS: FAST POWER-SUPPLY PAD-PLACEMENT OPTIMIZATION

Publication number: 20160210392

Abstract: A virtual force controlled collapse chip connection (C4) pad placement optimization frame-work for 2D power delivery grids is proposed. The present optimization framework regards power pads as mobile “positive charged particles” and current resources as a “negative charged back-ground.” The virtual electrostatic force is calculated from voltage gradients. This optimization framework optimizes pad locations by moving pads according to the virtual forces exerted on them by other pads and current sources in the system. Within this framework, three algorithms are proposed to meet various requirements of optimization quality and speed. These algorithms minimize resistive voltage drop (IR drop), the maximum current density, and power distribution network metal power dissipation at the same time.

Type: Application

Filed: January 19, 2016

Publication date: July 21, 2016

Inventors: Ke Wang, Kevin Skadron, Mircea R. Stan, Runjie Zhang, Brett Meyer
SYSTEM FOR PLACEMENT OPTIMIZATION OF CHIP DESIGN FOR TRANSIENT NOISE CONTROL AND RELATED METHODS THEREOF

Publication number: 20150370944

Abstract: Transient voltage noise, including resistive and reactive noise, causes timing errors at runtime. A heuristic framework, Walking Pads, is introduced to minimize transient voltage violations by optimizing power supply pad placement. It is shown that the steady-state optimal design point differs from the transient optimum, and further noise reduction can be achieved with transient optimization. The methodology significantly reduces voltage violations by balancing the average transient voltage noise of the four branches at each pad site. When pad placement is optimized using a representative stressmark, voltage violations are reduced 46-80% across 11 Parsec benchmarks with respect to the results from IR-drop-optimized pad placement. It is shown that the allocation of on-chip decoupling capacitance significantly influences the optimal locations of pads.

Type: Application

Filed: June 1, 2015

Publication date: December 24, 2015

Applicant: UNIVERSITY OF VIRGINIA PATENT FOUNDATION

Inventors: Ke Wang, Kevin Skadron, Mircea R. Stan, Runjie Zhang
Policy based allocation of register file cache to threads in multi-threaded processor

Patent number: 8200949

Abstract: A multi-threaded processor system, method, and computer program product capable of utilizing a register file cache are provided for simultaneously processing a plurality of threads. A processor capable of simultaneously processing a plurality of threads is provided. The processor includes a register file and a register file cache in communication with the register file.

Type: Grant

Filed: December 9, 2008

Date of Patent: June 12, 2012

Assignee: NVIDIA Corporation

Inventors: David Tarjan, Kevin Skadron
System, method, and computer program product for removing a register of a processor from an active state

Patent number: 8078844

Abstract: A system, method, and computer program product are provided for removing a register of a processor from an active state. In operation, an aspect of a portion of a processor capable of simultaneously processing a plurality of threads is identified. Additionally, a register of the processor is conditionally removed from an active state, based on the aspect.

Type: Grant

Filed: December 9, 2008

Date of Patent: December 13, 2011

Assignee: NVIDIA Corporation

Inventors: David Tarjan, Kevin Skadron
Dynamic warp subdivision for integrated branch and memory latency divergence tolerance

Publication number: 20110219221

Abstract: Dynamic warp subdivision (DWS), which allows a single warp to occupy more than one slot in the scheduler without requiring extra register file space, is described. Independent scheduling entities also allow divergent branch paths to interleave their execution, and allow threads that hit in the cache or otherwise have divergent memory-access latency to run ahead. The result is improved latency hiding and memory level parallelism (MLP).

Type: Application

Filed: March 3, 2011

Publication date: September 8, 2011

Inventors: Kevin Skadron, Jiayuan Meng, David Tarjan
ON DEMAND REGISTER ALLOCATION AND DEALLOCATION FOR A MULTITHREADED PROCESSOR

Publication number: 20110161616

Abstract: A system for allocating and de-allocating registers of a processor. The system includes a register file having plurality of physical registers and a first table coupled to the register file for mapping virtual register IDs to physical register IDs. A second table is coupled to the register file for determining whether a virtual register ID has a physical register mapped to it in a cycle. The first table and the second table enable physical registers of the register file to be allocated and de-allocated on a cycle-by-cycle basis to support execution of instructions by the processor.

Type: Application

Filed: December 29, 2009

Publication date: June 30, 2011

Applicant: NVIDIA CORPORATION

Inventors: David Tarjan, Kevin Skadron

prev 1 2