Patents by Inventor Kevin Skadron
Kevin Skadron has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230401034Abstract: Disclosed herein are systems, methods, and computer-readable media for sorting datasets within a Processing in Memory (PIM)-based system. A request to sort a dataset stored in a 3D-stacked memory can be received. The request can identify a specific dataset and sorting criteria, which includes a plurality of keys. The dataset can be partitioned into several subarrays across various memory banks within the 3D-stacked memory. Each piece of data within these subarrays can be separated into buckets based on the keys. Local histograms for each subarray and bank histograms based on the local histograms can be generated. A prefix-sum operation on the bank histograms can determine individual positions for the sorted dataset. Aggregation of the subarrays from all memory banks can form the sorted dataset, which can be subsequently returned.Type: ApplicationFiled: June 7, 2023Publication date: December 14, 2023Inventors: Marzieh Lenjani, Alif Ahmed, Kevin Skadron
-
Publication number: 20230385258Abstract: Disclosed herein is a Dynamic Random Access Memory-Based Content-Addressable Memory (DRAM-CAM) architecture and methods relating thereto. The DRAM-CAM architecture can include a memory array, with the data organized into blocks including rows and columns. Input data can be converted into a format with first and second groups of columns. Each first group can correspond to one or more rows of the input data, and each second group can include one or more null columns. A query can be received and loaded into an available column of the second group, and pattern matching can be performed on the data to identify occurrences of elements defined by the query. The pattern matching can be performed concurrently on the first groups of columns and the available columns bit by bit. Results can include a count or location of each identified element.Type: ApplicationFiled: May 10, 2023Publication date: November 30, 2023Inventors: Lingxi Wu, Kevin Skadron
-
Publication number: 20230343373Abstract: An integrated circuit memory device can include a plurality of banks of memory, each of the banks of memory including a first pair of sub-arrays comprising first and second sub-arrays, the first pair of sub-arrays configured to store data in memory cells of the first pair of sub-arrays, a first row buffer memory circuit located in the integrated circuit memory device adjacent to the first pair of sub-arrays and configured to store first row data received from the first pair of sub-arrays and configured to transfer the row data into and/or out of the first row buffer memory circuit, and a first sub-array level processor circuit in the integrated circuit memory device adjacent to the first pair of sub-arrays and operatively coupled to the first row data, wherein the first sub-array level processor circuit is configured to perform column oriented processing a sparse matrix kernel stored, at least in-part, in the first pair of sub-arrays, with input vector values stored, at least in part, in the first pair of subType: ApplicationFiled: April 25, 2023Publication date: October 26, 2023Inventors: KEVIN SKADRON, MARZIEH LENJANI
-
Patent number: 11776594Abstract: Apparatus includes a plurality of memory cells (e.g., a dynamic random access memory (DRAM)) addressable as rows and columns and a plurality of matching circuits configured to be coupled to respective bit lines associated with the columns A control circuit is configured to store respective reference sequences (e.g., binary-encoded k-mer patterns) in respective ones of the columns, to sequentially provide rows of bits stored in the memory cells and bits of a query to the matching circuits, and to identify one of the reference sequences as corresponding to the query responsive to comparisons by the matching circuits.Type: GrantFiled: August 31, 2021Date of Patent: October 3, 2023Assignee: University of Virginia Patent FoundationInventors: Kevin Skadron, Marzieh Lenjani, Abdolrasoul Sharifi, Lingxi Wu
-
Publication number: 20230072191Abstract: Apparatus includes a plurality of memory cells (e.g., a dynamic random access memory (DRAM)) addressable as rows and columns and a plurality of matching circuits configured to be coupled to respective bit lines associated with the columns A control circuit is configured to store respective reference sequences (e.g., binary-encoded k-mer patterns) in respective ones of the columns, to sequentially provide rows of bits stored in the memory cells and bits of a query to the matching circuits, and to identify one of the reference sequences as corresponding to the query responsive to comparisons by the matching circuits.Type: ApplicationFiled: August 31, 2021Publication date: March 9, 2023Inventors: Kevin Skadron, Marzieh Lenjani, Abdolrasoul Sharift, Lingxi Wu
-
Publication number: 20220415440Abstract: A method of operating a finite state machine circuit can be provided by determining if a target sequence of characters included in a string of reference characters occurs within a specified difference distance using states indicated by the finite state machine circuit to indicate a number of character mis-matches between the target sequence of characters and a respective sequence of characters within the string of reference characters.Type: ApplicationFiled: July 14, 2022Publication date: December 29, 2022Inventors: Chunkun BO, Kevin SKADRON, Elaheh SADREDINI, Vinh DANG
-
Patent number: 11436401Abstract: Transient voltage noise, including resistive and reactive noise, causes timing errors at runtime. A heuristic framework, Walking Pads, is introduced to minimize transient voltage violations by optimizing power supply pad placement. It is shown that the steady-state optimal design point differs from the transient optimum, and further noise reduction can be achieved with transient optimization. The methodology significantly reduces voltage violations by balancing the average transient voltage noise of the four branches at each pad site. When pad placement is optimized using a representative stressmark, voltage violations are reduced 46-80% across 11 Parsec benchmarks with respect to the results from IR-drop-optimized pad placement. It is shown that the allocation of on-chip decoupling capacitance significantly influences the optimal locations of pads.Type: GrantFiled: September 16, 2019Date of Patent: September 6, 2022Assignee: UNIVERSITY OF VIRGINIA PATENT FOUNDATIONInventors: Ke Wang, Kevin Skadron, Mircea R. Stan, Runjie Zhang
-
Patent number: 11393558Abstract: A method of operating a finite state machine circuit can be provided by determining if a target sequence of characters included in a string of reference characters occurs within a specified difference distance using states indicated by the finite state machine circuit to indicate a number of character mis-matches between the target sequence of characters and a respective sequence of characters within the string of reference characters.Type: GrantFiled: February 16, 2018Date of Patent: July 19, 2022Assignee: University of Virginia Patent FoundationInventors: Chunkun Bo, Kevin Skadron, Elaheh Sadredini, Vinh Dang
-
Patent number: 11314750Abstract: A method of searching tree-structured data can be provided by identifying all labels associated with nodes in a plurality of trees including the tree-structured data, determining which of the labels is included in a percentage of the plurality of trees that exceeds a frequent threshold value to provide frequent labels, defining frequent candidate sub-trees for searching within the plurality of trees using combinations of only the frequent labels, and then searching for the frequent candidate sub-trees in the plurality of trees including the tree-structured data using a plurality of pruning kernels instantiated on a non-deterministic finite state machine to provide a less than exact count of the frequent candidate sub-trees in the plurality of trees.Type: GrantFiled: January 14, 2019Date of Patent: April 26, 2022Assignee: University of Virginia Patent FoundationInventors: Elaheh Sadredini, Kevin Skadron, Gholamreza Rahimi, Ke Wang
-
Publication number: 20220012052Abstract: The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.Type: ApplicationFiled: September 24, 2021Publication date: January 13, 2022Inventors: Mohamed E. Aly, Wen-Mei W. Hwu, Kevin Skadron
-
Patent number: 11157275Abstract: The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.Type: GrantFiled: July 3, 2018Date of Patent: October 26, 2021Assignees: The Board of Trustees of the University of Illinois, University of Virginia Patent FoundationInventors: Mohamed E Aly, Wen-Mei W. Hwu, Kevin Skadron
-
Patent number: 11049551Abstract: A method of processing data in a memory can include accessing an array of memory cells located on a semiconductor memory die to provide a row of data including n bits, latching the n bits in one or more row buffer circuits adjacent to the array of memory cells on the semiconductor memory die to provide latched n bits operatively coupled to a column address selection circuit on the semiconductor memory die to provide a portion of the n latched bits as data output from the semiconductor memory die responsive to a memory read operation, and serially transferring the latched n bits in the row buffer circuit to an arithmetic logic unit (ALU) circuit located adjacent to the row buffer circuit on the semiconductor memory die.Type: GrantFiled: November 13, 2019Date of Patent: June 29, 2021Assignee: University of Virginia Patent FoundationInventors: Marzieh Lenjani, Patricia Gonzalez, Mircea R. Stan, Kevin Skadron
-
Publication number: 20210142846Abstract: A method of processing data in a memory can include accessing an array of memory cells located on a semiconductor memory die to provide a row of data including n bits, latching the n bits in one or more row buffer circuits adjacent to the array of memory cells on the semiconductor memory die to provide latched n bits operatively coupled to a column address selection circuit on the semiconductor memory die to provide a portion of the n latched bits as data output from the semiconductor memory die responsive to a memory read operation, and serially transferring the latched n bits in the row buffer circuit to an arithmetic logic unit (ALU) circuit located adjacent to the row buffer circuit on the semiconductor memory die.Type: ApplicationFiled: November 13, 2019Publication date: May 13, 2021Inventors: Marzieh Lenjani, Patricia Gonzalez, Mircea R. Stan, Kevin Skadron
-
Patent number: 10664241Abstract: A method operating a memory system, can be provided by reading a plurality of data words from a memory system, where each of the plurality of data words is stored in the memory system in a first dimension-major order. The plurality of data words can be shifted into a transpose memory system in the first dimension in parallel with one another using first directly time adjacent clock edges to store a plurality of transposed data words in a second dimension-major order in the transpose memory system relative to the memory system. The plurality of transposed data words can be shifted out of the transpose memory system in the second dimension using second directly time adjacent clock edges.Type: GrantFiled: October 13, 2017Date of Patent: May 26, 2020Assignee: University of Virginia Patent FoundationInventors: Mohamed Ezzat El Hadedy Aly, Kevin Skadron
-
Publication number: 20200151380Abstract: Transient voltage noise, including resistive and reactive noise, causes timing errors at runtime. A heuristic framework, Walking Pads, is introduced to minimize transient voltage violations by optimizing power supply pad placement. It is shown that the steady-state optimal design point differs from the transient optimum, and further noise reduction can be achieved with transient optimization. The methodology significantly reduces voltage violations by balancing the average transient voltage noise of the four branches at each pad site. When pad placement is optimized using a representative stressmark, voltage violations are reduced 46-80% across 11 Parsec benchmarks with respect to the results from IR-drop-optimized pad placement. It is shown that the allocation of on-chip decoupling capacitance significantly influences the optimal locations of pads.Type: ApplicationFiled: September 16, 2019Publication date: May 14, 2020Applicant: UNIVERSITY OF VIRGIINIA PATENT FOUNDATIONInventors: Ke Wang, Kevin Skadron, Mircea R. Stan, Runjie Zhang
-
Patent number: 10580481Abstract: A finite state machine circuit can include a plurality of rows of gain cell embedded Dynamic Random Access Memory (GC-eDRAM) cells that can be configured to store state information representing all N states expressed by a finite state machine circuit. A number of eDRAM switch cells can be electrically coupled to the plurality of rows of the GC-eDRAM cells, where the number of eDRAM switch cells can be arranged in an M×M cross-bar array where M is less than N, and the number of eDRAM switch cells can be configured to provide interconnect for all transitions between the all N states expressed by the finite state machine circuit.Type: GrantFiled: January 14, 2019Date of Patent: March 3, 2020Assignee: University of Virginia Patent FoundationInventors: Elaheh Sadredini, Gholamreza Rahimi, Kevin Skadron, Mircea Stan
-
Publication number: 20200012495Abstract: The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.Type: ApplicationFiled: July 3, 2018Publication date: January 9, 2020Inventors: Mohamed E Aly, Wen-Mei W. Hwu, Kevin Skadron
-
Patent number: 10482210Abstract: A virtual force controlled collapse chip connection (C4) pad placement optimization frame-work for 2D power delivery grids is proposed. The present optimization framework regards power pads as mobile “positive charged particles” and current resources as a “negative charged back-ground.” The virtual electrostatic force is calculated from voltage gradients. This optimization framework optimizes pad locations by moving pads according to the virtual forces exerted on them by other pads and current sources in the system. Within this framework, three algorithms are proposed to meet various requirements of optimization quality and speed. These algorithms minimize resistive voltage drop (IR drop), the maximum current density, and power distribution network metal power dissipation at the same time.Type: GrantFiled: January 19, 2016Date of Patent: November 19, 2019Assignee: University of Virginia Patent FoundationInventors: Ke Wang, Kevin Skadron, Mircea R. Stan, Runjie Zhang, Brett Meyer
-
Patent number: 10474690Abstract: The present invention introduces the development of a flexible CPU-AP (Computer Processing Unit-Automata Processor) computing infrastructure for mining hierarchical patterns based on Apriori algorithm. A novel automaton design strategy, called linear design, is described to generate automata for matching and counting hierarchical patterns and apply it on SPM (Sequential Pattern Mining). In addition, another novel automaton design strategy, called reduction design, is described for the disjunctive rule matching (DRM) and counting. The present invention shows performance improvement of AP SPM and DRM solutions and broader capability over multicore and GPU (Graphics Processing Unit) implementations of GSP SPM, and shows that AP SPM and DRM solutions outperform state-of-the-art SPM algorithms SPADE and PrefixSpan (especially for larger datasets).Type: GrantFiled: March 31, 2017Date of Patent: November 12, 2019Assignee: University of Virginia Patent FoundationInventors: Ke Wang, Kevin Skadron, Elaheh Sadredini
-
Patent number: 10445323Abstract: The present invention discloses a heterogeneous computation framework, of Association. Rule Mining (ARM) using Micron's Autotmata Processor (AP). This framework is based on the Apriori algorithm. Two Automaton designs are proposed to match and count the individual itemset. Several performance improvement strategies are proposed including minimizing the number of reporting vectors and reduce reconfiguration delays. The experiment results show up to 94× speed ups of the proposed AP-accelerated Apriori on six synthetic and real-world datasets, when compared with the Apriori single-core CPU implementation. The proposed AP-accelerated Apriori solution also outperforms the state-of-the-art multicore and GPU implementations of Equivalence Class Transformation (Eclat) algorithm on big datasets.Type: GrantFiled: September 30, 2015Date of Patent: October 15, 2019Assignee: UNIVERSITY OF VIRGINIA PATENT FOUNDATIONInventors: Ke Wang, Kevin Skadron