Patents by Inventor Wen-mei W. Hwu

Wen-mei W. Hwu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220012052
    Abstract: The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.
    Type: Application
    Filed: September 24, 2021
    Publication date: January 13, 2022
    Inventors: Mohamed E. Aly, Wen-Mei W. Hwu, Kevin Skadron
  • Patent number: 11157275
    Abstract: The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.
    Type: Grant
    Filed: July 3, 2018
    Date of Patent: October 26, 2021
    Assignees: The Board of Trustees of the University of Illinois, University of Virginia Patent Foundation
    Inventors: Mohamed E Aly, Wen-Mei W. Hwu, Kevin Skadron
  • Publication number: 20200012495
    Abstract: The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.
    Type: Application
    Filed: July 3, 2018
    Publication date: January 9, 2020
    Inventors: Mohamed E Aly, Wen-Mei W. Hwu, Kevin Skadron
  • Patent number: 7725696
    Abstract: A processor method and apparatus that allows for the overlapped execution of multiple iterations of a loop while allowing the compiler to include only a single copy of the loop body in the code while automatically managing which iterations are active. Since the prologue and epilogue are implicitly created and maintained within the hardware in the invention, a significant reduction in code size can be achieved compared to software-only modulo scheduling. Furthermore, loops with iteration counts less than the number of concurrent iterations present in the kernel are also automatically handled. This hardware enhanced scheme achieves the same performance as the fully-specified standard method. Furthermore, the hardware reduces the power requirement as the entire fetch unit can be deactivated for a portion of the loop's execution.
    Type: Grant
    Filed: October 4, 2007
    Date of Patent: May 25, 2010
    Inventors: Wen-mei W. Hwu, Matthew C. Merten
  • Patent number: 7302557
    Abstract: A processor method and apparatus that allows for the overlapped execution of multiple iterations of a loop while allowing the compiler to include only a single copy of the loop body in the code while automatically managing which iterations are active. Since the prologue and epilogue are implicitly created and maintained within the hardware in the invention, a significant reduction in code size can be achieved compared to software-only modulo scheduling. Furthermore, loops with iteration counts less than the number of concurrent iterations present in the kernel are also automatically handled. This hardware enhanced scheme achieves the same performance as the fully-specified standard method. Furthermore, the hardware reduces the power requirement as the entire fetch unit can be deactivated for a portion of the loop's execution.
    Type: Grant
    Filed: December 1, 2000
    Date of Patent: November 27, 2007
    Assignee: Impact Technologies, Inc.
    Inventors: Wen-mei W. Hwu, Matthew C. Merten
  • Patent number: 6640315
    Abstract: Disclosed is a method and system for handling inline recovery from speculatively executed instructions. Each register may be provided with an E-tag, that, when set, indicates an exception occurred in the generation of the value stored in its register, and an R-tag, which is used to manage data flow dependencies in recovery mode. Recovery is performed by re-executing speculatively those set of speculative instructions that are data flow dependent upon a first excepting speculative instruction. The disclosed invention provides an architecture and method for efficient exception handling when combining control speculation, data speculation and predication, thereby resulting in substantially enhanced instruction level parallelism.
    Type: Grant
    Filed: June 26, 1999
    Date of Patent: October 28, 2003
    Assignee: Board of Trustees of the University of Illinois
    Inventors: Wen-mei W. Hwu, Daniel A. Connors, David I. August, John W. Sias
  • Patent number: 5694577
    Abstract: An apparatus is provided, for use in a computer having a register bank and a device for operand fetch and instruction execution, for monitoring a store address to maintain coherency of preloaded data that is fetched by a load operation and should be effected by at least one subsequent store operation. The apparatus includes an address register bank having entries for holding the address of a load having loaded data which should be affected by at least one subsequent store operation. Each of the entries has associated therewith a pre-load flag and a type field, the pre-load flag being set when the load is executed and reset when there is no need to be affected by a subsequent store operation.
    Type: Grant
    Filed: June 6, 1995
    Date of Patent: December 2, 1997
    Assignees: Matsushita Electric Industrial Co., Ltd., The Board of Trustees of the University of Illinois
    Inventors: Tokuzo Kiyohara, Wen-mei W. Hwu, William Chen