Patents by Inventor Joshua Wayne Bowman

Joshua Wayne Bowman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12333274
    Abstract: To reduce power consumption, data bits or a portion of a data register that is not expected to toggle frequently can be grouped together, and be clock-gated independently from the rest of the data register. The grouping of the data bits can be determined based on the data types of the workload being operated on. For a data register configured to store a numeric value that supports multiple data types, the portion of the data register being clock-gated may store a group of data bits that are unused for one or more data types of the multiple data types supported by the data register. The portion of the data register being clock-gated can also be a group of data bits that remain unchanged or have a constant value for numeric values within a certain numeric range that is frequently operated on.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: June 17, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Joshua Wayne Bowman, Thomas A. Volpe, Sundeep Amirineni, Nishith Desai, Ron Diamant
  • Patent number: 12182691
    Abstract: To improve performance of a computational array, the architecture of the array can be modified to allow the processing engines of a column to operate in parallel and the clock frequency of the array to be increased. The processing engines of each column of the array can be grouped into a series of row groups. The processing engines of each row group can be loaded with input values, and computations on the input values can be carried out in parallel to generate the column output. One or more flip-flop stages can be inserted into the computational logic of each of the processing engines. The computational logic can then be distributed across the flip-flop stages to reduce the propagation delay between flip-flop stages of the processing engine, hence allowing the clock frequency of the array to be increased.
    Type: Grant
    Filed: March 17, 2021
    Date of Patent: December 31, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Sundeep Amirineni, Akshay Balasubramanian, Joshua Wayne Bowman, Ron Diamant, Paul Gilbert Meyer, Thomas Elmer
  • Patent number: 11880682
    Abstract: Systems and methods are provided to perform multiply-accumulate operations of reduced precision numbers in a systolic array. Each row of the systolic array can receive reduced inputs from a respective reducer. The reduced input can include a reduced input data element and/or a reduced weight. The systolic array may lack support for inputs with a first bit-length and the reducers may reduce the bit-length of a given input from the first bit-length to a second shorter bit-length and provide the reduced input to the array. In order to reduce the bit-length, the reducer may reduce the number of trailing bits of the input. Further, the systolic array can receive a reduced and rounded input. The systolic array can propagate the reduced input through the processing elements in the systolic array. Each processing element may include a multiplier and/or an adder to perform arithmetical operations based on the reduced input.
    Type: Grant
    Filed: June 30, 2021
    Date of Patent: January 23, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Paul Gilbert Meyer, Thomas A Volpe, Ron Diamant, Joshua Wayne Bowman, Nishith Desai, Thomas Elmer
  • Publication number: 20230004523
    Abstract: Systems and methods are provided to perform multiply-accumulate operations of reduced precision numbers in a systolic array. Each row of the systolic array can receive reduced inputs from a respective reducer. The reducer can receive a particular input and generate multiple reduced inputs from the input. The reduced inputs can include reduced input data elements and/or a reduced weights. The systolic array may lack support for inputs with a first bit-length and the reducers may reduce the bit-length of a given input from the first bit-length to a second shorter bit-length and provide multiple reduced inputs with second shorter bit-length to the array. The systolic array may perform multiply-accumulate operations on each unique combination of the multiple reduced input data elements and the reduced weights to generate multiple partial outputs. The systolic array may sum the partial outputs to generate the output.
    Type: Application
    Filed: June 30, 2021
    Publication date: January 5, 2023
    Inventors: Paul Gilbert Meyer, Thomas A. Volpe, Ron Diamant, Joshua Wayne Bowman, Nishith Desai, Thomas Elmer
  • Publication number: 20230004384
    Abstract: Systems and methods are provided to perform multiply-accumulate operations of reduced precision numbers in a systolic array. Each row of the systolic array can receive reduced inputs from a respective reducer. The reduced input can include a reduced input data element and/or a reduced weight. The systolic array may lack support for inputs with a first bit-length and the reducers may reduce the bit-length of a given input from the first bit-length to a second shorter bit-length and provide the reduced input to the array. In order to reduce the bit-length, the reducer may reduce the number of trailing bits of the input. Further, the systolic array can receive a reduced and rounded input. The systolic array can propagate the reduced input through the processing elements in the systolic array. Each processing element may include a multiplier and/or an adder to perform arithmetical operations based on the reduced input.
    Type: Application
    Filed: June 30, 2021
    Publication date: January 5, 2023
    Inventors: Paul Gilbert Meyer, Thomas A Volpe, Ron Diamant, Joshua Wayne Bowman, Nishith Desai, Thomas Elmer
  • Publication number: 20220188073
    Abstract: To reduce power consumption, data bits or a portion of a data register that is not expected to toggle frequently can be grouped together, and be clock-gated independently from the rest of the data register. The grouping of the data bits can be determined based on the data types of the workload being operated on. For a data register configured to store a numeric value that supports multiple data types, the portion of the data register being clock-gated may store a group of data bits that are unused for one or more data types of the multiple data types supported by the data register. The portion of the data register being clock-gated can also be a group of data bits that remain unchanged or have a constant value for numeric values within a certain numeric range that is frequently operated on.
    Type: Application
    Filed: December 11, 2020
    Publication date: June 16, 2022
    Inventors: Joshua Wayne Bowman, Thomas A. Volpe, Sundeep Amirineni, Nishith Desai, Ron Diamant