Patents by Inventor Jeremy Fowers

Jeremy Fowers has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11886833
    Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.
    Type: Grant
    Filed: June 28, 2021
    Date of Patent: January 30, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Bita Darvish Rouhani, Venmugil Elango, Rasoul Shafipour, Jeremy Fowers, Ming Gang Liu, Jinwen Xi, Douglas C. Burger, Eric S. Chung
  • Patent number: 11663450
    Abstract: Hardware and methods for neural network processing are provided. A method in a hardware node including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes performing using the MVU a first type of instruction that can only be performed by the MVU to generate a first result. The method further includes performing a second type of instruction that can only be performed by one of the multifunction units and generating a second result and without storing the any of the two results in a global register, passing the second result to the second multifunction and the third multifunction unit.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: May 30, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jeremy Fowers, Eric S. Chung, Douglas C. Burger
  • Patent number: 11556764
    Abstract: Systems and methods for deriving a concordant software neural network layer are provided. A method includes receiving first instructions configured to, using a neural network processor (NNP), process a first set of data corresponding to a neural network layer, where the NNP is configured to quantize the first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the NNP to generate a first set of results. The method further includes processing the first instructions to automatically generate second instructions configured for use with at least one processor, different from the NNP, such that the second instructions, when executed by the at least one processor to perform matrix multiply operations, generate a second set of results that are concordant with the first set of results.
    Type: Grant
    Filed: March 1, 2019
    Date of Patent: January 17, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jeremy Fowers, Daniel Lo, Deeksha Dangwal
  • Patent number: 11556762
    Abstract: Neural network processors that have been customized based on application specific synthesis specialization parameters and related methods are described. Certain example neural network processors and methods described in the present disclosure expose several major synthesis specialization parameters that can be used for specializing a microarchitecture instance of a neural network processor to specific neural network models including: (1) aligning the native vector dimension to the parameters of the model to minimize padding and waste during model evaluation, (2) increasing lane widths to drive up intra-row-level parallelism, or (3) increasing matrix multiply tiles to exploit sub-matrix parallelism for large neural network models.
    Type: Grant
    Filed: April 21, 2018
    Date of Patent: January 17, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
  • Publication number: 20220391209
    Abstract: Hardware and methods for neural network processing are provided. A method in a hardware node including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes performing using the MVU a first type of instruction that can only be performed by the MVU to generate a first result. The method further includes performing a second type of instruction that can only be performed by one of the multifunction units and generating a second result and without storing the any of the two results in a global register, passing the second result to the second multifunction and the third multifunction unit.
    Type: Application
    Filed: August 8, 2022
    Publication date: December 8, 2022
    Inventors: Jeremy FOWERS, Eric S. CHUNG, Douglas C. BURGER
  • Publication number: 20220253281
    Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.
    Type: Application
    Filed: June 28, 2021
    Publication date: August 11, 2022
    Inventors: Bita DARVISH ROUHANI, Venmugil ELANGO, Rasoul SHAFIPOUR, Jeremy FOWERS, Ming Gang LIU, Jinwen XI, Douglas C. BURGER, Eric S. CHUNG
  • Patent number: 11243778
    Abstract: The present disclosure relates to instruction dispatch mechanisms for superscalar processors having a plurality of functional units for executing operations simultaneously. Each particular functional unit of the plurality of functional units may be configured to output a capability vector indicating a set of operations that the particular functional unit is currently available to perform. As instructions are received in an issue queue, the functional unit to execute the instruction is selected by comparing capabilities required by the instruction to the asserted capabilities of each of the functional units. A functional unit may reset or de-assert a particular functionality while performing an operation and then re-assert the capability when the instruction is completed. A result of the operation may be stored in a skid buffer for at least as long as the chain execution time in order to avoid resource hazards are a write port of the vector register file.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: February 8, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Skand Hurkat, Jeremy Fowers
  • Publication number: 20220012577
    Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.
    Type: Application
    Filed: September 23, 2021
    Publication date: January 13, 2022
    Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov
  • Publication number: 20210406657
    Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.
    Type: Application
    Filed: August 23, 2021
    Publication date: December 30, 2021
    Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
  • Patent number: 11157801
    Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: October 26, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov
  • Patent number: 11151445
    Abstract: Neural network processors including a window expander circuit and related methods are provided. The window expander circuit may include a first logic circuit configured to store a set of data elements, corresponding to at least a subset of the input data, into a Q number of logical memories, where each of a P number of data elements of the set of the data elements is stored in each of the Q number of logical memories. The window expander circuit may further include a second logic circuit configured to receive the first set of data elements and additional data elements corresponding to the at least the subset of the input data from the Q number of logical memories and expand the at least the subset of the input data until the at least the subset of the input data is expanded based on a predetermined factor.
    Type: Grant
    Filed: April 21, 2018
    Date of Patent: October 19, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jeremy Fowers, Dan Zhang, Mohammadmahdi Ghandi
  • Patent number: 11144820
    Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding a chain of instructions received via an input queue, where the chain of instructions comprises a first instruction that can only be processed by the matrix vector unit and a sequence of instructions that can only be processed by a multifunction unit. The method includes processing the first instruction using the MVU and processing each of instructions in the sequence of instructions depending upon a position of the each of instructions in the sequence of instructions.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: October 12, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
  • Patent number: 11132599
    Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: September 28, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
  • Patent number: 10795678
    Abstract: Neural network processors including a vector register file (VRF) having a multi-port memory and related methods are provided. The processor may include tiles to process an N by N matrix of data elements and an N by 1 vector of data elements. The VRF may, in response to a write instruction, store N data elements in a multi-port memory and during each one of out of P clock cycles provide N data elements to each one of P input interface circuits of the multi-port memory comprising an input lane configured to carry L data elements in parallel. During the each one of the P clock cycles the multi-port memory may be configured to receive N data elements via a selected at least one of the P input interface circuits. The VRF may include output interface circuits for providing N data elements in response to a read instruction.
    Type: Grant
    Filed: April 21, 2018
    Date of Patent: October 6, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
  • Publication number: 20200279153
    Abstract: Systems and methods for deriving a concordant software neural network layer are provided. A method includes receiving first instructions configured to, using a neural network processor (NNP), process a first set of data corresponding to a neural network layer, where the NNP is configured to quantize the first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the NNP to generate a first set of results. The method further includes processing the first instructions to automatically generate second instructions configured for use with at least one processor, different from the NNP, such that the second instructions, when executed by the at least one processor to perform matrix multiply operations, generate a second set of results that are concordant with the first set of results.
    Type: Application
    Filed: March 1, 2019
    Publication date: September 3, 2020
    Inventors: Jeremy Fowers, Daniel Lo, Deeksha Dangwal
  • Publication number: 20190324748
    Abstract: Neural network processors including a vector register file (VRF) having a multi-port memory and related methods are provided. The processor may include tiles to process an N by N matrix of data elements and an N by 1 vector of data elements. The VRF may, in response to a write instruction, store N data elements in a multi-port memory and during each one of out of P clock cycles provide N data elements to each one of P input interface circuits of the multi-port memory comprising an input lane configured to carry L data elements in parallel. During the each one of the P clock cycles the multi-port memory may be configured to receive N data elements via a selected at least one of the P input interface circuits. The VRF may include output interface circuits for providing N data elements in response to a read instruction.
    Type: Application
    Filed: April 21, 2018
    Publication date: October 24, 2019
    Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
  • Publication number: 20190325296
    Abstract: Neural network processors that have been customized based on application specific synthesis specialization parameters and related methods are described. Certain example neural network processors and methods described in the present disclosure expose several major synthesis specialization parameters that can be used for specializing a microarchitecture instance of a neural network processor to specific neural network models including: (1) aligning the native vector dimension to the parameters of the model to minimize padding and waste during model evaluation, (2) increasing lane widths to drive up intra-row-level parallelism, or (3) increasing matrix multiply tiles to exploit sub-matrix parallelism for large neural network models.
    Type: Application
    Filed: April 21, 2018
    Publication date: October 24, 2019
    Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
  • Publication number: 20190325297
    Abstract: Neural network processors including a window expander circuit and related methods are provided. The window expander circuit may include a first logic circuit configured to store a set of data elements, corresponding to at least a subset of the input data, into a Q number of logical memories, where each of a P number of data elements of the set of the data elements is stored in each of the Q number of logical memories. The window expander circuit may further include a second logic circuit configured to receive the first set of data elements and additional data elements corresponding to the at least the subset of the input data from the Q number of logical memories and expand the at least the subset of the input data until the at least the subset of the input data is expanded based on a predetermined factor.
    Type: Application
    Filed: April 21, 2018
    Publication date: October 24, 2019
    Inventors: Jeremy Fowers, Dan Zhang, Mohammadmahdi Ghandi
  • Patent number: 10140252
    Abstract: Hardware and methods for neural network processing are provided. A method in a system comprising a plurality of nodes, where each node comprises a plurality of tiles, is provided. The method includes receiving an N by M matrix of coefficients configured to control a neural network model. The method includes storing a first row and a second row of the N by M matrix of coefficients in a first and a second on-chip memories incorporated within a first and a second of the plurality of tiles. The method includes processing the first row of the coefficients and a first set of input vectors using a first compute unit incorporated within the first of the plurality of tiles. The method includes processing the second row of the coefficients and a second set of input vectors using a second compute unit incorporated within the second of the plurality of tiles.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: November 27, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jeremy Fowers, Eric S. Chung
  • Publication number: 20180247186
    Abstract: Hardware and methods for neural network processing are provided. A method in a hardware node including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes performing using the MVU a first type of instruction that can only be performed by the MVU to generate a first result. The method further includes performing a second type of instruction that can only be performed by one of the multifunction units and generating a second result and without storing the any of the two results in a global register, passing the second result to the second multifunction and the third multifunction unit.
    Type: Application
    Filed: June 29, 2017
    Publication date: August 30, 2018
    Inventors: Jeremy Fowers, Eric S. Chung, Douglas C. Burger