Patents by Inventor Jeremy Fowers

Jeremy Fowers has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Hierarchical and shared exponent floating point data types

Patent number: 11886833

Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.

Type: Grant

Filed: June 28, 2021

Date of Patent: January 30, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Bita Darvish Rouhani, Venmugil Elango, Rasoul Shafipour, Jeremy Fowers, Ming Gang Liu, Jinwen Xi, Douglas C. Burger, Eric S. Chung
Neural network processing with chained instructions

Patent number: 11663450

Abstract: Hardware and methods for neural network processing are provided. A method in a hardware node including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes performing using the MVU a first type of instruction that can only be performed by the MVU to generate a first result. The method further includes performing a second type of instruction that can only be performed by one of the multifunction units and generating a second result and without storing the any of the two results in a global register, passing the second result to the second multifunction and the third multifunction unit.

Type: Grant

Filed: June 29, 2017

Date of Patent: May 30, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jeremy Fowers, Eric S. Chung, Douglas C. Burger
Deriving a concordant software neural network layer from a quantized firmware neural network layer

Patent number: 11556764

Abstract: Systems and methods for deriving a concordant software neural network layer are provided. A method includes receiving first instructions configured to, using a neural network processor (NNP), process a first set of data corresponding to a neural network layer, where the NNP is configured to quantize the first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the NNP to generate a first set of results. The method further includes processing the first instructions to automatically generate second instructions configured for use with at least one processor, different from the NNP, such that the second instructions, when executed by the at least one processor to perform matrix multiply operations, generate a second set of results that are concordant with the first set of results.

Type: Grant

Filed: March 1, 2019

Date of Patent: January 17, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jeremy Fowers, Daniel Lo, Deeksha Dangwal
Neural network processor based on application specific synthesis specialization parameters

Patent number: 11556762

Abstract: Neural network processors that have been customized based on application specific synthesis specialization parameters and related methods are described. Certain example neural network processors and methods described in the present disclosure expose several major synthesis specialization parameters that can be used for specializing a microarchitecture instance of a neural network processor to specific neural network models including: (1) aligning the native vector dimension to the parameters of the model to minimize padding and waste during model evaluation, (2) increasing lane widths to drive up intra-row-level parallelism, or (3) increasing matrix multiply tiles to exploit sub-matrix parallelism for large neural network models.

Type: Grant

Filed: April 21, 2018

Date of Patent: January 17, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
NEURAL NETWORK PROCESSING WITH CHAINED INSTRUCTIONS

Publication number: 20220391209

Abstract: Hardware and methods for neural network processing are provided. A method in a hardware node including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes performing using the MVU a first type of instruction that can only be performed by the MVU to generate a first result. The method further includes performing a second type of instruction that can only be performed by one of the multifunction units and generating a second result and without storing the any of the two results in a global register, passing the second result to the second multifunction and the third multifunction unit.

Type: Application

Filed: August 8, 2022

Publication date: December 8, 2022

Inventors: Jeremy FOWERS, Eric S. CHUNG, Douglas C. BURGER
HIERARCHICAL AND SHARED EXPONENT FLOATING POINT DATA TYPES

Publication number: 20220253281

Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.

Type: Application

Filed: June 28, 2021

Publication date: August 11, 2022

Inventors: Bita DARVISH ROUHANI, Venmugil ELANGO, Rasoul SHAFIPOUR, Jeremy FOWERS, Ming Gang LIU, Jinwen XI, Douglas C. BURGER, Eric S. CHUNG
Instruction dispatch for superscalar processors

Patent number: 11243778

Abstract: The present disclosure relates to instruction dispatch mechanisms for superscalar processors having a plurality of functional units for executing operations simultaneously. Each particular functional unit of the plurality of functional units may be configured to output a capability vector indicating a set of operations that the particular functional unit is currently available to perform. As instructions are received in an issue queue, the functional unit to execute the instruction is selected by comparing capabilities required by the instruction to the asserted capabilities of each of the functional units. A functional unit may reset or de-assert a particular functionality while performing an operation and then re-assert the capability when the instruction is completed. A result of the operation may be stored in a skid buffer for at least as long as the chain execution time in order to avoid resource hazards are a write port of the vector register file.

Type: Grant

Filed: December 31, 2020

Date of Patent: February 8, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Skand Hurkat, Jeremy Fowers
NEURAL NETWORK PROCESSING WITH MODEL PINNING

Publication number: 20220012577

Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.

Type: Application

Filed: September 23, 2021

Publication date: January 13, 2022

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov
MULTI-FUNCTION UNIT FOR PROGRAMMABLE HARDWARE NODES FOR NEURAL NETWORK PROCESSING

Publication number: 20210406657

Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.

Type: Application

Filed: August 23, 2021

Publication date: December 30, 2021

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
Neural network processing with the neural network model pinned to on-chip memories of hardware nodes

Patent number: 11157801

Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.

Type: Grant

Filed: June 29, 2017

Date of Patent: October 26, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov
Neural network processor with a window expander circuit

Patent number: 11151445

Abstract: Neural network processors including a window expander circuit and related methods are provided. The window expander circuit may include a first logic circuit configured to store a set of data elements, corresponding to at least a subset of the input data, into a Q number of logical memories, where each of a P number of data elements of the set of the data elements is stored in each of the Q number of logical memories. The window expander circuit may further include a second logic circuit configured to receive the first set of data elements and additional data elements corresponding to the at least the subset of the input data from the Q number of logical memories and expand the at least the subset of the input data until the at least the subset of the input data is expanded based on a predetermined factor.

Type: Grant

Filed: April 21, 2018

Date of Patent: October 19, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jeremy Fowers, Dan Zhang, Mohammadmahdi Ghandi
Hardware node with position-dependent memories for neural network processing

Patent number: 11144820

Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding a chain of instructions received via an input queue, where the chain of instructions comprises a first instruction that can only be processed by the matrix vector unit and a sequence of instructions that can only be processed by a multifunction unit. The method includes processing the first instruction using the MVU and processing each of instructions in the sequence of instructions depending upon a position of the each of instructions in the sequence of instructions.

Type: Grant

Filed: June 29, 2017

Date of Patent: October 12, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
Multi-function unit for programmable hardware nodes for neural network processing

Patent number: 11132599

Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.

Type: Grant

Filed: June 29, 2017

Date of Patent: September 28, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
Matrix vector multiplier with a vector register file comprising a multi-port memory

Patent number: 10795678

Abstract: Neural network processors including a vector register file (VRF) having a multi-port memory and related methods are provided. The processor may include tiles to process an N by N matrix of data elements and an N by 1 vector of data elements. The VRF may, in response to a write instruction, store N data elements in a multi-port memory and during each one of out of P clock cycles provide N data elements to each one of P input interface circuits of the multi-port memory comprising an input lane configured to carry L data elements in parallel. During the each one of the P clock cycles the multi-port memory may be configured to receive N data elements via a selected at least one of the P input interface circuits. The VRF may include output interface circuits for providing N data elements in response to a read instruction.

Type: Grant

Filed: April 21, 2018

Date of Patent: October 6, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
DERIVING A CONCORDANT SOFTWARE NEURAL NETWORK LAYER FROM A QUANTIZED FIRMWARE NEURAL NETWORK LAYER

Publication number: 20200279153

Abstract: Systems and methods for deriving a concordant software neural network layer are provided. A method includes receiving first instructions configured to, using a neural network processor (NNP), process a first set of data corresponding to a neural network layer, where the NNP is configured to quantize the first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the NNP to generate a first set of results. The method further includes processing the first instructions to automatically generate second instructions configured for use with at least one processor, different from the NNP, such that the second instructions, when executed by the at least one processor to perform matrix multiply operations, generate a second set of results that are concordant with the first set of results.

Type: Application

Filed: March 1, 2019

Publication date: September 3, 2020

Inventors: Jeremy Fowers, Daniel Lo, Deeksha Dangwal
MATRIX VECTOR MULTIPLIER WITH A VECTOR REGISTER FILE COMPRISING A MULTI-PORT MEMORY

Publication number: 20190324748

Abstract: Neural network processors including a vector register file (VRF) having a multi-port memory and related methods are provided. The processor may include tiles to process an N by N matrix of data elements and an N by 1 vector of data elements. The VRF may, in response to a write instruction, store N data elements in a multi-port memory and during each one of out of P clock cycles provide N data elements to each one of P input interface circuits of the multi-port memory comprising an input lane configured to carry L data elements in parallel. During the each one of the P clock cycles the multi-port memory may be configured to receive N data elements via a selected at least one of the P input interface circuits. The VRF may include output interface circuits for providing N data elements in response to a read instruction.

Type: Application

Filed: April 21, 2018

Publication date: October 24, 2019

Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
NEURAL NETWORK PROCESSOR BASED ON APPLICATION SPECIFIC SYNTHESIS SPECIALIZATION PARAMETERS

Publication number: 20190325296

Abstract: Neural network processors that have been customized based on application specific synthesis specialization parameters and related methods are described. Certain example neural network processors and methods described in the present disclosure expose several major synthesis specialization parameters that can be used for specializing a microarchitecture instance of a neural network processor to specific neural network models including: (1) aligning the native vector dimension to the parameters of the model to minimize padding and waste during model evaluation, (2) increasing lane widths to drive up intra-row-level parallelism, or (3) increasing matrix multiply tiles to exploit sub-matrix parallelism for large neural network models.

Type: Application

Filed: April 21, 2018

Publication date: October 24, 2019

Inventors: Jeremy Fowers, Kalin Ovtcharov, Eric S. Chung, Todd Michael Massengill, Ming Gang Liu, Gabriel Leonard Weisz
NEURAL NETWORK PROCESSOR WITH A WINDOW EXPANDER CIRCUIT

Publication number: 20190325297

Abstract: Neural network processors including a window expander circuit and related methods are provided. The window expander circuit may include a first logic circuit configured to store a set of data elements, corresponding to at least a subset of the input data, into a Q number of logical memories, where each of a P number of data elements of the set of the data elements is stored in each of the Q number of logical memories. The window expander circuit may further include a second logic circuit configured to receive the first set of data elements and additional data elements corresponding to the at least the subset of the input data from the Q number of logical memories and expand the at least the subset of the input data until the at least the subset of the input data is expanded based on a predetermined factor.

Type: Application

Filed: April 21, 2018

Publication date: October 24, 2019

Inventors: Jeremy Fowers, Dan Zhang, Mohammadmahdi Ghandi
Hardware node with matrix-vector multiply tiles for neural network processing

Patent number: 10140252

Abstract: Hardware and methods for neural network processing are provided. A method in a system comprising a plurality of nodes, where each node comprises a plurality of tiles, is provided. The method includes receiving an N by M matrix of coefficients configured to control a neural network model. The method includes storing a first row and a second row of the N by M matrix of coefficients in a first and a second on-chip memories incorporated within a first and a second of the plurality of tiles. The method includes processing the first row of the coefficients and a first set of input vectors using a first compute unit incorporated within the first of the plurality of tiles. The method includes processing the second row of the coefficients and a second set of input vectors using a second compute unit incorporated within the second of the plurality of tiles.

Type: Grant

Filed: June 29, 2017

Date of Patent: November 27, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Jeremy Fowers, Eric S. Chung
NEURAL NETWORK PROCESSING WITH CHAINED INSTRUCTIONS

Publication number: 20180247186

Abstract: Hardware and methods for neural network processing are provided. A method in a hardware node including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes performing using the MVU a first type of instruction that can only be performed by the MVU to generate a first result. The method further includes performing a second type of instruction that can only be performed by one of the multifunction units and generating a second result and without storing the any of the two results in a global register, passing the second result to the second multifunction and the third multifunction unit.

Type: Application

Filed: June 29, 2017

Publication date: August 30, 2018

Inventors: Jeremy Fowers, Eric S. Chung, Douglas C. Burger

1 2 next