Patents by Inventor Jeremy Halden Fowers
Jeremy Halden Fowers has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230342320Abstract: The present disclosure relates to devices for using a configurable stacked architecture for a fixed function datapath with an accelerator for accelerating an operation or a layer of a deep neural network (DNN). The stacked architecture may have a fixed function datapath that includes one or more configurable micro-execution units that execute a series of vector, scalar, reduction, broadcasting, and normalization operations for a DNN layer operation. The fixed function datapath may be customizable based on the DNN or the operation.Type: ApplicationFiled: July 5, 2023Publication date: October 26, 2023Inventors: Stephen Sangho YOUN, Steven Karl REINHARDT, Jeremy Halden FOWERS, Lok Chand KOPPAKA, Kalin OVTCHAROV
-
Patent number: 11734214Abstract: The present disclosure relates to devices for using a configurable stacked architecture for a fixed function datapath with an accelerator for accelerating an operation or a layer of a deep neural network (DNN). The stacked architecture may have a fixed function datapath that includes one or more configurable micro-execution units that execute a series of vector, scalar, reduction, broadcasting, and normalization operations for a DNN layer operation. The fixed function datapath may be customizable based on the DNN or the operation.Type: GrantFiled: March 25, 2021Date of Patent: August 22, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Stephen Sangho Youn, Steven Karl Reinhardt, Jeremy Halden Fowers, Lok Chand Koppaka, Kalin Ovtcharov
-
Publication number: 20220245083Abstract: The present disclosure relates to devices for using a configurable stacked architecture for a fixed function datapath with an accelerator for accelerating an operation or a layer of a deep neural network (DNN). The stacked architecture may have a fixed function datapath that includes one or more configurable micro-execution units that execute a series of vector, scalar, reduction, broadcasting, and normalization operations for a DNN layer operation. The fixed function datapath may be customizable based on the DNN or the operation.Type: ApplicationFiled: March 25, 2021Publication date: August 4, 2022Inventors: Stephen Sangho YOUN, Steven Karl REINHARDT, Jeremy Halden FOWERS, Lok Chand KOPPAKA, Kalin OVTCHAROV
-
Patent number: 10467324Abstract: A method is provided that includes providing a hard-wired integer multiplier circuit configured to multiply a first physical operand and a second physical operand, mapping a first logical operand to a first portion of the first physical operand, mapping a second logical operand to a second portion of the first physical operand, and mapping a third logical operand to the second physical operand. The method further includes multiplying the first physical operand and the second physical operand using the hard-wired integer multiplier circuit to provide a multiplication result that includes a first portion including a product of the first logical operand and the third logical operand, and a second portion including a product of the second logical operand and the third logical operand.Type: GrantFiled: May 24, 2017Date of Patent: November 5, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Eric Sen Chung, Jeremy Halden Fowers, Shlomo Alkalay
-
Patent number: 10372456Abstract: A hardware accelerator having an efficient instruction set is disclosed. An apparatus may comprise logic configured to access a first and a second machine instruction. The second machine instruction may be missing a tensor operand needed to execute the second machine instruction. The logic may be further configured to execute the first machine instruction, resulting in a tensor. The logic may be further configured to execute the second machine instruction using the resultant tensor as the missing tensor operand.Type: GrantFiled: May 24, 2017Date of Patent: August 6, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jeremy Halden Fowers, Kalin Ovtcharov, Steven Karl Reinhardt, Eric Sen Chung, Ming Gang Liu
-
Patent number: 10338925Abstract: Tensor register files in a hardware accelerator are disclosed. An apparatus may comprise tensor operation calculators each configured to perform a type of tensor operation. The apparatus may also comprises tensor register files, each of which is associated with one of the tensor operation calculators. The apparatus may also comprises logic configured to store respective ones of the tensors in the plurality of tensor register files in accordance with the type of tensor operation to be performed on the respective tensors. The apparatus may also control read access to tensor register files based on a type of tensor operation that a machine instruction is to perform.Type: GrantFiled: May 24, 2017Date of Patent: July 2, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jeremy Halden Fowers, Steven Karl Reinhardt, Kalin Ovtcharov, Eric Sen Chung
-
Patent number: 10331445Abstract: A processor circuit is provided that includes an input terminal and an output terminal, a plurality of vector processor operation circuits, a selector circuit coupled to the input terminal, the output terminal, and each of the vector processor operation circuits, and a scheduler circuit adapted to control the selector circuit to configure a vector processing pipeline comprising zero, one or more of the vector processor operation circuits in any order between the input terminal and the output terminal.Type: GrantFiled: May 24, 2017Date of Patent: June 25, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jeremy Halden Fowers, Ming Gang Liu, Kalin Ovtcharov, Steven Karl Reinhardt, Eric Sen Chung
-
Publication number: 20180341486Abstract: A processor circuit is provided that includes an input terminal and an output terminal, a plurality of vector processor operation circuits, a selector circuit coupled to the input terminal, the output terminal, and each of the vector processor operation circuits, and a scheduler circuit adapted to control the selector circuit to configure a vector processing pipeline comprising zero, one or more of the vector processor operation circuits in any order between the input terminal and the output terminal.Type: ApplicationFiled: May 24, 2017Publication date: November 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Jeremy Halden FOWERS, Ming Gang LIU, Kalin OVTCHAROV, Steven Karl REINHARDT, Eric Sen CHUNG
-
Publication number: 20180341483Abstract: Tensor register files in a hardware accelerator are disclosed. An apparatus may comprise tensor operation calculators each configured to perform a type of tensor operation. The apparatus may also comprises tensor register files, each of which is associated with one of the tensor operation calculators. The apparatus may also comprises logic configured to store respective ones of the tensors in the plurality of tensor register files in accordance with the type of tensor operation to be performed on the respective tensors. The apparatus may also control read access to tensor register files based on a type of tensor operation that a machine instruction is to perform.Type: ApplicationFiled: May 24, 2017Publication date: November 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Jeremy Halden FOWERS, Steven Karl REINHARDT, Kalin OVTCHAROV, Eric Sen CHUNG
-
Publication number: 20180341622Abstract: A method is provided that includes providing a hard-wired integer multiplier circuit configured to multiply a first physical operand and a second physical operand, mapping a first logical operand to a first portion of the first physical operand, mapping a second logical operand to a second portion of the first physical operand, and mapping a third logical operand to the second physical operand. The method further includes multiplying the first physical operand and the second physical operand using the hard-wired integer multiplier circuit to provide a multiplication result that includes a first portion including a product of the first logical operand and the third logical operand, and a second portion including a product of the second logical operand and the third logical operand.Type: ApplicationFiled: May 24, 2017Publication date: November 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Eric Sen CHUNG, Jeremy Halden FOWERS, Shlomo ALKALAY
-
Publication number: 20180341484Abstract: A hardware accelerator having an efficient instruction set is disclosed. An apparatus may comprise logic configured to access a first and a second machine instruction. The second machine instruction may be missing a tensor operand needed to execute the second machine instruction. The logic may be further configured to execute the first machine instruction, resulting in a tensor. The logic may be further configured to execute the second machine instruction using the resultant tensor as the missing tensor operand.Type: ApplicationFiled: May 24, 2017Publication date: November 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Jeremy Halden FOWERS, Kalin OVTCHAROV, Steven Karl REINHARDT, Eric Sen CHUNG, Ming Gang LIU
-
Patent number: 9590655Abstract: A method of lossless data compression includes receiving a set of parallel data strings; determining compression hash values for each of the parallel data strings; determining bit matches among portions of each of the parallel data strings based, at least in part, on the compression hash values; selecting among literals and the bit matches for each of the parallel data strings; and applying Huffman encoding to the selected literals or the selected bit matches.Type: GrantFiled: June 12, 2015Date of Patent: March 7, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Joo-Young Kim, Douglas C. Burger, Jeremy Halden Fowers, Scott A. Hauck
-
Publication number: 20160285473Abstract: A method of lossless data compression includes receiving a set of parallel data strings; determining compression hash values for each of the parallel data strings; determining bit matches among portions of each of the parallel data strings based, at least in part, on the compression hash values; selecting among literals and the bit matches for each of the parallel data strings; and applying Huffman encoding to the selected literals or the selected bit matches.Type: ApplicationFiled: June 12, 2015Publication date: September 29, 2016Inventors: Joo-Young Kim, Douglas C. Burger, Jeremy Halden Fowers, Scott A. Hauck