Patents by Inventor Steven Karl REINHARDT
Steven Karl REINHARDT has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230342320Abstract: The present disclosure relates to devices for using a configurable stacked architecture for a fixed function datapath with an accelerator for accelerating an operation or a layer of a deep neural network (DNN). The stacked architecture may have a fixed function datapath that includes one or more configurable micro-execution units that execute a series of vector, scalar, reduction, broadcasting, and normalization operations for a DNN layer operation. The fixed function datapath may be customizable based on the DNN or the operation.Type: ApplicationFiled: July 5, 2023Publication date: October 26, 2023Inventors: Stephen Sangho YOUN, Steven Karl REINHARDT, Jeremy Halden FOWERS, Lok Chand KOPPAKA, Kalin OVTCHAROV
-
Publication number: 20230305967Abstract: The present disclosure relates to devices and methods for using a banked memory structure with accelerators. The devices and methods may segment and isolate dataflows in datapath and memory of the accelerator. The devices and methods may provide each data channel with its own register memory bank. The devices and methods may use a memory address decoder to place the local variables in the proper memory bank.Type: ApplicationFiled: May 30, 2023Publication date: September 28, 2023Inventors: Stephen Sangho YOUN, Steven Karl REINHARDT, Hui GENG
-
Patent number: 11734214Abstract: The present disclosure relates to devices for using a configurable stacked architecture for a fixed function datapath with an accelerator for accelerating an operation or a layer of a deep neural network (DNN). The stacked architecture may have a fixed function datapath that includes one or more configurable micro-execution units that execute a series of vector, scalar, reduction, broadcasting, and normalization operations for a DNN layer operation. The fixed function datapath may be customizable based on the DNN or the operation.Type: GrantFiled: March 25, 2021Date of Patent: August 22, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Stephen Sangho Youn, Steven Karl Reinhardt, Jeremy Halden Fowers, Lok Chand Koppaka, Kalin Ovtcharov
-
Patent number: 11704251Abstract: The present disclosure relates to devices and methods for using a banked memory structure with accelerators. The devices and methods may segment and isolate dataflows in datapath and memory of the accelerator. The devices and methods may provide each data channel with its own register memory bank. The devices and methods may use a memory address decoder to place the local variables in the proper memory bank.Type: GrantFiled: April 27, 2022Date of Patent: July 18, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Stephen Sangho Youn, Steven Karl Reinhardt, Hui Geng
-
Publication number: 20220253384Abstract: The present disclosure relates to devices and methods for using a banked memory structure with accelerators. The devices and methods may segment and isolate dataflows in datapath and memory of the accelerator. The devices and methods may provide each data channel with its own register memory bank. The devices and methods may use a memory address decoder to place the local variables in the proper memory bank.Type: ApplicationFiled: April 27, 2022Publication date: August 11, 2022Inventors: Stephen Sangho YOUN, Steven Karl REINHARDT, Hui GENG
-
Publication number: 20220245083Abstract: The present disclosure relates to devices for using a configurable stacked architecture for a fixed function datapath with an accelerator for accelerating an operation or a layer of a deep neural network (DNN). The stacked architecture may have a fixed function datapath that includes one or more configurable micro-execution units that execute a series of vector, scalar, reduction, broadcasting, and normalization operations for a DNN layer operation. The fixed function datapath may be customizable based on the DNN or the operation.Type: ApplicationFiled: March 25, 2021Publication date: August 4, 2022Inventors: Stephen Sangho YOUN, Steven Karl REINHARDT, Jeremy Halden FOWERS, Lok Chand KOPPAKA, Kalin OVTCHAROV
-
Patent number: 11347652Abstract: The present disclosure relates to devices and methods for using a banked memory structure with accelerators. The devices and methods may segment and isolate dataflows in datapath and memory of the accelerator. The devices and methods may provide each data channel with its own register memory bank. The devices and methods may use a memory address decoder to place the local variables in the proper memory bank.Type: GrantFiled: November 13, 2020Date of Patent: May 31, 2022Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Stephen Sangho Youn, Steven Karl Reinhardt, Hui Geng
-
Publication number: 20220066943Abstract: The present disclosure relates to devices and methods for using a banked memory structure with accelerators. The devices and methods may segment and isolate dataflows in datapath and memory of the accelerator. The devices and methods may provide each data channel with its own register memory bank. The devices and methods may use a memory address decoder to place the local variables in the proper memory bank.Type: ApplicationFiled: November 13, 2020Publication date: March 3, 2022Inventors: Stephen Sangho YOUN, Steven Karl REINHARDT, Hui GENG
-
Publication number: 20210312266Abstract: Deep neural network accelerators (DNNs) with independent datapaths for simultaneous processing of different classes of operations and related methods are described. An example DNN accelerator includes an instruction dispatcher for receiving chains of instructions having both instructions for performing a first class of operations and a second class of operations corresponding to a neural network model. The DNN accelerator further includes a first datapath and a second datapath, where each is configured to execute at least one instruction chain locally before outputting any results. The instruction dispatcher is configured to forward instructions for performing the first class of operations to the first datapath and forward instructions for performing the second class of operations to the second datapath to overlap in time a performance of at least a subset of the first class of operations with a performance of at least a subset of the second class of operations.Type: ApplicationFiled: April 1, 2020Publication date: October 7, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Stephen Sangho YOUN, Lok Chand KOPPAKA, Steven Karl REINHARDT
-
Patent number: 10997116Abstract: A computing system is described herein that expedites deep neural network (DNN) operations or other processing operations using a hardware accelerator. The hardware accelerator, in turn, includes a tensor-processing engine that works in conjunction with a scalar-processing unit (SPU). The tensor-processing engine handles various kinds of tensor-based operations required by the DNN, such as multiplying vectors by matrices, combining vectors with other vectors, transforming individual vectors, etc. The SPU performs scalar-based operations, such as forming the reciprocal of a scalar, generating the square root of a scalar, etc. According to one illustrative implementation, the computing system uses the same vector-based programmatic interface to interact with both the tensor-processing engine and the SPU.Type: GrantFiled: August 6, 2019Date of Patent: May 4, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Steven Karl Reinhardt, Joseph Anthony Mayer, II, Dan Zhang
-
Publication number: 20210042260Abstract: A computing system is described herein that expedites deep neural network (DNN) operations or other processing operations using a hardware accelerator. The hardware accelerator, in turn, includes a tensor-processing engine that works in conjunction with a scalar-processing unit (SPU). The tensor-processing engine handles various kinds of tensor-based operations required by the DNN, such as multiplying vectors by matrices, combining vectors with other vectors, transforming individual vectors, etc. The SPU performs scalar-based operations, such as forming the reciprocal of a scalar, generating the square root of a scalar, etc. According to one illustrative implementation, the computing system uses the same vector-based programmatic interface to interact with both the tensor-processing engine and the SPU.Type: ApplicationFiled: August 6, 2019Publication date: February 11, 2021Inventors: Steven Karl REINHARDT, Joseph Anthony MAYER, II, Dan ZHANG
-
Patent number: 10372456Abstract: A hardware accelerator having an efficient instruction set is disclosed. An apparatus may comprise logic configured to access a first and a second machine instruction. The second machine instruction may be missing a tensor operand needed to execute the second machine instruction. The logic may be further configured to execute the first machine instruction, resulting in a tensor. The logic may be further configured to execute the second machine instruction using the resultant tensor as the missing tensor operand.Type: GrantFiled: May 24, 2017Date of Patent: August 6, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jeremy Halden Fowers, Kalin Ovtcharov, Steven Karl Reinhardt, Eric Sen Chung, Ming Gang Liu
-
Patent number: 10338925Abstract: Tensor register files in a hardware accelerator are disclosed. An apparatus may comprise tensor operation calculators each configured to perform a type of tensor operation. The apparatus may also comprises tensor register files, each of which is associated with one of the tensor operation calculators. The apparatus may also comprises logic configured to store respective ones of the tensors in the plurality of tensor register files in accordance with the type of tensor operation to be performed on the respective tensors. The apparatus may also control read access to tensor register files based on a type of tensor operation that a machine instruction is to perform.Type: GrantFiled: May 24, 2017Date of Patent: July 2, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jeremy Halden Fowers, Steven Karl Reinhardt, Kalin Ovtcharov, Eric Sen Chung
-
Patent number: 10331445Abstract: A processor circuit is provided that includes an input terminal and an output terminal, a plurality of vector processor operation circuits, a selector circuit coupled to the input terminal, the output terminal, and each of the vector processor operation circuits, and a scheduler circuit adapted to control the selector circuit to configure a vector processing pipeline comprising zero, one or more of the vector processor operation circuits in any order between the input terminal and the output terminal.Type: GrantFiled: May 24, 2017Date of Patent: June 25, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Jeremy Halden Fowers, Ming Gang Liu, Kalin Ovtcharov, Steven Karl Reinhardt, Eric Sen Chung
-
Publication number: 20180341483Abstract: Tensor register files in a hardware accelerator are disclosed. An apparatus may comprise tensor operation calculators each configured to perform a type of tensor operation. The apparatus may also comprises tensor register files, each of which is associated with one of the tensor operation calculators. The apparatus may also comprises logic configured to store respective ones of the tensors in the plurality of tensor register files in accordance with the type of tensor operation to be performed on the respective tensors. The apparatus may also control read access to tensor register files based on a type of tensor operation that a machine instruction is to perform.Type: ApplicationFiled: May 24, 2017Publication date: November 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Jeremy Halden FOWERS, Steven Karl REINHARDT, Kalin OVTCHAROV, Eric Sen CHUNG
-
Publication number: 20180341486Abstract: A processor circuit is provided that includes an input terminal and an output terminal, a plurality of vector processor operation circuits, a selector circuit coupled to the input terminal, the output terminal, and each of the vector processor operation circuits, and a scheduler circuit adapted to control the selector circuit to configure a vector processing pipeline comprising zero, one or more of the vector processor operation circuits in any order between the input terminal and the output terminal.Type: ApplicationFiled: May 24, 2017Publication date: November 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Jeremy Halden FOWERS, Ming Gang LIU, Kalin OVTCHAROV, Steven Karl REINHARDT, Eric Sen CHUNG
-
Publication number: 20180341484Abstract: A hardware accelerator having an efficient instruction set is disclosed. An apparatus may comprise logic configured to access a first and a second machine instruction. The second machine instruction may be missing a tensor operand needed to execute the second machine instruction. The logic may be further configured to execute the first machine instruction, resulting in a tensor. The logic may be further configured to execute the second machine instruction using the resultant tensor as the missing tensor operand.Type: ApplicationFiled: May 24, 2017Publication date: November 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Jeremy Halden FOWERS, Kalin OVTCHAROV, Steven Karl REINHARDT, Eric Sen CHUNG, Ming Gang LIU