Patents by Inventor Douglas C. Burger
Douglas C. Burger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11494614Abstract: Perplexity scores are computed for training data samples during ANN training. Perplexity scores can be computed as a divergence between data defining a class associated with a current training data sample and a probability vector generated by the ANN model. Perplexity scores can alternately be computed by learning a probability density function (“PDF”) fitting activation maps generated by an ANN model during training. A perplexity score can then be computed for a current training data sample by computing a probability for the current training data sample based on the PDF. If the perplexity score for a training data sample is lower than a threshold, the training data sample is removed from the training data set so that it will not be utilized for training during subsequent epochs. Training of the ANN model continues following the removal of training data samples from the training data set.Type: GrantFiled: March 20, 2019Date of Patent: November 8, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Eric S. Chung, Douglas C. Burger, Bita Darvish Rouhani
-
Publication number: 20220253281Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.Type: ApplicationFiled: June 28, 2021Publication date: August 11, 2022Inventors: Bita DARVISH ROUHANI, Venmugil ELANGO, Rasoul SHAFIPOUR, Jeremy FOWERS, Ming Gang LIU, Jinwen XI, Douglas C. BURGER, Eric S. CHUNG
-
Publication number: 20220012577Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.Type: ApplicationFiled: September 23, 2021Publication date: January 13, 2022Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov
-
Publication number: 20210406657Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.Type: ApplicationFiled: August 23, 2021Publication date: December 30, 2021Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
-
Neural network processing with the neural network model pinned to on-chip memories of hardware nodes
Patent number: 11157801Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.Type: GrantFiled: June 29, 2017Date of Patent: October 26, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov -
Patent number: 11144820Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding a chain of instructions received via an input queue, where the chain of instructions comprises a first instruction that can only be processed by the matrix vector unit and a sequence of instructions that can only be processed by a multifunction unit. The method includes processing the first instruction using the MVU and processing each of instructions in the sequence of instructions depending upon a position of the each of instructions in the sequence of instructions.Type: GrantFiled: June 29, 2017Date of Patent: October 12, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
-
Patent number: 11132599Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.Type: GrantFiled: June 29, 2017Date of Patent: September 28, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
-
Patent number: 11126433Abstract: Systems, apparatuses, and methods related to a block-based processor core composition register are disclosed. In one example of the disclosed technology, a processor can include a plurality of block-based processor cores for executing a program including a plurality of instruction blocks. A respective block-based processor core can include one or more sharable resources and a programmable composition control register. The programmable composition control register can be used to configure which resources of the one or more sharable resources are shared with other processor cores of the plurality of processor cores.Type: GrantFiled: December 23, 2015Date of Patent: September 21, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Douglas C. Burger, Aaron L. Smith
-
Patent number: 11099906Abstract: A service mapping component (SMC) is described herein for processing requests by instances of tenant functionality that execute on software-driven host components (or some other components) in a data processing system. The SMC is configured to apply at least one rule to determine whether a service requested by an instance of tenant functionality is to be satisfied by at least one of: a local host component, a local hardware acceleration component which is locally coupled to the local host component, and/or at least one remote hardware acceleration component that is indirectly accessible to the local host component via the local hardware acceleration component. In performing its analysis, the SMC can take into account various factors, such as whether or not the service corresponds to a line-rate service, latency-related considerations, security-related considerations, and so on.Type: GrantFiled: September 11, 2018Date of Patent: August 24, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Derek T. Chiou, Sitaram V. Lanka, Douglas C. Burger
-
Publication number: 20210216327Abstract: A reach matrix scheduler circuit for scheduling instructions to be executed in a processor is disclosed. The scheduler circuit includes an N×R matrix wake-up circuit, where ‘N’ is the instruction window size of the scheduler circuit, and ‘R’ is the “reach” with the instruction window of matrix wake-up circuit, with ‘R’ being less than ‘N’. A grant line associated with each instruction request entry in the N×R matrix wake-up circuit is coupled to ‘R’ other instruction entries among the ‘N’ instruction entries. When a producer instruction in an instruction request entry is ready for issuance, the grant line associated with the instruction request entry is activated so that any other instruction entries coupled to the grant line (i.e., within the “reach” of the instruction request entry) that consume the produced value generated by the producer instruction are “woken-up” and subsequently indicated as ready to be issued.Type: ApplicationFiled: January 9, 2020Publication date: July 15, 2021Inventors: Yusuf Cagatay TEKMEN, Rodney Wayne SMITH, Douglas C. BURGER, Gagan GUPTA, Kiran Ravi SETH
-
Publication number: 20210216454Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.Type: ApplicationFiled: March 29, 2021Publication date: July 15, 2021Applicant: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
-
Patent number: 11048517Abstract: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.Type: GrantFiled: June 24, 2019Date of Patent: June 29, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron Smith, Jan Gray
-
Patent number: 11016770Abstract: Distinct system registers for logical processors are disclosed. In one example of the disclosed technology, a processor includes a plurality of block-based physical processor cores for executing a program comprising a plurality of instruction blocks. The processor also includes a thread scheduler configured to schedule a thread of the program for execution, the thread using the one or more instruction blocks. The processor further includes at least one system register. The at least one system register stores data indicating a number and placement of the plurality of physical processor cores to form a logical processor. The logical processor executes the scheduled thread. The logical processor is configured to execute the thread in a continuous instruction window.Type: GrantFiled: February 15, 2016Date of Patent: May 25, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith
-
Patent number: 11010198Abstract: A data processing system is described herein that includes two or more software-driven host components. The two or more host components collectively provide a software plane. The data processing system also includes two or more hardware acceleration components (such as FPGA devices) that collectively provide a hardware acceleration plane. A common physical network allows the host components to communicate with each other, and which also allows the hardware acceleration components to communicate with each other. Further, the hardware acceleration components in the hardware acceleration plane include functionality that enables them to communicate with each other in a transparent manner without assistance from the software plane.Type: GrantFiled: August 4, 2017Date of Patent: May 18, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Andrew R. Putnam, Stephen F. Heil
-
Patent number: 10977104Abstract: Aspects extend to methods, systems, and computer program products for partially reconfiguring acceleration components. Partial reconfiguration can be implemented for any of a variety of reasons, including to address an error in functionality at the acceleration component or to update functionality at the acceleration component. During partial reconfiguration, connectivity can be maintained for any other functionality at the acceleration component untouched by the partial reconfiguration. Partial reconfiguration is more efficient to deploy than full reconfiguration of an acceleration component.Type: GrantFiled: January 25, 2019Date of Patent: April 13, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Derek T. Chiou, Sitaram V. Lanka, Adrian M. Caulfield, Andrew R. Putnam, Douglas C. Burger
-
Patent number: 10963379Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.Type: GrantFiled: February 2, 2018Date of Patent: March 30, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
-
Patent number: 10958717Abstract: A server system is provided that includes a plurality of servers, each server including at least one hardware acceleration device and at least one processor communicatively coupled to the hardware acceleration device by an internal data bus and executing a host server instance, the host server instances of the plurality of servers collectively providing a software plane, and the hardware acceleration devices of the plurality of servers collectively providing a hardware acceleration plane that implements a plurality of hardware accelerated services, wherein each hardware acceleration device maintains in memory a data structure that contains load data indicating a load of each of a plurality of target hardware acceleration devices, and wherein a requesting hardware acceleration device routes the request to a target hardware acceleration device that is indicated by the load data in the data structure to have a lower load than other of the target hardware acceleration devices.Type: GrantFiled: August 30, 2019Date of Patent: March 23, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Adrian Michael Caulfield, Eric S. Chung, Michael Konstantinos Papamichael, Douglas C. Burger, Shlomi Alkalay
-
Patent number: 10936316Abstract: Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using an instruction decoder that decodes instructions having variable numbers of target operands. In one example of the disclosed technology, a block-based processor core includes an instruction decoder configured to decode target operands for an instruction in an instruction block, the instruction being encoded to allow for a variable number of target operands and a control unit configured to send data for at least one of the decoded target operands for an operation performed by the at least one of the cores. In some examples, the instruction indicates target instructions with a vector encoding. In other examples, a variable length format allows for the indication of one or more targets.Type: GrantFiled: February 2, 2016Date of Patent: March 2, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith
-
Patent number: 10875525Abstract: Techniques for ability enhancement are described. In some embodiments, devices and systems located in a transportation network share threat information with one another, in order to enhance a user's ability to operate or function in a transportation-related context. In one embodiment, a process in a vehicle receives threat information from a remote device, the threat information based on information about objects or conditions proximate to the remote device. The process then determines that the threat information is relevant to the safe operation of the vehicle. Then, the process modifies operation of the vehicle based on the threat information, such as by presenting a message to the operator of the vehicle and/or controlling the vehicle itself.Type: GrantFiled: August 5, 2015Date of Patent: December 29, 2020Assignee: Microsoft Technology Licensing LLCInventors: Richard T. Lord, Robert W. Lord, Nathan P. Myhrvold, Clarence T. Tegreene, Roderick A. Hyde, Lowell L. Wood, Muriel Y. Ishikawa, Victoria Y. H. Wood, Charles Whitmer, Paramvir Bahl, Douglas C. Burger, Ranveer Chandra, William H. Gates, III, Pablos Holman, Jordin T. Kare, Craig J. Mundie, Tim Paek, Desney S. Tan, Lin Zhong, Matthew G. Dyor
-
Patent number: 10871967Abstract: Apparatus and methods are disclosed for controlling execution of register access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of register access instruction in an instruction block. In one example of the disclosed technology, a method of operating a processor includes selecting a register access instruction of the plurality of instructions to execute based at least in part on dependencies encoded within a previous block of instructions and on stored data indicating which of the register write instructions have executed for the previous block, and executing the selected instruction. In some examples, one or more of a write mask, a read mask, a register write vector register, or a counter are used to determine register read/write dependences. Based on the encoded dependencies and the masked write vector, the next instruction block can issue when its register dependencies are available.Type: GrantFiled: February 1, 2016Date of Patent: December 22, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Douglas C. Burger, Aaron L. Smith