Patents by Inventor Douglas C. Burger

Douglas C. Burger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Subsampling training data during artificial neural network training

Patent number: 11494614

Abstract: Perplexity scores are computed for training data samples during ANN training. Perplexity scores can be computed as a divergence between data defining a class associated with a current training data sample and a probability vector generated by the ANN model. Perplexity scores can alternately be computed by learning a probability density function (“PDF”) fitting activation maps generated by an ANN model during training. A perplexity score can then be computed for a current training data sample by computing a probability for the current training data sample based on the PDF. If the perplexity score for a training data sample is lower than a threshold, the training data sample is removed from the training data set so that it will not be utilized for training during subsequent epochs. Training of the ANN model continues following the removal of training data samples from the training data set.

Type: Grant

Filed: March 20, 2019

Date of Patent: November 8, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Douglas C. Burger, Bita Darvish Rouhani
HIERARCHICAL AND SHARED EXPONENT FLOATING POINT DATA TYPES

Publication number: 20220253281

Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.

Type: Application

Filed: June 28, 2021

Publication date: August 11, 2022

Inventors: Bita DARVISH ROUHANI, Venmugil ELANGO, Rasoul SHAFIPOUR, Jeremy FOWERS, Ming Gang LIU, Jinwen XI, Douglas C. BURGER, Eric S. CHUNG
NEURAL NETWORK PROCESSING WITH MODEL PINNING

Publication number: 20220012577

Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.

Type: Application

Filed: September 23, 2021

Publication date: January 13, 2022

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov
MULTI-FUNCTION UNIT FOR PROGRAMMABLE HARDWARE NODES FOR NEURAL NETWORK PROCESSING

Publication number: 20210406657

Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.

Type: Application

Filed: August 23, 2021

Publication date: December 30, 2021

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
Neural network processing with the neural network model pinned to on-chip memories of hardware nodes

Patent number: 11157801

Abstract: Systems and methods for neural network processing are provided. A method in a system comprising a plurality of nodes interconnected via a network, where each node includes a plurality of on-chip memory blocks and a plurality of compute units, is provided. The method includes upon service activation receiving an N by M matrix of coefficients corresponding to the neural network model. The method includes loading the coefficients corresponding to the neural network model into the plurality of the on-chip memory blocks for processing by the plurality of compute units. The method includes regardless of a utilization of the plurality of the on-chip memory blocks as part of an evaluation of the neural network model, maintaining the coefficients corresponding to the neural network model in the plurality of the on-chip memory blocks until the service is interrupted or the neural network model is modified or replaced.

Type: Grant

Filed: June 29, 2017

Date of Patent: October 26, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers, Kalin Ovtcharov
Hardware node with position-dependent memories for neural network processing

Patent number: 11144820

Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding a chain of instructions received via an input queue, where the chain of instructions comprises a first instruction that can only be processed by the matrix vector unit and a sequence of instructions that can only be processed by a multifunction unit. The method includes processing the first instruction using the MVU and processing each of instructions in the sequence of instructions depending upon a position of the each of instructions in the sequence of instructions.

Type: Grant

Filed: June 29, 2017

Date of Patent: October 12, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
Multi-function unit for programmable hardware nodes for neural network processing

Patent number: 11132599

Abstract: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the MVU, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding instructions including a first type of instruction for processing by only the MVU and a second type of instruction for processing by only one of the multifunction units. The method includes mapping a first instruction for processing by the matrix vector unit or to any one of the first multifunction unit, the second multifunction unit, or the third multifunction unit depending on whether the first instruction is the first type of instruction or the second type of instruction.

Type: Grant

Filed: June 29, 2017

Date of Patent: September 28, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Eric S. Chung, Douglas C. Burger, Jeremy Fowers
Block-based processor core composition register

Patent number: 11126433

Abstract: Systems, apparatuses, and methods related to a block-based processor core composition register are disclosed. In one example of the disclosed technology, a processor can include a plurality of block-based processor cores for executing a program including a plurality of instruction blocks. A respective block-based processor core can include one or more sharable resources and a programmable composition control register. The programmable composition control register can be used to configure which resources of the one or more sharable resources are shared with other processor cores of the plurality of processor cores.

Type: Grant

Filed: December 23, 2015

Date of Patent: September 21, 2021

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Handling tenant requests in a system that uses hardware acceleration components

Patent number: 11099906

Abstract: A service mapping component (SMC) is described herein for processing requests by instances of tenant functionality that execute on software-driven host components (or some other components) in a data processing system. The SMC is configured to apply at least one rule to determine whether a service requested by an instance of tenant functionality is to be satisfied by at least one of: a local host component, a local hardware acceleration component which is locally coupled to the local host component, and/or at least one remote hardware acceleration component that is indirectly accessible to the local host component via the local hardware acceleration component. In performing its analysis, the SMC can take into account various factors, such as whether or not the service corresponds to a line-rate service, latency-related considerations, security-related considerations, and so on.

Type: Grant

Filed: September 11, 2018

Date of Patent: August 24, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Derek T. Chiou, Sitaram V. Lanka, Douglas C. Burger
REACH MATRIX SCHEDULER CIRCUIT FOR SCHEDULING OF INSTRUCTIONS TO BE EXECUTED IN A PROCESSOR

Publication number: 20210216327

Abstract: A reach matrix scheduler circuit for scheduling instructions to be executed in a processor is disclosed. The scheduler circuit includes an N×R matrix wake-up circuit, where ‘N’ is the instruction window size of the scheduler circuit, and ‘R’ is the “reach” with the instruction window of matrix wake-up circuit, with ‘R’ being less than ‘N’. A grant line associated with each instruction request entry in the N×R matrix wake-up circuit is coupled to ‘R’ other instruction entries among the ‘N’ instruction entries. When a producer instruction in an instruction request entry is ready for issuance, the grant line associated with the instruction request entry is activated so that any other instruction entries coupled to the grant line (i.e., within the “reach” of the instruction request entry) that consume the produced value generated by the producer instruction are “woken-up” and subsequently indicated as ready to be issued.

Type: Application

Filed: January 9, 2020

Publication date: July 15, 2021

Inventors: Yusuf Cagatay TEKMEN, Rodney Wayne SMITH, Douglas C. BURGER, Gagan GUPTA, Kiran Ravi SETH
COUPLING WIDE MEMORY INTERFACE TO WIDE WRITE BACK PATHS

Publication number: 20210216454

Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.

Type: Application

Filed: March 29, 2021

Publication date: July 15, 2021

Applicant: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
Decoupled processor instruction window and operand buffer

Patent number: 11048517

Abstract: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.

Type: Grant

Filed: June 24, 2019

Date of Patent: June 29, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron Smith, Jan Gray
Distinct system registers for logical processors

Patent number: 11016770

Abstract: Distinct system registers for logical processors are disclosed. In one example of the disclosed technology, a processor includes a plurality of block-based physical processor cores for executing a program comprising a plurality of instruction blocks. The processor also includes a thread scheduler configured to schedule a thread of the program for execution, the thread using the one or more instruction blocks. The processor further includes at least one system register. The at least one system register stores data indicating a number and placement of the plurality of physical processor cores to form a logical processor. The logical processor executes the scheduled thread. The logical processor is configured to execute the thread in a continuous instruction window.

Type: Grant

Filed: February 15, 2016

Date of Patent: May 25, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Data processing system having a hardware acceleration plane and a software plane

Patent number: 11010198

Abstract: A data processing system is described herein that includes two or more software-driven host components. The two or more host components collectively provide a software plane. The data processing system also includes two or more hardware acceleration components (such as FPGA devices) that collectively provide a hardware acceleration plane. A common physical network allows the host components to communicate with each other, and which also allows the hardware acceleration components to communicate with each other. Further, the hardware acceleration components in the hardware acceleration plane include functionality that enables them to communicate with each other in a transparent manner without assistance from the software plane.

Type: Grant

Filed: August 4, 2017

Date of Patent: May 18, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Andrew R. Putnam, Stephen F. Heil
Partially reconfiguring acceleration components

Patent number: 10977104

Abstract: Aspects extend to methods, systems, and computer program products for partially reconfiguring acceleration components. Partial reconfiguration can be implemented for any of a variety of reasons, including to address an error in functionality at the acceleration component or to update functionality at the acceleration component. During partial reconfiguration, connectivity can be maintained for any other functionality at the acceleration component untouched by the partial reconfiguration. Partial reconfiguration is more efficient to deploy than full reconfiguration of an acceleration component.

Type: Grant

Filed: January 25, 2019

Date of Patent: April 13, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Derek T. Chiou, Sitaram V. Lanka, Adrian M. Caulfield, Andrew R. Putnam, Douglas C. Burger
Coupling wide memory interface to wide write back paths

Patent number: 10963379

Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.

Type: Grant

Filed: February 2, 2018

Date of Patent: March 30, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
Hardware implemented load balancing

Patent number: 10958717

Abstract: A server system is provided that includes a plurality of servers, each server including at least one hardware acceleration device and at least one processor communicatively coupled to the hardware acceleration device by an internal data bus and executing a host server instance, the host server instances of the plurality of servers collectively providing a software plane, and the hardware acceleration devices of the plurality of servers collectively providing a hardware acceleration plane that implements a plurality of hardware accelerated services, wherein each hardware acceleration device maintains in memory a data structure that contains load data indicating a load of each of a plurality of target hardware acceleration devices, and wherein a requesting hardware acceleration device routes the request to a target hardware acceleration device that is indicated by the load data in the data structure to have a lower load than other of the target hardware acceleration devices.

Type: Grant

Filed: August 30, 2019

Date of Patent: March 23, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Adrian Michael Caulfield, Eric S. Chung, Michael Konstantinos Papamichael, Douglas C. Burger, Shlomi Alkalay
Dense read encoding for dataflow ISA

Patent number: 10936316

Abstract: Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using an instruction decoder that decodes instructions having variable numbers of target operands. In one example of the disclosed technology, a block-based processor core includes an instruction decoder configured to decode target operands for an instruction in an instruction block, the instruction being encoded to allow for a variable number of target operands and a control unit configured to send data for at least one of the decoded target operands for an operation performed by the at least one of the cores. In some examples, the instruction indicates target instructions with a vector encoding. In other examples, a variable length format allows for the indication of one or more targets.

Type: Grant

Filed: February 2, 2016

Date of Patent: March 2, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Ability enhancement

Patent number: 10875525

Abstract: Techniques for ability enhancement are described. In some embodiments, devices and systems located in a transportation network share threat information with one another, in order to enhance a user's ability to operate or function in a transportation-related context. In one embodiment, a process in a vehicle receives threat information from a remote device, the threat information based on information about objects or conditions proximate to the remote device. The process then determines that the threat information is relevant to the safe operation of the vehicle. Then, the process modifies operation of the vehicle based on the threat information, such as by presenting a message to the operator of the vehicle and/or controlling the vehicle itself.

Type: Grant

Filed: August 5, 2015

Date of Patent: December 29, 2020

Assignee: Microsoft Technology Licensing LLC

Inventors: Richard T. Lord, Robert W. Lord, Nathan P. Myhrvold, Clarence T. Tegreene, Roderick A. Hyde, Lowell L. Wood, Muriel Y. Ishikawa, Victoria Y. H. Wood, Charles Whitmer, Paramvir Bahl, Douglas C. Burger, Ranveer Chandra, William H. Gates, III, Pablos Holman, Jordin T. Kare, Craig J. Mundie, Tim Paek, Desney S. Tan, Lin Zhong, Matthew G. Dyor
Register read/write ordering

Patent number: 10871967

Abstract: Apparatus and methods are disclosed for controlling execution of register access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of register access instruction in an instruction block. In one example of the disclosed technology, a method of operating a processor includes selecting a register access instruction of the plurality of instructions to execute based at least in part on dependencies encoded within a previous block of instructions and on stored data indicating which of the register write instructions have executed for the previous block, and executing the selected instruction. In some examples, one or more of a write mask, a read mask, a register write vector register, or a counter are used to determine register read/write dependences. Based on the encoded dependencies and the masked write vector, the next instruction block can issue when its register dependencies are available.

Type: Grant

Filed: February 1, 2016

Date of Patent: December 22, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith

prev 1 2 3 4 5 6 … next