Patents by Inventor Douglas C. Burger

Douglas C. Burger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

QUANTIZATION FOR DNN ACCELERATORS

Publication number: 20190340499

Abstract: Methods and apparatus are disclosed for providing emulation of quantized precision operations. In some examples, the quantized precision operations are performed for neural network models. Parameters of the quantized precision operations can be selected to emulate operation of hardware accelerators adapted to perform quantized format operations. In some examples, the quantized precision operations are performed in a block floating-point format where one or more values of a tensor, matrix, or vectors share a common exponent. Techniques for selecting the exponent, reshaping the input tensors, and training neural networks for use with quantized precision models are also disclosed. In some examples, a neural network model is further retrained based on the quantized model. For example, a normal precision model or a quantized precision model can be retrained by evaluating loss induced by performing operations in the quantized format.

Type: Application

Filed: May 4, 2018

Publication date: November 7, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Eric S. Chung, Bita Darvish Rouhani, Daniel Lo, Ritchie Zhao
DESIGN FLOW FOR QUANTIZED NEURAL NETWORKS

Publication number: 20190340492

Abstract: Methods and apparatus are disclosed supporting a design flow for developing quantized neural networks. In one example of the disclosed technology, a method includes quantizing a normal-precision floating-point neural network model into a quantized format. For example, the quantized format can be a block floating-point format, where two or more elements of tensors in the neural network share a common exponent. A set of test input is applied to a normal-precision flooding point model and the corresponding quantized model and the respective output tensors are compared. Based on this comparison, hyperparameters or other attributes of the neural networks can be adjusted. Further, quantization parameters determining the widths of data and selection of shared exponents for the block floating-point format can be selected. An adjusted, quantized neural network is retrained and programmed into a hardware accelerator.

Type: Application

Filed: May 4, 2018

Publication date: November 7, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Eric S. Chung, Bita Darvish Rouhani, Daniel Lo, Ritchie Zhao
BLOCK FLOATING POINT COMPUTATIONS USING REDUCED BIT-WIDTH VECTORS

Publication number: 20190339937

Abstract: A system for block floating point computation in a neural network receives a block floating point number comprising a mantissa portion. A bit-width of the block floating point number is reduced by decomposing the block floating point number into a plurality of numbers each having a mantissa portion with a bit-width that is smaller than a bit-width of the mantissa portion of the block floating point number. One or more dot product operations are performed separately on each of the plurality of numbers to obtain individual results, which are summed to generate a final dot product value. The final dot product value is used to implement the neural network. The reduced bit width computations allow higher precision mathematical operations to be performed on lower-precision processors with improved accuracy.

Type: Application

Filed: May 4, 2018

Publication date: November 7, 2019

Inventors: Daniel LO, Eric S. CHUNG, Douglas C. BURGER
Machine learning classification on hardware accelerators with stacked memory

Patent number: 10452995

Abstract: A method is provided for processing on an acceleration component a machine learning classification model. The machine learning classification model includes a plurality of decision trees, the decision trees including a first amount of decision tree data. The acceleration component includes an acceleration component die and a memory stack disposed in an integrated circuit package. The memory die includes an acceleration component memory having a second amount of memory less than the first amount of decision tree data. The memory stack includes a memory bandwidth greater than about 50 GB/sec and a power efficiency of greater than about 20 MB/sec/mW.

Type: Grant

Filed: June 29, 2015

Date of Patent: October 22, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Derek Chiou, Eric Chung, Andrew R. Putnam
Broadcast channel architectures for block-based processors

Patent number: 10452399

Abstract: Apparatus and methods are disclosed for example computer processors that are based on a hybrid dataflow execution model. In particular embodiments, a processor core in a block-based processor comprises: one or more functional units configured to perform functions using one or more operands; an instruction window comprising buffers configured to store individual instructions for execution by the processor core, the instruction window including one or more operand buffers for an individual instruction configured to store operand values; a control unit configured to execute the instructions in the instruction window and control operation of the one or more functional units; and a broadcast value store comprising a plurality of buffers dedicated to storing broadcast values, each buffer of the broadcast value store being associated with a respective broadcast channel from among a plurality of available broadcast channels.

Type: Grant

Filed: March 18, 2016

Date of Patent: October 22, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
Multimodal targets in a block-based processor

Patent number: 10445097

Abstract: Apparatus and methods are disclosed for decoding targets from an instruction and transmitting data to those targets in accordance with a current instruction. Multimodal target hardware is used in conjunction with one or more of the routers so as to route data to an appropriate target. The data can be one or more operands or a predicate and the targets can include operand buffers, broadcast channels, and general registers. In this way, operands, for example, can be directed for use with multiple subsequent instructions, and there are multiple modes for distributing the operands to the multiple instructions.

Type: Grant

Filed: March 17, 2016

Date of Patent: October 15, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith
DECOUPLED PROCESSOR INSTRUCTION WINDOW AND OPERAND BUFFER

Publication number: 20190310852

Abstract: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.

Type: Application

Filed: June 24, 2019

Publication date: October 10, 2019

Inventors: Douglas C. Burger, Aaron Smith, Jan Gray
Hardware implemented load balancing

Patent number: 10425472

Abstract: A server system is provided that includes a plurality of servers, each server including at least one hardware acceleration device and at least one processor communicatively coupled to the hardware acceleration device by an internal data bus and executing a host server instance, the host server instances of the plurality of servers collectively providing a software plane, and the hardware acceleration devices of the plurality of servers collectively providing a hardware acceleration plane that implements a plurality of hardware accelerated services, wherein each hardware acceleration device maintains in memory a data structure that contains load data indicating a load of each of a plurality of target hardware acceleration devices, and wherein a requesting hardware acceleration device routes the request to a target hardware acceleration device that is indicated by the load data in the data structure to have a lower load than other of the target hardware acceleration devices.

Type: Grant

Filed: January 17, 2017

Date of Patent: September 24, 2019

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Adrian Michael Caulfield, Eric S. Chung, Michael Konstantinos Papamichael, Douglas C. Burger, Shlomi Alkalay
UNOBTRUSIVE ACTIVE EYE INTERROGATION

Publication number: 20190274537

Abstract: Methods and systems for determining a physiological parameter of a subject through interrogation of an eye of the subject with an optical signal are described. Interrogation is performed unobtrusively. The physiological parameter is determined from a signal sensed from the eye of a subject according to a schedule, under the control of a scheduling controller.

Type: Application

Filed: March 15, 2019

Publication date: September 12, 2019

Inventors: Allen L. Brown, JR., Douglas C. Burger, Eric Horvitz, Roderick A. Hyde, Edward K.Y. Jung, Jordin T. Kare, Chris Demetrios Karkanias, Eric C. Leuthardt, John L. Manferdelli, Craig J. Mundie, Nathan P. Myhrvold, Barney Pell, Clarence T. Tegreene, Willard H. Wattenburg, Charles Whitmer, Lowell L. Wood, JR., Richard N. Zare
Verifying branch targets

Patent number: 10409606

Abstract: Apparatus and methods are disclosed for implementing bad jump detection in block-based processor architectures. In one example of the disclosed technology, a block-based processor includes one or more block-based processing cores configured to fetch and execute atomic blocks of instructions and a control unit configured to, based at least in part on receiving a branch signal indicating a target location is received from one of the instruction blocks, verify that the target location is a valid branch target.

Type: Grant

Filed: June 26, 2015

Date of Patent: September 10, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith, Jan S. Gray
COUPLING WIDE MEMORY INTERFACE TO WIDE WRITE BACK PATHS

Publication number: 20190236009

Abstract: Systems and methods are disclosed for performing wide memory operations for a wide data cache line. In some examples of the disclosed technology, a processor having two or more execution lanes includes a data cache coupled to memory, a wide memory load circuit that concurrently loads two or more words from a cache line of the data cache, and a writeback circuit situated to send a respective word of the concurrently-loaded words to a selected execution lane of the processor, either into an operand buffer or bypassing the operand buffer. In some examples, a sharding circuit is provided that allows bitwise, byte-wise, and/or word-wise manipulation of memory operation data. In some examples, wide cache loads allows for concurrent execution of plural execution lanes of the processor.

Type: Application

Filed: February 2, 2018

Publication date: August 1, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron L. Smith, Gagan Gupta, David T. Harper
Decoupled processor instruction window and operand buffer

Patent number: 10346168

Abstract: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.

Type: Grant

Filed: June 26, 2015

Date of Patent: July 9, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Aaron Smith, Jan Gray
NEURAL ENTROPY ENHANCED MACHINE LEARNING

Publication number: 20190197406

Abstract: A computer implemented method of optimizing a neural network includes obtaining a deep neural network (DNN) trained with a training dataset, determining a spreading signal between neurons in multiple adjacent layers of the DNN wherein the spreading signal is an element-wise multiplication of input activations between the neurons in a first layer to neurons in a second next layer with a corresponding weight matrix of connections between such neurons, and determining neural entropies of respective connections between neurons by calculating an exponent of a volume of an area covered by the spreading signal. The DNN may be optimized based on the determined neural entropies between the neurons in the multiple adjacent layers.

Type: Application

Filed: December 22, 2017

Publication date: June 27, 2019

Inventors: Bita Darvish Rouhani, Douglas C. Burger, Eric S. Chung
Parallel decision tree processor architecture

Patent number: 10332008

Abstract: A decision tree multi-processor system includes a plurality of decision tree processors that access a common feature vector and execute one or more decision trees with respect to the common feature vector. A related method includes providing a common feature vector to a plurality of decision tree processors implemented within an on-chip decision tree scoring system, and executing, by the plurality of decision tree processors, a plurality off decision trees, by reference to the common feature vector. A related decision tree-walking system includes feature storage that stores a common feature vector and a plurality of decision tree processors that access the common feature vector from the feature storage and execute a plurality of decision trees by comparing threshold values of the decision trees to feature values within the common feature vector.

Type: Grant

Filed: March 17, 2014

Date of Patent: June 25, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, James R. Larus, Andrew Putnam, Jan Gray
Changing between different roles at acceleration components

Patent number: 10333781

Abstract: Aspects extend to methods, systems, and computer program products for changing between different roles at acceleration components. Changing roles at an acceleration component can be facilitated without loading an image file to configure or partially reconfigure the acceleration component. At configuration time, an acceleration component can be configured with a framework and a plurality of selectable roles. The framework also provides a mechanism for loading different selectable roles for execution at the acceleration component (e.g., the framework can include a superset of instructions for providing any of a plurality of different roles). The framework can receive requests for specified roles from other components and switch to a subset of instructions for the specified roles. Switching between subsets of instructions at an acceleration component is a lower overhead operation relative to reconfiguring or partially reconfiguring an acceleration component by loading an image file.

Type: Grant

Filed: June 26, 2015

Date of Patent: June 25, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Andrew R. Putnam, Douglas C. Burger, Michael David Haselman, Stephen F. Heil, Yi Xiao, Sitaram V. Lanka
ALLOCATING ACCELERATION COMPONENT FUNCTIONALITY FOR SUPPORTING SERVICES

Publication number: 20190190847

Abstract: Aspects extend to methods, systems, and computer program products for allocating acceleration component functionality for supporting services. A service manager uses a finite number of acceleration components to accelerate services. Acceleration components can be allocated in a manner that balances load in a hardware acceleration plane, minimizes role switching, and adapts to demand changes. When role switching is appropriate, less extensive mechanisms (e.g., based on configuration data versus image files) can be used to switch roles to the extent possible.

Type: Application

Filed: February 25, 2019

Publication date: June 20, 2019

Inventors: Douglas C. Burger, Andrew R. Putnam, Stephen F. Heil, Michael David Haselman, Sitaram V. Lanka, Yi Xiao
PARTIALLY RECONFIGURING ACCELERATION COMPONENTS

Publication number: 20190155669

Abstract: Aspects extend to methods, systems, and computer program products for partially reconfiguring acceleration components. Partial reconfiguration can be implemented for any of a variety of reasons, including to address an error in functionality at the acceleration component or to update functionality at the acceleration component. During partial reconfiguration, connectivity can be maintained for any other functionality at the acceleration component untouched by the partial reconfiguration. Partial reconfiguration is more efficient to deploy than full reconfiguration of an acceleration component.

Type: Application

Filed: January 25, 2019

Publication date: May 23, 2019

Inventors: Derek T. Chiou, Sitaram V. Lanka, Adrian M. Caulfield, Andrew R. Putnam, Douglas C. Burger
Implementing a multi-component service using plural hardware acceleration components

Patent number: 10296392

Abstract: A data processing system is described herein that includes two or more software-driven host components that collectively provide a software plane. The data processing system further includes two or more hardware acceleration components that collectively provide a hardware acceleration plane. The hardware acceleration plane implements one or more services, including at least one multi-component service. The multi-component service has plural parts, and is implemented on a collection of two or more hardware acceleration components, where each hardware acceleration component in the collection implements a corresponding part of the multi-component service. Each hardware acceleration component in the collection is configured to interact with other hardware acceleration components in the collection without involvement from any host component. A function parsing component is also described herein that determines a manner of parsing a function into the plural parts of the multi-component service.

Type: Grant

Filed: May 20, 2015

Date of Patent: May 21, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Stephen F. Heil, Adrian M. Caulfield, Douglas C. Burger, Andrew R. Putnam, Eric S. Chung
Allocating acceleration component functionality for supporting services

Patent number: 10270709

Abstract: Aspects extend to methods, systems, and computer program products for allocating acceleration component functionality for supporting services. A service manager uses a finite number of acceleration components to accelerate services. Acceleration components can be allocated in a manner that balances load in a hardware acceleration plane, minimizes role switching, and adapts to demand changes. When role switching is appropriate, less extensive mechanisms (e.g., based on configuration data versus image files) can be used to switch roles to the extent possible.

Type: Grant

Filed: June 26, 2015

Date of Patent: April 23, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas C. Burger, Andrew R. Putnam, Stephen F. Heil, Michael David Haselman, Sitaram V. Lanka, Yi Xiao
Unobtrusive active eye interrogation

Patent number: 10251541

Abstract: Methods and systems for determining a physiological parameter of a subject through interrogation of an eye of the subject with an optical signal are described. Interrogation is performed unobtrusively. The physiological parameter is determined from a signal sensed from the eye of a subject according to a schedule, under the control of a scheduling controller.

Type: Grant

Filed: April 1, 2016

Date of Patent: April 9, 2019

Assignee: Elwha LLC

Inventors: Allen L. Brown, Jr., Douglas C. Burger, Eric Horvitz, Roderick A. Hyde, Edward K. Y. Jung, Jordin T. Kare, Chris Demetrios Karkanias, Eric C. Leuthardt, John L. Manferdelli, Craig J. Mundie, Nathan P. Myhrvold, Barney Pell, Clarence T. Tegreene, Willard H. Wattenburg, Charles Whitmer, Lowell L. Wood, Jr., Richard N. Zare

prev 1 2 3 4 5 6 7 8 … next