Patents by Inventor Daniel I. Lowell

Daniel I. Lowell has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ADAPTIVE QUANTIZATION FOR NEURAL NETWORKS

Publication number: 20240054332

Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.

Type: Application

Filed: October 27, 2023

Publication date: February 15, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
Adaptive quantization for neural networks

Patent number: 11803734

Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.

Type: Grant

Filed: December 20, 2017

Date of Patent: October 31, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
Data sparsity monitoring during neural network training

Patent number: 11562248

Abstract: An electronic device that includes a processor configured to execute training iterations during a training process for a neural network, each training iteration including processing a separate instance of training data through the neural network, and a sparsity monitor is described. During operation, the sparsity monitor acquires, during a monitoring interval in each of one or more monitoring periods, intermediate data output by at least some intermediate nodes of the neural network during training iterations that occur during each monitoring interval. The sparsity monitor then generates, based at least in part on the intermediate data, one or more values representing sparsity characteristics for the intermediate data. The sparsity monitor next sends, to the processor, the one or more values representing the sparsity characteristics and the processor controls one or more aspects of executing subsequent training iterations based at least in part on the values representing the sparsity characteristics.

Type: Grant

Filed: April 29, 2019

Date of Patent: January 24, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Shi Dong, Daniel I. Lowell
Data Sparsity Monitoring During Neural Network Training

Publication number: 20200342327

Abstract: An electronic device that includes a processor configured to execute training iterations during a training process for a neural network, each training iteration including processing a separate instance of training data through the neural network, and a sparsity monitor is described. During operation, the sparsity monitor acquires, during a monitoring interval in each of one or more monitoring periods, intermediate data output by at least some intermediate nodes of the neural network during training iterations that occur during each monitoring interval. The sparsity monitor then generates, based at least in part on the intermediate data, one or more values representing sparsity characteristics for the intermediate data. The sparsity monitor next sends, to the processor, the one or more values representing the sparsity characteristics and the processor controls one or more aspects of executing subsequent training iterations based at least in part on the values representing the sparsity characteristics.

Type: Application

Filed: April 29, 2019

Publication date: October 29, 2020

Inventors: Shi Dong, Daniel I. Lowell
Performance-aware and reliability-aware data placement for n-level heterogeneous memory systems

Patent number: 10365996

Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.

Type: Grant

Filed: October 21, 2016

Date of Patent: July 30, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Manish Gupta, David A. Roberts, Mitesh R. Meswani, Vilas Sridharan, Steven Raasch, Daniel I. Lowell
ADAPTIVE QUANTIZATION FOR NEURAL NETWORKS

Publication number: 20190188557

Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.

Type: Application

Filed: December 20, 2017

Publication date: June 20, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
Bufferless communication for redundant multithreading using register permutation

Patent number: 10303472

Abstract: Systems, apparatuses, and methods for implementing bufferless communication for redundant multithreading applications using register permutation are disclosed. In one embodiment, a system includes a parallel processing unit, a register file, and a scheduler. The scheduler is configured to cause execution of a plurality of threads to be performed in lockstep on the parallel processing unit. The plurality of threads include a first thread and a second thread executing on adjacent first and second lanes, respectively, of the parallel processing unit. The second thread is configured to perform a register permute operation from a first register location to a second register location in a first instruction cycle, with the second register location associated with the second processing lane. The second thread is configured to read from the second register location in a second instruction cycle, wherein the first and second instruction cycles are successive instruction cycles.

Type: Grant

Filed: November 22, 2016

Date of Patent: May 28, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Daniel I. Lowell, Manish Gupta
Paired value comparison for redundant multi-threading operations

Patent number: 10042687

Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.

Type: Grant

Filed: August 8, 2016

Date of Patent: August 7, 2018

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Daniel I. Lowell, Manish Gupta
Fingerprinting of redundant threads using compiler-inserted transformation code

Patent number: 10013240

Abstract: A first processing element is configured to execute a first thread and one or more second processing elements are configured to execute one or more second threads that are redundant to the first thread. The first thread and the one or more second threads are to selectively bypass one or more comparisons of results of operations performed by the first thread and the one or more second threads depending on whether an event trigger for the comparison has occurred a configurable number of times since a previous comparison of previously encoded values of the results. In some cases the comparison can be performed based on hashed (or encoded) values of the results of a current operation and one or more previous operations.

Type: Grant

Filed: June 21, 2016

Date of Patent: July 3, 2018

Assignee: Advanced Micro Devices, Inc.

Inventor: Daniel I. Lowell
BUFFERLESS COMMUNICATION FOR REDUNDANT MULTITHREADING USING REGISTER PERMUTATION

Publication number: 20180143829

Abstract: Systems, apparatuses, and methods for implementing bufferless communication for redundant multithreading applications using register permutation are disclosed. In one embodiment, a system includes a parallel processing unit, a register file, and a scheduler. The scheduler is configured to cause execution of a plurality of threads to be performed in lockstep on the parallel processing unit. The plurality of threads include a first thread and a second thread executing on adjacent first and second lanes, respectively, of the parallel processing unit. The second thread is configured to perform a register permute operation from a first register location to a second register location in a first instruction cycle, with the second register location associated with the second processing lane. The second thread is configured to read from the second register location in a second instruction cycle, wherein the first and second instruction cycles are successive instruction cycles.

Type: Application

Filed: November 22, 2016

Publication date: May 24, 2018

Inventors: Daniel I. Lowell, Manish Gupta
PAIRED VALUE COMPARISON FOR REDUNDANT MULTI-THREADING OPERATIONS

Publication number: 20180039531

Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.

Type: Application

Filed: August 8, 2016

Publication date: February 8, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Daniel I. Lowell, Manish Gupta
FINGERPRINTING OF REDUNDANT THREADS USING COMPILER-INSERTED TRANSFORMATION CODE

Publication number: 20170364332

Abstract: A first processing element is configured to execute a first thread and one or more second processing elements are configured to execute one or more second threads that are redundant to the first thread. The first thread and the one or more second threads are to selectively bypass one or more comparisons of results of operations performed by the first thread and the one or more second threads depending on whether an event trigger for the comparison has occurred a configurable number of times since a previous comparison of previously encoded values of the results. In some cases the comparison can be performed based on hashed (or encoded) values of the results of a current operation and one or more previous operations.

Type: Application

Filed: June 21, 2016

Publication date: December 21, 2017

Inventor: Daniel I. Lowell
PERFORMANCE-AWARE AND RELIABILITY-AWARE DATA PLACEMENT FOR N-LEVEL HETEROGENEOUS MEMORY SYSTEMS

Publication number: 20170277441

Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.

Type: Application

Filed: October 21, 2016

Publication date: September 28, 2017

Applicant: Advanced Micro Devices, Inc.

Inventors: Manish Gupta, David A. Roberts, Mitesh R. Meswani, Vilas Sridharan, Steven Raasch, Daniel I. Lowell