Patents by Inventor Daniel I. Lowell

Daniel I. Lowell has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240054332
    Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.
    Type: Application
    Filed: October 27, 2023
    Publication date: February 15, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
  • Patent number: 11803734
    Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: October 31, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
  • Patent number: 11562248
    Abstract: An electronic device that includes a processor configured to execute training iterations during a training process for a neural network, each training iteration including processing a separate instance of training data through the neural network, and a sparsity monitor is described. During operation, the sparsity monitor acquires, during a monitoring interval in each of one or more monitoring periods, intermediate data output by at least some intermediate nodes of the neural network during training iterations that occur during each monitoring interval. The sparsity monitor then generates, based at least in part on the intermediate data, one or more values representing sparsity characteristics for the intermediate data. The sparsity monitor next sends, to the processor, the one or more values representing the sparsity characteristics and the processor controls one or more aspects of executing subsequent training iterations based at least in part on the values representing the sparsity characteristics.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: January 24, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shi Dong, Daniel I. Lowell
  • Publication number: 20200342327
    Abstract: An electronic device that includes a processor configured to execute training iterations during a training process for a neural network, each training iteration including processing a separate instance of training data through the neural network, and a sparsity monitor is described. During operation, the sparsity monitor acquires, during a monitoring interval in each of one or more monitoring periods, intermediate data output by at least some intermediate nodes of the neural network during training iterations that occur during each monitoring interval. The sparsity monitor then generates, based at least in part on the intermediate data, one or more values representing sparsity characteristics for the intermediate data. The sparsity monitor next sends, to the processor, the one or more values representing the sparsity characteristics and the processor controls one or more aspects of executing subsequent training iterations based at least in part on the values representing the sparsity characteristics.
    Type: Application
    Filed: April 29, 2019
    Publication date: October 29, 2020
    Inventors: Shi Dong, Daniel I. Lowell
  • Patent number: 10365996
    Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: July 30, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Manish Gupta, David A. Roberts, Mitesh R. Meswani, Vilas Sridharan, Steven Raasch, Daniel I. Lowell
  • Publication number: 20190188557
    Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.
    Type: Application
    Filed: December 20, 2017
    Publication date: June 20, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
  • Patent number: 10303472
    Abstract: Systems, apparatuses, and methods for implementing bufferless communication for redundant multithreading applications using register permutation are disclosed. In one embodiment, a system includes a parallel processing unit, a register file, and a scheduler. The scheduler is configured to cause execution of a plurality of threads to be performed in lockstep on the parallel processing unit. The plurality of threads include a first thread and a second thread executing on adjacent first and second lanes, respectively, of the parallel processing unit. The second thread is configured to perform a register permute operation from a first register location to a second register location in a first instruction cycle, with the second register location associated with the second processing lane. The second thread is configured to read from the second register location in a second instruction cycle, wherein the first and second instruction cycles are successive instruction cycles.
    Type: Grant
    Filed: November 22, 2016
    Date of Patent: May 28, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Daniel I. Lowell, Manish Gupta
  • Patent number: 10042687
    Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.
    Type: Grant
    Filed: August 8, 2016
    Date of Patent: August 7, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Daniel I. Lowell, Manish Gupta
  • Patent number: 10013240
    Abstract: A first processing element is configured to execute a first thread and one or more second processing elements are configured to execute one or more second threads that are redundant to the first thread. The first thread and the one or more second threads are to selectively bypass one or more comparisons of results of operations performed by the first thread and the one or more second threads depending on whether an event trigger for the comparison has occurred a configurable number of times since a previous comparison of previously encoded values of the results. In some cases the comparison can be performed based on hashed (or encoded) values of the results of a current operation and one or more previous operations.
    Type: Grant
    Filed: June 21, 2016
    Date of Patent: July 3, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Daniel I. Lowell
  • Publication number: 20180143829
    Abstract: Systems, apparatuses, and methods for implementing bufferless communication for redundant multithreading applications using register permutation are disclosed. In one embodiment, a system includes a parallel processing unit, a register file, and a scheduler. The scheduler is configured to cause execution of a plurality of threads to be performed in lockstep on the parallel processing unit. The plurality of threads include a first thread and a second thread executing on adjacent first and second lanes, respectively, of the parallel processing unit. The second thread is configured to perform a register permute operation from a first register location to a second register location in a first instruction cycle, with the second register location associated with the second processing lane. The second thread is configured to read from the second register location in a second instruction cycle, wherein the first and second instruction cycles are successive instruction cycles.
    Type: Application
    Filed: November 22, 2016
    Publication date: May 24, 2018
    Inventors: Daniel I. Lowell, Manish Gupta
  • Publication number: 20180039531
    Abstract: Techniques for performing redundant multi-threading (“RMT”) include the use of an RMT compare instruction by two program instances (“work-items”). The RMT compare instruction specifies a value from each work-item to be compared. Upon executing the RMT compare instructions, the work-items transmit the values to a hardware comparator unit. The hardware comparator unit compares the received values and performs an error action if the values do not match. The error action may include sending an error code in a return value back to the work-items that requested the comparison or emitting a trap signal. Optionally, the work-items also send addresses for comparison to the comparator unit. If the addresses and values match, then the comparator stores the value at the specified address. If either or both of the values or the addresses do not match, then the comparator performs an error action.
    Type: Application
    Filed: August 8, 2016
    Publication date: February 8, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Daniel I. Lowell, Manish Gupta
  • Publication number: 20170364332
    Abstract: A first processing element is configured to execute a first thread and one or more second processing elements are configured to execute one or more second threads that are redundant to the first thread. The first thread and the one or more second threads are to selectively bypass one or more comparisons of results of operations performed by the first thread and the one or more second threads depending on whether an event trigger for the comparison has occurred a configurable number of times since a previous comparison of previously encoded values of the results. In some cases the comparison can be performed based on hashed (or encoded) values of the results of a current operation and one or more previous operations.
    Type: Application
    Filed: June 21, 2016
    Publication date: December 21, 2017
    Inventor: Daniel I. Lowell
  • Publication number: 20170277441
    Abstract: Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.
    Type: Application
    Filed: October 21, 2016
    Publication date: September 28, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Manish Gupta, David A. Roberts, Mitesh R. Meswani, Vilas Sridharan, Steven Raasch, Daniel I. Lowell