Patents by Inventor Dana Michelle Vantrease

Dana Michelle Vantrease has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Reducing computations for data including padding

Patent number: 11960566

Abstract: Systems and methods are provided to eliminate multiplication operations with zero padding data for convolution computations. A multiplication matrix is generated from an input feature map matrix with padding by adjusting coordinates and dimensions of the input feature map matrix to exclude padding data. The multiplication matrix is used to perform matrix multiplications with respective weight values which results in fewer computations as compared to matrix multiplications which include the zero padding data.

Type: Grant

Filed: April 13, 2021

Date of Patent: April 16, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant
PROCESSING FOR MULTIPLE INPUT DATA SETS

Publication number: 20230351186

Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.

Type: Application

Filed: May 5, 2023

Publication date: November 2, 2023

Inventors: Dana Michelle Vantrease, Ron Diamant, Thomas A. Volpe, Randy Huang
Processing for multiple input data sets

Patent number: 11797853

Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.

Type: Grant

Filed: September 22, 2022

Date of Patent: October 24, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant, Thomas A. Volpe, Randy Huang
PERFORMING CONCURRENT OPERATIONS IN A PROCESSING ELEMENT

Publication number: 20230325348

Abstract: A processing element (PE) of a systolic array can perform neural networks computations on two or more data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.

Type: Application

Filed: June 15, 2023

Publication date: October 12, 2023

Inventors: Dana Michelle Vantrease, Ron Diamant
Performing concurrent operations in a processing element

Patent number: 11720523

Abstract: A processing element (PE) of a systolic array can perform neural networks computations on two or more data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.

Type: Grant

Filed: October 15, 2019

Date of Patent: August 8, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant
PROCESSING FOR MULTIPLE INPUT DATA SETS

Publication number: 20230014783

Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.

Type: Application

Filed: September 22, 2022

Publication date: January 19, 2023

Inventors: Dana Michelle Vantrease, Ron Diamant, Thomas A. Volpe, Randy Huang
Processing for multiple input data sets

Patent number: 11475306

Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.

Type: Grant

Filed: March 22, 2018

Date of Patent: October 18, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant, Thomas A. Volpe, Randy Huang
Scheduling neural network computations based on memory capacity

Patent number: 11461631

Abstract: Disclosed herein are techniques for scheduling and executing multi-layer neural network computations for multiple contexts. In one embodiment, a method comprises determining a set of computation tasks to be executed, the set of computation tasks including a first computation task and a second computation task, as well as a third computation task and a fourth computation task to provide input data for the first and second computation tasks; determining a first execution batch comprising the first and second computation tasks; determining a second execution batch comprising at least the third computation task to be executed before the first execution batch; determining whether to include the fourth computation task in the second execution batch based on whether the memory device has sufficient capacity to hold input data and output data of both of the third and fourth computation; executing the second execution batch followed by the first execution batch.

Type: Grant

Filed: March 22, 2018

Date of Patent: October 4, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant, Thomas A. Volpe, Randy Huang
Weight loading in an array

Patent number: 11275997

Abstract: Disclosed herein are techniques for obtain weights for neural network computations. In one embodiment, an integrated circuit may include memory configured to store a first weight and a second weight; a row of processing elements comprising a first processing element and a second processing element, the first processing element comprising a first weight register, the second processing element comprising a second weight register, both of the first weight register and the second weight register being controllable by a weight load signal; and a controller configured to: provide the first weight from the memory to the row of processing elements; set the weight load signal to enable the first weight to propagate through the row to reach the first processing element; and set the weight load signal to store the first weight at the first weight register and the flush value at the second weight register.

Type: Grant

Filed: April 30, 2018

Date of Patent: March 15, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant, Sundeep Amirineni
Test generation of a distributed system

Patent number: 11275661

Abstract: A method of generating instructions to be executed by a plurality of execution engines that shares a resource is provided. The method comprises, in a first generation step: reading a first engine logical timestamp vector of a first execution engine of the execution engines, the logical timestamp representing a history of access operations for the resource; determining whether the first engine logical timestamp vector includes a most-up-to-date logical timestamp of the resource in the first generation step; based on the first engine logical timestamp vector including the most-up-to-date logical timestamp of the resource in the first generation step, generating an access instruction to be executed by the first execution engine to access the resource; and scheduling the first execution engine to execute the access instruction.

Type: Grant

Filed: September 25, 2019

Date of Patent: March 15, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant
Synchronization of concurrent computation engines

Patent number: 11175919

Abstract: Integrated circuit devices and methods for synchronizing execution of program code for multiple concurrently operating execution engines of the integrated circuit devices are provided. In some cases, one execution engine of an integrated circuit device may be dependent on the operation of another execution engine of the integrated circuit device. To synchronize the execution engines around the dependency, a first execution engine may execute an instruction to set a value in a register while a second execution engine may execute an instruction to wait for a condition associated with the register value.

Type: Grant

Filed: December 13, 2018

Date of Patent: November 16, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ilya Minkin, Ron Diamant, Drazen Borkovic, Jindrich Zejda, Dana Michelle Vantrease
Reducing computations for data including padding

Patent number: 10990650

Abstract: Systems and methods are provided to eliminate multiplication operations with zero padding data for convolution computations. A multiplication matrix is generated from an input feature map matrix with padding by adjusting coordinates and dimensions of the input feature map matrix to exclude padding data. The multiplication matrix is used to perform matrix multiplications with respective weight values which results in fewer computations as compared to matrix multiplications which include the zero padding data.

Type: Grant

Filed: March 22, 2018

Date of Patent: April 27, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant
Accelerated quantized multiply-and-add operations

Patent number: 10983754

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.

Type: Grant

Filed: June 2, 2020

Date of Patent: April 20, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
Synchronization of concurrent computation engines

Patent number: 10922146

Abstract: Systems and methods are provided for synchronizing execution of program code for an integrated circuit device having multiple concurrently operating execution engines, where the operation of one execution engine may be dependent on the operation of another execution engine. Data or resource dependencies may be accommodated with a Set instruction to cause a first execution engine to set a register value and a Wait instruction to cause a second execution engine to wait for a condition associate with the register value. Concurrently operation of the execution engines may thus be synchronized.

Type: Grant

Filed: December 13, 2018

Date of Patent: February 16, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ilya Minkin, Ron Diamant, Drazen Borkovic, Jindrich Zejda, Dana Michelle Vantrease
ACCELERATED QUANTIZED MULTIPLY-AND-ADD OPERATIONS

Publication number: 20200293284

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.

Type: Application

Filed: June 2, 2020

Publication date: September 17, 2020

Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
Integrated circuit with rate limiting

Patent number: 10705985

Abstract: In various implementations, provided are systems and methods for an integrated circuit implementing a processor that can include a rate limiting circuit that attempts to fairly distribute processor memory bandwidth between transaction generators in the processor. The rate limiting circuit can maintain a count of tokens for each transaction generator, where a transaction generator can only transmit a transaction when the transaction generator has enough tokens to do so. Each transaction generator can send a request to the rate limiting circuit when the transaction generator wants to transmit a transaction. The rate limiting circuit can then check whether the transaction generator has sufficient tokens to transmit the transaction. When the transaction generator has enough tokens, the rate limiting circuit will allow the transaction to enter the interconnect. When the transaction generator does not have enough tokens, the rate limiting circuit will not allow the transaction to enter the interconnect.

Type: Grant

Filed: March 12, 2018

Date of Patent: July 7, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Benny Pollak, Dana Michelle Vantrease, Adi Habusha
Accelerated quantized multiply-and-add operations

Patent number: 10678508

Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.

Type: Grant

Filed: March 23, 2018

Date of Patent: June 9, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
Runtime augmentation of engine instructions

Patent number: 10664282

Abstract: Methods for repeated execution of program code by an execution engine are provided. In order to execute large programs, the instruction buffer of an execution engine may be refilled may times with program code to complete one execution of the program. At completion of program execution, the program code needed to begin re-execution of the program is no longer in the instruction buffer. A runtime driver program can load instructions into the instruction buffer, or can cause instructions to be loaded. Once the instructions are loaded, the execution engine may be able to re-execute the instructions without needing further assistance from the runtime driver.

Type: Grant

Filed: February 4, 2019

Date of Patent: May 26, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Ilya Minkin, Ron Diamant, Mohammad El-Shabani, Dana Michelle Vantrease
PERFORMING CONCURRENT OPERATIONS IN A PROCESSING ELEMENT

Publication number: 20200050582

Abstract: A processing element (PE) of a systolic array can perform neural networks computations on two or more data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.

Type: Application

Filed: October 15, 2019

Publication date: February 13, 2020

Inventors: Dana Michelle Vantrease, Ron Diamant
Performing concurrent operations in a processing element

Patent number: 10459876

Abstract: A processing element (PE) of a systolic array can perform neural networks computations in parallel on two or more sequential data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated in parallel. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.

Type: Grant

Filed: January 31, 2018

Date of Patent: October 29, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant

1 2 next