Patents by Inventor Ron Diamant

Ron Diamant has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11803736
    Abstract: A systolic array can implement an architecture tailored to perform matrix multiplications on constrained fine-grained sparse weight matrices. Each processing element in the systolic array may include a weight register configured to store a weight value, and a multiplexor configured to select a feature map (FMAP) input element from multiple FMAP input data buses based on metadata associated with the weight value. Each processing element may also include a multiplier configured to multiply the selected feature map input element with the weight value to generate a multiplication result, and an adder configured to add the multiplication result to a partial sum input to generate a partial sum output.
    Type: Grant
    Filed: June 30, 2020
    Date of Patent: October 31, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Paul Gilbert Meyer, Thiam Khean Hah, Randy Renfu Huang, Ron Diamant, Vignesh Vivekraja
  • Patent number: 11797853
    Abstract: Disclosed herein are techniques for performing multi-layer neural network processing for multiple contexts. In one embodiment, a computing engine is set in a first configuration to implement a second layer of a neural network and to process first data related to a first context to generate first context second layer output. The computing engine can be switched from the first configuration to a second configuration to implement a first layer of the neural network. The computing engine can be used to process second data related to a second context to generate second context first layer output. The computing engine can be set to a third configuration to implement a third layer of the neural network to process the first context second layer output and the second context first layer output to generate a first processing result of the first context and a second processing result of the second context.
    Type: Grant
    Filed: September 22, 2022
    Date of Patent: October 24, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Dana Michelle Vantrease, Ron Diamant, Thomas A. Volpe, Randy Huang
  • Publication number: 20230334294
    Abstract: Provided are systems, methods, and integrated circuits for neural network processing. In various implementations, an integrated circuit for neural network processing can include a plurality of memory banks storing weight values for a neural network. The memory banks can be on the same chip as an array of processing engines. Upon receiving input data, the circuit can be configured to use the set of weight values to perform a task defined for the neural network. Performing the task can include reading weight values from the memory banks, inputting the weight values into the array of processing engines, and computing a result using the array of processing engines, where the result corresponds to an outcome of performing the task.
    Type: Application
    Filed: June 22, 2023
    Publication date: October 19, 2023
    Inventors: Randy Huang, Ron Diamant
  • Publication number: 20230325348
    Abstract: A processing element (PE) of a systolic array can perform neural networks computations on two or more data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.
    Type: Application
    Filed: June 15, 2023
    Publication date: October 12, 2023
    Inventors: Dana Michelle Vantrease, Ron Diamant
  • Patent number: 11782706
    Abstract: In one example, a method comprises: receiving input codes, wherein the input codes represent a computational dataflow graph; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor; determining a merged operator for each SESE subgraph; and generating executable instructions for the computational dataflow graph to be executed by a hardware accelerator having a first execution unit and a second execution unit, wherein the executable instructions comprise first executable instructions for the merged operators targeted at the first execution unit, and second executable instructions for other operators of the computational dataflow graph targeted at the second execution unit.
    Type: Grant
    Filed: June 29, 2021
    Date of Patent: October 10, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Hongbin Zheng, Drazen Borkovic, Haichen Li
  • Patent number: 11775430
    Abstract: Disclosed herein are techniques for performing memory access. In one embodiment, an integrated circuit includes a port and an access engine. The integrated circuit is coupled with a memory device. The access engine is configured to: receive, from an access requester device, a request to access data stored at a memory device; and based on receiving the request: provide, via the port, a sequential access of a plurality of portions of the data to the access requester device; and access the plurality of portions of the data in a parallel form at the memory device for the access requester device. The sequential access can include a sequential write access or a sequential read access of the plurality of portions of the data.
    Type: Grant
    Filed: August 24, 2020
    Date of Patent: October 3, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Sundeep Amirineni, Akshay Balasubramanian, Eyal Freund
  • Patent number: 11775268
    Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. The integrated circuit device includes a memory having a set of memory locations. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. While traversing the interference graph, a set of memory location assignments are generated by assigning the set of values to the set of memory locations in accordance with one or more color selection schemes.
    Type: Grant
    Filed: June 8, 2021
    Date of Patent: October 3, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Preston Pengra Briggs, Ron Diamant, Robert Geva
  • Patent number: 11741345
    Abstract: Provided are systems, methods, and integrated circuits for a neural network processing system. In various implementations, the system can include a first array of processing engines coupled to a first set of memory banks and a second array of processing engines coupled to a second set of memory banks. The first and second set of memory banks be storing all the weight values for a neural network, where the weight values are stored before any input data is received. Upon receiving input data, the system performs a task defined for the neural network. Performing the task can include computing an intermediate result using the first array of processing engines, copying the intermediate result to the second set of memory banks, and computing a final result using the second array of processing engines, where the final result corresponds to an outcome of performing the task.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: August 29, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Randy Huang, Ron Diamant
  • Patent number: 11741350
    Abstract: A computer-implemented method includes receiving a neural network model for implementation using a processing element array, where the neural network model includes a convolution operation on a set of input feature maps and a set of filters. The method also includes determining, based on the neural network model, that the convolution operation utilizes less than a threshold number of rows in the processing element array for applying a set of filter elements to the set of input feature maps, where the set of filter elements includes one filter element in each filter of the set of filters. The method further includes generating, for the convolution operation and based on the neural network model, a first instruction and a second instruction for execution by respective rows in the processing element array, where the first instruction and the second instruction use different filter elements of a filter in the set of filters.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: August 29, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
  • Patent number: 11720523
    Abstract: A processing element (PE) of a systolic array can perform neural networks computations on two or more data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.
    Type: Grant
    Filed: October 15, 2019
    Date of Patent: August 8, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Dana Michelle Vantrease, Ron Diamant
  • Patent number: 11714992
    Abstract: Systems and methods for providing executable instructions to a neural network processor are provided. In one example, a system comprises a database that stores a plurality of executable instructions and a plurality of subgraph identifiers, each subgraph identifier of the plurality of subgraph identifiers being associated with a subset of instructions of the plurality of executable instructions.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: August 1, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Richard John Heaton, Randy Renfu Huang, Ron Diamant
  • Patent number: 11704211
    Abstract: Techniques for avoiding uncorrectable errors in a memory device can include detecting a correctable error pattern of a memory page of a memory device, and determining that the correctable error pattern of the memory page satisfies a page migration condition. Upon satisfying the page migration condition, write accesses to the memory page are prevented from reaching a memory controller of the memory device. The contents of the memory page are then migrated to a reserved page, and a mapping table is updated to replace accesses to the memory page with accesses to the reserved page.
    Type: Grant
    Filed: December 8, 2021
    Date of Patent: July 18, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Patricio Kaplan, Ron Diamant, Brian Robert Silver
  • Patent number: 11687761
    Abstract: Systems and methods for performing improper input data detection are described. In one example, a system comprises: hardware circuits configured to receive input data and to perform computations of a neural network based on the input data to generate computation outputs; and an improper input detection circuit configured to: determine a relationship between the computation outputs of the hardware circuits and reference outputs; determine that the input data are improper based on the relationship; and perform an action based on determining that the input data are improper.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: June 27, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Randy Renfu Huang, Richard John Heaton, Andrea Olgiati, Ron Diamant
  • Publication number: 20230196113
    Abstract: Methods and systems for training a neural network are provided. In one example, an apparatus comprises a memory that stores instructions; and a hardware processor configured to execute the instructions to: control a neural network processor to perform a loss gradient operation to generate data gradients; after the loss gradient operation completes, control the neural network processor to perform a forward propagation operation to generate intermediate outputs; control the neural network processor to perform a backward propagation operation based on the data gradients and the intermediate outputs to generate weight gradients; receive the weight gradients from the neural network processor; and update weights of a neural network based on the weight gradients.
    Type: Application
    Filed: February 21, 2023
    Publication date: June 22, 2023
    Inventors: Sudipta Sengupta, Randy Renfu Huang, Ron Diamant, Vignesh Vivekaja
  • Patent number: 11676021
    Abstract: A first worker node of a distributed system computes a first set of gradients using a first neural network model and a first set of weights associated with the first neural network model. The first set of gradients are transmitted from the first worker node to a second worker node of the distributed system. The second worker node computes a first set of synchronized gradients based on the first set of gradients. While the first set of synchronized gradients are being computed, the first worker node computes a second set of gradients using a second neural network model and a second set of weights associated with the second neural network model. The second set of gradients are transmitted from the first worker node to the second worker node. The second worker node computes a second set of synchronized gradients based on the second set of gradients.
    Type: Grant
    Filed: September 19, 2022
    Date of Patent: June 13, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Patricio Kaplan, Ron Diamant
  • Publication number: 20230153620
    Abstract: A computer-implemented method includes receiving a neural network model that includes a tensor operation, dividing the tensor operation into a set of sub-operations, and generating instructions for performing a plurality of sub-operations of the set of sub-operations on respective computing engines of a plurality of computing engines on a same integrated circuit device or on different integrated circuit devices. Each sub-operation of the set of sub-operations generates a portion of a final output of the tensor operation. An inference is made based on a result of a sub-operation of the plurality of sub-operations, or based on results of the plurality of sub-operations.
    Type: Application
    Filed: January 13, 2023
    Publication date: May 18, 2023
    Inventors: Randy Renfu Huang, Ron Diamant, Richard John Heaton
  • Patent number: 11645075
    Abstract: Execution flows of a program can be characterized by a series of execution events. The rates at which these execution events occur for a particular program can be collected periodically, and the execution events statistics can be utilized for both training a machine learning model, and later on for making classification inferences to determine whether a program run contains any abnormality. When an abnormality is encountered, an alert can be generated and provided to supervisory logic of a computing system to indicate that an abnormal program flow has been detected.
    Type: Grant
    Filed: June 30, 2021
    Date of Patent: May 9, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Barak Wasserstrom, Adi Habusha, Ron Diamant, Erez Sabbag
  • Patent number: 11636569
    Abstract: In one example, an apparatus comprises: a buffer memory; and a memory access circuit configured to: fetch, from a first memory, a set of first groups of data elements of a first matrix, each first group of data elements being stored at consecutive memory addresses at the first memory; based on a first configuration, store the set of first groups of data elements at consecutive memory addresses or at non-consecutive memory addresses at the buffer memory; based on a second configuration that defines a memory address offset, fetch a set of second groups of the data elements from the buffer memory, each second group of the data elements being stored at consecutive memory addresses of the buffer memory, each second group being separated by the memory address offset in the buffer memory; and store each fetched second group at consecutive addresses of a destination memory to form a second matrix.
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: April 25, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Kun Xu, Ron Diamant
  • Patent number: 11625453
    Abstract: To improve utilization of a systolic array, each row of the array is provided with a number of general purpose row input data buses. Each of the general purpose row input data buses can be operable to transfer either feature map (FMAP) input elements or weight values into the processing elements of the corresponding row of the array. By using such general purpose row input data buses, concurrent matrix multiplications as well as faster background weight loading can be achieved in the array.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: April 11, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Paul Gilbert Meyer, Ron Diamant
  • Patent number: 11625269
    Abstract: A technique for scheduling instructions includes obtaining a set of instructions that operate on memory objects, and determining the dependencies of the memory objects. The memory objects are then sorted into a sequence of memory objects based on the dependencies of the memory objects, and the set of instructions are scheduled into a sequence of instructions according to the sequence of memory objects. Sorting memory objects allows instructions that operate on the same memory object to be kept together. This helps minimize spilling conditions because intervening instructions that do not operate on the same memory object can be avoided.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: April 11, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Robert Geva, Taylor Goodhart, Ron Diamant, Preston Pengra Briggs