Patents by Inventor Sundeep Amirineni

Sundeep Amirineni has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10983754
    Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.
    Type: Grant
    Filed: June 2, 2020
    Date of Patent: April 20, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
  • Patent number: 10942742
    Abstract: A reconfigurable processing circuit and system are provided. The system allows a user to program machine-level instructions in order to reconfigure the way the circuit behaves, including by adding new operations. The system can include a profile access content-addressable memory (CAM) configured to receive an execution step value from a step counter. The execution step value can be incremented and/or reset by a step management logic. The profile access CAM can select an entry of a profile table based on an opcode and the execution step value, and the processing engine can execute microcode based on the selected entry of the profile table. The profile access CAM can translate the opcode to an internal short instruction identifier in order to select the entry of the profile table. The system can further include an instruction decoding module configured to merge multiple instruction fields into a single effective instruction field.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: March 9, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Sundeep Amirineni, Mohammad El-Shabani, Sagar Sonar, Kenneth Wayne Patton
  • Patent number: 10943167
    Abstract: Disclosed herein are techniques for performing neural network computations. In one embodiment, an apparatus includes an array of processing elements, the array having configurable dimensions. The apparatus further includes a controller configured to set the dimensions of the array of processing elements based on at least one of: a first number of input data sets to be received by the array, or a second number of output data sets to be output by the array.
    Type: Grant
    Filed: August 12, 2019
    Date of Patent: March 9, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Sundeep Amirineni, Ron Diamant, Randy Huang, Thomas A. Volpe
  • Patent number: 10877552
    Abstract: Dynamic power dissipation in an integrated circuit device is related to switching activity and can degrade performance or cause premature failure. Methods and apparatuses for dynamic power reduction by limiting data transfer requests between execution engines and memory are provided. Data transfer limiter blocks can be associated with execution engines of the integrated circuit device. Each data transfer limiter block may include a set of counters to control a number of data transfer requests from an execution engine that are permitted to reach the memory in a specified period of time. The counters may incrementally increase the number of data transfer requests permitted to reach the memory subsystem from an initial number up to a maximum number of data transfer requests in the specified period of time.
    Type: Grant
    Filed: September 19, 2019
    Date of Patent: December 29, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Sundeep Amirineni, Akshay Balasubramanian
  • Patent number: 10817260
    Abstract: Systems and methods are provided to skip multiplication operations with zeros in processing elements of the systolic array to reduce dynamic power consumption. A value of zero can be detected on an input data element entering each row of the array and respective zero indicators may be generated. These respective zero indicators may be passed to all the processing elements in the respective rows. The multiplication operation with the zero value can be skipped in each processing element based on the zero indicators, thus reducing dynamic power consumption.
    Type: Grant
    Filed: June 13, 2018
    Date of Patent: October 27, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni, Thomas A. Volpe
  • Patent number: 10795742
    Abstract: Disclosed are techniques regarding aspects of implementing client configurable logic within a computer system. The computer system can be a cloud infrastructure. The techniques can include determining that the client configurable logic has performed an errant action.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: October 6, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Asif Khan, Sundeep Amirineni, Kiran Kalkunte Seshadri, Nafea Bshara
  • Publication number: 20200293284
    Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. In one example, an apparatus comprises a first circuit, a second circuit, and a third circuit. The first circuit is configured to: receive first values in a first format, the first values being generated from one or more asymmetric quantization operations of second values in a second format, and generate difference values based on subtracting a third value from each of the first values, the third value representing a zero value in the first format. The second circuit is configured to generate a sum of products in the first format using the difference values. The third circuit is configured to convert the sum of products from the first format to the second format based on scaling the sum of products with a scaling factor.
    Type: Application
    Filed: June 2, 2020
    Publication date: September 17, 2020
    Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
  • Patent number: 10768856
    Abstract: Disclosed herein are techniques for performing memory access. In one embodiment, an integrated circuit may include a memory device, a first port to receive first data elements from a memory access circuit within a first time period, and a second port to transmit second data elements to the memory access circuit within a second time period. The memory access circuit may receive the first data elements from the memory device within a third time period shorter than the first time period and transmit, via the first port, the received first data elements to a first processing circuit sequentially within the first time period. The memory access circuit may receive, via the second port, the second data elements from a second processing circuit sequentially within the second time period, and store the received second data elements in the memory device within a fourth time period shorter than the second time period.
    Type: Grant
    Filed: March 12, 2018
    Date of Patent: September 8, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Sundeep Amirineni, Akshay Balasubramanian, Eyal Freund
  • Patent number: 10740432
    Abstract: Methods and systems for performing hardware computations of mathematical functions are provided. In one example, a system comprises a mapping table that maps each base value of a plurality of base values to parameters related to a mathematical function; a selection module configured to select, based on an input value, a first base value and first parameters mapped to the first base value in the mapping table; and arithmetic circuits configured to: receive, from the mapping table, the first base value and the first plurality of parameters; and compute, based on a relationship between the input value and the first base value, and based on the first parameters, an estimated output value of the mathematical function for the input value.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: August 11, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Randy Renfu Huang, Mohammad El-Shabani, Sundeep Amirineni, Kenneth Wayne Patton, Willis Wang
  • Patent number: 10733498
    Abstract: Methods and systems for supporting parametric function computations in hardware circuits are proposed. In one example, a system comprises a hardware mapping table, a control circuit, and arithmetic circuits. The control circuit is configured to: in a first mode of operation, forward a set of parameters of a non-parametric function associated with an input value from the hardware mapping table to the arithmetic circuits to compute a first approximation of the non-parametric function at the input value; and in a second mode of operation, based on information indicating whether the input value is in a first input range or in a second input range from the hardware mapping table, forward a first parameter or a second parameter of a parametric function to the arithmetic circuits to compute, respectively, a second approximation or a third approximation of the parametric function at the input value.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: August 4, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Sundeep Amirineni, Mohammad El-Shabani
  • Patent number: 10678479
    Abstract: Provided are integrated circuits and methods for operating integrated circuits. An integrated circuit can include a plurality of memory banks and an execution engine including a set of execution components. Each execution component can be associated with a respective memory bank, and can read from and write to only the respective memory bank. The integrated circuit can further include a set of registers each associated with a respective memory bank from the plurality of memory banks. The integrated circuit can further be operable to load to or store from the set of registers in parallel, and load to or store from the set of registers serially. A parallel operation followed by a serial operation enables data to be moved from many memory banks into one memory bank. A serial operation followed by a parallel operation enables data to be moved from one memory bank into many memory banks.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: June 9, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Randy Renfu Huang, Sundeep Amirineni, Jeffrey T. Huynh
  • Patent number: 10678508
    Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: June 9, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni
  • Patent number: 10592322
    Abstract: Disclosed herein are techniques for preventing or minimizing completion timeout errors on a computer device. An apparatus includes a processing logic circuit configured to perform transactions requested by a requester device, and a timeout prevention logic coupled to the processing logic circuit. The timeout prevention logic includes a timeout logic and a moderation logic. The timeout logic is configured to, when the processing logic circuit fails to complete a particular transaction requested by the requester device within a reconfigurable time period, generate a timeout event and complete the particular requested transaction. The moderation logic is configured to determine a number of timeout events generated by the timeout logic during a monitoring time period, and set the reconfigurable time period based on the number of timeout events generated by the timeout logic during the monitoring time period.
    Type: Grant
    Filed: June 27, 2017
    Date of Patent: March 17, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Kiran Kalkunte Seshadri, Sundeep Amirineni, Nafea Bshara, Asif Khan
  • Patent number: 10587514
    Abstract: Packet processing pipelines may implement filtering of control plane decisions. When network packets are received various types of decision-making and processing is performed. In order to complete processing for the network packet, some decisions may need to be determined by a control plane for the packet processing pipeline, such as a general processor. Requests for control plane decisions for received network packets may be filtered prior to sending the requests to the control plane based on whether the same control plane decisions have been requested for previously received network packets. For control plane decisions with outstanding control plane decision requests, an additional control plane decision request for the network packet may be blocked, whereas control plane decisions with no outstanding control plane decision requests may be allowed.
    Type: Grant
    Filed: December 21, 2015
    Date of Patent: March 10, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Bijendra Singh, Thomas A. Volpe, Sundeep Amirineni
  • Patent number: 10445638
    Abstract: Disclosed herein are techniques for performing neural network computations. In one embodiment, an apparatus may include an array of processing elements, the array having a configurable first effective dimension and a configurable second effective dimension. The apparatus may also include a controller configured to determine at least one of: a first number of input data sets to be provided to the array at the first time or a second number of output data sets to be generated by the array at the second time, and to configure, based on at least one of the first number or the second number, at least one of the first effective dimension or the second effective dimension of the array.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: October 15, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Sundeep Amirineni, Ron Diamant, Randy Huang, Thomas A. Volpe
  • Publication number: 20190294413
    Abstract: Disclosed herein are techniques for accelerating convolution operations or other matrix multiplications in applications such as neural network. A computer-implemented method includes receiving low-precision inputs for a convolution operation from a storage device, and subtracting a low-precision value representing a high-precision zero value from the low-precision inputs to generate difference values, where the low-precision inputs are asymmetrically quantized from high-precision inputs. The method also includes performing multiplication and summation operations on the difference values to generate a sum of products, and generating a high-precision output by scaling the sum of products with a scaling factor.
    Type: Application
    Filed: March 23, 2018
    Publication date: September 26, 2019
    Inventors: Dana Michelle Vantrease, Randy Huang, Ron Diamant, Thomas Elmer, Sundeep Amirineni