Patents by Inventor Rathinakumar Appuswamy

Rathinakumar Appuswamy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11847553
    Abstract: Neural network processing hardware using parallel computational architectures with reconfigurable core-level and vector-level parallelism is provided. In various embodiments, a neural network model memory is adapted to store a neural network model comprising a plurality of layers. Each layer has at least one dimension and comprises a plurality of synaptic weights. A plurality of neural cores is provided. Each neural core includes a computation unit and an activation memory. The computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The computation unit has a plurality of vector units. The activation memory is adapted to store the input activations and the output activations. The system is adapted to partition the plurality of cores into a plurality of partitions based on dimensions of the layer and the vector units.
    Type: Grant
    Filed: June 14, 2018
    Date of Patent: December 19, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
  • Patent number: 11823054
    Abstract: Learned step size quantization in artificial neural network is provided. In various embodiments, a system comprises an artificial neural network and a computing node. The artificial neural network comprises: a quantizer having a configurable step size, the quantizer adapted to receive a plurality of input values and quantize the plurality of input values according to the configurable step size to produce a plurality of quantized input values, at least one matrix multiplier configured to receive the plurality of quantized input values from the quantizer and to apply a plurality of weights to the quantized input values to determine a plurality of output values having a first precision, and a multiplier configured to scale the output values to a second precision.
    Type: Grant
    Filed: February 20, 2020
    Date of Patent: November 21, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Steve Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar Appuswamy, Dharmendra S. Modha
  • Patent number: 11663461
    Abstract: Instruction distribution in an array of neural network cores is provided. In various embodiments, a neural inference chip is initialized with core microcode. The chip comprises a plurality of neural cores. The core microcode is executable by the neural cores to execute a tensor operation of a neural network. The core microcode is distributed to the plurality of neural cores via an on-chip network. The core microcode is executed synchronously by the plurality of neural cores to compute a neural network layer.
    Type: Grant
    Filed: July 5, 2018
    Date of Patent: May 30, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hartmut Penner, Dharmendra S. Modha, John V. Arthur, Andrew S. Cassidy, Rathinakumar Appuswamy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Jun Sawada, Brian Taba
  • Patent number: 11636317
    Abstract: Long-short term memory (LSTM) cells on spiking neuromorphic hardware are provided. In various embodiments, such systems comprise a spiking neurosynaptic core. The neurosynaptic core comprises a memory cell, an input gate operatively coupled to the memory cell and adapted to selectively admit an input to the memory cell, and an output gate operatively coupled to the memory cell an adapted to selectively release an output from the memory cell. The memory cell is adapted to maintain a value in the absence of input.
    Type: Grant
    Filed: February 16, 2017
    Date of Patent: April 25, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rathinakumar Appuswamy, Michael Beyeler, Pallab Datta, Myron Flickner, Dharmendra S. Modha
  • Publication number: 20230062217
    Abstract: Hardware neural network processors, are provided. A neural core includes a weight memory, an activation memory, a vector-matrix multiplier, and a vector processor. The vector-matrix multiplier is adapted to receive a weight matrix from the weight memory, receive an activation vector from the activation memory, and compute a vector-matrix multiplication of the weight matrix and the activation vector. The vector processor is adapted to receive one or more input vector from one or more vector source and perform one or more vector functions on the one or more input vector to yield an output vector. In some embodiments a programmable controller is adapted to configure and operate the neural core.
    Type: Application
    Filed: October 13, 2022
    Publication date: March 2, 2023
    Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
  • Patent number: 11537859
    Abstract: Neural inference chips are provided. A neural core of the neural inference chip comprises a vector-matrix multiplier; a vector processor; and an activation unit operatively coupled to the vector processor. The vector-matrix multiplier, vector processor, and/or activation unit is adapted to operate at variable precision.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: December 27, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steve Esser, Myron D. Flickner, Jeffrey McKinstry, Dharmendra S. Modha, Jun Sawada, Brian Taba
  • Patent number: 11501140
    Abstract: Hardware neural network processors, are provided. A neural core includes a weight memory, an activation memory, a vector-matrix multiplier, and a vector processor. The vector-matrix multiplier is adapted to receive a weight matrix from the weight memory, receive an activation vector from the activation memory, and compute a vector-matrix multiplication of the weight matrix and the activation vector. The vector processor is adapted to receive one or more input vector from one or more vector source and perform one or more vector functions on the one or more input vector to yield an output vector. In some embodiments a programmable controller is adapted to configure and operate the neural core.
    Type: Grant
    Filed: June 19, 2018
    Date of Patent: November 15, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
  • Publication number: 20220180177
    Abstract: A neural inference chip is provided, including at least one neural inference core. The at least one neural inference core is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of intermediate outputs. The at least one neural inference core comprises a plurality of activation units configured to receive the plurality of intermediate outputs and produce a plurality of activations. Each of the plurality of activation units is configured to apply a configurable activation function to its input. The configurable activation function has at least a re-ranging term and a scaling term, the re-ranging term determining the range of the activations and the scaling term determining the scale of the activations. Each of the plurality of activations units is configured to obtain the re-ranging term and the scaling term from one or more look up tables.
    Type: Application
    Filed: December 8, 2020
    Publication date: June 9, 2022
    Inventors: Jun Sawada, Myron D. Flickner, Andrew Stephen Cassidy, John Vernon Arthur, Pallab Datta, Dharmendra S. Modha, Steven Kyle Esser, Brian Seisho Taba, Jennifer Klamo, Rathinakumar Appuswamy, Filipp Akopyan, Carlos Ortega Otero
  • Publication number: 20220129436
    Abstract: Systems are provided that can produce symbolic and numeric representations of the neural network outputs, such that these outputs can be used to validate correctness of the implementation of the neural network. In various embodiments, a description of an artificial neural network containing no data-dependent branching is read. Based on the description of the artificial neural network, a symbolic representation is constructed of an output of the artificial neural network, the symbolic representation comprising at least one variable. The symbolic representation is compared to a ground truth symbolic representation, thereby validating the neural network system.
    Type: Application
    Filed: October 22, 2020
    Publication date: April 28, 2022
    Inventors: Alexander Andreopoulos, Dharmendra S. Modha, Andrew Stephen Cassidy, Brian Seisho Taba, Carmelo Di Nolfo, Hartmut Penner, John Vernon Arthur, Jun Sawada, Myron D. Flickner, Pallab Datta, Rathinakumar Appuswamy
  • Publication number: 20220129769
    Abstract: Modular neural network computing apparatus are provided with distributed neural network storage. In various embodiments, a neural inference processor comprises a plurality of neural inference cores, at least one model network interconnecting the plurality of neural inference cores, and at least one activation network interconnecting the plurality of neural inference cores. Each of the plurality of neural inference cores comprises memory adapted to store input activations, output activations, and a neural network model. The neural network model comprises synaptic weights, neuron parameters, and neural network instructions. The at least one model network is configured to distribute the neural network model among the plurality of neural inference cores. Each of the plurality of neural inference cores is configured to apply the synaptic weights to input activations from its memory to produce a plurality of output activations to its memory.
    Type: Application
    Filed: October 22, 2020
    Publication date: April 28, 2022
    Inventors: Jun Sawada, Dharmendra S. Modha, John Vernon Arthur, Andrew Stephen Cassidy, Pallab Datta, Rathinakumar Appuswamy, Tapan Kumar Nayak, Brian Kumar Taba, Carlos Ortega Otero, Filipp Akopyan, Arnon Amir, Nathaniel Joseph McClatchey
  • Publication number: 20220129742
    Abstract: Simulation and validation of neural network systems is provided. In various embodiments, a description of an artificial neural network is read. A directed graph is constructed comprising a plurality of edges and a plurality of nodes, each of the plurality of edges corresponding to a queue and each of the plurality of nodes corresponding to a computing function of the neural network system. A graph state is updated over a plurality of time steps according to the description of the neural network, the graph state being defined by the contents of each of the plurality of queues. Each of a plurality of assertions is tested at each of the plurality of time steps, each of the plurality of assertions being a function of a subset of the graph state. Invalidity of the neural network system is indicated for each violation of one of the plurality of assertions.
    Type: Application
    Filed: October 22, 2020
    Publication date: April 28, 2022
    Inventors: Alexander Andreopoulos, Dharmendra S. Modha, Carmelo Di Nolfo, Myron D. Flickner, Andrew Stephen Cassidy, Brian Seisho Taba, Pallab Datta, Rathinakumar Appuswamy, Jun Sawada
  • Publication number: 20220129743
    Abstract: Neural network accelerator output ranking is provided. In various embodiments, a system comprises a data memory; a memory controller configured to access the data memory; a plurality of comparators configured in a tree; a register; and a two-way comparator. The memory controller is configured to provide a first plurality of values from the data memory to the comparator tree. The comparator tree is configured to perform a plurality of concurrent pairwise comparisons of the first plurality of values to arrive at a first greatest value of the first plurality of values. The two-way comparator is configured to output the greater of the greatest value from the comparator tree and a stored value from the register. The register is configured to store the output of the two-way comparator.
    Type: Application
    Filed: October 23, 2020
    Publication date: April 28, 2022
    Inventors: Jun Sawada, Rathinakumar Appuswamy, John Vernon Arthur, Andrew Stephen Cassidy, Pallab Datta, Michael Vincent DeBole, Steven Kyle Esser, Dharmendra S. Modha
  • Publication number: 20220121925
    Abstract: Chips supporting constant time program control of nested loops are provided. In various embodiments, a chip comprises at least one arithmetic-logic computing unit and a controller operatively coupled to the at least one arithmetic-logic computing unit. The controller is configured according to a program configuration, the program configuration comprising at least one inner loop and at least one outer loop. The controller is configured to cause the at least one arithmetic computing unit to execute a plurality of operations according to the program configuration. The controller is configured to maintain at least a first loop counter and a second loop counter, the first loop counter configured to count a number of executed iterations of the at least one outer loop, and the second loop counter configured to count a number of executed iterations of the at least one inner loop.
    Type: Application
    Filed: October 21, 2020
    Publication date: April 21, 2022
    Inventors: Arnon Amir, Andrew Stephen Cassidy, Nathaniel Joseph McClatchey, Jun Sawada, Dharmendra S. Modha, Rathinakumar Appuswamy
  • Publication number: 20220121951
    Abstract: Conflict-free, stall-free, broadcast networks on neural inference chips are provided. In various embodiments, a neural inference chip comprises a plurality of network nodes and a network on chip interconnecting the plurality of network nodes. The network comprises at least one pair of directional paths. The paths of each pair have opposite directions and a common end. The network is configured to accept data at any of the plurality of nodes. The network is configured to propagate data along a first of the pair of directional paths from a source node to the common end of the pair of directional paths and along a second of the pair of directional paths from the common end of the pair of directional paths to one or more destination node.
    Type: Application
    Filed: October 21, 2020
    Publication date: April 21, 2022
    Inventors: Andrew Stephen Cassidy, Rathinakumar Appuswamy, John Vernon Arthur, Jun Sawada, Dharmendra S. Modha, Michael Vincent DeBole, Pallab Datta, Tapan Kumar Nayak
  • Patent number: 11270196
    Abstract: Neural inference chips for computing neural activations are provided. In various embodiments, the neural inference chip is adapted to: receive an input activation tensor comprising a plurality of input activations; receive a weight tensor comprising a plurality of weights; Booth recode each of the plurality of weights into a plurality of Booth-coded weights, each Booth coded value having an order; multiply the input activation tensor by the Booth coded weights, yielding a plurality of results for each input activation, each of the plurality of results corresponding to the orders of the Booth-coded weights; for each order of the Booth-coded weights, sum the corresponding results, yielding a plurality of partial sums, one for each order; and compute a neural activation from a sum of the plurality of partial sums.
    Type: Grant
    Filed: October 15, 2019
    Date of Patent: March 8, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jun Sawada, Filipp A. Akopyan, Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Dharmendra S. Modha, Tapan K. Nayak, Carlos O. Otero
  • Patent number: 11263011
    Abstract: A device for controlling neural inference processor cores is provided, including a compound instruction set architecture. The device comprises an instruction memory, which comprises a plurality of instructions for controlling a neural inference processor core. Each of the plurality of instructions comprises a control operation. The device further comprises a program counter. The device further comprises at least one loop counter register. The device is adapted to execute the plurality of instructions. Executing the plurality of instructions comprises: reading an instruction from the instruction memory based on a value of the program counter; updating the at least one loop counter register according to the control operation of the instruction; and updating the program counter according to the control operation of the instruction and a value of the at least one loop counter register.
    Type: Grant
    Filed: November 28, 2018
    Date of Patent: March 1, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Michael V. Debole, Steven K. Esser, Myron D. Flickner, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
  • Patent number: 11238347
    Abstract: Parallel processing among arrays of physical neural cores is provided. An array of neural cores is adapted to compute, in parallel, an output activation tensor of a neural network layer. A network is operatively connected to each of the neural cores. The output activation tensor is distributed across the neural cores. An input activation tensor is distributed across the neural cores. A weight tensor is distributed across the neural cores. Each neural core's computation comprises multiplying elements of a portion of the input activation tensor at that core with elements of a portion of the weight tensor at that core, and storing the summed products in a partial sum corresponding to an element of the output activation tensor. Each element of the output activation tensor is computed by accumulating all of the partial sums corresponding to that element via the network.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: February 1, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Brian Taba, Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Jennifer Klamo
  • Patent number: 11138495
    Abstract: Embodiments of the invention provide a method comprising receiving a set of features extracted from input data, training a linear classifier based on the set of features extracted, and generating a first matrix using the linear classifier. The first matrix includes multiple dimensions. Each dimension includes multiple elements. Elements of a first dimension correspond to the set of features extracted. Elements of a second dimension correspond to a set of classification labels. The elements of the second dimension are arranged based on one or more synaptic weight arrangements. Each synaptic weight arrangement represents effective synaptic strengths for a classification label of the set of classification labels. The neurosynaptic core circuit is programmed with synaptic connectivity information based on the synaptic weight arrangements. The core circuit is configured to classify one or more objects of interest in the input data.
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: October 5, 2021
    Assignee: International Business Machines Corporation
    Inventors: Rathinakumar Appuswamy, Steven K. Esser, Dharmendra S. Modha
  • Publication number: 20210264279
    Abstract: Learned step size quantization in artificial neural network is provided. In various embodiments, a system comprises an artificial neural network and a computing node. The artificial neural network comprises: a quantizer having a configurable step size, the quantizer adapted to receive a plurality of input values and quantize the plurality of input values according to the configurable step size to produce a plurality of quantized input values, at least one matrix multiplier configured to receive the plurality of quantized input values from the quantizer and to apply a plurality of weights to the quantized input values to determine a plurality of output values having a first precision, and a multiplier configured to scale the output values to a second precision.
    Type: Application
    Filed: February 20, 2020
    Publication date: August 26, 2021
    Inventors: Steve Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar Appuswamy, Dharmendra S. Modha
  • Publication number: 20210209450
    Abstract: A neural inference chip includes a global weight memory; a neural core; and a network connecting the global weight memory to the at least one neural core. The neural core comprises a local weight memory. The local weight memory comprises a plurality of memory banks. Each of the plurality of memory banks is uniquely addressable by at least one index. The neural inference chip is adapted to store in the global weight memory a compressed weight block comprising at least one compressed weight matrix. The neural inference chip is adapted to transmit the compressed weight block from the global weight memory to the core via the network. The core is adapted to decode the at least one compressed weight matrix into a decoded weight matrix and store the decoded weight matrix in its local weight memory. The at core is adapted to apply the decoded weight matrix to a plurality of input activations to produce a plurality of output activations.
    Type: Application
    Filed: January 3, 2020
    Publication date: July 8, 2021
    Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steve Esser, Myron D. Flickner, Dharmendra S. Modha, Jun Sawada