Patents by Inventor Jennifer Klamo
Jennifer Klamo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11847553Abstract: Neural network processing hardware using parallel computational architectures with reconfigurable core-level and vector-level parallelism is provided. In various embodiments, a neural network model memory is adapted to store a neural network model comprising a plurality of layers. Each layer has at least one dimension and comprises a plurality of synaptic weights. A plurality of neural cores is provided. Each neural core includes a computation unit and an activation memory. The computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The computation unit has a plurality of vector units. The activation memory is adapted to store the input activations and the output activations. The system is adapted to partition the plurality of cores into a plurality of partitions based on dimensions of the layer and the vector units.Type: GrantFiled: June 14, 2018Date of Patent: December 19, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
-
Patent number: 11663461Abstract: Instruction distribution in an array of neural network cores is provided. In various embodiments, a neural inference chip is initialized with core microcode. The chip comprises a plurality of neural cores. The core microcode is executable by the neural cores to execute a tensor operation of a neural network. The core microcode is distributed to the plurality of neural cores via an on-chip network. The core microcode is executed synchronously by the plurality of neural cores to compute a neural network layer.Type: GrantFiled: July 5, 2018Date of Patent: May 30, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Hartmut Penner, Dharmendra S. Modha, John V. Arthur, Andrew S. Cassidy, Rathinakumar Appuswamy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Jun Sawada, Brian Taba
-
Publication number: 20230062217Abstract: Hardware neural network processors, are provided. A neural core includes a weight memory, an activation memory, a vector-matrix multiplier, and a vector processor. The vector-matrix multiplier is adapted to receive a weight matrix from the weight memory, receive an activation vector from the activation memory, and compute a vector-matrix multiplication of the weight matrix and the activation vector. The vector processor is adapted to receive one or more input vector from one or more vector source and perform one or more vector functions on the one or more input vector to yield an output vector. In some embodiments a programmable controller is adapted to configure and operate the neural core.Type: ApplicationFiled: October 13, 2022Publication date: March 2, 2023Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
-
Patent number: 11501140Abstract: Hardware neural network processors, are provided. A neural core includes a weight memory, an activation memory, a vector-matrix multiplier, and a vector processor. The vector-matrix multiplier is adapted to receive a weight matrix from the weight memory, receive an activation vector from the activation memory, and compute a vector-matrix multiplication of the weight matrix and the activation vector. The vector processor is adapted to receive one or more input vector from one or more vector source and perform one or more vector functions on the one or more input vector to yield an output vector. In some embodiments a programmable controller is adapted to configure and operate the neural core.Type: GrantFiled: June 19, 2018Date of Patent: November 15, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
-
Publication number: 20220180177Abstract: A neural inference chip is provided, including at least one neural inference core. The at least one neural inference core is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of intermediate outputs. The at least one neural inference core comprises a plurality of activation units configured to receive the plurality of intermediate outputs and produce a plurality of activations. Each of the plurality of activation units is configured to apply a configurable activation function to its input. The configurable activation function has at least a re-ranging term and a scaling term, the re-ranging term determining the range of the activations and the scaling term determining the scale of the activations. Each of the plurality of activations units is configured to obtain the re-ranging term and the scaling term from one or more look up tables.Type: ApplicationFiled: December 8, 2020Publication date: June 9, 2022Inventors: Jun Sawada, Myron D. Flickner, Andrew Stephen Cassidy, John Vernon Arthur, Pallab Datta, Dharmendra S. Modha, Steven Kyle Esser, Brian Seisho Taba, Jennifer Klamo, Rathinakumar Appuswamy, Filipp Akopyan, Carlos Ortega Otero
-
Patent number: 11238347Abstract: Parallel processing among arrays of physical neural cores is provided. An array of neural cores is adapted to compute, in parallel, an output activation tensor of a neural network layer. A network is operatively connected to each of the neural cores. The output activation tensor is distributed across the neural cores. An input activation tensor is distributed across the neural cores. A weight tensor is distributed across the neural cores. Each neural core's computation comprises multiplying elements of a portion of the input activation tensor at that core with elements of a portion of the weight tensor at that core, and storing the summed products in a partial sum corresponding to an element of the output activation tensor. Each element of the output activation tensor is computed by accumulating all of the partial sums corresponding to that element via the network.Type: GrantFiled: September 28, 2018Date of Patent: February 1, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian Taba, Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Jennifer Klamo
-
Patent number: 11010662Abstract: Massively parallel neural inference computing elements are provided. A plurality of multipliers is arranged in a plurality of equal-sized groups. Each of the plurality of multipliers is adapted to, in parallel, apply a weight to an input activation to generate an output. A plurality of adders is operatively coupled to one of the groups of multipliers. Each of the plurality of adders is adapted to, in parallel, add the outputs of the multipliers within its associated group to generate a partial sum. A plurality of function blocks is operatively coupled to one of the plurality of adders. Each of the plurality of function blocks is adapted to, in parallel, apply a function to the partial sum of its associated adder to generate an output value.Type: GrantFiled: March 4, 2020Date of Patent: May 18, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
-
Publication number: 20200202205Abstract: Massively parallel neural inference computing elements are provided. A plurality of multipliers is arranged in a plurality of equal-sized groups. Each of the plurality of multipliers is adapted to, in parallel, apply a weight to an input activation to generate an output. A plurality of adders is operatively coupled to one of the groups of multipliers. Each of the plurality of adders is adapted to, in parallel, add the outputs of the multipliers within its associated group to generate a partial sum. A plurality of function blocks is operatively coupled to one of the plurality of adders. Each of the plurality of function blocks is adapted to, in parallel, apply a function to the partial sum of its associated adder to generate an output value.Type: ApplicationFiled: March 4, 2020Publication date: June 25, 2020Inventors: Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
-
Publication number: 20200117988Abstract: Networks for distributing parameters and data to neural network compute cores. In various embodiments, a neural inference chip comprises a plurality of neural cores and at least one network interconnecting the plurality of neural cores. Each of the plurality of neural cores is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The at least one network is adapted to simultaneously deliver synaptic weights and/or input activations to the plurality of neural cores.Type: ApplicationFiled: October 11, 2018Publication date: April 16, 2020Inventors: John V. Arthur, Brian Taba, Rathinakumar Appuswamy, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada
-
Publication number: 20200117981Abstract: Systems for neural network computation are provided. A neural network processor comprises a plurality of neural cores. The neural network processor has one or more processor precisions per activation. The processor is configured to accept data having a processor feature dimension. A transformation circuit is coupled to the neural network processor, and is adapted to: receive an input data tensor having an input precision per channel at one or more features; transform the input data tensor from the input precision to the processor precision; divide the input data into a plurality of blocks, each block conforming to one of the processor feature dimensions; provide each of the plurality of blocks to one of the plurality of neural cores. The neural network processor is adapted to compute, by the plurality of neural cores, output of one or more neural network layers.Type: ApplicationFiled: October 11, 2018Publication date: April 16, 2020Inventors: John V. Arthur, Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
-
Patent number: 10621489Abstract: Massively parallel neural inference computing elements are provided. A plurality of multipliers is arranged in a plurality of equal-sized groups. Each of the plurality of multipliers is adapted to, in parallel, apply a weight to an input activation to generate an output. A plurality of adders is operatively coupled to one of the groups of multipliers. Each of the plurality of adders is adapted to, in parallel, add the outputs of the multipliers within its associated group to generate a partial sum. A plurality of function blocks is operatively coupled to one of the plurality of adders. Each of the plurality of function blocks is adapted to, in parallel, apply a function to the partial sum of its associated adder to generate an output value.Type: GrantFiled: March 30, 2018Date of Patent: April 14, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
-
Publication number: 20200104718Abstract: Parallel processing among arrays of physical neural cores is provided. An array of neural cores is adapted to compute, in parallel, an output activation tensor of a neural network layer. A network is operatively connected to each of the neural cores. The output activation tensor is distributed across the neural cores. An input activation tensor is distributed across the neural cores. A weight tensor is distributed across the neural cores. Each neural core's computation comprises multiplying elements of a portion of the input activation tensor at that core with elements of a portion of the weight tensor at that core, and storing the summed products in a partial sum corresponding to an element of the output activation tensor. Each element of the output activation tensor is computed by accumulating all of the partial sums corresponding to that element via the network.Type: ApplicationFiled: September 28, 2018Publication date: April 2, 2020Inventors: Brian Taba, Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Jennifer Klamo
-
Publication number: 20200042856Abstract: Mapping of neural network layers to physical neural cores is provided. In various embodiments, a neural network description describing a plurality of neural network layers is read. Each of the plurality of neural network layers has an associated weight tensor, input tensor, and output tensor. A plurality of precedence relationships among the plurality of neural network layers is determined. The weight tensor, input tensor, and output tensor of each of the plurality of neural network layers are mapped onto an array of neural cores.Type: ApplicationFiled: July 31, 2018Publication date: February 6, 2020Inventors: Pallab Datta, Andrew S. Cassidy, Myron D. Flickner, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
-
Publication number: 20200019836Abstract: Networks of distributed neural cores are provided with hierarchical parallelism. In various embodiments, a plurality of neural cores is provided. Each of the plurality of neural cores comprises a plurality of vector compute units configured to operate in parallel. Each of the plurality of neural cores is configured to compute in parallel output activations by applying its plurality of vector compute units to input activations. Each of the plurality of neural cores is assigned a subset of output activations of a layer of a neural network for computation. Upon receipt of a subset of input activations of the layer of the neural network, each of the plurality of neural cores computes a partial sum for each of its assigned output activations, and computes its assigned output activations from at least the computed partial sums.Type: ApplicationFiled: July 12, 2018Publication date: January 16, 2020Inventors: John V. Arthur, Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
-
Publication number: 20200012929Abstract: Instruction distribution in an array of neural network cores is provided. In various embodiments, a neural inference chip is initialized with core microcode. The chip comprises a plurality of neural cores. The core microcode is executable by the neural cores to execute a tensor operation of a neural network. The core microcode is distributed to the plurality of neural cores via an on-chip network. The core microcode is executed synchronously by the plurality of neural cores to compute a neural network layer.Type: ApplicationFiled: July 5, 2018Publication date: January 9, 2020Inventors: Hartmut Penner, Dharmendra S. Modha, John V. Arthur, Andrew S. Cassidy, Rathinakumar Appuswamy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Jun Sawada, Brian Taba
-
Publication number: 20190385046Abstract: Neural network processing hardware using parallel computational architectures with reconfigurable core-level and vector-level parallelism is provided. In various embodiments, a neural network model memory is adapted to store a neural network model comprising a plurality of layers. Each layer has at least one dimension and comprises a plurality of synaptic weights. A plurality of neural cores is provided. Each neural core includes a computation unit and an activation memory. The computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The computation unit has a plurality of vector units. The activation memory is adapted to store the input activations and the output activations. The system is adapted to partition the plurality of cores into a plurality of partitions based on dimensions of the layer and the vector units.Type: ApplicationFiled: June 14, 2018Publication date: December 19, 2019Inventors: Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
-
Publication number: 20190385048Abstract: Hardware neural network processors, are provided. A neural core includes a weight memory, an activation memory, a vector-matrix multiplier, and a vector processor. The vector-matrix multiplier is adapted to receive a weight matrix from the weight memory, receive an activation vector from the activation memory, and compute a vector-matrix multiplication of the weight matrix and the activation vector. The vector processor is adapted to receive one or more input vector from one or more vector source and perform one or more vector functions on the one or more input vector to yield an output vector. In some embodiments a programmable controller is adapted to configure and operate the neural core.Type: ApplicationFiled: June 19, 2018Publication date: December 19, 2019Inventors: Andrew S. Cassidy, Rathinakumar Appuswamy, John V. Arthur, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba
-
Publication number: 20190332924Abstract: Neural inference processors are provided. In various embodiments, a processor includes a plurality of cores. Each core includes a neural computation unit, an activation memory, and a local controller. The neural computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The activation memory is adapted to store the input activations and the output activations. The local controller is adapted to load the input activations from the activation memory to the neural computation unit and to store the plurality of output activations from the neural computation unit to the activation memory. The processor includes a neural network model memory adapted to store network parameters, including the plurality of synaptic weights. The processor includes a global scheduler operatively coupled to the plurality of cores, adapted to provide the synaptic weights from the neural network model memory to each core.Type: ApplicationFiled: April 27, 2018Publication date: October 31, 2019Inventors: Andrew S. Cassidy, Myron D. Flickner, Pallab Datta, Hartmut Penner, Rathinakumar Appuswamy, Jun Sawada, John V. Arthur, Dharmendra S. Modha, Steven K. Esser, Brian Taba, Jennifer Klamo
-
Publication number: 20190325295Abstract: Neural inference chips and cores adapted to provide time, space, and energy efficient neural inference via parallelism and on-chip memory are provided. In various embodiments, the neural inference chips comprise: a plurality of neural cores interconnected by an on-chip network; a first on-chip memory for storing a neural network model, the first on-chip memory being connected to each of the plurality of cores by the on-chip network; a second on-chip memory for storing input and output data, the second on-chip memory being connected to each of the plurality of cores by the on-chip network.Type: ApplicationFiled: April 20, 2018Publication date: October 24, 2019Inventors: Dharmendra S. Modha, John V. Arthur, Jun Sawada, Steven K. Esser, Rathinakumar Appuswamy, Brian Taba, Andrew S. Cassidy, Pallab Datta, Myron D. Flickner, Hartmut Penner, Jennifer Klamo
-
Publication number: 20190303749Abstract: Massively parallel neural inference computing elements are provided. A plurality of multipliers is arranged in a plurality of equal-sized groups. Each of the plurality of multipliers is adapted to, in parallel, apply a weight to an input activation to generate an output. A plurality of adders is operatively coupled to one of the groups of multipliers. Each of the plurality of adders is adapted to, in parallel, add the outputs of the multipliers within its associated group to generate a partial sum. A plurality of function blocks is operatively coupled to one of the plurality of adders. Each of the plurality of function blocks is adapted to, in parallel, apply a function to the partial sum of its associated adder to generate an output value.Type: ApplicationFiled: March 30, 2018Publication date: October 3, 2019Inventors: Rathinakumar Appuswamy, John V. Arthur, Andrew S. Cassidy, Pallab Datta, Steven K. Esser, Myron D. Flickner, Jennifer Klamo, Dharmendra S. Modha, Hartmut Penner, Jun Sawada, Brian Taba