Patents by Inventor Pritish Narayanan

Pritish Narayanan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

FIXED ASYMMETRY COMPENSATION FOR MULTIPLY AND ACCUMULATE OPERATIONS

Publication number: 20240192921

Abstract: Systems and methods for compensating multiply and accumulate (MAC) operations are described. A processor can send an input vector to a first portion of a memory device. The first portion can store synaptic weights of a trained artificial neural network (ANN). The processor can read a first result of a MAC operation performed on the input vector and the synaptic weights stored in the first portion. The processor can send an inverse of the input vector to a second portion of the memory device. The processor can read a second result of a MAC operation performed on the inverse of the input vector and an inverse of synaptic weights stored in the second portion. The processor can combine the first result and the second result to generate a final result. The final result can be a compensated version of the first result.

Type: Application

Filed: December 9, 2022

Publication date: June 13, 2024

Inventors: Stefano Ambrogio, Pritish Narayanan
Analog memory-based complex multiply-accumulate (MACC) compute engine

Patent number: 12003240

Abstract: A circuit comprises a first pulse-width modulator configured to generate a first pulse based on a first input, a second pulse-width modulator configured to generate a second pulse based on a second input, a first differential circuit comprising a first transistor, a second transistor, a first resistor, and a second resistor, and a second differential circuit comprising a first transistor, a second transistor, a first resistor, and a second resistor. A gate of the first transistor of the first differential circuit and a gate of the second transistor of the first differential circuit, and a gate of the first transistor of the second differential circuit and a gate of the second transistor of the second differential circuit are configured to be controlled by the first and second pulse width modulators based on the first input and the second input.

Type: Grant

Filed: October 31, 2022

Date of Patent: June 4, 2024

Assignee: International Business Machines Corporation

Inventors: Charles Mackin, Pritish Narayanan
ANALOG MEMORY-BASED COMPLEX MULTIPLY-ACCUMULATE (MACC) COMPUTE ENGINE

Publication number: 20240162889

Abstract: A circuit comprises a first pulse-width modulator configured to generate a first pulse based on a first input, a second pulse-width modulator configured to generate a second pulse based on a second input, a first differential circuit comprising a first transistor, a second transistor, a first resistor, and a second resistor, and a second differential circuit comprising a first transistor, a second transistor, a first resistor, and a second resistor. A gate of the first transistor of the first differential circuit and a gate of the second transistor of the first differential circuit, and a gate of the first transistor of the second differential circuit and a gate of the second transistor of the second differential circuit are configured to be controlled by the first and second pulse width modulators based on the first input and the second input.

Type: Application

Filed: October 31, 2022

Publication date: May 16, 2024

Inventors: Charles Mackin, Pritish Narayanan
Compression of fully connected / recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression

Patent number: 11977974

Abstract: A system, having a memory that stores computer executable components, and a processor that executes the computer executable components, reduces data size in connection with training a neural network by exploiting spatial locality to weight matrices and effecting frequency transformation and compression. A receiving component receives neural network data in the form of a compressed frequency-domain weight matrix. A segmentation component segments the initial weight matrix into original sub-components, wherein respective original sub-components have spatial weights. A sampling component applies a generalized weight distribution to the respective original sub-components to generate respective normalized sub-components. A transform component applies a transform to the respective normalized sub-components.

Type: Grant

Filed: November 30, 2017

Date of Patent: May 7, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Chia-Yu Chen, Jungwook Choi, Kailash Gopalakrishnan, Suyog Gupta, Pritish Narayanan
Efficient tile mapping for row-by-row convolutional neural network mapping for analog artificial intelligence network inference

Patent number: 11868893

Abstract: Implementing a convolutional neural network (CNN) includes configuring a crosspoint array to implement a convolution layer in the CNN. Convolution kernels of the layer are stored in crosspoint devices of the array. Computations for the CNN are performed by iterating a set of operations for a predetermined number of times. The operations include transmitting voltage pulses corresponding to a subpart of a vector of input data to the crosspoint array. The voltage pulses generate electric currents that are representative of performing multiplication operations at the crosspoint device based on weight values stored at the crosspoint devices. A set of integrators accumulates an electric charge based on the output electric currents from the respective crosspoint devices. The crosspoint array outputs the accumulated charge after iterating for the predetermined number of times. The accumulated charge represents a multiply-add result of the vector of input data and the one or more convolution kernels.

Type: Grant

Filed: December 2, 2022

Date of Patent: January 9, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: HsinYu Tsai, Geoffrey Burr, Pritish Narayanan
Stochastic Bitstream Generation with In-Situ Function Mapping

Publication number: 20230419093

Abstract: Techniques for generating digital outputs as stochastic bitstreams with activation function mapping are provided. In one aspect, a system includes: a shared circuitry component including a RNG for generating a sequence of random addresses to read a random sequence of digital voltage references stored in a LUT, and a DAC for converting the random sequence of digital voltage references into random analog voltage references VL; and a comparator(s) for comparing the random analog voltage references VL and input analog voltages VN in sequences of comparisons to produce sequences of digital pulses as stochastic bitstreams. A system having multiple comparators for simultaneously comparing each of the random analog voltage references VL against more than one of the input analog voltages VN in parallel is also provided, as is a method for generating digital outputs from input analog voltages VN.

Type: Application

Filed: June 23, 2022

Publication date: December 28, 2023

Inventors: Pritish Narayanan, Geoffrey Burr
Selective application of multiple pulse durations to crossbar arrays

Patent number: 11823740

Abstract: A computer-implemented method, according to one embodiment, includes: causing a first subset of pulse width modulators in a crossbar array of memory cells to apply respective pulses to the crossbar array together at a same start time and end the respective pulses according to a predetermined distribution of times correlated to stored pulse width data for each pulse width modulator. The method also includes causing a second subset of pulse width modulators in the crossbar array to apply pulses to the crossbar array according to the predetermined distribution of times correlated to stored pulse width data for each pulse width modulator and end the respective pulses together at a same end time.

Type: Grant

Filed: December 8, 2021

Date of Patent: November 21, 2023

Assignee: International Business Machines Corporation

Inventors: Geoffrey Burr, Masatoshi Ishii, Pritish Narayanan, Paul Michael Solomon
Competitive machine learning accuracy on neuromorphic arrays with non-ideal non-volatile memory devices

Patent number: 11797833

Abstract: Optimized synapses for neuromorphic arrays are provided. In various embodiments, first and second single-transistor current sources are electrically coupled in series. The first single-transistor current source is electrically coupled to both a first control circuit and second control circuit, free of any intervening logic gate between the first single-transistor current source and either one of the control circuits. The second single-transistor current source is electrically coupled to both the first control circuit and the second control circuit, free of any intervening logic gate between the second single-transistor current source and either one of the control circuits. A capacitor is electrically coupled to the first and second single-transistor current sources. A read circuit is electrically coupled to the capacitor. The first and second single-transistor current sources are adapted to charge the capacitor only when concurrently receiving a control signal from both the first and second control circuits.

Type: Grant

Filed: November 14, 2017

Date of Patent: October 24, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Geoffrey W. Burr, Pritish Narayanan
TWO-DIMENSIONAL MESH FOR COMPUTE-IN-MEMORY ACCELERATOR ARCHITECTURE

Publication number: 20230316060

Abstract: Embodiments disclosed herein include a compute in-memory (CIM) accelerator architecture for deep neural network (DNN). The CIM accelerator architecture may include a first analog fabric engine having a plurality of compute in-memory (CIM) analog tiles. Each CIM analog tile may be configured to store a matrix of weight operands producing a vector of outputs from a vector of inputs, and perform in-memory computations. The first analog fabric may also include a plurality of compute cores. Each CIM analog tile and each compute core may include a microcontroller configured to execute a set of instructions. The first analog fabric may also include on-chip interconnects communicatively connecting all CIM analog tiles in the plurality of CIM analog tile to the compute cores.

Type: Application

Filed: March 31, 2022

Publication date: October 5, 2023

Inventors: Shubham Jain, HsinYu Tsai, Geoffrey Burr, Milos Stanisavljevic, Pritish Narayanan
HARDWARE IMPLEMENTATION OF ACTIVATION FUNCTIONS

Publication number: 20230306251

Abstract: A device comprises activation function circuitry configured to implement a non-linear activation function. The activation function circuitry comprises a comparator circuit, a capacitor, and a ramp voltage generator circuit. The capacitor comprises a terminal coupled to a first input terminal of the comparator circuit, and is configured to receive and store an input voltage which corresponds to an input value to the non-linear activation function. The ramp voltage generator circuit is configured to generate a ramp voltage which is applied to a second input terminal of the comparator circuit. The comparator circuit is configured to compare, during a conversion period, the stored input voltage to the ramp voltage, and generate a voltage pulse based on a result of the comparing. The voltage pulse comprises a pulse duration which encodes an activation output value of the non-linear activation function based on the input value to the non-linear activation function.

Type: Application

Filed: March 23, 2022

Publication date: September 28, 2023

Inventors: Stefano Ambrogio, Pritish Narayanan
CALIBRATING ANALOG RESISTIVE PROCESSING UNIT SYSTEM

Publication number: 20230306252

Abstract: A system comprises a processor, and a resistive processing unit (RPU) array. The RPU array comprises an array of cells which respectively comprise resistive memory devices that are programable to store weight values. The processor is configured to obtain a matrix comprising target weight values, program cells of the array of cells to store weight values in the RPU array, which correspond to respective target weight values of the matrix, and perform a calibration process to calibrate the RPU array. The calibration process comprises iteratively adjusting the target weight values of the matrix, and reprogramming the stored weight values of the matrix in the RPU array based on the respective adjusted target weight values, to reduce a variation between output lines of the RPU array with respect to multiply-and-accumulate distribution data that is generated and output from respective output lines of the RPU array during the calibration process.

Type: Application

Filed: March 25, 2022

Publication date: September 28, 2023

Inventors: Stefano Ambrogio, Pritish Narayanan, Geoffrey Burr
SPLIT PULSE WIDTH MODULATION TO REDUCE CROSSBAR ARRAY INTEGRATION TIME

Publication number: 20230198511

Abstract: A computer-implemented method, according to one embodiment, includes: causing a multi-bit input to be split into two or more chunks, where each of the two or more chunks include at least one individual bit. Each of the two or more chunks are also converted into a respective pulse width modulated signal, and a partial result is generated in digital form for each of the respective pulse width modulated signals. Each of the partial results are scaled by a respective significance factor corresponding to each of the two or more chunks, and the scaled partial results are also accumulated.

Type: Application

Filed: December 17, 2021

Publication date: June 22, 2023

Inventors: Geoffrey Burr, Masatoshi Ishii, Pritish Narayanan
SELECTIVE APPLICATION OF MULTIPLE PULSE DURATIONS TO CROSSBAR ARRAYS

Publication number: 20230178150

Abstract: A computer-implemented method, according to one embodiment, includes: causing a first subset of pulse width modulators in a crossbar array of memory cells to apply respective pulses to the crossbar array together at a same start time and end the respective pulses according to a predetermined distribution of times correlated to stored pulse width data for each pulse width modulator. The method also includes causing a second subset of pulse width modulators in the crossbar array to apply pulses to the crossbar array according to the predetermined distribution of times correlated to stored pulse width data for each pulse width modulator and end the respective pulses together at a same end time.

Type: Application

Filed: December 8, 2021

Publication date: June 8, 2023

Inventors: Geoffrey Burr, Masatoshi Ishii, Pritish Narayanan, Paul Michael Solomon
Configuring computing nodes in a three-dimensional mesh topology

Patent number: 11646944

Abstract: A system according to one embodiment includes a collection of computing nodes arranged in a mesh of N×M×Z topology, the nodes including computational hardware, wherein Z<N and Z<M, and wherein N and M are at least equal to 4; a collection of I/O connections interfaced with one of the sides of the mesh, said side having N×M nodes, each of the I/O connections being tied to a unique one of the nodes in said side; and I/O cards that are tied to the I/O connections.

Type: Grant

Filed: September 21, 2021

Date of Patent: May 9, 2023

Assignee: International Business Machines Corporation

Inventors: Alexis Asseman, Ahmet Serkan Ozcan, Charles Edwin Cox, Pritish Narayanan, Nicolas Antoine
EFFICIENT TILE MAPPING FOR ROW-BY-ROW CONVOLUTIONAL NEURAL NETWORK MAPPING FOR ANALOG ARTIFICIAL INTELLIGENCE NETWORK INFERENCE

Publication number: 20230100139

Abstract: Implementing a convolutional neural network (CNN) includes configuring a crosspoint array to implement a convolution layer in the CNN. Convolution kernels of the layer are stored in crosspoint devices of the array. Computations for the CNN are performed by iterating a set of operations for a predetermined number of times. The operations include transmitting voltage pulses corresponding to a subpart of a vector of input data to the crosspoint array. The voltage pulses generate electric currents that are representative of performing multiplication operations at the crosspoint device based on weight values stored at the crosspoint devices. A set of integrators accumulates an electric charge based on the output electric currents from the respective crosspoint devices. The crosspoint array outputs the accumulated charge after iterating for the predetermined number of times. The accumulated charge represents a multiply-add result of the vector of input data and the one or more convolution kernels.

Type: Application

Filed: December 2, 2022

Publication date: March 30, 2023

Inventors: HsinYu Tsai, Geoffrey Burr, Pritish Narayanan
IMPLICIT VECTOR CONCATENATION WITHIN 2D MESH ROUTING

Publication number: 20230100564

Abstract: Arrays of neural cores are provided. Each neural core comprises ordered input wires ordered output wires, and synapses, each of the synapses operatively coupled to one of the input wires and one of the output wires. A plurality of signal wires is provided. At least one of the signal wires is disposed along each dimension of the array of neural cores. A plurality of routers is provided, each of which is operatively coupled to one of the neural cores and to at least one of the signal wires along each of the dimensions of the array of neural cores. Each of the routers selectively routes a signal from the at least one signal wire to its coupled neural core. Each of the routers selectively routes a signal from its coupled neural core to the at least one signal wire. The routers segment the ordered input wires and the ordered output wires into segments and independently routes the signals of each segment.

Type: Application

Filed: September 29, 2021

Publication date: March 30, 2023

Inventors: Geoffrey Burr, Kohji Hosokawa, HsinYu Tsai, Shubham Jain, Pritish Narayanan
SYSTEM, METHOD AND ARTICLE OF MANUFACTURE FOR SYNCHRONIZATION-FREE TRANSMITTAL OF NEURON VALUES IN A HARDWARE ARTIFICIAL NEURAL NETWORKS

Publication number: 20230086636

Abstract: Computations in Artificial neural networks (ANNs) are accomplished using simple processing units, called neurons, with data embodied by the connections between neurons, called synapses, and by the strength of these connections, the synaptic weights. Crossbar arrays may be used to represent one layer of the ANN with Non-Volatile Memory (NVM) elements at each crosspoint, where the conductance of the NVM elements may be used to encode the synaptic weights, and a highly parallel current summation on the array achieves a weighted sum operation that is representative of the values of the output neurons. A method is outlined to transfer such neuron values from the outputs of one array to the inputs of a second array with no need for global clock synchronization, irrespective of the distances between the arrays, and to use such values at the next array, and/or to convert such values into digital bits at the next array.

Type: Application

Filed: November 28, 2022

Publication date: March 23, 2023

Inventors: Geoffrey W. Burr, Pritish Narayanan
System, method and article of manufacture for synchronization-free transmittal of neuron values in a hardware artificial neural networks

Patent number: 11580373

Abstract: Computations in Artificial neural networks (ANNs) are accomplished using simple processing units, called neurons, with data embodied by the connections between neurons, called synapses, and by the strength of these connections, the synaptic weights. Crossbar arrays may be used to represent one layer of the ANN with Non-Volatile Memory (NVM) elements at each crosspoint, where the conductance of the NVM elements may be used to encode the synaptic weights, and a highly parallel current summation on the array achieves a weighted sum operation that is representative of the values of the output neurons. A method is outlined to transfer such neuron values from the outputs of one array to the inputs of a second array with no need for global clock synchronization, irrespective of the distances between the arrays, and to use such values at the next array, and/or to convert such values into digital bits at the next array.

Type: Grant

Filed: January 20, 2017

Date of Patent: February 14, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Geoffrey W Burr, Pritish Narayanan
Efficient tile mapping for row-by-row convolutional neural network mapping for analog artificial intelligence network inference

Patent number: 11562240

Abstract: Implementing a convolutional neural network (CNN) includes configuring a crosspoint array to implement a convolution layer in the CNN. Convolution kernels of the layer are stored in crosspoint devices of the array. Computations for the CNN are performed by iterating a set of operations for a predetermined number of times. The operations include transmitting voltage pulses corresponding to a subpart of a vector of input data to the crosspoint array. The voltage pulses generate electric currents that are representative of performing multiplication operations at the crosspoint device based on weight values stored at the crosspoint devices. A set of integrators accumulates an electric charge based on the output electric currents from the respective crosspoint devices. The crosspoint array outputs the accumulated charge after iterating for the predetermined number of times. The accumulated charge represents a multiply-add result of the vector of input data and the one or more convolution kernels.

Type: Grant

Filed: May 27, 2020

Date of Patent: January 24, 2023

Assignee: International Business Machines Corporation

Inventors: Hsinyu Tsai, Geoffrey Burr, Pritish Narayanan
CALIBRATING PERIPHERAL VARIABILITY

Publication number: 20220405554

Abstract: Embodiments herein disclose computer-implemented methods, computer program products and computer systems for balancing neural network weight asymmetries. The computer-implemented method may include providing a neural network with weights comprising one or more major conductance pairs and one or more minor conductance pairs. The method may further include programming the one or more major conductance pairs to force an inference output to an expected duration value, determining a positive weight coefficient based on the one or more major conductance pairs and a negative weight coefficient based on the one or more minor conductance pairs, determining one or more target weights based on one or more of the positive weight coefficient and the negative weight coefficient, programming the one or more minor conductance pairs to force the inference output to the expected duration value, and programming the one or more major conductance pairs with the one or more target weights.

Type: Application

Filed: June 17, 2021

Publication date: December 22, 2022

Inventors: Stefano Ambrogio, Geoffrey Burr, Charles Mackin, Pritish Narayanan, HsinYu Tsai

1 2 3 next