Patents by Inventor Geoffrey Burr

Geoffrey Burr has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

LEARNED COLUMN-WEIGHTS FOR RAPID-ESTIMATION OF PROPERTIES OF AN ENTIRE EXCITATION VECTOR

Publication number: 20240086677

Abstract: A method includes receiving, at a neural network weight layer of an artificial neural network, an incoming excitation vector. The artificial neural network includes one or more operations requiring one or more scalar values, such as a mean or a standard deviation, to be computed across an output data vector of the artificial neural network. The method further includes using a predicted representation of the one or more scalar values during forward inference of the artificial neural network by the incoming excitation vector to apply the one or more operations to the output data vector, thus avoiding any computation needed to compute an exact representation of the one or more scalar values from the output data vector.

Type: Application

Filed: September 12, 2022

Publication date: March 14, 2024

Inventors: Geoffrey Burr, Malte Johannes Rasch
SPECIAL-PURPOSE DIGITAL-COMPUTE HARDWARE FOR EFFICIENT ELEMENT-WISE AGGREGATION, SCALING AND OFFSET

Publication number: 20240086192

Abstract: An efficient pipelined implementation of digital scaling, offset and aggregation operation supports element-by-element programmable scale and offset factors. The method includes time-multiplexed parallel pipelining of a plurality of digital data words, each of the plurality of digital data words encoding an N-bit signed integer, from one of a plurality of receive-registers through a datapath that can either (1) store the plurality of digital data words directly in a dedicated first memory, (2) store the plurality of digital data words directly in a dedicated second memory, or (3) direct the plurality of digital data words into a parallel set of fused-multiply-add units. The method further includes multiplying each digital data word by a corresponding data-word retrieved from the dedicated first memory to form product data words and adding the product data words to a corresponding data-word retrieved from the dedicated second memory to form an output sum-and-product data words.

Type: Application

Filed: September 12, 2022

Publication date: March 14, 2024

Inventors: Geoffrey Burr, Shubham Jain, Milos Stanisavljevic, Yasuteru Kohda
BURIED METAL SIGNAL RAIL FOR MEMORY ARRAYS

Publication number: 20240079326

Abstract: An IC memory device includes a substrate and an array of memory cells on the substrate. Each memory cell includes at least one memory cell transistor in a layer of the device adjacent to the substrate. In the same layer, the device also includes a plurality of shunt transistors. The device also includes a buried metal signal rail, which is disposed between the array of memory cells and the plurality of shunt transistors in a buried layer that is embedded into the substrate below the transistors. The device also includes single-layer vias, which are in same layer as the transistors and electrically connect the memory cell transistors to the shunt transistors through the buried metal signal rail.

Type: Application

Filed: September 6, 2022

Publication date: March 7, 2024

Inventors: Biswanath Senapati, SEIJI MUNETOH, Nicholas Anthony Lanzillo, Lawrence A. Clevenger, Geoffrey Burr, Kohji Hosokawa
Efficient tile mapping for row-by-row convolutional neural network mapping for analog artificial intelligence network inference

Patent number: 11868893

Abstract: Implementing a convolutional neural network (CNN) includes configuring a crosspoint array to implement a convolution layer in the CNN. Convolution kernels of the layer are stored in crosspoint devices of the array. Computations for the CNN are performed by iterating a set of operations for a predetermined number of times. The operations include transmitting voltage pulses corresponding to a subpart of a vector of input data to the crosspoint array. The voltage pulses generate electric currents that are representative of performing multiplication operations at the crosspoint device based on weight values stored at the crosspoint devices. A set of integrators accumulates an electric charge based on the output electric currents from the respective crosspoint devices. The crosspoint array outputs the accumulated charge after iterating for the predetermined number of times. The accumulated charge represents a multiply-add result of the vector of input data and the one or more convolution kernels.

Type: Grant

Filed: December 2, 2022

Date of Patent: January 9, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: HsinYu Tsai, Geoffrey Burr, Pritish Narayanan
Stochastic Bitstream Generation with In-Situ Function Mapping

Publication number: 20230419093

Abstract: Techniques for generating digital outputs as stochastic bitstreams with activation function mapping are provided. In one aspect, a system includes: a shared circuitry component including a RNG for generating a sequence of random addresses to read a random sequence of digital voltage references stored in a LUT, and a DAC for converting the random sequence of digital voltage references into random analog voltage references VL; and a comparator(s) for comparing the random analog voltage references VL and input analog voltages VN in sequences of comparisons to produce sequences of digital pulses as stochastic bitstreams. A system having multiple comparators for simultaneously comparing each of the random analog voltage references VL against more than one of the input analog voltages VN in parallel is also provided, as is a method for generating digital outputs from input analog voltages VN.

Type: Application

Filed: June 23, 2022

Publication date: December 28, 2023

Inventors: Pritish Narayanan, Geoffrey Burr
Selective application of multiple pulse durations to crossbar arrays

Patent number: 11823740

Abstract: A computer-implemented method, according to one embodiment, includes: causing a first subset of pulse width modulators in a crossbar array of memory cells to apply respective pulses to the crossbar array together at a same start time and end the respective pulses according to a predetermined distribution of times correlated to stored pulse width data for each pulse width modulator. The method also includes causing a second subset of pulse width modulators in the crossbar array to apply pulses to the crossbar array according to the predetermined distribution of times correlated to stored pulse width data for each pulse width modulator and end the respective pulses together at a same end time.

Type: Grant

Filed: December 8, 2021

Date of Patent: November 21, 2023

Assignee: International Business Machines Corporation

Inventors: Geoffrey Burr, Masatoshi Ishii, Pritish Narayanan, Paul Michael Solomon
TWO-DIMENSIONAL MESH FOR COMPUTE-IN-MEMORY ACCELERATOR ARCHITECTURE

Publication number: 20230316060

Abstract: Embodiments disclosed herein include a compute in-memory (CIM) accelerator architecture for deep neural network (DNN). The CIM accelerator architecture may include a first analog fabric engine having a plurality of compute in-memory (CIM) analog tiles. Each CIM analog tile may be configured to store a matrix of weight operands producing a vector of outputs from a vector of inputs, and perform in-memory computations. The first analog fabric may also include a plurality of compute cores. Each CIM analog tile and each compute core may include a microcontroller configured to execute a set of instructions. The first analog fabric may also include on-chip interconnects communicatively connecting all CIM analog tiles in the plurality of CIM analog tile to the compute cores.

Type: Application

Filed: March 31, 2022

Publication date: October 5, 2023

Inventors: Shubham Jain, HsinYu Tsai, Geoffrey Burr, Milos Stanisavljevic, Pritish Narayanan
CALIBRATING ANALOG RESISTIVE PROCESSING UNIT SYSTEM

Publication number: 20230306252

Abstract: A system comprises a processor, and a resistive processing unit (RPU) array. The RPU array comprises an array of cells which respectively comprise resistive memory devices that are programable to store weight values. The processor is configured to obtain a matrix comprising target weight values, program cells of the array of cells to store weight values in the RPU array, which correspond to respective target weight values of the matrix, and perform a calibration process to calibrate the RPU array. The calibration process comprises iteratively adjusting the target weight values of the matrix, and reprogramming the stored weight values of the matrix in the RPU array based on the respective adjusted target weight values, to reduce a variation between output lines of the RPU array with respect to multiply-and-accumulate distribution data that is generated and output from respective output lines of the RPU array during the calibration process.

Type: Application

Filed: March 25, 2022

Publication date: September 28, 2023

Inventors: Stefano Ambrogio, Pritish Narayanan, Geoffrey Burr
Efficient Data Layout and Alignment for Wide-Vector Accelerator Systems

Publication number: 20230305841

Abstract: Efficient data layout and alignment techniques for effectively executing AI workloads in wide-vector accelerator systems are provided. In one aspect, a method for processing AI workloads includes: logically dividing a data vector into a hierarchy of segments and sub-segments with each of the segments including more than one of the sub-segments, wherein each of the sub-segments includes words, and each of the words includes data-bits; and physically mapping the data-bits such that the words belonging to a same given one of the sub-segments are mapped contiguously across all of the segments. An AI accelerator system is also provided.

Type: Application

Filed: March 22, 2022

Publication date: September 28, 2023

Inventors: Shubham Jain, Geoffrey Burr, Yasuteru Kohda
SPLIT PULSE WIDTH MODULATION TO REDUCE CROSSBAR ARRAY INTEGRATION TIME

Publication number: 20230198511

Abstract: A computer-implemented method, according to one embodiment, includes: causing a multi-bit input to be split into two or more chunks, where each of the two or more chunks include at least one individual bit. Each of the two or more chunks are also converted into a respective pulse width modulated signal, and a partial result is generated in digital form for each of the respective pulse width modulated signals. Each of the partial results are scaled by a respective significance factor corresponding to each of the two or more chunks, and the scaled partial results are also accumulated.

Type: Application

Filed: December 17, 2021

Publication date: June 22, 2023

Inventors: Geoffrey Burr, Masatoshi Ishii, Pritish Narayanan
MEMORY CELL IN WAFER BACKSIDE

Publication number: 20230187314

Abstract: A memory cell in a backside of a wafer and methods of forming the memory cell are described. A buried metal structure can be formed through a frontside of a substrate. At least one device can be formed on the frontside of a substrate, where the at least one device can be connected to the buried metal structure in the substrate. A through silicon via (TSV) can be formed through a backside of the substrate, where the TSV can be connected to the buried metal structure. A memory cell can be formed on the backside of the substrate, where the memory cell can be connected to the TSV.

Type: Application

Filed: December 15, 2021

Publication date: June 15, 2023

Inventors: Biswanath Senapati, Seiji Munetoh, Nicholas Anthony Lanzillo, Lawrence A. Clevenger, Geoffrey Burr, Kohji Hosokawa
SELECTIVE APPLICATION OF MULTIPLE PULSE DURATIONS TO CROSSBAR ARRAYS

Publication number: 20230178150

Abstract: A computer-implemented method, according to one embodiment, includes: causing a first subset of pulse width modulators in a crossbar array of memory cells to apply respective pulses to the crossbar array together at a same start time and end the respective pulses according to a predetermined distribution of times correlated to stored pulse width data for each pulse width modulator. The method also includes causing a second subset of pulse width modulators in the crossbar array to apply pulses to the crossbar array according to the predetermined distribution of times correlated to stored pulse width data for each pulse width modulator and end the respective pulses together at a same end time.

Type: Application

Filed: December 8, 2021

Publication date: June 8, 2023

Inventors: Geoffrey Burr, Masatoshi Ishii, Pritish Narayanan, Paul Michael Solomon
ORGANIZING SEQUENCES FOR TRANSFORMER COMPUTE

Publication number: 20230169305

Abstract: A computer-implemented method according to one embodiment includes determining a threshold sequence-size for a transformer; organizing a batch of sequences according to the threshold sequence-size; and inputting the organized batch of sequences into the transformer.

Type: Application

Filed: November 30, 2021

Publication date: June 1, 2023

Inventors: Geoffrey Burr, HsinYu Tsai, Shubham Jain
TRANSLATING ARTIFICIAL NEURAL NETWORK SOFTWARE WEIGHTS TO HARDWARE-SPECIFIC ANALOG CONDUCTANCES

Publication number: 20230105568

Abstract: Translation of artificial neural network (ANN) software weights to analog conductances in the presence of conductance non-idealities for deployment to an analog non-volatile memory device is provided. A plurality of target synaptic weights of an artificial neural network is read. The plurality of target synaptic weights is mapped to a plurality of conductance values, each of the plurality of target synaptic weights being mapped to at least one of the plurality of conductance values. A hardware model is applied to the plurality of conductance values, thereby determining a plurality of hardware-adjusted conductance values, the hardware model corresponding to an analog non-volatile memory device. The plurality of hardware-adjusted conductance values is mapped to a plurality of hardware-adjusted synaptic weights. The plurality of conductance values is optimized in order to minimize an error metric between the target synaptic weights and the hardware-adjusted synaptic weights.

Type: Application

Filed: October 1, 2021

Publication date: April 6, 2023

Inventors: Charles Mackin, Geoffrey Burr, Jonathan Paul Timcheck
IMPLICIT VECTOR CONCATENATION WITHIN 2D MESH ROUTING

Publication number: 20230100564

Abstract: Arrays of neural cores are provided. Each neural core comprises ordered input wires ordered output wires, and synapses, each of the synapses operatively coupled to one of the input wires and one of the output wires. A plurality of signal wires is provided. At least one of the signal wires is disposed along each dimension of the array of neural cores. A plurality of routers is provided, each of which is operatively coupled to one of the neural cores and to at least one of the signal wires along each of the dimensions of the array of neural cores. Each of the routers selectively routes a signal from the at least one signal wire to its coupled neural core. Each of the routers selectively routes a signal from its coupled neural core to the at least one signal wire. The routers segment the ordered input wires and the ordered output wires into segments and independently routes the signals of each segment.

Type: Application

Filed: September 29, 2021

Publication date: March 30, 2023

Inventors: Geoffrey Burr, Kohji Hosokawa, HsinYu Tsai, Shubham Jain, Pritish Narayanan
FLEXIBLE AND ENERGY-EFFICIENT 2-D ROUTER MESH

Publication number: 20230096894

Abstract: An array of neural cores has at least two dimensions. Each of the neural cores comprises ordered input wires, ordered output wires, and synapses, each of the synapses operatively coupled to one of the input wires and one of the output wires. Signal wires are provided. At least one of the signal wires is disposed along each dimension of the array of neural cores. Each of the signal wires is disposed along at least one dimension of the array. Routers are provided, each of which is operatively coupled to (i) one of the neural cores and (ii) at least two of the signal wires, one along each of the dimensions of the array of neural cores. Each of the routers is configured to selectively route a signal from one of its at least two coupled signal wires to its coupled neural core. Each of the routers is configured to selectively route a signal from its coupled neural core to one of its at least two coupled signal wires.

Type: Application

Filed: September 28, 2021

Publication date: March 30, 2023

Inventors: Geoffrey Burr, Kohji Hosokawa
EFFICIENT TILE MAPPING FOR ROW-BY-ROW CONVOLUTIONAL NEURAL NETWORK MAPPING FOR ANALOG ARTIFICIAL INTELLIGENCE NETWORK INFERENCE

Publication number: 20230100139

Abstract: Implementing a convolutional neural network (CNN) includes configuring a crosspoint array to implement a convolution layer in the CNN. Convolution kernels of the layer are stored in crosspoint devices of the array. Computations for the CNN are performed by iterating a set of operations for a predetermined number of times. The operations include transmitting voltage pulses corresponding to a subpart of a vector of input data to the crosspoint array. The voltage pulses generate electric currents that are representative of performing multiplication operations at the crosspoint device based on weight values stored at the crosspoint devices. A set of integrators accumulates an electric charge based on the output electric currents from the respective crosspoint devices. The crosspoint array outputs the accumulated charge after iterating for the predetermined number of times. The accumulated charge represents a multiply-add result of the vector of input data and the one or more convolution kernels.

Type: Application

Filed: December 2, 2022

Publication date: March 30, 2023

Inventors: HsinYu Tsai, Geoffrey Burr, Pritish Narayanan
Efficient tile mapping for row-by-row convolutional neural network mapping for analog artificial intelligence network inference

Patent number: 11562240

Abstract: Implementing a convolutional neural network (CNN) includes configuring a crosspoint array to implement a convolution layer in the CNN. Convolution kernels of the layer are stored in crosspoint devices of the array. Computations for the CNN are performed by iterating a set of operations for a predetermined number of times. The operations include transmitting voltage pulses corresponding to a subpart of a vector of input data to the crosspoint array. The voltage pulses generate electric currents that are representative of performing multiplication operations at the crosspoint device based on weight values stored at the crosspoint devices. A set of integrators accumulates an electric charge based on the output electric currents from the respective crosspoint devices. The crosspoint array outputs the accumulated charge after iterating for the predetermined number of times. The accumulated charge represents a multiply-add result of the vector of input data and the one or more convolution kernels.

Type: Grant

Filed: May 27, 2020

Date of Patent: January 24, 2023

Assignee: International Business Machines Corporation

Inventors: Hsinyu Tsai, Geoffrey Burr, Pritish Narayanan
CALIBRATING PERIPHERAL VARIABILITY

Publication number: 20220405554

Abstract: Embodiments herein disclose computer-implemented methods, computer program products and computer systems for balancing neural network weight asymmetries. The computer-implemented method may include providing a neural network with weights comprising one or more major conductance pairs and one or more minor conductance pairs. The method may further include programming the one or more major conductance pairs to force an inference output to an expected duration value, determining a positive weight coefficient based on the one or more major conductance pairs and a negative weight coefficient based on the one or more minor conductance pairs, determining one or more target weights based on one or more of the positive weight coefficient and the negative weight coefficient, programming the one or more minor conductance pairs to force the inference output to the expected duration value, and programming the one or more major conductance pairs with the one or more target weights.

Type: Application

Filed: June 17, 2021

Publication date: December 22, 2022

Inventors: Stefano Ambrogio, Geoffrey Burr, Charles Mackin, Pritish Narayanan, HsinYu Tsai
Distributing device array currents across segment mirrors

Patent number: 11488664

Abstract: Distributing multiply-accumulate currents across segment mirrors by providing a circuit including an array of resistive elements, the array including rows and columns and first stage current mirrors, each of the first stage current mirrors being electrically coupled to a segment, wherein the segment comprises a columnar subset of the resistive elements, providing, by the array, a vector of current outputs equal to an analog vector-matrix product between a vector of voltage inputs to the array and a matrix of analog resistive weights within the array, wherein the voltage inputs encode a vector of analog input values, wherein each row of resistive elements corresponds to a specific voltage input, determining a score for each of the rows, determining a ranking of the rows of the array according to the score of each row, and mapping each row to a segment according to the ranking.

Type: Grant

Filed: October 13, 2020

Date of Patent: November 1, 2022

Assignee: International Business Machines Corporation

Inventors: Charles Mackin, Pritish Narayanan, Geoffrey Burr

1 2 next