Patents by Inventor Jungwook CHOI
Jungwook CHOI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210064372Abstract: An apparatus includes a memory and a processor coupled to the memory. The processor includes first and second sets of arithmetic units having first and second precision for floating-point computations, the second precision being lower than the first precision. The processor is configured to obtain a machine learning model trained in the first precision, to utilize the second set of arithmetic units to perform inference on input data, to utilize the first set of arithmetic units to generate feedback for updating parameters of the second set of arithmetic units based on the inference performed on the input data by the second set of arithmetic units, to tune parameters of the second set of arithmetic units based at least in part on the feedback generated by the first set of arithmetic units, and to utilize the second set of arithmetic units with the tuned parameters to generate inference results.Type: ApplicationFiled: September 3, 2019Publication date: March 4, 2021Inventors: Xiao Sun, Chia-Yu Chen, Naigang Wang, Jungwook Choi, Kailash Gopalakrishnan
-
Publication number: 20210064974Abstract: A neuromorphic device includes a plurality of first control lines, a plurality of second control lines and a matrix of resistive processing unit cells. Each resistive processing unit cell is electrically connected with one of the first control lines and one of the second control lines. A given resistive processing unit cell includes a first resistive device and a second resistive device. The first resistive device is a positively weighted resistive device and the second resistive device is a negatively weighted resistive device.Type: ApplicationFiled: August 30, 2019Publication date: March 4, 2021Inventors: Youngseok Kim, Jungwook Choi, Seyoung Kim, Chun-Chen Yeh
-
Publication number: 20210064954Abstract: A convolutional neural network includes a front layer, a back layer, and a plurality of other layers that are connected between the front layer and the back layer. One of the other layers is a transition layer. A first precision is assigned to activations of neurons from the front layer back to the transition layer and a second precision is assigned to activations of the neurons from the transition layer back to the back layer. A third precision is assigned to weights of inputs to neurons from the front layer back to the transition layer and a fourth precision is assigned to weights of inputs to the neurons from the transition layer back to the back layer. In some embodiments the layers forward of the transition layer have a different convolutional kernel than the layers rearward of the transition layer.Type: ApplicationFiled: August 27, 2019Publication date: March 4, 2021Inventors: Jungwook Choi, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan
-
Publication number: 20210064985Abstract: An apparatus for training and inferencing a neural network includes circuitry that is configured to generate a first weight having a first format including a first number of bits based at least in part on a second weight having a second format including a second number of bits and a residual having a third format including a third number of bits. The second number of bits and the third number of bits are each less than the first number of bits. The circuitry is further configured to update the second weight based at least in part on the first weight and to update the residual based at least in part on the updated second weight and the first weight. The circuitry is further configured to update the first weight based at least in part on the updated second weight and the updated residual.Type: ApplicationFiled: September 3, 2019Publication date: March 4, 2021Inventors: Xiao Sun, Jungwook Choi, Naigang Wang, Chia-Yu Chen, Kailash Gopalakrishnan
-
Publication number: 20200401413Abstract: Various embodiments are provided for using a reduced precision based programmable and single instruction multiple data (SIMD) dataflow architecture in a computing environment. One or more instructions between a plurality of execution units (EUs) operating in parallel within each one of a plurality of execution elements (EEs).Type: ApplicationFiled: June 20, 2019Publication date: December 24, 2020Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Kailash GOPALAKRISHNAN, Sunil SHUKLA, Jungwook CHOI, Silvia MUELLER, Bruce FLEISCHER, Vijayalakshmi SRINIVASAN, Ankur AGRAWAL, Jinwook OH
-
Patent number: 10838868Abstract: Embodiments for implementing a communicating memory between a plurality of computing components are provided. In one embodiment, an apparatus comprises a plurality of memory components residing on a processing chip, the plurality of memory components interconnected between a plurality of processing elements of at least one processing core of the processing chip and at least one external memory component external to the processing chip. The apparatus further comprises a plurality of load agents and a plurality of store agents on the processing chip, each interfacing with the plurality of memory components. Each of the plurality of load agents and the plurality of store agents execute an independent program specifying a destination of data transacted between the plurality of memory components, the at least one external memory component, and the plurality of processing elements.Type: GrantFiled: March 7, 2019Date of Patent: November 17, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Chia-Yu Chen, Jungwook Choi, Brian Curran, Bruce Fleischer, Kailash Gopalakrishan, Jinwook Oh, Sunil K Shukla, Vijayalakshmi Srinivasan, Swagath Venkataramani
-
Publication number: 20200356371Abstract: Various embodiments are provided reusing an operand in an instruction set architecture (ISA) by one or more processors in a computing system. An instruction may specify that an operand register for a selected operand retain operand data used by a previous instruction. The operand data in the operand register may be reused by the instruction.Type: ApplicationFiled: May 8, 2019Publication date: November 12, 2020Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bruce FLEISCHER, Sunil SHUKLA, Vijayalakshmi SRINIVASAN, Jungwook CHOI
-
Publication number: 20200311536Abstract: A minibatch in a neural network execution may be dynamically resized based on on-chip memory. For example, a size of the minibatch is configured such that the minibatch fits within on-chip memory. The size of the minibatch may be resized for a sequence of layers in the neural network execution. A next layer's execution can commence responsive to the resized minibatch being completed in a previous layer without having to wait for all of the minibatch to be completed in the previous layer.Type: ApplicationFiled: March 25, 2019Publication date: October 1, 2020Inventors: Swagath Venkataramani, Vijayalakshmi Srinivasan, Jungwook Choi
-
Publication number: 20200285579Abstract: Embodiments for implementing a communicating memory between a plurality of computing components are provided. In one embodiment, an apparatus comprises a plurality of memory components residing on a processing chip, the plurality of memory components interconnected between a plurality of processing elements of at least one processing core of the processing chip and at least one external memory component external to the processing chip. The apparatus further comprises a plurality of load agents and a plurality of store agents on the processing chip, each interfacing with the plurality of memory components. Each of the plurality of load agents and the plurality of store agents execute an independent program specifying a destination of data transacted between the plurality of memory components, the at least one external memory component, and the plurality of processing elements.Type: ApplicationFiled: March 7, 2019Publication date: September 10, 2020Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Chia-Yu CHEN, Jungwook CHOI, Brian CURRAN, Bruce FLEISCHER, Kailash GOPALAKRISHAN, Jinwook OH, Sunil K. SHUKLA, Vijayalakshmi SRINIVASAN, Swagath VENKATARAMANI
-
Patent number: 10769238Abstract: Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.Type: GrantFiled: September 19, 2019Date of Patent: September 8, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Chia-Yu Chen, Jungwook Choi, Kailash Gopalakrishnan, Victor Han, Vijayalakshmi Srinivasan, Jintao Zhang
-
Publication number: 20200249910Abstract: In an embodiment, a method includes configuring a specialized circuit for floating point computations using numbers represented by a hybrid format, wherein the hybrid format includes a first format and a second format. In the embodiment, the method includes operating the further configured specialized circuit to store an approximation of a numeric value in the first format during a forward pass for training a deep learning network. In the embodiment, the method includes operating the further configured specialized circuit to store an approximation of a second numeric value in the second format during a backward pass for training the deep learning network.Type: ApplicationFiled: February 6, 2019Publication date: August 6, 2020Applicant: International Business Machines CorporationInventors: NAIGANG WANG, Jungwook Choi, Kailash Gopalakrishnan, Ankur Agrawal, Silvia Melitta Mueller
-
Publication number: 20200226459Abstract: A processor receives input data and provides the input data to a first neural network including a first neural network model. The first neural network model has a first numerical precision level. A first feature vector is generated from the input data using the first neural network. The input data is provided to a second neural network including a second neural network model. The second neural network model has a second numerical precision level different from the first numerical precession level. A second feature vector is generated from the input data using the second neural network. A difference metric is computed between the first feature vector and the second feature vector. The difference metric is indicative of whether the input data includes adversarial data.Type: ApplicationFiled: January 11, 2019Publication date: July 16, 2020Applicant: International Business Machines CorporationInventors: Chia-Yu Chen, Pin-Yu Chen, Pierce I-Jen Chuang, Richard Chen, Jungwook Choi, Kailash Gopalakrishnan
-
Patent number: 10657442Abstract: A compute matrix is configured to include a set of compute units, each compute unit including a multiplier and an accumulator, each of the multiplier and the accumulator formed using at least one floating point unit (FPU). An accumulator array is configured to include a set of external accumulators. The compute matrix is operated to produce a chunk dot-product using a first chunk of a first input vector and a first chunk of a second input vector. The accumulator array is operated to output a dot-product of the first input vector and the second input vector using the chunk dot-product.Type: GrantFiled: April 19, 2018Date of Patent: May 19, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Naigang Wang, Jungwook Choi, Kailash Gopalakrishnan, Daniel Brand
-
Publication number: 20200134105Abstract: A method for improving performance of a predefined Deep Neural Network (DNN) convolution processing on a computing device includes inputting parameters, as input data into a processor on a computer that formalizes a design space exploration of a convolution mapping, on a predefined computer architecture that will execute the predefined convolution processing. The parameters are predefined as guided by a specification for the predefined convolution processing to be implemented by the convolution mapping and by a microarchitectural specification for the processor that will execute the predefined convolution processing. The processor calculates performance metrics for executing the predefined convolution processing on the computing device, as functions of the predefined parameters, as proxy estimates of performance of different possible design choices to implement the predefined convolution processing.Type: ApplicationFiled: October 31, 2018Publication date: April 30, 2020Inventors: Chia-Yu CHEN, Jungwook Choi, Kailash Gopalakrishnan, Vijayalakshmi Srinivasan, Swagath Venkataramani, Jintao Zhang
-
Patent number: 10592208Abstract: A specialized circuit is configured for floating point computations using numbers represented by a very low precision format (VLP format). The VLP format includes less than sixteen bits and is apportion into a sign bit, exponent bits (e), and mantissa bits (p). The configured specialized circuit is operated to store an approximation of a numeric value in the VLP format, where the approximation is represented as a function of a multiple of a fraction, where the fraction is an inverse of a number of discrete values that can be represented using only the mantissa bits.Type: GrantFiled: May 7, 2018Date of Patent: March 17, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Naigang Wang, Kailash Gopalakrishnan, Jungwook Choi, Silvia M. Mueller, Ankur Agrawal, Daniel Brand
-
Patent number: 10565285Abstract: A convolutional lowering component (CoLor component) between processor and memory units (or within a memory hierarchy) maps location in a lowered matrix to an equivalent location in a non-lowered matrix and provides auto zero padding in computational heavy convolutional layers. An identification component identifies processing components that execute computations in deep neural networks (DNNs) in which convolutions are realized as general matrix to matrix multiplications (GEMM) operations, and identifies a subset of the processing components that store deep neural network (DNN) features in a non-lowered form component that determines output for successively larger neural networks of a set. An address translation component translates address requests, generated by the subset of processing components to a memory subsystem, from a lowered index form to a non-lowered index form.Type: GrantFiled: December 18, 2017Date of Patent: February 18, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jungwook Choi, Bruce Fleischer, Vijayalakshmi Srinivasan, Swagath Venkataramani
-
Publication number: 20200012706Abstract: Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.Type: ApplicationFiled: September 19, 2019Publication date: January 9, 2020Inventors: Chia-Yu Chen, Jungwook Choi, Kailash Gopalakrishnan, Victor Han, Vijayalakshmi Srinivasan, Jintao Zhang
-
Publication number: 20200005125Abstract: A compensated deep neural network (compensated-DNN) is provided. A first vector having a set of components and a second vector having a set of corresponding components are received. A component of the first vector includes a first quantized value and a first compensation instruction, and a corresponding component of the second vector includes a second quantized value and a second compensation instruction. The first quantized value is multiplied with the second quantized value to compute a raw product value. The raw product value is compensated for a quantization error according to the first and second compensation instructions to produce a compensated product value. The compensated product value is added into an accumulated value for the dot product. The accumulated value is converted into an output vector of the dot product. The output vector includes an output quantized value and an output compensation instruction.Type: ApplicationFiled: June 27, 2018Publication date: January 2, 2020Inventors: Swagath Venkataramani, Shubham Jain, Vijayalakshmi Srinivasan, Jungwook Choi, Leland Chang
-
Publication number: 20190385050Abstract: Techniques for statistics-aware weight quantization are presented. To facilitate reducing the bit precision of weights, for a set of weights, a quantizer management component can estimate a quantization scale value to apply to a weight as a linear or non-linear function of the mean of a square of a weight value of the weight and the mean of an absolute value of the weight value, wherein the quantization scale value is determined to have a smaller quantization error than all, or at least almost all, other quantization errors associated with other quantization scale values. A quantizer component applies the quantization scale value to symmetrically and/or uniformly quantize weights of a layer of the set of weights to generate quantized weights, the weights being quantized using rounding. The respective quantized weights can be used to facilitate training and inference of a deep learning system.Type: ApplicationFiled: June 13, 2018Publication date: December 19, 2019Inventors: Zhuo Wang, Jungwook Choi, Kailash Gopalakrishnan, Pierce I-Jen Chuang
-
Patent number: 10489484Abstract: Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.Type: GrantFiled: April 11, 2019Date of Patent: November 26, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Chia-Yu Chen, Jungwook Choi, Kailash Gopalakrishnan, Victor Han, Vijayalakshmi Srinivasan, Jintao Zhang