Abstract: A method is provided. The method includes selecting a neural network model, wherein the neural network model includes a plurality of layers, and wherein each of the plurality of layers includes weights and activations; modifying the neural network model by inserting a plurality of quantization layers within the neural network model; associating a cost function with the modified neural network model, wherein the cost function includes a first coefficient corresponding to a first regularization term, and wherein an initial value of the first coefficient is pre-defined; and training the modified neural network model to generate quantized weights for a layer by increasing the first coefficient until all weights are quantized and the first coefficient satisfies a pre-defined threshold, further including optimizing a weight scaling factor for the quantized weights and an activation scaling factor for quantized activations, and wherein the quantized weights are quantized using the optimized weight scaling factor.
Type:
Grant
Filed:
March 7, 2018
Date of Patent:
March 8, 2022
Inventors:
Yoo Jin Choi, Mostafa El-Khamy, Jungwon Lee
Abstract: A neural network embodiment comprises an input layer, an output layer and a filter layer. Each unit of the filter layer receives a filter layer input from a single preceding unit via a respective filter layer input connection. Each filter layer input connection is coupled to a different single preceding unit. The filter layer incentivizes the neural network to learn to produce a target output from the output layer for a given input to the input layer while simultaneously learning weights for each filter layer input connection. The weights learned cause the filter layer to reduce a number of filter layer units that pass respective filter layer inputs as non-zero values. When applied as an initial internal layer between an input layer and an output layer, the filter layer incentivizes the neural network to learn which neural network input features to discard to produce the target output.
Abstract: A computer-implemented method includes employing a dynamic Boltzmann machine (DyBM) to solve a maximum likelihood of generalized normal distribution (GND) of time-series datasets. The method further includes acquiring the time-series datasets transmitted from a source node to a destination node of a neural network including a plurality of nodes, learning, by the processor, a time-series generative model based on the GND with eligibility traces, and, performing, by the processor, online updating of internal parameters of the GND based on a gradient update to predict updated times-series datasets generated from non-Gaussian distributions.
Type:
Grant
Filed:
October 31, 2018
Date of Patent:
December 7, 2021
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Rudy Raymond Harry Putra, Takayuki Osogami, Sakyasingha Dasgupta
Abstract: A system and method provide a sequence learning model. The method for training the sequence learning model comprises retrieving input sequence data. The input sequence data includes one or more input time sequences. The method also encodes the input sequence data into output symbol data using a sequence learning model. The output symbol data includes one or more symbolic representations. The method decodes, based on a neural network, the output symbol data to decoded sequence data, where the decoded sequence data includes one or more decoded time sequences that are to match the one or more input time sequences in the input sequence data. The method further compares the decoded sequence data with the input sequence data and updates the sequence learning model based on the comparison.
Type:
Grant
Filed:
December 22, 2017
Date of Patent:
June 8, 2021
Assignee:
Onu Technology Inc.
Inventors:
Volkmar Frinken, Guha Jayachandran, Shriphani Palakodety, Veni Singh