Patents by Inventor Georgios GEORGIADIS

Georgios GEORGIADIS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Piecewise quantization for neural networks

Patent number: 11775611

Abstract: In some embodiments, a method of quantizing an artificial neural network includes dividing a quantization range for a tensor of the artificial neural network into a first region and a second region, and quantizing values of the tensor in the first region separately from values of the tensor in the second region. In some embodiments, linear or nonlinear quantization are applied to values of the tensor in the first region and the second region. In some embodiments, the method includes locating a breakpoint between the first region and the second region by substantially minimizing an expected quantization error over at least a portion of the quantization range. In some embodiments, the expected quantization error is minimized by solving analytically and/or searching numerically.

Type: Grant

Filed: March 11, 2020

Date of Patent: October 3, 2023

Inventors: Jun Fang, Joseph H. Hassoun, Ali Shafiee Ardestani, Hamzah Ahmed Ali Abdelaziz, Georgios Georgiadis, Hui Chen, David Philip Lloyd Thorsley
Lossless compression of neural network weights

Patent number: 11588499

Abstract: A system and a method provide compression and decompression of weights of a layer of a neural network. For compression, the values of the weights are pruned and the weights of a layer are configured as a tensor having a tensor size of H×W×C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor. The tensor is formatted into at least one block of values. Each block is encoded independently from other blocks of the tensor using at least one lossless compression mode. For decoding, each block is decoded independently from other blocks using at least one decompression mode corresponding to the at least one compression mode used to compress the block; and deformatted into a tensor having the size of H×W×C.

Type: Grant

Filed: December 17, 2018

Date of Patent: February 21, 2023

Inventor: Georgios Georgiadis
JOINTLY PRUNING AND QUANTIZING DEEP NEURAL NETWORKS

Publication number: 20230004813

Abstract: A system and a method generate a neural network that includes at least one layer having weights and output feature maps that have been jointly pruned and quantized. The weights of the layer are pruned using an analytic threshold function. Each weight remaining after pruning is quantized based on a weighted average of a quantization and dequantization of the weight for all quantization levels to form quantized weights for the layer. Output feature maps of the layer are generated based on the quantized weights of the layer. Each output feature map of the layer is quantized based on a weighted average of a quantization and dequantization of the output feature map for all quantization levels. Parameters of the analytic threshold function, the weighted average of all quantization levels of the weights and the weighted average of each output feature map of the layer are updated using a cost function.

Type: Application

Filed: September 12, 2022

Publication date: January 5, 2023

Inventors: Georgios GEORGIADIS, Weiran DENG
Jointly pruning and quantizing deep neural networks

Patent number: 11475308

Abstract: A system and a method generate a neural network that includes at least one layer having weights and output feature maps that have been jointly pruned and quantized. The weights of the layer are pruned using an analytic threshold function. Each weight remaining after pruning is quantized based on a weighted average of a quantization and dequantization of the weight for all quantization levels to form quantized weights for the layer. Output feature maps of the layer are generated based on the quantized weights of the layer. Each output feature map of the layer is quantized based on a weighted average of a quantization and dequantization of the output feature map for all quantization levels. Parameters of the analytic threshold function, the weighted average of all quantization levels of the weights and the weighted average of each output feature map of the layer are updated using a cost function.

Type: Grant

Filed: April 26, 2019

Date of Patent: October 18, 2022

Inventors: Georgios Georgiadis, Weiran Deng
SELF-PRUNING NEURAL NETWORKS FOR WEIGHT PARAMETER REDUCTION

Publication number: 20220129756

Abstract: A technique to prune weights of a neural network using an analytic threshold function h(w) provides a neural network having weights that have been optimally pruned. The neural network includes a plurality of layers in which each layer includes a set of weights w associated with the layer that enhance a speed performance of the neural network, an accuracy of the neural network, or a combination thereof. Each set of weights is based on a cost function C that has been minimized by back-propagating an output of the neural network in response to input training data. The cost function C is also minimized based on a derivative of the cost function C with respect to a first parameter of the analytic threshold function h(w) and on a derivative of the cost function C with respect to a second parameter of the analytic threshold function h(w).

Type: Application

Filed: January 10, 2022

Publication date: April 28, 2022

Inventors: Weiran DENG, Georgios GEORGIADIS
Self-pruning neural networks for weight parameter reduction

Patent number: 11250325

Abstract: A technique to prune weights of a neural network using an analytic threshold function h(w) provides a neural network having weights that have been optimally pruned. The neural network includes a plurality of layers in which each layer includes a set of weights w associated with the layer that enhance a speed performance of the neural network, an accuracy of the neural network, or a combination thereof. Each set of weights is based on a cost function C that has been minimized by back-propagating an output of the neural network in response to input training data. The cost function C is also minimized based on a derivative of the cost function C with respect to a first parameter of the analytic threshold function h(w) and on a derivative of the cost function C with respect to a second parameter of the analytic threshold function h(w).

Type: Grant

Filed: February 12, 2018

Date of Patent: February 15, 2022

Inventors: Weiran Deng, Georgios Georgiadis
Accelerating long short-term memory networks via selective pruning

Patent number: 11151428

Abstract: A system and method for pruning. A neural network includes a plurality of long short-term memory cells, each of which includes an input having a weight matrix Wc, an input gate having a weight matrix Wi, a forget gate having a weight matrix Wf, and an output gate having a weight matrix Wo. In some embodiments, after initial training, one or more of the weight matrices Wi, Wf, and Wo are pruned, and the weight matrix Wc is left unchanged. The neural network is then retrained, the pruned weights being constrained to remain zero during retraining.

Type: Grant

Filed: April 9, 2020

Date of Patent: October 19, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventors: Georgios Georgiadis, Weiran Deng
PIECEWISE QUANTIZATION FOR NEURAL NETWORKS

Publication number: 20210133278

Abstract: A method of quantizing an artificial neural network may include dividing a quantization range for a tensor of the artificial neural network into a first region and a second region, and quantizing values of the tensor in the first region separately from values of the tensor in the second region. Linear or nonlinear quantization may be applied to values of the tensor in the first region and the second region. The method may include locating a breakpoint between the first region and the second region by substantially minimizing an expected quantization error over at least a portion of the quantization range. The expected quantization error may be minimized by solving analytically and/or searching numerically.

Type: Application

Filed: March 11, 2020

Publication date: May 6, 2021

Inventors: Jun FANG, Joseph H. HASSOUN, Ali SHAFIEE ARDESTANI, Hamzah Ahmed Ali ABDELAZIZ, Georgios GEORGIADIS, Hui CHEN, David Philip Lloyd THORSLEY
NEAR-INFRARED SPECTROSCOPY (NIR) BASED GLUCOSE PREDICTION USING DEEP LEARNING

Publication number: 20200293882

Abstract: A recurrent neural network that predicts blood glucose level includes a first long short-term memory (LSTM) network and a second LSTM network. The first LSTM network may include an input to receive near-infrared (NIR) radiation data and includes an output. The second LSTM network may include an input to receive the output of the first LSTM network and an output to output blood glucose level data based on the NIR radiation data input to the first LSTM network.

Type: Application

Filed: May 2, 2019

Publication date: September 17, 2020

Inventors: Liu LIU, Georgios GEORGIADIS, Elham SAKHAEE, Weiran DENG
JOINTLY PRUNING AND QUANTIZING DEEP NEURAL NETWORKS

Publication number: 20200293893

Abstract: A system and a method generate a neural network that includes at least one layer having weights and output feature maps that have been jointly pruned and quantized. The weights of the layer are pruned using an analytic threshold function. Each weight remaining after pruning is quantized based on a weighted average of a quantization and dequantization of the weight for all quantization levels to form quantized weights for the layer. Output feature maps of the layer are generated based on the quantized weights of the layer. Each output feature map of the layer is quantized based on a weighted average of a quantization and dequantization of the output feature map for all quantization levels. Parameters of the analytic threshold function, the weighted average of all quantization levels of the weights and the weighted average of each output feature map of the layer are updated using a cost function.

Type: Application

Filed: April 26, 2019

Publication date: September 17, 2020

Inventors: Georgios GEORGIADIS, Weiran DENG
ACCELERATING LONG SHORT-TERM MEMORY NETWORKS VIA SELECTIVE PRUNING

Publication number: 20200234089

Abstract: A system and method for pruning. A neural network includes a plurality of long short-term memory cells, each of which includes an input having a weight matrix Wc, an input gate having a weight matrix Wi, a forget gate having a weight matrix Wf, and an output gate having a weight matrix Wo. In some embodiments, after initial training, one or more of the weight matrices Wi, Wf, and Wo are pruned, and the weight matrix Wc is left unchanged. The neural network is then retrained, the pruned weights being constrained to remain zero during retraining.

Type: Application

Filed: April 9, 2020

Publication date: July 23, 2020

Inventors: Georgios Georgiadis, Weiran Deng
Color image modification with approximation function

Patent number: 10713765

Abstract: Color trim data is used in an approximation function to approximate one or more non-linear transformations of image data in an image processing pipeline. The color trim data is derived in one embodiment through a back projection on a colorist system, and the color trim data is used at the time of rendering an image on a display management system.

Type: Grant

Filed: March 1, 2018

Date of Patent: July 14, 2020

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Alexander Partin, Kimball Darr Thurston, III, Georgios Georgiadis
Accelerating long short-term memory networks via selective pruning

Patent number: 10657426

Abstract: A system and method for pruning. A neural network includes a plurality of long short-term memory cells, each of which includes an input having a weight matrix Wc, an input gate having a weight matrix Wi, a forget gate having a weight matrix Wf, and an output gate having a weight matrix Wo. In some embodiments, after initial training, one or more of the weight matrices Wi, Wf, and Wo are pruned, and the weight matrix Wc is left unchanged. The neural network is then retrained, the pruned weights being constrained to remain zero during retraining.

Type: Grant

Filed: March 27, 2018

Date of Patent: May 19, 2020

Assignee: Samsung Electronics Co., Ltd.

Inventors: Georgios Georgiadis, Weiran Deng
LOSSY COMPRESSION OF NEURAL NETWORK ACTIVATION MAPS

Publication number: 20200143226

Abstract: A system and a method provide compression and decompression of an activation map of a layer of a neural network. For compression, the values of the activation map are sparsified and the activation map is configured as a tensor having a tensor size of H×W×C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor. The tensor is formatted into at least one block of values. Each block is encoded independently from other blocks of the tensor using at least one lossless compression mode. For decoding, each block is decoded independently from other blocks using at least one decompression mode corresponding to the at least one compression mode used to compress the block; and deformatted into a tensor having the size of H×W×C.

Type: Application

Filed: December 17, 2018

Publication date: May 7, 2020

Inventor: Georgios GEORGIADIS
LOSSLESS COMPRESSION OF NEURAL NETWORK WEIGHTS

Publication number: 20200143249

Abstract: A system and a method provide compression and decompression of weights of a layer of a neural network. For compression, the values of the weights are pruned and the weights of a layer are configured as a tensor having a tensor size of H×W×C in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor. The tensor is formatted into at least one block of values. Each block is encoded independently from other blocks of the tensor using at least one lossless compression mode. For decoding, each block is decoded independently from other blocks using at least one decompression mode corresponding to the at least one compression mode used to compress the block; and deformatted into a tensor having the size of H×W×C.

Type: Application

Filed: December 17, 2018

Publication date: May 7, 2020

Inventor: Georgios GEORGIADIS
COLOR IMAGE MODIFICATION WITH APPROXIMATION FUNCTION

Publication number: 20200043149

Abstract: Color trim data is used in an approximation function to approximate one or more non-linear transformations of image data in an image processing pipeline. The color trim data is derived in one embodiment through a back projection on a colorist system, and the color trim data is used at the time of rendering an image on a display management system.

Type: Application

Filed: March 1, 2018

Publication date: February 6, 2020

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Alexander PARTIN, Kimball Darr THURSTON III, Georgios GEORGIADIS
LOSSLESS COMPRESSION OF SPARSE ACTIVATION MAPS OF NEURAL NETWORKS

Publication number: 20190370667

Abstract: A system and a method provide lossless compression of an activation map of a neural network. The system includes a formatter and an encoder. The formatter formats a tensor corresponding to an activation map into at least one block of values in which the tensor has a size of H×W×C and in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor. The encoder encodes the at least one block independently from other blocks of the tensor using at least one lossless compression mode. The at least one lossless compression mode selected to encode the at least one block may different from a lossless compression mode selected to encode another block of the tensor.

Type: Application

Filed: July 26, 2018

Publication date: December 5, 2019

Inventor: Georgios GEORGIADIS
ACCELERATING LONG SHORT-TERM MEMORY NETWORKS VIA SELECTIVE PRUNING

Publication number: 20190228274

Abstract: A system and method for pruning. A neural network includes a plurality of long short-term memory cells, each of which includes an input having a weight matrix Wc, an input gate having a weight matrix Wi, a forget gate having a weight matrix Wf, and an output gate having a weight matrix Wo. In some embodiments, after initial training, one or more of the weight matrices Wi, Wf, and Wo are pruned, and the weight matrix Wc is left unchanged. The neural network is then retrained, the pruned weights being constrained to remain zero during retraining.

Type: Application

Filed: March 27, 2018

Publication date: July 25, 2019

Inventors: Georgios Georgiadis, Weiran Deng
SELF-PRUNING NEURAL NETWORKS FOR WEIGHT PARAMETER REDUCTION

Publication number: 20190180184

Abstract: A technique to prune weights of a neural network using an analytic threshold function h(w) provides a neural network having weights that have been optimally pruned. The neural network includes a plurality of layers in which each layer includes a set of weights w associated with the layer that enhance a speed performance of the neural network, an accuracy of the neural network, or a combination thereof. Each set of weights is based on a cost function C that has been minimized by back-propagating an output of the neural network in response to input training data. The cost function C is also minimized based on a derivative of the cost function C with respect to a first parameter of the analytic threshold function h(w) and on a derivative of the cost function C with respect to a second parameter of the analytic threshold function h(w).

Type: Application

Filed: February 12, 2018

Publication date: June 13, 2019

Inventors: Weiran DENG, Georgios GEORGIADIS
METHODS AND ALGORITHMS OF REDUCING COMPUTATION FOR DEEP NEURAL NETWORKS VIA PRUNING

Publication number: 20190050735

Abstract: A method is disclosed to reduce computational load of a deep neural network. A number of multiply-accumulate (MAC) operations is determined for each layer of the deep neural network. A pruning error allowance per weight is determined based on a computational load of each layer. For each layer of the deep neural network: a threshold estimator is initialized, and weights of each layer are pruned based on a standard deviation of all weights within the layer. A pruning error per weight is determined for the layer, and if the pruning error per weight exceeds a predetermined threshold, the threshold estimator is updated for the layer the weights of the layer are repruned using the updated threshold estimator and the pruning error per weight is re-determined until the pruning error per weight is less than the threshold. The deep neural network is then retrained.

Type: Application

Filed: October 3, 2017

Publication date: February 14, 2019

Inventors: Zhengping JI, John Wakefield BROTHERS, Weiran DENG, Georgios GEORGIADIS

1 2 next