Patents by Inventor John Wakefield BROTHERS
John Wakefield BROTHERS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210334643Abstract: A processing unit is described that receives an instruction to perform a first operation on a first layer of a neural network, block dependency data, and an instruction to perform a second operation on a second layer of the neural network. The processing unit performs the first operation, which includes dividing the first layer into a plurality of input blocks, and operating on the input blocks to generate a plurality of output blocks. The processing unit then performs the second operation after the first operation has generated a set number of output blocks defined by the block dependency data.Type: ApplicationFiled: April 27, 2020Publication date: October 28, 2021Inventors: Dominic Hugo SYMES, John Wakefield BROTHERS, III, Fredrik Peter STOLT
-
Publication number: 20210303307Abstract: Herein described is a method of operating an accumulation process in a data processing apparatus. The accumulation process comprises a plurality of accumulations which output a respective plurality of accumulated values, each based on a stored value and a computed value generated by a data processing operation. The method comprises storing a first accumulated value, the first accumulated value being one of said plurality of accumulated values, into a first storage device comprising a plurality of single-bit storage elements; determining that a predetermined trigger has been satisfied with respect to the accumulation process; and in response to the determining, storing at least a portion of a second accumulated value, the second accumulated value being one of said plurality of accumulated values, into a second storage device.Type: ApplicationFiled: March 30, 2020Publication date: September 30, 2021Inventors: Jens OLSON, John Wakefield BROTHERS, III, Jared Corey SMOLENS, Chi-wen CHENG, Daren CROXFORD, Sharjeel SAEED, Dominic Hugo SYMES
-
Publication number: 20210303992Abstract: When performing neural network processing, the order in which the neural network processing is to be performed is determined, and the order in which weight values to be used for the neural network processing will be used is determined based on the determined order of the neural network processing. The weight values are then provided to the processor that is to perform the neural network processing in the determined order for the weight values, with the processor, when performing the neural network processing, then using the weight values in the determined order that they are provided to the processor.Type: ApplicationFiled: March 31, 2020Publication date: September 30, 2021Applicant: Arm LimitedInventor: John Wakefield Brothers, III
-
Publication number: 20210295140Abstract: A neural network processor is disclosed that includes a combined convolution and pooling circuit that can perform both convolution and pooling operations. The circuit can perform a convolution operation by a multiply circuit determining products of corresponding input feature map and convolution kernel weight values, and an add circuit accumulating the products determined by the multiply circuit in storage. The circuit can perform an average pooling operation by the add circuit accumulating input feature map data values in the storage, a divisor circuit determining a divisor value, and a division circuit dividing the data value accumulated in the storage by the determined divisor value. The circuit can perform a maximum pooling operation by a maximum circuit determining a maximum value of input feature map data values, and storing the determined maximum value in the storage.Type: ApplicationFiled: March 23, 2020Publication date: September 23, 2021Applicant: Arm LimitedInventors: Rune Holm, John Wakefield Brothers, III
-
Publication number: 20210133542Abstract: When performing a matrix-vector multiply operation for neural network processing, a set of one or more input vectors to be multiplied by a matrix of data values is scanned to identify data positions of the input vector(s) for which the data value is non-zero in at least one of the input vectors. For each of the data positions identified as having a non-zero value in at least one of the input vectors, the set of data values from the matrix of data values for that data position is fetched from memory and the matrix-vector multiply operation is performed using the data values for the input vectors for the data positions identified as being non-zero and the fetched set(s) of data values from the matrix of data values for those data position(s).Type: ApplicationFiled: October 31, 2019Publication date: May 6, 2021Applicant: Arm LimitedInventors: Rune Holm, John Wakefield Brothers, III
-
Publication number: 20210027148Abstract: A processor arranged to compress neural network activation data comprising an input module for obtaining neural network activation data. The processor also comprises a block creation module arranged to split the neural network activation data into a plurality of blocks; and a metadata generation module for generating metadata associated with at least one of the plurality of blocks. Based on the metadata generated a selection module selects a compression scheme for each of the plurality of blocks, and a compression module for applying the selected compression scheme to the corresponding block to produce compressed neural network activation data. An output module is also provided for outputting the compressed neural network activation data.Type: ApplicationFiled: July 22, 2019Publication date: January 28, 2021Inventors: Lingchuan MENG, John Wakefield BROTHERS, III, Jens OLSON, Jared Corey SMOLENS, Eric KUNZE, Ian Rudolf BRATT
-
Publication number: 20200410357Abstract: An embodiment includes a method, comprising: pruning a layer of a neural network having multiple layers using a threshold; and repeating the pruning of the layer of the neural network using a different threshold until a pruning error of the pruned layer reaches a pruning error allowance.Type: ApplicationFiled: September 9, 2020Publication date: December 31, 2020Inventors: Zhengping JI, John Wakefield BROTHERS, Ilia OVSIANNIKOV, Eunsoo SHIM
-
Patent number: 10832135Abstract: An embodiment includes a method, comprising: pruning a layer of a neural network having multiple layers using a threshold; and repeating the pruning of the layer of the neural network using a different threshold until a pruning error of the pruned layer reaches a pruning error allowance.Type: GrantFiled: April 14, 2017Date of Patent: November 10, 2020Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Zhengping Ji, John Wakefield Brothers, Ilia Ovsiannikov, Eunsoo Shim
-
Publication number: 20200349432Abstract: A convolutional layer in a convolutional neural network uses a predetermined horizontal input stride and a predetermined vertical input stride that are greater than 1 while the hardware forming the convolutional layer operates using an input stride of 1. Each original weight kernel of a plurality of sets of original weight kernels is subdivided based on the predetermined horizontal and vertical input strides to form a set of a plurality of sub-kernels for each set of original weight kernels. Each of a plurality of IFMs is subdivided based on the predetermined horizontal and vertical input strides to form a plurality of sub-maps. Each sub-map is convolved by the corresponding sub-kernel for a set of original weight kernels using an input stride of 1. A convolved result of each sub-map and the corresponding sub-kernel is summed to form an output feature map.Type: ApplicationFiled: July 14, 2020Publication date: November 5, 2020Inventor: John Wakefield BROTHERS
-
Patent number: 10776694Abstract: A convolutional layer in a convolutional neural network uses a predetermined horizontal input stride and a predetermined vertical input stride that are greater than 1 while the hardware forming the convolutional layer operates using an input stride of 1. Each original weight kernel of a plurality of sets of original weight kernels is subdivided based on the predetermined horizontal and vertical input strides to form a set of a plurality of sub-kernels for each set of original weight kernels. Each of a plurality of IFMs is subdivided based on the predetermined horizontal and vertical input strides to form a plurality of sub-maps. Each sub-map is convolved by the corresponding sub-kernel for a set of original weight kernels using an input stride of 1. A convolved result of each sub-map and the corresponding sub-kernel is summed to form an output feature map.Type: GrantFiled: August 8, 2017Date of Patent: September 15, 2020Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventor: John Wakefield Brothers
-
Publication number: 20190050735Abstract: A method is disclosed to reduce computational load of a deep neural network. A number of multiply-accumulate (MAC) operations is determined for each layer of the deep neural network. A pruning error allowance per weight is determined based on a computational load of each layer. For each layer of the deep neural network: a threshold estimator is initialized, and weights of each layer are pruned based on a standard deviation of all weights within the layer. A pruning error per weight is determined for the layer, and if the pruning error per weight exceeds a predetermined threshold, the threshold estimator is updated for the layer the weights of the layer are repruned using the updated threshold estimator and the pruning error per weight is re-determined until the pruning error per weight is less than the threshold. The deep neural network is then retrained.Type: ApplicationFiled: October 3, 2017Publication date: February 14, 2019Inventors: Zhengping JI, John Wakefield BROTHERS, Weiran DENG, Georgios GEORGIADIS
-
Publication number: 20180336462Abstract: A convolutional layer in a convolutional neural network uses a predetermined horizontal input stride and a predetermined vertical input stride that are greater than 1 while the hardware forming the convolutional layer operates using an input stride of 1. Each original weight kernel of a plurality of sets of original weight kernels is subdivided based on the predetermined horizontal and vertical input strides to form a set of a plurality of sub-kernels for each set of original weight kernels. Each of a plurality of IFMs is subdivided based on the predetermined horizontal and vertical input strides to form a plurality of sub-maps. Each sub-map is convolved by the corresponding sub-kernel for a set of original weight kernels using an input stride of 1. A convolved result of each sub-map and the corresponding sub-kernel is summed to form an output feature map.Type: ApplicationFiled: August 8, 2017Publication date: November 22, 2018Inventor: John Wakefield BROTHERS
-
Publication number: 20180232640Abstract: An embodiment includes a method, comprising: pruning a layer of a neural network having multiple layers using a threshold; and repeating the pruning of the layer of the neural network using a different threshold until a pruning error of the pruned layer reaches a pruning error allowance.Type: ApplicationFiled: April 14, 2017Publication date: August 16, 2018Inventors: Zhengping JI, John Wakefield BROTHERS, Ilia OVSIANNIKOV, Eunsoo SHIM
-
Publication number: 20180197081Abstract: A system and method to reduce weight storage bits for a deep-learning network includes a quantizing module and a cluster-number reduction module. The quantizing module quantizes neural weights of each quantization layer of the deep-learning network. The cluster-number reduction module reduces the predetermined number of clusters for a layer having a clustering error that is a minimum of the clustering errors of the plurality of quantization layers. The quantizing module requantizes the layer based on the reduced predetermined number of clusters for the layer and the cluster-number reduction module further determines another layer having a clustering error that is a minimum of the clustering errors of the plurality of quantized layers and reduces the predetermined number of clusters for the another layer until a recognition performance of the deep-learning network has been reduced by a predetermined threshold.Type: ApplicationFiled: March 20, 2017Publication date: July 12, 2018Inventors: Zhengping JI, John Wakefield BROTHERS
-
Patent number: 9041720Abstract: A circuit includes memory retiling methods which distribute image information among a plurality of memory channels producing reconfigured image information distributed among a subset of the plurality of memory channels allowing memory channels outside of the subset to be placed into a power save mode to reduce power consumption. Additional methods are disclosed for further reductions in power consumption.Type: GrantFiled: December 18, 2009Date of Patent: May 26, 2015Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Greg Sadowski, Warren Fritz Kruger, John Wakefield Brothers, III, David I.J. Glen, Stephen David Presant
-
Publication number: 20120066471Abstract: A method and system are provided for associating one or more memory buffers in a computing system with a plurality of memory channels. The method and apparatus associates one or more memory buffers with a plurality of memory banks based on preferred performance settings, wherein the plurality of memory banks spans over one or more of the plurality of memory channels. Additionally, the method and apparatus accesses the one or more memory buffers based on the preferred performance settings. Further, the method and apparatus can, in response to accessing the one or more memory buffers based on the preferred performance settings, determine whether the preferred performance settings are being satisfied.Type: ApplicationFiled: November 22, 2011Publication date: March 15, 2012Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Greg Sadowski, Philip J. Rogers, John Wakefield Brothers, III, W. Fritz Kruger, Konstantine I. Iourcha
-
Publication number: 20110148923Abstract: A circuit includes a memory circuit. The memory retiling circuit moves image information configured to be distributed among a plurality of memory channels into reconfigured image information configured to be distributed among a subset of the plurality of memory channels.Type: ApplicationFiled: December 18, 2009Publication date: June 23, 2011Applicant: Advanced Micro Devices, Inc.Inventors: Greg SADOWSKI, Warren Fritz KRUGER, John Wakefield BROTHERS, III, David I.J. GLEN, Stephen David PRESANT