Patents by Inventor John Wakefield BROTHERS
John Wakefield BROTHERS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240126602Abstract: A processor to execute a plurality of tasks comprising a first task and a second task. At least a part of the first task is to be executed simultaneously with at least a part of the second task. The processor comprises a handling unit to: determine an available portion of a storage available during execution of the part of the first task; determine a mapping between at least one logical address associated with data associated with the part of the second task and a corresponding at least one physical address of the storage corresponding to the available portion; and identify, based on the mapping, the at least one physical address corresponding to the at least one logical address associated with the data, for storing the data in the available portion of the storage.Type: ApplicationFiled: October 17, 2022Publication date: April 18, 2024Inventors: Jens OLSON, John Wakefield BROTHERS, III
-
Patent number: 11948069Abstract: A processor arranged to compress neural network activation data comprising an input module for obtaining neural network activation data. The processor also comprises a block creation module arranged to split the neural network activation data into a plurality of blocks; and a metadata generation module for generating metadata associated with at least one of the plurality of blocks. Based on the metadata generated a selection module selects a compression scheme for each of the plurality of blocks, and a compression module for applying the selected compression scheme to the corresponding block to produce compressed neural network activation data. An output module is also provided for outputting the compressed neural network activation data.Type: GrantFiled: July 22, 2019Date of Patent: April 2, 2024Assignee: Arm LimitedInventors: Lingchuan Meng, John Wakefield Brothers, III, Jens Olson, Jared Corey Smolens, Eric Kunze, Ian Rudolf Bratt
-
Publication number: 20240048152Abstract: Systems and methods for processing data for a neural network are described. The system comprises non-transitory memory configured to receive data bits defining a kernel of weights, the data bits being suitable for processing input data; and a data processing unit, configured to: receive bits defining a kernel of weights for the neural network, the kernel of weights comprising one or more non-zero value weights and one or more zero-valued weights; generate a set of mask bits, a position of each bit in the set of mask bits corresponds to a position within the kernel of weights and the value of each bit indicates whether a weight in the corresponding position is a zero-valued weight or a non-zero value weight; and transmit the non-zero value weights and the set of mask bits for storage, the non-zero value weights and the set of mask bits represent the kernel of weights.Type: ApplicationFiled: August 3, 2022Publication date: February 8, 2024Inventor: John Wakefield BROTHERS, III
-
Publication number: 20240036919Abstract: A method and processor comprising a command processing unit to receive, from a host processor, a sequence of commands to be executed; and generate based on the sequence of commands a plurality of tasks. The processor also comprises a plurality of compute units each having a first processing module for executing tasks of a first task type, a second processing module for executing tasks of a second task type, different from the first task type, and a local cache shared by at least the first processing module and the second processing module. The command processing unit issues the plurality of tasks to at least one of the plurality of compute units, and wherein at least one of the plurality of compute units is to process at least one of the plurality of tasks.Type: ApplicationFiled: July 26, 2023Publication date: February 1, 2024Applicant: Arm LimitedInventors: Alexander Eugene Chalfin, John Wakefield Brothers, III, Rune Holm, Samuel James Edward Martin
-
Patent number: 11874793Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to broadcast hubs for multi-processor arrangements. A processing tile may comprise a broadcast hub to obtain a plurality of parameters applicable in a particular operation from at least one of a plurality of processing tiles and initiate distribution of the plurality of parameters to the plurality of processing tiles, wherein the plurality of processing tiles may execute the particular operation based at least in part on the plurality of distributed parameters.Type: GrantFiled: March 30, 2022Date of Patent: January 16, 2024Assignee: Arm LimitedInventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
-
Patent number: 11842273Abstract: To perform neural network processing to modify an input data array to generate a corresponding output data array using a filter comprising an array of weight data, at least one of the input data array and the filter are subdivided into a plurality of portions, a plurality of neural network processing passes using the portions are performed, and the output generated by each processing pass is combined to provide the output data array.Type: GrantFiled: September 23, 2020Date of Patent: December 12, 2023Assignee: Arm LimitedInventors: John Wakefield Brothers, III, Rune Holm, Elliott Maurice Simon Rosemarine
-
Publication number: 20230315670Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to broadcast regions for multi-processor arrangements.Type: ApplicationFiled: March 30, 2022Publication date: October 5, 2023Inventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
-
Publication number: 20230315677Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to broadcast hubs for multi-processor arrangements. A processing tile may comprise a broadcast hub to obtain a plurality of parameters applicable in a particular operation from at least one of a plurality of processing tiles and initiate distribution of the plurality of parameters to the plurality of processing tiles, wherein the plurality of processing tiles may execute the particular operation based at least in part on the plurality of distributed parameters.Type: ApplicationFiled: March 30, 2022Publication date: October 5, 2023Inventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
-
Publication number: 20230315669Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to a point of serialization for broadcast communications within multi-processor arrangements.Type: ApplicationFiled: March 30, 2022Publication date: October 5, 2023Inventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
-
Patent number: 11755908Abstract: A system and method to reduce weight storage bits for a deep-learning network includes a quantizing module and a cluster-number reduction module. The quantizing module quantizes neural weights of each quantization layer of the deep-learning network. The cluster-number reduction module reduces the predetermined number of clusters for a layer having a clustering error that is a minimum of the clustering errors of the plurality of quantization layers. The quantizing module requantizes the layer based on the reduced predetermined number of clusters for the layer and the cluster-number reduction module further determines another layer having a clustering error that is a minimum of the clustering errors of the plurality of quantized layers and reduces the predetermined number of clusters for the another layer until a recognition performance of the deep-learning network has been reduced by a predetermined threshold.Type: GrantFiled: May 9, 2022Date of Patent: September 12, 2023Inventors: Zhengping Ji, John Wakefield Brothers
-
Patent number: 11741349Abstract: When performing a matrix-vector multiply operation for neural network processing, a set of one or more input vectors to be multiplied by a matrix of data values is scanned to identify data positions of the input vector(s) for which the data value is non-zero in at least one of the input vectors. For each of the data positions identified as having a non-zero value in at least one of the input vectors, the set of data values from the matrix of data values for that data position is fetched from memory and the matrix-vector multiply operation is performed using the data values for the input vectors for the data positions identified as being non-zero and the fetched set(s) of data values from the matrix of data values for those data position(s).Type: GrantFiled: October 31, 2019Date of Patent: August 29, 2023Assignee: Arm LimitedInventors: Rune Holm, John Wakefield Brothers, III
-
Patent number: 11669736Abstract: When performing neural network processing, the order in which the neural network processing is to be performed is determined, and the order in which weight values to be used for the neural network processing will be used is determined based on the determined order of the neural network processing. The weight values are then provided to the processor that is to perform the neural network processing in the determined order for the weight values, with the processor, when performing the neural network processing, then using the weight values in the determined order that they are provided to the processor.Type: GrantFiled: March 31, 2020Date of Patent: June 6, 2023Assignee: Arm LimitedInventor: John Wakefield Brothers, III
-
Patent number: 11561795Abstract: Herein described is a method of operating an accumulation process in a data processing apparatus. The accumulation process comprises a plurality of accumulations which output a respective plurality of accumulated values, each based on a stored value and a computed value generated by a data processing operation. The method comprises storing a first accumulated value, the first accumulated value being one of said plurality of accumulated values, into a first storage device comprising a plurality of single-bit storage elements; determining that a predetermined trigger has been satisfied with respect to the accumulation process; and in response to the determining, storing at least a portion of a second accumulated value, the second accumulated value being one of said plurality of accumulated values, into a second storage device.Type: GrantFiled: March 30, 2020Date of Patent: January 24, 2023Assignee: Arm LimitedInventors: Jens Olson, John Wakefield Brothers, III, Jared Corey Smolens, Chi-wen Cheng, Daren Croxford, Sharjeel Saeed, Dominic Hugo Symes
-
Patent number: 11537860Abstract: A neural network processor is disclosed that includes a combined convolution and pooling circuit that can perform both convolution and pooling operations. The circuit can perform a convolution operation by a multiply circuit determining products of corresponding input feature map and convolution kernel weight values, and an add circuit accumulating the products determined by the multiply circuit in storage. The circuit can perform an average pooling operation by the add circuit accumulating input feature map data values in the storage, a divisor circuit determining a divisor value, and a division circuit dividing the data value accumulated in the storage by the determined divisor value. The circuit can perform a maximum pooling operation by a maximum circuit determining a maximum value of input feature map data values, and storing the determined maximum value in the storage.Type: GrantFiled: March 23, 2020Date of Patent: December 27, 2022Assignee: Arm LimitedInventors: Rune Holm, John Wakefield Brothers, III
-
Publication number: 20220269941Abstract: A system and method to reduce weight storage bits for a deep-learning network includes a quantizing module and a cluster-number reduction module. The quantizing module quantizes neural weights of each quantization layer of the deep-learning network. The cluster-number reduction module reduces the predetermined number of clusters for a layer having a clustering error that is a minimum of the clustering errors of the plurality of quantization layers. The quantizing module requantizes the layer based on the reduced predetermined number of clusters for the layer and the cluster-number reduction module further determines another layer having a clustering error that is a minimum of the clustering errors of the plurality of quantized layers and reduces the predetermined number of clusters for the another layer until a recognition performance of the deep-learning network has been reduced by a predetermined threshold.Type: ApplicationFiled: May 9, 2022Publication date: August 25, 2022Inventors: Zhengping JI, John Wakefield BROTHERS
-
Publication number: 20220237461Abstract: A convolutional layer in a convolutional neural network uses a predetermined horizontal input stride and a predetermined vertical input stride that are greater than 1 while the hardware forming the convolutional layer operates using an input stride of 1. Each original weight kernel of a plurality of sets of original weight kernels is subdivided based on the predetermined horizontal and vertical input strides to form a set of a plurality of sub-kernels for each set of original weight kernels. Each of a plurality of IFMs is subdivided based on the predetermined horizontal and vertical input strides to form a plurality of sub-maps. Each sub-map is convolved by the corresponding sub-kernel for a set of original weight kernels using an input stride of 1. A convolved result of each sub-map and the corresponding sub-kernel is summed to form an output feature map.Type: ApplicationFiled: April 15, 2022Publication date: July 28, 2022Inventor: John Wakefield BROTHERS
-
Patent number: 11392825Abstract: A system and method to reduce weight storage bits for a deep-learning network includes a quantizing module and a cluster-number reduction module. The quantizing module quantizes neural weights of each quantization layer of the deep-learning network. The cluster-number reduction module reduces the predetermined number of clusters for a layer having a clustering error that is a minimum of the clustering errors of the plurality of quantization layers. The quantizing module requantizes the layer based on the reduced predetermined number of clusters for the layer and the cluster-number reduction module further determines another layer having a clustering error that is a minimum of the clustering errors of the plurality of quantized layers and reduces the predetermined number of clusters for the another layer until a recognition performance of the deep-learning network has been reduced by a predetermined threshold.Type: GrantFiled: March 20, 2017Date of Patent: July 19, 2022Inventors: Zhengping Ji, John Wakefield Brothers
-
Patent number: 11373097Abstract: A convolutional layer in a convolutional neural network uses a predetermined horizontal input stride and a predetermined vertical input stride that are greater than 1 while the hardware forming the convolutional layer operates using an input stride of 1. Each original weight kernel of a plurality of sets of original weight kernels is subdivided based on the predetermined horizontal and vertical input strides to form a set of a plurality of sub-kernels for each set of original weight kernels. Each of a plurality of IFMs is subdivided based on the predetermined horizontal and vertical input strides to form a plurality of sub-maps. Each sub-map is convolved by the corresponding sub-kernel for a set of original weight kernels using an input stride of 1. A convolved result of each sub-map and the corresponding sub-kernel is summed to form an output feature map.Type: GrantFiled: July 14, 2020Date of Patent: June 28, 2022Inventor: John Wakefield Brothers
-
Publication number: 20220198243Abstract: A method of processing input data for a given layer of a neural network using a data processing system comprising compute resources for performing convolutional computations is described. The input data comprises a given set of input feature maps, IFMs, and a given set of filters. The method comprises generating a set of part-IFMs including pluralities of part-IFMs which correspond to respective IFMs, of the given set of IFMs. The method further includes grouping part-IFMs in the set of part-IFMs into a set of selections of part-IFMs. The method further includes convolving, by respective compute resources of the data processing system, the set of selections with the given set of filters to compute a set of part-output feature maps. A data processing system for processing input data for a given layer of a neural network is also described.Type: ApplicationFiled: December 23, 2020Publication date: June 23, 2022Inventors: John Wakefield BROTHERS, III, Kartikeya BHARDWAJ, Alexander Eugene CHALFIN, Danny Daysang LOH
-
Publication number: 20220092409Abstract: To perform neural network processing to modify an input data array to generate a corresponding output data array using a filter comprising an array of weight data, at least one of the input data array and the filter are subdivided into a plurality of portions, a plurality of neural network processing passes using the portions are performed, and the output generated by each processing pass is combined to provide the output data array.Type: ApplicationFiled: September 23, 2020Publication date: March 24, 2022Applicant: Arm LimitedInventors: John Wakefield Brothers, III, Rune Holm, Elliott Maurice Simon Rosemarine