Patents by Inventor Sharjeel SAEED
Sharjeel SAEED has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11995475Abstract: An information processing apparatus is described for processing a workload. The information processing apparatus comprises a processor and a memory element connected to the processor via a data link. In advance of processing a workload, the information processing apparatus estimates an access time required to transfer an amount of the workload that is to be transferred from the external memory element to the processor, and estimates a processing time for the processor to process the workload. A processing rate characteristic of the processor and/or a data transfer rate between the memory and the processor is set in dependence upon the estimated processing time and estimated access time. Methods for varying a quality of service (QoS) value of requests to the external memory element are also described.Type: GrantFiled: October 28, 2020Date of Patent: May 28, 2024Assignee: Arm LimitedInventors: Daren Croxford, Sharjeel Saeed, Jayavarapu Srinivasa Rao, Aaron Debattista
-
Patent number: 11928581Abstract: A method of compressing kernels comprising detecting a plurality of replicated kernels. The plurality of replicated kernels comprise kernels. The method also comprises generating a composite kernel from the replicated kernels. The composite kernel comprises kernel data and meta data indicative of the rotations applied to the composite kernel data. The method also comprises storing a composite kernel.Type: GrantFiled: September 14, 2018Date of Patent: March 12, 2024Assignee: Arm LimitedInventors: Daren Croxford, Jayavarapu Srinivasa Rao, Sharjeel Saeed
-
Publication number: 20240036932Abstract: Disclosed herein is a graphics processor that comprises a programmable execution unit operable to execute programs to perform graphics processing operations. The graphics processor further comprises a dedicated machine learning processing circuit operable to perform processing operations for machine learning processing tasks. The machine learning processing circuit is in communication with the programmable execution unit internally to the graphics processor. In this way, the graphics processor can be configured such that machine learning processing tasks can be performed by the programmable execution unit, the machine learning processing circuit, or a combination of both, with the different units being able to message each other accordingly to control the processing.Type: ApplicationFiled: July 26, 2023Publication date: February 1, 2024Applicant: Arm LimitedInventors: Daren Croxford, Sharjeel Saeed, Isidoros Sideris
-
Publication number: 20240037835Abstract: There is provided an apparatus configured to operate as a shader core, the shader core configured to perform a complex rendering process comprising a rendering process and a machine learning process, the shader core comprising: one or more tile buffers configured to store data locally to the shader core, wherein during the rendering process, the one or more tile buffers are configured to store rendered fragment data relating to a tile; and during the machine learning process, the one or more tile buffers are configured to store an input feature map, kernel weights or an output feature map relating to the machine learning process.Type: ApplicationFiled: July 31, 2023Publication date: February 1, 2024Inventors: Daren CROXFORD, Sharjeel SAEED, Isidoros SIDERIS
-
Publication number: 20240036949Abstract: There is provided a processor configured to transfer data to a plurality of processor circuits. The apparatus includes broadcast circuitry that broadcasts first machine learning data to at least a subset of the plurality of processor circuits.Type: ApplicationFiled: July 31, 2023Publication date: February 1, 2024Inventors: Daren CROXFORD, Sharjeel SAEED, Isidoros SIDERIS
-
Patent number: 11824977Abstract: A data processing system including storage. The data processing system also includes at least one processor to generate output data using at least a portion of a first neural network layer and generate a key associated with at least the portion of the first neural network layer. The at least one processor is further operable to obtain the key from the storage and obtain a version of the output data for input into a second neural network layer. Using the key, the at least one processor is further operable to determine whether the version of the output data differs from the output data.Type: GrantFiled: July 28, 2020Date of Patent: November 21, 2023Assignee: Arm LimitedInventors: Sharjeel Saeed, Daren Croxford, Dominic Hugo Symes
-
Patent number: 11798221Abstract: In a graphics processing system comprising a graphics processor, a main memory, and a memory management unit, when rendering a frame that represents a view of a scene comprising one or more objects using a ray tracing process and the ray tracing process requires a traversal of a ray tracing acceleration data structure indicative of the distribution of geometry for the scene being rendered to determine geometry for the scene that may be intersected by a ray, at least part of the traversal of the ray tracing acceleration data structure is performed by the memory management unit (MMU).Type: GrantFiled: October 27, 2021Date of Patent: October 24, 2023Assignee: Arm LimitedInventors: Daren Croxford, Mathieu Jean Joseph Robart, Sharjeel Saeed
-
Publication number: 20230316063Abstract: An input data array is subjected to neural network processing to generate a result of the neural network processing for the input data array. A perturbation is applied to a part (but not all of) the input data array, with neural network processing then performed using the so-perturbed version of the input data array. However only some (and not all) of the perturbed version is subjected to neural network processing, based on the part of the input data array to which the perturbation has been applied. The result of the neural network processing of the perturbed version of the input data array is compared with the result of the neural network processing of the input data array without the perturbation, to determine whether the perturbation of the input data array has an effect on the result of the neural network processing.Type: ApplicationFiled: March 30, 2022Publication date: October 5, 2023Applicant: Arm LimitedInventors: Rachel Jean Trimble, Sharjeel Saeed, Daren Croxford
-
Publication number: 20230252264Abstract: When executing a neural network comprising a sequence of plural layers of neural network processing in which at least one of the layers of the sequence of plural layers of the neural network is followed by two or more branches of neural network processing, each branch comprising a different sequence of one or more layers of neural network processing, the branch or branches to use for the neural network processing following the layer of the neural network that is followed by the two or more branches of neural network processing is selected based on a property or properties of the output feature map from the layer that is followed by the two or more branches.Type: ApplicationFiled: February 10, 2022Publication date: August 10, 2023Applicant: Arm LimitedInventors: Daren Croxford, Rachel Jean Trimble, Sharjeel Saeed, Roberto Lopez Mendez
-
Publication number: 20230126531Abstract: In a graphics processing system comprising a graphics processor, a main memory, and a memory management unit, when rendering a frame that represents a view of a scene comprising one or more objects using a ray tracing process and the ray tracing process requires a traversal of a ray tracing acceleration data structure indicative of the distribution of geometry for the scene being rendered to determine geometry for the scene that may be intersected by a ray, at least part of the traversal of the ray tracing acceleration data structure is performed by the memory management unit (MMU).Type: ApplicationFiled: October 27, 2021Publication date: April 27, 2023Applicant: Arm LimitedInventors: Daren Croxford, Mathieu Jean Joseph Robart, Sharjeel Saeed
-
Patent number: 11625578Abstract: A method apparatus and computer readable medium for processing input data using a neural network comprising at least a first layer and a second layer. The method comprising the steps of applying a partitioning scheme to the input data, to partition the input data into a plurality of blocks, each block representing a portion of the input data. At the first layer of the neural network, the blocks of the input data are processed in a first order to generate intermediary data, wherein the intermediary data is partitioned into a plurality of intermediary blocks. At the second layer of the neural network, the intermediary blocks are processed in a second order, wherein the second order differs from the first order.Type: GrantFiled: March 30, 2020Date of Patent: April 11, 2023Assignee: ARM LimitedInventors: Sharjeel Saeed, Aaron DeBattista, Daren Croxford
-
Publication number: 20230089112Abstract: There is provided a data processing apparatus for performing machine learning. The data processing apparatus includes convolution circuitry for convolving a plurality of neighbouring regions of input data using a kernel to produce convolution outputs. Max-pooling circuitry determines and selects the largest of the convolution outputs as a pooled output and prediction circuitry performs a size prediction of the convolution outputs based on the neighbouring regions, wherein the size prediction is performed prior to the max-pooling circuitry determining the largest of the convolution outputs and adjusts a behaviour of the convolution circuitry based on the size prediction.Type: ApplicationFiled: September 20, 2021Publication date: March 23, 2023Inventors: Daren CROXFORD, Sharjeel SAEED, Rachel Jean TRIMBLE
-
Publication number: 20230079975Abstract: A system-on-chip comprises processing circuitry to process input data to generate output data, and power management circuitry to control power management policy for at least a portion of the system-on-chip. The power management circuitry controls the power management policy depending on metadata indicative of a property of the input data to be processed by the processing circuitry.Type: ApplicationFiled: September 10, 2021Publication date: March 16, 2023Inventors: Sharjeel SAEED, Daren CROXFORD, Rachel Jean TRIMBLE, Jayavarapu Srinivasa RAO, Sidhartha TANEJA
-
Publication number: 20230040673Abstract: A method for optimizing machine learning processing is provided. The method comprising retrieving, neural network architecture information for a neural network, the neural network architecture information comprising layer information and kernel information for the neural network. The network architecture information is analyzed to identify convolutional layers in the neural network which have associated strided layers. A first kernel for a convolutional layer identified as having an associated strided layer, and a second kernel for the strided layer associated with the convolutional layer are retrieved. A composite kernel is then generated, based on the first and second kernel, that performs the functions of the first and second kernel. Finally, the composite kernel is stored for further use by a neural network.Type: ApplicationFiled: July 28, 2021Publication date: February 9, 2023Inventors: Daren CROXFORD, Sharjeel SAEED, Rachel Jean TRIMBLE
-
Patent number: 11561795Abstract: Herein described is a method of operating an accumulation process in a data processing apparatus. The accumulation process comprises a plurality of accumulations which output a respective plurality of accumulated values, each based on a stored value and a computed value generated by a data processing operation. The method comprises storing a first accumulated value, the first accumulated value being one of said plurality of accumulated values, into a first storage device comprising a plurality of single-bit storage elements; determining that a predetermined trigger has been satisfied with respect to the accumulation process; and in response to the determining, storing at least a portion of a second accumulated value, the second accumulated value being one of said plurality of accumulated values, into a second storage device.Type: GrantFiled: March 30, 2020Date of Patent: January 24, 2023Assignee: Arm LimitedInventors: Jens Olson, John Wakefield Brothers, III, Jared Corey Smolens, Chi-wen Cheng, Daren Croxford, Sharjeel Saeed, Dominic Hugo Symes
-
Patent number: 11514312Abstract: Aspects of the present disclosure relate to a computer-implemented method of processing data portion. The method comprises processing a first data portion in a convolutional neural network to generate a first input to an activation function in the convolutional neural network; providing a first output by applying the activation function to the first input; and storing an indicator, representative of the first input to the activation function, for the first data portion. The method further comprises determining whether to provide a second output by applying the activation function to a second input, generated from a second data portion, based at least in part on an evaluation of the indicator for the first data portion.Type: GrantFiled: September 3, 2019Date of Patent: November 29, 2022Assignee: ARM LIMITEDInventors: Daren Croxford, Sharjeel Saeed
-
Patent number: 11423117Abstract: A computer implemented method for performing convolutions between subsets of an input data array and a kernel resulting in subsets of an output data array. The method may include receiving an input data array and using positional data indicating the position of elements of the input data array to determine subsets of the input data array which contains at least one non-zero value data element; performing convolutions between the subsets of the input data array containing at least one non-zero value data element and a kernel to produce output data array subsets; and combining the output data subsets with the positional data to generate output data indicative of a completed output data array.Type: GrantFiled: August 27, 2019Date of Patent: August 23, 2022Assignees: ARM LIMITED, APICAL LIMITEDInventors: Sharjeel Saeed, Daren Croxford, Davide Marani, Jayavarapu Srinivasa Rao
-
Publication number: 20220222569Abstract: A processing unit is provided which comprises volatile storage for storing machine learning data in binary representation, and a data processing engine communicatively coupled to the volatile storage. The processing unit is configured to selectively invert the bit values in binary representations of portions of the machine learning data when performing storage operations using the volatile storage. A computer-implemented method, and non-transitory computer-readable storage medium comprising instructions for executing the method are also provided. The method comprises receiving a request to perform a storage operation on the volatile storage using the machine learning data and performing the storage operation, including, selecting a portion of the machine learning data and inverting bit values in a binary representation of the selected portion. A computer-implemented method comprising receiving a request to store machine learning data on volatile storage and storing the machine learning data is also provided.Type: ApplicationFiled: January 11, 2021Publication date: July 14, 2022Inventors: Daren CROXFORD, Sharjeel SAEED, Rachel Jean TRIMBLE, Timothy Fawcett MILNER
-
Publication number: 20220129321Abstract: An information processing apparatus is described for processing a workload. The information processing apparatus comprises a processor and a memory element connected to the processor via a data link. In advance of processing a workload, the information processing apparatus estimates an access time required to transfer an amount of the workload that is to be transferred from the external memory element to the processor, and estimates a processing time for the processor to process the workload. A processing rate characteristic of the processor and/or a data transfer rate between the memory and the processor is set in dependence upon the estimated processing time and estimated access time. Methods for varying a quality of service (QoS) value of requests to the external memory element are also described.Type: ApplicationFiled: October 28, 2020Publication date: April 28, 2022Inventors: Daren CROXFORD, Sharjeel SAEED, Jayavarapu Srinivasa RAO, Aaron DEBATTISTA
-
Patent number: 11315303Abstract: When a programmable execution unit of a graphics processor is executing a graphics processing program to render a frame that represents a view of a scene using a ray tracing process, and the ray tracing process requires the determination of geometry that will be intersected by a ray, the programmable execution unit sends a message to a ray tracing acceleration data structure traversal circuit of the graphics processor, for the ray tracing acceleration data structure traversal circuit to perform a traversal of a ray tracing acceleration data structure for the scene to determine geometry for the scene that may be intersected by the ray. The ray tracing acceleration data structure traversal circuit then returns to the programmable execution unit an indication of geometry that may be intersected by the ray, and the programmable execution unit uses the indicated geometry to determine any geometry that is intersected by the ray.Type: GrantFiled: March 25, 2020Date of Patent: April 26, 2022Assignees: Arm Limited, Apical LimitedInventors: Sharjeel Saeed, Daren Croxford, Mathieu Jean Joseph Robart