Patents by Inventor Sharjeel SAEED

Sharjeel SAEED has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11995475
    Abstract: An information processing apparatus is described for processing a workload. The information processing apparatus comprises a processor and a memory element connected to the processor via a data link. In advance of processing a workload, the information processing apparatus estimates an access time required to transfer an amount of the workload that is to be transferred from the external memory element to the processor, and estimates a processing time for the processor to process the workload. A processing rate characteristic of the processor and/or a data transfer rate between the memory and the processor is set in dependence upon the estimated processing time and estimated access time. Methods for varying a quality of service (QoS) value of requests to the external memory element are also described.
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: May 28, 2024
    Assignee: Arm Limited
    Inventors: Daren Croxford, Sharjeel Saeed, Jayavarapu Srinivasa Rao, Aaron Debattista
  • Patent number: 11928581
    Abstract: A method of compressing kernels comprising detecting a plurality of replicated kernels. The plurality of replicated kernels comprise kernels. The method also comprises generating a composite kernel from the replicated kernels. The composite kernel comprises kernel data and meta data indicative of the rotations applied to the composite kernel data. The method also comprises storing a composite kernel.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: March 12, 2024
    Assignee: Arm Limited
    Inventors: Daren Croxford, Jayavarapu Srinivasa Rao, Sharjeel Saeed
  • Publication number: 20240036932
    Abstract: Disclosed herein is a graphics processor that comprises a programmable execution unit operable to execute programs to perform graphics processing operations. The graphics processor further comprises a dedicated machine learning processing circuit operable to perform processing operations for machine learning processing tasks. The machine learning processing circuit is in communication with the programmable execution unit internally to the graphics processor. In this way, the graphics processor can be configured such that machine learning processing tasks can be performed by the programmable execution unit, the machine learning processing circuit, or a combination of both, with the different units being able to message each other accordingly to control the processing.
    Type: Application
    Filed: July 26, 2023
    Publication date: February 1, 2024
    Applicant: Arm Limited
    Inventors: Daren Croxford, Sharjeel Saeed, Isidoros Sideris
  • Publication number: 20240037835
    Abstract: There is provided an apparatus configured to operate as a shader core, the shader core configured to perform a complex rendering process comprising a rendering process and a machine learning process, the shader core comprising: one or more tile buffers configured to store data locally to the shader core, wherein during the rendering process, the one or more tile buffers are configured to store rendered fragment data relating to a tile; and during the machine learning process, the one or more tile buffers are configured to store an input feature map, kernel weights or an output feature map relating to the machine learning process.
    Type: Application
    Filed: July 31, 2023
    Publication date: February 1, 2024
    Inventors: Daren CROXFORD, Sharjeel SAEED, Isidoros SIDERIS
  • Publication number: 20240036949
    Abstract: There is provided a processor configured to transfer data to a plurality of processor circuits. The apparatus includes broadcast circuitry that broadcasts first machine learning data to at least a subset of the plurality of processor circuits.
    Type: Application
    Filed: July 31, 2023
    Publication date: February 1, 2024
    Inventors: Daren CROXFORD, Sharjeel SAEED, Isidoros SIDERIS
  • Patent number: 11824977
    Abstract: A data processing system including storage. The data processing system also includes at least one processor to generate output data using at least a portion of a first neural network layer and generate a key associated with at least the portion of the first neural network layer. The at least one processor is further operable to obtain the key from the storage and obtain a version of the output data for input into a second neural network layer. Using the key, the at least one processor is further operable to determine whether the version of the output data differs from the output data.
    Type: Grant
    Filed: July 28, 2020
    Date of Patent: November 21, 2023
    Assignee: Arm Limited
    Inventors: Sharjeel Saeed, Daren Croxford, Dominic Hugo Symes
  • Patent number: 11798221
    Abstract: In a graphics processing system comprising a graphics processor, a main memory, and a memory management unit, when rendering a frame that represents a view of a scene comprising one or more objects using a ray tracing process and the ray tracing process requires a traversal of a ray tracing acceleration data structure indicative of the distribution of geometry for the scene being rendered to determine geometry for the scene that may be intersected by a ray, at least part of the traversal of the ray tracing acceleration data structure is performed by the memory management unit (MMU).
    Type: Grant
    Filed: October 27, 2021
    Date of Patent: October 24, 2023
    Assignee: Arm Limited
    Inventors: Daren Croxford, Mathieu Jean Joseph Robart, Sharjeel Saeed
  • Publication number: 20230316063
    Abstract: An input data array is subjected to neural network processing to generate a result of the neural network processing for the input data array. A perturbation is applied to a part (but not all of) the input data array, with neural network processing then performed using the so-perturbed version of the input data array. However only some (and not all) of the perturbed version is subjected to neural network processing, based on the part of the input data array to which the perturbation has been applied. The result of the neural network processing of the perturbed version of the input data array is compared with the result of the neural network processing of the input data array without the perturbation, to determine whether the perturbation of the input data array has an effect on the result of the neural network processing.
    Type: Application
    Filed: March 30, 2022
    Publication date: October 5, 2023
    Applicant: Arm Limited
    Inventors: Rachel Jean Trimble, Sharjeel Saeed, Daren Croxford
  • Publication number: 20230252264
    Abstract: When executing a neural network comprising a sequence of plural layers of neural network processing in which at least one of the layers of the sequence of plural layers of the neural network is followed by two or more branches of neural network processing, each branch comprising a different sequence of one or more layers of neural network processing, the branch or branches to use for the neural network processing following the layer of the neural network that is followed by the two or more branches of neural network processing is selected based on a property or properties of the output feature map from the layer that is followed by the two or more branches.
    Type: Application
    Filed: February 10, 2022
    Publication date: August 10, 2023
    Applicant: Arm Limited
    Inventors: Daren Croxford, Rachel Jean Trimble, Sharjeel Saeed, Roberto Lopez Mendez
  • Publication number: 20230126531
    Abstract: In a graphics processing system comprising a graphics processor, a main memory, and a memory management unit, when rendering a frame that represents a view of a scene comprising one or more objects using a ray tracing process and the ray tracing process requires a traversal of a ray tracing acceleration data structure indicative of the distribution of geometry for the scene being rendered to determine geometry for the scene that may be intersected by a ray, at least part of the traversal of the ray tracing acceleration data structure is performed by the memory management unit (MMU).
    Type: Application
    Filed: October 27, 2021
    Publication date: April 27, 2023
    Applicant: Arm Limited
    Inventors: Daren Croxford, Mathieu Jean Joseph Robart, Sharjeel Saeed
  • Patent number: 11625578
    Abstract: A method apparatus and computer readable medium for processing input data using a neural network comprising at least a first layer and a second layer. The method comprising the steps of applying a partitioning scheme to the input data, to partition the input data into a plurality of blocks, each block representing a portion of the input data. At the first layer of the neural network, the blocks of the input data are processed in a first order to generate intermediary data, wherein the intermediary data is partitioned into a plurality of intermediary blocks. At the second layer of the neural network, the intermediary blocks are processed in a second order, wherein the second order differs from the first order.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: April 11, 2023
    Assignee: ARM Limited
    Inventors: Sharjeel Saeed, Aaron DeBattista, Daren Croxford
  • Publication number: 20230089112
    Abstract: There is provided a data processing apparatus for performing machine learning. The data processing apparatus includes convolution circuitry for convolving a plurality of neighbouring regions of input data using a kernel to produce convolution outputs. Max-pooling circuitry determines and selects the largest of the convolution outputs as a pooled output and prediction circuitry performs a size prediction of the convolution outputs based on the neighbouring regions, wherein the size prediction is performed prior to the max-pooling circuitry determining the largest of the convolution outputs and adjusts a behaviour of the convolution circuitry based on the size prediction.
    Type: Application
    Filed: September 20, 2021
    Publication date: March 23, 2023
    Inventors: Daren CROXFORD, Sharjeel SAEED, Rachel Jean TRIMBLE
  • Publication number: 20230079975
    Abstract: A system-on-chip comprises processing circuitry to process input data to generate output data, and power management circuitry to control power management policy for at least a portion of the system-on-chip. The power management circuitry controls the power management policy depending on metadata indicative of a property of the input data to be processed by the processing circuitry.
    Type: Application
    Filed: September 10, 2021
    Publication date: March 16, 2023
    Inventors: Sharjeel SAEED, Daren CROXFORD, Rachel Jean TRIMBLE, Jayavarapu Srinivasa RAO, Sidhartha TANEJA
  • Publication number: 20230040673
    Abstract: A method for optimizing machine learning processing is provided. The method comprising retrieving, neural network architecture information for a neural network, the neural network architecture information comprising layer information and kernel information for the neural network. The network architecture information is analyzed to identify convolutional layers in the neural network which have associated strided layers. A first kernel for a convolutional layer identified as having an associated strided layer, and a second kernel for the strided layer associated with the convolutional layer are retrieved. A composite kernel is then generated, based on the first and second kernel, that performs the functions of the first and second kernel. Finally, the composite kernel is stored for further use by a neural network.
    Type: Application
    Filed: July 28, 2021
    Publication date: February 9, 2023
    Inventors: Daren CROXFORD, Sharjeel SAEED, Rachel Jean TRIMBLE
  • Patent number: 11561795
    Abstract: Herein described is a method of operating an accumulation process in a data processing apparatus. The accumulation process comprises a plurality of accumulations which output a respective plurality of accumulated values, each based on a stored value and a computed value generated by a data processing operation. The method comprises storing a first accumulated value, the first accumulated value being one of said plurality of accumulated values, into a first storage device comprising a plurality of single-bit storage elements; determining that a predetermined trigger has been satisfied with respect to the accumulation process; and in response to the determining, storing at least a portion of a second accumulated value, the second accumulated value being one of said plurality of accumulated values, into a second storage device.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: January 24, 2023
    Assignee: Arm Limited
    Inventors: Jens Olson, John Wakefield Brothers, III, Jared Corey Smolens, Chi-wen Cheng, Daren Croxford, Sharjeel Saeed, Dominic Hugo Symes
  • Patent number: 11514312
    Abstract: Aspects of the present disclosure relate to a computer-implemented method of processing data portion. The method comprises processing a first data portion in a convolutional neural network to generate a first input to an activation function in the convolutional neural network; providing a first output by applying the activation function to the first input; and storing an indicator, representative of the first input to the activation function, for the first data portion. The method further comprises determining whether to provide a second output by applying the activation function to a second input, generated from a second data portion, based at least in part on an evaluation of the indicator for the first data portion.
    Type: Grant
    Filed: September 3, 2019
    Date of Patent: November 29, 2022
    Assignee: ARM LIMITED
    Inventors: Daren Croxford, Sharjeel Saeed
  • Patent number: 11423117
    Abstract: A computer implemented method for performing convolutions between subsets of an input data array and a kernel resulting in subsets of an output data array. The method may include receiving an input data array and using positional data indicating the position of elements of the input data array to determine subsets of the input data array which contains at least one non-zero value data element; performing convolutions between the subsets of the input data array containing at least one non-zero value data element and a kernel to produce output data array subsets; and combining the output data subsets with the positional data to generate output data indicative of a completed output data array.
    Type: Grant
    Filed: August 27, 2019
    Date of Patent: August 23, 2022
    Assignees: ARM LIMITED, APICAL LIMITED
    Inventors: Sharjeel Saeed, Daren Croxford, Davide Marani, Jayavarapu Srinivasa Rao
  • Publication number: 20220222569
    Abstract: A processing unit is provided which comprises volatile storage for storing machine learning data in binary representation, and a data processing engine communicatively coupled to the volatile storage. The processing unit is configured to selectively invert the bit values in binary representations of portions of the machine learning data when performing storage operations using the volatile storage. A computer-implemented method, and non-transitory computer-readable storage medium comprising instructions for executing the method are also provided. The method comprises receiving a request to perform a storage operation on the volatile storage using the machine learning data and performing the storage operation, including, selecting a portion of the machine learning data and inverting bit values in a binary representation of the selected portion. A computer-implemented method comprising receiving a request to store machine learning data on volatile storage and storing the machine learning data is also provided.
    Type: Application
    Filed: January 11, 2021
    Publication date: July 14, 2022
    Inventors: Daren CROXFORD, Sharjeel SAEED, Rachel Jean TRIMBLE, Timothy Fawcett MILNER
  • Publication number: 20220129321
    Abstract: An information processing apparatus is described for processing a workload. The information processing apparatus comprises a processor and a memory element connected to the processor via a data link. In advance of processing a workload, the information processing apparatus estimates an access time required to transfer an amount of the workload that is to be transferred from the external memory element to the processor, and estimates a processing time for the processor to process the workload. A processing rate characteristic of the processor and/or a data transfer rate between the memory and the processor is set in dependence upon the estimated processing time and estimated access time. Methods for varying a quality of service (QoS) value of requests to the external memory element are also described.
    Type: Application
    Filed: October 28, 2020
    Publication date: April 28, 2022
    Inventors: Daren CROXFORD, Sharjeel SAEED, Jayavarapu Srinivasa RAO, Aaron DEBATTISTA
  • Patent number: 11315303
    Abstract: When a programmable execution unit of a graphics processor is executing a graphics processing program to render a frame that represents a view of a scene using a ray tracing process, and the ray tracing process requires the determination of geometry that will be intersected by a ray, the programmable execution unit sends a message to a ray tracing acceleration data structure traversal circuit of the graphics processor, for the ray tracing acceleration data structure traversal circuit to perform a traversal of a ray tracing acceleration data structure for the scene to determine geometry for the scene that may be intersected by the ray. The ray tracing acceleration data structure traversal circuit then returns to the programmable execution unit an indication of geometry that may be intersected by the ray, and the programmable execution unit uses the indicated geometry to determine any geometry that is intersected by the ray.
    Type: Grant
    Filed: March 25, 2020
    Date of Patent: April 26, 2022
    Assignees: Arm Limited, Apical Limited
    Inventors: Sharjeel Saeed, Daren Croxford, Mathieu Jean Joseph Robart