Patents by Inventor Rune Holm
Rune Holm has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250252706Abstract: A processor, method, and non-transitory computer-readable storage medium for processing template data and search data according to a search window applied to the search data. The search window comprising a set of offset positions. The processing is performed by a block matching engine (BME) that produces a tensor with difference values, and a convolutional engine (CE) that performs a convolutional operation on the tensor. The processing is performed in an iterative interleaved fashion, by dividing the set of offset positions into a plurality of subsets of offset positions. In parallel with processing of the first X offset positions by the CE, the BME generates the next X channels of the tensor, and which are subsequently pipelined through to the CE via an internal storage, etc.Type: ApplicationFiled: February 2, 2024Publication date: August 7, 2025Inventors: John Wakefield BROTHERS, III, Metin Gokhan ÜNAL, Balaji VENU, Rune HOLM
-
Publication number: 20250251908Abstract: A processor, method, and non-transitory computer-readable storage medium for performing a block matching between first data and second data are provided. The block matching is performed using an iterative process, wherein, for each iteration, a portion of the first data and a corresponding portion of the second data is selected using a sliding window approach. When differences data, used for block matching, is calculated for a specific subset of first and second data, many of the calculations overlap with those needed for a nearby subset. Summed area table, SAT, data used for determining the difference data is continuously stored and updated in a buffer, such that overlapping computations can be avoided.Type: ApplicationFiled: February 2, 2024Publication date: August 7, 2025Inventors: John Wakefield BROTHERS, III, Metin Gokhan ÜNAL, Balaji VENU, Rune HOLM
-
Patent number: 12333626Abstract: A processor, method and non-transitory computer-readable storage medium for handling data, by obtaining task data describing a task to be executed in the form of a plurality of operations on data, the task data further defining an operation space of said data, analyzing each of the operations to define transformation data comprising transformation instruction representing a transform into an associated operation-specific local spaces. In case transformation instructions to get to the operation-specific local space for an operation are producing less dimensions compared to the operation space, one or more operation-specific arguments are stored in a data field corresponding to a dimension not produced by the transformation instructions in the transformation data corresponding to the operation.Type: GrantFiled: March 15, 2023Date of Patent: June 17, 2025Assignee: Arm LimitedInventors: Rune Holm, Elliot Maurice Simon Rosemarine
-
Publication number: 20250181932Abstract: A data processing system, the data processing system comprising a command processing unit and a processor that is configured to perform processing, the processor comprising: multiple execution units configured to perform processing operations for a type of work; and a control circuit configured to distribute processing tasks to the multiple execution units to cause the multiple execution units to perform processing operations for the type of work in response to asynchronous commands provided to the control circuit by the command processing unit; wherein dependency tracking is compared against an array of counters to indicate dependencies within the array of counters; wherein the indicated dependencies are provided to the control circuit by the command processing unit in the asynchronous commands to indicate for the type of work that dependencies have been resolved or that dependencies exist.Type: ApplicationFiled: November 30, 2023Publication date: June 5, 2025Inventors: Elliot Maurice Simon Rosemarine, Tobin Deene Ehlis, Rune Holm, Jared Corey Smolens
-
Publication number: 20250165292Abstract: The present disclosure relates to a data processor for processing data, comprising: a plurality of execution units to execute one or more operations; and a plurality of storage elements to store data for the one or more operations, the data processor being configured to process at least one task, each task to be executed in the form of a directed acyclic graph of operations, wherein each of the operations maps to a corresponding execution unit and each connection between operations in the acyclic graph maps to a corresponding storage element, the data processor further comprising: a plurality of counters; and a control module to control the plurality of counters to: in a first mode, count an operation cycle number associated with each operation of the at least one task, the operation cycle number of an operation being a number of cycles required to complete the operation; and in a second mode, count a unit cycle number associated with one or more execution units, the unit cycle number of an execution unit beiType: ApplicationFiled: November 17, 2023Publication date: May 22, 2025Inventors: Dominic Hugo Symes, Rune Holm, Thomas Patrik Andreas Olsson
-
Publication number: 20250077841Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to adapt a neural network structure to a target platform. One or more performance metrics of an execution of the neural network structure may be implemented by one or more target hardware elements. A module from a library of modules may be selected to replace one or more elements of the neural network structure based, at least in part, on the observed one or more performance metrics.Type: ApplicationFiled: August 30, 2023Publication date: March 6, 2025Inventors: Rune Holm, Anton Kachatkou, Benjamin Klimczak, Ruomei Yan, Diego Russo
-
Publication number: 20240370301Abstract: The present disclosure relates to a system, method and non-transitory computer-readable storage medium for handling data. From a directed acyclic graph, DAG, of operations on input data a sub-graph of operations is identified and issued as task data to be executed by a processing module, wherein each of the operations in the sub-graph maps to a corresponding execution unit of the processing module of the system and wherein each connection between operations maps to a corresponding storage element of the processing module. The sub-graph is identified such that a simulation of an execution of the operations of the candidate sub-graph according to a determined size of the processing unit of said input data shows that the processing module can execute the operations of the sub-graph such that memory constrains of the processing module are met and read-write operations to memory external to the processing module are avoided or reduced.Type: ApplicationFiled: April 19, 2024Publication date: November 7, 2024Applicant: Arm LimitedInventors: Elliot Maurice Simons Rosemarine, Rune Holm
-
Patent number: 12124935Abstract: A computer-implemented method, performed in a neural processing system comprising control processor circuitry and arithmetic logic circuitry, of performing a convolution between an input feature map (IFM) and convolutional filter data, resulting in an output feature map (OFM). The method includes, obtaining in the control processor circuitry, dimensional characteristic parameters relating to dimensions of input work batch data arrays and positional characteristic parameters relating to positions of feature map content within the input work batches. The method also includes, in the arithmetic logic circuitry, performing convolutions between the input work batches, generated from the IFM based on the dimensional characteristic parameters and the positional characteristic parameters, and work batch filter data arrays corresponding to the filter to produce a plurality of output work batch data arrays. The plurality of output work batches are combined to generate an OFM.Type: GrantFiled: February 21, 2020Date of Patent: October 22, 2024Assignee: Arm LimitedInventors: Suraj Sudhir, Jayavarapu Srinivasa Rao, Rune Holm
-
Publication number: 20240311947Abstract: A processor, method and non-transitory computer-readable storage medium for handling data, by obtaining task data describing a task to be executed in the form of a plurality of operations on data, the task data further defining an operation space of said data, analyzing each of the operations to define transformation data comprising transformation instruction representing a transform into an associated operation-specific local spaces. In case transformation instructions to get to the operation-specific local space for an operation are producing less dimensions compared to the operation space, one or more operation-specific arguments are stored in a data field corresponding to a dimension not produced by the transformation instructions in the transformation data corresponding to the operation.Type: ApplicationFiled: March 15, 2023Publication date: September 19, 2024Inventors: Rune HOLM, Elliot Maurice Simon ROSEMARINE
-
Publication number: 20240248753Abstract: A processor to: receive a task to be executed, the task comprising a task-based parameter associated with the task, for use in determining a position, within an array of data descriptors, of a particular data descriptor of a particular portion of data to be processed in executing the task. Each of the data descriptors in the array of data descriptors has a predetermined size and is indicative of a location in a storage system of a respective portion of data. The processor derives, based on the task, array location data indicative of a location in the storage system of a predetermined data descriptor, and obtains the particular data descriptor, based on the array location data and the task-based parameter. The processor obtains the particular portion of data based on the particular data descriptor and processes the particular portion of data in executing the task.Type: ApplicationFiled: January 20, 2023Publication date: July 25, 2024Inventors: Elliot Maurice Simon ROSEMARINE, Alexander Eugene CHALFIN, Rune HOLM
-
Publication number: 20240248721Abstract: A method and apparatus for distributing operations for execution. Input data is received and is subdivided into portions, each comprising a first and second sub-portion. A first operation and a second operation are received. Dependencies between the first and second operations are identified. For each portion the first operation is issued for execution on the first sub-portion to produce a first output sub-portion, and completion is tracked. The first operation is issued for execution on the second sub-portion to produce a second output sub-portion. Depending upon satisfaction of the dependencies in respect of the first sub-portion, either the second operation to be executed on the first output sub-portion is issued, if the dependencies are met; or the second operation, to be executed on the first output sub-portion is stalled, if the dependencies are not met. This is repeated for each subsequent portion.Type: ApplicationFiled: January 16, 2024Publication date: July 25, 2024Inventors: Rune HOLM, Alexander Eugene CHALFIN, Elliot Maurice Simon ROSEMARINE
-
Publication number: 20240248764Abstract: A memory unit configured for handling task data, the task data describing a task to be executed as a directed acyclic graph of operations, wherein each operation maps to a corresponding execution unit, and wherein each connection between operations in the acyclic graph maps to a corresponding storage element of the execution unit. The task data defines an operation space representing the dimensions of a multi-dimensional arrangement of the connected operations to be executed represented by the data blocks; the memory unit configured to receive a sequence of processing requests comprising the one or more data blocks with each data block assigned a priority value and comprising a block command. The memory unit is configured to arbitrate between the data blocks based upon the priority value and block command to prioritize the sequence of processing requests and wherein the processing requests include writing data to, or reading data from storage.Type: ApplicationFiled: May 12, 2023Publication date: July 25, 2024Inventors: Rune HOLM, Jens OLSON, Elliot Maurice Simon ROSEMARINE, Jared SMOLENS
-
Publication number: 20240248754Abstract: A processor to generate position data indicative of a position within a compressed data stream, wherein, previously, in executing a task, data of the compressed data stream ending at the position has been read by the processor from storage storing the compressed data stream. After reading the data, the processor reads further data of the compressed data stream from the storage, in executing the task, the further data located beyond the position within the compressed data stream. After reading the further data, the processor reads, based on the position data, a portion of the compressed data stream from the storage, in executing the task, starting from the position within the compressed data stream. The processor decompresses the portion of the compressed data stream to generate decompressed data, in executing the task.Type: ApplicationFiled: January 20, 2023Publication date: July 25, 2024Inventors: Elliot Maurice Simon ROSEMARINE, Jared Corey SMOLENS, Rune HOLM, John Wakefield BROTHERS, III, Jens OLSON
-
Publication number: 20240248755Abstract: A processor comprising: a handling unit; a plurality of components each configured to execute a function. The handling unit can receive a task comprising operations on data in a coordinate space having N dimensions, receive a data structure describing execution of the task and comprising a partially ordered set of data items each associated with instructions usable by the plurality of components when executing the task, each data item is associated with a component among the plurality of components, each data item indicates dimensions of the coordinates space for which changes of coordinate causes the function of the associated component to execute, and dimensions of the coordinate space for which changes of coordinate causes the function of the associated component to store data ready to be used by another component. The handling unit iterates over the coordinate space and executes the task using the partially ordered set of data items.Type: ApplicationFiled: January 20, 2023Publication date: July 25, 2024Inventors: Rune HOLM, Jens OLSON, Jared Corey SMOLENS, Dominic Hugo SYMES, Elliot Maurice Simon ROSEMARINE
-
Publication number: 20240231661Abstract: A processor to obtain mapping data indicative of at least one mapping parameter for a plurality of mapping blocks of a multi-dimensional tensor to be mapped. The at least one mapping parameter is for mapping corresponding elements of each mapping block to the same co-ordinate in at least one selected dimension of the multi-dimensional tensor, such that each mapping block corresponds to the same set of co-ordinates in the at least one selected dimension. A co-ordinate of an element of a block of the multi-dimensional tensor is determined. The element is comprised by a mapping block. A physical address in a storage corresponding to the co-ordinate is determined, based on the co-ordinate. The physical address is utilized in a process comprising an interaction between the block of the multi-dimensional tensor and the storage.Type: ApplicationFiled: October 12, 2023Publication date: July 11, 2024Applicant: Arm LimitedInventors: Dominic Hugo Symes, Rune Holm
-
Patent number: 12032506Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to a point of serialization for broadcast communications within multi-processor arrangements.Type: GrantFiled: March 30, 2022Date of Patent: July 9, 2024Assignee: Arm LimitedInventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
-
Patent number: 12001369Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to broadcast regions for multi-processor arrangements.Type: GrantFiled: March 30, 2022Date of Patent: June 4, 2024Assignee: Arm LimitedInventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
-
Publication number: 20240134553Abstract: A processor to obtain mapping data indicative of at least one mapping parameter for a plurality of mapping blocks of a multi-dimensional tensor to be mapped. The at least one mapping parameter is for mapping corresponding elements of each mapping block to the same co-ordinate in at least one selected dimension of the multi-dimensional tensor, such that each mapping block corresponds to the same set of co-ordinates in the at least one selected dimension. A co-ordinate of an element of a block of the multi-dimensional tensor is determined. The element is comprised by a mapping block. A physical address in a storage corresponding to the co-ordinate is determined, based on the co-ordinate. The physical address is utilized in a process comprising an interaction between the block of the multi-dimensional tensor and the storage.Type: ApplicationFiled: October 11, 2023Publication date: April 25, 2024Applicant: Arm LimitedInventors: Dominic Hugo Symes, Rune Holm
-
Publication number: 20240036919Abstract: A method and processor comprising a command processing unit to receive, from a host processor, a sequence of commands to be executed; and generate based on the sequence of commands a plurality of tasks. The processor also comprises a plurality of compute units each having a first processing module for executing tasks of a first task type, a second processing module for executing tasks of a second task type, different from the first task type, and a local cache shared by at least the first processing module and the second processing module. The command processing unit issues the plurality of tasks to at least one of the plurality of compute units, and wherein at least one of the plurality of compute units is to process at least one of the plurality of tasks.Type: ApplicationFiled: July 26, 2023Publication date: February 1, 2024Applicant: Arm LimitedInventors: Alexander Eugene Chalfin, John Wakefield Brothers, III, Rune Holm, Samuel James Edward Martin
-
Patent number: 11874793Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to broadcast hubs for multi-processor arrangements. A processing tile may comprise a broadcast hub to obtain a plurality of parameters applicable in a particular operation from at least one of a plurality of processing tiles and initiate distribution of the plurality of parameters to the plurality of processing tiles, wherein the plurality of processing tiles may execute the particular operation based at least in part on the plurality of distributed parameters.Type: GrantFiled: March 30, 2022Date of Patent: January 16, 2024Assignee: Arm LimitedInventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III