Patents by Inventor Rune Holm

Rune Holm has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250252706
    Abstract: A processor, method, and non-transitory computer-readable storage medium for processing template data and search data according to a search window applied to the search data. The search window comprising a set of offset positions. The processing is performed by a block matching engine (BME) that produces a tensor with difference values, and a convolutional engine (CE) that performs a convolutional operation on the tensor. The processing is performed in an iterative interleaved fashion, by dividing the set of offset positions into a plurality of subsets of offset positions. In parallel with processing of the first X offset positions by the CE, the BME generates the next X channels of the tensor, and which are subsequently pipelined through to the CE via an internal storage, etc.
    Type: Application
    Filed: February 2, 2024
    Publication date: August 7, 2025
    Inventors: John Wakefield BROTHERS, III, Metin Gokhan ÜNAL, Balaji VENU, Rune HOLM
  • Publication number: 20250251908
    Abstract: A processor, method, and non-transitory computer-readable storage medium for performing a block matching between first data and second data are provided. The block matching is performed using an iterative process, wherein, for each iteration, a portion of the first data and a corresponding portion of the second data is selected using a sliding window approach. When differences data, used for block matching, is calculated for a specific subset of first and second data, many of the calculations overlap with those needed for a nearby subset. Summed area table, SAT, data used for determining the difference data is continuously stored and updated in a buffer, such that overlapping computations can be avoided.
    Type: Application
    Filed: February 2, 2024
    Publication date: August 7, 2025
    Inventors: John Wakefield BROTHERS, III, Metin Gokhan ÜNAL, Balaji VENU, Rune HOLM
  • Patent number: 12333626
    Abstract: A processor, method and non-transitory computer-readable storage medium for handling data, by obtaining task data describing a task to be executed in the form of a plurality of operations on data, the task data further defining an operation space of said data, analyzing each of the operations to define transformation data comprising transformation instruction representing a transform into an associated operation-specific local spaces. In case transformation instructions to get to the operation-specific local space for an operation are producing less dimensions compared to the operation space, one or more operation-specific arguments are stored in a data field corresponding to a dimension not produced by the transformation instructions in the transformation data corresponding to the operation.
    Type: Grant
    Filed: March 15, 2023
    Date of Patent: June 17, 2025
    Assignee: Arm Limited
    Inventors: Rune Holm, Elliot Maurice Simon Rosemarine
  • Publication number: 20250181932
    Abstract: A data processing system, the data processing system comprising a command processing unit and a processor that is configured to perform processing, the processor comprising: multiple execution units configured to perform processing operations for a type of work; and a control circuit configured to distribute processing tasks to the multiple execution units to cause the multiple execution units to perform processing operations for the type of work in response to asynchronous commands provided to the control circuit by the command processing unit; wherein dependency tracking is compared against an array of counters to indicate dependencies within the array of counters; wherein the indicated dependencies are provided to the control circuit by the command processing unit in the asynchronous commands to indicate for the type of work that dependencies have been resolved or that dependencies exist.
    Type: Application
    Filed: November 30, 2023
    Publication date: June 5, 2025
    Inventors: Elliot Maurice Simon Rosemarine, Tobin Deene Ehlis, Rune Holm, Jared Corey Smolens
  • Publication number: 20250165292
    Abstract: The present disclosure relates to a data processor for processing data, comprising: a plurality of execution units to execute one or more operations; and a plurality of storage elements to store data for the one or more operations, the data processor being configured to process at least one task, each task to be executed in the form of a directed acyclic graph of operations, wherein each of the operations maps to a corresponding execution unit and each connection between operations in the acyclic graph maps to a corresponding storage element, the data processor further comprising: a plurality of counters; and a control module to control the plurality of counters to: in a first mode, count an operation cycle number associated with each operation of the at least one task, the operation cycle number of an operation being a number of cycles required to complete the operation; and in a second mode, count a unit cycle number associated with one or more execution units, the unit cycle number of an execution unit bei
    Type: Application
    Filed: November 17, 2023
    Publication date: May 22, 2025
    Inventors: Dominic Hugo Symes, Rune Holm, Thomas Patrik Andreas Olsson
  • Publication number: 20250077841
    Abstract: Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to adapt a neural network structure to a target platform. One or more performance metrics of an execution of the neural network structure may be implemented by one or more target hardware elements. A module from a library of modules may be selected to replace one or more elements of the neural network structure based, at least in part, on the observed one or more performance metrics.
    Type: Application
    Filed: August 30, 2023
    Publication date: March 6, 2025
    Inventors: Rune Holm, Anton Kachatkou, Benjamin Klimczak, Ruomei Yan, Diego Russo
  • Publication number: 20240370301
    Abstract: The present disclosure relates to a system, method and non-transitory computer-readable storage medium for handling data. From a directed acyclic graph, DAG, of operations on input data a sub-graph of operations is identified and issued as task data to be executed by a processing module, wherein each of the operations in the sub-graph maps to a corresponding execution unit of the processing module of the system and wherein each connection between operations maps to a corresponding storage element of the processing module. The sub-graph is identified such that a simulation of an execution of the operations of the candidate sub-graph according to a determined size of the processing unit of said input data shows that the processing module can execute the operations of the sub-graph such that memory constrains of the processing module are met and read-write operations to memory external to the processing module are avoided or reduced.
    Type: Application
    Filed: April 19, 2024
    Publication date: November 7, 2024
    Applicant: Arm Limited
    Inventors: Elliot Maurice Simons Rosemarine, Rune Holm
  • Patent number: 12124935
    Abstract: A computer-implemented method, performed in a neural processing system comprising control processor circuitry and arithmetic logic circuitry, of performing a convolution between an input feature map (IFM) and convolutional filter data, resulting in an output feature map (OFM). The method includes, obtaining in the control processor circuitry, dimensional characteristic parameters relating to dimensions of input work batch data arrays and positional characteristic parameters relating to positions of feature map content within the input work batches. The method also includes, in the arithmetic logic circuitry, performing convolutions between the input work batches, generated from the IFM based on the dimensional characteristic parameters and the positional characteristic parameters, and work batch filter data arrays corresponding to the filter to produce a plurality of output work batch data arrays. The plurality of output work batches are combined to generate an OFM.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: October 22, 2024
    Assignee: Arm Limited
    Inventors: Suraj Sudhir, Jayavarapu Srinivasa Rao, Rune Holm
  • Publication number: 20240311947
    Abstract: A processor, method and non-transitory computer-readable storage medium for handling data, by obtaining task data describing a task to be executed in the form of a plurality of operations on data, the task data further defining an operation space of said data, analyzing each of the operations to define transformation data comprising transformation instruction representing a transform into an associated operation-specific local spaces. In case transformation instructions to get to the operation-specific local space for an operation are producing less dimensions compared to the operation space, one or more operation-specific arguments are stored in a data field corresponding to a dimension not produced by the transformation instructions in the transformation data corresponding to the operation.
    Type: Application
    Filed: March 15, 2023
    Publication date: September 19, 2024
    Inventors: Rune HOLM, Elliot Maurice Simon ROSEMARINE
  • Publication number: 20240248753
    Abstract: A processor to: receive a task to be executed, the task comprising a task-based parameter associated with the task, for use in determining a position, within an array of data descriptors, of a particular data descriptor of a particular portion of data to be processed in executing the task. Each of the data descriptors in the array of data descriptors has a predetermined size and is indicative of a location in a storage system of a respective portion of data. The processor derives, based on the task, array location data indicative of a location in the storage system of a predetermined data descriptor, and obtains the particular data descriptor, based on the array location data and the task-based parameter. The processor obtains the particular portion of data based on the particular data descriptor and processes the particular portion of data in executing the task.
    Type: Application
    Filed: January 20, 2023
    Publication date: July 25, 2024
    Inventors: Elliot Maurice Simon ROSEMARINE, Alexander Eugene CHALFIN, Rune HOLM
  • Publication number: 20240248721
    Abstract: A method and apparatus for distributing operations for execution. Input data is received and is subdivided into portions, each comprising a first and second sub-portion. A first operation and a second operation are received. Dependencies between the first and second operations are identified. For each portion the first operation is issued for execution on the first sub-portion to produce a first output sub-portion, and completion is tracked. The first operation is issued for execution on the second sub-portion to produce a second output sub-portion. Depending upon satisfaction of the dependencies in respect of the first sub-portion, either the second operation to be executed on the first output sub-portion is issued, if the dependencies are met; or the second operation, to be executed on the first output sub-portion is stalled, if the dependencies are not met. This is repeated for each subsequent portion.
    Type: Application
    Filed: January 16, 2024
    Publication date: July 25, 2024
    Inventors: Rune HOLM, Alexander Eugene CHALFIN, Elliot Maurice Simon ROSEMARINE
  • Publication number: 20240248764
    Abstract: A memory unit configured for handling task data, the task data describing a task to be executed as a directed acyclic graph of operations, wherein each operation maps to a corresponding execution unit, and wherein each connection between operations in the acyclic graph maps to a corresponding storage element of the execution unit. The task data defines an operation space representing the dimensions of a multi-dimensional arrangement of the connected operations to be executed represented by the data blocks; the memory unit configured to receive a sequence of processing requests comprising the one or more data blocks with each data block assigned a priority value and comprising a block command. The memory unit is configured to arbitrate between the data blocks based upon the priority value and block command to prioritize the sequence of processing requests and wherein the processing requests include writing data to, or reading data from storage.
    Type: Application
    Filed: May 12, 2023
    Publication date: July 25, 2024
    Inventors: Rune HOLM, Jens OLSON, Elliot Maurice Simon ROSEMARINE, Jared SMOLENS
  • Publication number: 20240248754
    Abstract: A processor to generate position data indicative of a position within a compressed data stream, wherein, previously, in executing a task, data of the compressed data stream ending at the position has been read by the processor from storage storing the compressed data stream. After reading the data, the processor reads further data of the compressed data stream from the storage, in executing the task, the further data located beyond the position within the compressed data stream. After reading the further data, the processor reads, based on the position data, a portion of the compressed data stream from the storage, in executing the task, starting from the position within the compressed data stream. The processor decompresses the portion of the compressed data stream to generate decompressed data, in executing the task.
    Type: Application
    Filed: January 20, 2023
    Publication date: July 25, 2024
    Inventors: Elliot Maurice Simon ROSEMARINE, Jared Corey SMOLENS, Rune HOLM, John Wakefield BROTHERS, III, Jens OLSON
  • Publication number: 20240248755
    Abstract: A processor comprising: a handling unit; a plurality of components each configured to execute a function. The handling unit can receive a task comprising operations on data in a coordinate space having N dimensions, receive a data structure describing execution of the task and comprising a partially ordered set of data items each associated with instructions usable by the plurality of components when executing the task, each data item is associated with a component among the plurality of components, each data item indicates dimensions of the coordinates space for which changes of coordinate causes the function of the associated component to execute, and dimensions of the coordinate space for which changes of coordinate causes the function of the associated component to store data ready to be used by another component. The handling unit iterates over the coordinate space and executes the task using the partially ordered set of data items.
    Type: Application
    Filed: January 20, 2023
    Publication date: July 25, 2024
    Inventors: Rune HOLM, Jens OLSON, Jared Corey SMOLENS, Dominic Hugo SYMES, Elliot Maurice Simon ROSEMARINE
  • Publication number: 20240231661
    Abstract: A processor to obtain mapping data indicative of at least one mapping parameter for a plurality of mapping blocks of a multi-dimensional tensor to be mapped. The at least one mapping parameter is for mapping corresponding elements of each mapping block to the same co-ordinate in at least one selected dimension of the multi-dimensional tensor, such that each mapping block corresponds to the same set of co-ordinates in the at least one selected dimension. A co-ordinate of an element of a block of the multi-dimensional tensor is determined. The element is comprised by a mapping block. A physical address in a storage corresponding to the co-ordinate is determined, based on the co-ordinate. The physical address is utilized in a process comprising an interaction between the block of the multi-dimensional tensor and the storage.
    Type: Application
    Filed: October 12, 2023
    Publication date: July 11, 2024
    Applicant: Arm Limited
    Inventors: Dominic Hugo Symes, Rune Holm
  • Patent number: 12032506
    Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to a point of serialization for broadcast communications within multi-processor arrangements.
    Type: Grant
    Filed: March 30, 2022
    Date of Patent: July 9, 2024
    Assignee: Arm Limited
    Inventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
  • Patent number: 12001369
    Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to broadcast regions for multi-processor arrangements.
    Type: Grant
    Filed: March 30, 2022
    Date of Patent: June 4, 2024
    Assignee: Arm Limited
    Inventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III
  • Publication number: 20240134553
    Abstract: A processor to obtain mapping data indicative of at least one mapping parameter for a plurality of mapping blocks of a multi-dimensional tensor to be mapped. The at least one mapping parameter is for mapping corresponding elements of each mapping block to the same co-ordinate in at least one selected dimension of the multi-dimensional tensor, such that each mapping block corresponds to the same set of co-ordinates in the at least one selected dimension. A co-ordinate of an element of a block of the multi-dimensional tensor is determined. The element is comprised by a mapping block. A physical address in a storage corresponding to the co-ordinate is determined, based on the co-ordinate. The physical address is utilized in a process comprising an interaction between the block of the multi-dimensional tensor and the storage.
    Type: Application
    Filed: October 11, 2023
    Publication date: April 25, 2024
    Applicant: Arm Limited
    Inventors: Dominic Hugo Symes, Rune Holm
  • Publication number: 20240036919
    Abstract: A method and processor comprising a command processing unit to receive, from a host processor, a sequence of commands to be executed; and generate based on the sequence of commands a plurality of tasks. The processor also comprises a plurality of compute units each having a first processing module for executing tasks of a first task type, a second processing module for executing tasks of a second task type, different from the first task type, and a local cache shared by at least the first processing module and the second processing module. The command processing unit issues the plurality of tasks to at least one of the plurality of compute units, and wherein at least one of the plurality of compute units is to process at least one of the plurality of tasks.
    Type: Application
    Filed: July 26, 2023
    Publication date: February 1, 2024
    Applicant: Arm Limited
    Inventors: Alexander Eugene Chalfin, John Wakefield Brothers, III, Rune Holm, Samuel James Edward Martin
  • Patent number: 11874793
    Abstract: The present disclosure relates generally to multi-processor arrangements and, more particularly, to broadcast hubs for multi-processor arrangements. A processing tile may comprise a broadcast hub to obtain a plurality of parameters applicable in a particular operation from at least one of a plurality of processing tiles and initiate distribution of the plurality of parameters to the plurality of processing tiles, wherein the plurality of processing tiles may execute the particular operation based at least in part on the plurality of distributed parameters.
    Type: Grant
    Filed: March 30, 2022
    Date of Patent: January 16, 2024
    Assignee: Arm Limited
    Inventors: Erik Persson, Graeme Leslie Ingram, Rune Holm, John Wakefield Brothers, III