Patents by Inventor Stefanos Kaxiras

Stefanos Kaxiras has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12001845
    Abstract: An apparatus comprises first instruction execution circuitry, second instruction execution circuitry, and a decoupled access buffer. Instructions of an ordered sequence of instructions are issued to one of the first and second instruction execution circuitry for execution in dependence on whether the instruction has a first type label or a second type label. An instruction with the first type label is an access-related instruction which determines at least one characteristic of a load operation to retrieve a data value from a memory address. Instruction execution by the first instruction execution circuitry of instructions having the first type label is prioritised over instruction execution by the second instruction execution circuitry of instructions having the second type label. Data values retrieved from memory as a result of execution of the first type instructions are stored in the decoupled access buffer.
    Type: Grant
    Filed: October 15, 2020
    Date of Patent: June 4, 2024
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Stefanos Kaxiras
  • Patent number: 11899940
    Abstract: When load requests are generated to support data processing operations, the load requests are buffered in pending load buffer circuitry prior to being carried out. Coalescing circuitry determines for a first load request whether a set of one or more subsequent load requests buffered in the pending load buffer circuitry satisfies an address proximity condition. The address proximity condition is satisfied when all data items identified by the set of one or more subsequent load requests are comprised within a series of data items which will be retrieved from the memory system in response to the first load request. When the address proximity condition is satisfied, forwarding of the set of one or more subsequent load requests is suppressed.
    Type: Grant
    Filed: October 7, 2020
    Date of Patent: February 13, 2024
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Stefanos Kaxiras
  • Patent number: 11886881
    Abstract: Apparatuses and methods are provided, relating to the control of data processing in devices which comprise both decoupled access-execute processing circuitry and prefetch circuitry. Control of the access portion of the decoupled access-execute processing circuitry may be dependent on a performance metric of the prefetch circuitry. Alternatively or in addition, control of the prefetch circuitry may be dependent on a performance metric of the access portion.
    Type: Grant
    Filed: December 21, 2020
    Date of Patent: January 30, 2024
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Michiel Willem Van Tol, Stefanos Kaxiras
  • Publication number: 20230120783
    Abstract: Apparatuses and methods are provided, relating to the control of data processing in devices which comprise both decoupled access-execute processing circuitry and prefetch circuitry. Control of the access portion of the decoupled access-execute processing circuitry may be dependent on a performance metric of the prefetch circuitry. Alternatively or in addition, control of the prefetch circuitry may be dependent on a performance metric of the access portion.
    Type: Application
    Filed: December 21, 2020
    Publication date: April 20, 2023
    Inventors: Mbou EYOLE, Michiel Willem VAN TOL, Stefanos KAXIRAS
  • Publication number: 20220391214
    Abstract: An apparatus comprises first instruction execution circuitry, second instruction execution circuitry, and a decoupled access buffer. Instructions of an ordered sequence of instructions are issued to one of the first and second instruction execution circuitry for execution in dependence on whether the instruction has a first type label or a second type label. An instruction with the first type label is an access-related instruction which determines at least one characteristic of a load operation to retrieve a data value from a memory address. Instruction execution by the first instruction execution circuitry of instructions having the first type label is prioritised over instruction execution by the second instruction execution circuitry of instructions having the second type label. Data values retrieved from memory as a result of execution of the first type instructions are stored in the decoupled access buffer.
    Type: Application
    Filed: October 15, 2020
    Publication date: December 8, 2022
    Inventors: Mbou EYOLE, Stefanos KAXIRAS
  • Publication number: 20220391101
    Abstract: When load requests are generated to support data processing operations, the load requests are buffered in pending load buffer circuitry prior to being carried out. Coalescing circuitry determines for a first load request whether a set of one or more subsequent load requests buffered in the pending load buffer circuitry satisfies an address proximity condition. The address proximity condition is satisfied when all data items identified by the set of one or more subsequent load requests are comprised within a series of data items which will be retrieved from the memory system in response to the first load request. When the address proximity condition is satisfied, forwarding of the set of one or more subsequent load requests is suppressed.
    Type: Application
    Filed: October 7, 2020
    Publication date: December 8, 2022
    Inventors: Mbou EYOLE, Stefanos KAXIRAS
  • Patent number: 11334485
    Abstract: A computer system for dynamic enforcement of store atomicity includes multiple processor cores, local cache memory for each processor core, a shared memory, a separate store buffer for each processor core for executed stores that are not yet performed and a coherence mechanism. A first processor core load on a first processor core receives a value at a first time from a first processor core store in the store buffer and prevents any other first processor core load younger than the first processor core load in program order from committing until a second time when the first processor core store is performed. Between the first time and the second time any load younger in program load than the first processor core load and having an address matched by coherence invalidation or an address matched by an eviction is squashed.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: May 17, 2022
    Assignee: ETA SCALE AB
    Inventors: Stefanos Kaxiras, Alberto Ros
  • Patent number: 11237966
    Abstract: Synchronization events associated with cache coherence are monitored without using invalidations. A callback-read is issued to a memory address associated with the synchronization event, which callback-read either reads the last value written in the memory address or blocks until a next write takes place in the memory address and reads a newly written value.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: February 1, 2022
    Assignee: ETA SCALE AB
    Inventors: Stefanos Kaxiras, Alberto Ros
  • Patent number: 11188464
    Abstract: Methods and systems for self-invalidating cachelines in a computer system having a plurality of cores are described. A first one of the plurality of cores, requests to load a memory block from a cache memory local to the first one of the plurality of cores, which request results in a cache miss. This results in checking a read-after-write detection structure to determine if a race condition exists for the memory block. If a race condition exists for the memory block, program order is enforced by the first one of the plurality of cores at least between any older loads and any younger loads with respect to the load that detects the prior store in the first one of the plurality of cores that issued the load of the memory block and causing one or more cache lines in the local cache memory to be self-invalidated.
    Type: Grant
    Filed: December 11, 2019
    Date of Patent: November 30, 2021
    Assignee: ETA SCALE AB
    Inventors: Alberto Ros, Stefanos Kaxiras
  • Publication number: 20210365554
    Abstract: A system and method for mitigating micro-architectural replay attacks in a processing system by delaying speculative execution on the processing system of a set of processor instructions upon detection that the set of processor instructions are part of a micro-architectural replay attack by detecting repeating speculative execution of the set of processor instructions interleaved with misspeculation and squashing of the set of processor instructions.
    Type: Application
    Filed: May 25, 2021
    Publication date: November 25, 2021
    Inventors: Christos SAKALIS, Stefanos KAXIRAS, Magnus SJÄLANDER
  • Patent number: 11163576
    Abstract: A system and method for efficiently preventing visible side-effects in the memory hierarchy during speculative execution is disclosed. Hiding the side-effects of executed instructions in the whole memory hierarchy is both expensive, in terms of performance and energy, and complicated. A system and method is disclosed to hide the side-effects of speculative loads in the cache(s) until the earliest time these speculative loads become non-speculative. A refinement is disclosed where loads that hit in the L1 cache are allowed to proceed by keeping their side-effects on the L1 cache hidden until these loads become non-speculative, and all other speculative loads that miss in the cache(s) are prevented from executing until they become non-speculative. To limit the performance deterioration caused by these delayed loads, a system and method is disclosed that augments the cache(s) with a value predictor or a re-computation engine that supplies predicted or recomputed values to the loads that missed in the cache(s).
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: November 2, 2021
    Assignee: ETA SCALE AB
    Inventors: Christos Sakalis, Stefanos Kaxiras, Alberto Ros, Alexandra Jimborean, Magnus Själander
  • Patent number: 11119920
    Abstract: A method for performing store buffer coalescing in a multiprocessor computer system includes forming, in a coalescing store buffer associated with a core in said multiprocessor system, an atomic group of writes; and performing each individual write in said atomic group in an order which is a function of an address in a memory system to which each of the writes in said atomic group are being written.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: September 14, 2021
    Assignee: ETA SCALE AB
    Inventors: Alberto Ros, Stefanos Kaxiras
  • Patent number: 11106468
    Abstract: Methods and systems for maintaining validity of a memory model in a multiple core computer system are described. A first core prevents a store instruction from being performed by another core until a condition is met which enables reordered instructions to validly execute.
    Type: Grant
    Filed: May 23, 2018
    Date of Patent: August 31, 2021
    Assignee: ETA SCALE AB
    Inventors: Alberto Ros, Stefanos Kaxiras
  • Patent number: 11068410
    Abstract: According to embodiments described herein, the hierarchical complexity for coherence protocols associated with clustered cache architectures can be encapsulated in a simple function, i.e., that of determining when a data block is shared entirely within a cluster (i.e., a sub-tree of the hierarchy) and is private from the outside. This allows embodiments to eliminate complex recursive coherence operations that span the hierarchy and instead employ simple coherence mechanisms such as self-invalidation and write-through but which are restricted to operate where a data block is shared. Thus embodiments recognize that, in the context of clustered cache hierarchies, data can be shared entirely within one cluster but can be private (unshared) to this cluster when viewed from the perspective of other clusters. This characteristic of the data can be determined and then used to locally simplify coherence protocols.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: July 20, 2021
    Assignee: ETA SCALE AB
    Inventors: Alberto Ros, Stefanos Kaxiras
  • Patent number: 10915466
    Abstract: Caches may be vulnerable to side-channel attacks, such as Spectre and Meltdown, that involve speculative execution of instructions, revealing information about a cache that the attacker is not permitted to access. Access permission may be stored in the cache, such as in an entry of a cache table or in the region information for a cache table. Optionally, the access permission may be re-checked if the access permission changes while a memory instruction is pending. Optionally, a random index value may be stored in a cache and used, at least in part, to identify a memory location of a cacheline. Optionally, cachelines that are involved in speculative loads for memory instructions may be marked as speculative. On condition of resolving the speculative load as non-speculative, the cacheline may be marked as non-speculative; and on condition of resolving the speculative load as mis-speculated, the cacheline may be removed from the cache.
    Type: Grant
    Filed: February 27, 2019
    Date of Patent: February 9, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Erik Ernst Hagersten, David Black-Schaffer, Stefanos Kaxiras
  • Publication number: 20200301712
    Abstract: A system and method for efficiently preventing visible side-effects in the memory hierarchy during speculative execution is disclosed. Hiding the side-effects of executed instructions in the whole memory hierarchy is both expensive, in terms of performance and energy, and complicated. A system and method is disclosed to hide the side-effects of speculative loads in the cache(s) until the earliest time these speculative loads become non-speculative. A refinement is disclosed where loads that hit in the L1 cache are allowed to proceed by keeping their side-effects on the L1 cache hidden until these loads become non-speculative, and all other speculative loads that miss in the cache(s) are prevented from executing until they become non-speculative. To limit the performance deterioration caused by these delayed loads, a system and method is disclosed that augments the cache(s) with a value predictor or a re-computation engine that supplies predicted or recomputed values to the loads that missed in the cache(s).
    Type: Application
    Filed: March 20, 2020
    Publication date: September 24, 2020
    Inventors: Christos SAKALIS, Stefanos KAXIRAS, Alberto ROS, Alexandra JIMBOREAN, Magnus SJÄLANDER
  • Publication number: 20200192801
    Abstract: A computer system for dynamic enforcement of store atomicity includes multiple processor cores, local cache memory for each processor core, a shared memory, a separate store buffer for each processor core for executed stores that are not yet performed and a coherence mechanism. A first processor core load on a first processor core receives a value at a first time from a first processor core store in the store buffer and prevents any other first processor core load younger than the first processor core load in program order from committing until a second time when the first processor core store is performed. Between the first time and the second time any load younger in program load than the first processor core load and having an address matched by coherence invalidation or an address matched by an eviction is squashed.
    Type: Application
    Filed: December 16, 2019
    Publication date: June 18, 2020
    Inventors: Stefanos KAXIRAS, Alberto ROS
  • Patent number: 10671543
    Abstract: Methods and systems which, for example, reduce energy usage in cache memories are described. Cache location information regarding the location of cachelines which are stored in a tracked portion of a memory hierarchy is stored in a cache location table. Address tags are stored with corresponding location information in the cache location table to associate the address tag with the cacheline and its cache location information. When a cacheline is moved to a new location in the memory hierarchy, the cache location table is updated so that the cache location information indicates where the cacheline is located within the memory hierarchy.
    Type: Grant
    Filed: November 20, 2014
    Date of Patent: June 2, 2020
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Erik Hagersten, Andreas Sembrant, David Black-Schaffer, Stefanos Kaxiras
  • Publication number: 20200110703
    Abstract: Methods and systems for self-invalidating cachelines in a computer system having a plurality of cores are described. A first one of the plurality of cores, requests to load a memory block from a cache memory local to the first one of the plurality of cores, which request results in a cache miss. This results in checking a read-after-write detection structure to determine if a race condition exists for the memory block. If a race condition exists for the memory block, program order is enforced by the first one of the plurality of cores at least between any older loads and any younger loads with respect to the load that detects the prior store in the first one of the plurality of cores that issued the load of the memory block and causing one or more cache lines in the local cache memory to be self-invalidated.
    Type: Application
    Filed: December 11, 2019
    Publication date: April 9, 2020
    Inventors: Alberto ROS, Stefanos KAXIRAS
  • Patent number: 10528471
    Abstract: Methods and systems for self-invalidating cachelines in a computer system having a plurality of cores are described. A first one of the plurality of cores, requests to load a memory block from a cache memory local to the first one of the plurality of cores, which request results in a cache miss. This results in checking a read-after-write detection structure to determine if a race condition exists for the memory block. If a race condition exists for the memory block, program order is enforced by the first one of the plurality of cores at least between any older loads and any younger loads with respect to the load that detects the prior store in the first one of the plurality of cores that issued the load of the memory block and causing one or more cache lines in the local cache memory to be self-invalidated.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: January 7, 2020
    Assignee: ETA SCALE AB
    Inventors: Alberto Ros, Stefanos Kaxiras