Patents by Inventor John Kalamatianos

John Kalamatianos has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210232505
    Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.
    Type: Application
    Filed: April 15, 2021
    Publication date: July 29, 2021
    Inventors: Alexander D. Breslow, John Kalamatianos
  • Patent number: 11038526
    Abstract: Various energy efficient data encoding schemes and computing devices are disclosed. In one aspect, a method of transmitting data from a transmitter to a receiver connected by plural wires is provided. The method includes sending from the transmitter on at least one but not all of the wires a first wave form that has first and second signal transitions. The receiver receives the first waveform and measures a first duration between the first and second signal transitions using a locally generated clock signal not received from the transmitter. The first duration is indicative of a first particular data value.
    Type: Grant
    Filed: February 17, 2020
    Date of Patent: June 15, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Greg Sadowski, John Kalamatianos
  • Patent number: 11023242
    Abstract: A method and apparatus of asynchronous scheduling in a graphics device includes sending one or more instructions from an instruction scheduler to one or more instruction first-in/first-out (FIFO) devices. An instruction in the one or more FIFO devices is selected for execution by a single-instruction/multiple-data (SIMD) pipeline unit. It is determined whether all operands for the selected instruction are available for execution of the instruction, and if all the operands are available, the selected instruction is executed on the SIMD pipeline unit. The self-timed arithmetic pipeline unit (SIMD pipeline unit) is effectively encapsulated in a synchronous, (e.g., clocked by global clock), scheduler and register file environment.
    Type: Grant
    Filed: January 27, 2017
    Date of Patent: June 1, 2021
    Assignees: ATI TECHNOLOGIES ULC, ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Greg Sadowski, Syed Zohaib M. Gilani
  • Patent number: 11016763
    Abstract: Systems, apparatuses, and methods for compacting multiple groups of micro-operations into individual cache lines of a micro-operation cache are disclosed. A processor includes at least a decode unit and a micro-operation cache. When a new group of micro-operations is decoded and ready to be written to the micro-operation cache, the micro-operation cache determines which set is targeted by the new group of micro-operations. If there is a way in this set that can store the new group without evicting any existing group already stored in the way, then the new group is stored into the way with the existing group(s) of micro-operations. Metadata is then updated to indicate that the new group of micro-operations has been written to the way. Additionally, the micro-operation cache manages eviction and replacement policy at the granularity of micro-operation groups rather than at the granularity of cache lines.
    Type: Grant
    Filed: March 8, 2019
    Date of Patent: May 25, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jagadish B. Kotra, John Kalamatianos
  • Publication number: 20210149672
    Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions.
    Type: Application
    Filed: December 17, 2020
    Publication date: May 20, 2021
    Inventors: John Kalamatianos, Jagadish B. Kotra
  • Publication number: 20210141740
    Abstract: A technique for accessing a memory having a high latency portion and a low latency portion is provided. The technique includes detecting a promotion trigger to promote data from the high latency portion to the low latency portion, in response to the promotion trigger, copying cache lines associated with the promotion trigger from the high latency portion to the low latency portion, and in response to a read request, providing data from either or both of the high latency portion or the low latency portion, based on a state associated with data in the high latency portion and the low latency portion.
    Type: Application
    Filed: November 13, 2019
    Publication date: May 13, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Apostolos Kokolis, Shrikanth Ganapathy
  • Patent number: 10990393
    Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: April 27, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Krishnan V. Ramani, Susumu Mashimo
  • Publication number: 20210117195
    Abstract: Address-based filtering for load/store speculation includes maintaining a filtering table including table entries associated with ranges of addresses; in response to receiving an ordering check triggering transaction, querying the filtering table using a target address of the ordering check triggering transaction to determine if an instruction dependent upon the ordering check triggering transaction has previously been generated a physical address; and in response to determining that the filtering table lacks an indication that the instruction dependent upon the ordering check triggering transaction has previously been generated a physical address, bypassing a lookup operation in an ordering violation memory structure to determine whether the instruction dependent upon the ordering check triggering transaction is currently in-flight.
    Type: Application
    Filed: October 21, 2019
    Publication date: April 22, 2021
    Inventors: JOHN KALAMATIANOS, KRISHNAN V. RAMANI, SUSUMU MASHIMO
  • Publication number: 20210117269
    Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
    Type: Application
    Filed: December 7, 2020
    Publication date: April 22, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
  • Patent number: 10983915
    Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.
    Type: Grant
    Filed: August 19, 2019
    Date of Patent: April 20, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, John Kalamatianos
  • Publication number: 20210089324
    Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.
    Type: Application
    Filed: June 26, 2020
    Publication date: March 25, 2021
    Inventors: Greg Sadowski, John Kalamatianos, Shomit N. Das
  • Publication number: 20210056036
    Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.
    Type: Application
    Filed: August 19, 2019
    Publication date: February 25, 2021
    Inventors: Alexander D. Breslow, John Kalamatianos
  • Publication number: 20210050864
    Abstract: A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.
    Type: Application
    Filed: August 16, 2019
    Publication date: February 18, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Nuwan Jayasena, John Kalamatianos
  • Patent number: 10908991
    Abstract: A computing device having a cache memory that is configured in a write-back mode is described. A cache controller in the cache memory acquires, from a record of bit errors that are present in each of a plurality of portions of the cache memory, a number of bit errors in a portion of the cache memory. The cache controller detects a coherency state of data stored in the portion of the cache memory. Based on the coherency state and the number of bit errors, the cache controller selects an error protection from among a plurality of error protections. The cache controller uses the selected error protection to protect the data stored in the portion of the cache memory from errors.
    Type: Grant
    Filed: September 6, 2018
    Date of Patent: February 2, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Shrikanth Ganapathy
  • Patent number: 10884751
    Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions.
    Type: Grant
    Filed: July 13, 2018
    Date of Patent: January 5, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Jagadish B. Kotra
  • Patent number: 10884940
    Abstract: A method of operating a cache in a computing device includes, in response to receiving a memory access request at the cache, determining compressibility of data specified by the request, selecting in the cache a destination portion for storing the data based on the compressibility of the data and a persistent fault history of the destination portion, and storing a compressed copy of the data in a non-faulted subportion of the destination portion, wherein the persistent fault history indicates that the non-faulted subportion excludes any persistent faults.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: January 5, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Shrikanth Ganapathy, Shomit Das, Matthew Tomei
  • Publication number: 20200409851
    Abstract: Techniques for generating a model for predicting when different hybrid prefetcher configurations should be used are disclosed. Techniques for using the model to predict when different hybrid prefetcher configurations should be used are also disclosed. The techniques for generating the model include obtaining a set of input data, and generating trees based on the training data. Each tree is associated with a different hybrid prefetcher configuration and the trees output certainty scores for the associated hybrid prefetcher configuration based on hardware feature measurements. To decide on a hybrid prefetcher configuration to use, a prefetcher traverses multiple trees to obtain certainty scores for different hybrid prefetcher configurations and identifies a hybrid prefetcher configuration to used based on a comparison of the certainty scores.
    Type: Application
    Filed: June 26, 2019
    Publication date: December 31, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Paul S. Keltcher, Mayank Chhablani, Alok Garg, Furkan Eris
  • Publication number: 20200401529
    Abstract: Wavefront loading in a processor is managed and includes monitoring a selected wavefront of a set of wavefronts. Reuse of memory access requests for the selected wavefront is counted. A cache hit rate in one or more caches of the processor is determined based on the counted reuse. Based on the cache hit rate, subsequent memory requests of other wavefronts of the set of wavefronts are modified by including a type of reuse of cache lines in requests to the caches. In the caches, storage of data in the caches is based on the type of reuse indicated by the subsequent memory access requests. Reused cache lines are protected by preventing cache line contents from being replaced by another cache line for a duration of processing the set of wavefronts. Caches are bypassed when streaming access requests are made.
    Type: Application
    Filed: June 19, 2019
    Publication date: December 24, 2020
    Inventors: Xianwei ZHANG, John KALAMATIANOS, Bradford BECKMANN
  • Patent number: 10860418
    Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
    Type: Grant
    Filed: April 8, 2019
    Date of Patent: December 8, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
  • Patent number: 10853075
    Abstract: An electronic device handles accesses of a branch prediction functional block when executing instructions in program code. The electronic device includes a processor having the branch prediction functional block that provides branch prediction information for control transfer instructions (CTIs) in the program code and a minimum predictor use (MPU) functional block. The MPU functional block determines, based on a record associated with a given fetch group of instructions, that a specified number of subsequent fetch groups of instructions that were previously determined to include no CTIs or conditional CTIs that were not taken are to be fetched for execution in sequence following the given fetch group. The MPU functional block then, when each of the specified number of the subsequent fetch groups is fetched and prepared for execution, prevents corresponding accesses of the branch prediction functional block for acquiring branch prediction information for instructions in that subsequent fetch group.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: December 1, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Varun Agrawal, John Kalamatianos, Adithya Yalavarti, Jingjie Qian