Operand Prefetch, E.g., Prefetch Instruction, Address Prediction (epo) Patents (Class 712/E9.047)
  • Patent number: 11972264
    Abstract: Processing circuitry performs processing operations in response to micro-operations. Front end circuitry supplies the micro-operations to be processed by the processing circuitry. Prediction circuitry generates a prediction of a number of loop iterations for which one or more micro-operations per loop iteration are to be supplied by the front end circuitry, where an actual number of loop iterations to be processed by the processing circuitry is resolvable by the processing circuitry based on at least one operand corresponding to a first loop iteration to be processed by the processing circuitry. The front end circuitry varies, based on a level of confidence in the prediction of the number of loop iterations, a supply rate with which the one or more micro-operations for at least a subset of the loop iterations are supplied to the processing circuitry.
    Type: Grant
    Filed: June 13, 2022
    Date of Patent: April 30, 2024
    Inventors: Guillaume Bolbenes, Thibaut Elie Lanois, Houdhaifa Bouzguarrou, Luca Nassi
  • Patent number: 11928524
    Abstract: The computer system includes one or more storage devices and a management computer, the management computer includes an information collection unit, an event detection unit, a plan generation unit, and a plan execution unit. The plan generation unit determines a target volume of a change process of a right of control in a plan, a processor of a change source of the right of control, and a processor of a change destination of the right of control, estimates an influence by a change process of the right of control in the plan, and the plan execution unit determines execution time of the plan based on the estimation of the influence and the operation information of the storage devices. As a result, in consideration of the influence by an ownership change process, while the influence applied to usage of a computer system is suppressed, the ownership change process is executed.
    Type: Grant
    Filed: February 26, 2021
    Date of Patent: March 12, 2024
    Assignee: Hitachi, Ltd.
    Inventors: Tsukasa Shibayama, Kazuei Hironaka, Kenta Sato
  • Patent number: 11861759
    Abstract: Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: January 2, 2024
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Nicolas Galoppo von Borries, Varghese George, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran
  • Patent number: 11861193
    Abstract: A system and method for updating a configuration of a host system so that the memory sub-system of the host system emulates performance characteristics of a target memory sub-system. An example system includes a memory sub-system; and a processor, operatively coupled with the memory sub-system, to perform operations comprising receiving a request to emulate a characteristic of a target memory sub-system, identifying a candidate configuration that generates a load on a memory sub-system of a host system to decrease a characteristics of the memory sub-system of the host system, and updating a configuration of the host system based at least on the candidate configuration, wherein the updated configuration changes the memory sub-system of the host system to emulate the characteristic of the target memory sub-system.
    Type: Grant
    Filed: January 13, 2023
    Date of Patent: January 2, 2024
    Assignee: Micron Technology, Inc.
    Inventors: Jacob Mulamootil Jacob, John M. Groves, Steven Moyer
  • Patent number: 11847460
    Abstract: Apparatuses and methods for handling load requests are disclosed. In response to a load request specifying a data item to retrieve from memory, a series of data items comprising the data item identified by the load request are retrieved. Load requests are buffered prior to the load requests being carried out. Coalescing circuitry determines for the load request and a set of one or more other load requests buffered in the pending load buffer circuitry whether an address proximity condition is true. The address proximity condition is true when all data items identified by the set of one or more other load requests are comprised within the series of data items. When the address proximity condition is true, the set of one or more other load requests are suppressed. Coalescing prediction circuitry generates a coalescing prediction for each load request based on previous handling of load requests by the coalescing circuitry.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: December 19, 2023
    Assignee: Arm Limited
    Inventors: Mbou Eyole, Michiel Willem Van Tol
  • Patent number: 11797307
    Abstract: In response to an instruction decoder decoding a range prefetch instruction specifying first and second address-range-specifying parameters and a stride parameter, prefetch circuitry controls, depending on the first and second address-range-specifying parameters and the stride parameter, prefetching of data from a plurality of specified ranges of addresses into the at least one cache. A start address and size of each specified range is dependent on the first and second address-range-specifying parameters. The stride parameter specifies an offset between start addresses of successive specified ranges. Use of the range prefetch instruction helps to improve programmability and improve the balance between prefetch coverage and circuit area of the prefetch circuitry.
    Type: Grant
    Filed: June 23, 2021
    Date of Patent: October 24, 2023
    Assignee: Arm Limited
    Inventors: Krishnendra Nathella, David Hennah Mansell, Alejandro Rico Carro, Andrew Mundy
  • Patent number: 11726917
    Abstract: A method includes recording a first set of consecutive memory access deltas, where each of the consecutive memory access deltas represents a difference between two memory addresses accessed by an application, updating values in a prefetch training table based on the first set of memory access deltas, and predicting one or more memory addresses for prefetching responsive to a second set of consecutive memory access deltas and based on values in the prefetch training table.
    Type: Grant
    Filed: July 13, 2020
    Date of Patent: August 15, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Susumu Mashimo, John Kalamatianos
  • Patent number: 11620133
    Abstract: Systems and methods for reusing load instructions by a processor without accessing a data cache include a load store execution unit (LSU) of the processor, the LSU being configured to determine if a prior execution of a first load instruction loaded data from a first cache line of the data cache and determine if a current execution of the second load instruction will load the data from the first cache line of the data cache. Further, the LSU also determines if a reuse of the data from the prior execution of the first load instruction for the current execution of the second load instruction will lead to functional errors. If there are no functional errors, the data from the prior execution of the first load instruction is reused for the current execution of the second load instruction, without accessing the data cache for the current execution of the second load instruction.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: April 4, 2023
    Assignee: Qualcomm Incorporated
    Inventor: Vignyan Reddy Kothinti Naresh
  • Patent number: 11593113
    Abstract: Unaligned atomic memory operations on a processor using a load-store instruction set architecture (ISA) that requires aligned accesses are performed by widening the memory access to an aligned address by the next larger power of two (e.g., 4-byte access is widened to 8 bytes, and 8-byte access is widened to 16 bytes). Data processing operations supported by the load-store ISA including shift, rotate, and bitfield manipulation are utilized to modify only the bytes in the original unaligned address so that the atomic memory operations are aligned to the widened access address. The aligned atomic memory operations using the widened accesses avoid the faulting exceptions associated with unaligned access for most 4-byte and 8-byte accesses. Exception handling is performed in cases in which memory access spans a 16-byte boundary.
    Type: Grant
    Filed: October 4, 2021
    Date of Patent: February 28, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Darek Mihocka, Arun Upadhyaya Kishan, Pedro Miguel Sequeira De Justo Teixeira
  • Patent number: 11573739
    Abstract: An information processing apparatus includes: a first memory; a second memory different in processing speed from the first memory; and a processor including: a memory controller that is coupled to the first memory and the second memory and that controls an access to the first memory and an access to the second memory; and a plurality of controllers that access to the first memory or the second memory. The processor is configured to suppress a writing frequency of data into the second memory by controlling one or more first controllers that access the second memory among the plurality of controllers in accordance with a result of monitoring a state of writing the data into the second memory.
    Type: Grant
    Filed: January 25, 2021
    Date of Patent: February 7, 2023
    Assignee: FUJITSU LIMITED
    Inventor: Satoshi Imamura
  • Patent number: 11449429
    Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register specifies a circular address mode for the loop, first and second block size numbers and a circular address block size selection. For a first circular address block size selection the block size corresponds to the first block size number. For a first circular address block size selection the block size corresponds to the first block size number. For a second circular address block size selection the block size corresponds to a sum of the first block size number and the second block size number.
    Type: Grant
    Filed: April 20, 2021
    Date of Patent: September 20, 2022
    Assignee: Texas Instruments Incorporated
    Inventor: Joseph Zbiciak
  • Patent number: 11403082
    Abstract: Systems and methods are configured to receive code containing an original loop that includes irregular memory accesses. The original loop can be split. A pre-execution loop that contains code to prefetch content of the memory can be generated. Execution of the pre-execution loop can access memory inclusively between a starting location and the starting location plus a prefetch distance. A modified loop that can perform at least one computation based on the content prefetched with execution of the pre-execution loop can be generated. Execution of the main loop can to follow the execution of the pre-execution loop. The original loop can be replaced with the pre-execution loop and the modified loop.
    Type: Grant
    Filed: April 30, 2021
    Date of Patent: August 2, 2022
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Sanyam Mehta, Gary William Elsesser, Terry D. Greyzck
  • Patent number: 9811341
    Abstract: Disclosed is an apparatus and method to manage instruction cache prefetching from an instruction cache. A processor may comprise: a prefetch engine; a branch prediction engine to predict the outcome of a branch; and dynamic optimizer. The dynamic optimizer may be used to control: identifying common instruction cache misses and inserting a prefetch instruction from the prefetch engine to the instruction cache.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: November 7, 2017
    Assignee: Intel Corporation
    Inventors: Kyriakos A. Stavrou, Enric Gibert Codina, Josep M. Codina, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Christos E. Kotselidis, Fernando Latorre, Pedro Lopez, Marc Lupon, Carlos Madriles Gimeno, Grigorios Magklis, Pedro Marcuello, Alejandro Martinez Vicente, Raul Martinez, Daniel Ortega, Demos Pavlou, Georgios Tournavitis, Polychronis Xekalakis
  • Patent number: 8918626
    Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. When an instruction retires during the lookahead mode, a working register which serves as a destination register for the instruction is not copied to a corresponding architectural register. Instead the architectural register is marked as invalid. Note that by not updating architectural registers during lookahead mode, the system eliminates the need to checkpoint the architectural registers prior to entering lookahead mode.
    Type: Grant
    Filed: November 10, 2011
    Date of Patent: December 23, 2014
    Assignee: Oracle International Corporation
    Inventors: Yuan C. Chou, Eric W. Mahurin
  • Patent number: 8195888
    Abstract: Technologies are generally described for allocating available prefetch bandwidth among processor cores in a multiprocessor computing system. The prefetch bandwidth associated with an off-chip memory interface of the multiprocessor may be determined, partitioned, and allocated across multiple processor cores.
    Type: Grant
    Filed: March 20, 2009
    Date of Patent: June 5, 2012
    Assignee: Empire Technology Development LLC
    Inventor: Yan Solihin
  • Patent number: 8156286
    Abstract: A microprocessor includes a cache memory, a prefetch unit, and detection logic. The prefetch unit may be configured to monitor memory accesses that miss in the cache and to determine whether to prefetch one or more blocks of memory from a system memory based upon previous memory accesses. The prefetch unit may be further configured to use addresses of the memory accesses that miss to calculate each next memory block to prefetch. The detection logic may be configured to provide a notification to the prefetch unit in response to detecting a memory access instruction including a particular hint. In response to receiving the notification, the prefetch unit may be configured to inhibit using an address associated with the memory access instruction including the particular hint, when calculating subsequent memory blocks to prefetch.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: April 10, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Thomas M. Deneau
  • Patent number: 8032713
    Abstract: A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design is provided. The design structure generally includes a computer system that includes a CPU, a storage device, circuitry for providing a speculative access threshold corresponding to a selected percentage of the total number of accesses to the storage device that can be speculatively issued, and circuitry for intermixing demand accesses and speculative accesses in accordance with the speculative access threshold.
    Type: Grant
    Filed: May 5, 2008
    Date of Patent: October 4, 2011
    Assignee: International Business Machines Corporation
    Inventors: James J. Allen, Jr., Steven K. Jenkins, James A. Mossman, Michael R. Trombley
  • Patent number: 7949830
    Abstract: A system and method for handling speculative read requests for a memory controller in a computer system are provided. In one example, a method includes the steps of providing a speculative read threshold corresponding to a selected percentage of the total number of reads that can be speculatively issued, and intermixing demand reads and speculative reads in accordance with the speculative read threshold. In another example, a computer system includes a CPU, a memory controller, memory, a bus connecting the CPU, memory controller and memory, circuitry for providing a speculative read threshold corresponding to a selected percentage of the total number of reads that can be speculatively issued, and circuitry for intermixing demand reads and speculative reads in accordance with the speculative read threshold.
    Type: Grant
    Filed: December 10, 2007
    Date of Patent: May 24, 2011
    Assignee: International Business Machines Corporation
    Inventors: James Johnson Allen, Jr., Steven Kenneth Jenkins, James A. Mossman, Michael Raymond Trombley
  • Patent number: 7937533
    Abstract: A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design is provided. The design structure generally includes a computer system that includes a CPU, a memory controller, memory, a bus connecting the CPU, memory controller and memory, circuitry for providing a speculative read threshold corresponding to a selected percentage of the total number of reads that can be speculatively issued, and circuitry for intermixing demand reads and speculative reads in accordance with the speculative read threshold.
    Type: Grant
    Filed: May 4, 2008
    Date of Patent: May 3, 2011
    Assignee: International Business Machines Corporation
    Inventors: James J. Allen, Jr., Steven K. Jenkins, James A. Mossman, Michael R. Trombley