Prefetching Patents (Class 712/207)
  • Publication number: 20140258681
    Abstract: Embodiments relate to prefetching data on a chip having a scout core and a parent core coupled to the scout core. The method includes determining that a program executed by the parent core requires content stored in a location remote from the parent core. The method includes sending a fetch table address determined by the parent core to the scout core. The method includes accessing a fetch table that is indicated by the fetch table address by the scout core. The fetch table indicates how many of pieces of content are to be fetched by the scout core and a location of the pieces of content. The method includes based on the fetch table indicating, fetching the pieces of content by the scout core. The method includes returning the fetched pieces of content to the parent core.
    Type: Application
    Filed: March 5, 2013
    Publication date: September 11, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Patent number: 8832415
    Abstract: A multiprocessor system includes nodes. Each node includes a data path that includes a core, a TLB, and a first level cache implementing disambiguation. The system also includes at least one second level cache and a main memory. For thread memory access requests, the core uses an address associated with an instruction format of the core. The first level cache uses an address format related to the size of the main memory plus an offset corresponding to hardware thread meta data. The second level cache uses a physical main memory address plus software thread meta data to store the memory access request. The second level cache accesses the main memory using the physical address with neither the offset nor the thread meta data after resolving speculation. In short, this system includes mapping of a virtual address to a different physical addresses for value disambiguation for different threads.
    Type: Grant
    Filed: January 4, 2011
    Date of Patent: September 9, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alan Gala, Martin Ohmacht
  • Patent number: 8832384
    Abstract: A storage proxy receives different abstracted memory access requests that are abstracted from the original memory access requests from different sources. The storage proxy reconstructs the characteristics of the original memory access requests from the abstracted memory access requests and makes prefetch decisions based on the reconstructed characteristics. An inflight table is configured to identify contiguous address ranges formed by an accumulation of sub-address ranges used by different abstracted memory access requests. An operation table is configured to identify the number of times the contiguous address ranges are formed by the memory access operations. A processor is then configured to prefetch the contiguous address ranges for certain corresponding read requests.
    Type: Grant
    Filed: July 29, 2010
    Date of Patent: September 9, 2014
    Assignee: Violin Memory, Inc.
    Inventor: Erik de la Iglesia
  • Patent number: 8806177
    Abstract: A method and system for prefetching in computer system are provided. The method in one aspect includes using a prefetch engine to perform prefetch instructions and to translate unmapped data. Misses to address translations during the prefetch are handled and resolved. The method also includes storing the resolved translations in a respective cache translation table. A system for prefetching in one aspect includes a prefetch engine operable to receive instructions to prefetch data from the main memory. The prefetch engine is also operable to search cache address translation for prefetch data and perform address mapping translation, if the prefetch data is unmapped. The prefetch engine is further operable to prefetch the data and store the address mapping in one or more cache memory, if the data is unmapped.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: August 12, 2014
    Assignee: International Business Machines Corporation
    Inventors: Orran Y. Krieger, Balaram Sinharoy, Robert B. Tremaine, Robert W. Wisniewski
  • Publication number: 20140223141
    Abstract: In some implementations, a processor may include a data structure, such as a translation lookaside buffer, that includes an entry containing first mapping information having a virtual address and a first context associated with a first thread. Control logic may receive a request for second mapping information having the virtual address and a second context associated with a second thread. The control logic may determine whether the second mapping information associated with the second context is equivalent to the first mapping information in the entry of the data structure. If the second mapping information is equivalent to the first mapping information, the control logic may associate the second thread with the first mapping information contained in the entry of the data structure to share the entry between the first thread and the second thread.
    Type: Application
    Filed: December 29, 2011
    Publication date: August 7, 2014
    Inventors: Jonathan D. Combs, Jason W. Brandt, Benjamin C. Chaffin, Julio Gago, Andrew F. Glew
  • Patent number: 8799603
    Abstract: Memory is used, including by receiving at a processor an indication that a first piece of metadata associated with a set of backup data is required during a block based backup and/or restore. The processor is used to retrieve from a metadata store a set of metadata that includes the first piece of metadata and one or more additional pieces of metadata included in the metadata store in an adjacent location that is adjacent to a first location in which the first piece of metadata is stored in the metadata store, without first determining whether the one or more additional pieces of metadata are currently required. The retrieved set of metadata is stored in a cache.
    Type: Grant
    Filed: September 12, 2013
    Date of Patent: August 5, 2014
    Assignee: EMC Corporation
    Inventor: Ajay Pratap Singh Kushwah
  • Publication number: 20140208075
    Abstract: Apparatuses, systems, and a method for providing a processor architecture with a control speculative load are described. In one embodiment, a computer-implemented method includes determining whether a speculative load instruction encounters a long latency condition, spontaneously deferring the speculative load instruction if the speculative load instruction encounters the long latency condition, and initiating a prefetch of a translation or of data that requires long latency access when the speculative load instruction encounters the long latency condition. The method further includes reaching a check instruction, which resteers to recovery code that executes a non-speculative version of the load.
    Type: Application
    Filed: December 20, 2011
    Publication date: July 24, 2014
    Inventor: James Earl McCormick, JR.
  • Patent number: 8788795
    Abstract: A wake-and-go mechanism may be a programming idiom accelerator. As a processor fetches instructions, the programming idiom accelerator may look ahead to determine whether a programming idiom is coming up in the instruction stream. If the programming idiom accelerator recognizes a programming idiom, the programming idiom accelerator may perform an action to accelerate execution of the programming idiom. In the case of a wake-and-go programming idiom, the programming idiom accelerator may record an entry in a wake-and-go array, for example.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: July 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8788759
    Abstract: A prefetch unit includes a program prefetch address generator that receives memory read requests and in response to addresses associated with the memory read request generates prefetch addresses and stores the prefetch addresses in slots of the prefetch unit buffer. Each slot includes a buffer for storing a prefetch address, two data buffers for storing data that is prefetched using the prefetch address of the slot, and a data buffer selector for alternating the functionality of the two data buffers. A first buffer is used to hold data that is returned in response to a received memory request, and a second buffer is used to hold data from a subsequent prefetch operation having a subsequent prefetch address, such that the data in the first buffer is not overwritten even when the data in the first buffer is still in the process of being read out.
    Type: Grant
    Filed: August 31, 2011
    Date of Patent: July 22, 2014
    Assignee: Texas Instruments Incorporated
    Inventors: Matthew D Pierson, Joseph R M Zbiciak
  • Patent number: 8773455
    Abstract: A display controller may include an RGB Interface module and a display port module, which may both use a target-master interface, in which the data receiving module pops pixels from the data sourcing module, and generates the HSync, VSync, and VBI timing signals. A dither module may be instantiated between the RGB interface module and display port module to perform dithering. The dither module may use a source-master interface, in which data signals and data valid signals are issued by the data sourcing module. In order to avoid having to use a large storage capacity FIFO with the dither module, a control unit may issue interface signals to the RGB Interface module and display port module, and clock-gate the dither module, to allow the data signals and data valid signals to properly interface with the RBG interface module and display port module, and provide data flow from the RGB interface module to the dither module to the display port module.
    Type: Grant
    Filed: August 11, 2011
    Date of Patent: July 8, 2014
    Assignee: Apple Inc.
    Inventors: Brijesh Tripathi, Nitin Bhargava
  • Publication number: 20140181475
    Abstract: A unified architecture for dynamic generation, execution, synchronization and parallelization of complex instruction formats includes a virtual register file, register cache and register file hierarchy. A self-generating and synchronizing dynamic and static threading architecture provides efficient context switching.
    Type: Application
    Filed: February 28, 2014
    Publication date: June 26, 2014
    Applicant: Soft Machines, Inc.
    Inventor: Mohammad A. Abdallah
  • Publication number: 20140173254
    Abstract: In a DFA scanning engine used to match regular expressions or similar rules, instructions to execute DFA state transitions are accessed through an instruction cache. Each DFA instruction may indicate varying numbers of transitions or branches from a current state. The cache pre-fetches a requested number of additional instructions consecutively following an accessed instruction. The DFA engine accesses an instruction from the cache corresponding to a state within a small number of transitions from the root state. When a low-branching instruction is executed to access a next instruction from the root state, or when a low-branching instruction is executed to access a next instruction from the cache, a fixed or configurable pre-fetch length is requested. Some instructions such as low-branching instructions may contain a pre-fetch hint.
    Type: Application
    Filed: December 18, 2012
    Publication date: June 19, 2014
    Applicant: LSI CORPORATION
    Inventor: Michael Ruehle
  • Publication number: 20140143522
    Abstract: An apparatus for processing data includes signature generation circuitry 30, 32 for generating a signature value indicative of the current state of the apparatus in dependence upon a sequence of immediately preceding return addresses generating during execution of a stream of program instructions to reach that state of the apparatus. Prefetch circuitry 10 performs one or more prefetch operations in dependence upon the signature value that is generated. The signature value may be generated by a hashing operation (such as an XOR) performed upon return addresses stored within a return address stack 28.
    Type: Application
    Filed: November 20, 2012
    Publication date: May 22, 2014
    Applicants: THE REGENTS OF THE UNIVERSITY OF MICHIGAN, ARM LIMITED
    Inventors: Ali SAIDI, Thomas Friedrich WENISCH, Aasheesh KOLLI
  • Patent number: 8732438
    Abstract: Embodiments of the present invention execute an anti-prefetch instruction. These embodiments start by decoding instructions in a decode unit in a processor to prepare the instructions for execution. Upon decoding an anti-prefetch instruction, these embodiments stall the decode unit to prevent decoding subsequent instructions. These embodiments then execute the anti-prefetch instruction, wherein executing the anti-prefetch instruction involves: (1) sending a prefetch request for a cache line in an L1 cache; (2) determining if the prefetch request hits in the L1 cache; (3) if the prefetch request hits in the L1 cache, determining if the cache line contains a predetermined value; and (4) conditionally performing subsequent operations based on whether the prefetch request hits in the L1 cache or the value of the data in the cache line.
    Type: Grant
    Filed: April 16, 2008
    Date of Patent: May 20, 2014
    Assignee: Oracle America, Inc.
    Inventors: Paul Caprioli, Sherman H. Yip, Gideon N. Levinsky
  • Patent number: 8707014
    Abstract: According to an aspect of an embodiment of the invention, an arithmetic processing unit includes a first cache memory unit that holds a part of data stored in a storage device; an address register that holds an address; a flag register that stores flag information; and a decoder that decodes a prefetch instruction for acquiring data stored at the address in the storage device. The arithmetic processing unit further includes an instruction execution unit that executes a cache hit check instruction instead of the prefetch instruction on the basis of a decoded result when the flag information is held, the cache hit check instruction allowing for searching the first cache memory unit with the address to thereby make a first cache hit determination that the first cache memory unit holds the data stored at the address in the storage device.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: April 22, 2014
    Assignee: Fujitsu Limited
    Inventors: Iwao Yamazaki, Hiroyuki Imai
  • Publication number: 20140101413
    Abstract: A prefetch optimizer tool for an information handling system (IHS) may improve effective memory access time by controlling both hardware prefetch operations and software prefetch operations. The prefetch optimizer tool selectively disables prefetch instructions in an instruction sequence of interest within an application. The tool measures execution times of the instruction sequence of interest when different prefetch instructions are disabled. The tool may hold hardware prefetch depth constant while cycling through disabling different prefetch instructions and taking corresponding execution time measurements. Alternatively, for each disabled prefetch instruction in the instruction sequence of interest, the tool may cycle through different hardware prefetch depths and take corresponding execution time measurements at each hardware prefetch depth.
    Type: Application
    Filed: December 11, 2013
    Publication date: April 10, 2014
    Applicant: International Business Machines Corporation
    Inventor: Randall Ray Heisch
  • Patent number: 8688960
    Abstract: A method, system and computer-usable medium are disclosed for managing prefetch streams in a virtual machine environment. Compiled application code in a first core, which comprises a Special Purpose Register (SPR) and a plurality of first prefetch engines, initiates a prefetch stream request. If the prefetch stream request cannot be initiated due to unavailability of a first prefetch engine, then an indicator bit indicating a Prefetch Stream Dispatch Fault is set in the SPR, causing a Hypervisor to interrupt the execution of the prefetch stream request. The Hypervisor then calls its associated operating system (OS), which determines prefetch engine availability for a second core comprising a plurality of second prefetch engines. If a second prefetch engine is available, then the OS migrates the prefetch stream request from the first core to the second core, where it is initiated on an available second prefetch engine.
    Type: Grant
    Filed: October 15, 2010
    Date of Patent: April 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: Matthew Accapadi, Robert H. Bell, Jr., Hong Lam Hua, Ram Raghavan, Mysore Sathyanarayana Srinivas
  • Patent number: 8688961
    Abstract: A method, system and computer-usable medium are disclosed for managing prefetch streams in a virtual machine environment. Compiled application code in a first core, which comprises a Special Purpose Register (SPR) and a plurality of first prefetch engines, initiates a prefetch stream request. If the prefetch stream request cannot be initiated due to unavailability of a first prefetch engine, then an indicator bit indicating a Prefetch Stream Dispatch Fault is set in the SPR, causing a Hypervisor to interrupt the execution of the prefetch stream request. The Hypervisor then calls its associated operating system (OS), which determines prefetch engine availability for a second core comprising a plurality of second prefetch engines. If a second prefetch engine is available, then the OS migrates the prefetch stream request from the first core to the second core, where it is initiated on an available second prefetch engine.
    Type: Grant
    Filed: March 22, 2012
    Date of Patent: April 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: Matthew Accapadi, Robert H. Bell, Jr., Hong Lam Hua, Ram Raghavan, Mysore Sathyanarayana Srinivas
  • Patent number: 8683138
    Abstract: A prefetch data machine instruction having an M field performs a function on a cache line of data specifying an address of an operand. The operation comprises either prefetching a cache line of data from memory to a cache or reducing the access ownership of store and fetch or fetch only of the cache line in the cache or a combination thereof. The address of the operand is either based on a register value or the program counter value pointing to the prefetch data machine instruction.
    Type: Grant
    Filed: January 6, 2012
    Date of Patent: March 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Dan F Greiner, Timothy J Slegel
  • Patent number: 8667257
    Abstract: Techniques are disclosed relating to improving the performance of branch prediction in processors. In one embodiment, a processor is disclosed that includes a branch prediction unit configured to predict a sequence of instructions to be issued by the processor for execution. The processor also includes a pattern detection unit configured to detect a pattern in the predicted sequence of instructions, where the pattern includes a plurality of predicted instructions. In response to the pattern detection unit detecting the pattern, the processor is configured to switch from issuing instructions predicted by the branch prediction unit to issuing the plurality of instructions. In some embodiments, the processor includes a replay unit that is configured to replay fetch addresses to an instruction fetch unit to cause the plurality of predicted instructions to be issued.
    Type: Grant
    Filed: November 10, 2010
    Date of Patent: March 4, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ravindra N. Bhargava, David Suggs, Anthony X. Jarvis
  • Patent number: 8661229
    Abstract: A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
    Type: Grant
    Filed: May 4, 2009
    Date of Patent: February 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Thomas Andrew Sartorius, Victor Roberts Augsburg, James Norris Dieffenderfer, Jeffrey Todd Bridges, Michael Scott McIlvaine, Rodney Wayne Smith
  • Patent number: 8656142
    Abstract: An illustrative embodiment provides a computer-implemented process for managing multiple speculative assist threads for data pre-fetching that sends a command from an assist thread of a first processor to second processor and a memory, wherein parameters of the command specify a processor identifier of the second processor, responsive to receiving the command, reply by the second processor indicating an ability to receive a cache line that is a target of a pre-fetch, responsive to receiving the command replying by the memory indicating a capability to provide the cache line, responsive to receiving replies from the second processor and the memory, sending, by the first processor, a combined response to the second processor and the memory, wherein the combined response indicates an action, and responsive to the action indicating a transaction can continue sending the requested cache line, by the memory, to the second processor into a target cache level on the second processor.
    Type: Grant
    Filed: October 13, 2010
    Date of Patent: February 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Yaoqing Gao
  • Patent number: 8650364
    Abstract: A processing device includes a memory and a processor that generates a plurality of read commands for reading read data from the memory and a plurality of write commands for writing write data to the memory. A prefetch memory interface prefetches prefetch data to a prefetch buffer, retrieves the read data from the prefetch buffer when the read data is included in the prefetch buffer, and retrieves the read data from the memory when the read data is not included in the prefetch buffer, wherein the prefetch buffer is managed via a linked list.
    Type: Grant
    Filed: May 28, 2008
    Date of Patent: February 11, 2014
    Assignee: ViXS Systems, Inc.
    Inventor: Jing Zhang
  • Publication number: 20140032881
    Abstract: Method, apparatus, and program means for performing a dot-product operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources store to a storage location a result value equal to a dot-product of at least two operands.
    Type: Application
    Filed: September 30, 2013
    Publication date: January 30, 2014
    Inventors: Ronen Zohar, Mark Seconi, Rajesh Parthasarathy, Srinivas Chennupaty, Mark Buxton, Chuck Desylva
  • Patent number: 8640133
    Abstract: Fetch operations are assigned to different threads in a multithreaded environment. There are provided a number of different sorting algorithms, from which one is periodically selected on the basis of whether the present algorithm is giving satisfactory results or not. The period is preferably a sub-context interval. The different sorting algorithms preferably include a software/OS priority. A second sorting algorithm may include sorting according to hardware performance measurements. Two-level priority scheme is used to combine both priorities. The judgement of satisfactory performance is preferably based on the difference between a desired number of fetch operations attributed per sub-context switch interval to each thread and a real number of fetch operations attributed per sub-context switch interval to each thread.
    Type: Grant
    Filed: December 18, 2009
    Date of Patent: January 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: Hisham El-Shishiny, Ali El-Moursy
  • Publication number: 20140025931
    Abstract: There is provided a method for controlling fetch-ahead of Fetch Sets into a decoupling First In First Out (FIFO) buffer of a Variable Length Execution Set (VLES) processor architecture, wherein a Fetch Set comprises at least a portion of a VLES group available for dispatch to processing resources within the VLES processor architecture, comprising, for each cycle, determining a number of VLES groups available for dispatch from previously pre-fetched Fetch Sets, and only requesting a fetch-ahead of a next Fetch Set in the next cycle if one of a select set of criteria related to the number of VLES groups available for dispatch is true.
    Type: Application
    Filed: March 30, 2011
    Publication date: January 23, 2014
    Applicant: Freescale Semiconductor, Inc.
    Inventors: Lev Vaskevich, Mark Elnekave, Yuval Peled, Idan Rozenberg
  • Publication number: 20140025932
    Abstract: A processor includes: a first GHR that indicates, in time series, results which have predicted validity or invalidity of branches when instructions have been fetched; a second GHR that indicates, in time series, results which have decided validity or invalidity of branches when computation has been completed; a branch prediction unit that, when the instructions are fetched, executes branch prediction by using a branch validity accuracy which are decided based on not only a branch history (BRHIS) but also the instruction fetch address and the first GHR and indicates whether the instruction is a branch direction as expected; an update unit that updates the first GHR with the value of the second GHR when it is decided that the branch prediction has failed based on the result of the branch computation; wherein an execution unit re-executes the instruction fetch.
    Type: Application
    Filed: September 23, 2013
    Publication date: January 23, 2014
    Applicant: FUJITSU LIMITED
    Inventor: Takashi SUZUKI
  • Publication number: 20140019722
    Abstract: Provided are a processor and an instruction processing method of the processor, with which it is possible to increase an instruction execution rate. A processor 1 includes a BTAC 12 that stores branch target information of a branch instruction and boundary information indicating that the branch instruction is on a fetch line boundary, a branch prediction unit 13 that performs branch prediction of a variable-length instruction set including the branch instruction by referring to the BTAC 12, and a fetch unit 14 that fetches an instruction based on the branch prediction result. The branch prediction unit 13 refers to the BTAC 12, and when the boundary information is present in the instruction which the branch prediction unit 13 makes the fetch unit 14 fetch, the branch prediction unit 13 makes the fetch unit 14 fetch the following next fetch line as well and then makes the fetch unit 14 fetch a branch prediction target instruction according to the branch target information.
    Type: Application
    Filed: February 24, 2012
    Publication date: January 16, 2014
    Inventors: Tsuyoshi Nagao, Junichi Sato
  • Publication number: 20140019721
    Abstract: Disclosed is an apparatus and method to manage instruction cache prefetching from an instruction cache. A processor may comprise: a prefetch engine; a branch prediction engine to predict the outcome of a branch; and dynamic optimizer. The dynamic optimizer may be used to control: indentifying common instruction cache misses and inserting a prefetch instruction from the prefetch engine to the instruction cache.
    Type: Application
    Filed: December 29, 2011
    Publication date: January 16, 2014
    Inventors: Kyriakos A. Stavrou, Enric Gibert Codina, Josep M. Codina, Crispin Gomez Requena, Antonio Gonzalez, Mirem Hyuseinova, Christos E. Kotselidis, Fernando Latorre, Pedro Lopez, Marc Lupon, Carlos Madriles gimeno, Grigorios Magklis, Pedro Marcuello, Alejandro Martinez Vicente, Raul Martinez, Daniel Ortega, Demos Pavlou, Georgios Tournavitis, Polychronis Xekalakis
  • Patent number: 8624906
    Abstract: A method and system for graphics instruction fetching. The method includes executing a plurality of threads in a multithreaded execution environment. A respective plurality of instructions are fetched to support the execution of the threads. During runtime, at least one instruction is prefetched for one of the threads to a prefetch buffer. The at least one instruction is accessed from the prefetch buffer if required by the one thread and discarded if not required by the one thread.
    Type: Grant
    Filed: September 29, 2004
    Date of Patent: January 7, 2014
    Assignee: Nvidia Corporation
    Inventor: Andrew D. Bowen
  • Publication number: 20130339665
    Abstract: Embodiments relate to collision-based alternate hashing. An aspect includes receiving an incoming instruction address. Another aspect includes determining whether an entry for the incoming instruction address exists in a history table based on a hash of the incoming instruction address. Another aspect includes based on determining that the entry for the incoming instruction address exists in the history table, determining whether the incoming instruction address matches an address tag in the determined entry. Another aspect includes based on determining that the incoming instruction address does not match the address tag in the determined entry, determining whether a collision exists for the incoming instruction address. Another aspect includes based on determining that the collision exists for the incoming instruction address, activating alternate hashing for the incoming instruction address using an alternate hash buffer.
    Type: Application
    Filed: June 15, 2012
    Publication date: December 19, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Khary J. Alexander, Ilia Averbouch, Ariel J. Birnbaum, Jonathan T. Hsieh, Chung-Lung K. Shum
  • Publication number: 20130326193
    Abstract: Embodiments include processing systems that determine, based on an instruction address range indicator stored in a first register, whether a next instruction fetch address corresponds to a location within a first memory region associated with a current privilege state or within a second memory region associated with a different privilege state. When the next instruction fetch address is not within the first memory region, the next instruction is allowed to be fetched only when a transition to the different privilege state is legal. In a further embodiment, when a data access address is generated for an instruction, a determination is made, based on a data address range indicator stored in a second register, whether access to a memory location corresponding to the data access address is allowed. The access is allowed when the current privilege state is a privilege state in which access to the memory location is allowed.
    Type: Application
    Filed: May 31, 2012
    Publication date: December 5, 2013
    Inventors: DANIEL M. MCCARTHY, Joseph C. Circello, Kristen A. Hausman
  • Patent number: 8578135
    Abstract: A high-performance information processing technique permitting updating of an instruction buffer ready for effective prefetching to branch instructions and returning to the subroutine with a small volume of hardware is to be provided at low cost. It is an information processing apparatus equipped with a CPU, a memory, prefetch means and the like, wherein a prefetch address generator unit in the prefetch means decodes a branching series of instructions including at least one branched address calculating instruction and branching instruction to a branched address out of a current instruction buffer storing the series of instructions currently accessed by the CPU, and thereby looks ahead to the branching destination address. The information processing apparatus further comprises a RTS instruction buffer for storing a series of instructions of the return destinations of RTS instructions, and series of instructions stored in the current instruction buffer are saved into the RTS instruction buffer.
    Type: Grant
    Filed: March 16, 2012
    Date of Patent: November 5, 2013
    Assignee: Renesas Electronics Corporation
    Inventors: Teppei Hirotsu, Yuuichi Abe, Takeshi Kataoka, Yasuhiro Nakatsuka
  • Publication number: 20130290678
    Abstract: Techniques to increase the consumption rate of raw instruction bytes within an instruction fetch unit. An instruction fetch unit according to embodiments of the present invention may include a prefetch buffer, a set of bypass multiplexers, an array of bypass latches, a byte-block multiplexer, an instruction alignment multiplexer, a predecode cache, and an instruction length decoder. Raw instruction bytes may be steered from the bypass latches into macro-instructions for consumption by the instruction length decoder, which may generate micro-instructions from the macro-instructions. Embodiments of the present invention may de-couple a latency for reading raw instruction bytes from the prefetch buffer from consuming raw instruction bytes by the instruction length decoder.
    Type: Application
    Filed: April 26, 2012
    Publication date: October 31, 2013
    Applicant: INTEL CORPORATION
    Inventors: Venkateswara R. MADDURI, Hoichi CHEONG, Jonathan Y. TONG
  • Patent number: 8572356
    Abstract: Techniques and structures are disclosed for a processor supporting checkpointing to operate effectively in scouting mode while a maximum number of supported checkpoints are active. Operation in scouting mode may include using bypass logic and a set of register storage locations to store and/or forward in-flight instruction results that were calculated during scouting mode. These forwarded results may be used during scouting mode to calculate memory load addresses for yet other in-flight instructions, and the processor may accordingly cause data to be prefetched from these calculated memory load addresses. The set of register storage locations may comprise a working register file or an active portion of a multiported register file.
    Type: Grant
    Filed: January 5, 2010
    Date of Patent: October 29, 2013
    Assignee: Oracle America, Inc.
    Inventors: Sherman H. Yip, Paul Caprioli
  • Patent number: 8564604
    Abstract: Systems and methods for improving throughput of a graphics processing unit are disclosed. In one embodiment, a system includes a multithreaded execution unit capable of processing requests to access a constant cache, a vertex attribute cache, at least one common register file, and an execution unit data path substantially simultaneously.
    Type: Grant
    Filed: April 21, 2010
    Date of Patent: October 22, 2013
    Assignee: VIA Technologies, Inc.
    Inventor: Yang (Jeff) Jiao
  • Patent number: 8560786
    Abstract: Memory is used, including by receiving at a processor an indication that a first piece of metadata associated with a set of backup data is required during a block based backup and/or restore. The processor is used to retrieve from a metadata store a set of metadata that includes the first piece of metadata and one or more additional pieces of metadata included in the metadata store in an adjacent location that is adjacent to a first location in which the first piece of metadata is stored in the metadata store, without first determining whether the one or more additional pieces of metadata are currently required. The retrieved set of metadata is stored in a cache.
    Type: Grant
    Filed: February 8, 2010
    Date of Patent: October 15, 2013
    Assignee: EMC Corporation
    Inventor: Ajay Pratap Singh Kushwah
  • Publication number: 20130262826
    Abstract: An apparatus and method are described for performing history-based prefetching. For example a method according to one embodiment comprises: determining if a previous access signature exists in memory for a memory page associated with a current stream; if the previous access signature exists, reading the previous access signature from memory; and issuing prefetch operations using the previous access signature.
    Type: Application
    Filed: October 6, 2011
    Publication date: October 3, 2013
    Inventors: Alexander Gendler, Larisa Novakovsky, George Leifman, Dana Rip
  • Patent number: 8549255
    Abstract: A microprocessor equipped to provide hardware initiated prefetching, includes at least one architecture for performing: issuance of a prefetch instruction; writing of a prefetch address into a prefetch fetch address register (PFAR); attempting a prefetch according to the address; detecting one of a cache miss and a cache hit; and if there is a cache miss, then sending a miss request to a next cache level and attempting cache access in a non-busy cycle; and if there is a cache hit, then incrementing the address in the PFAR and completing the prefetch. A method and a computer program product are provided.
    Type: Grant
    Filed: February 15, 2008
    Date of Patent: October 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: David A. Schroter, Mark S. Farrell, Jennifer Navarro, Chung-Lung Kevin Shum, Charles F. Webb
  • Patent number: 8533392
    Abstract: A system and method for cache hit management.
    Type: Grant
    Filed: March 4, 2009
    Date of Patent: September 10, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Adi Grossman, Omri Shacham
  • Patent number: 8533399
    Abstract: In a cache memory, energy and other efficiencies can be realized by saving a result of a cache directory lookup for sequential accesses to a same memory address. Where the cache is a point of coherence for speculative execution in a multiprocessor system, with directory lookups serving as the point of conflict detection, such saving becomes particularly advantageous.
    Type: Grant
    Filed: January 4, 2011
    Date of Patent: September 10, 2013
    Assignee: International Business Machines Corporation
    Inventor: Martin Ohmacht
  • Publication number: 20130232320
    Abstract: A prefetch unit includes a transience register and a length register. The transience register hosts an indication of transient for data stream prefetching. The length register hosts an indication of a stream length for data stream prefetching. The prefetch unit monitors the transience register and the length register. The prefetch unit generates prefetch requests of data streams with a transient property up to the stream length limit when the transience register indicates transient and the length register indicates the stream length limit for data stream prefetching. A cache controller coupled with the prefetch unit implements a cache replacement policy and cache coherence protocols. The cache controller writes data supplied from memory responsive to the prefetch requests into cache with an indication of transient. The cache controller victimizes cache lines with an indication of transient independent of the cache replacement policy.
    Type: Application
    Filed: March 1, 2012
    Publication date: September 5, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: JASON N. DALE, MILES R. DOOLEY, RICHARD J. EICKEMEYER, BRADLY G. FREY, YAOQING GAO, FRANCIS P. O'CONNELL, JEFFREY A. STUECHELI
  • Patent number: 8516226
    Abstract: A method and system for flexible prefetching of data and/or instructions for applications are described. A prefetching mechanism monitors program instructions and tag information associated with the instructions. The tag information is used to determine when a prefetch operation is desirable. The prefetching mechanism then requests data and/or instructions. Furthermore, the prefetching mechanism determines when entry into a different execution phase of an application program occurs, and executes a different prefetching policy based on the application's program instructions and tag information for that execution phase as well as profile information from previous executions of the application in that execution phase.
    Type: Grant
    Filed: January 23, 2006
    Date of Patent: August 20, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Jean-Francois Collard, Norman Paul Jouppi
  • Patent number: 8490071
    Abstract: Mechanisms are provided for optimizing code to perform prefetching of data into a shared memory of a computing device that is shared by a plurality of threads that execute on the computing device. A memory stream of a portion of code that is shared by the plurality of threads is identified. A set of prefetch instructions is distributed across the plurality of threads. Prefetch instructions are inserted into the instruction sequences of the plurality of threads such that each instruction sequence has a separate sub-portion of the set of prefetch instructions, thereby generating optimized code. Executable code is generated based on the optimized code and stored in a storage device. The executable code, when executed, performs the prefetches associated with the distributed set of prefetch instructions in a shared manner across the plurality of threads.
    Type: Grant
    Filed: May 4, 2010
    Date of Patent: July 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John A. Gunnels
  • Publication number: 20130179663
    Abstract: A prefetch optimizer tool for an information handling system (IHS) may improve effective memory access time by controlling both hardware prefetch operations and software prefetch operations. The prefetch optimizer tool selectively disables prefetch instructions in an instruction sequence of interest within an application. The tool measures execution times of the instruction sequence of interest when different prefetch instructions are disabled. The tool may hold hardware prefetch depth constant while cycling through disabling different prefetch instructions and taking corresponding execution time measurements. Alternatively, for each disabled prefetch instruction in the instruction sequence of interest, the tool may cycle through different hardware prefetch depths and take corresponding execution time measurements at each hardware prefetch depth.
    Type: Application
    Filed: January 10, 2012
    Publication date: July 11, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Randall Ray Heisch
  • Patent number: 8484421
    Abstract: Embodiments of the present disclosure provide a system on a chip (SOC) comprising a processing core, and a cache including a cache instruction port, a cache data port, and a port utilization circuitry configured to selectively fetch instructions through the cache instruction port and selectively pre-fetch instructions through the cache data port. Other embodiments are also described and claimed.
    Type: Grant
    Filed: November 23, 2009
    Date of Patent: July 9, 2013
    Assignee: Marvell Israel (M.I.S.L) Ltd.
    Inventors: Tarek Rohana, Adi Habusha, Gil Stoler
  • Patent number: 8484436
    Abstract: A memory controller is configured to receive read requests from a processor and return memory words from memory. The memory controller comprises an address comparator and a loop entry cache. The address comparator is configured to determine a difference between a previous read request address and a current read request address. The address comparator is also configured to determine whether the difference is positive and less than a certain address difference and, if so, indicate a limited backwards jump. The loop entry cache is configured to store a current memory word for the current read request address when the address comparator indicates a limited backwards jump.
    Type: Grant
    Filed: September 2, 2010
    Date of Patent: July 9, 2013
    Assignee: Atmel Corporation
    Inventors: Franck Lunadier, Frédéric Schumacher
  • Patent number: 8484437
    Abstract: A data processing apparatus includes a pre-fetch unit configured to divide and store data, a validation setting unit configured to store information regarding whether or not the data stored in the pre-fetch unit are valid, an address generation unit configured to generate an address for reading/storing the data from/in the pre-fetch unit, and a pre-fetch control unit configured to control a storage position of the data in the pre-fetch unit by using the address and information of the address generation unit and the validation setting unit.
    Type: Grant
    Filed: September 7, 2010
    Date of Patent: July 9, 2013
    Assignee: Hynix Semiconductor
    Inventor: Seok-In Kim
  • Patent number: 8462789
    Abstract: A network processor of an embodiment includes a packet classification engine, a processing pipeline, and a controller. The packet classification engine allows for classifying each of a plurality of packets according to packet type. The processing pipeline has a plurality of stages for processing each of the plurality of packets in a pipelined manner, where each stage includes one or more processors. The controller allows for providing the plurality of packets to the processing pipeline in an order that is based at least partially on: (i) packet types of the plurality of packets as classified by the packet classification engine and (ii) estimates of processing times for processing packets of the packet types at each stage of the plurality of stages of the processing pipeline. A method in a network processor allows for prefetching instructions into a cache for processing a packet based on a packet type of the packet.
    Type: Grant
    Filed: May 9, 2012
    Date of Patent: June 11, 2013
    Inventor: Justin Mark Sobaje
  • Publication number: 20130138922
    Abstract: Systems and methods are disclosed for enhancing the throughput of a processor by minimizing the number of transfers of data associated with data transfer between a register file and a memory stack. The register file used by a processor running an application is partitioned into a number of blocks. A subset of the blocks of the register file is defined in an application binary interface enabling the subset to be pre-allocated and exposed to the application binary interface. Optionally, blocks other than the subset are not exposed to the application binary interface so that the data relating to application function switch or a context switch is not transferred between the unexposed blocks and a memory stack.
    Type: Application
    Filed: November 29, 2011
    Publication date: May 30, 2013
    Applicant: International Business Machines Corporation
    Inventors: Revital Eres, Amit Golander, Nadav Levison, Sagi Manole, Ayal Zaks