Prefetching Patents (Class 712/207)
  • Patent number: 7594096
    Abstract: The present invention allows a microprocessor to identify and speculatively execute future load instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The data for such future load instructions can be prefetched from a distant cache or main memory such that when the load instruction is re-executed (non speculative executed) after the stall condition expires, its data will reside either in the L1 cache, or will be enroute to the processor, resulting in a reduced execution latency. When an extended stall condition is detected, load lookahead prefetch is started allowing speculative execution of instructions that would normally have been stalled.
    Type: Grant
    Filed: December 5, 2007
    Date of Patent: September 22, 2009
    Assignee: International Business Machines Corporation
    Inventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
  • Publication number: 20090235053
    Abstract: A system and method for performing register renaming of source registers in a processor having a variable advance instruction window for storing a group of instructions to be executed by the processor, wherein a new instruction is added to the variable advance instruction window when a location becomes available. A tag is assigned to each instruction in the variable advance instruction window. The tag of each instruction to leave the window is assigned to the next new instruction to be added to it. The results of instructions executed by the processor are stored in a temp buffer according to their corresponding tags to avoid output and anti-dependencies. The temp buffer therefore permits the processor to execute instructions out of order and in parallel. Data dependency checks for input dependencies are performed only for each new instruction added to the variable advance instruction window and register renaming is performed to avoid input dependencies.
    Type: Application
    Filed: May 26, 2009
    Publication date: September 17, 2009
    Applicant: Seiko Epson Corporation
    Inventors: Trevor A. Deosaran, Sanjiv Garg, Kevin R. Iadonato
  • Publication number: 20090228688
    Abstract: A system including a dual function adder is described. In one embodiment, the system includes an adder. The adder is configured for a first instruction to determine an address for a hardware prefetch if the first instruction is a hardware prefetch instruction. The adder is further configures for the first instruction to determine a value from an arithmetic operation if the first instruction is an arithmetic operation instruction.
    Type: Application
    Filed: March 4, 2008
    Publication date: September 10, 2009
    Applicant: QUALCOMM INCORPORATED
    Inventors: Ajay Anant Ingle, Erich James Plondke, Lucian Codrescu
  • Patent number: 7587580
    Abstract: A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
    Type: Grant
    Filed: February 3, 2005
    Date of Patent: September 8, 2009
    Assignee: QUALCOMM Corporated
    Inventors: Thomas Andrew Sartorius, Victor Roberts Augsburg, James Norris Dieffenderfer, Jeffrey Todd Bridges, Michael Scott McIlvaine, Rodney Wayne Smith
  • Publication number: 20090216951
    Abstract: A system, method, and computer program product for handling shared cache lines to allow forward progress among processors in a multi-processor environment is provided. A counter and a threshold are provided a processor of the multi-processor environment, such that the counter is incremented for every exclusive cross interrogate (XI) reject that is followed by an instruction completion, and reset on an exclusive XI acknowledgement. If the XI reject counter reaches a preset threshold value, the processor's pipeline is drained by blocking instruction issue and prefetching attempts, creating a window for an exclusive XI from another processor to be honored, after which normal instruction processing is resumed. Configuring the preset threshold value as a programmable value allows for fine-tuning of system performance.
    Type: Application
    Filed: February 22, 2008
    Publication date: August 27, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Chung-Lung Kevin Shum, Charles F. Webb
  • Publication number: 20090217004
    Abstract: A prefetch bit (126) is associated with cache block (125) of a cache (120), and the management (130) of cache-prefetch operations is based on the state of this bit (126). Further efficiencies are gained by allowing each application to identify memory areas (115) within which regularly repeating memory accesses are likely, such as frame memory in a video application. For each of these memory areas (115), the application also identifies a likely stride value, such as the line length of the data in the frame memory. Pre-fetching is limited to the identified areas (115), and the prefetch bit (126) is used to identify blocks (125) from these areas and to limit repeated cache hit/miss determinations.
    Type: Application
    Filed: November 15, 2005
    Publication date: August 27, 2009
    Applicant: Koninkliijke Phillips Electronics N.V.
    Inventors: Jan-Willem Van De Waerdt, Jean-Paul Vanitegem
  • Publication number: 20090210663
    Abstract: A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
    Type: Application
    Filed: May 4, 2009
    Publication date: August 20, 2009
    Applicant: QUALCOMM INCORPORATED
    Inventors: Thomas Andrew Sartorius, Victor Roberts Augsburg, James Norris Dieffenderfer, Jeffrey Todd Bridges, Michael Scott McIlvaine, Rodney Wayne Smith
  • Publication number: 20090210662
    Abstract: A microprocessor equipped to provide hardware initiated prefetching, includes at least one architecture for performing: issuance of a prefetch instruction; writing of a prefetch address into a prefetch fetch address register (PFAR); attempting a prefetch according to the address; detecting one of a cache miss and a cache hit; and if there is a cache miss, then sending a miss request to a next cache level and attempting cache access in a non-busy cycle; and if there is a cache hit, then incrementing the address in the PFAR and completing the prefetch. A method and a computer program product are provided.
    Type: Application
    Filed: February 15, 2008
    Publication date: August 20, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: David A. Schroter, Mark S. Farrell, Jennifer Navarro, Chung-Lung Kevin Shum, Charles F. Webb
  • Patent number: 7577947
    Abstract: Methods and apparatus to dynamically insert prefetch instructions are disclosed. In an example method, one or more samples associated with cache misses are identified from a performance monitoring unit in a processor system. Based on sample information associated with the one or more samples, delinquent information is generated. To dynamically insert one or more prefetch instructions, a prefetch point is identified based on the delinquent information.
    Type: Grant
    Filed: December 19, 2003
    Date of Patent: August 18, 2009
    Assignee: Intel Corporation
    Inventors: Sreenivas Subramoney, Mauricio J. Serrano, Richard L. Hudson, Ali-Reza Adl-Tabatabai
  • Publication number: 20090204791
    Abstract: A method and apparatus for forming compound issue groups containing instructions from multiple cache lines of instructions are provided. By pre-fetching instruction lines containing instructions targeted by a conditional branch statement, if it is predicted that the conditional branch will be taken, a compound issue group may be formed with instructions from the I-line containing the branch statement and the I-line containing instructions targeted by the branch.
    Type: Application
    Filed: February 12, 2008
    Publication date: August 13, 2009
    Inventor: David A. Luick
  • Publication number: 20090198965
    Abstract: According to a method of data processing, a memory controller receives a prefetch load request from a processor core of a data processing system. The prefetch load request specifies a requested line of data. In response to receipt of the prefetch load request, the memory controller determines by reference to a stream of demand requests how much data is to be supplied to the processor core in response to the prefetch load request. In response to the memory controller determining to provide less than all of the requested line of data, the memory controller provides less than all of the requested line of data to the processor core.
    Type: Application
    Filed: February 1, 2008
    Publication date: August 6, 2009
    Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Publication number: 20090198964
    Abstract: A method, system, and computer program product are provided for verifying out of order instruction address (IA) stride prefetch performance in a processor design having more than one level of cache hierarchies. Multiple instruction streams are generated and the instructions loop back to corresponding instruction addresses. The multiple instruction streams are dispatched to a processor and simulation application to process. When a particular instruction is being dispatched, the particular instruction's instruction address and operand address are recorded in the queue. The processor is monitored to determine if the processor executes fetch and prefetch commands in accordance with the simulation application. It is checked to determine if prefetch commands are issued for instructions having three or more strides.
    Type: Application
    Filed: January 31, 2008
    Publication date: August 6, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Wei-Yi Xiao, Dean G. Bair, Christopher A. Krygowski, Chung-Lung K. Shum
  • Patent number: 7562192
    Abstract: An apparatus in a microprocessor for selectively retiring a prefetched cache line is disclosed. The microprocessor includes a prefetch buffer that stores a cache line prefetched from a system memory coupled to the microprocessor. The microprocessor also includes a cache memory, comprising an array of storage elements for storing cache lines, indexed by an index input. One of the storage elements of the array indexed by an index portion of an address of the prefetched cache line stored in the prefetch buffer is storing a replacement candidate line for the prefetched cache line. The microprocessor also includes control logic that determines whether the replacement candidate line in the cache memory is invalid, and if so, replaces the replacement candidate line in the one of the storage elements with the prefetched cache line from the prefetch buffer.
    Type: Grant
    Filed: November 27, 2006
    Date of Patent: July 14, 2009
    Assignee: Centaur Technologies
    Inventors: G. Glenn Henry, Rodney E. Hooker
  • Patent number: 7555633
    Abstract: Various embodiments of methods and systems for implementing a microprocessor that fetches a group of instructions into instruction cache in response to a corresponding trace being evicted from the trace cache are disclosed. In some embodiments, a microprocessor may include an instruction cache, a trace cache, and a prefetch unit. In response to a trace being evicted from trace cache, the prefetch unit may fetch a line of instructions into instruction cache.
    Type: Grant
    Filed: November 3, 2003
    Date of Patent: June 30, 2009
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gregory William Smaus, Mitchell Alsup
  • Publication number: 20090157967
    Abstract: A prefetch data machine instruction having an M field performs a function on a cache line of data specifying an address of an operand. The operation comprises either prefetching a cache line of data from memory to a cache or reducing the access ownership of store and fetch or fetch only of the cache line in the cache or a combination thereof. The address of the operand is either based on a register value or the program counter value pointing to the prefetch data machine instruction.
    Type: Application
    Filed: December 12, 2007
    Publication date: June 18, 2009
    Applicant: International Business Machines Corporation
    Inventors: Dan F. Greiner, Timothy J. Slegel
  • Publication number: 20090138661
    Abstract: A computer system and method. In one embodiment, a computer system comprises a processor and a cache memory. The processor executes a prefetch instruction to prefetch a block of data words into the cache memory. In one embodiment, the cache memory comprises a plurality of cache levels. The processor selects one of the cache levels based on a value of a prefetch instruction parameter indicating the temporal locality of data to be prefetched. In a further embodiment, individual words are prefetched from non-contiguous memory addresses. A single execution of the prefetch instruction allows the processor to prefetch multiple blocks into the cache memory. The number of data words in each block, the number of blocks, an address interval between each data word of each block, and an address interval between each block to be prefetched are indicated by parameters of the prefetch instruction.
    Type: Application
    Filed: November 26, 2007
    Publication date: May 28, 2009
    Inventor: Gary Lauterbach
  • Patent number: 7533247
    Abstract: The present subject matter relates to operation frame filtering, building, and execution. Some embodiments include identifying a frame signature, counting a number of execution occurrences of the frame signature, and building a frame of operations to execute instead of operations identified by the frame signature.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: May 12, 2009
    Assignee: Intel Corporation
    Inventors: Stephan Jourdan, Per Hammarlund, Alexandre Farcy, John Alan Miller
  • Patent number: 7533220
    Abstract: A microprocessor coupled to a system memory has a memory subsystem with a translation look-aside buffer (TLB) for storing TLB information. The microprocessor also includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and an abnormal TLB access policy. The microprocessor also includes a stream prefetch unit that generates a prefetch request to the memory subsystem to prefetch a cache line of the data stream from the system memory into the memory subsystem. If a virtual page address of the prefetch request causes an abnormal TLB access, the memory subsystem selectively aborts the prefetch request based on the abnormal TLB access policy specified in the instruction.
    Type: Grant
    Filed: August 11, 2006
    Date of Patent: May 12, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Publication number: 20090119488
    Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.
    Type: Application
    Filed: January 7, 2009
    Publication date: May 7, 2009
    Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang
  • Patent number: 7530063
    Abstract: A method and system of modifying instructions forming a loop is provided. A method of modifying instructions forming a loop includes modifying instructions forming a loop including: determining static and dynamic characteristics for the instructions; selecting a modification factor for the instructions based on a number of separate equivalent sections forming a cache in a processor which is processing the instructions; and modifying the instructions to interleave the instructions in the loop according to the modification factor and the static and dynamic characteristics when the instructions satisfy a modification criteria based on the static and dynamic characteristics.
    Type: Grant
    Filed: May 27, 2004
    Date of Patent: May 5, 2009
    Assignee: International Business Machines Corporation
    Inventors: Roch Georges Archambault, Robert James Blainey, Yaoqing Gao, John David McCalpin, Francis Patrick O'Connell, Pascal Vezolle, Steven Wayne White
  • Patent number: 7529911
    Abstract: One embodiment of the present invention provides a system that improves the effectiveness of prefetching during execution of instructions in scout mode. Upon encountering a non-data dependent stall condition, the system performs a checkpoint and commences execution of instructions in scout mode, wherein instructions are speculatively executed to prefetch future memory operations, but wherein results are not committed to the architectural state of a processor. When the system executes a load instruction during scout mode, if the load instruction causes a lower-level cache miss, the system allows the load instruction to access a higher-level cache. Next, the system places the load instruction and subsequent dependent instructions into a deferred queue, and resumes execution of the program in scout mode.
    Type: Grant
    Filed: May 26, 2005
    Date of Patent: May 5, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Lawrence A. Spracklen, Yuan C. Chou, Santosh G. Abraham
  • Patent number: 7519777
    Abstract: Methods, systems and computer program products for concomitant pair per-fetching. Exemplary embodiments include a method for concomitant pair prefetching, the method including detecting a stride pattern, detecting an indirect access pattern to define an access window, prefetching candidates within the defined access window, wherein the prefetching comprises obtaining prefetch addresses from a history table, updating a miss stream window, selecting a candidate of a concomitant pair from the miss stream window, producing an index from the candidate pair, accessing an aging filter, updating the history table and selecting another concomitant pair candidate from the miss stream window.
    Type: Grant
    Filed: June 11, 2008
    Date of Patent: April 14, 2009
    Assignee: International Business Machines Corporation
    Inventors: Kattamuri Ekanadham, Il Park, Pratap C. Pattnaik
  • Publication number: 20090094440
    Abstract: A pre-fetch circuit of a semiconductor memory apparatus can carry out a high-frequency operating test through a low-frequency channel of a test equipment. The pre-fetch circuit of a semiconductor memory apparatus can includes: a pre-fetch unit for pre-fetching data bits in a first predetermined number; a plurality of registers provided in the first predetermined number, each of which latches a data in order or a data out of order of the pre-fetched data in response to different control signals; and a control unit for selectively activating the different control signals in response to a test mode signal, whereby some of the registers latch the data out of order.
    Type: Application
    Filed: July 18, 2008
    Publication date: April 9, 2009
    Applicant: HYNIX SEMICONDUCTOR, INC.
    Inventor: Young-Ju Kim
  • Patent number: 7516312
    Abstract: An instruction prefetch apparatus includes a branch target buffer (BTB), a presbyopic target buffer (PTB) and a prefetch stream buffer (PSB). The BTB includes records that map branch addresses to branch target addresses, and the PTB includes records that map branch target addresses to subsequent branch target addresses. When a branch instruction is encountered, the BTB can predict the dynamically adjacent subsequent block entry location as the branch target address in the record that also includes the branch instruction address. The PTB can predict multiple subsequent blocks by mapping the branch target address to subsequent dynamic blocks. The PSB holds instructions prefetched from subsequent blocks predicted by the PTB.
    Type: Grant
    Filed: April 2, 2004
    Date of Patent: April 7, 2009
    Assignee: Intel Corporation
    Inventors: Hong Wang, Ralph Kling, Edward T. Grochowski, Kalpana Ramakrishnan
  • Patent number: 7516278
    Abstract: A system controller, which executes a speculative fetch from a memory before determining whether data requested for a memory fetch request is in a cache by searching tag information of the cache, includes a consumption determining unit that monitors a consumption status of a hardware resource used in the speculative fetch, and determines whether a consumption of the hardware resource exceeds a predetermined value; and a speculative-fetch issuing unit that stops issuing the speculative fetch when the consumption determining unit determines that the consumption of the hardware resource exceeds the predetermined value.
    Type: Grant
    Filed: December 1, 2004
    Date of Patent: April 7, 2009
    Assignee: Fujitsu Limited
    Inventors: Akira Watanabe, Go Sugizaki, Shigekatsu Sagi, Masahiro Mishima
  • Patent number: 7516279
    Abstract: Computer implemented method, system and computer program product for prefetching data in a data processing system. A computer implemented method for prefetching data in a data processing system includes generating attribute information of prior data streams by associating attributes of each prior data stream with a storage access instruction which caused allocation of the data stream, and then recording the generated attribute information. The recorded attribute information is accessed, and a behavior of a new data stream is modified using the accessed recorded attribute information.
    Type: Grant
    Filed: February 28, 2006
    Date of Patent: April 7, 2009
    Assignee: International Business Machines Corporation
    Inventors: John Barry Griswell, Jr., Francis Patrick O'Connell
  • Publication number: 20090089548
    Abstract: A method for preloading data in a CPU pipeline is provided, which includes the following steps. When a hint instruction is executed, allocate and initiate an entry in a preload table. When a load instruction is fetched, load a piece of data from a memory into the entry according to the entry. When a use instruction which uses the data loaded by the load instruction is executed, forward the data for the use instruction from the entry instead of from the memory. When the load instruction is executed, update the entry according to the load instruction.
    Type: Application
    Filed: September 27, 2007
    Publication date: April 2, 2009
    Applicant: FARADAY TECHNOLOGY CORP.
    Inventors: I-Jui Sung, Ming-Chung Kao
  • Patent number: 7512740
    Abstract: A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority.
    Type: Grant
    Filed: August 11, 2006
    Date of Patent: March 31, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7511851
    Abstract: The network printer obtains information from a Web page over the network, and stores this obtained information together with a URL thereof and the time when this information was received into the storage. When a print request from the outside via a console panel is made, the network printer reads data of an assigned URL from the storage, and prints this data using the printer engine.
    Type: Grant
    Filed: December 28, 2005
    Date of Patent: March 31, 2009
    Assignee: Ricoh Company, Ltd.
    Inventor: Hiroyuki Matsushima
  • Patent number: 7509472
    Abstract: Address translation for instruction fetching can be obviated for sequences of instruction instances that reside on a same page. Obviating address translation reduces power consumption and increases pipeline efficiency since accessing of an address translation buffer can be avoided. Certain events, such as branch mis-predictions and exceptions, can be designated as page boundary crossing events. In addition, carry over at a particular bit position when computing a branch target or a next instruction instance fetch target can also be designated as a page boundary crossing event. An address translation buffer is accessed to translate an address representation of a first instruction instance. However, until a page boundary crossing event occurs, the address representations of subsequent instruction instances are not translated. Instead, the translated portion of the address representation for the first instruction instance is recycled for the subsequent instruction instances.
    Type: Grant
    Filed: February 1, 2006
    Date of Patent: March 24, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Paul Caprioli, Shailender Chaudhry
  • Patent number: 7509459
    Abstract: A microprocessor has a plurality of stream prefetch engines for prefetching a respective data stream from the system memory into the microprocessor cache memory and an instruction decoder that decodes instructions of the microprocessor instruction set. The instruction set includes a stream prefetch instruction that returns an identifier uniquely associating a data stream specified by the instruction with one of the engines. The instruction set also includes an explicit prefetch-triggering load instruction that specifies a stream identifier returned by a previously executed stream prefetch instruction. When the decoder decodes a conventional load instruction it does not prefetch; however, when it decodes an explicit prefetch-triggering load instruction it commences prefetching the specified data stream. In one embodiment, an indicator of the load instruction may explicitly specify non-prefetch-triggering.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: March 24, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Publication number: 20090077350
    Abstract: A method, system and computer program for modifying an executing application, comprising monitoring the executing application to identify at least one of a hot load instruction, a hot store instruction and an active prefetch instruction that contributes to cache congestion; where the monitoring identifies a hot load instruction, enabling at least one prefetch associated with the hot load instruction; where the monitoring identifies a hot store instruction, enabling at least one prefetch associated with the hot store instruction; and where the monitoring identifies an active prefetch instruction that contributes to cache congestion, one of disabling the active prefetch instruction and reducing the effectiveness of the active prefetch instructions.
    Type: Application
    Filed: May 29, 2007
    Publication date: March 19, 2009
    Inventors: Sujoy Saraswati, Teresa Johnson
  • Publication number: 20090077321
    Abstract: A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority.
    Type: Application
    Filed: August 4, 2008
    Publication date: March 19, 2009
    Applicant: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7506106
    Abstract: A microprocessor has a data stream prefetch unit for processing a data stream prefetch instruction. The instruction specifies a data stream and a speculative stream hit policy indicator. If a load instruction hits in the data stream, then if the load is non-speculative the stream prefetch unit prefetches a portion of the data stream from system memory into cache memory; however, if the load is speculative the stream prefetch unit selectively prefetches a portion of the data stream from the system memory into the cache memory based on the value of the policy indicator. The load instruction is speculative if it is not guaranteed to complete execution, such as if it follows a predicted branch instruction whose outcome has not yet been finally determined to be correct. In one embodiment, the stream prefetch unit performs a similar function for store instructions that hit in the data stream.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: March 17, 2009
    Assignee: MIPS Technologies, Inc.
    Inventor: Keith E. Diefendorff
  • Patent number: 7506105
    Abstract: Generating a hashed value of the program counter in a data processing system. The hashed value can be used for prefetching in the data processing system. In some examples, the hashed value is used to identify whether a load instruction associated with the hashed value has an address that is part of a strided stream in an address stream. In some examples, the hashed value is a subset of bits of the bits of the program counter. In other examples, the hashed value may be derived in other ways from the program counter.
    Type: Grant
    Filed: May 2, 2005
    Date of Patent: March 17, 2009
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Hassan F. Al-Sukhni, James C. Holt, Matt B. Smittle, Michael D. Snyder, Brian C. Grayson
  • Publication number: 20090070556
    Abstract: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.
    Type: Application
    Filed: January 4, 2008
    Publication date: March 12, 2009
    Inventors: John Barry Griswell, JR., Hung Qui Le, Francis Patrick O'Connell, William J. Starke, Jeffrey Adam Stuecheli, Albert Thomas Williams
  • Patent number: 7502910
    Abstract: A sideband scout thread processing technique is provided. The sideband scout thread processing technique utilizes sideband information to identify a subset of processor instructions for execution by a scout thread processor. The sideband information identifies instructions that need to be executed to “warm-up” a cache memory that is shared with a main processor executing the whole set of processor instructions. Thus, the main processor has fewer cache misses and reduced latencies. In one embodiment, a system includes a first processor for executing a sequence of processor instructions, a second processor for executing a subset of the sequence of processor instructions, and a cache shared between the first processor and the second processor. The second processor includes sideband circuitry configured to identify the subset of the sequence of processor instructions to execute according to sideband information associated with the sequence of processor instructions.
    Type: Grant
    Filed: January 28, 2003
    Date of Patent: March 10, 2009
    Assignee: Sun Microsystems, Inc.
    Inventor: Peter C. Damron
  • Patent number: 7500062
    Abstract: A circuit arrangement and method selectively reorder speculatively issued memory read requests being communicated to a lower memory level in a multi-level memory architecture. In particular, a memory read request that has been speculatively issued to a lower memory level prior to completion of a cache lookup operation initiated in a cache memory in a higher memory level may be reordered ahead of at least one previously received and pending request awaiting communication to the lower memory level. By doing so, the latency associated with the memory read request is reduced when the request results in a cache miss in the higher level memory, and as a result, system performance is improved.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: March 3, 2009
    Assignee: International Business Machines Corporation
    Inventors: Bruce Leroy Beukema, Michael Bar-Joshua, Alexander Mesh, Shaul Yifrach
  • Patent number: 7500061
    Abstract: A preload controller for controlling a bus access device that reads out data from a main memory via a bus and transfers the readout data to a temporary memory, including a first acquiring device to acquire access hint information which represents a data access interval to the main memory, a second acquiring device to acquire system information which represents a transfer delay time in transfer of data via the bus by the bus access device, a determining device to determine a preload unit count based on the data access interval represented by the access hint information and the transfer delay time represented by the system information, and a management device to instruct the bus access device to read out data for the preload unit count from the main memory and to transfer the readout data to the temporary memory ahead of a data access of the data.
    Type: Grant
    Filed: June 14, 2005
    Date of Patent: March 3, 2009
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Seiji Maeda, Yusuke Shirota
  • Publication number: 20090055333
    Abstract: When a patient enters a medical situation, healthcare professionals can use various amounts of information in evaluating the situation. However, different information can be beneficial dependent on the medical situation. Moreover, personnel can historically use specific information types regardless of the situation. An artificial neuron network is employed to pre-fetch information that personnel likely will want prior to a request from the personnel. In addition, the artificial neuron network can be trained based on results of presented information.
    Type: Application
    Filed: August 22, 2007
    Publication date: February 26, 2009
    Applicant: MICROSOFT CORPORATION
    Inventor: Gang Wang
  • Publication number: 20090055628
    Abstract: Assigning each of a plurality of memory fetch units to any of a plurality of candidate variables to reduce load-hit-store delays, wherein a total number of required memory fetch units is minimized. A plurality of store/load pairs are identified. A dependency graph is generated by creating a node Nx for each store to variable X and a node Ny for each load of variable Y and, unless X=Y, for each store/load pair, creating an edge between a respective node Nx and a corresponding node Ny; for each created edge, labeling the edge with a heuristic weight; labeling each node Nx with a node weight Wx that combines a plurality of respective edge weights of a plurality of corresponding nodes Nx such that Wx=??xj; and determining a color for each of the graph nodes using k distinct colors wherein k is minimized such that no adjacent nodes joined by an edge between a respective node Nx and a corresponding node Ny have an identical color; and assigning a memory fetch unit to each of the k distinct colors.
    Type: Application
    Filed: August 21, 2007
    Publication date: February 26, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINE CORPORATION
    Inventors: Marcel Mitran, Joran S.C. Siu, Alexander Vasilevskiy
  • Patent number: 7496732
    Abstract: A method and apparatus for using result-speculative data under run-ahead speculative execution is disclosed. In one embodiment, the uncommitted target data from instructions being run-ahead executed may be saved into an advance data table. This advance data table may be indexed by the lines in the instruction buffer containing the instructions for run-ahead execution. When the instructions are re-executed subsequent to the run-ahead execution, valid target data may be retrieved from the advance data table and supplied as part of a zero-clock bypass to support parallel re-execution. This may achieve parallel execution of dependent instructions. In other embodiments, the advance data table may be content-addressable-memory searchable on target registers and supply target data to general speculative execution.
    Type: Grant
    Filed: December 17, 2003
    Date of Patent: February 24, 2009
    Assignee: Intel Corporation
    Inventors: Sailesh Kottapalli, Richard W. Goe, Youngsoo Choi
  • Publication number: 20090049277
    Abstract: A semiconductor integrated circuit device is provided. The operating frequency generating component generates an operating frequency that is a timing that becomes a reference for synchronizing processing between each circuit when the semiconductor integrated circuit operates. The extracting component extracts a critical path that is the slowest path when a data signal propagates between predetermined terminals inside the semiconductor integrated circuit. The instruction prefetch executing component prefetches an instruction relating to the critical path that has been extracted by the extracting component. The processing configuration changing component changes the processing configuration so as to realize transmission of the data signal within a predetermined cycle of the operating frequency using the instruction prefetch executing component when the data signal passes through the path of the critical path.
    Type: Application
    Filed: June 3, 2008
    Publication date: February 19, 2009
    Applicant: OKI ELECTRIC INDUSTRY CO., LTD.
    Inventor: Koji Muranishi
  • Patent number: 7493621
    Abstract: An apparatus, program product and method initiate, in connection with a context switch operation, a prefetch of data likely to be used by a thread prior to resuming execution of that thread. As a result, once it is known that a context switch will be performed to a particular thread, data may be prefetched on behalf of that thread so that when execution of the thread is resumed, more of the working state for the thread is likely to be cached, or at least in the process of being retrieved into cache memory, thus reducing cache-related performance penalties associated with context switching.
    Type: Grant
    Filed: December 18, 2003
    Date of Patent: February 17, 2009
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey Powers Bradford, Harold F. Kossman, Timothy John Mullins
  • Patent number: 7493447
    Abstract: Methods and related computer program products, systems, and devices for using a NAND flash as a program ROM are disclosed.
    Type: Grant
    Filed: May 3, 2006
    Date of Patent: February 17, 2009
    Assignee: Nuvoton Technology Corporation
    Inventor: Yi-Hsien Chuang
  • Patent number: 7490210
    Abstract: A system and method are described for a memory management processor which, using a table of reference addresses embedded in the object code, can open the appropriate memory pages to expedite the retrieval of information from memory referenced by instructions in the execution pipeline. A suitable compiler parses the source code and collects references to branch addresses, calls to other routines, or data references, and creates reference tables listing the addresses for these references at the beginning of each routine. These tables are received by the memory management processor as the instructions of the routine are beginning to be loaded into the execution pipeline, so that the memory management processor can begin opening memory pages where the referenced information is stored. Opening the memory pages where the referenced information is located before the instructions reach the instruction processor helps lessen memory latency delays which can greatly impede processing performance.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: February 10, 2009
    Assignee: Micron Technology, Inc.
    Inventor: Dean A. Klein
  • Patent number: 7487296
    Abstract: A multi-stride prefetcher includes a recurring prefetch table that in turn includes a stream table and an index table. The stream table includes a valid field and a tag field. The stream table also includes a thread number field to help support multi-threaded processor cores. The tag field stores a tag from an address associated with a cache miss. The index table includes fields for storing information characterizing a state machine. The fields include a learning bit. The multi-stride prefetcher prefetches data into a cache for a plurality of streams of cache misses, each stream having a plurality of strides.
    Type: Grant
    Filed: February 17, 2005
    Date of Patent: February 3, 2009
    Assignee: Sun Microsystems, Inc.
    Inventors: Sorin Iacobovici, Sudarshan Kadambi, Yuan C. Chou
  • Publication number: 20090031111
    Abstract: A method, apparatus, and computer program product dynamically select compiled instructions for execution. Static instructions for execution on a first execution and dynamic instructions for execution on a second execution unit are received. The throughput performance of the static instructions and the dynamic instructions is evaluated based on current states of the execution units. The static instructions or the dynamic instructions are selected for execution at runtime on the first execution unit or the second execution unit, respectively, based on the throughput performance of the instructions.
    Type: Application
    Filed: July 26, 2007
    Publication date: January 29, 2009
    Inventors: Deanna J. Chou, Jesse E. Craig, John Sargis, JR., Daneyand J. Singley, Sebastian T. Ventrone
  • Publication number: 20090031112
    Abstract: A system and method are provided for stacking global variables associated with a plurality of tools. The method includes loading a first tool global variable into a memory and executing a first tool of a computer application, the computer application configured to automate human resource processes. The method includes responsive to a call to execute a second tool of the computer application, pushing the first tool global variable onto a stack. The method includes loading a second tool global variable into the memory and executing the second tool. The method includes responsive to completing execution of the second tool, popping the first tool global variable off the stack and loading the first tool global variable back into the memory.
    Type: Application
    Filed: July 27, 2007
    Publication date: January 29, 2009
    Applicant: SAP AG
    Inventors: Christian Behrens, Steffen Rotsch, Martin Scholz
  • Patent number: 7484041
    Abstract: Systems and methods for improving the performance of a multiprocessor system by enabling a first processor to initiate the retrieval of data and the storage of the data in the cache memory of a second processor. One embodiment comprises a system having a plurality of processors coupled to a bus, where each processor has a corresponding cache memory. The processors are configured so that a first one of the processors can issue a preload command directing a target processor to load data into the target processor's cache memory. The preload command may be issued in response to a preload instruction in program code, or in response to an event. The first processor may include an explicit identifier of the target processor in the preload command, or the selection of the target processor may be left to another agent, such as an arbitrator coupled to the bus.
    Type: Grant
    Filed: April 4, 2005
    Date of Patent: January 27, 2009
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Takashi Yoshikawa