Prefetching Patents (Class 712/207)

GUARANTEED PREFETCH INSTRUCTION

Publication number: 20100306503

Abstract: A microprocessor includes a cache memory, an instruction set having first and second prefetch instructions each configured to instruct the microprocessor to prefetch a cache line of data from a system memory into the cache memory, and a memory subsystem configured to execute the first and second prefetch instructions. For the first prefetch instruction the memory subsystem is configured to forego prefetching the cache line of data from the system memory into the cache memory in response to a predetermined set of conditions. For the second prefetch instruction the memory subsystem is configured to complete prefetching the cache line of data from the system memory into the cache memory in response to the predetermined set of conditions.

Type: Application

Filed: May 17, 2010

Publication date: December 2, 2010

Applicant: VIA TECHNOLOGIES, INC.

Inventors: G. Glenn Henry, Colin Eddy, Rodney E. Hooker
Apparatus and method for supporting execution of prefetch threads

Patent number: 7840761

Abstract: A processor executes one or more prefetch threads and one or more main computing threads. Each prefetch thread executes instructions ahead of a main computing thread to retrieve data for the main computing thread, such as data that the main computing thread may use in the immediate future. Data is retrieved for the prefetch thread and stored in a memory, such as data fetched from an external memory and stored in a buffer. A prefetch controller determines whether the memory is full. If the memory is full, a cache controller stalls at least one prefetch thread. The stall may continue until at least some of the data is transferred from the memory to a cache for use by at least one main computing thread. The stalled prefetch thread or threads are then reactivated.

Type: Grant

Filed: April 1, 2005

Date of Patent: November 23, 2010

Assignee: STMicroelectronics, Inc.

Inventors: Osvaldo M. Colavin, Davide Rizzo
Pre-tracing instructions for CGA coupled processor in inactive mode for execution upon switch to active mode and continuing pre-fetching cache miss instructions

Patent number: 7836277

Abstract: A method of managing an instruction cache and a process of using the method are provided. The processor may comprise a processor core which is operated either during an active mode or during an inactive mode wherein the process core performs at least one instruction during the active mode, an instruction cache which pre-traces a first instruction and determines, during the inactive mode, whether the processor core will meet a cache miss with regard to the first instruction, wherein the first instruction is to be performed by the processor core during the active mode, a coarse-grained array which performs a second instruction during the inactive mode, and a configuration memory which stores configuration information of the coarse-grained array, wherein the coarse-grained array performs the second instruction using the configuration information.

Type: Grant

Filed: March 5, 2008

Date of Patent: November 16, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: Il Hyun Park, Dong-Hoon Yoo, Dong Kwan Suh, Soojung Ryu, Jeongwook Kim
Converting victim writeback to a fill

Patent number: 7836262

Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.

Type: Grant

Filed: June 5, 2007

Date of Patent: November 16, 2010

Assignee: Apple Inc.

Inventors: Ramesh Gunna, Sudarshan Kadambi
Continuing execution in scout mode while a main thread resumes normal execution

Patent number: 7836281

Abstract: A system that facilitates improving performance of a processor during scout mode. During a normal-execution mode, the system executes instructions for using main thread. Upon encountering a stall condition during execution of the main thread, the system generates a checkpoint. The system then enters a scout mode, wherein instructions are speculatively executed by a speculative thread to prefetch future memory references, but results are not committed to the architectural state of the processor. Upon encountering a memory reference during scout mode, the system issues a prefetch for the memory reference. If the stall condition that caused the processor to enter scout mode is resolved, the system uses the checkpoint to resume execution of the main thread from the instruction that caused the stall condition, and simultaneously continues executing instructions in scout mode using the speculative thread from the point where the speculative thread left off.

Type: Grant

Filed: October 6, 2005

Date of Patent: November 16, 2010

Assignee: Oracle America, Inc.

Inventors: Marc Tremblay, Shailender Chaudhry
Computer Memory Architecture for Hybrid Serial and Parallel Computing Systems

Publication number: 20100287357

Abstract: In one embodiment, a serial processor is configured to execute software instructions in a software program in serial. A serial memory is configured to store data for use by the serial processor in executing the software instructions in serial. A plurality of parallel processors are configured to execute software instructions in the software program in parallel. A plurality of partitioned memory modules are provided and configured to store data for use by the plurality of parallel processors in executing software instructions in parallel. Accordingly, a processor/memory structure is provided that allows serial programs to use quick local serial memories and parallel programs to use partitioned parallel memories. The system may switch between a serial mode and a parallel mode. The system may incorporate pre-fetching commands of several varieties.

Type: Application

Filed: March 10, 2010

Publication date: November 11, 2010

Applicant: XMTT INC.

Inventor: Uzi Y. Vishkin
Determining target addresses for instruction flow changing instructions in a data processing apparatus

Patent number: 7831806

Abstract: A data processing apparatus comprises a processor for executing a stream of instructions, and a prefetch unit for prefetching instructions from a memory prior to sending those instructions to the processor for execution. The prefetch unit receives from the memory a plurality of prefetched instructions from sequential addresses in memory, and detects whether any prefetched instructions are an instruction flow changing instruction, and outputs a fetch address for a next instruction to be prefetched by the prefetch unit. Address generation logic is also provided which, for a selected prefetched instruction that is detected to be an instruction flow changing instruction, determines a target address to be output as the fetch address. Address generation logic has a first address generation path and a further generation path for determining the target address. The first address generation path generates the target address more quickly than the further address generation path.

Type: Grant

Filed: February 18, 2004

Date of Patent: November 9, 2010

Assignee: ARM Limited

Inventor: Paul Anthony Gilkerson
Enhanced single threaded execution in a simultaneous multithreaded microprocessor

Patent number: 7827389

Abstract: A method, system, and computer program product are provided for enhancing the execution of independent loads in a processing unit. The processing unit dispatches a first set of instructions in order from a first buffer for execution. The processing unit receives updated results from the execution of the first set of instructions. The processing unit updates, in a first register, at least one register entry associated with each instruction in the first set of instructions, with the updated results. The processing unit determines if the first set of instructions from the first buffer have completed execution. Responsive to the completed execution of the first set of instructions from the first buffer, the processing unit copies the set of entries from the first register to a second register.

Type: Grant

Filed: June 15, 2007

Date of Patent: November 2, 2010

Assignee: International Business Machines Corporation

Inventors: Hung Q. Le, Dung Q. Nguyen
Microprocessor with improved data stream prefetching using multiple transaction look-aside buffers (TLBs)

Patent number: 7822943

Abstract: Systems, methods and computer program products for improving data stream prefetching in a microprocessor are described herein.

Type: Grant

Filed: August 4, 2008

Date of Patent: October 26, 2010

Assignee: MIPS Technologies, Inc.

Inventor: Keith E. Diefendorff
Promoting and appending traces in an instruction processing circuit based upon a bias value

Patent number: 7814298

Abstract: A method, system and computer program product for promoting a trace in an instruction processing circuit is disclosed. They comprise determining if a current trace is promotable and determining if a next trace is appendable to the current trace. They include promoting the current trace and the next trace if the current trace is promotable and the next trace is appendable.

Type: Grant

Filed: November 16, 2007

Date of Patent: October 12, 2010

Assignee: Oracle America, Inc.

Inventors: Richard Thaik, John Gregory Favor, Joseph Rowlands, Leonard Eric Shar, Matthew Ashcraft
Pre-fetch circuit of semiconductor memory apparatus and control method of the same

Patent number: 7814247

Abstract: A pre-fetch circuit of a semiconductor memory apparatus can carry out a high-frequency operating test through a low-frequency channel of a test equipment. The pre-fetch circuit of a semiconductor memory apparatus can includes: a pre-fetch unit for pre-fetching data bits in a first predetermined number; a plurality of registers provided in the first predetermined number, each of which latches a data in order or a data out of order of the pre-fetched data in response to different control signals; and a control unit for selectively activating the different control signals in response to a test mode signal, whereby some of the registers latch the data out of order.

Type: Grant

Filed: July 18, 2008

Date of Patent: October 12, 2010

Assignee: Hynix Semiconductor Inc.

Inventor: Young-Ju Kim
METHOD AND SYSTEM FOR DATA PREFETCHING FOR LOOPS BASED ON LINEAR INDUCTION EXPRESSIONS

Publication number: 20100250854

Abstract: An efficient and effective compiler data prefetching technique is disclosed in which memory accesses may be prefetched are represented in linear induction expressions. Furthermore, indirect memory accesses indexed by other memory accesses of linear induction expressions in scalar loops may be prefetched.

Type: Application

Filed: March 16, 2010

Publication date: September 30, 2010

Inventor: Dz-ching Ju
Early resolving instructions

Patent number: 7805592

Abstract: Techniques are disclosed for handling control transfer instructions in pipelined processors. Such instructions may cause the sequence of subsequent instructions to change, and thus may require subsequent instructions to be deleted from the processor's pipeline. Pre-decode means (110) are provided for at least partially decoding control transfer instructions early in the pipeline. Subsequent instructions can then be prevented from progressing through the pipeline. The mechanism required to delete unwanted instructions is thereby simplified.

Type: Grant

Filed: October 7, 2002

Date of Patent: September 28, 2010

Assignee: Altera Corporation

Inventors: Nicholas Paul Joyce, Nigel Peter Topham
Multiprocessor Cache Prefetch With Off-Chip Bandwidth Allocation

Publication number: 20100241811

Abstract: Technologies are generally described for allocating available prefetch bandwidth among processor cores in a multiprocessor computing system. The prefetch bandwidth associated with an off-chip memory interface of the multiprocessor may be determined, partitioned, and allocated across multiple processor cores.

Type: Application

Filed: March 20, 2009

Publication date: September 23, 2010

Inventor: Yan Solihin
Trace indexing via trace end addresses

Patent number: 7802077

Abstract: A new class traces for a processing engine, called “extended blocks,” possess an architecture that permits possible many entry points but only a single exit point. These extended blocks may be indexed based upon the address of the last instruction therein. Use of the new trace architecture provides several advantages, including reduction of instruction redundancies, dynamic block extension and a sharing of instructions among various extended blocks.

Type: Grant

Filed: June 30, 2000

Date of Patent: September 21, 2010

Assignee: Intel Corporation

Inventors: Stephen J. Jourdan, Lihu Rappoport, Ronny Ronen, Adi Yoaz
Memory control circuit and microprocessory system for pre-fetching instructions

Patent number: 7793085

Abstract: A memory control circuit for providing a small-circuit-size memory control circuit capable of reducing a branch penalty during the execution of a branch instruction in a CPU. A branch-destination buffer caches a branch-destination instruction and a branch-destination-instruction address determined by a branch instruction executed by the CPU. When the CPU executes a branch instruction thereafter, if the branch-destination-instruction address output from the CPU matches an instruction address in the branch-destination buffer, the corresponding branch-destination instruction stored in the branch-destination buffer is sent to the CPU. When a branch instruction is executed, an address comparison circuit compares the branch-destination-instruction address with the branch-source-instruction address.

Type: Grant

Filed: December 30, 2004

Date of Patent: September 7, 2010

Assignee: Fujitsu Semiconductor Limited

Inventor: Kenji Furuya
Graceful degradation in a trace-based processor

Patent number: 7783863

Abstract: A method of handling a trace to be aborted includes receiving an indication of a trace to be aborted and an indication of an abort reason corresponding to an execution of the trace to be aborted. The trace to be aborted has a trace type associated therewith and includes a sequence of the operations, and represents a sequence of at least two of the instructions. The method further includes identifying a corrective action based at least in part on the type of the trace to be aborted and on the abort reason, not taking into account a correspondence between the at least one operation that caused the execution to be aborted and the at least one instruction that the at least one operation at least in part represents. A next trace and its trace type is determined for execution, where the determining is based on the trace to be aborted and on the corrective action.

Type: Grant

Filed: October 24, 2007

Date of Patent: August 24, 2010

Assignee: Oracle America, Inc.

Inventors: Christopher Patrick Nelson, John Gregory Favor, Richard Win Thaik, Matthew William Ashcraft
System and method for implementing a hardware-supported thread assist under load lookahead mechanism for a microprocessor

Patent number: 7779234

Abstract: The present invention includes a system and method for implementing a hardware-supported thread assist under load lookahead mechanism for a microprocessor. According to an embodiment of the present invention, hardware thread-assist mode can be activated when one thread of the microprocessor is in a sleep mode. When load lookahead mode is activated, the fixed point unit copies the content of one or more architected facilities from an active thread to corresponding architected facilities in the first inactive thread. The load-store unit performs at least one speculative load in load lookahead mode and writes the results of the at least one speculative load to a duplicated architected facility in the first inactive thread.

Type: Grant

Filed: October 23, 2007

Date of Patent: August 17, 2010

Assignee: International Business Machines Corporation

Inventors: James W. Bishop, Hung Q. Le, Dung Q. Nguyen, Wolfram Sauer, Benjamin W. Stolt, Michael T. Vaden
Method and apparatus for dynamically managing instruction buffer depths for non-predicted branches

Patent number: 7779232

Abstract: A method and apparatus for dynamically managing instruction buffer depths for non-predicted branches reduces wasted energy and resources associated with low confidence branch prediction conditions. A portion of the instruction buffer for a instruction thread is allocated for storing predicted branch instruction streams and another portion, which may be zero-sized during high prediction confidence conditions, is allocated to the non-predicted branch instruction stream. The size of the buffers is adjusted dynamically in conformity with an on-going prediction confidence that provides a measure of how well branch prediction mechanisms are working for a given instruction thread. An alternate instruction fetch address table can be maintained and multiplexed with the main fetch address register for addressing the instruction cache, so that the instruction stream can be quickly shifted to the non-predicted path when a branch instruction is resolved to the non-predicted path.

Type: Grant

Filed: August 28, 2007

Date of Patent: August 17, 2010

Assignee: International Business Machines Corporation

Inventors: Richard W. Doing, Michael O. Klett, Kevin N. Magill, Brian R. Mestan, David Mui, Balaram Sinharoy, Jeffrey R. Summers
System and method for implementing a software-supported thread assist mechanism for a microprocessor

Patent number: 7779233

Abstract: A system and computer-implementable method for implementing software-supported thread assist within a data processing system, wherein the data processing system supports processing instructions within at least a first thread and a second thread. An instruction dispatch unit (IDU) places the first thread into a sleep mode. The IDU separates an instruction stream for the second thread into at least a first independent instruction stream and a second independent instruction stream. The first independent instruction stream is processed utilizing facilities allocated to the first thread and the second independent instruction stream is processed utilizing facilities allocated to the second thread.

Type: Grant

Filed: October 23, 2007

Date of Patent: August 17, 2010

Assignee: International Business Machines Corporation

Inventors: Hung Q. Le, Dung Q. Nguyen
Data processing system and method for processing data

Patent number: 7769954

Abstract: A data processing system includes: a cache memory comprising a plurality of ways, each of which stores a data line including a data and address information of the data; an analysis module that analyzes whether or not a data requested in a read instruction is to be used in a subsequent instruction to be executed within a predetermined time period after the execution of the read instruction is started; a mode selection module that selects one of a plurality of access modes for accessing the cache memory based on a result of the analysis module; and an access unit that accesses the cache memory in the selected one of the access modes when the read instruction is executed.

Type: Grant

Filed: March 29, 2007

Date of Patent: August 3, 2010

Assignee: Kabushiki Kaisha Toshiba

Inventor: Kenta Yasufuku
Method, apparatus, and program to efficiently calculate cache prefetching patterns for loops

Patent number: 7761667

Abstract: A mechanism is provided that identifies instructions that access storage and may be candidates for catch prefetching. The mechanism augments these instructions so that any given instance of the instruction operates in one of four modes, namely normal, unexecuted, data gathering, and validation. In the normal mode, the instruction merely performs the function specified in the software runtime environment. An instruction in unexecuted mode, upon the next execution, is placed in data gathering mode. When an instruction in the data gathering mode is encountered, the mechanism of the present invention collects data to discover potential fixed storage access patterns. When an instruction is in validation mode, the mechanism of the present invention validates the presumed fixed storage access patterns.

Type: Grant

Filed: August 12, 2008

Date of Patent: July 20, 2010

Assignee: International Business Machines Corporation

Inventors: Christopher Michael Donawa, Allan Henry Kielstra
Processor instruction used to determine whether to perform a memory-related trap

Publication number: 20100153689

Abstract: Instruction execution includes fetching an instruction that comprises a first set of one or more bits identifying the instruction, and a second set of one or more bits associated with a first address value. It further includes executing the instruction to determine whether to perform a trap, wherein executing the instruction includes selecting from a plurality of tests at least one test for determining whether to perform a trap and carrying out the at least one test.

Type: Application

Filed: February 12, 2010

Publication date: June 17, 2010

Inventors: Jack Choquette, Gil Tene, Michael A. Wolf
Method and apparatus for multiple load instruction execution

Patent number: 7730288

Abstract: A method and apparatus for executing instructions. The method includes receiving a first load instruction and a second load instruction. The method also includes issuing the first load instruction and the second load instruction to a cascaded delayed execution pipeline unit having at least a first execution pipeline and a second execution pipeline, wherein the second execution pipeline executes an instruction in a common issue group in a delayed manner relative to another instruction in the common issue group executed in the first execution pipeline. The method also includes accessing a cache by executing the first load instruction and the second load instruction. A delay between execution of the first load instruction and the second load instruction allows the cache to complete the access with the first load instruction before beginning the access with the second load instruction.

Type: Grant

Filed: June 27, 2007

Date of Patent: June 1, 2010

Assignee: International Business Machines Corporation

Inventor: David Arnold Luick
Method for preloading data in a CPU pipeline

Patent number: 7730289

Abstract: A method for preloading data in a CPU pipeline is provided, which includes the following steps. When a hint instruction is executed, allocate and initiate an entry in a preload table. When a load instruction is fetched, load a piece of data from a memory into the entry according to the entry. When a use instruction which uses the data loaded by the load instruction is executed, forward the data for the use instruction from the entry instead of from the memory. When the load instruction is executed, update the entry according to the load instruction.

Type: Grant

Filed: September 27, 2007

Date of Patent: June 1, 2010

Assignee: Faraday Technology Corp.

Inventors: I-Jui Sung, Ming-Chung Kao
Alignment of cache fetch return data relative to a thread

Patent number: 7725659

Abstract: A method of obtaining data, comprising at least one sector, for use by at least a first thread wherein each processor cycle is allocated to at least one thread, includes the steps of: requesting data for at least a first thread; upon receipt of at least a first sector of the data, determining whether the at least first sector is aligned with the at least first thread, wherein a given sector is aligned with a given thread when a processor cycle in which the given sector will be written is allocated to the given thread; responsive to a determination that the at least first sector is aligned with the at least first thread, bypassing the at least first sector, wherein bypassing a sector comprises reading the sector while it is being written; and responsive to a determination that the at least first sector is not aligned with the at least first thread, delaying the writing of the at least first sector until the occurrence of a processor cycle allocated to the at least first thread by retaining the at least first s

Type: Grant

Filed: September 5, 2007

Date of Patent: May 25, 2010

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, Hans Mikael Jacobson, Robert Alan Philhower
High-performance, superscalar-based computer system with out-of-order instruction execution

Patent number: 7721070

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Grant

Filed: September 22, 2008

Date of Patent: May 18, 2010

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
METHOD FOR INCREASING CONFIGURATION RUNTIME OF TIME-SLICED CONFIGURATIONS

Publication number: 20100122064

Abstract: A device may include a data processing logic cell field and one or more sequential CPUs. The logic cell field and the CPUs may be configured to be coupled to each other for data exchange. The data exchange may be in block form using lines leading to a cache memory. In a method for operating a reconfigurable unit having runtime-limited configurations, the configurations may be able to increase their maximum allowed runtime, e.g., by triggering a parallel counter. An increase in configuration runtime by the configurations may be suppressed in response to an interrupt.

Type: Application

Filed: September 30, 2009

Publication date: May 13, 2010

Inventor: MARTIN VORBACH
Store stream prefetching in a microprocessor

Patent number: 7716427

Abstract: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.

Type: Grant

Filed: January 4, 2008

Date of Patent: May 11, 2010

Assignee: International Business Machines Corporation

Inventors: John Barry Griswell, Jr., Hung Qui Le, Francis Patrick O'Connell, William J. Starke, Jeffrey Adam Stuecheli, Albert Thomas Williams
VIRTUAL MACHINE CONTROL METHOD AND VIRTUAL MACHINE SYSTEM

Publication number: 20100115513

Abstract: Provided is a virtual machine including a first virtualization module operating on a physical CPU, for providing a first CPU, and a second virtualization module operating on the first CPU, for providing second CPU. The second virtualization module includes first processor control information holding a state of the first CPU obtained at a time of execution of the user program. The first virtualization module includes second processor control information containing a state of the physical CPU obtained at the time of the execution of the second virtualization module, third processor control information containing a state of the physical CPU obtained at the time of the execution of the user program, and prefetch entry information in which information to be prefetched from the third processor control information is set, and, upon detection of a event, the information set in the prefetch entry information is reflected to the first processor control information.

Type: Application

Filed: October 30, 2009

Publication date: May 6, 2010

Inventors: Toshiomi MORIKI, Naoya Hattori, Yuji Tsushima
System, method and software to preload instructions from an instruction set other than one currently executing

Patent number: 7711927

Abstract: An instruction preload instruction executed in a first processor instruction set operating mode is operative to correctly preload instructions in a different, second instruction set. The instructions are pre-decoded according to the second instruction set encoding in response to an instruction set preload indicator (ISPI). In various embodiments, the ISPI may be set prior to executing the preload instruction, or may comprise part of the preload instruction or the preload target address.

Type: Grant

Filed: March 14, 2007

Date of Patent: May 4, 2010

Assignee: QUALCOMM Incorporated

Inventors: Thomas Andrew Sartorius, Brian Michael Stempel, Rodney Wayne Smith
Computer memory architecture for hybrid serial and parallel computing systems

Patent number: 7707388

Abstract: In one embodiment, a serial processor is configured to execute software instructions in a software program in serial. A serial memory is configured to store data for use by the serial processor in executing the software instructions in serial. A plurality of parallel processors are configured to execute software instructions in the software program in parallel. A plurality of partitioned memory modules are provided and configured to store data for use by the plurality of parallel processors in executing software instructions in parallel. Accordingly, a processor/memory structure is provided that allows serial programs to use quick local serial memories and parallel programs to use partitioned parallel memories. The system may switch between a serial mode and a parallel mode. The system may incorporate pre-fetching commands of several varieties.

Type: Grant

Filed: November 29, 2006

Date of Patent: April 27, 2010

Assignee: XMTT Inc.

Inventor: Uzi Vishkin
Dynamic prefetch distance calculation

Patent number: 7702856

Abstract: The prefetch distance to be used by a prefetch instruction may not always be correctly calculated using compile-time information. In one embodiment, the present invention generates prefetch distance calculation code to dynamically calculate a prefetch distance used by a prefetch instruction at run-time.

Type: Grant

Filed: November 9, 2005

Date of Patent: April 20, 2010

Assignee: Intel Corporation

Inventors: Rakesh Krishnaiyer, Somnath Ghosh, Abhay Kanhere
Branch predictor directed prefetch

Patent number: 7702888

Abstract: An apparatus for executing branch predictor directed prefetch operations. During operation, a branch prediction unit may provide an address of a first instruction to the fetch unit. The fetch unit may send a fetch request for the first instruction to the instruction cache to perform a fetch operation. In response to detecting a cache miss corresponding to the first instruction, the fetch unit may execute one or more prefetch operation while the cache miss corresponding to the first instruction is being serviced. The branch prediction unit may provide an address of a predicted next instruction in the instruction stream to the fetch unit. The fetch unit may send a prefetch request for the predicted next instruction to the instruction cache to execute the prefetch operation. The fetch unit may store prefetched instruction data obtained from a next level of memory in the instruction cache or in a prefetch buffer.

Type: Grant

Filed: February 28, 2007

Date of Patent: April 20, 2010

Assignee: GlobalFoundries Inc.

Inventors: Marius Evers, Trivikram Krishnamurthy
CHANNEL COMMAND WORD PRE-FETCHING APPARATUS

Publication number: 20100082948

Abstract: In a CCW fetching section, for each input/output device being a control objective, a result prediction table in which prediction values of status values to be returned from an input/output device as execution results of CCW commands, is referred to. Then, based on the prediction values, commands being pre-fetching objectives are pre-fetched from a CCW program stored in a memory, and transmitted to a CCW executing section. On the other hand, in the CCW executing section, the pre-fetched commands are sequentially executed, and the actual status values as the execution results are received from the input/output device. Then, when the received actual status values are not same as the predicted status values, success or failure in prediction is notified to the CCW fetching section, and also, the result prediction table is updated in the CCW fetching section.

Type: Application

Filed: June 29, 2009

Publication date: April 1, 2010

Applicant: FUJITSU LIMITED

Inventors: Tsukasa Matsuda, Hideki Yamanaka
System and method for improving the page crossing performance of a data prefetcher

Patent number: 7689774

Abstract: A system and method for improving the page crossing performance of a data prefetcher is presented. A prefetch engine tracks times at which a data stream terminates due to a page boundary. When a certain percentage of data streams terminate at page boundaries, the prefetch engine sets an aggressive profile flag. In turn, when the data prefetch engine receives a real address that corresponds to the beginning/end of a new page, and the aggressive profile flag is set, the prefetch engine uses an aggressive startup profile to generate and schedule prefetches on the assumption that the real address is highly likely to be the continuation of a long data stream. As a result, the system and method minimize latency when crossing real page boundaries when a program is predominately accessing long streams.

Type: Grant

Filed: April 6, 2007

Date of Patent: March 30, 2010

Assignee: International Business Machines Corporation

Inventors: Francis Patrick O'Connell, Jeffrey A. Stuecheli
System using stream prefetching history to improve data prefetching performance

Patent number: 7689775

Abstract: Computer implemented method, system and computer program product for prefetching data in a data processing system. A computer implemented method for prefetching data in a data processing system includes generating attribute information of prior data streams by associating attributes of each prior data stream with a storage access instruction which caused allocation of the data stream, and then recording the generated attribute information. The recorded attribute information is accessed, and a behavior of a new data stream is modified using the accessed recorded attribute information.

Type: Grant

Filed: March 9, 2009

Date of Patent: March 30, 2010

Assignee: International Business Machines Corporation

Inventors: John Barry Griswell, Jr., Francis Patrick O'Connell
Locked prefetch scheduling in general cyclic regions

Patent number: 7681188

Abstract: One embodiment of the present invention provides a system that facilitates locked prefetch scheduling in general cyclic regions of a computer program. The system operates by first receiving a source code for the computer program and compiling the source code into intermediate code. The system then performs a trace detection on the intermediate code. Next, the system inserts prefetch instructions and corresponding locks into the intermediate code. Finally, the system generates executable code from the intermediate code, wherein a lock for a given prefetch instruction prevents subsequent prefetches from being issued until the data value returns for the given prefetch instruction.

Type: Grant

Filed: April 29, 2005

Date of Patent: March 16, 2010

Assignee: Sun Microsystems, Inc.

Inventors: Partha P. Tirumalai, Spiros Kalogeropulos, Yonghong Song
System, method and software to preload instructions from a variable-length instruction set with proper pre-decoding

Patent number: 7676659

Abstract: In a processor executing instructions from a variable-length instruction set, a preload instruction is operative to retrieve from memory a data block corresponding to an instruction cache line, pre-decode instructions from a variable-length instruction set in the data block, and load the instructions and pre-decode information into the instruction cache. An instruction execution unit indicates to a pre-decoder the position within the data block of a first valid instruction. The pre-decoder successively determines the length of each instruction and hence the instruction boundaries. An instruction cache line offset indicator that identifies the position of the first valid instruction may be generated and provided to the pre-decoder in a variety of ways.

Type: Grant

Filed: April 4, 2007

Date of Patent: March 9, 2010

Assignee: QUALCOMM Incorporated

Inventors: Brian Michael Stempel, Thomas Andrew Sartorius, Rodney Wayne Smith
PROCESSOR AND EARLY-LOAD METHOD THEREOF

Publication number: 20100049947

Abstract: A processor and an early-load method thereof are provided. In the early-load method, an instruction is fetched and determined in an instruction fetch stage to obtain a determination result. Whether to early-load an early-loaded data corresponding to the instruction is determined according to the determination result. A target data is fetched according to the instruction in an instruction execution stage if the early-loaded data is not loaded correctly. The early-loaded data is served as the target data if the early-loaded data is loaded correctly.

Type: Application

Filed: August 22, 2008

Publication date: February 25, 2010

Applicant: FARADAY TECHNOLOGY CORP.

Inventors: Shun-Chieh Chang, Yuan-Hwa Li, Yuan-Jung Kuo, Chin-Ling Huang, Chung-Ping Chung
Fine-grained software-directed data prefetching using integrated high-level and low-level code analysis optimizations

Patent number: 7669194

Abstract: A mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations is provided. The mechanism identifies and classifies streams, identifies data that is most likely to incur a cache miss, exploits effective hardware prefetching to determine the proper number of streams to be prefetched, exploits effective data prefetching on different types of streams in order to eliminate redundant prefetching and avoid cache pollution, and uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.

Type: Grant

Filed: August 26, 2004

Date of Patent: February 23, 2010

Assignee: International Business Machines Corporation

Inventors: Roch Georges Archambault, Robert James Blainey, Yaoqing Gao, Allan Russell Martin, James Lawrence McInnes, Francis Patrick O'Connell
Microprocessor with improved data stream prefetching

Patent number: 7664920

Abstract: A microprocessor includes a hierarchical memory subsystem, an instruction decoder, and a stream prefetch unit. The decoder decodes an instruction that specifies a locality characteristic parameter. In one embodiment, the parameter specifies a relative urgency with which a data stream specified by the instruction is needed rather than specifying exactly which of the cache memories in the hierarchy to prefetch the data stream into. The prefetch unit selects one of the cache memory levels in the hierarchy for prefetching the data stream into based on the memory subsystem configuration and on the relative urgency. In another embodiment, the prefetch unit instructs the memory subsystem to mark the prefetched cache line for early, late, or normal eviction according to its cache line replacement policy based on the parameter value.

Type: Grant

Filed: August 11, 2006

Date of Patent: February 16, 2010

Assignee: MIPS Technologies, Inc.

Inventor: Keith E. Diefendorff
Recovering a subordinate strand from a branch misprediction using state information from a primary strand

Patent number: 7664942

Abstract: Embodiments of the present invention provide a system that executes program code in a processor. The system starts by executing the program code in a normal mode using a primary strand while concurrently executing the program code ahead of the primary strand using a subordinate strand in a scout mode. Upon resolving a branch using the subordinate strand, the system records a resolution for the branch in a speculative branch resolution table. Upon subsequently encountering the branch using the primary strand, the system uses the recorded resolution from the speculative branch resolution table to predict a resolution for the branch for the primary strand. Upon determining that the resolution of the branch was mispredicted for the primary strand, the system determines that the subordinate strand mispredicted the branch. The system then recovers the subordinate strand to the branch and restarts the subordinate strand executing the program code.

Type: Grant

Filed: August 25, 2008

Date of Patent: February 16, 2010

Assignee: Sun Microsystems, Inc.

Inventors: Marc Tremblay, Shailender Chaudhry
Apparatus and Methods for Speculative Interrupt Vector Prefetching

Publication number: 20100036987

Abstract: Techniques for interrupt processing are described. An exceptional condition is detected in one or more stages of an instruction pipeline in a processor. In response to the detected exceptional condition and prior to the processor accepting an interrupt in response to the detected exceptional condition, an instruction cache is checked for the presence of an instruction at a starting address of an interrupt handler. The instruction at the starting address of the interrupt vector table is prefetched from storage above the instruction cache when the instruction is not present in the instruction cache to load the instruction in the instruction cache, whereby the instruction is made available in the instruction cache by the time the processor accepts the interrupt in response to the detected exceptional condition.

Type: Application

Filed: August 8, 2008

Publication date: February 11, 2010

Applicant: QUALCOMM INCORPORATED

Inventors: Daren Eugene Streett, Brian Michael Stempel
System and method for processor with predictive memory retrieval assist

Patent number: 7657723

Abstract: A system and method are described for a memory management processor which, using a table of reference addresses embedded in the object code, can open the appropriate memory pages to expedite the retrieval of information from memory referenced by instructions in the execution pipeline. A suitable compiler parses the source code and collects references to branch addresses, calls to other routines, or data references, and creates reference tables listing the addresses for these references at the beginning of each routine. These tables are received by the memory management processor as the instructions of the routine are beginning to be loaded into the execution pipeline, so that the memory management processor can begin opening memory pages where the referenced information is stored. Opening the memory pages where the referenced information is located before the instructions reach the instruction processor helps lessen memory latency delays which can greatly impede processing performance.

Type: Grant

Filed: January 28, 2009

Date of Patent: February 2, 2010

Assignee: Micron Technology, Inc.

Inventor: Dean A. Klein
Instruction dispatch scheduler employing round-robin apparatus supporting multiple thread priorities for use in multithreading microprocessor

Patent number: 7657883

Abstract: A dispatch scheduler in a multithreading microprocessor is disclosed. Each of N concurrently executing threads has one of P priorities. P N-bit round-robin vectors are generated, each being a 1-bit left-rotated and subsequently sign-extended version of an N-bit 1-hot input vector indicating the last thread selected for dispatching at the priority. N P-input muxes each receive a corresponding one of the N bits of each of the P round-robin vectors and selects the input specified by the thread priority. Selection logic selects an instruction for dispatching from the thread having a dispatch value greater than or equal to any of the threads left thereof in the N-bit input vectors. The dispatch value of each of the threads comprises a least-significant bit equal to the corresponding P-input mux output, a most-significant bit that is true if the instruction is dispatchable, and middle bits comprising the priority of the thread.

Type: Grant

Filed: March 22, 2005

Date of Patent: February 2, 2010

Assignee: MIPS Technologies, Inc.

Inventor: Michael Gottlieb Jensen
Branch target aware instruction prefetching technique

Patent number: 7647477

Abstract: Inspecting a currently fetched instruction group and determining branching behavior of the currently fetched instruction group, allows for intelligent instruction prefetching. A currently fetched instruction group is predecoded and, assuming the currently fetch instruction group includes a branch type instruction, a branch target is characterized in relation to a fetch boundary, which delimits a memory region contiguous with the memory region that hosts the currently fetched instruction group. Instruction prefetching is included based, at least in part, on the predecoded characterization of the branch target.

Type: Grant

Filed: November 23, 2004

Date of Patent: January 12, 2010

Assignee: Sun Microsystems, Inc.

Inventors: Paul Caprioli, Shailender Chaudhry
Memory control circuit and integrated circuit

Publication number: 20100005251

Abstract: The memory unit is compatible with a plurality of operation modes. The plurality of operation modes include the normal mode allowing access and the standby mode consuming a lower power than the normal mode. The branch detection section detects a branch instruction from an instruction fetched from the memory unit by the CPU. The mode control section changes an operation mode of the memory unit according to a detection result by the branch detection section.

Type: Application

Filed: December 23, 2008

Publication date: January 7, 2010

Applicant: NEC ELECTRONICS CORPORATION

Inventor: Kiminari Yamazoe
METHODS AND APPARATUS FOR DYNAMIC PREDICTION BY SOFTWARE

Publication number: 20090313456

Abstract: A method, storage medium, processor instruction and processor to for specifying a value in a first portion of a conditional pre-fetch instruction associated with a branch instruction used for effectuating a branch operation, specifying a target instruction address in a second portion of the instruction, evaluating the value to determine whether a condition is met, and pre-fetching one or more instructions starting at the target instruction address into an instruction buffer of the processor when the condition is met, is provided.

Type: Application

Filed: August 13, 2009

Publication date: December 17, 2009

Applicant: SONY COMPUTER ENTERTAINMENT INC.

Inventors: Masahiro Yasue, Akiyuki Hatakeyama
Instruction issue control wtihin a multithreaded processor

Publication number: 20090313455

Abstract: A multithreaded processor is provided with a saturating counter which serves to generate a thread preference signal to steer selection of which program thread operations are taken from for issue into the multiple processor pipelines. The counter is updated based upon the selections made for issue. The counter is a saturating counter and its sign bit may be used as a thread preference signal when discriminating between two threads. The update made to the count value can be weighted depending upon programmable priorities associated with the respective threads as well as a weighting based upon the time taken to execute the type of operation selected.

Type: Application

Filed: December 15, 2005

Publication date: December 17, 2009

Inventors: David Hennah Mansell, Stuart David Biles

prev … 3 4 5 6 7 8 9 10 11 … next