Prefetching Patents (Class 712/207)

Systems and methods for loading data into the cache of one processor to improve performance of another processor in a multiprocessor system

Patent number: 7484041

Abstract: Systems and methods for improving the performance of a multiprocessor system by enabling a first processor to initiate the retrieval of data and the storage of the data in the cache memory of a second processor. One embodiment comprises a system having a plurality of processors coupled to a bus, where each processor has a corresponding cache memory. The processors are configured so that a first one of the processors can issue a preload command directing a target processor to load data into the target processor's cache memory. The preload command may be issued in response to a preload instruction in program code, or in response to an event. The first processor may include an explicit identifier of the target processor in the preload command, or the selection of the target processor may be left to another agent, such as an arbitrator coupled to the bus.

Type: Grant

Filed: April 4, 2005

Date of Patent: January 27, 2009

Assignee: Kabushiki Kaisha Toshiba

Inventor: Takashi Yoshikawa
SPECULATIVE MEMORY PREFETCH

Publication number: 20090024835

Abstract: A system and method for pre-fetching data from system memory. A multi-core processor accesses a cache hit predictor concurrently with sending a memory request to a cache subsystem. The predictor has two tables. The first table is indexed by a portion of a memory address and provides a hit prediction based on a first counter value. The second table is indexed by a core number and provides a hit prediction based on a second counter value. If neither table predicts a hit, a pre-fetch request is sent to memory. In response to detecting said hit prediction is incorrect, the pre-fetch is cancelled.

Type: Application

Filed: July 19, 2007

Publication date: January 22, 2009

Inventors: Michael K. Fertig, Patrick Conway, Kevin Michael Lepak, Cissy Xumin Yuan
Microprocessor with improved data stream prefetching

Patent number: 7480769

Abstract: A microprocessor coupled to a system memory includes a load request signal that requests data be loaded from the system memory into the microprocessor in response to a load instruction. The load request signal includes a load virtual page address. The microprocessor also includes a prefetch request signal that requests a cache line be prefetched from the system memory into the microprocessor in response to a prefetch instruction. The prefetch request signal includes a prefetch virtual page address.

Type: Grant

Filed: August 11, 2006

Date of Patent: January 20, 2009

Assignee: MIPS Technologies, Inc.

Inventors: Keith E. Diefendorff, Thomas A. Petersen
Systems for loading unaligned words and methods of operating the same

Patent number: 7480783

Abstract: Disclosed are systems for loading an unaligned word from a specified unaligned word address in a memory, the unaligned word comprising a plurality of indexed portions crossing a word boundry, a method of operating the system comprising: loading a first aligned word commencing at an aligned word address rounded from the specified unaligned word address; identifying an index representing the location of the unaligned word address relative to the aligned word address; loading a second aligned word commencing at an aligned word address rounded from a second unaligned word address; and combining indexed portions of the first and second alinged words using the indentified index to construct the unaligned word.

Type: Grant

Filed: August 19, 2004

Date of Patent: January 20, 2009

Assignees: STMicroelectronics Limited, Hewlett-Packard Company

Inventors: Mark O. Homewood, Paolo Faraboschi
High-Performance, Superscalar-Based Computer System with Out-of-Order Instruction Execution

Publication number: 20090019261

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Application

Filed: September 22, 2008

Publication date: January 15, 2009

Applicant: Seiko Epson Corporation

Inventors: Le Trong NGUYEN, Derek J. LENTZ, Yoshiyuki MIYAYAMA, Sanjiv GARG, Yasuaki HAGIWARA, Johannes WANG, Te-Li LAU, Sze-Shun WANG, Quang H. TRANG
MASS PREFETCHING METHOD FOR DISK ARRAY

Publication number: 20090019260

Abstract: Disclosed herein is a mass prefetching method for disk arrays. In order to improve disk read performance for a non-sequential with having spatial locality as well as a sequential read, when a host requests a block to be read, all the blocks of the strip to which the block belongs are read. This is designated as strip prefetching (SP). Throttled Strip Prefetching (TSP), proposed in the present invention, investigates whether SP is beneficial by an online disk simulation, and does not perform SP if it is determined that SP is not beneficial. Since all prefetching operations of TSP are aligned in the strip of the disk array, the disk independence loss is resolved, and thus the performance of disk arrays is improved for concurrent sequential reads of multiple processes. TSP may however suffer from the loss of disk parallelism due to the disk independence of SP for a single sequential read. In order to solve this problem, this invention proposes Massive Stripe Prefetching (MSP).

Type: Application

Filed: January 2, 2008

Publication date: January 15, 2009

Applicant: KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY

Inventors: Kyu-Ho Park, Sung-Hoon Baek
DATA FORWARDING FROM SYSTEM MEMORY-SIDE PREFETCHER

Publication number: 20090006813

Abstract: An apparatus, system, and method are disclosed. In one embodiment, the apparatus includes a system memory-side prefetcher that is coupled to a memory controller. The system memory-side prefetcher includes a stride detection unit to identify one or more patterns in a stream. The system memory-side prefetcher also includes a prefetch injection unit to insert prefetches into the memory controller based on the detected one or more patterns. The system memory-side prefetcher also includes a prefetch data forwarding unit to forward the prefetched data to a cache memory coupled to a processor.

Type: Application

Filed: June 28, 2007

Publication date: January 1, 2009

Inventors: Abhishek Singhal, Hemant G. Rotithor
Software value prediction using pendency records of predicted prefetch values

Patent number: 7472256

Abstract: Profile information can be used to target read operations that cause a substantial portion of misses in a program. A software value prediction technique that utilizes latency and is applied to the targeted read operations facilitates aggressive speculative execution without significant performance impact and without hardware support. A software value predictor issues prefetches for targeted read operations during speculative execution, and utilizes values from these prefetches during subsequent speculative execution, since the earlier prefectches should have completed, to update a software value prediction structure(s). Such a software based value prediction technique allows for aggressive speculative execution without the overhead of a hardware value predictor.

Type: Grant

Filed: April 12, 2005

Date of Patent: December 30, 2008

Assignee: Sun Microsystems, Inc.

Inventors: Sreekumar R. Nair, Santosh G. Abraham
LOADING TEST DATA INTO EXECUTION UNITS IN A GRAPHICS CARD TO TEST THE EXECUTION UNITS

Publication number: 20080307202

Abstract: Provided are a method and system for loading test data into execution units in a graphics card to test the execution units. Test instructions are loaded into a cache in a graphics module comprising multiple execution units coupled to the cache on a bus during a design test mode. The cache instructions are concurrently transferred to an instruction queue of each execution unit to concurrently load the cache instructions into the instruction queues of the execution units. The execution units concurrently execute the cache instructions to fetch test instructions from the cache to load into memories of the execution units and execute during the design test mode.

Type: Application

Filed: June 7, 2007

Publication date: December 11, 2008

Inventors: Allan WONG, Ke YIN, Naveen MATAM, Anthony BABELLA, Wing Hang WONG
Method and apparatus for suppressing duplicative prefetches for branch target cache lines

Patent number: 7461237

Abstract: A system that suppresses duplicative prefetches for branch target cache lines. During operation, the system fetches a first cache line into in a fetch buffer. The system then prefetches a second cache line, which immediately follows the first cache line, into the fetch buffer. If a control transfer instruction in the first cache line has a target instruction which is located in the second cache line, the system determines if the control transfer instruction is also located at the end of the first cache line so that a corresponding delay slot for the control transfer instruction is located at the beginning of the second cache line. If so, the system suppresses a subsequent prefetch for a target cache line containing the target instruction because the target instruction is located in the second cache line which has already been prefetched.

Type: Grant

Filed: April 20, 2005

Date of Patent: December 2, 2008

Assignee: Sun Microsystems, Inc.

Inventors: Abid Ali, Paul Caprioli, Shailender Chaudhry, Miles Lee
TECHNIQUE FOR PREFETCHING DATA BASED ON A STRIDE PATTERN

Publication number: 20080288751

Abstract: A processor system (100) includes a central processing unit (102) and a prefetch engine (110). The prefetch engine (110) is coupled to the central processing unit (102). The prefetch engine (110) is configured to detect, when data associated with the central processing unit (102) is read from a memory (114), a stride pattern in an address stream based upon whether sums of a current stride and a previous stride are equal for a number of consecutive reads. The prefetch engine (110) is also configured to prefetch, for the central processing unit (102), data from the memory (114) based on the detected stride pattern.

Type: Application

Filed: May 17, 2007

Publication date: November 20, 2008

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor: Andrej Kocev
Apparatus for and method of distributing instructions

Publication number: 20080276073

Abstract: An apparatus is provided for buffering instructions. An instruction store has memory locations for storing instructions. Each instruction can be associated with a timer such that an instruction dispatcher causes the instruction to be sent when the timer indicates that the instruction should be sent.

Type: Application

Filed: May 2, 2007

Publication date: November 6, 2008

Applicant: Analog Devices, Inc.

Inventors: Joern Soerensen, Dilip Muthukrishnan, William Plumb, Thomas Keller, Morag Clark
Method and apparatus for converting memory instructions to prefetch operations during a thread switch window

Patent number: 7447877

Abstract: A method and apparatus for converting memory instructions to prefetch operations during a thread switch window is disclosed. In one embodiment, memory access instructions that are already inside an instruction pipeline when the current thread is switched out may be decoded and then converted to the complementary prefetch operations. The prefetch operation may place the data into the cache during the execution of the alternate thread.

Type: Grant

Filed: June 13, 2002

Date of Patent: November 4, 2008

Assignee: Intel Corporation

Inventors: Bharadwaj Pudipeddi, Udo Walterscheidt
Prefetching using future branch path information derived from branch prediction

Patent number: 7441110

Abstract: A mechanism is described that predicts the usefulness of a prefetching instruction during the instruction's decode cycle. Prefetching instructions that are predicted as useful (prefetch useful data) are sent to an execution unit of the processor for execution, while instructions that are predicted as not useful are discarded. The prediction regarding the usefulness of a prefetching instructions is performed utilizing a branch prediction mask contained in the branch history mechanism. This mask is compared to information contained in the prefetching instruction that records the branch path between the prefetching instruction and actual use of the data. Both instructions and data can be prefetched using this mechanism.

Type: Grant

Filed: December 10, 1999

Date of Patent: October 21, 2008

Assignee: International Business Machines Corporation

Inventors: Thomas R. Puzak, Allan M. Hartstein, Mark Charney, Daniel A. Prener, Peter H. Oden
Identifying and processing essential and non-essential code separately

Patent number: 7437542

Abstract: A conjugate processor includes an instruction set architecture (ISA) visible portion having a main pipeline, and an h-flow portion having an h-flow pipeline. The binary executed on the conjugate processor includes an essential portion that is executed on the main pipeline and a non-essential portion that is executed on the h-flow pipeline. The non-essential portion includes hint calculus that is used to provide hints to the main pipeline. The conjugate processor also includes a conjugate mapping table that maps triggers to h-flow targets. Triggers can be instruction attributes, data attributes, state attributes or event attributes. When a trigger is satisfied, the h-flow code specified by the target is executed in the h-flow pipeline.

Type: Grant

Filed: January 13, 2006

Date of Patent: October 14, 2008

Assignee: Intel Corporation

Inventors: Hong Wang, Ralph Kling, Yong-Fong Lee, David A. Berson, Michael A. Kozuch, Konrad Lai
Preload controller, preload control method for controlling preload of data by processor to temporary memory, and program

Patent number: 7434005

Abstract: A preload controller for controlling a bus access device that reads out data from a main memory via a bus and transfers the readout data to a temporary memory, including a first acquiring device to acquire access hint information which represents a data access interval to the main memory, a second acquiring device to acquire system information which represents a transfer delay time in transfer of data via the bus by the bus access device, a determining device to determine a preload unit count based on the data access interval represented by the access hint information and the transfer delay time represented by the system information, and a management device to instruct the bus access device to read out data for the preload unit count from the main memory and to transfer the readout data to the temporary memory ahead of a data access of the data.

Type: Grant

Filed: June 14, 2005

Date of Patent: October 7, 2008

Assignee: Kabushiki Kaisha Toshiba

Inventors: Seiji Maeda, Yusuke Shirota
ADAPTIVE CONTROL OF MULTIPLE PREFETCHERS

Publication number: 20080243268

Abstract: According to one example embodiment of the inventive subject matter, there is provided a mechanism that controls which prefetchers are applied to execute an application in a computing system by turning them on and off. In one embodiment, this may be accomplished for example with a software control process that may run in the background. In another example embodiment, this may be accomplished using a hardware control machine, or a combination of hardware and software. The prefetchers are turned on and off in order to increase the performance of the computing system.

Type: Application

Filed: March 31, 2007

Publication date: October 2, 2008

Inventors: Meenakshi A. Kandaswamy, Simon C. Steely
Prefetching Based on Streaming Hints

Publication number: 20080244080

Abstract: A processor includes non-volatile memory into which streamed application components may be pre-fetched from a slower storage medium in order to decrease stall times during execution of the application. Alternatively, the application components pre-fetched into the non-volatile memory may be from a traditionally-loaded application rather than a streamed application. The order in which components of the application are prefetched into the non-volatile memory may be based on load order hints. For at least one embodiment, the load order hints are derived from sever-side load ordering logic. For at least one other embodiment, the load order hints are provided by the application itself via a mechanism such as an application programming interface. For at least one other embodiment, the load order hints are generated by the client using profile data. Or, a combination of such approaches may be used. Other embodiments are also described and claimed.

Type: Application

Filed: March 29, 2007

Publication date: October 2, 2008

Inventors: Thomas H. James, Steven Grobman
Method and apparatus for speculative prefetching in a multi-processor/multi-core message-passing machine

Publication number: 20080244231

Abstract: In some embodiments, the invention involves a novel combination of techniques for prefetching data and passing messages between and among cores in a multi-processor/multi-core platform. In an embodiment, a receiving core has a message queue and a message prefetcher. Incoming messages are simultaneously written to the message queue and the message prefetcher. The prefetcher speculatively fetches data referenced in the received message so that the data is available when the message is executed in the execution pipeline, or shortly thereafter. Other embodiments are described and claimed.

Type: Application

Filed: March 30, 2007

Publication date: October 2, 2008

Inventors: Aaron Kunze, Erik J. Johnson, Hermann Gartler
Pre-fetch apparatus

Publication number: 20080244232

Abstract: Apparatus and computing systems associated with data pre-fetching are described. One embodiment includes a processor that includes a first unit to store data corresponding to a load instruction and an instruction pointer (IP) value associated with the load instruction. The processor also includes a second unit to produce a predicted demand address for a next load instruction, the predicted demand address being based on a constant stride value. The processor also includes a third unit to generate an instruction pointer pre-fetch (IPP) request for the predicted demand address. The processor may also include units to arbitrate between generated IP pre-fetch requests and alternative pre-fetch requests.

Type: Application

Filed: April 2, 2007

Publication date: October 2, 2008

Inventors: Marina Sherman, Jack Doweck
Generating a set of pre-fetch address candidates based on popular sets of address and data offset counters

Patent number: 7430650

Abstract: Cache prefetching algorithm uses previously requested address and data patterns to predict future data needs and prefetch such data from memory into cache. A requested address is compared to previously requested addresses and returned data to compute a set of increments, and the set of increments is added to the currently requested address and returned data to generate a set of prefetch candidates. Weight functions are used to prioritize prefetch candidates. The prefetching method requires no changes to application code or operation system (OS) and is transparent to the compiler and the processor. The prefetching method comprises a parallel algorithm well-suited to implementation on an application-specific integrated circuit (ASIC) or field-programmable gate array (FPGA), or to integration into a processor.

Type: Grant

Filed: June 13, 2005

Date of Patent: September 30, 2008

Inventor: Richard A. Ross
Detecting when to prefetch inodes and then prefetching inodes in parallel

Patent number: 7430640

Abstract: The decision to prefetch inodes is based upon the detecting of access patterns that would benefit from such a prefetch. Once the decision to prefetch is made, a plurality of inodes are prefetched in parallel. Further, the prefetching of inodes is paced, such that the prefetching substantially matches the speed at which an application requests inodes.

Type: Grant

Filed: November 8, 2005

Date of Patent: September 30, 2008

Assignee: International Business Machines Corporation

Inventors: Frank B. Schmuck, James C. Wyllie
System, Method And Software To Preload Instructions From An Instruction Set Other Than One Currently Executing

Publication number: 20080229069

Abstract: An instruction preload instruction executed in a first processor instruction set operating mode is operative to correctly preload instructions in a different, second instruction set. The instructions are pre-decoded according to the second instruction set encoding in response to an instruction set preload indicator (ISPI). In various embodiments, the ISPI may be set prior to executing the preload instruction, or may comprise part of the preload instruction or the preload target address.

Type: Application

Filed: March 14, 2007

Publication date: September 18, 2008

Applicant: QUALCOMM INCORPORATED

Inventors: Thomas Andrew Sartorius, Brian Michael Stempel, Rodney Wayne Smith
PREFETCH PROCESSING APPARATUS, PREFETCH PROCESSING METHOD, STORAGE MEDIUM STORING PREFETCH PROCESSING PROGRAM

Publication number: 20080229072

Abstract: A prefetch processing apparatus includes a central-processing-unit monitor unit that monitors processing states of the central processing unit in association with time elapsed from start time of executing a program. A cache-miss-data address obtaining unit obtains cache-miss-data addresses in association with the time elapsed from the start time of executing the program, and a cycle determining unit determines a cycle of time required for executing the program. An identifying unit identifies a prefetch position in a cycle in which a prefetch-target address is to be prefetched by associating the cycle determined by the cycle determining unit with the cache-miss data addresses obtained by the cache-miss-data address obtaining unit. The prefetch-target address is an address of data on which prefetch processing is to be performed.

Type: Application

Filed: March 5, 2008

Publication date: September 18, 2008

Applicant: FUJITSU LIMITED

Inventors: Shuji Yamamura, Takashi Aoki
PREFETCH CONTROL APPARATUS, STORAGE DEVICE SYSTEM AND PREFETCH CONTROL METHOD

Publication number: 20080229071

Abstract: A prefetch control apparatus includes a prefetch controller for controlling prefetch of read data into a cache memory caching data to be transferred between a computer apparatus and a storage device, and which enhances a read efficiency of the read data from the storage device, a sequentiality decider for deciding whether the read data that are read from the storage device toward the computer apparatus are sequential access data, a locality decider for deciding whether the read data have locality of data arrangement in the predetermined storage area, in a case where the read data that are read from the storage device toward the computer apparatus have been decided not to be sequential access data, and a prefetcher for prefetching the read data in a case where the read data has the locality of the data arrangement.

Type: Application

Filed: March 5, 2008

Publication date: September 18, 2008

Applicant: Fujitsu Limited

Inventors: Katsuhiko SHIOYA, Eiichi YAMANAKA
Cache circuitry, data processing apparatus and method for prefetching data

Publication number: 20080229070

Abstract: Cache circuitry, a data processing apparatus including such cache circuitry, and a method for prefetching data into such cache circuitry, are provided. The cache circuitry has a cache storage comprising a plurality of cache lines for storing data values, and control circuitry which is responsive to an access racquet issued by a device of the data processing apparatus identifying a memory address of a data value to be accessed, to cause a lookup operation to be performed to determine whether the data value for that memory address is stored within the cache storage. If not, a linefill operation is initiated to retrieve the data value from memory.

Type: Application

Filed: March 12, 2007

Publication date: September 18, 2008

Applicant: ARM Limited

Inventors: Elodie Charra, Philippe Jean-Pierre Raphalen, Frederic Claude Marie Piry, Philippe Luc, Gilles Eric Grandou
Method, apparatus, and program to efficiently calculate cache prefetching patterns for loops

Patent number: 7421540

Abstract: A mechanism is provided that identifies instructions that access storage and may be candidates for cache prefetching. The mechanism augments these instructions so that any given instance of the instruction operates in one of four modes, namely normal, unexecuted, data gathering, and validation. In the normal mode, the instruction merely performs the function specified in the software runtime environment. An instruction in unexecuted mode, upon the next execution, is placed in data gathering mode. When an instruction in the data gathering mode is encountered, the mechanism of the present invention collects data to discover potential fixed storage access patterns. When an instruction is in validation mode, the mechanism of the present invention validates the presumed fixed storage access patterns.

Type: Grant

Filed: May 3, 2005

Date of Patent: September 2, 2008

Assignee: International Business Machines Corporation

Inventors: Christopher Michael Donawa, Allan Henry Kielstra
Branch predictor directed prefetch

Publication number: 20080209173

Abstract: An apparatus for executing branch predictor directed prefetch operations. During operation, a branch prediction unit may provide an address of a first instruction to the fetch unit. The fetch unit may send a fetch request for the first instruction to the instruction cache to perform a fetch operation. In response to detecting a cache miss corresponding to the first instruction, the fetch unit may execute one or more prefetch operation while the cache miss corresponding to the first instruction is being serviced. The branch prediction unit may provide an address of a predicted next instruction in the instruction stream to the fetch unit. The fetch unit may send a prefetch request for the predicted next instruction to the instruction cache to execute the prefetch operation. The fetch unit may store prefetched instruction data obtained from a next level of memory in the instruction cache or in a prefetch buffer.

Type: Application

Filed: February 28, 2007

Publication date: August 28, 2008

Inventors: Marius Evers, Trivikram Krishnamurthy
CONTEXT SWITCH DATA PREFETCHING IN MULTITHREADED COMPUTER

Publication number: 20080201529

Abstract: An apparatus, program product and method initiate, in connection with a context switch operation, a prefetch of data likely to be used by a thread prior to resuming execution of that thread. As a result, once it is known that a context switch will be performed to a particular thread, data may be prefetched on behalf of that thread so that when execution of the thread is resumed, more of the working state for the thread is likely to be cached, or at least in the process of being retrieved into cache memory, thus reducing cache-related performance penalties associated with context switching.

Type: Application

Filed: April 24, 2008

Publication date: August 21, 2008

Applicant: International Business Machines Corporation

Inventors: Jeffrey Powers Bradford, Harold F. Kossman, Timothy John Mullins
PROCESSOR INSTRUCTION CACHE WITH DUAL-READ MODES

Publication number: 20080189518

Abstract: A processor includes a cache memory that has an array, word lines, and bit lines. A control module accesses cells of the array during access cycles to access instructions stored in the cache memory. The control module performs one of a first discrete read and a first sequential read to access instructions in a first set of cells of the array that are connected to a first word line and selectively performs one of a second discrete read and a second sequential read based on a branch instruction to access instructions in a second set of cells of the array that are connected to a second word line. The second word line is different than the first word line.

Type: Application

Filed: April 2, 2008

Publication date: August 7, 2008

Inventors: Sehat Sutardja, Jason T. Su, Hong-Yi Chen, Jason Sheu, Jensen Tjeng
Storage system, and storage control method

Patent number: 7409486

Abstract: A protocol chip and a bridge are connected to a first bus, while the bridge and a micro processor (MP) are connected to a second bus. The MP generates parameter information and writes it into a local memory (LM), and issues a write command which includes access destination information to this parameter information to a protocol chip. The bridge pre-fetches the parameter information from the LM using the access destination information within the write command which is transferred to the protocol chip via itself, and when receiving a read command from the protocol chip, transmits the parameter information which has been pre-fetched to the protocol chip via the first bus, without passing the read command through to the MP.

Type: Grant

Filed: March 27, 2006

Date of Patent: August 5, 2008

Assignee: Hitachi, Ltd.

Inventors: Osamu Torigoe, Hideaki Shima, Shouji Katoh
METHOD AND APPARATUS FOR CONTROLLING INSTRUCTION CACHE PREFETCH

Publication number: 20080184010

Abstract: According to the present invention, there is provided an instruction cache prefetch control apparatus having an external memory, a CPU and an instruction cache unit, the instruction cache unit having: an instruction cache data memory which receives and stores the instruction sequence; a prefetch buffer which prefetches and stores an instruction sequence next to the instruction sequence as a target of a fetch request from the CPU when the next instruction sequence is not stored in the instruction cache data memory; an instruction cache write control unit which selectively outputs, to the instruction cache data memory, one of the instruction sequence output from the external memory and the instruction sequence stored in the prefetch buffer; and a hit or miss determination access control unit which, upon receiving, from the CPU, a fetch request for the instruction sequence stored in the prefetch buffer, transfers the instruction sequence from the prefetch buffer to the instruction cache data memory and stores th

Type: Application

Filed: December 31, 2007

Publication date: July 31, 2008

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Masato Uchiyama
Handling cache miss in an instruction crossing a cache line boundary

Patent number: 7404042

Abstract: A fetch section of a processor comprises an instruction cache and a pipeline of several stages for obtaining instructions. Instructions may cross cache line boundaries. The pipeline stages process two addresses to recover a complete boundary crossing instruction. During such processing, if the second piece of the instruction is not in the cache, the fetch with regard to the first line is invalidated and recycled. On this first pass, processing of the address for the second part of the instruction is treated as a pre-fetch request to load instruction data to the cache from higher level memory, without passing any of that data to the later stages of the processor. When the first line address passes through the fetch stages again, the second line address follows in the normal order, and both pieces of the instruction are can be fetched from the cache and combined in the normal manner.

Type: Grant

Filed: May 18, 2005

Date of Patent: July 22, 2008

Assignee: QUALCOMM Incorporated

Inventors: Brian Michael Stempel, Jeffrey Todd Bridges, Rodney Wayne Smith, Thomas Andrew Sartorius
Descriptor Prefetch Mechanism for High Latency and Out of Order DMA Device

Publication number: 20080168259

Abstract: A DMA device prefetches descriptors into a descriptor prefetch buffer. The size of descriptor prefetch buffer holds an appropriate number of descriptors for a given latency environment. To support a linked list of descriptors, the DMA engine prefetches descriptors based on the assumption that they are sequential in memory and discards any descriptors that are found to violate this assumption. The DMA engine seeks to keep the descriptor prefetch buffer full by requesting multiple descriptors per transaction whenever possible. The bus engine fetches these descriptors from system memory and writes them to the prefetch buffer. The DMA engine may also use an aggressive prefetch where the bus engine requests the maximum number of descriptors that the buffer will support whenever there is any space in the descriptor prefetch buffer. The DMA device discards any remaining descriptors that cannot be stored.

Type: Application

Filed: January 10, 2007

Publication date: July 10, 2008

Inventors: Giora Biran, Luis E. De la Torre, Bernard C. Drerup, Jyoti Gupta, Richard Nicholas
DESIGN STRUCTURE FOR SELF PREFETCHING L2 CACHE MECHANISM FOR DATA LINES

Publication number: 20080162819

Abstract: A design structure for prefetching instruction lines is provided. The design structure is embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design. The design structure comprises a processor having a level 2 cache, and a level 1 cache configured to receive instruction lines from the level 2 cache is described, wherein each instruction line comprises one or more instructions. The processor also includes a processor core configured to execute instructions retrieved from the level 1 cache, and circuitry configured to fetch a first instruction line from a level 2 cache, identify, in the first instruction line, an address identifying a first data line containing data targeted by a data access instruction contained in the first instruction line or a different instruction line, and prefetch, from the level 2 cache, the first data line using the extracted address.

Type: Application

Filed: March 13, 2008

Publication date: July 3, 2008

Inventor: DAVID A. LUICK
Digital signal processor architecture with optimized memory access for code discontinuity

Patent number: 7389405

Abstract: A method and architecture accesses a unified memory in a micro-processing system having a two-phase clock. The unified memory is accessed during a first instruction cycle. When a program code discontinuity is encountered, the unified memory is accessed a first time during an instruction cycle with a dummy access. The unified memory is accessed a second time during the instruction cycle when a program code discontinuity is encountered with either a data access, as in the case of a last instruction of a loop, or an instruction access, as in the case of a jump instruction.

Type: Grant

Filed: November 17, 2003

Date of Patent: June 17, 2008

Assignee: Mediatek, Inc.

Inventor: Frederic Boutaud
Data Processing System and Method

Publication number: 20080140997

Abstract: Embodiments of the present invention relate to a data processing system and method for using metadata associated with data to be retrieved from storage to identify further data to be retrieve at least a portion of that further data from the storage in accordance with a prefetch policy.

Type: Application

Filed: February 4, 2005

Publication date: June 12, 2008

Inventor: Shailendra Tripathi
Apparatus and methods for low-complexity instruction prefetch system

Publication number: 20080140996

Abstract: When misses occur in an instruction cache, prefetching techniques are used that minimize miss rates, memory access bandwidth, and power use. One of the prefetching techniques operates when a miss occurs. A notification that a fetch address missed in an instruction cache is received. The fetch address that caused the miss is analyzed to determine an attribute of the fetch address and based on the attribute a line of instructions is prefetched. The attribute may indicate that the fetch address is a target address of a non-sequential operation. Another attribute may indicate that the fetch address is a target address of a non-sequential operation and the target address is more than X % into a cache line. A further attribute may indicate that the fetch address is an even address in the instruction cache. Such attributes may be combined to determine whether to prefetch.

Type: Application

Filed: December 8, 2006

Publication date: June 12, 2008

Inventors: Michael William Morrow, James Norris Dieffenderfer
Prefetching apparatus, prefetching method and prefetching program product

Patent number: 7383417

Abstract: The efficient performance of prefetching of data prior to the reading of the data by a program. A prefetching apparatus, for prefetching data from a file to a buffer before the data is read by a program, includes: a history recorder, for recording a history for a plurality of data readings issued by the program while performing data reading; a prefetching generator, for generating a plurality of prefetchings that correspond to the plurality of data readings recorded in the history; a prefetching process determination unit, for determining, based on the history, the performance order for the plurality of prefetchings; and a prefetching unit, for performing, when following the determination of the performance order the program is executed, the plurality of prefetchings in the performance order.

Type: Grant

Filed: March 15, 2006

Date of Patent: June 3, 2008

Assignee: International Business Machines Corporation

Inventors: Toshiaki Yasue, Hideaki Komatsu
Software-based technique for improving the effectiveness of prefetching during scout mode

Patent number: 7373482

Abstract: One embodiment of the present invention provides a system that improves the effectiveness of prefetching during execution of instructions in scout mode. During operation, the system executes program instructions in a normal-execution mode. Upon encountering a condition which causes the processor to enter scout mode, the system performs a checkpoint and commences execution of instructions in scout mode, wherein the instructions are speculatively executed to prefetch future memory operations, but wherein results are not committed to the architectural state of a processor. During execution of a load instruction during scout mode, if the load instruction is a special load instruction and if the load instruction causes a lower-level cache miss, the system waits for data to be returned from a higher-level cache before resuming execution of subsequent instructions in scout mode, instead of disregarding the result of the load instruction and immediately resuming execution in scout mode.

Type: Grant

Filed: May 26, 2005

Date of Patent: May 13, 2008

Assignee: Sun Microsystems, Inc.

Inventors: Lawrence A. Spracklen, Yuan C. Chou, Santosh G. Abraham
System and method of pre-fetching using an extended data structure including required data and a pre-fetch flag

Patent number: 7370153

Abstract: Method and apparatus for implementing controlled pre-fetching of data. An extended data structure can be used to specifying where and when data is to be pre-fetched, and how much pre-fetching is to be performed, if any. The extended data structure has a pre-fetch flag that signals a host controller if pre-fetching is to be done. If the pre-fetch flag is set, pre-fetching is performed, otherwise pre-fetching is not performed. The host controller parses the extended data structure and formulates a data request that is sent to the disk drive. Pre-fetched data can be stored in a buffer memory for future use.

Type: Grant

Filed: August 6, 2004

Date of Patent: May 6, 2008

Assignee: NVIDIA Corporation

Inventor: Radoslav Danilak
Method for changing a thread priority in a simultaneous multithread processor

Patent number: 7363625

Abstract: An SMT system is designed to allow software alteration of thread priority. In one case, the system signals a change in a thread priority based on the state of instruction execution and in particular when the instruction has completed execution. To alter the priority of a thread, the software uses a special form of a “no operation” (NOP) instruction (hereafter termed thread priority NOP). When the thread priority NOP is dispatched, its special NOP is decoded in the decode unit of the IDU into an operation that writes a special code into the completion table for the thread priority NOP. A “trouble” bit is also set in the completion table that indicates which instruction group contains the thread priority NOP. The trouble bit indicates that special processing is required after instruction completion. The thread priority instruction is processed after completion using the special code to change a thread's priority.

Type: Grant

Filed: April 24, 2003

Date of Patent: April 22, 2008

Assignee: International Business Machines Corporation

Inventors: William E. Burky, Ronald N. Kalla, David A. Schroter, Balaram Sinharoy
DATA PREFETCHING IN A MICROPROCESSING ENVIRONMENT

Publication number: 20080091921

Abstract: Systems and methods for prefetching data in a microprocessor environment are provided. The method comprises decoding a first instruction; determining if the first instruction comprises both a load instruction and embedded prefetch data; processing the load instruction; and processing the prefetch data, in response to determining that the first instruction comprises the prefetch data, wherein processing the prefetch data comprises determining a prefetch multiple, a prefetch address and the number of elements to prefetch, based on the prefetch data.

Type: Application

Filed: October 12, 2006

Publication date: April 17, 2008

Inventors: Diab Abuaiadh, Daniel Citron
Variable width alignment engine for aligning instructions based on transition between buffers

Patent number: 7360059

Abstract: In one embodiment, a digital signal processor includes look ahead logic to decrease the number of bubbles inserted in the processing pipeline. The processor receives data containing instructions in a plurality of buffers and decodes the size of a first instruction. The beginning of a second instruction is determined based on the size of the first instruction. The size of the second instruction is decoded and the processor determines whether loading the second instruction will deplete one of the plurality of buffers.

Type: Grant

Filed: February 3, 2006

Date of Patent: April 15, 2008

Assignee: Analog Devices, Inc.

Inventors: Thomas Tomazin, William C. Anderson, Charles P. Roth, Kayla Chalmers, Juan G. Revilla, Ravi P. Singh
Memory Controller for Sparse Data Computation System and Method Therefor

Publication number: 20080082790

Abstract: An accelerator system supplements standard computer memory management units specifically in the case of sparse data. The accelerator processes requests for data from an analysis application running on the processor system by pre-fetching a subset of the irregularly ordered data and forming that data into a dense, sequentially-ordered array, which is then placed directly into the processor's main memory, for example. In one example, the memory controller is implemented as a separate, add-on coprocessor so that actions of the memory controller will take place simultaneously with the calculations of the processor system. This system addresses the problems caused by a lack of sequential and spatial locality in sparse data. In effect, the complicated data access characteristic of irregular structures, which are a characteristic of sparse matrices, is transferred from the code level to the hardware level.

Type: Application

Filed: August 16, 2007

Publication date: April 3, 2008

Inventors: Oleg Vladimirovich Diyankov, Yuri Ivanovich Konotop, John Victor Batson
Replacing instruction and corresponding instructions in a queue according to rules when shared data buffer is accessed

Patent number: 7346762

Abstract: A method of executing program instructions may include receiving, in a processor, an instruction that causes the processor to read data from or write data to a portion of memory that is shared by one or more processes, at least one process of which manipulates data in a format that is different than a format of data in the shared portion of memory. The method may further include executing alternate instructions in place of the received instruction. The alternate instructions may effect transformation of data associated with the shared portion of memory from a first data format to a second data format.

Type: Grant

Filed: January 6, 2006

Date of Patent: March 18, 2008

Assignee: Apple Inc.

Inventors: Ronnie G. Misra, Joshua H. Shaffer
Memory latency of processors with configurable stride based pre-fetching technique

Patent number: 7346741

Abstract: A method and apparatus for retrieving instructions to be processed by a microprocessor is provided. By pre-fetching instructions in anticipation of being requested, instead of waiting for the instructions to be requested, the latency involved in requesting instructions from higher levels of memory may be avoided. A pre-fetched line of instruction may be stored into a pre-fetch buffer residing on a microprocessor. The pre-fetch buffer may be used by the microprocessor as an alternate source from which to retrieve a requested instruction when the requested instruction is not stored within the first level cache. The particular line of instruction being pre-fetched may be identified based on a configurable stride value. The configurable stride value may be adjusted to maximize the likelihood that a requested instruction, not present in the first level cache, is present in the pre-fetch buffer. The configurable stride value may be updated manually or automatically.

Type: Grant

Filed: May 10, 2005

Date of Patent: March 18, 2008

Assignee: Sun Microsystems, Inc.

Inventors: Brian F. Keish, Quinn Jacobson, Lakshminarasim Varadadesikan
Branch prediction in a data processing system utilizing a cache of previous static predictions

Patent number: 7343481

Abstract: A data processing system incorporates an instruction prefetch unit 8 including a static branch predictor 12. A static branch prediction cache 30, 32, 34 is provided for storing a most recently encountered static branch prediction such that a subsequent request to fetch the already encountered branch instruction can be identified before the opcode for that branch instruction is returned. The cached static branch prediction can thus redirect the prefetching to the branch target address sooner than the static predictor 12.

Type: Grant

Filed: March 19, 2003

Date of Patent: March 11, 2008

Assignee: ARM Limited

Inventor: David James Williamson
Page descriptors for prefetching and memory management

Patent number: 7334088

Abstract: A computer system and a method for enhancing the cache prefetch behavior. A computer system including a processor, a main memory, a prefetch controller, a cache memory, a prefetch buffer, and a main memory, wherein each page in the main memory has associated with it a tag, which is used for controling the prefetching of a variable subset of lines from this page as well as lines from at least one other page. And, coupled to the processor is a prefetch controller, wherein the prefetch controller responds to the processor determining a fault (or miss) occurred to a line of data by fetching a corresponding line of data with the corresponding tag, with the corresponding tag to be stored in the prefetch buffer, and sending the corresponding line of data to the cache memory.

Type: Grant

Filed: December 20, 2002

Date of Patent: February 19, 2008

Assignee: International Business Machines Corporation

Inventor: Peter Franaszek
Methods and apparatus for reducing memory latency in a software application

Patent number: 7328433

Abstract: Methods and apparatus for reducing memory latency in a software application are disclosed. A disclosed system uses one or more helper threads to prefetch variables for a main thread to reduce performance bottlenecks due to memory latency and/or a cache miss. A performance analysis tool is used to profile the software application's resource usage and identifies areas in the software application experiencing performance bottlenecks. Compiler-runtime instructions are generated into the software application to create and manage the helper thread. The helper thread prefetches data in the identified areas of the software application experiencing performance bottlenecks. A counting mechanism is inserted into the helper thread and a counting mechanism is inserted into the main thread to coordinate the execution of the helper thread with the main thread and to help ensure the prefetched data is not removed from the cache before the main thread is able to take advantage of the prefetched data.

Type: Grant

Filed: October 2, 2003

Date of Patent: February 5, 2008

Assignee: Intel Corporation

Inventors: Xinmin Tian, Shih-wei Liao, Hong Wang, Milind Girkar, John Shen, Perry Wang, Grant Haab, Gerolf Hoflehner, Daniel Lavery, Hideki Saito, Sanjiv Shah, Dongkeun Kim

prev … 5 6 7 8 9 10 11 12 13 … next