Prefetching Patents (Class 712/207)
-
Patent number: 6708266Abstract: The central processing unit is provided with an instruction queue storage section. This central processing unit is made of a memory, such as FIFO memory, that adopts first-in first-out method. A counter counters each time an instruction datum is stored in the instruction queue storage section. When the value of the counter is 0 or 1 and instruction fetch is not suppressed, a fetch request is issued.Type: GrantFiled: January 25, 2001Date of Patent: March 16, 2004Assignee: Fujitsu LimitedInventor: Seiji Suetake
-
Patent number: 6708255Abstract: A variable input/output control device in a synchronous semiconductor memory device including a plurality of first prefetch units to prefetch data from an input buffer, a plurality of second prefetch units to prefetch data from a memory core and a control signal generator for generating a control signal in response to command signals to select one of the plurality of first prefetch units and one of the plurality of second prefetch units.Type: GrantFiled: December 31, 2001Date of Patent: March 16, 2004Assignee: Hynix Semiconductor Inc.Inventor: Seung-Hyun Yi
-
Patent number: 6697917Abstract: The present invention prevents, at high speed, a malfunction from occurring at the time of changing a mode in a processor, in which information to be decoded varies with modes. The processor is provided with a circuit for referring to a result of decoding information or issuing an instruction when a write operation is performed on a register for storing data containing a bit that indicates a current mode, and for outputting a purge signal if the result of decoding information or issuing an instruction is information represented by a mode switching signal. Thus, when a mode switching signal is written to the register, a purge signal is outputted to a cache memory. Consequently, the valid bit of prefetched cache data is turned off. This prevents prefetched data from being decoded in a different mode. As a result, operations are normally performed after the switching of the mode. Alternatively, the purge signal is outputted by detecting a change in the value of the bit indicating the current mode.Type: GrantFiled: March 29, 2000Date of Patent: February 24, 2004Assignee: Fujitsu LimitedInventors: Junya Matsushima, Takumi Takeno, Kenichi Nabeya, Daisuke Ban
-
Patent number: 6687807Abstract: Additional memory hardware in a computer system which is distinct in function from the main memory system architecture permits the storage and retrieval of prefetch addresses and allows the compiler to more efficiently generate prefetch instructions for execution while traversing pointer-based or recursive data structures. The additional memory hardware makes up a content addressable memory (CAM) or a hash table/array memory that is relatively close in cycle time to the CPU and relatively small when compared to the main memory system. The additional CAM hardware permits the compiler to write data access loops which remember the addresses for each node visited while traversing the linked data structure by providing storage space to hold a prefetch address or a set of prefetch addresses.Type: GrantFiled: April 18, 2000Date of Patent: February 3, 2004Assignee: Sun Microystems, Inc.Inventor: Peter C. Damron
-
Patent number: 6684319Abstract: The present invention minimizes power consumption and processing time in a very long instruction word digital signal processor by identifying certain blocks of instructions and placing them in a small, fast buffer for subsequent retrieval and execution. A decoder unit decodes a prefetch instruction flag bit that indicates when instructions are to be prefetched and placed into the buffer. The decoder unit signals a control unit, which sends the instruction code from a memory unit to the buffer and maintains an address mapping table and a program counter. The control unit also sets a select input on a multiplexer to indicate that the multiplexer is to output the prefetch instructions it receives from the buffer. The multiplexer outputs the prefetch instructions to an instruction register that sends the prefetch instructions to appropriate functional units for execution.Type: GrantFiled: June 30, 2000Date of Patent: January 27, 2004Assignee: Conexant Systems, Inc.Inventors: Moataz A. Mohamed, Keith M. Bindloss
-
Patent number: 6681317Abstract: An apparatus and method to provide ordering when an advanced load address table is used for advanced loads. An advanced load address table (ALAT) is used to retain an entry associated with a location accessed by an advanced load instruction. The entry is utilized to determine if an intervening access to the location is performed by another instruction prior to the execution of a corresponding checking instruction. Ordering is maintained to ensure validity of the entry in the ALAT, when the advanced load instruction is boosted past an ordering setting boundary.Type: GrantFiled: September 29, 2000Date of Patent: January 20, 2004Assignee: Intel CorporationInventor: Gregory S. Mathews
-
Patent number: 6681318Abstract: One embodiment of the present invention provides a system that prefetches instructions by using an assist processor to perform prefetch operations in advance of a primary processor. The system operates by executing executable code on the primary processor, and simultaneously executing a reduced version of the executable code on the assist processor. This reduced version of the executable code executes more quickly than the executable code, and performs prefetch operations for the primary processor in advance of when the primary processor requires the instructions. The system also stores the prefetched instructions into a cache that is accessible by the primary processor so that the primary processor is able to access the prefetched instructions without having to retrieve the prefetched instructions from a main memory. In one embodiment of the present invention, prior to executing the executable code, the system compiles source code into executable code for the primary processor.Type: GrantFiled: January 16, 2001Date of Patent: January 20, 2004Assignee: Sun Microsystems, Inc.Inventors: Shailender Chaudhry, Marc Tremblay
-
Patent number: 6678796Abstract: A method and apparatus for scheduling instructions to provide adequate prefetch latency is disclosed during compilation of a program code in to a program. The prefetch scheduler component of the present invention selects a memory operation within the program code as a “martyr load” and removes the prefetch associated with the martyr load, if any. The prefetch scheduler takes advantage of the latency associated with the martyr load to schedule prefetches for memory operations which follow the martyr load. The prefetches are scheduled “behind” (i.e., prior to) the martyr load to allow the prefetches to complete before the associated memory operations are carried out. The prefetch schedule component continues this process throughout the program code to optimize prefetch scheduling and overall program operation.Type: GrantFiled: October 3, 2000Date of Patent: January 13, 2004Assignee: Sun Microsystems, Inc.Inventors: Nicolai Kosche, Peter C. Damron, Joseph Chamdani, Partha Pal Tirumalai
-
Patent number: 6675279Abstract: A behavioral memory mechanism for performing fetch prediction within a data processing system is disclosed. The data processing system includes a processor, a real memory, an address converter, a fetch prediction means, and an address translator. The real memory has multiple real address locations, and each of the real address locations is associated with a corresponding one of many virtual address locations. The virtual address locations are divided into two non-overlapping regions, namely, an architecturally visible virtual memory region and a behavioral virtual memory region. The address converter converts an effective address to an architecturally visible virtual address and a behavioral virtual address. The architecturally visible virtual address is utilized to access the architecturally visible virtual memory region of the virtual memory and the behavioral virtual address is utilized to access the behavioral virtual memory region of the virtual memory.Type: GrantFiled: October 16, 2001Date of Patent: January 6, 2004Assignee: International Business Machines CorporationInventors: Ravi K. Arimilli, William J. Starke
-
Publication number: 20030233531Abstract: In a method for fetching instructions in an embedded system, a predicted one of a set of the instructions stored in a memory device is fetched and is subsequently stored in an instruction buffer when a system bus is in a data access phase. When a processor generates an access request for the memory device, the predicted one of the instructions stored in the instruction buffer is provided to the system bus for receipt by the processor upon determining that the predicted one of the instructions stored in the instruction buffer hits the access request from the processor. An embedded system with an instruction prefetching device is also disclosed.Type: ApplicationFiled: June 10, 2003Publication date: December 18, 2003Inventor: Chang-Fu Lin
-
Publication number: 20030233530Abstract: A system and method for prefetching instructions from a slower memory for storing them in a faster memory includes the following: prefetching the instructions from a slower memory; recognizing an opcode corresponding to an unconditional branch instruction; continuing to prefetch at a target address of the unconditional branch instruction, responsive to recognizing the opcode corresponding to the unconditional branch instruction; recognizing an opcode corresponding to a conditional branch instruction; prefetching along each of the possible branches for the conditional branch instruction, responsive to recognizing the opcode corresponding to the conditional branch instruction; taking a branch from the possible branches of the conditional branch; and canceling prefetching of other possible branches not taken.Type: ApplicationFiled: June 14, 2002Publication date: December 18, 2003Applicant: International Business Machines CorporationInventors: Richard Harold Boivie, Jun Tung Fong
-
Patent number: 6665776Abstract: A microprocessor is configured to continue execution in a special Speculative Prefetching After Data Cache Miss (SPAM) mode after a data cache miss is encountered. The microprocessor includes additional registers and program counter, and optionally additional cache memory for use during the special SPAM mode. By continuing execution during the SPAM mode, multiple outstanding and overlapping cache fill requests may be issued, thus improving performance of the microprocessor.Type: GrantFiled: January 4, 2001Date of Patent: December 16, 2003Assignee: Hewlett-Packard Development Company L.P.Inventors: Norman Paul Jouppi, Keith Istvan Farkas
-
Patent number: 6665774Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.Type: GrantFiled: October 16, 2001Date of Patent: December 16, 2003Assignee: Cray, Inc.Inventors: Gregory J. Faanes, Eric P. Lundberg
-
Publication number: 20030225996Abstract: Method and apparatus for inserting prefetch instructions in an executable computer program. Profile data are generated for executed load instructions and store instructions. The profile data include instruction addresses, target addresses, data loaded and stored, and execution counts. From the profile data, recurring patterns of instructions resulting in cache-miss conditions are identified. Prefetch instructions are inserted prior to the instructions that result in cache-miss conditions for patterns of instructions recurring more than a selected frequency.Type: ApplicationFiled: May 30, 2002Publication date: December 4, 2003Applicant: Hewlett-Packard CompanyInventor: Carol L. Thompson
-
Patent number: 6658550Abstract: An asynchronous processor having pipelined instruction fetching and execution to implement concurrent execution of instructions by two or more execution units. A writeback unit is connected to execution units and memory units to control information updates and to handle precise exception. A pipelined completion mechanism can be implemented to improve the throughput.Type: GrantFiled: April 30, 2002Date of Patent: December 2, 2003Assignee: California Institute of TechnologyInventors: Alain J. Martin, Andrew Lines, Rajit Manohar, Uri Cummings, Mika Nystroem
-
Patent number: 6658535Abstract: Upon receiving a read command, a disk drive moves a read head to target data and reads the data into a read buffer. In an action called “prefetching”, the drive continues to read nearby data into the read buffer which doubles as a data cache. When another I/O command is present and must be serviced, prefetching is preempted thereby reducing the data read into the cache. Moving the head from the current I/O command to the next I/O command creates a delay comprising two components: seek time and rational latency. Based on the relative values of these components, a time period, less than the entire delay period, is calculated in which prefetching will continue. By continuing prefetching instead of preempting it, the likelihood of cache hits is increased because more data is available in the read buffer. Furthermore, by performing prefetching during part of the otherwise unused delay period, no performance penalty is introduced.Type: GrantFiled: January 19, 2000Date of Patent: December 2, 2003Assignee: International Business Machines CorporationInventors: Nimrod Megiddo, Spencer Ng
-
Patent number: 6658534Abstract: The mechanism to reduce instruction cache miss penalties by initiating an early cache line prefetch is implemented. The mechanism provides for an early prefetch of a next succeeding cache line before an instruction cache miss is detected during a fetch which causes an instruction cache miss. The prefetch is initiated when it is guaranteed that instructions in the subsequent cache line will be referenced. This occurs when the current instruction is either a non-branch instruction, so instructions will execute sequentially, or if the current instruction is a branch instruction, but the branch forward is sufficiently short. If the current instruction is a branch, but the branch forward is to the next sequential cache line, a prefetch of the next sequential cache line may be performed. In this way, cache miss latencies may be reduced without generating cache pollution due to the prefetch of cache lines which are subsequently unreferenced.Type: GrantFiled: March 31, 1998Date of Patent: December 2, 2003Assignee: International Business Machines CorporationInventors: Steven Wayne White, Hung Qui Le, Kurt Alan Feiste, Paul Joseph Jordan
-
Patent number: 6654873Abstract: A processor apparatus which reduces an overhead at the time of switching processing modules and efficiently performs desired processing at a high speed, wherein desired processing is performed by prefetching a series of instructions by a main program prefetcher, pre-decoding the same by a pre-decoder, and supplying the same to a decoder and execution unit via a multiplexer. When an instruction to execute a macro command is detected in the pre-decoder, the instructions of the macro command are prefetched by a macro program prefetcher and pre-decoded in the pre-decoder. As a result, when branching to a macro command, the instructions of the macro command can be immediately supplied to an execution unit only by switching the multiplexer.Type: GrantFiled: January 7, 2000Date of Patent: November 25, 2003Assignee: Sony CorporationInventor: Tomohiko Kadowaki
-
Patent number: 6651162Abstract: A method of prefetching addresses includes the step of accessing a stored instruction using a current address. During the access using the current address, a target address is accessed in a branch target address cache. A stored instruction associated with the target address accessed from the branch target address cache is prefetched and the branch target address is indexed with selected bits from the address accessed from the branch target address cache.Type: GrantFiled: November 4, 1999Date of Patent: November 18, 2003Assignee: International Business Machines CorporationInventors: David Stephen Levitan, Shashank Nemawarkar, Balaram Sinharoy, William John Starke
-
Patent number: 6647487Abstract: An apparatus and methods for optimizing prefetch performance. Logical ones are shifted into the bits of a shift register from the left for each instruction address prefetched. As instruction addresses are fetched by the processor, logical zeros are shifted into the bit positions of the shift register from the right. Once initiated, prefetching continues until a logical one is stored in the nth-bit of the shift register. Detection of this logical one in the n-th bit causes prefetching to cease until a prefetched instruction address is removed from the prefetched instruction address register and a logical zero is shifted back into the n-th bit of the shift register. Thus, autonomous prefetch agents are prevented from prefetching too far ahead of the current instruction pointer resulting in wasted memory bandwidth and the replacement of useful instruction in the instruction cache.Type: GrantFiled: February 18, 2000Date of Patent: November 11, 2003Assignee: Hewlett-Packard Development Company, LP.Inventors: Stephen R. Undy, James E McCormick, Jr.
-
Patent number: 6647467Abstract: A pipelined processor includes a branch acceleration technique which is based on an improved branch cache architecture. In one exemplary embodiment, the present invention has an instruction pipeline comprising a plurality of stages, each stage initially containing first data. A branch cache module stores at least a portion of the first data from one or more of the pipeline stages, and a branch cache control unit causes the branch cache module to store at least portion of the first data from one or more of the pipeline stages in response to execution of a cacheable branch instruction which triggers a cache miss, and causes the branch cache module to restore second data to one or more of the pipeline stages in response to a cache hit. The present invention also discloses methods for controlling the branch cache and for reducing pipeline stalls caused by branching.Type: GrantFiled: October 9, 2000Date of Patent: November 11, 2003Assignee: Micron Technology, Inc.Inventor: Eric M. Dowling
-
Patent number: 6647485Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.Type: GrantFiled: May 10, 2001Date of Patent: November 11, 2003Assignee: Seiko Epson CorporationInventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
-
Patent number: 6643766Abstract: Speculative pre-fetching and pre-flushing of additional cache lines minimize cache miss latency and coherency check latency of an out of order instruction execution processor. A pre-fetch/pre-flush slot (DPRESLOT) is provided in a memory queue (MQUEUE) of the out-of-order execution processor. The DPRESLOT monitors the transactions between a system interface, e.g., the system bus, and an address reorder buffer slot (ARBSLOT) and/or between the system interface and a cache coherency check slot (CCCSLOT). When a cache miss is detected, the DPRESLOT causes one or more cache lines in addition to the data line, which caused the current cache miss, to be pre-fetched from the memory hierarchy into the cache memory (DCACHE) in anticipation that the additional data would be required in the near future.Type: GrantFiled: May 4, 2000Date of Patent: November 4, 2003Assignee: Hewlett-Packard Development Company, L.P.Inventors: Gregg B Lesartre, David Jerome Johnson
-
Publication number: 20030204705Abstract: The present invention provides a data processing apparatus and method for predicting branch instructions in a data processing apparatus. The data processing apparatus comprises a processor for executing instructions, a prefetch unit for prefetching instructions from a memory prior to sending those instructions to the processor for execution, and branch prediction logic for predicting which instruction should be prefetched by the prefetch unit. The branch prediction logic is arranged to predict whether a prefetched instruction specifies a branch operation that will cause a change in instruction flow, and if so to indicate to the prefetch unit a target address within the memory from which a next instruction should be retrieved.Type: ApplicationFiled: April 30, 2002Publication date: October 30, 2003Inventors: William H. Oldfield, Alexander E. Nancekievill
-
Patent number: 6636945Abstract: The data-transfer latency of a cache-miss load instruction is shortened in a processor having a cache memory. A load history table wherein a transfer address of the cache-miss load instruction is registered is provided between the processor and a memory system. When access addresses are sequential, a request for hardware prefetch to a successive address is issued and the address is registered into a prefetch buffer. Further, when a cache-miss load request to the successive address is issued, the data are transferred from the prefetch buffer directly to the processor. The system may include multiple simultaneous prefetches and a prefetch variable optimized using software.Type: GrantFiled: July 19, 2001Date of Patent: October 21, 2003Assignee: Hitachi, Ltd.Inventor: Tomohiro Nakamura
-
Patent number: 6634024Abstract: The present invention integrates data prefetching into a modulo scheduling technique to provide for the generation of assembly code having improved performance. Modulo scheduling can produce optimal steady state code for many important cases by sufficiently separating defining instructions (producers) from using instructions (consumers), thereby avoiding machine stall cycles and simultaneously maximizing processor utilization. Integrating data prefetching within modulo scheduling yields high performance assembly code by prefetching data from memory while at the same time using modulo scheduling to efficiently schedule the remaining operations. The invention integrates data prefetching into modulo scheduling by postponing prefetch insertion until after modulo scheduling is complete. Actual insertion of the prefetch instructions occurs in a postpass after the generation of appropriate prologue-kernel-epilogue code.Type: GrantFiled: June 27, 2001Date of Patent: October 14, 2003Assignee: Sun Microsystems, Inc.Inventors: Partha Pal Tirumalai, Rajagopalan Mahadevan
-
Patent number: 6631459Abstract: An apparatus includes an instruction word storage for storing a plurality of general instruction words and extended instruction words, a temporary storage unit including a plurality of buffers for pre-fetching and storing the plurality of instruction words from the instruction word storage, an instruction word search unit for receiving and decoding the plurality of instruction words pre-fetched and outputting a position signal of a general instruction word and the positions of one or more successive extended instruction words stored in the temporary storage a selector for selecting a buffer in which a general instruction word is stored and outputting the general instruction word sequentially, according to the position signal a general instruction word parser for receiving a general instruction word from the selector and outputting a plurality of control signals for executing the general instruction word simultaneously, an extended data parser is provided for performing an operational processing of operands ofType: GrantFiled: August 24, 2000Date of Patent: October 7, 2003Assignee: Asia Design Co., Ltd.Inventors: Kyung Youn Cho, Jong Yoon Lim, Geun Taek Lee, Hyeong Cheol Oh, Hyun Gyu Kim, Byung Gueon Min, Heui Lee
-
Patent number: 6631464Abstract: An instruction fetch control system prefetches a branch instruction in a pipeline system and fetches a branch target instruction of the branch instruction. The control system comprises a first branch judgement circuit for conducting a branch condition judgement in a stage prior to the branch judgement stage in which a second and original branch judgement of the branch instruction is conducted, and a circuit for starting a prefetch of instructions following said branch target instruction without waiting for the branch judgement stage where the first branch judgement circuit judges that the branch is successful.Type: GrantFiled: June 10, 1993Date of Patent: October 7, 2003Assignee: Fujitsu LimitedInventors: Tsuyoshi Mori, Seishi Okada
-
Publication number: 20030182535Abstract: A processor apparatus which reduces an overhead at the time of switching processing modules and efficiently performs desired processing at a high speed, wherein desired processing is performed by prefetching a series of instructions by a main program prefetcher, pre-decoding the same by a pre-decoder, and supplying the same to a decoder and execution unit via a multiplexer. When an instruction to execute a macro command is detected in the pre-decoder, the instructions of the macro command are prefetched by a macro program prefetcher and pre-decoded in the pre-decoder. As a result, when branching to a macro command, the instructions of the macro command can be immediately supplied to an execution unit only by switching the multiplexer.Type: ApplicationFiled: January 7, 2000Publication date: September 25, 2003Inventor: TOMOHIKO KADOWAKI
-
Publication number: 20030177315Abstract: A microprocessor apparatus is provided that enables exclusive prefetch of a cache line from memory. The apparatus includes translation logic and execution logic. The translation logic translates an extended prefetch instruction into a micro instruction sequence that directs a microprocessor to prefetch a cache line in an exclusive state. The execution logic is coupled to the translation logic. The execution logic receives the micro instruction sequence, and issues a transaction over a memory bus that requests the cache line in the exclusive state.Type: ApplicationFiled: February 11, 2003Publication date: September 18, 2003Applicant: IP-First LLCInventor: Rodney Hooker
-
Publication number: 20030159019Abstract: The present invention provides a data processing apparatus and method for predicting instructions in a data processing apparatus. The data processing apparatus comprises a processor core for executing instructions from any of a plurality of instruction sets, and a prefetch unit for prefetching instructions from a memory prior to sending those instructions to the processor core for execution. Further, prediction logic is used to predict which instructions should be prefetched by the prefetch unit, the prediction logic being arranged to review a prefetched instruction to predict whether execution of that prefetched instruction will cause a change in instruction flow, and if so to indicate to the prefetch unit an address within the memory from which a next instruction should be retrieved.Type: ApplicationFiled: February 20, 2002Publication date: August 21, 2003Inventors: William Henry Oldfield, David Vivian Jaggar
-
Patent number: 6606617Abstract: A method, apparatus, and article of manufacture for a computer implemented technique for prefetching pages. Pages are prefetched from a database stored on a data storage device connected to a computer. Pages to be retrieved are identified. Identifiers for the identified pages are stored in multiple prefetch page lists. Concurrently, the retrieved pages are processed and prefetch commands are issued to alternating multiple prefetch page lists.Type: GrantFiled: May 28, 1999Date of Patent: August 12, 2003Assignee: International Business Machines CorporationInventors: Charles Roy Bonner, Robert William Lyle
-
Patent number: 6606701Abstract: There is provided a micro-processor including (a) a pre-fetch cue FIFO which fetches and stores therein a command code, (b) a pre-fetch cue valid indicating that an effective command code is stored in the pre-fetch cue FIFO, (c) an access priority judging circuit receiving a pre-fetch request signal indicating that there is vacancy in the pre-fetch cue FIFO, a cue empty signal indicating that the pre-fetch cue FIFO is entirely empty, and an operand data request signal indicating that there has been generated an operand data access, and determining a kind of next bus access, (d) a bus state control circuit transmitting a bus interface signal, based on the kind of next bus access having been determined by the access priority judging circuit, and also transmitting a burst transfer signal indicating that a memory is in a condition for carrying out burst transfer, and (e) an access register storing data about the previous bus access.Type: GrantFiled: November 24, 1999Date of Patent: August 12, 2003Assignee: NEC Electronics CorporationInventor: Masashi Tsubota
-
Patent number: 6606689Abstract: A video game system includes an audio digital signal processor, a main memory and an audio memory separate from the main memory and storing audio-related data for processing by the audio digital signal processor. Memory access circuitry reads non-audio-related data stored on a mass storage device and writes the non-audio-related data to the audio memory. The non-audio-related data is later read from the audio memory and written to the main memory.Type: GrantFiled: November 28, 2000Date of Patent: August 12, 2003Assignee: Nintendo Co., Ltd.Inventors: Howard H. Cheng, Dan Shimizu, Genyo Takeda
-
Patent number: 6604191Abstract: An instruction fetching system (and/or architecture) which may be utilized by a high-frequency short-pipeline microprocessor, for efficient fetching of both in-line and target instructions. The system contains an instruction fetching unit (IFU), having a control logic and associated components for controlling a specially designed instruction cache (I-cache). The I-cache is a sum-address cache, i.e., it receives two address inputs, which compiled by a decoder to provide the address of the line of instructions desired fetch. The I-cache is designed with an array of cache lines that can contain 32 instructions, and three buffers that each have a capacity of 32 instructions.Type: GrantFiled: February 4, 2000Date of Patent: August 5, 2003Assignee: International Business Machines CorporationInventors: Brian King Flacks, David Meltzer, Joel Abraham Silberman
-
Patent number: 6604190Abstract: A data address prediction structure for a superscalar microprocessor is provided. The data address prediction structure predicts a data address that a group of instructions is going to access while that group of instructions is being fetched from the instruction cache. The data bytes associated with the predicted address are placed in a relatively small, fast buffer. The decode stages of instruction processing pipelines in the microprocessor access the buffer with addresses generated from the instructions, and if the associated data bytes are found in the buffer they are conveyed to the reservation station associated with the requesting decode stage. Therefore, the implicit memory read associated with an instruction is performed prior to the instruction arriving in a functional unit. The functional unit is occupied by the instruction for a fewer number of clock cycles, since it need not perform the implicit memory operation.Type: GrantFiled: June 7, 1995Date of Patent: August 5, 2003Assignee: Advanced Micro Devices, Inc.Inventor: Thang M. Tran
-
Patent number: 6594730Abstract: An embodiment of the present invention provides a memory controller that includes a plurality of transaction queues and an arbiter, a prefetch cache in communication with the arbiter, and a prefetch queue in communication with the prefetch cache. The prefetch queue also may be provided in communication with each of the transaction queues for the purpose of determining whether the transaction queues are operating in a congested state.Type: GrantFiled: August 3, 1999Date of Patent: July 15, 2003Assignee: Intel CorporationInventors: Herbert H J Hum, Andrew V. Anderson
-
Publication number: 20030110365Abstract: A central processing system can maintain an efficient information reading operation even when a program executed by a central processing unit contains many branch commands. A prefetch queue of the central processing unit reads and information expected to be processed next by the central processing unit from a main memory. The function of the prefetch queue is deactivated in accordance with a control signal provided from a prefetch queue control unit. A block transfer function of a cache memory is also deactivated when unnecessary information is read from the main memory in accordance with the block transfer function.Type: ApplicationFiled: October 20, 1999Publication date: June 12, 2003Inventor: SEIJI SUETAKE
-
Publication number: 20030105939Abstract: A method and apparatus for issuing one or more next-line prefetch requests from a predicted memory address. The first issued next-line prefetch request corresponds to a cache line having a memory address contiguous with the predicted memory address. Any subsequent next-line prefetch request corresponds to a cache line having a memory address contiguous with a memory address associated with a preceding next-line prefetch request.Type: ApplicationFiled: June 5, 2002Publication date: June 5, 2003Inventors: Robert N. Cooksey, Stephan J. Jourdan
-
Publication number: 20030105942Abstract: Operations including inserted prefetch operations that correspond to addressing chains may be scheduled above memory access operations that are likely-to-miss, thereby exploiting latency of the “martyred” likely-to-miss operations and improving execution performance of resulting code. More generally, certain pre-executable counterparts of likely-to-stall operations that form dependency chains may be scheduled above operations that are themselves likely-to-stall. Techniques have been developed to perform such scheduling. In particular, techniques have been developed that allow scheduled pre-executable operations (including prefetch operations and speculative loads) to be hoisted above intervening speculation boundaries. Speculative copies of dependency chains are employed in some realizations. Aggressive insertion of prefetch operations (including some used as markers) is employed in some realizations. Techniques for scheduling operations (e.g., in a compiler implementation) are described.Type: ApplicationFiled: November 28, 2001Publication date: June 5, 2003Applicant: Sun Microsystems, Inc.Inventors: Peter C. Damron, Nicolai Kosche
-
Publication number: 20030105940Abstract: A content prefetcher having a prefetch chain reinforcement mechanism. In response to a prefetch hit at a cache line within a prefetch chain, a request depth of the hit cache line is promoted and the hit cache line is scanned for candidate virtual addresses in order to reinforce the prefetch chain.Type: ApplicationFiled: June 5, 2002Publication date: June 5, 2003Inventors: Robert N. Cooksey, Stephan J. Jourdan
-
Publication number: 20030105937Abstract: A content prefetcher including a virtual address predictor. The virtual address predictor identifies candidate virtual addresses in a cache line without reference to an external address source.Type: ApplicationFiled: November 30, 2001Publication date: June 5, 2003Inventors: Robert N. Cooksey, Stephan J. Jourdan
-
Patent number: 6574712Abstract: A data processing system includes a processor having a first level cache and a prefetch engine. Coupled to the processor are a second level cache and a third level cache and a system memory. Prefetching of cache lines is performed into each of the first, second, and third level caches by the prefetch engine. Prefetch requests from the prefetch engine to the second and third level caches is performed over a private prefetch request bus, which is separate from the bus system that transfers data from the various cache levels to the processor. A software instruction is used to accelerate the prefetch process by overriding the normal functionality of the hardware prefetch engine. The instruction also limits the amount of data to be prefetched.Type: GrantFiled: April 14, 2000Date of Patent: June 3, 2003Assignee: International Business Machines CorporationInventors: James Allan Kahle, Michael John Mayfield, Francis Patrick O'Connell, David Scott Ray, Edward John Silha, Joel M. Tendler
-
Patent number: 6571329Abstract: The present invention aims at improving the performance of the process of an information processing apparatus which includes an instruction fetch port, and can detect the possibility for the overwrite of an instruction fetched from the instruction fetch port by correctly detecting the length of an instruction sequence already stored in an instruction buffer for storing an instruction to be fetched before the execution of instructions, and an instruction to be determined in the instructions being or already executed, and by correctly detecting the possibility for the overwrite of the contents of an instruction fetched from one instruction port. The information processing apparatus comprises: an instruction fetch counter unit for counting the length of an instruction sequence containing all instructions which are fetched before the last fetched instruction.Type: GrantFiled: March 21, 2000Date of Patent: May 27, 2003Assignee: Fujitsu LimitedInventors: Masaki Ukai, Aiichiro Inoue
-
Patent number: 6567901Abstract: A processor of a system initiates memory read transactions on a bus and provides information regarding the speculative nature of the transaction. A bus device, such as a memory controller, then receives and processes the transaction, placing the request in a queue to be serviced in an order dependent upon the relative speculative nature of the request. In addition, the processor, upon receipt of an appropriate signal, cancels a speculative read that is no longer needed or upgrades a speculative read that has become non-speculative.Type: GrantFiled: February 29, 2000Date of Patent: May 20, 2003Assignee: Hewlett Packard Development Company, L.P.Inventor: E. David Neufeld
-
Patent number: 6564313Abstract: The invention contemplates a system and method for efficient instruction prefetching based on the termination of loops. A computer system may be contemplated herein, wherein the computer system may include a semiconductor memory device, a cache memory device and a prefetch unit. The system may also include a memory bus to couple the semiconductor memory device to the prefetch unit. The system may further include a circuit coupled to the memory bus. The circuit may detect a branch instruction within the sequence of instructions, such that the branch instruction may target a loop construct. A circuit may also be contemplated herein. The circuit may include a detector coupled to detect a loop within a sequence of instructions. The circuit may also include one or more counting devices coupled to the detector. A first counting device may count a number of clock cycles associated with a set of instructions within a loop construct.Type: GrantFiled: December 20, 2001Date of Patent: May 13, 2003Assignee: LSI Logic CorporationInventor: Asheesh Kashyap
-
Publication number: 20030088864Abstract: One embodiment of the present invention provides a system that generates code to perform anticipatory prefetching for data references. During operation, the system receives code to be executed on a computer system. Next, the system analyzes the code to identify data references to be prefetched. This analysis can involve: using a two-phase marking process in which blocks that are certain to execute are considered before other blocks; and analyzing complex array subscripts. Next, the system inserts prefetch instructions into the code in advance of the identified data references. This insertion can involve: dealing with non-constant or unknown stride values; moving prefetch instructions into preceding basic blocks; and issuing multiple prefetches for the same data reference.Type: ApplicationFiled: November 2, 2001Publication date: May 8, 2003Inventors: Partha P. Tirumalai, Spiros Kalogeropulos, Mahadevan Rajagopalan, Yonghong Song, Vikram Rao
-
Publication number: 20030088863Abstract: One embodiment of the present invention provides a system that generates code to perform anticipatory prefetching for data references. During operation, the system receives code to be executed on a computer system. Next, the system analyzes the code to identify data references to be prefetched. This analysis can involve: using a two-phase marking process in which blocks that are certain to execute are considered before other blocks; and analyzing complex array subscripts. Next, the system inserts prefetch instructions into the code in advance of the identified data references. This insertion can involve: dealing with non-constant or unknown stride values; moving prefetch instructions into preceding basic blocks; and issuing multiple prefetches for the same data reference.Type: ApplicationFiled: November 2, 2001Publication date: May 8, 2003Inventors: Partha P. Tirumalai, Spiros Kalogeropulos, Mahadevan Rajagopalan, Yonghong Song, Vikram Rao
-
Patent number: 6560693Abstract: A mechanism is described that prefetches instructions and data into the cache using a branch instruction as a prefetch trigger. The prefetch is initiated if the predicted execution path after the branch instruction matches the previously seen execution path. This match of the execution paths is determined using a branch history queue that records the branch outcomes (taken/not taken) of the branches in the program. For each branch in this queue, a branch history mask records the outcomes of the next N branches and serves as an encoding of the execution path following the branch instruction. The branch instruction along with the mask is associated with a prefetch address (instruction or data address) and is used for triggering prefetches in the future when the branch is executed again. A mechanism is also described to improve the timeliness of a prefetch by suitably adjusting the value of N after observing the usefulness of the prefetched instructions or data.Type: GrantFiled: December 10, 1999Date of Patent: May 6, 2003Assignee: International Business Machines CorporationInventors: Thomas R. Puzak, Allan M. Hartstein, Mark Charney, Daniel A. Prener, Peter H. Oden, Vijayalakshmi Srinivasan
-
Patent number: 6557095Abstract: A method and apparatus for scheduling jump and store operations using a dependency matrix and for scheduling operations in-order using a dependency matrix. A child operation, such as a jump or store micro-operation, is received for scheduling. The child operation is dependent on the completion of a parent operation, such as when all jump operations in an instruction stream must be executed in-order. An entry corresponding to the child operation is placed in a scheduling queue and the child operation is compared with other entries in the scheduling queue. The result of this comparison is stored in a dependency matrix. Each row in the dependency matrix corresponds to an entry in the scheduling queue, and each column corresponds to a dependency on an entry in the scheduling queue. Entries in the scheduling queue can then be scheduled based on the information in the dependency matrix, such as when the entire row associated with an entry is clear.Type: GrantFiled: December 27, 1999Date of Patent: April 29, 2003Assignee: Intel CorporationInventor: Alexander Henstrom