Prefetching Patents (Class 712/207)
-
Patent number: 7594096Abstract: The present invention allows a microprocessor to identify and speculatively execute future load instructions during a stall condition. This allows forward progress to be made through the instruction stream during the stall condition which would otherwise cause the microprocessor or thread of execution to be idle. The data for such future load instructions can be prefetched from a distant cache or main memory such that when the load instruction is re-executed (non speculative executed) after the stall condition expires, its data will reside either in the L1 cache, or will be enroute to the processor, resulting in a reduced execution latency. When an extended stall condition is detected, load lookahead prefetch is started allowing speculative execution of instructions that would normally have been stalled.Type: GrantFiled: December 5, 2007Date of Patent: September 22, 2009Assignee: International Business Machines CorporationInventors: Richard James Eickemeyer, Hung Qui Le, Dung Quoc Nguyen, Benjamin Walter Stolt, Brian William Thompto
-
Publication number: 20090235053Abstract: A system and method for performing register renaming of source registers in a processor having a variable advance instruction window for storing a group of instructions to be executed by the processor, wherein a new instruction is added to the variable advance instruction window when a location becomes available. A tag is assigned to each instruction in the variable advance instruction window. The tag of each instruction to leave the window is assigned to the next new instruction to be added to it. The results of instructions executed by the processor are stored in a temp buffer according to their corresponding tags to avoid output and anti-dependencies. The temp buffer therefore permits the processor to execute instructions out of order and in parallel. Data dependency checks for input dependencies are performed only for each new instruction added to the variable advance instruction window and register renaming is performed to avoid input dependencies.Type: ApplicationFiled: May 26, 2009Publication date: September 17, 2009Applicant: Seiko Epson CorporationInventors: Trevor A. Deosaran, Sanjiv Garg, Kevin R. Iadonato
-
Publication number: 20090228688Abstract: A system including a dual function adder is described. In one embodiment, the system includes an adder. The adder is configured for a first instruction to determine an address for a hardware prefetch if the first instruction is a hardware prefetch instruction. The adder is further configures for the first instruction to determine a value from an arithmetic operation if the first instruction is an arithmetic operation instruction.Type: ApplicationFiled: March 4, 2008Publication date: September 10, 2009Applicant: QUALCOMM INCORPORATEDInventors: Ajay Anant Ingle, Erich James Plondke, Lucian Codrescu
-
Patent number: 7587580Abstract: A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.Type: GrantFiled: February 3, 2005Date of Patent: September 8, 2009Assignee: QUALCOMM CorporatedInventors: Thomas Andrew Sartorius, Victor Roberts Augsburg, James Norris Dieffenderfer, Jeffrey Todd Bridges, Michael Scott McIlvaine, Rodney Wayne Smith
-
Publication number: 20090216951Abstract: A system, method, and computer program product for handling shared cache lines to allow forward progress among processors in a multi-processor environment is provided. A counter and a threshold are provided a processor of the multi-processor environment, such that the counter is incremented for every exclusive cross interrogate (XI) reject that is followed by an instruction completion, and reset on an exclusive XI acknowledgement. If the XI reject counter reaches a preset threshold value, the processor's pipeline is drained by blocking instruction issue and prefetching attempts, creating a window for an exclusive XI from another processor to be honored, after which normal instruction processing is resumed. Configuring the preset threshold value as a programmable value allows for fine-tuning of system performance.Type: ApplicationFiled: February 22, 2008Publication date: August 27, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Chung-Lung Kevin Shum, Charles F. Webb
-
Publication number: 20090217004Abstract: A prefetch bit (126) is associated with cache block (125) of a cache (120), and the management (130) of cache-prefetch operations is based on the state of this bit (126). Further efficiencies are gained by allowing each application to identify memory areas (115) within which regularly repeating memory accesses are likely, such as frame memory in a video application. For each of these memory areas (115), the application also identifies a likely stride value, such as the line length of the data in the frame memory. Pre-fetching is limited to the identified areas (115), and the prefetch bit (126) is used to identify blocks (125) from these areas and to limit repeated cache hit/miss determinations.Type: ApplicationFiled: November 15, 2005Publication date: August 27, 2009Applicant: Koninkliijke Phillips Electronics N.V.Inventors: Jan-Willem Van De Waerdt, Jean-Paul Vanitegem
-
Publication number: 20090210663Abstract: A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.Type: ApplicationFiled: May 4, 2009Publication date: August 20, 2009Applicant: QUALCOMM INCORPORATEDInventors: Thomas Andrew Sartorius, Victor Roberts Augsburg, James Norris Dieffenderfer, Jeffrey Todd Bridges, Michael Scott McIlvaine, Rodney Wayne Smith
-
Publication number: 20090210662Abstract: A microprocessor equipped to provide hardware initiated prefetching, includes at least one architecture for performing: issuance of a prefetch instruction; writing of a prefetch address into a prefetch fetch address register (PFAR); attempting a prefetch according to the address; detecting one of a cache miss and a cache hit; and if there is a cache miss, then sending a miss request to a next cache level and attempting cache access in a non-busy cycle; and if there is a cache hit, then incrementing the address in the PFAR and completing the prefetch. A method and a computer program product are provided.Type: ApplicationFiled: February 15, 2008Publication date: August 20, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: David A. Schroter, Mark S. Farrell, Jennifer Navarro, Chung-Lung Kevin Shum, Charles F. Webb
-
Patent number: 7577947Abstract: Methods and apparatus to dynamically insert prefetch instructions are disclosed. In an example method, one or more samples associated with cache misses are identified from a performance monitoring unit in a processor system. Based on sample information associated with the one or more samples, delinquent information is generated. To dynamically insert one or more prefetch instructions, a prefetch point is identified based on the delinquent information.Type: GrantFiled: December 19, 2003Date of Patent: August 18, 2009Assignee: Intel CorporationInventors: Sreenivas Subramoney, Mauricio J. Serrano, Richard L. Hudson, Ali-Reza Adl-Tabatabai
-
Publication number: 20090204791Abstract: A method and apparatus for forming compound issue groups containing instructions from multiple cache lines of instructions are provided. By pre-fetching instruction lines containing instructions targeted by a conditional branch statement, if it is predicted that the conditional branch will be taken, a compound issue group may be formed with instructions from the I-line containing the branch statement and the I-line containing instructions targeted by the branch.Type: ApplicationFiled: February 12, 2008Publication date: August 13, 2009Inventor: David A. Luick
-
Publication number: 20090198965Abstract: According to a method of data processing, a memory controller receives a prefetch load request from a processor core of a data processing system. The prefetch load request specifies a requested line of data. In response to receipt of the prefetch load request, the memory controller determines by reference to a stream of demand requests how much data is to be supplied to the processor core in response to the prefetch load request. In response to the memory controller determining to provide less than all of the requested line of data, the memory controller provides less than all of the requested line of data to the processor core.Type: ApplicationFiled: February 1, 2008Publication date: August 6, 2009Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
-
Publication number: 20090198964Abstract: A method, system, and computer program product are provided for verifying out of order instruction address (IA) stride prefetch performance in a processor design having more than one level of cache hierarchies. Multiple instruction streams are generated and the instructions loop back to corresponding instruction addresses. The multiple instruction streams are dispatched to a processor and simulation application to process. When a particular instruction is being dispatched, the particular instruction's instruction address and operand address are recorded in the queue. The processor is monitored to determine if the processor executes fetch and prefetch commands in accordance with the simulation application. It is checked to determine if prefetch commands are issued for instructions having three or more strides.Type: ApplicationFiled: January 31, 2008Publication date: August 6, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Wei-Yi Xiao, Dean G. Bair, Christopher A. Krygowski, Chung-Lung K. Shum
-
Patent number: 7562192Abstract: An apparatus in a microprocessor for selectively retiring a prefetched cache line is disclosed. The microprocessor includes a prefetch buffer that stores a cache line prefetched from a system memory coupled to the microprocessor. The microprocessor also includes a cache memory, comprising an array of storage elements for storing cache lines, indexed by an index input. One of the storage elements of the array indexed by an index portion of an address of the prefetched cache line stored in the prefetch buffer is storing a replacement candidate line for the prefetched cache line. The microprocessor also includes control logic that determines whether the replacement candidate line in the cache memory is invalid, and if so, replaces the replacement candidate line in the one of the storage elements with the prefetched cache line from the prefetch buffer.Type: GrantFiled: November 27, 2006Date of Patent: July 14, 2009Assignee: Centaur TechnologiesInventors: G. Glenn Henry, Rodney E. Hooker
-
Patent number: 7555633Abstract: Various embodiments of methods and systems for implementing a microprocessor that fetches a group of instructions into instruction cache in response to a corresponding trace being evicted from the trace cache are disclosed. In some embodiments, a microprocessor may include an instruction cache, a trace cache, and a prefetch unit. In response to a trace being evicted from trace cache, the prefetch unit may fetch a line of instructions into instruction cache.Type: GrantFiled: November 3, 2003Date of Patent: June 30, 2009Assignee: Advanced Micro Devices, Inc.Inventors: Gregory William Smaus, Mitchell Alsup
-
Publication number: 20090157967Abstract: A prefetch data machine instruction having an M field performs a function on a cache line of data specifying an address of an operand. The operation comprises either prefetching a cache line of data from memory to a cache or reducing the access ownership of store and fetch or fetch only of the cache line in the cache or a combination thereof. The address of the operand is either based on a register value or the program counter value pointing to the prefetch data machine instruction.Type: ApplicationFiled: December 12, 2007Publication date: June 18, 2009Applicant: International Business Machines CorporationInventors: Dan F. Greiner, Timothy J. Slegel
-
Publication number: 20090138661Abstract: A computer system and method. In one embodiment, a computer system comprises a processor and a cache memory. The processor executes a prefetch instruction to prefetch a block of data words into the cache memory. In one embodiment, the cache memory comprises a plurality of cache levels. The processor selects one of the cache levels based on a value of a prefetch instruction parameter indicating the temporal locality of data to be prefetched. In a further embodiment, individual words are prefetched from non-contiguous memory addresses. A single execution of the prefetch instruction allows the processor to prefetch multiple blocks into the cache memory. The number of data words in each block, the number of blocks, an address interval between each data word of each block, and an address interval between each block to be prefetched are indicated by parameters of the prefetch instruction.Type: ApplicationFiled: November 26, 2007Publication date: May 28, 2009Inventor: Gary Lauterbach
-
Patent number: 7533247Abstract: The present subject matter relates to operation frame filtering, building, and execution. Some embodiments include identifying a frame signature, counting a number of execution occurrences of the frame signature, and building a frame of operations to execute instead of operations identified by the frame signature.Type: GrantFiled: December 30, 2005Date of Patent: May 12, 2009Assignee: Intel CorporationInventors: Stephan Jourdan, Per Hammarlund, Alexandre Farcy, John Alan Miller
-
Patent number: 7533220Abstract: A microprocessor coupled to a system memory has a memory subsystem with a translation look-aside buffer (TLB) for storing TLB information. The microprocessor also includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and an abnormal TLB access policy. The microprocessor also includes a stream prefetch unit that generates a prefetch request to the memory subsystem to prefetch a cache line of the data stream from the system memory into the memory subsystem. If a virtual page address of the prefetch request causes an abnormal TLB access, the memory subsystem selectively aborts the prefetch request based on the abnormal TLB access policy specified in the instruction.Type: GrantFiled: August 11, 2006Date of Patent: May 12, 2009Assignee: MIPS Technologies, Inc.Inventor: Keith E. Diefendorff
-
Publication number: 20090119488Abstract: In one embodiment, a processor comprises a prefetch unit coupled to a data cache. The prefetch unit is configured to concurrently maintain a plurality of separate, active prefetch streams. Each prefetch stream is either software initiated via execution by the processor of a dedicated prefetch instruction or hardware initiated via detection of a data cache miss by one or more load/store memory operations. The prefetch unit is further configured to generate prefetch requests responsive to the plurality of prefetch streams to prefetch data in to the data cache.Type: ApplicationFiled: January 7, 2009Publication date: May 7, 2009Inventors: Sudarshan Kadambi, Puneet Kumar, Po-Yung Chang
-
Patent number: 7530063Abstract: A method and system of modifying instructions forming a loop is provided. A method of modifying instructions forming a loop includes modifying instructions forming a loop including: determining static and dynamic characteristics for the instructions; selecting a modification factor for the instructions based on a number of separate equivalent sections forming a cache in a processor which is processing the instructions; and modifying the instructions to interleave the instructions in the loop according to the modification factor and the static and dynamic characteristics when the instructions satisfy a modification criteria based on the static and dynamic characteristics.Type: GrantFiled: May 27, 2004Date of Patent: May 5, 2009Assignee: International Business Machines CorporationInventors: Roch Georges Archambault, Robert James Blainey, Yaoqing Gao, John David McCalpin, Francis Patrick O'Connell, Pascal Vezolle, Steven Wayne White
-
Patent number: 7529911Abstract: One embodiment of the present invention provides a system that improves the effectiveness of prefetching during execution of instructions in scout mode. Upon encountering a non-data dependent stall condition, the system performs a checkpoint and commences execution of instructions in scout mode, wherein instructions are speculatively executed to prefetch future memory operations, but wherein results are not committed to the architectural state of a processor. When the system executes a load instruction during scout mode, if the load instruction causes a lower-level cache miss, the system allows the load instruction to access a higher-level cache. Next, the system places the load instruction and subsequent dependent instructions into a deferred queue, and resumes execution of the program in scout mode.Type: GrantFiled: May 26, 2005Date of Patent: May 5, 2009Assignee: Sun Microsystems, Inc.Inventors: Lawrence A. Spracklen, Yuan C. Chou, Santosh G. Abraham
-
Patent number: 7519777Abstract: Methods, systems and computer program products for concomitant pair per-fetching. Exemplary embodiments include a method for concomitant pair prefetching, the method including detecting a stride pattern, detecting an indirect access pattern to define an access window, prefetching candidates within the defined access window, wherein the prefetching comprises obtaining prefetch addresses from a history table, updating a miss stream window, selecting a candidate of a concomitant pair from the miss stream window, producing an index from the candidate pair, accessing an aging filter, updating the history table and selecting another concomitant pair candidate from the miss stream window.Type: GrantFiled: June 11, 2008Date of Patent: April 14, 2009Assignee: International Business Machines CorporationInventors: Kattamuri Ekanadham, Il Park, Pratap C. Pattnaik
-
Publication number: 20090094440Abstract: A pre-fetch circuit of a semiconductor memory apparatus can carry out a high-frequency operating test through a low-frequency channel of a test equipment. The pre-fetch circuit of a semiconductor memory apparatus can includes: a pre-fetch unit for pre-fetching data bits in a first predetermined number; a plurality of registers provided in the first predetermined number, each of which latches a data in order or a data out of order of the pre-fetched data in response to different control signals; and a control unit for selectively activating the different control signals in response to a test mode signal, whereby some of the registers latch the data out of order.Type: ApplicationFiled: July 18, 2008Publication date: April 9, 2009Applicant: HYNIX SEMICONDUCTOR, INC.Inventor: Young-Ju Kim
-
Patent number: 7516312Abstract: An instruction prefetch apparatus includes a branch target buffer (BTB), a presbyopic target buffer (PTB) and a prefetch stream buffer (PSB). The BTB includes records that map branch addresses to branch target addresses, and the PTB includes records that map branch target addresses to subsequent branch target addresses. When a branch instruction is encountered, the BTB can predict the dynamically adjacent subsequent block entry location as the branch target address in the record that also includes the branch instruction address. The PTB can predict multiple subsequent blocks by mapping the branch target address to subsequent dynamic blocks. The PSB holds instructions prefetched from subsequent blocks predicted by the PTB.Type: GrantFiled: April 2, 2004Date of Patent: April 7, 2009Assignee: Intel CorporationInventors: Hong Wang, Ralph Kling, Edward T. Grochowski, Kalpana Ramakrishnan
-
Patent number: 7516278Abstract: A system controller, which executes a speculative fetch from a memory before determining whether data requested for a memory fetch request is in a cache by searching tag information of the cache, includes a consumption determining unit that monitors a consumption status of a hardware resource used in the speculative fetch, and determines whether a consumption of the hardware resource exceeds a predetermined value; and a speculative-fetch issuing unit that stops issuing the speculative fetch when the consumption determining unit determines that the consumption of the hardware resource exceeds the predetermined value.Type: GrantFiled: December 1, 2004Date of Patent: April 7, 2009Assignee: Fujitsu LimitedInventors: Akira Watanabe, Go Sugizaki, Shigekatsu Sagi, Masahiro Mishima
-
Patent number: 7516279Abstract: Computer implemented method, system and computer program product for prefetching data in a data processing system. A computer implemented method for prefetching data in a data processing system includes generating attribute information of prior data streams by associating attributes of each prior data stream with a storage access instruction which caused allocation of the data stream, and then recording the generated attribute information. The recorded attribute information is accessed, and a behavior of a new data stream is modified using the accessed recorded attribute information.Type: GrantFiled: February 28, 2006Date of Patent: April 7, 2009Assignee: International Business Machines CorporationInventors: John Barry Griswell, Jr., Francis Patrick O'Connell
-
Publication number: 20090089548Abstract: A method for preloading data in a CPU pipeline is provided, which includes the following steps. When a hint instruction is executed, allocate and initiate an entry in a preload table. When a load instruction is fetched, load a piece of data from a memory into the entry according to the entry. When a use instruction which uses the data loaded by the load instruction is executed, forward the data for the use instruction from the entry instead of from the memory. When the load instruction is executed, update the entry according to the load instruction.Type: ApplicationFiled: September 27, 2007Publication date: April 2, 2009Applicant: FARADAY TECHNOLOGY CORP.Inventors: I-Jui Sung, Ming-Chung Kao
-
Patent number: 7512740Abstract: A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority.Type: GrantFiled: August 11, 2006Date of Patent: March 31, 2009Assignee: MIPS Technologies, Inc.Inventor: Keith E. Diefendorff
-
Patent number: 7511851Abstract: The network printer obtains information from a Web page over the network, and stores this obtained information together with a URL thereof and the time when this information was received into the storage. When a print request from the outside via a console panel is made, the network printer reads data of an assigned URL from the storage, and prints this data using the printer engine.Type: GrantFiled: December 28, 2005Date of Patent: March 31, 2009Assignee: Ricoh Company, Ltd.Inventor: Hiroyuki Matsushima
-
Patent number: 7509472Abstract: Address translation for instruction fetching can be obviated for sequences of instruction instances that reside on a same page. Obviating address translation reduces power consumption and increases pipeline efficiency since accessing of an address translation buffer can be avoided. Certain events, such as branch mis-predictions and exceptions, can be designated as page boundary crossing events. In addition, carry over at a particular bit position when computing a branch target or a next instruction instance fetch target can also be designated as a page boundary crossing event. An address translation buffer is accessed to translate an address representation of a first instruction instance. However, until a page boundary crossing event occurs, the address representations of subsequent instruction instances are not translated. Instead, the translated portion of the address representation for the first instruction instance is recycled for the subsequent instruction instances.Type: GrantFiled: February 1, 2006Date of Patent: March 24, 2009Assignee: Sun Microsystems, Inc.Inventors: Paul Caprioli, Shailender Chaudhry
-
Patent number: 7509459Abstract: A microprocessor has a plurality of stream prefetch engines for prefetching a respective data stream from the system memory into the microprocessor cache memory and an instruction decoder that decodes instructions of the microprocessor instruction set. The instruction set includes a stream prefetch instruction that returns an identifier uniquely associating a data stream specified by the instruction with one of the engines. The instruction set also includes an explicit prefetch-triggering load instruction that specifies a stream identifier returned by a previously executed stream prefetch instruction. When the decoder decodes a conventional load instruction it does not prefetch; however, when it decodes an explicit prefetch-triggering load instruction it commences prefetching the specified data stream. In one embodiment, an indicator of the load instruction may explicitly specify non-prefetch-triggering.Type: GrantFiled: October 13, 2006Date of Patent: March 24, 2009Assignee: MIPS Technologies, Inc.Inventor: Keith E. Diefendorff
-
Publication number: 20090077350Abstract: A method, system and computer program for modifying an executing application, comprising monitoring the executing application to identify at least one of a hot load instruction, a hot store instruction and an active prefetch instruction that contributes to cache congestion; where the monitoring identifies a hot load instruction, enabling at least one prefetch associated with the hot load instruction; where the monitoring identifies a hot store instruction, enabling at least one prefetch associated with the hot store instruction; and where the monitoring identifies an active prefetch instruction that contributes to cache congestion, one of disabling the active prefetch instruction and reducing the effectiveness of the active prefetch instructions.Type: ApplicationFiled: May 29, 2007Publication date: March 19, 2009Inventors: Sujoy Saraswati, Teresa Johnson
-
Publication number: 20090077321Abstract: A microprocessor coupled to a system memory by a bus includes an instruction decode unit that decodes an instruction that specifies a data stream in the system memory and a stream prefetch priority. The microprocessor also includes a load/store unit that generates load/store requests to transfer data between the system memory and the microprocessor. The microprocessor also includes a stream prefetch unit that generates a plurality of prefetch requests to prefetch the data stream from the system memory into the microprocessor. The prefetch requests specify the stream prefetch priority. The microprocessor also includes a bus interface unit (BIU) that generates transaction requests on the bus to transfer data between the system memory and the microprocessor in response to the load/store requests and the prefetch requests. The BIU prioritizes the bus transaction requests for the prefetch requests relative to the bus transaction requests for the load/store requests based on the stream prefetch priority.Type: ApplicationFiled: August 4, 2008Publication date: March 19, 2009Applicant: MIPS Technologies, Inc.Inventor: Keith E. Diefendorff
-
Patent number: 7506106Abstract: A microprocessor has a data stream prefetch unit for processing a data stream prefetch instruction. The instruction specifies a data stream and a speculative stream hit policy indicator. If a load instruction hits in the data stream, then if the load is non-speculative the stream prefetch unit prefetches a portion of the data stream from system memory into cache memory; however, if the load is speculative the stream prefetch unit selectively prefetches a portion of the data stream from the system memory into the cache memory based on the value of the policy indicator. The load instruction is speculative if it is not guaranteed to complete execution, such as if it follows a predicted branch instruction whose outcome has not yet been finally determined to be correct. In one embodiment, the stream prefetch unit performs a similar function for store instructions that hit in the data stream.Type: GrantFiled: October 13, 2006Date of Patent: March 17, 2009Assignee: MIPS Technologies, Inc.Inventor: Keith E. Diefendorff
-
Patent number: 7506105Abstract: Generating a hashed value of the program counter in a data processing system. The hashed value can be used for prefetching in the data processing system. In some examples, the hashed value is used to identify whether a load instruction associated with the hashed value has an address that is part of a strided stream in an address stream. In some examples, the hashed value is a subset of bits of the bits of the program counter. In other examples, the hashed value may be derived in other ways from the program counter.Type: GrantFiled: May 2, 2005Date of Patent: March 17, 2009Assignee: Freescale Semiconductor, Inc.Inventors: Hassan F. Al-Sukhni, James C. Holt, Matt B. Smittle, Michael D. Snyder, Brian C. Grayson
-
Publication number: 20090070556Abstract: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.Type: ApplicationFiled: January 4, 2008Publication date: March 12, 2009Inventors: John Barry Griswell, JR., Hung Qui Le, Francis Patrick O'Connell, William J. Starke, Jeffrey Adam Stuecheli, Albert Thomas Williams
-
Patent number: 7502910Abstract: A sideband scout thread processing technique is provided. The sideband scout thread processing technique utilizes sideband information to identify a subset of processor instructions for execution by a scout thread processor. The sideband information identifies instructions that need to be executed to “warm-up” a cache memory that is shared with a main processor executing the whole set of processor instructions. Thus, the main processor has fewer cache misses and reduced latencies. In one embodiment, a system includes a first processor for executing a sequence of processor instructions, a second processor for executing a subset of the sequence of processor instructions, and a cache shared between the first processor and the second processor. The second processor includes sideband circuitry configured to identify the subset of the sequence of processor instructions to execute according to sideband information associated with the sequence of processor instructions.Type: GrantFiled: January 28, 2003Date of Patent: March 10, 2009Assignee: Sun Microsystems, Inc.Inventor: Peter C. Damron
-
Patent number: 7500062Abstract: A circuit arrangement and method selectively reorder speculatively issued memory read requests being communicated to a lower memory level in a multi-level memory architecture. In particular, a memory read request that has been speculatively issued to a lower memory level prior to completion of a cache lookup operation initiated in a cache memory in a higher memory level may be reordered ahead of at least one previously received and pending request awaiting communication to the lower memory level. By doing so, the latency associated with the memory read request is reduced when the request results in a cache miss in the higher level memory, and as a result, system performance is improved.Type: GrantFiled: November 17, 2005Date of Patent: March 3, 2009Assignee: International Business Machines CorporationInventors: Bruce Leroy Beukema, Michael Bar-Joshua, Alexander Mesh, Shaul Yifrach
-
Patent number: 7500061Abstract: A preload controller for controlling a bus access device that reads out data from a main memory via a bus and transfers the readout data to a temporary memory, including a first acquiring device to acquire access hint information which represents a data access interval to the main memory, a second acquiring device to acquire system information which represents a transfer delay time in transfer of data via the bus by the bus access device, a determining device to determine a preload unit count based on the data access interval represented by the access hint information and the transfer delay time represented by the system information, and a management device to instruct the bus access device to read out data for the preload unit count from the main memory and to transfer the readout data to the temporary memory ahead of a data access of the data.Type: GrantFiled: June 14, 2005Date of Patent: March 3, 2009Assignee: Kabushiki Kaisha ToshibaInventors: Seiji Maeda, Yusuke Shirota
-
Publication number: 20090055333Abstract: When a patient enters a medical situation, healthcare professionals can use various amounts of information in evaluating the situation. However, different information can be beneficial dependent on the medical situation. Moreover, personnel can historically use specific information types regardless of the situation. An artificial neuron network is employed to pre-fetch information that personnel likely will want prior to a request from the personnel. In addition, the artificial neuron network can be trained based on results of presented information.Type: ApplicationFiled: August 22, 2007Publication date: February 26, 2009Applicant: MICROSOFT CORPORATIONInventor: Gang Wang
-
Publication number: 20090055628Abstract: Assigning each of a plurality of memory fetch units to any of a plurality of candidate variables to reduce load-hit-store delays, wherein a total number of required memory fetch units is minimized. A plurality of store/load pairs are identified. A dependency graph is generated by creating a node Nx for each store to variable X and a node Ny for each load of variable Y and, unless X=Y, for each store/load pair, creating an edge between a respective node Nx and a corresponding node Ny; for each created edge, labeling the edge with a heuristic weight; labeling each node Nx with a node weight Wx that combines a plurality of respective edge weights of a plurality of corresponding nodes Nx such that Wx=??xj; and determining a color for each of the graph nodes using k distinct colors wherein k is minimized such that no adjacent nodes joined by an edge between a respective node Nx and a corresponding node Ny have an identical color; and assigning a memory fetch unit to each of the k distinct colors.Type: ApplicationFiled: August 21, 2007Publication date: February 26, 2009Applicant: INTERNATIONAL BUSINESS MACHINE CORPORATIONInventors: Marcel Mitran, Joran S.C. Siu, Alexander Vasilevskiy
-
Patent number: 7496732Abstract: A method and apparatus for using result-speculative data under run-ahead speculative execution is disclosed. In one embodiment, the uncommitted target data from instructions being run-ahead executed may be saved into an advance data table. This advance data table may be indexed by the lines in the instruction buffer containing the instructions for run-ahead execution. When the instructions are re-executed subsequent to the run-ahead execution, valid target data may be retrieved from the advance data table and supplied as part of a zero-clock bypass to support parallel re-execution. This may achieve parallel execution of dependent instructions. In other embodiments, the advance data table may be content-addressable-memory searchable on target registers and supply target data to general speculative execution.Type: GrantFiled: December 17, 2003Date of Patent: February 24, 2009Assignee: Intel CorporationInventors: Sailesh Kottapalli, Richard W. Goe, Youngsoo Choi
-
Publication number: 20090049277Abstract: A semiconductor integrated circuit device is provided. The operating frequency generating component generates an operating frequency that is a timing that becomes a reference for synchronizing processing between each circuit when the semiconductor integrated circuit operates. The extracting component extracts a critical path that is the slowest path when a data signal propagates between predetermined terminals inside the semiconductor integrated circuit. The instruction prefetch executing component prefetches an instruction relating to the critical path that has been extracted by the extracting component. The processing configuration changing component changes the processing configuration so as to realize transmission of the data signal within a predetermined cycle of the operating frequency using the instruction prefetch executing component when the data signal passes through the path of the critical path.Type: ApplicationFiled: June 3, 2008Publication date: February 19, 2009Applicant: OKI ELECTRIC INDUSTRY CO., LTD.Inventor: Koji Muranishi
-
Patent number: 7493621Abstract: An apparatus, program product and method initiate, in connection with a context switch operation, a prefetch of data likely to be used by a thread prior to resuming execution of that thread. As a result, once it is known that a context switch will be performed to a particular thread, data may be prefetched on behalf of that thread so that when execution of the thread is resumed, more of the working state for the thread is likely to be cached, or at least in the process of being retrieved into cache memory, thus reducing cache-related performance penalties associated with context switching.Type: GrantFiled: December 18, 2003Date of Patent: February 17, 2009Assignee: International Business Machines CorporationInventors: Jeffrey Powers Bradford, Harold F. Kossman, Timothy John Mullins
-
Patent number: 7493447Abstract: Methods and related computer program products, systems, and devices for using a NAND flash as a program ROM are disclosed.Type: GrantFiled: May 3, 2006Date of Patent: February 17, 2009Assignee: Nuvoton Technology CorporationInventor: Yi-Hsien Chuang
-
Patent number: 7490210Abstract: A system and method are described for a memory management processor which, using a table of reference addresses embedded in the object code, can open the appropriate memory pages to expedite the retrieval of information from memory referenced by instructions in the execution pipeline. A suitable compiler parses the source code and collects references to branch addresses, calls to other routines, or data references, and creates reference tables listing the addresses for these references at the beginning of each routine. These tables are received by the memory management processor as the instructions of the routine are beginning to be loaded into the execution pipeline, so that the memory management processor can begin opening memory pages where the referenced information is stored. Opening the memory pages where the referenced information is located before the instructions reach the instruction processor helps lessen memory latency delays which can greatly impede processing performance.Type: GrantFiled: September 30, 2005Date of Patent: February 10, 2009Assignee: Micron Technology, Inc.Inventor: Dean A. Klein
-
Patent number: 7487296Abstract: A multi-stride prefetcher includes a recurring prefetch table that in turn includes a stream table and an index table. The stream table includes a valid field and a tag field. The stream table also includes a thread number field to help support multi-threaded processor cores. The tag field stores a tag from an address associated with a cache miss. The index table includes fields for storing information characterizing a state machine. The fields include a learning bit. The multi-stride prefetcher prefetches data into a cache for a plurality of streams of cache misses, each stream having a plurality of strides.Type: GrantFiled: February 17, 2005Date of Patent: February 3, 2009Assignee: Sun Microsystems, Inc.Inventors: Sorin Iacobovici, Sudarshan Kadambi, Yuan C. Chou
-
Publication number: 20090031111Abstract: A method, apparatus, and computer program product dynamically select compiled instructions for execution. Static instructions for execution on a first execution and dynamic instructions for execution on a second execution unit are received. The throughput performance of the static instructions and the dynamic instructions is evaluated based on current states of the execution units. The static instructions or the dynamic instructions are selected for execution at runtime on the first execution unit or the second execution unit, respectively, based on the throughput performance of the instructions.Type: ApplicationFiled: July 26, 2007Publication date: January 29, 2009Inventors: Deanna J. Chou, Jesse E. Craig, John Sargis, JR., Daneyand J. Singley, Sebastian T. Ventrone
-
Publication number: 20090031112Abstract: A system and method are provided for stacking global variables associated with a plurality of tools. The method includes loading a first tool global variable into a memory and executing a first tool of a computer application, the computer application configured to automate human resource processes. The method includes responsive to a call to execute a second tool of the computer application, pushing the first tool global variable onto a stack. The method includes loading a second tool global variable into the memory and executing the second tool. The method includes responsive to completing execution of the second tool, popping the first tool global variable off the stack and loading the first tool global variable back into the memory.Type: ApplicationFiled: July 27, 2007Publication date: January 29, 2009Applicant: SAP AGInventors: Christian Behrens, Steffen Rotsch, Martin Scholz
-
Patent number: 7484041Abstract: Systems and methods for improving the performance of a multiprocessor system by enabling a first processor to initiate the retrieval of data and the storage of the data in the cache memory of a second processor. One embodiment comprises a system having a plurality of processors coupled to a bus, where each processor has a corresponding cache memory. The processors are configured so that a first one of the processors can issue a preload command directing a target processor to load data into the target processor's cache memory. The preload command may be issued in response to a preload instruction in program code, or in response to an event. The first processor may include an explicit identifier of the target processor in the preload command, or the selection of the target processor may be left to another agent, such as an arbitrator coupled to the bus.Type: GrantFiled: April 4, 2005Date of Patent: January 27, 2009Assignee: Kabushiki Kaisha ToshibaInventor: Takashi Yoshikawa