Prefetching Patents (Class 712/207)

Data processing apparatus using pre-fetched data

Patent number: 8484437

Abstract: A data processing apparatus includes a pre-fetch unit configured to divide and store data, a validation setting unit configured to store information regarding whether or not the data stored in the pre-fetch unit are valid, an address generation unit configured to generate an address for reading/storing the data from/in the pre-fetch unit, and a pre-fetch control unit configured to control a storage position of the data in the pre-fetch unit by using the address and information of the address generation unit and the validation setting unit.

Type: Grant

Filed: September 7, 2010

Date of Patent: July 9, 2013

Assignee: Hynix Semiconductor

Inventor: Seok-In Kim
Processor independent loop entry cache

Patent number: 8484436

Abstract: A memory controller is configured to receive read requests from a processor and return memory words from memory. The memory controller comprises an address comparator and a loop entry cache. The address comparator is configured to determine a difference between a previous read request address and a current read request address. The address comparator is also configured to determine whether the difference is positive and less than a certain address difference and, if so, indicate a limited backwards jump. The loop entry cache is configured to store a current memory word for the current read request address when the address comparator indicates a limited backwards jump.

Type: Grant

Filed: September 2, 2010

Date of Patent: July 9, 2013

Assignee: Atmel Corporation

Inventors: Franck Lunadier, Frédéric Schumacher
Network processors and methods allowing for prefetching instructions based on a packet type of a packet

Patent number: 8462789

Abstract: A network processor of an embodiment includes a packet classification engine, a processing pipeline, and a controller. The packet classification engine allows for classifying each of a plurality of packets according to packet type. The processing pipeline has a plurality of stages for processing each of the plurality of packets in a pipelined manner, where each stage includes one or more processors. The controller allows for providing the plurality of packets to the processing pipeline in an order that is based at least partially on: (i) packet types of the plurality of packets as classified by the packet classification engine and (ii) estimates of processing times for processing packets of the packet types at each stage of the plurality of stages of the processing pipeline. A method in a network processor allows for prefetching instructions into a cache for processing a packet based on a packet type of the packet.

Type: Grant

Filed: May 9, 2012

Date of Patent: June 11, 2013

Inventor: Justin Mark Sobaje
REGISTER MANAGEMENT IN AN EXTENDED PROCESSOR ARCHITECTURE

Publication number: 20130138922

Abstract: Systems and methods are disclosed for enhancing the throughput of a processor by minimizing the number of transfers of data associated with data transfer between a register file and a memory stack. The register file used by a processor running an application is partitioned into a number of blocks. A subset of the blocks of the register file is defined in an application binary interface enabling the subset to be pre-allocated and exposed to the application binary interface. Optionally, blocks other than the subset are not exposed to the application binary interface so that the data relating to application function switch or a context switch is not transferred between the unexposed blocks and a memory stack.

Type: Application

Filed: November 29, 2011

Publication date: May 30, 2013

Applicant: International Business Machines Corporation

Inventors: Revital Eres, Amit Golander, Nadav Levison, Sagi Manole, Ayal Zaks
Method and apparatus for efficient helper thread state initialization using inter-thread register copy

Patent number: 8453161

Abstract: This disclosure describes a method and system that may enable fast, hardware-assisted, producer-consumer style communication of values between threads. The method, in one aspect, uses a dedicated hardware buffer as an intermediary storage for transferring values from registers in one thread to registers in another thread. The method may provide a generic, programmable solution that can transfer any subset of register values between threads in any given order, where the source and target registers may or may not be correlated. The method also may allow for determinate access times, since it completely bypasses the memory hierarchy. Also, the method is designed to be lightweight, focusing on communication, and keeping synchronization facilities orthogonal to the communication mechanism. It may be used by a helper thread that performs data prefetching for an application thread, for example, to initialize the upward-exposed reads in the address computation slice of the helper thread code.

Type: Grant

Filed: May 25, 2010

Date of Patent: May 28, 2013

Assignee: International Business Machines Corporation

Inventors: Michael K. Gschwind, John K. O'Brien, Valentina Salapura, Zehra N. Sura
REDUCING HARDWARE COSTS FOR SUPPORTING MISS LOOKAHEAD

Publication number: 20130124828

Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. When an instruction retires during the lookahead mode, a working register which serves as a destination register for the instruction is not copied to a corresponding architectural register. Instead the architectural register is marked as invalid. Note that by not updating architectural registers during lookahead mode, the system eliminates the need to checkpoint the architectural registers prior to entering lookahead mode.

Type: Application

Filed: November 10, 2011

Publication date: May 16, 2013

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Yuan C. Chou, Eric W. Mahurin
REDUCING POWER CONSUMPTION AND RESOURCE UTILIZATION DURING MISS LOOKAHEAD

Publication number: 20130124829

Abstract: The disclosed embodiments relate to a system that executes program instructions on a processor. During a normal-execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system speculatively executes subsequent instructions in a lookahead mode to prefetch future loads. While executing in the lookahead mode, if the processor determines that the lookahead mode is unlikely to uncover any additional outer-level cache misses, the system terminates the lookahead mode. Then, after the unresolved data dependency is resolved, the system recommences execution in the normal-execution mode from the instruction that triggered the lookahead mode.

Type: Application

Filed: November 10, 2011

Publication date: May 16, 2013

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Yuan C. Chou, Eric W. Mahurin
Run-time updating of prediction hint instructions

Patent number: 8443171

Abstract: The present invention provides a system and method for runtime updating of hints in program instructions. The invention also provides for programs of instructions that include hint performance data. Also, the invention provides an instruction cache that modifies hints and writes them back. As runtime hint updates are stored in instructions, the impact of the updates is not limited by the limited memory capacity local to a processor. Also, there is no conflict between hardware and software hints, as they can share a common encoding in the program instructions.

Type: Grant

Filed: July 30, 2004

Date of Patent: May 14, 2013

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Dale Morris, James E. McCormick
Prefetch optimization in shared resource multi-core systems

Patent number: 8443151

Abstract: An apparatus and method is described herein for optimization to prefetch throttling, which potentially enhances performance, reduces power consumption, and maintains positive gain for workloads that benefit from prefetching. More specifically, the optimizations described herein allow for bandwidth congestion and prefetch accuracy to be taken into account as feedbacks for throttling at the source of prefetch generation. As a result, when there is low congestion, full prefetch generation is allowed, even if the prefetch is inaccurate, since there is available bandwidth. However, when congestion is high, the determination of throttling falls to prefetch accuracy. If accuracy is high—miss rate is low—then less throttling is needed, because the prefetches are being utilized—performance is being enhanced.

Type: Grant

Filed: November 9, 2009

Date of Patent: May 14, 2013

Assignee: Intel Corporation

Inventors: Puqi P. Tang, Hemant G. Rotithor, Ryan L. Carlson, Nagi Aboulenein
Computer system, method, cache controller and computer program for caching I/O requests

Patent number: 8423720

Abstract: A computer system having a main unit and an expansion unit connected by an interface arrangement. The expansion unit includes at least one connector for receiving an input/output component, so that additional input/output components can be added to the computer system. The interface arrangement includes at least one cache controller and at least one cache memory for monitoring and predicting requests exchanged between the main unit and the expansion unit. A method of caching and processing input/output requests and a storage medium is also provided.

Type: Grant

Filed: May 5, 2008

Date of Patent: April 16, 2013

Assignee: International Business Machines Corporation

Inventor: Andreas Christian Döring
Fine-grained software-directed data prefetching using integrated high-level and low-level code analysis optimizations

Patent number: 8413127

Abstract: A mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations is provided. The mechanism identifies and classifies streams, identifies data that is most likely to incur a cache miss, exploits effective hardware prefetching to determine the proper number of streams to be prefetched, exploits effective data prefetching on different types of streams in order to eliminate redundant prefetching and avoid cache pollution, and uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.

Type: Grant

Filed: December 22, 2009

Date of Patent: April 2, 2013

Assignee: International Business Machines Corporation

Inventors: Roch G. Archambault, Robert J. Blainey, Yaoqing Gao, Allan R. Martin, James L. McInnes, Francis Patrick O'Connell
Method and system for enhancing computer processing performance

Patent number: 8387053

Abstract: A method of performing operations in a computer system, computer system, and related method of compilation, are disclosed. In one embodiment, the method of performing includes providing compiled code having at least one thread, where each of the at least one thread includes a respective plurality of blocks and each respective block includes a respective pre-fetch component and a respective execute component. The method also includes performing a first pre-fetch component from a first block of a first thread of the at least one thread, performing a first additional component after the first pre-fetch component has been performed, and performing a first execute component from the first block of the first thread. The first execute component is performed after the first additional component has been performed, and the first additional component is from either a second thread or another block of the first thread that is not the first block.

Type: Grant

Filed: January 25, 2007

Date of Patent: February 26, 2013

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Blaine D. Gaither, Verna Knapp, Jerome Huck, Benjamin D. Osecky
Converting victim writeback to a fill

Patent number: 8364907

Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.

Type: Grant

Filed: January 27, 2012

Date of Patent: January 29, 2013

Assignee: Apple Inc.

Inventors: Ramesh Gunna, Sudarshan Kadambi
Microprocessor with repeat prefetch indirect instruction

Patent number: 8364902

Abstract: A microprocessor includes an instruction decoder for decoding a repeat prefetch indirect instruction that includes address operands used to calculate an address of a first entry in a prefetch table having a plurality of entries, each including a prefetch address. The repeat prefetch indirect instruction also includes a count specifying a number of cache lines to be prefetched. The memory address of each of the cache lines is specified by the prefetch address in one of the entries in the prefetch table. A count register, initially loaded with the count specified in the prefetch instruction, stores a remaining count of the cache lines to be prefetched. Control logic fetches the prefetch addresses of the cache lines from the table into the microprocessor and prefetches the cache lines from the system memory into a cache memory of the microprocessor using the count register and the prefetch addresses fetched from the table.

Type: Grant

Filed: October 15, 2009

Date of Patent: January 29, 2013

Assignee: VIA Technologies, Inc.

Inventors: Rodney E. Hooker, John Michael Greer
Pre-fetching for a sibling cache

Patent number: 8341357

Abstract: One embodiment provides a system that pre-fetches into a sibling cache. During operation, a first thread executes in a first processor core associated with a first cache, while a second thread associated with the first thread simultaneously executes in a second processor core associated with a second cache. During execution, the second thread encounters an instruction that triggers a request to a lower-level cache which is shared by the first cache and the second cache. The system responds to this request by directing a load fill which returns from the lower-level cache in response to the request to the first cache, thereby reducing cache misses for the first thread.

Type: Grant

Filed: March 16, 2010

Date of Patent: December 25, 2012

Assignee: Oracle America, Inc.

Inventors: Martin R. Karlsson, Shailender Chaudhry, Robert E. Cypher
DATA PROCESSING APPARATUS, CONTROL METHOD THEREFOR, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20120297169

Abstract: A data processing apparatus which sequentially executes a verification process so as to recognize a target object, comprising: an obtaining unit configured to obtain dictionary data to be referred to in the verification process; a holding unit configured to hold a plurality of dictionary data; a verification unit configured to execute the verification process for the input data by referring to one dictionary data; a history holding unit configured to hold a verification result; and a prefetch determination unit configured to determine based on the verification result whether to execute prefetch processing in which the obtaining unit obtains in advance dictionary data to be referred to by the verification unit in a succeeding verification process, and holds the dictionary data in the holding unit before the succeeding verification process.

Type: Application

Filed: May 10, 2012

Publication date: November 22, 2012

Applicant: CANON KABUSHIKI KAISHA

Inventor: Akiyoshi Momoi
Apparatus and methods for speculative interrupt vector prefetching

Patent number: 8291202

Abstract: Techniques for interrupt processing are described. An exceptional condition is detected in one or more stages of an instruction pipeline in a processor. In response to the detected exceptional condition and prior to the processor accepting an interrupt in response to the detected exceptional condition, an instruction cache is checked for the presence of an instruction at a starting address of an interrupt handler. The instruction at the starting address of the interrupt vector table is prefetched from storage above the instruction cache when the instruction is not present in the instruction cache to load the instruction in the instruction cache, whereby the instruction is made available in the instruction cache by the time the processor accepts the interrupt in response to the detected exceptional condition.

Type: Grant

Filed: August 8, 2008

Date of Patent: October 16, 2012

Inventors: Daren Eugene Streett, Brian Michael Stempel
Entry replacement within a data store using entry profile data and runtime performance gain data

Patent number: 8271750

Abstract: A data processing system includes a data store having storage locations storing entries which can be used for a variety of purposes, such as operand value prediction, branch prediction, etc. An entry profile store stores profile data for more candidate entries than there are storage locations within the data store. The profile data is used to determine replacement policy for entries within the data store. The profile data can include hash values used to determine whether predictions associated with candidate entries were correct without having to store the full predictions within the profile data.

Type: Grant

Filed: January 18, 2008

Date of Patent: September 18, 2012

Assignee: ARM Limited

Inventors: Sami Yehia, Marios Kleanthous
RETURN ADDRESS PREDICTION IN MULTITHREADED PROCESSORS

Publication number: 20120233442

Abstract: Techniques and structures are disclosed relating to predicting return addresses in multithreaded processors. In one embodiment, a processor is disclosed that includes a return address prediction unit. The return address prediction unit is configured to store return addresses for different ones of a plurality of threads executable on the processor. The return address prediction unit is configured to receive a request for a predicted return address for one of the plurality of threads. The first request includes an identification of the requesting thread. The return address prediction unit is configured to provide the predicted return address to the requesting thread. In some embodiments, the return address prediction unit is configured to store the return addresses in a memory that has a plurality of dedicated portions. In some embodiments, the return address prediction unit is configured to store the return addresses in a memory that has dynamically allocable entries.

Type: Application

Filed: March 11, 2011

Publication date: September 13, 2012

Inventors: Manish K. Shah, Gregory F. Grohoski, Zeid H. Samoail
Varying an amount of data retrieved from memory based upon an instruction hint

Patent number: 8266381

Abstract: In at least one embodiment, a processor detects during execution of program code whether a load instruction within the program code is associated with a hint. In response to detecting that the load instruction is not associated with a hint, the processor retrieves a full cache line of data from the memory hierarchy into the processor in response to the load instruction. In response to detecting that the load instruction is associated with a hint, a processor retrieves a partial cache line of data into the processor from the memory hierarchy in response to the load instruction.

Type: Grant

Filed: February 1, 2008

Date of Patent: September 11, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
Method and apparatus for generating efficient code for scout thread to prefetch data values for a main thread

Publication number: 20120226892

Abstract: One embodiment of the present invention provides a system that generates code for a scout thread to prefetch data values for a main thread. During operation, the system compiles source code for a program to produce executable code for the program. This compilation process involves performing reuse analysis to identify prefetch candidates which are likely to be touched during execution of the program. Additionally, this compilation process produces executable code for the scout thread which contains prefetch instructions to prefetch the identified prefetch candidates for the main thread. In this way, the scout thread can subsequently be executed in parallel with the main thread in advance of where the main thread is executing to prefetch data items for the main thread.

Type: Application

Filed: March 16, 2005

Publication date: September 6, 2012

Inventors: Partha P. Tirumalai, Yonghong Song, Spiros Kalogeropulos
Selective preclusion of a bus access request

Patent number: 8260990

Abstract: A system and method for selective preclusion of bus access requests are disclosed. In an embodiment, a method includes determining a bus unit access setting at a logic circuit of a processor. The method also includes selectively precluding a bus unit access request based on the bus unit access setting.

Type: Grant

Filed: November 19, 2007

Date of Patent: September 4, 2012

Assignee: QUALCOMM Incorporated

Inventors: Lucian Codrescu, Ajay Anant Ingle, Christopher Edward Koob, Erich James Plondke
Sourcing differing amounts of prefetch data in response to data prefetch requests

Patent number: 8250307

Abstract: According to a method of data processing, a memory controller receives a prefetch load request from a processor core of a data processing system. The prefetch load request specifies a requested line of data. In response to receipt of the prefetch load request, the memory controller determines by reference to a stream of demand requests how much data is to be supplied to the processor core in response to the prefetch load request. In response to the memory controller determining to provide less than all of the requested line of data, the memory controller provides less than all of the requested line of data to the processor core.

Type: Grant

Filed: February 1, 2008

Date of Patent: August 21, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
Method for reducing buffer capacity in a pipeline processor

Patent number: 8250231

Abstract: A method to reduce buffer capacity in a processor includes giving the data packets admittance to the processor through at least one interface, storing the data packets in at least one input buffer, and using a packet rate shaper outside of a processing pipeline to control flow of the data packets to the pipeline before the data packets enter the pipeline. First and second data packets are given admittance to the pipeline in dependence on cost information per packet that is dependent upon an expected time period of residence of the first data packet in the pipeline. Cost information dependent upon an expected time period of residence of the second data packet in the pipeline differs from said cost information dependent upon the expected time period of residence of the first data packet in the pipeline.

Type: Grant

Filed: December 20, 2005

Date of Patent: August 21, 2012

Assignee: Marvell International Ltd.

Inventors: Thomas Bodén, Jakob Carlström
Methods and apparatus for dynamic prediction by software

Patent number: 8250344

Abstract: A method, storage medium, processor instruction and processor to for specifying a value in a first portion of a conditional pre-fetch instruction associated with a branch instruction used for effectuating a branch operation, specifying a target instruction address in a second portion of the instruction, evaluating the value to determine whether a condition is met, and pre-fetching one or more instructions starting at the target instruction address into an instruction buffer of the processor when the condition is met, is provided.

Type: Grant

Filed: August 13, 2009

Date of Patent: August 21, 2012

Assignee: Sony Computer Entertainment Inc.

Inventors: Masahiro Yasue, Akiyuki Hatakeyama
INFORMATION PROCESSING APPARATUS

Publication number: 20120173850

Abstract: A high-performance information processing technique permitting updating of an instruction buffer ready for effective prefetching to branch instructions and returning to the subroutine with a small volume of hardware is to be provided at low cost. It is an information processing apparatus equipped with a CPU, a memory, prefetch means and the like, wherein a prefetch address generator unit in the prefetch means decodes a branching series of instructions including at least one branched address calculating instruction and branching instruction to a branched address out of a current instruction buffer storing the series of instructions currently accessed by the CPU, and thereby looks ahead to the branching destination address. The information processing apparatus further comprises a RTS instruction buffer for storing a series of instructions of the return destinations of RTS instructions, and series of instructions stored in the current instruction buffer are saved into the RTS instruction buffer.

Type: Application

Filed: March 16, 2012

Publication date: July 5, 2012

Applicant: RENESAS ELECTRONICS CORPORATION

Inventors: Teppei HIROTSU, Yuuichi ABE, Takeshi KATAOKA, Yasuhiro NAKATSUKA
Expanded functionality of processor operations within a fixed width instruction encoding

Patent number: 8209520

Abstract: An apparatus for executing fixed width instructions in a multiple execution unit system has a device for fetching instructions from a memory, and a decoder for decoding each fetched instruction in turn. A determination is made as to whether each decoded instruction includes a portion to fetch a locally stored instruction from a local store. If it does, the locally stored instruction is fetched and locally stored portion are executed.

Type: Grant

Filed: March 20, 2007

Date of Patent: June 26, 2012

Assignee: Imagination Technologies Limited

Inventor: Andrew Webber
Programming Language Exposing Idiom Calls

Publication number: 20120159126

Abstract: A programming language may include hint instructions that may notify a programming idiom accelerator that a programming idiom is coming. An idiom begin hint exposes the programming idiom to the programming idiom accelerator. Thus, the programming idiom accelerator need not perform pattern matching or other forms of analysis to recognize a sequence of instructions. Rather, the programmer may insert idiom hint instructions, such as an idiom begin hint, to expose the idiom to the programming idiom accelerator. Similarly, an idiom end hint may mark the end of the programming idiom.

Type: Application

Filed: February 1, 2008

Publication date: June 21, 2012

Inventors: Ravi K Arimilli, Satya P. Sharma, Randal C. Swanberg
Multiprocessor cache prefetch with off-chip bandwidth allocation

Patent number: 8195888

Abstract: Technologies are generally described for allocating available prefetch bandwidth among processor cores in a multiprocessor computing system. The prefetch bandwidth associated with an off-chip memory interface of the multiprocessor may be determined, partitioned, and allocated across multiple processor cores.

Type: Grant

Filed: March 20, 2009

Date of Patent: June 5, 2012

Assignee: Empire Technology Development LLC

Inventor: Yan Solihin
CORRELATION-BASED INSTRUCTION PREFETCHING

Publication number: 20120131311

Abstract: The disclosed embodiments provide a system that facilitates prefetching an instruction cache line in a processor. During execution of the processor, the system performs a current instruction cache access which is directed to a current cache line. If the current instruction cache access causes a cache miss or is a first demand fetch for a previously prefetched cache line, the system determines whether the current instruction cache access is discontinuous with a preceding instruction cache access. If so, the system completes the current instruction cache access by performing a cache access to service the cache miss or the first demand fetch, and also prefetching a predicted cache line associated with a discontinuous instruction cache access which is predicted to follow the current instruction cache access.

Type: Application

Filed: November 23, 2010

Publication date: May 24, 2012

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventor: Yuan C. Chou
Dual function adder for computing a hardware prefetch address and an arithmetic operation value

Patent number: 8185721

Abstract: A system including a dual function adder is described. In one embodiment, the system includes an adder. The adder is configured for a first instruction to determine an address for a hardware prefetch if the first instruction is a hardware prefetch instruction. The adder is further configured for the first instruction to determine a value from an arithmetic operation if the first instruction is an arithmetic operation instruction.

Type: Grant

Filed: March 4, 2008

Date of Patent: May 22, 2012

Assignee: QUALCOMM Incorporated

Inventors: Ajay Anant Ingle, Erich James Plondke, Lucian Codrescu
SIGNAL PROCESSING SYSTEM AND INTEGRATED CIRCUIT COMPRISING A PREFETCH MODULE AND METHOD THEREFOR

Publication number: 20120124336

Abstract: A signal processing system comprising at least one master device at least one memory element and prefetch module arranged to perform prefetching from at least one memory element upon a memory access request to the at least one memory element from the at least one master device. Upon receiving a memory access request from the at least one master device, the prefetch module is arranged to configure the enabling of prefetching of at least one of instruction information and data information in relation to that memory access request based at least partly on an address to which the memory access request relates.

Type: Application

Filed: July 20, 2009

Publication date: May 17, 2012

Applicant: Freescale Semiconductor, Inc.

Inventors: Alistair Robertson, Joseph Circello, Mark Maiolani
Memory control apparatus, memory control method and information processing system

Patent number: 8166259

Abstract: A memory control apparatus, a memory control method and an information processing system are disclosed. Fetch response data retrieved from a main storage unit is received, while bypassing a storage unit, by a first port in which the received fetch response data can be set. The fetch response data retrieved from the main storage unit, if unable to be set in the first port, is set in a second port through the storage unit. A transmission control unit performs priority control operation to send out, in accordance with a predetermined priority, the fetch response data set in the first port or the second port to the processor. As a result, the latency is shortened from the time when the fetch response data arrives to the time when the fetch response data is sent out toward the processor in response to a fetch request from the processor.

Type: Grant

Filed: March 26, 2009

Date of Patent: April 24, 2012

Assignee: Fujitsu Limited

Inventor: Souta Kusachi
Application Performance with Support for Re-Initiating Unconfirmed Software-Initiated Threads in Hardware

Publication number: 20120096240

Abstract: A method, system and computer-usable medium are disclosed for managing prefetch streams in a virtual machine environment. Compiled application code in a first core, which comprises a Special Purpose Register (SPR) and a plurality of first prefetch engines, initiates a prefetch stream request. If the prefetch stream request cannot be initiated due to unavailability of a first prefetch engine, then an indicator bit indicating a Prefetch Stream Dispatch Fault is set in the SPR, causing a Hypervisor to interrupt the execution of the prefetch stream request. The Hypervisor then calls its associated operating system (OS), which determines prefetch engine availability for a second core comprising a plurality of second prefetch engines. If a second prefetch engine is available, then the OS migrates the prefetch stream request from the first core to the second core, where it is initiated on an available second prefetch engine.

Type: Application

Filed: October 15, 2010

Publication date: April 19, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthew Accapadi, Robert H. Bell, JR., Hong L. Hua, Ram Raghavan, Mysore S. Srinivas
Performance of Emerging Applications in a Virtualized Environment Using Transient Instruction Streams

Publication number: 20120096241

Abstract: A method, system and computer-usable medium are disclosed for managing transient instruction streams. Transient flags are defined in Branch-and-Link (BRL) instructions that are known to be infrequently executed. A bit is likewise set in a Special Purpose Register (SPR) of the hardware (e.g., a core) that is executing an instruction request thread. Subsequent fetches or prefetches in the request thread are treated as transient and are not written to lower-level caches. If an instruction is non-transient, and if a lower-level cache is non-inclusive of the L1 instruction cache, a fetch or prefetch miss that is obtained from memory may be written in both the L1 and the lower-level cache. If it is not inclusive, a cast-out from the L1 instruction cache may be written in the lower-level cache.

Type: Application

Filed: October 15, 2010

Publication date: April 19, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert H. Bell, JR., Hong L. Hua, Ram Raghavan, Mysore S. Srinivas
PARALLELIZING I/O PROCESSING OF INDEX INSERTIONS DURING INSERTION OF DATA INTO A DATABASE

Publication number: 20120096039

Abstract: Database elements are inserted into a database object by processing each of a plurality of operations in a sequential order within a first processing round to insert the database elements into the database objects, where processing for at least one operation in the order becomes suspended due to a resource request, and where at least one successive operation is initiated in response to suspension of one or more prior operations to enable prefetching of information for processing the operations. Each suspended operation is re-processed with the prefetched information in one or more additional processing rounds until processing of the operations is completed.

Type: Application

Filed: October 18, 2010

Publication date: April 19, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert W. Lyle, Ping Wang
Method and apparatus for performing data prefetch in a multiprocessor system

Patent number: 8161245

Abstract: A method and apparatus for performing data prefetch in a multiprocessor system are disclosed. The multiprocessor system includes multiple processors, each having a cache memory. The cache memory is subdivided into multiple slices. A group of prefetch requests is initially issued by a requesting processor in the multiprocessor system. Each prefetch request is intended for one of the respective slices of the cache memory of the requesting processor. In response to the prefetch requests being missed in the cache memory of the requesting processor, the prefetch requests are merged into one combined prefetch request. The combined prefetch request is then sent to the cache memories of all the non-requesting processors within the multiprocessor system. In response to a combined clean response from the cache memories of all the non-requesting processors, data are then obtained for the combined prefetch request from a system memory.

Type: Grant

Filed: February 9, 2005

Date of Patent: April 17, 2012

Assignee: International Business Machines Corporation

Inventors: James S. Fields, Jr., Benjiman L. Goodman, Guy L. Guthrie, Jeffrey A. Stuecheli
INEFFECTIVE PREFETCH DETERMINATION AND LATENCY OPTIMIZATION

Publication number: 20120084511

Abstract: A processor of an information handling system (IHS) initiates an L3 cache prefetch operation in response to a demand load during instruction processing. The processor selects an L3 cache prefetch at random for tracking as a target prefetched instruction. The processor initiates an L1 cache target prefetch operation and stores the resultant target prefetched instruction in the L1 cache. If a demand load arrives, the processor analyses the target prefetched instruction for effectiveness and determines the source of the prefetch data. If a demand does not arrive, the processor tests to determine if the particular prefetched instruction timed out in the cache and identifies the infectiveness of the prefetch operation. The processor samples multiple prefetch operations at random and generates a history of prefetch effectiveness and other useful prefetch information. The processor stores the prefetch effectiveness information to enable reduction or removal of ineffective prefetch operations.

Type: Application

Filed: October 4, 2010

Publication date: April 5, 2012

Applicant: International Business Machines Corporation

Inventors: Miles R. Dooley, Venkat R. Indukuru, Alex E. Mericas, Francis P. O'Connell
MEMORY ACCELERATOR BUFFER REPLACEMENT METHOD AND SYSTEM

Publication number: 20120084532

Abstract: A microcontroller using an optimized buffer replacement strategy comprises a memory configured to store instructions, a processor configured to execute said program instructions, and a memory accelerator operatively coupled between the processor and the memory. The memory accelerator is configured to receive an information request and overwrite the buffer from which the prefetch was initiated with the requested information when the request is fulfilled by a previously initiated prefetch operation.

Type: Application

Filed: September 30, 2010

Publication date: April 5, 2012

Applicant: NXP B.V.

Inventors: Craig MaCkenna, Richard N. Varney, Gregory K. Goodhue
PROCESSOR POWER MANAGEMENT BASED ON CLASS AND CONTENT OF INSTRUCTIONS

Publication number: 20120079242

Abstract: A processor and method are disclosed. In one embodiment the processor includes a prefetch buffer that stores macro instructions. The processor also includes a clock circuit that can provide a clock signal for at least some of the functional units within the processor. The processor additionally includes macro instruction decode logic that can determine a class of each macro instruction. The processor also includes a clock management unit that can cause the clock signal to remain in a steady state entering at least one of the units in the processor that do not operate on a current macro instruction being decoded. Finally, the processor also includes at least one instruction decoder unit that can decode the first macro instruction into one or more opcodes.

Type: Application

Filed: September 24, 2010

Publication date: March 29, 2012

Inventors: Venkateswara R. Madduri, Jonathan Y. Tong, Hoichi Cheong
Next-instruction-type-field

Publication number: 20120079243

Abstract: A graphics processing unit core 26 includes a plurality of processing pipelines 38, 40, 42, 44. A program instruction of a thread of program instructions being executed by a processing pipeline includes a next-instruction-type field 36 indicating an instruction type of a next program instruction following the current program instruction within the processing thread concerned. This next-instruction-type field is used to control selection of to which processing pipeline the next instruction is issued before that next instruction has been fetched and decoded. The next-instruction-type field may be passed along the processing pipeline as the least significant four bits within a program counter value associated with a current program instruction 32. The next-instruction-type field may also be used to control the forwarding of thread state variables between processing pipelines when a thread migrates between processing pipelines prior to the next program instruction being fetched or decoded.

Type: Application

Filed: September 1, 2011

Publication date: March 29, 2012

Applicant: ARM Limited

Inventor: Jorn Nystad
Enhanced load lookahead prefetch in single threaded mode for a simultaneous multithreaded microprocessor

Patent number: 8145887

Abstract: A method, system, and computer program product are provided for enhancing the execution of independent loads in a processing unit. A processing unit detects if a long-latency miss associated with a load instruction has been encountered. Responsive to a long-latency miss, the processing unit enters a load lookahead mode. Responsive to entering the load lookahead mode, the processing unit dispatches each instruction from a first set of instructions from a first buffer with an associated vector. The processing unit determines if the first set of instructions from the first buffer have completed execution. Responsive to completed execution of the first set of instructions from the first buffer, the processing unit copies the set of vectors from a first vector array to a second vector array. Then the processing unit dispatches a second set of instructions from a second buffer with an associated vector from the second vector array.

Type: Grant

Filed: June 15, 2007

Date of Patent: March 27, 2012

Assignee: International Business Machines Corporation

Inventors: Hung Q. Le, Dung Q. Nguyen
Wake-and-go mechanism with system bus response

Patent number: 8145849

Abstract: A wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism is configured to issue a look-ahead load command on a system bus to read a data value from a target address and perform a comparison operation to determine whether the data value at the target address indicates that an event for which a thread is waiting has occurred. In response to the comparison resulting in a determination that the event has not occurred, the wake-and-go engine populates a wake-and-go storage array with the target address and snooping the target address on the system bus without data exclusivity. In response to the comparison resulting in a determination that the event has occurred, the wake-and-go engine issues a load command on the system bus to read the data value from the target address with data exclusivity.

Type: Grant

Filed: February 1, 2008

Date of Patent: March 27, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
Preloading instructions from an instruction set other than a currently executing instruction set

Patent number: 8145883

Abstract: A preload instruction in a first instruction set is executed at a processor. The preload instruction causes the processor to preload one or more instructions into an instruction cache. The pre-loaded instructions are pre-decoded according to a second instruction set that is different from the first instruction set. The preloaded instructions are pre-decoded according to the second instruction set in response to an instruction set preload indicator (ISPI).

Type: Grant

Filed: March 12, 2010

Date of Patent: March 27, 2012

Assignee: QUALCOMM Incorporation

Inventors: Thomas Andrew Sartorius, Brian Michael Stempel, Rodney Wayne Smith
Computer memory architecture for hybrid serial and parallel computing systems

Patent number: 8145879

Abstract: In one embodiment, a serial processor is configured to execute software instructions in a software program in serial. A serial memory is configured to store data for use by the serial processor in executing the software instructions in serial. A plurality of parallel processors are configured to execute software instructions in the software program in parallel. A plurality of partitioned memory modules are provided and configured to store data for use by the plurality of parallel processors in executing software instructions in parallel. Accordingly, a processor/memory structure is provided that allows serial programs to use quick local serial memories and parallel programs to use partitioned parallel memories. The system may switch between a serial mode and a parallel mode. The system may incorporate pre-fetching commands of several varieties.

Type: Grant

Filed: March 10, 2010

Date of Patent: March 27, 2012

Assignee: XMTT Inc.

Inventor: Uzi Y. Vishkin
PREFETCHER WITH ARBITRARY DOWNSTREAM PREFETCH CANCELATION

Publication number: 20120072702

Abstract: A prefetch cancelation arbiter improves access to a shared memory resource by arbitrarily canceling speculative prefetches. The prefetch cancelation arbiter applies a set of arbitrary policies to speculative prefetches to select one or more of the received speculative prefetches to cancel. The selected speculative prefetches are canceled and a cancelation notification of each canceled speculative prefetch is sent to a higher-level memory component such as a prefetch unit or a local memory arbiter that is local to the processor associated with the canceled speculative prefetch. The set of arbitrary policies is used to reduce memory accesses to the shared memory resource.

Type: Application

Filed: September 15, 2011

Publication date: March 22, 2012

Inventors: Matthew D. Pierson, Joseph R.M. Zbiciak, Kai Chirca, Amitabh Menon, Timothy D. Anderson
Context switch data prefetching in multithreaded computer

Patent number: 8141098

Abstract: An apparatus initiates, in connection with a context switch operation, a prefetch of data likely to be used by a thread prior to resuming execution of that thread. As a result, once it is known that a context switch will be performed to a particular thread, data may be prefetched on behalf of that thread so that when execution of the thread is resumed, more of the working state for the thread is likely to be cached, or at least in the process of being retrieved into cache memory, thus reducing cache-related performance penalties associated with context switching.

Type: Grant

Filed: January 16, 2009

Date of Patent: March 20, 2012

Assignee: International Business Machines Corporation

Inventors: Jeffrey Powers Bradford, Harold F. Kossman, Timothy John Mullins
Converting victim writeback to a fill

Patent number: 8131946

Abstract: In one embodiment, a processor may be configured to write ECC granular stores into the data cache, while non-ECC granular stores may be merged with cache data in a memory request buffer. In one embodiment, a processor may be configured to detect that a victim block writeback hits one or more stores in a memory request buffer (or vice versa) and may convert the victim block writeback to a fill. In one embodiment, a processor may speculatively issue stores that are subsequent to a load from a load/store queue, but prevent the update for the stores in response to a snoop hit on the load.

Type: Grant

Filed: October 20, 2010

Date of Patent: March 6, 2012

Assignee: Apple Inc.

Inventors: Ramesh Gunna, Sudarshan Kadambi
Wake-and-go mechanism with system address bus transaction master

Patent number: 8127080

Abstract: A wake-and-go mechanism is provided for a data processing system. The wake-and-go mechanism is configured to issue a look-ahead load command on a system bus to read a data value from a target address and perform a comparison operation to determine whether the data value at the target address indicates that an event for which a thread is waiting has occurred. In response to the comparison resulting in a determination that the event has not occurred, the wake-and-go engine populates the wake-and-go storage array with the target address and snoops the target address on the system bus.

Type: Grant

Filed: February 1, 2008

Date of Patent: February 28, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
Vector processing with high execution throughput

Patent number: 8108652

Abstract: The claimed invention is an efficient and high-performance vector processor. Through minimizing the use of multiple banks of memory and/or multi-ported memory blocks to reduce implementation cost, vector memory 450 provides abundant memory bandwidth and enables sustained low-delay memory operations for a large number of SIMD (Single Instruction, Multiple Data) or vector operators simultaneously.

Type: Grant

Filed: September 10, 2008

Date of Patent: January 31, 2012

Inventor: Ronald Chi-Chun Hui

prev 1 2 3 4 5 6 7 8 9 … next