With Multilevel Cache Hierarchies (epo) Patents (Class 711/E12.024)
-
Publication number: 20120166729Abstract: A method and apparatus for controlling affinity of subcaches is disclosed. When a core compute unit evicts a line of victim data, a prioritized search for space allocation on available subcaches is executed, in order of proximity between the subcache and the compute unit. The victim data may be injected into an adjacent subcache if space is available. Otherwise, a line may be evicted from the adjacent subcache to make room for the victim data or the victim data may be sent to the next closest subcache. To retrieve data, a core compute unit sends a Tag Lookup Request message directly to the nearest subcache as well as to a cache controller, which controls routing of messages to all of the subcaches. A Tag Lookup Response message is sent back to the cache controller to indicate if the requested data is located in the nearest sub-cache.Type: ApplicationFiled: December 22, 2010Publication date: June 28, 2012Applicant: ADVANCED MICRO DEVICES, INC.Inventor: Greggory D. Donley
-
Publication number: 20120159075Abstract: A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times.Type: ApplicationFiled: February 27, 2012Publication date: June 21, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: David A. Luick
-
Publication number: 20120159074Abstract: Embodiments of the invention relate to increased energy efficiency and conservation by reducing and increasing an amount of cache available for use by a processor, and an amount of power supplied to the cache and to the processor, based on the amount of cache actually being used by the processor to process data. For example, a power control unit (PCU) may monitor a last level cache (LLC) to identify if the size or amount of the cache being used by a processor to process data and to determine heuristics based on that amount. Based on the monitored amount of cache being used and the heuristics, the PCU causes a corresponding decrease or increase in an amount of the cache available for use by the processor, and a corresponding decrease or increase in an amount of power supplied to the cache and to the processor.Type: ApplicationFiled: December 23, 2011Publication date: June 21, 2012Inventors: Inder M. Sodhi, Satish K. Damaraju, Sanjeev S. Jahagirdar, Ryan D. Wells
-
Publication number: 20120159073Abstract: An apparatus and method for improving cache performance in a computer system having a multi-level cache hierarchy. For example, one embodiment of a method comprises: selecting a first line in a cache at level N for potential eviction; querying a cache at level M in the hierarchy to determine whether the first cache line is resident in the cache at level M, wherein M<N; in response to receiving an indication that the first cache line is not resident at level M, then evicting the first cache line from the cache at level N; in response to receiving an indication that the first cache line is resident at level M, then retaining the first cache line and choosing a second cache line for potential eviction.Type: ApplicationFiled: December 20, 2010Publication date: June 21, 2012Inventors: Aamer Jaleel, Simon C. Steely, JR., Eric R. Borch, Malini K. Bhandaru, Joel S. Emer
-
Publication number: 20120151144Abstract: A method and computer device for determining the cache memory configuration. The method includes allocating an amount of cache memory from a first memory level of the cache memory, and determining a read transfer time for the allocated amount of cache memory. The allocated amount of cache memory then is increased and the read transfer time for the increased allocated amount of cache memory is determined. The allocated amount of cache memory continues to be increased and the read transfer time determined for the each allocated amount until all of the cache memory in all of the cache memory levels has been allocated. The cache memory configuration is determined based on the read transfer times from the allocated portions of the cache memory. The determined cache memory configuration includes the number of cache memory levels and the respective capacities of each cache memory level.Type: ApplicationFiled: December 8, 2010Publication date: June 14, 2012Inventor: William Judge Yohn
-
Patent number: 8200897Abstract: The present invention comprises a CHA 110 which transmits/receives data to/from an external device, a DKA 140 which transmits/receives data to/from an HDD unit 200, a primary cache unit 120 which has a primary cache memory 124, a secondary cache unit 130 which is installed between the primary cache unit 120 and the DKA 140 and has a secondary cache memory 134, a CCP 121 which stores write target data received by the CHA 110 in the primary cache memory 124, and a CCP 131 which stores the write target data in the secondary cache memory 134, and transfers the write target data stored in the secondary cache memory 134 to the DKA 140.Type: GrantFiled: July 8, 2011Date of Patent: June 12, 2012Assignee: Hitachi, Ltd.Inventors: Tatsuya Ninomiya, Kazuo Tanaka
-
Publication number: 20120137075Abstract: The invention relates to a multi-core processor system, in particular a single-package multi-core processor system, comprising at least two processor cores, preferably at least four processor cores, each of said at least two cores, preferably at least four processor cores, having a local LEVEL-1 cache, a tree communication structure combining the multiple LEVEL-1 caches, the tree having at least one node, preferably at least three nodes for a four processor core multi-core processor, and TAG information is associated to data managed within the tree, usable in the treatment of the data.Type: ApplicationFiled: June 9, 2010Publication date: May 31, 2012Applicant: HYPERION CORE, INC.Inventor: Martin Vorbach
-
Publication number: 20120137074Abstract: A method and system to perform stream buffer management instructions in a processor. The stream buffer management instructions facilitate the creation and usage of a dedicated memory space or stream buffer of the processor in one embodiment of the invention. The dedicated memory space is a contiguous memory space and has a sequential or linear addressing scheme in one embodiment of the invention. The processor has logic to execute a stream buffer management instruction to copy data from a source memory address to a destination memory address that is specified with a desired level of memory hierarchy.Type: ApplicationFiled: November 29, 2010Publication date: May 31, 2012Inventors: Daehyun Kim, Changkyu Kim, Victor W. Lee, Jatin Chhugani, Nadathur Rajagopalan Satish
-
Publication number: 20120131265Abstract: A method of writing data units to a storage device. The data units are cached in a first level cache sorted by logical address. A group (Gj) of sorted data units is transferred from the first level cache to a second level cache embodied in a solid state memory device. Data units of multiple groups (Gj) are sorted in the second level cache by logical address. The sorted data units stemming from the multiple groups are written to the storage device.Type: ApplicationFiled: October 28, 2011Publication date: May 24, 2012Applicant: International Business Machines CorporationInventors: Ioannis Koltsidas, Roman Pletka
-
Publication number: 20120117326Abstract: The present invention relates to an apparatus and a method for accessing a cache memory. The cache memory comprises a level-one memory and a level-two memory. The apparatus for accessing the cache memory according to the present invention comprises a register unit and a control unit. The control unit receives a first read command and a reject datum of the level-one memory and stores the reject datum of the level-one memory to the register unit. Then the control unit reads and stores a stored datum of the level-two memory to the level-one memory according to the first read command.Type: ApplicationFiled: November 3, 2011Publication date: May 10, 2012Applicant: REALTEK SEMICONDUCTOR CORP.Inventors: YEN-JU LU, JUI-YUAN LIN
-
Publication number: 20120110266Abstract: Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed.Type: ApplicationFiled: December 31, 2011Publication date: May 3, 2012Inventors: Christopher Wilkerson, M. Muhammad Khellah, Vivek De, Ming Y. Zhang, Jaume Abella, Javier Carretero Casado, Pedro Chaparro Monferrer, Xavier Vera, Antonio Gonzalez
-
Patent number: 8171224Abstract: A method of providing history based done logic for a D-cache includes receiving a D-cache line in an L2 cache; determining if the D-cache line is unprefetchable; aging the D-cache line without a delay if the D-cache line is prefetchable; and aging the D-cache line with a delay if the D-cache line is unprefetchable.Type: GrantFiled: May 28, 2009Date of Patent: May 1, 2012Assignee: International Business Machines CorporationInventor: David A. Luick
-
Publication number: 20120102269Abstract: The disclosed embodiments provide a system that uses speculative cache requests to reduce cache miss delays for a cache in a multi-level memory hierarchy. During operation, the system receives a memory reference which is directed to a cache line in the cache. Next, while determining whether the cache line is available in the cache, the system determines whether the memory reference is likely to miss in the cache, and if so, simultaneously sends a speculative request for the cache line to a lower level of the multi-level memory hierarchy.Type: ApplicationFiled: October 21, 2010Publication date: April 26, 2012Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Tarik Ono, Mark R. Greenstreet
-
Publication number: 20120089782Abstract: A method for managing data movement in a multi-level cache system having a primary cache and a secondary cache. The method includes determining whether an unallocated space of the primary cache has reached a minimum threshold; selecting at least one outgoing data block from the primary cache when the primary cache reached the minimum threshold; initiating a de-stage process for de-staging the outgoing data block from the primary cache; and terminating the de-stage process when the unallocated space of the primary cache has reached an upper threshold. The de-stage process further includes determining whether a cache hit has occurred in the secondary cache before; storing the outgoing data block in the secondary cache when the cache hit has occurred in the secondary cache before; generating and storing metadata regarding the outgoing data block; and deleting the outgoing data block from the primary cache.Type: ApplicationFiled: October 7, 2010Publication date: April 12, 2012Applicant: LSI CORPORATIONInventors: Brian D. McKean, Donald R. Humlicek, Timothy R. Snider
-
Publication number: 20120084511Abstract: A processor of an information handling system (IHS) initiates an L3 cache prefetch operation in response to a demand load during instruction processing. The processor selects an L3 cache prefetch at random for tracking as a target prefetched instruction. The processor initiates an L1 cache target prefetch operation and stores the resultant target prefetched instruction in the L1 cache. If a demand load arrives, the processor analyses the target prefetched instruction for effectiveness and determines the source of the prefetch data. If a demand does not arrive, the processor tests to determine if the particular prefetched instruction timed out in the cache and identifies the infectiveness of the prefetch operation. The processor samples multiple prefetch operations at random and generates a history of prefetch effectiveness and other useful prefetch information. The processor stores the prefetch effectiveness information to enable reduction or removal of ineffective prefetch operations.Type: ApplicationFiled: October 4, 2010Publication date: April 5, 2012Applicant: International Business Machines CorporationInventors: Miles R. Dooley, Venkat R. Indukuru, Alex E. Mericas, Francis P. O'Connell
-
Publication number: 20120084497Abstract: An apparatus of an aspect includes a prefetch cache line address predictor to receive a cache line address and to predict a next cache line address to be prefetched. The next cache line address may indicate a cache line having at least 64-bytes of instructions. The prefetch cache line address predictor may have a cache line target history storage to store a cache line target history for each of multiple most recent corresponding cache lines. Each cache line target history may indicate whether the corresponding cache line had a sequential cache line target or a non-sequential cache line target. The cache line address predictor may also have a cache line target history predictor. The cache line target history predictor may predict whether the next cache line address is a sequential cache line address or a non-sequential cache line address, based on the cache line target history for the most recent cache lines.Type: ApplicationFiled: September 30, 2010Publication date: April 5, 2012Inventors: Samantika Subramaniam, Aamer Jaleel, Simon C. Steely, JR.
-
Publication number: 20120079202Abstract: A prefetching system receives a memory read request having an associated address. In response to a determination that a most significant portion of the associated address is not present within slots of an array for storing the most significant portion of predicted addresses, a prefetch FIFO (First In-First Out) counter is modified to point to a next slot of the array and a new predicted address is generated in response to the received most significant portion of the associated address and is placed in the next slot of the array. The prefetch FIFO counter cycles through the slots of the array before wrapping around to a first slot of the array for storing the most significant portion of predicted addresses.Type: ApplicationFiled: July 4, 2011Publication date: March 29, 2012Inventor: Kai Chirca
-
Publication number: 20120079203Abstract: A shared resource within a module may be accessed by a request from an external requester. An external transaction request may be received from an external requester outside the module for access to the shared resource that includes control information, not all of which is needed to access the shared resource. The external transaction request may be modified to form a modified request by removing a portion of the locally unneeded control information and storing the unneeded portion of control information as an entry in a bypass buffer. A reply received from the shared resource may be modified by appending the stored portion of control information from the entry in the bypass buffer before sending the modified reply to the external requester.Type: ApplicationFiled: September 22, 2011Publication date: March 29, 2012Inventors: Dheera Balasubramanian, Raguram Damodaran
-
Publication number: 20120072668Abstract: A prefetch unit generates a prefetch address in response to an address associated with a memory read request received from the first or second cache. The prefetch unit includes a prefetch buffer that is arranged to store the prefetch address in an address buffer of a selected slot of the prefetch buffer, where each slot of the prefetch unit includes a buffer for storing a prefetch address, and two sub-slots. Each sub-slot includes a data buffer for storing data that is prefetched using the prefetch address stored in the slot, and one of the two sub-slots of the slot is selected in response to a portion of the generated prefetch address. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request.Type: ApplicationFiled: September 15, 2011Publication date: March 22, 2012Inventors: Kai Chirca, Joseph R. M. Zbiciak, Matthew D. Pierson
-
Publication number: 20120072667Abstract: A prefetch unit generates prefetch addresses in response to an initial received memory read request, an address associated with the initial received memory read request, a line length of the requestor of the initial received memory read request, and a request type width of the initial received memory read request. Prefetch operations are generated using the generated prefetch addresses, wherein each generated prefetch address is stored in a prefetch buffer slot that is selected by a prefetch FIFO (First In First Out) prefetch counter. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request.Type: ApplicationFiled: August 25, 2011Publication date: March 22, 2012Inventors: Timothy D. Anderson, Kai Chirca
-
Patent number: 8140760Abstract: A method of providing history based done logic for a I-cache includes receiving an I-cache line in an L2 cache; determining if the I-cache line is unprefetchable; aging the I-cache line without a delay if the I-cache line is prefetchable; and aging the I-cache line with a delay is the I-cache line is unprefetchable.Type: GrantFiled: May 28, 2009Date of Patent: March 20, 2012Assignee: International Business Machines CorporationInventor: David A. Luick
-
Publication number: 20120066455Abstract: A hybrid prefetch method and apparatus is disclosed. A processor includes a hybrid prefetch unit configured to generate addresses for accessing data from a system memory. The hybrid prefetch unit includes a first prediction unit configured to generate a first memory address according to a first prefetch algorithm and a second prediction unit configured to generate a second memory address according to a second prefetch algorithm. The hybrid prefetcher further includes an arbitration unit configured to select one of the first and second memory addresses and further configured to provide the selected one of the first and second memory addresses during a prefetch operation.Type: ApplicationFiled: September 9, 2010Publication date: March 15, 2012Inventors: Swamy Punyamurtula, Bharath Narashima Swamy
-
Publication number: 20120059996Abstract: A mechanism is provided for avoiding cross-interrogates for a streaming data optimized level one cache. The mechanism adds a set of dedicated registers, referred to as “copex registers,” to track ownership of the cache lines that the co-processor's L1 cache holds exclusive. The mechanism extends the cache directory of the L2 cache by a bit that identifies exclusive ownership of a cache line in the co-processor cache. The co-processor continuously provides an indication of which copex registers are valid. On any action that requires a directory lookup in the L2 cache, the mechanism compares the valid copex registers against the lookup address in parallel to the directory lookup. The mechanism considers the “exclusive ownership in co-processor” bit in the directory valid only if the cache line is also currently in a valid copex register.Type: ApplicationFiled: September 7, 2010Publication date: March 8, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Christian Habermann, Christian Jacobi, Martin Recktenwald, Hans-Werner Tast
-
Publication number: 20120059995Abstract: Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.Type: ApplicationFiled: November 9, 2011Publication date: March 8, 2012Applicant: QUALCOMM INCORPORATEDInventors: Thomas Philip Speier, James Norris Dieffenderfer, Thomas Andrew Sartorius
-
Publication number: 20120054439Abstract: The present invention provides a method and apparatus for allocating cache bandwidth to multiple processors. One embodiment of the method includes delaying, at a local device associated with a local cache, a first cache probe from a non-local device to the local cache following a second cache probe from the non-local device that matches a third cache probe from the local device.Type: ApplicationFiled: August 24, 2010Publication date: March 1, 2012Inventor: William L. Walker
-
Publication number: 20120054440Abstract: The present invention is related to a method for determining duplicate clicks via a multi-layered cache. The method includes establishing, by a cache manager executing on a device, a cache comprising a hierarchy of a plurality of cache layers. The cache manager may establish a first cache layer of the plurality of cache layers as a size bounded cache layer. The cache manager may further establish a second cache layer of the plurality of cache layers as a time bounded cache layer. In some embodiments, the second cache layer may encapsulate the first cache layer. The cache manager may receive a request to determine whether a click or an ad view is stored in the cache. The cache manager may determine whether the click or the ad view is stored in one of the first cache layer or the second cache layer.Type: ApplicationFiled: August 31, 2010Publication date: March 1, 2012Inventors: Toby Doig, Dominic Davis
-
Publication number: 20120042126Abstract: The present invention provides a method and apparatus for use with a hierarchical cache system. The method may include concurrently flushing one or more first caches and a second cache of a multi-level cache. Each first cache is smaller and at a lower level in the multi-level cache than the second level cache.Type: ApplicationFiled: August 11, 2010Publication date: February 16, 2012Inventors: Robert KRICK, David Kaplan
-
Publication number: 20120030429Abstract: A computer method and system of caching. In a multi-threaded application, different threads execute respective transactions accessing a data store (e.g. database) from a single server. The method and system represent status of datastore transactions using respective certain (e.g. Future) parameters. Results of the said transactions are cached based on transaction status as represented by the certain parameters and on data store determination of a subject transaction. The caching employs a two stage commit and effectively forms a two level cache. One levels maps from datastore keys to entries in the cache. Each entry stores a respective last known commit value. The second level provides an optional mapping from a respective transaction as represented by the corresponding certain parameter to an updated value.Type: ApplicationFiled: October 4, 2011Publication date: February 2, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: James M. Synge
-
Publication number: 20110320720Abstract: Cache line replacement in a symmetric multiprocessing computer, the computer having a plurality of processors, a main memory that is shared among the processors, a plurality of cache levels including at least one high level of private caches and a low level shared cache, and a cache controller that controls the shared cache, including receiving in the cache controller a memory instruction that requires replacement of a cache line in the low level shared cache; and selecting for replacement by the cache controller a least recently used cache line in the low level shared cache that has no copy stored in any higher level cache.Type: ApplicationFiled: June 23, 2010Publication date: December 29, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Craig Walters, Vijayalakshmi Srinivasan
-
Publication number: 20110320721Abstract: A computer-implemented method for managing data transfer in a multi-level memory hierarchy that includes receiving a fetch request for allocation of data in a higher level memory, determining whether a data bus between the higher level memory and a lower level memory is available, bypassing an intervening memory between the higher level memory and the lower level memory when it is determined that the data bus is available, and transferring the requested data directly from the higher level memory to the lower level memory.Type: ApplicationFiled: June 24, 2010Publication date: December 29, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Deanna Postles Dunn Berger, Michael Fee, Arthur J. O'Neill, JR., Robert J. Sonnelitter, III
-
Publication number: 20110314202Abstract: Embodiments of the invention provide techniques for managing cache metadata providing a mapping between addresses on a storage medium (e.g., disk storage) and corresponding addresses on a cache device at data items are stored. In some embodiments, cache metadata may be stored in a hierarchical data structure comprising a plurality of hierarchy levels. When a reboot of the computer is initiated, only a subset of the plurality of hierarchy levels may be loaded to memory, thereby expediting the process of restoring the cache metadata and thus startup operations. Startup may be further expedited by using cache metadata to perform operations associated with reboot. Thereafter, as requests to read data items on the storage medium are processed using cache metadata to identify addresses at which the data items are stored in cache, the identified addresses may be stored in memory.Type: ApplicationFiled: August 30, 2011Publication date: December 22, 2011Applicant: Microsoft CorporationInventors: Mehmet Iyigun, Yevgeniy Bak, Michael Fortin, David Fields, Cenk Ergan, Alexander Kirshenbaum
-
Publication number: 20110302561Abstract: A data layout optimization may utilize affinity estimation between paris of fields of a record in a computer program. The affinity estimation may be determined based on a trace of an execution and in view of actual processing entities performing each access to the fields. The disclosed subject matter may be configured to be aware of a specific architecture of a target computer having a plurality of processing entities, executing the program so as to provide an improved affinity estimation which may take into account both false sharing issues, spatial locality improvement and the like.Type: ApplicationFiled: June 8, 2010Publication date: December 8, 2011Applicant: International Business Machines CorporationInventors: Alon Dayan, David Joel Edelsohn, Olga Golovanevsky, Ayal Zaks
-
Publication number: 20110296093Abstract: Methods for programming and sensing in a memory device, a data cache, and a memory device are disclosed. In one such method, all of the bit lines of a memory block are programmed or sensed during the same program or sense operation by alternately multiplexing the odd or even page bit lines to the dynamic data cache. The dynamic data cache comprises dual SDC, PDC, DDC1, and DDC2 circuits such that one set of circuits is coupled to the odd page bit lines and the other set of circuits is coupled to the even page bit lines.Type: ApplicationFiled: August 11, 2011Publication date: December 1, 2011Inventor: Chang Wan HA
-
Publication number: 20110276762Abstract: A data processing system includes a processor core and a cache memory hierarchy coupled to the processor core. The cache memory hierarchy includes at least one upper level cache and a lowest level cache. A memory controller is coupled to the lowest level cache and to a system memory and includes a physical write queue from which the memory controller writes data to the system memory. The memory controller initiates accesses to the lowest level cache to place into the physical write queue selected cachelines having spatial locality with data present in the physical write queue.Type: ApplicationFiled: May 7, 2010Publication date: November 10, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: DAVID M. DALY, BENJIMAN L. GOODMAN, HILLERY C. HUNTER, WILLIAM J. STARKE, JEFFREY A. STUECHELI
-
Publication number: 20110276763Abstract: A data processing system includes a multi-level cache hierarchy including a lowest level cache, a processor core coupled to the multi-level cache hierarchy, and a memory controller coupled to the lowest level cache and to a memory bus of a system memory. The memory controller includes a physical read queue that buffers data read from the system memory via the memory bus and a physical write queue that buffers data to be written to the system memory via the memory bus. The memory controller grants priority to write operations over read operations on the memory bus based upon a number of dirty cachelines in the lowest level cache memory.Type: ApplicationFiled: May 7, 2010Publication date: November 10, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: DAVID M. DALY, BENJIMAN L. GOODMAN, HILLERY C. HUNTER, WILLIAM J. STARKE, JEFFREY A. STUECHELI
-
Publication number: 20110271057Abstract: The disclosed embodiments provide a system that filters duplicate requests from an L1 cache for a cache line. During operation, the system receives at an L2 cache a first request and a second request for the same cache line, and stores identifying information for these requests. The system then performs a cache array look-up for the first request that, in the process of creating a load fill packet for the first request, loads the cache line into a fill buffer. After sending the load fill packet for the first request to the L1 cache, the system uses the cache line data still stored in the fill buffer and stored identifying information for the second fill request to send a subsequent load fill packet for the second request to the L1 cache without performing an additional cache array look-up.Type: ApplicationFiled: May 3, 2010Publication date: November 3, 2011Applicant: ORACLE INTERNATIONAL CORPORATIONInventor: Martin R. Karlsson
-
Publication number: 20110264860Abstract: A microprocessor includes first and second cache memories occupying distinct hierarchy levels, the second backing the first. A prefetcher monitors load operations and maintains a recent history of the load operations from a cache line and determines whether the recent history indicates a clear direction. The prefetcher prefetches one or more cache lines into the first cache memory when the recent history indicates a clear direction and otherwise prefetches the one or more cache lines into the second cache memory. The prefetcher also determines whether the recent history indicates the load operations are large and, other things being equal, prefetches a greater number of cache lines when large than small. The prefetcher also determines whether the recent history indicates the load operations are received on consecutive clock cycles and, other things being equal, prefetches a greater number of cache lines when on consecutive clock cycles than not.Type: ApplicationFiled: August 26, 2010Publication date: October 27, 2011Applicant: VIA Technologies, Inc.Inventors: Rodney E. Hooker, Colin Eddy
-
Publication number: 20110264861Abstract: Execution of code in a multitenant runtime environment. A request to execute code corresponding to a tenant identifier (ID) is received in a multitenant environment. The multitenant database stores data for multiple client entities each identified by a tenant ID having one of one or more users associated with the tenant ID. Users of each of multiple client entities can only access data identified by a tenant ID associated with the respective client entity. The multitenant database is a hosted database provided by an entity separate from the client entities, and provides on-demand database service to the client entities. Source code corresponding to the code to be executed is retrieved from a multitenant database. The retrieved source code is compiled. The compiled code is executed in the multitenant runtime environment. The memory used by the compiled code is freed in response to completion of the execution of the compiled code.Type: ApplicationFiled: April 21, 2011Publication date: October 27, 2011Applicant: SALESFORCE.COMInventors: Gregory D. Fee, William J. Gallagher
-
Patent number: 8041894Abstract: Method and system for a multi-level virtual/real cache system with synonym resolution. An exemplary embodiment includes a multi-level cache hierarchy, including a set of L1 caches associated with one or more processor cores and a set of L2 caches, wherein the set of L1 caches are a subset of the set of L2 caches, wherein the set of L1 caches underneath a given L2 cache are associated with one or more of the processor cores.Type: GrantFiled: February 25, 2008Date of Patent: October 18, 2011Assignee: International Business Machines CorporationInventors: Barry W. Krumm, Christian Jacobi, Chung-Lung Kevin Shum, Hans-Werner Tast, Aaron Tsai, Ching-Farn E. Wu
-
Publication number: 20110219190Abstract: A method and apparatus for repopulating a cache are disclosed. At least a portion of the contents of the cache are stored in a location separate from the cache. Power is removed from the cache and is restored some time later. After power has been restored to the cache, it is repopulated with the portion of the contents of the cache that were stored separately from the cache.Type: ApplicationFiled: March 3, 2010Publication date: September 8, 2011Applicant: ATI Technologies ULCInventors: Philip Ng, Jimshed B. Mirza, Anthony Asaro
-
Publication number: 20110213947Abstract: A technique for reducing the power consumption required to execute processing operations. A processing complex, such as a CPU or a GPU, includes a first set of cores comprising one or more fast cores and second set of cores comprising one or more slow cores. A processing mode of the processing complex can switch between a first mode of operation and a second mode of operation based on one or more of the workload characteristics, performance characteristics of the first and second sets of cores, power characteristics of the first and second sets of cores, and operating conditions of the processing complex. A controller causes the processing operations to be executed by either the first set of cores or the second set of cores to achieve the lowest total power consumption.Type: ApplicationFiled: May 25, 2010Publication date: September 1, 2011Inventors: John George Mathieson, Phil Carmack, Brian Smith
-
Publication number: 20110208915Abstract: In an embodiment, a processor may be configured to detect a store exclusive operation followed by a memory barrier operation in a speculative instruction stream being executed by the processor. The processor may fuse the store exclusive operation and the memory barrier operation, creating a fused operation. The fused operation may be transmitted and globally ordered, and the processor may complete both the store exclusive operation and the memory barrier operation in response to the fused operation. As the fused operation progresses through the processor and one or more other components (e.g. caches in the cache hierarchy) to the ordering point in the system, the fused operation may push previous memory operations to effect the memory barrier operation. In some embodiments, the latency for completing the store exclusive operation and the subsequent data memory barrier operation may be reduced if the store exclusive operation is successful at the ordering point.Type: ApplicationFiled: February 24, 2010Publication date: August 25, 2011Inventors: Peter J. Bannon, Po-Yung Chang
-
Patent number: 8006036Abstract: The present invention comprises a CHA 110 which transmits/receives data to/from an external device, a DKA 140 which transmits/receives data to/from an HDD unit 200, a primary cache unit 120 which has a primary cache memory 124, a secondary cache unit 130 which is installed between the primary cache unit 120 and the DKA 140 and has a secondary cache memory 134, a CCP 121 which stores write target data received by the CHA 110 in the primary cache memory 124, and a CCP 131 which stores the write target data in the secondary cache memory 134, and transfers the write target data stored in the secondary cache memory 134 to the DKA 140.Type: GrantFiled: January 24, 2008Date of Patent: August 23, 2011Assignee: Hitachi, Ltd.Inventors: Tatsuya Ninomiya, Kazuo Tanaka
-
Publication number: 20110202727Abstract: Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache or the selected line is a write-through line. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.Type: ApplicationFiled: February 18, 2010Publication date: August 18, 2011Applicant: QUALCOMM INCORPORATEDInventors: Thomas Philip Speier, James Norris Dieffenderfer, Thomas Andrew Sartorius
-
Publication number: 20110202726Abstract: A data processing apparatus for forming a portion of a coherent cache system comprises at least one master device for performing data processing operations, and a cache coupled to the at least one master device and arranged to store data values for access by that at least one master device when performing the data processing operations. Cache coherency circuitry is responsive to a coherency request from another portion of the coherent cache system to cause a coherency action to be taken in respect of at least one data value stored in the cache. Responsive to an indication that the coherency action has resulted in invalidation of that at least one data value in the cache, refetch control circuitry is used to initiate a refetch of that at least one data value into the cache.Type: ApplicationFiled: February 12, 2010Publication date: August 18, 2011Applicant: ARM LimitedInventors: Christopher William Laycock, Antony John Harris, Bruce James Mathewson, Andrew Christopher Rose, Richard Roy Grisenthwaite
-
Publication number: 20110197030Abstract: In one embodiment, a system comprises a plurality of agents coupled to an interconnect and a cache coupled to the interconnect. The plurality of agents are configured to cache data. A first agent of the plurality of agents is configured to initiate a transaction on the interconnect by transmitting a memory request, and other agents of the plurality of agents are configured to snoop the memory request from the interconnect. The other agents provide a response in a response phase of the transaction on the interconnect. The cache is configured to detect a hit for the memory request and to provide data for the transaction to the first agent prior to the response phase and independent of the response.Type: ApplicationFiled: April 18, 2011Publication date: August 11, 2011Inventors: Brian P. Lilly, Sridhar P. Subramanian, Ramesh Gunna
-
Publication number: 20110185125Abstract: A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers may share access to an interconnect egress port coupled to the interconnect network, and may generate multiple concurrent requests to convey data via the shared port, where each of the requests is destined for a corresponding core, and where a datapath width of the port is less than a combined width of the multiple requests. The given tag unit may arbitrate among the controllers for access to the shared port, such that the requests are transmitted to corresponding cores serially rather than concurrently.Type: ApplicationFiled: January 27, 2010Publication date: July 28, 2011Inventors: Prashant Jain, Yoganand Chillarige, Sandip Das, Shukur Moulali Pathan, Srinivasan R. Iyengar, Sanjay Patel
-
Publication number: 20110167224Abstract: A cache memory according to an aspect of the present invention including entries each of which includes a tag address, line data, and a dirty flag, the cache memory includes: a command execution unit which rewrites, when a first command is instructed by a processor, a tag address included in at least one entry specified by the processor among the entries to a tag address corresponding to an address specified by the processor, and to set a dirty flag corresponding to the entry; and a write-back unit which writes, back to a main memory, the line data included in the entry in which the dirty flag is set.Type: ApplicationFiled: March 15, 2011Publication date: July 7, 2011Applicant: PANASONIC CORPORATIONInventor: Takanori ISONO
-
Publication number: 20110161586Abstract: Technologies are described herein related to multi-core processors that are adapted to share processor resources. An example multi-core processor can include a plurality of processor cores. The multi-core processor further can include a shared register file selectively coupled to two or more of the plurality of processor cores, where the shared register file is adapted to serve as a shared resource among the selected processor cores.Type: ApplicationFiled: December 29, 2009Publication date: June 30, 2011Inventors: Miodrag Potkonjak, Nathan Zachary Beckmann
-
Publication number: 20110161589Abstract: A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.Type: ApplicationFiled: December 30, 2009Publication date: June 30, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Guy L. Guthrie, William J. Starke, Jeffrey A. Stuecheli, Derek E. Williams, Thomas R. Puzak