With Multilevel Cache Hierarchies (epo) Patents (Class 711/E12.024)
  • Publication number: 20120166729
    Abstract: A method and apparatus for controlling affinity of subcaches is disclosed. When a core compute unit evicts a line of victim data, a prioritized search for space allocation on available subcaches is executed, in order of proximity between the subcache and the compute unit. The victim data may be injected into an adjacent subcache if space is available. Otherwise, a line may be evicted from the adjacent subcache to make room for the victim data or the victim data may be sent to the next closest subcache. To retrieve data, a core compute unit sends a Tag Lookup Request message directly to the nearest subcache as well as to a cache controller, which controls routing of messages to all of the subcaches. A Tag Lookup Response message is sent back to the cache controller to indicate if the requested data is located in the nearest sub-cache.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 28, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventor: Greggory D. Donley
  • Publication number: 20120159075
    Abstract: A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times.
    Type: Application
    Filed: February 27, 2012
    Publication date: June 21, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: David A. Luick
  • Publication number: 20120159074
    Abstract: Embodiments of the invention relate to increased energy efficiency and conservation by reducing and increasing an amount of cache available for use by a processor, and an amount of power supplied to the cache and to the processor, based on the amount of cache actually being used by the processor to process data. For example, a power control unit (PCU) may monitor a last level cache (LLC) to identify if the size or amount of the cache being used by a processor to process data and to determine heuristics based on that amount. Based on the monitored amount of cache being used and the heuristics, the PCU causes a corresponding decrease or increase in an amount of the cache available for use by the processor, and a corresponding decrease or increase in an amount of power supplied to the cache and to the processor.
    Type: Application
    Filed: December 23, 2011
    Publication date: June 21, 2012
    Inventors: Inder M. Sodhi, Satish K. Damaraju, Sanjeev S. Jahagirdar, Ryan D. Wells
  • Publication number: 20120159073
    Abstract: An apparatus and method for improving cache performance in a computer system having a multi-level cache hierarchy. For example, one embodiment of a method comprises: selecting a first line in a cache at level N for potential eviction; querying a cache at level M in the hierarchy to determine whether the first cache line is resident in the cache at level M, wherein M<N; in response to receiving an indication that the first cache line is not resident at level M, then evicting the first cache line from the cache at level N; in response to receiving an indication that the first cache line is resident at level M, then retaining the first cache line and choosing a second cache line for potential eviction.
    Type: Application
    Filed: December 20, 2010
    Publication date: June 21, 2012
    Inventors: Aamer Jaleel, Simon C. Steely, JR., Eric R. Borch, Malini K. Bhandaru, Joel S. Emer
  • Publication number: 20120151144
    Abstract: A method and computer device for determining the cache memory configuration. The method includes allocating an amount of cache memory from a first memory level of the cache memory, and determining a read transfer time for the allocated amount of cache memory. The allocated amount of cache memory then is increased and the read transfer time for the increased allocated amount of cache memory is determined. The allocated amount of cache memory continues to be increased and the read transfer time determined for the each allocated amount until all of the cache memory in all of the cache memory levels has been allocated. The cache memory configuration is determined based on the read transfer times from the allocated portions of the cache memory. The determined cache memory configuration includes the number of cache memory levels and the respective capacities of each cache memory level.
    Type: Application
    Filed: December 8, 2010
    Publication date: June 14, 2012
    Inventor: William Judge Yohn
  • Patent number: 8200897
    Abstract: The present invention comprises a CHA 110 which transmits/receives data to/from an external device, a DKA 140 which transmits/receives data to/from an HDD unit 200, a primary cache unit 120 which has a primary cache memory 124, a secondary cache unit 130 which is installed between the primary cache unit 120 and the DKA 140 and has a secondary cache memory 134, a CCP 121 which stores write target data received by the CHA 110 in the primary cache memory 124, and a CCP 131 which stores the write target data in the secondary cache memory 134, and transfers the write target data stored in the secondary cache memory 134 to the DKA 140.
    Type: Grant
    Filed: July 8, 2011
    Date of Patent: June 12, 2012
    Assignee: Hitachi, Ltd.
    Inventors: Tatsuya Ninomiya, Kazuo Tanaka
  • Publication number: 20120137075
    Abstract: The invention relates to a multi-core processor system, in particular a single-package multi-core processor system, comprising at least two processor cores, preferably at least four processor cores, each of said at least two cores, preferably at least four processor cores, having a local LEVEL-1 cache, a tree communication structure combining the multiple LEVEL-1 caches, the tree having at least one node, preferably at least three nodes for a four processor core multi-core processor, and TAG information is associated to data managed within the tree, usable in the treatment of the data.
    Type: Application
    Filed: June 9, 2010
    Publication date: May 31, 2012
    Applicant: HYPERION CORE, INC.
    Inventor: Martin Vorbach
  • Publication number: 20120137074
    Abstract: A method and system to perform stream buffer management instructions in a processor. The stream buffer management instructions facilitate the creation and usage of a dedicated memory space or stream buffer of the processor in one embodiment of the invention. The dedicated memory space is a contiguous memory space and has a sequential or linear addressing scheme in one embodiment of the invention. The processor has logic to execute a stream buffer management instruction to copy data from a source memory address to a destination memory address that is specified with a desired level of memory hierarchy.
    Type: Application
    Filed: November 29, 2010
    Publication date: May 31, 2012
    Inventors: Daehyun Kim, Changkyu Kim, Victor W. Lee, Jatin Chhugani, Nadathur Rajagopalan Satish
  • Publication number: 20120131265
    Abstract: A method of writing data units to a storage device. The data units are cached in a first level cache sorted by logical address. A group (Gj) of sorted data units is transferred from the first level cache to a second level cache embodied in a solid state memory device. Data units of multiple groups (Gj) are sorted in the second level cache by logical address. The sorted data units stemming from the multiple groups are written to the storage device.
    Type: Application
    Filed: October 28, 2011
    Publication date: May 24, 2012
    Applicant: International Business Machines Corporation
    Inventors: Ioannis Koltsidas, Roman Pletka
  • Publication number: 20120117326
    Abstract: The present invention relates to an apparatus and a method for accessing a cache memory. The cache memory comprises a level-one memory and a level-two memory. The apparatus for accessing the cache memory according to the present invention comprises a register unit and a control unit. The control unit receives a first read command and a reject datum of the level-one memory and stores the reject datum of the level-one memory to the register unit. Then the control unit reads and stores a stored datum of the level-two memory to the level-one memory according to the first read command.
    Type: Application
    Filed: November 3, 2011
    Publication date: May 10, 2012
    Applicant: REALTEK SEMICONDUCTOR CORP.
    Inventors: YEN-JU LU, JUI-YUAN LIN
  • Publication number: 20120110266
    Abstract: Methods and apparatus relating to disabling one or more cache portions during low voltage operations are described. In some embodiments, one or more extra bits may be used for a portion of a cache that indicate whether the portion of the cache is capable at operating at or below Vccmin levels. Other embodiments are also described and claimed.
    Type: Application
    Filed: December 31, 2011
    Publication date: May 3, 2012
    Inventors: Christopher Wilkerson, M. Muhammad Khellah, Vivek De, Ming Y. Zhang, Jaume Abella, Javier Carretero Casado, Pedro Chaparro Monferrer, Xavier Vera, Antonio Gonzalez
  • Patent number: 8171224
    Abstract: A method of providing history based done logic for a D-cache includes receiving a D-cache line in an L2 cache; determining if the D-cache line is unprefetchable; aging the D-cache line without a delay if the D-cache line is prefetchable; and aging the D-cache line with a delay if the D-cache line is unprefetchable.
    Type: Grant
    Filed: May 28, 2009
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventor: David A. Luick
  • Publication number: 20120102269
    Abstract: The disclosed embodiments provide a system that uses speculative cache requests to reduce cache miss delays for a cache in a multi-level memory hierarchy. During operation, the system receives a memory reference which is directed to a cache line in the cache. Next, while determining whether the cache line is available in the cache, the system determines whether the memory reference is likely to miss in the cache, and if so, simultaneously sends a speculative request for the cache line to a lower level of the multi-level memory hierarchy.
    Type: Application
    Filed: October 21, 2010
    Publication date: April 26, 2012
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Tarik Ono, Mark R. Greenstreet
  • Publication number: 20120089782
    Abstract: A method for managing data movement in a multi-level cache system having a primary cache and a secondary cache. The method includes determining whether an unallocated space of the primary cache has reached a minimum threshold; selecting at least one outgoing data block from the primary cache when the primary cache reached the minimum threshold; initiating a de-stage process for de-staging the outgoing data block from the primary cache; and terminating the de-stage process when the unallocated space of the primary cache has reached an upper threshold. The de-stage process further includes determining whether a cache hit has occurred in the secondary cache before; storing the outgoing data block in the secondary cache when the cache hit has occurred in the secondary cache before; generating and storing metadata regarding the outgoing data block; and deleting the outgoing data block from the primary cache.
    Type: Application
    Filed: October 7, 2010
    Publication date: April 12, 2012
    Applicant: LSI CORPORATION
    Inventors: Brian D. McKean, Donald R. Humlicek, Timothy R. Snider
  • Publication number: 20120084511
    Abstract: A processor of an information handling system (IHS) initiates an L3 cache prefetch operation in response to a demand load during instruction processing. The processor selects an L3 cache prefetch at random for tracking as a target prefetched instruction. The processor initiates an L1 cache target prefetch operation and stores the resultant target prefetched instruction in the L1 cache. If a demand load arrives, the processor analyses the target prefetched instruction for effectiveness and determines the source of the prefetch data. If a demand does not arrive, the processor tests to determine if the particular prefetched instruction timed out in the cache and identifies the infectiveness of the prefetch operation. The processor samples multiple prefetch operations at random and generates a history of prefetch effectiveness and other useful prefetch information. The processor stores the prefetch effectiveness information to enable reduction or removal of ineffective prefetch operations.
    Type: Application
    Filed: October 4, 2010
    Publication date: April 5, 2012
    Applicant: International Business Machines Corporation
    Inventors: Miles R. Dooley, Venkat R. Indukuru, Alex E. Mericas, Francis P. O'Connell
  • Publication number: 20120084497
    Abstract: An apparatus of an aspect includes a prefetch cache line address predictor to receive a cache line address and to predict a next cache line address to be prefetched. The next cache line address may indicate a cache line having at least 64-bytes of instructions. The prefetch cache line address predictor may have a cache line target history storage to store a cache line target history for each of multiple most recent corresponding cache lines. Each cache line target history may indicate whether the corresponding cache line had a sequential cache line target or a non-sequential cache line target. The cache line address predictor may also have a cache line target history predictor. The cache line target history predictor may predict whether the next cache line address is a sequential cache line address or a non-sequential cache line address, based on the cache line target history for the most recent cache lines.
    Type: Application
    Filed: September 30, 2010
    Publication date: April 5, 2012
    Inventors: Samantika Subramaniam, Aamer Jaleel, Simon C. Steely, JR.
  • Publication number: 20120079202
    Abstract: A prefetching system receives a memory read request having an associated address. In response to a determination that a most significant portion of the associated address is not present within slots of an array for storing the most significant portion of predicted addresses, a prefetch FIFO (First In-First Out) counter is modified to point to a next slot of the array and a new predicted address is generated in response to the received most significant portion of the associated address and is placed in the next slot of the array. The prefetch FIFO counter cycles through the slots of the array before wrapping around to a first slot of the array for storing the most significant portion of predicted addresses.
    Type: Application
    Filed: July 4, 2011
    Publication date: March 29, 2012
    Inventor: Kai Chirca
  • Publication number: 20120079203
    Abstract: A shared resource within a module may be accessed by a request from an external requester. An external transaction request may be received from an external requester outside the module for access to the shared resource that includes control information, not all of which is needed to access the shared resource. The external transaction request may be modified to form a modified request by removing a portion of the locally unneeded control information and storing the unneeded portion of control information as an entry in a bypass buffer. A reply received from the shared resource may be modified by appending the stored portion of control information from the entry in the bypass buffer before sending the modified reply to the external requester.
    Type: Application
    Filed: September 22, 2011
    Publication date: March 29, 2012
    Inventors: Dheera Balasubramanian, Raguram Damodaran
  • Publication number: 20120072668
    Abstract: A prefetch unit generates a prefetch address in response to an address associated with a memory read request received from the first or second cache. The prefetch unit includes a prefetch buffer that is arranged to store the prefetch address in an address buffer of a selected slot of the prefetch buffer, where each slot of the prefetch unit includes a buffer for storing a prefetch address, and two sub-slots. Each sub-slot includes a data buffer for storing data that is prefetched using the prefetch address stored in the slot, and one of the two sub-slots of the slot is selected in response to a portion of the generated prefetch address. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request.
    Type: Application
    Filed: September 15, 2011
    Publication date: March 22, 2012
    Inventors: Kai Chirca, Joseph R. M. Zbiciak, Matthew D. Pierson
  • Publication number: 20120072667
    Abstract: A prefetch unit generates prefetch addresses in response to an initial received memory read request, an address associated with the initial received memory read request, a line length of the requestor of the initial received memory read request, and a request type width of the initial received memory read request. Prefetch operations are generated using the generated prefetch addresses, wherein each generated prefetch address is stored in a prefetch buffer slot that is selected by a prefetch FIFO (First In First Out) prefetch counter. Subsequent hits on the prefetcher result in returning prefetched data to the requestor in response to a subsequent memory read request received after the initial received memory read request.
    Type: Application
    Filed: August 25, 2011
    Publication date: March 22, 2012
    Inventors: Timothy D. Anderson, Kai Chirca
  • Patent number: 8140760
    Abstract: A method of providing history based done logic for a I-cache includes receiving an I-cache line in an L2 cache; determining if the I-cache line is unprefetchable; aging the I-cache line without a delay if the I-cache line is prefetchable; and aging the I-cache line with a delay is the I-cache line is unprefetchable.
    Type: Grant
    Filed: May 28, 2009
    Date of Patent: March 20, 2012
    Assignee: International Business Machines Corporation
    Inventor: David A. Luick
  • Publication number: 20120066455
    Abstract: A hybrid prefetch method and apparatus is disclosed. A processor includes a hybrid prefetch unit configured to generate addresses for accessing data from a system memory. The hybrid prefetch unit includes a first prediction unit configured to generate a first memory address according to a first prefetch algorithm and a second prediction unit configured to generate a second memory address according to a second prefetch algorithm. The hybrid prefetcher further includes an arbitration unit configured to select one of the first and second memory addresses and further configured to provide the selected one of the first and second memory addresses during a prefetch operation.
    Type: Application
    Filed: September 9, 2010
    Publication date: March 15, 2012
    Inventors: Swamy Punyamurtula, Bharath Narashima Swamy
  • Publication number: 20120059996
    Abstract: A mechanism is provided for avoiding cross-interrogates for a streaming data optimized level one cache. The mechanism adds a set of dedicated registers, referred to as “copex registers,” to track ownership of the cache lines that the co-processor's L1 cache holds exclusive. The mechanism extends the cache directory of the L2 cache by a bit that identifies exclusive ownership of a cache line in the co-processor cache. The co-processor continuously provides an indication of which copex registers are valid. On any action that requires a directory lookup in the L2 cache, the mechanism compares the valid copex registers against the lookup address in parallel to the directory lookup. The mechanism considers the “exclusive ownership in co-processor” bit in the directory valid only if the cache line is also currently in a valid copex register.
    Type: Application
    Filed: September 7, 2010
    Publication date: March 8, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christian Habermann, Christian Jacobi, Martin Recktenwald, Hans-Werner Tast
  • Publication number: 20120059995
    Abstract: Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.
    Type: Application
    Filed: November 9, 2011
    Publication date: March 8, 2012
    Applicant: QUALCOMM INCORPORATED
    Inventors: Thomas Philip Speier, James Norris Dieffenderfer, Thomas Andrew Sartorius
  • Publication number: 20120054439
    Abstract: The present invention provides a method and apparatus for allocating cache bandwidth to multiple processors. One embodiment of the method includes delaying, at a local device associated with a local cache, a first cache probe from a non-local device to the local cache following a second cache probe from the non-local device that matches a third cache probe from the local device.
    Type: Application
    Filed: August 24, 2010
    Publication date: March 1, 2012
    Inventor: William L. Walker
  • Publication number: 20120054440
    Abstract: The present invention is related to a method for determining duplicate clicks via a multi-layered cache. The method includes establishing, by a cache manager executing on a device, a cache comprising a hierarchy of a plurality of cache layers. The cache manager may establish a first cache layer of the plurality of cache layers as a size bounded cache layer. The cache manager may further establish a second cache layer of the plurality of cache layers as a time bounded cache layer. In some embodiments, the second cache layer may encapsulate the first cache layer. The cache manager may receive a request to determine whether a click or an ad view is stored in the cache. The cache manager may determine whether the click or the ad view is stored in one of the first cache layer or the second cache layer.
    Type: Application
    Filed: August 31, 2010
    Publication date: March 1, 2012
    Inventors: Toby Doig, Dominic Davis
  • Publication number: 20120042126
    Abstract: The present invention provides a method and apparatus for use with a hierarchical cache system. The method may include concurrently flushing one or more first caches and a second cache of a multi-level cache. Each first cache is smaller and at a lower level in the multi-level cache than the second level cache.
    Type: Application
    Filed: August 11, 2010
    Publication date: February 16, 2012
    Inventors: Robert KRICK, David Kaplan
  • Publication number: 20120030429
    Abstract: A computer method and system of caching. In a multi-threaded application, different threads execute respective transactions accessing a data store (e.g. database) from a single server. The method and system represent status of datastore transactions using respective certain (e.g. Future) parameters. Results of the said transactions are cached based on transaction status as represented by the certain parameters and on data store determination of a subject transaction. The caching employs a two stage commit and effectively forms a two level cache. One levels maps from datastore keys to entries in the cache. Each entry stores a respective last known commit value. The second level provides an optional mapping from a respective transaction as represented by the corresponding certain parameter to an updated value.
    Type: Application
    Filed: October 4, 2011
    Publication date: February 2, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: James M. Synge
  • Publication number: 20110320720
    Abstract: Cache line replacement in a symmetric multiprocessing computer, the computer having a plurality of processors, a main memory that is shared among the processors, a plurality of cache levels including at least one high level of private caches and a low level shared cache, and a cache controller that controls the shared cache, including receiving in the cache controller a memory instruction that requires replacement of a cache line in the low level shared cache; and selecting for replacement by the cache controller a least recently used cache line in the low level shared cache that has no copy stored in any higher level cache.
    Type: Application
    Filed: June 23, 2010
    Publication date: December 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Craig Walters, Vijayalakshmi Srinivasan
  • Publication number: 20110320721
    Abstract: A computer-implemented method for managing data transfer in a multi-level memory hierarchy that includes receiving a fetch request for allocation of data in a higher level memory, determining whether a data bus between the higher level memory and a lower level memory is available, bypassing an intervening memory between the higher level memory and the lower level memory when it is determined that the data bus is available, and transferring the requested data directly from the higher level memory to the lower level memory.
    Type: Application
    Filed: June 24, 2010
    Publication date: December 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Deanna Postles Dunn Berger, Michael Fee, Arthur J. O'Neill, JR., Robert J. Sonnelitter, III
  • Publication number: 20110314202
    Abstract: Embodiments of the invention provide techniques for managing cache metadata providing a mapping between addresses on a storage medium (e.g., disk storage) and corresponding addresses on a cache device at data items are stored. In some embodiments, cache metadata may be stored in a hierarchical data structure comprising a plurality of hierarchy levels. When a reboot of the computer is initiated, only a subset of the plurality of hierarchy levels may be loaded to memory, thereby expediting the process of restoring the cache metadata and thus startup operations. Startup may be further expedited by using cache metadata to perform operations associated with reboot. Thereafter, as requests to read data items on the storage medium are processed using cache metadata to identify addresses at which the data items are stored in cache, the identified addresses may be stored in memory.
    Type: Application
    Filed: August 30, 2011
    Publication date: December 22, 2011
    Applicant: Microsoft Corporation
    Inventors: Mehmet Iyigun, Yevgeniy Bak, Michael Fortin, David Fields, Cenk Ergan, Alexander Kirshenbaum
  • Publication number: 20110302561
    Abstract: A data layout optimization may utilize affinity estimation between paris of fields of a record in a computer program. The affinity estimation may be determined based on a trace of an execution and in view of actual processing entities performing each access to the fields. The disclosed subject matter may be configured to be aware of a specific architecture of a target computer having a plurality of processing entities, executing the program so as to provide an improved affinity estimation which may take into account both false sharing issues, spatial locality improvement and the like.
    Type: Application
    Filed: June 8, 2010
    Publication date: December 8, 2011
    Applicant: International Business Machines Corporation
    Inventors: Alon Dayan, David Joel Edelsohn, Olga Golovanevsky, Ayal Zaks
  • Publication number: 20110296093
    Abstract: Methods for programming and sensing in a memory device, a data cache, and a memory device are disclosed. In one such method, all of the bit lines of a memory block are programmed or sensed during the same program or sense operation by alternately multiplexing the odd or even page bit lines to the dynamic data cache. The dynamic data cache comprises dual SDC, PDC, DDC1, and DDC2 circuits such that one set of circuits is coupled to the odd page bit lines and the other set of circuits is coupled to the even page bit lines.
    Type: Application
    Filed: August 11, 2011
    Publication date: December 1, 2011
    Inventor: Chang Wan HA
  • Publication number: 20110276762
    Abstract: A data processing system includes a processor core and a cache memory hierarchy coupled to the processor core. The cache memory hierarchy includes at least one upper level cache and a lowest level cache. A memory controller is coupled to the lowest level cache and to a system memory and includes a physical write queue from which the memory controller writes data to the system memory. The memory controller initiates accesses to the lowest level cache to place into the physical write queue selected cachelines having spatial locality with data present in the physical write queue.
    Type: Application
    Filed: May 7, 2010
    Publication date: November 10, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: DAVID M. DALY, BENJIMAN L. GOODMAN, HILLERY C. HUNTER, WILLIAM J. STARKE, JEFFREY A. STUECHELI
  • Publication number: 20110276763
    Abstract: A data processing system includes a multi-level cache hierarchy including a lowest level cache, a processor core coupled to the multi-level cache hierarchy, and a memory controller coupled to the lowest level cache and to a memory bus of a system memory. The memory controller includes a physical read queue that buffers data read from the system memory via the memory bus and a physical write queue that buffers data to be written to the system memory via the memory bus. The memory controller grants priority to write operations over read operations on the memory bus based upon a number of dirty cachelines in the lowest level cache memory.
    Type: Application
    Filed: May 7, 2010
    Publication date: November 10, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: DAVID M. DALY, BENJIMAN L. GOODMAN, HILLERY C. HUNTER, WILLIAM J. STARKE, JEFFREY A. STUECHELI
  • Publication number: 20110271057
    Abstract: The disclosed embodiments provide a system that filters duplicate requests from an L1 cache for a cache line. During operation, the system receives at an L2 cache a first request and a second request for the same cache line, and stores identifying information for these requests. The system then performs a cache array look-up for the first request that, in the process of creating a load fill packet for the first request, loads the cache line into a fill buffer. After sending the load fill packet for the first request to the L1 cache, the system uses the cache line data still stored in the fill buffer and stored identifying information for the second fill request to send a subsequent load fill packet for the second request to the L1 cache without performing an additional cache array look-up.
    Type: Application
    Filed: May 3, 2010
    Publication date: November 3, 2011
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventor: Martin R. Karlsson
  • Publication number: 20110264860
    Abstract: A microprocessor includes first and second cache memories occupying distinct hierarchy levels, the second backing the first. A prefetcher monitors load operations and maintains a recent history of the load operations from a cache line and determines whether the recent history indicates a clear direction. The prefetcher prefetches one or more cache lines into the first cache memory when the recent history indicates a clear direction and otherwise prefetches the one or more cache lines into the second cache memory. The prefetcher also determines whether the recent history indicates the load operations are large and, other things being equal, prefetches a greater number of cache lines when large than small. The prefetcher also determines whether the recent history indicates the load operations are received on consecutive clock cycles and, other things being equal, prefetches a greater number of cache lines when on consecutive clock cycles than not.
    Type: Application
    Filed: August 26, 2010
    Publication date: October 27, 2011
    Applicant: VIA Technologies, Inc.
    Inventors: Rodney E. Hooker, Colin Eddy
  • Publication number: 20110264861
    Abstract: Execution of code in a multitenant runtime environment. A request to execute code corresponding to a tenant identifier (ID) is received in a multitenant environment. The multitenant database stores data for multiple client entities each identified by a tenant ID having one of one or more users associated with the tenant ID. Users of each of multiple client entities can only access data identified by a tenant ID associated with the respective client entity. The multitenant database is a hosted database provided by an entity separate from the client entities, and provides on-demand database service to the client entities. Source code corresponding to the code to be executed is retrieved from a multitenant database. The retrieved source code is compiled. The compiled code is executed in the multitenant runtime environment. The memory used by the compiled code is freed in response to completion of the execution of the compiled code.
    Type: Application
    Filed: April 21, 2011
    Publication date: October 27, 2011
    Applicant: SALESFORCE.COM
    Inventors: Gregory D. Fee, William J. Gallagher
  • Patent number: 8041894
    Abstract: Method and system for a multi-level virtual/real cache system with synonym resolution. An exemplary embodiment includes a multi-level cache hierarchy, including a set of L1 caches associated with one or more processor cores and a set of L2 caches, wherein the set of L1 caches are a subset of the set of L2 caches, wherein the set of L1 caches underneath a given L2 cache are associated with one or more of the processor cores.
    Type: Grant
    Filed: February 25, 2008
    Date of Patent: October 18, 2011
    Assignee: International Business Machines Corporation
    Inventors: Barry W. Krumm, Christian Jacobi, Chung-Lung Kevin Shum, Hans-Werner Tast, Aaron Tsai, Ching-Farn E. Wu
  • Publication number: 20110219190
    Abstract: A method and apparatus for repopulating a cache are disclosed. At least a portion of the contents of the cache are stored in a location separate from the cache. Power is removed from the cache and is restored some time later. After power has been restored to the cache, it is repopulated with the portion of the contents of the cache that were stored separately from the cache.
    Type: Application
    Filed: March 3, 2010
    Publication date: September 8, 2011
    Applicant: ATI Technologies ULC
    Inventors: Philip Ng, Jimshed B. Mirza, Anthony Asaro
  • Publication number: 20110213947
    Abstract: A technique for reducing the power consumption required to execute processing operations. A processing complex, such as a CPU or a GPU, includes a first set of cores comprising one or more fast cores and second set of cores comprising one or more slow cores. A processing mode of the processing complex can switch between a first mode of operation and a second mode of operation based on one or more of the workload characteristics, performance characteristics of the first and second sets of cores, power characteristics of the first and second sets of cores, and operating conditions of the processing complex. A controller causes the processing operations to be executed by either the first set of cores or the second set of cores to achieve the lowest total power consumption.
    Type: Application
    Filed: May 25, 2010
    Publication date: September 1, 2011
    Inventors: John George Mathieson, Phil Carmack, Brian Smith
  • Publication number: 20110208915
    Abstract: In an embodiment, a processor may be configured to detect a store exclusive operation followed by a memory barrier operation in a speculative instruction stream being executed by the processor. The processor may fuse the store exclusive operation and the memory barrier operation, creating a fused operation. The fused operation may be transmitted and globally ordered, and the processor may complete both the store exclusive operation and the memory barrier operation in response to the fused operation. As the fused operation progresses through the processor and one or more other components (e.g. caches in the cache hierarchy) to the ordering point in the system, the fused operation may push previous memory operations to effect the memory barrier operation. In some embodiments, the latency for completing the store exclusive operation and the subsequent data memory barrier operation may be reduced if the store exclusive operation is successful at the ordering point.
    Type: Application
    Filed: February 24, 2010
    Publication date: August 25, 2011
    Inventors: Peter J. Bannon, Po-Yung Chang
  • Patent number: 8006036
    Abstract: The present invention comprises a CHA 110 which transmits/receives data to/from an external device, a DKA 140 which transmits/receives data to/from an HDD unit 200, a primary cache unit 120 which has a primary cache memory 124, a secondary cache unit 130 which is installed between the primary cache unit 120 and the DKA 140 and has a secondary cache memory 134, a CCP 121 which stores write target data received by the CHA 110 in the primary cache memory 124, and a CCP 131 which stores the write target data in the secondary cache memory 134, and transfers the write target data stored in the secondary cache memory 134 to the DKA 140.
    Type: Grant
    Filed: January 24, 2008
    Date of Patent: August 23, 2011
    Assignee: Hitachi, Ltd.
    Inventors: Tatsuya Ninomiya, Kazuo Tanaka
  • Publication number: 20110202727
    Abstract: Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache or the selected line is a write-through line. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.
    Type: Application
    Filed: February 18, 2010
    Publication date: August 18, 2011
    Applicant: QUALCOMM INCORPORATED
    Inventors: Thomas Philip Speier, James Norris Dieffenderfer, Thomas Andrew Sartorius
  • Publication number: 20110202726
    Abstract: A data processing apparatus for forming a portion of a coherent cache system comprises at least one master device for performing data processing operations, and a cache coupled to the at least one master device and arranged to store data values for access by that at least one master device when performing the data processing operations. Cache coherency circuitry is responsive to a coherency request from another portion of the coherent cache system to cause a coherency action to be taken in respect of at least one data value stored in the cache. Responsive to an indication that the coherency action has resulted in invalidation of that at least one data value in the cache, refetch control circuitry is used to initiate a refetch of that at least one data value into the cache.
    Type: Application
    Filed: February 12, 2010
    Publication date: August 18, 2011
    Applicant: ARM Limited
    Inventors: Christopher William Laycock, Antony John Harris, Bruce James Mathewson, Andrew Christopher Rose, Richard Roy Grisenthwaite
  • Publication number: 20110197030
    Abstract: In one embodiment, a system comprises a plurality of agents coupled to an interconnect and a cache coupled to the interconnect. The plurality of agents are configured to cache data. A first agent of the plurality of agents is configured to initiate a transaction on the interconnect by transmitting a memory request, and other agents of the plurality of agents are configured to snoop the memory request from the interconnect. The other agents provide a response in a response phase of the transaction on the interconnect. The cache is configured to detect a hit for the memory request and to provide data for the transaction to the first agent prior to the response phase and independent of the response.
    Type: Application
    Filed: April 18, 2011
    Publication date: August 11, 2011
    Inventors: Brian P. Lilly, Sridhar P. Subramanian, Ramesh Gunna
  • Publication number: 20110185125
    Abstract: A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers may share access to an interconnect egress port coupled to the interconnect network, and may generate multiple concurrent requests to convey data via the shared port, where each of the requests is destined for a corresponding core, and where a datapath width of the port is less than a combined width of the multiple requests. The given tag unit may arbitrate among the controllers for access to the shared port, such that the requests are transmitted to corresponding cores serially rather than concurrently.
    Type: Application
    Filed: January 27, 2010
    Publication date: July 28, 2011
    Inventors: Prashant Jain, Yoganand Chillarige, Sandip Das, Shukur Moulali Pathan, Srinivasan R. Iyengar, Sanjay Patel
  • Publication number: 20110167224
    Abstract: A cache memory according to an aspect of the present invention including entries each of which includes a tag address, line data, and a dirty flag, the cache memory includes: a command execution unit which rewrites, when a first command is instructed by a processor, a tag address included in at least one entry specified by the processor among the entries to a tag address corresponding to an address specified by the processor, and to set a dirty flag corresponding to the entry; and a write-back unit which writes, back to a main memory, the line data included in the entry in which the dirty flag is set.
    Type: Application
    Filed: March 15, 2011
    Publication date: July 7, 2011
    Applicant: PANASONIC CORPORATION
    Inventor: Takanori ISONO
  • Publication number: 20110161586
    Abstract: Technologies are described herein related to multi-core processors that are adapted to share processor resources. An example multi-core processor can include a plurality of processor cores. The multi-core processor further can include a shared register file selectively coupled to two or more of the plurality of processor cores, where the shared register file is adapted to serve as a shared resource among the selected processor cores.
    Type: Application
    Filed: December 29, 2009
    Publication date: June 30, 2011
    Inventors: Miodrag Potkonjak, Nathan Zachary Beckmann
  • Publication number: 20110161589
    Abstract: A data processing system includes first and second processing units and a system memory. The first processing unit has first upper and first lower level caches, and the second processing unit has second upper and lower level caches. In response to a data request, a victim cache line to be castout from the first lower level cache is selected, and the first lower level cache selects between performing a lateral castout (LCO) of the victim cache line to the second lower level cache and a castout of the victim cache line to the system memory based upon a confidence indicator associated with the victim cache line. In response to selecting an LCO, the first processing unit issues an LCO command on the interconnect fabric and removes the victim cache line from the first lower level cache, and the second lower level cache holds the victim cache line.
    Type: Application
    Filed: December 30, 2009
    Publication date: June 30, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Guy L. Guthrie, William J. Starke, Jeffrey A. Stuecheli, Derek E. Williams, Thomas R. Puzak