Cache Consistency Protocols (epo) Patents (Class 711/E12.026)
  • Publication number: 20130061003
    Abstract: A system, apparatus, and method for routing traffic in a SoC from I/O devices to memory. A coherence switch routes coherent traffic through a coherency port on a processor complex to a real-time port of a memory controller. The coherence switch routes non-coherent traffic to a non-real time port of the memory controller. The coherence switch can also dynamically switch traffic between the two paths. The routing of traffic can be configured via a configuration register, and while software can initiate an update to the configuration register, the actual coherence switch hardware will implement the update. Software can write to a software-writeable copy of the configuration register to initiate an update to the flow path to memory for a transaction identifier. The coherence switch detects the update to the software-writeable copy, and then the coherence switch updates the working copy of the configuration register and implements the new routing.
    Type: Application
    Filed: September 7, 2011
    Publication date: March 7, 2013
    Inventors: Timothy J. Millet, Muditha Kanchana, Shailendra S. Desai
  • Patent number: 8392656
    Abstract: A parameter copying method is applied to a duplex system in which MPU and a main memory are duplicated and duplex operations on a hot standby system are performed. The parameter copying method includes cache reading data in the main memory corresponding to one MPU, cache writing the data read in the cache reading step on an as-is basis, and writing the data into the main memory corresponding to the one MPU by a block write that is produced by a cache replace caused due to the cache writing step, and also writing the same data into the main memory corresponding to the other MPU by the block write on a basis of a mirrored write.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: March 5, 2013
    Assignee: Yokogawa Electric Corporation
    Inventor: Hideharu Yajima
  • Patent number: 8392663
    Abstract: A multiprocessor system maintains cache coherence among processors in a coherent domain. Within the coherent domain, a first processor can receive a command to perform a cache maintenance operation. The first processor can determine whether the cache maintenance operation is a coherent operation. For coherent operations, the first processor sends a coherent request message for distribution to other processors in the coherent domain and can cancel execution of the cache maintenance operation pending receipt of intervention messages corresponding to the coherent request. The intervention messages can reflect a global ordering of coherence traffic in the multiprocessor system and can include instructions for maintaining a data cache and an instruction cache of the first processor. Cache maintenance operations that are determined to be non-coherent can be executed at the first processor without sending the coherent request.
    Type: Grant
    Filed: December 10, 2008
    Date of Patent: March 5, 2013
    Assignee: MIPS Technologies, Inc.
    Inventors: Ryan C. Kinter, Darren M. Jones, Matthias Knoth
  • Publication number: 20130054900
    Abstract: A method and an apparatus for increasing capacity of cache directory in multi-processor systems, the apparatus comprising a plurality of processor nodes and a plurality of cache memory nodes and a plurality of main memory nodes.
    Type: Application
    Filed: August 22, 2012
    Publication date: February 28, 2013
    Inventor: Conor Santifort
  • Publication number: 20130046924
    Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.
    Type: Application
    Filed: October 23, 2012
    Publication date: February 21, 2013
    Inventors: Ali-Reza Adl-Tabatabai, YANG NI, BRATIN SAHA, VADIM BASSIN, GAD SHEAFFER, DAVID CALLAHAN, JAN GRAY
  • Patent number: 8380933
    Abstract: A multiprocessor system includes cache memories each of which is provided in correspondence with one of processor cores and includes a tag storage unit configured to store validity information representing whether a cache line as a unit to store data is valid, update information representing whether data in the cache line has been rewritten, and address information of the data in the cache line, a shared memory shared by the processor cores, and an arbitration circuit configured to arbitrate access requests from the processor cores to the shared memory and send the arbitrated access request to the cache memories. Each cache memory includes a violation detection circuit configured to detect a violation access by comparing the information in the tag storage unit with the access request from the arbitration circuit.
    Type: Grant
    Filed: March 24, 2008
    Date of Patent: February 19, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Masato Uchiyama
  • Patent number: 8380934
    Abstract: A cache device interposed between a processor and a memory device, including: a cache memory storing data from the memory device; a buffer holding output data output from the processor; a control circuit determining, on the basis of a request to access the memory device, whether a cache hit has occurred or not and, if a cache miss has occurred, storing the output data in the buffer in response to the access request, outputting a read request for reading the data in a line containing data requested by the access request from the memory device, storing data output from the line of the memory device into the cache memory, and storing the output data from the buffer into the cache memory.
    Type: Grant
    Filed: February 16, 2010
    Date of Patent: February 19, 2013
    Assignee: Fujitsu Semiconductor Limited
    Inventor: Gen Tsukishiro
  • Publication number: 20130031314
    Abstract: A number of coherence domains are maintained among the multitude of processing cores disposed in a microprocessor. A cache coherency manager defines the coherency relationships such that coherence traffic flows only among the processing cores that are defined as having a coherency relationship. The data defining the coherency relationships between the processing cores is optionally stored in a programmable register. For each source of a coherent request, the processing core targets of the request are identified in the programmable register. In response to a coherent request, an intervention message is forwarded only to the cores that are defined to be in the same coherence domain as the requesting core. If a cache hit occurs in response to a coherent read request and the coherence state of the cache line resulting in the hit satisfies a condition, the requested data is made available to the requesting core from that cache line.
    Type: Application
    Filed: January 30, 2012
    Publication date: January 31, 2013
    Applicant: MIPS Technologies, Inc.
    Inventor: Ryan C. Kinter
  • Publication number: 20130019047
    Abstract: An apparatus having a memory and circuit is disclosed. The memory may (i) assert a first signal in response to detecting a conflict between at least two addresses requesting access to a block at a first time, (ii) generate a second signal in response to a cache miss caused by an address requesting access to the block at a second time and (iii) store a line fetched in response to the cache miss in another block by adjusting the first address by an offset. The second time is generally after the first time. The circuit may (i) generate the offset in response to the assertion of the first signal and (ii) present the offset in a third signal to the memory in response to the assertion of the second signal corresponding to reception of the first address at the second time. The offset is generally associated with the first address.
    Type: Application
    Filed: July 11, 2011
    Publication date: January 17, 2013
    Inventors: Dmitry Podvalny, Alex Shinkar, Assaf Rachlevski
  • Patent number: 8352681
    Abstract: Proposed are a highly reliable storage system and its control method capable of accelerating the processing speed of the copy processing seen from the host device. With the storage system and its control method which stores a command issued from a host device in a command queue and executes the command stored in the command queue in the order that the command was stored in command queue, a copy queue is set for temporarily retaining a copy command among the commands issued from the host device in the memory, the copy command among the commands is moved from the host device stored in the command queue to the copy queue and an execution completion reply of copy processing according to the command is sent to the host device as a sender of the command, and the copy command that was moved to the copy queue is executed in the background in the order that the copy command was stored in the copy queue.
    Type: Grant
    Filed: July 17, 2009
    Date of Patent: January 8, 2013
    Assignee: Hitachi, Ltd.
    Inventors: Kosuke Sakai, Koji Nagata, Yoshiyuki Noborikawa
  • Publication number: 20130007375
    Abstract: A device with an interconnect having a plurality of memory controllers for connecting the plurality of memory controllers. Each memory controller of the plurality of memory controllers is coupled to an allocated memory for storing data. Further, each memory controller of the plurality of memory controllers has one accelerator of a plurality of accelerators for mutually exchanging data over the interconnect.
    Type: Application
    Filed: June 27, 2012
    Publication date: January 3, 2013
    Applicant: International Business Machines Corporation
    Inventors: Florian Alexander Auernhammer, Victoria Caparros Cabezas, Andreas Christian Doering, Patricia Maria Sagmeister
  • Publication number: 20120331233
    Abstract: A mechanism is provided for detecting false sharing misses. Responsive to performing either an eviction or an invalidation of a cache line in a cache memory of the data processing system, a determination is made as to whether there is an entry associated with the cache line in a false sharing detection table. Responsive to the entry associated with the cache line existing in the false sharing detection table, a determination is made as to whether an overlap field associated with the entry is set. Responsive to the overlap field failing to be set, identification is made that a false sharing coherence miss has occurred. A first signal is then sent to a performance monitoring unit indicating the false sharing coherence miss.
    Type: Application
    Filed: June 24, 2011
    Publication date: December 27, 2012
    Applicant: International Business Machines Corporation
    Inventors: Harold W. Cain, III, Hubertus Franke
  • Publication number: 20120317366
    Abstract: The present invention measures an actual utilization frequency of data and controls a location of this data in a storage apparatus in a case where a host computer makes joint use of a storage apparatus and a cache apparatus. A portion of data used by an application program 1A is stored in a storage apparatus 2 and a cache apparatus 3. A management apparatus 4 detects an I/O load of a page (4A), and detects an I/O load of cache data (4B). The management apparatus 4 determines a corresponding relationship between the page and the cache data (4C), and adds the I/O load of the cache data to the I/O load of the page.
    Type: Application
    Filed: May 30, 2011
    Publication date: December 13, 2012
    Inventors: Takanori Sato, Takato Kusama
  • Patent number: 8332568
    Abstract: A memory access determination circuit includes a counter that outputs a first value counted by using a first reference value, and a control unit that makes a cache determination of an address corresponding to an output of the counter, wherein, when a cache miss occurs for the address, the counter outputs a second value by using a second reference value.
    Type: Grant
    Filed: February 15, 2010
    Date of Patent: December 11, 2012
    Assignee: Fujitsu Semiconductor Limited
    Inventor: Kazuhiko Okada
  • Publication number: 20120311021
    Abstract: A method of a transaction-based system is applicable to a data deduplication system. In the system, pointers of same data point to a same position, so that when one piece of data is changed, all associated pointers need to be changed. In this method, a server first sets a flag to a false value, and after the server receives a request for backing up a data element from a client, the server reads a fingerprinting of the data element and determines whether the fingerprinting is the same as a temporary fingerprinting in a meta cache of the client, writes the data element and the fingerprinting into a corresponding temporary storage data block when the fingerprinting is not the same as the temporary fingerprinting, and writes the data element and the fingerprinting into a main meta cache and resets the flag when the flag is a true value.
    Type: Application
    Filed: September 23, 2011
    Publication date: December 6, 2012
    Applicant: INVENTEC CORPORATION
    Inventors: Ming-Sheng Zhu, Chih-Feng Chen
  • Publication number: 20120311271
    Abstract: A read cache device for accelerating execution of read commands in a storage area network (SAN) in a data path between frontend servers and a backend storage. The device includes a cache memory unit for maintaining portions of data that reside in the backend storage and mapped to at least one accelerated virtual volume; a cache management unit for maintaining data consistency between the cache memory unit and the at least one accelerated virtual volume; a descriptor memory unit for maintaining a plurality of descriptors; and a processor for receiving each command and each command response travels in the data path serving each received read command directed to the at least one accelerated virtual volume by returning requested data stored in the cache memory unit and writing data to the cache memory unit according to a caching policy.
    Type: Application
    Filed: June 6, 2011
    Publication date: December 6, 2012
    Applicant: SANRAD, Ltd.
    Inventors: Yaron Klein, Allon Cohen
  • Patent number: 8327081
    Abstract: A processor module having a cache device and a system controller having a copy TAG2 of a tag of the cache device configure a system to which a protocol representing the states of a data block of the cache device by six states, that is, an invalid state I, a shared state S, an exclusive state E, a modified state M, a shared modified state O, and a writable modified state W can be applied. In order to implement the concept, information about a new state in a cache device of a requester is included in a reply packet from the cache device for transmitting the data block. After the completion of the snooping process of the TAG2 until the reception of the reply packet from the cache device for transmitting the data block and the determination of the next state, an object data block is locked in the TAG2.
    Type: Grant
    Filed: August 27, 2008
    Date of Patent: December 4, 2012
    Assignee: Fujitsu Limited
    Inventor: Gou Sugizaki
  • Publication number: 20120303906
    Abstract: Embodiments are provided for cache memory systems. In one general embodiment, a method that includes receiving a host write request from a host computer, creating a sequential log file in a storage device, and copying data received during the host write request to a storage buffer. The method further includes determining if a selected quantity of data has been accumulated in the storage buffer and executing a write through of data to sequentially write the data accumulated in the storage buffer to the sequential log file and to a storage class memory device if the selected quantity of data has been accumulated in the storage buffer.
    Type: Application
    Filed: August 6, 2012
    Publication date: November 29, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Binny S. Gill
  • Patent number: 8316194
    Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 15, 2009
    Date of Patent: November 20, 2012
    Assignee: Intel Corporation
    Inventors: Ali-Reza Adl-Tabatabai, Yang Ni, Bratin Saha, Vadim Bassin, Gad Sheaffer, David Callahan, Jan Gray
  • Publication number: 20120290794
    Abstract: A method including: receiving multiple local requests to access the cache line; inserting, into an address chain, multiple entries corresponding to the multiple local requests; identifying a first entry at a head of the address chain; initiating, in response to identifying the first entry and in response to the first entry corresponding to a request to own the cache line, a traversal of the address chain; setting, during the traversal of the address chain, a state element identified in a second entry; receiving a foreign request to access the cache line; inserting, in response to setting the state element, a third entry corresponding to the foreign request into the address chain after the second entry; and relinquishing, in response to inserting the third entry after the second entry in the address chain, the cache line to a foreign thread after executing the multiple local requests.
    Type: Application
    Filed: August 29, 2011
    Publication date: November 15, 2012
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Connie Wai Mun Cheung, Madhavi Kondapaneni, Joann Yin Lam, Ramaswamy Sivaramakrishnan
  • Publication number: 20120290774
    Abstract: A method and system to allow power fail-safe write-back or write-through caching of data in a persistent storage device into one or more cache lines of a caching device. No metadata associated with any of the cache lines is written atomically into the caching device when the data in the storage device is cached. As such, specialized cache hardware to allow atomic writing of metadata during the caching of data is not required.
    Type: Application
    Filed: May 16, 2012
    Publication date: November 15, 2012
    Inventor: SANJEEV N. TRIKA
  • Publication number: 20120254551
    Abstract: It is provided a method of generating a code by a compiler, including the steps of: analyzing a program executed by a processor; analyzing data necessary to execute respective tasks included in a program; determining whether a boundary of the data used by divided tasks is consistent with a management unit of a cache memory based on results of the analyzing; and generating a code for providing a non-cacheable area from which the data to be stored in the management unit including the boundary is not temporarily stored into the cache memory and a code for storing an arithmetic processing result stored in the management unit including the boundary into a non-cacheable area in a case where it is determined that the boundary of the data used by the divided tasks is not consistent with the management unit of the cache memory.
    Type: Application
    Filed: December 14, 2010
    Publication date: October 4, 2012
    Applicant: WASEDA UNIVERSITY
    Inventors: Hironori Kasahara, Keiji Kimura, Masayoshi Mase
  • Publication number: 20120254541
    Abstract: Methods and apparatus for updating data in passive variable resistive memory (PVRM) are provided. In one example, a method for updating data stored in PVRM is disclosed. The method includes updating a memory block of a plurality of memory blocks in a cache hierarchy without invalidating the memory block. The updated memory block may be copied from the cache hierarchy to a write through buffer. Additionally, the method includes writing the updated memory block to the PVRM, thereby updating the data in the PVRM.
    Type: Application
    Filed: April 4, 2011
    Publication date: October 4, 2012
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Brad Beckmann, Lisa Hsu
  • Publication number: 20120246410
    Abstract: A cache memory has one or a plurality of ways having a plurality of cache lines including a tag memory which stores a tag address, a first dirty bit memory which stores a first dirty bit, a valid bit memory which stores a valid bit, and a data memory which stores data. The cache memory has a line index memory which stores a line index for identifying the cache line. The cache memory has a DBLB management unit having a plurality of lines including a row memory which stores first bit data identifying the way and second bit data identifying the line index, a second dirty bit memory which stores a second dirty bit of bit unit corresponding to writing of a predetermined unit into the data memory, and a FIFO memory which stores FIFO information prescribing a registered order. Data in a cache line of a corresponding way is written back on the basis of the second dirty bit.
    Type: Application
    Filed: June 9, 2011
    Publication date: September 27, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Hui Xu
  • Patent number: 8271736
    Abstract: A method for increasing the performance and utilization of cache memory by combining the data block frequency map generated by data de-duplication mechanism and page prefetching and eviction algorithms like Least Recently Used (LRU) policy. The data block frequency map provides weight directly proportional to the frequency count of the block in the dataset. This weight is used to influence the caching algorithms like LRU. Data blocks that have lesser frequency count in the dataset are evicted before those with higher frequencies, even though they may not have been the topmost blocks for page eviction by caching algorithms. The method effectively combines the weight of the block in the frequency map and its eviction status by caching algorithms like LRU to get an improved performance and utilization of the cache memory.
    Type: Grant
    Filed: February 7, 2008
    Date of Patent: September 18, 2012
    Assignee: International Business Machines Corporation
    Inventors: Karan Gupta, Tarun Thakur
  • Publication number: 20120233410
    Abstract: The present invention discloses a shared-variable-based (SVB) approach for fast and accurate multi-core cache coherence simulation. While the intuitive, conventional approach, synchronizing at either every cycle or memory access, gives accurate simulation results, it has poor performance due to huge simulation overloads. In the present invention, timing synchronization is only needed before shared variable accesses in order to maintain accuracy while improving the efficiency in the proposed shared-variable-based approach.
    Type: Application
    Filed: March 13, 2011
    Publication date: September 13, 2012
    Applicant: National Tsing Hua University
    Inventors: Cheng-Yang FU, Meng-Huan Wu, Ren-Song Tsay
  • Publication number: 20120221798
    Abstract: Methods for selecting a line to evict from a data storage system are provided. A computer system implementing a method for selecting a line to evict from a data storage system is also provided. The methods include selecting an uncached class line for eviction prior to selecting a cached class line for eviction.
    Type: Application
    Filed: May 5, 2012
    Publication date: August 30, 2012
    Inventor: Blaine D. Gaither
  • Patent number: 8255633
    Abstract: A list prefetch engine improves a performance of a parallel computing system. The list prefetch engine receives a current cache miss address. The list prefetch engine evaluates whether the current cache miss address is valid. If the current cache miss address is valid, the list prefetch engine compares the current cache miss address and a list address. A list address represents an address in a list. A list describes an arbitrary sequence of prior cache miss addresses. The prefetch engine prefetches data according to the list, if there is a match between the current cache miss address and the list address.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: August 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Peter Boyle, Norman Christ, Alan Gara, Changhoan Kim, Robert Mawhinney, Martin Ohmacht, Krishnan Sugavanam
  • Patent number: 8255635
    Abstract: According to method of data processing in a multiprocessor data processing system, in response to a processor request to modify a target granule of a target cache line of data containing multiple granules, a processing unit originates on an interconnect of the multiprocessor data processing system a data-claim-partial request that requests permission to promote only the target granule of the target cache line to a unique copy with an intent to modify the target granule. In response to a combined response to the data-claim-partial request indicating success (the combined response representing a system-wide response to the data-claim-partial-request), the processing unit promotes only the target granule of the target cache line to a unique copy by updating a coherency state of the target granule and retaining a coherency state of at least one other granule of the target cache line.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: August 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Jerry D. Lewis, Warren E. Maule
  • Publication number: 20120215988
    Abstract: Administering non-cacheable memory load instructions in a computing environment where cacheable data is produced and consumed in a coherent manner without harming performance of a producer, the environment including a hierarchy of computer memory that includes one or more caches backed by main memory, the caches controlled by a cache controller, at least one of the caches configured as a write-back cache. Embodiments of the present invention include receiving, by the cache controller, a non-cacheable memory load instruction for data stored at a memory address, the data treated by the producer as cacheable; determining by the cache controller from a cache directory whether the data is cached; if the data is cached, returning the data in the memory address from the write-back cache without affecting the write-back cache's state; and if the data is not cached, returning the data from main memory without affecting the write-back cache's state.
    Type: Application
    Filed: May 2, 2012
    Publication date: August 23, 2012
    Applicant: International Business Machines Corporation
    Inventors: Jon K. Kriegel, Jamie R. Kuesel
  • Publication number: 20120210073
    Abstract: An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    Type: Application
    Filed: February 11, 2011
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Alan Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
  • Publication number: 20120210072
    Abstract: A method of processing store requests in a data processing system includes enqueuing a store request in a store queue of a cache memory of the data processing system. The store request identifies a target memory block by a target address and specifies store data. While the store request and a barrier request older than the store request are enqueued in the store queue, a read-claim machine of the cache memory is dispatched to acquire coherence ownership of target memory block of the store request. After coherence ownership of the target memory block is acquired and the barrier request has been retired from the store queue, a cache array of the cache memory is updated with the store data.
    Type: Application
    Filed: April 26, 2012
    Publication date: August 16, 2012
    Applicant: International Business Machines Corporation
    Inventors: Guy L. Guthrie, William J. Starke, Derek E. Williams
  • Publication number: 20120210071
    Abstract: A multi-core processor with a shared physical memory is described. In an embodiment a sending core sends a memory write request to a destination core so that the request may be acted upon by the destination core as if it originated from the destination core. In an example, a data structure is configured in the shared physical memory and mapped to be accessible to the sending and destination cores. In an example, the shared data structure is used as a message channel between the sending and destination cores to carry data using the memory write request. In an embodiment a notification mechanism is enabled using the shared physical memory in order to notify the destination core of events by updating a notification data structure. In an example, the notification mechanism triggers a notification process at the destination core to inform a receiving process of a notification.
    Type: Application
    Filed: February 11, 2011
    Publication date: August 16, 2012
    Applicant: Microsoft Corporation
    Inventors: Richard John Black, Timothy Harris, Ross Cameron Mcilroy, Karin Strauss
  • Patent number: 8244985
    Abstract: Apparatus and methods relating to store operations are disclosed. In one embodiment, a first storage unit is to store data. A second storage unit is to store the data only after it has become detectable by a bus agent. Moreover, the second storage unit may store an index field for each data value to be stored within the second storage unit. Other embodiments are also disclosed.
    Type: Grant
    Filed: January 27, 2009
    Date of Patent: August 14, 2012
    Assignee: Intel Corporation
    Inventors: Vladimir Pentkovksi, Ling Cen, Vivek Garg, Deep Buch, David Zhao
  • Patent number: 8239633
    Abstract: A coherence controller in hardware of an apparatus in an example detects conflicts on coherence requests through direct, non-broadcast employment of signatures that: summarize read-sets and write-sets of memory transactions; and provide false positives but no false negatives for the conflicts on the coherence requests. The signatures comprise fixed-size representations of a substantially arbitrary set of addresses for the read-sets and the write-sets of the memory transactions.
    Type: Grant
    Filed: July 9, 2008
    Date of Patent: August 7, 2012
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: David A. Wood, Mark D. Hill, Michael M. Swift, Michael R. Marty, Luke Yen, Kevin E. Moore, Jayaram Bobba, Haris Volos
  • Publication number: 20120198164
    Abstract: This invention is a cache system with a memory attribute register having plural entries. Each entry stores a write-through or a write-back indication for a corresponding memory address range. On a write to cached data the cache the cache consults the memory attribute register for the corresponding address range. Writes to addresses in regions marked as write-through always update all levels of the memory hierarchy. Writes to addresses in regions marked as write- back update only the first cache level that can service the write. The memory attribute register is preferably a memory mapped control register writable by the central processing unit.
    Type: Application
    Filed: September 28, 2011
    Publication date: August 2, 2012
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Raguram Damodaran, Abhijeet Ashok Chachad, Naveen Bhoria
  • Publication number: 20120198177
    Abstract: A data processor is disclosed that definitively determines an effective address being calculated and decoded will be associated with an address range that includes a memory local to a data processor unit, and will disable a cache access based upon a comparison between a portion of a base address and a corresponding portion of an effective address input operand. Access to the local memory can be accomplished through a first port of the local memory when it is definitively determined that the effective address will be associated with an address range. Access to the local memory cannot be accomplished through the first port of the local memory when it is not definitively determined that the effective address will be associated with the address range.
    Type: Application
    Filed: January 28, 2011
    Publication date: August 2, 2012
    Applicant: FREESCALE SEMICONDUCTOR, INC.
    Inventor: William C. Moyer
  • Patent number: 8230177
    Abstract: Systems and methods for efficient handling of store misses. A processor comprises a store queue that stores data for committed store instructions. Coupled to the store queue is a cache responsible for ensuring consistent ordering of store operations for all consumers, which may be accomplished by maintaining a corresponding cache line be in an exclusive state before executing a store operation. In response to a first committed store instruction missing in the cache, the store queue is configured to convey to the cache a second entry of the plurality of queue entries as a speculative prefetch instruction. This second entry corresponds to a committed store instruction that follows in program order the first committed store instruction of a given thread. If the prefetch instruction misses in the cache, the latency for acquiring a corresponding cache line overlaps with the latency of the first store instruction.
    Type: Grant
    Filed: May 28, 2009
    Date of Patent: July 24, 2012
    Assignee: Oracle America, Inc.
    Inventor: Mark A. Luttrell
  • Patent number: 8225297
    Abstract: Various technologies and techniques are disclosed for providing software accessible metadata on a cache of a central processing unit. A multiprocessor has at least one central processing unit. The central processing unit has a cache with cache lines that are augmented by cache metadata. The cache metadata includes software-controlled metadata identifiers that allow multiple logical processors to share the cache metadata. The metadata identifiers and cache metadata can then be used to accelerate various operations. For example, parallel computations can be accelerated using cache metadata and metadata identifiers. As another example, nested computations can be accelerated using metadata identifiers and cache metadata. As yet another example, transactional memory applications that include parallelism within transactions or that include nested transactions can be also accelerated using cache metadata and metadata identifiers.
    Type: Grant
    Filed: August 6, 2007
    Date of Patent: July 17, 2012
    Assignee: Microsoft Corporation
    Inventors: Jan Gray, Timothy L. Harris, James Larus, Burton Smith
  • Publication number: 20120179875
    Abstract: A method and apparatus for fine-grained filtering in a hardware accelerated software transactional memory system is herein described. A data object, which may have an arbitrary size, is associated with a filter word. The filter word is in a first default state when no access, such as a read, from the data object has occurred during a pendancy of a transaction. Upon encountering a first access, such as a first read, from the data object, access barrier operations including an ephemeral/private store operation to set the filter word to a second state are performed. Upon a subsequent/redundant access, such as a second read, the access barrier operations are elided to accelerate the subsequent access, based on the filter word being set to the second state to indicate a previous access occurred.
    Type: Application
    Filed: January 10, 2012
    Publication date: July 12, 2012
    Inventors: Bratin Saha, Ali-Reza Adl-Tabatabai, Gad Sheaffer, Quinn Jacobson
  • Publication number: 20120173824
    Abstract: Embodiments of the invention provide techniques for managing cache metadata providing a mapping between addresses on a storage medium (e.g., disk storage) and corresponding addresses on a cache device at which data items are stored. In some embodiments, cache metadata may be stored in a hierarchical data structure comprising a plurality of hierarchy levels. When a reboot of the computer is initiated, only a subset of the plurality of hierarchy levels may be loaded to memory, thereby expediting the process of restoring the cache metadata and thus startup operations. Startup may be further expedited by using cache metadata to perform operations associated with reboot.
    Type: Application
    Filed: February 2, 2012
    Publication date: July 5, 2012
    Applicant: Microsoft Corporation
    Inventors: Mehmet Iyigun, Yevgeniy Bak, Michael Fortin, David Fields, Cenk Ergan, Alexander Kirshenbaum
  • Patent number: 8214602
    Abstract: In one embodiment, a processor comprises a data cache and a load/store unit (LSU). The LSU comprises a queue and a control unit, and each entry in the queue is assigned to a different load that has accessed the data cache but has not retired. The control unit is configured to update the data cache hit status of each load represented in the queue as a content of the data cache changes. The control unit is configured to detect a snoop hit on a first load in a first entry of the queue responsive to: the snoop index matching a load index stored in the first entry, the data cache hit status of the first load indicating hit, the data cache detecting a snoop hit for the snoop operation, and a load way stored in the first entry matching a first way of the data cache in which the snoop operation is a hit.
    Type: Grant
    Filed: June 23, 2008
    Date of Patent: July 3, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashutosh S. Dhodapkar, Michael G. Butler
  • Publication number: 20120166735
    Abstract: Technologies are generally described for a system for sending a data block stored in a cache. In some examples described herein, a system may comprise a first processor in a first tile. The first processor is effective to generate a request for a data block, the request including a destination identifier identifying a destination tile for the data block, the destination tile being distinct from the first tile. Some example systems may further comprise a second tile effective to receive the request, the second tile effective to determine a data tile including the data block, the second tile further effective to send the request to the data tile. Some example systems may still further comprise a data tile effective to receive the request from the second tile, the data tile effective to send the data block to the destination tile.
    Type: Application
    Filed: March 2, 2012
    Publication date: June 28, 2012
    Applicant: EMPIRE TECHNOLOGY DEVELOPMENT LLC
    Inventor: Yan Solihin
  • Publication number: 20120159080
    Abstract: A method and apparatus for utilizing a higher-level cache as a neighbor cache directory in a multi-processor system are provided. In the method and apparatus, when the data field of a portion or all of the cache is unused, a remaining portion of the cache is repurposed for usage as neighbor cache directory. The neighbor cache provides a pointer to another cache in the multi-processor system storing memory data. The neighbor cache directory can be searched in the same manner as a data cache.
    Type: Application
    Filed: December 15, 2010
    Publication date: June 21, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Greggory D. Donley, William A. Hughes, Kevin M. Lepak, Vydhyanathan Kalyanasundharam, Benjamin Tsien
  • Publication number: 20120159083
    Abstract: Systems and methods for performing memory transactions are described. In an embodiment, a system comprises a processor configured to perform an action in response to a transaction indicative of a request originated by a hardware subsystem. A logic circuit is configured to receive the transaction. In response to identifying a specific characteristic of the transaction, the logic circuit splits the transaction into two or more other transactions. The two or more other transactions enable the processor to satisfy the request without performing the action. The system also includes an interface circuit configured to receive the request originated by the hardware subsystem and provide the transaction to the logic circuit. In some embodiments, a system may be implemented as a system-on-a-chip (SoC). Devices suitable for using these systems include, for example, desktop and laptop computers, tablets, network appliances, mobile phones, personal digital assistants, e-book readers, televisions, and game consoles.
    Type: Application
    Filed: December 17, 2010
    Publication date: June 21, 2012
    Inventors: Deniz Balkan, Gurjeet S. Saund
  • Patent number: 8205046
    Abstract: A system and method for maintaining coherency in a symmetric multiprocessing (SMP) system are disclosed. Briefly described, in architecture, one exemplary embodiment comprises a first crossbar coupled to a plurality of local processors; a second crossbar coupled to at least one remote processor; and at least one crossbar directory that tracks access of information by a remote processor in a symmetric multiprocessing (SMP) system, the remote processor in communication with at least one of the local processors via the crossbars, such that a most current location of the information can be determined by the crossbar directory.
    Type: Grant
    Filed: January 31, 2005
    Date of Patent: June 19, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Mark Shaw, Stuart Allen Berke
  • Patent number: 8200909
    Abstract: A method and apparatus for accelerating a software transactional memory (STM) system is described herein. Annotation field are associated with lines of a transactional memory. An annotation field associated with a line of the transaction memory is initialized to a first value upon starting a transaction. In response to encountering a read operation in the transaction, then annotation field is checked. If the annotation field includes a first value, the read is serviced from the line of the transaction memory without having to search an additional write space. A second and third value in the annotation field potentially indicates whether a read operation missed the transactional memory or a tentative value is stored in a write space. Additionally, an additional bit in the annotation field, may be utilized to indicate whether previous read operations have been logged, allowing for subsequent redundant read logging to be reduced.
    Type: Grant
    Filed: April 26, 2011
    Date of Patent: June 12, 2012
    Inventors: Bratin Saha, Ali-Reza Adl-Tabatabai, Quinn Jacobson
  • Patent number: 8195890
    Abstract: The present invention is a protocol for maintaining cache consistency between multiprocessors within a tightly coupled system. A distributed directory is maintained within the data-sharing processors, so that copies can be invalidated when modified. All transfers are event driven, rather the polled, to reduce bus-bandwidth consumption. Deadlocks are avoided by placing to-be-executed command codes in the returned response packets, when the request-forwarding queues are full or not present.
    Type: Grant
    Filed: August 22, 2007
    Date of Patent: June 5, 2012
    Assignee: Sawyer Law Group, P.C.
    Inventor: David Vernon James
  • Publication number: 20120137081
    Abstract: Systems and methods for management of a cache are disclosed. In general, embodiments described herein store access counts in file system metadata associated with files in the cache. By encoding access counts in the file system metadata, file I/O operations are reduced. Preferably, the reference count is encoded in an access count timestamp in the file system metadata. The access counts can be decoded based on the difference between the access count time stamp and a base time value, with larger differences reflecting a larger access count. The cache can be aged by advancing the base time value, thereby causing the access count for a file to drop. The base time value can also be stored in file system metadata, thereby reducing file I/O operations when performing aging.
    Type: Application
    Filed: November 30, 2010
    Publication date: May 31, 2012
    Inventor: James C. Shea
  • Patent number: 8190820
    Abstract: In one embodiment, the present invention includes a directory to aid in maintaining control of a cache coherency protocol. The directory can be coupled to multiple caching agents via an interconnect, and be configured to store a entries associated with cache lines. The directory also includes logic to determine a time delay before the directory can send a concurrent snoop request. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 13, 2008
    Date of Patent: May 29, 2012
    Assignee: Intel Corporation
    Inventors: Hariharan Thantry, Akhilesh Kumar, Seungjoon Park