Cache Consistency Protocols (epo) Patents (Class 711/E12.026)
E Subclasses
- Copy directories (EPO) (Class 711/E12.028)
- Associative directories (EPO) (Class 711/E12.029)
- Distributed directories, e.g., linked lists of caches, etc. (EPO) (Class 711/E12.03)
- Limited pointers directories; state-only directories without pointers (EPO) (Class 711/E12.031)
- With concurrent directory accessing, i.e., handling multiple concurrent coherency transactions (EPO) (Class 711/E12.032)
-
Publication number: 20130061003Abstract: A system, apparatus, and method for routing traffic in a SoC from I/O devices to memory. A coherence switch routes coherent traffic through a coherency port on a processor complex to a real-time port of a memory controller. The coherence switch routes non-coherent traffic to a non-real time port of the memory controller. The coherence switch can also dynamically switch traffic between the two paths. The routing of traffic can be configured via a configuration register, and while software can initiate an update to the configuration register, the actual coherence switch hardware will implement the update. Software can write to a software-writeable copy of the configuration register to initiate an update to the flow path to memory for a transaction identifier. The coherence switch detects the update to the software-writeable copy, and then the coherence switch updates the working copy of the configuration register and implements the new routing.Type: ApplicationFiled: September 7, 2011Publication date: March 7, 2013Inventors: Timothy J. Millet, Muditha Kanchana, Shailendra S. Desai
-
Patent number: 8392656Abstract: A parameter copying method is applied to a duplex system in which MPU and a main memory are duplicated and duplex operations on a hot standby system are performed. The parameter copying method includes cache reading data in the main memory corresponding to one MPU, cache writing the data read in the cache reading step on an as-is basis, and writing the data into the main memory corresponding to the one MPU by a block write that is produced by a cache replace caused due to the cache writing step, and also writing the same data into the main memory corresponding to the other MPU by the block write on a basis of a mirrored write.Type: GrantFiled: January 29, 2010Date of Patent: March 5, 2013Assignee: Yokogawa Electric CorporationInventor: Hideharu Yajima
-
Patent number: 8392663Abstract: A multiprocessor system maintains cache coherence among processors in a coherent domain. Within the coherent domain, a first processor can receive a command to perform a cache maintenance operation. The first processor can determine whether the cache maintenance operation is a coherent operation. For coherent operations, the first processor sends a coherent request message for distribution to other processors in the coherent domain and can cancel execution of the cache maintenance operation pending receipt of intervention messages corresponding to the coherent request. The intervention messages can reflect a global ordering of coherence traffic in the multiprocessor system and can include instructions for maintaining a data cache and an instruction cache of the first processor. Cache maintenance operations that are determined to be non-coherent can be executed at the first processor without sending the coherent request.Type: GrantFiled: December 10, 2008Date of Patent: March 5, 2013Assignee: MIPS Technologies, Inc.Inventors: Ryan C. Kinter, Darren M. Jones, Matthias Knoth
-
Publication number: 20130054900Abstract: A method and an apparatus for increasing capacity of cache directory in multi-processor systems, the apparatus comprising a plurality of processor nodes and a plurality of cache memory nodes and a plurality of main memory nodes.Type: ApplicationFiled: August 22, 2012Publication date: February 28, 2013Inventor: Conor Santifort
-
Publication number: 20130046924Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.Type: ApplicationFiled: October 23, 2012Publication date: February 21, 2013Inventors: Ali-Reza Adl-Tabatabai, YANG NI, BRATIN SAHA, VADIM BASSIN, GAD SHEAFFER, DAVID CALLAHAN, JAN GRAY
-
Patent number: 8380933Abstract: A multiprocessor system includes cache memories each of which is provided in correspondence with one of processor cores and includes a tag storage unit configured to store validity information representing whether a cache line as a unit to store data is valid, update information representing whether data in the cache line has been rewritten, and address information of the data in the cache line, a shared memory shared by the processor cores, and an arbitration circuit configured to arbitrate access requests from the processor cores to the shared memory and send the arbitrated access request to the cache memories. Each cache memory includes a violation detection circuit configured to detect a violation access by comparing the information in the tag storage unit with the access request from the arbitration circuit.Type: GrantFiled: March 24, 2008Date of Patent: February 19, 2013Assignee: Kabushiki Kaisha ToshibaInventor: Masato Uchiyama
-
Patent number: 8380934Abstract: A cache device interposed between a processor and a memory device, including: a cache memory storing data from the memory device; a buffer holding output data output from the processor; a control circuit determining, on the basis of a request to access the memory device, whether a cache hit has occurred or not and, if a cache miss has occurred, storing the output data in the buffer in response to the access request, outputting a read request for reading the data in a line containing data requested by the access request from the memory device, storing data output from the line of the memory device into the cache memory, and storing the output data from the buffer into the cache memory.Type: GrantFiled: February 16, 2010Date of Patent: February 19, 2013Assignee: Fujitsu Semiconductor LimitedInventor: Gen Tsukishiro
-
Publication number: 20130031314Abstract: A number of coherence domains are maintained among the multitude of processing cores disposed in a microprocessor. A cache coherency manager defines the coherency relationships such that coherence traffic flows only among the processing cores that are defined as having a coherency relationship. The data defining the coherency relationships between the processing cores is optionally stored in a programmable register. For each source of a coherent request, the processing core targets of the request are identified in the programmable register. In response to a coherent request, an intervention message is forwarded only to the cores that are defined to be in the same coherence domain as the requesting core. If a cache hit occurs in response to a coherent read request and the coherence state of the cache line resulting in the hit satisfies a condition, the requested data is made available to the requesting core from that cache line.Type: ApplicationFiled: January 30, 2012Publication date: January 31, 2013Applicant: MIPS Technologies, Inc.Inventor: Ryan C. Kinter
-
Publication number: 20130019047Abstract: An apparatus having a memory and circuit is disclosed. The memory may (i) assert a first signal in response to detecting a conflict between at least two addresses requesting access to a block at a first time, (ii) generate a second signal in response to a cache miss caused by an address requesting access to the block at a second time and (iii) store a line fetched in response to the cache miss in another block by adjusting the first address by an offset. The second time is generally after the first time. The circuit may (i) generate the offset in response to the assertion of the first signal and (ii) present the offset in a third signal to the memory in response to the assertion of the second signal corresponding to reception of the first address at the second time. The offset is generally associated with the first address.Type: ApplicationFiled: July 11, 2011Publication date: January 17, 2013Inventors: Dmitry Podvalny, Alex Shinkar, Assaf Rachlevski
-
Patent number: 8352681Abstract: Proposed are a highly reliable storage system and its control method capable of accelerating the processing speed of the copy processing seen from the host device. With the storage system and its control method which stores a command issued from a host device in a command queue and executes the command stored in the command queue in the order that the command was stored in command queue, a copy queue is set for temporarily retaining a copy command among the commands issued from the host device in the memory, the copy command among the commands is moved from the host device stored in the command queue to the copy queue and an execution completion reply of copy processing according to the command is sent to the host device as a sender of the command, and the copy command that was moved to the copy queue is executed in the background in the order that the copy command was stored in the copy queue.Type: GrantFiled: July 17, 2009Date of Patent: January 8, 2013Assignee: Hitachi, Ltd.Inventors: Kosuke Sakai, Koji Nagata, Yoshiyuki Noborikawa
-
Publication number: 20130007375Abstract: A device with an interconnect having a plurality of memory controllers for connecting the plurality of memory controllers. Each memory controller of the plurality of memory controllers is coupled to an allocated memory for storing data. Further, each memory controller of the plurality of memory controllers has one accelerator of a plurality of accelerators for mutually exchanging data over the interconnect.Type: ApplicationFiled: June 27, 2012Publication date: January 3, 2013Applicant: International Business Machines CorporationInventors: Florian Alexander Auernhammer, Victoria Caparros Cabezas, Andreas Christian Doering, Patricia Maria Sagmeister
-
Publication number: 20120331233Abstract: A mechanism is provided for detecting false sharing misses. Responsive to performing either an eviction or an invalidation of a cache line in a cache memory of the data processing system, a determination is made as to whether there is an entry associated with the cache line in a false sharing detection table. Responsive to the entry associated with the cache line existing in the false sharing detection table, a determination is made as to whether an overlap field associated with the entry is set. Responsive to the overlap field failing to be set, identification is made that a false sharing coherence miss has occurred. A first signal is then sent to a performance monitoring unit indicating the false sharing coherence miss.Type: ApplicationFiled: June 24, 2011Publication date: December 27, 2012Applicant: International Business Machines CorporationInventors: Harold W. Cain, III, Hubertus Franke
-
Publication number: 20120317366Abstract: The present invention measures an actual utilization frequency of data and controls a location of this data in a storage apparatus in a case where a host computer makes joint use of a storage apparatus and a cache apparatus. A portion of data used by an application program 1A is stored in a storage apparatus 2 and a cache apparatus 3. A management apparatus 4 detects an I/O load of a page (4A), and detects an I/O load of cache data (4B). The management apparatus 4 determines a corresponding relationship between the page and the cache data (4C), and adds the I/O load of the cache data to the I/O load of the page.Type: ApplicationFiled: May 30, 2011Publication date: December 13, 2012Inventors: Takanori Sato, Takato Kusama
-
Patent number: 8332568Abstract: A memory access determination circuit includes a counter that outputs a first value counted by using a first reference value, and a control unit that makes a cache determination of an address corresponding to an output of the counter, wherein, when a cache miss occurs for the address, the counter outputs a second value by using a second reference value.Type: GrantFiled: February 15, 2010Date of Patent: December 11, 2012Assignee: Fujitsu Semiconductor LimitedInventor: Kazuhiko Okada
-
Publication number: 20120311021Abstract: A method of a transaction-based system is applicable to a data deduplication system. In the system, pointers of same data point to a same position, so that when one piece of data is changed, all associated pointers need to be changed. In this method, a server first sets a flag to a false value, and after the server receives a request for backing up a data element from a client, the server reads a fingerprinting of the data element and determines whether the fingerprinting is the same as a temporary fingerprinting in a meta cache of the client, writes the data element and the fingerprinting into a corresponding temporary storage data block when the fingerprinting is not the same as the temporary fingerprinting, and writes the data element and the fingerprinting into a main meta cache and resets the flag when the flag is a true value.Type: ApplicationFiled: September 23, 2011Publication date: December 6, 2012Applicant: INVENTEC CORPORATIONInventors: Ming-Sheng Zhu, Chih-Feng Chen
-
Publication number: 20120311271Abstract: A read cache device for accelerating execution of read commands in a storage area network (SAN) in a data path between frontend servers and a backend storage. The device includes a cache memory unit for maintaining portions of data that reside in the backend storage and mapped to at least one accelerated virtual volume; a cache management unit for maintaining data consistency between the cache memory unit and the at least one accelerated virtual volume; a descriptor memory unit for maintaining a plurality of descriptors; and a processor for receiving each command and each command response travels in the data path serving each received read command directed to the at least one accelerated virtual volume by returning requested data stored in the cache memory unit and writing data to the cache memory unit according to a caching policy.Type: ApplicationFiled: June 6, 2011Publication date: December 6, 2012Applicant: SANRAD, Ltd.Inventors: Yaron Klein, Allon Cohen
-
Patent number: 8327081Abstract: A processor module having a cache device and a system controller having a copy TAG2 of a tag of the cache device configure a system to which a protocol representing the states of a data block of the cache device by six states, that is, an invalid state I, a shared state S, an exclusive state E, a modified state M, a shared modified state O, and a writable modified state W can be applied. In order to implement the concept, information about a new state in a cache device of a requester is included in a reply packet from the cache device for transmitting the data block. After the completion of the snooping process of the TAG2 until the reception of the reply packet from the cache device for transmitting the data block and the determination of the next state, an object data block is locked in the TAG2.Type: GrantFiled: August 27, 2008Date of Patent: December 4, 2012Assignee: Fujitsu LimitedInventor: Gou Sugizaki
-
Publication number: 20120303906Abstract: Embodiments are provided for cache memory systems. In one general embodiment, a method that includes receiving a host write request from a host computer, creating a sequential log file in a storage device, and copying data received during the host write request to a storage buffer. The method further includes determining if a selected quantity of data has been accumulated in the storage buffer and executing a write through of data to sequentially write the data accumulated in the storage buffer to the sequential log file and to a storage class memory device if the selected quantity of data has been accumulated in the storage buffer.Type: ApplicationFiled: August 6, 2012Publication date: November 29, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Binny S. Gill
-
Patent number: 8316194Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.Type: GrantFiled: December 15, 2009Date of Patent: November 20, 2012Assignee: Intel CorporationInventors: Ali-Reza Adl-Tabatabai, Yang Ni, Bratin Saha, Vadim Bassin, Gad Sheaffer, David Callahan, Jan Gray
-
Publication number: 20120290794Abstract: A method including: receiving multiple local requests to access the cache line; inserting, into an address chain, multiple entries corresponding to the multiple local requests; identifying a first entry at a head of the address chain; initiating, in response to identifying the first entry and in response to the first entry corresponding to a request to own the cache line, a traversal of the address chain; setting, during the traversal of the address chain, a state element identified in a second entry; receiving a foreign request to access the cache line; inserting, in response to setting the state element, a third entry corresponding to the foreign request into the address chain after the second entry; and relinquishing, in response to inserting the third entry after the second entry in the address chain, the cache line to a foreign thread after executing the multiple local requests.Type: ApplicationFiled: August 29, 2011Publication date: November 15, 2012Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Connie Wai Mun Cheung, Madhavi Kondapaneni, Joann Yin Lam, Ramaswamy Sivaramakrishnan
-
Publication number: 20120290774Abstract: A method and system to allow power fail-safe write-back or write-through caching of data in a persistent storage device into one or more cache lines of a caching device. No metadata associated with any of the cache lines is written atomically into the caching device when the data in the storage device is cached. As such, specialized cache hardware to allow atomic writing of metadata during the caching of data is not required.Type: ApplicationFiled: May 16, 2012Publication date: November 15, 2012Inventor: SANJEEV N. TRIKA
-
Publication number: 20120254551Abstract: It is provided a method of generating a code by a compiler, including the steps of: analyzing a program executed by a processor; analyzing data necessary to execute respective tasks included in a program; determining whether a boundary of the data used by divided tasks is consistent with a management unit of a cache memory based on results of the analyzing; and generating a code for providing a non-cacheable area from which the data to be stored in the management unit including the boundary is not temporarily stored into the cache memory and a code for storing an arithmetic processing result stored in the management unit including the boundary into a non-cacheable area in a case where it is determined that the boundary of the data used by the divided tasks is not consistent with the management unit of the cache memory.Type: ApplicationFiled: December 14, 2010Publication date: October 4, 2012Applicant: WASEDA UNIVERSITYInventors: Hironori Kasahara, Keiji Kimura, Masayoshi Mase
-
Publication number: 20120254541Abstract: Methods and apparatus for updating data in passive variable resistive memory (PVRM) are provided. In one example, a method for updating data stored in PVRM is disclosed. The method includes updating a memory block of a plurality of memory blocks in a cache hierarchy without invalidating the memory block. The updated memory block may be copied from the cache hierarchy to a write through buffer. Additionally, the method includes writing the updated memory block to the PVRM, thereby updating the data in the PVRM.Type: ApplicationFiled: April 4, 2011Publication date: October 4, 2012Applicant: Advanced Micro Devices, Inc.Inventors: Brad Beckmann, Lisa Hsu
-
Publication number: 20120246410Abstract: A cache memory has one or a plurality of ways having a plurality of cache lines including a tag memory which stores a tag address, a first dirty bit memory which stores a first dirty bit, a valid bit memory which stores a valid bit, and a data memory which stores data. The cache memory has a line index memory which stores a line index for identifying the cache line. The cache memory has a DBLB management unit having a plurality of lines including a row memory which stores first bit data identifying the way and second bit data identifying the line index, a second dirty bit memory which stores a second dirty bit of bit unit corresponding to writing of a predetermined unit into the data memory, and a FIFO memory which stores FIFO information prescribing a registered order. Data in a cache line of a corresponding way is written back on the basis of the second dirty bit.Type: ApplicationFiled: June 9, 2011Publication date: September 27, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventor: Hui Xu
-
Patent number: 8271736Abstract: A method for increasing the performance and utilization of cache memory by combining the data block frequency map generated by data de-duplication mechanism and page prefetching and eviction algorithms like Least Recently Used (LRU) policy. The data block frequency map provides weight directly proportional to the frequency count of the block in the dataset. This weight is used to influence the caching algorithms like LRU. Data blocks that have lesser frequency count in the dataset are evicted before those with higher frequencies, even though they may not have been the topmost blocks for page eviction by caching algorithms. The method effectively combines the weight of the block in the frequency map and its eviction status by caching algorithms like LRU to get an improved performance and utilization of the cache memory.Type: GrantFiled: February 7, 2008Date of Patent: September 18, 2012Assignee: International Business Machines CorporationInventors: Karan Gupta, Tarun Thakur
-
Publication number: 20120233410Abstract: The present invention discloses a shared-variable-based (SVB) approach for fast and accurate multi-core cache coherence simulation. While the intuitive, conventional approach, synchronizing at either every cycle or memory access, gives accurate simulation results, it has poor performance due to huge simulation overloads. In the present invention, timing synchronization is only needed before shared variable accesses in order to maintain accuracy while improving the efficiency in the proposed shared-variable-based approach.Type: ApplicationFiled: March 13, 2011Publication date: September 13, 2012Applicant: National Tsing Hua UniversityInventors: Cheng-Yang FU, Meng-Huan Wu, Ren-Song Tsay
-
Publication number: 20120221798Abstract: Methods for selecting a line to evict from a data storage system are provided. A computer system implementing a method for selecting a line to evict from a data storage system is also provided. The methods include selecting an uncached class line for eviction prior to selecting a cached class line for eviction.Type: ApplicationFiled: May 5, 2012Publication date: August 30, 2012Inventor: Blaine D. Gaither
-
Patent number: 8255633Abstract: A list prefetch engine improves a performance of a parallel computing system. The list prefetch engine receives a current cache miss address. The list prefetch engine evaluates whether the current cache miss address is valid. If the current cache miss address is valid, the list prefetch engine compares the current cache miss address and a list address. A list address represents an address in a list. A list describes an arbitrary sequence of prior cache miss addresses. The prefetch engine prefetches data according to the list, if there is a match between the current cache miss address and the list address.Type: GrantFiled: January 29, 2010Date of Patent: August 28, 2012Assignee: International Business Machines CorporationInventors: Peter Boyle, Norman Christ, Alan Gara, Changhoan Kim, Robert Mawhinney, Martin Ohmacht, Krishnan Sugavanam
-
Patent number: 8255635Abstract: According to method of data processing in a multiprocessor data processing system, in response to a processor request to modify a target granule of a target cache line of data containing multiple granules, a processing unit originates on an interconnect of the multiprocessor data processing system a data-claim-partial request that requests permission to promote only the target granule of the target cache line to a unique copy with an intent to modify the target granule. In response to a combined response to the data-claim-partial request indicating success (the combined response representing a system-wide response to the data-claim-partial-request), the processing unit promotes only the target granule of the target cache line to a unique copy by updating a coherency state of the target granule and retaining a coherency state of at least one other granule of the target cache line.Type: GrantFiled: February 1, 2008Date of Patent: August 28, 2012Assignee: International Business Machines CorporationInventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Jerry D. Lewis, Warren E. Maule
-
Publication number: 20120215988Abstract: Administering non-cacheable memory load instructions in a computing environment where cacheable data is produced and consumed in a coherent manner without harming performance of a producer, the environment including a hierarchy of computer memory that includes one or more caches backed by main memory, the caches controlled by a cache controller, at least one of the caches configured as a write-back cache. Embodiments of the present invention include receiving, by the cache controller, a non-cacheable memory load instruction for data stored at a memory address, the data treated by the producer as cacheable; determining by the cache controller from a cache directory whether the data is cached; if the data is cached, returning the data in the memory address from the write-back cache without affecting the write-back cache's state; and if the data is not cached, returning the data from main memory without affecting the write-back cache's state.Type: ApplicationFiled: May 2, 2012Publication date: August 23, 2012Applicant: International Business Machines CorporationInventors: Jon K. Kriegel, Jamie R. Kuesel
-
Publication number: 20120210073Abstract: An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.Type: ApplicationFiled: February 11, 2011Publication date: August 16, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alexandre E. Eichenberger, Alan Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
-
Publication number: 20120210072Abstract: A method of processing store requests in a data processing system includes enqueuing a store request in a store queue of a cache memory of the data processing system. The store request identifies a target memory block by a target address and specifies store data. While the store request and a barrier request older than the store request are enqueued in the store queue, a read-claim machine of the cache memory is dispatched to acquire coherence ownership of target memory block of the store request. After coherence ownership of the target memory block is acquired and the barrier request has been retired from the store queue, a cache array of the cache memory is updated with the store data.Type: ApplicationFiled: April 26, 2012Publication date: August 16, 2012Applicant: International Business Machines CorporationInventors: Guy L. Guthrie, William J. Starke, Derek E. Williams
-
Publication number: 20120210071Abstract: A multi-core processor with a shared physical memory is described. In an embodiment a sending core sends a memory write request to a destination core so that the request may be acted upon by the destination core as if it originated from the destination core. In an example, a data structure is configured in the shared physical memory and mapped to be accessible to the sending and destination cores. In an example, the shared data structure is used as a message channel between the sending and destination cores to carry data using the memory write request. In an embodiment a notification mechanism is enabled using the shared physical memory in order to notify the destination core of events by updating a notification data structure. In an example, the notification mechanism triggers a notification process at the destination core to inform a receiving process of a notification.Type: ApplicationFiled: February 11, 2011Publication date: August 16, 2012Applicant: Microsoft CorporationInventors: Richard John Black, Timothy Harris, Ross Cameron Mcilroy, Karin Strauss
-
Patent number: 8244985Abstract: Apparatus and methods relating to store operations are disclosed. In one embodiment, a first storage unit is to store data. A second storage unit is to store the data only after it has become detectable by a bus agent. Moreover, the second storage unit may store an index field for each data value to be stored within the second storage unit. Other embodiments are also disclosed.Type: GrantFiled: January 27, 2009Date of Patent: August 14, 2012Assignee: Intel CorporationInventors: Vladimir Pentkovksi, Ling Cen, Vivek Garg, Deep Buch, David Zhao
-
Patent number: 8239633Abstract: A coherence controller in hardware of an apparatus in an example detects conflicts on coherence requests through direct, non-broadcast employment of signatures that: summarize read-sets and write-sets of memory transactions; and provide false positives but no false negatives for the conflicts on the coherence requests. The signatures comprise fixed-size representations of a substantially arbitrary set of addresses for the read-sets and the write-sets of the memory transactions.Type: GrantFiled: July 9, 2008Date of Patent: August 7, 2012Assignee: Wisconsin Alumni Research FoundationInventors: David A. Wood, Mark D. Hill, Michael M. Swift, Michael R. Marty, Luke Yen, Kevin E. Moore, Jayaram Bobba, Haris Volos
-
Publication number: 20120198164Abstract: This invention is a cache system with a memory attribute register having plural entries. Each entry stores a write-through or a write-back indication for a corresponding memory address range. On a write to cached data the cache the cache consults the memory attribute register for the corresponding address range. Writes to addresses in regions marked as write-through always update all levels of the memory hierarchy. Writes to addresses in regions marked as write- back update only the first cache level that can service the write. The memory attribute register is preferably a memory mapped control register writable by the central processing unit.Type: ApplicationFiled: September 28, 2011Publication date: August 2, 2012Applicant: TEXAS INSTRUMENTS INCORPORATEDInventors: Raguram Damodaran, Abhijeet Ashok Chachad, Naveen Bhoria
-
Publication number: 20120198177Abstract: A data processor is disclosed that definitively determines an effective address being calculated and decoded will be associated with an address range that includes a memory local to a data processor unit, and will disable a cache access based upon a comparison between a portion of a base address and a corresponding portion of an effective address input operand. Access to the local memory can be accomplished through a first port of the local memory when it is definitively determined that the effective address will be associated with an address range. Access to the local memory cannot be accomplished through the first port of the local memory when it is not definitively determined that the effective address will be associated with the address range.Type: ApplicationFiled: January 28, 2011Publication date: August 2, 2012Applicant: FREESCALE SEMICONDUCTOR, INC.Inventor: William C. Moyer
-
Patent number: 8230177Abstract: Systems and methods for efficient handling of store misses. A processor comprises a store queue that stores data for committed store instructions. Coupled to the store queue is a cache responsible for ensuring consistent ordering of store operations for all consumers, which may be accomplished by maintaining a corresponding cache line be in an exclusive state before executing a store operation. In response to a first committed store instruction missing in the cache, the store queue is configured to convey to the cache a second entry of the plurality of queue entries as a speculative prefetch instruction. This second entry corresponds to a committed store instruction that follows in program order the first committed store instruction of a given thread. If the prefetch instruction misses in the cache, the latency for acquiring a corresponding cache line overlaps with the latency of the first store instruction.Type: GrantFiled: May 28, 2009Date of Patent: July 24, 2012Assignee: Oracle America, Inc.Inventor: Mark A. Luttrell
-
Patent number: 8225297Abstract: Various technologies and techniques are disclosed for providing software accessible metadata on a cache of a central processing unit. A multiprocessor has at least one central processing unit. The central processing unit has a cache with cache lines that are augmented by cache metadata. The cache metadata includes software-controlled metadata identifiers that allow multiple logical processors to share the cache metadata. The metadata identifiers and cache metadata can then be used to accelerate various operations. For example, parallel computations can be accelerated using cache metadata and metadata identifiers. As another example, nested computations can be accelerated using metadata identifiers and cache metadata. As yet another example, transactional memory applications that include parallelism within transactions or that include nested transactions can be also accelerated using cache metadata and metadata identifiers.Type: GrantFiled: August 6, 2007Date of Patent: July 17, 2012Assignee: Microsoft CorporationInventors: Jan Gray, Timothy L. Harris, James Larus, Burton Smith
-
Publication number: 20120179875Abstract: A method and apparatus for fine-grained filtering in a hardware accelerated software transactional memory system is herein described. A data object, which may have an arbitrary size, is associated with a filter word. The filter word is in a first default state when no access, such as a read, from the data object has occurred during a pendancy of a transaction. Upon encountering a first access, such as a first read, from the data object, access barrier operations including an ephemeral/private store operation to set the filter word to a second state are performed. Upon a subsequent/redundant access, such as a second read, the access barrier operations are elided to accelerate the subsequent access, based on the filter word being set to the second state to indicate a previous access occurred.Type: ApplicationFiled: January 10, 2012Publication date: July 12, 2012Inventors: Bratin Saha, Ali-Reza Adl-Tabatabai, Gad Sheaffer, Quinn Jacobson
-
Publication number: 20120173824Abstract: Embodiments of the invention provide techniques for managing cache metadata providing a mapping between addresses on a storage medium (e.g., disk storage) and corresponding addresses on a cache device at which data items are stored. In some embodiments, cache metadata may be stored in a hierarchical data structure comprising a plurality of hierarchy levels. When a reboot of the computer is initiated, only a subset of the plurality of hierarchy levels may be loaded to memory, thereby expediting the process of restoring the cache metadata and thus startup operations. Startup may be further expedited by using cache metadata to perform operations associated with reboot.Type: ApplicationFiled: February 2, 2012Publication date: July 5, 2012Applicant: Microsoft CorporationInventors: Mehmet Iyigun, Yevgeniy Bak, Michael Fortin, David Fields, Cenk Ergan, Alexander Kirshenbaum
-
Patent number: 8214602Abstract: In one embodiment, a processor comprises a data cache and a load/store unit (LSU). The LSU comprises a queue and a control unit, and each entry in the queue is assigned to a different load that has accessed the data cache but has not retired. The control unit is configured to update the data cache hit status of each load represented in the queue as a content of the data cache changes. The control unit is configured to detect a snoop hit on a first load in a first entry of the queue responsive to: the snoop index matching a load index stored in the first entry, the data cache hit status of the first load indicating hit, the data cache detecting a snoop hit for the snoop operation, and a load way stored in the first entry matching a first way of the data cache in which the snoop operation is a hit.Type: GrantFiled: June 23, 2008Date of Patent: July 3, 2012Assignee: Advanced Micro Devices, Inc.Inventors: Ashutosh S. Dhodapkar, Michael G. Butler
-
Publication number: 20120166735Abstract: Technologies are generally described for a system for sending a data block stored in a cache. In some examples described herein, a system may comprise a first processor in a first tile. The first processor is effective to generate a request for a data block, the request including a destination identifier identifying a destination tile for the data block, the destination tile being distinct from the first tile. Some example systems may further comprise a second tile effective to receive the request, the second tile effective to determine a data tile including the data block, the second tile further effective to send the request to the data tile. Some example systems may still further comprise a data tile effective to receive the request from the second tile, the data tile effective to send the data block to the destination tile.Type: ApplicationFiled: March 2, 2012Publication date: June 28, 2012Applicant: EMPIRE TECHNOLOGY DEVELOPMENT LLCInventor: Yan Solihin
-
Publication number: 20120159080Abstract: A method and apparatus for utilizing a higher-level cache as a neighbor cache directory in a multi-processor system are provided. In the method and apparatus, when the data field of a portion or all of the cache is unused, a remaining portion of the cache is repurposed for usage as neighbor cache directory. The neighbor cache provides a pointer to another cache in the multi-processor system storing memory data. The neighbor cache directory can be searched in the same manner as a data cache.Type: ApplicationFiled: December 15, 2010Publication date: June 21, 2012Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Greggory D. Donley, William A. Hughes, Kevin M. Lepak, Vydhyanathan Kalyanasundharam, Benjamin Tsien
-
Publication number: 20120159083Abstract: Systems and methods for performing memory transactions are described. In an embodiment, a system comprises a processor configured to perform an action in response to a transaction indicative of a request originated by a hardware subsystem. A logic circuit is configured to receive the transaction. In response to identifying a specific characteristic of the transaction, the logic circuit splits the transaction into two or more other transactions. The two or more other transactions enable the processor to satisfy the request without performing the action. The system also includes an interface circuit configured to receive the request originated by the hardware subsystem and provide the transaction to the logic circuit. In some embodiments, a system may be implemented as a system-on-a-chip (SoC). Devices suitable for using these systems include, for example, desktop and laptop computers, tablets, network appliances, mobile phones, personal digital assistants, e-book readers, televisions, and game consoles.Type: ApplicationFiled: December 17, 2010Publication date: June 21, 2012Inventors: Deniz Balkan, Gurjeet S. Saund
-
Patent number: 8205046Abstract: A system and method for maintaining coherency in a symmetric multiprocessing (SMP) system are disclosed. Briefly described, in architecture, one exemplary embodiment comprises a first crossbar coupled to a plurality of local processors; a second crossbar coupled to at least one remote processor; and at least one crossbar directory that tracks access of information by a remote processor in a symmetric multiprocessing (SMP) system, the remote processor in communication with at least one of the local processors via the crossbars, such that a most current location of the information can be determined by the crossbar directory.Type: GrantFiled: January 31, 2005Date of Patent: June 19, 2012Assignee: Hewlett-Packard Development Company, L.P.Inventors: Mark Shaw, Stuart Allen Berke
-
Patent number: 8200909Abstract: A method and apparatus for accelerating a software transactional memory (STM) system is described herein. Annotation field are associated with lines of a transactional memory. An annotation field associated with a line of the transaction memory is initialized to a first value upon starting a transaction. In response to encountering a read operation in the transaction, then annotation field is checked. If the annotation field includes a first value, the read is serviced from the line of the transaction memory without having to search an additional write space. A second and third value in the annotation field potentially indicates whether a read operation missed the transactional memory or a tentative value is stored in a write space. Additionally, an additional bit in the annotation field, may be utilized to indicate whether previous read operations have been logged, allowing for subsequent redundant read logging to be reduced.Type: GrantFiled: April 26, 2011Date of Patent: June 12, 2012Inventors: Bratin Saha, Ali-Reza Adl-Tabatabai, Quinn Jacobson
-
Patent number: 8195890Abstract: The present invention is a protocol for maintaining cache consistency between multiprocessors within a tightly coupled system. A distributed directory is maintained within the data-sharing processors, so that copies can be invalidated when modified. All transfers are event driven, rather the polled, to reduce bus-bandwidth consumption. Deadlocks are avoided by placing to-be-executed command codes in the returned response packets, when the request-forwarding queues are full or not present.Type: GrantFiled: August 22, 2007Date of Patent: June 5, 2012Assignee: Sawyer Law Group, P.C.Inventor: David Vernon James
-
Publication number: 20120137081Abstract: Systems and methods for management of a cache are disclosed. In general, embodiments described herein store access counts in file system metadata associated with files in the cache. By encoding access counts in the file system metadata, file I/O operations are reduced. Preferably, the reference count is encoded in an access count timestamp in the file system metadata. The access counts can be decoded based on the difference between the access count time stamp and a base time value, with larger differences reflecting a larger access count. The cache can be aged by advancing the base time value, thereby causing the access count for a file to drop. The base time value can also be stored in file system metadata, thereby reducing file I/O operations when performing aging.Type: ApplicationFiled: November 30, 2010Publication date: May 31, 2012Inventor: James C. Shea
-
Patent number: 8190820Abstract: In one embodiment, the present invention includes a directory to aid in maintaining control of a cache coherency protocol. The directory can be coupled to multiple caching agents via an interconnect, and be configured to store a entries associated with cache lines. The directory also includes logic to determine a time delay before the directory can send a concurrent snoop request. Other embodiments are described and claimed.Type: GrantFiled: June 13, 2008Date of Patent: May 29, 2012Assignee: Intel CorporationInventors: Hariharan Thantry, Akhilesh Kumar, Seungjoon Park