Cache Consistency Protocols (epo) Patents (Class 711/E12.026)
  • Patent number: 11509531
    Abstract: A command and response messaging mechanism for use between nodes of a homogeneous data grid can allow a configuration state to be quickly provisioned to the nodes of a cluster at run time for an application running on the data grid. For example, a processing device of a node can receive a global configuration state from a peer node in the grid network. The processing device can apply common values for symmetrical attributes from the global configuration state to a local configuration. The processing device can also apply individual node values for asymmetrical attributes from the global configuration state to the local configuration. The processing device can then run the application on the local node using the local configuration.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: November 22, 2022
    Assignee: RED HAT, INC.
    Inventor: Tristan Tarrant
  • Patent number: 10949251
    Abstract: Embodiments described herein provide a system, method, and apparatus to accelerate reduce operations in a graphics processor. One embodiment provides an apparatus including one or more processors, the one or more processors including a first logic unit to perform a merged write, barrier, and read operation in response to a barrier synchronization request from a set of threads in a work group, synchronize the set of threads, and report a result of an operation specified in association with the barrier synchronization request.
    Type: Grant
    Filed: April 1, 2016
    Date of Patent: March 16, 2021
    Assignee: INTEL CORPORATION
    Inventors: Yong Jiang, Yuanyuan Li, Jianghong Du, Kuilin Chen, Thomas A. Tetzlaff
  • Patent number: 10817361
    Abstract: A technique includes receiving an alert indicator in a distributed computer system that includes a plurality of computing nodes coupled together by cluster interconnection fabric. The alert indicator indicates detection of a fault in a first computing node of the plurality of computing nodes. The technique indicates regulating communication between the first computing node and at least one of the other computing nodes in response to the alert indicator to contain error propagation due to the fault within the first computing node.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: October 27, 2020
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Gregg B Lesartre, Dale C Morris, Russ W Herrell, Blaine D Gaither
  • Patent number: 10572252
    Abstract: A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.
    Type: Grant
    Filed: December 16, 2016
    Date of Patent: February 25, 2020
    Assignee: Movidius Limited
    Inventors: Brendan Barry, Fergal Connor, Martin O'Riordan, David Moloney, Sean Power
  • Patent number: 10552352
    Abstract: Methods and apparatus for a synchronized multi-directional transfer on an inter-processor communication (IPC) link. In one embodiment, the synchronized multi-directional transfer utilizes one or more buffers which are configured to accumulate data during a first state. The one or more buffers are further configured to transfer the accumulated data during a second state. Data is accumulated during a low power state where one or more processors are inactive, and the data transfer occurs during an operational state where the processors are active. Additionally, in some variants, the data transfer may be performed for currently available transfer resources, and halted until additional transfer resources are made available. In still other variants, one or more of the independently operable processors may execute traffic monitoring processes so as to optimize data throughput of the IPC link.
    Type: Grant
    Filed: August 6, 2018
    Date of Patent: February 4, 2020
    Assignee: Apple Inc.
    Inventors: Karan Sanghi, Vladislav Petkov, Radha Kumar Pulyala, Saurabh Garg, Haining Zhang
  • Patent number: 10331532
    Abstract: Aspects disclosed herein relate to periodic non-intrusive diagnosis of lockstep systems. An exemplary method includes comparing execution of a program on a first processing system of the plurality of processing systems and execution of the program on a second processing system of the plurality of processing systems using a first comparator circuit, comparing the execution of the program on the first processing system and the execution of the program on the second processing system using a second comparator circuit, and running a diagnosis program on the second comparator circuit while the comparing using the first comparator circuit is ongoing.
    Type: Grant
    Filed: January 19, 2017
    Date of Patent: June 25, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Kapil Bansal, Kailash Digari, Rahul Gulati
  • Patent number: 9712396
    Abstract: In some aspects, the disclosure is directed to methods and systems for topology configuration of an array of packet processing elements via a topology configuration packet. Each processing element may include input packet busses from a first plurality of neighboring processing elements and output packet busses to a second plurality of neighboring processing elements. Each processing element may receive the configuration packet from one of the first plurality of neighboring elements, set its own topology configuration register according to predetermined values within the packet, and forward the packet out all of its outputs, in the same manner as a standard packet.
    Type: Grant
    Filed: May 22, 2015
    Date of Patent: July 18, 2017
    Assignee: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.
    Inventors: Michael Assa, Daniel Shterman
  • Patent number: 9575676
    Abstract: In accordance with one example, a method for comparing data units is disclosed comprising generating a first digest representing a first data unit stored in a first memory. A first encoded value is generated based, at least in part, on the first digest and a predetermined value. A second digest representing a second data unit stored in a second memory different from the first memory, is generated. A second encoded value is derived based, at least in part, on the second digest and the predetermined value. It is determined whether the first data unit and the second data unit are the same based, at least in part, on the first digest, the first predetermined value, the first encoded value, and the second digest, by first processor. If the second data unit is not the same as the first data unit, the first data unit is stored in the second memory.
    Type: Grant
    Filed: March 7, 2016
    Date of Patent: February 21, 2017
    Assignee: FalconStor, Inc.
    Inventors: Wai Lam, Ronald S. Niles, Xiaowei Li
  • Patent number: 9465742
    Abstract: The barrier-aware bridge tracks all outstanding transactions from the attached master. When a barrier transaction is sent from the master, it is tracked by the bridge, along with a snapshot of the current list of outstanding transactions, in a separate barrier tracking FIFO. Each barrier is separately tracked with whatever transactions that are outstanding at that time. As outstanding transaction responses are sent back to the master, their tracking information is simultaneously cleared from every barrier FIFO entry.
    Type: Grant
    Filed: October 17, 2013
    Date of Patent: October 11, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Daniel B Wu, Kai Chirca
  • Patent number: 9015446
    Abstract: A method for providing a first processor access to a memory associated with a second processor. The method includes receiving a first address map from the first processor that includes an MMIO aperture for a NUMA device, receiving a second address map from a second processor that includes MMIO apertures for hardware devices that the second processor is configured to access, and generating a global address map by combining the first and second address maps. The method further includes receiving an access request transmitted from the first processor to the NUMA device, generating a memory access request based on the first access request and a translation table that maps a first address associated with the first access request into a second address associated with the memory associated with the second processor, and routing the memory access request to the memory based on the global address map.
    Type: Grant
    Filed: December 10, 2008
    Date of Patent: April 21, 2015
    Assignee: NVIDIA Corporation
    Inventors: Michael Brian Cox, Brad W. Simeral
  • Patent number: 9009416
    Abstract: A method, computer program product, and computing system for reclassifying a first assigned cache portion associated with a first machine as a public cache portion associated with the first machine and at least one additional machine after the occurrence of a reclassifying event. The public cache portion includes a plurality of pieces of content received by the first machine. A content identifier for each of the plurality of pieces of content included within the public cache portion is compared with content identifiers for pieces of content included within a portion of a data array associated with the at least one additional machine to generate a list of matching data portions. The list of matching data portions is provided to at least one additional assigned cache portion within the cache system that is associated with the at least one additional machine.
    Type: Grant
    Filed: December 30, 2011
    Date of Patent: April 14, 2015
    Assignee: EMC Corporation
    Inventors: Philip Derbeko, Anat Eyal, Roy E. Clark
  • Patent number: 8972663
    Abstract: A method for cache coherence, including: broadcasting, by a requester cache (RC) over a partially-ordered request network (RN), a peer-to-peer (P2P) request for a cacheline to a plurality of slave caches; receiving, by the RC and over the RN while the P2P request is pending, a forwarded request for the cacheline from a gateway; receiving, by the RC and after receiving the forwarded request, a plurality of responses to the P2P request from the plurality of slave caches; setting an intra-processor state of the cacheline in the RC, wherein the intra-processor state also specifies an inter-processor state of the cacheline; and issuing, by the RC, a response to the forwarded request after setting the intra-processor state and after the P2P request is complete; and modifying, by the RC, the intra-processor state in response to issuing the response to the forwarded request.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: March 3, 2015
    Assignee: Oracle International Corporation
    Inventors: Paul N. Loewenstein, Stephen E. Phillips, David Richard Smentek, Connie Wai Mun Cheung, Serena Wing Yee Leung, Damien Walker, Ramaswamy Sivaramakrishnan
  • Patent number: 8972667
    Abstract: A device with an interconnect having a plurality of memory controllers for connecting the plurality of memory controllers. Each memory controller of the plurality of memory controllers is coupled to an allocated memory for storing data. Further, each memory controller of the plurality of memory controllers has one accelerator of a plurality of accelerators for mutually exchanging data over the interconnect.
    Type: Grant
    Filed: June 27, 2012
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Florian Alexander Auernhammer, Victoria Caparros Cabezas, Andreas Christian Doering, Patricia Maria Sagmeister
  • Patent number: 8949545
    Abstract: A data processing device includes a load/store module to provide an interface between a processor device and a bus. In response to receiving a load or store instruction from the processor device, the load/store module determines a predicted coherency state of a cache line associated with the load or store instruction. Based on the predicted coherency state, the load/store module selects a bus transaction and communicates it to the bus. By selecting the bus transaction based on the predicted cache state, the load/store module does not have to wait for all pending bus transactions to be serviced, providing for greater predictability as to when bus transactions will be communicated to the bus, and allowing the bus behavior to be more easily simulated.
    Type: Grant
    Filed: December 4, 2008
    Date of Patent: February 3, 2015
    Assignee: Freescale Semiconductor, Inc.
    Inventor: John D. Pape
  • Patent number: 8918590
    Abstract: The present invention provides a ring bus type multicore system including one memory, a main memory controller for connecting the memory to a ring bus; and multiple cores connected in the shape of the ring bus, wherein each of the cores further includes a cache interface and a cache controller for controlling or managing the interface, and the cache controller of each of the cores connected in the shape of the ring bus executes a step of snooping data on the request through the cache interface; and when the cache of the core holds the data, a step of controlling the core to receive the request and return the data to the requester core, or, when the cache of the core does not hold the data, the main memory controller executes a step of reading the data from the memory and sending the data to the requester core.
    Type: Grant
    Filed: December 5, 2011
    Date of Patent: December 23, 2014
    Assignee: International Business Machines Corporation
    Inventors: Aya Minami, Yohichi Miwa
  • Patent number: 8904114
    Abstract: Various implementations of shared upper level cache architectures for multi-core processors including a first subset of processor cores and a second subset of processor cores and a module configured to copy data from a first shared upper level cache memory to a second shared upper level cache memory are generally disclosed.
    Type: Grant
    Filed: November 24, 2009
    Date of Patent: December 2, 2014
    Assignee: Empire Technology Development LLC
    Inventor: Ezekiel Kruglick
  • Patent number: 8886889
    Abstract: Methods and apparatus are provided for reusing snoop responses and data phase results in a bus controller. A bus controller receives an incoming bus transaction BTR1 corresponding to an incoming cache transaction CTR1 for an entry in at least one cache; issues a snoop request with a cache line address of the incoming bus transaction BTR1 for the entry to a plurality of cache controllers; collects at least one snoop response from the plurality of cache controllers; broadcasts a combined snoop response to the plurality of cache controllers, wherein the combined snoop response is a combination of the snoop responses from the plurality of cache controllers; and broadcasts cache line data from a source cache for the entry during a data phase to the plurality of cache controllers, wherein a subsequent cache transaction CTR2 for the entry is processed based on the broadcast combined snoop response and the broadcast cache line data.
    Type: Grant
    Filed: February 21, 2012
    Date of Patent: November 11, 2014
    Assignee: LSI Corporation
    Inventors: Vidyalakshmi Rajagopalan, Archna Rai, Sharath Kashyap, Anuj Soni
  • Patent number: 8886894
    Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 23, 2012
    Date of Patent: November 11, 2014
    Assignee: Intel Corporation
    Inventors: Ali-Reza Adl-Tabatabai, Yang Ni, Bratin Saha, Vadim Bassin, Gad Sheaffer, David Callahan, Jan Gray
  • Patent number: 8874856
    Abstract: A false sharing detecting apparatus for analyzing a multi-thread application, the false sharing detecting apparatus includes an operation set detecting unit configured to detect an operation set having a chance of causing performance degradation due to false sharing, and a probability calculation unit configured to calculate a first probability defined as a probability that the detected operation set is to be executed according to an execution pattern causing performance degradation due to false sharing, and calculate a second probability based on the calculated first probability. The second probability is defined as a probability that performance degradation due to false sharing occurs with respect to an operation included in the detected operation set.
    Type: Grant
    Filed: June 17, 2011
    Date of Patent: October 28, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Dae-Hyun Cho, Sung-Do Moon
  • Patent number: 8868846
    Abstract: Disclosed is a coherent storage system. A network interface device (NIC) receives network storage commands from a host. The NIC may cache the data to/from the storage commands in a solid-state disk. The NIC may respond to future network storage command by supplying the data from the solid-state disk rather than initiating a network transaction. Other NIC's on other hosts may also cache network storage data. These NICs may respond to transactions from the first NIC by supplying data, or changing the state of data in their caches.
    Type: Grant
    Filed: December 29, 2010
    Date of Patent: October 21, 2014
    Assignee: Netapp, Inc.
    Inventor: Robert E. Ober
  • Patent number: 8868837
    Abstract: In a multiprocessor system, with conflict checking implemented in a directory lookup of a shared cache memory, a reader set encoding permits dynamic recordation of read accesses. The reader set encoding includes an indication of a portion of a line read, for instance by indicating boundaries of read accesses. Different encodings may apply to different types of speculative execution.
    Type: Grant
    Filed: January 18, 2011
    Date of Patent: October 21, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alan Gara, Martin Ohmacht
  • Patent number: 8856448
    Abstract: Efficient techniques are described for tracking a potential invalidation of a data cache entry in a data cache for which coherency is required. Coherency information is received that indicates a potential invalidation of a data cache entry. The coherency information in association with the data cache entry is retained to track the potential invalidation to the data cache entry. The retained coherency information is kept separate from state bits that are utilized in cache access operations. An invalidate bit, associated with a data cache entry, may be utilized to represents a potential invalidation of the data cache entry. The invalidate bit is set in response to the coherency information, to track the potential invalidation of the data cache entry. A valid bit associated with the data cache entry is set in response to the active invalidate bit and a memory synchronization command. The set invalidate bit is cleared after the valid bit has been cleared.
    Type: Grant
    Filed: February 19, 2009
    Date of Patent: October 7, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Michael W. Morrow, James Norris Dieffenderfer
  • Patent number: 8856466
    Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 23, 2012
    Date of Patent: October 7, 2014
    Assignee: Intel Corporation
    Inventors: Ali-Reza Adl-Tabatabai, Yang Ni, Bratin Saha, Vadim Bassin, Gad Sheaffer, David Callahan, Jan Gray
  • Patent number: 8806144
    Abstract: A flash storage device includes a first memory, a flash memory comprising a plurality of physical blocks, each of the plurality of physical blocks comprising a plurality of physical pages, and a controller. The controller is configured to store, in the first memory, copies of data read from the flash memory, map a logical address in a read request received from a host system to a virtual unit address and a virtual page address, and check a virtual unit cache tag table stored in the first memory based on the virtual unit address. If a hit is found in the virtual unit cache tag table, a virtual page cache tag sub-table stored in the first memory is checked based on the virtual page address, wherein the virtual page cache tag sub-table is associated with the virtual unit address. If a hit is found in the virtual page cache tag sub-table, data stored in the first memory mapped to the hit in the virtual page cache tag sub-table is read in response to the read request received from the host system.
    Type: Grant
    Filed: May 12, 2010
    Date of Patent: August 12, 2014
    Assignee: STEC, Inc.
    Inventors: Po-Jen Hsueh, Richard A. Mataya, Mark Moshayedi
  • Patent number: 8799587
    Abstract: A Region Coherence Array (RCA) having subregions and subregion prefetching for shared-memory multiprocessor systems having a single-level, or a multi-level interconnect hierarchy architecture.
    Type: Grant
    Filed: January 26, 2009
    Date of Patent: August 5, 2014
    Assignee: International Business Machines Corporation
    Inventor: Jason F. Cantin
  • Patent number: 8799582
    Abstract: A method and apparatus for extending cache coherency to hold buffered data to support transactional execution is herein described. A transactional store operation referencing an address associated with a data item is performed in a buffered manner. Here, the coherency state associated with cache lines to hold the data item are transitioned to a buffered state. In response to local requests for the buffered data item, the data item is provided to ensure internal transactional sequential ordering. However, in response to external access requests, a miss response is provided to ensure the transactionally updated data item is not made globally visible until commit. Upon commit, the buffered lines are transitioned to a modified state to make the data item globally visible.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: August 5, 2014
    Assignee: Intel Corporation
    Inventors: Gad Sheaffer, Shlomo Raikin, Vadim Bassin, Raanan Sade, Ehud Cohen, Oleg Margulis
  • Patent number: 8788761
    Abstract: One embodiment of the present invention sets forth am extension to a cache coherence protocol with two explicit control states, P (private), and R (read-only), that provide explicit program control of cache lines for which the program logic can guarantee correct behavior. In the private state, only the owner of a cache line can access the cache line for read or write operations. In the read-only state, only read operations can be performed on the cache line, thereby disallowing write operations to be performed.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: July 22, 2014
    Assignee: NVIDIA Corporation
    Inventor: William James Dally
  • Patent number: 8775743
    Abstract: Systems and methods for implementing a distributed shared memory (DSM) in a computer cluster in which an unreliable underlying message passing technology is used, such that the DSM efficiently maintains coherency and reliability. DSM agents residing on different nodes of the cluster process access permission requests of local and remote users on specified data segments via handling procedures, which provide for recovering of lost ownership of a data segment while ensuring exclusive ownership of a data segment among the DSM agents detecting and resolving a no-owner messaging deadlock, pruning of obsolete messages, and recovery of the latest contents of a data segment whose ownership has been lost.
    Type: Grant
    Filed: July 2, 2012
    Date of Patent: July 8, 2014
    Assignee: International Business Machines Corporation
    Inventors: Lior Aronovich, Ron Asher
  • Patent number: 8762641
    Abstract: A method is described for use when a cache is accessed. Before all valid array entries are validated, a valid array entry is read when a data array entry is accessed. If the valid array entry is a first array value, access to the cache is treated as being invalid and the data array entry is reloaded. If the valid array entry is a second array value, a tag array entry is compared with an address to determine if the data array entry is valid or invalid. A valid control register contains a first control value before all valid array entries are validated and a second control value after all valid array entries are validated. After the second control value is established, reads of the valid array are disabled and the tag array entry is compared with the address to determine if a data array entry is valid or invalid.
    Type: Grant
    Filed: March 12, 2009
    Date of Patent: June 24, 2014
    Assignee: Qualcomm Incorporated
    Inventor: Arthur Joseph Hoane, Jr.
  • Patent number: 8762651
    Abstract: Maintaining cache coherence in a multi-node, symmetric multiprocessing computer, the computer composed of a plurality of compute nodes, including, broadcasting upon a cache miss by the first compute node to other compute nodes a request for the cache line; if at least two of the compute nodes has a correct copy of the cache line, selecting which compute node is to transmit the correct copy of the cache line to the first node, and transmitting from the selected compute node to the first node the correct copy of the cache line; and updating by each node the state of the cache line in each node, in dependence upon one or more of the states of the cache line in all the nodes.
    Type: Grant
    Filed: June 23, 2010
    Date of Patent: June 24, 2014
    Assignee: International Business Machines Corporation
    Inventors: Michael A. Blake, Garrett M. Drapala, Pak-Kin Mak, Vesselina K. Papazova, Craig R. Walters
  • Patent number: 8756377
    Abstract: An apparatus for storing data that is being processed is disclosed. The apparatus comprises: a cache associated with a processor and for storing a local copy of data items stored in a memory for use by the processor, monitoring circuitry associated with the cache for monitoring write transaction requests to the memory initiated by a further device, the further device being configured not to store data in the cache. The monitoring circuitry is responsive to detecting a write transaction request to write a data item, a local copy of which is stored in the cache, to block a write acknowledge signal transmitted from the memory to the further device indicating the write has completed and to invalidate the stored local copy in the cache and on completion of the invalidation to send the write acknowledge signal to the further device.
    Type: Grant
    Filed: February 2, 2010
    Date of Patent: June 17, 2014
    Assignee: ARM Limited
    Inventors: Simon John Craske, Antony John Penton, Loic Pierron, Andrew Christopher Rose
  • Patent number: 8751748
    Abstract: In a parallel processing system with speculative execution, conflict checking occurs in a directory lookup of a cache memory that is shared by all processors. In each case, the same physical memory address will map to the same set of that cache, no matter which processor originated that access. The directory includes a dynamic reader set encoding, indicating what speculative threads have read a particular line. This reader set encoding is used in conflict checking. A bitset encoding is used to specify particular threads that have read the line.
    Type: Grant
    Filed: January 18, 2011
    Date of Patent: June 10, 2014
    Assignee: International Business Machines Corporation
    Inventors: Daniel Ahn, Luis H. Ceze, Alan Gara, Martin Ohmacht, Zhuang Xiaotong
  • Patent number: 8732408
    Abstract: A circuit contains a shared memory (12), that is used by a plurality of processing elements (10) that contain cache circuits (102) for caching data from the shared memory (12). The processing elements perform a plurality of cooperating tasks, each task involving caching data from the shared memory (12) and sending cache message traffic. Consistency between cached data for different tasks is maintained by transmission of cache coherence requests via a communication network. Information from cache coherence requests generated for all of said tasks is buffered. One of the processing elements provides an indication signal indicating a current task stage of at least one of the processing elements. Additional cache message traffic is generated adapted dependent on the indication signal and the buffered information from the cache coherence requests. Thus conditions of cache traffic stress may be created to verify operability of the circuit, or cache message traffic may be delayed to avoid stress.
    Type: Grant
    Filed: October 16, 2008
    Date of Patent: May 20, 2014
    Assignee: Nytell Software LLC
    Inventors: Sainath Karlapalem, Andrei Sergeevich Terechko
  • Patent number: 8732412
    Abstract: Systems and methods for implementing a distributed shared memory (DSM) in a computer cluster in which an unreliable underlying message passing technology is used, such that the DSM efficiently maintains coherency and reliability. DSM agents residing on different nodes of the cluster process access permission requests of local and remote users on specified data segments via handling procedures, which provide for recovering of lost ownership of a data segment while ensuring exclusive ownership of a data segment among the DSM agents detecting and resolving a no-owner messaging deadlock, pruning of obsolete messages, and recovery of the latest contents of a data segment whose ownership has been lost.
    Type: Grant
    Filed: July 2, 2012
    Date of Patent: May 20, 2014
    Assignee: International Business Machines Corporation
    Inventors: Lior Aronovich, Ron Asher
  • Patent number: 8725958
    Abstract: The present invention provides a data processor capable of reducing power consumption at the time of execution of a spin wait loop for a spinlock. A CPU executes a weighted load instruction at the time of performing a spinlock process and outputs a spin wait request to a corresponding cache memory. When the spin wait request is received from the CPU, the cache memory temporarily stops outputting an acknowledge response to a read request from the CPU until a predetermined condition (snoop write hit, interrupt request, or lapse of predetermined time) is satisfied. Therefore, pipeline execution of the CPU is stalled and the operation of the CPU and the cache memory can be temporarily stopped, and power consumption at the time of executing a spin wait loop can be reduced.
    Type: Grant
    Filed: January 19, 2011
    Date of Patent: May 13, 2014
    Assignee: Renesas Electronics Corporation
    Inventor: Hirokazu Takata
  • Publication number: 20140129782
    Abstract: The invention provides a system with storage cache with high bandwidth and low latency to the server, and coherence for the contents of multiple memory caches, wherein locally managing a storage cache situated on a server is combined with a means for globally managing the coherency of storage caches of a number of servers. The local cache manager delivers very high performance and low latency for write transactions that hit the local cache in the Modified or Exclusive state and for read transactions that hit the local cache in the Modified, Exclusive or Shared states. The global coherency manager enables many servers connected via a network to share the contents of their local caches, providing application transparency by maintaining a directory with an entry for each storage block that indicates which servers have that block in the shared state or which server has that block in the modified state.
    Type: Application
    Filed: November 4, 2012
    Publication date: May 8, 2014
    Inventor: Robert Quinn
  • Publication number: 20140122805
    Abstract: Embodiments related to selecting a runahead poison policy from a plurality of runahead poison policies during microprocessor operation are provided. The example method includes causing the microprocessor to enter runahead upon detection of a runahead event and implementing a first runahead poison policy selected from a plurality of runahead poison policies operative to manage runahead poison injection during runahead. The example method also includes during microprocessor operation, selecting a second runahead poison policy operative to manage runahead poison injection differently from the first runahead poison policy.
    Type: Application
    Filed: October 26, 2012
    Publication date: May 1, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Magnus Ekman, James van Zoeren, Paul Serris
  • Patent number: 8713251
    Abstract: A disk array device that can detect the successful completion of data overwrite/update at high speed only by checking a UDT is provided. When a DIF is used as a verification code appended to data, check information that detects the successful completion of overwrite is defined in the UDT, in addition to address information that detects positional errors. Upon request of overwrite/update of data stored in a cache, a check bit of the data in the cache is changed to a value different from a check bit to be appended to new data by a host adapter. Then, data transfer is initiated. Upon completion of the data overwrite, the check bit is changed back to the original value, whereby it is possible to detect the successful completion of overwrite/update (FIG. 8).
    Type: Grant
    Filed: May 27, 2009
    Date of Patent: April 29, 2014
    Assignee: Hitachi, Ltd.
    Inventors: Hideyuki Koseki, Yusuke Nonaka
  • Patent number: 8706970
    Abstract: An apparatus for controlling operation of a cache includes a first command queue, a second command queue and an input controller configured to receive requests having a first command type and a second command type and to assign a first request having the first command type to the first command queue and a second command having the first command type to the second command queue in the event that the first command queue has not received an indication that a first dedicated buffer is available.
    Type: Grant
    Filed: November 7, 2012
    Date of Patent: April 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Diana L. Orf, Robert J. Sonnelitter, III
  • Publication number: 20140108737
    Abstract: A method to eliminate the delay of a block invalidate operation in a multi CPU environment by overlapping the block invalidate operation with normal CPU accesses, thus making the delay transparent. A range check is performed on each CPU access while a block invalidate operation is in progress, and an access that maps to within the address range of the block invalidate operation will be trated as a cache miss to ensure that the requesting CPU will receive valid data.
    Type: Application
    Filed: October 11, 2012
    Publication date: April 17, 2014
    Applicant: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Naveen Bhoria, Raguram Damodaran, Abhijeet Ashok Chachad
  • Patent number: 8688917
    Abstract: A method and apparatus for monitoring memory accesses in hardware to support transactional execution is herein described. Attributes are monitor accesses to data items without regard for detection at physical storage structure granularity, but rather ensuring monitoring at least at data items granularity. As an example, attributes are added to state bits of a cache to enable new cache coherency states. Upon a monitored memory access to a data item, which may be selectively determined, coherency states associated with the data item are updated to a monitored state. As a result, invalidating requests to the data item are detected through combination of the request type and the monitored coherency state of the data item.
    Type: Grant
    Filed: January 20, 2012
    Date of Patent: April 1, 2014
    Assignee: Intel Corporation
    Inventors: Gad Sheaffer, Shlomo Raikin, Vadim Bassin, Raanan Sade, Ehud Cohen, Oleg Margulis
  • Patent number: 8688921
    Abstract: A software transactional memory system is provided with multiple global version counters. The system assigns an affinity to one of the global version counters for each thread that executes transactions. Each thread maintains a local copy of the global version counters for use in validating read accesses of transactions. Each thread uses a corresponding affinitized global version counter to store version numbers of write accesses of executed transactions. The system adaptively changes the affinities of threads when data conflict or global version counter conflict is detected between threads.
    Type: Grant
    Filed: March 3, 2009
    Date of Patent: April 1, 2014
    Assignee: Microsoft Corporation
    Inventor: Yosseff Levanoni
  • Publication number: 20140089600
    Abstract: Methods and apparatuses for utilizing a data pending state for cache misses in a system cache. To reduce the size of a miss queue that is searched by subsequent misses, a cache line storage location is allocated in the system cache for a miss and the state of the cache line storage location is set to data pending. A subsequent request that hits to the cache line storage location will detect the data pending state and as a result, the subsequent request will be sent to a replay buffer. When the fill for the original miss comes back from external memory, the state of the cache line storage location is updated to a clean state. Then, the request stored in the replay buffer is reactivated and allowed to complete its access to the cache line storage location.
    Type: Application
    Filed: September 27, 2012
    Publication date: March 27, 2014
    Applicant: APPLE INC.
    Inventors: Sukalpa Biswas, Shinye Shiu, James B. Keller
  • Patent number: 8667226
    Abstract: A data processing system (10) includes a first master (14) and a second master (16 or 22). The first master includes a cache (28) and snoop queue circuitry (44, 52, 54) having a snoop request queue (44) which stores snoop requests. The snoop queue circuitry receives snoop requests for storage into the snoop request queue and provides snoop requests from the snoop request queue to the cache, and the snoop queue circuitry provides a ready indicator indicating whether the snoop request queue can store more snoop requests. The second master includes outgoing transaction control circuitry (72) which controls initiation of outgoing transactions to a system interconnect.
    Type: Grant
    Filed: March 24, 2008
    Date of Patent: March 4, 2014
    Assignee: Freescale Semiconductor, Inc.
    Inventor: William C. Moyer
  • Publication number: 20140059298
    Abstract: In one embodiment, a method performed by one or more computing devices includes receiving at a host cache, a first request to prepare a volume of the host cache for creating a snapshot of a cached logical unit number (LUN), the request indicating that a snapshot of the cached LUN will be taken, preparing, in response to the first request, the volume of the host cache for creating the snapshot of the cached LUN depending on a mode of the host cache, receiving, at the host cache, a second request to create the snapshot of the cached LUN, and in response to the second request, creating, at the host cache, the snapshot of the cached LUN.
    Type: Application
    Filed: August 24, 2012
    Publication date: February 27, 2014
    Applicant: DELL PRODUCTS L.P.
    Inventors: Marc David Olin, Michael James Klemm
  • Publication number: 20140052931
    Abstract: A method for controlling a memory scrubbing rate based on content of the status bit of a tag array of a cache memory. More specifically, the tag array of a cache memory is scrubbed at smaller interval than the scrubbing rate of the storage arrays of the cache. This increased scrubbing rate is in appreciation for the importance of maintaining integrity of tag data. Based on the content of the status bit of the tag array which indicates modified, the corresponding data entry in the cache storage array is scrubbed accordingly. If the modified bit is set, then the entry in the storage array is scrubbed after processing the tag entry. If the modified bit is not set, then the storage array is scrubbed at a predetermined scrubbing interval.
    Type: Application
    Filed: August 17, 2012
    Publication date: February 20, 2014
    Inventors: Ravindraraj Ramaraju, William C. Moyer, Andrew C. Russell
  • Publication number: 20140040562
    Abstract: The disclosed embodiments provide a system that uses broadcast-based TLB-sharing techniques to reduce address-translation latency in a shared-memory multiprocessor system with two or more nodes that are connected by an electrical interconnect. During operation, a first node receives a memory operation that includes a virtual address. Upon determining that one or more TLB levels of the first node will miss for the virtual address, the first node uses the electrical interconnect to broadcast a TLB request to one or more additional nodes of the shared-memory multiprocessor in parallel with scheduling a speculative page-table walk for the virtual address. If the first node receives a TLB entry from another node of the shared-memory multiprocessor via the electrical interconnect in response to the TLB request, the first node cancels the speculative page-table walk. Otherwise, if no response is received, the first node instead waits for the completion of the page-table walk.
    Type: Application
    Filed: August 2, 2012
    Publication date: February 6, 2014
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Pranay Koka, David A. Munday, Michael O. McCracken, Herbert D. Schwetman, JR.
  • Patent number: 8645632
    Abstract: Embodiments of the present invention provide a system that performs a speculative writestream transaction. The system starts by receiving, at a home node, a writestream ordered (WSO) request to start a WSO transaction from a processing subsystem. The WSO request identifies a cache line to be written during the WSO transaction. The system then sends an acknowledge signal to the processing subsystem to enable the processing subsystem to proceed with the WSO transaction. During the WSO transaction, the system receives a second WSO request to start a WSO transaction. The second WSO request identifies the same cache line as to be written during the subsequent WSO transaction. In response to receiving the second WSO request, the system sends an abort signal to cause the processing subsystem to abort the WSO transaction.
    Type: Grant
    Filed: February 4, 2009
    Date of Patent: February 4, 2014
    Assignee: Oracle America, Inc.
    Inventors: Robert E. Cypher, Haakan E. Zeffer, Anders Landin
  • Publication number: 20140032858
    Abstract: Methods and apparatus are provided for cache line sharing among cache controllers. A cache comprises a plurality of cache lines; and a cache controller for sharing at least one of the cache lines with one or more additional caches, wherein a given cache line shared by a plurality of caches corresponds to a given set of physical addresses in a main memory. The cache controller optionally maintains an ownership control signal indicating which portions of the at least one cache line are controlled by the cache and a validity control signal indicating whether each portion of the at least one cache line is valid. Each cache line can be in one of a plurality of cache coherence states, including a modified partial state and a shared partial state.
    Type: Application
    Filed: July 25, 2012
    Publication date: January 30, 2014
    Inventors: Vidyalakshmi Rajagopalan, Archna Rai, Anuj Soni, Sharath Kashyap
  • Patent number: 8639887
    Abstract: A mechanism for dynamically altering a request received at a hardware component is provided. The request is received at the hardware component, and the request includes a mode option. It is determined whether an action of the request requires an unavailable resource and it is determined whether the mode option is for the action requiring the unavailable resource. In response to the mode option being for the action requiring the unavailable resource, the action is automatically removed from the request. The request is passed for pipeline arbitration without the action requiring the unavailable resource.
    Type: Grant
    Filed: June 23, 2010
    Date of Patent: January 28, 2014
    Assignee: International Business Machines Corporation
    Inventors: Deanna Postles Dunn Berger, Michael Fee, Kenneth D. Klapproth, Robert J. Sonnelitter, III