Cache Consistency Protocols (epo) Patents (Class 711/E12.026)
E Subclasses
- Copy directories (EPO) (Class 711/E12.028)
- Associative directories (EPO) (Class 711/E12.029)
- Distributed directories, e.g., linked lists of caches, etc. (EPO) (Class 711/E12.03)
- Limited pointers directories; state-only directories without pointers (EPO) (Class 711/E12.031)
- With concurrent directory accessing, i.e., handling multiple concurrent coherency transactions (EPO) (Class 711/E12.032)
-
Patent number: 11509531Abstract: A command and response messaging mechanism for use between nodes of a homogeneous data grid can allow a configuration state to be quickly provisioned to the nodes of a cluster at run time for an application running on the data grid. For example, a processing device of a node can receive a global configuration state from a peer node in the grid network. The processing device can apply common values for symmetrical attributes from the global configuration state to a local configuration. The processing device can also apply individual node values for asymmetrical attributes from the global configuration state to the local configuration. The processing device can then run the application on the local node using the local configuration.Type: GrantFiled: October 30, 2018Date of Patent: November 22, 2022Assignee: RED HAT, INC.Inventor: Tristan Tarrant
-
Patent number: 10949251Abstract: Embodiments described herein provide a system, method, and apparatus to accelerate reduce operations in a graphics processor. One embodiment provides an apparatus including one or more processors, the one or more processors including a first logic unit to perform a merged write, barrier, and read operation in response to a barrier synchronization request from a set of threads in a work group, synchronize the set of threads, and report a result of an operation specified in association with the barrier synchronization request.Type: GrantFiled: April 1, 2016Date of Patent: March 16, 2021Assignee: INTEL CORPORATIONInventors: Yong Jiang, Yuanyuan Li, Jianghong Du, Kuilin Chen, Thomas A. Tetzlaff
-
Patent number: 10817361Abstract: A technique includes receiving an alert indicator in a distributed computer system that includes a plurality of computing nodes coupled together by cluster interconnection fabric. The alert indicator indicates detection of a fault in a first computing node of the plurality of computing nodes. The technique indicates regulating communication between the first computing node and at least one of the other computing nodes in response to the alert indicator to contain error propagation due to the fault within the first computing node.Type: GrantFiled: May 7, 2018Date of Patent: October 27, 2020Assignee: Hewlett Packard Enterprise Development LPInventors: Gregg B Lesartre, Dale C Morris, Russ W Herrell, Blaine D Gaither
-
Patent number: 10572252Abstract: A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.Type: GrantFiled: December 16, 2016Date of Patent: February 25, 2020Assignee: Movidius LimitedInventors: Brendan Barry, Fergal Connor, Martin O'Riordan, David Moloney, Sean Power
-
Patent number: 10552352Abstract: Methods and apparatus for a synchronized multi-directional transfer on an inter-processor communication (IPC) link. In one embodiment, the synchronized multi-directional transfer utilizes one or more buffers which are configured to accumulate data during a first state. The one or more buffers are further configured to transfer the accumulated data during a second state. Data is accumulated during a low power state where one or more processors are inactive, and the data transfer occurs during an operational state where the processors are active. Additionally, in some variants, the data transfer may be performed for currently available transfer resources, and halted until additional transfer resources are made available. In still other variants, one or more of the independently operable processors may execute traffic monitoring processes so as to optimize data throughput of the IPC link.Type: GrantFiled: August 6, 2018Date of Patent: February 4, 2020Assignee: Apple Inc.Inventors: Karan Sanghi, Vladislav Petkov, Radha Kumar Pulyala, Saurabh Garg, Haining Zhang
-
Patent number: 10331532Abstract: Aspects disclosed herein relate to periodic non-intrusive diagnosis of lockstep systems. An exemplary method includes comparing execution of a program on a first processing system of the plurality of processing systems and execution of the program on a second processing system of the plurality of processing systems using a first comparator circuit, comparing the execution of the program on the first processing system and the execution of the program on the second processing system using a second comparator circuit, and running a diagnosis program on the second comparator circuit while the comparing using the first comparator circuit is ongoing.Type: GrantFiled: January 19, 2017Date of Patent: June 25, 2019Assignee: QUALCOMM IncorporatedInventors: Kapil Bansal, Kailash Digari, Rahul Gulati
-
Patent number: 9712396Abstract: In some aspects, the disclosure is directed to methods and systems for topology configuration of an array of packet processing elements via a topology configuration packet. Each processing element may include input packet busses from a first plurality of neighboring processing elements and output packet busses to a second plurality of neighboring processing elements. Each processing element may receive the configuration packet from one of the first plurality of neighboring elements, set its own topology configuration register according to predetermined values within the packet, and forward the packet out all of its outputs, in the same manner as a standard packet.Type: GrantFiled: May 22, 2015Date of Patent: July 18, 2017Assignee: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.Inventors: Michael Assa, Daniel Shterman
-
Patent number: 9575676Abstract: In accordance with one example, a method for comparing data units is disclosed comprising generating a first digest representing a first data unit stored in a first memory. A first encoded value is generated based, at least in part, on the first digest and a predetermined value. A second digest representing a second data unit stored in a second memory different from the first memory, is generated. A second encoded value is derived based, at least in part, on the second digest and the predetermined value. It is determined whether the first data unit and the second data unit are the same based, at least in part, on the first digest, the first predetermined value, the first encoded value, and the second digest, by first processor. If the second data unit is not the same as the first data unit, the first data unit is stored in the second memory.Type: GrantFiled: March 7, 2016Date of Patent: February 21, 2017Assignee: FalconStor, Inc.Inventors: Wai Lam, Ronald S. Niles, Xiaowei Li
-
Patent number: 9465742Abstract: The barrier-aware bridge tracks all outstanding transactions from the attached master. When a barrier transaction is sent from the master, it is tracked by the bridge, along with a snapshot of the current list of outstanding transactions, in a separate barrier tracking FIFO. Each barrier is separately tracked with whatever transactions that are outstanding at that time. As outstanding transaction responses are sent back to the master, their tracking information is simultaneously cleared from every barrier FIFO entry.Type: GrantFiled: October 17, 2013Date of Patent: October 11, 2016Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Daniel B Wu, Kai Chirca
-
Patent number: 9015446Abstract: A method for providing a first processor access to a memory associated with a second processor. The method includes receiving a first address map from the first processor that includes an MMIO aperture for a NUMA device, receiving a second address map from a second processor that includes MMIO apertures for hardware devices that the second processor is configured to access, and generating a global address map by combining the first and second address maps. The method further includes receiving an access request transmitted from the first processor to the NUMA device, generating a memory access request based on the first access request and a translation table that maps a first address associated with the first access request into a second address associated with the memory associated with the second processor, and routing the memory access request to the memory based on the global address map.Type: GrantFiled: December 10, 2008Date of Patent: April 21, 2015Assignee: NVIDIA CorporationInventors: Michael Brian Cox, Brad W. Simeral
-
Patent number: 9009416Abstract: A method, computer program product, and computing system for reclassifying a first assigned cache portion associated with a first machine as a public cache portion associated with the first machine and at least one additional machine after the occurrence of a reclassifying event. The public cache portion includes a plurality of pieces of content received by the first machine. A content identifier for each of the plurality of pieces of content included within the public cache portion is compared with content identifiers for pieces of content included within a portion of a data array associated with the at least one additional machine to generate a list of matching data portions. The list of matching data portions is provided to at least one additional assigned cache portion within the cache system that is associated with the at least one additional machine.Type: GrantFiled: December 30, 2011Date of Patent: April 14, 2015Assignee: EMC CorporationInventors: Philip Derbeko, Anat Eyal, Roy E. Clark
-
Patent number: 8972663Abstract: A method for cache coherence, including: broadcasting, by a requester cache (RC) over a partially-ordered request network (RN), a peer-to-peer (P2P) request for a cacheline to a plurality of slave caches; receiving, by the RC and over the RN while the P2P request is pending, a forwarded request for the cacheline from a gateway; receiving, by the RC and after receiving the forwarded request, a plurality of responses to the P2P request from the plurality of slave caches; setting an intra-processor state of the cacheline in the RC, wherein the intra-processor state also specifies an inter-processor state of the cacheline; and issuing, by the RC, a response to the forwarded request after setting the intra-processor state and after the P2P request is complete; and modifying, by the RC, the intra-processor state in response to issuing the response to the forwarded request.Type: GrantFiled: March 14, 2013Date of Patent: March 3, 2015Assignee: Oracle International CorporationInventors: Paul N. Loewenstein, Stephen E. Phillips, David Richard Smentek, Connie Wai Mun Cheung, Serena Wing Yee Leung, Damien Walker, Ramaswamy Sivaramakrishnan
-
Patent number: 8972667Abstract: A device with an interconnect having a plurality of memory controllers for connecting the plurality of memory controllers. Each memory controller of the plurality of memory controllers is coupled to an allocated memory for storing data. Further, each memory controller of the plurality of memory controllers has one accelerator of a plurality of accelerators for mutually exchanging data over the interconnect.Type: GrantFiled: June 27, 2012Date of Patent: March 3, 2015Assignee: International Business Machines CorporationInventors: Florian Alexander Auernhammer, Victoria Caparros Cabezas, Andreas Christian Doering, Patricia Maria Sagmeister
-
Patent number: 8949545Abstract: A data processing device includes a load/store module to provide an interface between a processor device and a bus. In response to receiving a load or store instruction from the processor device, the load/store module determines a predicted coherency state of a cache line associated with the load or store instruction. Based on the predicted coherency state, the load/store module selects a bus transaction and communicates it to the bus. By selecting the bus transaction based on the predicted cache state, the load/store module does not have to wait for all pending bus transactions to be serviced, providing for greater predictability as to when bus transactions will be communicated to the bus, and allowing the bus behavior to be more easily simulated.Type: GrantFiled: December 4, 2008Date of Patent: February 3, 2015Assignee: Freescale Semiconductor, Inc.Inventor: John D. Pape
-
Patent number: 8918590Abstract: The present invention provides a ring bus type multicore system including one memory, a main memory controller for connecting the memory to a ring bus; and multiple cores connected in the shape of the ring bus, wherein each of the cores further includes a cache interface and a cache controller for controlling or managing the interface, and the cache controller of each of the cores connected in the shape of the ring bus executes a step of snooping data on the request through the cache interface; and when the cache of the core holds the data, a step of controlling the core to receive the request and return the data to the requester core, or, when the cache of the core does not hold the data, the main memory controller executes a step of reading the data from the memory and sending the data to the requester core.Type: GrantFiled: December 5, 2011Date of Patent: December 23, 2014Assignee: International Business Machines CorporationInventors: Aya Minami, Yohichi Miwa
-
Patent number: 8904114Abstract: Various implementations of shared upper level cache architectures for multi-core processors including a first subset of processor cores and a second subset of processor cores and a module configured to copy data from a first shared upper level cache memory to a second shared upper level cache memory are generally disclosed.Type: GrantFiled: November 24, 2009Date of Patent: December 2, 2014Assignee: Empire Technology Development LLCInventor: Ezekiel Kruglick
-
Patent number: 8886889Abstract: Methods and apparatus are provided for reusing snoop responses and data phase results in a bus controller. A bus controller receives an incoming bus transaction BTR1 corresponding to an incoming cache transaction CTR1 for an entry in at least one cache; issues a snoop request with a cache line address of the incoming bus transaction BTR1 for the entry to a plurality of cache controllers; collects at least one snoop response from the plurality of cache controllers; broadcasts a combined snoop response to the plurality of cache controllers, wherein the combined snoop response is a combination of the snoop responses from the plurality of cache controllers; and broadcasts cache line data from a source cache for the entry during a data phase to the plurality of cache controllers, wherein a subsequent cache transaction CTR2 for the entry is processed based on the broadcast combined snoop response and the broadcast cache line data.Type: GrantFiled: February 21, 2012Date of Patent: November 11, 2014Assignee: LSI CorporationInventors: Vidyalakshmi Rajagopalan, Archna Rai, Sharath Kashyap, Anuj Soni
-
Patent number: 8886894Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.Type: GrantFiled: October 23, 2012Date of Patent: November 11, 2014Assignee: Intel CorporationInventors: Ali-Reza Adl-Tabatabai, Yang Ni, Bratin Saha, Vadim Bassin, Gad Sheaffer, David Callahan, Jan Gray
-
Patent number: 8874856Abstract: A false sharing detecting apparatus for analyzing a multi-thread application, the false sharing detecting apparatus includes an operation set detecting unit configured to detect an operation set having a chance of causing performance degradation due to false sharing, and a probability calculation unit configured to calculate a first probability defined as a probability that the detected operation set is to be executed according to an execution pattern causing performance degradation due to false sharing, and calculate a second probability based on the calculated first probability. The second probability is defined as a probability that performance degradation due to false sharing occurs with respect to an operation included in the detected operation set.Type: GrantFiled: June 17, 2011Date of Patent: October 28, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Dae-Hyun Cho, Sung-Do Moon
-
Patent number: 8868846Abstract: Disclosed is a coherent storage system. A network interface device (NIC) receives network storage commands from a host. The NIC may cache the data to/from the storage commands in a solid-state disk. The NIC may respond to future network storage command by supplying the data from the solid-state disk rather than initiating a network transaction. Other NIC's on other hosts may also cache network storage data. These NICs may respond to transactions from the first NIC by supplying data, or changing the state of data in their caches.Type: GrantFiled: December 29, 2010Date of Patent: October 21, 2014Assignee: Netapp, Inc.Inventor: Robert E. Ober
-
Patent number: 8868837Abstract: In a multiprocessor system, with conflict checking implemented in a directory lookup of a shared cache memory, a reader set encoding permits dynamic recordation of read accesses. The reader set encoding includes an indication of a portion of a line read, for instance by indicating boundaries of read accesses. Different encodings may apply to different types of speculative execution.Type: GrantFiled: January 18, 2011Date of Patent: October 21, 2014Assignee: International Business Machines CorporationInventors: Alan Gara, Martin Ohmacht
-
Patent number: 8856448Abstract: Efficient techniques are described for tracking a potential invalidation of a data cache entry in a data cache for which coherency is required. Coherency information is received that indicates a potential invalidation of a data cache entry. The coherency information in association with the data cache entry is retained to track the potential invalidation to the data cache entry. The retained coherency information is kept separate from state bits that are utilized in cache access operations. An invalidate bit, associated with a data cache entry, may be utilized to represents a potential invalidation of the data cache entry. The invalidate bit is set in response to the coherency information, to track the potential invalidation of the data cache entry. A valid bit associated with the data cache entry is set in response to the active invalidate bit and a memory synchronization command. The set invalidate bit is cleared after the valid bit has been cleared.Type: GrantFiled: February 19, 2009Date of Patent: October 7, 2014Assignee: QUALCOMM IncorporatedInventors: Michael W. Morrow, James Norris Dieffenderfer
-
Patent number: 8856466Abstract: In one embodiment, the present invention includes a method for executing a transactional memory (TM) transaction in a first thread, buffering a block of data in a first buffer of a cache memory of a processor, and acquiring a write monitor on the block to obtain ownership of the block at an encounter time in which data at a location of the block in the first buffer is updated. Other embodiments are described and claimed.Type: GrantFiled: October 23, 2012Date of Patent: October 7, 2014Assignee: Intel CorporationInventors: Ali-Reza Adl-Tabatabai, Yang Ni, Bratin Saha, Vadim Bassin, Gad Sheaffer, David Callahan, Jan Gray
-
Patent number: 8806144Abstract: A flash storage device includes a first memory, a flash memory comprising a plurality of physical blocks, each of the plurality of physical blocks comprising a plurality of physical pages, and a controller. The controller is configured to store, in the first memory, copies of data read from the flash memory, map a logical address in a read request received from a host system to a virtual unit address and a virtual page address, and check a virtual unit cache tag table stored in the first memory based on the virtual unit address. If a hit is found in the virtual unit cache tag table, a virtual page cache tag sub-table stored in the first memory is checked based on the virtual page address, wherein the virtual page cache tag sub-table is associated with the virtual unit address. If a hit is found in the virtual page cache tag sub-table, data stored in the first memory mapped to the hit in the virtual page cache tag sub-table is read in response to the read request received from the host system.Type: GrantFiled: May 12, 2010Date of Patent: August 12, 2014Assignee: STEC, Inc.Inventors: Po-Jen Hsueh, Richard A. Mataya, Mark Moshayedi
-
Patent number: 8799582Abstract: A method and apparatus for extending cache coherency to hold buffered data to support transactional execution is herein described. A transactional store operation referencing an address associated with a data item is performed in a buffered manner. Here, the coherency state associated with cache lines to hold the data item are transitioned to a buffered state. In response to local requests for the buffered data item, the data item is provided to ensure internal transactional sequential ordering. However, in response to external access requests, a miss response is provided to ensure the transactionally updated data item is not made globally visible until commit. Upon commit, the buffered lines are transitioned to a modified state to make the data item globally visible.Type: GrantFiled: December 30, 2008Date of Patent: August 5, 2014Assignee: Intel CorporationInventors: Gad Sheaffer, Shlomo Raikin, Vadim Bassin, Raanan Sade, Ehud Cohen, Oleg Margulis
-
Patent number: 8799587Abstract: A Region Coherence Array (RCA) having subregions and subregion prefetching for shared-memory multiprocessor systems having a single-level, or a multi-level interconnect hierarchy architecture.Type: GrantFiled: January 26, 2009Date of Patent: August 5, 2014Assignee: International Business Machines CorporationInventor: Jason F. Cantin
-
Patent number: 8788761Abstract: One embodiment of the present invention sets forth am extension to a cache coherence protocol with two explicit control states, P (private), and R (read-only), that provide explicit program control of cache lines for which the program logic can guarantee correct behavior. In the private state, only the owner of a cache line can access the cache line for read or write operations. In the read-only state, only read operations can be performed on the cache line, thereby disallowing write operations to be performed.Type: GrantFiled: September 23, 2011Date of Patent: July 22, 2014Assignee: NVIDIA CorporationInventor: William James Dally
-
Patent number: 8775743Abstract: Systems and methods for implementing a distributed shared memory (DSM) in a computer cluster in which an unreliable underlying message passing technology is used, such that the DSM efficiently maintains coherency and reliability. DSM agents residing on different nodes of the cluster process access permission requests of local and remote users on specified data segments via handling procedures, which provide for recovering of lost ownership of a data segment while ensuring exclusive ownership of a data segment among the DSM agents detecting and resolving a no-owner messaging deadlock, pruning of obsolete messages, and recovery of the latest contents of a data segment whose ownership has been lost.Type: GrantFiled: July 2, 2012Date of Patent: July 8, 2014Assignee: International Business Machines CorporationInventors: Lior Aronovich, Ron Asher
-
Patent number: 8762641Abstract: A method is described for use when a cache is accessed. Before all valid array entries are validated, a valid array entry is read when a data array entry is accessed. If the valid array entry is a first array value, access to the cache is treated as being invalid and the data array entry is reloaded. If the valid array entry is a second array value, a tag array entry is compared with an address to determine if the data array entry is valid or invalid. A valid control register contains a first control value before all valid array entries are validated and a second control value after all valid array entries are validated. After the second control value is established, reads of the valid array are disabled and the tag array entry is compared with the address to determine if a data array entry is valid or invalid.Type: GrantFiled: March 12, 2009Date of Patent: June 24, 2014Assignee: Qualcomm IncorporatedInventor: Arthur Joseph Hoane, Jr.
-
Patent number: 8762651Abstract: Maintaining cache coherence in a multi-node, symmetric multiprocessing computer, the computer composed of a plurality of compute nodes, including, broadcasting upon a cache miss by the first compute node to other compute nodes a request for the cache line; if at least two of the compute nodes has a correct copy of the cache line, selecting which compute node is to transmit the correct copy of the cache line to the first node, and transmitting from the selected compute node to the first node the correct copy of the cache line; and updating by each node the state of the cache line in each node, in dependence upon one or more of the states of the cache line in all the nodes.Type: GrantFiled: June 23, 2010Date of Patent: June 24, 2014Assignee: International Business Machines CorporationInventors: Michael A. Blake, Garrett M. Drapala, Pak-Kin Mak, Vesselina K. Papazova, Craig R. Walters
-
Patent number: 8756377Abstract: An apparatus for storing data that is being processed is disclosed. The apparatus comprises: a cache associated with a processor and for storing a local copy of data items stored in a memory for use by the processor, monitoring circuitry associated with the cache for monitoring write transaction requests to the memory initiated by a further device, the further device being configured not to store data in the cache. The monitoring circuitry is responsive to detecting a write transaction request to write a data item, a local copy of which is stored in the cache, to block a write acknowledge signal transmitted from the memory to the further device indicating the write has completed and to invalidate the stored local copy in the cache and on completion of the invalidation to send the write acknowledge signal to the further device.Type: GrantFiled: February 2, 2010Date of Patent: June 17, 2014Assignee: ARM LimitedInventors: Simon John Craske, Antony John Penton, Loic Pierron, Andrew Christopher Rose
-
Patent number: 8751748Abstract: In a parallel processing system with speculative execution, conflict checking occurs in a directory lookup of a cache memory that is shared by all processors. In each case, the same physical memory address will map to the same set of that cache, no matter which processor originated that access. The directory includes a dynamic reader set encoding, indicating what speculative threads have read a particular line. This reader set encoding is used in conflict checking. A bitset encoding is used to specify particular threads that have read the line.Type: GrantFiled: January 18, 2011Date of Patent: June 10, 2014Assignee: International Business Machines CorporationInventors: Daniel Ahn, Luis H. Ceze, Alan Gara, Martin Ohmacht, Zhuang Xiaotong
-
Patent number: 8732408Abstract: A circuit contains a shared memory (12), that is used by a plurality of processing elements (10) that contain cache circuits (102) for caching data from the shared memory (12). The processing elements perform a plurality of cooperating tasks, each task involving caching data from the shared memory (12) and sending cache message traffic. Consistency between cached data for different tasks is maintained by transmission of cache coherence requests via a communication network. Information from cache coherence requests generated for all of said tasks is buffered. One of the processing elements provides an indication signal indicating a current task stage of at least one of the processing elements. Additional cache message traffic is generated adapted dependent on the indication signal and the buffered information from the cache coherence requests. Thus conditions of cache traffic stress may be created to verify operability of the circuit, or cache message traffic may be delayed to avoid stress.Type: GrantFiled: October 16, 2008Date of Patent: May 20, 2014Assignee: Nytell Software LLCInventors: Sainath Karlapalem, Andrei Sergeevich Terechko
-
Patent number: 8732412Abstract: Systems and methods for implementing a distributed shared memory (DSM) in a computer cluster in which an unreliable underlying message passing technology is used, such that the DSM efficiently maintains coherency and reliability. DSM agents residing on different nodes of the cluster process access permission requests of local and remote users on specified data segments via handling procedures, which provide for recovering of lost ownership of a data segment while ensuring exclusive ownership of a data segment among the DSM agents detecting and resolving a no-owner messaging deadlock, pruning of obsolete messages, and recovery of the latest contents of a data segment whose ownership has been lost.Type: GrantFiled: July 2, 2012Date of Patent: May 20, 2014Assignee: International Business Machines CorporationInventors: Lior Aronovich, Ron Asher
-
Patent number: 8725958Abstract: The present invention provides a data processor capable of reducing power consumption at the time of execution of a spin wait loop for a spinlock. A CPU executes a weighted load instruction at the time of performing a spinlock process and outputs a spin wait request to a corresponding cache memory. When the spin wait request is received from the CPU, the cache memory temporarily stops outputting an acknowledge response to a read request from the CPU until a predetermined condition (snoop write hit, interrupt request, or lapse of predetermined time) is satisfied. Therefore, pipeline execution of the CPU is stalled and the operation of the CPU and the cache memory can be temporarily stopped, and power consumption at the time of executing a spin wait loop can be reduced.Type: GrantFiled: January 19, 2011Date of Patent: May 13, 2014Assignee: Renesas Electronics CorporationInventor: Hirokazu Takata
-
Publication number: 20140129782Abstract: The invention provides a system with storage cache with high bandwidth and low latency to the server, and coherence for the contents of multiple memory caches, wherein locally managing a storage cache situated on a server is combined with a means for globally managing the coherency of storage caches of a number of servers. The local cache manager delivers very high performance and low latency for write transactions that hit the local cache in the Modified or Exclusive state and for read transactions that hit the local cache in the Modified, Exclusive or Shared states. The global coherency manager enables many servers connected via a network to share the contents of their local caches, providing application transparency by maintaining a directory with an entry for each storage block that indicates which servers have that block in the shared state or which server has that block in the modified state.Type: ApplicationFiled: November 4, 2012Publication date: May 8, 2014Inventor: Robert Quinn
-
Publication number: 20140122805Abstract: Embodiments related to selecting a runahead poison policy from a plurality of runahead poison policies during microprocessor operation are provided. The example method includes causing the microprocessor to enter runahead upon detection of a runahead event and implementing a first runahead poison policy selected from a plurality of runahead poison policies operative to manage runahead poison injection during runahead. The example method also includes during microprocessor operation, selecting a second runahead poison policy operative to manage runahead poison injection differently from the first runahead poison policy.Type: ApplicationFiled: October 26, 2012Publication date: May 1, 2014Applicant: NVIDIA CORPORATIONInventors: Magnus Ekman, James van Zoeren, Paul Serris
-
Patent number: 8713251Abstract: A disk array device that can detect the successful completion of data overwrite/update at high speed only by checking a UDT is provided. When a DIF is used as a verification code appended to data, check information that detects the successful completion of overwrite is defined in the UDT, in addition to address information that detects positional errors. Upon request of overwrite/update of data stored in a cache, a check bit of the data in the cache is changed to a value different from a check bit to be appended to new data by a host adapter. Then, data transfer is initiated. Upon completion of the data overwrite, the check bit is changed back to the original value, whereby it is possible to detect the successful completion of overwrite/update (FIG. 8).Type: GrantFiled: May 27, 2009Date of Patent: April 29, 2014Assignee: Hitachi, Ltd.Inventors: Hideyuki Koseki, Yusuke Nonaka
-
Patent number: 8706970Abstract: An apparatus for controlling operation of a cache includes a first command queue, a second command queue and an input controller configured to receive requests having a first command type and a second command type and to assign a first request having the first command type to the first command queue and a second command having the first command type to the second command queue in the event that the first command queue has not received an indication that a first dedicated buffer is available.Type: GrantFiled: November 7, 2012Date of Patent: April 22, 2014Assignee: International Business Machines CorporationInventors: Diana L. Orf, Robert J. Sonnelitter, III
-
Publication number: 20140108737Abstract: A method to eliminate the delay of a block invalidate operation in a multi CPU environment by overlapping the block invalidate operation with normal CPU accesses, thus making the delay transparent. A range check is performed on each CPU access while a block invalidate operation is in progress, and an access that maps to within the address range of the block invalidate operation will be trated as a cache miss to ensure that the requesting CPU will receive valid data.Type: ApplicationFiled: October 11, 2012Publication date: April 17, 2014Applicant: TEXAS INSTRUMENTS INCORPORATEDInventors: Naveen Bhoria, Raguram Damodaran, Abhijeet Ashok Chachad
-
Patent number: 8688921Abstract: A software transactional memory system is provided with multiple global version counters. The system assigns an affinity to one of the global version counters for each thread that executes transactions. Each thread maintains a local copy of the global version counters for use in validating read accesses of transactions. Each thread uses a corresponding affinitized global version counter to store version numbers of write accesses of executed transactions. The system adaptively changes the affinities of threads when data conflict or global version counter conflict is detected between threads.Type: GrantFiled: March 3, 2009Date of Patent: April 1, 2014Assignee: Microsoft CorporationInventor: Yosseff Levanoni
-
Patent number: 8688917Abstract: A method and apparatus for monitoring memory accesses in hardware to support transactional execution is herein described. Attributes are monitor accesses to data items without regard for detection at physical storage structure granularity, but rather ensuring monitoring at least at data items granularity. As an example, attributes are added to state bits of a cache to enable new cache coherency states. Upon a monitored memory access to a data item, which may be selectively determined, coherency states associated with the data item are updated to a monitored state. As a result, invalidating requests to the data item are detected through combination of the request type and the monitored coherency state of the data item.Type: GrantFiled: January 20, 2012Date of Patent: April 1, 2014Assignee: Intel CorporationInventors: Gad Sheaffer, Shlomo Raikin, Vadim Bassin, Raanan Sade, Ehud Cohen, Oleg Margulis
-
Publication number: 20140089600Abstract: Methods and apparatuses for utilizing a data pending state for cache misses in a system cache. To reduce the size of a miss queue that is searched by subsequent misses, a cache line storage location is allocated in the system cache for a miss and the state of the cache line storage location is set to data pending. A subsequent request that hits to the cache line storage location will detect the data pending state and as a result, the subsequent request will be sent to a replay buffer. When the fill for the original miss comes back from external memory, the state of the cache line storage location is updated to a clean state. Then, the request stored in the replay buffer is reactivated and allowed to complete its access to the cache line storage location.Type: ApplicationFiled: September 27, 2012Publication date: March 27, 2014Applicant: APPLE INC.Inventors: Sukalpa Biswas, Shinye Shiu, James B. Keller
-
Patent number: 8667226Abstract: A data processing system (10) includes a first master (14) and a second master (16 or 22). The first master includes a cache (28) and snoop queue circuitry (44, 52, 54) having a snoop request queue (44) which stores snoop requests. The snoop queue circuitry receives snoop requests for storage into the snoop request queue and provides snoop requests from the snoop request queue to the cache, and the snoop queue circuitry provides a ready indicator indicating whether the snoop request queue can store more snoop requests. The second master includes outgoing transaction control circuitry (72) which controls initiation of outgoing transactions to a system interconnect.Type: GrantFiled: March 24, 2008Date of Patent: March 4, 2014Assignee: Freescale Semiconductor, Inc.Inventor: William C. Moyer
-
Publication number: 20140059298Abstract: In one embodiment, a method performed by one or more computing devices includes receiving at a host cache, a first request to prepare a volume of the host cache for creating a snapshot of a cached logical unit number (LUN), the request indicating that a snapshot of the cached LUN will be taken, preparing, in response to the first request, the volume of the host cache for creating the snapshot of the cached LUN depending on a mode of the host cache, receiving, at the host cache, a second request to create the snapshot of the cached LUN, and in response to the second request, creating, at the host cache, the snapshot of the cached LUN.Type: ApplicationFiled: August 24, 2012Publication date: February 27, 2014Applicant: DELL PRODUCTS L.P.Inventors: Marc David Olin, Michael James Klemm
-
Publication number: 20140052931Abstract: A method for controlling a memory scrubbing rate based on content of the status bit of a tag array of a cache memory. More specifically, the tag array of a cache memory is scrubbed at smaller interval than the scrubbing rate of the storage arrays of the cache. This increased scrubbing rate is in appreciation for the importance of maintaining integrity of tag data. Based on the content of the status bit of the tag array which indicates modified, the corresponding data entry in the cache storage array is scrubbed accordingly. If the modified bit is set, then the entry in the storage array is scrubbed after processing the tag entry. If the modified bit is not set, then the storage array is scrubbed at a predetermined scrubbing interval.Type: ApplicationFiled: August 17, 2012Publication date: February 20, 2014Inventors: Ravindraraj Ramaraju, William C. Moyer, Andrew C. Russell
-
Publication number: 20140040562Abstract: The disclosed embodiments provide a system that uses broadcast-based TLB-sharing techniques to reduce address-translation latency in a shared-memory multiprocessor system with two or more nodes that are connected by an electrical interconnect. During operation, a first node receives a memory operation that includes a virtual address. Upon determining that one or more TLB levels of the first node will miss for the virtual address, the first node uses the electrical interconnect to broadcast a TLB request to one or more additional nodes of the shared-memory multiprocessor in parallel with scheduling a speculative page-table walk for the virtual address. If the first node receives a TLB entry from another node of the shared-memory multiprocessor via the electrical interconnect in response to the TLB request, the first node cancels the speculative page-table walk. Otherwise, if no response is received, the first node instead waits for the completion of the page-table walk.Type: ApplicationFiled: August 2, 2012Publication date: February 6, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Pranay Koka, David A. Munday, Michael O. McCracken, Herbert D. Schwetman, JR.
-
Patent number: 8645632Abstract: Embodiments of the present invention provide a system that performs a speculative writestream transaction. The system starts by receiving, at a home node, a writestream ordered (WSO) request to start a WSO transaction from a processing subsystem. The WSO request identifies a cache line to be written during the WSO transaction. The system then sends an acknowledge signal to the processing subsystem to enable the processing subsystem to proceed with the WSO transaction. During the WSO transaction, the system receives a second WSO request to start a WSO transaction. The second WSO request identifies the same cache line as to be written during the subsequent WSO transaction. In response to receiving the second WSO request, the system sends an abort signal to cause the processing subsystem to abort the WSO transaction.Type: GrantFiled: February 4, 2009Date of Patent: February 4, 2014Assignee: Oracle America, Inc.Inventors: Robert E. Cypher, Haakan E. Zeffer, Anders Landin
-
Publication number: 20140032858Abstract: Methods and apparatus are provided for cache line sharing among cache controllers. A cache comprises a plurality of cache lines; and a cache controller for sharing at least one of the cache lines with one or more additional caches, wherein a given cache line shared by a plurality of caches corresponds to a given set of physical addresses in a main memory. The cache controller optionally maintains an ownership control signal indicating which portions of the at least one cache line are controlled by the cache and a validity control signal indicating whether each portion of the at least one cache line is valid. Each cache line can be in one of a plurality of cache coherence states, including a modified partial state and a shared partial state.Type: ApplicationFiled: July 25, 2012Publication date: January 30, 2014Inventors: Vidyalakshmi Rajagopalan, Archna Rai, Anuj Soni, Sharath Kashyap
-
Patent number: 8639890Abstract: Systems and methods for implementing a distributed shared memory (DSM) in a computer cluster in which an unreliable underlying message passing technology is used, such that the DSM efficiently maintains coherency and reliability. DSM agents residing on different nodes of the cluster process access permission requests of local and remote users on specified data segments via handling procedures, which provide for recovering of lost ownership of a data segment while ensuring exclusive ownership of a data segment among the DSM agents detecting and resolving a no-owner messaging deadlock, pruning of obsolete messages, and recovery of the latest contents of a data segment whose ownership has been lost.Type: GrantFiled: July 2, 2012Date of Patent: January 28, 2014Assignee: International Business Machines CorporationInventors: Lior Aronovich, Ron Asher