Coherency Patents (Class 711/141)
-
Patent number: 8996829Abstract: Aspects of the subject matter described herein relate to maintaining consistency in a storage system. In aspects, one or more objects may be updated in the context of a transaction. In conjunction with updating the objects, logical copies of the objects may be obtained and modified. A request to write the updated logical copies is sent to a storage controller. The logical copies do not overwrite the original copies. In conjunction with sending the request, a data structure is provided for the storage controller to store on the disk. The data structure indicates the one or more objects that were supposed to be written to disk and may include verification data to indicate the content that was supposed to be written to disk. During recovery, this data structure may be used to determine whether all of the object(s) were correctly written to disk.Type: GrantFiled: April 29, 2013Date of Patent: March 31, 2015Assignee: Microsoft Technology Licensing, LLCInventors: Thomas J. Miller, Jonathan M. Cargille, William R. Tipton, Surendra Verma
-
Patent number: 8996820Abstract: A multi-core processor system includes a processor configured to establish coherency of shared data values stored in a cache memory accessed by a multiple cores; detect a first thread executed by a first core among the cores; identify upon detecting the first thread, a second thread under execution by a second core other than the first core and among the cores; determine whether shared data commonly accessed by the first thread and the second thread is present; and stop establishment of coherency for a first cache memory corresponding to the first core and a second cache memory corresponding to the second core, upon determining that no shared data commonly accessed is present.Type: GrantFiled: December 12, 2012Date of Patent: March 31, 2015Assignee: Fujitsu LimitedInventors: Takahisa Suzuki, Koichiro Yamashita, Hiromasa Yamauchi, Koji Kurihara
-
Publication number: 20150089155Abstract: Cache lines in a multi-processor computing environment are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. Communications detected on a coherence interconnect may indicate that a cache line is associated with performance-reducing events. A high-contention cache line may be placed in sub-line coherency mode. Caches accessing the cache line are notified that the cache line is in sub-line coherency mode. The cache line may be associated with a counter in a centralized detection table that is incremented based on detecting the communications. The cache line may be a high-contention cache line when the counter satisfies a high-contention criterion, such as reaching a threshold value. The cache line may be returned to full-line coherency mode when a reset criterion is satisfied.Type: ApplicationFiled: September 26, 2013Publication date: March 26, 2015Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Harold W. Cain, III, Michael K. Gschwind, Maged M. Michael, Valentina Salapura, Eric M. Schwarz, Chung-Lung K. Shum
-
Publication number: 20150089157Abstract: A cache coherence manager, disposed in a multi-core microprocessor, includes a request unit, an intervention unit, a response unit and an interface unit. The request unit receives coherent requests and selectively issues speculative requests in response. The interface unit selectively forwards the speculative requests to a memory. The interface unit includes at least three tables. Each entry in the first table represents an index to the second table. Each entry in the second table represents an index to the third table. The entry in the first table is allocated when a response to an associated intervention message is stored in the first table but before the speculative request is received by the interface unit. The entry in the second table is allocated when the speculative request is stored in the interface unit. The entry in the third table is allocated when the speculative request is issued to the memory.Type: ApplicationFiled: December 2, 2014Publication date: March 26, 2015Inventors: William Lee, Thomas Benjamin Berg
-
Publication number: 20150089152Abstract: Cache lines in a computing environment with transactional memory are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. When a transaction accessing a cache line in full-line coherency mode results in a transactional abort, the cache line may be placed in sub-line coherency mode if the cache line is a high-conflict cache line. The cache line may be associated with a counter in a conflict address detection table that is incremented whenever a transaction conflict is detected for the cache line. The cache line may be a high-conflict cache line when the counter satisfies a high-conflict criterion, such as reaching a threshold value. The cache line may be returned to full-line coherency mode when a reset criterion is satisfied.Type: ApplicationFiled: September 26, 2013Publication date: March 26, 2015Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Harold W. Cain, III, Michael K. Gschwind, Maged M. Michael, Valentina Salapura, Eric M. Schwarz, Chung-Lung K. Shum
-
Publication number: 20150089154Abstract: Cache lines in a multi-processor computing environment are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. A high-coherence-miss cache line may be placed in sub-line coherency mode. A cache line may be associated with a counter in a coherence miss detection table that is incremented whenever an access of the cache line results in a coherence request. The cache line may be a high-coherence-miss cache line when the counter satisfies a high-coherence-miss criterion, such as reaching a threshold value. The cache line may be returned to full-line coherency mode when a reset criterion is satisfied.Type: ApplicationFiled: September 26, 2013Publication date: March 26, 2015Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Harold W. Cain, III, Michael K. Gschwind, Maged M. Michael, Valentina Salapura, Eric M. Schwarz, Chung-Lung K. Shum
-
Publication number: 20150089156Abstract: In an aspect, an update unit can evaluate condition(s) in an update request and update one or more memory locations based on the condition evaluation. The update unit can operate atomically to determine whether to effect the update and to make the update. Updates can include one or more of incrementing and swapping values. An update request may specify one of a pre-determined set of update types. Some update types may be conditional and others unconditional. The update unit can be coupled to receive update requests from a plurality of computation units. The computation units may not have privileges to directly generate write requests to be effected on at least some of the locations in memory. The computation units can be fixed function circuitry operating on inputs received from programmable computation elements. The update unit may include a buffer to hold received update requests.Type: ApplicationFiled: September 23, 2014Publication date: March 26, 2015Inventors: Steven J. Clohset, Jason R. Redgrave, Luke T. Peterson
-
Publication number: 20150089153Abstract: Cache lines in a computing environment with transactional memory are configurable with a coherency mode and are associated with a high-conflict indicator. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. A cache line is placed in sub-line coherency mode based on examining the high-conflict indicator. A transaction accessing a memory address in a cache line in sub-line coherency mode marks only the sub-cache line portion associated with the memory address as transactionally accessed. The high-conflict indicator may be included in a set of descriptive bits associated with the cache line. A copy of the high-conflict indicator for a cache line in a first cache may be updated with the high-conflict indicator for the cache line in a second cache.Type: ApplicationFiled: September 26, 2013Publication date: March 26, 2015Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Harold W. Cain, III, Michael K. Gschwind, Maged M. Michael, Valentina Salapura, Eric M. Schwarz, Chung-Lung K. Shum
-
Publication number: 20150089159Abstract: Cache lines in a multi-processor computing environment are configurable with a coherency mode. Cache lines in full-line coherency mode are operated or managed with full-line granularity. Cache lines in sub-line coherency mode are operated or managed as sub-cache line portions of a full cache line. Each cache is associated with a directory having a number of directory entries and with a side table having a smaller number of entries. The directory entry for a cache line associates the cache line with a tag and a set of full-line descriptive bits. Creating a side table entry for the cache line places the cache line in sub-line coherency mode. The side table entry associates each of the sub-cache line portions of the cache line with a set of sub-line descriptive bits. Removing the side table entry may return the cache line to full-line coherency mode.Type: ApplicationFiled: September 26, 2013Publication date: March 26, 2015Applicant: International Business Machines CorporationInventors: Fadi Y. Busaba, Harold W. Cain, III, Michael K. Gschwind, Maged M. Michael, Valentina Salapura, Eric M. Schwarz, Chung-Lung K. Shum
-
Publication number: 20150089151Abstract: Techniques are disclosed for performing memory access operations. A texture unit receives a memory access operation that includes a tuple associated with a first view in a plurality of views. The texture unit retrieves a first hash value associated with a first texture header in a plurality of texture headers, where the first texture header is related to the first view. The texture unit retrieves a second hash value associated with a second texture header in the plurality of texture headers, where the second texture header is related to a second view. The texture unit determines whether the first view is potentially aliased with the second view, based on the first and second hash values. If so, then the texture unit invalidates a cache entry in a cache memory associated with the second texture header. Otherwise, the texture unit maintains the cache entry.Type: ApplicationFiled: September 25, 2013Publication date: March 26, 2015Applicant: NVIDIA CORPORATIONInventors: Jeffrey Bolz, Patrick R. BROWN, Steven J. HEINRICH, Dale L. KIRKLAND, Joel MCCORMACK
-
Patent number: 8990501Abstract: A multiple processor system is disclosed. The processor system includes a first cluster including a first plurality of processors is associated with a first cluster cache, a second cluster including a second plurality of processors associated with a second cluster cache, and a cluster communication network between the first cluster and the second cluster for sharing data between the first cluster and the second cluster. The first cluster includes a first unshared connection to the cluster communication network and the second cluster includes a second unshared connection to the cluster communication network.Type: GrantFiled: October 12, 2005Date of Patent: March 24, 2015Assignee: Azul Systems, Inc.Inventors: Scott Sellers, Gil Tene
-
Patent number: 8990511Abstract: There is provided a cache synchronization control method by which contents of a plurality of caches can be synchronized without a programmer explicitly setting a synchronization point, and the contents of the caches can be synchronized without scanning all cache blocks. A cache synchronization control method for a multiprocessor that has a plurality of processors having a cache, and a storage device shared by the plurality of processors, the method comprises: before a task is executed, a first step of writing back input data of the task to the storage device by a processor that manages the task and deleting data corresponding to the input data from its own cache by a processor other than the processor; and after the task is executed, a second step of writing back output data of the task to the storage device by a processor that has executed the task and deleting data corresponding to the output data from its own cache by a processor other than the processor.Type: GrantFiled: October 31, 2008Date of Patent: March 24, 2015Assignee: NEC CorporationInventor: Takahiro Kumura
-
Patent number: 8990510Abstract: A method, system and computer program product for managing requests for deferred updates to shared data elements while minimizing grace period detection overhead associated with determining whether pre-existing references to the data elements have been removed. Plural update requests that are eligible for grace period detection are buffered without performing grace period detection processing. One or more conditions that could warrant commencement of grace period detection processing are monitored while the update requests are buffered. If warranted by such a condition, grace period detection is performed relative to the update requests so that they can be processed. In this way, grace period detection overhead can be amortized over plural update requests while being sensitive to conditions warranting prompt grace period detection.Type: GrantFiled: August 15, 2008Date of Patent: March 24, 2015Assignee: International Business Machines CorporationInventors: Paul E. McKenney, Orran Y. Krieger, Jonathan Appavoo, Dipankar Sarma
-
Patent number: 8984233Abstract: Aspects of the subject matter described herein relate to error detection for files. In aspects, before allowing updates to a clean file, a flag marking the file as dirty is written to non-volatile storage. Thereafter, the file may be updated as long as desired. Periodically or at some other time, the file may be marked as clean after all outstanding updates to the file and error codes associated with the file are written to storage. While waiting for outstanding updates and error codes to be written to storage, if additional requests to update the file are received, the file may be marked as dirty again prior to allowing the additional requests to update the file. The request to write a clean flag regarding the file may be done lazily.Type: GrantFiled: June 20, 2014Date of Patent: March 17, 2015Assignee: Microsoft CorporationInventors: Thomas J. Miller, Jonathan M. Cargille, William R. Tipton, Surendra Verma
-
Patent number: 8984218Abstract: A system (and associated method) comprises a storage drive and a central processing unit (“CPU”). The storage drive is adapted to accommodate a removable storage medium. The CPU is configured to execute code to cause the CPU to write data to an addressable unit of the storage medium and also to write an identifying value to the addressable unit. The identifying value is indicative of an identity of the system or the storage drive.Type: GrantFiled: June 24, 2005Date of Patent: March 17, 2015Assignee: Hewlett-Packard Development Company, L.P.Inventor: Charles R. Weirauch
-
Publication number: 20150074325Abstract: A multi-processor computer system is described in which transaction processing is distributed among multiple protocol engines. The system includes a plurality of local nodes and an interconnection controller interconnected by a local point-to-point architecture. The interconnection controller comprises a plurality of protocol engines for processing transactions. Transactions are distributed among the protocol engines using destination information associated with the transactions.Type: ApplicationFiled: November 3, 2014Publication date: March 12, 2015Applicant: MEMORY INTEGRITY, LLCInventors: Charles Edward Watson, JR., Rajesh Kota, David Brian Glasco
-
Patent number: 8977820Abstract: A data processing apparatus and method are provided for handling hard errors occurring in a cache of the data processing apparatus. Cache location avoid storage is provided having at least one record, with each record being used to store a cache line identifier identifying a specific cache line. On detection of an error condition, one of the records in the cache location avoid storage is allocated to store the cache line identifier for the specific cache line associated with the entry for which the error condition was detected. A clean and invalidate operation is performed in respect of the specific cache line, and the access request is then re-performed. Cache access circuitry is arranged to exclude any specific cache line identified in the cache location avoid storage from a lookup procedure.Type: GrantFiled: December 21, 2007Date of Patent: March 10, 2015Assignee: ARM LimitedInventors: Antony John Penton, Alex James Waugh, Andrew Christopher Rose, Paul Stanley Hughes
-
Publication number: 20150067267Abstract: A method and an apparatus for concurrent accessing of dynamically type objects based on inline cache code are described. Inline cache initialization in a single thread may be off loaded to an interpreter without incurring unnecessary synchronization overhead. A thread bias mechanism may be provided to detect whether a code block is executed in a single thread. Further, the number of inline cache initializations performed via a compiler, such as baseline JIT compiler, can be reduced to improve processing performance.Type: ApplicationFiled: December 4, 2013Publication date: March 5, 2015Applicant: Apple Inc.Inventor: Filip J. Pizlo
-
Publication number: 20150067270Abstract: Managing a cache includes determining from metadata of a received service request whether a cache data response may satisfy the request as a function of recognizing a cacheable method name specification within request metadata by a service provider associated with the request, and determining whether the request is an inquiry in order to decide if the request may be satisfied by the cached data. Aspects also include searching the cache for the data response if determined the data is cacheable and the request is an inquiry, and sending the request on to a service provider if the data response is not a cacheable response, or the request is an update request.Type: ApplicationFiled: November 7, 2014Publication date: March 5, 2015Inventors: Hiroyuki Miyajima, Masaru Yamamoto
-
Publication number: 20150067268Abstract: A computer processor collects information for a dominant data access loop and reference code patterns based on data reference pattern analysis, and for pointer aliasing and data shape based on pointer escape analysis. The computer processor selects a candidate array for data splitting wherein the candidate array is referenced by a dominant data access loop. The computer processor determines a data splitting mode by which to split the data of the candidate array, based on the reference code patterns, the pointer aliasing, and the data shape information, and splits the data into two or more split arrays. The computer processor creates a software cache that includes a portion of the data of the two or more split arrays in a transposed format, and maintains the portion of the transposed data within the software cache and consults the software cache during an access of the split arrays.Type: ApplicationFiled: June 13, 2014Publication date: March 5, 2015Inventors: Christopher M. Barton, Shimin Cui, Satish K. Sadasivam, Raul E. Silvera, Madhavi G. Valluri, Steven W. White
-
Publication number: 20150067269Abstract: A method for building a multi-processor system with nodes having multiple cache coherency domains. In this system, a directory built in anode controller needs to include processor domain attribute information, and the information can be acquired by configuring cache coherency domain attributes of ports of the node controller connected to processors. In the disclosure herein, the node ca roller can support the multiple physical cache coherency domains in a node.Type: ApplicationFiled: November 6, 2014Publication date: March 5, 2015Inventors: Endong WANG, Leijun HU, Jicheng CHEN, Dong ZHANG, Weifeng GONG, Feng ZHANG
-
Patent number: 8972663Abstract: A method for cache coherence, including: broadcasting, by a requester cache (RC) over a partially-ordered request network (RN), a peer-to-peer (P2P) request for a cacheline to a plurality of slave caches; receiving, by the RC and over the RN while the P2P request is pending, a forwarded request for the cacheline from a gateway; receiving, by the RC and after receiving the forwarded request, a plurality of responses to the P2P request from the plurality of slave caches; setting an intra-processor state of the cacheline in the RC, wherein the intra-processor state also specifies an inter-processor state of the cacheline; and issuing, by the RC, a response to the forwarded request after setting the intra-processor state and after the P2P request is complete; and modifying, by the RC, the intra-processor state in response to issuing the response to the forwarded request.Type: GrantFiled: March 14, 2013Date of Patent: March 3, 2015Assignee: Oracle International CorporationInventors: Paul N. Loewenstein, Stephen E. Phillips, David Richard Smentek, Connie Wai Mun Cheung, Serena Wing Yee Leung, Damien Walker, Ramaswamy Sivaramakrishnan
-
Patent number: 8972667Abstract: A device with an interconnect having a plurality of memory controllers for connecting the plurality of memory controllers. Each memory controller of the plurality of memory controllers is coupled to an allocated memory for storing data. Further, each memory controller of the plurality of memory controllers has one accelerator of a plurality of accelerators for mutually exchanging data over the interconnect.Type: GrantFiled: June 27, 2012Date of Patent: March 3, 2015Assignee: International Business Machines CorporationInventors: Florian Alexander Auernhammer, Victoria Caparros Cabezas, Andreas Christian Doering, Patricia Maria Sagmeister
-
Publication number: 20150058579Abstract: A method for memory utilization by an electronic device is described. The method includes transferring a first portion of a first decision tree and a second portion of a second decision tree from a first memory to a cache memory. The first portion and second portion of each decision tree are stored contiguously in the first memory. The first decision tree and second decision tree are each associated with a different feature of an object detection algorithm. The method also includes reducing cache misses by traversing the first portion of the first decision tree and the second portion of the second decision tree in the cache memory based on an order of execution of the object detection algorithm.Type: ApplicationFiled: August 25, 2014Publication date: February 26, 2015Inventors: Lei Xu, Bo Zhou, Michael Warren Castelloe, Shuxue Quan, Xinping Zhang, Junchen Du, Ashwath Harthattu, Feng Guo, Yingyong Qi
-
Patent number: 8966461Abstract: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.Type: GrantFiled: September 29, 2011Date of Patent: February 24, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Benedict R. Gaster, Lee W. Howes, Mark D. Hummel
-
Populating a first stride of tracks from a first cache to write to a second stride in a second cache
Patent number: 8966178Abstract: Provided are a computer program product, system, and method for managing data in a cache system comprising a first cache, a second cache, and a storage system. A determination is made of tracks stored in the storage system to demote from the first cache. A first stride is formed including the determined tracks to demote. A determination is made of a second stride in the second cache in which to include the tracks in the first stride. The tracks from the first stride are added to the second stride in the second cache. A determination is made of tracks in strides in the second cache to demote from the second cache. The determined tracks to demote from the second cache are demoted.Type: GrantFiled: January 17, 2012Date of Patent: February 24, 2015Assignee: International Business Machines CorporationInventors: Kevin J. Ash, Michael T. Benhase, Lokesh M. Gupta, Matthew J. Kalos, Karl A. Nielsen -
Patent number: 8966187Abstract: For a flexible replication with skewed mapping in a multi-core chip, a request for a cache line is received, at a receiver core in the multi-core chip from a requester core in the multi-core chip. The receiver and requester cores comprise electronic circuits. The multi-core chip comprises a set of cores including the receiver and the requester cores. A target core is identified from the request to which the request is targeted. A determination is made whether the target core includes the requester core in a neighborhood of the target core, the neighborhood including a first subset of cores mapped to the target core according to a skewed mapping. The cache line is replicated, responsive to the determining being negative, from the target core to a replication core. The cache line is provided from the replication core to the requester core.Type: GrantFiled: December 1, 2011Date of Patent: February 24, 2015Assignee: International Business Machines CorporationInventors: Jian Li, William Evan Speight
-
Patent number: 8966180Abstract: A scatter/gather technique optimizes unstructured streaming memory accesses, providing off-chip bandwidth efficiency by accessing only useful data at a fine granularity, and off-loading memory access overhead by supporting address calculation, data shuffling, and format conversion.Type: GrantFiled: March 1, 2013Date of Patent: February 24, 2015Assignee: Intel CorporationInventors: Daehyun Kim, Christopher J. Hughes, Yen-Kuang Chen, Partha Kundu
-
Patent number: 8965995Abstract: A wireless storage management system adapted for being used in an electronic product for wirelessly communicating with a plurality of wireless storage devices includes an identity module assigning master and slave roles to the wireless storage devices, a hard disk manage module controlling the master device to obtain disk information about the wireless storage devices and further set an archive order for the slave devices, and a file manage module managing file access according to the archive order. The electronic product is only directly connected with the master device to make the hard disk manage module and the file manage module manage the slave devices via the master device. So the electronic product can conveniently view the disk information of all of the wireless storage devices and further realize the file access to all of the devices only by being directly connected with the master device.Type: GrantFiled: August 13, 2012Date of Patent: February 24, 2015Assignee: Cheng Uei Precision Industry Co., Ltd.Inventor: Chih-Jen Kuo
-
Patent number: 8959289Abstract: A data processing system includes a processor core supported by upper and lower level caches. In response to executing a deallocate instruction in the processor core, a deallocation request is sent from the processor core to the lower level cache, the deallocation request specifying a target address associated with a target cache line. In response to receipt of the deallocation request at the lower level cache, a determination is made if the target address hits in the lower level cache. In response to determining that the target address hits in the lower level cache, the target cache line is retained in a data array of the lower level cache and a replacement order field in a directory of the lower level cache is updated such that the target cache line is more likely to be evicted from the lower level cache in response to a subsequent cache miss.Type: GrantFiled: October 19, 2012Date of Patent: February 17, 2015Assignee: International Business Machines CorporationInventors: Sanjeev Ghai, Guy L. Guthrie, William J. Starke, Jeff A. Stuecheli, Derek E. Williams, Phillip G. Williams
-
Patent number: 8959290Abstract: Methods and apparatus are provided for reusing snoop responses and data phase results in a cache controller. A cache controller receives a broadcast combined snoop response from a bus controller, wherein the broadcast combined snoop response corresponds to an incoming bus transaction BTR1 corresponding to a cache transaction CTR1 for an entry in at least one cache and wherein the combined snoop response is a combination of at least one snoop response from a plurality of cache controllers; receives broadcast cache line data from a source cache as instructed by the bus controller for the entry during a data phase; and processes a subsequent cache transaction CTR2 for the entry based on one or more of the broadcast combined snoop response and the broadcast cache line data.Type: GrantFiled: February 21, 2012Date of Patent: February 17, 2015Assignee: LSI CorporationInventors: Vidyalakshmi Rajagopalan, Archna Rai, Sharath Kashyap, Anuj Soni
-
Populating a first stride of tracks from a first cache to write to a second stride in a second cache
Patent number: 8959279Abstract: Provided are a computer program product, system, and method for managing data in a cache system comprising a first cache, a second cache, and a storage system. A determination is made of tracks stored in the storage system to demote from the first cache. A first stride is formed including the determined tracks to demote. A determination is made of a second stride in the second cache in which to include the tracks in the first stride. The tracks from the first stride are added to the second stride in the second cache. A determination is made of tracks in strides in the second cache to demote from the second cache. The determined tracks to demote from the second cache are demoted.Type: GrantFiled: May 4, 2012Date of Patent: February 17, 2015Assignee: International Business Machines CorporationInventors: Kevin J. Ash, Michael T. Benhase, Lokesh M. Gupta, Matthew J. Kalos, Karl A. Nielsen -
Patent number: 8954474Abstract: A method of maintaining data described in a plurality of data models. An ontology is used to describe the data models. The data models are managed using the ontology and using a validation schema to validate object(s) governed by the ontology and derived from data-centric component(s) of content that has a semantically independent structure. Management of the data models is neutral relative to implementation of the content.Type: GrantFiled: April 21, 2008Date of Patent: February 10, 2015Assignee: The Boeing CompanyInventors: Mark A. Dahl, Edward J. Levinskas, Patrick L. Walsh, Russell G. Gianni, James G. Tanner, Roberto Aaron Vergaray
-
Patent number: 8954674Abstract: A scatter/gather technique optimizes unstructured streaming memory accesses, providing off-chip bandwidth efficiency by accessing only useful data at a fine granularity, and off-loading memory access overhead by supporting address calculation, data shuffling, and format conversion.Type: GrantFiled: October 8, 2013Date of Patent: February 10, 2015Assignee: Intel CorporationInventors: Daehyun Kim, Christopher J. Hughes, Yen-Kuang Chen, Partha Kundu
-
Patent number: 8954682Abstract: The present invention measures an actual utilization frequency of data and controls a location of this data in a storage apparatus in a case where a host computer makes joint use of a storage apparatus and a cache apparatus. A portion of data used by an application program 1A is stored in a storage apparatus 2 and a cache apparatus 3. A management apparatus 4 detects an I/O load of a page (4A), and detects an I/O load of cache data (4B). The management apparatus 4 determines a corresponding relationship between the page and the cache data (4C), and adds the I/O load of the cache data to the I/O load of the page.Type: GrantFiled: February 19, 2014Date of Patent: February 10, 2015Assignee: Hitachi, Ltd.Inventors: Takanori Sato, Takato Kusama
-
Patent number: 8954672Abstract: The present disclosure relates to a method and system for mapping cache lines to a row-based cache. In particular, a method includes, in response to a plurality of memory access requests each including an address associated with a cache line of a main memory, mapping sequentially addressed cache lines of the main memory to a row of the row-based cache. A disclosed system includes row index computation logic operative to map sequentially addressed cache lines of a main memory to a row of a row-based cache in response to a plurality of memory access requests each including an address associated with a cache line of the main memory.Type: GrantFiled: March 12, 2012Date of Patent: February 10, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, Mark D. Hill
-
Patent number: 8949545Abstract: A data processing device includes a load/store module to provide an interface between a processor device and a bus. In response to receiving a load or store instruction from the processor device, the load/store module determines a predicted coherency state of a cache line associated with the load or store instruction. Based on the predicted coherency state, the load/store module selects a bus transaction and communicates it to the bus. By selecting the bus transaction based on the predicted cache state, the load/store module does not have to wait for all pending bus transactions to be serviced, providing for greater predictability as to when bus transactions will be communicated to the bus, and allowing the bus behavior to be more easily simulated.Type: GrantFiled: December 4, 2008Date of Patent: February 3, 2015Assignee: Freescale Semiconductor, Inc.Inventor: John D. Pape
-
Patent number: 8949540Abstract: A victim cache line having a data-invalid coherence state is selected for castout from a first lower level cache of a first processing unit. The first processing unit issues on an interconnect fabric a lateral castout (LCO) command identifying the victim cache line to be castout from the first lower level cache, indicating the data-invalid coherence state, and indicating that a lower level cache is an intended destination of the victim cache line. In response to a coherence response to the LCO command indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in a second lower level cache of a second processing unit in the data-invalid coherence state.Type: GrantFiled: March 11, 2009Date of Patent: February 3, 2015Assignee: International Business Machines CorporationInventors: Guy L. Guthrie, Hien M. Le, Alvan W. Ng, Michael S. Siegel, Derek E. Williams, Phillip G. Williams
-
Patent number: 8949547Abstract: A data processing system that manages data hazards at a coherency controller and not at an initiator device is disclosed. Write requests are processed in a two part form, such that a first part is transmitted and when the coherency controller has space to accept data, the data and a state of the data prior to a write are sent as a second part of a write request. When there are copending reads and writes to the same address, writes are stalled by not responding to the first part of a write request and snoop requests received to the address are processed regardless of the fact that the write is pending. When the pending read has completed, the coherency controller will respond to the first part of the write request and the initiator device will complete the write by sending the data and a state indicator following the snoop.Type: GrantFiled: August 8, 2011Date of Patent: February 3, 2015Assignee: ARM LimitedInventors: Phanindra Kumar Mannava, Jamshed Jalal, Ramamoorthy Guru Prasadh, Michael Alan Filippo
-
Patent number: 8949546Abstract: Embodiments include a local cache management system that is configured to be coupled to a local cache and that includes an index engine configured to store fingerprints of message segments stored in the local cache and a redundancy management engine coupled to the index engine. The redundancy management engine includes an adaptive emitter configured to receive a message segment to be transmitted to a remote device, determine expected latency costs of a plurality of transmission algorithms, and select a transmission algorithm, such as by selecting the lowest expected latency cost. The adaptive emitter is also configured to determine whether the message segment is stored within a remote cache management system associated with the remote device, and transmit the message segment through a network to the remote cache management system using the selected transmission algorithm upon a determination that the message segment is not stored within the remote cache management system.Type: GrantFiled: May 31, 2012Date of Patent: February 3, 2015Assignee: VMware, Inc.Inventors: Liang Cui, Chengzhong Liu, Zhifeng Xia
-
Publication number: 20150032970Abstract: A processing apparatus comprising: several processors for processing data; a hierarchical memory system comprising a memory accessible to all the processors, and several caches corresponding to each of the processors, each of the caches being accessible to the corresponding processor and comprising storage locations and corresponding indicators. There is also cache coherency control circuitry for maintaining coherency of data stored in the hierarchical memory system. The processors are configured to respond to receipt of a predefined request to perform an operation on a data item to determine if the cache corresponding to the processor receiving the request has a storage location allocated to the data item. If not, the processing apparatus is configured to: allocate a storage location within the cache to the data item, set the indicator corresponding to the storage location to indicate that the storage location is storing a delta value, set data in the allocated storage location to an initial value.Type: ApplicationFiled: July 24, 2013Publication date: January 29, 2015Applicant: Arm LimitedInventors: Hedley James Francis, Robert Martin Elliott, Ian Victor Devereux, Daren Croxford
-
Publication number: 20150032971Abstract: In one embodiment, a method for predicting false sharing includes running code on a plurality of cores and tracking potential false sharing in the code while running the code to produce tracked potential false sharing, where tracking the potential false sharing includes determining whether there is potential false sharing between a first cache line and a second cache line, and where the first cache line is adjacent to the second cache line. The method also includes reporting potential false sharing in accordance with the tracked potential false sharing to produce a false sharing report.Type: ApplicationFiled: July 25, 2014Publication date: January 29, 2015Inventors: Chen Tian, Tongping Liu, Ziang Hu
-
Patent number: 8943276Abstract: A plurality of tracks is examined for meeting criteria for a discard scan. In lieu of waiting for a completion of a track access operation, at least one of the plurality of tracks is marked for demotion. An additional discard scan may be subsequently performed for tracks not previously demoted. The discard and additional discard scans may proceed in two phases.Type: GrantFiled: March 14, 2013Date of Patent: January 27, 2015Assignee: International Business Machines CorporationInventors: Michael T. Benhase, Lokesh M. Gupta, Carol S. Mellgren, Kenneth W. Todd
-
Publication number: 20150026416Abstract: Methods for dynamic memory cache size adjustment, enabling dynamic memory cache size adjustment, memory devices, and memory systems are disclosed. One such method for dynamic memory cache size adjustment determines available memory space in a memory array and adjusts a size of a memory cache in the memory array responsive to the available memory space.Type: ApplicationFiled: October 9, 2014Publication date: January 22, 2015Applicant: MICRON TECHNOLOGY, INC.Inventors: Siamack Nemazie, Farshid Tabrizi, Berhanu Iman, Ruchir Shah, William E. Benson, Michael George
-
Patent number: 8938587Abstract: A coherent attached processor proxy (CAPP) that participates in coherence communication in a primary coherent system on behalf of an attached processor external to the primary coherent system tracks delivery of data to destinations in the primary coherent system via one or more entries in a data structure. Each of the one or more entries specifies with a destination tag a destination in the primary coherent system to which data is to be delivered from the attached processor. In response to initiation of recovery operations for the CAPP, the CAPP performs data recovery operations, including transmitting, to at least one destination indicated by the destination tag of one or more entries, an indication of a data error in data to be delivered to that destination from the attached processor.Type: GrantFiled: January 11, 2013Date of Patent: January 20, 2015Assignee: International Business Machines CorporationInventors: Bartholomew Blaner, Kenneth A. Lauricella, Joseph G. McDonald, Michael S. Siegel, Jeff A. Stuecheli
-
Patent number: 8935483Abstract: Described embodiments provide a packet classifier of a network processor having a plurality of processing modules. A scheduler generates a thread of contexts for each tasks generated by the network processor corresponding to each received packet. The thread corresponds to an order of instructions applied to the corresponding packet. A multi-thread instruction engine processes the threads of instructions. A state engine operates on instructions received from the multi-thread instruction engine, the instruction including a cache access request to a local cache of the state engine. A cache line entry manager of the state engine translates between a logical index value of data corresponding to the cache access request and a physical address of data stored in the local cache. The cache line entry manager manages data coherency of the local cache and allows one or more concurrent cache access requests to a given cache data line for non-overlapping data units.Type: GrantFiled: December 22, 2010Date of Patent: January 13, 2015Assignee: LSI CorporationInventor: Jerry Pirog
-
Patent number: 8930635Abstract: Processing within a multiprocessor computer system is facilitated by: setting, in association with invalidate page table entry processing, a storage key at a matching location in central storage of a multiprocessor computer system to a predefined value; and subsequently executing a request to update the storage key to a new storage key, the subsequently executing including determining whether the predefined value is an allowed stale value, and if so, replacing in central storage the storage key of predefined value with the new storage key without requiring purging or updating of the storage key in any local processor cache of the multiprocessor computer system, thus minimizing interprocessor communication pursuant to processing of the request to update the storage key to the new storage key.Type: GrantFiled: December 14, 2009Date of Patent: January 6, 2015Assignee: International Business Machines CorporationInventor: Gary A. Woffinden
-
Patent number: 8930528Abstract: A method of partitioning directory. Accesses, e.g., shared/exclusive, and/or waiting requests, e.g., shared/exclusive, to access one or more files with a directory are monitored, e.g., incrementing/decrementing respective counters. The waiting requests are queued to be granted at a later time. The directory is determined to be primed for partitioning if a number of waiting requests to access the directory is greater than a threshold value of a plurality of heuristics and optionally further based on satisfying the condition for at least a programmable time threshold period. A trigger signal is automatically generated if the directory is primed for partitioning. The trigger signal causes a file system to partition the directory. It is appreciated that the plurality of heuristics is user programmable.Type: GrantFiled: August 16, 2010Date of Patent: January 6, 2015Assignee: Symantec CorporationInventors: Rahul Ravindra Borade, Anindya Banerjee, Kedar Patwardhan
-
Patent number: 8930518Abstract: An application server of a server cluster may store a payload of a write request in a local cache and thereafter serve read requests based on payloads in the local cache if the corresponding data is present when such read requests are received. The payloads are however later propagated to respective data stores at a later suitable time. Each application server in the server cluster retrieves data from the data stores if the required payload is unavailable in the respective local cache. According to another aspect, an application server signals to other application servers of the server cluster if a required payload is unavailable in the local cache. In response, the application server having the specified payload (in local cache) propagates the payload with a higher priority to the corresponding data store, such that the payload is available to the requesting application server.Type: GrantFiled: October 3, 2012Date of Patent: January 6, 2015Assignee: Oracle International CorporationInventors: Krishnamoorthy Dharmalingam, Sankar Mani
-
Patent number: 8930636Abstract: One embodiment sets forth a technique for ensuring relaxed coherency between different caches. Two different execution units may be configured to access different caches that may store one or more cache lines corresponding to the same memory address. During time periods between memory barrier instructions relaxed coherency is maintained between the different caches. More specifically, writes to a cache line in a first cache that corresponds to a particular memory address are not necessarily propagated to a cache line in a second cache before the second cache receives a read or write request that also corresponds to the particular memory address. Therefore, the first cache and the second are not necessarily coherent during time periods of relaxed coherency. Execution of a memory barrier instruction ensures that the different caches will be coherent before a new period of relaxed coherency begins.Type: GrantFiled: July 20, 2012Date of Patent: January 6, 2015Assignee: NVIDIA CorporationInventors: Joel James McCormack, Rajesh Kota, Olivier Giroux, Emmett M. Kilgariff