Shared Cache Patents (Class 711/130)
  • Publication number: 20110191542
    Abstract: Methods and apparatus relating to system-wide quiescence and per-thread transaction fence in a distributed caching agent are described. Some embodiments utilize messages, counters, and/or state machines that support system-wide quiescence and per-thread transaction fence flows. Other embodiments are also disclosed.
    Type: Application
    Filed: December 26, 2010
    Publication date: August 4, 2011
    Inventors: James R. Vash, Bongjin Jung, Rishan Tan
  • Patent number: 7991960
    Abstract: Data store access circuitry is disclosed that comprises: a data store for storing values; comparator circuitry coupled to said data store and responsive to receipt of a data access request comprising an address to compare at least a portion of said address with at least a portion of one or more of said values stored in said data store so as to identify a stored value matching said address; a base value register coupled to said comparator circuitry and storing a base value corresponding to at least a portion of at least one of said stored values; and comparator control circuitry coupled to said comparator circuitry to control: (i) which portion of said address is processed as a non-shared portion and compared by said comparator circuitry with non-shared portions of said one or more stored values stored in said data store; and (ii) which portion of said address is processed as a shared portion and compared by said comparator circuitry with a shared portion of said base value stored in said base value register;
    Type: Grant
    Filed: August 27, 2008
    Date of Patent: August 2, 2011
    Assignee: ARM Limited
    Inventors: Daren Croxford, Timothy Fawcett Milner
  • Publication number: 20110185125
    Abstract: A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers may share access to an interconnect egress port coupled to the interconnect network, and may generate multiple concurrent requests to convey data via the shared port, where each of the requests is destined for a corresponding core, and where a datapath width of the port is less than a combined width of the multiple requests. The given tag unit may arbitrate among the controllers for access to the shared port, such that the requests are transmitted to corresponding cores serially rather than concurrently.
    Type: Application
    Filed: January 27, 2010
    Publication date: July 28, 2011
    Inventors: Prashant Jain, Yoganand Chillarige, Sandip Das, Shukur Moulali Pathan, Srinivasan R. Iyengar, Sanjay Patel
  • Publication number: 20110185126
    Abstract: When a processor has transitioned to an operation stop state, it is possible to reduce the power consumption of a cache memory while maintaining the consistency of cache data. A multiprocessor system includes first and second processors, a shared memory, first and second cache memories, a consistency management circuit for managing consistency of data stored in the first and second cache memories, a request signal line for transmitting a request signal for a data update request from the consistency management circuit to the first and second cache memories, an information signal line for transmitting an information signal for informing completion of the data update from the first and second cache memories to the consistency management circuit, and a cache power control circuit for controlling supply of a clock signal and power to the first and second cache memories in accordance with the request signal and the information signal.
    Type: Application
    Filed: January 24, 2011
    Publication date: July 28, 2011
    Applicant: RENESAS ELECTRONICS CORPORATION
    Inventors: Tsuneki SASAKI, Shuichi KUNIE, Tatsuya KAWASAKI
  • Publication number: 20110185117
    Abstract: According to one embodiment, a system includes a virtual tape library having a cache, a virtual tape controller (VTC) coupled to the virtual tape library, and an interface for coupling at least one host to the VTC. The cache is shared by all the hosts, and a common view of a cache state, a virtual library state, and a number of write requests pending is provided to all the hosts by the VTC. In another embodiment, a method includes receiving data from at least one host using a VTC, storing data received from all the hosts to a cache using the VTC, sending an alert to all the hosts when free space is low and entering into a warning state, sending another alert to all the hosts when free space is critically low and entering into a critical state while allowing previously mounted virtual drives to continue normally.
    Type: Application
    Filed: January 25, 2010
    Publication date: July 28, 2011
    Applicant: International Business Machines Corporation
    Inventors: Ralph T. Beeston, Erika M. Dawson, Duke A. Lee, David Luciani, Joel K. Lyman
  • Patent number: 7987217
    Abstract: Techniques are provided for performing transaction-aware caching of metadata in an electronic file system. A mechanism is described for providing transaction-aware caching that uses a cache hierarchy, where the cache hierarchy includes uncommitted caches associated with sessions in an application and a committed cache that is shared among the sessions in that application. Techniques are described for caching document metadata, access control metadata and folder path metadata. Also described is a technique for using negative cache entries to avoid unnecessary communications with a server when applications repeatedly request non-existent data.
    Type: Grant
    Filed: May 30, 2003
    Date of Patent: July 26, 2011
    Assignee: Oracle International Corporation
    Inventors: David J. Long, David B. Pitfield
  • Publication number: 20110179416
    Abstract: A method is provided for use in a system that includes a host machine that includes multiple physical CPUs (PCPUs) and at least two cache nodes that are shared by different sets of the PCPUs, comprising: creating in a memory device multiple sets of lanes each lane set associated with a respective PCPU set; tracking levels of processing activity of the PCPUs of each PCPU set; using an MSIX vector value to associate lanes with PCPUs; receiving a IO request from any given PCPU from among the multiple PCPUs; and assigning the IO request to a respective lane based at least in part upon the PCPU set associated with the lane and PCPU processing activity levels.
    Type: Application
    Filed: January 21, 2010
    Publication date: July 21, 2011
    Applicant: VMWARE, INC.
    Inventors: Vibhor Patale, Rupesh Bajaj, Edward Goggin, Hariharan Subramanian
  • Patent number: 7984241
    Abstract: A plurality of bits are added to virtual and physical memory addresses to specify the level at which data is stored in a multi-level cache hierarchy. When data is to be written to cache, each cache level determines whether it is permitted to store the data. Storing data at the appropriate cache level addresses the problem of cache thrashing.
    Type: Grant
    Filed: July 26, 2006
    Date of Patent: July 19, 2011
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Sudheer Kurichiyath
  • Patent number: 7975018
    Abstract: A plurality of access nodes sharing access to data on a storage network implement a directory based cache ownership scheme. One node, designated as a global coordinator, maintains a directory (e.g., table or other data structure) storing information about I/O operations by the access nodes. The other nodes send requests to the global coordinator when an I/O operation is to be performed on identified data. Ownership of that data in the directory is given to the first requesting node. Ownership may transfer to another node if the directory entry is unused or quiescent. The distributed directory-based cache coherency allows for reducing bandwidth requirements between geographically separated access nodes by allowing localized (cached) access to remote data.
    Type: Grant
    Filed: July 7, 2005
    Date of Patent: July 5, 2011
    Assignee: EMC Corporation
    Inventors: Ron Unrau, Steven Bromling, Wayne Karpoff
  • Publication number: 20110161596
    Abstract: Techniques are generally described for methods, systems, data processing devices and computer readable media related to multi-core parallel processing directory-based cache coherence. Example systems may include one multi-core processor or multiple multi-core processors. An example multi-core processor includes a plurality of processor cores, each of the processor cores having a respective cache. The system may further include a main memory coupled to each multi-core processor. A directory descriptor cache may be associated with the plurality of the processor cores, where the directory descriptor cache may be configured to store a plurality of directory descriptors. Each of the directory descriptors may provide an indication of the cache sharing status of a respective cache-line-sized row of the main memory.
    Type: Application
    Filed: December 28, 2009
    Publication date: June 30, 2011
    Inventor: Tom Conte
  • Publication number: 20110161586
    Abstract: Technologies are described herein related to multi-core processors that are adapted to share processor resources. An example multi-core processor can include a plurality of processor cores. The multi-core processor further can include a shared register file selectively coupled to two or more of the plurality of processor cores, where the shared register file is adapted to serve as a shared resource among the selected processor cores.
    Type: Application
    Filed: December 29, 2009
    Publication date: June 30, 2011
    Inventors: Miodrag Potkonjak, Nathan Zachary Beckmann
  • Publication number: 20110161630
    Abstract: An apparatus and method is described herein for replacing faulty core components. General purpose hardware is provided to replace core pipeline components, such as execution units. In the embodiment of execution unit replacement, a proxy unit is provided, such that mapping logic is able to map instruction/operations, which correspond to faulty execution units, to the proxy unit. As a result, the proxy unit is able to receive the operations, send them to general purpose hardware for execution, and subsequently write-back the execution results to a register file; it essentially replaces the defective execution unit allowing a processor with defective units to be sold or continue operation.
    Type: Application
    Filed: December 28, 2009
    Publication date: June 30, 2011
    Inventors: Steven E. Raasch, Michael D. Powell, Shubhendu S. Mukherjee, Arijit Biswas
  • Publication number: 20110161590
    Abstract: A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.
    Type: Application
    Filed: December 31, 2009
    Publication date: June 30, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Guy L. Guthríe, William J. Starke, Derek E. Williams
  • Patent number: 7971000
    Abstract: The invention concerns a method and a system for maintaining consistency of a cache memory, accessible by multiple independent processes. The processes can share common data. The processes perform simultaneous data searching operations optionally followed by providing the data to the processes, a removal of same or an insertion of new data. The searching, removal and insertion operations, are comprehensively executed once they have been initiated by the independent processes. They are executed excluding one another when they must operate on common data. The removal or insertion operations are each completely reversible. In that context, the invention provides that the operations for providing, removing or inserting the data have a finite or bound duration of execution so as to prevent any locking.
    Type: Grant
    Filed: March 8, 2006
    Date of Patent: June 28, 2011
    Assignee: Amadeus s.a.s.
    Inventors: Frédérick Ros, Rudy Daniello, Luc Isnardy, Claudine Reynaud, Wayne Rubenstein
  • Patent number: 7970999
    Abstract: An information distribution system includes an interconnect and multiple data processing nodes coupled to the interconnect. Each data processing node includes mass storage and a cache. Each data processing node also includes interface logic configured to receive signals from the interconnect and to apply the signals from the interconnect to affect the content of the cache, and to receive signals from the mass storage and to apply the signals from the mass storage to affect the content of the cache. The content of the mass storage and cache of a particular node may also be provided to other nodes of the system, via the interconnect.
    Type: Grant
    Filed: January 22, 2008
    Date of Patent: June 28, 2011
    Assignee: ARRIS Group
    Inventor: Robert C Duzett
  • Publication number: 20110153948
    Abstract: Systems, apparatus, and method of monitoring synchronization in a distributed cache are described. In an exemplary embodiment, a first and second processing core process a first and second thread respectively. A first and second distributed cache slices store data for either or both of the first and second processing cores. A first and second core interface co-located with the first and second processing cores respectively maintain a finite state machine (FSM) to be executed in response to receiving a request from a thread of its co-located processing core to monitor a cache line in the distributed cache.
    Type: Application
    Filed: December 22, 2009
    Publication date: June 23, 2011
    Inventors: James R. Vash, Bongjin Jung, Rishan Tan
  • Publication number: 20110154345
    Abstract: Implementations and techniques for multicore processors having a domain interconnection network configured to associate a first collision domain network with a second collision domain network in communication are generally disclosed.
    Type: Application
    Filed: December 21, 2009
    Publication date: June 23, 2011
    Inventor: Ezekiel Kruglick
  • Patent number: 7966453
    Abstract: Software indicates to hardware of a processing system that its storage modification to a particular cache line is done, and will not be doing any modification for the time being. With this indication, the processor actively releases its exclusive ownership by updating its line ownership from exclusive to read-only (or shared) in its own cache directory and in the storage controller (SC). By actively giving up the exclusive rights, another processor can immediately be given exclusive ownership to that said cache line without waiting on any processor's explicit cross invalidate acknowledgement. This invention also describes the hardware design needed to provide this support.
    Type: Grant
    Filed: December 12, 2007
    Date of Patent: June 21, 2011
    Assignee: International Business Machines Corporation
    Inventors: Chung-Lung Kevin Shum, Kathryn Marie Jackson, Charles Franklin Webb
  • Publication number: 20110145505
    Abstract: Mechanisms are provided, for implementation in a data processing system having at least one physical processor and at least one associated cache memory, for allocating cache resources of the at least one cache memory to virtual processors of the data processing system. The mechanisms identify a plurality of high priority virtual processors in the data processing system. The mechanisms further determine a percentage of cache lines of the at least one cache memory to be assigned to high priority virtual processors. Moreover, the mechanisms mark a portion of the cache lines in the at least one cache memory as being evictable by only high priority virtual processors based on the determined percentage of cache lines to be assigned to high priority virtual processors. The marked portion of the cache lines cannot be evicted by lower priority virtual processors having a priority lower than the high priority virtual processors.
    Type: Application
    Filed: December 15, 2009
    Publication date: June 16, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Vaijayanthimala K. Anand, Diane G. Flemming, William A. Maron, Mysore S. Srinivas
  • Publication number: 20110145506
    Abstract: In one embodiment, the present invention includes a cache memory including cache lines that each have a tag field including a state portion to store a cache coherency state of data stored in the line and a weight portion to store a weight corresponding to a relative importance of the data. In various implementations, the weight can be based on the cache coherency state and a recency of usage of the data. Other embodiments are described and claimed.
    Type: Application
    Filed: December 16, 2009
    Publication date: June 16, 2011
    Inventors: Naveen Cherukuri, Dennis W. Brzezinski, Ioannis T. Schoinas, Anahita Shayesteh, Akhilesh Kumar, Mani Azimi
  • Patent number: 7962693
    Abstract: A cache management system providing improved page latching methodology. A method providing access to data in a multi-threaded computing system comprises: providing a cache containing data pages and a mapping to pages in memory of the multi-threaded computing system; associating a latch with each page in cache to regulate access, the latch allowing multiple threads to share access to the page for reads and a single thread to obtain exclusive access to the page for writes; in response to a request from a first thread to read a particular page, determining whether the particular page is in cache without acquiring any synchronization object regulating access and without blocking access by other threads; if the particular page is in cache, reading the particular page unless another thread has exclusively latched the particular page; and otherwise, if the particular page is not in cache, bringing the page into cache.
    Type: Grant
    Filed: May 17, 2008
    Date of Patent: June 14, 2011
    Assignee: Ianywhere Solutions, Inc.
    Inventor: Peter Bumbulis
  • Publication number: 20110138128
    Abstract: A technique to track shared information in a multi-core processor or multi-processor system. In one embodiment, core identification information (“core IDs”) are used to track shared information among multiple cores in a multi-core processor or multiple processors in a multi-processor system.
    Type: Application
    Filed: December 7, 2009
    Publication date: June 9, 2011
    Inventors: Yen-Kuang Chen, Christopher J. Hughes, Changkyn Kim
  • Patent number: 7958317
    Abstract: A technique for performing stream detection and prefetching within a cache memory simplifies stream detection and prefetching. A bit in a cache directory or cache entry indicates that a cache line has not been accessed since being prefetched and another bit indicates the direction of a stream associated with the cache line. A next cache line is prefetched when a previously prefetched cache line is accessed, so that the cache always attempts to prefetch one cache line ahead of accesses, in the direction of a detected stream. Stream detection is performed in response to load misses tracked in the load miss queue (LMQ). The LMQ stores an offset indicating a first miss at the offset within a cache line. A next miss to the line sets a direction bit based on the difference between the first and second offsets and causes prefetch of the next line for the stream.
    Type: Grant
    Filed: August 4, 2008
    Date of Patent: June 7, 2011
    Assignee: International Business Machines Corporation
    Inventors: William E. Speight, Lixin Zhang
  • Publication number: 20110131377
    Abstract: A multi-core processor chip comprises at least one shared cache having a plurality of ports and a plurality of address spaces and a plurality of processor cores. Each processor core is coupled to one of the plurality of ports such that each processor core is able to access the at least one shared cache simultaneously with another of the plurality of processor cores. Each processor core is assigned one of a unique application or a unique application task and the multi-core processor is operable to execute a partitioning operating system that temporally and spatially isolates each unique application and each unique application task such that each of the plurality of processor cores does not attempt to write to the same address space of the at least one shared cache at the same time as another of the plurality of processor cores.
    Type: Application
    Filed: December 2, 2009
    Publication date: June 2, 2011
    Applicant: HONEYWELL INTERNATIONAL INC.
    Inventors: Scott Gray, Nicholas Wilt
  • Publication number: 20110125971
    Abstract: Various implementations of shared upper level cache architectures are disclosed.
    Type: Application
    Filed: November 24, 2009
    Publication date: May 26, 2011
    Applicants: Empire Technology Development LLC, Glitter Technology LLP
    Inventor: Ezekiel Kruglick
  • Publication number: 20110119446
    Abstract: A method, system and computer program product are disclosed for implementing load-reserve and store-conditional instructions in a multi-processor computing system. The computing system includes a multitude of processor units and a shared memory cache, and each of the processor units has access to the memory cache. In one embodiment, the method comprises providing the memory cache with a series of reservation registers, and storing in these registers addresses reserved in the memory cache for the processor units as a result of issuing load-reserve requests. In this embodiment, when one of the processor units makes a request to store data in the memory cache using a store-conditional request, the reservation registers are checked to determine if an address in the memory cache is reserved for that one of the processor units. If an address in the memory cache is reserved for that one of the processors, the data are stored at this reserved address.
    Type: Application
    Filed: February 1, 2010
    Publication date: May 19, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Matthias A. Blumrich, Martin Ohmacht
  • Publication number: 20110119447
    Abstract: According to embodiments described in the specification, a method and apparatus for managing memory in a mobile electronic device are provided. The method comprises: receiving a request to install an application; receiving at least one indication of data intended to be maintained in a shared cache; determining, based on the at least one indication, whether data corresponding to the intended data exists in the shared cache; upon a negative determination, writing the intended data to the shared cache; and repeating the receiving at least one indication, the determining and the writing for at least one additional application.
    Type: Application
    Filed: November 18, 2009
    Publication date: May 19, 2011
    Applicant: Research in Motion Limited
    Inventor: Ankur AGGARWAL
  • Publication number: 20110113199
    Abstract: An apparatus and method is described herein for optimization to prefetch throttling, which potentially enhances performance, reduces power consumption, and maintains positive gain for workloads that benefit from prefetching. More specifically, the optimizations described herein allow for bandwidth congestion and prefetch accuracy to be taken into account as feedbacks for throttling at the source of prefetch generation. As a result, when there is low congestion, full prefetch generation is allowed, even if the prefetch is inaccurate, since there is available bandwidth. However, when congestion is high, the determination of throttling falls to prefetch accuracy. If accuracy is high—miss rate is low—then less throttling is needed, because the prefetches are being utilized—performance is being enhanced.
    Type: Application
    Filed: November 9, 2009
    Publication date: May 12, 2011
    Inventors: Puqi P. Tang, Hemant G. Rotithor, Ryan L. Carlson, Nagi Aboulenein
  • Publication number: 20110113200
    Abstract: Embodiments of an apparatus for controlling cache occupancy rates are presented. In one embodiment, an apparatus comprises a controller and monitor logic. The monitor logic determines a monitored occupancy rate associated with a first program class. The first controller regulates a first allocation probability corresponding to the first program class, based at least on the difference between a requested occupancy rate and the first monitored occupancy rate.
    Type: Application
    Filed: November 10, 2009
    Publication date: May 12, 2011
    Inventors: Jaideep Moses, Rameshkumar G. Illikkal, Donald K. Newell, Ravishankar Iyer, Kostantinos Alsopos, Li Zhao
  • Patent number: 7941603
    Abstract: An advanced processor comprises a plurality of multithreaded processor cores each having a data cache and instruction cache. A data switch interconnect is coupled to each of the processor cores and configured to pass information among the processor cores. A messaging network is coupled to each of the processor cores and a plurality of communication ports. In one aspect of an embodiment of the invention, the data switch interconnect is coupled to each of the processor cores by its respective data cache, and the messaging network is coupled to each of the processor cores by its respective message station. Advantages of the invention include the ability to provide high bandwidth communications between computer systems and memory in an efficient and cost-effective manner.
    Type: Grant
    Filed: November 30, 2009
    Date of Patent: May 10, 2011
    Assignee: NetLogic Microsystems, Inc.
    Inventor: David T. Hass
  • Patent number: 7941585
    Abstract: A RISC-type processor includes a main register file and a data cache. The data cache can be partitioned to include a local memory, the size of which can be dynamically changed on a cache block basis while the processor is executing instructions that use the main register file. The local memory can emulate as an additional register file to the processor and can reside at a virtual address. The local memory can be further partitioned for prefetching data from a non-cacheable address to be stored/loaded into the main register file.
    Type: Grant
    Filed: December 17, 2004
    Date of Patent: May 10, 2011
    Assignee: Cavium Networks, Inc.
    Inventors: David H. Asher, David A. Carlson, Richard E. Kessler
  • Patent number: 7937532
    Abstract: In some embodiments, the invention involves a novel combination of techniques for prefetching data and passing messages between and among cores in a multi-processor/multi-core platform. In an embodiment, a receiving core has a message queue and a message prefetcher. Incoming messages are simultaneously written to the message queue and the message prefetcher. The prefetcher speculatively fetches data referenced in the received message so that the data is available when the message is executed in the execution pipeline, or shortly thereafter. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: May 3, 2011
    Assignee: Intel Corporation
    Inventors: Aaron Kunze, Erik J. Johnson, Hermann Gartler
  • Publication number: 20110087843
    Abstract: An apparatus, method, and system are disclosed. In one embodiment the apparatus includes a cache memory, which a number of sets. Each of the sets in the cache memory have several cache lines. The apparatus also includes at least one process resource table. The process resource table maintains a cache line occupancy count of a number of cache lines. Specifically, the cache line occupancy count for each cache line describes the number of cache lines in the cache storing information utilized by a process running on a computer system. Additionally, the process resource table stores the occupancy count of less cache lines than the total number of cache lines in the cache memory.
    Type: Application
    Filed: October 9, 2009
    Publication date: April 14, 2011
    Inventors: Li Zhao, Ravishankar Iyer, Rameshkumar G. Illikkal, Erik G. Hallnor, Martin G. Dixon, Donald K. Newell
  • Patent number: 7921407
    Abstract: Transaction code written by the programmer may be translated, replaced or transformed into a code that is configured to implement transactions according to any of various techniques. A compiler may replace programmer written transaction code into code allowing multiple compatible transaction implementation techniques to be used in the same program, and at the same time. A programmer may write transaction code once using familiar coding styles, but the transaction to be effected according to one of a number of compatible alternative implementation techniques. The compiler may enable the implementation of multiple, alternative transactional memory schemes. The particular technique implemented for each transaction may not be decided until runtime. At runtime, any of the various implemented techniques may be used to effect the transaction and if a first technique fails or is inappropriate for a particular transaction, one or more other techniques may be attempted.
    Type: Grant
    Filed: November 2, 2006
    Date of Patent: April 5, 2011
    Assignee: Oracle America, Inc.
    Inventors: Peter C. Damron, Yosef Lev, Mark S. Moir
  • Publication number: 20110072217
    Abstract: A plurality of mid-tier databases form a single, consistent cache grid for data in a one or more backend data sources, such as a database system. The mid-tier databases may be standard relational databases. Cache agents at each mid-tier database swap in data from the backend database as needed. Consistency in the cache grid is maintained by ownership locks. Cache agents prevent database operations that will modify cached data in a mid-tier database unless and until ownership of the cached data can be acquired for the mid-tier database. Cache groups define what backend data may be cached, as well as a general structure in which the backend data is to be cached. Metadata for cache groups is shared to ensure that data is cached in the same form throughout the entire grid. Ownership of cached data can then be tracked through a mapping of cached instances of data to particular mid-tier databases.
    Type: Application
    Filed: September 18, 2009
    Publication date: March 24, 2011
    Inventors: Chi Hoang, Tirthankar Lahiri, Marie-Anne Neimat, Chih-Ping Wang, John Miller, Dilys Thomas, Nagender Bandi, Susan Cheng
  • Publication number: 20110060880
    Abstract: A multiprocessor according to an embodiment of the present invention comprises: a provisional determination unit that provisionally determines one transfer source for each transfer destination by performing predetermined prediction processing based on monitoring of transfer of cache data among cache memories. A data transfer unit activates, after a provisional determination result of the provisional determination unit is obtained, only a tag cache corresponding to the provisionally-determined one transfer source when the transfer of the cache data is performed and determines whether cache data corresponding to a refill request is cached referring to only the activated tag cache.
    Type: Application
    Filed: July 29, 2010
    Publication date: March 10, 2011
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Soichiro Hosoda
  • Publication number: 20110055482
    Abstract: Various example embodiments are disclosed. According to an example embodiment, a shared cache may be configured to determine whether a word requested by one of the L1 caches is currently stored in the L2 shared cache, read the requested word from the main memory based on determining that the requested word is not currently stored in the L2 shared cache, determine whether at least one line in a way reserved for the requesting L1 cache is unused, store the requested word in the at least one line based on determining that the at least one line in the reserved way is unused, and store the requested word in a line of the L2 shared cache outside the reserved way based on determining that the at least one line in the reserved way is not unused.
    Type: Application
    Filed: November 25, 2009
    Publication date: March 3, 2011
    Applicant: Broadcom Corporation
    Inventors: Kimming So, Binh Truong
  • Publication number: 20110055487
    Abstract: In one embodiment, the present invention includes a method to obtain topology information regarding a system including at least one multicore processor, provide the topology information to a plurality of parallel processes, generate a topological map based on the topology information, access the topological map to determine a topological relationship between a sender process and a receiver process, and select a given memory copy routine to pass a message from the sender process to the receiver process based at least in part on the topological relationship. Other embodiments are described and claimed.
    Type: Application
    Filed: March 31, 2008
    Publication date: March 3, 2011
    Inventors: Sergey I. Sapronov, Alexey V. Bayduraev, Alexander V. Supalov, Vladimir D. Truschin, Igor Ermolaev, Dmitry Mishura
  • Publication number: 20110055827
    Abstract: A mechanism is provided in a virtual machine monitor for providing cache partitioning in virtualized environments. The mechanism assigns a virtual identification (ID) to each virtual machine in the virtualized environment. The processing core stores the virtual ID of the virtual machine in a special register. The mechanism also creates an entry for the virtual machine in a partition table. The mechanism may partition a shared cache using a vertical (way) partition and/or a horizontal partition. The entry in the partition table includes a vertical partition control and a horizontal partition control. For each cache access, the virtual machine passes the virtual ID along with the address to the shared cache. If the cache access results in a miss, the shared cache uses the partition table to select a victim cache line for replacement.
    Type: Application
    Filed: August 25, 2009
    Publication date: March 3, 2011
    Applicant: International Business Machines Corporation
    Inventors: Jiang Lin, Lixin Zhang
  • Patent number: 7899995
    Abstract: An array of streaming multiprocessors shares data via a shared memory. A flushing mechanism is used to guarantee that data required for dependent computations is available in the shared memory.
    Type: Grant
    Filed: March 3, 2009
    Date of Patent: March 1, 2011
    Assignee: NVIDIA Corporation
    Inventor: Radoslav Danilak
  • Patent number: 7899663
    Abstract: Memory consistency is provided in an emulated processing environment. A processor architected with a weak memory consistency emulates an architecture having a firm memory consistency. This memory consistency is provided without requiring serialization instructions or special hardware.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: March 1, 2011
    Assignee: International Business Machines Corporation
    Inventors: Theodore J. Bohizic, Mark H. Decker, Viktor S. Gyuris
  • Patent number: 7895399
    Abstract: A processor reads a program including a prefetch command and a load command and data from a main memory, and executes the program. The processor includes: a processor core that executes the program; a L2 cache that stores data on the main memory for each predetermined unit of data storage; and a prefetch unit that pre-reads the data into the L2 cache from the main memory on the basis of a request for prefetch from the processor core. The prefetch unit includes: a L2 cache management table including an area in which a storage state is held for each position in the unit of data storage of the L2 cache and an area in which a request for prefetch is reserved; and a prefetch control unit that instructs, the L2 cache to perform the request for prefetch reserved or the request for prefetch from the processor core.
    Type: Grant
    Filed: February 13, 2007
    Date of Patent: February 22, 2011
    Assignee: Hitachi, Ltd.
    Inventors: Aki Tomita, Naonobu Sukegawa
  • Patent number: 7895396
    Abstract: Provided is a storage system having improved access performance. The storage system includes: a hard disk drive, and a storage controller for reading/writing data from/to the hard disk drive, the storage controller including: at least one interface connected to a host computer through a network; and a plurality of processors connected to the interface through an internal network. The storage system is characterized in that: the processor provides at least one logical access port to the host computer; and the interface stores routing information including a processor which processes an access request addressed to the logical access port, extracts an address from the received access request upon reception of the access request from the host computer, specifies the processor which processes the received access request based on the routing information and the extracted address, and transfers the received access request to the specified processor.
    Type: Grant
    Filed: August 3, 2009
    Date of Patent: February 22, 2011
    Assignee: Hitachi, Ltd.
    Inventors: Akira Fujibayashi, Shuji Nakamura, Mutsumi Hosoya
  • Patent number: 7890701
    Abstract: A method and system for dynamic distributed data caching includes providing a cache community of peer members and a master member. A master member volunteers to leave the cache community upon which decision a peer member is selected to become the new master member. Each peer member has an associated first content portion indicating content to be cached by the respective peer. A client may be allowed to join the cache community. A peer list associated with the cache community is updated to include the client. The peer list indicates the peers in the cache community. A respective second content portion is associated with each peer based on the addition of the client.
    Type: Grant
    Filed: June 1, 2010
    Date of Patent: February 15, 2011
    Assignee: Parallel Networks, LLC
    Inventors: Keith A. Lowery, Bryan S. Chin, David A. Consolver, Gregg A. DeMasters
  • Publication number: 20110029736
    Abstract: The storage controller of the present invention is able to reduce the amount of purge message communication and increase the processing performance of the storage controller. Each microprocessor creates and saves a purge message every time control information in the shared memory is updated. After a series of update processes are complete, the saved purge messages are transmitted to each microprocessor. To the control information, attribute corresponding to its characteristics is established, and cache control and purge control are executed depending on the attribute.
    Type: Application
    Filed: February 17, 2009
    Publication date: February 3, 2011
    Inventors: Kei Sato, Takeo Fujimoto, Osamu Sakaguchi
  • Publication number: 20110022773
    Abstract: A mechanism is provided in a virtual machine monitor for fine grained cache allocation in a shared cache. The mechanism partitions a cache tag into a most significant bit (MSB) portion and a least significant bit (LSB) portion. The MSB portion of the tags is shared among the cache lines in a set. The LSB portion of the tags is private, one per cache line. The mechanism allows software to set the MSB portion of tags in a cache to allocate sets of cache lines. The cache controller determines whether a cache line is locked based on the MSB portion of the tag.
    Type: Application
    Filed: July 27, 2009
    Publication date: January 27, 2011
    Applicant: International Business Machines Corporation
    Inventors: Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
  • Publication number: 20110022803
    Abstract: An approach is provided to identify a disabled processing core and an active processing core from a set of processing cores included in a processing node. Each of the processing cores is assigned a cache memory. The approach extends a memory map of the cache memory assigned to the active processing core to include the cache memory assigned to the disabled processing core. A first amount of data that is used by a first process is stored by the active processing core to the cache memory assigned to the active processing core. A second amount of data is stored by the active processing core to the cache memory assigned to the inactive processing core using the extended memory map.
    Type: Application
    Filed: July 24, 2009
    Publication date: January 27, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Diane Garza Flemming, William A. Maron, Ram Raghavan, Mysore Sathyanarayana Srinivas, Basu Vaidyanathan
  • Publication number: 20110010504
    Abstract: In one embodiment, a memory that is delineated into transparent and non-transparent portions. The transparent portion may be controlled by a control unit coupled to the memory, along with a corresponding tag memory. The non-transparent portion may be software controlled by directly accessing the non-transparent portion via an input address. In an embodiment, the memory may include a decoder configured to decode the address and select a location in either the transparent or non-transparent portion. Each request may include a non-transparent attribute identifying the request as either transparent or non-transparent. In an embodiment, the size of the transparent portion may be programmable. Based on the non-transparent attribute indicating transparent, the decoder may selectively mask bits of the address based on the size to ensure that the decoder only selects a location in the transparent portion.
    Type: Application
    Filed: July 10, 2009
    Publication date: January 13, 2011
    Inventors: James Wang, Zongjian Chen, James B. Keller, Timothy J. Millet
  • Publication number: 20110004729
    Abstract: Methods, apparatuses, and systems directed to the caching of blocks of lines of memory in a cache-coherent, distributed shared memory system. Block caches used in conjunction with line caches can be used to store more data with less tag memory space compared to the use of line caches alone and can therefore reduce memory requirements. In one particular embodiment, the present invention manages this caching using a DSM-management chip, after the allocation of the blocks by software, such as a hypervisor. An example embodiment provides processing relating to block caches in cache-coherent distributed shared memory.
    Type: Application
    Filed: December 19, 2007
    Publication date: January 6, 2011
    Applicant: 3Leaf Systems, Inc.
    Inventors: Isam Akkawi, Najeeb Imran Ansari, Bryan Chin, Chetana Nagendra Keltcher, Krishnan Subramani, Janakiramanan Vaidyanathan
  • Publication number: 20100332763
    Abstract: An apparatus, system, and method are disclosed for improving cache coherency processing. The method includes determining that a first processor in a multiprocessor system receives a cache miss. The method also includes determining whether an application associated with the cache miss is running on a single processor core and/or whether the application is running on two or more processor cores that share a cache. A cache coherency algorithm is executed in response to determining that the application associated with the cache miss is running on two or more processor cores that do not share a cache, and is skipped in response to determining that the application associated with the cache miss is running on one of a single processor core and two or more processor cores that share a cache.
    Type: Application
    Filed: June 30, 2009
    Publication date: December 30, 2010
    Applicant: International Business Machines Corporation
    Inventors: Marcus L. Kornegay, Ngan N. Pham