With Multilevel Cache Hierarchies (epo) Patents (Class 711/E12.024)

Cache Streaming System

Publication number: 20120317360

Abstract: A system, having a stream cache and a storage. The stream cache includes a stream cache controller adapted to control or mediate input data transmitted through the stream cache; and a stream cache memory. The stream cache memory is adapted to both store at least first portions of the input data, as determined by the stream cache controller, and to further output the stored first portions of the input data to a processor. The storage is adapted to receive and store second portions of the input data, as determined by the stream cache controller, and to further transmit the stored second portions of the input data for output to the processor.

Type: Application

Filed: May 16, 2012

Publication date: December 13, 2012

Applicant: LANTIQ DEUTSCHLAND GMBH

Inventors: Thomas Zettler, Gunther Fenzl, Olaf Wachendorf, Raimar Thudt, Ritesh Banerjee
EXTERNAL CACHE OPERATION BASED ON CLEAN CASTOUT MESSAGES

Publication number: 20120311267

Abstract: A processor transmits clean castout messages indicating that a cache line is not dirty and is no longer being stored by a lowest level cache of the processor. An external cache receives the clean castout messages and manages cache lines based in part on the clean castout messages.

Type: Application

Filed: May 31, 2011

Publication date: December 6, 2012

Inventors: Blaine D. Gaither, David A. Plettner
Synchronizing access to data in shared memory via upper level cache queuing

Patent number: 8327074

Abstract: A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.

Type: Grant

Filed: April 12, 2012

Date of Patent: December 4, 2012

Assignee: International Business Machines Corporation

Inventors: Guy L. Guthrie, William J. Starke, Derek E. Williams
Empirically based dynamic control of acceptance of victim cache lateral castouts

Patent number: 8327073

Abstract: A second lower level cache receives an LCO command issued by a first lower level cache on an interconnect fabric. The LCO command indicates an address of a victim cache line to be castout from the first lower level cache and indicates that the second lower level cache is an intended destination of the victim cache line. The second lower level cache determines whether to accept the victim cache line from the first lower level cache based at least in part on the address of the victim cache line indicated by the LCO command. In response to determining not to accept the victim cache line, the second lower level cache provides a coherence response to the LCO command refusing the identified victim cache line. In response to determining to accept the victim cache line, the second lower level cache updates an entry corresponding to the identified victim cache line.

Type: Grant

Filed: April 9, 2009

Date of Patent: December 4, 2012

Assignee: International Business Machines Corporation

Inventors: Guy L. Guthrie, Harmony L. Helterhoff, Thomas L. Jeremiah, Alvan W. Ng, William J. Starke, Jeffrey A. Stuecheli, Philip G. Williams
MEMORY MANAGEMENT UNIT, APPARATUSES INCLUDING THE SAME, AND METHOD OF OPERATING THE SAME

Publication number: 20120297139

Abstract: A method of operating a memory management unit includes accessing a translation lookaside buffer (TLB), translating a page number of a virtual address into a frame number of a physical address when there is a match for the page number of the virtual address in the TLB, executing a miss process when there is no match for the page number of the virtual address in the TLB. The miss process includes accessing a page table translation (PTT) cache, checking whether access information of a k-th level page table corresponding to a k-th page number that will be accessed in the virtual address is in the PTT cache, acquiring a base address of a physical page using the access information, and determining the frame number of physical address corresponding to the page number of the virtual address using a page offset in the physical page.

Type: Application

Filed: May 17, 2012

Publication date: November 22, 2012

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor: JIN HYUCK CHOI
SAVING LOG DATA USING A DISK SYSTEM AS PRIMARY CACHE AND A TAPE LIBRARY AS SECONDARY CACHE

Publication number: 20120272005

Abstract: Various embodiments save a plurality of log data in a hierarchical storage management system using a disk system as a primary cache with a tape library as a secondary cache. The user data is stored in the primary cache and written into the secondary cache at a subsequent period of time. The plurality of blank tapes in the secondary cache is prepared for storing the user data and the plurality of log data based on priorities. At least one of the plurality of blank tapes is selected for copying the plurality of log data and the user data from the primary cache to the secondary cache based on priorities. The plurality of log data is stored in the primary cache. The selection of at least one of the plurality of blank tapes completely filled with the plurality of log data is delayed for writing additional amounts of user data.

Type: Application

Filed: June 11, 2012

Publication date: October 25, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Norie IWASAKI, Koichi MASUDA, Tadaaki MINOURA, Tomokazu NAKAMURA, Takeshi SOHDA, Takahiro TSUDA
EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS

Publication number: 20120272003

Abstract: A microprocessor configured to access an external memory includes a first-level cache, a second-level cache, and a bus interface unit (BIU) configured to interface the first-level and second-level caches to a bus used to access the external memory. The BIU is configured to prioritize requests from the first-level cache above requests from the second-level cache. The second-level cache is configured to generate a first request to the BIU to fetch a cache line from the external memory. The second-level cache is also configured to detect that the first-level cache has subsequently generated a second request to the second-level cache for the same cache line. The second-level cache is also configured to request the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted ownership of the bus to fulfill the first request.

Type: Application

Filed: June 27, 2012

Publication date: October 25, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Clinton Thomas Glover, Colin Eddy, Rodney E. Hooker, Albert J. Loper
EFFICIENT DATA PREFETCHING IN THE PRESENCE OF LOAD HITS

Publication number: 20120272004

Abstract: A memory subsystem in a microprocessor includes a first-level cache, a second-level cache, and a prefetch cache configured to speculatively prefetch cache lines from a memory external to the microprocessor. The second-level cache and the prefetch cache are configured to allow the same cache line to be simultaneously present in both. If a request by the first-level cache for a cache line hits in both the second-level cache and in the prefetch cache, the prefetch cache invalidates its copy of the cache line and the second-level cache provides the cache line to the first-level cache.

Type: Application

Filed: June 27, 2012

Publication date: October 25, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Clinton Thomas Glover, Colin Eddy, Rodney E. Hooker, Albert J. Loper
Synchronizing access to data in shared memory via upper level cache queuing

Patent number: 8296519

Abstract: A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.

Type: Grant

Filed: December 31, 2009

Date of Patent: October 23, 2012

Assignee: International Business Machines Corporation

Inventors: Guy L. Guthríe, William J. Starke, Derek E. Williams
PERFORMING A PARTIAL CACHE LINE STORAGE-MODIFYING OPERATION BASED UPON A HINT

Publication number: 20120265938

Abstract: Analyzing pre-processed code includes identifying at least one storage-modifying construct specifying a storage-modifying memory access to a memory hierarchy of a data processing system and determining if more than one granule of a cache line of data containing multiple granules that is targeted by the storage-modifying construct is subsequently referenced by said pre-processed code. Post-processed code including a storage-modifying instruction corresponding to the at least one storage-modifying construct in the pre-processed code is generated and stored.

Type: Application

Filed: January 12, 2012

Publication date: October 18, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ravi K. Arimilli, Guy L. Guthrie, William J. Starke, Derek E. Williams
Cache line use history based done bit modification to D-cache replacement scheme

Patent number: 8291169

Abstract: A method of providing history based done logic includes receiving a cache line in a L2 cache; determining if the cache line has a history of access at least three times on a previous call into the L2 cache; providing the cache line directly to a processor if the history of access was less then the at least three times; and loading the cache line into an L1 cache if the history of access was the at least three times.

Type: Grant

Filed: May 28, 2009

Date of Patent: October 16, 2012

Assignee: International Business Machines Corporation

Inventor: David A. Luick
SIMULTANEOUS EVICTION AND CLEANING OPERATIONS IN A CACHE

Publication number: 20120260041

Abstract: Embodiments provide a method comprising receiving, at a cache associated with a central processing unit that is disposed on an integrated circuit, a request to perform a cache operation on the cache; in response to receiving and processing the request, determining that first data cached in a first cache line of the cache is to be written to a memory that is coupled to the integrated circuit; identifying a second cache line in the cache, the second cache line being complimentary to the first cache line; transmitting a single memory instruction from the cache to the memory to write to the memory (i) the first data from the first cache line and (ii) second data from the second cache line; and invalidating the first data in the first cache line, without invalidating the second data in the second cache line.

Type: Application

Filed: April 4, 2012

Publication date: October 11, 2012

Inventors: Adi Habusha, Eitan Joshua, Shaul Chapman
ENHANCED PIPELINING AND MULTI-BUFFER ARCHITECTURE FOR LEVEL TWO CACHE CONTROLLER TO MINIMIZE HAZARD STALLS AND OPTIMIZE PERFORMANCE

Publication number: 20120260031

Abstract: This invention is a data processing system including a central processing unit, an external interface, a level one cache, level two memory including level two unified cache and directly addressable memory. A level two memory controller includes a directly addressable memory read pipeline, a central processing unit write pipeline, an external cacheable pipeline and an external non-cacheable pipeline.

Type: Application

Filed: September 26, 2011

Publication date: October 11, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Abhijeet Ashok Chachad, Raguram Damodaran
Fused store exclusive/memory barrier operation

Patent number: 8285937

Abstract: In an embodiment, a processor may be configured to detect a store exclusive operation followed by a memory barrier operation in a speculative instruction stream being executed by the processor. The processor may fuse the store exclusive operation and the memory barrier operation, creating a fused operation. The fused operation may be transmitted and globally ordered, and the processor may complete both the store exclusive operation and the memory barrier operation in response to the fused operation. As the fused operation progresses through the processor and one or more other components (e.g. caches in the cache hierarchy) to the ordering point in the system, the fused operation may push previous memory operations to effect the memory barrier operation. In some embodiments, the latency for completing the store exclusive operation and the subsequent data memory barrier operation may be reduced if the store exclusive operation is successful at the ordering point.

Type: Grant

Filed: February 24, 2010

Date of Patent: October 9, 2012

Assignee: Apple Inc.

Inventors: Peter J. Bannon, Po-Yung Chang
Cache access filtering for processors without secondary miss detection

Patent number: 8285926

Abstract: The disclosed embodiments provide a system that filters duplicate requests from an L1 cache for a cache line. During operation, the system receives at an L2 cache a first request and a second request for the same cache line, and stores identifying information for these requests. The system then performs a cache array look-up for the first request that, in the process of creating a load fill packet for the first request, loads the cache line into a fill buffer. After sending the load fill packet for the first request to the L1 cache, the system uses the cache line data still stored in the fill buffer and stored identifying information for the second fill request to send a subsequent load fill packet for the second request to the L1 cache without performing an additional cache array look-up.

Type: Grant

Filed: May 3, 2010

Date of Patent: October 9, 2012

Assignee: Oracle America, Inc.

Inventor: Martin R. Karlsson
METHODS AND APPARATUS FOR UPDATING DATA IN PASSIVE VARIABLE RESISTIVE MEMORY

Publication number: 20120254541

Abstract: Methods and apparatus for updating data in passive variable resistive memory (PVRM) are provided. In one example, a method for updating data stored in PVRM is disclosed. The method includes updating a memory block of a plurality of memory blocks in a cache hierarchy without invalidating the memory block. The updated memory block may be copied from the cache hierarchy to a write through buffer. Additionally, the method includes writing the updated memory block to the PVRM, thereby updating the data in the PVRM.

Type: Application

Filed: April 4, 2011

Publication date: October 4, 2012

Applicant: Advanced Micro Devices, Inc.

Inventors: Brad Beckmann, Lisa Hsu
METHOD AND SYSTEM TO IMPROVE UNALIGNED CACHE MEMORY ACCESSES

Publication number: 20120246407

Abstract: A method and system to improve unaligned cache memory accesses. In one embodiment of the invention, a processing unit has logic to facilitate access of at least two cache memory lines of a cache memory in a single read operation. By doing so, it avoids additional read operations or cycles to read the required data that is cached in more than one cache memory line. Embodiments of the invention facilitate the streaming of unaligned vector loads that does not require substantially more power than streaming aligned vector loads. For example, in one embodiment of the invention, the streaming of unaligned vector loads consumes less than two times the power requirements of streaming aligned vector loads.

Type: Application

Filed: March 21, 2011

Publication date: September 27, 2012

Inventors: WILLIAM C. HASENPLAUGH, Tryggve Fossum
RESOURCE SHARING TO REDUCE IMPLEMENTATION COSTS IN A MULTICORE PROCESSOR

Publication number: 20120239883

Abstract: A processor may include several processor cores, each including a respective higher-level cache; a lower-level cache including several tag units each including several controllers, where each controller corresponds to a respective cache bank configured to store data, and where the controllers are concurrently operable to access their respective cache banks; and an interconnect network configured to convey data between the cores and the lower-level cache. The controllers in a given tag unit may share access to a resource that may include one or more of an interconnect egress port coupled to the interconnect network, an interconnect ingress port coupled to the interconnect network, a test controller, or a data storage structure.

Type: Application

Filed: June 1, 2012

Publication date: September 20, 2012

Inventors: Prashant Jain, Yoganand Chillarige, Sandip Das, Shukur Moulali Pathan, Srinivasan R. Iyengar, Sanjay Patel
Scheduling Workloads Based On Cache Asymmetry

Publication number: 20120233393

Abstract: In one embodiment, a processor includes a first cache and a second cache, a first core associated with the first cache and a second core associated with the second cache. The caches are of asymmetric sizes, and a scheduler can intelligently schedule threads to the cores based at least in part on awareness of this asymmetry and resulting cache performance information obtained during a training phase of at least one of the threads.

Type: Application

Filed: March 8, 2011

Publication date: September 13, 2012

Inventors: Xiaowei Jiang, Li Zhao, Ravishankar Iyer
CACHE PHASE DETECTOR AND PROCESSOR CORE

Publication number: 20120233407

Abstract: A cache phase detector included in a processor core according to example embodiments includes a counting unit and a signal generating unit. The counting unit generates a critical section miscount by counting a request from the processor core resulting in a tag miss and a valid cache line based on a tag miss signal and a cache line valid signal. The signal generating unit compares the critical section miscount from the counting unit with a reference value, and generates a cache phase change signal if the critical section miscount is greater than the reference value.

Type: Application

Filed: March 5, 2012

Publication date: September 13, 2012

Inventors: Ju-Hee CHOI, Hoi-Jin Lee
NETWORK-ON-CHIP SYSTEM INCLUDING ACTIVE MEMORY PROCESSOR

Publication number: 20120226865

Abstract: Disclosed is a network-on-chip system including an active memory processor for processing increased communication latency by multiple processors and memories. The network-on-chip system includes a plurality of processing elements that request to perform an active memory operation for a predetermined operation from a shared memory to reduce access latency of the shared memory, and an active memory processor connected to the processing elements through a network, storing codes for processing custom transaction in request to the active memory operation, performing an operation addresses or data stored in a shared cache memory or the shared memory based on the codes and transmitting the performed operation result to the processing elements.

Type: Application

Filed: December 9, 2009

Publication date: September 6, 2012

Applicant: SNU R&DB FOUNDATION

Inventors: Ki-Young Choi, Jun-Hee Yoo, Sung-Joo Yoo, Hyun-Chul Shin
DYNAMIC MIGRATION OF VIRTUAL MACHINES BASED ON WORKLOAD CACHE DEMAND PROFILING

Publication number: 20120226866

Abstract: A computer-implemented method comprises obtaining a cache hit ratio for each of a plurality of virtual machines, and identifying, from among the plurality of virtual machines, a first virtual machine having a cache hit ratio that is less than a threshold ratio. The identified first virtual machine is then migrated from the first physical server having a first cache size to a second physical server having a second cache size that is greater than the first cache size. Optionally, a virtual machine having a cache hit ratio that is less than a threshold ratio is identified on a class-specific basis, such as for L1 cache, L2 cache and L3 cache.

Type: Application

Filed: March 2, 2011

Publication date: September 6, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: James J. Bozek, Nils Peter Joachim Hansson, Edward S. Suffern, James L. Wooldridge
Binary tree based multilevel cache system for multicore processors

Publication number: 20120226867

Abstract: A binary tree based multi-level cache system for multi-core processors and its two possible implementations LogN and LogN+1 models maintaining a true pyramid is described.

Type: Application

Filed: March 4, 2011

Publication date: September 6, 2012

Inventor: Muhammad Ali Ismail
Computer Cache System With Stratified Replacement

Publication number: 20120221794

Abstract: Methods for selecting a line to evict from a data storage system are provided. A computer system implementing a method for selecting a line to evict from a data storage system is also provided. The methods include selecting an uncached class line for eviction prior to selecting a cached class line for eviction.

Type: Application

Filed: May 5, 2012

Publication date: August 30, 2012

Inventor: Blaine D. Gaither
MULTI-PORT CACHE MEMORY APPARATUS AND METHOD

Publication number: 20120221797

Abstract: Provided is a multi-port cache memory apparatus and a method of the multi-port cache memory apparatus. The multi-port memory apparatus may divide an address space into address regions and allocate the divided memory regions to cache banks, thereby preventing the concentration of access to a particular cache.

Type: Application

Filed: January 31, 2012

Publication date: August 30, 2012

Inventors: Moo-Kyoung Chung, Soo-Jung Ryu, Ho-Young Kim, Woong Seo, Young-Chul Cho
INCREASED NAND FLASH MEMORY READ THROUGHPUT

Publication number: 20120221780

Abstract: A method of reading sequential pages of flash memory from alternating memory blocks comprises loading data from a first page into a first primary data cache and a second page into a second primary data cache simultaneously, the first and second pages loaded from different blocks of flash memory. Data from the first primary data cache is stored in a first secondary data cache, and data from the second primary data cache is stored in a second secondary data cache. Data is sequentially provided from the first and second secondary data caches by a multiplexer coupled to the first and second data caches.

Type: Application

Filed: May 7, 2012

Publication date: August 30, 2012

Applicant: Round Rock Research, LLC

Inventors: Dzung H. Nguyen, Frankie F. Roohparvar
SYSTEMS AND METHODS FOR RECONFIGURING CACHE MEMORY

Publication number: 20120221793

Abstract: A microprocessor system is disclosed that includes a first data cache that is shared by a first group of one or more program threads in a multi-thread mode and used by one program thread in a single-thread mode. A second data cache is shared by a second group of one or more program threads in the multi-thread mode and is used as a victim cache for the first data cache in the single-thread mode.

Type: Application

Filed: February 28, 2011

Publication date: August 30, 2012

Inventor: THANG M. TRAN
DATA CACHING METHOD

Publication number: 20120215983

Abstract: Data caching for use in a computer system including a lower cache memory and a higher cache memory. The higher cache memory receives a fetch request. It is then determined by the higher cache memory the state of the entry to be replaced next. If the state of the entry to be replaced next indicates that the entry is exclusively owned or modified, the state of the entry to be replaced next is changed such that a following cache access is processed at a higher speed compared to an access processed if the state would stay unchanged.

Type: Application

Filed: April 28, 2012

Publication date: August 23, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christian Habermann, Martin Recktenwald, Hans-Werner Tast, Ralf Winkelmann
SHARED CACHE FOR A TIGHTLY-COUPLED MULTIPROCESSOR

Publication number: 20120210069

Abstract: Computing apparatus (11) includes a plurality of processor cores (12) and a cache (10), which is shared by and accessible simultaneously to the plurality of the processor cores. The cache includes a shared memory (16), including multiple block frames of data imported from a level-two (L2) memory (14) in response to requests by the processor cores, and a shared tag table (18), which is separate from the shared memory and includes table entries that correspond to the block frames and contain respective information regarding the data contained in the block frames.

Type: Application

Filed: October 24, 2010

Publication date: August 16, 2012

Applicant: PLURALITY LTD.

Inventors: Nimrod Bayer, Peleg Aviely, Shareef Hakeem, Shmuel Shem-Zion
ADDRESS TRANSLATION CACHING AND I/O CACHE PERFORMANCE IMPROVEMENT IN VIRTUALIZED ENVIRONMENTS

Publication number: 20120203950

Abstract: Methods and apparatus relating to improving address translation caching and/or input/output (I/O) cache performance in virtualized environments are described. In one embodiment, a hint provided by an endpoint device may be utilized to update information stored in an I/O cache. Such information may be utilized for implementation of a more efficient replacement policy in an embodiment. Other embodiments are also disclosed.

Type: Application

Filed: April 17, 2012

Publication date: August 9, 2012

Inventors: Mahesh Wagh, Jasmin Ajanovic
COORDINATED WRITEBACK OF DIRTY CACHELINES

Publication number: 20120203968

Abstract: A data processing system includes a processor core and a cache memory hierarchy coupled to the processor core. The cache memory hierarchy includes at least one upper level cache and a lowest level cache. A memory controller is coupled to the lowest level cache and to a system memory and includes a physical write queue from which the memory controller writes data to the system memory. The memory controller initiates accesses to the lowest level cache to place into the physical write queue selected cachelines having spatial locality with data present in the physical write queue.

Type: Application

Filed: April 16, 2012

Publication date: August 9, 2012

Applicant: International Business Machines Corporation

Inventors: David M. DALY, Benjiman L. GOODMAN, Hillery C. HUNTER, William J. STARKE, Jeffrey A. STUECHELI
MEMORY BUS WRITE PRIORITIZATION

Publication number: 20120203969

Abstract: A data processing system includes a multi-level cache hierarchy including a lowest level cache, a processor core coupled to the multi-level cache hierarchy, and a memory controller coupled to the lowest level cache and to a memory bus of a system memory. The memory controller includes a physical read queue that buffers data read from the system memory via the memory bus and a physical write queue that buffers data to be written to the system memory via the memory bus. The memory controller grants priority to write operations over read operations on the memory bus based upon a number of dirty cachelines in the lowest level cache memory.

Type: Application

Filed: April 16, 2012

Publication date: August 9, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: DAVID M. DALY, BENJIMAN L. GOODMAN, HILLERY C. HUNTER, WILLIAM J. STARKE, JEFFREY A. STUECHELI
SYNCHRONIZING ACCESS TO DATA IN SHARED MEMORY VIA UPPER LEVEL CACHE QUEUING

Publication number: 20120198167

Abstract: A processing unit includes a store-in lower level cache having reservation logic that determines presence or absence of a reservation and a processor core including a store-through upper level cache, an instruction execution unit, a load unit that, responsive to a hit in the upper level cache on a load-reserve operation generated through execution of a load-reserve instruction by the instruction execution unit, temporarily buffers a load target address of the load-reserve operation, and a flag indicating that the load-reserve operation bound to a value in the upper level cache. If a storage-modifying operation is received that conflicts with the load target address of the load-reserve operation, the processor core sets the flag to a particular state, and, responsive to execution of a store-conditional instruction, transmits an associated store-conditional operation to the lower level cache with a fail indication if the flag is set to the particular state.

Type: Application

Filed: April 12, 2012

Publication date: August 2, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Guy L. Guthrie, William J. Starke, Derek E. Williams
Hazard Prevention for Data Conflicts Between Level One Data Cache Line Allocates and Snoop Writes

Publication number: 20120198162

Abstract: A comparator compares the address of DMA writes in the final entry of the FIFO stack to all pending read addresses in a monitor memory. If there is no match, then the DMA access is permitted to proceed. If the DMA write is to a cache line with a pending read, the DMA write access is stalled together with any DMA accesses behind the DMA write in the FIFO stack. DMA read accesses are not compared but may stall behind a stalled DMA write access. These stalls occur if the cache read was potentially cacheable. This is possible for some monitored accesses but not all. If a DMA write is stalled, the comparator releases it to complete once there are no pending reads to the same cache line.

Type: Application

Filed: September 28, 2011

Publication date: August 2, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Abhijeet Ashok Chachad, Jonathan (Son) Hung Tran, Raguram Damodaran, Krishna Chaithanya Gurram
Level One Data Cache Line Lock and Enhanced Snoop Protocol During Cache Victims and Writebacks to Maintain Level One Data Cache and Level Two Cache Coherence

Publication number: 20120198163

Abstract: This invention assures cache coherence in a multi-level cache system upon eviction of a higher level cache line. A victim buffer stored data on evicted lines. On a DMA access that may be cached in the higher level cache the lower level cache sends a snoop write. The address of this snoop write is compared with the victim buffer. On a hit in the victim buffer the write completes in the victim buffer. When the victim data passes to the next cache level it is written into a second victim buffer to be retired when the data is committed to cache. DMA write addresses are compared to addresses in this second victim buffer. On a match the write takes place in the second victim buffer. On a failure to match the controller sends a snoop write.

Type: Application

Filed: September 28, 2011

Publication date: August 2, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Raguram Damodaran, Abhijeet Ashok Chachad, Jonathan (Son) Hung Tran, David Matthew Thompson
Efficient Cache Allocation by Optimizing Size and Order of Allocate Commands Based on Bytes Required by CPU

Publication number: 20120198160

Abstract: This invention is a data processing system having a multi-level cache system. The multi-level cache system includes at least first level cache and a second level cache. Upon a cache miss in both the at least one first level cache and the second level cache the data processing system evicts and allocates a cache line within the second level cache. The data processing system determine from the miss address whether the request falls within a low half or a high half of the allocated cache line. The data processing system first requests data from external memory of the miss half cache line. Upon receipt data is supplied to the at least one first level cache and the CPU. The data processing system then requests data from external memory for the other half of the second level cache line.

Type: Application

Filed: September 23, 2011

Publication date: August 2, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Abhijeet Ashok Chachad, Roger Kyle Castille, Joseph Raymond Michael Zbiciak, Dheera Balasubramanian
Mechanism to Update the Status of In-Flight Cache Coherence In a Multi-Level Cache Hierarchy

Publication number: 20120198165

Abstract: Separate buffers store snoop writes and direct memory access writes. A multiplexer selects one of these for input to a FIFO buffer. The FIFO buffer is split into multiple FIFOs including: a command FIFO; an address FIFO; and write data FIFO. Each snoop command is compared with an allocated line set and way and deleted on a match to avoid data corruption. Each snoop command is also compared with a victim address. If the snoop address matches victim address logic redirects the snoop command to a victim buffer and the snoop write is completed in the victim buffer.

Type: Application

Filed: September 28, 2011

Publication date: August 2, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Raguram Damodaran, Naveen Bhoria, Krishna C. Gurram
GUEST TO NATIVE BLOCK ADDRESS MAPPINGS AND MANAGEMENT OF NATIVE CODE STORAGE

Publication number: 20120198122

Abstract: A method for managing mappings of storage on a code cache for a processor. The method includes storing a plurality of guest address to native address mappings as entries in a conversion look aside buffer, wherein the entries indicate guest addresses that have corresponding converted native addresses stored within a code cache memory, and receiving a subsequent request for a guest address at the conversion look aside buffer. The conversion look aside buffer is indexed to determine whether there exists an entry that corresponds to the index, wherein the index comprises a tag and an offset that is used to identify the entry that corresponds to the index. Upon a hit on the tag, the corresponding entry is accessed to retrieve a pointer to the code cache memory corresponding block of converted native instructions. The corresponding block of converted native instructions are fetched from the code cache memory for execution.

Type: Application

Filed: January 27, 2012

Publication date: August 2, 2012

Applicant: SOFT MACHINES, INC.

Inventor: Mohammad Abdallah
NON-BLOCKING, PIPELINED WRITE ALLOCATES WITH ALLOCATE DATA MERGING IN A MULTI-LEVEL CACHE SYSTEM

Publication number: 20120198161

Abstract: This invention handles write request cache misses. The cache controller stores write data, sends a read request to external memory for a corresponding cache line, merges the write data with data returned from the external memory and stores merged data in the cache. The cache controller includes buffers with plural entries storing the write address, the write data, the position of the write data within a cache line and unique identification number. This stored data enables the cache controller to proceed to servicing other access requests while waiting for response from the external memory.

Type: Application

Filed: September 26, 2011

Publication date: August 2, 2012

Applicant: Texas Instruments Incorporated

Inventors: Abhijeet Ashok Chachad, Raguram Damodaran, David Matthew Thompson
Memory Attribute Sharing Between Differing Cache Levels of Multilevel Cache

Publication number: 20120198166

Abstract: The level one memory controller maintains a local copy of the cacheability bit of each memory attribute register. The level two memory controller is the initiator of all configuration read/write requests from the CPU. Whenever a configuration write is made to a memory attribute register, the level one memory controller updates its local copy of the memory attribute register.

Type: Application

Filed: September 28, 2011

Publication date: August 2, 2012

Applicant: Texas Instruments Incorporated

Inventors: Raguram Damodaran, Joseph Raymond Michael Zbiciak, Naveen Bhoria
Programmable Address-Based Write-Through Cache Control

Publication number: 20120198164

Abstract: This invention is a cache system with a memory attribute register having plural entries. Each entry stores a write-through or a write-back indication for a corresponding memory address range. On a write to cached data the cache the cache consults the memory attribute register for the corresponding address range. Writes to addresses in regions marked as write-through always update all levels of the memory hierarchy. Writes to addresses in regions marked as write- back update only the first cache level that can service the write. The memory attribute register is preferably a memory mapped control register writable by the central processing unit.

Type: Application

Filed: September 28, 2011

Publication date: August 2, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Raguram Damodaran, Abhijeet Ashok Chachad, Naveen Bhoria
ASSIST THREAD FOR INJECTING CACHE MEMORY IN A MICROPROCESSOR

Publication number: 20120198459

Abstract: A data processing system includes a microprocessor having access to multiple levels of cache memories. The microprocessor executes a main thread compiled from a source code object. The system includes a processor for executing an assist thread also derived from the source code object. The assist thread includes memory reference instructions of the main thread and only those arithmetic instructions required to resolve the memory reference instructions. A scheduler configured to schedule the assist thread in conjunction with the corresponding execution thread is configured to execute the assist thread ahead of the execution thread by a determinable threshold such as the number of main processor cycles or the number of code instructions. The assist thread may execute in the main processor or in a dedicated assist processor that makes direct memory accesses to one of the lower level cache memory elements.

Type: Application

Filed: March 29, 2012

Publication date: August 2, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Patrick Joseph Bohrer, Orran Yaakov Krieger, Ramakrishnan Rajamony, Michael Rosenfield, Hazim Shafi, Balaram Sinharoy, Robert Brett Tremaine
Efficient data prefetching in the presence of load hits

Patent number: 8234450

Abstract: A BIU prioritizes L1 requests above L2 requests. The L2 generates a first request to the BIU and detects the generation of a snoop request and L1 request to the same cache line. The L2 determines whether a bus transaction to fulfill the first request may be retried and, if so, generates a miss, and otherwise generates a hit. Alternatively, the L2 detects the L1 generated a request to the L2 for the same line and responsively requests the BIU to refrain from performing a transaction on the bus to fulfill the first request if the BIU has not yet been granted the bus. Alternatively, a prefetch cache and the L2 allow the same line to be simultaneously present. If an L1 request hits in both the L2 and in the prefetch cache, the prefetch cache invalidates its copy of the line and the L2 provides the line to the L1.

Type: Grant

Filed: April 20, 2010

Date of Patent: July 31, 2012

Assignee: VIA Technologies, Inc.

Inventors: Clinton Thomas Glover, Colin Eddy, Rodney E. Hooker, Albert J. Loper
Distributed User Controlled Multilevel Block and Global Cache Coherence with Accurate Completion Status

Publication number: 20120191913

Abstract: This invention permits user controlled cache coherence operations with the flexibility to do these operations on all levels of cache together or each level independently. In the case of an all level operation, the user does not have to monitor and sequence each phase of the operation. This invention also provides a way for users to track completion of these operations. This is critical for multi-core/multi-processor devices. Multiple cores may be accessing the end point and the user/application needs to be able to identify when the operation from one core is complete, before permitting other cores access that data or code.

Type: Application

Filed: September 22, 2011

Publication date: July 26, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Raguram Damodaran, Abhijeet A. Chachad
EFFICIENT LEVEL TWO MEMORY BANKING TO IMPROVE PERFORMANCE FOR MULTIPLE SOURCE TRAFFIC AND ENABLE DEEPER PIPELINING OF ACCESSES BY REDUCING BANK STALLS

Publication number: 20120191915

Abstract: The level two memory of this invention supports coherency data transfers with level one cache and DMA data transfers. The width of DMA transfers is 16 bytes. The width of level one instruction cache transfers is 32 bytes. The width of level one data transfers is 64 bytes. The width of level two allocates is 128 bytes. DMA transfers are interspersed with CPU traffic and have similar requirements of efficient throughput and reduced latency. An additional challenge is that these two data streams (CPU and DMA) require access to the level two memory at the same time. This invention is a banking technique for the level two memory to facilitate efficient data transfers.

Type: Application

Filed: September 26, 2011

Publication date: July 26, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Abhijeet Ashok Chachad, Ramakrishnan Venkatasubramanian
OPTIMIZING TAG FORWARDING IN A TWO LEVEL CACHE SYSTEM FROM LEVEL ONE TO LEVER TWO CONTROLLERS FOR CACHE COHERENCE PROTOCOL FOR DIRECT MEMORY ACCESS TRANSFERS

Publication number: 20120191916

Abstract: A second level memory controller uses shadow tags 711 to implement snoop read and write coherence. These shadow tags are generally used only for snoops intending to keep L2 SRAM coherent with the level one data cache. Thus updates for all external cache lines are ignored. The shadow tags are updated on all level one cache allocates and all dirty and invalidate modifications to data stored in L2 SRAM. These interactions happen on different interfaces, but the traffic on that interface includes level one data cache accesses to both external and level two directly addressable lines. These interactions create extra traffic on these interfaces and creating extra stalls to the CPU. Thus in this invention shadow tags are updated only on a subset of less than all updates of the level one tags.

Type: Application

Filed: September 26, 2011

Publication date: July 26, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Abhijeet Ashok Chachad, Roger Kyle Castille, Joseph Raymond Michael Zbiciak, Dheera Balasubramanian
PERFORMANCE AND POWER IMPROVEMENT ON DMA WRITES TO LEVEL TWO COMBINED CACHE/SRAM THAT IS CAUSED IN LEVEL ONE DATA CACHE AND LINE IS VALID AND DIRTY

Publication number: 20120191914

Abstract: This invention optimizes DMA writes to directly addressable level two memory that is cached in level one and the line is valid and dirty. When the level two controller detects that a line is valid and dirty in level one, the level two memory need not update its copy of the data. Level one memory will replace the level two copy with a victim writeback at a future time. Thus the level two memory need not store write a copy. This limits the number of DMA writes to level two directly addressable memory and thus improves performance and minimizes dynamic power. This also frees the level two memory for other master/requestors.

Type: Application

Filed: September 26, 2011

Publication date: July 26, 2012

Applicant: TEXAS INSTRUMENTS INCORPORATED

Inventors: Jonathan (Son) Hung Tran, Raguram Damodaran, Abhijeet Ashok Chachad, Joseph Raymond Michael Zbiciak
METHOD FOR MANAGING MULTI-LAYERED DATA STRUCTURES IN A PIPELINED MEMORY ARCHITECTURE

Publication number: 20120191919

Abstract: A method for managing multi-layered data structures in a pipelined memory architecture, comprising the steps of:—providing a multi-level data structure where each level corresponds to a memory access;—storing each level in a separate memory block with respect to the other levels. In this way, a more efficient usage of memory is achieved.

Type: Application

Filed: July 28, 2010

Publication date: July 26, 2012

Inventor: Mikael Sundström
Cache memory system, data processing apparatus, and storage apparatus

Patent number: 8230173

Abstract: A cache memory system includes a plurality of first storage hierarchical units provided individually to a plurality of processors. A second storage hierarchical unit is provided commonly to the plurality of processors. A control unit controls data transfer between the plurality of first storage hierarchical units and the second storage hierarchical unit. Each of the plurality of processors is capable of executing a no-data transfer store command as a store command that does not require data transfer from the second storage hierarchical unit to the corresponding first storage hierarchical unit, and each of the plurality of first storage hierarchical units outputs a transfer-control signal in response to occurrence of a cache miss hit when executing the no-data transfer store command by the corresponding processor.

Type: Grant

Filed: March 11, 2009

Date of Patent: July 24, 2012

Assignee: Fujitsu Semiconductor Limited

Inventor: Masayuki Tsuji
GLOBAL INSTRUCTIONS FOR SPIRAL CACHE MANAGEMENT

Publication number: 20120179872

Abstract: A method of operation of a pipelined cache memory supports global operations within the cache. The cache may be a spiral cache, with a move-to-front M2F network for moving values from a backing store to a front-most tile coupled to a processor or lower-order level of a memory hierarchy and a spiral push-back network for pushing out modified values to the backing-store. The cache controller manages application of global commands by propagating individual commands to the tiles. The global commands may provide zeroing, flushing and reconciling of the given tiles. Commands for interrupting and resuming interrupted global commands may be implemented, to reduce halting or slowing of processing while other global operations are in process. A line detector within each tile supports reconcile and flush operations, and a line patcher in the controller provides for initializing address ranges with no processor intervention.

Type: Application

Filed: March 13, 2012

Publication date: July 12, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Volker Strumpen

prev 1 2 3 4 5 6 7 8 next