Patents by Inventor Hazim Shafi

Hazim Shafi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Chained cache coherency states for sequential homogeneous access to a cache line with outstanding data response

Publication number: 20070083717

Abstract: A method and data processing system for sequentially coupling successive, homogenous processor requests for a cache line in a chain before the data is received in the cache of a first processor within the chain. Chained intermediate coherency states are assigned to track the chain of processor requests and subsequent access permission provided, prior to receipt of the data at the first processor starting the chain. The chained intermediate coherency state assigned identifies the processor operation and a directional identifier identifies the processor to which the cache line is to be forwarded. When the data is received at the cache of the first processor within the chain, the first processor completes its operation on (or with) the data and then forwards the data to the next processor in the chain. The chain is immediately stopped when a non-homogenous operation is snooped by the last-in-chain processor.

Type: Application

Filed: October 6, 2005

Publication date: April 12, 2007

Inventors: Ramakrishnan Rajamony, Hazim Shafi, Derek Williams, Kenneth Wright
HARDWARE SUPPORT FOR SUPERPAGE COALESCING

Publication number: 20070067604

Abstract: A method of assigning virtual memory to physical memory in a data processing system allocates a set of contiguous physical memory pages for a new page mapping, instructs the memory controller to move the virtual memory pages according to the new page mapping, and then allows access to the virtual memory pages using the new page mapping while the memory controller is still copying the virtual memory pages to the set of physical memory pages. The memory controller can use a mapping table which temporarily stores entries of the old and new page addresses, and releases the entries as copying for each entry is completed. The translation look aside buffer (TLB) entries in the processor cores are updated for the new page addresses prior to completion of copying of the memory pages by the memory controller. The invention can be extended to non-uniform memory array (NUMA) systems.

Type: Application

Filed: October 19, 2006

Publication date: March 22, 2007

Inventors: Elmootazbellah Elnozahy, James Peterson, Ramakrishnan Rajamony, Hazim Shafi
System and method of managing cache hierarchies with adaptive mechanisms

Publication number: 20060277366

Abstract: A system and method of managing cache hierarchies with adaptive mechanisms. A preferred embodiment of the present invention includes, in response to selecting a data block for eviction from a memory cache (the source cache) out of a collection of memory caches, examining a data structure to determine whether an entry exists that indicates that the data block has been evicted from the source memory cache, or another peer cache, to a slower cache or memory and subsequently retrieved from the slower cache or memory into the source memory cache or other peer cache. Also, a preferred embodiment of the present invention includes, in response to determining the entry exists in the data structure, selecting a peer memory cache out of the collection of memory caches at the same level in the hierarchy to receive the data block from the source memory cache upon eviction.

Type: Application

Filed: June 2, 2005

Publication date: December 7, 2006

Inventors: Ramakrishnan Rajamony, Hazim Shafi, William Speight, Lixin Zhang
System and method to improve hardware pre-fetching using translation hints

Publication number: 20060179236

Abstract: A system and method for improving hardware-controlled pre-fetching within a data processing system. A collection of address translation entries are pre-fetched and placed in an address translation cache. This translation pre-fetch mechanism cooperates with the data and/or instruction hardware-controlled pre-fetch mechanism to avoid stalls at page boundaries, which improves the latter's effectiveness at hiding memory latency.

Type: Application

Filed: January 13, 2005

Publication date: August 10, 2006

Inventor: Hazim Shafi
Method and memory controller for adaptive row management within a memory subsystem

Patent number: 7082514

Abstract: A method and memory controller for adaptive row management within a memory subsystem provides metrics for evaluating row access behavior and dynamically adjusting the row management policy of the memory subsystem in conformity with measured metrics to reduce the average latency of the memory subsystem. Counters provided within the memory controller track the number of consecutive row accesses and optionally the number of total accesses over a measurement interval. The number of counted consecutive row accesses can be used to control the closing of rows for subsequent accesses, reducing memory latency. The count may be validated using a second counter or storage for improved accuracy and alternatively the row close count may be set via program or logic control in conformity with a count of consecutive row hits in ratio with a total access count.

Type: Grant

Filed: September 18, 2003

Date of Patent: July 25, 2006

Assignee: International Business Machines Corporation

Inventors: Ramakrishnan Rajamony, Hazim Shafi, Robert B. Tremaine
Directory based support for function shipping in a multiprocessor system

Patent number: 7080214

Abstract: A multiprocessor system includes a plurality of data processing nodes. Each node has a processor coupled to a system memory, a cache memory, and a cache directory. The cache directory contains cache coherency information for a predetermined range of system memory addresses. An interconnection enables the nodes to exchange messages. A node initiating a function shipping request identifies an intermediate destination directory based on a list of the function's operands and sends a message indicating the function and its corresponding operands to the identified destination directory. The destination cache directory determines a target node based, at least in part, on its cache coherency status information to reduce memory access latency by selecting a target node where all or some of the operands are valid in the local cache memory. The destination directory then ships the function to the target node over the interconnection.

Type: Grant

Filed: October 16, 2003

Date of Patent: July 18, 2006

Assignee: International Business Machines Corporation

Inventors: James Lyle Peterson, Ramakrishnan Rajamony, Hazim Shafi
Assist thread for injecting cache memory in a microprocessor

Publication number: 20060155963

Abstract: A data processing system includes a microprocessor having access to multiple levels of cache memories. The microprocessor executes a main thread compiled from a source code object. The system includes a processor for executing an assist thread also derived from the source code object. The assist thread includes memory reference instructions of the main thread and only those arithmetic instructions required to resolve the memory reference instructions. A scheduler configured to schedule the assist thread in conjunction with the corresponding execution thread is configured to execute the assist thread ahead of the execution thread by a determinable threshold such as the number of main processor cycles or the number of code instructions. The assist thread may execute in the main processor or in a dedicated assist processor that makes direct memory accesses to one of the lower level cache memory elements.

Type: Application

Filed: January 13, 2005

Publication date: July 13, 2006

Inventors: Patrick Bohrer, Orran Krieger, Ramakrishnan Rajamony, Michael Rosenfield, Hazim Shafi, Balaram Sinharoy, Robert Tremaine
Methods and arrangements to manage on-chip memory to reduce memory latency

Publication number: 20060155886

Abstract: Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.

Type: Application

Filed: January 11, 2005

Publication date: July 13, 2006

Inventors: Dilma da Silva, Elmootazbellah Elnozahy, Orran Krieger, Hazim Shafi, Xiaowei Shen, Balaram Sinharoy, Robert Tremaine
System and method for reducing unnecessary cache operations

Publication number: 20060155934

Abstract: A system and method for cache management in a data processing system. The data processing system includes a processor and a memory hierarchy. The memory hierarchy includes at least an upper memory cache, at least a lower memory cache, and a write-back data structure. In response to replacing data from the upper memory cache, the upper memory cache examines the write-back data structure to determine whether or not the data is present in the lower memory cache. If the data is present in the lower memory cache, the data is replaced in the upper memory cache without casting out the data to the lower memory cache.

Type: Application

Filed: January 11, 2005

Publication date: July 13, 2006

Inventors: Ramakrishnan Rajamony, Hazim Shafi, William Speight, Lixin Zhang
Method and system for managing cache injection in a multiprocessor system

Publication number: 20060064518

Abstract: A method and apparatus for managing cache injection in a multiprocessor system reduces processing time associated with direct memory access transfers in a symmetrical multiprocessor (SMP) or a non-uniform memory access (NUMA) multiprocessor environment. The method and apparatus either detect the target processor for DMA completion or direct processing of DMA completion to a particular processor, thereby enabling cache injection to a cache that is coupled with processor that executes the DMA completion routine processing the data injected into the cache. The target processor may be identified by determining the processor handling the interrupt that occurs on completion of the DMA transfer. Alternatively or in conjunction with target processor identification, an interrupt handler may queue a deferred procedure call to the target processor to process the transferred data.

Type: Application

Filed: September 23, 2004

Publication date: March 23, 2006

Applicant: International Business Machines Corporation

Inventors: Patrick Bohrer, Ahmed Gheith, Peter Hochschild, Ramakrishnan Rajamony, Hazim Shafi, Balaram Sinharoy
Memory power management using prefetch buffers

Patent number: 6938146

Abstract: A system and method for improving memory performance and decreasing memory power requirements is described. To accomplish the improvements, a prefetch buffer is added to a memory controller with accompanying prefetch logic. The memory controller first attempts to satisfy memory requests from the prefetch buffer allowing the main memory to stay in a reduced power state until accessing it is required. If the memory controller is unable to satisfy a memory request from the prefetch buffer, the main memory is changed to an active power state and the prefetch logic is invoked. The prefetch logic loads the requested memory, returns the request memory to the requester, and loads memory likely to be requested in the near future into the prefetch buffer. Concurrent with the execution of the prefetch logic, the memory controller returns the requested data.

Type: Grant

Filed: December 19, 2002

Date of Patent: August 30, 2005

Assignee: International Business Machines Corporation

Inventors: Hazim Shafi, Sivakumar Velusamy
Hardware support for superpage coalescing

Publication number: 20050108496

Abstract: A method of assigning virtual memory to physical memory in a data processing system allocates a set of contiguous physical memory pages for a new page mapping, instructs the memory controller to move the virtual memory pages according to the new page mapping, and then allows access to the virtual memory pages using the new page mapping while the memory controller is still copying the virtual memory pages to the set of physical memory pages. The memory controller can use a mapping table which temporarily stores entries of the old and new page addresses, and releases the entries as copying for each entry is completed. The translation lookaside buffer (TLB) entries in the processor cores are updated for the new page addresses prior to completion of copying of the memory pages by the memory controller. The invention can be extended to non-uniform memory array (NUMA) systems.

Type: Application

Filed: November 13, 2003

Publication date: May 19, 2005

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Elmootazbellah Elnozahy, James Peterson, Ramakrishnan Rajamony, Hazim Shafi
Directory based support for function shipping in a multiprocessor system

Publication number: 20050086438

Abstract: A multiprocessor system includes a plurality of data processing nodes. Each node has a processor coupled to a system memory, a cache memory, and a cache directory. The cache directory contains cache coherency information for a predetermined range of system memory addresses. An interconnection enables the nodes to exchange messages. A node initiating a function shipping request identifies an intermediate destination directory based on a list of the function's operands and sends a message indicating the function and its corresponding operands to the identified destination directory. The destination cache directory determines a target node based, at least in part, on its cache coherency status information to reduce memory access latency by selecting a target node where all or some of the operands are valid in the local cache memory. The destination directory then ships the function to the target node over the interconnection.

Type: Application

Filed: October 16, 2003

Publication date: April 21, 2005

Applicant: International Business Machines Corporation

Inventors: James Peterson, Ramakrishnan Rajamony, Hazim Shafi
Method and memory controller for adaptive row management within a memory subsystem

Publication number: 20050066113

Abstract: A method and memory controller for adaptive row management within a memory subsystem provides metrics for evaluating row access behavior and dynamically adjusting the row management policy of the memory subsystem in conformity with measured metrics to reduce the average latency of the memory subsystem. Counters provided within the memory controller track the number of consecutive row accesses and optionally the number of total accesses over a measurement interval. The number of counted consecutive row accesses can be used to control the closing of rows for subsequent accesses, reducing memory latency. The count may be validated using a second counter or storage for improved accuracy and alternatively the row close count may be set via program or logic control in conformity with a count of consecutive row hits in ratio with a total access count.

Type: Application

Filed: September 18, 2003

Publication date: March 24, 2005

Applicant: International Business Machines Corporation

Inventors: Ramakrishnan Rajamony, Hazim Shafi, Robert Tremaine
Memory power management using prefetch buffers

Publication number: 20040123042

Abstract: A system and method for improving memory performance and decreasing memory power requirements is described. To accomplish the improvements, a prefetch buffer is added to a memory controller with accompanying prefetch logic. The memory controller first attempts to satisfy memory requests from the prefetch buffer allowing the main memory to stay in a reduced power state until accessing it is required. If the memory controller is unable to satisfy a memory request from the prefetch buffer, the main memory is changed to an active power state and the prefetch logic is invoked. The prefetch logic loads the requested memory, returns the request memory to the requester, and loads memory likely to be requested in the near future into the prefetch buffer. Concurrent with the execution of the prefetch logic, the memory controller returns the requested data.

Type: Application

Filed: December 19, 2002

Publication date: June 24, 2004

Applicant: International Business Machines Corp.

Inventors: Hazim Shafi, Sivakumar Velusamy
Method and apparatus for accelerating input/output processing using cache injections

Patent number: 6711650

Abstract: A method for accelerating input/output operations within a data processing system is disclosed. Initially, a determination is initially made in a cache controller as to whether or not a bus operation is a data transfer from a first memory to a second memory without intervening communications through a processor, such as a direct memory access (DMA) transfer. If the bus operation is such data transfer, a determination is made in a cache memory as to whether or not the cache memory includes a copy of data from the data transfer. If the cache memory does not include a copy of data from the data transfer, a cache line is allocated within the cache memory to store a copy of data from the data transfer.

Type: Grant

Filed: November 7, 2002

Date of Patent: March 23, 2004

Assignee: International Business Machines Corporation

Inventors: Patrick Joseph Bohrer, Ramakrishnan Rajamony, Hazim Shafi
System and method for precisely locating networked devices

Publication number: 20040039855

Abstract: A networked device and system are disclosed wherein the networked device includes a network communication adapter and location information means. The adapter is configured to communicate with an external agent via a network to which the networked device is connected. The location information means is a part of or accessible to the network communication adapter and is configured to provide location information to the adapter where the location information is indicative of the adapter's geographic location. The adapter responds to a predetermined request from the external agent by providing the location information to the external agent. The location information means may include a global positioning system (GPS) receiver that is able to receive GPS signals and determine geographic information from the signals. In another embodiment, the location information means includes an ultra wideband (UWB) receiver that is able receive UWB signals.

Type: Application

Filed: August 22, 2002

Publication date: February 26, 2004

Applicant: International Business Machines Corporation

Inventors: Patrick Joseph Bohrer, Ramakrishnan Rajamony, Hazim Shafi

prev 1 2 3 4