Patents by Inventor William E. Speight

William E. Speight has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Scalable Space-Optimized and Energy-Efficient Computing System

Publication number: 20110292594

Abstract: A scalable space-optimized and energy-efficient computing system is provided. The computing system comprises a plurality of modular compartments in at least one level of a frame configured in a hexadron configuration. The computing system also comprises an air inlet, an air mixing plenum, and at least one fan. In the computing system the plurality of modular compartments are affixed above the air inlet, the air mixing plenum is affixed above the plurality of modular compartments, and the at least one fan is affixed above the air mixing plenum. When at least one module is inserted into one of the plurality of modular compartments, the module couples to a backplane within the frame.

Type: Application

Filed: May 28, 2010

Publication date: December 1, 2011

Applicant: International Business Machines Corporation

Inventors: John B. Carter, Wael R. El-Essawy, Elmootazbellah N. Elnozahy, Madhusudan K. Iyengar, Thomas W. Keller, JR., Jian Li, Karthick Rajamani, Juan C. Rubio, William E. Speight, Lixin Zhang
Reducing Energy Consumption of Set Associative Caches by Reducing Checked Ways of the Set Association

Publication number: 20110296112

Abstract: Mechanisms for accessing a set associative cache of a data processing system are provided. A set of cache lines, in the set associative cache, associated with an address of a request are identified. Based on a determined mode of operation for the set, the following may be performed: determining if a cache hit occurs in a preferred cache line without accessing other cache lines in the set of cache lines; retrieving data from the preferred cache line without accessing the other cache lines in the set of cache lines, if it is determined that there is a cache hit in the preferred cache line; and accessing each of the other cache lines in the set of cache lines to determine if there is a cache hit in any of these other cache lines only in response to there being a cache miss in the preferred cache line(s).

Type: Application

Filed: May 25, 2010

Publication date: December 1, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jian Li, William E. Speight, Lixin Zhang
Stackable Module for Energy-Efficient Computing Systems

Publication number: 20110292597

Abstract: A modular processing module is provided. The modular processing module comprises a set of processing module sides. Each processing module side comprises a circuit board, a plurality of connectors coupled to the circuit board, and a plurality of processing nodes coupled to the circuit board. Each processing module side in the set of processing module sides couples to another processing module side using at least one connector in the plurality of connectors such that, when all of the set of processing module sides are coupled together, the modular processing module is formed. The modular processing module comprises an exterior connection to a power source and a communication system.

Type: Application

Filed: May 28, 2010

Publication date: December 1, 2011

Applicant: International Business Machines Corporation

Inventors: John B. Carter, Wael R. El-Essawy, Elmootazbellah N. Elnozahy, Wesley M. Felter, Madhusudan K. Iyengar, Thomas W. Keller, JR., Karthick Rajamani, Juan C. Rubio, William E. Speight, Lixin Zhang
Mechanisms for Reducing DRAM Power Consumption

Publication number: 20110296097

Abstract: Mechanisms are provided for inhibiting precharging of memory cells of a dynamic random access memory (DRAM) structure. The mechanisms receive a command for accessing memory cells of the DRAM structure. The mechanisms further determine, based on the command, if precharging the memory cells following accessing the memory cells is to be inhibited. Moreover, the mechanisms send, in response to the determination indicating that precharging the memory cells is to be inhibited, a command to blocking logic of the DRAM structure to block precharging of the memory cells following accessing the memory cells.

Type: Application

Filed: May 27, 2010

Publication date: December 1, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Elmootazbellah N. Elnozahy, Karthick Rajamani, William E. Speight, Lixin Zhang
Latency-Tolerant 3D On-Chip Memory Organization

Publication number: 20110296107

Abstract: A mechanism is provided within a 3D stacked memory organization to spread or stripe cache lines across multiple layers. In an example organization, a 128B cache line takes eight cycles on a 16B-wide bus. Each layer may provide 32B. The first layer uses the first two of the eight transfer cycles to send the first 32B. The next layer sends the next 32B using the next two cycles of the eight transfer cycles, and so forth. The mechanism provides a uniform memory access.

Type: Application

Filed: May 26, 2010

Publication date: December 1, 2011

Applicant: International Business Machines Corporation

Inventors: Jian Li, William E. Speight, Lixin Zhang
Assigning Memory to On-Chip Coherence Domains

Publication number: 20110296115

Abstract: A mechanism is provided for assigning memory to on-chip cache coherence domains. The mechanism assigns caches within a processing unit to coherence domains. The mechanism then assigns chunks of memory to the coherence domains. The mechanism monitors applications running on cores within the processing unit to identify needs of the applications. The mechanism may then reassign memory chunks to the cache coherence domains based on the needs of the applications running in the coherence domains. When a memory controller receives the cache miss, the memory controller may look up the address in a lookup table that maps memory chunks to cache coherence domains. Snoop requests are sent to caches within the coherence domain. If a cache line is found in a cache within the coherence domain, the cache line is returned to the originating cache by the cache containing the cache line either directly or through the memory controller.

Type: Application

Filed: May 26, 2010

Publication date: December 1, 2011

Applicant: International Business Machines Corporation

Inventors: William E. Speight, Lixin Zhang
Data Reorganization through Hardware-Supported Intermediate Addresses

Publication number: 20110238946

Abstract: A virtual address scheme for improving performance and efficiency of memory accesses of sparsely-stored data items in a cached memory system is disclosed. In a preferred embodiment of the present invention, a special address translation unit is used to translate sets of non-contiguous addresses in real memory into contiguous blocks of addresses in an “intermediate address space.” This intermediate address space is a fictitious or “virtual” address space, but is distinguishable from the virtual address space visible to application programs, and in user-level memory operations, effective addresses seen/manipulated by application programs are translated into intermediate addresses by an additional address translation unit for memory caching purposes. This scheme allows non-contiguous data items in memory to be assembled into contiguous cache lines for more efficient caching/access (due to the perceived spatial proximity of the data from the perspective of the processor).

Type: Application

Filed: March 24, 2010

Publication date: September 29, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Remote asynchronous data mover

Patent number: 7996564

Abstract: A distributed data processing system executes multiple tasks within a parallel job, including a first local task on a local node and at least one task executing on a remote node, with a remote memory having real address (RA) locations mapped to one or more of the source effective addresses (EA) and destination EA of a data move operation initiated by a task executing on the local node. On initiation of the data move operation, remote asynchronous data move (RADM) logic identifies that the operation moves data to/from a first EA that is memory mapped to an RA of the remote memory. The local processor/RADM logic initiates a RADM operation that moves a copy of the data directly from/to the first remote memory by completing the RADM operation using the network interface cards (NICs) of the source and destination processing nodes, determined by accessing a data center for the node IDs of remote memory.

Type: Grant

Filed: April 16, 2009

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ronald N. Kalla, Ramakrishnan Rajamony, Balaram Sinharoy, William E. Speight, William J. Starke
CACHE DIRECTED SEQUENTIAL PREFETCH

Publication number: 20110145509

Abstract: A technique for performing stream detection and prefetching within a cache memory simplifies stream detection and prefetching. A bit in a cache directory or cache entry indicates that a cache line has not been accessed since being prefetched and another bit indicates the direction of a stream associated with the cache line. A next cache line is prefetched when a previously prefetched cache line is accessed, so that the cache always attempts to prefetch one cache line ahead of accesses, in the direction of a detected stream. Stream detection is performed in response to load misses tracked in the load miss queue (LMQ). The LMQ stores an offset indicating a first miss at the offset within a cache line. A next miss to the line sets a direction bit based on the difference between the first and second offsets and causes prefetch of the next line for the stream.

Type: Application

Filed: February 9, 2011

Publication date: June 16, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William E. Speight, Lixin Zhang
Performing collective operations using software setup and partial software execution at leaf nodes in a multi-tiered full-graph interconnect architecture

Patent number: 7958183

Abstract: A mechanism for performing collective operations. In software executing on a parent processor in a first processor book, a number of other processors are determined in a same or different processor book of the data processing system that is needed to execute the collective operation, thereby establishing a plurality of processors comprising the parent processor and the other processors. In software executing on the parent processor, the plurality of processors are logically arranged as a plurality of nodes in a hierarchical structure. The collective operation is transmitted to the plurality of processors based on the hierarchical structure. In hardware of the parent processor, results are received from the execution of the collective operation from the other processors, a final result is generated of the collective operation based on the received results, and the final result is output.

Type: Grant

Filed: August 27, 2007

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
Providing full hardware support of collective operations in a multi-tiered full-graph interconnect architecture

Patent number: 7958182

Abstract: A mechanism is provided for performing collective operations. In hardware of a parent processor in a first processor book, a number of other processors are determined in a same or different processor book of the data processing system that is needed to execute the collective operation, thereby establishing a plurality of processors comprising the parent processor and the other processors. In hardware of the parent processor, the plurality of processors are logically arranged as a plurality of nodes in a hierarchical structure. The collective operation is transmitted to the plurality of processors based on the hierarchical structure. In hardware of the parent processor, results are received from the execution of the collective operation from the other processors, a final result is generated of the collective operation based on the received results, and the final result is output.

Type: Grant

Filed: August 27, 2007

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
Cache directed sequential prefetch

Patent number: 7958317

Abstract: A technique for performing stream detection and prefetching within a cache memory simplifies stream detection and prefetching. A bit in a cache directory or cache entry indicates that a cache line has not been accessed since being prefetched and another bit indicates the direction of a stream associated with the cache line. A next cache line is prefetched when a previously prefetched cache line is accessed, so that the cache always attempts to prefetch one cache line ahead of accesses, in the direction of a detected stream. Stream detection is performed in response to load misses tracked in the load miss queue (LMQ). The LMQ stores an offset indicating a first miss at the offset within a cache line. A next miss to the line sets a direction bit based on the difference between the first and second offsets and causes prefetch of the next line for the stream.

Type: Grant

Filed: August 4, 2008

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: William E. Speight, Lixin Zhang
Dynamic adjustment of prefetch stream priority

Patent number: 7958316

Abstract: A method, processor, and data processing system for dynamically adjusting a prefetch stream priority based on the consumption rate of the data by the processor. The method includes a prefetch engine issuing a prefetch request of a first prefetch stream to fetch one or more data from the memory subsystem. The first prefetch stream has a first assigned priority that determines a relative order for scheduling prefetch requests of the first prefetch stream relative to other prefetch requests of other prefetch streams. Based on the receipt of a processor demand for the data before the data returns to the cache or return of the data along time before the receiving the processor demand, logic of the prefetch engine dynamically changes the first assigned priority to a second higher or lower priority, which priority is subsequently utilized to schedule and issue a next prefetch request of the first prefetch stream.

Type: Grant

Filed: February 1, 2008

Date of Patent: June 7, 2011

Assignee: International Business Machines Corporation

Inventors: William E. Speight, Lixin Zhang
Read and Write Aware Cache

Publication number: 20110072214

Abstract: A mechanism is provided in a cache for providing a read and write aware cache. The mechanism partitions a large cache into a read-often region and a write-often region. The mechanism considers read/write frequency in a non-uniform cache architecture replacement policy. A frequently written cache line is placed in one of the farther banks. A frequently read cache line is placed in one of the closer banks. The size ratio between read-often and write-often regions may be static or dynamic. The boundary between the read-often region and the write-often region may be distinct or fuzzy.

Type: Application

Filed: September 18, 2009

Publication date: March 24, 2011

Applicant: International Business Machines Corporation

Inventors: Jian Li, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Routing information through a data processing system implementing a multi-tiered full-graph interconnect architecture

Patent number: 7904590

Abstract: A mechanism is provided for routing information through the data processing system. Data is received at a source processor within a set of processors that is to be transmitted to a destination processor, where the data includes address information. A first determination is performed as to whether the destination processor is within a same processor book as the source processor based on the address information. A second determination is performed as to whether the destination processor is within a same supernode as the source processor based on the address information if the destination processor is not within the same processor book. A routing path is identified for the data based on results of the first determination, the second determination, and one or more routing table data structures. The data is then transmitted from the source processor along the identified routing path toward the destination processor.

Type: Grant

Filed: August 27, 2007

Date of Patent: March 8, 2011

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
Fine Grained Cache Allocation

Publication number: 20110022773

Abstract: A mechanism is provided in a virtual machine monitor for fine grained cache allocation in a shared cache. The mechanism partitions a cache tag into a most significant bit (MSB) portion and a least significant bit (LSB) portion. The MSB portion of the tags is shared among the cache lines in a set. The LSB portion of the tags is private, one per cache line. The mechanism allows software to set the MSB portion of tags in a cache to allocate sets of cache lines. The cache controller determines whether a cache line is locked based on the MSB portion of the tag.

Type: Application

Filed: July 27, 2009

Publication date: January 27, 2011

Applicant: International Business Machines Corporation

Inventors: Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Method and System for Performance Isolation in Virtualized Environments

Publication number: 20110004875

Abstract: A method, a system, an apparatus, and a computer program product for allocating resources of one or more shared devices to one or more partitions of a virtualization environment within a data processing system. At least one user defined resource assignment is received for one or more devices associated with the data processing system. One or more registers, associated with the one or more partitions are dynamically set to execute the at least one resource assignment, whereby the at least one resource assignment enables a user defined quantitative measure (number and/or percentage) of devices to operate when the one or more transactions are executed via the partition. The system enables the one or more devices to execute one or more transactions at a bandwidth/capacity that is less than or equal to the user defined resource assignment and minimizes performance interference among partitions.

Type: Application

Filed: July 1, 2009

Publication date: January 6, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Elmootazbellah N. Elnozahy, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
DATA PROCESSING SYSTEM, PROCESSOR AND METHOD FOR VARYING A DATA PREFETCH SIZE BASED UPON DATA USAGE

Publication number: 20100293339

Abstract: A method of data processing in a processor includes maintaining a usage history indicating demand usage of prefetched data retrieved into cache memory. An amount of data to prefetch by a data prefetch request is selected based upon the usage history. The data prefetch request is transmitted to a memory hierarchy to prefetch the selected amount of data into cache memory.

Type: Application

Filed: February 1, 2008

Publication date: November 18, 2010

Inventors: RAVI K. ARIMILLI, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
Remote Asynchronous Data Mover

Publication number: 20100268788

Abstract: A distributed data processing system executes multiple tasks within a parallel job, including a first local task on a local node and at least one task executing on a remote node, with a remote memory having real address (RA) locations mapped to one or more of the source effective addresses (EA) and destination EA of a data move operation initiated by a task executing on the local node. On initiation of the data move operation, remote asynchronous data move (RADM) logic identifies that the operation moves data to/from a first EA that is memory mapped to an RA of the remote memory. The local processor/RADM logic initiates a RADM operation that moves a copy of the data directly from/to the first remote memory by completing the RADM operation using the network interface cards (NICs) of the source and destination processing nodes, determined by accessing a data center for the node IDs of remote memory.

Type: Application

Filed: April 16, 2009

Publication date: October 21, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ronald N. Kalla, Ramakrishnan Rajamony, Balaram Sinharoy, William E. Speight, William J. Starke
Branch target address cache

Patent number: 7783870

Abstract: A processor includes an execution unit and instruction sequencing logic that fetches instructions from a memory system for execution. The instruction sequencing logic includes branch logic that outputs predicted branch target addresses for use as instruction fetch addresses. The branch logic includes a level one branch target address cache (BTAC) and a level two BTAC each having a respective plurality of entries each associating at least a tag with a predicted branch target address. The branch logic accesses the level one and level two BTACs in parallel with a tag portion of a first instruction fetch address to obtain a first predicted branch target address from the level one BTAC for use as a second instruction fetch address in a first processor clock cycle and a second predicted branch target address from the level two BTAC for use as a third instruction fetch address in a later second processor clock cycle.

Type: Grant

Filed: August 13, 2007

Date of Patent: August 24, 2010

Assignee: International Business Machines Corporation

Inventors: David S. Levitan, William E. Speight, Lixin Zhang

prev 1 2 3 4 5 next