Patents by Inventor William E. Speight

William E. Speight has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Fast and accurate proton therapy dose calculations

Patent number: 9999788

Abstract: Simulating particle beam interactions includes identifying a set of n functions F1, F2, . . . , Fn corresponding to a plurality of different physical aspects of a particle beam, performing simulations of each Fi using a full physics model, selecting for each Fi a distribution function fi that models relevant behavior and reducing computation of the full physics model for each Fi by replacing Fi with a distribution function fi. The computation reduction includes comparing a set of simulations wherein each fi replaces its respective Fi to determine if relevant behavior is accurately modeled and selecting one of fi or Fi for each n, for a Monte Carlo simulation based on a runtime and accuracy criteria.

Type: Grant

Filed: January 14, 2015

Date of Patent: June 19, 2018

Assignee: International Business Machines Corporation

Inventors: Anne E. Gattiker, Damir A. Jamsek, Sani R. Nassif, Thomas H. Osiecki, William E. Speight, Chin Ngai Sze, Min-Yu Tsai
Techniques for predicated execution in an out-of-order processor

Patent number: 9946550

Abstract: A technique for handling predicated code in an out-of-order processor includes detecting a predicate defining instruction associated with a predicated code region. Renaming of predicated instructions, within the predicated code region, is then stalled until a predicate of the predicate defining instruction is resolved.

Type: Grant

Filed: September 17, 2007

Date of Patent: April 17, 2018

Assignee: International Business Machines Corporation

Inventors: Ram Rangan, William E. Speight, Mark W. Stephenson, Lixin Zhang
FAST AND ACCURATE PROTON THERAPY DOSE CALCULATIONS

Publication number: 20150352374

Abstract: Simulating particle beam interactions includes identifying a set of n functions F1, F2, . . . , Fn corresponding to a plurality of different physical aspects of a particle beam, performing simulations of each Fi using a full physics model, selecting for each Fi a distribution function fi that models relevant behavior and reducing computation of the full physics model for each Fi by replacing Fi with a distribution function fi. The computation reduction includes comparing a set of simulations wherein each fi replaces its respective Fi to determine if relevant behavior is accurately modeled and selecting one of fi or Fi for each n, for a Monte Carlo simulation based on a runtime and accuracy criteria.

Type: Application

Filed: January 14, 2015

Publication date: December 10, 2015

Inventors: Anne E. Gattiker, Damir A. Jamsek, Sani R. Nassif, Thomas H. Osiecki, William E. Speight, Chin Ngai Sze, Min-Yu Tsai
Non-uniform cache architecture (NUCA)

Patent number: 9152569

Abstract: In one embodiment, a cache memory includes a cache array including a plurality of entries for caching cache lines of data, where the plurality of entries are distributed between a first region implemented in a first memory technology and a second region implemented in a second memory technology. The cache memory further includes a cache directory of the contents of the cache array and a cache controller that controls operation of the cache memory.

Type: Grant

Filed: November 4, 2008

Date of Patent: October 6, 2015

Assignee: International Business Machines Corporation

Inventors: Jian Li, Ramakrishnan Rajamony, William E. Speight, Xiaoxia Wu, Lixin Zhang
Address translation through an intermediate address space

Patent number: 8966219

Abstract: In a data processing system capable of concurrently executing multiple hardware threads of execution, an intermediate address translation unit in a processing unit translates an effective address for a memory access into an intermediate address. A cache memory is accessed utilizing the intermediate address. In response to a miss in cache memory, the intermediate address is translated into a real address by a real address translation unit that performs address translation for multiple hardware threads of execution. The system memory is accessed with the real address.

Type: Grant

Filed: October 30, 2007

Date of Patent: February 24, 2015

Assignee: International Business Machines Corporation

Inventors: Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Performing setup operations for receiving different amounts of data while processors are performing message passing interface tasks

Patent number: 8893148

Abstract: A system and method are provided for performing setup operations for receiving a different amount of data while processors are performing message passing interface (MPI) tasks. Mechanisms for adjusting the balance of processing workloads of the processors are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. An MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, setup operations may be performed while processors are performing MPI tasks to prepare for receiving different sized portions of data in a subsequent computation cycle based on the history.

Type: Grant

Filed: June 15, 2012

Date of Patent: November 18, 2014

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
Read and write aware cache with a read portion and a write portion of a tag and status array

Patent number: 8843705

Abstract: A mechanism is provided in a cache for providing a read and write aware cache. The mechanism partitions a large cache into a read-often region and a write-often region. The mechanism considers read/write frequency in a non-uniform cache architecture replacement policy. A frequently written cache line is placed in one of the farther banks. A frequently read cache line is placed in one of the closer banks. The size ratio between read-often and write-often regions may be static or dynamic. The boundary between the read-often region and the write-often region may be distinct or fuzzy.

Type: Grant

Filed: August 13, 2012

Date of Patent: September 23, 2014

Assignee: International Business Machines Corporation

Inventors: Jian Li, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Method and system for performance isolation in virtualized environments

Patent number: 8756608

Abstract: A method, a system, an apparatus, and a computer program product for allocating resources of one or more shared devices to one or more partitions of a virtualization environment within a data processing system. At least one user defined resource assignment is received for one or more devices associated with the data processing system. One or more registers, associated with the one or more partitions are dynamically set to execute the at least one resource assignment, whereby the at least one resource assignment enables a user defined quantitative measure (number and/or percentage) of devices to operate when the one or more transactions are executed via the partition. The system enables the one or more devices to execute one or more transactions at a bandwidth/capacity that is less than or equal to the user defined resource assignment and minimizes performance interference among partitions.

Type: Grant

Filed: July 1, 2009

Date of Patent: June 17, 2014

Assignee: International Business Machines Corporation

Inventors: Elmootazbellah N. Elnozahy, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Assigning memory to on-chip coherence domains

Patent number: 8612691

Abstract: A mechanism for assigning memory to on-chip cache coherence domains assigns caches within a processing unit to coherence domains. The mechanism assigns chunks of memory to the coherence domains. The mechanism monitors applications running on cores within the processing unit to identify needs of the applications. The mechanism may then reassign memory chunks to the cache coherence domains based on the needs of the applications running in the coherence domains. When a memory controller receives the cache miss, the memory controller may look up the address in a lookup table that maps memory chunks to cache coherence domains. Snoop requests are sent to caches within the coherence domain. If a cache line is found in a cache within the coherence domain, the cache line is returned to the originating cache by the cache containing the cache line either directly or through the memory controller.

Type: Grant

Filed: April 24, 2012

Date of Patent: December 17, 2013

Assignee: International Business Machines Corporation

Inventors: William E. Speight, Lixin Zhang
Latency-tolerant 3D on-chip memory organization

Patent number: 8612687

Abstract: A mechanism is provided within a 3D stacked memory organization to spread or stripe cache lines across multiple layers. In an example organization, a 128B cache line takes eight cycles on a 16B-wide bus. Each layer may provide 32B. The first layer uses the first two of the eight transfer cycles to send the first 32B. The next layer sends the next 32B using the next two cycles of the eight transfer cycles, and so forth. The mechanism provides a uniform memory access.

Type: Grant

Filed: May 26, 2010

Date of Patent: December 17, 2013

Assignee: International Business Machines Corporation

Inventors: Jian Li, William E. Speight, Lixin Zhang
Varying a data prefetch size based upon data usage

Patent number: 8595443

Abstract: A method of data processing in a processor includes maintaining a usage history indicating demand usage of prefetched data retrieved into cache memory. An amount of data to prefetch by a data prefetch request is selected based upon the usage history. The data prefetch request is transmitted to a memory hierarchy to prefetch the selected amount of data into cache memory.

Type: Grant

Filed: February 1, 2008

Date of Patent: November 26, 2013

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
Instruction set architecture extensions for performing power versus performance tradeoffs

Patent number: 8589665

Abstract: Mechanisms are provided for processing an instruction in a processor of a data processing system. The mechanisms operate to receive, in a processor of the data processing system, an instruction, the instruction including power/performance tradeoff information associated with the instruction. The mechanisms further operate to determine power/performance tradeoff priorities or criteria, specifying whether power conservation or performance is prioritized with regard to execution of the instruction, based on the power/performance tradeoff information. Moreover, the mechanisms process the instruction in accordance with the power/performance tradeoff priorities or criteria identified based on the power/performance tradeoff information of the instruction.

Type: Grant

Filed: May 27, 2010

Date of Patent: November 19, 2013

Assignee: International Business Machines Corporation

Inventors: John B. Carter, Jian Li, Karthick Rajamani, William E. Speight, Lixin Zhang
Fine grained cache allocation

Patent number: 8543769

Abstract: A mechanism is provided in a virtual machine monitor for fine grained cache allocation in a shared cache. The mechanism partitions a cache tag into a most significant bit (MSB) portion and a least significant bit (LSB) portion. The MSB portion of the tags is shared among the cache lines in a set. The LSB portion of the tags is private, one per cache line. The mechanism allows software to set the MSB portion of tags in a cache to allocate sets of cache lines. The cache controller determines whether a cache line is locked based on the MSB portion of the tag.

Type: Grant

Filed: July 27, 2009

Date of Patent: September 24, 2013

Assignee: International Business Machines Corporation

Inventors: Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Assigning memory to on-chip coherence domains

Patent number: 8543770

Abstract: A mechanism is provided for assigning memory to on-chip cache coherence domains. The mechanism assigns caches within a processing unit to coherence domains. The mechanism then assigns chunks of memory to the coherence domains. The mechanism monitors applications running on cores within the processing unit to identify needs of the applications. The mechanism may then reassign memory chunks to the cache coherence domains based on the needs of the applications running in the coherence domains. When a memory controller receives the cache miss, the memory controller may look up the address in a lookup table that maps memory chunks to cache coherence domains. Snoop requests are sent to caches within the coherence domain. If a cache line is found in a cache within the coherence domain, the cache line is returned to the originating cache by the cache containing the cache line either directly or through the memory controller.

Type: Grant

Filed: May 26, 2010

Date of Patent: September 24, 2013

Assignee: International Business Machines Corporation

Inventors: William E. Speight, Lixin Zhang
Cache directed sequential prefetch

Patent number: 8458408

Abstract: A technique for performing stream detection and prefetching within a cache memory simplifies stream detection and prefetching. A bit in a cache directory or cache entry indicates that a cache line has not been accessed since being prefetched and another bit indicates the direction of a stream associated with the cache line. A next cache line is prefetched when a previously prefetched cache line is accessed, so that the cache always attempts to prefetch one cache line ahead of accesses, in the direction of a detected stream. Stream detection is performed in response to load misses tracked in the load miss queue (LMQ). The LMQ stores an offset indicating a first miss at the offset within a cache line. A next miss to the line sets a direction bit based on the difference between the first and second offsets and causes prefetch of the next line for the stream.

Type: Grant

Filed: February 9, 2011

Date of Patent: June 4, 2013

Assignee: International Business Machines Corporation

Inventors: William E. Speight, Lixin Zhang
Stackable module for energy-efficient computing systems

Patent number: 8358503

Abstract: A modular processing module is provided. The modular processing module comprises a set of processing module sides. Each processing module side comprises a circuit board, a plurality of connectors coupled to the circuit board, and a plurality of processing nodes coupled to the circuit board. Each processing module side in the set of processing module sides couples to another processing module side using at least one connector in the plurality of connectors such that, when all of the set of processing module sides are coupled together, the modular processing module is formed. The modular processing module comprises an exterior connection to a power source and a communication system.

Type: Grant

Filed: May 28, 2010

Date of Patent: January 22, 2013

Assignee: International Business Machines Corporation

Inventors: John B. Carter, Wael R. El-Essawy, Elmootazbellah N. Elnozahy, Wesley M. Felter, Madhusudan K. Iyengar, Thomas W. Keller, Jr., Karthick Rajamani, Juan C. Rubio, William E. Speight, Lixin Zhang
Techniques for dynamically sharing a fabric to facilitate off-chip communication for multiple on-chip units

Patent number: 8346988

Abstract: A technique for sharing a fabric to facilitate off-chip communication for on-chip units includes dynamically assigning a first unit that implements a first communication protocol to a first portion of the fabric when private fabrics are indicated for the on-chip units. The technique also includes dynamically assigning a second unit that implements a second communication protocol to a second portion of the fabric when the private fabrics are indicated for the on-chip units. In this case, the first and second units are integrated in a same chip and the first and second protocols are different. The technique further includes dynamically assigning, based on off-chip traffic requirements of the first and second units, the first unit or the second unit to the first and second portions of the fabric when the private fabrics are not indicated for the on-chip units.

Type: Grant

Filed: May 25, 2010

Date of Patent: January 1, 2013

Assignee: International Business Machines Corporation

Inventors: Jian Li, William E. Speight, Lixin Zhang
Reducing energy consumption of set associative caches by reducing checked ways of the set association

Patent number: 8341355

Abstract: Mechanisms for accessing a set associative cache of a data processing system are provided. A set of cache lines, in the set associative cache, associated with an address of a request are identified. Based on a determined mode of operation for the set, the following may be performed: determining if a cache hit occurs in a preferred cache line without accessing other cache lines in the set of cache lines; retrieving data from the preferred cache line without accessing the other cache lines in the set of cache lines, if it is determined that there is a cache hit in the preferred cache line; and accessing each of the other cache lines in the set of cache lines to determine if there is a cache hit in any of these other cache lines only in response to there being a cache miss in the preferred cache line(s).

Type: Grant

Filed: May 25, 2010

Date of Patent: December 25, 2012

Assignee: International Business Machines Corporation

Inventors: Jian Li, William E. Speight, Lixin Zhang
Read and Write Aware Cache

Publication number: 20120311265

Abstract: A mechanism is provided in a cache for providing a read and write aware cache. The mechanism partitions a large cache into a read-often region and a write-often region. The mechanism considers read/write frequency in a non-uniform cache architecture replacement polity. A frequently written cache line is placed in one of the farther banks. A frequently read cache line is place in one of the closer banks. The size ration between read-often and write-often regions may be static or dynamic. The boundary between the read-often region and the write-often region may be distinct or fuzzy.

Type: Application

Filed: August 13, 2012

Publication date: December 6, 2012

Applicant: International Business Machines Corporation

Inventors: Jian Li, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
Hardware based dynamic load balancing of message passing interface tasks by modifying tasks

Patent number: 8312464

Abstract: Mechanisms are provided for providing hardware based dynamic load balancing of message passing interface (MPI) tasks by modifying tasks. Mechanisms for adjusting the balance of processing workloads of the processors executing tasks of an MPI job are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. Each processor has an associated hardware implemented MPI load balancing controller. The MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. Thus, operations may be performed to shift workloads from the slowest processor to one or more of the faster processors.

Type: Grant

Filed: August 28, 2007

Date of Patent: November 13, 2012

Assignee: International Business Machines Corporation

Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight

1 2 3 4 5 next