Patents by Inventor William E. Speight

William E. Speight has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9999788
    Abstract: Simulating particle beam interactions includes identifying a set of n functions F1, F2, . . . , Fn corresponding to a plurality of different physical aspects of a particle beam, performing simulations of each Fi using a full physics model, selecting for each Fi a distribution function fi that models relevant behavior and reducing computation of the full physics model for each Fi by replacing Fi with a distribution function fi. The computation reduction includes comparing a set of simulations wherein each fi replaces its respective Fi to determine if relevant behavior is accurately modeled and selecting one of fi or Fi for each n, for a Monte Carlo simulation based on a runtime and accuracy criteria.
    Type: Grant
    Filed: January 14, 2015
    Date of Patent: June 19, 2018
    Assignee: International Business Machines Corporation
    Inventors: Anne E. Gattiker, Damir A. Jamsek, Sani R. Nassif, Thomas H. Osiecki, William E. Speight, Chin Ngai Sze, Min-Yu Tsai
  • Patent number: 9946550
    Abstract: A technique for handling predicated code in an out-of-order processor includes detecting a predicate defining instruction associated with a predicated code region. Renaming of predicated instructions, within the predicated code region, is then stalled until a predicate of the predicate defining instruction is resolved.
    Type: Grant
    Filed: September 17, 2007
    Date of Patent: April 17, 2018
    Assignee: International Business Machines Corporation
    Inventors: Ram Rangan, William E. Speight, Mark W. Stephenson, Lixin Zhang
  • Publication number: 20150352374
    Abstract: Simulating particle beam interactions includes identifying a set of n functions F1, F2, . . . , Fn corresponding to a plurality of different physical aspects of a particle beam, performing simulations of each Fi using a full physics model, selecting for each Fi a distribution function fi that models relevant behavior and reducing computation of the full physics model for each Fi by replacing Fi with a distribution function fi. The computation reduction includes comparing a set of simulations wherein each fi replaces its respective Fi to determine if relevant behavior is accurately modeled and selecting one of fi or Fi for each n, for a Monte Carlo simulation based on a runtime and accuracy criteria.
    Type: Application
    Filed: January 14, 2015
    Publication date: December 10, 2015
    Inventors: Anne E. Gattiker, Damir A. Jamsek, Sani R. Nassif, Thomas H. Osiecki, William E. Speight, Chin Ngai Sze, Min-Yu Tsai
  • Patent number: 9152569
    Abstract: In one embodiment, a cache memory includes a cache array including a plurality of entries for caching cache lines of data, where the plurality of entries are distributed between a first region implemented in a first memory technology and a second region implemented in a second memory technology. The cache memory further includes a cache directory of the contents of the cache array and a cache controller that controls operation of the cache memory.
    Type: Grant
    Filed: November 4, 2008
    Date of Patent: October 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Jian Li, Ramakrishnan Rajamony, William E. Speight, Xiaoxia Wu, Lixin Zhang
  • Patent number: 8966219
    Abstract: In a data processing system capable of concurrently executing multiple hardware threads of execution, an intermediate address translation unit in a processing unit translates an effective address for a memory access into an intermediate address. A cache memory is accessed utilizing the intermediate address. In response to a miss in cache memory, the intermediate address is translated into a real address by a real address translation unit that performs address translation for multiple hardware threads of execution. The system memory is accessed with the real address.
    Type: Grant
    Filed: October 30, 2007
    Date of Patent: February 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
  • Patent number: 8893148
    Abstract: A system and method are provided for performing setup operations for receiving a different amount of data while processors are performing message passing interface (MPI) tasks. Mechanisms for adjusting the balance of processing workloads of the processors are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. An MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. As a result, setup operations may be performed while processors are performing MPI tasks to prepare for receiving different sized portions of data in a subsequent computation cycle based on the history.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: November 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight
  • Patent number: 8843705
    Abstract: A mechanism is provided in a cache for providing a read and write aware cache. The mechanism partitions a large cache into a read-often region and a write-often region. The mechanism considers read/write frequency in a non-uniform cache architecture replacement policy. A frequently written cache line is placed in one of the farther banks. A frequently read cache line is placed in one of the closer banks. The size ratio between read-often and write-often regions may be static or dynamic. The boundary between the read-often region and the write-often region may be distinct or fuzzy.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: September 23, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jian Li, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
  • Patent number: 8756608
    Abstract: A method, a system, an apparatus, and a computer program product for allocating resources of one or more shared devices to one or more partitions of a virtualization environment within a data processing system. At least one user defined resource assignment is received for one or more devices associated with the data processing system. One or more registers, associated with the one or more partitions are dynamically set to execute the at least one resource assignment, whereby the at least one resource assignment enables a user defined quantitative measure (number and/or percentage) of devices to operate when the one or more transactions are executed via the partition. The system enables the one or more devices to execute one or more transactions at a bandwidth/capacity that is less than or equal to the user defined resource assignment and minimizes performance interference among partitions.
    Type: Grant
    Filed: July 1, 2009
    Date of Patent: June 17, 2014
    Assignee: International Business Machines Corporation
    Inventors: Elmootazbellah N. Elnozahy, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
  • Patent number: 8612691
    Abstract: A mechanism for assigning memory to on-chip cache coherence domains assigns caches within a processing unit to coherence domains. The mechanism assigns chunks of memory to the coherence domains. The mechanism monitors applications running on cores within the processing unit to identify needs of the applications. The mechanism may then reassign memory chunks to the cache coherence domains based on the needs of the applications running in the coherence domains. When a memory controller receives the cache miss, the memory controller may look up the address in a lookup table that maps memory chunks to cache coherence domains. Snoop requests are sent to caches within the coherence domain. If a cache line is found in a cache within the coherence domain, the cache line is returned to the originating cache by the cache containing the cache line either directly or through the memory controller.
    Type: Grant
    Filed: April 24, 2012
    Date of Patent: December 17, 2013
    Assignee: International Business Machines Corporation
    Inventors: William E. Speight, Lixin Zhang
  • Patent number: 8612687
    Abstract: A mechanism is provided within a 3D stacked memory organization to spread or stripe cache lines across multiple layers. In an example organization, a 128B cache line takes eight cycles on a 16B-wide bus. Each layer may provide 32B. The first layer uses the first two of the eight transfer cycles to send the first 32B. The next layer sends the next 32B using the next two cycles of the eight transfer cycles, and so forth. The mechanism provides a uniform memory access.
    Type: Grant
    Filed: May 26, 2010
    Date of Patent: December 17, 2013
    Assignee: International Business Machines Corporation
    Inventors: Jian Li, William E. Speight, Lixin Zhang
  • Patent number: 8595443
    Abstract: A method of data processing in a processor includes maintaining a usage history indicating demand usage of prefetched data retrieved into cache memory. An amount of data to prefetch by a data prefetch request is selected based upon the usage history. The data prefetch request is transmitted to a memory hierarchy to prefetch the selected amount of data into cache memory.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: November 26, 2013
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Gheorghe C. Cascaval, Balaram Sinharoy, William E. Speight, Lixin Zhang
  • Patent number: 8589665
    Abstract: Mechanisms are provided for processing an instruction in a processor of a data processing system. The mechanisms operate to receive, in a processor of the data processing system, an instruction, the instruction including power/performance tradeoff information associated with the instruction. The mechanisms further operate to determine power/performance tradeoff priorities or criteria, specifying whether power conservation or performance is prioritized with regard to execution of the instruction, based on the power/performance tradeoff information. Moreover, the mechanisms process the instruction in accordance with the power/performance tradeoff priorities or criteria identified based on the power/performance tradeoff information of the instruction.
    Type: Grant
    Filed: May 27, 2010
    Date of Patent: November 19, 2013
    Assignee: International Business Machines Corporation
    Inventors: John B. Carter, Jian Li, Karthick Rajamani, William E. Speight, Lixin Zhang
  • Patent number: 8543769
    Abstract: A mechanism is provided in a virtual machine monitor for fine grained cache allocation in a shared cache. The mechanism partitions a cache tag into a most significant bit (MSB) portion and a least significant bit (LSB) portion. The MSB portion of the tags is shared among the cache lines in a set. The LSB portion of the tags is private, one per cache line. The mechanism allows software to set the MSB portion of tags in a cache to allocate sets of cache lines. The cache controller determines whether a cache line is locked based on the MSB portion of the tag.
    Type: Grant
    Filed: July 27, 2009
    Date of Patent: September 24, 2013
    Assignee: International Business Machines Corporation
    Inventors: Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
  • Patent number: 8543770
    Abstract: A mechanism is provided for assigning memory to on-chip cache coherence domains. The mechanism assigns caches within a processing unit to coherence domains. The mechanism then assigns chunks of memory to the coherence domains. The mechanism monitors applications running on cores within the processing unit to identify needs of the applications. The mechanism may then reassign memory chunks to the cache coherence domains based on the needs of the applications running in the coherence domains. When a memory controller receives the cache miss, the memory controller may look up the address in a lookup table that maps memory chunks to cache coherence domains. Snoop requests are sent to caches within the coherence domain. If a cache line is found in a cache within the coherence domain, the cache line is returned to the originating cache by the cache containing the cache line either directly or through the memory controller.
    Type: Grant
    Filed: May 26, 2010
    Date of Patent: September 24, 2013
    Assignee: International Business Machines Corporation
    Inventors: William E. Speight, Lixin Zhang
  • Patent number: 8458408
    Abstract: A technique for performing stream detection and prefetching within a cache memory simplifies stream detection and prefetching. A bit in a cache directory or cache entry indicates that a cache line has not been accessed since being prefetched and another bit indicates the direction of a stream associated with the cache line. A next cache line is prefetched when a previously prefetched cache line is accessed, so that the cache always attempts to prefetch one cache line ahead of accesses, in the direction of a detected stream. Stream detection is performed in response to load misses tracked in the load miss queue (LMQ). The LMQ stores an offset indicating a first miss at the offset within a cache line. A next miss to the line sets a direction bit based on the difference between the first and second offsets and causes prefetch of the next line for the stream.
    Type: Grant
    Filed: February 9, 2011
    Date of Patent: June 4, 2013
    Assignee: International Business Machines Corporation
    Inventors: William E. Speight, Lixin Zhang
  • Patent number: 8358503
    Abstract: A modular processing module is provided. The modular processing module comprises a set of processing module sides. Each processing module side comprises a circuit board, a plurality of connectors coupled to the circuit board, and a plurality of processing nodes coupled to the circuit board. Each processing module side in the set of processing module sides couples to another processing module side using at least one connector in the plurality of connectors such that, when all of the set of processing module sides are coupled together, the modular processing module is formed. The modular processing module comprises an exterior connection to a power source and a communication system.
    Type: Grant
    Filed: May 28, 2010
    Date of Patent: January 22, 2013
    Assignee: International Business Machines Corporation
    Inventors: John B. Carter, Wael R. El-Essawy, Elmootazbellah N. Elnozahy, Wesley M. Felter, Madhusudan K. Iyengar, Thomas W. Keller, Jr., Karthick Rajamani, Juan C. Rubio, William E. Speight, Lixin Zhang
  • Patent number: 8346988
    Abstract: A technique for sharing a fabric to facilitate off-chip communication for on-chip units includes dynamically assigning a first unit that implements a first communication protocol to a first portion of the fabric when private fabrics are indicated for the on-chip units. The technique also includes dynamically assigning a second unit that implements a second communication protocol to a second portion of the fabric when the private fabrics are indicated for the on-chip units. In this case, the first and second units are integrated in a same chip and the first and second protocols are different. The technique further includes dynamically assigning, based on off-chip traffic requirements of the first and second units, the first unit or the second unit to the first and second portions of the fabric when the private fabrics are not indicated for the on-chip units.
    Type: Grant
    Filed: May 25, 2010
    Date of Patent: January 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: Jian Li, William E. Speight, Lixin Zhang
  • Patent number: 8341355
    Abstract: Mechanisms for accessing a set associative cache of a data processing system are provided. A set of cache lines, in the set associative cache, associated with an address of a request are identified. Based on a determined mode of operation for the set, the following may be performed: determining if a cache hit occurs in a preferred cache line without accessing other cache lines in the set of cache lines; retrieving data from the preferred cache line without accessing the other cache lines in the set of cache lines, if it is determined that there is a cache hit in the preferred cache line; and accessing each of the other cache lines in the set of cache lines to determine if there is a cache hit in any of these other cache lines only in response to there being a cache miss in the preferred cache line(s).
    Type: Grant
    Filed: May 25, 2010
    Date of Patent: December 25, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jian Li, William E. Speight, Lixin Zhang
  • Publication number: 20120311265
    Abstract: A mechanism is provided in a cache for providing a read and write aware cache. The mechanism partitions a large cache into a read-often region and a write-often region. The mechanism considers read/write frequency in a non-uniform cache architecture replacement polity. A frequently written cache line is placed in one of the farther banks. A frequently read cache line is place in one of the closer banks. The size ration between read-often and write-often regions may be static or dynamic. The boundary between the read-often region and the write-often region may be distinct or fuzzy.
    Type: Application
    Filed: August 13, 2012
    Publication date: December 6, 2012
    Applicant: International Business Machines Corporation
    Inventors: Jian Li, Ramakrishnan Rajamony, William E. Speight, Lixin Zhang
  • Patent number: 8312464
    Abstract: Mechanisms are provided for providing hardware based dynamic load balancing of message passing interface (MPI) tasks by modifying tasks. Mechanisms for adjusting the balance of processing workloads of the processors executing tasks of an MPI job are provided so as to minimize wait periods for waiting for all of the processors to call a synchronization operation. Each processor has an associated hardware implemented MPI load balancing controller. The MPI load balancing controller maintains a history that provides a profile of the tasks with regard to their calls to synchronization operations. From this information, it can be determined which processors should have their processing loads lightened and which processors are able to handle additional processing loads without significantly negatively affecting the overall operation of the parallel execution system. Thus, operations may be performed to shift workloads from the slowest processor to one or more of the faster processors.
    Type: Grant
    Filed: August 28, 2007
    Date of Patent: November 13, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lakshminarayana B. Arimilli, Ravi K. Arimilli, Ramakrishnan Rajamony, William E. Speight