Patents by Inventor Alan Gara

Alan Gara has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8412974
    Abstract: A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: April 2, 2013
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Matthew R. Ellavsky, Ross L. Franke, Alan Gara, Thomas M. Gooding, Rudolf A. Haring, Mark J. Jeanson, Gerard V. Kopcsay, Thomas A. Liebsch, Daniel Littrell, Martin Ohmacht, Don D. Reed, Brandon E. Schenck, Richard A. Swetz
  • Patent number: 8397052
    Abstract: Mechanisms are provided for controlling version pressure on a speculative versioning cache. Raw version pressure data is collected based on one or more threads accessing cache lines of the speculative versioning cache. One or more statistical measures of version pressure are generated based on the collected raw version pressure data. A determination is made as to whether one or more modifications to an operation of a data processing system are to be performed based on the one or more statistical measures of version pressure, the one or more modifications affecting version pressure exerted on the speculative versioning cache. An operation of the data processing system is modified based on the one or more determined modifications, in response to a determination that one or more modifications to the operation of the data processing system are to be performed, to affect the version pressure exerted on the speculative versioning cache.
    Type: Grant
    Filed: August 19, 2009
    Date of Patent: March 12, 2013
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Alan Gara, Kathryn M. O'Brien, Martin Ohmacht, Xiaotong Zhuang
  • Publication number: 20130024648
    Abstract: A system and method for accessing memory are provided. The system comprises a lookup buffer for storing one or more page table entries, wherein each of the one or more page table entries comprises at least a virtual page number and a physical page number; a logic circuit for receiving a virtual address from said processor, said logic circuit for matching the virtual address to the virtual page number in one of the page table entries to select the physical page number in the same page table entry, said page table entry having one or more bits set to exclude a memory range from a page.
    Type: Application
    Filed: September 14, 2012
    Publication date: January 24, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Jon K. Kriegel, Martin Ohmacht, Burkhard Steinmacher-Burow
  • Patent number: 8347039
    Abstract: A stream prefetch engine performs data retrieval in a parallel computing system. The engine receives a load request from at least one processor. The engine evaluates whether a first memory address requested in the load request is present and valid in a table. The engine checks whether there exists valid data corresponding to the first memory address in an array if the first memory address is present and valid in the table. The engine increments a prefetching depth of a first stream that the first memory address belongs to and fetching a cache line associated with the first memory address from the at least one cache memory device if there is not yet valid data corresponding to the first memory address in the array. The engine determines whether prefetching of additional data is needed for the first stream within its prefetching depth. The engine prefetches the additional data if the prefetching is needed.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: January 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: Peter Boyle, Norman Christ, Alan Gara, Robert Mawhinney, Martin Ohmacht, Krishnan Sugavanam
  • Publication number: 20120331232
    Abstract: An apparatus and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    Type: Application
    Filed: September 5, 2012
    Publication date: December 27, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Alan Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
  • Publication number: 20120324142
    Abstract: A list prefetch engine improves a performance of a parallel computing system. The list prefetch engine receives a current cache miss address. The list prefetch engine evaluates whether the current cache miss address is valid. If the current cache miss address is valid, the list prefetch engine compares the current cache miss address and a list address. A list address represents an address in a list. A list describes an arbitrary sequence of prior cache miss addresses. The prefetch engine prefetches data according to the list, if there is a match between the current cache miss address and the list address.
    Type: Application
    Filed: August 24, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Peter Boyle, Norman Christ, Alan Gara, Changhoan Kim, Robert Mawhinney, Martin Ohmacht, Krishnan Sugavanam
  • Patent number: 8327077
    Abstract: A prefetch system improves a performance of a parallel computing system. The parallel computing system includes a plurality of computing nodes. A computing node includes at least one processor and at least one memory device. The prefetch system includes at least one stream prefetch engine and at least one list prefetch engine. The prefetch system operates those engines simultaneously. After the at least one processor issues a command, the prefetch system passes the command to a stream prefetch engine and a list prefetch engine. The prefetch system operates the stream prefetch engine and the list prefetch engine to prefetch data to be needed in subsequent clock cycles in the processor in response to the passed command.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: December 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Peter A. Boyle, Norman H. Christ, Alan Gara, Robert D. Mawhinney, Martin Ohmacht, Krishnan Sugavanam
  • Publication number: 20120266008
    Abstract: An apparatus, method and computer program product for automatically controlling power dissipation of a parallel computing system that includes a plurality of processors. A computing device issues a command to the parallel computing system. A clock pulse-width modulator encodes the command in a system clock signal to be distributed to the plurality of processors. The plurality of processors in the parallel computing system receive the system clock signal including the encoded command, and adjusts power dissipation according to the encoded command.
    Type: Application
    Filed: April 13, 2011
    Publication date: October 18, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Paul W. Coteus, Alan Gara, Thomas M. Gooding, Rudolf A. Haring, Gerard V. Kopcsay, Thomas A. Liebsch, Don D. Reed
  • Patent number: 8275954
    Abstract: A device for copying performance counter data includes hardware path that connects a direct memory access (DMA) unit to a plurality of hardware performance counters and a memory device. Software prepares an injection packet for the DMA unit to perform copying, while the software can perform other tasks. In one aspect, the software that prepares the injection packet runs on a processing core other than the core that gathers the hardware performance counter data.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: September 25, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alan Gara, Valentina Salapura, Robert W. Wisniewski
  • Patent number: 8275964
    Abstract: Hardware support for collecting performance counters directly to memory, in one aspect, may include a plurality of performance counters operable to collect one or more counts of one or more selected activities. A first storage element may be operable to store an address of a memory location. A second storage element may be operable to store a value indicating whether the hardware should begin copying. A state machine may be operable to detect the value in the second storage element and trigger hardware copying of data in selected one or more of the plurality of performance counters to the memory location whose address is stored in the first storage element.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: September 25, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alan Gara, Valentina Salapura, Robert W. Wisniewski
  • Patent number: 8255633
    Abstract: A list prefetch engine improves a performance of a parallel computing system. The list prefetch engine receives a current cache miss address. The list prefetch engine evaluates whether the current cache miss address is valid. If the current cache miss address is valid, the list prefetch engine compares the current cache miss address and a list address. A list address represents an address in a list. A list describes an arbitrary sequence of prior cache miss addresses. The prefetch engine prefetches data according to the list, if there is a match between the current cache miss address and the list address.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: August 28, 2012
    Assignee: International Business Machines Corporation
    Inventors: Peter Boyle, Norman Christ, Alan Gara, Changhoan Kim, Robert Mawhinney, Martin Ohmacht, Krishnan Sugavanam
  • Publication number: 20120210172
    Abstract: System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory “nest” (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus.
    Type: Application
    Filed: February 15, 2011
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alan Gara, Michael Karl Gschwind, Valentina Salapura
  • Publication number: 20120210164
    Abstract: System, method and computer program product for scheduling threads in a multiprocessing system with selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). The method configures the selective pairing facility to use checking provide one highly reliable thread for high-reliability and allocate threads to corresponding processor cores indicating need for hardware checking. The method configures the selective pairing facility to provide multiple independent cores and allocate threads to corresponding processor cores indicating inherent resilience.
    Type: Application
    Filed: February 15, 2011
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alan Gara, Michael Karl Gschwind, Valentina Salapura
  • Publication number: 20120210073
    Abstract: An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    Type: Application
    Filed: February 11, 2011
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Alan Gara, Martin Ohmacht, Vijayalakshmi Srinivasan
  • Publication number: 20120210162
    Abstract: System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory “nest” (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus.
    Type: Application
    Filed: February 15, 2011
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alan Gara, Michael Karl Gschwind, Valentina Salapura
  • Publication number: 20120198118
    Abstract: A device for copying performance counter data includes hardware path that connects a direct memory access (DMA) unit to a plurality of hardware performance counters and a memory device. Software prepares an injection packet for the DMA unit to perform copying, while the software can perform other tasks. In one aspect, the software that prepares the injection packet runs on a processing core other than the core that gathers the hardware performance counter data.
    Type: Application
    Filed: April 13, 2012
    Publication date: August 2, 2012
    Applicant: International Business Machines Corporation
    Inventors: Alan Gara, Valentina Salapura, Robert W. Wisniewski
  • Publication number: 20120185672
    Abstract: Performing a series of successive synchronizing operations by a core on data shared by a plurality of cores may include a first core indicating an upcoming synchronizing operation on shared data. A second memory layer stores the shared data and tracks the first core's ownership of the shared data. The second memory layer is shared via coherency operations among the first core and one or more second cores. The first core may perform one or more synchronization operations on the shared data without requiring interaction from the second memory layer.
    Type: Application
    Filed: January 18, 2011
    Publication date: July 19, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alan Gara, Martin Ohmacht, Burkhard Steinmacher-Burow, Robert W. Wisniewski
  • Patent number: 8117288
    Abstract: A general computer-implement method and apparatus to optimize problem layout on a massively parallel supercomputer is described. The method takes as input the communication matrix of an arbitrary problem in the form of an array whose entries C(i, j) are the amount to data communicated from domain i to domain j. Given C(i, j), first implement a heuristic map is implemented which attempts sequentially to map a domain and its communications neighbors either to the same supercomputer node or to near-neighbor nodes on the supercomputer torus while keeping the number of domains mapped to a supercomputer node constant (as much as possible). Next a Markov Chain of maps is generated from the initial map using Monte Carlo simulation with Free Energy (cost function) F=?i,jC(i,j)H(i,j)? where H(i,j) is the smallest number of hops on the supercomputer torus between domain i and domain j.
    Type: Grant
    Filed: October 12, 2004
    Date of Patent: February 14, 2012
    Assignee: International Business Machines Corporation
    Inventors: Gyan V. Bhanot, Alan Gara, Philip Heidelberger, Eoin M. Lawless, James C. Sexton, Robert E. Walkup
  • Patent number: 8103832
    Abstract: Method and apparatus of prefetching streams of varying prefetch depth dynamically changes the depth of prefetching so that the number of multiple streams as well as the hit rate of a single stream are optimized. The method and apparatus in one aspect monitor a plurality of load requests from a processing unit for data in a prefetch buffer, determine an access pattern associated with the plurality of load requests and adjust a prefetch depth according to the access pattern.
    Type: Grant
    Filed: June 26, 2007
    Date of Patent: January 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alan Gara, Martin Ohmacht, Valentina Salapura, Krishnan Sugavanam, Dirk Hoenicke
  • Patent number: 8103910
    Abstract: A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: January 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Matthias A. Blumrich, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Martin Ohmacht, Burkhard Steinmacher-Burow, Krishnan Sugavanam