Patents by Inventor Joseph D. Ratterman

Joseph D. Ratterman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8117502
    Abstract: An apparatus, program product and method logically divide a group of nodes and causes node pairs comprising a node from each section to communicate. Results from the communications may be analyzed to determine performance characteristics, such as bandwidth and proper connectivity.
    Type: Grant
    Filed: August 22, 2008
    Date of Patent: February 14, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charles Jens Archer, Kurt Walter Pinnow, Joseph D. Ratterman, Brian Edward Smith
  • Publication number: 20120036384
    Abstract: Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
    Type: Application
    Filed: October 20, 2011
    Publication date: February 9, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Amanda E. Peters, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8112658
    Abstract: An apparatus, program product and method check for nodal faults in a row of nodes by causing each node in the row to concurrently communicate with its adjacent neighbor nodes in the row. The communications are analyzed to determine a presence of a faulty node or connection.
    Type: Grant
    Filed: August 25, 2008
    Date of Patent: February 7, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charles Jens Archer, Kurt Walter Pinnow, Joseph D. Ratterman, Brian Edward Smith
  • Patent number: 8095811
    Abstract: Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
    Type: Grant
    Filed: May 29, 2008
    Date of Patent: January 10, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Amanda A. Peters, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110296139
    Abstract: Performing a deterministic reduction operation in a parallel computer that includes compute nodes, each of which includes computer processors and a CAU (Collectives Acceleration Unit) that couples computer processors to one another for data communications, including organizing processors and a CAU into a branched tree topology in which the CAU is a root and the processors are children; receiving, from each of the processors in any order, dummy contribution data, where each processor is restricted from sending any other data to the root CAU prior to receiving an acknowledgement of receipt from the root CAU; sending, by the root CAU to the processors in the branched tree topology, in a predefined order, acknowledgements of receipt of the dummy contribution data; receiving, by the root CAU from the processors in the predefined order, the processors' contribution data to the reduction operation; and reducing, by the root CAU, the processors' contribution data.
    Type: Application
    Filed: May 28, 2010
    Publication date: December 1, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110296137
    Abstract: A parallel computer that includes compute nodes having computer processors and a CAU (Collectives Acceleration Unit) that couples processors to one another for data communications. In embodiments of the present invention, deterministic reduction operation include: organizing processors of the parallel computer and a CAU into a branched tree topology, where the CAU is a root of the branched tree topology and the processors are children of the root CAU; establishing a receive buffer that includes receive elements associated with processors and configured to store the associated processor's contribution data; receiving, in any order from the processors, each processor's contribution data; tracking receipt of each processor's contribution data; and reducing, the contribution data in a predefined order, only after receipt of contribution data from all processors in the branched tree topology.
    Type: Application
    Filed: May 28, 2010
    Publication date: December 1, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110289177
    Abstract: Compute nodes of a parallel computer organized for collective operations via a network, each compute node having a receive buffer and establishing a topology for the network; selecting a schedule for a broadcast operation; depositing, by a root node of the topology, broadcast data in a target node's receive buffer, including performing a DMA operation with a well-known memory location for the target node's receive buffer; depositing, by the root node in a memory region designated for storing broadcast data length, a length of the broadcast data, including performing a DMA operation with a well-known memory location of the broadcast data length memory region; and triggering, by the root node, the target node to perform a next DMA operation, including depositing, in a memory region designated for receiving injection instructions for the target node, an instruction to inject the broadcast data into the receive buffer of a subsequent target node.
    Type: Application
    Filed: May 19, 2010
    Publication date: November 24, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110288848
    Abstract: Embodiments of the invention provide a method of calculating performance counter data for a computer simulator, while minimizing the performance costs associated with cycle-accurate simulation. A callback may be associated with the instructions of a user program and, when the instructions are executed, the associated callbacks may be executed as well. Upon execution, the callbacks may calculate performance counter data related to the associated instruction.
    Type: Application
    Filed: May 21, 2010
    Publication date: November 24, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. ARCHER, Michael BLOCKSOME, Joseph D. RATTERMAN, Brian E. SMITH
  • Publication number: 20110270942
    Abstract: Systems, methods and articles of manufacture are disclosed for performing a collective operation on a parallel computing system that includes multiple compute nodes and multiple networks connecting the compute nodes. Each of the networks may have different characteristics. A source node may broadcast a DMA descriptor over a first network to a target node, to initialize the collective operation. The target node may perform the collective operation over a second network and using the broadcast DMA descriptor.
    Type: Application
    Filed: April 28, 2010
    Publication date: November 3, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: CHARLES J. ARCHER, MICHAEL BLOCKSOME, JOSEPH D. RATTERMAN, BRIAN E. SMITH
  • Publication number: 20110271006
    Abstract: Systems, methods and articles of manufacture are disclosed for effecting a desired collective operation on a parallel computing system that includes multiple compute nodes. The compute nodes may pipeline multiple collective operations to effect the desired collective operation. To select protocols suitable for the multiple collective operations, the compute nodes may also perform additional collective operations. The compute nodes may pipeline the multiple collective operations and/or the additional collective operations to effect the desired collective operation more efficiently.
    Type: Application
    Filed: April 29, 2010
    Publication date: November 3, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael Blocksome, Bob R. Cernohous, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110271263
    Abstract: Compiling software for a hierarchical distributed processing system including providing to one or more compiling nodes software to be compiled, wherein at least a portion of the software to be compiled is to be executed by one or more other nodes; compiling, by the compiling node, the software; maintaining, by the compiling node, any compiled software to be executed on the compiling node; selecting, by the compiling node, one or more nodes in a next tier of the hierarchy of the distributed processing system in dependence upon whether any compiled software is for the selected node or the selected node's descendants; sending to the selected node only the compiled software to be executed by the selected node or selected node's descendant.
    Type: Application
    Filed: April 29, 2010
    Publication date: November 3, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110265098
    Abstract: In an embodiment, a reception thread receives a source node identifier, a type, and a data pointer from an application and, in response, creates a receive request. If the source node identifier specifies a source node, the reception thread adds the receive request to a fast-post queue. If a message received from a network does not match a receive request on a posted queue, a polling thread adds a receive request that represents the message to an unexpected queue. If the fast-post queue contains the receive request, the polling thread removes the receive request from the fast-post queue. If the receive request that was removed from the fast-post queue does not match the receive request on the unexpected queue, the polling thread adds the receive request that was removed from the fast-post queue to the posted queue. The reception thread and the polling thread execute asynchronously from each other.
    Type: Application
    Filed: April 21, 2010
    Publication date: October 27, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow
  • Publication number: 20110258281
    Abstract: Embodiments of the invention provide a method for querying performance counter data on a massively parallel computing system, while minimizing the costs associated with interrupting computer processors and limited memory resources. DMA descriptors may be inserted into an injection FIFO of a remote compute node in the massively parallel computing system. Upon executing the DMA operations described by the DMA descriptors, performance counter data may be transferred from the remote compute node to a destination node.
    Type: Application
    Filed: April 15, 2010
    Publication date: October 20, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8041969
    Abstract: Methods, apparatus, and products are disclosed for reducing power consumption while performing collective operations on a plurality of compute nodes that include: receiving, by each compute node, instructions to perform a type of collective operation; selecting, by each compute node from a plurality of collective operations for the collective operation type, a particular collective operation in dependence upon power consumption characteristics for each of the plurality of collective operations; and executing, by each compute node, the selected collective operation.
    Type: Grant
    Filed: May 27, 2008
    Date of Patent: October 18, 2011
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Amanda E. Peters, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110246582
    Abstract: In an embodiment, a send thread receives an identifier that identifies a destination node and a pointer to data. The send thread creates a first send request in response to the receipt of the identifier and the data pointer. The send thread selects a selected channel from among a plurality of channels. The selected channel comprises a selected hand-off queue and an identification of a selected message unit. Each of the channels identifies a different message unit. The selected hand-off queue is randomly accessible. If the selected hand-off queue contains an available entry, the send thread adds the first send request to the selected hand-off queue. If the selected hand-off queue does not contain an available entry, the send thread removes a second send request from the selected hand-off queue and sends the second send request to the selected message unit.
    Type: Application
    Filed: March 30, 2010
    Publication date: October 6, 2011
    Applicant: International Business Machines Corporation
    Inventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow, Robert W. Wisniewski
  • Patent number: 8032899
    Abstract: Methods, apparatus, and products are disclosed for providing policy-based operating system services in a hypervisor on a computing system. The computing system includes at least one compute node. The compute node includes an operating system and a hypervisor. The operating system includes a kernel. The hypervisor comprising a kernel proxy and a plurality of operating system services of a service type. Providing policy-based operating system services in a hypervisor on a computing system includes establishing, on the compute node, a kernel policy specifying one of the operating system services of the service type for use by the kernel proxy, and accessing, by the kernel proxy, the specified operating system service. The computing system may also be implemented as a distributed computing system that includes one or more operating system service nodes. One or more of the operating system services may be distributed among the operating system service nodes.
    Type: Grant
    Filed: October 26, 2006
    Date of Patent: October 4, 2011
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Albert Sidelnik, Brian E. Smith
  • Publication number: 20110239003
    Abstract: Direct injection of a data to be transferred in a hybrid computing environment that includes a host computer and a plurality of accelerators, the host computer and the accelerators adapted to one another for data communications by a system level message passing module. Each accelerator includes a Power Processing Element (‘PPE’) and a plurality of Synergistic Processing Elements (‘SPEs’). Direct injection includes reserving, by each SPE, a slot in a shared memory region accessible by the host computer; loading, by each SPE into local memory of the SPE, a portion of data to be transferred to the host computer; executing, by each SPE in parallel, a data processing operation on the portion of the data loaded in local memory of each SPE; and writing, by each SPE, the processed data to the SPE's reserved slot in the shared memory region accessible by the host computer.
    Type: Application
    Filed: March 29, 2010
    Publication date: September 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Gary R. Ricard, Brian E. Smith
  • Publication number: 20110238950
    Abstract: Performing a scattery operation on a hierarchical tree network optimized for collective operations including receiving, by the scattery module installed on the node, from a nearest neighbor parent above the node a chunk of data having at least a portion of data for the node; maintaining, by the scattery module installed on the node, the portion of the data for the node; determining, by the scattery module installed on the node, whether any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child; and sending, by the scattery module installed on the node, those portions of data to the nearest neighbor child if any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child.
    Type: Application
    Filed: March 29, 2010
    Publication date: September 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110238949
    Abstract: Distributed administration of a lock for an operational group of compute nodes in a hierarchical tree structured network including assigning the root node of the operational group to send acknowledgments for lock requests, the root lock administration module comprising a module of automated computing machinery; receiving a lock request assigned to a particular node from a child node; determining whether another request from another child is directly ahead in an acknowledgement queue; if a request from another child is directly ahead in the acknowledgement queue, putting the lock request for the particular node in the acknowledgement queue until the lock request directly ahead in the acknowledgement queue is satisfied and when the lock request ahead in the queue is satisfied, sending the particular node for whom the lock request is assigned a message acknowledging the particular node has the lock; and if a request from another child is not directly ahead in a queue, sending to the particular node for whom the
    Type: Application
    Filed: March 29, 2010
    Publication date: September 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Publication number: 20110219208
    Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).
    Type: Application
    Filed: January 10, 2011
    Publication date: September 8, 2011
    Applicant: International Business Machines Corporation
    Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu