Patents by Inventor Joseph D. Ratterman
Joseph D. Ratterman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8117502Abstract: An apparatus, program product and method logically divide a group of nodes and causes node pairs comprising a node from each section to communicate. Results from the communications may be analyzed to determine performance characteristics, such as bandwidth and proper connectivity.Type: GrantFiled: August 22, 2008Date of Patent: February 14, 2012Assignee: International Business Machines CorporationInventors: Charles Jens Archer, Kurt Walter Pinnow, Joseph D. Ratterman, Brian Edward Smith
-
Publication number: 20120036384Abstract: Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.Type: ApplicationFiled: October 20, 2011Publication date: February 9, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Amanda E. Peters, Joseph D. Ratterman, Brian E. Smith
-
Patent number: 8112658Abstract: An apparatus, program product and method check for nodal faults in a row of nodes by causing each node in the row to concurrently communicate with its adjacent neighbor nodes in the row. The communications are analyzed to determine a presence of a faulty node or connection.Type: GrantFiled: August 25, 2008Date of Patent: February 7, 2012Assignee: International Business Machines CorporationInventors: Charles Jens Archer, Kurt Walter Pinnow, Joseph D. Ratterman, Brian Edward Smith
-
Patent number: 8095811Abstract: Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.Type: GrantFiled: May 29, 2008Date of Patent: January 10, 2012Assignee: International Business Machines CorporationInventors: Charles J. Archer, Michael A. Blocksome, Amanda A. Peters, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110296139Abstract: Performing a deterministic reduction operation in a parallel computer that includes compute nodes, each of which includes computer processors and a CAU (Collectives Acceleration Unit) that couples computer processors to one another for data communications, including organizing processors and a CAU into a branched tree topology in which the CAU is a root and the processors are children; receiving, from each of the processors in any order, dummy contribution data, where each processor is restricted from sending any other data to the root CAU prior to receiving an acknowledgement of receipt from the root CAU; sending, by the root CAU to the processors in the branched tree topology, in a predefined order, acknowledgements of receipt of the dummy contribution data; receiving, by the root CAU from the processors in the predefined order, the processors' contribution data to the reduction operation; and reducing, by the root CAU, the processors' contribution data.Type: ApplicationFiled: May 28, 2010Publication date: December 1, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110296137Abstract: A parallel computer that includes compute nodes having computer processors and a CAU (Collectives Acceleration Unit) that couples processors to one another for data communications. In embodiments of the present invention, deterministic reduction operation include: organizing processors of the parallel computer and a CAU into a branched tree topology, where the CAU is a root of the branched tree topology and the processors are children of the root CAU; establishing a receive buffer that includes receive elements associated with processors and configured to store the associated processor's contribution data; receiving, in any order from the processors, each processor's contribution data; tracking receipt of each processor's contribution data; and reducing, the contribution data in a predefined order, only after receipt of contribution data from all processors in the branched tree topology.Type: ApplicationFiled: May 28, 2010Publication date: December 1, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110289177Abstract: Compute nodes of a parallel computer organized for collective operations via a network, each compute node having a receive buffer and establishing a topology for the network; selecting a schedule for a broadcast operation; depositing, by a root node of the topology, broadcast data in a target node's receive buffer, including performing a DMA operation with a well-known memory location for the target node's receive buffer; depositing, by the root node in a memory region designated for storing broadcast data length, a length of the broadcast data, including performing a DMA operation with a well-known memory location of the broadcast data length memory region; and triggering, by the root node, the target node to perform a next DMA operation, including depositing, in a memory region designated for receiving injection instructions for the target node, an instruction to inject the broadcast data into the receive buffer of a subsequent target node.Type: ApplicationFiled: May 19, 2010Publication date: November 24, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110288848Abstract: Embodiments of the invention provide a method of calculating performance counter data for a computer simulator, while minimizing the performance costs associated with cycle-accurate simulation. A callback may be associated with the instructions of a user program and, when the instructions are executed, the associated callbacks may be executed as well. Upon execution, the callbacks may calculate performance counter data related to the associated instruction.Type: ApplicationFiled: May 21, 2010Publication date: November 24, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. ARCHER, Michael BLOCKSOME, Joseph D. RATTERMAN, Brian E. SMITH
-
Publication number: 20110270942Abstract: Systems, methods and articles of manufacture are disclosed for performing a collective operation on a parallel computing system that includes multiple compute nodes and multiple networks connecting the compute nodes. Each of the networks may have different characteristics. A source node may broadcast a DMA descriptor over a first network to a target node, to initialize the collective operation. The target node may perform the collective operation over a second network and using the broadcast DMA descriptor.Type: ApplicationFiled: April 28, 2010Publication date: November 3, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: CHARLES J. ARCHER, MICHAEL BLOCKSOME, JOSEPH D. RATTERMAN, BRIAN E. SMITH
-
Publication number: 20110271006Abstract: Systems, methods and articles of manufacture are disclosed for effecting a desired collective operation on a parallel computing system that includes multiple compute nodes. The compute nodes may pipeline multiple collective operations to effect the desired collective operation. To select protocols suitable for the multiple collective operations, the compute nodes may also perform additional collective operations. The compute nodes may pipeline the multiple collective operations and/or the additional collective operations to effect the desired collective operation more efficiently.Type: ApplicationFiled: April 29, 2010Publication date: November 3, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael Blocksome, Bob R. Cernohous, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110271263Abstract: Compiling software for a hierarchical distributed processing system including providing to one or more compiling nodes software to be compiled, wherein at least a portion of the software to be compiled is to be executed by one or more other nodes; compiling, by the compiling node, the software; maintaining, by the compiling node, any compiled software to be executed on the compiling node; selecting, by the compiling node, one or more nodes in a next tier of the hierarchy of the distributed processing system in dependence upon whether any compiled software is for the selected node or the selected node's descendants; sending to the selected node only the compiled software to be executed by the selected node or selected node's descendant.Type: ApplicationFiled: April 29, 2010Publication date: November 3, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110265098Abstract: In an embodiment, a reception thread receives a source node identifier, a type, and a data pointer from an application and, in response, creates a receive request. If the source node identifier specifies a source node, the reception thread adds the receive request to a fast-post queue. If a message received from a network does not match a receive request on a posted queue, a polling thread adds a receive request that represents the message to an unexpected queue. If the fast-post queue contains the receive request, the polling thread removes the receive request from the fast-post queue. If the receive request that was removed from the fast-post queue does not match the receive request on the unexpected queue, the polling thread adds the receive request that was removed from the fast-post queue to the posted queue. The reception thread and the polling thread execute asynchronously from each other.Type: ApplicationFiled: April 21, 2010Publication date: October 27, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow
-
Publication number: 20110258281Abstract: Embodiments of the invention provide a method for querying performance counter data on a massively parallel computing system, while minimizing the costs associated with interrupting computer processors and limited memory resources. DMA descriptors may be inserted into an injection FIFO of a remote compute node in the massively parallel computing system. Upon executing the DMA operations described by the DMA descriptors, performance counter data may be transferred from the remote compute node to a destination node.Type: ApplicationFiled: April 15, 2010Publication date: October 20, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Patent number: 8041969Abstract: Methods, apparatus, and products are disclosed for reducing power consumption while performing collective operations on a plurality of compute nodes that include: receiving, by each compute node, instructions to perform a type of collective operation; selecting, by each compute node from a plurality of collective operations for the collective operation type, a particular collective operation in dependence upon power consumption characteristics for each of the plurality of collective operations; and executing, by each compute node, the selected collective operation.Type: GrantFiled: May 27, 2008Date of Patent: October 18, 2011Assignee: International Business Machines CorporationInventors: Charles J. Archer, Michael A. Blocksome, Amanda E. Peters, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110246582Abstract: In an embodiment, a send thread receives an identifier that identifies a destination node and a pointer to data. The send thread creates a first send request in response to the receipt of the identifier and the data pointer. The send thread selects a selected channel from among a plurality of channels. The selected channel comprises a selected hand-off queue and an identification of a selected message unit. Each of the channels identifies a different message unit. The selected hand-off queue is randomly accessible. If the selected hand-off queue contains an available entry, the send thread adds the first send request to the selected hand-off queue. If the selected hand-off queue does not contain an available entry, the send thread removes a second send request from the selected hand-off queue and sends the second send request to the selected message unit.Type: ApplicationFiled: March 30, 2010Publication date: October 6, 2011Applicant: International Business Machines CorporationInventors: Gabor J. Dozsa, Philip Heidelberger, Sameer Kumar, Joseph D. Ratterman, Burkhard Steinmacher-Burow, Robert W. Wisniewski
-
Patent number: 8032899Abstract: Methods, apparatus, and products are disclosed for providing policy-based operating system services in a hypervisor on a computing system. The computing system includes at least one compute node. The compute node includes an operating system and a hypervisor. The operating system includes a kernel. The hypervisor comprising a kernel proxy and a plurality of operating system services of a service type. Providing policy-based operating system services in a hypervisor on a computing system includes establishing, on the compute node, a kernel policy specifying one of the operating system services of the service type for use by the kernel proxy, and accessing, by the kernel proxy, the specified operating system service. The computing system may also be implemented as a distributed computing system that includes one or more operating system service nodes. One or more of the operating system services may be distributed among the operating system service nodes.Type: GrantFiled: October 26, 2006Date of Patent: October 4, 2011Assignee: International Business Machines CorporationInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Albert Sidelnik, Brian E. Smith
-
Publication number: 20110239003Abstract: Direct injection of a data to be transferred in a hybrid computing environment that includes a host computer and a plurality of accelerators, the host computer and the accelerators adapted to one another for data communications by a system level message passing module. Each accelerator includes a Power Processing Element (‘PPE’) and a plurality of Synergistic Processing Elements (‘SPEs’). Direct injection includes reserving, by each SPE, a slot in a shared memory region accessible by the host computer; loading, by each SPE into local memory of the SPE, a portion of data to be transferred to the host computer; executing, by each SPE in parallel, a data processing operation on the portion of the data loaded in local memory of each SPE; and writing, by each SPE, the processed data to the SPE's reserved slot in the shared memory region accessible by the host computer.Type: ApplicationFiled: March 29, 2010Publication date: September 29, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Gary R. Ricard, Brian E. Smith
-
Publication number: 20110238950Abstract: Performing a scattery operation on a hierarchical tree network optimized for collective operations including receiving, by the scattery module installed on the node, from a nearest neighbor parent above the node a chunk of data having at least a portion of data for the node; maintaining, by the scattery module installed on the node, the portion of the data for the node; determining, by the scattery module installed on the node, whether any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child; and sending, by the scattery module installed on the node, those portions of data to the nearest neighbor child if any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child.Type: ApplicationFiled: March 29, 2010Publication date: September 29, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110238949Abstract: Distributed administration of a lock for an operational group of compute nodes in a hierarchical tree structured network including assigning the root node of the operational group to send acknowledgments for lock requests, the root lock administration module comprising a module of automated computing machinery; receiving a lock request assigned to a particular node from a child node; determining whether another request from another child is directly ahead in an acknowledgement queue; if a request from another child is directly ahead in the acknowledgement queue, putting the lock request for the particular node in the acknowledgement queue until the lock request directly ahead in the acknowledgement queue is satisfied and when the lock request ahead in the queue is satisfied, sending the particular node for whom the lock request is assigned a message acknowledging the particular node has the lock; and if a request from another child is not directly ahead in a queue, sending to the particular node for whom theType: ApplicationFiled: March 29, 2010Publication date: September 29, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
-
Publication number: 20110219208Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC).Type: ApplicationFiled: January 10, 2011Publication date: September 8, 2011Applicant: International Business Machines CorporationInventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Brian Smith, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu