Patents by Inventor Brian E. Smith

Brian E. Smith has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9160622
    Abstract: Determining a system configuration for performing a collective operation on a parallel computer that includes a plurality of compute nodes, the compute nodes coupled for data communications over a data communications network, including: selecting a system configuration on the parallel computer for executing the collective operation; executing the collective operation on the selected system configuration on the parallel computer; determining performance metrics associated with executing the collective operation on the selected system configuration on the parallel computer; selecting, using a simulated annealing algorithm, a plurality of test system configurations on the parallel computer for executing the collective operation, wherein the simulated annealing algorithm specifies a similarity threshold between a plurality of system configurations; executing, the collective operation on each of the test system configurations; and determining performance metrics associated with executing the collective operation o
    Type: Grant
    Filed: February 7, 2013
    Date of Patent: October 13, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, James E. Carey, Philip J. Sanders, Brian E. Smith
  • Patent number: 9158602
    Abstract: Processing posted receive commands in a parallel computer, including: posting, by a parallel process of a compute node, a receive command, the receive command including a set of parameters excluding the receive command from being directed among parallel posted receive queues; flattening the parallel unexpected message queues into a single unexpected message queue; determining whether the posted receive command is satisfied by an entry in the single unexpected message queue; if the posted receive command is satisfied by an entry in the single unexpected message queue, processing the posted receive command; if the posted receive command is not satisfied by an entry in the single unexpected message queue: flattening the parallel posted receive queues into a single posted receive queue; and storing the posted receive command in the single posted receive queue.
    Type: Grant
    Filed: May 21, 2012
    Date of Patent: October 13, 2015
    Assignee: Intermational Business Machines Corporation
    Inventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 9152481
    Abstract: Processing posted receive commands in a parallel computer, including: posting, by a parallel process of a compute node, a receive command, the receive command including a set of parameters excluding the receive command from being directed among parallel posted receive queues; flattening the parallel unexpected message queues into a single unexpected message queue; determining whether the posted receive command is satisfied by an entry in the single unexpected message queue; if the posted receive command is satisfied by an entry in the single unexpected message queue, processing the posted receive command; if the posted receive command is not satisfied by an entry in the single unexpected message queue: flattening the parallel posted receive queues into a single posted receive queue; and storing the posted receive command in the single posted receive queue.
    Type: Grant
    Filed: November 16, 2012
    Date of Patent: October 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Sameer Kumar, Amith R. Mamidala, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 9141183
    Abstract: Methods, apparatuses, and computer program products for collective operation management in a parallel computer are provided. Embodiments include a parallel computer having a first compute node operatively coupled for data communications over a tree data communications network with a plurality of child compute nodes. Embodiments also include each child compute node performing a first collective operation. The first compute rode, for each child compute node, receives from the child compute node, a result of the first collective operation performed by the child compute node. In response to receiving at least one result, the first compute node reduces a power consumption level of the child compute node.
    Type: Grant
    Filed: March 11, 2013
    Date of Patent: September 22, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, James E. Carey, Philip J. Sanders, Brian E. Smith
  • Patent number: 9116750
    Abstract: Methods, apparatuses, and computer program products for optimizing collective communications within a parallel computer comprising a plurality of hardware threads for executing software threads of a parallel application are provided. Embodiments include a processor of a parallel computer determining for each software thread, an affinity of the software thread to a particular hardware thread. Each affinity indicates an assignment of a software thread to a particular hardware thread. The processor also generates one or more affinity domains based on the affinities of the software threads. Embodiments also include a processor generating, for each affinity domain, a topology of the affinity domain based on the affinities of the software threads to the hardware threads. According to embodiments of the present application, a processor also performs, based on the generated topologies of the affinity domains, a collective operation on one or more software threads.
    Type: Grant
    Filed: August 8, 2012
    Date of Patent: August 25, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 9088582
    Abstract: Token-based flow control of messages in a parallel computer, the parallel computer including a plurality of compute nodes, each compute node including one or more computer processors, including: allocating, by a token administration module to a plurality of the computer processors in the parallel computer, a number of data communications tokens; identifying all communicators executing on each computer processor, where each communicator is participating in a distinct parallel operation executing on the parallel computer; allocating, to the communicators, the data communications tokens; determining, by a communicator attempting to send data to the destination, whether the communicator has enough available data communications tokens to send the data to the destination; and responsive to determining that the communicator has enough available data communications tokens to send the data, sending, by the communicator, the data to the destination.
    Type: Grant
    Filed: February 13, 2013
    Date of Patent: July 21, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, James E. Carey, Philip J. Sanders, Brian E. Smith
  • Patent number: 9055078
    Abstract: Token-based flow control of messages in a parallel computer, the parallel computer including a plurality of compute nodes, each compute node including one or more computer processors, including: allocating, by a token administration module to a plurality of the computer processors in the parallel computer, a number of data communications tokens; identifying all communicators executing on each computer processor, where each communicator is participating in a distinct parallel operation executing on the parallel computer; allocating, to the communicators, the data communications tokens; determining, by a communicator attempting to send data to the destination, whether the communicator has enough available data communications tokens to send the data to the destination; and responsive to determining that the communicator has enough available data communications tokens to send the data, sending, by the communicator, the data to the destination.
    Type: Grant
    Filed: January 10, 2013
    Date of Patent: June 9, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, James E. Carey, Philip J. Sanders, Brian E. Smith
  • Patent number: 9053226
    Abstract: Administering connection identifiers for collective operations in a parallel computer, including prior to calling a collective operation, determining, by a first compute node of a communicator to receive an instruction to execute the collective operation, whether a value stored in a global connection identifier utilization buffer exceeds a predetermined threshold; if the value stored in the global ConnID utilization buffer does not exceed the predetermined threshold: calling the collective operation with a next available ConnID including retrieving, from an element of a ConnID buffer, the next available ConnID and locking the element of the ConnID buffer from access by other compute nodes; and if the value stored in the global ConnID utilization buffer exceeds the predetermined threshold: repeatedly determining whether the value stored in the global ConnID utilization buffer exceeds the predetermined threshold until the value stored in the global ConnID utilization buffer does not exceed the predetermined thr
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: June 9, 2015
    Assignee: International Business Machines Corporation
    Inventors: Daniel A. Faraj, Brian E. Smith
  • Patent number: 9047091
    Abstract: Collective operation protocol selection in a parallel computer that includes compute nodes may be carried out by calling a collective operation with operating parameters; selecting a protocol for executing the operation and executing the operation with the selected protocol. Selecting a protocol includes: iteratively, until a prospective protocol meets predetermined performance criteria: providing, to a protocol performance function for the prospective protocol, the operating parameters; determining whether the prospective protocol meets predefined performance criteria by evaluating a predefined performance fit equation, calculating a measure of performance of the protocol for the operating parameters; determining that the prospective protocol meets predetermined performance criteria and selecting the protocol for executing the operation only if the calculated measure of performance is greater than a predefined minimum performance threshold.
    Type: Grant
    Filed: November 21, 2012
    Date of Patent: June 2, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 9009350
    Abstract: Determining a path for network traffic between a source compute node and a destination compute node in a parallel computer including: beginning with an identified group of compute nodes that includes the source compute node and iteratively until an identified group of compute nodes includes the destination compute node: identifying a group of compute nodes, the group of compute nodes having topological network locations included in a predefined topological shape; selecting a path for network traffic between compute nodes having topological network locations included in the predefined topological shape, and when an identified group of compute nodes includes the destination compute node: selecting a final path for network traffic; and sending a data communications message along the path for network traffic between the source compute node and the destination compute node, the path including, in order of selection, the selected paths and the selected final path.
    Type: Grant
    Filed: April 1, 2008
    Date of Patent: April 14, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Amanda Peters, Brian E. Smith, Brent A. Swartz
  • Patent number: 8990450
    Abstract: Managing a direct memory access (‘DMA’) injection first-in-first-out (‘FIFO’) messaging queue in a parallel computer, including: inserting, by a messaging unit management module, a DMA message descriptor into the injection FIFO messaging queue; determining, by the messaging unit management module, the number of extra slots in an immediate messaging queue required to store DMA message data associated with the DMA message descriptor; and responsive to determining that the number of extra slots in the immediate message queue required to store the DMA message data is greater than one, inserting, by the messaging unit management module, a number of DMA dummy message descriptors into the injection FIFO messaging queue, wherein the number of DMA dummy message descriptors is at least as many as the number of extra slots in the immediate messaging queue that are required to store the DMA message data.
    Type: Grant
    Filed: May 14, 2012
    Date of Patent: March 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Michael A. Blocksome, Todd A. Inglett, Patrick J. McCarthy, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8966224
    Abstract: A parallel computer that includes compute nodes having computer processors and a CAU (Collectives Acceleration Unit) that couples processors to one another for data communications. In embodiments of the present invention, deterministic reduction operation include: organizing processors of the parallel computer and a CAU into a branched tree topology, where the CAU is a root of the branched tree topology and the processors are children of the root CAU; establishing a receive buffer that includes receive elements associated with processors and configured to store the associated processor's contribution data; receiving, in any order from the processors, each processor's contribution data; tracking receipt of each processor's contribution data; and reducing, the contribution data in a predefined order, only after receipt of contribution data from all processors in the branched tree topology.
    Type: Grant
    Filed: November 1, 2012
    Date of Patent: February 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8949577
    Abstract: A parallel computer that includes compute nodes having computer processors and a CAU (Collectives Acceleration Unit) that couples processors to one another for data communications. In embodiments of the present invention, deterministic reduction operation include: organizing processors of the parallel computer and a CAU into a branched tree topology, where the CAU is a root of the branched tree topology and the processors are children of the root CAU; establishing a receive buffer that includes receive elements associated with processors and configured to store the associated processor's contribution data; receiving, in any order from the processors, each processor's contribution data; tracking receipt of each processor's contribution data; and reducing, the contribution data in a predefined order, only after receipt of contribution data from all processors in the branched tree topology.
    Type: Grant
    Filed: May 28, 2010
    Date of Patent: February 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8949453
    Abstract: Data communications in a parallel active messaging interface (‘PAMI’) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a SEND instruction, the SEND instruction specifying a transmission of transfer data from the origin endpoint to a first target endpoint; transmitting from the origin endpoint to the first target endpoint a Request-To-Send (‘RTS’) message advising the first target endpoint of the location and size of the transfer data; assigning by the first target endpoint to each of a plurality of target endpoints separate portions of the transfer data; and receiving by the plurality of target endpoints the transfer data.
    Type: Grant
    Filed: November 30, 2010
    Date of Patent: February 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8938713
    Abstract: Developing a collective operation for execution in a parallel computer that includes compute nodes coupled for data communications, including: receiving, by a collective development tool, a specification of a target collective operation to develop; receiving, by the collective development tool, a specification of computer hardware characteristics of the parallel computer within which the target collective operation will be executed; selecting, by the collective development tool automatically without user interaction, iteratively for each stage of the target collective operation, a collective primitive in dependence upon the specification of computer hardware characteristics and a predefined set of rules specifying selection criteria of collective primitives based on computer hardware characteristics; and generating, by the collective development tool, the target collective operation in dependence upon the selected collective primitives.
    Type: Grant
    Filed: February 9, 2012
    Date of Patent: January 20, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, James E. Carey, Philip J. Sanders, Brian E. Smith
  • Patent number: 8930956
    Abstract: Methods, apparatuses, and computer program products for utilizing a kernel administration hardware thread of a multi-threaded, multi-core compute node of a parallel computer are provided. Embodiments include a kernel assigning a memory space of a hardware thread of an application processing core to a kernel administration hardware thread of a kernel processing core. A kernel administration hardware thread is configured to advance the hardware thread to a next memory space associated with the hardware thread in response to the assignment of the kernel administration hardware thread to the memory space of the hardware thread. Embodiments also include the kernel administration hardware thread executing an instruction within the assigned memory space.
    Type: Grant
    Filed: August 8, 2012
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Michael A. Blocksome, Todd A. Inglett, Patrick J. McCarthy, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8930962
    Abstract: Methods, apparatuses, and computer program products for processing unexpected messages at a compute node of a parallel computer are provided. Embodiments include receiving, by the compute node, a portion of a message from another compute node of the parallel computer, the message comprising a plurality of separate portions; in response to receiving the portion of the message, determining, by the compute node, whether one of the applications executing on the compute node, has indicated that the message is expected; if one of the applications executing on the compute node has not indicated that the message is expected, storing, by the compute node, the portion of the message in an unexpected message buffer within the compute node; and if one of the applications executing on the compute node has indicated that the message is expected, storing the portion of the message at a storage destination indicated by the message.
    Type: Grant
    Filed: February 22, 2012
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, James E. Carey, Philip J. Sanders, Brian E. Smith
  • Patent number: 8909716
    Abstract: Administering truncated receive functions in a parallel messaging interface (‘PMI’) of a parallel computer comprising a plurality of compute nodes coupled for data communications through the PMI and through a data communications network, including: sending, through the PMI on a source compute node, a quantity of data from the source compute node to a destination compute node; specifying, by an application on the destination compute node, a portion of the quantity of data to be received by the application on the destination compute node and a portion of the quantity of data to be discarded; receiving, by the PMI on the destination compute node, all of the quantity of data; providing, by the PMI on the destination compute node to the application on the destination compute node, only the portion of the quantity of data to be received by the application; and discarding, by the PMI on the destination compute node, the portion of the quantity of data to be discarded.
    Type: Grant
    Filed: September 28, 2010
    Date of Patent: December 9, 2014
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8910178
    Abstract: Executing computing tasks on a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joined the single local barrier.
    Type: Grant
    Filed: August 10, 2011
    Date of Patent: December 9, 2014
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Michael A. Blocksome, Joseph D. Ratterman, Brian E. Smith
  • Patent number: 8898678
    Abstract: Identifying a collective operation within an application executing on a parallel computer; identifying a call site of the collective operation; determining whether the collective operation is root-based; if the collective operation is not root-based: establishing a tuning session and executing the collective operation in the tuning session; if the collective operation is root-based, determining whether all compute nodes executing the application identified the collective operation at the same call site; if all compute nodes identified the collective operation at the same call site, establishing a tuning session and executing the collective operation in the tuning session; and if all compute nodes executing the application did not identify the collective operation at the same call site, executing the collective operation without establishing a tuning session.
    Type: Grant
    Filed: October 30, 2012
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Daniel A. Faraj, Brian E. Smith