Patents by Inventor Ahmad A. Faraj
Ahmad A. Faraj has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9246792Abstract: Methods, apparatus, and products are disclosed for providing point to point data communications among compute nodes in a global combining network of a parallel computer that include: determining a class route identifier available for all of the nodes along a communications path from an origin node to a target node; configuring network hardware of each node along the communications path with routing instructions in dependence upon the available class route identifier and the network's topology; transmitting, by the origin node along the communications path, a network packet to the target node, including encoding the available class route identifier in the network packet; and routing, by the network hardware of each node along the communications path, the network packet to the target node in dependence upon the routing instructions for each node and the available class route identifier.Type: GrantFiled: April 5, 2012Date of Patent: January 26, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Ahmad A. Faraj, Todd A. Inglett
-
Patent number: 8484440Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: establishing, for each node, a plurality of logical rings, each ring including a different set of at least one core on that node, each ring including the cores on at least two of the nodes; iteratively for each node: assigning each core of that node to one of the rings established for that node to which the core has not previously been assigned, and performing, for each ring for that node, a global allreduce operation using contribution data for the cores assigned to that ring or any global allreduce results from previous global allreduce operations, yielding current global allreduce results for each core; and performing, for each node, a local allreduce operation using the global allreduce results.Type: GrantFiled: May 21, 2008Date of Patent: July 9, 2013Assignee: International Business Machines CorporationInventor: Ahmad Faraj
-
Publication number: 20130151713Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: establishing, for each node, a plurality of logical rings, each ring including a different set of at least one core on that node, each ring including the cores on at least two of the nodes; iteratively for each node: assigning each core of that node to one of the rings established for that node to which the core has not previously been assigned, and performing, for each ring for that node, a global allreduce operation using contribution data for the cores assigned to that ring or any global allreduce results from previous global allreduce operations, yielding current global allreduce results for each core; and performing, for each node, a local allreduce operation using the global allreduce results.Type: ApplicationFiled: May 21, 2008Publication date: June 13, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Ahmad Faraj
-
Patent number: 8422402Abstract: Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.Type: GrantFiled: April 1, 2008Date of Patent: April 16, 2013Assignee: International Business Machines CorporationInventors: Charles J. Archer, Ahmad A. Faraj
-
Patent number: 8423663Abstract: Methods, apparatus, and products are disclosed for providing full point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer, each compute node connected to each adjacent compute node in the global combining network through a link, that include: receiving a network packet in a compute node, the network packet specifying a destination compute node; selecting, in dependence upon the destination compute node, at least one of the links for the compute node along which to forward the network packet toward the destination compute node; and forwarding the network packet along the selected link to the adjacent compute node connected to the compute node through the selected link.Type: GrantFiled: August 6, 2007Date of Patent: April 16, 2013Assignee: International Business Machines CorporationInventors: Charles J. Archer, Ahmad A. Faraj, Todd A. Inglett, Joseph D. Ratterman
-
Patent number: 8375197Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.Type: GrantFiled: May 21, 2008Date of Patent: February 12, 2013Assignee: International Business Machines CorporationInventor: Ahmad Faraj
-
Patent number: 8296457Abstract: Methods, apparatus, and products are disclosed for providing nearest neighbor point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer, each compute node connected to each adjacent compute node in the global combining network through a link, that include: identifying each link in the global combining network for each compute node of the operational group; designating one of a plurality of point-to-point class routing identifiers for each link such that no compute node in the operational group is connected to two adjacent compute nodes in the operational group with links designated for the same class routing identifiers; and configuring each compute node of the operational group for point-to-point communications with each adjacent compute node in the global combining network through the link between that compute node and that adjacent compute node using that link's designated class routing identifier.Type: GrantFiled: August 2, 2007Date of Patent: October 23, 2012Assignee: International Business Machines CorporationInventors: Charles J. Archer, Ahmad A. Faraj, Todd A. Inglett, Joseph D. Ratterman
-
Publication number: 20120189012Abstract: Methods, apparatus, and products are disclosed for providing point to point data communications among compute nodes in a global combining network of a parallel computer that include: determining a class route identifier available for all of the nodes along a communications path from an origin node to a target node; configuring network hardware of each node along the communications path with routing instructions in dependence upon the available class route identifier and the network's topology; transmitting, by the origin node along the communications path, a network packet to the target node, including encoding the available class route identifier in the network packet; and routing, by the network hardware of each node along the communications path, the network packet to the target node in dependence upon the routing instructions for each node and the available class route identifier.Type: ApplicationFiled: April 5, 2012Publication date: July 26, 2012Applicant: International Business Machines CorporationInventors: Charles J. Archer, Ahmad A. Faraj, Todd A. Inglett
-
Patent number: 8194678Abstract: Methods, apparatus, and products are disclosed for providing point to point data communications among compute nodes in a global combining network of a parallel computer that include: determining a class route identifier available for all of the nodes along a communications path from an origin node to a target node; configuring network hardware of each node along the communications path with routing instructions in dependence upon the available class route identifier and the network's topology; transmitting, by the origin node along the communications path, a network packet to the target node, including encoding the available class route identifier in the network packet; and routing, by the network hardware of each node along the communications path, the network packet to the target node in dependence upon the routing instructions for each node and the available class route identifier.Type: GrantFiled: July 21, 2008Date of Patent: June 5, 2012Assignee: International Business Machines CorporationInventors: Charles J. Archer, Ahmad A. Faraj, Todd A. Inglett
-
Patent number: 8161268Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.Type: GrantFiled: May 21, 2008Date of Patent: April 17, 2012Assignee: International Business Machines CorporationInventor: Ahmad Faraj
-
Patent number: 8122228Abstract: Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.Type: GrantFiled: March 24, 2008Date of Patent: February 21, 2012Assignee: International Business Machines CorporationInventor: Ahmad Faraj
-
Patent number: 7991857Abstract: Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.Type: GrantFiled: March 24, 2008Date of Patent: August 2, 2011Assignee: International Business Machines CorporationInventors: Jeremy E. Berg, Ahmad A. Faraj
-
Patent number: 7653716Abstract: Methods, systems, and products are disclosed for determining a bisection bandwidth for a multi-node data communications network that include: partitioning nodes in the network into a first sub-network and a second sub-network in dependence upon a topology of the network; sending, by each node in the first sub-network to a destination node in the second sub-network, a first message having a predetermined message size; receiving, by each node in the first sub-network from a source node in the second sub-network, a second message; measuring, by each node in the first sub-network, the elapsed communications time between the sending of the first message and the receiving of the second message; selecting the longest elapsed communications time; and calculating the bisection bandwidth for the network in dependence upon the number of the nodes in the first sub-network, the predetermined message size of the first test message, and the longest elapsed communications time.Type: GrantFiled: August 15, 2007Date of Patent: January 26, 2010Assignee: International Business Machines CorporationInventor: Ahmad A. Faraj
-
Publication number: 20090307467Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.Type: ApplicationFiled: May 21, 2008Publication date: December 10, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Ahmad Faraj
-
Publication number: 20090292905Abstract: Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.Type: ApplicationFiled: May 21, 2008Publication date: November 26, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Ahmad Faraj
-
Publication number: 20090245134Abstract: Methods, systems, and products are disclosed for broadcasting a message in a parallel computer that includes: transmitting, by the logical root to all of the nodes directly connected to the logical root, a message; and for each node except the logical root: receiving the message; if that node is the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received; if that node received the message from a parent node and if that node is not a leaf node, then transmitting the message to all of the child nodes; and if that node received the message from a child node and if that node is not the physical root, then transmitting the message to all of the child nodes except the child node from which the message was received and transmitting the message to the parent node.Type: ApplicationFiled: April 1, 2008Publication date: October 1, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Charles J. Archer, Ahmad A. Faraj
-
Publication number: 20090240838Abstract: Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.Type: ApplicationFiled: March 24, 2008Publication date: September 24, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jeremy E. Berg, Ahmad A. Faraj
-
Publication number: 20090240915Abstract: Methods, systems, and products are disclosed for broadcasting collective operation contributions throughout a parallel computer. The parallel computer includes a plurality of compute nodes connected together through a data communications network. Each compute node has a plurality of processors for use in collective parallel operations on the parallel computer. Broadcasting collective operation contributions throughout a parallel computer according to embodiments of the present invention includes: transmitting, by each processor on each compute node, that processor's collective operation contribution to the other processors on that compute node using intra-node communications; and transmitting on a designated network link, by each processor on each compute node according to a serial processor transmission sequence, that processor's collective operation contribution to the other processors on the other compute nodes using inter-node communications.Type: ApplicationFiled: March 24, 2008Publication date: September 24, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Ahmad Faraj
-
Publication number: 20090049114Abstract: Methods, systems, and products are disclosed for determining a bisection bandwidth for a multi-node data communications network that include: partitioning nodes in the network into a first sub-network and a second sub-network in dependence upon a topology of the network; sending, by each node in the first sub-network to a destination node in the second sub-network, a first message having a predetermined message size; receiving, by each node in the first sub-network from a source node in the second sub-network, a second message; measuring, by each node in the first sub-network, the elapsed communications time between the sending of the first message and the receiving of the second message; selecting the longest elapsed communications time; and calculating the bisection bandwidth for the network in dependence upon the number of the nodes in the first sub-network, the predetermined message size of the first test message, and the longest elapsed communications time.Type: ApplicationFiled: August 15, 2007Publication date: February 19, 2009Inventor: Ahmad A. Faraj
-
Publication number: 20090046585Abstract: Methods, systems, and apparatus are disclosed for determining communications latency for transmissions between nodes in a data communications network that include: preparing, by an origin node, to receive an acknowledgement message from a target node, the acknowledgement message indicating that the target node is ready to receive a test message from the origin node; receiving, by the origin node from the target node, the acknowledgement message; sending, by the origin node to the target node in response to receiving the acknowledgement message, the test message; preparing, by the origin node, to receive an echo message from the target node; receiving, by the origin node from the target node, the echo message; and determining, by the origin node, a round-trip communications latency between the origin node and the target node in dependence upon the sending of the test message and the receiving of the echo message.Type: ApplicationFiled: August 15, 2007Publication date: February 19, 2009Inventor: Ahmad A. Faraj