Patents by Inventor Philip Heidelberger
Philip Heidelberger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11556450Abstract: The embodiments herein describe hybrid parallelism techniques where a mix of data and model parallelism techniques are used to split the workload of a layer across an array of processors. When configuring the array, the bandwidth of the processors in one direction may be greater than the bandwidth in the other direction. Each layer is characterized according to whether they are more feature heavy or weight heavy. Depending on this characterization, the workload of an NN layer can be assigned to the array using a hybrid parallelism technique rather than using solely the data parallelism technique or solely the model parallelism technique. For example, if an NN layer is more weight heavy than feature heavy, data parallelism is used in the direction with the greater bandwidth (to minimize the negative impact of weight reduction) while model parallelism is used in the direction with the smaller bandwidth.Type: GrantFiled: October 11, 2019Date of Patent: January 17, 2023Assignee: International Business Machines CorporationInventors: Swagath Venkataramani, Vijayalakshmi Srinivasan, Philip Heidelberger
-
Patent number: 11165719Abstract: A computer network architecture includes a plurality N of first nodes, each first node having kC ports to a cluster network, where N and kC are integers greater than 0; and a local network switch connected to each of the plurality of first nodes, but not to the cluster network. Each first node has kL ports to the local network switch, where kL, is an integer greater than 0, and any two first nodes in the plurality of first nodes communicate with each other via the local network switch or via the cluster network.Type: GrantFiled: June 12, 2019Date of Patent: November 2, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Philip Heidelberger, Craig Brian Stunkel
-
Patent number: 11165686Abstract: A switch-connected dragonfly network and method of operating. A plurality of groups of row switches is organized according to multiple rows and columns, each row including multiple groups of row switches connected to form a two-level dragonfly network. A plurality of column switches interconnect groups of row switches along respective columns, a column switch associated with a corresponding group of row switches in a row. A switch port with a same logical port on a row switch at a same location in each group along the respective column connects to a same column switch. The switch-connected dragonfly network is expandable by adding additional rows, an added row comprising a two-level dragonfly network. A switch group of said added row associated with a column being connects to an available port at an existing column switch of said column by corresponding added S path link with no re-cabling of the switched network required.Type: GrantFiled: August 7, 2018Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Philip Heidelberger, Dong Chen, Yutaka Sugawara, Paul W. Coteus
-
Publication number: 20210110247Abstract: The embodiments herein describe hybrid parallelism techniques where a mix of data and model parallelism techniques are used to split the workload of a layer across an array of processors. When configuring the array, the bandwidth of the processors in one direction may be greater than the bandwidth in the other direction. Each layer is characterized according to whether they are more feature heavy or weight heavy. Depending on this characterization, the workload of an NN layer can be assigned to the array using a hybrid parallelism technique rather than using solely the data parallelism technique or solely the model parallelism technique. For example, if an NN layer is more weight heavy than feature heavy, data parallelism is used in the direction with the greater bandwidth (to minimize the negative impact of weight reduction) while model parallelism is used in the direction with the smaller bandwidth.Type: ApplicationFiled: October 11, 2019Publication date: April 15, 2021Inventors: Swagath Venkataramani, Vijayalakshmi Srinivasan, Philip Heidelberger
-
Patent number: 10979337Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.Type: GrantFiled: January 29, 2020Date of Patent: April 13, 2021Assignee: International Business Machines CorporationInventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
-
Patent number: 10958588Abstract: Methods and systems for monitoring remote transmissions of messages among a plurality of nodes are described. A processing element in a first node may allocate a sequence number to a request to read and/or update data in a second node. The processing element may be different from main processors of the first node. The processing element may send the message and the sequence number to the second node. The processing element may modify a status of the sequence number to an active state, indicating a transmission of the message is pending. The processing element may, in response to a response from the second node, modify the status of the sequence number to an inactive state, indicating a completed transmission of the message. The processing element may, in response to no response from the second node within a time period, resend the message and the sequence number to the second node.Type: GrantFiled: February 5, 2018Date of Patent: March 23, 2021Assignee: International Business Machines CorporationInventors: Sameer Kumar, Philip Heidelberger, Yutaka Sugawara, Dong Chen, Robert M. Senger
-
Publication number: 20200396175Abstract: A computer network architecture includes a plurality N of first nodes, each first node having kC ports to a cluster network, where N and kC are integers greater than 0; and a local network switch connected to each of the plurality of first nodes, but not to the cluster network. Each first node has kL ports to the local network switch, where kL, is an integer greater than 0, and any two first nodes in the plurality of first nodes communicate with each other via the local network switch or via the cluster network.Type: ApplicationFiled: June 12, 2019Publication date: December 17, 2020Inventors: Philip Heidelberger, Craig Brian Stunkel
-
Patent number: 10834672Abstract: A first method includes determining a total length of pending packets for a network link, determining a currently preferred power mode for the network link based on the total length of pending packets for the network link, and changing a current power mode for the network link to the currently preferred power mode. A corresponding apparatus is also disclosed herein. A second method includes determining a utilization for a network link, determining a currently preferred power mode for the network link based on the utilization for the network link, and changing a current power mode for the network link to the currently preferred power mode. A corresponding apparatus is also disclosed herein.Type: GrantFiled: September 23, 2015Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Dong Chen, Paul W. Coteus, Noel A. Eisley, Philip Heidelberger, Robert M. Senger, Burkhard Steinmacher-Burow, Yutaka Sugawara
-
Patent number: 10812416Abstract: A shared memory maintained by sender processes stores a sequence number counter per destination process. A sender process increments the sequence number counter in the shared memory in sending a message to a destination process. The sender process sends a data packet comprising the message and at least a sequence number specified by the sequence number counter. All of the sender processes share a sequence number counter per destination process, each of the sender processes incrementing the sequence number counter in sending a respective message. Receiver processes run on the hardware processor, each of the receiver processes maintaining a local memory counter on the memory, the local memory counter associated with a sending node. The local memory counter stores a sequence number of a message received from the sending node. The receiver process delivers incoming data packets ordered by sequence numbers of the data packets.Type: GrantFiled: December 27, 2017Date of Patent: October 20, 2020Assignee: International Business Machines CorporationInventors: Sameer Kumar, Philip Heidelberger, Dong Chen, Yutaka Sugawara, Robert M. Senger, Burkhard Steinmacher-Burow
-
Patent number: 10740097Abstract: Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computer program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.Type: GrantFiled: May 20, 2016Date of Patent: August 11, 2020Assignee: International Business Machines CorporationInventors: Dong Chen, Paul W. Coteus, Noel A. Eisley, Alan Gara, Philip Heidelberger, Robert M. Senger, Valentina Salapura, Burkhard Steinmacher-Burow, Yutaka Sugawara, Todd E. Takken
-
Publication number: 20200228435Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.Type: ApplicationFiled: January 29, 2020Publication date: July 16, 2020Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
-
Patent number: 10601697Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.Type: GrantFiled: May 6, 2019Date of Patent: March 24, 2020Assignee: International Business Machines CorporationInventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
-
Publication number: 20200053002Abstract: A switch-connected dragonfly network and method of operating. A plurality of groups of row switches is organized according to multiple rows and columns, each row including multiple groups of row switches connected to form a two-level dragonfly network. A plurality of column switches interconnect groups of row switches along respective columns, a column switch associated with a corresponding group of row switches in a row. A switch port with a same logical port on a row switch at a same location in each group along the respective column connects to a same column switch. The switch-connected dragonfly network is expandable by adding additional rows, an added row comprising a two-level dragonfly network. A switch group of said added row associated with a column being connects to an available port at an existing column switch of said column by corresponding added S path link with no re-cabling of the switched network required.Type: ApplicationFiled: August 7, 2018Publication date: February 13, 2020Inventors: Philip Heidelberger, Dong Chen, Yutaka Sugawara, Paul W. Coteus
-
Patent number: 10425358Abstract: An apparatus includes a collective switch hardware architecture, including an input arrangement circuit including multiple input ports and multiple outputs. The input arrangement circuit routes its multiple input ports to selected ones of its outputs. The collective switch hardware architecture includes collective reduction logic coupled to the multiple outputs of the input arrangement circuit and having multiple outputs. The collective reduction logic includes ALU(s) and arbitration and control circuitry. The ALU(s) and arbitration and control circuitry support multiple simultaneous collective operations from different collective classes, and support arbitrary input port and output port mapping to different collective classes. The collective switch hardware architecture further includes an output arrangement circuit including a multiple inputs coupled to the multiple outputs of the collective reduction logic and including multiple output ports.Type: GrantFiled: September 29, 2016Date of Patent: September 24, 2019Assignee: International Business Machines CorporationInventors: Dong Chen, Philip Heidelberger, Craig Stunkel
-
Publication number: 20190260666Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.Type: ApplicationFiled: May 6, 2019Publication date: August 22, 2019Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
-
Publication number: 20190245799Abstract: Methods and systems for monitoring remote transmissions of messages among a plurality of nodes are described. A processing element in a first node may allocate a sequence number to a request to read and/or update data in a second node. The processing element may be different from main processors of the first node. The processing element may send the message and the sequence number to the second node. The processing element may modify a status of the sequence number to an active state, indicating a transmission of the message is pending. The processing element may, in response to a response from the second node, modify the status of the sequence number to an inactive state, indicating a completed transmission of the message. The processing element may, in response to no response from the second node within a time period, resend the message and the sequence number to the second node.Type: ApplicationFiled: February 5, 2018Publication date: August 8, 2019Inventors: Sameer Kumar, Philip Heidelberger, Yutaka Sugawara, Dong Chen, Robert M. Senger
-
Patent number: 10348609Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.Type: GrantFiled: April 18, 2018Date of Patent: July 9, 2019Assignee: International Business Machines CorporationInventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
-
Publication number: 20190199653Abstract: A shared memory maintained by sender processes stores a sequence number counter per destination process. A sender process increments the sequence number counter in the shared memory in sending a message to a destination process. The sender process sends a data packet comprising the message and at least a sequence number specified by the sequence number counter. All of the sender processes share a sequence number counter per destination process, each of the sender processes incrementing the sequence number counter in sending a respective message. Receiver processes run on the hardware processor, each of the receiver processes maintaining a local memory counter on the memory, the local memory counter associated with a sending node. The local memory counter stores a sequence number of a message received from the sending node. The receiver process delivers incoming data packets ordered by sequence numbers of the data packets.Type: ApplicationFiled: December 27, 2017Publication date: June 27, 2019Inventors: Sameer Kumar, Philip Heidelberger, Dong Chen, Yutaka Sugawara, Robert M. Senger, Burkhard Steinmacher-Burow
-
Patent number: 10152450Abstract: According to one embodiment of the present invention, a system for operating memory includes a first node coupled to a second node by a network, the system configured to perform a method including receiving the remote transaction message from the second node in a processing element in the first node via the network, wherein the remote transaction message bypasses a main processor in the first node as it is transmitted to the processing element. In addition, the method includes accessing, by the processing element, data from a location in a memory in the first node based on the remote transaction message, and performing, by the processing element, computations based on the data and the remote transaction message.Type: GrantFiled: August 13, 2012Date of Patent: December 11, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dong Chen, Noel A. Eisley, Philip Heidelberger, James A. Kahle, Fabrizio Petrini, Robert M. Senger, Burkhard Steinmacher-Burow, Yutaka Sugawara
-
Patent number: 10140179Abstract: A method and system are disclosed for providing combined error code protection and subgroup parity protection for a given group of n bits. The method comprises the steps of identifying a number, m, of redundant bits for said error protection; and constructing a matrix P, wherein multiplying said given group of n bits with P produces m redundant error correction code (ECC) protection bits, and two columns of P provide parity protection for subgroups of said given group of n bits. In the preferred embodiment of the invention, the matrix P is constructed by generating permutations of m bit wide vectors with three or more, but an odd number of, elements with value one and the other elements with value zero; and assigning said vectors to rows of the matrix P.Type: GrantFiled: December 17, 2015Date of Patent: November 27, 2018Assignee: International Business Machines CorporationInventors: Alan Gara, Dong Chen, Philip Heidelberger, Martin Ohmacht