Patents by Inventor Philip Heidelberger

Philip Heidelberger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11556450
    Abstract: The embodiments herein describe hybrid parallelism techniques where a mix of data and model parallelism techniques are used to split the workload of a layer across an array of processors. When configuring the array, the bandwidth of the processors in one direction may be greater than the bandwidth in the other direction. Each layer is characterized according to whether they are more feature heavy or weight heavy. Depending on this characterization, the workload of an NN layer can be assigned to the array using a hybrid parallelism technique rather than using solely the data parallelism technique or solely the model parallelism technique. For example, if an NN layer is more weight heavy than feature heavy, data parallelism is used in the direction with the greater bandwidth (to minimize the negative impact of weight reduction) while model parallelism is used in the direction with the smaller bandwidth.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: January 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Swagath Venkataramani, Vijayalakshmi Srinivasan, Philip Heidelberger
  • Patent number: 11165719
    Abstract: A computer network architecture includes a plurality N of first nodes, each first node having kC ports to a cluster network, where N and kC are integers greater than 0; and a local network switch connected to each of the plurality of first nodes, but not to the cluster network. Each first node has kL ports to the local network switch, where kL, is an integer greater than 0, and any two first nodes in the plurality of first nodes communicate with each other via the local network switch or via the cluster network.
    Type: Grant
    Filed: June 12, 2019
    Date of Patent: November 2, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Philip Heidelberger, Craig Brian Stunkel
  • Patent number: 11165686
    Abstract: A switch-connected dragonfly network and method of operating. A plurality of groups of row switches is organized according to multiple rows and columns, each row including multiple groups of row switches connected to form a two-level dragonfly network. A plurality of column switches interconnect groups of row switches along respective columns, a column switch associated with a corresponding group of row switches in a row. A switch port with a same logical port on a row switch at a same location in each group along the respective column connects to a same column switch. The switch-connected dragonfly network is expandable by adding additional rows, an added row comprising a two-level dragonfly network. A switch group of said added row associated with a column being connects to an available port at an existing column switch of said column by corresponding added S path link with no re-cabling of the switched network required.
    Type: Grant
    Filed: August 7, 2018
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Philip Heidelberger, Dong Chen, Yutaka Sugawara, Paul W. Coteus
  • Publication number: 20210110247
    Abstract: The embodiments herein describe hybrid parallelism techniques where a mix of data and model parallelism techniques are used to split the workload of a layer across an array of processors. When configuring the array, the bandwidth of the processors in one direction may be greater than the bandwidth in the other direction. Each layer is characterized according to whether they are more feature heavy or weight heavy. Depending on this characterization, the workload of an NN layer can be assigned to the array using a hybrid parallelism technique rather than using solely the data parallelism technique or solely the model parallelism technique. For example, if an NN layer is more weight heavy than feature heavy, data parallelism is used in the direction with the greater bandwidth (to minimize the negative impact of weight reduction) while model parallelism is used in the direction with the smaller bandwidth.
    Type: Application
    Filed: October 11, 2019
    Publication date: April 15, 2021
    Inventors: Swagath Venkataramani, Vijayalakshmi Srinivasan, Philip Heidelberger
  • Patent number: 10979337
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Grant
    Filed: January 29, 2020
    Date of Patent: April 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Patent number: 10958588
    Abstract: Methods and systems for monitoring remote transmissions of messages among a plurality of nodes are described. A processing element in a first node may allocate a sequence number to a request to read and/or update data in a second node. The processing element may be different from main processors of the first node. The processing element may send the message and the sequence number to the second node. The processing element may modify a status of the sequence number to an active state, indicating a transmission of the message is pending. The processing element may, in response to a response from the second node, modify the status of the sequence number to an inactive state, indicating a completed transmission of the message. The processing element may, in response to no response from the second node within a time period, resend the message and the sequence number to the second node.
    Type: Grant
    Filed: February 5, 2018
    Date of Patent: March 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Sameer Kumar, Philip Heidelberger, Yutaka Sugawara, Dong Chen, Robert M. Senger
  • Publication number: 20200396175
    Abstract: A computer network architecture includes a plurality N of first nodes, each first node having kC ports to a cluster network, where N and kC are integers greater than 0; and a local network switch connected to each of the plurality of first nodes, but not to the cluster network. Each first node has kL ports to the local network switch, where kL, is an integer greater than 0, and any two first nodes in the plurality of first nodes communicate with each other via the local network switch or via the cluster network.
    Type: Application
    Filed: June 12, 2019
    Publication date: December 17, 2020
    Inventors: Philip Heidelberger, Craig Brian Stunkel
  • Patent number: 10834672
    Abstract: A first method includes determining a total length of pending packets for a network link, determining a currently preferred power mode for the network link based on the total length of pending packets for the network link, and changing a current power mode for the network link to the currently preferred power mode. A corresponding apparatus is also disclosed herein. A second method includes determining a utilization for a network link, determining a currently preferred power mode for the network link based on the utilization for the network link, and changing a current power mode for the network link to the currently preferred power mode. A corresponding apparatus is also disclosed herein.
    Type: Grant
    Filed: September 23, 2015
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Paul W. Coteus, Noel A. Eisley, Philip Heidelberger, Robert M. Senger, Burkhard Steinmacher-Burow, Yutaka Sugawara
  • Patent number: 10812416
    Abstract: A shared memory maintained by sender processes stores a sequence number counter per destination process. A sender process increments the sequence number counter in the shared memory in sending a message to a destination process. The sender process sends a data packet comprising the message and at least a sequence number specified by the sequence number counter. All of the sender processes share a sequence number counter per destination process, each of the sender processes incrementing the sequence number counter in sending a respective message. Receiver processes run on the hardware processor, each of the receiver processes maintaining a local memory counter on the memory, the local memory counter associated with a sending node. The local memory counter stores a sequence number of a message received from the sending node. The receiver process delivers incoming data packets ordered by sequence numbers of the data packets.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: October 20, 2020
    Assignee: International Business Machines Corporation
    Inventors: Sameer Kumar, Philip Heidelberger, Dong Chen, Yutaka Sugawara, Robert M. Senger, Burkhard Steinmacher-Burow
  • Patent number: 10740097
    Abstract: Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computer program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.
    Type: Grant
    Filed: May 20, 2016
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Paul W. Coteus, Noel A. Eisley, Alan Gara, Philip Heidelberger, Robert M. Senger, Valentina Salapura, Burkhard Steinmacher-Burow, Yutaka Sugawara, Todd E. Takken
  • Publication number: 20200228435
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Application
    Filed: January 29, 2020
    Publication date: July 16, 2020
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Patent number: 10601697
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Grant
    Filed: May 6, 2019
    Date of Patent: March 24, 2020
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Publication number: 20200053002
    Abstract: A switch-connected dragonfly network and method of operating. A plurality of groups of row switches is organized according to multiple rows and columns, each row including multiple groups of row switches connected to form a two-level dragonfly network. A plurality of column switches interconnect groups of row switches along respective columns, a column switch associated with a corresponding group of row switches in a row. A switch port with a same logical port on a row switch at a same location in each group along the respective column connects to a same column switch. The switch-connected dragonfly network is expandable by adding additional rows, an added row comprising a two-level dragonfly network. A switch group of said added row associated with a column being connects to an available port at an existing column switch of said column by corresponding added S path link with no re-cabling of the switched network required.
    Type: Application
    Filed: August 7, 2018
    Publication date: February 13, 2020
    Inventors: Philip Heidelberger, Dong Chen, Yutaka Sugawara, Paul W. Coteus
  • Patent number: 10425358
    Abstract: An apparatus includes a collective switch hardware architecture, including an input arrangement circuit including multiple input ports and multiple outputs. The input arrangement circuit routes its multiple input ports to selected ones of its outputs. The collective switch hardware architecture includes collective reduction logic coupled to the multiple outputs of the input arrangement circuit and having multiple outputs. The collective reduction logic includes ALU(s) and arbitration and control circuitry. The ALU(s) and arbitration and control circuitry support multiple simultaneous collective operations from different collective classes, and support arbitrary input port and output port mapping to different collective classes. The collective switch hardware architecture further includes an output arrangement circuit including a multiple inputs coupled to the multiple outputs of the collective reduction logic and including multiple output ports.
    Type: Grant
    Filed: September 29, 2016
    Date of Patent: September 24, 2019
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Philip Heidelberger, Craig Stunkel
  • Publication number: 20190260666
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Application
    Filed: May 6, 2019
    Publication date: August 22, 2019
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Publication number: 20190245799
    Abstract: Methods and systems for monitoring remote transmissions of messages among a plurality of nodes are described. A processing element in a first node may allocate a sequence number to a request to read and/or update data in a second node. The processing element may be different from main processors of the first node. The processing element may send the message and the sequence number to the second node. The processing element may modify a status of the sequence number to an active state, indicating a transmission of the message is pending. The processing element may, in response to a response from the second node, modify the status of the sequence number to an inactive state, indicating a completed transmission of the message. The processing element may, in response to no response from the second node within a time period, resend the message and the sequence number to the second node.
    Type: Application
    Filed: February 5, 2018
    Publication date: August 8, 2019
    Inventors: Sameer Kumar, Philip Heidelberger, Yutaka Sugawara, Dong Chen, Robert M. Senger
  • Patent number: 10348609
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Grant
    Filed: April 18, 2018
    Date of Patent: July 9, 2019
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Publication number: 20190199653
    Abstract: A shared memory maintained by sender processes stores a sequence number counter per destination process. A sender process increments the sequence number counter in the shared memory in sending a message to a destination process. The sender process sends a data packet comprising the message and at least a sequence number specified by the sequence number counter. All of the sender processes share a sequence number counter per destination process, each of the sender processes incrementing the sequence number counter in sending a respective message. Receiver processes run on the hardware processor, each of the receiver processes maintaining a local memory counter on the memory, the local memory counter associated with a sending node. The local memory counter stores a sequence number of a message received from the sending node. The receiver process delivers incoming data packets ordered by sequence numbers of the data packets.
    Type: Application
    Filed: December 27, 2017
    Publication date: June 27, 2019
    Inventors: Sameer Kumar, Philip Heidelberger, Dong Chen, Yutaka Sugawara, Robert M. Senger, Burkhard Steinmacher-Burow
  • Patent number: 10152450
    Abstract: According to one embodiment of the present invention, a system for operating memory includes a first node coupled to a second node by a network, the system configured to perform a method including receiving the remote transaction message from the second node in a processing element in the first node via the network, wherein the remote transaction message bypasses a main processor in the first node as it is transmitted to the processing element. In addition, the method includes accessing, by the processing element, data from a location in a memory in the first node based on the remote transaction message, and performing, by the processing element, computations based on the data and the remote transaction message.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: December 11, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger, James A. Kahle, Fabrizio Petrini, Robert M. Senger, Burkhard Steinmacher-Burow, Yutaka Sugawara
  • Patent number: 10140179
    Abstract: A method and system are disclosed for providing combined error code protection and subgroup parity protection for a given group of n bits. The method comprises the steps of identifying a number, m, of redundant bits for said error protection; and constructing a matrix P, wherein multiplying said given group of n bits with P produces m redundant error correction code (ECC) protection bits, and two columns of P provide parity protection for subgroups of said given group of n bits. In the preferred embodiment of the invention, the matrix P is constructed by generating permutations of m bit wide vectors with three or more, but an odd number of, elements with value one and the other elements with value zero; and assigning said vectors to rows of the matrix P.
    Type: Grant
    Filed: December 17, 2015
    Date of Patent: November 27, 2018
    Assignee: International Business Machines Corporation
    Inventors: Alan Gara, Dong Chen, Philip Heidelberger, Martin Ohmacht