Patents by Inventor Philip Heidelberger

Philip Heidelberger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10120810
    Abstract: A method, system and memory controller for implementing memory hierarchy placement decisions in a memory system including direct routing of arriving data into a main memory system and selective injection of the data or computed results into a processor cache in a computer system. A memory controller, or a processing element in a memory system, selectively drives placement of data into other levels of the memory hierarchy. The decision to inject into the hierarchy can be triggered by the arrival of data from an input output (IO) device, from computation, or from a directive of an in-memory processing element.
    Type: Grant
    Filed: December 1, 2017
    Date of Patent: November 6, 2018
    Assignee: International Business Machines Corporation
    Inventors: Philip Heidelberger, Hillery C. Hunter, James A. Kahle, Ravi Nair
  • Patent number: 10069599
    Abstract: A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.
    Type: Grant
    Filed: December 17, 2015
    Date of Patent: September 4, 2018
    Assignee: International Business Machines Corporation
    Inventors: Matthias A. Blumrich, Paul W. Coteus, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Todd E. Takken, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
  • Publication number: 20180241661
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Application
    Filed: April 18, 2018
    Publication date: August 23, 2018
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Patent number: 9971713
    Abstract: A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
    Type: Grant
    Filed: April 30, 2015
    Date of Patent: May 15, 2018
    Assignee: GLOBALFOUNDRIES INC.
    Inventors: Sameh Asaad, Ralph E. Bellofatto, Michael A. Blocksome, Matthias A. Blumrich, Peter Boyle, Jose R. Brunheroto, Dong Chen, Chen-Yong Cher, George L. Chiu, Norman Christ, Paul W. Coteus, Kristan D. Davis, Gabor J. Dozsa, Alexandre E. Eichenberger, Noel A. Eisley, Matthew R. Ellavsky, Kahn C. Evans, Bruce M. Fleischer, Thomas W. Fox, Alan Gara, Mark E. Giampapa, Thomas M. Gooding, Michael K. Gschwind, John A. Gunnels, Shawn A. Hall, Rudolf A. Haring, Philip Heidelberger, Todd A. Inglett, Brant L. Knudson, Gerard V. Kopcsay, Sameer Kumar, Amith R. Mamidala, James A. Marcella, Mark G. Megerian, Douglas R. Miller, Samuel J. Miller, Adam J. Muff, Michael B. Mundy, John K. O'Brien, Kathryn M. O'Brien, Martin Ohmacht, Jeffrey J. Parker, Ruth J. Poole, Joseph D. Ratterman, Valentina Salapura, David L. Satterfield, Robert M. Senger, Burkhard Steinmacher-Burow, William M. Stockdell, Craig B. Stunkel, Krishnan Sugavanam, Yutaka Sugawara, Todd E. Takken, Barry M. Trager, James L. Van Oosten, Charles D. Wait, Robert E. Walkup, Alfred T. Watson, Robert W. Wisniewski, Peng Wu
  • Patent number: 9954760
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Grant
    Filed: January 31, 2017
    Date of Patent: April 24, 2018
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Patent number: 9948543
    Abstract: A method (and structure) for improving efficiency in a multiprocessor system including a plurality of processor nodes interconnected in a multidimensional array, each processor node including a processor, an associated memory device, and an associated inter-nodal interface device for exchange of data with other nodes. Each processor can implement a broadcast procedure as an initiator node, using a format that permits inter-nodal interface devices at each node receiving a broadcast instruction packet to process the received broadcast instruction packet without using processing resources of the processor at the receiving node. Each inter-nodal interface device in each node can implement the broadcast procedure without using processing resources of the processor associated with the receiving node.
    Type: Grant
    Filed: December 30, 2015
    Date of Patent: April 17, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dong Chen, Philip Heidelberger, Sameer Kumar
  • Publication number: 20180091442
    Abstract: An apparatus includes a collective switch hardware architecture, including an input arrangement circuit including multiple input ports and multiple outputs. The input arrangement circuit routes its multiple input ports to selected ones of its outputs. The collective switch hardware architecture includes collective reduction logic coupled to the multiple outputs of the input arrangement circuit and having multiple outputs. The collective reduction logic includes ALU(s) and arbitration and control circuity. The ALU(s) and arbitration and control circuitry support multiple simultaneous collective operations from different collective classes, and support arbitrary input port and output port mapping to different collective classes. The collective switch hardware architecture further includes an output arrangement circuit including a multiple inputs coupled to the multiple outputs of the collective reduction logic and including multiple output ports.
    Type: Application
    Filed: September 29, 2016
    Publication date: March 29, 2018
    Inventors: Dong Chen, PHILIP HEIDELBERGER, CRAIG STUNKEL
  • Publication number: 20180089093
    Abstract: A method, system and memory controller for implementing memory hierarchy placement decisions in a memory system including direct routing of arriving data into a main memory system and selective injection of the data or computed results into a processor cache in a computer system. A memory controller, or a processing element in a memory system, selectively drives placement of data into other levels of the memory hierarchy. The decision to inject into the hierarchy can be triggered by the arrival of data from an input output (IO) device, from computation, or from a directive of an in-memory processing element.
    Type: Application
    Filed: December 1, 2017
    Publication date: March 29, 2018
    Inventors: Philip Heidelberger, Hillery C. Hunter, James A. Kahle, Ravi Nair
  • Patent number: 9910783
    Abstract: A method, system and memory controller for implementing memory hierarchy placement decisions in a memory system including direct routing of arriving data into a main memory system and selective injection of the data or computed results into a processor cache in a computer system. A memory controller, or a processing element in a memory system, selectively drives placement of data into other levels of the memory hierarchy. The decision to inject into the hierarchy can be triggered by the arrival of data from an input output (IO) device, from computation, or from a directive of an in-memory processing element.
    Type: Grant
    Filed: February 3, 2017
    Date of Patent: March 6, 2018
    Assignee: International Business Machines Corporation
    Inventors: Philip Heidelberger, Hillery C. Hunter, James A. Kahle, Ravi Nair
  • Publication number: 20180048556
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Application
    Filed: January 31, 2017
    Publication date: February 15, 2018
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger
  • Patent number: 9893950
    Abstract: A network system includes a plurality of sub-network planes and global switches. The sub-network planes have a same network topology as each other. Each of the sub-network planes includes edge switches. Each of the edge switches has N ports. Each of the global switches is configured to connect a group of edge switches at a same location in the sub-network planes. In each of the sub-network planes, some of the N ports of each of the edge switches are connected to end nodes, and others of the N ports are connected to other edge switches in the same sub-network plane, other of the N ports are connected to at least one of the global switches.
    Type: Grant
    Filed: January 27, 2016
    Date of Patent: February 13, 2018
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Philip Heidelberger
  • Publication number: 20170214579
    Abstract: A network system includes a plurality of sub-network planes and global switches. The sub-network planes have a same network topology as each other. Each of the sub-network planes includes edge switches. Each of the edge switches has N ports. Each of the global switches is configured to connect a group of edge switches at a same location in the sub-network planes. In each of the sub-network planes, some of the N ports of each of the edge switches are connected to end nodes, and others of the N ports are connected to other edge switches in the same sub-network plane, other of the N ports are connected to at least one of the global switches.
    Type: Application
    Filed: January 27, 2016
    Publication date: July 27, 2017
    Inventors: Dong Chen, Philip Heidelberger
  • Publication number: 20170195212
    Abstract: A method (and structure) for improving efficiency in a multiprocessor system including a plurality of processor nodes interconnected in a multidimensional array, each processor node including a processor, an associated memory device, and an associated inter-nodal interface device for exchange of data with other nodes. Each processor can implement a broadcast procedure as an initiator node, using a format that permits inter-nodal interface devices at each node receiving a broadcast instruction packet to process the received broadcast instruction packet without using processing resources of the processor at the receiving node. Each inter-nodal interface device in each node can implement the broadcast procedure without using processing resources of the processor associated with the receiving node.
    Type: Application
    Filed: December 30, 2015
    Publication date: July 6, 2017
    Inventors: Dong Chen, Philip Heidelberger, Sameer Kumar
  • Patent number: 9699078
    Abstract: An apparatus and method for extending the scalability and improving the partitionability of networks that contain all-to-all links for transporting packet traffic from a source endpoint to a destination endpoint with low per-endpoint (per-server) cost and a small number of hops. An all-to-all wiring in the baseline topology is decomposed into smaller all-to-all components in which each smaller all-to-all connection is replaced with star topology by using global switches. Stacking multiple copies of the star topology baseline network creates a multi-planed switching topology for transporting packet traffic. Point-to-point unified stacking method using global switch wiring methods connects multiple planes of a baseline topology by using the global switches to create a large network size with a low number of hops, i.e., low network latency. Grouped unified stacking method increases the scalability (network size) of a stacked topology.
    Type: Grant
    Filed: December 29, 2015
    Date of Patent: July 4, 2017
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Philip Heidelberger, Yutaka Sugawara
  • Publication number: 20170187616
    Abstract: An apparatus and method for extending the scalability and improving the partitionability of networks that contain all-to-all links for transporting packet traffic from a source endpoint to a destination endpoint with low per-endpoint (per-server) cost and a small number of hops. An all-to-all wiring in the baseline topology is decomposed into smaller all-to-all components in which each smaller all-to-all connection is replaced with star topology by using global switches. Stacking multiple copies of the star topology baseline network creates a multi-planed switching topology for transporting packet traffic. Point-to-point unified stacking method using global switch wiring methods connects multiple planes of a baseline topology by using the global switches to create a large network size with a low number of hops, i.e., low network latency. Grouped unified stacking method increases the scalability (network size) of a stacked topology.
    Type: Application
    Filed: December 29, 2015
    Publication date: June 29, 2017
    Inventors: Dong Chen, Philip Heidelberger, Yutaka Sugawara
  • Publication number: 20170161200
    Abstract: A method, system and memory controller for implementing memory hierarchy placement decisions in a memory system including direct routing of arriving data into a main memory system and selective injection of the data or computed results into a processor cache in a computer system. A memory controller, or a processing element in a memory system, selectively drives placement of data into other levels of the memory hierarchy. The decision to inject into the hierarchy can be triggered by the arrival of data from an input output (IO) device, from computation, or from a directive of an in-memory processing element.
    Type: Application
    Filed: February 3, 2017
    Publication date: June 8, 2017
    Inventors: Philip Heidelberger, Hillery C. Hunter, James A. Kahle, Ravi Nair
  • Patent number: 9626322
    Abstract: A multiprocessor computer system includes a plurality of processor nodes and at least a three-tier hierarchical network interconnecting the processor nodes. The hierarchical network includes a plurality of routers interconnected such that each router is connected to a subset of the plurality of processor nodes; the plurality of routers are arranged in a hierarchy of n?3 tiers (T1, . . . , Tn); the plurality of routers are partitioned into disjoint groups at the first tier T1, the groups at tier Ti being partitioned into disjoint groups (of complete Ti groups) at the next tier Ti+1 and a top tier Tn including a single group containing all of the plurality of routers; and for all tiers 1?i?n, each tier-Ti?1 subgroup within a tier Ti group is connected by at least one link to all other tier-Ti?1 subgroups within the same tier Ti group.
    Type: Grant
    Filed: September 15, 2014
    Date of Patent: April 18, 2017
    Assignee: International Business Machines Corporation
    Inventors: Baba L. Arimilli, Wolfgang Denzel, Philip Heidelberger, German Rodriguez Herrera, Christopher J. Johnson, Lonny Lambrecht, Cyriel Minkenberg, Bogdan Prisacari
  • Publication number: 20170086151
    Abstract: A first method includes determining a total length of pending packets for a network link, determining a currently preferred power mode for the network link based on the total length of pending packets for the network link, and changing a current power mode for the network link to the currently preferred power mode. A corresponding apparatus is also disclosed herein. A second method includes determining a utilization for a network link, determining a currently preferred power mode for the network link based on the utilization for the network link, and changing a current power mode for the network link to the currently preferred power mode. A corresponding apparatus is also disclosed herein.
    Type: Application
    Filed: September 23, 2015
    Publication date: March 23, 2017
    Inventors: Dong Chen, Paul W. Coteus, Noel A. Eisley, Philip Heidelberger, Robert M. Senger, Burkhard Steinmacher-Burow, Yutaka Sugawara
  • Patent number: 9582427
    Abstract: A method, system and memory controller for implementing memory hierarchy placement decisions in a memory system including direct routing of arriving data into a main memory system and selective injection of the data or computed results into a processor cache in a computer system. A memory controller, or a processing element in a memory system, selectively drives placement of data into other levels of the memory hierarchy. The decision to inject into the hierarchy can be triggered by the arrival of data from an input output (IO) device, from computation, or from a directive of an in-memory processing element.
    Type: Grant
    Filed: August 31, 2015
    Date of Patent: February 28, 2017
    Assignee: International Business Machines Corporation
    Inventors: Philip Heidelberger, Hillery C. Hunter, James A. Kahle, Ravi Nair
  • Patent number: 9565094
    Abstract: A method, system and computer program product are disclosed for routing data packet in a computing system comprising a multidimensional torus compute node network including a multitude of compute nodes, and an I/O node network including a plurality of I/O nodes. In one embodiment, the method comprises assigning to each of the data packets a destination address identifying one of the compute nodes; providing each of the data packets with a toio value; routing the data packets through the compute node network to the destination addresses of the data packets; and when each of the data packets reaches the destination address assigned to said each data packet, routing said each data packet to one of the I/O nodes if the toio value of said each data packet is a specified value. In one embodiment, each of the data packets is also provided with an ioreturn value used to route the data packets through the compute node network.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: February 7, 2017
    Assignee: International Business Machines Corporation
    Inventors: Dong Chen, Noel A. Eisley, Philip Heidelberger