Patents by Inventor Dirk Hoenicke

Dirk Hoenicke has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD AND APPARATUS OF PREFETCHING STREAMS OF VARYING PREFETCH DEPTH

Publication number: 20090006762

Abstract: Method and apparatus of prefetching streams of varying prefetch depth dynamically changes the depth of prefetching so that the number of multiple streams as well as the hit rate of a single stream are optimized. The method and apparatus in one aspect monitor a plurality of load requests from a processing unit for data in a prefetch buffer, determine an access pattern associated with the plurality of load requests and adjust a prefetch depth according to the access pattern.

Type: Application

Filed: June 26, 2007

Publication date: January 1, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alan G. Gara, Martin Ohmacht, Valentina Salapura, Krishnan Sugavanam, Dirk Hoenicke
LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION

Publication number: 20080313408

Abstract: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed.

Type: Application

Filed: August 22, 2008

Publication date: December 18, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas
COLLECTIVE NETWORK ROUTING

Publication number: 20080298368

Abstract: Disclosed are a unified method and apparatus to classify, route, and process injected data packets into a network so as to belong to a plurality of logical networks, each implementing a specific flow of data on top of a common physical network. The method allows to locally identify collectives of packets for local processing, such as the computation of the sum, difference, maximum, minimum, or other logical operations among the identified packet collective. Packets are injected together with a class-attribute and an opcode attribute. Network routers, employing the described method, use the packet attributes to look-up the class-specific route information from a local route table, which contains the local incoming and outgoing directions as part of the specifically implemented global data flow of the particular virtual network.

Type: Application

Filed: July 15, 2008

Publication date: December 4, 2008

Applicant: International Business Machines Corporation

Inventor: Dirk Hoenicke
Collective Network For Computer Structures

Publication number: 20080104367

Abstract: A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

Type: Application

Filed: July 18, 2005

Publication date: May 1, 2008

Inventors: Matthias A. Blumrich, Paul W. Coteus, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Todd E. Takken, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION

Publication number: 20070204112

Abstract: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed.

Type: Application

Filed: December 28, 2006

Publication date: August 30, 2007

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias Blumrich, Dong Chen, Paul Coteus, Alan Gara, Mark Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Burkhard Steinmacher-Burow, Todd Takken, Pavlos Vranas
Managing coherence via put/get windows

Publication number: 20070055825

Abstract: A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.

Type: Application

Filed: February 25, 2002

Publication date: March 8, 2007

Inventors: Matthias Blumrich, Dong Chan, Paul Coteus, Alan Gata, Mark Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht
Low latency memory access and synchronization

Patent number: 7174434

Abstract: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed.

Type: Grant

Filed: February 25, 2002

Date of Patent: February 6, 2007

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas
Methods and apparatus using commutative error detection values for fault isolation in multiple node computers

Publication number: 20060248370

Abstract: The present invention concerns methods and apparatus for performing fault isolation in multiple node computing systems using commutative error detection values—for example, checksums—to identify and to isolate faulty nodes. In the present invention nodes forming the multiple node computing system are networked together and during program execution communicate with one another by transmitting information through the network. When information associated with a reproducible portion of a computer program is injected into the network by a node, a commutative error detection value is calculated and stored in commutative error detection apparatus associated with the node. At intervals, node fault detection apparatus associated with the multiple node computer system retrieve commutative error detection values saved in the commutative error detection apparatus associated with the node and stores them in memory.

Type: Application

Filed: April 14, 2005

Publication date: November 2, 2006

Inventors: Gheorghe Almasi, Matthias Blumrich, Dong Chen, Paul Coteus, Alan Gara, Mark Giampapa, Philip Heidelberger, Dirk Hoenicke, Sarabjeet Singh, Burkhard Steinmacher-Burow, Todd Takken, Pavlos Vranas
Collective network routing

Publication number: 20060227774

Abstract: Disclosed are a unified method and apparatus to classify, route, and process injected data packets into a network so as to belong to a plurality of logical networks, each implementing a specific flow of data on top of a common physical network. The method allows to locally identify collectives of packets for local processing, such as the computation of the sum, difference, maximum, minimum, or other logical operations among the identified packet collective. Packets are injected together with a class-attribute and an opcode attribute. Network routers, employing the described method, use the packet attributes to look-up the class-specific route information from a local route table, which contains the local incoming and outgoing directions as part of the specifically implemented global data flow of the particular virtual network.

Type: Application

Filed: April 6, 2005

Publication date: October 12, 2006

Applicant: International Business Machines Corporation

Inventor: Dirk Hoenicke
Novel snoop filter for filtering snoop requests

Publication number: 20060224838

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

Type: Application

Filed: March 29, 2005

Publication date: October 5, 2006

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias Blumrich, Dong Chen, Alan Gara, Mark Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos Vranas
Snoop filtering system in a multiprocessor system

Publication number: 20060224835

Abstract: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit.

Type: Application

Filed: March 29, 2005

Publication date: October 5, 2006

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias Blumrich, Dong Chen, Alan Gara, Mark Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos Vranas
Method and apparatus for filtering snoop requests in a point-to-point interconnect architecture

Publication number: 20060224837

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each of the memory writing sources is directly connected to the dedicated input ports of all other snoop filter devices associated with all other processing units in a point-to-point interconnect fashion.

Type: Application

Filed: March 29, 2005

Publication date: October 5, 2006

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias Blumrich, Dong Chen, Alan Gara, Mark Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos Vranas
Multidimensional switch network

Publication number: 20050195808

Abstract: Multidimensional switch data networks are disclosed, such as are used by a distributed-memory parallel computer, as applied for example to computations in the field of life sciences. A distributed memory parallel computing system comprises a number of parallel compute nodes and a message passing data network connecting the compute nodes together. The data network connecting the compute nodes comprises a multidimensional switch data network of compute nodes having N dimensions, and a number/array of compute nodes Ln in each of the N dimensions. Each compute node includes an N port routing element having a port for each of the N dimensions. Each compute node of an array of Ln compute nodes in each of the N dimensions connects through a port of its routing element to an Ln port crossbar switch having Ln ports. Several embodiments are disclosed of a 4 dimensional computing system having 65,536 compute nodes.

Type: Application

Filed: March 4, 2004

Publication date: September 8, 2005

Applicant: International Business Machines Corporation

Inventors: Dong Chen, Alan Gara, Mark Giampapa, Philip Heidelberger, Dirk Hoenicke, Burkhard Steinmacher-Burow, Pavlos Vranas, Matthias Blumrich
Hardware/software based indirect time stamping methodology for proactive hardware/software event detection and control

Publication number: 20050144532

Abstract: An improved method and apparatus for time stamping events occurring on a large scale distributed network uses a local counter associated with each processor of the distributed network. Each counter resets at the same time globally so that all events are recorded with respect to a particular time. The counter is stopped when a critical event is detected. The events are masked or filtered in an online or offline fashion to eliminate non-critical events from triggering a collection by the system monitor or service/host processor. The masking can be done dynamically through the use of an event history logger. The central system may poll the remote processor periodically to receive the accurate counter value from the local counter and device control register. Remedial action can be taken when conditional probability calculations performed on the historical information indicate that a critical event is about to occur.

Type: Application

Filed: December 12, 2003

Publication date: June 30, 2005

Inventors: Marc Dombrowa, Dirk Hoenicke, Ramendra Sahoo, Krishnan Sugavanam
Deterministic error recovery protocol

Publication number: 20050081078

Abstract: Disclosed are an error recovery method and system for use with a communication system having first and second nodes, each of said nodes having a receiver and a sender, the sender of the first node being connected to the receiver of the second node by a first cable, and the sender of the second node being connected to the receiver of the first node by a second cable. The method comprising the step of after one of the nodes detects an error, both of the nodes entering the same defined state. In particular, the receiver of the first node enters an error state, stays in the error state for a defined period of time T, and, after said defined period of time T, enters a wait state. Also, the sender of the first node sends to the receiver of the second node an error message for a defined period of time Te, and after the defined period of time Te, the sender of the first node enters an idle state.

Type: Application

Filed: September 30, 2003

Publication date: April 14, 2005

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias Blumrich, Dong Chen, Alan Gara, Philip Heidelberger, Dirk Hoenicke, Burkhard Steinmacher-Burow, Pavlos Vranas
Global tree network for computing structures

Publication number: 20040078493

Abstract: A system and method for enabling high-speed, low-latency global tree communications among processing nodes interconnected according to a tree network structure. The global tree network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations include one or more of: global broadcast operations downstream from a root node to leaf nodes of a virtual tree, global reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node in the virtual tree.

Type: Application

Filed: August 22, 2003

Publication date: April 22, 2004

Inventors: Matthias A Blumrich, Dong Chen, Paul W Coteus, Alan G Gara, Mark E Giampapa, Philip Heidelberger, Dirk Hoenicke, Burkhard D Steinmacher-Burow, Todd E Takken, Pavlos M Vranas
Low latency memoray system access

Publication number: 20040073758

Abstract: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed.

Type: Application

Filed: August 22, 2003

Publication date: April 15, 2004

Inventors: Matthias A. Blumrich, Dong Chen, Paul W Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas

prev 1 2