Patents by Inventor Matthias A. Blumrich

Matthias A. Blumrich has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROGRAMMABLE PARTITIONING FOR HIGH-PERFORMANCE COHERENCE DOMAINS IN A MULTIPROCESSOR SYSTEM

Publication number: 20090006769

Abstract: A multiprocessor computing system and a method of logically partitioning a multiprocessor computing system are disclosed. The multiprocessor computing system comprises a multitude of processing units, and a multitude of snoop units. Each of the processing units includes a local cache, and the snoop units are provided for supporting cache coherency in the multiprocessor system. Each of the snoop units is connected to a respective one of the processing units and to all of the other snoop units. The multiprocessor computing system further includes a partitioning system for using the snoop units to partition the multitude of processing units into a plurality of independent, memory-consistent, adjustable-size processing groups. Preferably, when the processor units are partitioned into these processing groups, the partitioning system also configures the snoop units to maintain cache coherency within each of said groups.

Type: Application

Filed: June 26, 2007

Publication date: January 1, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Valentina Salapura
LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION

Publication number: 20080313408

Abstract: A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed.

Type: Application

Filed: August 22, 2008

Publication date: December 18, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Martin Ohmacht, Burkhard D. Steinmacher-Burow, Todd E. Takken, Pavlos M. Vranas
METHOD AND APPARATUS FOR FILTERING SNOOP REQUESTS USING A SCOREBOARD

Publication number: 20080294850

Abstract: An apparatus for implementing snooping cache coherence that locally reduces the number of snoop requests presented to each cache in a multiprocessor system. A snoop filter device associated with a single processor includes one or more “scoreboard” data structures that make snoop determinations, i.e., for each snoop request from another processor, to determine if a request is to be forwarded to the processor or, discarded. At least one scoreboard is active, and at least one scoreboard is determined to be historic at any point in time. A snoop determination of the queue indicates that an entry may be in the cache, but does not indicate its actual residence status. In addition, the snoop filter block implementing scoreboard data structures is operatively coupled with a cache wrap detection logic means whereby, upon detection of a cache wrap condition, the content of the active scoreboard is copied into a historic scoreboard and the content of at least one active scoreboard is reset.

Type: Application

Filed: May 29, 2008

Publication date: November 27, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Alan G. Gara, Thomas R. Puzak, Valentina Salapura
One-bounce network

Patent number: 7457303

Abstract: A one-bounce data network comprises a plurality of nodes interconnected to each other via communication links, the network including a plurality of interconnected switch devices, said switch devices interconnected such that a message is communicated between any two switches passes over a single link from a source switch to a destination switch; and, the source switch concurrently sends a message to an arbitrary bounce switch which then sends the message to the destination switch.

Type: Grant

Filed: September 30, 2003

Date of Patent: November 25, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
Global interrupt and barrier networks

Patent number: 7444385

Abstract: A system and method for generating global asynchronous signals in a computing structure. Particularly, a global interrupt and barrier network is implemented that implements logic for generating global interrupt and barrier signals for controlling global asynchronous operations performed by processing elements at selected processing nodes of a computing structure in accordance with a processing algorithm; and includes the physical interconnecting of the processing nodes for communicating the global interrupt and barrier signals to the elements via low-latency paths. The global asynchronous signals respectively initiate interrupt and barrier operations at the processing nodes at times selected for optimizing performance of the processing algorithms.

Type: Grant

Filed: February 25, 2002

Date of Patent: October 28, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Paul W. Coteus, Alan G. Gara, Mark E Giampapa, Philip Heidelberger, Gerard V. Kopcsay, Burkhard D. Steinmacher-Burow, Todd E. Takken
METHOD AND APARATHUS FOR FILTERING SNOOP REQUESTS USING STREAM REGISTERS

Publication number: 20080244194

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having a local cache memory associated therewith. A snoop filter device is associated with each processing unit and includes at least one snoop filter primitive implementing filtering method based on usage of stream registers sets and associated stream register comparison logic. From the plurality of stream registers sets, at least one stream register set is active, and at least one stream register set is labeled historic at any point in time. In addition, the snoop filter block is operatively coupled with cache wrap detection logic whereby the content of the active stream register set is switched into a historic stream register set upon the cache wrap condition detection, and the content of at least one active stream register set is reset.

Type: Application

Filed: June 11, 2008

Publication date: October 2, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Alan G. Gara, Valentina Salapura
SNOOP FILTERING SYSTEM IN A MULTIPROCESSOR SYSTEM

Publication number: 20080222364

Abstract: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit.

Type: Application

Filed: May 23, 2008

Publication date: September 11, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
METHOD AND APPARATUS FOR DETECTING A CACHE WRAP CONDITION

Publication number: 20080209128

Abstract: A method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced. The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap. The method and apparatus is preferably implemented in a snoop filter having filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms. In the various embodiments, cache wrap detection logic is implemented using registers and comparators, loadable counters, or a scoreboard data structure.

Type: Application

Filed: May 1, 2008

Publication date: August 28, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Alan G. Gara, Mark E. Giampapa, Martin Ohmacht, Valentina Salapura
METHOD AND APPARATUS FOR FILTERING SNOOP REQUESTS USING MULTIPLE SNOOP CACHES

Publication number: 20080155201

Abstract: A method and apparatus for implementing a snoop filter unit associated with a single processor in a multiprocessor system. The snoop filter unit has a plurality of ports, each port receiving snoop requests from exactly one dedicated source. Associated with each port is a snoop cache filter that processes each snoop cache request and records addresses of the most recent snoop requests for exactly one source. The snoop cache filter uses vector encoding to record the occurrence of snoop requests for a sequence of consecutive cache lines. All addresses of snoop requests are added to the snoop cache unless a received snoop cache request matches an entry present in the associated snoop cache, in which case the snoop request is discarded. Otherwise, the associated snoop cache request is enqueued for forwarding to the single processor.

Type: Application

Filed: March 5, 2008

Publication date: June 26, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Alan G. Gara, Valentina Salapura
Method and apparatus for filtering snoop requests using stream registers

Patent number: 7392351

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having a local cache memory associated therewith. A snoop filter device is associated with each processing unit and includes at least one snoop filter primitive implementing filtering method based on usage of stream registers sets and associated stream register comparison logic. From the plurality of stream registers sets, at least one stream register set is active, and at least one stream register set is labeled historic at any point in time. In addition, the snoop filter block is operatively coupled with cache wrap detection logic whereby the content of the active stream register set is switched into a historic stream register set upon the cache wrap condition detection, and the content of at least one active stream register set is reset.

Type: Grant

Filed: March 29, 2005

Date of Patent: June 24, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Alan G. Gara, Valentina Salapura
Method and apparatus for filtering snoop requests using multiple snoop caches

Patent number: 7386685

Abstract: A method and apparatus for implementing a snoop filter unit associated with a single processor in a multiprocessor system. The snoop filter unit has a plurality of ports, each port receiving snoop requests from exactly one dedicated source. Associated with each port is a snoop cache filter that processes each snoop cache request and records addresses of the most recent snoop requests for exactly one source. The snoop cache filter uses vector encoding to record the occurrence of snoop requests for a sequence of consecutive cache lines. All addresses of snoop requests are added to the snoop cache unless a received snoop cache request matches an entry present in the associated snoop cache, in which case the snoop request is discarded. Otherwise, the associated snoop cache request is enqueued for forwarding to the single processor.

Type: Grant

Filed: March 29, 2005

Date of Patent: June 10, 2008

Assignee: International Busniess Machines Corporation

Inventors: Matthias A. Blumrich, Alan G. Gara, Valentina Salapura
Method and apparatus for detecting a cache wrap condition

Patent number: 7386684

Abstract: A method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced. The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap. The method and apparatus is preferably implemented in a snoop filter having filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms. In the various embodiments, cache wrap detection logic is implemented using registers and comparators, loadable counters, or a scoreboard data structure.

Type: Grant

Filed: March 29, 2005

Date of Patent: June 10, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Alan G. Gara, Mark E. Giampapa, Martin Ohmacht, Valentina Salapura
Method and apparatus for filtering snoop requests in a point-to-point interconnect architecture

Patent number: 7386683

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each of the memory writing sources is directly connected to the dedicated input ports of all other snoop filter devices associated with all other processing units in a point-to-point interconnect fashion.

Type: Grant

Filed: March 29, 2005

Date of Patent: June 10, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
METHOD AND APPARATUS FOR FILTERING SNOOP REQUESTS IN A POINT-TO-POINT INTERCONNECT ARCHITECTURE

Publication number: 20080133845

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each of the memory writing sources is directly connected to the dedicated input ports of all other snoop filter devices associated with all other processing units in a point-to-point interconnect fashion.

Type: Application

Filed: February 21, 2008

Publication date: June 5, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
Method and apparatus for filtering snoop requests using a scoreboard

Patent number: 7383397

Abstract: An apparatus for implementing snooping cache coherence that locally reduces the number of snoop requests presented to each cache in a multiprocessor system. A snoop filter device associated with a single processor includes one or more “scoreboard” data structures that make snoop determinations, i.e., for each snoop request from another processor, to determine if a request is to be forwarded to the processor or, discarded. At least one scoreboard is active, and at least one scoreboard is determined to be historic at any point in time. A snoop determination of the queue indicates that an entry may be in the cache, but does not indicate its actual residence status. In addition, the snoop filter block implementing scoreboard data structures is operatively coupled with a cache wrap detection logic means whereby, upon detection of a cache wrap condition, the content of the active scoreboard is copied into a historic scoreboard and the content of at least one active scoreboard is reset.

Type: Grant

Filed: March 29, 2005

Date of Patent: June 3, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Alan G. Gara, Thomas R. Puzak, Valentina Salapura
Snoop filtering system in a multiprocessor system

Patent number: 7380071

Abstract: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit.

Type: Grant

Filed: March 29, 2005

Date of Patent: May 27, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
Snoop filter for filtering snoop requests

Patent number: 7373462

Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.

Type: Grant

Filed: March 29, 2005

Date of Patent: May 13, 2008

Assignee: International Business Machines Corporation

Inventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
Collective Network For Computer Structures

Publication number: 20080104367

Abstract: A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

Type: Application

Filed: July 18, 2005

Publication date: May 1, 2008

Inventors: Matthias A. Blumrich, Paul W. Coteus, Dong Chen, Alan Gara, Mark E. Giampapa, Philip Heidelberger, Dirk Hoenicke, Todd E. Takken, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
OPTIMIZED SCALABLE NETWORK SWITCH

Publication number: 20080091842

Abstract: In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2 m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.

Type: Application

Filed: October 5, 2007

Publication date: April 17, 2008

Applicant: International Business Machines Corporation

Inventors: Matthias Blumrich, Dong Chen, Paul Coteus
Arithmetic functions in torus and tree networks

Patent number: 7313582

Abstract: Methods and systems for performing arithmetic functions. In accordance with a first aspect of the invention, methods and apparatus are provided, working in conjunction of software algorithms and hardware implementation of class network routing, to achieve a very significant reduction in the time required for global arithmetic operation on the torus. Therefore, it leads to greater scalability of applications running on large parallel machines. The invention involves three steps in improving the efficiency and accuracy of global operations: (1) Ensuring, when necessary, that all the nodes do the global operation on the data in the same order and so obtain a unique answer, independent of roundoff error; (2) Using the topology of the torus to minimize the number of hops and the bidirectional capabilities of the network to reduce the number of time steps in the data transfer operation to an absolute minimum; and (3) Using class function routing to reduce latency in the data transfer.

Type: Grant

Filed: February 25, 2002

Date of Patent: December 25, 2007

Assignee: International Business Machines Corporation

Inventors: Gyan Bhanot, Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas

prev 1 2 3 4 5 6 next