Patents by Inventor Alan G. Gara
Alan G. Gara has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7426253Abstract: A hybrid counter array device for counting events with interrupt indication includes a first counter portion comprising N counter devices, each for counting signals representing event occurrences and providing a first count value representing lower order bits. An overflow bit device associated with each respective counter device is additionally set in response to an overflow condition. The hybrid counter array includes a second counter portion comprising a memory array device having N addressable memory locations in correspondence with the N counter devices, each addressable memory location for storing a second count value representing higher order bits. An operatively coupled control device monitors each associated overflow bit device and initiates incrementing a second count value stored at a corresponding memory location in response to a respective overflow bit being set.Type: GrantFiled: August 21, 2006Date of Patent: September 16, 2008Assignee: International Business Machines CorporationInventors: Alan G. Gara, Valentina Salapura
-
Publication number: 20080222364Abstract: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit.Type: ApplicationFiled: May 23, 2008Publication date: September 11, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
-
Publication number: 20080209128Abstract: A method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced. The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap. The method and apparatus is preferably implemented in a snoop filter having filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms. In the various embodiments, cache wrap detection logic is implemented using registers and comparators, loadable counters, or a scoreboard data structure.Type: ApplicationFiled: May 1, 2008Publication date: August 28, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Matthias A. Blumrich, Alan G. Gara, Mark E. Giampapa, Martin Ohmacht, Valentina Salapura
-
Patent number: 7404041Abstract: A system, method and computer program product for supporting thread level speculative execution in a computing environment having multiple processing units adapted for concurrent execution of threads in speculative and non-speculative modes. Each processing unit includes a cache memory hierarchy of caches operatively connected therewith. The apparatus includes an additional cache level local to each processing unit for use only in a thread level speculation mode, each additional cache for storing speculative results and status associated with its associated processor when handling speculative threads. The additional local cache level at each processing unit are interconnected so that speculative values and control data may be forwarded between parallel executing threads. A control implementation is provided that enables speculative coherence between speculative threads executing in the computing environment.Type: GrantFiled: February 10, 2006Date of Patent: July 22, 2008Assignee: International Business Machines CorporationInventors: Alan G. Gara, Michael K. Gschwind, Valentina Salapura
-
Publication number: 20080155201Abstract: A method and apparatus for implementing a snoop filter unit associated with a single processor in a multiprocessor system. The snoop filter unit has a plurality of ports, each port receiving snoop requests from exactly one dedicated source. Associated with each port is a snoop cache filter that processes each snoop cache request and records addresses of the most recent snoop requests for exactly one source. The snoop cache filter uses vector encoding to record the occurrence of snoop requests for a sequence of consecutive cache lines. All addresses of snoop requests are added to the snoop cache unless a received snoop cache request matches an entry present in the associated snoop cache, in which case the snoop request is discarded. Otherwise, the associated snoop cache request is enqueued for forwarding to the single processor.Type: ApplicationFiled: March 5, 2008Publication date: June 26, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Matthias A. Blumrich, Alan G. Gara, Valentina Salapura
-
Patent number: 7392351Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having a local cache memory associated therewith. A snoop filter device is associated with each processing unit and includes at least one snoop filter primitive implementing filtering method based on usage of stream registers sets and associated stream register comparison logic. From the plurality of stream registers sets, at least one stream register set is active, and at least one stream register set is labeled historic at any point in time. In addition, the snoop filter block is operatively coupled with cache wrap detection logic whereby the content of the active stream register set is switched into a historic stream register set upon the cache wrap condition detection, and the content of at least one active stream register set is reset.Type: GrantFiled: March 29, 2005Date of Patent: June 24, 2008Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Alan G. Gara, Valentina Salapura
-
Patent number: 7386685Abstract: A method and apparatus for implementing a snoop filter unit associated with a single processor in a multiprocessor system. The snoop filter unit has a plurality of ports, each port receiving snoop requests from exactly one dedicated source. Associated with each port is a snoop cache filter that processes each snoop cache request and records addresses of the most recent snoop requests for exactly one source. The snoop cache filter uses vector encoding to record the occurrence of snoop requests for a sequence of consecutive cache lines. All addresses of snoop requests are added to the snoop cache unless a received snoop cache request matches an entry present in the associated snoop cache, in which case the snoop request is discarded. Otherwise, the associated snoop cache request is enqueued for forwarding to the single processor.Type: GrantFiled: March 29, 2005Date of Patent: June 10, 2008Assignee: International Busniess Machines CorporationInventors: Matthias A. Blumrich, Alan G. Gara, Valentina Salapura
-
Patent number: 7386683Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each of the memory writing sources is directly connected to the dedicated input ports of all other snoop filter devices associated with all other processing units in a point-to-point interconnect fashion.Type: GrantFiled: March 29, 2005Date of Patent: June 10, 2008Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
-
Patent number: 7386684Abstract: A method and apparatus for detecting a cache wrap condition in a computing environment having a processor and a cache. A cache wrap condition is detected when the entire contents of a cache have been replaced, relative to a particular starting state. A set-associative cache is considered to have wrapped when all of the sets within the cache have been replaced. The starting point for cache wrap detection is the state of the cache sets at the time of the previous cache wrap. The method and apparatus is preferably implemented in a snoop filter having filter mechanisms that rely upon detecting the cache wrap condition. These snoop filter mechanisms requiring this information are operatively coupled with cache wrap detection logic adapted to detect the cache wrap event, and perform an indication step to the snoop filter mechanisms. In the various embodiments, cache wrap detection logic is implemented using registers and comparators, loadable counters, or a scoreboard data structure.Type: GrantFiled: March 29, 2005Date of Patent: June 10, 2008Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Alan G. Gara, Mark E. Giampapa, Martin Ohmacht, Valentina Salapura
-
Publication number: 20080133633Abstract: The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via “all-to-all” distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates effType: ApplicationFiled: October 31, 2007Publication date: June 5, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Gyan V. Bhanot, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Burkhard D. Steinmacher-Burow, Pavlos M. Vranas
-
Publication number: 20080133845Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each of the memory writing sources is directly connected to the dedicated input ports of all other snoop filter devices associated with all other processing units in a point-to-point interconnect fashion.Type: ApplicationFiled: February 21, 2008Publication date: June 5, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
-
Patent number: 7383397Abstract: An apparatus for implementing snooping cache coherence that locally reduces the number of snoop requests presented to each cache in a multiprocessor system. A snoop filter device associated with a single processor includes one or more “scoreboard” data structures that make snoop determinations, i.e., for each snoop request from another processor, to determine if a request is to be forwarded to the processor or, discarded. At least one scoreboard is active, and at least one scoreboard is determined to be historic at any point in time. A snoop determination of the queue indicates that an entry may be in the cache, but does not indicate its actual residence status. In addition, the snoop filter block implementing scoreboard data structures is operatively coupled with a cache wrap detection logic means whereby, upon detection of a cache wrap condition, the content of the active scoreboard is copied into a historic scoreboard and the content of at least one active scoreboard is reset.Type: GrantFiled: March 29, 2005Date of Patent: June 3, 2008Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Alan G. Gara, Thomas R. Puzak, Valentina Salapura
-
Publication number: 20080126634Abstract: A method and apparatus for transferring wide data (e.g., n bits) from a narrow bus (m bits, where m<n) for updating a wide data storage array. The apparatus includes: a staging latch accommodating m bits, e.g., 32 bits; control circuitry for depositing the m bits of data from a data bus port into the staging latch addressed using a specific register address; and control circuitry adapted to merging the m bit data contained in the staging latch with m bit data from a data bus port, to generate the n bit wide data, for example, 64 bits, that is written atomically to a storage array specified by an address corresponding to a storage array location.Type: ApplicationFiled: August 21, 2006Publication date: May 29, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alan G. Gara, Michael K. Gschwind, Valentina Salapura
-
Patent number: 7380071Abstract: A system and method for supporting cache coherency in a computing environment having multiple processing units, each unit having an associated cache memory system operatively coupled therewith. The system includes a plurality of interconnected snoop filter units, each snoop filter unit corresponding to and in communication with a respective processing unit, with each snoop filter unit comprising a plurality of devices for receiving asynchronous snoop requests from respective memory writing sources in the computing environment; and a point-to-point interconnect comprising communication links for directly connecting memory writing sources to corresponding receiving devices; and, a plurality of parallel operating filter devices coupled in one-to-one correspondence with each receiving device for processing snoop requests received thereat and one of forwarding requests or preventing forwarding of requests to its associated processing unit.Type: GrantFiled: March 29, 2005Date of Patent: May 27, 2008Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
-
Patent number: 7373462Abstract: A method and apparatus for supporting cache coherency in a multiprocessor computing environment having multiple processing units, each processing unit having one or more local cache memories associated and operatively connected therewith. The method comprises providing a snoop filter device associated with each processing unit, each snoop filter device having a plurality of dedicated input ports for receiving snoop requests from dedicated memory writing sources in the multiprocessor computing environment. Each snoop filter device includes a plurality of parallel operating port snoop filters in correspondence with the plurality of dedicated input ports, each port snoop filter implementing one or more parallel operating sub-filter elements that are adapted to concurrently filter snoop requests received from respective dedicated memory writing sources and forward a subset of those requests to its associated processing unit.Type: GrantFiled: March 29, 2005Date of Patent: May 13, 2008Assignee: International Business Machines CorporationInventors: Matthias A. Blumrich, Dong Chen, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Dirk I. Hoenicke, Martin Ohmacht, Valentina Salapura, Pavlos M. Vranas
-
Patent number: 7350027Abstract: A method and apparatus for hardware support of the thread level speculation for existing processor cores without having to change the existing processor core, processor core's interface, or existing caches on the L1, L2 or L3 level. Architecture support for thread speculative execution by adding a new cache level for storing speculative values and a dedicated bus for forwarding speculative values and control. The cache level is hierarchically positioned between the cache levels L1 and L2 cache levels.Type: GrantFiled: February 10, 2006Date of Patent: March 25, 2008Assignee: International Business Machines CorporationInventors: Alan G. Gara, Valentina Salapura
-
Publication number: 20080043899Abstract: A hybrid counter array device for counting events. The hybrid counter array includes a first counter portion comprising N counter devices, each counter device for receiving signals representing occurrences of events from an event source and providing a first count value corresponding to a lower order bits of the hybrid counter array. The hybrid counter array includes a second counter portion comprising a memory array device having N addressable memory locations in correspondence with the N counter devices, each addressable memory location for storing a second count value representing higher order bits of the hybrid counter array. A control device monitors each of the N counter devices of the first counter portion and initiates updating a value of a corresponding second count value stored at the corresponding addressable memory location in the second counter portion. Thus, a combination of the first and second count values provide an instantaneous measure of number of events received.Type: ApplicationFiled: August 21, 2006Publication date: February 21, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alan G. Gara, Valentina Salapura
-
Publication number: 20080046700Abstract: A system for monitoring a large number of simultaneous events implements a hybrid counter array device having a first counter portion comprising counter devices, each counter device for receiving signals representing occurrences of events from an event source and providing a first count value corresponding to a lower order bits of the hybrid counter array. A second counter portion comprises a memory array device having addressable memory locations in correspondence with the counter devices, each addressable memory location for storing a second count value representing higher order bits. A control device monitors each of the counter devices and initiates updating a value of a corresponding second count value stored at the corresponding addressable memory location. The system includes interrupt pre-indication for providing fast interrupt trigger to a processor device when a count value related to an event equals a threshold value.Type: ApplicationFiled: August 21, 2006Publication date: February 21, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alan G. Gara, Michael K. Gschwind, Valentina Salapura
-
Publication number: 20080043897Abstract: A hybrid counter array device for counting events with interrupt indication includes a first counter portion comprising N counter devices, each for counting signals representing event occurrences and providing a first count value representing lower order bits. An overflow bit device associated with each respective counter device is additionally set in response to an overflow condition. The hybrid counter array includes a second counter portion comprising a memory array device having N addressable memory locations in correspondence with the N counter devices, each addressable memory location for storing a second count value representing higher order bits. An operatively coupled control device monitors each associated overflow bit device and initiates incrementing a second count value stored at a corresponding memory location in response to a respective overflow bit being set.Type: ApplicationFiled: August 21, 2006Publication date: February 21, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Alan G. Gara, Valentina Salapura
-
Patent number: 7330996Abstract: A method for maintaining full performance of a file system in the presence of a failure is provided. The file system having N storage devices, where N is an integer greater than zero and N primary file servers where each file server is operatively connected to a corresponding storage device for accessing files therein. The file system further having a secondary file server operatively connected to at least one of the N storage devices. The method including: switching the connection of one of the N storage devices to the secondary file server upon a failure of one of the N primary file servers; and switching the connections of one or more of the remaining storage devices to a primary file server other than the failed file server as necessary so as to prevent a loss in performance and to provide each storage device with an operating file server.Type: GrantFiled: February 25, 2002Date of Patent: February 12, 2008Assignee: International Business Machines CorporationInventors: Paul W. Coteus, Alan G. Gara, Mark E. Giampapa, Philip Heidelberger, Burkhard D. Steinmacher-Burow