Patents by Inventor Thomas R. Puzak

Thomas R. Puzak has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7383397
    Abstract: An apparatus for implementing snooping cache coherence that locally reduces the number of snoop requests presented to each cache in a multiprocessor system. A snoop filter device associated with a single processor includes one or more “scoreboard” data structures that make snoop determinations, i.e., for each snoop request from another processor, to determine if a request is to be forwarded to the processor or, discarded. At least one scoreboard is active, and at least one scoreboard is determined to be historic at any point in time. A snoop determination of the queue indicates that an entry may be in the cache, but does not indicate its actual residence status. In addition, the snoop filter block implementing scoreboard data structures is operatively coupled with a cache wrap detection logic means whereby, upon detection of a cache wrap condition, the content of the active scoreboard is copied into a historic scoreboard and the content of at least one active scoreboard is reset.
    Type: Grant
    Filed: March 29, 2005
    Date of Patent: June 3, 2008
    Assignee: International Business Machines Corporation
    Inventors: Matthias A. Blumrich, Alan G. Gara, Thomas R. Puzak, Valentina Salapura
  • Publication number: 20080010555
    Abstract: A hardware monitor is used to monitor the sequence of instructions executed during a miss cluster. The monitor groups each cache miss into a miss cluster, and the miss penalty associated with each cluster is determined by identifying a set of instructions that were executed during the miss cluster. The finite cache running time is then calculated for the set of instructions that occurred during the miss cluster. Additionally, an infinite cache running time is determined for the same set of instructions that occurred during the miss cluster, where the infinite cache running time is the time needed to execute this set of instructions in the absence of any miss. The cost of the miss cluster is then calculated by measuring the difference between the finite cache running time and the infinite cache running time.
    Type: Application
    Filed: June 16, 2006
    Publication date: January 10, 2008
    Inventors: Phillip Emma, Alian Hartstein, Daniel N. Lynch, Thomas R. Puzak, Vijayalakshmi Srinivasan
  • Publication number: 20040015683
    Abstract: A two level branch history table (TLBHT) is substantially improved by providing a mechanism to prefetch entries from the very large second level branch history table (L2 BHT) into the active (very fast) first level branch history table (L1 BHT) before the processor uses them in the branch prediction process and at the same time prefetch cache misses into the instruction cache. The mechanism prefetches entries from the very large L2 BHT into the very fast L1 BHT before the processor uses them in the branch prediction process. A TLBHT is successful because it can prefetch branch entries into the L1 BHT sufficiently ahead of the time the entry is needed. This feature of the TLBHT is also used to prefetch instructions into the cache ahead of their use. In fact, the timeliness of the prefetches produced by the TLBHT can be used to remove most of the cycle time penalty incurred by cache misses.
    Type: Application
    Filed: July 18, 2002
    Publication date: January 22, 2004
    Applicant: International Business Machines Corporation
    Inventors: Philip G. Emma, Klaus J. Getzlaff, Allan M. Hartstein, Thomas Pflueger, Thomas R. Puzak, Eric Mark Schwarz, Vijayalakshmi Srinivasan
  • Patent number: 6560693
    Abstract: A mechanism is described that prefetches instructions and data into the cache using a branch instruction as a prefetch trigger. The prefetch is initiated if the predicted execution path after the branch instruction matches the previously seen execution path. This match of the execution paths is determined using a branch history queue that records the branch outcomes (taken/not taken) of the branches in the program. For each branch in this queue, a branch history mask records the outcomes of the next N branches and serves as an encoding of the execution path following the branch instruction. The branch instruction along with the mask is associated with a prefetch address (instruction or data address) and is used for triggering prefetches in the future when the branch is executed again. A mechanism is also described to improve the timeliness of a prefetch by suitably adjusting the value of N after observing the usefulness of the prefetched instructions or data.
    Type: Grant
    Filed: December 10, 1999
    Date of Patent: May 6, 2003
    Assignee: International Business Machines Corporation
    Inventors: Thomas R. Puzak, Allan M. Hartstein, Mark Charney, Daniel A. Prener, Peter H. Oden, Vijayalakshmi Srinivasan
  • Patent number: 6418525
    Abstract: A method and apparatus for storing and utilizing set prediction information regarding which set of a set-associative memory will be accessed for enhancing performance of the set-associative memory and reducing power consumption. The set prediction information is stored in various locations including a branch target buffer, instruction cache and operand history table to decrease latency for accesses to set-associative instruction and data caches.
    Type: Grant
    Filed: January 29, 1999
    Date of Patent: July 9, 2002
    Assignee: International Business Machines Corporation
    Inventors: Mark J. Charney, Philip G. Emma, Daniel A. Prener, Thomas R. Puzak
  • Patent number: 6311253
    Abstract: A method for storing information in a computer memory system includes maintaining an Mth level storage system including an Mth level data store for storing data, an Mth level full directory for storing a set of tags corresponding to the data, and an Mth level partial directory for storing a subset of the tags. The partial directory is accessible faster than the full directory. Upon an M-1 level miss corresponding to a request for data, a congruence class corresponding to the request is fetched from the partial directory when it is present therein; otherwise, it is fetched from the full directory. The requested data is retrieved from the data store when it is present in the congruence class; otherwise, it is retrieved from a next level of the memory system. The tags in the partial directory may be full tags, partial tags, or a combination thereof.
    Type: Grant
    Filed: June 21, 1999
    Date of Patent: October 30, 2001
    Assignee: International Business Machines Corporation
    Inventors: Albert Chang, Mark Charney, Robert K. Montoye, Thomas R. Puzak
  • Patent number: 5636364
    Abstract: In a cache-to-memory interface, a means and method for timesharing a single bus to allow the concurrent processing of multiple misses. The multiplicity of misses can arise from a single processor if that processor has a nonblocking cache and/or does speculative prefetching, or it can arise from a multiplicity of processors in a shared-bus configuration.
    Type: Grant
    Filed: December 1, 1994
    Date of Patent: June 3, 1997
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Joshua W. Knight, III, Thomas R. Puzak
  • Patent number: 5634119
    Abstract: A computer processing system includes a first memory that stores instructions belonging to a first instruction set architecture and a second memory that stores instructions belonging to a second instruction set architecture. An instruction buffer is coupled to the first and second memories, for storing instructions that are executed by a processor unit. The system operates in one of two modes. In a first mode, instructions are fetched from the first memory into the instruction buffer according to data stored in a first branch history table. In the second mode, instructions are fetched from the second memory into the instruction buffer according to data stored in a second branch history table.The first instruction set architecture may be system level instructions and the second instruction set architecture may be millicode instructions that, for example, define a complex system level instruction and/or emulate a third instruction set architecture.
    Type: Grant
    Filed: January 6, 1995
    Date of Patent: May 27, 1997
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Joshua W. Knight, III, Thomas R. Puzak, James R. Robinson, A. James Van Norstrand, Jr.
  • Patent number: 5584002
    Abstract: A method for addressing data in a cache unit which has a plurality of congruence classes, following a failure which disables one or more of the congruence classes in the cache unit. A plurality of synonym classes are established. A subset of the congruence classes is assigned to each of the synonym classes. Any disabled congruence classes are identified. The synonym class to which the disabled congruence class belongs is identified. An alternate congruence class is selected which belongs to the same synonym class as the disabled congruence class. When a request is received by the cache to store a line of data into the disabled congruence class, the line is stored into the alternate congruence class in response to the request.
    Type: Grant
    Filed: February 22, 1993
    Date of Patent: December 10, 1996
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Joshua W. Knight, Keith N. Langston, James H. Pomerene, Thomas R. Puzak
  • Patent number: 5434985
    Abstract: System and method for predicting a multiplicity of future branches simultaneously (parallel) from an executing program, to enable the simultaneous fetching of multiple disjoint program segments. Additionally, the present invention detects divergence of incorrect branch predictions and provides correction for such divergence without penalty. By predicting an entire sequence of branches in parallel, the present invention removes restrictions that decoding of multiple instructions in a superscalar environment must be limited to a single branch group. As a result, the speed of today's superscalar processors can be significantly increased.
    Type: Grant
    Filed: August 11, 1992
    Date of Patent: July 18, 1995
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Joshua W. Knight, James H. Pomerene, Thomas R. Puzak
  • Patent number: 5353421
    Abstract: A multi-prediction branch prediction mechanism predicts each conditional branch at least twice, first during the instruction-fetch phase of the pipeline and then again during the decode phase of the pipeline. The mechanism uses at least two different branch prediction mechanisms, each a separate and independent mechanism from the other. A set of rules are used to resolve those instances as to when the predictions differ.
    Type: Grant
    Filed: July 13, 1993
    Date of Patent: October 4, 1994
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Joshua W. Knight, James H. Po merene, Thomas R. Puzak, Rudolph N. Rechtschaffen, James R. Robinson
  • Patent number: 5197139
    Abstract: A store through cache environment managed exclusively grants exclusivity on a large granularity basis. A cross-invalidate is realized for all changed lines via a single transmission when exclusivity is released. A dynamic table that operates in conjunction with a directory look-aside table (DLAT) determines a number of pages that can be held exclusive simultaneously. For adequate operating speed, the special table must be either fully associative or at least set associative. Alternatively, the table can be incorporated into the DLAT. Each DLAT entry is also extended to include a set of "resident" bits and a "valid nonresident" bit. When exclusively is released, the set of local change bits is broadcast to all processors. Upon receipt of such broadcast, the appropriate action is to change the "valid nonresident" indication to read-only and to clear residence bits whose corresponding local change bit is set.
    Type: Grant
    Filed: April 5, 1990
    Date of Patent: March 23, 1993
    Assignee: International Business Machines Corporation
    Inventors: Philip G. Emma, Joshua W. Knight, James H. Pomerene, Thomas R. Puzak, Rudolph N. Rechtschaffen, Frank J. Sparacio
  • Patent number: 4903196
    Abstract: A method and apparatus for controlling access to its general purpose registers (GPRs) by a high end machine configuration including a plurality of execution units within a single CPU. The invention allows up to "N" execution units to be concurrently executing up to "N" instructions using the GPR sequentially or different GPR's concurrently as either SINK or SOURCE while at the same time preserving the logical integrity of the data supplied to the execution units. The use of the invention allows a higher degree of parallelism in the execution of the instructions than would otherwise be possible if only sequential operations were performed.A series of special purpose tags are associated with each GPR and execution unit. These tags are used together with control circuitry both within the GPR's, within the individual execution units and within the instruction decode unit, which permit the multiple use of the registers to be accomplished while maintaining the requisite logical integrity.
    Type: Grant
    Filed: May 2, 1986
    Date of Patent: February 20, 1990
    Assignee: International Business Machines Corporation
    Inventors: James H. Pomerene, Thomas R. Puzak, Rudolph N. Rechtschaffen, Frank J. Sparacio
  • Patent number: 4823259
    Abstract: A high speed buffer store arrangement for use in a data processing system having multiple cache buffer storage units in a hierarchial arrangement permits fast transfer of wide data blocks. On each cache chip, input and output latches are integrated thus avoiding separate intermediate buffering. Input and output latches are interconnected by 64-byte wide data buses so that data blocks can be shifted rapidly from one cache hierarchy level to another and back. Chip-internal feedback connections from output to input latches allow data blocks to be selectively reentered into a cache after reading. An additional register array is provided so that data blocks can be furnished again after transfer from cache to main memory or CPU without accessing the respective cache. Wide data blocks can be transferred within one cycle, thus tying up caches much less in transfer operations, so that they have increased availability.
    Type: Grant
    Filed: June 23, 1988
    Date of Patent: April 18, 1989
    Assignee: International Business Machines Corporation
    Inventors: Frederick J. Aichelmann, Jr., Rex H. Blumberg, David Meltzer, James H. Pomerene, Thomas R. Puzak, Rudolph N. Rechtschaffen, Frank J. Sparacio
  • Patent number: 4807110
    Abstract: A prefetching mechanism for a system having a cache has, in addition to the normal cache directory, a two-level shadow directory. When an information block is accessed, a parent identifier derived from the block address is stored in a first level of the shadow directory. The address of a subsequently accessed block is stored in the second level of the shadow directory, in a position associated with the first-level position of the respective parent identifier.With each access to an information block, a check is made whether the respective parent identifier is already stored in the first level of the shadow directory. If it is found, then a descendant address from the associated second-level position is used to prefetch an information block to the cache if it is not already resident therein. This mechanism avoids, with a high probability, the occurrence of cache misses.
    Type: Grant
    Filed: April 6, 1984
    Date of Patent: February 21, 1989
    Assignee: International Business Machines Corporation
    Inventors: James H. Pomerene, Thomas R. Puzak, Rudolph N. Rechtschaffen, Frank J. Sparacio
  • Patent number: 4774654
    Abstract: A prefetching mechanism for a memory hierarchy which includes at least two levels of storage, with L1 being a high-speed low-capacity memory, and L2 being a low-speed high-capacity memory, with the units of L2 and L1 being blocks and sub-blocks respectively, with each block containing several sub-blocks in consecutive addresses. Each sub-block is provided an additional bit, called a r-bit, which indicates that the sub-block has been previously stored in L1 when the bit is 1, and has not been previously stored in L1 when the bit is 0. Initially when a block is loaded into L2 each of the r-bits in the sub-block are set to 0. When a sub-block is transferred from L1 to L2, its r-bit is then set to 1 in the L2 block, to indicate its previous storage in L1. When the CPU references a given sub-block which is not present in L1, and has to be fetched from L2 to L1, the remaining sub-blocks in this block having r-bits set to 1 are prefetched to L1.
    Type: Grant
    Filed: December 24, 1984
    Date of Patent: September 27, 1988
    Assignee: International Business Machines Corporation
    Inventors: James H. Pomerene, Thomas R. Puzak, Rudolph N. Rechtschaffen, Kimming So
  • Patent number: 4679141
    Abstract: A branch history table (BHT) is substantially improved by dividing it into two parts: an active area, and a backup area. The active area contains entries for a small number of branches which the processor can encounter in the near future and the backup area contains all other branch entries. Means are provided to bring entries from the backup area into the active area ahead of when the processor will use those entries. When entries are no longer needed they are removed from the active area and put into the backup area if not already there. New entries for the near future are brought in, so that the active area, though small, will almost always contain the branch information needed by the processor.The small size of the active area allows it to be fast and to be optimally located in the processor layout. The backup area can be located outside the critical part of the layout and can therefore be made larger than would be practicable for a standard BHT.
    Type: Grant
    Filed: April 29, 1985
    Date of Patent: July 7, 1987
    Assignee: International Business Machines Corporation
    Inventors: James H. Pomerene, Thomas R. Puzak, Rudolph N. Rechtschaffen, Philip L. Rosenfeld, Frank J. Sparacio