Patents by Inventor Simon C. Steely, Jr.
Simon C. Steely, Jr. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 6088771Abstract: A technique reduces the latency of a memory barrier (MB) operation used to impose an inter-reference order between sets of memory reference operations issued by a processor to a multiprocessor system having a shared memory. The technique comprises issuing the MB operation immediately after issuing a first set of memory reference operations (i.e., the pre-MB operations) without waiting for responses to those pre-MB operations. Issuance of the MB operation to the system results in serialization of that operation and generation of a MB Acknowledgment (MB-Ack) command. The MB-Ack is loaded into a probe queue of the issuing processor and, according to the invention, functions to pull-in all previously ordered invalidate and probe commands in that queue. By ensuring that the probes and invalidates are ordered before the MB-Ack is received at the issuing processor, the inventive technique provides the appearance that all pre-MB references have completed.Type: GrantFiled: October 24, 1997Date of Patent: July 11, 2000Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., Madhumitra Sharma, Kourosh Gharachorloo, Stephen R. Van Doren
-
Patent number: 6085263Abstract: An improved I/O processor (IOP) delivers high I/O performance while maintaining inter-reference ordering among memory reference operations issued by an I/O device as specified by a consistency model in a shared memory multiprocessor system. The IOP comprises a retire controller which imposes inter-reference ordering among the operations based on receipt of a commit signal for each operation, wherein the commit signal for a memory reference operation indicates the apparent completion of the operation rather than actual completion of the operation. In addition, the IOP comprises a prefetch controller coupled to an I/O cache for prefetching data into cache without any ordering constraints (or out-of-order). The ordered retirement functions of the IOP are separated from its prefetching operations, which enables the latter operations to be performed in an arbitrary manner so as to improve the overall performance of the system.Type: GrantFiled: October 24, 1997Date of Patent: July 4, 2000Assignee: Compaq Computer Corp.Inventors: Madhumitra Sharma, Chester Pawlowski, Kourosh Gharachorloo, Stephen R. Van Doren, Simon C. Steely, Jr.
-
Patent number: 6081887Abstract: A technique for predicting the result of a conditional branch instruction for use with a processor having instruction pipeline. A stored predictor is connected to the front end of the pipeline and is trained from a truth based predictor connected to the back end of the pipeline. The stored predictor is accessible in one instruction cycle, and therefore provides minimum predictor latency. Update latency is minimized by storing multiple predictions in the front end stored predictor which are indexed by an index counter. The multiple predictions, as provided by the back end, are indexed by the index counter to select a particular one as current prediction on a given instruction pipeline cycle. The front end stored predictor also passes along to the back end predictor, such as through the instruction pipeline, a position value used to generate the predictions. This further structure accommodates ghost branch instructions that turn out to be flushed out of the pipeline when it must be backed up.Type: GrantFiled: November 12, 1998Date of Patent: June 27, 2000Assignee: Compaq Computer CorporationInventors: Simon C. Steely, Jr., Edward J. McLellan, Joel S. Emer
-
Patent number: 6061765Abstract: In accordance with the present invention, a method and apparatus is provided for storing victim data evicted from a cache and for satisfying pending requests or probe messages that target victim data, using a set of victim data buffers coupled to a central processing unit of a computer system. Storage locations referred to as a "victim valid bit" and a "probe valid bit" are associated with each victim data buffer in the computer system to indicate a release condition for the coupled victim data buffer. With such an arrangement, the victim data buffer can be deallocated when the victim valid bit and the probe valid bit have both been cleared.Type: GrantFiled: October 24, 1997Date of Patent: May 9, 2000Assignee: Compaq Computer CorporationInventors: Stephen Van Doren, Simon C. Steely, Jr., Robert Eugene Stewart, James Bernard Keller
-
Patent number: 6055605Abstract: A technique reduces the latency of inter-reference ordering between sets of memory reference operations in a multiprocessor system having a shared memory that is distributed among a plurality of processors that share a cache. According to the technique, each processor sharing a cache inherits a commit-signal that is generated by control logic of the multiprocessor system in response to a memory reference operation issued by another processor sharing that cache. The commit-signal facilitates serialization among the processors and shared memory entities of the multiprocessor system by indicating the apparent completion of the memory reference operation to those entities of the system.Type: GrantFiled: October 24, 1997Date of Patent: April 25, 2000Assignee: Compaq Computer CorporationInventors: Madhumitra Sharma, Simon C. Steely, Jr., Kourosh Gharachorloo, Stephen R. Van Doren
-
Patent number: 6049889Abstract: A multi-node computer network includes a plurality of nodes coupled together via a data link. Each of the nodes includes a local memory, which further comprises a shared memory. Certain items of data that are to be shared by the nodes are stored in the shared portion of memory. Associated with each of the shared data items is a data structure. When a node sharing data with other nodes in the system seeks to modify the data, it transmits the modifications over the data link to the other nodes in the network. Each update is received in order by each node in the cluster. As part of the last transmission by the modifying node, an acknowledgement request is sent to the receiving nodes in the cluster. Each node that receives the acknowledgment request returns an acknowledgement to the sending node. The returned acknowledgement is written to the data structure associated with the shared data item.Type: GrantFiled: January 13, 1998Date of Patent: April 11, 2000Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., Glenn P. Garvey, Richard B. Gillett, Jr.
-
Patent number: 5953747Abstract: A prediction mechanism for improving direct-mapped cache performance is shown to include a direct-mapped cache, partitioned into a plurality of pseudo-banks. Prediction means are employed to provide a prediction index which is appended to the cache index to provide the entire address for addressing the direct mapped cache. One embodiment of the prediction means includes a prediction cache which is advantageously larger than the pseudo-banks of the direct-mapped cache and is used to store the prediction index for each cache location. A second embodiment includes a plurality of partial tag stores, each including a predetermined number of tag bits for the data in each bank. A comparison of the tags generates a match in one of the plurality of tag stores, and is used in turn to generate a prediction index.Type: GrantFiled: June 26, 1996Date of Patent: September 14, 1999Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., Joseph Dominic Macri
-
Patent number: 5829051Abstract: An apparatus for allocating data to and retrieving data from a cache includes a memory subsystem coupled between a processor and a memory to provide quick access of memory data to the processor. The memory subsystem includes a cache memory. The address provided to the memory subsystem is divided into a cache index and a tag, and the cache index is hashed to provide a plurality of alternative addresses for accessing the cache. During a cache read, each of the alternative addresses are selected to search for the data responsive to an indicator of the validity of the data at the locations. The selection of the alternative address may be done through a mask having a number of bits corresponding to the number of alternative addresses. Each bit indicates whether the alternative address at that location should be used during the access of the cache in search of the data.Type: GrantFiled: April 4, 1994Date of Patent: October 27, 1998Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., Richard B. Gillett, Jr., Tryggve Fossum
-
Patent number: 5828874Abstract: A branch prediction apparatus includes a predicted past history device, and a a branch prediction device. The predicted past history device is operable to receive an indication of a branch instruction and to output a pattern of past predictions of branch directions for the indicated branch instruction. The pattern of past predictions includes at least one prediction of a branch direction for which the correctness of the prediction has not been determined The branch prediction device is operable to receive the pattern of past predictions of branch directions for the indicated branch instruction and to output a predicted branch direction for the indicated branch instruction based on the received pattern of past predictions.Type: GrantFiled: June 5, 1996Date of Patent: October 27, 1998Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., David J. Sager
-
Patent number: 5758142Abstract: A predictor which chooses between two or more predictors is described. The predictor includes a first component predictor which operates according to a first algorithm to produce a prediction of an action and a second component predictor which operates according to a second algorithm to produce a prediction of said action. The predictor also includes means, coupled to each of said first and second predictors, for choosing between predictions provided from said predictors to provide a prediction of the action from the predictor. The predictor can be used to predict outcomes of branches, cache hits, prefetched instruction sequences, and so forth.Type: GrantFiled: May 31, 1994Date of Patent: May 26, 1998Assignee: Digital Equipment CorporationInventors: Scott McFarling, Simon C. Steely, Jr., Joel Emer, Edward McLellan
-
Patent number: 5619662Abstract: A pipelined processor includes an instruction box including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by the set of instructions, to reorder the issuance of the set of instructions from the instruction processor. The mapped register operand fields are associated with the corresponding instructions of the reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.Type: GrantFiled: August 12, 1994Date of Patent: April 8, 1997Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., David J. Sager, David B. Fite, Jr.
-
Patent number: 5581719Abstract: A pipelined processor includes an instruction box including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by said set of instructions, to reorder the issuance of said set of instructions from said instruction processor. The mapped register operand fields are associated with the corresponding instructions of said reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.Type: GrantFiled: March 10, 1995Date of Patent: December 3, 1996Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., David J. Sager
-
Patent number: 5564118Abstract: A pipelined processor includes an instruction box including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by said set of instructions, to reorder the issuance of said set of instructions from said instruction processor. The mapped register operand fields are associated with the corresponding instructions of said reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.Type: GrantFiled: December 8, 1994Date of Patent: October 8, 1996Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., David J. Sager, William B. Noyce
-
Patent number: 5551048Abstract: A method for providing communication between a plurality of nodes coupled in a ring arrangement, wherein a plurality of the nodes comprise processors each having a cache memory for storing a subset of shared data. Each of the nodes on the ring deposits data into a data slot during a given time period. The data deposited by each node may comprise an address field and a node field. To ensure data coherency between the caches, each processor on the ring includes a queue for saving a plurality of received data representative of the latest bus data transmitted on the bus. As each processor receives new data, the new data is compared against the plurality of saved data in the queue to determine if the address field of the new data matches the address field of any of the saved data of the queue. In the event that the new data matches one of the plurality of saved data, it is determined whether the new data represents updated data from the memory device.Type: GrantFiled: June 3, 1994Date of Patent: August 27, 1996Assignee: Digital Equipment CorporationInventor: Simon C. Steely, Jr.
-
Patent number: 5519841Abstract: A pipelined processor includes an instruction unit including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by the set of instructions, to reorder the issuance of the set of instructions from the processor. The mapped register operand fields are associated with the corresponding instructions of the reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.Type: GrantFiled: November 12, 1992Date of Patent: May 21, 1996Assignee: Digital Equipment CorporationInventors: David J. Sager, Simon C. Steely, Jr., David B. Fite, Jr.
-
Patent number: 5509135Abstract: A plurality of indexes are provided for a multi-way set-associate cache of a computer system. The cache is organized as a plurality of blocks for storing data which are a copies of main memory data. Each block has an associated tag for uniquely identifying the block. The blocks and the tags are addressed by indexes. The indexes are generated by a Boolean hashing function which converts a memory address to cache indexes by combining the bits of the memory address using an exclusive OR function. Different combination of bits are used to generate a plurality of different indexes to address the tags and the associated blocks to transfer data between the cache and the central processing unit of the computer system.Type: GrantFiled: September 25, 1992Date of Patent: April 16, 1996Assignee: Digital Equipment CorporationInventor: Simon C. Steely, Jr.
-
Patent number: 5283873Abstract: A next line prediction mechanism for predicting a next instruction index to an instruction cache of a computer pipeline, has a latency equal to the cycle time of the instruction cache to maximize the instruction bandwidth out of the instruction cache. The instruction cache outputs a block of instructions with each fetch initiated by a next instruction index provided by the line prediction mechanism. The instructions of the block are processed in parallel for instruction decode and branch prediction to maintain a high rate of instruction flow through the pipeline.Type: GrantFiled: June 29, 1990Date of Patent: February 1, 1994Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., David J. Sager
-
Patent number: 5235697Abstract: The set-prediction cache memory system comprises an extension of a set-associative cache memory system which operates in parallel to the set-associative structure to increase the overall speed of the cache memory while maintaining its performance. The set prediction cache memory system includes a plurality of data RAMs and a plurality of tag RAMs to store data and data tags, respectively. Also included in the system are tag store comparators to compare the tag data contained in a specific tag RAM location with a second index comprising a predetermined second portion of a main memory address.Type: GrantFiled: October 5, 1992Date of Patent: August 10, 1993Assignee: Digital EquipmentInventors: Simon C. Steely, Jr., John H. Zurawski
-
Patent number: 5214770Abstract: A method and apparatus for optimizing the performance of a multiple cache system computer having separate caches for data and instructions in which all writes to the data cache are monitored. If the address tag of the item being written matches one of a list of tags representing valid instructions currently stored in the instruction cache, a flag called I.sub.-- FLUSH.sub.-- ON.sub.-- REI is set. Until this flag is set, REI (Return from Exception or Interrupt) instructions will not flush the instruction cache. When the flag is set, an REI command will also flush or clear the instruction cache. Thus, the instruction cache is only flushed when an address referenced by an instruction is modified, so as to reduce the number of times the cache is flushed and optimize the computer's speed of operation.Type: GrantFiled: June 21, 1990Date of Patent: May 25, 1993Assignee: Digital Equipment CorporationInventors: Raj K. Ramanujan, Peter J. Bannon, Simon C. Steely, Jr.
-
Patent number: 5197132Abstract: A register map having a free list of available physical locations in a register file, a log containing a sequential listing of logical registers changed during a predetermined number of cycles, a back-up map associating the logical registers with corresponding physical homes at a back-up point in a computer pipeline operation and a predicted map associating the logical registers with corresponding physical homes at a current point in the computer pipeline operation. A set of valid bits is associated with the maps to indicate whether a particular logical register is to be taken from the back-up map or the predicted map indication of a corresponding physical home. The valid bits can be "flash cleared" in a single cycle to back-up the computer pipeline to the back-up point during a trap event.Type: GrantFiled: June 29, 1990Date of Patent: March 23, 1993Assignee: Digital Equipment CorporationInventors: Simon C. Steely, Jr., David J. Sager