Patents by Inventor Simon C. Steely, Jr.

Simon C. Steely, Jr. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7240165
    Abstract: A multi-processor system includes a requesting node that provides a first request for data to a home node. The requesting node being operative to provide a second request for the data to at least one predicted node in parallel with first request. The requesting node receives at least one coherent copy of the data from at least one of the home node and the at least one predicted node.
    Type: Grant
    Filed: January 15, 2004
    Date of Patent: July 3, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Gregory Edward Tierney, Simon C. Steely, Jr.
  • Patent number: 7237067
    Abstract: Methods for storing replacement data in a multi-way associative cache are disclosed. One method comprises logically dividing the cache's cache sets into segments of at least one cache way; searching a cache set in accordance with a segment search sequence for a segment currently comprising a way which has not yet been accessed during a current cycle of the segment search sequence; searching the current segment in accordance with a way search sequence for a way which has not yet been accessed during a current way search cycle; and storing the replacement data in a first way which has not yet been accessed during a current cycle of the way search sequence. A cache controller that performs such methods is also disclosed.
    Type: Grant
    Filed: April 22, 2004
    Date of Patent: June 26, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Simon C. Steely, Jr.
  • Patent number: 7177987
    Abstract: Systems and method are disclosed for providing responses for different cache coherency protocols. One embodiment may comprise a system that includes a first node employing a first cache coherency protocol. A detector associated with the first node detects a condition based on responses provided by the first node to requests provided according to a second cache coherency protocol, the second cache coherency protocol being different from the first cache coherency protocol. The first node provides a response to a given one of the requests to the first node that varies based on the condition detected by the detector.
    Type: Grant
    Filed: January 20, 2004
    Date of Patent: February 13, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Stephen R. Van Doren, Gregory Edward Tierney, Simon C. Steely, Jr.
  • Patent number: 7149852
    Abstract: Systems and methods are disclosed for blocking data responses. One system includes a target node that, in response to a source broadcast request for requested data, provides a response that includes a copy of the requested data. The target node also provides a blocking message to a home node associated with the requested data. The blocking message being operative cause the home node to provide a non-data response to the source broadcast request if the blocking message is matched with the source broadcast request at the home node.
    Type: Grant
    Filed: January 20, 2004
    Date of Patent: December 12, 2006
    Assignee: Hewlett Packard Development Company, LP.
    Inventors: Stephen R. Van Doren, Gregory Edward Tierney, Simon C. Steely, Jr.
  • Patent number: 7143245
    Abstract: A system comprises a first node including data having an associated D-state and a second node operative to provide a source broadcast requesting the data. The first node is operative in response to the source broadcast to provide the data to the second node and transition the state associated with the data at the first node from the D-state to an O-state without concurrently updating memory. An S-state is associated with the data at the second node.
    Type: Grant
    Filed: January 20, 2004
    Date of Patent: November 28, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Gregory Edward Tierney, Stephen R. Van Doren, Simon C. Steely, Jr.
  • Patent number: 6961825
    Abstract: A distributed processing system includes a cache coherency mechanism that essentially encodes network routing information into sectored presence bits. The mechanism organizes the sectored presence bits as one or more arbitration masks that system switches decode and use directly to route invalidate messages through one or more higher levels of the system. The lower level or levels of the system use local routing mechanisms, such as local directories, to direct the invalidate messages to the individual processors that are holding the data of interest.
    Type: Grant
    Filed: January 24, 2001
    Date of Patent: November 1, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Simon C. Steely, Jr., Stephen Van Doren, Madhumitra Sharma
  • Patent number: 6904465
    Abstract: A multiple-processor system in which a commit message is returned to a source processor that requests a memory access operation so as to indicate the apparent completion of the operation includes a multiple-level switch unit linking nodes that contain the processors. The switch unit includes multiple input switches each of which receives messages from multiple nodes, and a set of output switches whose inputs are the outputs of the input switches and whose outputs are the inputs of the nodes. Each switch processes messages in the order in which they are received by the switch and each output switch follows the same rule as the other output switches.
    Type: Grant
    Filed: April 26, 2001
    Date of Patent: June 7, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Simon C. Steely, Jr., Madhumitra Sharma, Stephen R. Van Doren
  • Patent number: 6801986
    Abstract: A method, for executing a load locked and a store conditional instruction in a processor, achieves an atomic read-write operation to a memory block. First the load locked instruction is executed to read a memory block, and the processor in response to executing the load locked instruction issues a read modify system command to read the block and to take ownership of the block by the processor, and also sets a lock flag for the address of the memory block, and writes a value of the memory block into a cache of the processor as a cache copy of the memory block. The lock flag, upon receipt of an invalidate message by the processor for the cache copy of the memory block, is reset if any invalidate messages for the memory block are received by the processor. The processor waits for a selected time interval before the processor surrenders ownership of the memory block upon receipt of an ownership request message, if any is received by the processor after execution of the load locked instruction.
    Type: Grant
    Filed: August 20, 2001
    Date of Patent: October 5, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Simon C. Steely, Jr., Stephen R. Van Doren, Madhumitra Sharma
  • Patent number: 6769057
    Abstract: A “data verified”, or DV, bit is included in an instruction to indicate if the instruction or a dependent instruction may be associated with the retrieved data as soon as the data is available or should instead be associated with the data after verification. If the DV bit is in a first state, e.g., not set, the system may issue instructions that use associated data as soon as the data is available. If the DV bit is in a second state, e.g., set, the system does not issue the instructions that use the data until the data is verified. The system or user sets the DV bit based on an analysis of an instruction set that includes the instruction and/or accumulated profile data from previous use or uses of the software. The DV bit is set in a LOAD instruction if the dependent user instruction is close enough in the instruction set that the user instruction is likely to issue before the data is verified and/or if the LOAD instruction is part of a relatively long chain of instructions.
    Type: Grant
    Filed: January 22, 2001
    Date of Patent: July 27, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Simon C. Steely, Jr.
  • Patent number: 6647466
    Abstract: A system for adaptively bypassing one or more higher cache levels following a miss in a lower level of a cache hierarchy is described. Each cache level preferably includes a tag store containing address and state information for each cache line resident in the respective cache. When an invalidate request is received at a given cache hierarchy, each cache level is searched for the address specified by the invalidate request. When an address match is detected, the state of the respective cache line is changed to the invalid state, although the address of the cache line is left in the tag store. Thereafter, if the processor or entity associated with this cache hierarchy issues its own request for this same cache line, the cache hierarchy begins searching the tag store of each level starting with the lowest cache level.
    Type: Grant
    Filed: January 25, 2001
    Date of Patent: November 11, 2003
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Simon C. Steely, Jr.
  • Patent number: 6636948
    Abstract: A performance enhancing change-to-dirty operation (CTD) is disclosed wherein contention among several processors trying to gain ownership of a block of data is obviated by arranging the CTD to always succeed. A method and a system are disclosed where a processor in a multiprocessor system having a copy of data gains assured ownership of data that the processor may then write. The method provides for the possibilities of conditions that may exist and provides a scenario that the requesting processor may have to wait for the ownership. Conditions are handled where the memory is the “owner” of the data and where other processor are requesting ownership, and where copies of the data exist at other processors. The method provides for messages to other processor having copies of the data informing them that the data is now invalid.
    Type: Grant
    Filed: April 13, 2001
    Date of Patent: October 21, 2003
    Assignee: Hewlett-Packard Development Company, LP.
    Inventors: Simon C. Steely, Jr., Stephen R. Van Doren, Madhu Sharna
  • Patent number: 6493801
    Abstract: An adaptive cache coherent purging protocol includes recognizing system performance, especially latency, is affected by when cache is purged. The occurrence of performance enhancing and degrading events regarding a cache are counted and compared to a threshold. When the threshold is triggered the cache becomes a candidate for purging. In an embodiment, a time out delay is implemented before actual purging occurs. When the threshold is not triggered but a cache event occurs, a fake time out delay is triggered and the count is adaptively either raised, lowered or set to zero in response to performance enhancing and/or degrading events. The effect is to make the actual purging more likely if the history of cache events indicates that the performance would be enhanced thereby or less likely if the history indicates that the performance would be degraded thereby.
    Type: Grant
    Filed: January 26, 2001
    Date of Patent: December 10, 2002
    Assignee: Compaq Computer Corporation
    Inventors: Simon C. Steely, Jr., Nikolaos Hardavellas
  • Patent number: 6295585
    Abstract: A multi-node computer network includes a plurality of nodes coupled together via a data link. Each of the nodes includes a local memory, which further comprises a shared memory. Certain items of data that are to be shared by the nodes are stored in the shared portion of memory. Associated with each of the shared data items is a data structure. When a node sharing data with other nodes in the system seeks to modify the data, it transmits the modifications over the data link to the other nodes in the network. Each update is received in order by each node in the cluster. As part of the last transmission by the modifying node, an acknowledgement request is sent to the receiving nodes in the cluster. Each node that receives the acknowledgment request returns an acknowledgement to the sending node. The returned acknowledgement is written to the data structure associated with the shared data item.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: September 25, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Richard B. Gillett, Jr., Glenn P. Garvey, Simon C. Steely, Jr.
  • Patent number: 6286090
    Abstract: A technique selectively imposes inter-reference ordering between memory reference operations issued by a processor of a multiprocessor system to addresses within a page pertaining to a page table entry (PTE) that is affected by a translation buffer (TB) miss flow routine. The TB miss flow is used to retrieve information contained in the PTE for mapping a virtual address to a physical address and, subsequently, to allow retrieval of data at the mapped physical address. The PTE that is retrieved in response to a memory reference (read) operation is not loaded into the TB until a commit-signal associated with that read operation is returned to the processor. Once the PTE and associated commit-signal are returned, the processor loads the PTE into the TB so that it can be used for a subsequent read operation directed to the data at the physical address.
    Type: Grant
    Filed: May 26, 1998
    Date of Patent: September 4, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Simon C. Steely, Jr., Madhumitra Sharma, Stephen R. Van Doren, Kourosh Gharachorloo
  • Patent number: 6249520
    Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: June 19, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Simon C. Steely, Jr., Stephen R. VanDoren, Madhumitra Sharma, Craig D. Keefer, David W. Davis
  • Patent number: 6209065
    Abstract: A mechanism optimizes the generation of a commit-signal by control logic of the multiprocessor system in response to a memory reference operation issued by a processor to a local node of a multiprocessor system having a hierarchical switch for interconnecting a plurality of nodes. The mechanism generally comprises a structure that indicates whether the memory reference operation affects other processors of other nodes of the multiprocessor system. An ordering point of the local node generates an optimized commit-signal when the structure indicates that the memory reference operation does not affect the other processors.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: March 27, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Stephen R. Van Doren, Simon C. Steely, Jr., Kourosh Gharachorloo, Madhumitra Sharma
  • Patent number: 6202126
    Abstract: A method for preventing inadvertent invalidation of data elements in a system having a separate probe queue and fill queue for each central processing unit, is provided wherein a central processing unit stores a clean data element, that would otherwise have been discarded, in a victim data buffer when it is evicted from cache. The central processing unit subsequently issues a clean-victim command to the system control logic when the readmiss or read-miss-modify command, targeting the data element that maps to the same location in cache as the clean data element, is issued. The clean-victim command causes the duplicate tag store to indicate that the clean data element is no longer stored in that central processing unit's cache. While the data is stored therein, the central processing unit cannot issue a probe message that targets that data until the victim data buffer has been deallocated.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: March 13, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Stephen Van Doren, Simon C. Steely, Jr., Madhumitra Sharma
  • Patent number: 6108737
    Abstract: A mechanism reduces the latency of inter-reference ordering between sets of memory reference operations in a multiprocessor system having a shared memory. The mechanism comprises a commit-signal that is generated by control logic of the multiprocessor system in response to an issued memory reference operation. The commit-signal facilitates inter-reference ordering; moreover, the commit signal indicates the apparent completion of the memory reference operation, rather than actual completion of the operation. The apparent completion of an operation occurs substantially sooner than the actual completion of an operation, thereby improving performance of the multiprocessor system.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: August 22, 2000
    Assignee: Compaq Computer Corporation
    Inventors: Madhumitra Sharma, Stephen R. Van Doren, Kourosh Gharachorloo, Simon C. Steely, Jr.
  • Patent number: 6105108
    Abstract: A multiprocessor computer system releases a victim data buffer storing victim data, when system control logic determines that a count of the number of probe messages pending at a specified time equals the number of such probe messages that have had an address comparison performed after the specified time. The specified time occurs when a command to write the victim data element to main memory passes a serialization point of the computer system.The address comparison compares a target address of a probe message with addresses of data stored in the victim data buffer and the associated cache of a CPU of the computer system.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: August 15, 2000
    Assignee: Compaq Computer Corporation
    Inventors: Simon C. Steely, Jr., Stephen Van Doren
  • Patent number: 6101581
    Abstract: In accordance with the present invention, a method and apparatus is provided for maintaining the coherency of victim data from a time when the data is stored in a victim data buffer until a time when the data is written into a main memory. Alternatively, the coherency of the victim data is preserved until a determination is made that pending probe messages do not target the victim data. At that time the victim data buffer can be deallocated.With both arrangements, a central processing unit can release a victim data buffer at a point in time other than when the data that is stored therein is read from the buffer. Thus, the central processor unit can perform the release or deallocation of the buffer when it is most efficient and when no further access to the data is required.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: August 8, 2000
    Inventors: Stephen Van Doren, Simon C. Steely, Jr., Robert Eugene Stewart, James Bernard Keller