Patents by Inventor Steven Dodson
Steven Dodson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20030009640Abstract: A non-uniform memory access (NUMA) data processing system includes a plurality of nodes coupled to a node interconnect. The plurality of nodes contain a plurality of processing units and at least one system memory having a table (e.g., a page table) resident therein. The table includes at least one entry for translating a group of non-physical addresses to physical addresses that individually specifies control information pertaining to the group of non-physical addresses for each of the plurality of nodes. The control information may include one or more data storage control fields, which may include a plurality of write through indicators that are each associated with a respective one of the plurality of nodes. When a write through indicator is set, processing units in the associated node write modified data back to system memory in a home node rather than caching the data.Type: ApplicationFiled: June 21, 2001Publication date: January 9, 2003Applicant: International Business Machines Corp.Inventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields
-
Publication number: 20030009643Abstract: A non-uniform memory access (NUMA) computer system includes a remote node coupled by a node interconnect to a home node having a home system memory. The remote node includes a local interconnect, a processing unit and at least one cache coupled to the local interconnect, and a node controller coupled between the local interconnect and the node interconnect. The processing unit first issues, on the local interconnect, a read-type request targeting data resident in the home system memory with a flag in the read-type request set to a first state to indicate only local servicing of the read-type request. In response to inability to service the read-type request locally in the remote node, the processing unit reissues the read-type request with the flag set to a second state to instruct the node controller to transmit the read-type request to the home node.Type: ApplicationFiled: June 21, 2001Publication date: January 9, 2003Applicant: International Business Machines Corp.Inventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields
-
Publication number: 20030009632Abstract: A computer system includes a processing unit, a system memory, and a memory controller coupled to the processing unit and the system memory. According to the present invention, the memory controller accesses the system memory to obtain prefetch data and transmits the prefetch data to the processing unit in a prefetch write operation specifying the processing unit in a destination field. In one embodiment, the memory controller transmits the prefetch write operation in response to receipt of a prefetch hint from the processing unit, which may accompany a read-type request by the processing unit. This prefetch methodology may advantageously be implemented imprecisely, with the memory controller responding to the prefetch hint only if a prefetch queue is available and ignoring the prefetch hint otherwise. The processing unit may similarly ignore the prefetch write operation if no snoop queue is available.Type: ApplicationFiled: June 21, 2001Publication date: January 9, 2003Applicant: International Business Machines Corp.Inventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields
-
Publication number: 20030009623Abstract: A non-uniform memory access (NUMA) computer system and associated method of operation are disclosed. The NUMA computer system includes at least a remote node and a home node coupled to an interconnect. The remote node contains at least one processing unit coupled to a remote system memory, and the home node contains at least a home system memory. To reduce access latency for data from other nodes, a portion of the remote system memory is allocated as a remote memory cache containing data corresponding to data resident in the home system memory. In one embodiment, access bandwidth to the remote memory cache is increased by distributing the remote memory cache across multiple system memories in the remote node.Type: ApplicationFiled: June 21, 2001Publication date: January 9, 2003Applicant: International Business Machines Corp.Inventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields
-
Publication number: 20030009639Abstract: A non-uniform memory access (NUMA) computer system includes a remote node coupled by a node interconnect to a home node including a home system memory. The remote node includes a plurality of snoopers coupled to a local interconnect. The plurality of snoopers includes a cache that caches a cache line corresponding to but modified with respect to data resident in the home system memory. The cache has a cache controller that issues a deallocate operation on the local interconnect in response to deallocating the modified cache line. The remote node further includes a node controller, coupled between the local interconnect and the node interconnect, that transmits the deallocate operation to the home node with an indication of whether or not a copy of the cache line remains in the remote node following the deallocation. In this manner, the local memory directory associated with the home system memory can be updated to precisely reflect which nodes hold a copy of the cache line.Type: ApplicationFiled: June 21, 2001Publication date: January 9, 2003Applicant: International Business Machines Corp.Inventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields
-
Publication number: 20030009641Abstract: A non-uniform memory access (NUMA) computer system includes at least one remote node and a home node coupled by a node interconnect. The home node contains a home system memory and a memory controller. In response to receipt of a data request from a remote node, the memory controller determines whether to grant exclusive or non-exclusive ownership of requested data specified in the data request by reference to history information indicative of prior data accesses originating in the remote node. The memory controller then transmits the requested data and an indication of exclusive or non-exclusive ownership to the remote node.Type: ApplicationFiled: June 21, 2001Publication date: January 9, 2003Applicant: International Business Machines Corp.Inventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields
-
Publication number: 20030009637Abstract: A non-uniform memory access (NUMA) computer system includes a first node and a second node coupled by a node interconnect. The second node includes a local interconnect, a node controller coupled between the local interconnect and the node interconnect, and a controller coupled to the local interconnect. In response to snooping an operation from the first node issued on the local interconnect by the node controller, the controller signals acceptance of responsibility for coherency management activities related to the operation in the second node, performs coherency management activities in the second node required by the operation, and thereafter provides notification of performance of the coherency management activities.Type: ApplicationFiled: June 21, 2001Publication date: January 9, 2003Applicant: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields
-
Patent number: 6505277Abstract: A method for ordering the time of issue of a load instruction from a lower level (L2) intervening cache, interlinked by a system bus to a first L2 cache. The method comprises the steps of (i) appending a cycle of dependency (CoD) value to said load instruction, where the CoD value corresponds to a specified time, measured in cycles, on a synchronized timer (ST) at which the data is required by a downstream dependency, (ii) monitoring when a search of the first cache by said load instruction results in a miss, (iii) searching the load instruction at the second cache when the miss is detected, and (iv) providing the data requested by the load instruction from the intervening cache to a pipeline of a system resource of said first L2 cache at the specified time.Type: GrantFiled: June 25, 1999Date of Patent: January 7, 2003Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, Lakshminarayana Baba Arimilli, John Steven Dodson, Jerry Don Lewis
-
Publication number: 20030005236Abstract: A method, system, and processor cache configuration that enables efficient retrieval of valid data in response to an invalidate cache miss at a local processor cache. A cache directory is enhanced by appending a set of directional bits in addition to the coherency state bits and the address tag. The directional bits provide information that includes the processor cache identification (ID) and routing method. The processor cache ID indicates which processor operation resulted in the cache line of the local processor changing to the invalidate (I) coherency state. The processor operation may be issued by a local processor or by a processor from another group or node of processors if the multiprocessor system comprises multiple nodes of processors. The routing method indicates what transmission method to utilize to forward a request for the cache line. The request may be forwarded to a local system bus or directly to another processor group via a switch or broadcast mechanism.Type: ApplicationFiled: June 29, 2001Publication date: January 2, 2003Applicant: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, Guy Lynn Guthrie, Jerry Don Lewis
-
Publication number: 20030005232Abstract: A cache controller for a processor in a remote node of a system bus in a multiway multiprocessor link sends out a cache deallocate address transaction (CDAT) for a given cache line when that cache line is flushed and information from memory in a home node is no longer deemed valid for that cache line of that remote node processor. A local snoop of that CDAT transaction is then performed as a background function by other processors in the same remote node. If the snoop results indicate that same information is valid in another cache, and that cache decides it better to keep it valid in that remote node, then the information remains there. If the snoop results indicate that the information is not valid among caches in that remote node, or will be flushed due to the CDAT, the system memory directory in the home node of the multiprocessor link is notified and changes state in response to this.Type: ApplicationFiled: June 29, 2001Publication date: January 2, 2003Applicant: International Business Machines Corp.Inventors: Guy Lynn Guthrie, Ravi Kumar Arimilli, James Stephen Fields, Jr., John Steven Dodson
-
Patent number: 6502171Abstract: In cancelling the cast out portion of a combined operation including a data access related to the cast out, the combined response logic explicitly directs a horizontal storage device at the same level as the storage device initiating the combined operation to allocate and store either the cast out or target data. A horizontal storage device having available space—i.e., an invalid or modified data element in a congruence class for the victim—stores either the target or the cast out data for subsequent access by an intervention. Cancellation of the cast out thus defers any latency associated with writing the cast out victim to system memory while maximizing utilization of available storage with acceptable tradeoffs in data access latency.Type: GrantFiled: August 4, 1999Date of Patent: December 31, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, Guy Lynn Guthrie, Jody B. Joyner, Jerry Don Lewis
-
Patent number: 6502168Abstract: According to the present invention, a data processing system includes a cache having a cache directory. A status indication indicative of the status of at least one of a plurality of data entries in the cache is stored in the cache directory. In response to receipt of a cache operation request, a determination is made whether to update the status indication. In response to the determination that the status indication is to be updated, the status indication is copied into a shadow register and updated. The status indication is then written back into the cache directory at a later time. The shadow register thus serves as a virtual cache controller queue that dynamically mimics a cache directory entry without functional latency.Type: GrantFiled: September 23, 1999Date of Patent: December 31, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, Jerry Don Lewis
-
Patent number: 6496921Abstract: A method of operating a processing unit of a computer system, by issuing an instruction having an explicit prefetch request directly from an instruction sequence unit to a prefetch unit of the processing unit. The invention applies to values that are either operand data or instructions. In a preferred embodiment, two prefetch units are used, the first prefetch unit being hardware independent and dynamically monitoring one or more active streams associated with operations carried out by a core of the processing unit, and the second prefetch unit being aware of the lower level storage subsystem and sending with the prefetch request an indication that a prefetch value is to be loaded into a lower level cache of the processing unit. The invention may advantageously associate each prefetch request with a stream ID of an associated processor stream, or a processor ID of the requesting processing unit (the latter feature is particularly useful for caches which are shared by a processing unit cluster).Type: GrantFiled: June 30, 1999Date of Patent: December 17, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, Lakshminarayana Baba Arimilli, Leo James Clark, John Steven Dodson, Guy Lynn Guthrie, James Stephen Fields, Jr.
-
Patent number: 6487637Abstract: A method of operating a multi-level memory hierarchy of a computer system and apparatus embodying the method, wherein instructions issue having an explicit prefetch request directly from an instruction sequence unit to a prefetch unit of the processing unit. The invention applies to values that are either operand data or instructions. These prefetch requests can be demand load requests, where the processing unit will need the operand data or instructions, or speculative load requests, where the processing unit may or may not need the operand data or instructions, but a branch prediction or stream association predicts that they might be needed. Further branch predictions or stream associations that were made based on an earlier speculative choice are linked by using a tag pool which assigns a bit fields in the tag pool entries to the level of speculation depth. Each entry shares in common the bit field values associated with earlier branches or stream associations.Type: GrantFiled: July 30, 1999Date of Patent: November 26, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, Leo James Clark, John Steven Dodson, Guy Lynn Guthrie, William John Starke
-
Patent number: 6484241Abstract: A method of maintaining coherency in a multiprocessor computer system wherein each processing unit's cache has sectored cache lines. A first cache coherency state is assigned to one of the sectors of a particular cache line, and a second cache coherency state, different from the first cache coherency state, is assigned to the overall cache line while maintaining the first cache coherency state for the first sector. The first cache coherency state may provide an indication that the first sector contains a valid value which is not shared with any other cache (i.e., an exclusive or modified state), and the second cache coherency state may provide an indication that at least one of the sectors in the cache line contains a valid value which is shared with at least one other cache (a shared, recently-read, or tagged state). Other coherency states may be applied to other sectors in the same cache line.Type: GrantFiled: December 28, 2000Date of Patent: November 19, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, Guy Lynn Guthrie
-
Patent number: 6480915Abstract: Serialization of global operations within a multiprocessor system is achieved utilizing a single token, requiring a bus master to acquire the token for completion of each individual global operation initiated by that bus master. A combined token and operation request, in which a token request and an operation request are transmitted in a single bus transaction, is employed once for a global operation, to initiate the global operation for the first time. A token manager determines whether the token is available or checked out and responds to the token portion of the combined request. Snoopers respond to the operation portion of the combined request depending on whether they are busy. If the entire combined request is retried, a token request (only) is employed to request the token and, when the token is acquired, an operation request (only) is employed to request the operation.Type: GrantFiled: November 9, 1999Date of Patent: November 12, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, Jody B. Joyner, Jerry Don Lewis
-
Patent number: 6480975Abstract: A method of checking for errors in a set associative cache array, by comparing a requested value to values loaded in the cache blocks and determining, concurrently with this comparison, whether the cache blocks collectively contain at least one error (such as a soft error caused by stray radiation). Separate parity checks are performed on each cache block and if a parity error occurs, an error correction code (ECC) is executed for the entire congruence class, i.e., only one set of ECC bits are used for the combined cache blocks forming the congruence class. The cache operation is retried after ECC execution. The present invention can be applied to a cache directory containing address tags, or to a cache entry array containing the actual instruction and data values. This novel method allows the ECC to perform double-bit error as well, but a smaller number of error checking bits is required as compared with the prior art, due to the provision of a single ECC field for the entire congruence class.Type: GrantFiled: February 17, 1998Date of Patent: November 12, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, Jerry Don Lewis
-
Patent number: 6477613Abstract: Following a cache miss by an operation, the address for the operation is transmitted on the bus coupling the cache to lower levels of the storage hierarchy. A portion of the address including the index field is transmitted during a first bus cycle, and may be employed to begin directory lookups in lower level storage devices before the address tag is received. The remainder of the address is transmitted during subsequent bus cycles, which should be in time for address tag comparisons with the congruence class elements. To allow multiple directory lookups to be occurring concurrently in a pipelined directory, a portion of multiple addresses for several data access operations, each portion including the index field for the respective address, may be transmitted during the first bus cycle or staged in consecutive bus cycles, with the remainders of each address—including the cache tags—transmitted during the subsequent bus cycles.Type: GrantFiled: June 30, 1999Date of Patent: November 5, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, Guy Lynn Guthrie, Jody B. Joyner, Jerry Don Lewis
-
Patent number: 6470427Abstract: A programmable agent and method for managing prefetch queues provide dynamically configurable handling of priorities in a prefetching subsystem for providing look-ahead memory loads in a computer system. When it's queues are at capacity an agent handling prefetches from memory either ignores new requests, forces the new requests to retry or cancels a pending request in order to perform the new request. The behavior can be adjusted under program control by programming a register, or the control may be coupled to a load pattern analyzer. In addition, the behavior with respect to new requests can be set to different types depending on a phase of a pending request.Type: GrantFiled: November 9, 1999Date of Patent: October 22, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, John Steven Dodson, James Stephen Fields, Jr., Guy Lynn Guthrie
-
Patent number: 6463507Abstract: A method of improving memory access for a computer system, by sending load requests to a lower level storage subsystem along with associated information pertaining to intended use of the requested information by the requesting processor, without using a high level load queue. Returning the requested information to the processor along with the associated use information allows the information to be placed immediately without using reload buffers. A register load bus separate from the cache load bus (and having a smaller granularity) is used to return the information. An upper level (L1) cache may then be imprecisely reloaded (the upper level cache can also be imprecisely reloaded with store instructions). The lower level (L2) cache can monitor L1 and L2 cache activity, which can be used to select a victim cache block in the L1 cache (based on the additional L2 information), or to select a victim cache block in the L2 cache (based on the additional L1 information).Type: GrantFiled: June 25, 1999Date of Patent: October 8, 2002Assignee: International Business Machines CorporationInventors: Ravi Kumar Arimilli, Leo James Clark, John Steven Dodson, Guy Lynn Guthrie