Patents by Inventor Madhumitra Sharma

Madhumitra Sharma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Cache coherency mechanism using arbitration masks

Patent number: 6961825

Abstract: A distributed processing system includes a cache coherency mechanism that essentially encodes network routing information into sectored presence bits. The mechanism organizes the sectored presence bits as one or more arbitration masks that system switches decode and use directly to route invalidate messages through one or more higher levels of the system. The lower level or levels of the system use local routing mechanisms, such as local directories, to direct the invalidate messages to the individual processors that are holding the data of interest.

Type: Grant

Filed: January 24, 2001

Date of Patent: November 1, 2005

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Simon C. Steely, Jr., Stephen Van Doren, Madhumitra Sharma
Low latency inter-reference ordering in a multiple processor system employing a multiple-level inter-node switch

Patent number: 6904465

Abstract: A multiple-processor system in which a commit message is returned to a source processor that requests a memory access operation so as to indicate the apparent completion of the operation includes a multiple-level switch unit linking nodes that contain the processors. The switch unit includes multiple input switches each of which receives messages from multiple nodes, and a set of output switches whose inputs are the outputs of the input switches and whose outputs are the inputs of the nodes. Each switch processes messages in the order in which they are received by the switch and each output switch follows the same rule as the other output switches.

Type: Grant

Filed: April 26, 2001

Date of Patent: June 7, 2005

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Simon C. Steely, Jr., Madhumitra Sharma, Stephen R. Van Doren
Livelock prevention by delaying surrender of ownership upon intervening ownership request during load locked / store conditional atomic memory operation

Patent number: 6801986

Abstract: A method, for executing a load locked and a store conditional instruction in a processor, achieves an atomic read-write operation to a memory block. First the load locked instruction is executed to read a memory block, and the processor in response to executing the load locked instruction issues a read modify system command to read the block and to take ownership of the block by the processor, and also sets a lock flag for the address of the memory block, and writes a value of the memory block into a cache of the processor as a cache copy of the memory block. The lock flag, upon receipt of an invalidate message by the processor for the cache copy of the memory block, is reset if any invalidate messages for the memory block are received by the processor. The processor waits for a selected time interval before the processor surrenders ownership of the memory block upon receipt of an ownership request message, if any is received by the processor after execution of the load locked instruction.

Type: Grant

Filed: August 20, 2001

Date of Patent: October 5, 2004

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Simon C. Steely, Jr., Stephen R. Van Doren, Madhumitra Sharma
Mechanism for packet component merging and channel assignment, and packet decomposition and channel reassignment in a multiprocessor system

Publication number: 20030076831

Abstract: A technique efficiently combines data and ordered transactions in a multiprocessor system having a plurality of nodes interconnected by a hierarchical switch. The technique further enables an ordered channel of the system to make progress in the presence of a blocked interface within the hierarchical switch. Specifically, the technique combines ordered components and unordered data components into common packets that are transmitted over an ordered channel of the system in the event that ordered and unordered components are generated simultaneously. The technique further allows, in the event that a combined packet in the ordered channel is stalled due to a data buffer dependency, the packet to be decomposed into an ordered component and an unordered data component wherein the ordered component remains in the ordered channel and the unordered data component is reassigned to the unordered data channel.

Type: Application

Filed: March 21, 2001

Publication date: April 24, 2003

Inventors: Stephen R. Van Doren, Simon C. Steely, Madhumitra Sharma
Apparatus and method for ownership load locked misses for atomic lock acquisition in a multiprocessor computer system

Publication number: 20030037223

Abstract: A method, for executing a load locked and a store conditional instruction in a processor, achieves an atomic read-write operation to a memory block. First the load locked instruction is executed to read a memory block, and the processor in response to executing the load locked instruction issues a read modify system command to read the block and to take ownership of the block by the processor, and also sets a lock flag for the address of the memory block, and writes a value of the memory block into a cache of the processor as a cache copy of the memory block. The lock flag, upon receipt of an invalidate message by the processor for the cache copy of the memory block, is reset if any invalidate messages for the memory block are received by the processor. The processor waits for a selected time interval before the processor surrenders ownership of the memory block upon receipt of an ownership request message, if any is received by the processor after execution of the load locked instruction.

Type: Application

Filed: August 20, 2001

Publication date: February 20, 2003

Inventors: Simon C. Steely, Stephen R. Van Doren, Madhumitra Sharma
Low latency inter-reference ordering in a multiple processor system employing a multiple-level inter-node switch

Publication number: 20020194290

Abstract: A multiple-processor system in which a commit message is returned to a source processor that requests a memory access operation so as to indicate the apparent completion of the operation includes a multiple-level switch unit linking nodes that contain the processors. The switch unit includes multiple input switches each of which receives messages from multiple nodes, and a set of output switches whose inputs are the outputs of the input switches and whose outputs are the inputs of the nodes. Each switch processes messages in the order in which they are received by the switch and each output switch follows the same rule as the other output switches.

Type: Application

Filed: April 26, 2001

Publication date: December 19, 2002

Inventors: Simon C. Steely, Madhumitra Sharma, Stephen R. Van Doren
Always succeeding change to dirty method

Publication number: 20020152358

Abstract: A performance enhancing change-to-dirty operation (CTD) is disclosed wherein contention among several processors trying to gain ownership of a block of data is obviated by arranging the CTD to always succeed. A method and a system are disclosed where a processor in a multiprocessor system having a copy of data gains assured ownership of data that the processor may then write. The method provides for the possibilities of conditions that may exist and provides a scenario that the requesting processor may have to wait for the ownership. Conditions are handled where the memory is the “owner” of the data and where other processor are requesting ownership, and where copies of the data exist at other processors. The method provides for messages to other processor having copies of the data informing them that the data is now invalid.

Type: Application

Filed: April 13, 2001

Publication date: October 17, 2002

Inventors: Simon C. Steely, Stephen R. Van Doren, Madhumitra Sharma
Credit-based flow control technique in a modular multiprocessor system

Publication number: 20020146022

Abstract: A credit-based, flow control technique utilizes a plurality of counters to conserve resources of a switch fabric within a modular multiprocessor system while ensuring that transaction packets pending in virtual channel queues of the fabric efficiently progress through those resources. The multiprocessor system includes a plurality of nodes interconnected by the switch fabric that extends from a global input port of a node through a hierarchical switch to a global output port of the same or another node. The resources include shared buffers within the global ports and hierarchical switch. Each counter is associated with a virtual channel queue and the flow control technique uses the counters to essentially create the structure of the shared buffers.

Type: Application

Filed: April 9, 2001

Publication date: October 10, 2002

Inventors: Stephen R. Van Doren, Simon C. Steely, Madhumitra Sharma, Gregory E. Tierney
Cache coherency mechanism using arbitration masks

Publication number: 20020099833

Abstract: A distributed processing system includes a cache coherency mechanism that essentially encodes network routing information into sectored presence bits. The mechanism organizes the sectored presence bits as one or more arbitration masks that system switches decode and use directly to route invalidate messages through one or more higher levels of the system. The lower level or levels of the system use local routing mechanisms, such as local directories, to direct the invalidate messages to the individual processors that are holding the data of interest.

Type: Application

Filed: January 24, 2001

Publication date: July 25, 2002

Inventors: Simon C. Steely, Stephen Van Doren, Madhumitra Sharma
Multicast decomposition mechanism in a hierarchically order distributed shared memory multiprocessor computer system

Publication number: 20020009095

Abstract: A technique decomposes a multicast transaction issued by one of a plurality of nodes of a distributed shared memory multiprocessor system into a series of multicast packets, each of which may further “spawn” multicast messages directed to a subset of the nodes. A central switch fabric interconnects the nodes, each of which includes a global port coupled to the switch, a plurality of processors and memory. The central switch includes a central ordering point that maintains an order of packets issued by, e.g., a source processor of a remote node when requesting data resident in a memory of a home node. The multicast messages spawned from a multicast packet passing the central ordering point are generated according to multicast decomposition and ordering rules of the inventive technique.

Type: Application

Filed: May 31, 2001

Publication date: January 24, 2002

Inventors: Stephen R. Van Doren, Simon C. Steely, Madhumitra Sharma
Initiate flow control mechanism of a modular multiprocessor system

Publication number: 20010055277

Abstract: An initiate flow control mechanism prevents interconnect resources within a switch fabric of a modular multiprocessor system from being dominated with initiate transactions. The multiprocessor system comprises a plurality of nodes interconnected by a switch fabric that extends from a global input port of a node through a hierarchical switch to a global output port of the same or another node. The interconnect resources include shared buffers within the global ports and hierarchical switch. The initiate flow control mechanism manages these shared buffers to reserve bandwidth for complete transactions when extensive global initiate traffic to one or more nodes of the system may create a bottleneck in the switch fabric.

Type: Application

Filed: May 11, 2001

Publication date: December 27, 2001

Inventors: Simon C. Steely, Madhumitra Sharma, Stephen R. Van Doren, Gregory E. Tierney
Mechanism for selectively imposing interference order between page-table fetches and corresponding data fetches

Patent number: 6286090

Abstract: A technique selectively imposes inter-reference ordering between memory reference operations issued by a processor of a multiprocessor system to addresses within a page pertaining to a page table entry (PTE) that is affected by a translation buffer (TB) miss flow routine. The TB miss flow is used to retrieve information contained in the PTE for mapping a virtual address to a physical address and, subsequently, to allow retrieval of data at the mapped physical address. The PTE that is retrieved in response to a memory reference (read) operation is not loaded into the TB until a commit-signal associated with that read operation is returned to the processor. Once the PTE and associated commit-signal are returned, the processor loads the PTE into the TB so that it can be used for a subsequent read operation directed to the data at the physical address.

Type: Grant

Filed: May 26, 1998

Date of Patent: September 4, 2001

Assignee: Compaq Computer Corporation

Inventors: Simon C. Steely, Jr., Madhumitra Sharma, Stephen R. Van Doren, Kourosh Gharachorloo
Shadow commands to optimize sequencing of requests in a switch-based multi-processor system

Patent number: 6279084

Abstract: The invention pertains to serializing local and remote references to a portion of a shared memory to optimize sequencing of requests in a switch-based, multi-processor system in which the local and remote references can occur concurrently. Usually, local accesses are typically much faster than remote accesses. Thus, in the interest of performance, both local and remote accesses are permitted to occur concurrently in the multiprocessing system. However, in one instance a local access can cause deadlock problems for a remote access. In addition, problems associated with coherency of the shared memory can also arise. Thus, in order to prevent deadlock problems and to maintain coherency of a shared memory, if a local reference to an address of memory has been forwarded to a switch, in this instance a hierarchical switch, then all subsequent references to that address of memory are forwarded to the hierarchical switch. The hierarchical switch has ordering properties that maintain the received order of inputs.

Type: Grant

Filed: October 24, 1997

Date of Patent: August 21, 2001

Assignee: Compaq Computer Corporation

Inventors: Stephen R. VanDoren, Simon C. Steely, Madhumitra Sharma, Hari Krishnan Nagpal
High-performance non-blocking switch with multiple channel ordering constraints

Patent number: 6249520

Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.

Type: Grant

Filed: October 24, 1997

Date of Patent: June 19, 2001

Assignee: Compaq Computer Corporation

Inventors: Simon C. Steely, Jr., Stephen R. VanDoren, Madhumitra Sharma, Craig D. Keefer, David W. Davis
Mechanism for optimizing generation of commit-signals in a distributed shared-memory system

Patent number: 6209065

Abstract: A mechanism optimizes the generation of a commit-signal by control logic of the multiprocessor system in response to a memory reference operation issued by a processor to a local node of a multiprocessor system having a hierarchical switch for interconnecting a plurality of nodes. The mechanism generally comprises a structure that indicates whether the memory reference operation affects other processors of other nodes of the multiprocessor system. An ordering point of the local node generates an optimized commit-signal when the structure indicates that the memory reference operation does not affect the other processors.

Type: Grant

Filed: October 24, 1997

Date of Patent: March 27, 2001

Assignee: Compaq Computer Corporation

Inventors: Stephen R. Van Doren, Simon C. Steely, Jr., Kourosh Gharachorloo, Madhumitra Sharma
Victimization of clean data blocks

Patent number: 6202126

Abstract: A method for preventing inadvertent invalidation of data elements in a system having a separate probe queue and fill queue for each central processing unit, is provided wherein a central processing unit stores a clean data element, that would otherwise have been discarded, in a victim data buffer when it is evicted from cache. The central processing unit subsequently issues a clean-victim command to the system control logic when the readmiss or read-miss-modify command, targeting the data element that maps to the same location in cache as the clean data element, is issued. The clean-victim command causes the duplicate tag store to indicate that the clean data element is no longer stored in that central processing unit's cache. While the data is stored therein, the central processing unit cannot issue a probe message that targets that data until the victim data buffer has been deallocated.

Type: Grant

Filed: October 24, 1997

Date of Patent: March 13, 2001

Assignee: Compaq Computer Corporation

Inventors: Stephen Van Doren, Simon C. Steely, Jr., Madhumitra Sharma
Low occupancy protocol for managing concurrent transactions with dependencies

Patent number: 6154816

Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.

Type: Grant

Filed: October 24, 1997

Date of Patent: November 28, 2000

Assignee: Compaq Computer Corp.

Inventors: Simon C. Steely, Madhumitra Sharma, Stephen R. VanDoren
Order supporting mechanisms for use in a switch-based multi-processor system

Patent number: 6122714

Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.

Type: Grant

Filed: October 24, 1997

Date of Patent: September 19, 2000

Assignee: Compaq Computer Corp.

Inventors: Stephen R. VanDoren, Simon C. Steely, Madhumitra Sharma, David M. Fenwick
Method and apparatus for reducing latency of inter-reference ordering in a multiprocessor system

Patent number: 6108737

Abstract: A mechanism reduces the latency of inter-reference ordering between sets of memory reference operations in a multiprocessor system having a shared memory. The mechanism comprises a commit-signal that is generated by control logic of the multiprocessor system in response to an issued memory reference operation. The commit-signal facilitates inter-reference ordering; moreover, the commit signal indicates the apparent completion of the memory reference operation, rather than actual completion of the operation. The apparent completion of an operation occurs substantially sooner than the actual completion of an operation, thereby improving performance of the multiprocessor system.

Type: Grant

Filed: October 24, 1997

Date of Patent: August 22, 2000

Assignee: Compaq Computer Corporation

Inventors: Madhumitra Sharma, Stephen R. Van Doren, Kourosh Gharachorloo, Simon C. Steely, Jr.
Method and apparatus for disambiguating change-to-dirty commands in a switch based multi-processing system with coarse directories

Patent number: 6101420

Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.

Type: Grant

Filed: October 24, 1997

Date of Patent: August 8, 2000

Assignee: Compaq Computer Corporation

Inventors: Stephen R. VanDoren, Simon C. Steely, Madhumitra Sharma, Kourosh Gharachorloo

1 2 next