Patents by Inventor Madhumitra Sharma

Madhumitra Sharma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 6961825
    Abstract: A distributed processing system includes a cache coherency mechanism that essentially encodes network routing information into sectored presence bits. The mechanism organizes the sectored presence bits as one or more arbitration masks that system switches decode and use directly to route invalidate messages through one or more higher levels of the system. The lower level or levels of the system use local routing mechanisms, such as local directories, to direct the invalidate messages to the individual processors that are holding the data of interest.
    Type: Grant
    Filed: January 24, 2001
    Date of Patent: November 1, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Simon C. Steely, Jr., Stephen Van Doren, Madhumitra Sharma
  • Patent number: 6904465
    Abstract: A multiple-processor system in which a commit message is returned to a source processor that requests a memory access operation so as to indicate the apparent completion of the operation includes a multiple-level switch unit linking nodes that contain the processors. The switch unit includes multiple input switches each of which receives messages from multiple nodes, and a set of output switches whose inputs are the outputs of the input switches and whose outputs are the inputs of the nodes. Each switch processes messages in the order in which they are received by the switch and each output switch follows the same rule as the other output switches.
    Type: Grant
    Filed: April 26, 2001
    Date of Patent: June 7, 2005
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Simon C. Steely, Jr., Madhumitra Sharma, Stephen R. Van Doren
  • Patent number: 6801986
    Abstract: A method, for executing a load locked and a store conditional instruction in a processor, achieves an atomic read-write operation to a memory block. First the load locked instruction is executed to read a memory block, and the processor in response to executing the load locked instruction issues a read modify system command to read the block and to take ownership of the block by the processor, and also sets a lock flag for the address of the memory block, and writes a value of the memory block into a cache of the processor as a cache copy of the memory block. The lock flag, upon receipt of an invalidate message by the processor for the cache copy of the memory block, is reset if any invalidate messages for the memory block are received by the processor. The processor waits for a selected time interval before the processor surrenders ownership of the memory block upon receipt of an ownership request message, if any is received by the processor after execution of the load locked instruction.
    Type: Grant
    Filed: August 20, 2001
    Date of Patent: October 5, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Simon C. Steely, Jr., Stephen R. Van Doren, Madhumitra Sharma
  • Publication number: 20030076831
    Abstract: A technique efficiently combines data and ordered transactions in a multiprocessor system having a plurality of nodes interconnected by a hierarchical switch. The technique further enables an ordered channel of the system to make progress in the presence of a blocked interface within the hierarchical switch. Specifically, the technique combines ordered components and unordered data components into common packets that are transmitted over an ordered channel of the system in the event that ordered and unordered components are generated simultaneously. The technique further allows, in the event that a combined packet in the ordered channel is stalled due to a data buffer dependency, the packet to be decomposed into an ordered component and an unordered data component wherein the ordered component remains in the ordered channel and the unordered data component is reassigned to the unordered data channel.
    Type: Application
    Filed: March 21, 2001
    Publication date: April 24, 2003
    Inventors: Stephen R. Van Doren, Simon C. Steely, Madhumitra Sharma
  • Publication number: 20030037223
    Abstract: A method, for executing a load locked and a store conditional instruction in a processor, achieves an atomic read-write operation to a memory block. First the load locked instruction is executed to read a memory block, and the processor in response to executing the load locked instruction issues a read modify system command to read the block and to take ownership of the block by the processor, and also sets a lock flag for the address of the memory block, and writes a value of the memory block into a cache of the processor as a cache copy of the memory block. The lock flag, upon receipt of an invalidate message by the processor for the cache copy of the memory block, is reset if any invalidate messages for the memory block are received by the processor. The processor waits for a selected time interval before the processor surrenders ownership of the memory block upon receipt of an ownership request message, if any is received by the processor after execution of the load locked instruction.
    Type: Application
    Filed: August 20, 2001
    Publication date: February 20, 2003
    Inventors: Simon C. Steely, Stephen R. Van Doren, Madhumitra Sharma
  • Publication number: 20020194290
    Abstract: A multiple-processor system in which a commit message is returned to a source processor that requests a memory access operation so as to indicate the apparent completion of the operation includes a multiple-level switch unit linking nodes that contain the processors. The switch unit includes multiple input switches each of which receives messages from multiple nodes, and a set of output switches whose inputs are the outputs of the input switches and whose outputs are the inputs of the nodes. Each switch processes messages in the order in which they are received by the switch and each output switch follows the same rule as the other output switches.
    Type: Application
    Filed: April 26, 2001
    Publication date: December 19, 2002
    Inventors: Simon C. Steely, Madhumitra Sharma, Stephen R. Van Doren
  • Publication number: 20020152358
    Abstract: A performance enhancing change-to-dirty operation (CTD) is disclosed wherein contention among several processors trying to gain ownership of a block of data is obviated by arranging the CTD to always succeed. A method and a system are disclosed where a processor in a multiprocessor system having a copy of data gains assured ownership of data that the processor may then write. The method provides for the possibilities of conditions that may exist and provides a scenario that the requesting processor may have to wait for the ownership. Conditions are handled where the memory is the “owner” of the data and where other processor are requesting ownership, and where copies of the data exist at other processors. The method provides for messages to other processor having copies of the data informing them that the data is now invalid.
    Type: Application
    Filed: April 13, 2001
    Publication date: October 17, 2002
    Inventors: Simon C. Steely, Stephen R. Van Doren, Madhumitra Sharma
  • Publication number: 20020146022
    Abstract: A credit-based, flow control technique utilizes a plurality of counters to conserve resources of a switch fabric within a modular multiprocessor system while ensuring that transaction packets pending in virtual channel queues of the fabric efficiently progress through those resources. The multiprocessor system includes a plurality of nodes interconnected by the switch fabric that extends from a global input port of a node through a hierarchical switch to a global output port of the same or another node. The resources include shared buffers within the global ports and hierarchical switch. Each counter is associated with a virtual channel queue and the flow control technique uses the counters to essentially create the structure of the shared buffers.
    Type: Application
    Filed: April 9, 2001
    Publication date: October 10, 2002
    Inventors: Stephen R. Van Doren, Simon C. Steely, Madhumitra Sharma, Gregory E. Tierney
  • Publication number: 20020099833
    Abstract: A distributed processing system includes a cache coherency mechanism that essentially encodes network routing information into sectored presence bits. The mechanism organizes the sectored presence bits as one or more arbitration masks that system switches decode and use directly to route invalidate messages through one or more higher levels of the system. The lower level or levels of the system use local routing mechanisms, such as local directories, to direct the invalidate messages to the individual processors that are holding the data of interest.
    Type: Application
    Filed: January 24, 2001
    Publication date: July 25, 2002
    Inventors: Simon C. Steely, Stephen Van Doren, Madhumitra Sharma
  • Publication number: 20020009095
    Abstract: A technique decomposes a multicast transaction issued by one of a plurality of nodes of a distributed shared memory multiprocessor system into a series of multicast packets, each of which may further “spawn” multicast messages directed to a subset of the nodes. A central switch fabric interconnects the nodes, each of which includes a global port coupled to the switch, a plurality of processors and memory. The central switch includes a central ordering point that maintains an order of packets issued by, e.g., a source processor of a remote node when requesting data resident in a memory of a home node. The multicast messages spawned from a multicast packet passing the central ordering point are generated according to multicast decomposition and ordering rules of the inventive technique.
    Type: Application
    Filed: May 31, 2001
    Publication date: January 24, 2002
    Inventors: Stephen R. Van Doren, Simon C. Steely, Madhumitra Sharma
  • Publication number: 20010055277
    Abstract: An initiate flow control mechanism prevents interconnect resources within a switch fabric of a modular multiprocessor system from being dominated with initiate transactions. The multiprocessor system comprises a plurality of nodes interconnected by a switch fabric that extends from a global input port of a node through a hierarchical switch to a global output port of the same or another node. The interconnect resources include shared buffers within the global ports and hierarchical switch. The initiate flow control mechanism manages these shared buffers to reserve bandwidth for complete transactions when extensive global initiate traffic to one or more nodes of the system may create a bottleneck in the switch fabric.
    Type: Application
    Filed: May 11, 2001
    Publication date: December 27, 2001
    Inventors: Simon C. Steely, Madhumitra Sharma, Stephen R. Van Doren, Gregory E. Tierney
  • Patent number: 6286090
    Abstract: A technique selectively imposes inter-reference ordering between memory reference operations issued by a processor of a multiprocessor system to addresses within a page pertaining to a page table entry (PTE) that is affected by a translation buffer (TB) miss flow routine. The TB miss flow is used to retrieve information contained in the PTE for mapping a virtual address to a physical address and, subsequently, to allow retrieval of data at the mapped physical address. The PTE that is retrieved in response to a memory reference (read) operation is not loaded into the TB until a commit-signal associated with that read operation is returned to the processor. Once the PTE and associated commit-signal are returned, the processor loads the PTE into the TB so that it can be used for a subsequent read operation directed to the data at the physical address.
    Type: Grant
    Filed: May 26, 1998
    Date of Patent: September 4, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Simon C. Steely, Jr., Madhumitra Sharma, Stephen R. Van Doren, Kourosh Gharachorloo
  • Patent number: 6279084
    Abstract: The invention pertains to serializing local and remote references to a portion of a shared memory to optimize sequencing of requests in a switch-based, multi-processor system in which the local and remote references can occur concurrently. Usually, local accesses are typically much faster than remote accesses. Thus, in the interest of performance, both local and remote accesses are permitted to occur concurrently in the multiprocessing system. However, in one instance a local access can cause deadlock problems for a remote access. In addition, problems associated with coherency of the shared memory can also arise. Thus, in order to prevent deadlock problems and to maintain coherency of a shared memory, if a local reference to an address of memory has been forwarded to a switch, in this instance a hierarchical switch, then all subsequent references to that address of memory are forwarded to the hierarchical switch. The hierarchical switch has ordering properties that maintain the received order of inputs.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: August 21, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Stephen R. VanDoren, Simon C. Steely, Madhumitra Sharma, Hari Krishnan Nagpal
  • Patent number: 6249520
    Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: June 19, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Simon C. Steely, Jr., Stephen R. VanDoren, Madhumitra Sharma, Craig D. Keefer, David W. Davis
  • Patent number: 6209065
    Abstract: A mechanism optimizes the generation of a commit-signal by control logic of the multiprocessor system in response to a memory reference operation issued by a processor to a local node of a multiprocessor system having a hierarchical switch for interconnecting a plurality of nodes. The mechanism generally comprises a structure that indicates whether the memory reference operation affects other processors of other nodes of the multiprocessor system. An ordering point of the local node generates an optimized commit-signal when the structure indicates that the memory reference operation does not affect the other processors.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: March 27, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Stephen R. Van Doren, Simon C. Steely, Jr., Kourosh Gharachorloo, Madhumitra Sharma
  • Patent number: 6202126
    Abstract: A method for preventing inadvertent invalidation of data elements in a system having a separate probe queue and fill queue for each central processing unit, is provided wherein a central processing unit stores a clean data element, that would otherwise have been discarded, in a victim data buffer when it is evicted from cache. The central processing unit subsequently issues a clean-victim command to the system control logic when the readmiss or read-miss-modify command, targeting the data element that maps to the same location in cache as the clean data element, is issued. The clean-victim command causes the duplicate tag store to indicate that the clean data element is no longer stored in that central processing unit's cache. While the data is stored therein, the central processing unit cannot issue a probe message that targets that data until the victim data buffer has been deallocated.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: March 13, 2001
    Assignee: Compaq Computer Corporation
    Inventors: Stephen Van Doren, Simon C. Steely, Jr., Madhumitra Sharma
  • Patent number: 6154816
    Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: November 28, 2000
    Assignee: Compaq Computer Corp.
    Inventors: Simon C. Steely, Madhumitra Sharma, Stephen R. VanDoren
  • Patent number: 6122714
    Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: September 19, 2000
    Assignee: Compaq Computer Corp.
    Inventors: Stephen R. VanDoren, Simon C. Steely, Madhumitra Sharma, David M. Fenwick
  • Patent number: 6108737
    Abstract: A mechanism reduces the latency of inter-reference ordering between sets of memory reference operations in a multiprocessor system having a shared memory. The mechanism comprises a commit-signal that is generated by control logic of the multiprocessor system in response to an issued memory reference operation. The commit-signal facilitates inter-reference ordering; moreover, the commit signal indicates the apparent completion of the memory reference operation, rather than actual completion of the operation. The apparent completion of an operation occurs substantially sooner than the actual completion of an operation, thereby improving performance of the multiprocessor system.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: August 22, 2000
    Assignee: Compaq Computer Corporation
    Inventors: Madhumitra Sharma, Stephen R. Van Doren, Kourosh Gharachorloo, Simon C. Steely, Jr.
  • Patent number: 6101420
    Abstract: An architecture and coherency protocol for use in a large SMP computer system includes a hierarchical switch structure which allows for a number of multi-processor nodes to be coupled to the switch to operate at an optimum performance. Within each multi-processor node, a simultaneous buffering system is provided that allows all of the processors of the multi-processor node to operate at peak performance. A memory is shared among the nodes, with a portion of the memory resident at each of the multi-processor nodes. Each of the multi-processor nodes includes a number of elements for maintaining memory coherency, including a victim cache, a directory and a transaction tracking table. The victim cache allows for selective updates of victim data destined for memory stored at a remote multi-processing node, thereby improving the overall performance of memory.
    Type: Grant
    Filed: October 24, 1997
    Date of Patent: August 8, 2000
    Assignee: Compaq Computer Corporation
    Inventors: Stephen R. VanDoren, Simon C. Steely, Madhumitra Sharma, Kourosh Gharachorloo