Patents by Inventor Paul N. Loewenstein
Paul N. Loewenstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20140281237Abstract: A method for cache coherence, including: broadcasting, by a requester cache (RC) over a partially-ordered request network (RN), a peer-to-peer (P2P) request for a cacheline to a plurality of slave caches; receiving, by the RC and over the RN while the P2P request is pending, a forwarded request for the cacheline from a gateway; receiving, by the RC and after receiving the forwarded request, a plurality of responses to the P2P request from the plurality of slave caches; setting an intra-processor state of the cacheline in the RC, wherein the intra-processor state also specifies an inter-processor state of the cacheline; and issuing, by the RC, a response to the forwarded request after setting the intra-processor state and after the P2P request is complete; and modifying, by the RC, the intra-processor state in response to issuing the response to the forwarded request.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Paul N. Loewenstein, Stephen E. Phillips, David Richard Smentek, Connie Wai Mun Cheung, Serena Wing Yee Leung, Damien Walker, Ramaswamy Sivaramakrishnan
-
Publication number: 20140279894Abstract: A method and apparatus are disclosed for enabling nodes in a distributed system to share one or more memory portions. A home node makes a portion of its main memory available for sharing, and one or more sharer nodes mirrors that shared portion of the home node's main memory in its own main memory. To maintain memory coherency, a memory coherence protocol is implemented. Under this protocol, a special data value is used to indicate that data in a mirrored memory location is not valid. This enables a sharer node to know when to obtain valid data from a home node. With this protocol, valid data is obtained from the home node and updates are propagated to the home node. Thus, no “dirty” data is transferred between sharer nodes. Consequently, the failure of one node will not cause the failure of another node or the failure of the entire system.Type: ApplicationFiled: March 14, 2013Publication date: September 18, 2014Applicant: Oracle International CorporationInventor: PAUL N. LOEWENSTEIN
-
Patent number: 8812935Abstract: A system for detecting an address or data error in a memory system. During operation, the system stores a data block to an address by: calculating a hash of the address; using the calculated hash and data bits from the data block to compute ECC check bits; and storing the data block containing the data bits and the ECC check bits at the address. During a subsequent retrieval operation, the memory system uses the address to retrieve the data block containing the data bits and ECC check bits. Next, the system calculates a hash of the address and uses the calculated hash and the data bits to compute ECC check bits. Finally, the system compares the computed ECC check bits with the retrieved ECC check bits to determine whether an error exists in the address or data bits, or if a data corruption indicator is set.Type: GrantFiled: August 2, 2012Date of Patent: August 19, 2014Assignee: Oracle International CorporationInventor: Paul N. Loewenstein
-
Publication number: 20140215157Abstract: A system and method for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. More specifically, the disclosed embodiments provide a system that notifies a waiting thread when a targeted store is directed to monitored memory locations. During operation, the system receives a targeted store which is directed to a specific cache in a shared-memory multiprocessor system. In response, the system examines a destination address for the targeted store to determine whether the targeted store is directed to a monitored memory location which is being monitored for a thread associated with the specific cache. If so, the system informs the thread about the targeted store.Type: ApplicationFiled: January 30, 2013Publication date: July 31, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Mark S. Moir, Paul N. Loewenstein, David Dice
-
Publication number: 20140181420Abstract: A system includes a number of processors with each processor including a cache memory. The system also includes a number of directory controllers coupled to the processors. Each directory controller may be configured to administer a corresponding cache coherency directory. Each cache coherency directory may be configured to track a corresponding set of memory addresses. Each processor may be configured with information indicating the corresponding set of memory addresses tracked by each cache coherency directory. Directory redundancy operations in such a system may include identifying a failure of one of the cache coherency directories; reassigning the memory address set previously tracked by the failed cache coherency directory among the non-failed cache coherency directories; and reconfiguring each processor with information describing the reassignment of the memory address set among the non-failed cache coherency directories.Type: ApplicationFiled: February 4, 2013Publication date: June 26, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Thomas M. Wicki, Stephen E. Phillips, Nicholas E. Aneshansley, Ramaswamy Sivaramakrishnan, Paul N. Loewenstein
-
Patent number: 8756378Abstract: A method for managing caches, including: broadcasting, by a first cache agent operatively connected to a first cache and using a first physical network, a first peer-to-peer (P2P) request for a memory address; issuing, by a second cache agent operatively connected to a second cache and using a second physical network, a first response to the first P2P request based on a type of the first P2P request and a state of a cacheline in the second cache corresponding to the memory address; issuing, by a third cache agent operatively connected to a third cache, a second response to the first P2P request; and upgrading, by the first cache agent and based on the first response and the second response, a state of a cacheline in the first cache corresponding to the memory address.Type: GrantFiled: February 17, 2011Date of Patent: June 17, 2014Assignee: Oracle International CorporationInventor: Paul N. Loewenstein
-
Publication number: 20140095810Abstract: A method and apparatus are disclosed for enabling nodes in a distributed system to share one or more memory portions. A home node makes a portion of its main memory available for sharing, and one or more sharer nodes mirrors that shared portion of the home node's main memory in its own main memory. To maintain memory coherency, a memory coherence protocol is implemented. Under this protocol, load and store instructions that target the mirrored memory portion of a sharer node are trapped, and store instructions that target the shared memory portion of a home node are trapped. With this protocol, valid data is obtained from the home node and updates are propagated to the home node. Thus, no “dirty” data is transferred between sharer nodes. As a result, the failure of one node will not cause the failure of another node or the failure of the entire system.Type: ApplicationFiled: March 14, 2013Publication date: April 3, 2014Applicant: Oracle International CorporationInventors: PAUL N. LOEWENSTEIN, John G. Johnson, Kathirgamar Aingaran, Zoran Radovic
-
Publication number: 20140089591Abstract: The present embodiments provide a system for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor in the shared-memory multiprocessor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. The system includes an interface, such as an application programming interface (API), and a system call interface or an instruction-set architecture (ISA) that provides access to a number of mechanisms for supporting targeted stores. These mechanisms include a thread-location mechanism that determines a location near where a thread is executing in the shared-memory multiprocessor, and a targeted-store mechanism that targets a store to a location (e.g., cache memory) in the shared-memory multiprocessor.Type: ApplicationFiled: September 24, 2012Publication date: March 27, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Mark S. Moir, David Dice, Paul N. Loewenstein
-
Publication number: 20140075163Abstract: Techniques are disclosed relating to suspending execution of a processor thread while monitoring for a write to a specified memory location. An execution subsystem may be configured to perform a load instruction that causes the processor to retrieve data from a specified memory location and atomically begin monitoring for a write to the specified location. The load instruction may be a load-monitor instruction. The execution subsystem may be further configured to perform a wait instruction that causes the processor to suspend execution of a processor thread during at least a portion of an interval specified by the wait instruction and to resume execution of the processor thread at the end of the interval. The wait instruction may be a monitor-wait instruction. The processor may be further configured to resume execution of the processor thread in response to detecting a write to a memory location specified by a previous monitor instruction.Type: ApplicationFiled: September 7, 2012Publication date: March 13, 2014Inventors: Paul N. Loewenstein, Mark A. Luttrell, Paul J. Jordan
-
Publication number: 20140040526Abstract: Systems and methods for efficient data transport across multiple processors when link utilization is congested. In a multi-node system, each of the nodes measures a congestion level for each of the one or more links connected to it. A source node indicates when each of one or more links to a destination node is congested or each non-congested link is unable to send a particular packet type. In response, the source node sets an indication that it is a candidate for seeking a data forwarding path to send a packet of the particular packet type to the destination node. The source node uses measured congestion levels received from other nodes to search for one or more intermediate nodes. An intermediate node in a data forwarding path has non-congested links for data transport. The source node reroutes data to the destination node through the data forwarding path.Type: ApplicationFiled: July 31, 2012Publication date: February 6, 2014Inventors: Bruce J. Chang, Sebastian Turullols, Brian F. Keish, Damien Walker, Ramaswamy Sivaramakrishnan, Paul N. Loewenstein
-
Publication number: 20140040697Abstract: A system for detecting an address or data error in a memory system. During operation, the system stores a data block to an address by: calculating a hash of the address; using the calculated hash and data bits from the data block to compute ECC check bits; and storing the data block containing the data bits and the ECC check bits at the address. During a subsequent retrieval operation, the memory system uses the address to retrieve the data block containing the data bits and ECC check bits. Next, the system calculates a hash of the address and uses the calculated hash and the data bits to compute ECC check bits. Finally, the system compares the computed ECC check bits with the retrieved ECC check bits to determine whether an error exists in the address or data bits, or if a data corruption indicator is set.Type: ApplicationFiled: August 2, 2012Publication date: February 6, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventor: Paul N. Loewenstein
-
Publication number: 20120215987Abstract: A method for managing caches, including: broadcasting, by a first cache agent operatively connected to a first cache and using a first physical network, a first peer-to-peer (P2P) request for a memory address; issuing, by a second cache agent operatively connected to a second cache and using a second physical network, a first response to the first P2P request based on a type of the first P2P request and a state of a cacheline in the second cache corresponding to the memory address; issuing, by a third cache agent operatively connected to a third cache, a second response to the first P2P request; and upgrading, by the first cache agent and based on the first response and the second response, a state of a cacheline in the first cache corresponding to the memory address.Type: ApplicationFiled: February 17, 2011Publication date: August 23, 2012Applicant: ORACLE INTERNATIONAL CORPORATIONInventor: Paul N. Loewenstein
-
Patent number: 7680989Abstract: We propose a class of mechanisms to support a new style of synchronization that offers simple and efficient solutions to several existing problems for which existing solutions are complicated, expensive, and/or otherwise inadequate. In general, the proposed mechanisms allow a program to read from a first memory location (called the “flagged” location), and to then continue execution, storing values to zero or more other memory locations such that these stores take effect (i.e., become visible in the memory system) only while the flagged memory location does not change. In some embodiments, the mechanisms further allow the program to determine when the first memory location has changed. We call the proposed mechanisms conditional multi-store synchronization mechanisms and define aspects of an instruction set architecture consistent therewith.Type: GrantFiled: August 17, 2006Date of Patent: March 16, 2010Assignee: Sun Microsystems, Inc.Inventors: Mark S. Moir, Robert E. Cypher, Paul N. Loewenstein
-
Patent number: 7657710Abstract: A system may include a processor node, and may also include an input/output (I/O) node including a processor and an I/O device. The processor and I/O nodes may each include a respective cache memory configured to cache a system memory and a respective cache coherence controller. The system may further include interconnect through which the nodes may communicate. In response to detecting a request for the I/O device to perform a DMA write operation to a coherence unit of the I/O node's respective cache memory, and in response to determining that the coherence unit is not modified with respect to the system memory and no other cache memory within the system has read or write permission corresponding to a copy of the coherence unit, the I/O node's respective cache coherence controller may grant write permission but not read permission for the coherence unit to the I/O node's respective cache memory.Type: GrantFiled: November 17, 2006Date of Patent: February 2, 2010Assignee: Sun Microsystems, Inc.Inventor: Paul N. Loewenstein
-
Patent number: 7480771Abstract: We propose a class of mechanisms to support a new style of synchronization that offers simple and efficient solutions to several existing problems for which existing solutions are complicated, expensive, and/or otherwise inadequate. In general, the proposed mechanisms allow a program to read from a first memory location (called the “flagged” location), and to then continue execution, storing values to zero or more other memory locations such that these stores take effect (i.e., become visible in the memory system) only while the flagged memory location does not change. In some embodiments, the mechanisms further allow the program to determine when the first memory location has changed. We call the proposed mechanisms conditional multi-store synchronization mechanisms.Type: GrantFiled: August 17, 2006Date of Patent: January 20, 2009Assignee: Sun Microsystems, Inc.Inventors: Mark S. Moir, Robert E. Cypher, Paul N. Loewenstein
-
Patent number: 7412567Abstract: In one embodiment, a processor comprises a coherence trap unit and a trap logic coupled to the coherence trap unit. The coherence trap unit is also coupled to receive data accessed in response to the processor executing a memory operation. The coherence trap unit is configured to detect that the data matches a designated value indicating that a coherence trap is to be initiated to coherently perform the memory operation. The trap logic is configured to trap to a designated software routine responsive to the coherence trap unit detecting the designated value. In some embodiments, a cache tag in a cache may track whether or not the corresponding cache line has the designated value, and the cache tag may be used to trigger a trap in response to an access to the corresponding cache line.Type: GrantFiled: April 28, 2006Date of Patent: August 12, 2008Assignee: Sun Microsystems, Inc.Inventors: Håkan E. Zeffer, Erik E. Hagersten, Anders Landin, Shailender Chaudhry, Paul N. Loewenstein, Robert E. Cypher, Zoran Radovic
-
Publication number: 20080120441Abstract: A system may include a processor node, and may also include an input/output (I/O) node including a processor and an I/O device. The processor and I/O nodes may each include a respective cache memory configured to cache a system memory and a respective cache coherence controller. The system may further include interconnect through which the nodes may communicate. In response to detecting a request for the I/O device to perform a DMA write operation to a coherence unit of the I/O node's respective cache memory, and in response to determining that the coherence unit is not modified with respect to the system memory and no other cache memory within the system has read or write permission corresponding to a copy of the coherence unit, the I/O node's respective cache coherence controller may grant write permission but not read permission for the coherence unit to the I/O node's respective cache memory.Type: ApplicationFiled: November 17, 2006Publication date: May 22, 2008Inventor: Paul N. Loewenstein
-
Patent number: 7325101Abstract: Cache lines stored in an on-chip cache memory are associated with one or more state bits that indicate whether data stored in the cache lines was sourced from an off-chip cache memory or a main memory. By keeping track of the source of cache lines in the on-chip cache memory and by designing the replacement algorithm of the on-chip cache memory such that only one line in a given set maps into an off-cache memory cache line, the frequency of off-chip cache memory accesses may be greatly reduced, thereby improving performance and efficiency.Type: GrantFiled: May 31, 2005Date of Patent: January 29, 2008Assignee: Sun Microsystems, Inc.Inventors: Sorin Iacobovici, Paul N. Loewenstein
-
Patent number: 5983326Abstract: A multiprocessing system having a plurality of processing nodes interconnected by an interconnect network. A home agent is configured to service multiple requests simultaneously. A transaction blocking unit is coupled to a home agent control unit for preventing the servicing of a pending coherent transaction request if another transaction request corresponding to the same coherency unit is already being serviced by the home agent control unit. The transaction blocking unit is further configured such that read-to-share transaction requests in a NUMA mode do not block other read-to-share transaction requests in the NUMA mode.Type: GrantFiled: July 1, 1996Date of Patent: November 9, 1999Assignee: Sun Microsystems, Inc.Inventors: Erik E. Hagersten, Paul N. Loewenstein
-
Patent number: 5950226Abstract: A multiprocessing computer system employing a three-hop communications protocol. When a request is sent by a requesting node to a home node, the home node sends read and/or invalidate demands to any slave nodes holding cached copies of the requested data. The demands from the home node to the slave nodes may each advantageously include a value indicative of the number of replies the requesting agent should expect to receive. The slaves reply back to the requesting node with either data or an acknowledge. Each reply may further include the number of replies the requester should expect. Upon receiving all expected replies, the requesting node may send a completion message back to the home and may treat the transaction as completed and proceed with subsequent processing.Type: GrantFiled: July 1, 1996Date of Patent: September 7, 1999Assignee: Sun Microsystems, Inc.Inventors: Erik E. Hagersten, Paul N. Loewenstein