Patents by Inventor Mark S. Moir

Mark S. Moir has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20140215157
    Abstract: A system and method for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. More specifically, the disclosed embodiments provide a system that notifies a waiting thread when a targeted store is directed to monitored memory locations. During operation, the system receives a targeted store which is directed to a specific cache in a shared-memory multiprocessor system. In response, the system examines a destination address for the targeted store to determine whether the targeted store is directed to a monitored memory location which is being monitored for a thread associated with the specific cache. If so, the system informs the thread about the targeted store.
    Type: Application
    Filed: January 30, 2013
    Publication date: July 31, 2014
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Mark S. Moir, Paul N. Loewenstein, David Dice
  • Patent number: 8789057
    Abstract: Transactional Lock Elision (TLE) may allow multiple threads to concurrently execute critical sections as speculative transactions. Transactions may abort due to various reasons. To avoid starvation, transactions may revert to execution using mutual exclusion when transactional execution fails. Because threads may revert to mutual exclusion in response to the mutual exclusion of other threads, a positive feedback loop may form in times of high congestion, causing a “lemming effect”. To regain the benefits of concurrent transactional execution, the system may allow one or more threads awaiting a given lock to be released from the wait queue and instead attempt transactional execution. A gang release may allow a subset of waiting threads to be released simultaneously. The subset may be chosen dependent on the number of waiting threads, historical abort relationships between threads, analysis of transactions of each thread, sensitivity of each thread to abort, and/or other thread-local or global criteria.
    Type: Grant
    Filed: December 3, 2008
    Date of Patent: July 22, 2014
    Assignee: Oracle America, Inc.
    Inventors: David Dice, Mark S. Moir
  • Patent number: 8776063
    Abstract: Multi-threaded, transactional memory systems may allow concurrent execution of critical sections as speculative transactions. These transactions may abort due to contention among threads. Hardware feedback mechanisms may detect information about aborts and provide that information to software, hardware, or hybrid software/hardware contention management mechanisms. For example, they may detect occurrences of transactional aborts or conditions that may result in transactional aborts, and may update local readable registers or other storage entities (e.g., performance counters) with relevant contention information. This information may include identifying data (e.g., information outlining abort relationships between the processor and other specific physical or logical processors) and/or tallied data (e.g., values of event counters reflecting the number of aborted attempts by the current thread or the resources consumed by those attempts).
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: July 8, 2014
    Assignee: Oracle America, Inc.
    Inventors: David Dice, Kevin E. Moore, Mark S. Moir
  • Publication number: 20140181423
    Abstract: The systems and methods described herein may be used to implement scalable statistics counters suitable for use in systems that employ a NUMA style memory architecture. The counters may be implemented as data structures that include a count value portion and a node identifier portion. The counters may be accessible within transactions. The node identifier portion may identify a node on which a thread that most recently incremented the counter was executing or one on which a thread that has requested priority to increment the shared counter was executing. Threads executing on identified nodes may have higher priority to increment the counter than other threads. Threads executing on other nodes may delay their attempts to increment the counter, thus encouraging consecutive updates from threads on a single node. Impatient threads may attempt to update the node identifier portion or may update an anti-starvation variable to indicate a request for priority.
    Type: Application
    Filed: December 20, 2012
    Publication date: June 26, 2014
    Applicant: Oracle International Corporation
    Inventors: David Dice, Yosef Lev, Mark S. Moir
  • Publication number: 20140181473
    Abstract: The systems and methods described herein may implement probabilistic counters and/or update mechanisms for those counters such that they are dependent on the value of a configurable accuracy parameter. The accuracy parameter value may be adjusted to provide fine-grained control over the tradeoff between the accuracy of the counters and the performance of applications that access them. The counters may be implemented as data structures that include a mantissa portion and an exponent portion that collectively represent an update probability value. When updating the counters, the value of the configurable accuracy parameter may affect whether, when, how often, or by what amount the mantissa portion and/or the exponent portion are updated. Updating a probabilistic counter may include multiplying its value by a constant that is dependent on the value of a configurable accuracy parameter. The counters may be accessible within transactions. The counters may have deterministic update policies.
    Type: Application
    Filed: December 20, 2012
    Publication date: June 26, 2014
    Applicant: Oracle International Corporation
    Inventors: David Dice, Yosef Lev, Mark S. Moir
  • Publication number: 20140181827
    Abstract: The systems and methods described herein may implement scalable statistics counters that are adaptive to the amount of contention for the counters. The counters may be accessible within transactions. Methods for determining whether or when to increment the counters in response to initiation of an increment operation and/or methods for updating the counters may be selected dependent on current, recent, or historical amounts of contention. Various contention management policies or retry conditions may be applied to select between multiple methods. One counter may include a precise counter portion that is incremented under low contention and a probabilistic counter portion that is updated under high contention. Amounts by which probabilistic counters are incremented may be contention-dependent. Another counter may include a node identifier portion that encourages consecutive increments by threads on a single node only when under contention.
    Type: Application
    Filed: December 20, 2012
    Publication date: June 26, 2014
    Applicant: Oracle International Corporation
    Inventors: David Dice, Yosef Lev, Mark S. Moir
  • Publication number: 20140089591
    Abstract: The present embodiments provide a system for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor in the shared-memory multiprocessor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. The system includes an interface, such as an application programming interface (API), and a system call interface or an instruction-set architecture (ISA) that provides access to a number of mechanisms for supporting targeted stores. These mechanisms include a thread-location mechanism that determines a location near where a thread is executing in the shared-memory multiprocessor, and a targeted-store mechanism that targets a store to a location (e.g., cache memory) in the shared-memory multiprocessor.
    Type: Application
    Filed: September 24, 2012
    Publication date: March 27, 2014
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Mark S. Moir, David Dice, Paul N. Loewenstein
  • Patent number: 8595446
    Abstract: The transactional memory system described herein may apply a mix of read validation techniques to validate read operations (e.g., invisible reads and/or semi-visible reads) in different transactions, or to validate different read operations within a single transaction (including reads of the same location). The system may include mechanisms to dynamically determine that a read validation technique should be replaced by a different technique for reads of particular locations or for all subsequent reads, and/or to dynamically adjust the balance between different read validation techniques to manage costs. Some of the read validation techniques may be supported by hardware transactional memory (HTM). The system may delay acquisition of ownership records for reading, and may acquire two or more ownership records back-to-back (e.g., within a single hardware transaction). The user code of a software transaction may be divided into multiple segments, some of which may be executed within a hardware transaction.
    Type: Grant
    Filed: November 25, 2009
    Date of Patent: November 26, 2013
    Assignee: Oracle America, Inc.
    Inventors: Yosef Lev, Marek K. Olszewski, Mark S. Moir
  • Patent number: 8560816
    Abstract: Systems and methods described herein for performing incremental register checkpointing may employ a special register to indicate which registers have already been checkpointed. This register may include one bit per register. These systems may also include a special pointer register whose value identifies a location in user memory or in dedicated on-chip storage at which a copy of a register's value should be saved by a checkpointing operation. Only registers modified during speculative execution or execution of a transaction may be checkpointed (e.g., when register modifying instructions are encountered) and subsequently restored (e.g., due to misspeculation or transaction abort), rather than all of the registers of the processor. Each register may be checkpointed at most once for a given speculative episode or atomic transaction. Setting a bit in the special register may prevent checkpointing of the corresponding register. Setting all of the bits in the special register may disable checkpointing.
    Type: Grant
    Filed: June 30, 2010
    Date of Patent: October 15, 2013
    Assignee: Oracle International Corporation
    Inventors: Mark S. Moir, David Dice, Daniel S. Nussbaum, James R. Goodman
  • Patent number: 8533663
    Abstract: Systems and methods for managing divergence of best effort transactional support mechanisms in various transactional memory implementations using a portable transaction interface are described. This interface may be implemented by various combinations of best effort hardware features, including none at all. Because the features offered by this interface may be best effort, a default (e.g., software) implementation may always be possible without the need for special hardware support. Software may be written to the interface, and may be executable on a variety of platforms, taking advantage of best effort hardware features included on each one, while not depending on any particular mechanism. Multiple implementations of each operation defined by the interface may be included in one or more portable transaction interface libraries.
    Type: Grant
    Filed: October 13, 2008
    Date of Patent: September 10, 2013
    Assignee: Oracle America, Inc.
    Inventors: Mark S. Moir, David Dice
  • Patent number: 8533699
    Abstract: Systems and methods for optimizing code may use transactional memory to optimize one code section by forcing another code section to execute atomically. Application source code may be analyzed to identify instructions in one code section that only need to be executed if there exists the possibility that another code section (e.g., a critical section) could be partially executed or that its results could be affected by interference. In response to identifying such instructions, alternate code may be generated that forces the critical section to be executed as an atomic transaction, e.g., using best-effort hardware transactional memory. This alternate code may replace the original code or may be included in an alternate execution path that can be conditionally selected for execution at runtime. The alternate code may elide the identified instructions (which are rendered unnecessary by the transaction) by removing them, or by including them in the alternate execution path.
    Type: Grant
    Filed: March 31, 2011
    Date of Patent: September 10, 2013
    Assignee: Oracle International Corporation
    Inventors: Mark S. Moir, David Dice, Srikanta N. Tirthapura
  • Publication number: 20130198499
    Abstract: A computer system may recognize a busy-wait loop in program instructions at compile time and/or may recognize busy-wait looping behavior during execution of program instructions. The system may recognize that an exit condition for a busy-wait loop is specified by a conditional branch type instruction in the program instructions. In response to identifying the loop and the conditional branch type instruction that specifies its exit condition, the system may influence or override a prediction made by a dynamic branch predictor, resulting in a prediction that the exit condition will be met and that the loop will be exited regardless of any observed branch behavior for the conditional branch type instruction. The looping instructions may implement waiting for an inter-thread communication event to occur or for a lock to become available. When the exit condition is met, the loop may be exited without incurring a misprediction delay.
    Type: Application
    Filed: January 31, 2012
    Publication date: August 1, 2013
    Inventors: David Dice, Mark S. Moir
  • Patent number: 8464261
    Abstract: The transactional memory system described herein may implement parallel co-transactions that access a shared memory such that at most one of the co-transactions in a set will succeed and all others will fail (e.g., be aborted). Co-transactions may improve the performance of programs that use transactional memory by attempting to perform the same high-level operation using multiple algorithmic approaches, transactional memory implementations and/or speculation options in parallel, and allowing only the first to complete to commit its results. If none of the co-transactions succeeds, one or more may be retried, possibly using a different approach and/or transactional memory implementation. The at-most-one property may be managed through the use of a shared “done” flag. Conflicts between co-transactions in a set and accesses made by transactions or activities outside the set may be managed using lazy write ownership acquisition and/or a priority-based approach.
    Type: Grant
    Filed: March 31, 2010
    Date of Patent: June 11, 2013
    Assignee: Oracle International Corporation
    Inventors: Mark S. Moir, Robert E. Cypher, Daniel S. Nussbaum
  • Patent number: 8417897
    Abstract: The system and methods described herein may reduce read/write fence latencies and cache pressure related to STM metadata accesses. These techniques may leverage locality information (as reflected by the value of a respective locale guard) associated with each of a plurality of data partitions (locales) in a shared memory to elide various operations in transactional read/write fences when transactions access data in locales owned by their threads. The locale state may be disabled, free, exclusive, or shared. For a given memory access operation of an atomic transaction targeting an object in the shared memory, the system may implement the memory access operation using a contention mediation mechanism selected based on the value of the locale guard associated with the locale in which the target object resides. For example, a traditional read/write fence may be employed in some memory access operations, while other access operations may employ an optimized read/write fence.
    Type: Grant
    Filed: March 31, 2010
    Date of Patent: April 9, 2013
    Assignee: Oracle International Corporation
    Inventors: Virendra J. Marathe, Mark S. Moir
  • Patent number: 8412894
    Abstract: Solutions to a value recycling problem facilitate implementations of computer programs that may execute as multithreaded computations in multiprocessor computers, as well as implementations of related shared data structures. Some exploitations allow non-blocking, shared data structures to be implemented using standard dynamic allocation mechanisms (such as malloc and free). Some exploitations allow non-blocking, indeed even lock-free or wait-free, implementations of dynamic storage allocation for shared data structures. In some exploitations, our techniques provide a way to manage dynamically allocated memory in a non-blocking manner without depending on garbage collection. While exploitations of solutions to the value recycling problem that we propose include management of dynamic storage allocation wherein values managed and recycled tend to include values that encode pointers, they are not limited thereto.
    Type: Grant
    Filed: February 22, 2011
    Date of Patent: April 2, 2013
    Assignee: Oracle International Corporation
    Inventors: Mark S. Moir, Victor Luchangco, Maurice Herlihy
  • Patent number: 8402227
    Abstract: The system and methods described herein may exploit hardware transactional memory to improve the performance of a software or hybrid transactional memory implementation, even when an entire user transaction cannot be executed within a hardware transaction. The user code of an atomic transaction may be executed within a software transaction, which may collect read and write sets and/or other information about the atomic transaction. A single hardware transaction may be used to commit the atomic transaction by validating the transaction's read set and applying the effects of the user code to memory, reducing the overhead associated with commitment of software transactions. Because the hardware transaction code is carefully controlled, it may be less likely to fail to commit. Various remedial actions may be taken before retrying hardware transactions following some failures. If a transaction exceeds the constraints of the hardware, it may be committed by the software transactional memory alone.
    Type: Grant
    Filed: March 31, 2010
    Date of Patent: March 19, 2013
    Assignee: Oracle International Corporation
    Inventors: Mark S. Moir, Yosef Lev, Daniel S. Nussbaum
  • Patent number: 8402464
    Abstract: Transactional Lock Elision (TLE) may allow threads in a multi-threaded system to concurrently execute critical sections as speculative transactions. Such speculative transactions may abort due to contention among threads. Systems and methods for managing contention among threads may increase overall performance by considering both local and global execution data in reducing, resolving, and/or mitigating such contention. Global data may include aggregated and/or derived data representing thread-local data of remote thread(s), including transactional abort history, abort causal history, resource consumption history, performance history, synchronization history, and/or transactional delay history. Local and/or global data may be used in determining the mode by which critical sections are executed, including TLE and mutual exclusion, and/or to inform concurrency throttling mechanisms. Local and/or global data may also be used in determining concurrency throttling parameters (e.g.
    Type: Grant
    Filed: December 1, 2008
    Date of Patent: March 19, 2013
    Assignee: Oracle America, Inc.
    Inventors: David Dice, Mark S. Moir
  • Publication number: 20120310987
    Abstract: The systems and methods described herein may be used to implement a shared dynamic-sized data structure using hardware transactional memory to simplify and/or improve memory management of the data structure. An application (or thread thereof) may indicate (or register) the intended use of an element of the data structure and may initialize the value of the data structure element. Thereafter, another thread or application may use hardware transactions to access the data structure element while confirming that the data structure element is still part of the dynamic data structure and/or that memory allocated to the data structure element has not been freed. Various indicators may be used determine whether memory allocated to the element element can be freed.
    Type: Application
    Filed: June 23, 2011
    Publication date: December 6, 2012
    Inventors: Aleksandar Dragojevic, Maurice Herlihy, Yosef Lev, Mark S. Moir
  • Publication number: 20120278576
    Abstract: The design of nonblocking linked data structures using single-location synchronization primitives such as compare-and-swap (CAS) is a complex affair that often requires severe restrictions on the way pointers are used. One way to address this problem is to provide stronger synchronization operations, for example, ones that atomically modify one memory location while simultaneously verifying the contents of others. We provide a simple and highly efficient nonblocking implementation of such an operation: an atomic k-word-compare single-swap operation (KCSS). Our implementation is obstruction-free. As a result, it is highly efficient in the uncontended case and relies on contention management mechanisms in the contended cases. It allows linked data structure manipulation without the complexity and restrictions of other solutions.
    Type: Application
    Filed: July 6, 2012
    Publication date: November 1, 2012
    Inventors: Nir N. Shavit, Mark S. Moir, Victor M. Luchangco
  • Publication number: 20120254846
    Abstract: Systems and methods for optimizing code may use transactional memory to optimize one code section by forcing another code section to execute atomically. Application source code may be analyzed to identify instructions in one code section that only need to be executed if there exists the possibility that another code section (e.g., a critical section) could be partially executed or that its results could be affected by interference. In response to identifying such instructions, alternate code may be generated that forces the critical section to be executed as an atomic transaction, e.g., using best-effort hardware transactional memory. This alternate code may replace the original code or may be included in an alternate execution path that can be conditionally selected for execution at runtime. The alternate code may elide the identified instructions (which are rendered unnecessary by the transaction) by removing them, or by including them in the alternate execution path.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Inventors: Mark S. Moir, David Dice, Srikanta N. Tirthapura