Patents by Inventor Mark Luttrell
Mark Luttrell has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20200218683Abstract: A reconfigurable data processor comprises an array of configurable units and a bus system configurable to define virtual machines. The system can partition the array of configurable units into a plurality of sets of configurable units, and block communications via the bus system between configurable units within a particular set and configurable units outside the particular set. A memory access controller can be connected to the bus system, configurable to confine access to memory outside the array of configurable units originating from within the particular set to memory space allocated to the particular.Type: ApplicationFiled: January 3, 2019Publication date: July 9, 2020Applicant: SambaNova Systems, Inc.Inventors: Gregory Frederick Grohoski, Sumti Jairath, Mark Luttrell, Raghu Prabhakar, Ram Sivaramakrishnan, Manish K. Shah
-
Patent number: 10698853Abstract: A reconfigurable data processor comprises an array of configurable units and a bus system configurable to define virtual machines. The system can partition the array of configurable units into a plurality of sets of configurable units, and block communications via the bus system between configurable units within a particular set and configurable units outside the particular set. A memory access controller can be connected to the bus system, configurable to confine access to memory outside the array of configurable units originating from within the particular set to memory space allocated to the particular.Type: GrantFiled: January 3, 2019Date of Patent: June 30, 2020Assignee: SambaNova Systems, Inc.Inventors: Gregory Frederick Grohoski, Sumti Jairath, Mark Luttrell, Raghu Prabhakar, Ram Sivaramakrishnan, Manish K Shah
-
Publication number: 20200159692Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. Configurable units in the plurality of configurable units each include logic to execute a unit configuration load process, including receiving via the bus system, sub-files of a unit file particular to the configurable unit, and loading the received sub-files into the configuration store of the configurable unit. A configuration load controller connected to the bus system, including logic to execute an array configuration load process, including distributing a configuration file comprising unit files for a plurality of the configurable units in the array.Type: ApplicationFiled: November 21, 2018Publication date: May 21, 2020Applicant: SambaNova Systems, Inc.Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David Brian Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
-
Publication number: 20200159544Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. Configurable units in the plurality of configurable units each include logic to execute a unit configuration load process, including receiving via the bus system, sub-files of a unit file particular to the configurable unit, and loading the received sub-files into the configuration store of the configurable unit. A configuration load controller connected to the bus system, including logic to execute an array configuration load process, including distributing a configuration file comprising unit files for a plurality of the configurable units in the array.Type: ApplicationFiled: November 21, 2018Publication date: May 21, 2020Applicant: SambaNova Systems, Inc.Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David Brian Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
-
Publication number: 20200073835Abstract: A system is disclosed, including a plurality of access units, a plurality of circuit nodes each coupled to a respective access unit, and a plurality of data processing nodes each coupled to a respective access unit. A particular data processing node may be configured to generate a plurality of data transactions. The particular data processing node may also be configured to determine an availability of a coupled access unit. In response to a determination that the coupled access unit is unavailable, the particular data processing node may be configured to halt a transfer of the plurality of data transactions to the coupled access unit and assert a halt indicator signal. In response to a determination that the coupled access unit is available, the particular data processing node may be configured to transfer the particular data transaction to the coupled access unit.Type: ApplicationFiled: November 8, 2019Publication date: March 5, 2020Inventors: Robert Golla, Manish Shah, Mark Luttrell
-
Patent number: 10474601Abstract: A system is disclosed, including a plurality of access units, a plurality of circuit nodes each coupled to a respective access unit, and a plurality of data processing nodes each coupled to a respective access unit. A particular data processing node may be configured to generate a plurality of data transactions. The particular data processing node may also be configured to determine an availability of a coupled access unit. In response to a determination that the coupled access unit is unavailable, the particular data processing node may be configured to halt a transfer of the plurality of data transactions to the coupled access unit and assert a halt indicator signal. In response to a determination that the coupled access unit is available, the particular data processing node may be configured to transfer the particular data transaction to the coupled access unit.Type: GrantFiled: February 6, 2017Date of Patent: November 12, 2019Assignee: Oracle International CorporationInventors: Robert Golla, Manish Shah, Mark Luttrell
-
Publication number: 20180225239Abstract: A system is disclosed, including a plurality of access units, a plurality of circuit nodes each coupled to a respective access unit, and a plurality of data processing nodes each coupled to a respective access unit. A particular data processing node may be configured to generate a plurality of data transactions. The particular data processing node may also be configured to determine an availability of a coupled access unit. In response to a determination that the coupled access unit is unavailable, the particular data processing node may be configured to halt a transfer of the plurality of data transactions to the coupled access unit and assert a halt indicator signal. In response to a determination that the coupled access unit is available, the particular data processing node may be configured to transfer the particular data transaction to the coupled access unit.Type: ApplicationFiled: February 6, 2017Publication date: August 9, 2018Inventors: Robert Golla, Manish Shah, Mark Luttrell
-
Patent number: 9971565Abstract: Random numbers within a processor may be scarce, especially when multiple hardware threads are consuming them. A local random number buffer can be used by an execution core to better manage allocation and consumption of random numbers. The buffer may operate in a number of modes, and allow any hardware thread to use a random number under some conditions. In other conditions, only certain hardware threads may be allowed to consume a random number. The local random number buffer may have a dynamic pool of entries usable by any hardware thread, as well as reserved entries usable by only particular hardware threads. Further, a user-level instruction is disclosed that can be stored in a wait queue in response to a random number being unavailable, rather than having the instruction's request for a random number simply be denied. The random number buffer may also boost performance and reduce latency.Type: GrantFiled: May 7, 2015Date of Patent: May 15, 2018Assignee: Oracle International CorporationInventors: John Pape, Mark Luttrell, Paul Jordan, Michael Snyder
-
Patent number: 9940132Abstract: Techniques are disclosed relating to suspending execution of a processor thread while monitoring for a write to a specified memory location. An execution subsystem may be configured to perform a load instruction that causes the processor to retrieve data from a specified memory location and atomically begin monitoring for a write to the specified location. The load instruction may be a load-monitor instruction. The execution subsystem may be further configured to perform a wait instruction that causes the processor to suspend execution of a processor thread during at least a portion of an interval specified by the wait instruction and to resume execution of the processor thread at the end of the interval. The wait instruction may be a monitor-wait instruction. The processor may be further configured to resume execution of the processor thread in response to detecting a write to a memory location specified by a previous monitor instruction.Type: GrantFiled: December 14, 2015Date of Patent: April 10, 2018Assignee: Oracle International CorporationInventors: Paul N. Loewenstein, Mark A. Luttrell, Paul J. Jordan
-
Patent number: 9892039Abstract: A method and apparatus for performing non-temporal write combining using existing cache resources is disclosed. In one embodiment, a method includes executing a first thread on a processor core, the first thread including a first block initialization store (BIS) instruction. A cache query may be performed responsive to the BIS instruction, and if the query results in a cache miss, a cache line may be installed in a cache in an unordered dirty state in which it is exclusively owned by the first thread. The first BIS instruction and one or more additional BIS instructions may write data from the first processor core into the first cache line. After a cache coherence response is received, the state of the first cache line may be changed to an ordered dirty state in which it is no longer exclusive to the first thread.Type: GrantFiled: April 21, 2015Date of Patent: February 13, 2018Assignee: Oracle International CorporationInventors: Mark Luttrell, David Smentek, Ramaswamy Sivaramakrishnan, Serena Leung
-
Patent number: 9672298Abstract: Techniques for executing versioned memory access instructions. In one embodiment, a processor is configured to execute versioned store instructions of a first thread within a first mode of operation. In this embodiment, in the first mode of operation, the processor is configured to retire a versioned store instruction only after a version comparison has been performed for the versioned store instruction. In this embodiment the processor is configured to suppress retirement of instructions in the first thread that are younger than an oldest versioned store instruction until the oldest versioned store instruction has retired. In some embodiments, the processor is configured to execute versioned store instructions of a given thread within a second mode of operation, in which the processor is configured to retire outstanding versioned store instructions before a version comparison has been performed.Type: GrantFiled: May 1, 2014Date of Patent: June 6, 2017Assignee: Oracle International CorporationInventors: Zoran Radovic, Jared C. Smolens, Robert T. Golla, Paul J. Jordan, Mark A. Luttrell
-
Patent number: 9665375Abstract: Systems and methods for efficient thread arbitration in a threaded processor with dynamic resource allocation. A processor includes a resource shared by multiple threads. The resource includes an array with multiple entries, each of which may be allocated for use by any thread. Control logic detects a load miss to memory, wherein the miss is associated with a latency greater than a given threshold. The load instruction or an immediately younger instruction is selected for replay for an associated thread. A pipeline flush and replay for the associated thread begins with the selected instruction. Instructions younger than the load instruction are held at a given pipeline stage until the load instruction completes. During replay, this hold prevents resources from being allocated to the associated thread while the load instruction is being serviced.Type: GrantFiled: April 26, 2012Date of Patent: May 30, 2017Assignee: Oracle International CorporationInventors: Yuan C. Chou, Robert T. Golla, Mark A. Luttrell
-
Publication number: 20160328209Abstract: Random numbers within a processor may be scarce, especially when multiple hardware threads are consuming them. A local random number buffer can be used by an execution core to better manage allocation and consumption of random numbers. The buffer may operate in a number of modes, and allow any hardware thread to use a random number under some conditions. In other conditions, only certain hardware threads may be allowed to consume a random number. The local random number buffer may have a dynamic pool of entries usable by any hardware thread, as well as reserved entries usable by only particular hardware threads. Further, a user-level instruction is disclosed that can be stored in a wait queue in response to a random number being unavailable, rather than having the instruction's request for a random number simply be denied. The random number buffer may also boost performance and reduce latency.Type: ApplicationFiled: May 7, 2015Publication date: November 10, 2016Inventors: John Pape, Mark Luttrell, Paul Jordan, Michael Snyder
-
Publication number: 20160314069Abstract: A method and apparatus for performing non-temporal write combining using existing cache resources is disclosed. In one embodiment, a method includes executing a first thread on a processor core, the first thread including a first block initialization store (BIS) instruction. A cache query may be performed responsive to the BIS instruction, and if the query results in a cache miss, a cache line may be installed in a cache in an unordered dirty state in which it is exclusively owned by the first thread. The first BIS instruction and one or more additional BIS instructions may write data from the first processor core into the first cache line. After a cache coherence response is received, the state of the first cache line may be changed to an ordered dirty state in which it is no longer exclusive to the first thread.Type: ApplicationFiled: April 21, 2015Publication date: October 27, 2016Inventors: Mark Luttrell, David Smentek, Ramaswamy Sivaramakrishnan, Serena Leung
-
Patent number: 9405690Abstract: A processor may include a cache configured to store instructions and memory data for the processor. The cache may store instructions in which a relative address, such as for a branch instruction has been calculated, such that the instruction stored in the cache is modified from how the instruction is stored in main memory. The cache may include additional information in the tag to identify an instruction entry versus a memory data entry. When receiving a cache request, the cache may look at a type tag in addition to an address tag to determine if the request is a hit or a miss based upon the request being for an instruction from an instruction fetch unit or for memory data from a memory management unit. A cache entry may be invalidated and evicted if the address matches but the data type does not match.Type: GrantFiled: August 7, 2013Date of Patent: August 2, 2016Assignee: Oracle International CorporationInventor: Mark A Luttrell
-
Patent number: 9367472Abstract: Systems and methods for reliably using data storage media. Multiple processors are configured to access a persistent memory. For a given data block corresponding to a write access request from a first processor to the persistent memory, a cache controller prevents any read access of a copy of the given data block in an associated cache. The cache controller prevents any read access while detecting an acknowledgment that the given data block is stored in the persistent memory is not yet received. Until the acknowledgment is received, the cache controller allows write access of the copy of the given data block in the associated cache only for a thread in the first processor that originally sent the write access request. The cache controller invalidates any copy of the given data block in any cache levels below the associated cache.Type: GrantFiled: June 10, 2013Date of Patent: June 14, 2016Assignee: Oracle International CorporationInventors: William H. Bridge, Jr., Paul Loewenstein, Mark A. Luttrell
-
Publication number: 20160098274Abstract: Techniques are disclosed relating to suspending execution of a processor thread while monitoring for a write to a specified memory location. An execution subsystem may be configured to perform a load instruction that causes the processor to retrieve data from a specified memory location and atomically begin monitoring for a write to the specified location. The load instruction may be a load-monitor instruction. The execution subsystem may be further configured to perform a wait instruction that causes the processor to suspend execution of a processor thread during at least a portion of an interval specified by the wait instruction and to resume execution of the processor thread at the end of the interval. The wait instruction may be a monitor-wait instruction. The processor may be further configured to resume execution of the processor thread in response to detecting a write to a memory location specified by a previous monitor instruction.Type: ApplicationFiled: December 14, 2015Publication date: April 7, 2016Inventors: Paul N. Loewenstein, Mark A. Luttrell, Paul J. Jordan
-
Publication number: 20150317338Abstract: Techniques for executing versioned memory access instructions. In one embodiment, a processor is configured to execute versioned store instructions of a first thread within a first mode of operation. In this embodiment, in the first mode of operation, the processor is configured to retire a versioned store instruction only after a version comparison has been performed for the versioned store instruction. In this embodiment the processor is configured to suppress retirement of instructions in the first thread that are younger than an oldest versioned store instruction until the oldest versioned store instruction has retired. In some embodiments, the processor is configured to execute versioned store instructions of a given thread within a second mode of operation, in which the processor is configured to retire outstanding versioned store instructions before a version comparison has been performed.Type: ApplicationFiled: May 1, 2014Publication date: November 5, 2015Applicant: Oracle International CorporationInventors: Zoran Radovic, Jared C. Smolens, Robert T. Golla, Paul J. Jordan, Mark A. Luttrell
-
Patent number: 9058180Abstract: Systems and methods for efficient picking of instructions for out-of-order issue and execution in a processor. In one embodiment, a processor comprises a unified pick queue that is dynamically allocated. Each entry is configured to store age and dependency information relative to other decoded instructions. Also, each entry stores a picked field, which when asserted indicates the decoded instruction has already been picked for out-of-order issue and execution. When asserted, a trigger field indicates a result of a corresponding decoded instruction will be available a predetermined number of clock cycles afterward. A younger instruction dependent on a result of an older instruction is ready to be picked before the result of the older instruction is available. In this case, the older instruction has asserted picked and trigger fields.Type: GrantFiled: June 29, 2009Date of Patent: June 16, 2015Assignee: Oracle America, Inc.Inventors: Robert T. Golla, Matthew B. Smittle, Mark A. Luttrell, Xiang Shan Li
-
Publication number: 20150046651Abstract: A processor may include a cache configured to store instructions and memory data for the processor. The cache may store instructions in which a relative address, such as for a branch instruction has been calculated, such that the instruction stored in the cache is modified from how the instruction is stored in main memory. The cache may include additional information in the tag to identify an instruction entry versus a memory data entry. When receiving a cache request, the cache may look at a type tag in addition to an address tag to determine if the request is a hit or a miss based upon the request being for an instruction from an instruction fetch unit or for memory data from a memory management unit. A cache entry may be invalidated and evicted if the address matches but the data type does not match.Type: ApplicationFiled: August 7, 2013Publication date: February 12, 2015Applicant: Oracle International CorporationInventor: Mark A. Luttrell