Patents by Inventor Steven L. Scott

Steven L. Scott has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8307194
    Abstract: A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.
    Type: Grant
    Filed: August 18, 2003
    Date of Patent: November 6, 2012
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Gregory J. Faanes, Brick Stephenson, William T. Moore, Jr., James R. Kohn
  • Publication number: 20120272247
    Abstract: A method and system for software emulation of hardware support for multi-threaded processing using virtual hardware threads is provided. A software threading system executes on a node that has one or more processors, each with one or more hardware threads. The node has access to local memory and access to remote memory. The software threading system manages the execution of tasks of a user program. The software threading system switches between the virtual hardware threads representing the tasks as the tasks issue remote memory access requests while in user privilege mode. Thus, the software threading system emulates more hardware threads than the underlying hardware supports and switches the virtual hardware threads without the overhead of a context switch to the operating system or change in privilege mode.
    Type: Application
    Filed: April 22, 2011
    Publication date: October 25, 2012
    Inventors: Steven L. Scott, Gregory B. Titus, Sung-Eun Choi, Troy A. Johnson, David Mizell, Michael F. Ringenburg, Karlon West
  • Patent number: 8261134
    Abstract: A multiprocessor computer system comprises one or more watchdog timers operable to detect failure of a memory operation based on passage of a certain timing period from a memory operation being issued without a valid response. An error handler is operable to take corrective action regarding the failed memory operation, such as to provide at least one of hardware state management and application state management.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: September 4, 2012
    Assignee: Cray Inc.
    Inventors: Dennis C. Abts, Steven L. Scott, Aaron F. Godfrey
  • Publication number: 20120221830
    Abstract: A processor core, comprises one or more vector units operable to change between a fine-grained vector mode having a shorter maximum vector length and a coarse-grained vector mode having a longer maximum vector length. Changing vector modes comprises halting all instruction stream execution in the core, flushing one or more registers in a register space, reconfiguring one or more vector registers in the register space, and restarting instruction execution in the core.
    Type: Application
    Filed: February 29, 2012
    Publication date: August 30, 2012
    Applicant: CRAY INC.
    Inventors: Gregory J. Faanes, Eric P. Lundberg, Abdulla Bataineh, Timothy J. Johnson, Michael Parker, James Robert Kohn, Steven L. Scott, Robert Alverson
  • Patent number: 8239704
    Abstract: In some embodiments, the present invention relates to a method of maintaining a global clock within a multiprocessor system having a plurality of nodes that are connected in a network via links. A virtual spanning tree is mapped onto the network and the nodes and the links are configured such that each node is in a parent-child relationship with one or more other nodes in the virtual spanning tree. A global clock is generated in a root of the virtual spanning tree and global clock signals are communicated down the virtual spanning tree to each of the nodes.
    Type: Grant
    Filed: June 12, 2009
    Date of Patent: August 7, 2012
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, Aaron F. Godfrey
  • Patent number: 8223778
    Abstract: A system and method for routing a packet between ports for use in a router having a plurality of ports, including a first and a second port, wherein each port includes a plurality of look-up tables (LUTs) and a look-up table select connected to the LUTs. Routing information is loaded into each of the plurality of LUTs while LUT selection information is loaded in the look-up table select. A packet having a plurality of destination bits is received at the first port and a destination port selected within the router to receive the packet. The destination port is selected by applying two or more of the destination bits to the plurality of LUTs in the first port and selecting an output of the plurality of LUTs as a function of one or more of the destination bits, wherein the selected output indicates the port selected to receive the packet. The packet is then routed to the output of the selected port.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: July 17, 2012
    Assignee: Intel Corporation
    Inventors: Steven L. Scott, Robert Alverson
  • Patent number: 8184626
    Abstract: A high-radix interprocessor communications system and method having a plurality of processor nodes, a plurality of first routers and a plurality of second routers. Each first router is connected to a processor node and to two or more second routers. Each first router includes input ports, output ports, row busses, columns channels and a plurality of subswitches arranged in a n x p matrix. Each row bus receives data from one of the plurality of input ports and distributes the data to two or more of the plurality of subswitches. Each column distributes data from one or more subswitches to one or more output ports. Each row bus includes a route selector, wherein the route selector includes a routing table which selects an output port for each packet and which routes the packet through one of the row busses to the selected output port.
    Type: Grant
    Filed: January 12, 2009
    Date of Patent: May 22, 2012
    Assignees: Cray Inc., The Board of Trustees of the Leland Stanford Junior University
    Inventors: Steven L. Scott, Dennis C. Abts, William J. Dally
  • Patent number: 8095759
    Abstract: A multiprocessor computer system comprises a plurality of processors and a plurality of nodes, each node comprising one or more processors. A local memory in each of the plurality of nodes is coupled to the processors in each node, and a hardware firewall comprising a part of one or more of the nodes is operable to prevent a write from an unauthorized processor from writing to the local memory.
    Type: Grant
    Filed: May 29, 2009
    Date of Patent: January 10, 2012
    Assignee: Cray Inc.
    Inventors: Dennis C. Abts, Steven L. Scott, Aaron F. Godfrey
  • Publication number: 20110051724
    Abstract: A system and method for routing in a high-radix network. A packet is received and examined to determine if the packet can be routed adaptively. If the packet can be routed adaptively, the packet is routed adaptively, wherein routing adaptively includes selecting a column, computing a column mask, routing the packet to the column; and selecting an output port as a function of the column mask. If the packet can be routed deterministically, routing deterministically, wherein routing deterministically includes accessing a routing table to obtain an output port and routing the packet to the output port from the routing table.
    Type: Application
    Filed: November 9, 2010
    Publication date: March 3, 2011
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Gregory Hubbard, Dennis C. Abts
  • Patent number: 7864792
    Abstract: In a system having a N output ports, wherein N is an integer greater than one, a method of distributing packets across the plurality of output ports. A packet having two or more fields is received and a first number is computed as a function of one or more of the plurality of fields. A second number is computed that is modulo base N of the first number and an output port is selected as a function of the second number.
    Type: Grant
    Filed: April 21, 2008
    Date of Patent: January 4, 2011
    Assignee: Cray, Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, William J. Dally
  • Publication number: 20100318747
    Abstract: An atomic memory operation cache comprises a cache memory operable to cache atomic memory operation data, a write timer, and a cache controller. The cache controller is operable to update main memory with one or more dirty atomic memory operation cache entries stored in the cache memory upon expiration of the write timer, and is further operable to update main memory with one or more dirty atomic memory operation cache entries stored in the cache memory upon eviction of the one or more dirty atomic memory operation cache entries from the cache memory.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Dennis C. Abts, Steven L. Scott
  • Publication number: 20100318626
    Abstract: A multiprocessor computer system comprises a first node operable to access memory local to a remote node by receiving a virtual memory address from a requesting entity in node logic in the first node. The first node creates a network address from the virtual address received in the node logic, where the network address is in a larger address space than the virtual memory address, and sends a fast memory access request from the first node to a network node identified in the network address.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Dennis C. Abts, Robert Alverson, Edwin Froese, Howard Pritchard, Steven L. Scott
  • Publication number: 20100318831
    Abstract: In some embodiments, the present invention relates to a method of maintaining a global clock within a multiprocessor system having a plurality of nodes that are connected in a network via links. A virtual spanning tree is mapped onto the network and the nodes and the links are configured such that each node is in a parent-child relationship with one or more other nodes in the virtual spanning tree. A global clock is generated in a root of the virtual spanning tree and global clock signals are communicated down the virtual spanning tree to each of the nodes.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, Aaron F. Godfrey
  • Publication number: 20100318741
    Abstract: A multiprocessor computer system comprises a processing node having a plurality of processors and a local memory shared among processors in the node. An L1 data cache is local to each of the plurality of processors, and an L2 cache is local to each of the plurality of processors. An L3 cache is local the node but shared among the plurality of processors, and the L3 cache is a subset of data stored in the local memory. The L2 caches are subsets of the L3 cache, and the L1 caches are a subset of the L2 caches in the respective processors.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Gregory J. Faanes, Abdulla Bataineh, Michael Bye, Gerald A. Schwoerer, Dennis C. Abts
  • Patent number: 7852836
    Abstract: A system and method for routing packets from one node to another node in a system having a plurality of nodes connected by a network. A node router is provided in each node, wherein the node router includes a plurality of network ports, including a first and a second network port, wherein each network port includes a communications channel for communicating with one of the other network nodes, a plurality of virtual channel input buffers and a plurality of virtual channel staging buffers, wherein each of the virtual channel staging buffers receives data from one of the plurality of input buffers.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: December 14, 2010
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, Gregory Hubbard
  • Publication number: 20100306489
    Abstract: A multiprocessor computer system comprises a plurality of processors and a plurality of nodes, each node comprising one or more processors. A local memory in each of the plurality of nodes is coupled to the processors in each node, and a hardware firewall comprising a part of one or more of the nodes is operable to prevent a write from an unauthorized processor from writing to the local memory.
    Type: Application
    Filed: May 29, 2009
    Publication date: December 2, 2010
    Applicant: Cray Inc.
    Inventors: Dennis C. Abts, Steven L. Scott, Aaron F. Godfrey
  • Patent number: 7843929
    Abstract: A system and method for routing in a high-radix network. A packet is received and examined to determine if the packet can be routed adaptively. If the packet can be routed adaptively, the packet is routed adaptively, wherein routing adaptively includes selecting a column, computing a column mask, routing the packet to the column; and selecting an output port as a function of the column mask. If the packet can be routed deterministically, routing deterministically, wherein routing deterministically includes accessing a routing table to obtain an output port and routing the packet to the output port from the routing table.
    Type: Grant
    Filed: April 21, 2008
    Date of Patent: November 30, 2010
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Gregory Hubbard, Dennis C. Abts
  • Patent number: 7830905
    Abstract: A system and method for speculative forwarding of packets received by a router, wherein each packet includes phits and wherein one or more phits include a cyclic redundancy code (CRC). A packet is received and phits of the packet are forwarded to router logic. A cyclic redundancy code for the packet is calculated and compared to the packet's cyclic redundancy code. An error is generated if the cyclic redundancy codes don't match. If the cyclic redundancy codes don't match, a phit of the packet is modified to reflect the error, the CRC is corrected and the corrected CRC is forwarded to the router logic along with the phit reflecting the CRC error. At the router logic, a check is made to see if the packet is still within the router logic. If the packet is still within the router logic and there was a CRC error, the packet is discarded. If, however, the packet is no longer within the router logic and there was a CRC error, the packet is modified so that the next router discards the packet.
    Type: Grant
    Filed: April 21, 2008
    Date of Patent: November 9, 2010
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Gregory Hubbard, Kelly Marquardt, Roger A. Bethard, Dennis C. Abts
  • Publication number: 20100199121
    Abstract: A multiprocessor computer system comprises one or more watchdog timers operable to detect failure of a memory operation based on passage of a certain timing period from a memory operation being issued without a valid response. An error handler is operable to take corrective action regarding the failed memory operation, such as to provide at least one of hardware state management and application state management.
    Type: Application
    Filed: January 28, 2010
    Publication date: August 5, 2010
    Applicant: Cray Inc
    Inventors: Dennis C. Abts, Steven L. Scott, Aaron F. Godfrey
  • Patent number: 7743223
    Abstract: In a computer system having a plurality of processors connected to a shared memory, a system and method of decoupling an address from write data in a store to the shared memory. A write request address is generated for a memory write, wherein the write request address points to a memory location in shared memory. A write request is issued to the shared memory, wherein the write request includes the write request address. The write request address is noted in the shared memory and addresses in subsequent load and store requests are compared in share memory to the write request address. The write data is transferred to the shared memory and matched, within the shared memory, to the write request address. The write data is then stored into the shared memory as a function of the write request address.
    Type: Grant
    Filed: August 18, 2003
    Date of Patent: June 22, 2010
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Gregory J. Faanes