Patents by Inventor Steven L. Scott

Steven L. Scott has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100115228
    Abstract: A multiprocessor computer system has a plurality of first processors having a first addressable memory space, and a plurality of second processors having a second addressable memory space. The second addressable memory space is of a different size than the first addressable memory space, and the first addressable memory space and second addressable memory space comprise a part of the same common address space.
    Type: Application
    Filed: October 31, 2008
    Publication date: May 6, 2010
    Applicant: CRAY INC.
    Inventors: Michael Parker, Timothy J. Johnson, Laurence S. Kaplan, Steven L. Scott, Robert Alverson, Skef Iterum
  • Publication number: 20100115234
    Abstract: A processor core, comprises one or more vector units operable to change between a fine-grained vector mode having a shorter maximum vector length and a coarse-grained vector mode having a longer maximum vector length. Changing vector modes comprises halting all instruction stream execution in the core, flushing one or more registers in a register space, reconfiguring one or more vector registers in the register space, and restarting instruction execution in the core.
    Type: Application
    Filed: October 31, 2008
    Publication date: May 6, 2010
    Applicant: CRAY INC.
    Inventors: Gregory J. Faanes, Eric P. Lundberg, Abdulla Bataineh, Timothy J. Johnson, Michael Parker, James Robert Kohn, Steven L. Scott, Robert Alverson
  • Publication number: 20100115236
    Abstract: A multiprocessor computer system having a plurality of processing elements comprises one or more core-level hierarchical shared semaphore registers, wherein each core-level hierarchical shared semaphore register is coupled to a different processor core. Each hierarchical shared semaphore register is writable to each of a plurality of streams executing on the coupled processor core. One or more chip-level hierarchical shared semaphore registers are also coupled to plurality of processor cores, each chip-level hierarchical shared semaphore register writable to each of the plurality of processor cores.
    Type: Application
    Filed: October 31, 2008
    Publication date: May 6, 2010
    Applicant: Cray inc.
    Inventors: Abdulla Bataineh, James Robert Kohn, Eric P. Lundberg, Timothy J. Johnson, Thomas L. Court, Gregory J. Faanes, Steven L. Scott
  • Publication number: 20100049942
    Abstract: A multiprocessor computer system comprises a dragonfly processor interconnect network that comprises a plurality of processor nodes, a plurality of routers, each router directly coupled to a plurality of terminal nodes, the routers coupled to one another and arranged into a group, and a plurality of groups of routers, such that each group is connected to each other group via at least one direct connection.
    Type: Application
    Filed: August 20, 2008
    Publication date: February 25, 2010
    Inventors: John Kim, Dennis C. Abts, Steven L. Scott, William J. Dally
  • Publication number: 20090292855
    Abstract: A high-radix interprocessor communications system and method having a plurality of processor nodes, a plurality of first routers and a plurality of second routers. Each first router is connected to a processor node and to two or more second routers. Each first router includes input ports, output ports, row busses, columns channels and a plurality of subswitches arranged in a n×p matrix. Each row bus receives data from one of the plurality of input ports and distributes the data to two or more of the plurality of subswitches. Each column distributes data from one or more subswitches to one or more output ports. Each row bus includes a route selector, wherein the route selector includes a routing table which selects an output port for each packet and which routes the packet through one of the row busses to the selected output port.
    Type: Application
    Filed: January 12, 2009
    Publication date: November 26, 2009
    Inventors: Steven L. Scott, Dennis C. Abts, William J. Dally
  • Patent number: 7543133
    Abstract: A computer system having low memory access latency. In one embodiment, the computer system includes a network and one or more processing nodes connected via the network, wherein each processing node includes a plurality of processors and a shared memory connected to each of the processors. The shared memory includes a cache. Each processor includes a scalar processing unit, a vector processing unit and means for operating the scalar processing unit independently of the vector processing unit. Processors on one node can load data directly from and store data directly to shared memory on another processing node via the network.
    Type: Grant
    Filed: August 18, 2003
    Date of Patent: June 2, 2009
    Assignee: Cray Inc.
    Inventor: Steven L. Scott
  • Patent number: 7519771
    Abstract: A novel system and method for processing memory instructions. One embodiment of the invention provides a method for processing a memory instruction. In this embodiment, the method includes obtaining a memory request; storing the memory request in an Initial Request Queue (IRQ); and processing the memory request from the IRQ by a cache controller, wherein processing includes: identifying a type of the memory request, and processing the memory request in both a local cache and an Force Order Queue (FOQ), wherein processing includes determining if a portion of an address associated with the memory request matches one or more partial addresses in the FOQ and, if the memory request misses in the cache and the address does not match one or more partial addresses in the FOQ, adding the memory request to the FOQ and allocating a cache line in the local cache corresponding to the local cache miss.
    Type: Grant
    Filed: August 18, 2003
    Date of Patent: April 14, 2009
    Assignee: Cray Inc.
    Inventors: Gregory J. Faanes, Eric P. Lundberg, Steven L. Scott, Robert J. Baird
  • Publication number: 20090041049
    Abstract: In a system having a N output ports, wherein N is an integer greater than one, a method of distributing packets across the plurality of output ports. A packet having two or more fields is received and a first number is computed as a function of one or more of the plurality of fields. A second number is computed that is modulo base N of the first number and an output port is selected as a function of the second number.
    Type: Application
    Filed: April 21, 2008
    Publication date: February 12, 2009
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, William J. Dally
  • Publication number: 20090028172
    Abstract: A system and method for speculative forwarding of packets received by a router, wherein each packet includes phits and wherein one or more phits include a cyclic redundancy code (CRC). A packet is received and phits of the packet are forwarded to router logic. A cyclic redundancy code for the packet is calculated and compared to the packet's cyclic redundancy code. An error is generated if the cyclic redundancy codes don't match. If the cyclic redundancy codes don't match, a phit of the packet is modified to reflect the error, the CRC is corrected and the corrected CRC is forwarded to the router logic along with the phit reflecting the CRC error. At the router logic, a check is made to see if the packet is still within the router logic. If the packet is still within the router logic and there was a CRC error, the packet is discarded. If, however, the packet is no longer within the router logic and there was a CRC error, the packet is modified so that the next router discards the packet.
    Type: Application
    Filed: April 21, 2008
    Publication date: January 29, 2009
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Gregory Hubbard, Kelly Marquardt, Roger A. Bethard, Dennis C. Abts
  • Publication number: 20080304479
    Abstract: A multiprocessor computer system comprises a sending processor node and a receiving processor node. The sending processor node is operable to send packets comprising part of a message to a receiver, and to send a message complete packet after all packets in the message are sent. The message complete packet includes an indicator of the number of packets in the message, and the message is recognized as complete in the receiver once the number of packets indicated in the message complete packet have been received for the message. The sender tracks acknowledgment from the receiver of receipt of the sent packets; and notifies the receiver when it has received all packets comprising a part of the message.
    Type: Application
    Filed: June 7, 2007
    Publication date: December 11, 2008
    Inventors: Steven L. Scott, Dennis C. Abts, Robert Alverson, Edwin Froese
  • Publication number: 20080304491
    Abstract: A multiprocessor computer system comprises a sending processor node and a receiving processor node. The sending processor node is operable to send packets comprising part of a message to a receiver, to maintain a message buffer entry in the sender comprising the sent packets, to track acknowledgment from the receiver that sent packets have been received; to maintain a timer indicating the time since message data has been sent, and to resend packets not acknowledged upon the timer reaching a timeout state. The receiving processor node is operable to send acknowledgement to the sender that received packets have been received, to track packets using a received message table to track which packets comprising part of the message have been received and whether all packets in the message have been received, and to process packets once all packets in a message are received to reassemble the received message.
    Type: Application
    Filed: June 7, 2007
    Publication date: December 11, 2008
    Inventors: Steven L. Scott, Dennis C. Abts, Robert Alverson, Edwin Froese
  • Publication number: 20080285562
    Abstract: A system and method for routing in a high-radix network. A packet is received and examined to determine if the packet can be routed adaptively. If the packet can be routed adaptively, the packet is routed adaptively, wherein routing adaptively includes selecting a column, computing a column mask, routing the packet to the column; and selecting an output port as a function of the column mask. If the packet can be routed deterministically, routing deterministically, wherein routing deterministically includes accessing a routing table to obtain an output port and routing the packet to the output port from the routing table.
    Type: Application
    Filed: April 21, 2008
    Publication date: November 20, 2008
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Gregory Hubbard, Dennis C. Abts
  • Patent number: 7437521
    Abstract: A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.
    Type: Grant
    Filed: August 18, 2003
    Date of Patent: October 14, 2008
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Gregory J. Faanes, Brick Stephenson, William T. Moore, Jr., James R. Kohn
  • Patent number: 7409505
    Abstract: A method and apparatus for a coherence mechanism that supports a distributed memory programming model in which processors each maintain their own memory area, and communicate data between them. A hierarchical programming model is supported, which uses distributed memory semantics on top of shared memory nodes. Coherence is maintained globally, but caching is restricted to a local region of the machine (a “node” or “caching domain”). A directory cache is held in an on-chip cache and is multi-banked, allowing very high transaction throughput. Directory associativity allows the directory cache to map contents of all caches concurrently. References off node are converted to non-allocating references, allowing the same access mechanism (a regular load or store) to be used for both for intra-node and extra-node references. Stores (Puts) to remote caches automatically update the caches instead of invalidating the caches, allowing producer/consumer data sharing to occur through cache instead of through main memory.
    Type: Grant
    Filed: July 11, 2006
    Date of Patent: August 5, 2008
    Assignee: Cray, Inc.
    Inventors: Steven L. Scott, Abdulla Bataineh
  • Publication number: 20080151909
    Abstract: A system and method for routing packets from one node to another node in a system having a plurality of nodes connected by a network. A node router is provided in each node, wherein the node router includes a plurality of network ports, including a first and a second network port, wherein each network port includes a communications channel for communicating with one of the other network nodes, a plurality of virtual channel input buffers and a plurality of virtual channel staging buffers, wherein each of the virtual channel staging buffers receives data from one of the plurality of input buffers.
    Type: Application
    Filed: October 31, 2007
    Publication date: June 26, 2008
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, Gregory Hubbard
  • Publication number: 20080123679
    Abstract: A system and method for routing a packet between ports for use in a router having a plurality of ports, including a first and a second port, wherein each port includes a plurality of look-up tables (LUTs) and a look-up table select connected to the LUTs. Routing information is loaded into each of the plurality of LUTs while LUT selection information is loaded in the look-up table select. A packet having a plurality of destination bits is received at the first port and a destination port selected within the router to receive the packet. The destination port is selected by applying two or more of the destination bits to the plurality of LUTs in the first port and selecting an output of the plurality of LUTs as a function of one or more of the destination bits, wherein the selected output indicates the port selected to receive the packet. The packet is then routed to the output of the selected port.
    Type: Application
    Filed: October 31, 2007
    Publication date: May 29, 2008
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Robert Alverson
  • Patent number: 7356026
    Abstract: A method of node translation for communicating over virtual channels in a clustered multiprocessor system using connection descriptors (CDs), which specify the endpoint nodes for virtual connections. The system includes a local processing element node, a remote processing element node and a network interconnect therebetween for sending communications between the processing element nodes. The method includes assigning a CD to specify an endpoint node for a virtual connection, defining a local connection table (LCT) to be accessed with the CD to produce a system node identifier (SNID) of the endpoint node, generating a communication request including the CD, accessing the LCT using the CD of that communication request to produce the SNID for the endpoint node of the connection in response to that request, and sending a memory request to the endpoint node.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: April 8, 2008
    Assignee: Silicon Graphics, Inc.
    Inventors: Steven L. Scott, Chris Dickson, Steve Reinhardt
  • Patent number: 7334110
    Abstract: In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a system and method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands, reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit, predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed, dispatching the predispatched vector instruction if all required operands are ready, and executing the dispatched vector instruction as a function of the scalar operand.
    Type: Grant
    Filed: August 18, 2003
    Date of Patent: February 19, 2008
    Assignee: Cray Inc.
    Inventors: Gregory J. Faanes, Steven L. Scott, Eric P. Lundberg, William T. Moore, Jr., Timothy J. Johnson
  • Patent number: 7082500
    Abstract: A method and apparatus for a coherence mechanism that supports a distributed memory programming model in which processors each maintain their own memory area, and communicate data between them. A hierarchical programming model is supported, which uses distributed memory semantics on top of shared memory nodes. Coherence is maintained globally, but caching is restricted to a local region of the machine (a “node” or “caching domain”). A directory cache is held in an on-chip cache and is multi-banked, allowing very high transaction throughput. Directory associativity allows the directory cache to map contents of all caches concurrently. References off node are converted to non-allocating references, allowing the same access mechanism (a regular load or store) to be used for both for intra-node and extra-node references. Stores (Puts) to remote caches automatically update the caches instead of invalidating the caches, allowing producer/consumer data sharing to occur through cache instead of through main memory.
    Type: Grant
    Filed: February 18, 2003
    Date of Patent: July 25, 2006
    Assignee: Cray, Inc.
    Inventors: Steven L. Scott, Abdulla Bataineh
  • Patent number: 6925547
    Abstract: A method of performing remote address translation in a multiprocessor system includes determining a connection descriptor and a virtual address at a local node, accessing a local connection table at the local node using the connection descriptor to produce a system node identifier for a remote node and a remote address space number, communicating the virtual address and remote address space number to the remote node, and translating the virtual address to a physical address at the remote node (qualified by the remote address space number). A user process running at the local node provides the connection descriptor and virtual address. The translation is performed by matching the virtual address and remote address space number with an entry of a translation-lookaside buffer (TLB) at the remote node. Performing the translation at the remote node reduces the amount of translation information needed at the local node for remote memory accesses.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: August 2, 2005
    Assignee: Silicon Graphics, Inc.
    Inventors: Steven L. Scott, Chris Dickson, Eric C. Fromm, Michael L. Anderson