Patents Assigned to Cray Inc.
  • Publication number: 20100324854
    Abstract: A memory daughter card (MDC) is described, having a very high-speed serial interface and an on-card MDC test engine that allows one MDC to be directly connected to another MDC for testing purposes. In some embodiments, a control interface allows the test engine to be programmed and controlled by a test controller on a test fixture that allows simultaneous testing of a single MDC or one or more pairs of MDCs, one MDC in a pair (e.g., the “golden” MDC) testing the other MDC of that pair. Other methods are also described, wherein one MDC executes a series of reads and writes and other commands to another MDC to test at least some of the other card's functions, or wherein one port executes a series of test commands to another port on the same MDC to test at least some of the card's functions.
    Type: Application
    Filed: August 27, 2010
    Publication date: December 23, 2010
    Applicant: CRAY INC.
    Inventors: David R. Resnick, Gerald A. Schwoerer, Kelly J. Marquardt, Alan M. Grossmeier, Michael L. Steinberger, Van L. Snyder, Roger A. Bethard
  • Publication number: 20100318764
    Abstract: A system and method of compiling program code, wherein the program code includes an operation on an array of data elements stored in memory of a computer system. The program code is scanned for operations that are vectorizable. The vectorizable operations are examined to determine whether they should be executed at least in part in a vector atomic memory operation (AMO) functional unit attached to memory. If so, the compiled code includes vector AMO instructions.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventor: Terry D. Greyzck
  • Publication number: 20100318591
    Abstract: A computer system is operable to identify subfields that differ in two data elements using a bit matrix compare function between a first matrix filled with pattern elements and a reference pattern.
    Type: Application
    Filed: June 11, 2010
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventor: William F. Long
  • Publication number: 20100318774
    Abstract: A multiprocessor computer system comprises a plurality of processors distributed across a plurality of node coupled by a processor interconnect network. One or more of the processors is operable to manage hung processor instructions by setting a graduation timeout counter after a first program instruction graduates, resetting the graduation timeout counter if a subsequent program instruction graduates before the graduation timeout counter expires, and resetting the processor if the graduation timeout counter expires before the subsequent program instruction graduates.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Dennis C. Abts, Aaron F. Godfrey
  • Publication number: 20100318626
    Abstract: A multiprocessor computer system comprises a first node operable to access memory local to a remote node by receiving a virtual memory address from a requesting entity in node logic in the first node. The first node creates a network address from the virtual address received in the node logic, where the network address is in a larger address space than the virtual memory address, and sends a fast memory access request from the first node to a network node identified in the network address.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Dennis C. Abts, Robert Alverson, Edwin Froese, Howard Pritchard, Steven L. Scott
  • Publication number: 20100318979
    Abstract: A system and method of compiling program code, wherein the program code includes an operation on an array of data elements stored in memory of a computer system. The program code is scanned for an equation which may have recurring data points. The equation is then replaced with vectorized machine executable code, wherein the machine executable code comprises a nested loop and wherein the nested loop comprises an exterior loop and a virtual interior loop. The exterior loop decomposes the equation into a plurality of loops of length N, wherein N is an integer greater than one. The virtual interior loop executes vector operations corresponding to the N length loop to form a result vector resident in memory, wherein the virtual interior loop includes a vector atomic memory operation (AMO) instruction.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventor: Terry D. Greyzck
  • Publication number: 20100318747
    Abstract: An atomic memory operation cache comprises a cache memory operable to cache atomic memory operation data, a write timer, and a cache controller. The cache controller is operable to update main memory with one or more dirty atomic memory operation cache entries stored in the cache memory upon expiration of the write timer, and is further operable to update main memory with one or more dirty atomic memory operation cache entries stored in the cache memory upon eviction of the one or more dirty atomic memory operation cache entries from the cache memory.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Dennis C. Abts, Steven L. Scott
  • Publication number: 20100318751
    Abstract: An error message handling buffer comprises a first buffer and a second buffer. A first index is associated with the first buffer and a second index is associated with the second buffer. A buffer controller is operable to write and read messages in the buffer, such that messages are written to the buffer of the first and second buffers that has a buffer index value lesser than the buffer size, and read from the other of the first and second buffers, the other buffer having an index value greater than or equal to the buffer size.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventor: Clayton D. Andreasen
  • Publication number: 20100318769
    Abstract: A system and method of compiling program code, wherein the program code includes an operation on an array of data elements stored in memory of a computer system. The program code is scanned for an equation which operates on data of lengths other than the limited number of vector supported data lengths. The equation is then replaced with vectorized machine executable code, wherein the machine executable code comprises a nested loop and wherein the nested loop comprises an exterior loop and a virtual interior loop. The exterior loop decomposes the equation into a plurality of loops of length N, wherein N is an integer greater than one. The virtual interior loop executes vector operations corresponding to the N length loop to form a result vector of length N, wherein the virtual interior loop includes one or more vector atomic memory operation (AMO) instructions, used to resolve false conflicts.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventor: Terry D. Greyzck
  • Publication number: 20100318831
    Abstract: In some embodiments, the present invention relates to a method of maintaining a global clock within a multiprocessor system having a plurality of nodes that are connected in a network via links. A virtual spanning tree is mapped onto the network and the nodes and the links are configured such that each node is in a parent-child relationship with one or more other nodes in the virtual spanning tree. A global clock is generated in a root of the virtual spanning tree and global clock signals are communicated down the virtual spanning tree to each of the nodes.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, Aaron F. Godfrey
  • Publication number: 20100318773
    Abstract: A computer system is operable to identify index elements in a vector index array that cannot be processed in parallel by calculating a complement modified bit matrix compare function between a first matrix filled with elements from the vector index array and a second matrix filled with the same elements from the vector index array.
    Type: Application
    Filed: June 11, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Terry D. Greyzck, William F. Long, Peter M. Klausler, Matthew F. Taylor
  • Publication number: 20100318741
    Abstract: A multiprocessor computer system comprises a processing node having a plurality of processors and a local memory shared among processors in the node. An L1 data cache is local to each of the plurality of processors, and an L2 cache is local to each of the plurality of processors. An L3 cache is local the node but shared among the plurality of processors, and the L3 cache is a subset of data stored in the local memory. The L2 caches are subsets of the L3 cache, and the L1 caches are a subset of the L2 caches in the respective processors.
    Type: Application
    Filed: June 12, 2009
    Publication date: December 16, 2010
    Applicant: Cray Inc.
    Inventors: Steven L. Scott, Gregory J. Faanes, Abdulla Bataineh, Michael Bye, Gerald A. Schwoerer, Dennis C. Abts
  • Patent number: 7852836
    Abstract: A system and method for routing packets from one node to another node in a system having a plurality of nodes connected by a network. A node router is provided in each node, wherein the node router includes a plurality of network ports, including a first and a second network port, wherein each network port includes a communications channel for communicating with one of the other network nodes, a plurality of virtual channel input buffers and a plurality of virtual channel staging buffers, wherein each of the virtual channel staging buffers receives data from one of the plurality of input buffers.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: December 14, 2010
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Dennis C. Abts, Gregory Hubbard
  • Publication number: 20100306489
    Abstract: A multiprocessor computer system comprises a plurality of processors and a plurality of nodes, each node comprising one or more processors. A local memory in each of the plurality of nodes is coupled to the processors in each node, and a hardware firewall comprising a part of one or more of the nodes is operable to prevent a write from an unauthorized processor from writing to the local memory.
    Type: Application
    Filed: May 29, 2009
    Publication date: December 2, 2010
    Applicant: Cray Inc.
    Inventors: Dennis C. Abts, Steven L. Scott, Aaron F. Godfrey
  • Patent number: 7843929
    Abstract: A system and method for routing in a high-radix network. A packet is received and examined to determine if the packet can be routed adaptively. If the packet can be routed adaptively, the packet is routed adaptively, wherein routing adaptively includes selecting a column, computing a column mask, routing the packet to the column; and selecting an output port as a function of the column mask. If the packet can be routed deterministically, routing deterministically, wherein routing deterministically includes accessing a routing table to obtain an output port and routing the packet to the output port from the routing table.
    Type: Grant
    Filed: April 21, 2008
    Date of Patent: November 30, 2010
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Gregory Hubbard, Dennis C. Abts
  • Patent number: 7830905
    Abstract: A system and method for speculative forwarding of packets received by a router, wherein each packet includes phits and wherein one or more phits include a cyclic redundancy code (CRC). A packet is received and phits of the packet are forwarded to router logic. A cyclic redundancy code for the packet is calculated and compared to the packet's cyclic redundancy code. An error is generated if the cyclic redundancy codes don't match. If the cyclic redundancy codes don't match, a phit of the packet is modified to reflect the error, the CRC is corrected and the corrected CRC is forwarded to the router logic along with the phit reflecting the CRC error. At the router logic, a check is made to see if the packet is still within the router logic. If the packet is still within the router logic and there was a CRC error, the packet is discarded. If, however, the packet is no longer within the router logic and there was a CRC error, the packet is modified so that the next router discards the packet.
    Type: Grant
    Filed: April 21, 2008
    Date of Patent: November 9, 2010
    Assignee: Cray Inc.
    Inventors: Steven L. Scott, Gregory Hubbard, Kelly Marquardt, Roger A. Bethard, Dennis C. Abts
  • Patent number: 7826996
    Abstract: A memory daughter card (MDC) is described, having a very high-speed serial interface and an on-card MDC test engine that allows one MDC to be directly connected to another MDC for testing purposes. In some embodiments, a control interface allows the test engine to be programmed and controlled by a test controller on a test fixture that allows simultaneous testing of a single MDC or one or more pairs of MDCs, one MDC in a pair (e.g., the “golden” MDC) testing the other MDC of that pair. Other methods are also described, wherein one MDC executes a series of reads and writes and other commands to another MDC to test at least some of the other card's functions, or wherein one port executes a series of test commands to another port on the same MDC to test at least some of the card's functions.
    Type: Grant
    Filed: February 26, 2007
    Date of Patent: November 2, 2010
    Assignee: Cray Inc.
    Inventors: David R. Resnick, Gerald A. Schwoerer, Kelly J. Marquardt, Alan M. Grossmeier, Michael L. Steinberger, Van L. Snyder, Roger A. Bethard
  • Patent number: 7793073
    Abstract: A method and apparatus to correctly compute a vector-gather, vector-operate (e.g., vector add), and vector-scatter sequence, particularly when elements of the vector may be redundantly presented, as with indirectly addressed vector operations. For an add operation, one vector register is loaded with the “add-in” values, and another vector register is loaded with address values of “add to” elements to be gathered from memory into a third vector register. If the vector of address values has a plurality of elements that point to the same memory address, the algorithm should add all the “add in” values from elements corresponding to the elements having the duplicated addresses. An indirectly addressed load performs the “gather” operation to load the “add to” values. A vector add operation then adds corresponding elements from the “add in” vector to the “add to” vector. An indirectly addressed store then performs the “scatter” operation to store the results.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: September 7, 2010
    Assignee: Cray Inc.
    Inventor: James R. Kohn
  • Publication number: 20100199121
    Abstract: A multiprocessor computer system comprises one or more watchdog timers operable to detect failure of a memory operation based on passage of a certain timing period from a memory operation being issued without a valid response. An error handler is operable to take corrective action regarding the failed memory operation, such as to provide at least one of hardware state management and application state management.
    Type: Application
    Filed: January 28, 2010
    Publication date: August 5, 2010
    Applicant: Cray Inc
    Inventors: Dennis C. Abts, Steven L. Scott, Aaron F. Godfrey
  • Patent number: 7764629
    Abstract: A method and system for finding connected components of a graph using a parallel algorithm is provided. The connected nodes system performs a search algorithm in parallel to identify subgraphs of the graph in which the nodes of the subgraph are connected. The connected nodes system also identifies which subgraphs have at least one edge between their nodes. Thus, the connected nodes system effectively generates a hyper-graph with the subgraphs as hyper-nodes that are connected when subgraphs have at least one edge between their nodes. The connected nodes system may then perform a conventional connected component algorithm on the hyper-graph to identify the connected hyper-nodes, which effectively identifies the connected nodes of the underlying graphs.
    Type: Grant
    Filed: August 11, 2005
    Date of Patent: July 27, 2010
    Assignee: Cray Inc.
    Inventor: Simon H. Kahan