Patents by Inventor Gregory J. Faanes

Gregory J. Faanes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Configurable vector length computer processor

Patent number: 8601236

Abstract: A processor core, comprises one or more vector units operable to change between a fine-grained vector mode having a shorter maximum vector length and a coarse-grained vector mode having a longer maximum vector length. Changing vector modes comprises halting all instruction stream execution in the core, flushing one or more registers in a register space, reconfiguring one or more vector registers in the register space, and restarting instruction execution in the core.

Type: Grant

Filed: February 29, 2012

Date of Patent: December 3, 2013

Assignee: Cray Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg, Abdulla Bataineh, Timothy J. Johnson, Michael Parker, James Robert Kohn, Steven L. Scott, Robert Alverson
Relaxed memory consistency model

Patent number: 8307194

Abstract: A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.

Type: Grant

Filed: August 18, 2003

Date of Patent: November 6, 2012

Assignee: Cray Inc.

Inventors: Steven L. Scott, Gregory J. Faanes, Brick Stephenson, William T. Moore, Jr., James R. Kohn
CONFIGURABLE VECTOR LENGTH COMPUTER PROCESSOR

Publication number: 20120221830

Abstract: A processor core, comprises one or more vector units operable to change between a fine-grained vector mode having a shorter maximum vector length and a coarse-grained vector mode having a longer maximum vector length. Changing vector modes comprises halting all instruction stream execution in the core, flushing one or more registers in a register space, reconfiguring one or more vector registers in the register space, and restarting instruction execution in the core.

Type: Application

Filed: February 29, 2012

Publication date: August 30, 2012

Applicant: CRAY INC.

Inventors: Gregory J. Faanes, Eric P. Lundberg, Abdulla Bataineh, Timothy J. Johnson, Michael Parker, James Robert Kohn, Steven L. Scott, Robert Alverson
"OR" BIT MATRIX MULTIPLY VECTOR INSTRUCTION

Publication number: 20120072704

Abstract: A processor is operable to execute a bit matrix multiply instruction. In further examples, the processor is operable to perform a vector bit matrix multiply instruction, and is a part of a computerized system.

Type: Application

Filed: February 3, 2011

Publication date: March 22, 2012

Applicant: Cray Inc.

Inventors: Timothy J. Johnson, Gregory J. Faanes
MULTIPROCESSOR COMPUTER CACHE COHERENCE PROTOCOL

Publication number: 20100318741

Abstract: A multiprocessor computer system comprises a processing node having a plurality of processors and a local memory shared among processors in the node. An L1 data cache is local to each of the plurality of processors, and an L2 cache is local to each of the plurality of processors. An L3 cache is local the node but shared among the plurality of processors, and the L3 cache is a subset of data stored in the local memory. The L2 caches are subsets of the L3 cache, and the L1 caches are a subset of the L2 caches in the respective processors.

Type: Application

Filed: June 12, 2009

Publication date: December 16, 2010

Applicant: Cray Inc.

Inventors: Steven L. Scott, Gregory J. Faanes, Abdulla Bataineh, Michael Bye, Gerald A. Schwoerer, Dennis C. Abts
Decoupling of write address from its associated write data in a store to a shared memory in a multiprocessor system

Patent number: 7743223

Abstract: In a computer system having a plurality of processors connected to a shared memory, a system and method of decoupling an address from write data in a store to the shared memory. A write request address is generated for a memory write, wherein the write request address points to a memory location in shared memory. A write request is issued to the shared memory, wherein the write request includes the write request address. The write request address is noted in the shared memory and addresses in subsequent load and store requests are compared in share memory to the write request address. The write data is transferred to the shared memory and matched, within the shared memory, to the write request address. The write data is then stored into the shared memory as a function of the write request address.

Type: Grant

Filed: August 18, 2003

Date of Patent: June 22, 2010

Assignee: Cray Inc.

Inventors: Steven L. Scott, Gregory J. Faanes
CONFIGURABLE VECTOR LENGTH COMPUTER PROCESSOR

Publication number: 20100115234

Abstract: A processor core, comprises one or more vector units operable to change between a fine-grained vector mode having a shorter maximum vector length and a coarse-grained vector mode having a longer maximum vector length. Changing vector modes comprises halting all instruction stream execution in the core, flushing one or more registers in a register space, reconfiguring one or more vector registers in the register space, and restarting instruction execution in the core.

Type: Application

Filed: October 31, 2008

Publication date: May 6, 2010

Applicant: CRAY INC.

Inventors: Gregory J. Faanes, Eric P. Lundberg, Abdulla Bataineh, Timothy J. Johnson, Michael Parker, James Robert Kohn, Steven L. Scott, Robert Alverson
HIERARCHICAL SHARED SEMAPHORE REGISTERS

Publication number: 20100115236

Abstract: A multiprocessor computer system having a plurality of processing elements comprises one or more core-level hierarchical shared semaphore registers, wherein each core-level hierarchical shared semaphore register is coupled to a different processor core. Each hierarchical shared semaphore register is writable to each of a plurality of streams executing on the coupled processor core. One or more chip-level hierarchical shared semaphore registers are also coupled to plurality of processor cores, each chip-level hierarchical shared semaphore register writable to each of the plurality of processor cores.

Type: Application

Filed: October 31, 2008

Publication date: May 6, 2010

Applicant: Cray inc.

Inventors: Abdulla Bataineh, James Robert Kohn, Eric P. Lundberg, Timothy J. Johnson, Thomas L. Court, Gregory J. Faanes, Steven L. Scott
LARGE INTEGER SUPPORT IN VECTOR OPERATIONS

Publication number: 20100115232

Abstract: A vector processor or vector processing computer has a first vector register operable to store two or more vector elements that together comprise a single first large integer and a second vector register operable to store two or more vector elements that together comprise a single second large integer. An adder having a carry-in bit is operable to add the large integer in the first vector register to the large integer in the second vector register by using the carry-in bit to add sequential elements of the vector registers.

Type: Application

Filed: October 31, 2008

Publication date: May 6, 2010

Inventors: Timothy J. Johnson, Eric P. Lundberg, Michael Parker, Gregory J. Faanes
VECTOR ATOMIC MEMORY OPERATIONS

Publication number: 20090138680

Abstract: A processor is operable to execute one or more vector atomic memory operations. A further embodiment provides support for atomic memory operations in a memory manger, which is operable to process atomic memory operations and to return a completion notification or a result.

Type: Application

Filed: November 28, 2007

Publication date: May 28, 2009

Inventors: Timothy J. Johnson, Gregory J. Faanes
System and method for processing memory instructions using a forced order queue

Patent number: 7519771

Abstract: A novel system and method for processing memory instructions. One embodiment of the invention provides a method for processing a memory instruction. In this embodiment, the method includes obtaining a memory request; storing the memory request in an Initial Request Queue (IRQ); and processing the memory request from the IRQ by a cache controller, wherein processing includes: identifying a type of the memory request, and processing the memory request in both a local cache and an Force Order Queue (FOQ), wherein processing includes determining if a portion of an address associated with the memory request matches one or more partial addresses in the FOQ and, if the memory request misses in the cache and the address does not match one or more partial addresses in the FOQ, adding the memory request to the FOQ and allocating a cache line in the local cache corresponding to the local cache miss.

Type: Grant

Filed: August 18, 2003

Date of Patent: April 14, 2009

Assignee: Cray Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg, Steven L. Scott, Robert J. Baird
"OR" BIT MATRIX MULTIPLY VECTOR INSTRUCTION

Publication number: 20080288756

Abstract: A processor is operable to execute a bit matrix multiply instruction. In further examples, the processor is operable to perform a vector bit matrix multiply instruction, and is a part of a computerized system.

Type: Application

Filed: May 18, 2007

Publication date: November 20, 2008

Inventors: Timothy J. Johnson, Gregory J. Faanes
Multistream processing memory-and barrier-synchronization method and apparatus

Patent number: 7437521

Abstract: A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.

Type: Grant

Filed: August 18, 2003

Date of Patent: October 14, 2008

Assignee: Cray Inc.

Inventors: Steven L. Scott, Gregory J. Faanes, Brick Stephenson, William T. Moore, Jr., James R. Kohn
Decoupled scalar/vector computer architecture system and method

Patent number: 7334110

Abstract: In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a system and method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands, reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit, predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed, dispatching the predispatched vector instruction if all required operands are ready, and executing the dispatched vector instruction as a function of the scalar operand.

Type: Grant

Filed: August 18, 2003

Date of Patent: February 19, 2008

Assignee: Cray Inc.

Inventors: Gregory J. Faanes, Steven L. Scott, Eric P. Lundberg, William T. Moore, Jr., Timothy J. Johnson
Vector and scalar data cache for a vector multiprocessor

Patent number: 6665774

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Grant

Filed: October 16, 2001

Date of Patent: December 16, 2003

Assignee: Cray, Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg
Vector and scalar data cache for a vector multiprocessor

Patent number: 6496902

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Grant

Filed: December 31, 1998

Date of Patent: December 17, 2002

Assignee: Cray Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg
Vector and scalar data cache for a vector multiprocessor

Publication number: 20020144061

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Application

Filed: October 16, 2001

Publication date: October 3, 2002

Applicant: Cray Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg