Scalar/vector Processor Interface Patents (Class 712/3)

System and method of processing data using scalar/vector instructions

Patent number: 7676647

Abstract: A processor device is disclosed that includes a register file with a combined condition code register for scalar and vector operations. The processor device utilizes the combined condition code register for scalar and vector operations. Further, a compare operation can store resulting bits in the combined condition code register and a conditional operation can utilize the combined condition code register bits for evaluating a condition.

Type: Grant

Filed: August 18, 2006

Date of Patent: March 9, 2010

Assignee: QUALCOMM Incorporated

Inventors: Lucian Codrescu, Erich Plondke, Taylor Simpson
Method and apparatus for self-healing symmetric multi-processor system interconnects

Patent number: 7661006

Abstract: A computer implemented method, apparatus, and computer program product for managing symmetric multiprocessor interconnects. The process identifies functional communication connections between each processor in a plurality of processors on a multiprocessor to form identified functional communication connections. The process maps every functional communication connection between any two processors in the plurality of processors, based on the identified functional communication connections, to form an interconnect matrix. The process creates a path map using the interconnect matrix. The path map comprises a sequence of communication connections between the plurality of processors. The process initializes the plurality of processors using the path map.

Type: Grant

Filed: January 9, 2007

Date of Patent: February 9, 2010

Assignee: International Business Machines Corporation

Inventors: Luai A. Abou-Emara, Mark David McLaughlin, Jorge N. Yanez
Apparatus, processor, cache memory and method of processing vector data

Publication number: 20090228657

Abstract: An apparatus includes a vector unit to process a vector data, a cache memory which includes a plurality of cache lines to store a plurality of divisional data being sent from a main memory, each of the divisional data of vector data having been divided according to a capacity of a cache line, and a cache controller to send all of the divisional data as the vector data to the vector unit after the cache lines have stored all of the divisional data including the vector data.

Type: Application

Filed: February 6, 2009

Publication date: September 10, 2009

Applicant: NEC Corporation

Inventor: Takashi Hagiwara
Processing Unit Incorporating Vectorizable Execution Unit

Publication number: 20090150647

Abstract: A vectorizable execution unit is capable of being operated in a plurality of modes, with the processing lanes in the vectorizable execution unit grouped into different combinations of logical execution units in different modes. By doing so, processing lanes can be selectively grouped together to operate as different types of vector execution units and/or scalar execution units, and if desired, dynamically switched during runtime to process various types of instruction streams in a manner that is best suited for each type of instruction stream. As a consequence, a single vectorizable execution unit may be configurable, e.g., via software control, to operate either as a vector execution or a plurality of scalar execution units.

Type: Application

Filed: December 7, 2007

Publication date: June 11, 2009

Inventors: Eric Oliver Mejdrich, Adam James Muff, Matthew Ray Tubbs
Staggered execution stack for vector processing

Patent number: 7457938

Abstract: In one embodiment, the present invention includes a method for executing an operation on low order portions of first and second source operands using a first execution stack of a processor and executing the operation on high order portions of the first and second source operands using a second execution stack of the processor, where the operation in the second execution stack is staggered by one or more cycles from the operation in the first execution stack. Other embodiments are described and claimed.

Type: Grant

Filed: September 30, 2005

Date of Patent: November 25, 2008

Assignee: Intel Corporation

Inventors: Stephan Jourdan, Avinash Sodani, Michael Fetterman, Per Hammarlund, Ronak Singhal, Glenn Hinton
Multistream processing memory-and barrier-synchronization method and apparatus

Patent number: 7437521

Abstract: A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.

Type: Grant

Filed: August 18, 2003

Date of Patent: October 14, 2008

Assignee: Cray Inc.

Inventors: Steven L. Scott, Gregory J. Faanes, Brick Stephenson, William T. Moore, Jr., James R. Kohn
Simulation of processor status flags

Publication number: 20080222388

Abstract: The dynamic efficient and accurate simulation of processor status flags is described. One exemplary embodiment includes simulation of processor status flags of a first CPU type on a second CPU type using simple arithmetic operations to calculate status flags in parallel, and by keeping an intermediate state that allows efficient calculation of status flags when they are needed. In this way, sufficient intermediate state exists to generate desired status flags either directly or with a simple operation.

Type: Application

Filed: March 5, 2007

Publication date: September 11, 2008

Applicant: Microsoft Corporation

Inventor: Darek Mihocka
Scalar result producing method in vector/scalar system by vector unit from vector results according to modifier in vector instruction

Patent number: 7350057

Abstract: Described herein is a method and system for executing instructions. The system comprises a scalar unit for executing scalar instructions each defining a single value pair; a vector unit for executing vector instructions each defining multiple value pairs; and an instruction decoder for receiving a single stream of instructions including scalar instructions and vector instructions and operable to direct scalar instructions to the scalar unit and vector instructions to the vector unit. The vector unit can comprises a plurality of value processing units and a scalar result unit. The scalar unit can comprise a scalar register file. Communication between the vector unit and the scalar unit is enabled by allowing the vector unit to access the scalar register file and allowing the scalar unit to access output from the scalar result unit. The output of the scalar result unit may be based on the relative magnitudes of outputs from the plurality of value processing units.

Type: Grant

Filed: November 6, 2006

Date of Patent: March 25, 2008

Assignee: Broadcom Corporation

Inventors: Stephen Barlow, Neil Bailey, Timothy Ramsdale, David Plowman, Robert Swann
Decoupled scalar/vector computer architecture system and method

Patent number: 7334110

Abstract: In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a system and method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands, reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit, predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed, dispatching the predispatched vector instruction if all required operands are ready, and executing the dispatched vector instruction as a function of the scalar operand.

Type: Grant

Filed: August 18, 2003

Date of Patent: February 19, 2008

Assignee: Cray Inc.

Inventors: Gregory J. Faanes, Steven L. Scott, Eric P. Lundberg, William T. Moore, Jr., Timothy J. Johnson
Method and apparatus for data processing

Patent number: 7305540

Abstract: Methods and apparatuses for a data processing system are described herein. In one aspect of the invention, an exemplary apparatus includes a chip interconnect, a memory controller for controlling the host memory comprising DRAM memory, the memory controller coupled to the chip interconnect, a scalar processing unit coupled the chip interconnect wherein the scalar processing unit is capable of executing instructions to perform scalar data processing, a vector processing unit coupled the chip interconnect wherein the vector processing unit is capable of executing instructions to perform vector data processing, and an input/output (I/O) interface coupled to the chip interconnect wherein the I/O interface receives/transmits data from/to the scalar and/or vector processing units.

Type: Grant

Filed: December 31, 2001

Date of Patent: December 4, 2007

Assignee: Apple Inc.

Inventors: Sushma Shrikant Trivedi, Joseph P. Bratt, Jack Benkual, Vaughn Todd Arnold, Derek Fujio Iwamoto
Functional-level instruction-set computer architecture for processing application-layer content-service requests such as file-access requests

Patent number: 7254696

Abstract: A functional-level instruction-set computing (FLIC) architecture executes higher-level functional instructions such as lookups and bit-compares of variable-length operands. Each FLIC processing-engine slice has specialized processing units including a lookup unit that searches for a matching entry in a lookup cache. Variable-length operands are stored in execution buffers. The operand length and location in the execution buffer are stored in fixed-length general-purpose registers (GPRs) that also store fixed-length operands. A copy/move unit moves data between input and output buffers and one or more FLIC processing-engine slices. Multiple contexts can each have a set of GPRs and execution buffers. An expansion buffer in a FLIC slice can be allocated to a context to expand that context's execution buffer for storing longer operands.

Type: Grant

Filed: December 12, 2002

Date of Patent: August 7, 2007

Assignee: Alacritech, Inc.

Inventors: Millind Mittal, Mehul Kharidia, Tarun Kumar Tripathy, J. Sukarno Mertoguno
Method and apparatus for prioritized instruction issue queue in a processor

Patent number: 7032101

Abstract: An apparatus and method in a high performance processor for issuing instructions, comprising; a classification logic for sorting instructions in a number of priority categories, a plurality of instruction queues storing the instruction of differing priorities, and a issue logic selecting from which queue to dispatch instructions for execution. This apparatus and method can be implemented in both in-order, and out-of-order execution processor architectures. The invention also involves instruction cloning, and use of various predictive techniques.

Type: Grant

Filed: February 26, 2002

Date of Patent: April 18, 2006

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, Valentina Salapura
Protocol processor intended for the execution of a collection of instructions in a reduced number of operations

Patent number: 7028145

Abstract: Protocol processor intended to be associated with at least one main processor of a system with a view to the execution of tasks to which the main processor is not suited. The protocol processor comprises a program part (30) including an incrementation register (31), a program memory (33) connected to the incrementation register (31) in order to receive addresses thereof, a decoding part (35) intended to receive instructions from the program memory (33) of the program part (30) with a view to executing an instruction in two cycles, and a data part (36) for executing the instruction.

Type: Grant

Filed: July 10, 1997

Date of Patent: April 11, 2006

Inventors: Gerard Chauvel, Francis Aussedat, Pierre Calippe
Fast and flexible scan conversion and matrix transpose in a SIMD processor

Patent number: 6963341

Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.

Type: Grant

Filed: May 20, 2003

Date of Patent: November 8, 2005

Inventor: Tibet Mimar
Method and apparatus for obtaining a scalar value directly from a vector register

Patent number: 6857061

Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.

Type: Grant

Filed: April 7, 2000

Date of Patent: February 15, 2005

Assignee: Nintendo Co., Ltd.

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng
Mixed hardware/software architecture and method for processing xDSL communications

Patent number: 6839889

Abstract: A method of implementing a scaleable architecture for a communications system is disclosed, based on minimizing a total gate count for the communications system to reduce cost, complexity, etc. The method considers the requirements of particular communications transmission process that is dividable into individual transmission tasks. A computational complexity for each of said N individual transmission tasks respectively, said computational complexity being based on a number of instructions per second (MIPs) required by a computational circuit to perform each of said N individual transmission tasks; a number of gates and/or transistors required to implement each of individual transmission task using a hardware based or software based computing circuit, etc. After determining an effective number of MIPs acheivable by such circuits, the N tasks are allocated in a gate efficient manner for a final design architecture, or for a working implementation in the field.

Type: Grant

Filed: March 1, 2001

Date of Patent: January 4, 2005

Assignee: Realtek Semiconductor Corp.

Inventor: Ming-Kang Liu
SIMD datapath coupled to scalar/vector/address/conditional data register file with selective subpath scalar processing mode

Patent number: 6839828

Abstract: There is provided a processor designed to operate in a plurality of modes for processing vector and scalar instructions. Register files are each for storing scalar and vector data and address information. A parallel vector unit, coupled to the register files, includes functional units configurable to operate in a vector operation mode and a scalar operation mode. The vector unit includes an apparatus for tightly coupling the functional units to perform an operation specified by a current instruction. Under a vector operation mode, the vector unit performs, in parallel, a single vector operation on a plurality of data elements. The operations performed on the plurality of data elements are each performed by a different functional unit of the vector unit. Under a scalar operation mode, the vector unit performs a scalar operation on a data element received from the register files in a functional unit within the vector unit.

Type: Grant

Filed: August 14, 2001

Date of Patent: January 4, 2005

Assignee: International Business Machines Corporation

Inventors: Michael Karl Gschwind, Harm Peter Hofstee, Martin Edward Hopkins
CPU datapaths and local memory that executes either vector or superscalar instructions

Publication number: 20040193837

Abstract: A data processing system includes left and right data path processors coupled to an instruction cache. The left and right data path processors, respectively, are configured to execute left and right instruction words received in a single clock cycle from the instruction cache. The left and right data path processors are also configured to operate in a scalar mode and a vector mode. The processors (a) execute the left and right instruction words as two separate instructions in the scalar mode, and (b) execute the left and right instruction words as one instruction in the vector mode.

Type: Application

Filed: March 31, 2003

Publication date: September 30, 2004

Inventors: Patrick Devaney, David M. Keaton, Katsumi Murai
Vector instructions composed from scalar instructions

Publication number: 20040193838

Abstract: A processing system includes left and right data path processors configured to execute instructions issued from an instruction cache. A vector instruction includes a first word configured for execution by the left data path processor and a second word configured for execution by the right data path processor. The first and second words are issued in the same clock cycle from the instruction cache, and are interlocked to jointly specify a single vector instruction. The first and second words include code for vector operation and code for vector control. The first and second words are concurrently executed to complete the vector operation, free-of any other instructions issued from the instruction cache.

Type: Application

Filed: March 31, 2003

Publication date: September 30, 2004

Inventors: Patrick Devaney, David M. Keaton, Katsumi Murai
Graphics processing unit with transform module capable of handling scalars and vectors

Patent number: 6734874

Abstract: A method, apparatus and article of manufacture are provided for handling both scalar and vector components during graphics processing. To accomplish this, vertex data is received in the form of vectors after which vector operations are performed on the vector vertex data. Next, scalar operations may be executed on an output of the vector operations, thereby rendering vertex data in the form of scalars. Such scalar vertex data may then be converted to vector vertex data for performing vector operations thereon.

Type: Grant

Filed: January 31, 2001

Date of Patent: May 11, 2004

Assignee: nVidia Corporation

Inventors: John Erik Lindholm, Simon Moy, David B. Kirk, Paolo E. Sabella
Global tree network for computing structures

Publication number: 20040078493

Abstract: A system and method for enabling high-speed, low-latency global tree communications among processing nodes interconnected according to a tree network structure. The global tree network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations include one or more of: global broadcast operations downstream from a root node to leaf nodes of a virtual tree, global reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node in the virtual tree.

Type: Application

Filed: August 22, 2003

Publication date: April 22, 2004

Inventors: Matthias A Blumrich, Dong Chen, Paul W Coteus, Alan G Gara, Mark E Giampapa, Philip Heidelberger, Dirk Hoenicke, Burkhard D Steinmacher-Burow, Todd E Takken, Pavlos M Vranas
Vector and scalar data cache for a vector multiprocessor

Patent number: 6665774

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Grant

Filed: October 16, 2001

Date of Patent: December 16, 2003

Assignee: Cray, Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg
Cycle segmented prefix circuits

Patent number: 6609189

Abstract: The poor scalability of existing superscalar processors has been of great concern to the computer engineering community. In particular, the critical-path delays of many components in existing implementations grow quadratically with the issue width and the window size. This patent presents a novel way to reimplement these components and reduce their critical-path delay growth. It then describes an entire processor microarchitecture, called the Ultrascalar processor, that has better critical-path delay growth than existing superscalars. Most of our scalable designs are based on a single circuit, a cyclic segmented parallel prefix (cspp). We observe that processor components typically operate on a wrap-around sequence of instructions, computing some associative property of that sequence. For example, to assign an ALU to the oldest requesting instruction, each instruction in the instruction sequence must be told whether any preceding instructions are requesting an ALU.

Type: Grant

Filed: March 12, 1999

Date of Patent: August 19, 2003

Assignee: Yale University

Inventors: Bradley C. Kuszmaul, Dana Sue Henry-Kuszmaul
Method and apparatus for vector register with scalar values

Patent number: 6530011

Abstract: A method and an apparatus for implementing mixed scalar and vector values in a digital processing system. In one embodiment, a digital processing system, which contains processing unit and memories, is capable of identifying a first data in a first scalar register and a second data in a vector register. Upon fetching the first data as a first operand and the second data as a second operand, the processing unit performs an operation between the first and second operands in response to an operator. After operations, the result is subsequently stored in a second scalar register.

Type: Grant

Filed: October 20, 1999

Date of Patent: March 4, 2003

Assignee: SandCraft, Inc.

Inventor: Jack H. Choquette
Processor implementation having unified scalar and SIMD datapath

Publication number: 20030037221

Abstract: An improved processor implementation is described in which scalar and vector processing components are merged to reduce complexity. In particular, the implementation includes a scalar-vector register file for storing scalar and vector instructions, as well as a parallel vector unit comprising functional units that can process vector or scalar instructions as required. A further aspect of the invention provides the ability to disable unused functional units in the parallel vector unit, such as during a scalar operation, to achieve significant power savings.

Type: Application

Filed: August 14, 2001

Publication date: February 20, 2003

Applicant: International Business Machines Corporation

Inventors: Michael Karl Gschwind, Harm Peter Hofstee, Martin Edward Hopkins
Vector and scalar data cache for a vector multiprocessor

Patent number: 6496902

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Grant

Filed: December 31, 1998

Date of Patent: December 17, 2002

Assignee: Cray Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg
DATA PROCESSING APPARATUS INCLUDING A PLURALITY OF PIPELINE PROCESSING MECHANISMS IN WHICH MEMORY ACCESS INSTRUCTIONS ARE CARRIED OUT IN A MEMORY ACCESS PIPELINE

Publication number: 20020099922

Abstract: The memory access arithmetic operation instruction is executed in the data processing apparatus including a memory access pipeline and arithmetic operation pipeline. The decoding and development of the memory access arithmetic operation are carried out after the memory access arithmetic operation instruction is input to the memory access pipeline and the memory access results and the memory access arithmetic instruction are output to the arithmetic operation pipeline.

Type: Application

Filed: January 13, 1999

Publication date: July 25, 2002

Applicant: Fujitsu Limited

Inventor: Mariko SAKAMOTO
Emptying packed data state during execution of packed data instructions

Patent number: 6266686

Abstract: A method in a computer system which includes receiving a first instruction which indicates indicates termination of execution of instructions which operate upon packed data stored in a first storage area. The first storage area is used for modifying data responsive to execution of floating point instructions. A plurality of tags is associated with the first storage area indicating that locations in the first storage area are either empty or non-empty responsive to the execution of the floating point instructions which modify data contained in the first storage area. Responsive to the receiving of the first instruction which indicates termination of execution of instructions which operate upon the packed data stored in the first storage area, the method sets only the plurality of tags to an empty state. In different embodiments, setting of the plurality of tags to a non-empty state occurs responsive to receiving a second instruction.

Type: Grant

Filed: March 4, 1999

Date of Patent: July 24, 2001

Assignee: Intel Corporation

Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
Massively parallel computer including auxiliary vector processor

Patent number: 6219775

Abstract: A massively-parallel computer includes a plurality of processing nodes and at least one control node interconnected by a network. The network faciliates the transfer of data among the processing nodes and of commands from the control node to the processing nodes. Each processing node includes an interface for transmitting data over, and receiving data and commands from, the network, at least one memory module for storing data, a node processor and an auxiliary processor. The node processor receives commands received by the interface and processes data in response thereto, in the process generating memory access requests for facilitating the retrieval of data from or storage of data in the memory module. The node processor further controlling the transfer of data over the network by the interface. The auxiliary processor is connected to the memory module and the node processor.

Type: Grant

Filed: March 18, 1998

Date of Patent: April 17, 2001

Assignee: Thinking Machines Corporation

Inventors: Jon P. Wade, Daniel R. Cassiday, Robert D. Lordi, Guy Lewis Steele, Jr., Margaret A. St. Pierre, Monica C. Wong-Chan, Zahi S. Abuhamdeh, David C. Douglas, Mahesh N. Ganmukhi, Jeffrey V. Hill, W. Daniel Hillis, Scott J. Smith, Shaw-Wen Yang, Robert C. Zak, Jr.
Parallel processing method and system using a lazy parallel data type to reduce inter-processor communication

Patent number: 6212617

Abstract: A parallel programming system provides a lazy collection oriented data type that reduces inter-processor communication in programs executed on parallel computers. The lazy collection oriented data type is provided as a data type in a parallel programming language. The parallel language supports both data-parallel and control-parallel operations. These operations take advantage of the lazy collection oriented data type to defer or reduce inter-processor communication until an operation on the data type requires that it be balanced across a set of processors.

Type: Grant

Filed: June 30, 1998

Date of Patent: April 3, 2001

Assignee: Microsoft Corporation

Inventor: Jonathan C. Hardwick
Microprocessor modified to perform inverse discrete cosine transform operations on a one-dimensional matrix of numbers within a minimal number of instructions

Patent number: 6141673

Abstract: A multimedia extension unit (MEU) is provided for performing various multimedia-type operations. The MEU can be coupled either through a coprocessor bus or a local central processing unit (CPU) bus to a conventional processor. The MEU employs vector registers, a vector arithmetic logic unit (ALU), and an operand routing unit (ORU) to perform a maximum number of the multimedia operations within as few instruction cycles as possible. Complex algorithms are readily performed by arranging operands upon the vector ALU in accordance with the desired algorithm flowgraph. The ORU aligns the operands within partitioned slots or sub-slots of the vector registers using vector instructions unique to the MEU. At the output of the ORU, operand pairs from vector source or destination registers can be easily routed and combined at the vector ALU.

Type: Grant

Filed: May 25, 1999

Date of Patent: October 31, 2000

Assignees: Advanced Micro Devices, Inc., Compaq Computer Corp.

Inventors: John S. Thayer, John Gregory Favor, Frederick D. Weber
Apparatus and method for tracing microprocessor instructions

Patent number: 6106573

Abstract: A microprocessor implements an instruction tracing mechanism that saves the state of the microprocessor without special hardware. Prior to the execution of a traced instruction, a trace microcode routine is implemented that saves the state of the microprocessor to external memory. The state information saved by the trace microcode routine varies depending upon the amount of data needed by the end user. After the state of the processor has been saved, the trace instruction is executed. State information that changed during the execution of the trace instruction is saved to memory prior to a subsequent instruction. The trace instruction mechanism advantageously requires minimal special hardware and expedites the saving of the processor state information.

Type: Grant

Filed: May 14, 1999

Date of Patent: August 22, 2000

Assignee: Advanced Micro Devices, Inc.

Inventors: Rupaka Mahalingaiah, James K. Pickett
System and method for processing multiple received signal sources

Patent number: 6073158

Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.

Type: Grant

Filed: July 29, 1993

Date of Patent: June 6, 2000

Assignee: Cirrus Logic, Inc.

Inventors: Robert Marshall Nally, John Charles Schafer
Multiple processor, distributed memory computer with out-of-order processing

Patent number: 6061776

Abstract: A distributed memory computer architecture associates separate memory blocks with their own processors, each of which executes the same program. A processor fetching data or instructions from its local memory also broadcasts that fetched data or instruction to the other processors to cut the time required for them to request this data. Runs of instruction and data local to one processor providing improved performance that is captured by the system as a whole by the ability of the other processors not executing local data or instructions to execute instructions out of order and return to find the data ready in buffer for rapid use.

Type: Grant

Filed: April 21, 1999

Date of Patent: May 9, 2000

Assignee: Wisconsin Alumni Research Foundation

Inventors: Douglas C. Burger, Stefanos Kaxiras, James R. Goodman
Apparatus and method for system control using a self-timed asynchronous control structure

Patent number: 6055620

Abstract: A control apparatus and method is provided for controlling operations of functional units in systems. The control apparatus and method implement a set of operations that can include dependencies between the functional units of a system to complete each operation. For example, in an asynchronous digital processor, self-timing and inter-block communication are used to implement a self-timed scheduler. The self-timed scheduler and method implement an instruction set using a plurality of functional units of the asynchronous digital processor. A scheduler can include a scheduler decoder that decodes each instruction to generate functional unit schedule and control information, a communication device and a plurality of scheduler functional unit controllers, wherein each of the scheduler functional unit controllers corresponds to one of the plurality of functional units of a system.

Type: Grant

Filed: September 18, 1997

Date of Patent: April 25, 2000

Assignees: LG Semicon Co., Ltd., Cogency Technology Incorporated

Inventors: Nigel C. Paver, Paul Day
Method and apparatus for moving select non-contiguous bytes of packed data in a single instruction

Patent number: 6052769

Abstract: A method comprises decoding a single instruction having a first operand identifying a plurality of bytes of packed data and a second operand identifying a corresponding plurality of byte masks. Each of the plurality of byte masks identified by the second operand of the single decoded instruction are analyzed, wherein select bytes of the plurality of bytes identified by the first operand are moved to an implicitly defined location based, at least in part, on the analysis of the individual byte masks identified by the second operand of the single decoded instruction.

Type: Grant

Filed: March 31, 1998

Date of Patent: April 18, 2000

Assignee: Intel Corporation

Inventors: Thomas R. Huff, Shreekant Thakkar, Nathaniel Hoffman
User programmable circuit and method for data processing apparatus using a self-timed asynchronous control structure

Patent number: 6044453

Abstract: A programmable circuit and method for a data processing apparatus is provided that allows an entire instruction or instruction set to be modified. According to the present invention, the instruction can be modified, for example, during initialization or execution. The programmable circuit for a data processing apparatus can include a plurality of functional units, each functional unit performing a set of prescribed operations. A programmable circuit that is capable of modifying an entire instruction. A controller that decodes a current instruction to perform a corresponding instruction task using the plurality of functional units and a communications device coupling the functional units, the programmable circuit and the controller.

Type: Grant

Filed: September 30, 1997

Date of Patent: March 28, 2000

Assignees: LG Semicon Co., Ltd., Cogency Technology Incorporated

Inventor: Nigel C. Paver
High performance, superscalar-based computer system with out-of-order instruction execution

Patent number: 6038654

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instructions in-order.

Type: Grant

Filed: June 23, 1999

Date of Patent: March 14, 2000

Assignee: Seiko Epson Corporation

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
MPEG motion compensation using operand routing and performing add and divide in a single instruction

Patent number: 5991865

Abstract: A routable operand and selectable operation processor multimedia extension unit is employed to motion compensate MPEG video using improved vector processing. A vector processing unit executes an add and divide instruction that adds two vector registers and divides the result in a single instruction. This is implemented through loading a first vector register with a first plurality of elements from a source block. A second vector register is then loaded with a second plurality of elements that are adjacent to the first plurality of elements. The add and divide instruction is then executed on the first and second vector registers, yielding an interpolated source element that is stored in a resultant vector register.

Type: Grant

Filed: December 31, 1996

Date of Patent: November 23, 1999

Assignee: Compaq Computer Corporation

Inventors: Brian E. Longhenry, Gary W. Thome, John S. Thayer
Multiprocessor arrangement including bus arbitration scheme involving plural CPU clusters that address each other as "phantom" CPUs

Patent number: 5935230

Abstract: At least two clusters of CPUs are present in a multiprocessor computer system. Each CPU cluster has a given number of CPUs, each CPU having an associated ID such as an ID number. An additional ID number, not associated with a CPU in the same cluster, is associated with the opposite CPU cluster that appears to the original cluster as a "phantom" processor. A round-robin bus arbitration scheme allows ordered ownership of a common bus within a first cluster until the ID reaches the "phantom" processor, at which time bus ownership passes to a CPU in the second cluster. This arrangement is preferably symmetric, so that when a CPU from the first cluster requests ownership of the bus, it is granted bus ownership by virtue of the first cluster's appearance to the second cluster as a "phantom" CPU.

Type: Grant

Filed: July 9, 1997

Date of Patent: August 10, 1999

Assignee: Amiga Development, LLC

Inventors: Felix Pinai, Manhtien Phan

prev 1 2 3