Distributing Of Vector Data To Vector Registers Patents (Class 712/4)

Masking to control an access to data in vector register (Class 712/5)

Safety net paradigm for managing two computer execution modes

Patent number: 6789181

Abstract: A method and computer for executing the method. A source program is translated into an object program, in a manner in which the translated object program has a different execution behavior than the source program. The translated object program is executed under a monitor capable of detecting any deviation from fully-correct interpretation before any side-effect of the different execution behavior is irreversibly committed. When the monitor detects the deviation, or when an interrupt occurs during execution of the object program, a state of the program is established corresponding to a state that would have occurred during an execution of the source program, and from which execution can continue. Execution of the source program continues primarily in a hardware emulator designed to execute instructions of an instruction set non-native to the computer.

Type: Grant

Filed: November 3, 1999

Date of Patent: September 7, 2004

Assignee: ATI International, SRL

Inventors: John S. Yates, David L. Reese, Korbin S. Van Dyke, Paul H. Hohensee
Synchronous periodical orthogonal data converter

Publication number: 20040172517

Abstract: An orthogonal data converter for converting the components of a sequential vector component flow to a parallel vector component flow. The data converter has an input rotator configured to rotate corresponding vector components of the sequential vector component flow by a prescribed amount, and a bank of register files configured to store the rotated vector components. The converter also has an output rotator configured to rotate the position of the vector components read from the bank of register files by a prescribed amount. A controller of the converter is operative to control the addressing of the bank of register files and the rotating of the vector components. In this regard, the controller is operative to write the vector components to the bank of register files in a prescribed order and read the vector components in a prescribed order to generate the parallel vector component flow.

Type: Application

Filed: September 19, 2003

Publication date: September 2, 2004

Inventors: Boris Prokopenko, Timour Paltashev
Operand queues for streaming data: A processor register file extension

Patent number: 6782470

Abstract: The register file of a processor includes embedded operand queues. The configuration of the register file into registers and operand queues is defined dynamically by a computer program. The programmer determines the trade-off between the number and size of the operand queue(s) versus the number of registers used for the program. The programmer partitions a portion of the registers into one or more operand queues. A given queue occupies a consecutive set of registers, although multiple queues need not occupy consecutive registers. An additional address bit is included to distinguish operand queue addresses from register addresses. Queue state logic tracks status information for each queue, including a header pointer, tail pointer, start address, end address and number of vacancies value. The program sets the locations and depth of a given operand queue within the register file.

Type: Grant

Filed: November 6, 2000

Date of Patent: August 24, 2004

Assignee: University of Washington

Inventors: Stefan G. Berg, Michael S. Grow, Weiyun Sun, Donglok Kim, Yongmin Kim
High availability synchronization architecture

Publication number: 20040153624

Abstract: A “high availability” system comprises one or more switches under the control of multiple control processors (“CPs”). One of the CPs is deemed to be “active,” while the other CP is kept in a “standby” mode. Each CP generally has the same software load including a fabric state synchronization (“FSS”) facility. The FSSs of each CP communicate with each other. The state information pertaining to an active “image” is continuously provided to a standby copy of the image (“standby image”). The CPs' FSSs perform the function of synchronizing the standby image to the active image. The state information generally includes configuration and operational parameters and other information regarding the active image. By keeping the standby image synchronized to the active image, the standby image can be rapidly transitioned to the active mode if the active image experiences a fault and continue where the previous active image left off.

Type: Application

Filed: October 29, 2002

Publication date: August 5, 2004

Applicant: Brocade Communications Systems, Inc.

Inventors: Bill J. Zhou, Richard L. Hammons
Method for fusing instructions in a vector processor

Publication number: 20040128485

Abstract: According to one embodiment, a microprocessor is described. The microprocessor includes a scalar processor and a vector processor. The vector processor fuses multiple instructions that are to be processed. The fused instructions enable a single source register to simultaneously transmit its data contents to multiple math units.

Type: Application

Filed: December 27, 2002

Publication date: July 1, 2004

Inventor: Scott R. Nelson
Information processing apparatus and method of controlling memory thereof

Publication number: 20040128472

Abstract: A vector information processing apparatus has a CPU comprising a plurality of asynchronously operating units, a main memory for storing data, and a main memory controller for controlling the writing of data in the main memory. The main memory controller has a VSC address buffer for holding a storage address in the main memory for each element designated by a vector scatter instruction. The main memory controller is arranged to inhibit the outputting of a writing permission signal for the main memory which is generated according to a writing request for writing an element having a smaller element number, which has the same storage address as the storage address and which has not been processed in a sequence of element numbers, of writing requests for writing elements in the main memory which are issued respectively from the asynchronously operating units according to a vector scatter instruction.

Type: Application

Filed: July 22, 2003

Publication date: July 1, 2004

Applicant: NEC CORPORATION

Inventor: Hisao Koyanagi
Method and data processing system for microprocessor communication using a processor interconnect in a multi-processor system

Publication number: 20040117510

Abstract: Processor communication registers (PCRs) contained in each processor within a multiprocessor system and interconnected by a specialized bus provides enhanced processor communication. Each PCR stores identical processor communication information that is useful in pipelined or parallel multi-processing. Each processor has exclusive rights to store to a sector within each PCR and has continuous access to read the contents of its own PCR. Each processor updates its exclusive sector within all of the PCRs utilizing communication over the specialized bus, instantly allowing all of the other processors to see the change within the PCR data, and bypassing the cache subsystem.

Type: Application

Filed: December 12, 2002

Publication date: June 17, 2004

Applicant: International Business Machines Corporation

Inventors: Ravi Kumar Arimilli, Robert Alan Cargnoni, Derek Edward Williams, Kenneth Lee Wright
Partitioned vector processing

Publication number: 20040117595

Abstract: A system and method for calculating memory addresses in a partitioned memory in a processing system having a processing unit, input and output units, a program sequencer and an external interface. An address calculator includes a set of storage elements, such as registers, and an arithmetic unit for calculating a memory address of a vector element dependent upon values stored in the storage elements and the address of a previous vector element. The storage elements hold STRIDE, SKIP and SPAN values and optionally a TYPE value, relating to the spacing between elements in the same partition, the spacing between elements in the consecutive partitions, the number of elements in a partition and the size of a vector element, respectively.

Type: Application

Filed: September 8, 2003

Publication date: June 17, 2004

Inventors: James M. Norris, Philip E. May, Kent D. Moat, Raymond B. Essick, Brian G. Lucas
Methods and apparatuses to clear state for operation of a stack

Patent number: 6751725

Abstract: Methods and apparatuses to clear state for operation of a stack. According to one embodiment of the invention, a processor comprises a set of one or more storage areas and a decode unit. The set of one or more storage areas are to store a plurality of tags and a top of stack indication, where each of the plurality of tags is to indicate if a register is in an empty or non-empty state. The decode unit is to decode scalar floating point instructions and packed data instructions, where at least certain of said scalar floating point instructions specify registers in a stack referenced manner and at least certain of said packed data instructions specify registers in a non-stack referenced manner. In addition, the packed data instructions include an instruction to mark the end of blocks of the packed data instructions in programs. The processor also comprises circuitry to cause the plurality of tags to indicate the empty state responsive to execution of the instruction.

Type: Grant

Filed: February 16, 2001

Date of Patent: June 15, 2004

Assignee: Intel Corporation

Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
Vector register file with arbitrary vector addressing

Publication number: 20040103262

Abstract: A system and method for processing operations that use data vectors each comprising a plurality of data elements, in accordance with the present invention, includes a vector data file comprising a plurality of storage elements for storing data elements of the data vectors. A pointer array is coupled by a bus to the vector data file. The pointer array includes a plurality of entries wherein each entry identifies at least one storage element in the vector data file. The at least one storage element stores at least one data element of the data vectors, wherein for at least one particular entry in the pointer array, the at least one storage element identified by the particular entry has an arbitrary starting address in the vector data file.

Type: Application

Filed: November 15, 2003

Publication date: May 27, 2004

Applicant: International Business Machines Corporation

Inventors: Clair John Glossner, Erdem Hokenek, David Meltzer, Mayan Moudgill
Vector transfer system generating address error exception when vector to be transferred does not start and end on same memory page

Patent number: 6742106

Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.

Type: Grant

Filed: January 28, 2003

Date of Patent: May 25, 2004

Assignee: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
System and method for processing vectorized data

Patent number: 6728874

Abstract: A method and system for correctly processing both big endian and little endian vector data. If the vector has a little endian data order, each piece of data (such as a byte) within the vector is processed in order. If the vector has a big endian data order, each vector element is processed in order, but each piece of data within each vector element is processed in reverse order.

Type: Grant

Filed: October 10, 2000

Date of Patent: April 27, 2004

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Frans W. Sijstermans, Evert-Jan D. Pol
Method and apparatus for efficient loading and storing of vectors

Patent number: 6701424

Abstract: A method and apparatus for loading and storing vectors from and to memory, including embedding a location identifier in bits comprising a vector load and store instruction, wherein the location identifier indicates a location in the vector where useful data ends. The vector load instruction further includes a value field that indicates a particular constant for use by the load/store unit to set locations in the vector register beyond the useful data with the constant. By embedding the ending location of the useful date in the instruction, bandwidth and memory are saved by only requiring that the useful data in the vector be loaded and stored.

Type: Grant

Filed: April 7, 2000

Date of Patent: March 2, 2004

Assignee: Nintendo Co., Ltd.

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng
Method and apparatus for bit vector array

Patent number: 6681315

Abstract: A bit vector array apparatus provides a high speed method for processing network transmission controls. Complex data structures for controlling network access are represented in the simplest possible form as single bit vector elements. The bit vector elements are combined into bit vectors comprised of 32 single bit vector elements. The bit vectors are processed in parallel in the bit vector array apparatus, which is comprised of special-purpose bit manipulation functions to expedite the processing.

Type: Grant

Filed: November 26, 1997

Date of Patent: January 20, 2004

Assignee: International Business Machines Corporation

Inventors: Paul John Hilts, Brian Alan Youngman
Viterbi decoding for SIMD vector processors with indirect vector element access

Publication number: 20040006681

Abstract: A configuration of vector units, digital circuitry and associated instructions is disclosed for the parallel processing of multiple Viterbi decoder butterflies on a programmable digital signal processor (DSP) that is based on single-instruction-multiple-data (SIMD) principles and provides indirect access to vector elements. The disclosed configuration uses a processor with two vector units and associated registers, where the vector units are connected back to back for processing Viterbi decoder state metrics. Viterbi add instructions increment vectors of state metrics from a first register, performing a desired permutation of state metrics while reading them indirectly through vector pointers, and writing intermediate result vectors to a second register.

Type: Application

Filed: September 13, 2002

Publication date: January 8, 2004

Inventors: Jaime Humberto Moreno, Fredy Daniel Neeser
Interconnection device with integrated storage

Publication number: 20040003200

Abstract: An interconnection device (300) with a number of links (306, 308, 310, 312 and 314), each link having a number of link input ports (302), link output ports (304) and storage registers (316). An input selection switch (402) is coupled to a selected link input port to receive an input data token. The storage registers (316) may be used to store input data tokens. A storage access switch (404) is coupled to the input selection switch (402) and to the storage registers (316) and may be used to select the current input data token or a token from the storage registers as an output data token. An output selection switch (406) receives the output data token and provides it to a selected link output port. The interconnection device may, for example, be used to connect the inputs and outputs of the processing elements of a vector processor or digital signal processor.

Type: Application

Filed: June 28, 2002

Publication date: January 1, 2004

Inventors: Philip E. May, Kent Donald Moat, Raymond B. Essick, Silviu Chiricescu, Brian Geoffrey Lucas, James M. Norris, Michael Allen Schuette, Ali Saidi
Vector and scalar data cache for a vector multiprocessor

Patent number: 6665774

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Grant

Filed: October 16, 2001

Date of Patent: December 16, 2003

Assignee: Cray, Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg
Vector register file with arbitrary vector addressing

Patent number: 6665790

Abstract: A system and method for processing operations that use data vectors each comprising a plurality of data elements, in accordance with the present invention, includes a vector data file comprising a plurality of storage elements for storing data elements of the data vectors. A pointer array is coupled by a bus to the vector data file. The pointer array includes a plurality of entries wherein each entry identifies at least one storage element in the vector data file. The at least one storage element stores at least one data element of the data vectors, wherein for at least one particular entry in the pointer array, the at least one storage element identified by the particular entry has an arbitrary starting address in the vector data file.

Type: Grant

Filed: February 29, 2000

Date of Patent: December 16, 2003

Assignee: International Business Machines Corporation

Inventors: Clair John Glossner, III, Erdem Hokenek, David Meltzer, Mayan Moudgill
Configurable stream processor apparatus and methods

Publication number: 20030221086

Abstract: Data processing apparatus and methods capable of executing vector instructions. Such apparatus preferably include a number of data buffers whose sizes are configurable in hardware and/or in software; a number of buffer control units adapted to control access to the data buffers, at lease one buffer control unit including at least one programmable write pointer register, read pointer register, read stride register and vector length register; a number of execution units for executing vector instructions using input operands stored in data buffers and storing produced results to data buffers; and at least one Direct Memory Access channel transferring data to and from said buffers. Preferably, at least some of the data buffers are implemented in dual-ported fashion in order to allow at least two simultaneous accesses per buffer, including at least one read access and one write access.

Type: Application

Filed: February 13, 2003

Publication date: November 27, 2003

Inventors: Slobodan A. Simovich, Ivan P. Radivojevic, Erik Ramberg
System for posting vector synchronization instructions to vector instruction queue to separate vector instructions from different application programs

Patent number: 6625720

Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector instructions are used for transferring the vector data between memory and registers used to perform calculations on the vector data. The transfers of portions of the vector data required in a calculation are scheduled so that calculations on a portion of the vector data are performed while a subsequent portion of the vector data is transferred. A vector buffer pool is partitioned into one or more vector buffers based on configuration information including the number of vectors buffers required by an application program and the size required for each vector buffer. The vector buffers are allocated for exclusive use by an application program that is executing in the data processor. Vector data transfer instructions are posted in a vector transfer instruction queue and are executed in the order they are posted to the instruction queue.

Type: Grant

Filed: August 17, 1999

Date of Patent: September 23, 2003

Assignee: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
Vector transfer system generating address error exception when vector to be transferred does not start and end on same memory page

Publication number: 20030167387

Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.

Type: Application

Filed: January 28, 2003

Publication date: September 4, 2003

Applicant: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
Data access in a processor

Publication number: 20030159016

Abstract: A data processor comprising: a register memory comprising an array of memory cells extending in two dimensions, the cells being located on rows in the first dimension and columns in the second dimension, each cell being addressable by means of an instruction specifying a pair of coordinates that identify the row and column of the cell in the array; and a processing unit capable of executing instructions that operate on a plurality of memory cells in the register, the instructions identifying the plurality of cells by means of a first instruction part specifying a pair of coordinates that identify a first cell in the array, and a second instruction part that identifies the configuration of the plurality of cells relative to the first cell; the data processor being arranged to interpret a first form of second instruction part as specifying a first group of cells all of which are located in the same row but in different columns, and to interpret a second form of second instruction part as specifying a first grou

Type: Application

Filed: October 31, 2002

Publication date: August 21, 2003

Applicant: ALPHAMOSAIC LIMITED

Inventors: Stephen Barlow, Neil Bailey, Timothy Ramsdale, David Plowman, Robert Swann
Data access in a processor

Publication number: 20030159017

Abstract: A data processor comprising: a first processor unit having a first register memory addressable in a first format; a second processor unit having a second register memory addressable in a second format, and being capable of retrieving data from the first processor unit; the second processor unit being capable of executing an instruction including an operand specified by means of a reference to data in the first register memory.

Type: Application

Filed: October 31, 2002

Publication date: August 21, 2003

Applicant: ALPHAMOSAIC LIMITED

Inventors: Stephen Barlow, Neil Bailey, Timothy Ramsdale, David Plowman, Robert Swann
Method and apparatus for obtaining a scalar value directly from a vector register

Patent number: 6571328

Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.

Type: Grant

Filed: August 1, 2001

Date of Patent: May 27, 2003

Assignee: Nintendo Co., Ltd.

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J. Van Hook
Data processor changing an alignment of loaded data

Patent number: 6553474

Abstract: A data processor in which a read operation, including misaligned data as operand data, can be performed in a single cycle. An alignment buffer having a register to hold data stored at one address in data memory is provided between the data memory and a data path unit. The alignment buffer outputs misaligned data by selecting misaligned data from data held in the register and data read from the data memory. The data held in the register is updated as word-aligned data is read out.

Type: Grant

Filed: January 24, 2001

Date of Patent: April 22, 2003

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventors: Hironobu Ito, Hisakazu Sato
Automatic instruction set architecture generation

Publication number: 20030074654

Abstract: A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operations, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.

Type: Application

Filed: October 16, 2001

Publication date: April 17, 2003

Inventors: David William Goodwin, Dror Maydan, Ding-Kai Chen, Darin Stamenov Petkov, Steven Weng-Kiang Tjiang, Peng Tu, Christopher Rowen
System and method for controlling free space distribution by key range within a database

Publication number: 20030056082

Abstract: An improved method and system for controlling free space distribution by key range within a database. In one embodiment, a data structure including key ranges of a plurality of database tables and indexes, and a plurality of key range free space parameters is created. The plurality of database tables and indexes may include a plurality of page sets, which may include rows of data and keys. Time values may be associated with the plurality of free space parameters. The key range free space parameters may have values assigned to them. The key range free space parameters may be user-defined or automatically generated using growth trend analysis, based on key range growth statistics. The rows of data and keys within the plurality of page sets may be redistributed by a reorganization process. The redistributing may reference the key ranges of the data structure and the key range free space parameters.

Type: Application

Filed: December 27, 2001

Publication date: March 20, 2003

Inventor: John D. Maxfield
Method and apparatus for vector register with scalar values

Patent number: 6530011

Abstract: A method and an apparatus for implementing mixed scalar and vector values in a digital processing system. In one embodiment, a digital processing system, which contains processing unit and memories, is capable of identifying a first data in a first scalar register and a second data in a vector register. Upon fetching the first data as a first operand and the second data as a second operand, the processing unit performs an operation between the first and second operands in response to an operator. After operations, the result is subsequently stored in a second scalar register.

Type: Grant

Filed: October 20, 1999

Date of Patent: March 4, 2003

Assignee: SandCraft, Inc.

Inventor: Jack H. Choquette
Vector transfer system generating address error exception when vector to be transferred does not start and end on same memory page

Patent number: 6513107

Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.

Type: Grant

Filed: August 17, 1999

Date of Patent: January 28, 2003

Assignee: NEC Electronics, Inc.

Inventor: Ahmad R. Ansari
Vector and scalar data cache for a vector multiprocessor

Patent number: 6496902

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Grant

Filed: December 31, 1998

Date of Patent: December 17, 2002

Assignee: Cray Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg
Vector and scalar data cache for a vector multiprocessor

Publication number: 20020144061

Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.

Type: Application

Filed: October 16, 2001

Publication date: October 3, 2002

Applicant: Cray Inc.

Inventors: Gregory J. Faanes, Eric P. Lundberg
Method and apparatus for single cycle processing of data associated with separate accumulators in a dual multiply-accumulate architecture

Patent number: 6446193

Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.

Type: Grant

Filed: September 8, 1997

Date of Patent: September 3, 2002

Assignee: Agere Systems Guardian Corp.

Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
Execution unit for processing a data stream independently and in parallel

Patent number: 6401194

Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.

Type: Grant

Filed: January 28, 1997

Date of Patent: June 4, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
Method and apparatus for obtaining a scalar value directly from a vector register

Publication number: 20020032848

Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.

Type: Application

Filed: August 1, 2001

Publication date: March 14, 2002

Applicant: Nintendo Co., Ltd.

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J. Van Hook
Processing architecture having a matrix-transpose capability

Publication number: 20020032710

Abstract: According to the invention, a matrix of elements is processed in a processor. A first subset of matrix elements is loaded from a first location and a second subset of matrix elements is loaded from a second location. A third subset of matrix elements is stored in a first destination and a fourth subset of matrix elements is stored in a second destination. The loading and storing steps result from the same instruction issue.

Type: Application

Filed: March 8, 2001

Publication date: March 14, 2002

Inventors: Ashley Saulsbury, Daniel S. Rice, Michael W. Parkin, Nyles Nettleton
Method and apparatus for efficient loading and storing of vectors

Publication number: 20020026569

Abstract: A method and apparatus for loading and storing vectors from and to memory, including embedding a location identifier in bits comprising a vector load and store instruction, wherein the location identifier indicates a location in the vector where useful data ends. The vector load instruction further includes a value field that indicates a particular constant for use by the load/store unit to set locations in the vector register beyond the useful data with the constant. By embedding the ending location of the useful date in the instruction, bandwidth and memory are saved by only requiring that the useful data in the vector be loaded and stored.

Type: Application

Filed: August 1, 2001

Publication date: February 28, 2002

Applicant: Nintendo Co., Ltd.

Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng
Vector scatter instruction control circuit and vector architecture information processing equipment

Publication number: 20020007449

Abstract: A vector artchitecture processing unit according to the present invention comprises a vector scatter (VSC) address coincidence detection unit 3 that comprises registers in which an area start address and an area end address of an area specified by an area-specified vector scatter instruction are stored; and a circuit that checks if the addresses specified by the area-specified vector scatter instruction overlap with an address to be accessed by a memory access instruction following the area-specified vector scatter instruction, wherein an instruction issue control unit 1 comprises a hold control circuit that holds the following memory access instruction in response to an address conflict signal from the VSC address conflict detector.

Type: Application

Filed: July 10, 2001

Publication date: January 17, 2002

Applicant: NEC CORPORATION

Inventor: Hisao Koyanagi
Method and apparatus for generating an alignment control vector

Patent number: 6334176

Abstract: The data processing system loads three input operands, including two input vectors and a control vector, into vector registers and performs a permutation of the two input vectors as specified by the control vector, and further stores the result of the operation as the output operand in an output register. The control vector consists of sixteen indices, each uniquely identifying a single byte of input data in either of the input registers, and can be specified in the operational code or be the result of a computation previously performed within the vector registers. The control vector is specified by calculating the offset of a selected vector element of the input vector relative to a base address of the input vector and loading each element with an index equal to the relative offset. Alternatively, the generation of the alignment vector is made by performing a look-up within a look-up table.

Type: Grant

Filed: April 17, 1998

Date of Patent: December 25, 2001

Assignees: Motorola, Inc., International Business Machines Corporation, Apple Computer, Inc.

Inventors: Hunter Ledbetter Scales, III, Keith Everett Diefendorff, Brett Olsson, Pradeep Kumar Dubey, Ronald Ray Hochsprung
Apparatus and method for processing data having a mixed vector/scalar register file

Patent number: 6282634

Abstract: A floating point unit is provided with a register bank comprising 32 registers that may be used as either vector registers of scalar registers. A data processing instruction includes at least one register specifying field pointing to a register containing a data value to be used in that operation. An increase in the instruction bit space available to encode more opcodes or to allow for more registers is provided by encoding whether a register is to be treated as a vector or a scalar within the register field itself. Further, the register field for one register of the instruction may encode whether another register is a vector or a scalar. The registers can be initially accessed using the values within the register fields of the instruction independently of the opcode allowing for easier decode.

Type: Grant

Filed: May 27, 1998

Date of Patent: August 28, 2001

Assignee: ARM Limited

Inventors: Christopher Neal Hinds, David Vivian Jaggar, David Terrence Matheny, David James Seal
MEMORY STRUCTURE

Publication number: 20010014930

Abstract: The invention relates to a new memory structure specially adapted for the storage of memory vectors. Each of the storage positions (#1, Mi-#M, Mi) of the memory has a length adapted to the length of large vectors and is parallelly arranged extending from an input and/or output for information and deeper into the memory. In this way each vector is stored undivided in a sequential order with the beginning of the vector at the input and/or output of the memory (memory field F1 in memory plane Mi). Addressing is made to the input and/or output of the memory. There are means (1IB-MIB, 1UB-MUB) acting like shift registers for the inputting and outputting of information in undivided sequence to/from the storage positions in the memory.

Type: Application

Filed: December 8, 1997

Publication date: August 16, 2001

Inventor: INGEMAR SODERQUIST
Alignment and ordering of vector elements for single instruction multiple data processing

Patent number: 6266758

Abstract: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register.

Type: Grant

Filed: March 5, 1999

Date of Patent: July 24, 2001

Assignee: MIPS Technologies, Inc.

Inventors: Timothy J. van Hook, Perter Hsu, William A. Huffman, Henry P. Moreton, Earl A. Killian
Parallel processing method and system using a lazy parallel data type to reduce inter-processor communication

Patent number: 6212617

Abstract: A parallel programming system provides a lazy collection oriented data type that reduces inter-processor communication in programs executed on parallel computers. The lazy collection oriented data type is provided as a data type in a parallel programming language. The parallel language supports both data-parallel and control-parallel operations. These operations take advantage of the lazy collection oriented data type to defer or reduce inter-processor communication until an operation on the data type requires that it be balanced across a set of processors.

Type: Grant

Filed: June 30, 1998

Date of Patent: April 3, 2001

Assignee: Microsoft Corporation

Inventor: Jonathan C. Hardwick
Recirculating register file

Patent number: 6189094

Abstract: A floating point unit having a register bank containing a plurality of registers supports vector operations that execute a specified operation a plurality of times upon a sequence of data values form different registers. The register bank is divided into subsets and with the sequence of registers used in a vector operation wrapping within a subset. The subsets comprise disjoint, contiguous ranges of register numbers. The wrapping within ranges allows compact code and efficient to be provided for performing DSP operations, such as FIR filtering and matrix transformations.

Type: Grant

Filed: May 27, 1998

Date of Patent: February 13, 2001

Assignee: Arm Limited

Inventors: Christopher Neal Hinds, David Vivian Jaggar, David Terrence Matheny, David James Seal
Load and store instructions which perform unpacking and packing of data bits in separate vector and integer cache storage

Patent number: 6173366

Abstract: A multimedia extension unit (MEU) is provided for performing various multimedia-type operations. The MEU can be coupled either through a coprocessor bus or a local CPU bus to a conventional processor. The MEU employs vector registers, a vector ALU, and an operand routing unit (ORU) to perform a maximum number of the multimedia operations within as few instruction cycles as possible. Complex algorithms are readily performed by arranging operands upon the vector ALU in accordance with the desired algorithm flowgraph. The ORU aligns the operands within partitioned slots or sub-slots of the vector registers using vector instructions unique to the MEU. At the output of the ORU, operand pairs from vector source or destination registers can be easily routed and combined at the vector ALU. The vector instructions employ special load/store instructions in combination with numerous operational instructions to carry out concurrent multimedia operations on the aligned operands.

Type: Grant

Filed: December 2, 1996

Date of Patent: January 9, 2001

Assignees: Compaq Computer Corp., Advanced Micro Devices, Inc.

Inventors: John S. Thayer, John G. Favor, Frederick D. Weber
Microprocessor modified to perform inverse discrete cosine transform operations on a one-dimensional matrix of numbers within a minimal number of instructions

Patent number: 6141673

Abstract: A multimedia extension unit (MEU) is provided for performing various multimedia-type operations. The MEU can be coupled either through a coprocessor bus or a local central processing unit (CPU) bus to a conventional processor. The MEU employs vector registers, a vector arithmetic logic unit (ALU), and an operand routing unit (ORU) to perform a maximum number of the multimedia operations within as few instruction cycles as possible. Complex algorithms are readily performed by arranging operands upon the vector ALU in accordance with the desired algorithm flowgraph. The ORU aligns the operands within partitioned slots or sub-slots of the vector registers using vector instructions unique to the MEU. At the output of the ORU, operand pairs from vector source or destination registers can be easily routed and combined at the vector ALU.

Type: Grant

Filed: May 25, 1999

Date of Patent: October 31, 2000

Assignees: Advanced Micro Devices, Inc., Compaq Computer Corp.

Inventors: John S. Thayer, John Gregory Favor, Frederick D. Weber
Vector shift functional unit for successively shifting operands stored in a vector register by corresponding shift counts stored in another vector register

Patent number: 6098162

Abstract: Vector shifting elements of a vector register by varying amounts in a single process is achieved in a vector supercomputer processor. A first vector register contains a set of operands, and a second vector register contains a set of shift counts, one shift count for each operand. Operands and shift counts are successively transferred to a vector shift functional unit, which shifts the operand by an amount equal to the value of the shift count. The shifted operands are stored in a third vector register. The vector shift functional unit also achieves word shifting of a predetermined number of vector register elements to different word locations of another vector register.

Type: Grant

Filed: August 24, 1998

Date of Patent: August 1, 2000

Assignee: Cray Research, Inc.

Inventors: Alan J. Schiffleger, Ram K. Gupta, Christopher C. Hsiung
Method and apparatus for moving data in a parallel processor using source and destination vector registers

Patent number: 6088782

Abstract: A Single Instruction Multiple Data processor apparatus for implementing algorithms using sliding window type data is shown. The implementation shifts the elements of a Destination Vector Register (406, 606) either automatically every time the destination register value is read or in response to a specific instruction (800). The shifting of the Destination Vector Register (406, 606) allows each processing element to operate on new data. As the destination vector (406, 606) elements are shifted, a new element is provided to the vector from a Source Vector Register (404, 604).

Type: Grant

Filed: July 10, 1997

Date of Patent: July 11, 2000

Assignee: Motorola Inc.

Inventors: De-Lei Lee, L. Rodney Goke, William Carroll Anderson
Apparatus for precise architectural update in an out-of-order processor

Patent number: 6085305

Abstract: A processor including at least one execution unit generating out-of-order results and out-of-order condition codes. Precise architectural state of the processor is maintained by providing a results buffer having a number of slots and providing a condition code buffer having the same number of slots as the results buffer, each slot in the condition code buffer in one-to-one correspondence with a slot in the results buffer. Each live instruction in the processor is assigned a slot in the results buffer and the condition code buffer. Each speculative result produced by the execution units is stored in the assigned slot in the results buffer. When an instruction is retired, the results for that instruction are transferred to an architectural result register and any condition codes generated by that instruction are transferred to an architectural condition code register.

Type: Grant

Filed: June 25, 1997

Date of Patent: July 4, 2000

Assignee: Sun Microsystems, Inc.

Inventors: Ramesh Panwar, Arjun Prabhu
System and method for processing multiple received signal sources

Patent number: 6073158

Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.

Type: Grant

Filed: July 29, 1993

Date of Patent: June 6, 2000

Assignee: Cirrus Logic, Inc.

Inventors: Robert Marshall Nally, John Charles Schafer
Cover instruction and asynchronous backing store switch

Patent number: 6065114

Abstract: A computer-implemented method of switching contexts in a processor is provided. The processor includes a register stack (RS) that has first and second portions. The processor includes a register stack engine (RSE) to exchange information, in one of instruction execution dependent and independent modes between the second portion and the storage area. The computer implemented method of switching contexts includes the following steps: It is determined whether an interrupt occurred; a first register (IFM) configured to store a content of a second register (CFM) is invalidated, the CFM is configured to store control information related to the first portion; it is determined whether an interrupt handler needs to access the RS; and if so, the IFM is validated, the content of the CFM is copied to the IFM, and RSE is caused to exchange information between both the first and second portions of the RS and the storage area.

Type: Grant

Filed: April 21, 1998

Date of Patent: May 16, 2000

Assignee: Idea Corporation

Inventors: Achmed Rumi Zahir, Jonathan K. Ross, Carol Thompson, Cary Coutant, Prasad Raje, Sunil Saxena

prev 1 2 3 4 5 6 next