Distributing Of Vector Data To Vector Registers Patents (Class 712/4)
  • Patent number: 6782470
    Abstract: The register file of a processor includes embedded operand queues. The configuration of the register file into registers and operand queues is defined dynamically by a computer program. The programmer determines the trade-off between the number and size of the operand queue(s) versus the number of registers used for the program. The programmer partitions a portion of the registers into one or more operand queues. A given queue occupies a consecutive set of registers, although multiple queues need not occupy consecutive registers. An additional address bit is included to distinguish operand queue addresses from register addresses. Queue state logic tracks status information for each queue, including a header pointer, tail pointer, start address, end address and number of vacancies value. The program sets the locations and depth of a given operand queue within the register file.
    Type: Grant
    Filed: November 6, 2000
    Date of Patent: August 24, 2004
    Assignee: University of Washington
    Inventors: Stefan G. Berg, Michael S. Grow, Weiyun Sun, Donglok Kim, Yongmin Kim
  • Publication number: 20040153624
    Abstract: A “high availability” system comprises one or more switches under the control of multiple control processors (“CPs”). One of the CPs is deemed to be “active,” while the other CP is kept in a “standby” mode. Each CP generally has the same software load including a fabric state synchronization (“FSS”) facility. The FSSs of each CP communicate with each other. The state information pertaining to an active “image” is continuously provided to a standby copy of the image (“standby image”). The CPs' FSSs perform the function of synchronizing the standby image to the active image. The state information generally includes configuration and operational parameters and other information regarding the active image. By keeping the standby image synchronized to the active image, the standby image can be rapidly transitioned to the active mode if the active image experiences a fault and continue where the previous active image left off.
    Type: Application
    Filed: October 29, 2002
    Publication date: August 5, 2004
    Applicant: Brocade Communications Systems, Inc.
    Inventors: Bill J. Zhou, Richard L. Hammons
  • Publication number: 20040128485
    Abstract: According to one embodiment, a microprocessor is described. The microprocessor includes a scalar processor and a vector processor. The vector processor fuses multiple instructions that are to be processed. The fused instructions enable a single source register to simultaneously transmit its data contents to multiple math units.
    Type: Application
    Filed: December 27, 2002
    Publication date: July 1, 2004
    Inventor: Scott R. Nelson
  • Publication number: 20040128472
    Abstract: A vector information processing apparatus has a CPU comprising a plurality of asynchronously operating units, a main memory for storing data, and a main memory controller for controlling the writing of data in the main memory. The main memory controller has a VSC address buffer for holding a storage address in the main memory for each element designated by a vector scatter instruction. The main memory controller is arranged to inhibit the outputting of a writing permission signal for the main memory which is generated according to a writing request for writing an element having a smaller element number, which has the same storage address as the storage address and which has not been processed in a sequence of element numbers, of writing requests for writing elements in the main memory which are issued respectively from the asynchronously operating units according to a vector scatter instruction.
    Type: Application
    Filed: July 22, 2003
    Publication date: July 1, 2004
    Applicant: NEC CORPORATION
    Inventor: Hisao Koyanagi
  • Publication number: 20040117510
    Abstract: Processor communication registers (PCRs) contained in each processor within a multiprocessor system and interconnected by a specialized bus provides enhanced processor communication. Each PCR stores identical processor communication information that is useful in pipelined or parallel multi-processing. Each processor has exclusive rights to store to a sector within each PCR and has continuous access to read the contents of its own PCR. Each processor updates its exclusive sector within all of the PCRs utilizing communication over the specialized bus, instantly allowing all of the other processors to see the change within the PCR data, and bypassing the cache subsystem.
    Type: Application
    Filed: December 12, 2002
    Publication date: June 17, 2004
    Applicant: International Business Machines Corporation
    Inventors: Ravi Kumar Arimilli, Robert Alan Cargnoni, Derek Edward Williams, Kenneth Lee Wright
  • Publication number: 20040117595
    Abstract: A system and method for calculating memory addresses in a partitioned memory in a processing system having a processing unit, input and output units, a program sequencer and an external interface. An address calculator includes a set of storage elements, such as registers, and an arithmetic unit for calculating a memory address of a vector element dependent upon values stored in the storage elements and the address of a previous vector element. The storage elements hold STRIDE, SKIP and SPAN values and optionally a TYPE value, relating to the spacing between elements in the same partition, the spacing between elements in the consecutive partitions, the number of elements in a partition and the size of a vector element, respectively.
    Type: Application
    Filed: September 8, 2003
    Publication date: June 17, 2004
    Inventors: James M. Norris, Philip E. May, Kent D. Moat, Raymond B. Essick, Brian G. Lucas
  • Patent number: 6751725
    Abstract: Methods and apparatuses to clear state for operation of a stack. According to one embodiment of the invention, a processor comprises a set of one or more storage areas and a decode unit. The set of one or more storage areas are to store a plurality of tags and a top of stack indication, where each of the plurality of tags is to indicate if a register is in an empty or non-empty state. The decode unit is to decode scalar floating point instructions and packed data instructions, where at least certain of said scalar floating point instructions specify registers in a stack referenced manner and at least certain of said packed data instructions specify registers in a non-stack referenced manner. In addition, the packed data instructions include an instruction to mark the end of blocks of the packed data instructions in programs. The processor also comprises circuitry to cause the plurality of tags to indicate the empty state responsive to execution of the instruction.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: June 15, 2004
    Assignee: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Publication number: 20040103262
    Abstract: A system and method for processing operations that use data vectors each comprising a plurality of data elements, in accordance with the present invention, includes a vector data file comprising a plurality of storage elements for storing data elements of the data vectors. A pointer array is coupled by a bus to the vector data file. The pointer array includes a plurality of entries wherein each entry identifies at least one storage element in the vector data file. The at least one storage element stores at least one data element of the data vectors, wherein for at least one particular entry in the pointer array, the at least one storage element identified by the particular entry has an arbitrary starting address in the vector data file.
    Type: Application
    Filed: November 15, 2003
    Publication date: May 27, 2004
    Applicant: International Business Machines Corporation
    Inventors: Clair John Glossner, Erdem Hokenek, David Meltzer, Mayan Moudgill
  • Patent number: 6742106
    Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.
    Type: Grant
    Filed: January 28, 2003
    Date of Patent: May 25, 2004
    Assignee: NEC Electronics, Inc.
    Inventor: Ahmad R. Ansari
  • Patent number: 6728874
    Abstract: A method and system for correctly processing both big endian and little endian vector data. If the vector has a little endian data order, each piece of data (such as a byte) within the vector is processed in order. If the vector has a big endian data order, each vector element is processed in order, but each piece of data within each vector element is processed in reverse order.
    Type: Grant
    Filed: October 10, 2000
    Date of Patent: April 27, 2004
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Frans W. Sijstermans, Evert-Jan D. Pol
  • Patent number: 6701424
    Abstract: A method and apparatus for loading and storing vectors from and to memory, including embedding a location identifier in bits comprising a vector load and store instruction, wherein the location identifier indicates a location in the vector where useful data ends. The vector load instruction further includes a value field that indicates a particular constant for use by the load/store unit to set locations in the vector register beyond the useful data with the constant. By embedding the ending location of the useful date in the instruction, bandwidth and memory are saved by only requiring that the useful data in the vector be loaded and stored.
    Type: Grant
    Filed: April 7, 2000
    Date of Patent: March 2, 2004
    Assignee: Nintendo Co., Ltd.
    Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng
  • Patent number: 6681315
    Abstract: A bit vector array apparatus provides a high speed method for processing network transmission controls. Complex data structures for controlling network access are represented in the simplest possible form as single bit vector elements. The bit vector elements are combined into bit vectors comprised of 32 single bit vector elements. The bit vectors are processed in parallel in the bit vector array apparatus, which is comprised of special-purpose bit manipulation functions to expedite the processing.
    Type: Grant
    Filed: November 26, 1997
    Date of Patent: January 20, 2004
    Assignee: International Business Machines Corporation
    Inventors: Paul John Hilts, Brian Alan Youngman
  • Publication number: 20040006681
    Abstract: A configuration of vector units, digital circuitry and associated instructions is disclosed for the parallel processing of multiple Viterbi decoder butterflies on a programmable digital signal processor (DSP) that is based on single-instruction-multiple-data (SIMD) principles and provides indirect access to vector elements. The disclosed configuration uses a processor with two vector units and associated registers, where the vector units are connected back to back for processing Viterbi decoder state metrics. Viterbi add instructions increment vectors of state metrics from a first register, performing a desired permutation of state metrics while reading them indirectly through vector pointers, and writing intermediate result vectors to a second register.
    Type: Application
    Filed: September 13, 2002
    Publication date: January 8, 2004
    Inventors: Jaime Humberto Moreno, Fredy Daniel Neeser
  • Publication number: 20040003200
    Abstract: An interconnection device (300) with a number of links (306, 308, 310, 312 and 314), each link having a number of link input ports (302), link output ports (304) and storage registers (316). An input selection switch (402) is coupled to a selected link input port to receive an input data token. The storage registers (316) may be used to store input data tokens. A storage access switch (404) is coupled to the input selection switch (402) and to the storage registers (316) and may be used to select the current input data token or a token from the storage registers as an output data token. An output selection switch (406) receives the output data token and provides it to a selected link output port. The interconnection device may, for example, be used to connect the inputs and outputs of the processing elements of a vector processor or digital signal processor.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Inventors: Philip E. May, Kent Donald Moat, Raymond B. Essick, Silviu Chiricescu, Brian Geoffrey Lucas, James M. Norris, Michael Allen Schuette, Ali Saidi
  • Patent number: 6665790
    Abstract: A system and method for processing operations that use data vectors each comprising a plurality of data elements, in accordance with the present invention, includes a vector data file comprising a plurality of storage elements for storing data elements of the data vectors. A pointer array is coupled by a bus to the vector data file. The pointer array includes a plurality of entries wherein each entry identifies at least one storage element in the vector data file. The at least one storage element stores at least one data element of the data vectors, wherein for at least one particular entry in the pointer array, the at least one storage element identified by the particular entry has an arbitrary starting address in the vector data file.
    Type: Grant
    Filed: February 29, 2000
    Date of Patent: December 16, 2003
    Assignee: International Business Machines Corporation
    Inventors: Clair John Glossner, III, Erdem Hokenek, David Meltzer, Mayan Moudgill
  • Patent number: 6665774
    Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.
    Type: Grant
    Filed: October 16, 2001
    Date of Patent: December 16, 2003
    Assignee: Cray, Inc.
    Inventors: Gregory J. Faanes, Eric P. Lundberg
  • Publication number: 20030221086
    Abstract: Data processing apparatus and methods capable of executing vector instructions. Such apparatus preferably include a number of data buffers whose sizes are configurable in hardware and/or in software; a number of buffer control units adapted to control access to the data buffers, at lease one buffer control unit including at least one programmable write pointer register, read pointer register, read stride register and vector length register; a number of execution units for executing vector instructions using input operands stored in data buffers and storing produced results to data buffers; and at least one Direct Memory Access channel transferring data to and from said buffers. Preferably, at least some of the data buffers are implemented in dual-ported fashion in order to allow at least two simultaneous accesses per buffer, including at least one read access and one write access.
    Type: Application
    Filed: February 13, 2003
    Publication date: November 27, 2003
    Inventors: Slobodan A. Simovich, Ivan P. Radivojevic, Erik Ramberg
  • Patent number: 6625720
    Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector instructions are used for transferring the vector data between memory and registers used to perform calculations on the vector data. The transfers of portions of the vector data required in a calculation are scheduled so that calculations on a portion of the vector data are performed while a subsequent portion of the vector data is transferred. A vector buffer pool is partitioned into one or more vector buffers based on configuration information including the number of vectors buffers required by an application program and the size required for each vector buffer. The vector buffers are allocated for exclusive use by an application program that is executing in the data processor. Vector data transfer instructions are posted in a vector transfer instruction queue and are executed in the order they are posted to the instruction queue.
    Type: Grant
    Filed: August 17, 1999
    Date of Patent: September 23, 2003
    Assignee: NEC Electronics, Inc.
    Inventor: Ahmad R. Ansari
  • Publication number: 20030167387
    Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.
    Type: Application
    Filed: January 28, 2003
    Publication date: September 4, 2003
    Applicant: NEC Electronics, Inc.
    Inventor: Ahmad R. Ansari
  • Publication number: 20030159016
    Abstract: A data processor comprising: a register memory comprising an array of memory cells extending in two dimensions, the cells being located on rows in the first dimension and columns in the second dimension, each cell being addressable by means of an instruction specifying a pair of coordinates that identify the row and column of the cell in the array; and a processing unit capable of executing instructions that operate on a plurality of memory cells in the register, the instructions identifying the plurality of cells by means of a first instruction part specifying a pair of coordinates that identify a first cell in the array, and a second instruction part that identifies the configuration of the plurality of cells relative to the first cell; the data processor being arranged to interpret a first form of second instruction part as specifying a first group of cells all of which are located in the same row but in different columns, and to interpret a second form of second instruction part as specifying a first grou
    Type: Application
    Filed: October 31, 2002
    Publication date: August 21, 2003
    Applicant: ALPHAMOSAIC LIMITED
    Inventors: Stephen Barlow, Neil Bailey, Timothy Ramsdale, David Plowman, Robert Swann
  • Publication number: 20030159017
    Abstract: A data processor comprising: a first processor unit having a first register memory addressable in a first format; a second processor unit having a second register memory addressable in a second format, and being capable of retrieving data from the first processor unit; the second processor unit being capable of executing an instruction including an operand specified by means of a reference to data in the first register memory.
    Type: Application
    Filed: October 31, 2002
    Publication date: August 21, 2003
    Applicant: ALPHAMOSAIC LIMITED
    Inventors: Stephen Barlow, Neil Bailey, Timothy Ramsdale, David Plowman, Robert Swann
  • Patent number: 6571328
    Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.
    Type: Grant
    Filed: August 1, 2001
    Date of Patent: May 27, 2003
    Assignee: Nintendo Co., Ltd.
    Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J. Van Hook
  • Patent number: 6553474
    Abstract: A data processor in which a read operation, including misaligned data as operand data, can be performed in a single cycle. An alignment buffer having a register to hold data stored at one address in data memory is provided between the data memory and a data path unit. The alignment buffer outputs misaligned data by selecting misaligned data from data held in the register and data read from the data memory. The data held in the register is updated as word-aligned data is read out.
    Type: Grant
    Filed: January 24, 2001
    Date of Patent: April 22, 2003
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hironobu Ito, Hisakazu Sato
  • Publication number: 20030074654
    Abstract: A digital computer system automatically creates an Instruction Set Architecture (ISA) that potentially exploits VLIW instructions, vector operations, fused operations, and specialized operations with the goal of increasing the performance of a set of applications while keeping hardware cost below a designer specified limit, or with the goal of minimizing hardware cost given a required level of performance.
    Type: Application
    Filed: October 16, 2001
    Publication date: April 17, 2003
    Inventors: David William Goodwin, Dror Maydan, Ding-Kai Chen, Darin Stamenov Petkov, Steven Weng-Kiang Tjiang, Peng Tu, Christopher Rowen
  • Publication number: 20030056082
    Abstract: An improved method and system for controlling free space distribution by key range within a database. In one embodiment, a data structure including key ranges of a plurality of database tables and indexes, and a plurality of key range free space parameters is created. The plurality of database tables and indexes may include a plurality of page sets, which may include rows of data and keys. Time values may be associated with the plurality of free space parameters. The key range free space parameters may have values assigned to them. The key range free space parameters may be user-defined or automatically generated using growth trend analysis, based on key range growth statistics. The rows of data and keys within the plurality of page sets may be redistributed by a reorganization process. The redistributing may reference the key ranges of the data structure and the key range free space parameters.
    Type: Application
    Filed: December 27, 2001
    Publication date: March 20, 2003
    Inventor: John D. Maxfield
  • Patent number: 6530011
    Abstract: A method and an apparatus for implementing mixed scalar and vector values in a digital processing system. In one embodiment, a digital processing system, which contains processing unit and memories, is capable of identifying a first data in a first scalar register and a second data in a vector register. Upon fetching the first data as a first operand and the second data as a second operand, the processing unit performs an operation between the first and second operands in response to an operator. After operations, the result is subsequently stored in a second scalar register.
    Type: Grant
    Filed: October 20, 1999
    Date of Patent: March 4, 2003
    Assignee: SandCraft, Inc.
    Inventor: Jack H. Choquette
  • Patent number: 6513107
    Abstract: A vector transfer unit for handling transfers of vector data between a memory and a data processor in a computer system. Vector data transfer instructions are posted to an instruction queue in the vector transfer unit. Program instructions for performing a burst transfer include determining the starting address of the vector data to be transferred, the ending address of the vector data to be transferred, and whether the ending address of the vector data to be transferred is within the same virtual memory page as the starting address. The ending address of the vector data to be transferred is determined based on the number of data elements to be transferred, the stride of the vector data to be transferred, and the width of the vector data elements to be transferred. When the amount of data to be transferred is divisible by a factor of two, the multiplication of the stride and width of the data elements is carried out by shifting.
    Type: Grant
    Filed: August 17, 1999
    Date of Patent: January 28, 2003
    Assignee: NEC Electronics, Inc.
    Inventor: Ahmad R. Ansari
  • Patent number: 6496902
    Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.
    Type: Grant
    Filed: December 31, 1998
    Date of Patent: December 17, 2002
    Assignee: Cray Inc.
    Inventors: Gregory J. Faanes, Eric P. Lundberg
  • Publication number: 20020144061
    Abstract: A common scalar/vector data cache apparatus and method for a scalar/vector computer. One aspect of the present invention provides a computer system including a memory. The memory includes a plurality of sections. The computer system also includes a scalar/vector processor coupled to the memory using a plurality of separate address busses and a plurality of separate read-data busses wherein at least one of the sections of the memory is associated with each address bus and at least one of the sections of the memory is associated with each read-data bus. The processor further includes a plurality of scalar registers and a plurality of vector registers and operating on instructions which provide a reference address to a data word. The processor includes a scalar/vector cache unit that includes a cache array, and a FIFO unit that tracks (a.) an address in the cache array to which a read-data value will be placed when the read-data value is returned from the memory, and (b.
    Type: Application
    Filed: October 16, 2001
    Publication date: October 3, 2002
    Applicant: Cray Inc.
    Inventors: Gregory J. Faanes, Eric P. Lundberg
  • Patent number: 6446193
    Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.
    Type: Grant
    Filed: September 8, 1997
    Date of Patent: September 3, 2002
    Assignee: Agere Systems Guardian Corp.
    Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
  • Patent number: 6401194
    Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.
    Type: Grant
    Filed: January 28, 1997
    Date of Patent: June 4, 2002
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
  • Publication number: 20020032710
    Abstract: According to the invention, a matrix of elements is processed in a processor. A first subset of matrix elements is loaded from a first location and a second subset of matrix elements is loaded from a second location. A third subset of matrix elements is stored in a first destination and a fourth subset of matrix elements is stored in a second destination. The loading and storing steps result from the same instruction issue.
    Type: Application
    Filed: March 8, 2001
    Publication date: March 14, 2002
    Inventors: Ashley Saulsbury, Daniel S. Rice, Michael W. Parkin, Nyles Nettleton
  • Publication number: 20020032848
    Abstract: A method and apparatus for obtaining a scalar value from a vector register for use in a mixed vector and scalar instruction, including providing a vector in a vector register file, and embedding a location identifier of the scalar value within the vector in the bits defining the mixed vector and scalar instruction. The scalar value can be used directly from the vector register without the need to load the scalar to a scalar register prior to executing the instruction. The scalar location identifier may be embedded in the secondary op code of the instruction, or the instruction may have dedicated bits for providing the location of the scalar within the vector.
    Type: Application
    Filed: August 1, 2001
    Publication date: March 14, 2002
    Applicant: Nintendo Co., Ltd.
    Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng, Timothy J. Van Hook
  • Publication number: 20020026569
    Abstract: A method and apparatus for loading and storing vectors from and to memory, including embedding a location identifier in bits comprising a vector load and store instruction, wherein the location identifier indicates a location in the vector where useful data ends. The vector load instruction further includes a value field that indicates a particular constant for use by the load/store unit to set locations in the vector register beyond the useful data with the constant. By embedding the ending location of the useful date in the instruction, bandwidth and memory are saved by only requiring that the useful data in the vector be loaded and stored.
    Type: Application
    Filed: August 1, 2001
    Publication date: February 28, 2002
    Applicant: Nintendo Co., Ltd.
    Inventors: Yu-Chung C. Liao, Peter A. Sandon, Howard Cheng
  • Publication number: 20020007449
    Abstract: A vector artchitecture processing unit according to the present invention comprises a vector scatter (VSC) address coincidence detection unit 3 that comprises registers in which an area start address and an area end address of an area specified by an area-specified vector scatter instruction are stored; and a circuit that checks if the addresses specified by the area-specified vector scatter instruction overlap with an address to be accessed by a memory access instruction following the area-specified vector scatter instruction, wherein an instruction issue control unit 1 comprises a hold control circuit that holds the following memory access instruction in response to an address conflict signal from the VSC address conflict detector.
    Type: Application
    Filed: July 10, 2001
    Publication date: January 17, 2002
    Applicant: NEC CORPORATION
    Inventor: Hisao Koyanagi
  • Patent number: 6334176
    Abstract: The data processing system loads three input operands, including two input vectors and a control vector, into vector registers and performs a permutation of the two input vectors as specified by the control vector, and further stores the result of the operation as the output operand in an output register. The control vector consists of sixteen indices, each uniquely identifying a single byte of input data in either of the input registers, and can be specified in the operational code or be the result of a computation previously performed within the vector registers. The control vector is specified by calculating the offset of a selected vector element of the input vector relative to a base address of the input vector and loading each element with an index equal to the relative offset. Alternatively, the generation of the alignment vector is made by performing a look-up within a look-up table.
    Type: Grant
    Filed: April 17, 1998
    Date of Patent: December 25, 2001
    Assignees: Motorola, Inc., International Business Machines Corporation, Apple Computer, Inc.
    Inventors: Hunter Ledbetter Scales, III, Keith Everett Diefendorff, Brett Olsson, Pradeep Kumar Dubey, Ronald Ray Hochsprung
  • Patent number: 6282634
    Abstract: A floating point unit is provided with a register bank comprising 32 registers that may be used as either vector registers of scalar registers. A data processing instruction includes at least one register specifying field pointing to a register containing a data value to be used in that operation. An increase in the instruction bit space available to encode more opcodes or to allow for more registers is provided by encoding whether a register is to be treated as a vector or a scalar within the register field itself. Further, the register field for one register of the instruction may encode whether another register is a vector or a scalar. The registers can be initially accessed using the values within the register fields of the instruction independently of the opcode allowing for easier decode.
    Type: Grant
    Filed: May 27, 1998
    Date of Patent: August 28, 2001
    Assignee: ARM Limited
    Inventors: Christopher Neal Hinds, David Vivian Jaggar, David Terrence Matheny, David James Seal
  • Publication number: 20010014930
    Abstract: The invention relates to a new memory structure specially adapted for the storage of memory vectors. Each of the storage positions (#1, Mi-#M, Mi) of the memory has a length adapted to the length of large vectors and is parallelly arranged extending from an input and/or output for information and deeper into the memory. In this way each vector is stored undivided in a sequential order with the beginning of the vector at the input and/or output of the memory (memory field F1 in memory plane Mi). Addressing is made to the input and/or output of the memory. There are means (1IB-MIB, 1UB-MUB) acting like shift registers for the inputting and outputting of information in undivided sequence to/from the storage positions in the memory.
    Type: Application
    Filed: December 8, 1997
    Publication date: August 16, 2001
    Inventor: INGEMAR SODERQUIST
  • Patent number: 6266758
    Abstract: The present invention provides alignment and ordering of vector elements for SIMD processing. In the alignment of vector elements for SIMD processing, one vector is loaded from a memory unit into a first register and another vector is loaded from the memory unit into a second register. The first vector contains a first byte of an aligned vector to be generated. Then, a starting byte specifying the first byte of an aligned vector is determined. Next, a vector is extracted from the first register and the second register beginning from the first bit in the first byte of the first register continuing through the bits in the second register. Finally, the extracted vector is replicated into a third register such that the third register contains a plurality of elements aligned for SIMD processing. In the ordering of vector elements for SIMD processing, a first vector is loaded from a memory unit into a first register and a second vector is loaded from the memory unit into a second register.
    Type: Grant
    Filed: March 5, 1999
    Date of Patent: July 24, 2001
    Assignee: MIPS Technologies, Inc.
    Inventors: Timothy J. van Hook, Perter Hsu, William A. Huffman, Henry P. Moreton, Earl A. Killian
  • Patent number: 6212617
    Abstract: A parallel programming system provides a lazy collection oriented data type that reduces inter-processor communication in programs executed on parallel computers. The lazy collection oriented data type is provided as a data type in a parallel programming language. The parallel language supports both data-parallel and control-parallel operations. These operations take advantage of the lazy collection oriented data type to defer or reduce inter-processor communication until an operation on the data type requires that it be balanced across a set of processors.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: April 3, 2001
    Assignee: Microsoft Corporation
    Inventor: Jonathan C. Hardwick
  • Patent number: 6189094
    Abstract: A floating point unit having a register bank containing a plurality of registers supports vector operations that execute a specified operation a plurality of times upon a sequence of data values form different registers. The register bank is divided into subsets and with the sequence of registers used in a vector operation wrapping within a subset. The subsets comprise disjoint, contiguous ranges of register numbers. The wrapping within ranges allows compact code and efficient to be provided for performing DSP operations, such as FIR filtering and matrix transformations.
    Type: Grant
    Filed: May 27, 1998
    Date of Patent: February 13, 2001
    Assignee: Arm Limited
    Inventors: Christopher Neal Hinds, David Vivian Jaggar, David Terrence Matheny, David James Seal
  • Patent number: 6173366
    Abstract: A multimedia extension unit (MEU) is provided for performing various multimedia-type operations. The MEU can be coupled either through a coprocessor bus or a local CPU bus to a conventional processor. The MEU employs vector registers, a vector ALU, and an operand routing unit (ORU) to perform a maximum number of the multimedia operations within as few instruction cycles as possible. Complex algorithms are readily performed by arranging operands upon the vector ALU in accordance with the desired algorithm flowgraph. The ORU aligns the operands within partitioned slots or sub-slots of the vector registers using vector instructions unique to the MEU. At the output of the ORU, operand pairs from vector source or destination registers can be easily routed and combined at the vector ALU. The vector instructions employ special load/store instructions in combination with numerous operational instructions to carry out concurrent multimedia operations on the aligned operands.
    Type: Grant
    Filed: December 2, 1996
    Date of Patent: January 9, 2001
    Assignees: Compaq Computer Corp., Advanced Micro Devices, Inc.
    Inventors: John S. Thayer, John G. Favor, Frederick D. Weber
  • Patent number: 6141673
    Abstract: A multimedia extension unit (MEU) is provided for performing various multimedia-type operations. The MEU can be coupled either through a coprocessor bus or a local central processing unit (CPU) bus to a conventional processor. The MEU employs vector registers, a vector arithmetic logic unit (ALU), and an operand routing unit (ORU) to perform a maximum number of the multimedia operations within as few instruction cycles as possible. Complex algorithms are readily performed by arranging operands upon the vector ALU in accordance with the desired algorithm flowgraph. The ORU aligns the operands within partitioned slots or sub-slots of the vector registers using vector instructions unique to the MEU. At the output of the ORU, operand pairs from vector source or destination registers can be easily routed and combined at the vector ALU.
    Type: Grant
    Filed: May 25, 1999
    Date of Patent: October 31, 2000
    Assignees: Advanced Micro Devices, Inc., Compaq Computer Corp.
    Inventors: John S. Thayer, John Gregory Favor, Frederick D. Weber
  • Patent number: 6098162
    Abstract: Vector shifting elements of a vector register by varying amounts in a single process is achieved in a vector supercomputer processor. A first vector register contains a set of operands, and a second vector register contains a set of shift counts, one shift count for each operand. Operands and shift counts are successively transferred to a vector shift functional unit, which shifts the operand by an amount equal to the value of the shift count. The shifted operands are stored in a third vector register. The vector shift functional unit also achieves word shifting of a predetermined number of vector register elements to different word locations of another vector register.
    Type: Grant
    Filed: August 24, 1998
    Date of Patent: August 1, 2000
    Assignee: Cray Research, Inc.
    Inventors: Alan J. Schiffleger, Ram K. Gupta, Christopher C. Hsiung
  • Patent number: 6088782
    Abstract: A Single Instruction Multiple Data processor apparatus for implementing algorithms using sliding window type data is shown. The implementation shifts the elements of a Destination Vector Register (406, 606) either automatically every time the destination register value is read or in response to a specific instruction (800). The shifting of the Destination Vector Register (406, 606) allows each processing element to operate on new data. As the destination vector (406, 606) elements are shifted, a new element is provided to the vector from a Source Vector Register (404, 604).
    Type: Grant
    Filed: July 10, 1997
    Date of Patent: July 11, 2000
    Assignee: Motorola Inc.
    Inventors: De-Lei Lee, L. Rodney Goke, William Carroll Anderson
  • Patent number: 6085305
    Abstract: A processor including at least one execution unit generating out-of-order results and out-of-order condition codes. Precise architectural state of the processor is maintained by providing a results buffer having a number of slots and providing a condition code buffer having the same number of slots as the results buffer, each slot in the condition code buffer in one-to-one correspondence with a slot in the results buffer. Each live instruction in the processor is assigned a slot in the results buffer and the condition code buffer. Each speculative result produced by the execution units is stored in the assigned slot in the results buffer. When an instruction is retired, the results for that instruction are transferred to an architectural result register and any condition codes generated by that instruction are transferred to an architectural condition code register.
    Type: Grant
    Filed: June 25, 1997
    Date of Patent: July 4, 2000
    Assignee: Sun Microsystems, Inc.
    Inventors: Ramesh Panwar, Arjun Prabhu
  • Patent number: 6073158
    Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.
    Type: Grant
    Filed: July 29, 1993
    Date of Patent: June 6, 2000
    Assignee: Cirrus Logic, Inc.
    Inventors: Robert Marshall Nally, John Charles Schafer
  • Patent number: 6065114
    Abstract: A computer-implemented method of switching contexts in a processor is provided. The processor includes a register stack (RS) that has first and second portions. The processor includes a register stack engine (RSE) to exchange information, in one of instruction execution dependent and independent modes between the second portion and the storage area. The computer implemented method of switching contexts includes the following steps: It is determined whether an interrupt occurred; a first register (IFM) configured to store a content of a second register (CFM) is invalidated, the CFM is configured to store control information related to the first portion; it is determined whether an interrupt handler needs to access the RS; and if so, the IFM is validated, the content of the CFM is copied to the IFM, and RSE is caused to exchange information between both the first and second portions of the RS and the storage area.
    Type: Grant
    Filed: April 21, 1998
    Date of Patent: May 16, 2000
    Assignee: Idea Corporation
    Inventors: Achmed Rumi Zahir, Jonathan K. Ross, Carol Thompson, Cary Coutant, Prasad Raje, Sunil Saxena
  • Patent number: 6058465
    Abstract: A vector processor architecture provides vector registers of fixed size having data elements of programmable size and type. The type and size for data elements are defined by instructions which manipulate operands associated with the vector registers. The data size defined by an instruction determines the number of the data elements in a vector register and the number of parallel operations performed to complete the instruction. One embodiment of the invention supports 8-bit, 9-bit, 16-bit, and 32-bit data element sizes of integer type for all sizes and floating point data type for the 32-bit data elements.
    Type: Grant
    Filed: August 19, 1996
    Date of Patent: May 2, 2000
    Inventor: Le Trong Nguyen
  • Patent number: 6006315
    Abstract: A method is provided for writing a scalar value to a vector V1 without reading the vector from a storage device. A scalar value to be written into the vector at a specified position and a scalar value (index) representing such position are read from a storage device into an Arithmetic Logic Unit (ALU) of a vector processor. The ALU then generates another vector V2 having multiple copies of the scalar value to be written into V1. ALU also generates a mask representing the index. The vector V2 is then delivered to the storage storing V1, but the mask is applied so that only one or more, but not all, copies of the scalar value are written from V2 to the storage. The rest of the vector V1 remains unchanged. The invention reduces register file read contention. Furthermore, if the updated V1 (i.e. V1 having the scalar value) is to be used in the next instruction, a copy of V1 is read from the storage and is updated from V2 and the mask, simultaneously with V1 being updated in the storage.
    Type: Grant
    Filed: October 18, 1996
    Date of Patent: December 21, 1999
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Heonchul Park