Arithmetic Operation Instruction Processing Patents (Class 712/221)
  • Publication number: 20080148012
    Abstract: A mathematical operation processing apparatus is disclosed by which the supply of an operand which is performed based on condition codes by a plurality of mathematical operations can be performed at a high speed. The mathematical operation processing apparatus includes a plurality of computing elements configured to perform different mathematical operations different from one another and produce mathematical operation results of the mathematical operations and condition codes. A condition code set register retains the condition codes produced simultaneously by the computing elements as a condition code set. A condition code conversion section performs a predetermined conversion for the condition code set and outputs a result of the conversion as a conversion condition code set. An operand supplying section supplies an operand for the mathematical operations in the computing elements based on the conversion condition code set.
    Type: Application
    Filed: December 4, 2007
    Publication date: June 19, 2008
    Applicant: Sony Corporation
    Inventors: Yasuhiro Iizuka, Takahiro Sato, Takayasu Kon, Kenichi Sanpei, Eiichiro Morinaga
  • Publication number: 20080148024
    Abstract: The present disclosure provides a method for instruction processing. The method may include adding a first operand from a first register, a second operand from a second register and a carry input bit to generate a sum and a carry out bit. The method may further include loading the sum into a third register and loading the carry out bit into a most significant bit position of the third register to generate a third operand. The method may also include performing a single bit shift on the third operand via a shifter unit to produce a shifted operand and loading the shifted operand into the fourth register. The method may further include loading a least significant bit from the sum into the most significant bit position of the fourth register to generate a fourth operand. The method may additionally include generating a greatest common divisor (GCD) of the first and second operands via the fourth operand and generating a public key based on, at least in part, the GCD.
    Type: Application
    Filed: December 14, 2006
    Publication date: June 19, 2008
    Applicant: INTEL CORPORATION
    Inventors: Gilbert M. Wolrich, William Hasenplaugh, Wajdi Feghali, Daniel Cutter, Vinodh Gopal, Gunnar Gaubatz
  • Patent number: 7389406
    Abstract: A partial execution unit of a splittable execution unit performs an operation on a portion of one or more arguments of a micro-operation to generate a first partial execution result of the micro-operation. A complementary portion of one of the arguments is passed through a bypass execution unit instead of through the splittable execution unit to generate a second partial execution result of the micro-operation. The first partial execution result and second partial execution result are concatenated into a full execution result.
    Type: Grant
    Filed: September 28, 2004
    Date of Patent: June 17, 2008
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Guillermo Savransky, Sagi Lahav
  • Publication number: 20080140994
    Abstract: A method of operation within an integrated-circuit processing device having a plurality of execution lanes. Upon receiving an instruction to exchange data between the execution lanes, respective requests from the execution lanes are examined to determine a set of the execution lanes that may send data to one or more others of the execution lanes during a first interval. Each execution lane within the set of the execution lanes is signaled to indicate that the execution lane may send data to the one or others of the execution lanes.
    Type: Application
    Filed: October 9, 2007
    Publication date: June 12, 2008
    Inventors: Brucek Khailany, William James Dally, Ujval J. Kapasi, Jim Jian Lin, Raghunath Rao, DeForest Tovey, Mark Rygh, Jung-Ho Ahn
  • Publication number: 20080141006
    Abstract: A system and method for implementing arithmetic logic unit (ALU) support for value-based control dependence sequences. According to a first embodiment of the present invention, an ALU generates a carry-out signal designating one of a first and second value as a larger value. In response to the carry-out signal, the ALU updates a storage location with a third value, which is the larger value. According to a second embodiment of the present invention, an ALU generates a carry-out signal designating one of a first and second value as a larger value. In response to the carry-out signal, the ALU updates a storage location with a third value. The third value is a fourth value, if the carry-out signal designates the first value as the larger value or the third value is a fifth value, if the carry-out signal designates the second value as the larger value.
    Type: Application
    Filed: December 11, 2006
    Publication date: June 12, 2008
    Inventors: Lei Chen, Hung C. Ngo, Kevin J. Nowka
  • Publication number: 20080126758
    Abstract: A digital signal processing apparatus and method for MAC operation are disclosed. The DSP apparatus including: a first memory for storing a plurality of first operands; a second memory for storing a plurality of second operands; a MAC processor including a plurality of parallel MAC blocks disposed in parallel for performing a parallel MAC operation on a first operand outputted from the first memory in parallel and a second operand outputted from the second memory in parallel using the parallel MAC blocks, wherein the first memory and the second memory include dual port memories for outputting the plurality of the first operands and the second operands to the plurality of parallel MAC blocks in parallel.
    Type: Application
    Filed: December 22, 2006
    Publication date: May 29, 2008
    Inventors: Young-Su Kwon, Bon-Tae Koo, Nak-Woong Eum
  • Publication number: 20080114967
    Abstract: There is provided a semiconductor integrated circuit device which consumes less power and enables real-time processing. The semiconductor integrated circuit device comprises: thermal sensors which can detect temperature, determine whether the detection result exceeds each of the above reference values and output the result; and a control block capable of controlling the operations of arithmetic blocks based on the output signals of the thermal sensors, wherein the control block returns to an operation state from a suspended state with an interrupt signal based on the output signals of the thermal sensors and determines the operation conditions of the arithmetic blocks to ensure that the temperature conditions of the arithmetic blocks are satisfied. Thereby, power consumption is reduced and real-time processing efficiency is improved.
    Type: Application
    Filed: November 6, 2007
    Publication date: May 15, 2008
    Inventors: Makoto Saen, Kenichi Osada, Tetsuya Yamada, Yusuke Kanno, Satoshi Misaka
  • Patent number: 7366882
    Abstract: A processor is provided with a address calculation unit so as to generate addresses for elements of object oriented data structures in one processor clock cycle.
    Type: Grant
    Filed: May 10, 2002
    Date of Patent: April 29, 2008
    Inventors: Zohair Sahraoui, Gary Ciambella
  • Patent number: 7363471
    Abstract: A method may translate a set of source instructions into a set of target instructions, execute the set of target instructions, and unmask a denormal input control bit if the set of source instructions uses a denormal input handling mechanism. A method may detect at least one denormal exception of a faulty target instruction by executing the set of target instructions; assign a predetermined value to one or more denormal operands of the faulty target instruction; and execute the faulty target instruction with the predetermined value for the one or more denormal operands. An apparatus, system, and machine-readable medium may perform such methods.
    Type: Grant
    Filed: June 27, 2005
    Date of Patent: April 22, 2008
    Assignee: Intel Corporation
    Inventors: Sion Berkowits, Orna Etzion, Li Jianhui
  • Patent number: 7353516
    Abstract: The present invention concerns data flow control in adaptive integrated circuitry which utilizes a data flow model for data processing. The present invention controls task initiation and execution based upon data consumption measured in data buffer units. In the various embodiments, when a first task of a plurality of tasks is initiated, buffer parameter is determined and a buffer count is initialized for the first task. For each iteration of the first task using a data buffer unit of input data, the buffer count is correspondingly adjusted, such as incremented or decremented. When the buffer count meets the buffer parameter requirements, the state of the first task is changed, which may including stopping the first task, and a next action is determined, such as initiating a second task. The various apparatus embodiments include a hardware task manager, a node sequencer, a programmable node, and use of a monitoring task within an adaptive execution unit.
    Type: Grant
    Filed: August 14, 2003
    Date of Patent: April 1, 2008
    Assignee: NVIDIA Corporation
    Inventors: Ghobad Heidari-Bateni, Sharad D. Sambhwani
  • Publication number: 20080077779
    Abstract: In one embodiment, the present invention includes a method for receiving a rounding instruction and an immediate value in a processor, determining if a rounding mode override indicator of the immediate value is active, and if so executing a rounding operation on a source operand in a floating point unit of the processor responsive to the rounding instruction and according to a rounding mode set forth in the immediate operand. Other embodiments are described and claimed.
    Type: Application
    Filed: September 22, 2006
    Publication date: March 27, 2008
    Inventors: Ronen Zohar, Shane Story
  • Publication number: 20080072025
    Abstract: A novel and useful apparatus for and method of software based phase locked loop (PLL). The software based PLL incorporates a reconfigurable calculation unit (RCU) that is optimized and programmed to sequentially perform all the atomic operations of a PLL or any other desired task in a time sharing manner. An application specific instruction-set processor (ASIP) incorporating the RCU includes an instruction set whose instructions are optimized to perform the atomic operations of a PLL. The RCU is clocked at a fast enough processor clock rate to insure that all PLL atomic operations are performed within a single PLL reference clock cycle.
    Type: Application
    Filed: September 11, 2007
    Publication date: March 20, 2008
    Inventors: Roman Staszewski, Robert B. Staszewski, Fuqiang Shi
  • Patent number: 7346761
    Abstract: An arithmetic and logic device as an integral part of a processing unit is provided to achieve code size and overhead reduction. The arithmetic and logic device contains several auxiliary computing units, each of which is capable of simple arithmetic and logical operation, under the control of a control unit. By configuring the auxiliary computing units along the data path, additional processing to the operands could be carried out within the same instruction cycle. As such, a processing unit incorporating such an arithmetic and logic device is able to achieve significant performance improvement both in terms of code size and memory access overhead.
    Type: Grant
    Filed: October 8, 2005
    Date of Patent: March 18, 2008
    Assignee: National Chung Cheng University
    Inventors: Tien-Fu Chen, Chih-Heng Kang, Chen-Neng Win
  • Patent number: 7343472
    Abstract: A processor includes an instruction memory, arithmetic logic unit, finite field arithmetic unit, at least one digital storage device, and an instruction decoder. The instruction memory temporarily stores an instruction that includes at least one of: an operational code, destination information, and source information. The instruction decoder is operably coupled to interpret the instruction to identify the arithmetic logic unit and/or the finite field arithmetic unit to perform the operational code of the corresponding instruction. The instruction decoder then identifies at least one destination location within the digital storage device based on the destination information contained within the corresponding instruction. The instruction decoder then identifies at least one source location within the digital storage device based on the source information of the corresponding instruction.
    Type: Grant
    Filed: June 11, 2003
    Date of Patent: March 11, 2008
    Assignee: Broadcom Corporation
    Inventors: Joshua Porten, Won Kim, Scott D. Johnson, John R. Nickolls
  • Patent number: 7308560
    Abstract: A digital signal processing unit includes a control unit and a data computing unit. An R/L register for distinguishing independent data is disposed in the control unit. An R/L select signal for indicating independent data is supplied to the data computing unit. A data processing instruction signal for distinguishing a data processing instruction from other instructions is issued from an instruction decoder. The R/L register for distinguishing independent data is controlled by the data processing instruction signal. In the data computing unit, the portion related to storing independent data is multiplexed according to the number of independent data to be processed, and this multiplexed portion is controlled by the R/L select signal supplied from the control unit.
    Type: Grant
    Filed: February 3, 2005
    Date of Patent: December 11, 2007
    Assignee: Oki Electric Industry Co., Ltd.
    Inventors: Danya Sugai, Teruaki Uehara
  • Patent number: 7290121
    Abstract: A data processor (200) has a pipelined execution unit (120). Whether a first instruction is one of a class of instructions wherein as a result of execution of the first instruction the contents of an operand register will be stored in a destination register is determined. A second instruction that references the destination register is received before a completion of execution of the first instruction. The second instruction is executed using the contents of the operand register without stalling the second instruction in the pipelined execution unit (120).
    Type: Grant
    Filed: June 12, 2003
    Date of Patent: October 30, 2007
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Stephen Charles Kromer
  • Patent number: 7284117
    Abstract: A processor includes a prediction circuit and a floating point unit. The prediction circuit is configured to predict an execution latency of a floating point operation. The floating point unit is coupled to receive the floating point operation for execution, and is configured to detect a misprediction of the execution latency. In some embodiments, an exception may be taken in response to the misprediction. In other embodiments, the floating point operation may be rescheduled with the corrected execution latency.
    Type: Grant
    Filed: November 4, 2003
    Date of Patent: October 16, 2007
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Arun Radhakrishnan, Kelvin D. Goveas
  • Patent number: 7269718
    Abstract: A method, apparatus, and computer instructions in a processor for performing arithmetic operations. A data type associated with a particular memory location is used to determine if an operation about to be performed on the data in that location is legal. If the operation requires the data to have a required data type, a determination is made as to whether the operation is a legal operation based on the identified data type and the required data type. If the operation is not legal on the identified type, a determination is made as to whether data can be cast to change the identified data type to the required data type. The data is cast to the required data type if the data can be cast to form modified data, and the arithmetic operation is performed on the modified data. If the data cannot be cast to the Required type, an exception or interrupt is generated.
    Type: Grant
    Filed: April 29, 2004
    Date of Patent: September 11, 2007
    Assignee: International Business Machines Corporation
    Inventors: William Preston Alexander, III, Robert Tod Dimpsey, Frank Eliot Levine, Robert John Urquhart
  • Publication number: 20070186082
    Abstract: Included are embodiments of a stream processor configured to process data in any of a plurality of different formats. At least one embodiment of the stream processor includes a first scalar arithmetic logic unit (ALU), configured to process a plurality of sets of short data in response to a received short format control signal from an instruction set and process a set of long data in response to a received long format control signal from the instruction set. Embodiments of the processor also include a second arithmetic logic unit (ALU), configured to receive the processed data from the first arithmetic logic unit (ALU) and process the input data and the processed data according to a control signal from the instruction set. Still other embodiments include a special function unit (SFU) configured to provide additional computational functionality to the first ALU and the second ALU.
    Type: Application
    Filed: February 6, 2007
    Publication date: August 9, 2007
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7249243
    Abstract: Techniques for control word prediction and speculative execution. In one embodiment, an apparatus includes a control word predictor, execution resources, and a comparison module. The control word predictor of this embodiment predicts a predicted control word for execution of operations in response to a control word changing operation. The execution resources of this embodiment speculatively execute the plurality of operations utilizing the predicted control word, and the comparison module determines if the predicted control word matches an actual control word set by the control word changing operation or a plurality of other control words, and to cause re-execution of said plurality of operations if said actual control word matches any of the plurality of other control words.
    Type: Grant
    Filed: August 6, 2003
    Date of Patent: July 24, 2007
    Assignee: Intel Corporation
    Inventors: Mohammad A. Abadallah, Mitchell Diamond, David B. Jackson, Kip A. Baumann, Ki W. Yoon, Rafi M. Saied, Robert L. Farrell
  • Patent number: 7240184
    Abstract: A multipurpose functional unit is configurable to support a number of operations including multiply-add and comparison testing operations, as well as other integer and/or floating-point arithmetic operations, Boolean operations, and format conversion operations.
    Type: Grant
    Filed: November 10, 2004
    Date of Patent: July 3, 2007
    Assignee: NVIDIA Corporation
    Inventors: Ming Y. Siu, Stuart F. Oberman
  • Patent number: 7237089
    Abstract: An operation method has processing for applying a same type of operation in parallel to N M-bit operands to obtain N M-bit operation results executed on a computer. Here, N is an integer equal to or greater than 2 and M is an integer equal to or greater than 1. The operation method includes: an operation step of applying the type of operation to an N*M-bit provisional operand that is formed by concatenating the N M-bit operands, to obtain one N*M-bit provisional operation result, and generating correction information based on an effect had, by applying the operation, on each M bits of the provisional operation result from a bit that neighbors the M bits; and a correction step of correcting the provisional operation result in M-bit units with use of the correction information, to obtain the N M-bit operation results.
    Type: Grant
    Filed: November 26, 2002
    Date of Patent: June 26, 2007
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Masato Suzuki
  • Patent number: 7237055
    Abstract: A system, apparatus and a method for routing data over fewer switches and interconnections among reconfigurable logic elements, and for adapting routing resources to dynamically perform complex bit-level permutations, such as shifting and bit reversal operations. In one embodiment, an exemplary silo routing circuit is formed upon a semiconductor substrate and routes data among a number of reconfigurable computational elements. The silo routing circuit comprises a plurality of input terminals and a plurality of output terminals. Further, the silo routing circuit includes a multi-stage interconnection network (“MIN”) of switches configurable to form data paths from any input terminal to any output terminal.
    Type: Grant
    Filed: August 31, 2004
    Date of Patent: June 26, 2007
    Assignee: Stretch, Inc.
    Inventor: Charle' R. Rupp
  • Patent number: 7231510
    Abstract: A mechanism for, and method of, processing multiply-accumulate instructions with out-of-order completion in a pipeline, for use in a processor having an at least four-wide instruction issue architecture, and a digital signal processor (DSP) incorporating the mechanism or the method. In one embodiment, the mechanism including: (1) a multiply-accumulate unit (MAC) having an initial multiply stage and a subsequent accumulate stage and (2) out-of-order completion logic, associated with the MAC, that causes interim results produced by the multiply stage to be stored when the accumulate stage is unavailable and allows younger instructions to complete before the multiply-accumulate instructions.
    Type: Grant
    Filed: November 13, 2001
    Date of Patent: June 12, 2007
    Assignee: VeriSilicon Holdings (Cayman Islands) Co. Ltd.
    Inventors: Hung T. Nguyen, Shannon A. Wichman
  • Patent number: 7216217
    Abstract: A programmable processor that comprises a general purpose processor architecture, capable of operation independent of another host processor, having a virtual memory addressing unit, an instruction path and a data path; an external interface; a cache operable to retain data communicated between the external interface and the data path; at least one register file configurable to receive and store data from the data path and to communicate the stored data to the data path; and a multi-precision execution unit coupled to the data path. The multi-precision execution unit is configurable to dynamically partition data received from the data path to account for an elemental width of the data and is capable of performing group floating-point operations on multiple operands in partitioned fields of operand registers and returning catenated results. In other embodiments the multi-precision execution unit is additionally configurable to execute group integer and/or group data handling operations.
    Type: Grant
    Filed: August 25, 2003
    Date of Patent: May 8, 2007
    Assignee: Microunity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris
  • Patent number: 7216138
    Abstract: A method and apparatus are described for converting a number from a floating point format to an integer format or from an integer format to a floating point format responsive to a control signal of a control signal format. Numbers are stored in the floating point format in a register of a first set of architectural registers in a packed format. One or more numbers in the floating point format are converted to the integer format and placed in a register of a second set of architectural registers in a packed format. Conversion from integer format to floating point format is performed in a similar manner. A floating point arithmetic apparatus is described that provides for converting a plurality of numbers between integer formats and a floating point formats, further providing for conversion operations that require a greater data path width than floating-point arithmetic operations.
    Type: Grant
    Filed: February 14, 2001
    Date of Patent: May 8, 2007
    Assignee: Intel Corporation
    Inventors: Mohammad Abdallah, Prasad Modali, Chien-Yu Huang, legal representative, Thomas R. Huff, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar, Hsien-Cheng E. Hsieh, deceased
  • Patent number: 7212959
    Abstract: A method and apparatus for accumulating arbitrary length strings of input values, such as floating point values, in a layered tree structure such that the order of adds at each layer is maintained. The accumulating utilizes a shared adder, and includes means for directing initial inputs and intermediate result values.
    Type: Grant
    Filed: August 8, 2001
    Date of Patent: May 1, 2007
    Inventors: Stephen Clark Purcell, Scott Kimura, Mark L. Wood Patrick
  • Patent number: 7206927
    Abstract: A method of executing an instruction stream in a pipelined execution unit of depth, p, comprises loading the instruction stream; detecting an iteration of an instruction in the loaded instruction stream; interleaving p steams of instances of the instruction in the pipeline; detecting an end of the iteration; and combining results obtained from the p streams after all programmed iterations have completed. A computational circuit comprises a register which can hold a value representing both an operand and result of an iterative operation; a multiplexer having a first input connected to receive the operand from the register, a second input connected to a source of an identify value for the iterative operation, and an output; and an operator circuit having an input connected to receive a value from the multiplexer output, and an output connected to return thee result to the register.
    Type: Grant
    Filed: November 19, 2002
    Date of Patent: April 17, 2007
    Assignee: Analog Devices, Inc.
    Inventor: Abhijit Giri
  • Patent number: 7191316
    Abstract: A system for handling a plurality of single precision floating point instructions and a plurality of double precision floating point instructions that both index a same set of registers is provided. The system comprises a decode unit arranged to decode, stall, and forward at least one of the plurality of single precision and at least one of the plurality of double precision floating point instructions in a fetch group. The decode unit includes a first counter arranged to increment for each of the plurality of single precision floating point instructions forwarded down a pipeline; a second counter arranged to increment for each of the plurality of double precision floating point instructions forwarded down the pipeline; a first mask register and a second mask register. The first mask register is updated by each of the single precision floating point instructions forwarded and the second mask register is updated by each of the double precision floating point instructions forwarded.
    Type: Grant
    Filed: January 29, 2003
    Date of Patent: March 13, 2007
    Assignee: Sun Microsystems, Inc.
    Inventors: Rabin A. Sugumar, Sorin Iacobovici, Robert Nuckolls, Chandra M. R. Thimmannagari
  • Patent number: 7159100
    Abstract: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into a first vector register and a second vector register, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, a first vector register and a second vector register are read from the register file. The present invention then executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The result of the execution is then written into the accumulator. Then, each element in the accumulator is transformed into an N-bit width element and stored into the memory.
    Type: Grant
    Filed: December 30, 1998
    Date of Patent: January 2, 2007
    Assignee: MIPS Technologies, Inc.
    Inventors: Timothy van Hook, Peter Hsu, William A. Huffman, Henry P. Moreton, Earl A. Killian
  • Patent number: 7149882
    Abstract: A processor with instructions to operate on different data types stored in a single logical register file. According to one embodiment of the invention, a processor includes a number of physical registers, a memory unit, and a decode/execution unit. The memory unit is to make the number of physical registers appear to software as a single software-visible register file. The decode/execution unit is to execute on the contents of the single software-visible register file instructions of a first instruction type and of a second instruction type, wherein the single software-visible register file is to be operated as a flat register file during execution of instructions of the second instruction type and as a stack referenced register file during execution of instructions of the first instruction type.
    Type: Grant
    Filed: May 11, 2004
    Date of Patent: December 12, 2006
    Assignee: Intel Corporation
    Inventors: Andrew F. Glew, Larry M. Mennemeier, Alexander D. Peleg, David Bistry, Millind Mittal, Carole Dulong, Eiichi Kowashi, Benny Eitan, Derrik Lin, Romamohan R. Vakkalagadda
  • Patent number: 7149877
    Abstract: A disclosed byte execution unit receives byte instruction information and two operands, and performs an operation specified by the byte instruction information upon one or both of the operands, thereby producing a result. The byte instruction specifies either a count ones in bytes operation, an average bytes operation, an absolute differences of bytes operation, or a sum bytes into halfwords operation. In one embodiment, the byte execution unit includes multiple byte units. Each byte unit includes multiple population counters, two compressor units, adder input multiplexer logic, adder logic, and result multiplexer logic. A data processing system is described including a processor coupled to a memory system. The processor includes the byte execution unit. The memory system includes a byte instruction, wherein the byte instruction specifies either the count ones in bytes operation, the average bytes operation, the absolute differences of bytes operation, or the sum bytes into halfwords operation.
    Type: Grant
    Filed: July 17, 2003
    Date of Patent: December 12, 2006
    Assignee: International Business Machines Corporation
    Inventors: Sang Hoo Dhong, Hwa-Joon Oh, Brad William Michael, Silvia Melitta Mueller, Kevin D. Tran
  • Patent number: 7146491
    Abstract: A data processing apparatus and method for generating constant values is provided. The data processing apparatus comprises a data processing unit operable in response to an instruction to perform a data processing operation on one or more data values. Shift logic is operable to selectively apply a shift operation to data to produce one of the data values for the data processing operation. Further, a plurality of registers are provided for storing data. The instruction has a register specifier field for identifying a register and a shift specifier field for specifying a shift to be applied to that register's data in order to produce one of the data values for the data processing operation.
    Type: Grant
    Filed: October 26, 2004
    Date of Patent: December 5, 2006
    Assignee: ARM Limited
    Inventors: Jonathan Sean Callan, David Hennah Mansell, Christopher Pedley, David James Seal
  • Patent number: 7139900
    Abstract: New instruction definitions for a packet add (PADD) operation and for a single instruction multiple add (SMAD) operation are disclosed. In addition, a new dedicated PADD logic device that performs the PADD operation in about one to two processor clock cycles is disclosed. Also, a new dedicated SMAD logic device that performs a single instruction multiple data add (SMAD) operation in about one to two clock cycles is disclosed.
    Type: Grant
    Filed: June 23, 2003
    Date of Patent: November 21, 2006
    Assignee: Intel Corporation
    Inventors: Corey Gee, Bapiraju Vinnakota, Saleem Mohammadali, Carl A. Alberola
  • Patent number: 7139901
    Abstract: A software program extension for a dynamic multi-streaming processor is disclosed. The extension comprising an instruction set enabling coordinated interaction between a packet management component and a core processing component of the processor. The software program comprises, a portion thereof for managing packet uploads and downloads into and out of memory, a portion thereof for managing specific memory allocations and de-allocations associated with enqueueing and dequeuing data packets, a portion thereof for managing the use of multiple contexts dedicated to the processing of a single data packet; and a portion thereof for managing selection and utilization of arithmetic and other context memory functions associated with data packet processing. The extension complements standard data packet processing program architecture for specific use for processors having a packet management unit that functions independently from a streaming processor unit.
    Type: Grant
    Filed: September 7, 2001
    Date of Patent: November 21, 2006
    Assignee: MIPS Technologies, Inc.
    Inventors: Enrique Musoll, Mario Nemirovsky, Stephen Melvin
  • Patent number: 7111155
    Abstract: A computation core includes a computation block, an addressing block and an instruction sequencer, which are coupled to a memory through a memory interface. The computation block includes a register file and dual execution units. The execution units include features for enhanced performance in executing digital signal computations. The computation core is configured for executing digital signal processor instructions and microcontroller instructions, while achieving efficient digital signal processor computation and high code density. A finite impulse response filter algorithm achieves high performance on the dual execution units.
    Type: Grant
    Filed: May 12, 2000
    Date of Patent: September 19, 2006
    Assignee: Analog Devices, Inc.
    Inventors: William C. Anderson, John Edmondson, Jose Fridman, Marc Hoffman, Russell L. Rivin
  • Patent number: 7100025
    Abstract: An apparatus and method for performing single-instruction multiple-data instructions using a single multiply-accumulate unit while minimizing operational latency. The multiply-accumulate unit generates a first half and a second half of a data result. A register stores the first half of the data result. A miscellaneous-logic unit determines when to release the first half of the data result from the register to synchronize the first half and the second half of the data result.
    Type: Grant
    Filed: January 28, 2000
    Date of Patent: August 29, 2006
    Assignees: Hewlett-Packard Development Company, L.P., Intel Corporation
    Inventor: Thomas Justin Sullivan
  • Patent number: 7062633
    Abstract: It is decided whether a first source data from the memory 101 is a data which is to be subjected to arithmetic or not by a state flag detection means 150, the result of the decision is retained as a state flag, and it is decided by a condition decision means 109 whether or not the state flag satisfies a condition for performing the arithmetic. A control means 110 controls whether an ALU 100 should perform the arithmetic or not on the basis of the condition satisfaction/dissatisfaction information.
    Type: Grant
    Filed: December 15, 1999
    Date of Patent: June 13, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Mana Hamada, Shunichi Kuromaru, Tomonori Yonezawa, Tsuyoshi Nakamura
  • Patent number: 7062637
    Abstract: Executing digital signal processing (DSP) instructions in a digital signal processor integrated circuit comprising receiving a DSP instruction in digital signal processor integrated circuit to process one or more complex number operands; fetching a first operand with a first data type, the first operand having real and imaginary values with a complex data type; fetching a second operand with a second data type; prior to executing a DSP operation, determining a permutation of the first operand, the second operand, or both the first operand and the second operand, and permuting instances of the first operand, the second operand, or both the first operand and the second operand to execute the DSP operation; and executing the DSP operation in the digital signal processor integrated circuit using the first operand and the second operand to obtain a result, the result having real and imaginary values with a complex data type.
    Type: Grant
    Filed: March 6, 2003
    Date of Patent: June 13, 2006
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Patent number: 7051193
    Abstract: Instruction-level parallelism in software pipelined loops is exploited by predicting future register rotations. A processor includes an architected current frame marker register and at least one unarchitected frame marker register. Register rotation prediction is achieved by setting the register rotation of future iterations of a software loop to be a function of the unarchitected frame marker registers. True data dependencies remain, but the dependencies caused solely by register renaming are removed. Dynamic predication is used to predicate instructions from future iterations, allowing them to be squashed if dependencies are later found. The register renaming that results from the prediction can be included in instructions in a buffer, or a renaming stage in an execution pipeline can perform the renaming.
    Type: Grant
    Filed: March 28, 2001
    Date of Patent: May 23, 2006
    Assignee: Intel Corporation
    Inventors: Hong Wang, Christopher J. Hughes, Ralph Kling, Yong-Fong Lee, Daniel M. Lavery, John Shen, Jamison Collins
  • Patent number: 7047397
    Abstract: A method for executing an instruction with a semi-fast operation in a staggered ALU. The method of one embodiment comprises generating a first operation and a second operation from a micro-instruction. The first and second operations are scheduled for execution in a staggered arithmetic logic unit (ALU). The first and second operations are separated by N clock cycles. Data from the first operation is communicated to the second operation for use with execution of the second operation.
    Type: Grant
    Filed: September 13, 2002
    Date of Patent: May 16, 2006
    Assignee: Intel Corporation
    Inventor: Ross A. Segelken
  • Patent number: 7047396
    Abstract: A method and system for fixed-length memory-to-memory processing of fixed-length instructions. Further, the present invention is a method and system for implementing a memory operand width independent of the ALU width. The arithmetic and register data are 32 bits, but the memory operand is variable in size. The size of the memory operand is specified by the instruction. Instructions in accordance with the present invention allow for multiple memory operands in a single fixed-length instruction. The instruction set is small and simple, so the implementation is lower cost than traditional processors. More addressing modes are provided for, thus creating a more efficient code. Semaphores are implemented using a single bit. Shift-and-merge instructions are used to access data across word boundaries.
    Type: Grant
    Filed: June 22, 2001
    Date of Patent: May 16, 2006
    Assignee: Ubicom, Inc.
    Inventors: David A. Fotland, Roger D. Arnold, Tibet Mimaroglu
  • Patent number: 7028168
    Abstract: A system for performing matrix operations utilizes a processor, memory, and a matrix operation manager. The processor has a memory cache. The memory is external to the processor and stores first and second matrices. The matrix operation manager is configured to mathematically combine the first matrix with the scond matrix utilizing a hoisted matrix algorithm for hoisting values of the first matrix, and the hoisted matrix algorithm has an outer loop and an inner loop that is performed to completion for each iteration of the outer loop. The matrix operation manager, for each iteration of the outer loop, is configured to load to the cache and to write to a contiguous portion of the memory, before performing the inner loop, values from the first matrix that are to be combined, via performance of the inner loop, with values from the second matrix.
    Type: Grant
    Filed: December 5, 2002
    Date of Patent: April 11, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Kevin R. Wadleigh
  • Patent number: 6996701
    Abstract: A computer system that can be operated by a clock frequency higher than the clock frequency by which the critical path instruction is executed correctly. The pipeline is driven at a high clock frequency higher than the clock frequency by which critical path instruction can be executed correctly. The computer system includes a high frequency ALU being operated by the pipeline clock frequency, and at least two low frequency ALUs being operated by the low clock frequency by which the critical path instruction is executed correctly. Each instruction of the execution stage is inputted to the low frequency ALUs alternately and each executes the critical path instruction in two machine cycles.
    Type: Grant
    Filed: September 25, 2001
    Date of Patent: February 7, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Akimitsu Shimamura
  • Patent number: 6996702
    Abstract: A processing system includes an arithmetic logic unit (ALU) sub-system that allows data associated with a prior instruction to be preserved for use with a next instruction or subsequent instruction without having to reload the value using an intermediate register. The ALU sub-system includes a pair of ALUs communicatively cross-coupled with a pair of accumulators. The processing system also includes a data selector coupled to the ALU sub-system for use with memory contention prediction. The data selector includes a constant generator that controls storage of data associated with a previous instruction in a bypass element, and a selector to choose between data from a databus element and data stored in the bypass element.
    Type: Grant
    Filed: July 30, 2002
    Date of Patent: February 7, 2006
    Assignee: WIS Technologies, Inc.
    Inventors: Shuhua Xiang, Li Sha, Ping Zhu, Hongjun Yuan, Wei Ni
  • Patent number: 6988184
    Abstract: Methods of performing dyadic digital signal processing (DSP) instructions. In one embodiment of the invention, the method includes fetching a dyadic DSP instruction having a main operation and a sub operation; predecoding the dyadic DSP instruction to generate predecoded instruction signals; and decoding the predecoded instruction signals to generate select signals to selectively couple data from a first plurality of buses coupled to inputs of multiplexers of a first plurality of DSP functional blocks to execute the main operation of the dyadic DSP instruction in one processor cycle and to selectively couple data from a second plurality of buses coupled to inputs of multiplexers of a second plurality of DSP functional blocks to execute the sub operation of the dyadic DSP instruction in the one processor cycle.
    Type: Grant
    Filed: August 2, 2002
    Date of Patent: January 17, 2006
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Patent number: 6986023
    Abstract: A processor-based system may include a main processor and a coprocessor. The coprocessor handles instructions that include opcodes specifying a data processing operation to be performed by the coprocessor and a coprocessor identification field for identifying a target coprocessor for coprocessor instructions. Two bits indicate one of four data sizes including a byte (8 bits), a half word (16 bits), a word (32 bits), and a double word (64 bits). Two other bits indicate a saturation type.
    Type: Grant
    Filed: August 9, 2002
    Date of Patent: January 10, 2006
    Assignee: Intel Corporation
    Inventors: Nigel C. Paver, William T. Maghielse, Wing K. Yu, Jianwei Liu, Anthony Jebson, Kailesh B. Bavaria, Rupal M. Parikh, Deli Deng, Mukesh Patel, Mark Fullerton, Murli Ganeshan, Stephen J. Strazdus
  • Patent number: 6973551
    Abstract: A method and system for enabling a director to perform an atomic read-modify-write operation on plural bit read data stored in a selected one of a plurality of memory locations. The method includes providing a plurality of successive full adders, each one of the full adders being associated with a corresponding one of the bits of the plural bit read data. Each one of the full adders has a summation output, a carry bit input and a carry bit output. The method includes adding in each one of the full adders: (a) a corresponding bit of plural bit input data provided by the director; (b) the corresponding one of the bits of the plural bit read data; and, (c) a carry bit fed the carry bit input from a preceding full adder. Each one of the full adders provides: (a) a carry bit on the carry output thereof representative of the most significant bit produced by the full adder; and, (b) a bit on the summation output representative of a least significant bit produced by the full adder.
    Type: Grant
    Filed: December 30, 2002
    Date of Patent: December 6, 2005
    Assignee: EMC Corporation
    Inventor: John K. Walton
  • Patent number: 6970994
    Abstract: A method and apparatus for executing partial-width packed data instructions are discussed. The processor may include a plurality of registers, a register renaming unit, a decoder, and a partial-width execution unit. The register renaming unit provides an architectural register file to store packed data operands each of which include a plurality of data elements. The decoder is to decode a first and second set of instructions that each specify one or more registers in the architectural register file. The first set of instructions specify operations to be performed on all of the data elements stored in the one or more specified registers. In contrast, the second set of instructions specify operations to be performed on only a subset of the data elements. The partial-width execution unit is to execute operations specified by either of the first or the second set of instructions.
    Type: Grant
    Filed: May 8, 2001
    Date of Patent: November 29, 2005
    Assignee: Intel Corporation
    Inventors: Mohammad Abdallah, James Coke, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
  • Patent number: RE39121
    Abstract: A processor which executes positive conversion processing, which converts coded data into uncoded data, and saturation calculation processing, which rounds a value to an appropriate number of bits, at high speed. When a positive conversion saturation calculation instruction “MCSST D1” is decoded, the sum-product result register 6 outputs its held value to the path P1. The comparator 22 compares the magnitude of the held value of the sum-product result register 6 with the coded 32-bit integer “0x0000_00FF”. The polarity judging unit 23 judges whether the eighth bit of the value held by the sum-product result register 6 is “ON”. The multiplexer 24 outputs one of the maximum value “0x0000_00FF” generated by the constant generator 21, the zero value “0x0000_0000” generated by the zero generator 25, and the held value of the sum-product result register 6 to the data bus 18.
    Type: Grant
    Filed: February 13, 2003
    Date of Patent: June 6, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Toru Morikawa, Nobuo Higaki, Akira Miyoshi, Keizo Sumida