Arithmetic Operation Instruction Processing Patents (Class 712/221)
-
Publication number: 20080148012Abstract: A mathematical operation processing apparatus is disclosed by which the supply of an operand which is performed based on condition codes by a plurality of mathematical operations can be performed at a high speed. The mathematical operation processing apparatus includes a plurality of computing elements configured to perform different mathematical operations different from one another and produce mathematical operation results of the mathematical operations and condition codes. A condition code set register retains the condition codes produced simultaneously by the computing elements as a condition code set. A condition code conversion section performs a predetermined conversion for the condition code set and outputs a result of the conversion as a conversion condition code set. An operand supplying section supplies an operand for the mathematical operations in the computing elements based on the conversion condition code set.Type: ApplicationFiled: December 4, 2007Publication date: June 19, 2008Applicant: Sony CorporationInventors: Yasuhiro Iizuka, Takahiro Sato, Takayasu Kon, Kenichi Sanpei, Eiichiro Morinaga
-
Publication number: 20080148024Abstract: The present disclosure provides a method for instruction processing. The method may include adding a first operand from a first register, a second operand from a second register and a carry input bit to generate a sum and a carry out bit. The method may further include loading the sum into a third register and loading the carry out bit into a most significant bit position of the third register to generate a third operand. The method may also include performing a single bit shift on the third operand via a shifter unit to produce a shifted operand and loading the shifted operand into the fourth register. The method may further include loading a least significant bit from the sum into the most significant bit position of the fourth register to generate a fourth operand. The method may additionally include generating a greatest common divisor (GCD) of the first and second operands via the fourth operand and generating a public key based on, at least in part, the GCD.Type: ApplicationFiled: December 14, 2006Publication date: June 19, 2008Applicant: INTEL CORPORATIONInventors: Gilbert M. Wolrich, William Hasenplaugh, Wajdi Feghali, Daniel Cutter, Vinodh Gopal, Gunnar Gaubatz
-
Patent number: 7389406Abstract: A partial execution unit of a splittable execution unit performs an operation on a portion of one or more arguments of a micro-operation to generate a first partial execution result of the micro-operation. A complementary portion of one of the arguments is passed through a bypass execution unit instead of through the splittable execution unit to generate a second partial execution result of the micro-operation. The first partial execution result and second partial execution result are concatenated into a full execution result.Type: GrantFiled: September 28, 2004Date of Patent: June 17, 2008Assignee: Intel CorporationInventors: Zeev Sperber, Guillermo Savransky, Sagi Lahav
-
Publication number: 20080140994Abstract: A method of operation within an integrated-circuit processing device having a plurality of execution lanes. Upon receiving an instruction to exchange data between the execution lanes, respective requests from the execution lanes are examined to determine a set of the execution lanes that may send data to one or more others of the execution lanes during a first interval. Each execution lane within the set of the execution lanes is signaled to indicate that the execution lane may send data to the one or others of the execution lanes.Type: ApplicationFiled: October 9, 2007Publication date: June 12, 2008Inventors: Brucek Khailany, William James Dally, Ujval J. Kapasi, Jim Jian Lin, Raghunath Rao, DeForest Tovey, Mark Rygh, Jung-Ho Ahn
-
Publication number: 20080141006Abstract: A system and method for implementing arithmetic logic unit (ALU) support for value-based control dependence sequences. According to a first embodiment of the present invention, an ALU generates a carry-out signal designating one of a first and second value as a larger value. In response to the carry-out signal, the ALU updates a storage location with a third value, which is the larger value. According to a second embodiment of the present invention, an ALU generates a carry-out signal designating one of a first and second value as a larger value. In response to the carry-out signal, the ALU updates a storage location with a third value. The third value is a fourth value, if the carry-out signal designates the first value as the larger value or the third value is a fifth value, if the carry-out signal designates the second value as the larger value.Type: ApplicationFiled: December 11, 2006Publication date: June 12, 2008Inventors: Lei Chen, Hung C. Ngo, Kevin J. Nowka
-
Publication number: 20080126758Abstract: A digital signal processing apparatus and method for MAC operation are disclosed. The DSP apparatus including: a first memory for storing a plurality of first operands; a second memory for storing a plurality of second operands; a MAC processor including a plurality of parallel MAC blocks disposed in parallel for performing a parallel MAC operation on a first operand outputted from the first memory in parallel and a second operand outputted from the second memory in parallel using the parallel MAC blocks, wherein the first memory and the second memory include dual port memories for outputting the plurality of the first operands and the second operands to the plurality of parallel MAC blocks in parallel.Type: ApplicationFiled: December 22, 2006Publication date: May 29, 2008Inventors: Young-Su Kwon, Bon-Tae Koo, Nak-Woong Eum
-
Publication number: 20080114967Abstract: There is provided a semiconductor integrated circuit device which consumes less power and enables real-time processing. The semiconductor integrated circuit device comprises: thermal sensors which can detect temperature, determine whether the detection result exceeds each of the above reference values and output the result; and a control block capable of controlling the operations of arithmetic blocks based on the output signals of the thermal sensors, wherein the control block returns to an operation state from a suspended state with an interrupt signal based on the output signals of the thermal sensors and determines the operation conditions of the arithmetic blocks to ensure that the temperature conditions of the arithmetic blocks are satisfied. Thereby, power consumption is reduced and real-time processing efficiency is improved.Type: ApplicationFiled: November 6, 2007Publication date: May 15, 2008Inventors: Makoto Saen, Kenichi Osada, Tetsuya Yamada, Yusuke Kanno, Satoshi Misaka
-
Patent number: 7366882Abstract: A processor is provided with a address calculation unit so as to generate addresses for elements of object oriented data structures in one processor clock cycle.Type: GrantFiled: May 10, 2002Date of Patent: April 29, 2008Inventors: Zohair Sahraoui, Gary Ciambella
-
Patent number: 7363471Abstract: A method may translate a set of source instructions into a set of target instructions, execute the set of target instructions, and unmask a denormal input control bit if the set of source instructions uses a denormal input handling mechanism. A method may detect at least one denormal exception of a faulty target instruction by executing the set of target instructions; assign a predetermined value to one or more denormal operands of the faulty target instruction; and execute the faulty target instruction with the predetermined value for the one or more denormal operands. An apparatus, system, and machine-readable medium may perform such methods.Type: GrantFiled: June 27, 2005Date of Patent: April 22, 2008Assignee: Intel CorporationInventors: Sion Berkowits, Orna Etzion, Li Jianhui
-
Patent number: 7353516Abstract: The present invention concerns data flow control in adaptive integrated circuitry which utilizes a data flow model for data processing. The present invention controls task initiation and execution based upon data consumption measured in data buffer units. In the various embodiments, when a first task of a plurality of tasks is initiated, buffer parameter is determined and a buffer count is initialized for the first task. For each iteration of the first task using a data buffer unit of input data, the buffer count is correspondingly adjusted, such as incremented or decremented. When the buffer count meets the buffer parameter requirements, the state of the first task is changed, which may including stopping the first task, and a next action is determined, such as initiating a second task. The various apparatus embodiments include a hardware task manager, a node sequencer, a programmable node, and use of a monitoring task within an adaptive execution unit.Type: GrantFiled: August 14, 2003Date of Patent: April 1, 2008Assignee: NVIDIA CorporationInventors: Ghobad Heidari-Bateni, Sharad D. Sambhwani
-
Publication number: 20080077779Abstract: In one embodiment, the present invention includes a method for receiving a rounding instruction and an immediate value in a processor, determining if a rounding mode override indicator of the immediate value is active, and if so executing a rounding operation on a source operand in a floating point unit of the processor responsive to the rounding instruction and according to a rounding mode set forth in the immediate operand. Other embodiments are described and claimed.Type: ApplicationFiled: September 22, 2006Publication date: March 27, 2008Inventors: Ronen Zohar, Shane Story
-
Publication number: 20080072025Abstract: A novel and useful apparatus for and method of software based phase locked loop (PLL). The software based PLL incorporates a reconfigurable calculation unit (RCU) that is optimized and programmed to sequentially perform all the atomic operations of a PLL or any other desired task in a time sharing manner. An application specific instruction-set processor (ASIP) incorporating the RCU includes an instruction set whose instructions are optimized to perform the atomic operations of a PLL. The RCU is clocked at a fast enough processor clock rate to insure that all PLL atomic operations are performed within a single PLL reference clock cycle.Type: ApplicationFiled: September 11, 2007Publication date: March 20, 2008Inventors: Roman Staszewski, Robert B. Staszewski, Fuqiang Shi
-
Patent number: 7346761Abstract: An arithmetic and logic device as an integral part of a processing unit is provided to achieve code size and overhead reduction. The arithmetic and logic device contains several auxiliary computing units, each of which is capable of simple arithmetic and logical operation, under the control of a control unit. By configuring the auxiliary computing units along the data path, additional processing to the operands could be carried out within the same instruction cycle. As such, a processing unit incorporating such an arithmetic and logic device is able to achieve significant performance improvement both in terms of code size and memory access overhead.Type: GrantFiled: October 8, 2005Date of Patent: March 18, 2008Assignee: National Chung Cheng UniversityInventors: Tien-Fu Chen, Chih-Heng Kang, Chen-Neng Win
-
Patent number: 7343472Abstract: A processor includes an instruction memory, arithmetic logic unit, finite field arithmetic unit, at least one digital storage device, and an instruction decoder. The instruction memory temporarily stores an instruction that includes at least one of: an operational code, destination information, and source information. The instruction decoder is operably coupled to interpret the instruction to identify the arithmetic logic unit and/or the finite field arithmetic unit to perform the operational code of the corresponding instruction. The instruction decoder then identifies at least one destination location within the digital storage device based on the destination information contained within the corresponding instruction. The instruction decoder then identifies at least one source location within the digital storage device based on the source information of the corresponding instruction.Type: GrantFiled: June 11, 2003Date of Patent: March 11, 2008Assignee: Broadcom CorporationInventors: Joshua Porten, Won Kim, Scott D. Johnson, John R. Nickolls
-
Patent number: 7308560Abstract: A digital signal processing unit includes a control unit and a data computing unit. An R/L register for distinguishing independent data is disposed in the control unit. An R/L select signal for indicating independent data is supplied to the data computing unit. A data processing instruction signal for distinguishing a data processing instruction from other instructions is issued from an instruction decoder. The R/L register for distinguishing independent data is controlled by the data processing instruction signal. In the data computing unit, the portion related to storing independent data is multiplexed according to the number of independent data to be processed, and this multiplexed portion is controlled by the R/L select signal supplied from the control unit.Type: GrantFiled: February 3, 2005Date of Patent: December 11, 2007Assignee: Oki Electric Industry Co., Ltd.Inventors: Danya Sugai, Teruaki Uehara
-
Patent number: 7290121Abstract: A data processor (200) has a pipelined execution unit (120). Whether a first instruction is one of a class of instructions wherein as a result of execution of the first instruction the contents of an operand register will be stored in a destination register is determined. A second instruction that references the destination register is received before a completion of execution of the first instruction. The second instruction is executed using the contents of the operand register without stalling the second instruction in the pipelined execution unit (120).Type: GrantFiled: June 12, 2003Date of Patent: October 30, 2007Assignee: Advanced Micro Devices, Inc.Inventor: Stephen Charles Kromer
-
Patent number: 7284117Abstract: A processor includes a prediction circuit and a floating point unit. The prediction circuit is configured to predict an execution latency of a floating point operation. The floating point unit is coupled to receive the floating point operation for execution, and is configured to detect a misprediction of the execution latency. In some embodiments, an exception may be taken in response to the misprediction. In other embodiments, the floating point operation may be rescheduled with the corrected execution latency.Type: GrantFiled: November 4, 2003Date of Patent: October 16, 2007Assignee: Advanced Micro Devices, Inc.Inventors: Arun Radhakrishnan, Kelvin D. Goveas
-
Patent number: 7269718Abstract: A method, apparatus, and computer instructions in a processor for performing arithmetic operations. A data type associated with a particular memory location is used to determine if an operation about to be performed on the data in that location is legal. If the operation requires the data to have a required data type, a determination is made as to whether the operation is a legal operation based on the identified data type and the required data type. If the operation is not legal on the identified type, a determination is made as to whether data can be cast to change the identified data type to the required data type. The data is cast to the required data type if the data can be cast to form modified data, and the arithmetic operation is performed on the modified data. If the data cannot be cast to the Required type, an exception or interrupt is generated.Type: GrantFiled: April 29, 2004Date of Patent: September 11, 2007Assignee: International Business Machines CorporationInventors: William Preston Alexander, III, Robert Tod Dimpsey, Frank Eliot Levine, Robert John Urquhart
-
Publication number: 20070186082Abstract: Included are embodiments of a stream processor configured to process data in any of a plurality of different formats. At least one embodiment of the stream processor includes a first scalar arithmetic logic unit (ALU), configured to process a plurality of sets of short data in response to a received short format control signal from an instruction set and process a set of long data in response to a received long format control signal from the instruction set. Embodiments of the processor also include a second arithmetic logic unit (ALU), configured to receive the processed data from the first arithmetic logic unit (ALU) and process the input data and the processed data according to a control signal from the instruction set. Still other embodiments include a special function unit (SFU) configured to provide additional computational functionality to the first ALU and the second ALU.Type: ApplicationFiled: February 6, 2007Publication date: August 9, 2007Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
-
Patent number: 7249243Abstract: Techniques for control word prediction and speculative execution. In one embodiment, an apparatus includes a control word predictor, execution resources, and a comparison module. The control word predictor of this embodiment predicts a predicted control word for execution of operations in response to a control word changing operation. The execution resources of this embodiment speculatively execute the plurality of operations utilizing the predicted control word, and the comparison module determines if the predicted control word matches an actual control word set by the control word changing operation or a plurality of other control words, and to cause re-execution of said plurality of operations if said actual control word matches any of the plurality of other control words.Type: GrantFiled: August 6, 2003Date of Patent: July 24, 2007Assignee: Intel CorporationInventors: Mohammad A. Abadallah, Mitchell Diamond, David B. Jackson, Kip A. Baumann, Ki W. Yoon, Rafi M. Saied, Robert L. Farrell
-
Patent number: 7240184Abstract: A multipurpose functional unit is configurable to support a number of operations including multiply-add and comparison testing operations, as well as other integer and/or floating-point arithmetic operations, Boolean operations, and format conversion operations.Type: GrantFiled: November 10, 2004Date of Patent: July 3, 2007Assignee: NVIDIA CorporationInventors: Ming Y. Siu, Stuart F. Oberman
-
Patent number: 7237089Abstract: An operation method has processing for applying a same type of operation in parallel to N M-bit operands to obtain N M-bit operation results executed on a computer. Here, N is an integer equal to or greater than 2 and M is an integer equal to or greater than 1. The operation method includes: an operation step of applying the type of operation to an N*M-bit provisional operand that is formed by concatenating the N M-bit operands, to obtain one N*M-bit provisional operation result, and generating correction information based on an effect had, by applying the operation, on each M bits of the provisional operation result from a bit that neighbors the M bits; and a correction step of correcting the provisional operation result in M-bit units with use of the correction information, to obtain the N M-bit operation results.Type: GrantFiled: November 26, 2002Date of Patent: June 26, 2007Assignee: Matsushita Electric Industrial Co., Ltd.Inventor: Masato Suzuki
-
Patent number: 7237055Abstract: A system, apparatus and a method for routing data over fewer switches and interconnections among reconfigurable logic elements, and for adapting routing resources to dynamically perform complex bit-level permutations, such as shifting and bit reversal operations. In one embodiment, an exemplary silo routing circuit is formed upon a semiconductor substrate and routes data among a number of reconfigurable computational elements. The silo routing circuit comprises a plurality of input terminals and a plurality of output terminals. Further, the silo routing circuit includes a multi-stage interconnection network (“MIN”) of switches configurable to form data paths from any input terminal to any output terminal.Type: GrantFiled: August 31, 2004Date of Patent: June 26, 2007Assignee: Stretch, Inc.Inventor: Charle' R. Rupp
-
Patent number: 7231510Abstract: A mechanism for, and method of, processing multiply-accumulate instructions with out-of-order completion in a pipeline, for use in a processor having an at least four-wide instruction issue architecture, and a digital signal processor (DSP) incorporating the mechanism or the method. In one embodiment, the mechanism including: (1) a multiply-accumulate unit (MAC) having an initial multiply stage and a subsequent accumulate stage and (2) out-of-order completion logic, associated with the MAC, that causes interim results produced by the multiply stage to be stored when the accumulate stage is unavailable and allows younger instructions to complete before the multiply-accumulate instructions.Type: GrantFiled: November 13, 2001Date of Patent: June 12, 2007Assignee: VeriSilicon Holdings (Cayman Islands) Co. Ltd.Inventors: Hung T. Nguyen, Shannon A. Wichman
-
Patent number: 7216217Abstract: A programmable processor that comprises a general purpose processor architecture, capable of operation independent of another host processor, having a virtual memory addressing unit, an instruction path and a data path; an external interface; a cache operable to retain data communicated between the external interface and the data path; at least one register file configurable to receive and store data from the data path and to communicate the stored data to the data path; and a multi-precision execution unit coupled to the data path. The multi-precision execution unit is configurable to dynamically partition data received from the data path to account for an elemental width of the data and is capable of performing group floating-point operations on multiple operands in partitioned fields of operand registers and returning catenated results. In other embodiments the multi-precision execution unit is additionally configurable to execute group integer and/or group data handling operations.Type: GrantFiled: August 25, 2003Date of Patent: May 8, 2007Assignee: Microunity Systems Engineering, Inc.Inventors: Craig Hansen, John Moussouris
-
Patent number: 7216138Abstract: A method and apparatus are described for converting a number from a floating point format to an integer format or from an integer format to a floating point format responsive to a control signal of a control signal format. Numbers are stored in the floating point format in a register of a first set of architectural registers in a packed format. One or more numbers in the floating point format are converted to the integer format and placed in a register of a second set of architectural registers in a packed format. Conversion from integer format to floating point format is performed in a similar manner. A floating point arithmetic apparatus is described that provides for converting a plurality of numbers between integer formats and a floating point formats, further providing for conversion operations that require a greater data path width than floating-point arithmetic operations.Type: GrantFiled: February 14, 2001Date of Patent: May 8, 2007Assignee: Intel CorporationInventors: Mohammad Abdallah, Prasad Modali, Chien-Yu Huang, legal representative, Thomas R. Huff, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar, Hsien-Cheng E. Hsieh, deceased
-
Patent number: 7212959Abstract: A method and apparatus for accumulating arbitrary length strings of input values, such as floating point values, in a layered tree structure such that the order of adds at each layer is maintained. The accumulating utilizes a shared adder, and includes means for directing initial inputs and intermediate result values.Type: GrantFiled: August 8, 2001Date of Patent: May 1, 2007Inventors: Stephen Clark Purcell, Scott Kimura, Mark L. Wood Patrick
-
Patent number: 7206927Abstract: A method of executing an instruction stream in a pipelined execution unit of depth, p, comprises loading the instruction stream; detecting an iteration of an instruction in the loaded instruction stream; interleaving p steams of instances of the instruction in the pipeline; detecting an end of the iteration; and combining results obtained from the p streams after all programmed iterations have completed. A computational circuit comprises a register which can hold a value representing both an operand and result of an iterative operation; a multiplexer having a first input connected to receive the operand from the register, a second input connected to a source of an identify value for the iterative operation, and an output; and an operator circuit having an input connected to receive a value from the multiplexer output, and an output connected to return thee result to the register.Type: GrantFiled: November 19, 2002Date of Patent: April 17, 2007Assignee: Analog Devices, Inc.Inventor: Abhijit Giri
-
Patent number: 7191316Abstract: A system for handling a plurality of single precision floating point instructions and a plurality of double precision floating point instructions that both index a same set of registers is provided. The system comprises a decode unit arranged to decode, stall, and forward at least one of the plurality of single precision and at least one of the plurality of double precision floating point instructions in a fetch group. The decode unit includes a first counter arranged to increment for each of the plurality of single precision floating point instructions forwarded down a pipeline; a second counter arranged to increment for each of the plurality of double precision floating point instructions forwarded down the pipeline; a first mask register and a second mask register. The first mask register is updated by each of the single precision floating point instructions forwarded and the second mask register is updated by each of the double precision floating point instructions forwarded.Type: GrantFiled: January 29, 2003Date of Patent: March 13, 2007Assignee: Sun Microsystems, Inc.Inventors: Rabin A. Sugumar, Sorin Iacobovici, Robert Nuckolls, Chandra M. R. Thimmannagari
-
Patent number: 7159100Abstract: The present invention provides extended precision in SIMD arithmetic operations in a processor having a register file and an accumulator. A first set of data elements and a second set of data elements are loaded into a first vector register and a second vector register, respectively. Each data element comprises N bits. Next, an arithmetic instruction is fetched from memory. The arithmetic instruction is decoded. Then, a first vector register and a second vector register are read from the register file. The present invention then executes the arithmetic instruction on corresponding data elements in the first and second vector registers. The result of the execution is then written into the accumulator. Then, each element in the accumulator is transformed into an N-bit width element and stored into the memory.Type: GrantFiled: December 30, 1998Date of Patent: January 2, 2007Assignee: MIPS Technologies, Inc.Inventors: Timothy van Hook, Peter Hsu, William A. Huffman, Henry P. Moreton, Earl A. Killian
-
Patent number: 7149882Abstract: A processor with instructions to operate on different data types stored in a single logical register file. According to one embodiment of the invention, a processor includes a number of physical registers, a memory unit, and a decode/execution unit. The memory unit is to make the number of physical registers appear to software as a single software-visible register file. The decode/execution unit is to execute on the contents of the single software-visible register file instructions of a first instruction type and of a second instruction type, wherein the single software-visible register file is to be operated as a flat register file during execution of instructions of the second instruction type and as a stack referenced register file during execution of instructions of the first instruction type.Type: GrantFiled: May 11, 2004Date of Patent: December 12, 2006Assignee: Intel CorporationInventors: Andrew F. Glew, Larry M. Mennemeier, Alexander D. Peleg, David Bistry, Millind Mittal, Carole Dulong, Eiichi Kowashi, Benny Eitan, Derrik Lin, Romamohan R. Vakkalagadda
-
Patent number: 7149877Abstract: A disclosed byte execution unit receives byte instruction information and two operands, and performs an operation specified by the byte instruction information upon one or both of the operands, thereby producing a result. The byte instruction specifies either a count ones in bytes operation, an average bytes operation, an absolute differences of bytes operation, or a sum bytes into halfwords operation. In one embodiment, the byte execution unit includes multiple byte units. Each byte unit includes multiple population counters, two compressor units, adder input multiplexer logic, adder logic, and result multiplexer logic. A data processing system is described including a processor coupled to a memory system. The processor includes the byte execution unit. The memory system includes a byte instruction, wherein the byte instruction specifies either the count ones in bytes operation, the average bytes operation, the absolute differences of bytes operation, or the sum bytes into halfwords operation.Type: GrantFiled: July 17, 2003Date of Patent: December 12, 2006Assignee: International Business Machines CorporationInventors: Sang Hoo Dhong, Hwa-Joon Oh, Brad William Michael, Silvia Melitta Mueller, Kevin D. Tran
-
Patent number: 7146491Abstract: A data processing apparatus and method for generating constant values is provided. The data processing apparatus comprises a data processing unit operable in response to an instruction to perform a data processing operation on one or more data values. Shift logic is operable to selectively apply a shift operation to data to produce one of the data values for the data processing operation. Further, a plurality of registers are provided for storing data. The instruction has a register specifier field for identifying a register and a shift specifier field for specifying a shift to be applied to that register's data in order to produce one of the data values for the data processing operation.Type: GrantFiled: October 26, 2004Date of Patent: December 5, 2006Assignee: ARM LimitedInventors: Jonathan Sean Callan, David Hennah Mansell, Christopher Pedley, David James Seal
-
Patent number: 7139900Abstract: New instruction definitions for a packet add (PADD) operation and for a single instruction multiple add (SMAD) operation are disclosed. In addition, a new dedicated PADD logic device that performs the PADD operation in about one to two processor clock cycles is disclosed. Also, a new dedicated SMAD logic device that performs a single instruction multiple data add (SMAD) operation in about one to two clock cycles is disclosed.Type: GrantFiled: June 23, 2003Date of Patent: November 21, 2006Assignee: Intel CorporationInventors: Corey Gee, Bapiraju Vinnakota, Saleem Mohammadali, Carl A. Alberola
-
Patent number: 7139901Abstract: A software program extension for a dynamic multi-streaming processor is disclosed. The extension comprising an instruction set enabling coordinated interaction between a packet management component and a core processing component of the processor. The software program comprises, a portion thereof for managing packet uploads and downloads into and out of memory, a portion thereof for managing specific memory allocations and de-allocations associated with enqueueing and dequeuing data packets, a portion thereof for managing the use of multiple contexts dedicated to the processing of a single data packet; and a portion thereof for managing selection and utilization of arithmetic and other context memory functions associated with data packet processing. The extension complements standard data packet processing program architecture for specific use for processors having a packet management unit that functions independently from a streaming processor unit.Type: GrantFiled: September 7, 2001Date of Patent: November 21, 2006Assignee: MIPS Technologies, Inc.Inventors: Enrique Musoll, Mario Nemirovsky, Stephen Melvin
-
Patent number: 7111155Abstract: A computation core includes a computation block, an addressing block and an instruction sequencer, which are coupled to a memory through a memory interface. The computation block includes a register file and dual execution units. The execution units include features for enhanced performance in executing digital signal computations. The computation core is configured for executing digital signal processor instructions and microcontroller instructions, while achieving efficient digital signal processor computation and high code density. A finite impulse response filter algorithm achieves high performance on the dual execution units.Type: GrantFiled: May 12, 2000Date of Patent: September 19, 2006Assignee: Analog Devices, Inc.Inventors: William C. Anderson, John Edmondson, Jose Fridman, Marc Hoffman, Russell L. Rivin
-
Patent number: 7100025Abstract: An apparatus and method for performing single-instruction multiple-data instructions using a single multiply-accumulate unit while minimizing operational latency. The multiply-accumulate unit generates a first half and a second half of a data result. A register stores the first half of the data result. A miscellaneous-logic unit determines when to release the first half of the data result from the register to synchronize the first half and the second half of the data result.Type: GrantFiled: January 28, 2000Date of Patent: August 29, 2006Assignees: Hewlett-Packard Development Company, L.P., Intel CorporationInventor: Thomas Justin Sullivan
-
Patent number: 7062633Abstract: It is decided whether a first source data from the memory 101 is a data which is to be subjected to arithmetic or not by a state flag detection means 150, the result of the decision is retained as a state flag, and it is decided by a condition decision means 109 whether or not the state flag satisfies a condition for performing the arithmetic. A control means 110 controls whether an ALU 100 should perform the arithmetic or not on the basis of the condition satisfaction/dissatisfaction information.Type: GrantFiled: December 15, 1999Date of Patent: June 13, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Mana Hamada, Shunichi Kuromaru, Tomonori Yonezawa, Tsuyoshi Nakamura
-
Patent number: 7062637Abstract: Executing digital signal processing (DSP) instructions in a digital signal processor integrated circuit comprising receiving a DSP instruction in digital signal processor integrated circuit to process one or more complex number operands; fetching a first operand with a first data type, the first operand having real and imaginary values with a complex data type; fetching a second operand with a second data type; prior to executing a DSP operation, determining a permutation of the first operand, the second operand, or both the first operand and the second operand, and permuting instances of the first operand, the second operand, or both the first operand and the second operand to execute the DSP operation; and executing the DSP operation in the digital signal processor integrated circuit using the first operand and the second operand to obtain a result, the result having real and imaginary values with a complex data type.Type: GrantFiled: March 6, 2003Date of Patent: June 13, 2006Assignee: Intel CorporationInventors: Kumar Ganapathy, Ruban Kanapathipillai
-
Patent number: 7051193Abstract: Instruction-level parallelism in software pipelined loops is exploited by predicting future register rotations. A processor includes an architected current frame marker register and at least one unarchitected frame marker register. Register rotation prediction is achieved by setting the register rotation of future iterations of a software loop to be a function of the unarchitected frame marker registers. True data dependencies remain, but the dependencies caused solely by register renaming are removed. Dynamic predication is used to predicate instructions from future iterations, allowing them to be squashed if dependencies are later found. The register renaming that results from the prediction can be included in instructions in a buffer, or a renaming stage in an execution pipeline can perform the renaming.Type: GrantFiled: March 28, 2001Date of Patent: May 23, 2006Assignee: Intel CorporationInventors: Hong Wang, Christopher J. Hughes, Ralph Kling, Yong-Fong Lee, Daniel M. Lavery, John Shen, Jamison Collins
-
Patent number: 7047397Abstract: A method for executing an instruction with a semi-fast operation in a staggered ALU. The method of one embodiment comprises generating a first operation and a second operation from a micro-instruction. The first and second operations are scheduled for execution in a staggered arithmetic logic unit (ALU). The first and second operations are separated by N clock cycles. Data from the first operation is communicated to the second operation for use with execution of the second operation.Type: GrantFiled: September 13, 2002Date of Patent: May 16, 2006Assignee: Intel CorporationInventor: Ross A. Segelken
-
Patent number: 7047396Abstract: A method and system for fixed-length memory-to-memory processing of fixed-length instructions. Further, the present invention is a method and system for implementing a memory operand width independent of the ALU width. The arithmetic and register data are 32 bits, but the memory operand is variable in size. The size of the memory operand is specified by the instruction. Instructions in accordance with the present invention allow for multiple memory operands in a single fixed-length instruction. The instruction set is small and simple, so the implementation is lower cost than traditional processors. More addressing modes are provided for, thus creating a more efficient code. Semaphores are implemented using a single bit. Shift-and-merge instructions are used to access data across word boundaries.Type: GrantFiled: June 22, 2001Date of Patent: May 16, 2006Assignee: Ubicom, Inc.Inventors: David A. Fotland, Roger D. Arnold, Tibet Mimaroglu
-
Patent number: 7028168Abstract: A system for performing matrix operations utilizes a processor, memory, and a matrix operation manager. The processor has a memory cache. The memory is external to the processor and stores first and second matrices. The matrix operation manager is configured to mathematically combine the first matrix with the scond matrix utilizing a hoisted matrix algorithm for hoisting values of the first matrix, and the hoisted matrix algorithm has an outer loop and an inner loop that is performed to completion for each iteration of the outer loop. The matrix operation manager, for each iteration of the outer loop, is configured to load to the cache and to write to a contiguous portion of the memory, before performing the inner loop, values from the first matrix that are to be combined, via performance of the inner loop, with values from the second matrix.Type: GrantFiled: December 5, 2002Date of Patent: April 11, 2006Assignee: Hewlett-Packard Development Company, L.P.Inventor: Kevin R. Wadleigh
-
Patent number: 6996701Abstract: A computer system that can be operated by a clock frequency higher than the clock frequency by which the critical path instruction is executed correctly. The pipeline is driven at a high clock frequency higher than the clock frequency by which critical path instruction can be executed correctly. The computer system includes a high frequency ALU being operated by the pipeline clock frequency, and at least two low frequency ALUs being operated by the low clock frequency by which the critical path instruction is executed correctly. Each instruction of the execution stage is inputted to the low frequency ALUs alternately and each executes the critical path instruction in two machine cycles.Type: GrantFiled: September 25, 2001Date of Patent: February 7, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventor: Akimitsu Shimamura
-
Patent number: 6996702Abstract: A processing system includes an arithmetic logic unit (ALU) sub-system that allows data associated with a prior instruction to be preserved for use with a next instruction or subsequent instruction without having to reload the value using an intermediate register. The ALU sub-system includes a pair of ALUs communicatively cross-coupled with a pair of accumulators. The processing system also includes a data selector coupled to the ALU sub-system for use with memory contention prediction. The data selector includes a constant generator that controls storage of data associated with a previous instruction in a bypass element, and a selector to choose between data from a databus element and data stored in the bypass element.Type: GrantFiled: July 30, 2002Date of Patent: February 7, 2006Assignee: WIS Technologies, Inc.Inventors: Shuhua Xiang, Li Sha, Ping Zhu, Hongjun Yuan, Wei Ni
-
Patent number: 6988184Abstract: Methods of performing dyadic digital signal processing (DSP) instructions. In one embodiment of the invention, the method includes fetching a dyadic DSP instruction having a main operation and a sub operation; predecoding the dyadic DSP instruction to generate predecoded instruction signals; and decoding the predecoded instruction signals to generate select signals to selectively couple data from a first plurality of buses coupled to inputs of multiplexers of a first plurality of DSP functional blocks to execute the main operation of the dyadic DSP instruction in one processor cycle and to selectively couple data from a second plurality of buses coupled to inputs of multiplexers of a second plurality of DSP functional blocks to execute the sub operation of the dyadic DSP instruction in the one processor cycle.Type: GrantFiled: August 2, 2002Date of Patent: January 17, 2006Assignee: Intel CorporationInventors: Kumar Ganapathy, Ruban Kanapathipillai
-
Patent number: 6986023Abstract: A processor-based system may include a main processor and a coprocessor. The coprocessor handles instructions that include opcodes specifying a data processing operation to be performed by the coprocessor and a coprocessor identification field for identifying a target coprocessor for coprocessor instructions. Two bits indicate one of four data sizes including a byte (8 bits), a half word (16 bits), a word (32 bits), and a double word (64 bits). Two other bits indicate a saturation type.Type: GrantFiled: August 9, 2002Date of Patent: January 10, 2006Assignee: Intel CorporationInventors: Nigel C. Paver, William T. Maghielse, Wing K. Yu, Jianwei Liu, Anthony Jebson, Kailesh B. Bavaria, Rupal M. Parikh, Deli Deng, Mukesh Patel, Mark Fullerton, Murli Ganeshan, Stephen J. Strazdus
-
Patent number: 6973551Abstract: A method and system for enabling a director to perform an atomic read-modify-write operation on plural bit read data stored in a selected one of a plurality of memory locations. The method includes providing a plurality of successive full adders, each one of the full adders being associated with a corresponding one of the bits of the plural bit read data. Each one of the full adders has a summation output, a carry bit input and a carry bit output. The method includes adding in each one of the full adders: (a) a corresponding bit of plural bit input data provided by the director; (b) the corresponding one of the bits of the plural bit read data; and, (c) a carry bit fed the carry bit input from a preceding full adder. Each one of the full adders provides: (a) a carry bit on the carry output thereof representative of the most significant bit produced by the full adder; and, (b) a bit on the summation output representative of a least significant bit produced by the full adder.Type: GrantFiled: December 30, 2002Date of Patent: December 6, 2005Assignee: EMC CorporationInventor: John K. Walton
-
Patent number: 6970994Abstract: A method and apparatus for executing partial-width packed data instructions are discussed. The processor may include a plurality of registers, a register renaming unit, a decoder, and a partial-width execution unit. The register renaming unit provides an architectural register file to store packed data operands each of which include a plurality of data elements. The decoder is to decode a first and second set of instructions that each specify one or more registers in the architectural register file. The first set of instructions specify operations to be performed on all of the data elements stored in the one or more specified registers. In contrast, the second set of instructions specify operations to be performed on only a subset of the data elements. The partial-width execution unit is to execute operations specified by either of the first or the second set of instructions.Type: GrantFiled: May 8, 2001Date of Patent: November 29, 2005Assignee: Intel CorporationInventors: Mohammad Abdallah, James Coke, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
-
Patent number: RE39121Abstract: A processor which executes positive conversion processing, which converts coded data into uncoded data, and saturation calculation processing, which rounds a value to an appropriate number of bits, at high speed. When a positive conversion saturation calculation instruction “MCSST D1” is decoded, the sum-product result register 6 outputs its held value to the path P1. The comparator 22 compares the magnitude of the held value of the sum-product result register 6 with the coded 32-bit integer “0x0000_00FF”. The polarity judging unit 23 judges whether the eighth bit of the value held by the sum-product result register 6 is “ON”. The multiplexer 24 outputs one of the maximum value “0x0000_00FF” generated by the constant generator 21, the zero value “0x0000_0000” generated by the zero generator 25, and the held value of the sum-product result register 6 to the data bus 18.Type: GrantFiled: February 13, 2003Date of Patent: June 6, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Toru Morikawa, Nobuo Higaki, Akira Miyoshi, Keizo Sumida