Decoding By Plural Parallel Decoders Patents (Class 712/212)
  • Publication number: 20100241791
    Abstract: A controller includes an instruction table memory, a program counter, a first decoder, and a first executing unit. The instruction table memory stores an instruction code obtained by coding a sequence to access a nonvolatile semiconductor memory. A read address in the instruction table memory is set to the program counter. The first decoder decodes the instruction code read from the instruction table memory to output a first decode signal. The first executing unit executes access to the nonvolatile semiconductor memory on the basis of the first decode signal output from the first decoder.
    Type: Application
    Filed: September 4, 2009
    Publication date: September 23, 2010
    Inventors: Tarou Iwashiro, Takahide Nishiyama, Seiichi Tomita
  • Patent number: 7793080
    Abstract: One or more processor cores of a multiple-core processing device each can utilize a processing pipeline having a plurality of execution units (e.g., integer execution units or floating point units) that together share a pre-execution front-end having instruction fetch, decode and dispatch resources. Further, one or more of the processor cores each can implement dispatch resources configured to dispatch multiple instructions in parallel to multiple corresponding execution units via separate dispatch buses. The dispatch resources further can opportunistically decode and dispatch instruction operations from multiple threads in parallel so as to increase the dispatch bandwidth. Moreover, some or all of the stages of the processing pipelines of one or more of the processor cores can be configured to implement independent thread selection for the corresponding stage.
    Type: Grant
    Filed: December 31, 2007
    Date of Patent: September 7, 2010
    Inventors: Gene Shen, Sean Lie
  • Publication number: 20100223431
    Abstract: In a multi-core processor of a shared-memory type, deterioration in the data processing capability caused by competitions of memory accesses from a plurality of processors is suppressed effectively. In a memory access controlling system for controlling accesses to a cache memory in a data read-ahead process when the multi-core processor of a shared-memory type performs a task including a data read-ahead thread for executing data read-ahead and a parallel execution thread for performing an execution process in parallel with the data read-ahead, the system includes a data read-ahead controller which controls an interval between data read-ahead processes in the data read-ahead thread adaptive to a data flow which varies corresponding to an input value of the parallel process in the parallel execution thread. By controlling the interval between the data read-ahead processes, competitions of memory accesses in the multi-core processor are suppressed.
    Type: Application
    Filed: February 4, 2008
    Publication date: September 2, 2010
    Inventor: Kosuke Nishihara
  • Publication number: 20100199073
    Abstract: A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.
    Type: Application
    Filed: October 15, 2009
    Publication date: August 5, 2010
    Inventors: Erdem Hokenek, Mayan Moudgill, Michael J. Schulte, C. John Glossner
  • Publication number: 20100169614
    Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.
    Type: Application
    Filed: February 12, 2010
    Publication date: July 1, 2010
    Applicant: Panasonic Corporation
    Inventors: Shuichi TAKAYAMA, Nobuo Higaki
  • Patent number: 7747020
    Abstract: Performing a hash algorithm in a processor architecture to alleviate performance bottlenecks and improve overall algorithm performance. In one embodiment of the invention, the hash algorithm is pipelined within the processor architecture.
    Type: Grant
    Filed: December 4, 2003
    Date of Patent: June 29, 2010
    Assignee: Intel Corporation
    Inventor: Wajdi K. Feghali
  • Patent number: 7747993
    Abstract: A method of ordering instructions. The method can include placing a first instruction that consumes a value of an object before a second instruction that produces the value of the object such that the first instruction is processed before the second instruction and a physical location is allocated to the value of the object upon processing the first instruction.
    Type: Grant
    Filed: December 30, 2004
    Date of Patent: June 29, 2010
    Assignee: Michigan Technological University
    Inventor: Soner Onder
  • Patent number: 7724963
    Abstract: A method and apparatus for determining a closest match of N input patterns relative to R reference patterns using K processing units. Each of a set of input patterns are loaded into the K processing units. One of the Reference patterns is sequentially loaded into each of the processing units and a distance defining the similarity between the reference pattern and each of the input patterns is calculated. A present calculated distance replaces its corresponding stored present minimum distance if it is has a smaller value. After the R reference patterns have been processed the minimum distance and its corresponding identification for all N input patterns is determined without merging outputs. The minimum distances and the identifications may be read either in parallel or serially. The apparatus is easily scalable by adding processors. The number of reference patterns may be easily increased without altering system configuration.
    Type: Grant
    Filed: February 22, 2008
    Date of Patent: May 25, 2010
    Assignee: International Business Machines Corporation
    Inventors: Kerry A. Kravec, Ali G. Saidi, Jan M. Slyfield, Pascal R. Tannhof
  • Patent number: 7721070
    Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: May 18, 2010
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 7496196
    Abstract: Embodiments of the present invention provide a method and apparatus of performing on one or more bytes of an input data block at least one predetermined encryption or decryption operation.
    Type: Grant
    Filed: June 30, 2004
    Date of Patent: February 24, 2009
    Assignee: Intel Corporation
    Inventors: Marc Jalfon, Boris E. Ginzburg
  • Patent number: 7478224
    Abstract: A combined native (RISC or CISC) microprocessor and stack (Java™) machine are constructed so that Java™ VM instructions can be executed in hardware. Most Java™ instructions are executed directly, while more complex Java™ instructions, such as those manipulating Java™ objects, are executed as native microcode. In order for native microcode instructions to access the Java™ operand stack, a Java™ operand stack pointer points to the register file location that is the current top of the stack, while a remap bit in the status register indicates that registers specified in native instructions are remapped as the maximum Java™ operand stack pointer value minus the present value of the Java™ operand stack pointer.
    Type: Grant
    Filed: April 15, 2005
    Date of Patent: January 13, 2009
    Assignee: Atmel Corporation
    Inventors: Oyvind Strom, Erik K. Renno, Kristian Monsen
  • Publication number: 20080263328
    Abstract: Embodiments of the invention relate to a method and system for accessing a set of parallel registers orthogonally. A decoder may be used to select a particular row or column of the set of parallel registers to perform register operations in a parallel fashion corresponding to the selected row or in an orthogonal fashion corresponding to the selected column. Thus, when a particular row is selected, a register operation may be carried out for each bit of the selected row to produce a parallel register output, such as by reading/writing each bit of the selected row to a parallel register. On the other hand, when a particular column is selected, a register operation may be carried out for each bit of the selected column, such as by reading/writing each bit of the selected column to an orthogonal register. The orthogonal register access allows for fast and efficient access to a particular bit in the set of parallel registers.
    Type: Application
    Filed: September 21, 2007
    Publication date: October 23, 2008
    Applicant: CYPRESS SEMICONDUCTOR CORPORATION
    Inventors: Timothy Williams, Gregory John Verge, Dennis Seguine
  • Patent number: 7424594
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Grant
    Filed: June 3, 2004
    Date of Patent: September 9, 2008
    Assignee: Altera Corporation
    Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
  • Publication number: 20080177982
    Abstract: The present invention provides a memory device and a method for accessing the memory device thereof. The memory device comprises an address encoding selector for selecting one of plurality of encoding circuits which encodes a first address into a second address, and a data decoding selector for selecting one of plurality of decoding circuits which decodes a first data corresponding to the second address into a second data and a non-volatile memory, coupled to address encoding selector and the data decoding selector, for storing the first data. The method for accessing the memory device comprises encoding a first address into a second address by an address encoding selector, and decoding a first data corresponding to the second address into a second data by a data decoding selector, wherein the first data being stored in the non-volatile memory.
    Type: Application
    Filed: June 4, 2007
    Publication date: July 24, 2008
    Applicant: HOLTEK SEMICONDUCTOR INC.
    Inventors: Chuen-An Lin, Ching-Tsung Tung
  • Patent number: 7366352
    Abstract: A method and apparatus for determining a closest match of N input patterns relative to R reference patterns using K processing units. Each of a set of input patterns are loaded into the K processing units. One of the Reference patterns is sequentially loaded into each of the processing units and a distance defining the similarity between the reference pattern and each of the input patterns is calculated. A present calculated distance replaces its corresponding stored present minimum distance if it is has a smaller value. After the R reference patterns have been processed the minimum distance and its corresponding identification for all N input patterns is determined without merging outputs. The minimum distances and the identifications may be read either in parallel or serially. The apparatus is easily scalable by adding processors. The number of reference patterns may be easily increased without altering system configuration.
    Type: Grant
    Filed: March 20, 2003
    Date of Patent: April 29, 2008
    Assignee: International Business Machines Corporation
    Inventors: Kerry A. Kravec, Ali G. Saidi, Jan M. Slyfield, Pascal R. Tannhof
  • Publication number: 20080077774
    Abstract: A technique includes using multiple processing cores of a semiconductor package to perform functions directed to booting up a computer system.
    Type: Application
    Filed: September 26, 2006
    Publication date: March 27, 2008
    Inventors: Lyle E. Cool, Vincent J. Zimmer
  • Patent number: 7340589
    Abstract: The data processing device and electronic equipment of the present invention perform pipeline control and include a fetch circuit which fetches instruction codes of a plurality of instructions in instruction queues. A prefix instruction decoder circuit performs a decode processing only on a prefix instruction. The prefix instruction decoder circuit receives the instruction code before decoding, judges whether or not the instruction is a given prefix instruction, and causes a target instruction to modify an information register to store information necessary for decoding a target instruction when the instruction is the given prefix instruction. A decoder circuit receives each of the instruction codes of the instructions other than the prefix instruction as a decode instruction and decodes the decode instruction. When the decode instruction is a target instruction, the target instruction modified by the prefix instruction is decoded based on the target instruction modifying information.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: March 4, 2008
    Assignee: Seiko Epson Corporation
    Inventor: Makoto Kudo
  • Publication number: 20070294514
    Abstract: To provide a technique to reduce power consumption when carrying out image processing by processors. For the purpose of this, for example, a means for specifying a two-dimensional source register and destination register is provided in an operand of an instruction, and the processor includes a means which executes calculation using a plurality of source registers in a plurality of cycles and obtains a plurality of destinations. Moreover, in an instruction to obtain a destination using a plurality of source registers and consuming a plurality of cycles, a data rounding processing part is connected to a final stage of a pipeline. With such configurations, the power consumed when reading an instruction memory is reduced by reducing the access frequency to the instruction memory, for example.
    Type: Application
    Filed: March 21, 2007
    Publication date: December 20, 2007
    Inventors: Koji Hosogi, Masakazu Ehama, Hiroaki Nakata, Kenichi Iwata, Seiji Mochizuki, Takafumi Yuasa, Yukifumi Kobayashi, Tetsuya Shibayama, Hiroshi Ueda, Masaki Nobori
  • Patent number: 7305542
    Abstract: Speculatively decoding instruction lengths in order to increase instruction throughput. Instructions are speculatively decoded within a pipelined microprocessor architecture such that up to four instruction lengths may be decoded within a maximum of two processor clock cycles.
    Type: Grant
    Filed: June 25, 2002
    Date of Patent: December 4, 2007
    Assignee: Intel Corporation
    Inventor: Venkateswara Rao Madduri
  • Patent number: 7289142
    Abstract: A monolithic integrated circuit includes programmable processing circuitry. The processing circuitry includes four programmable circuitry elements. Switching circuitry is operatively connected to the programmable circuitry elements and is configured to provide data communication between the circuitry elements. An image sensor interface is connected to the processing circuitry and is configured to receive signals from an image sensor and to pass data representing the signals to the programmable processing circuitry.
    Type: Grant
    Filed: October 14, 2003
    Date of Patent: October 30, 2007
    Assignee: Silverbrook Research Pty Ltd
    Inventor: Kia Silverbrook
  • Patent number: 7243214
    Abstract: According to some embodiments, a method determining a number of stages associated with an instruction to be executed via a processor pipeline, determining a number of stages associated with a subsequent instruction, and stalling the pipeline based on the number of stages associated with the instruction to be executed and the number of stages associated with the subsequent instruction is provided.
    Type: Grant
    Filed: April 21, 2003
    Date of Patent: July 10, 2007
    Assignee: Intel Corporation
    Inventors: Niall D. McDonnell, John Wishneusky
  • Patent number: 7206921
    Abstract: A processor may include an instruction decoder to decode macroinstructions into micro-operations. In some embodiments, the instruction decoder may include a first decoder and a second decoder. The first decoder may decode a macroinstruction having SSE data type operands into a laminated micro-operation, and may generate unlamination information for the laminated micro-operation. The second decoder may generate from the laminated micro-operation and the unlamination information two or more micro-operations, where operands of the two or more micro-operations each correspond to a half of one of the SSE operands of the macroinstruction.
    Type: Grant
    Filed: April 7, 2003
    Date of Patent: April 17, 2007
    Assignee: Intel Corporation
    Inventors: Zeev Sperber, Robert Valentine, Simcha Gochman
  • Patent number: 7180893
    Abstract: A packet header processing engine includes a level 2 (L2) header generation unit and a level 3 (L3) header generation unit. The L2 and L3 header generation units are implemented in parallel with one another. The L2 generation unit writes L2 header information to a first buffer and the L3 generation unit writes L3 header information to a second buffer. When both the L2 and L3 generation units complete their operations for a particular packet, a build component combines the generated L2 and L3 header information from the buffers to form a complete packet header.
    Type: Grant
    Filed: March 22, 2002
    Date of Patent: February 20, 2007
    Assignee: Juniper Networks, Inc.
    Inventors: Pradeep Sindhu, Raymond M. Lim, Jeffrey G. Libby
  • Patent number: 7143200
    Abstract: A semiconductor integrated circuit to be connected to a PCI bus, having a configuration register. The size of an address space mapped to the semiconductor integrated circuit depends on the size of readable and writable region (Fv) of a base address register (30) that the configuration register has. The size of the readable and writable region of the base address register can be changed by a mask circuit (31). The size of a local address space can be set to be variable according to the number of mask bits specified by a mask signal. For example, also in the case where a plurality of PCI devices are used, a memory space mapped to each device can be selectively reduced in size and as such, it is also possible to cope with the case where finite resources are mapped to many PCI devices to construct a system.
    Type: Grant
    Filed: May 20, 2004
    Date of Patent: November 28, 2006
    Assignee: Hitachi, Ltd.
    Inventor: Katsuichi Tomobe
  • Patent number: 7111151
    Abstract: A microprocessor, method and signal-bearing medium for storing a program for executing the method, includes a microcode unit for outputting control signals, for each of a plurality of instructions, required by the microprocessor for executing the instructions. The microcode unit includes an instruction address input for receiving an instruction address, a control variable input for receiving a control variable corresponding to a current state of the microprocessor, a control signal input for receiving all of the control signals output by the microcode unit for an immediately preceding instruction, and a plurality of embedded logic circuits each dedicated for evaluating one unique type of instruction received by the microcode unit.
    Type: Grant
    Filed: March 14, 2001
    Date of Patent: September 19, 2006
    Assignee: International Business Machines Corporation
    Inventors: William P. Moore, Sebastian T. Ventrone
  • Patent number: 7103755
    Abstract: An apparatus for efficient parallel executing instruction avoiding the usage of cross bypasses, the apparatus including an instruction buffer for storing instructions, of decoders for decoding, in parallel, the instructions which simultaneously issue from the instruction buffer, executing units for executing the instructions decoded in the decoders, and an instruction-issuing controlling means for controlling the issuing of the instructions in such a way that, when the instructions are executed, one of the plural executing units executes instructions more frequently than the rest of the plural executing units. The apparatus is preferably incorporated in an information processor to superscalar or out-of-order instruction execution.
    Type: Grant
    Filed: January 10, 2003
    Date of Patent: September 5, 2006
    Assignee: Fujitsu Limited
    Inventors: Susumu Akiu, Masaki Ukai, Toshio Yoshida
  • Patent number: 7069555
    Abstract: Systems and methods perform super-region instruction scheduling that increases the instruction level parallelism for computer programs. A compiler performs data flow analysis and memory interference analysis on the code to determine data dependencies between entities such as registers and memory locations. A region tree is constructed, where the region tree contains a single entry block and a single exit block, with potential intervening blocks representing different control flows through the region. Instructions within blocks are moved to predecessor blocks when there are no dependencies on the instruction to be moved, and when the move results in greater opportunity for instruction level parallelism. Redundant instructions from multiple paths can be merged into a single instruction during the process of scheduling. In addition, if a dependency can be removed the method transforms the instruction into an instruction that can be moved to a block having available resources.
    Type: Grant
    Filed: September 29, 2000
    Date of Patent: June 27, 2006
    Assignee: Microsoft Corporation
    Inventor: Ten H. Tzen
  • Patent number: 7007031
    Abstract: System and method of data unit management in a decoding system employing a decoding pipeline. Each incoming data unit is assigned a memory element and is stored in the assigned memory element. Each decoding module gets the data to be operated on, as well as the control data, for a given data unit from the assigned memory element. Each decoding module, after performing its decoding operations on the data unit, deposits the newly processed data back into the same memory element. In one embodiment, the assigned memory locations comprise a header portion for holding the control data corresponding to the data unit and a data portion for holding the substantive data of the data unit. The header information is written to the header portion of the assigned memory element once and accessed by the various decoding modules throughout the decoding pipeline as needed. The data portion of memory is used/shared by multiple decoding modules.
    Type: Grant
    Filed: April 1, 2002
    Date of Patent: February 28, 2006
    Assignee: Broadcom Corporation
    Inventors: Alexander G. MacInnis, Jose′ R. Alvarez, Sheng Zhong, Xiaodong Xie, Vivian Hsiun
  • Patent number: 6970998
    Abstract: In an embodiment, a method comprises receiving a first instruction and a second instruction, where the second instruction specifies that a destination address of the first instruction should be replaced with a destination address of the second instruction. The method also includes decoding the first instruction and the second instruction. The decoding comprises replacing a destination address of the first instruction with an address provided by the second instruction, upon determining that the second instruction is a suffix instructions to the first instruction.
    Type: Grant
    Filed: October 7, 2002
    Date of Patent: November 29, 2005
    Assignee: Redback Networks Inc.
    Inventor: John G. Favor
  • Patent number: 6968444
    Abstract: A microprocessor employing a fixed position dispatch unit. The microprocessor includes a plurality of execution units each corresponding to an issue position and configured to execute a common subset of instructions. At least a first one of the execution units includes extended logic for executing a designated instruction that others of the execution units may be incapable of executing. The microprocessor also includes a plurality of decoders coupled to the plurality of execution units. The plurality of decoders may provide positional information to cause the designated instruction to be routed to the first execution unit. Further, the microprocessor includes a dispatch control unit configured to dispatch during a dispatch cycle, the designated instruction for execution by the first execution unit based upon the positional information. The dispatch control unit may further dispatch one or more instructions within the common subset of instructions during the same dispatch cycle.
    Type: Grant
    Filed: November 4, 2002
    Date of Patent: November 22, 2005
    Assignee: Advanced Micro Devices, Inc.
    Inventors: David E. Kroesche, Michael T. Clark
  • Patent number: 6959375
    Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.
    Type: Grant
    Filed: October 29, 2002
    Date of Patent: October 25, 2005
    Assignee: Seiko Epson Corporation
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6944749
    Abstract: A method for decoding instructions in an execution package with a processor includes using an assembler to assemble instructions into different execution packages. Each instruction has an identification segment and an instruction segment. The method also includes using the assembler to reorder the instructions by separating identification segments from instruction segments, grouping all identification segments of the execution package together, and grouping all instruction segments of the execution package together. The method uses the processor to decode identification segments of the instructions at the same time, and adds a length of each identification segment together to calculate a total length of the execution package.
    Type: Grant
    Filed: July 29, 2002
    Date of Patent: September 13, 2005
    Assignee: Faraday Technology Corp.
    Inventor: Shan-Chyun Ku
  • Patent number: 6907518
    Abstract: For use in a processor having a first number of decode units for decoding an ordered stream of floating point instructions, a floating point unit (FPU) for receiving decoded ones of the floating point instructions and a method of processing the decoded ones of the floating point instructions. In one embodiment, the FPU includes: (1) a second number of floating point pipelines that execute the floating point instructions, the second number being at least one and less than the first number, the floating point pipeline having a load unit, an execution core and a store unit, (2) a floating point checkpoint buffer, coupled to the decode units, that queues the decoded ones of the floating point instructions for allocation to the floating point pipelines and (3) a floating point register file, coupled to and cooperable with the floating point checkpoint buffer, that preserves states of the execution core to allow the floating point pipelines to execute the floating point instructions out of order.
    Type: Grant
    Filed: June 16, 2003
    Date of Patent: June 14, 2005
    Assignee: National Semiconductor Corporation
    Inventors: Jeffrey Lohman, Nicholas Samra, Ram Gummadi
  • Patent number: 6904514
    Abstract: An instruction set is provided which has a first field for describing an execution instruction for designating content of an operation or data processing that is executed in at least one processing unit forming a data processing system, and a second field for describing preparation information for setting the processing unit to such a state that is ready to execute an operation or data processing that is executed according to the execution instruction, thereby making it possible to provide a control program having the instruction set in which preparation information independent of the execution instruction described in the first field is described in the second field. Accordingly preparation for execution of the subsequent execution instruction is made based on the preparation information. In the instruction set, since destination of branch instruction is described in the second field and is known in advance, the problems that cannot be solved with a conventional instruction set can be solved.
    Type: Grant
    Filed: August 30, 2000
    Date of Patent: June 7, 2005
    Assignee: IPFlex Inc.
    Inventor: Tomoyoshi Sato
  • Patent number: 6889313
    Abstract: A decode unit comprises first and second decoders respectively connected to receive bit sequences of first and second predetermined lengths. The first and second decoders operate in parallel to generate respective outputs. A switch selects one of the outputs in dependence on an instruction mode of the processor which governs the length of the bit sequence which is actually required to be decoded.
    Type: Grant
    Filed: May 2, 2000
    Date of Patent: May 3, 2005
    Assignee: STMicroelectronics S.A.
    Inventors: Andrew Cofler, Stéphane Bouvier, Laurent Wojcieszak
  • Publication number: 20040230773
    Abstract: In a computer system, a method and apparatus for dispatching and executing multi-cycle and complex instructions. The method results in maximum performance for such without impacting other areas in the processor such as decode, grouping or dispatch units. This invention allows multi-cycle and complex instructions to be dispatched to one port but executed in multiple execution pipes without cracking the instruction and without limiting it to a single execution pipe. Some control signals are generated in the dispatch unit and dispatched with the instruction to the Fixed Point Unit (FXU). The FXU logic then execute these instructions on the available FXU pipes. This method results in optimum performance with little or no other complications. The presented technique places the flexibility of how these instructions will be executed in the FXU, where the actual execution takes place, instead of in the instruction decode or dispatch units or cracking by the compiler.
    Type: Application
    Filed: May 12, 2003
    Publication date: November 18, 2004
    Applicant: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, John G. Rell, Timothy J. Slegel
  • Patent number: 6820191
    Abstract: An apparatus and method for executing an instruction with a register bit mask for transferring data between a plurality of registers and memory inside a processor is provided. The method includes adding the N bits in the N-bit decode information together to form an initial count value, and generating a plurality of register identification (ID) numbers equivalent in number to the initial count value. The register ID numbers correspond to the positions in the N-bit decode information that has a bit value ‘1’. According to the register ID number, a link is created between the plurality of registers corresponding to the register ID numbers and a memory unit so that the memory unit and the registers are free to exchange stored data.
    Type: Grant
    Filed: December 28, 2000
    Date of Patent: November 16, 2004
    Assignee: Faraday Technology Corp.
    Inventors: Calvin Guey, Shyh-An Chi, Yu-Min Wang
  • Patent number: 6766438
    Abstract: The present invention provides a RISC processor with a debug interface unit that enables the external replication of the data processing sequence within a RISC processor for debug purposes. The data exchanged between the sequence controller and the instruction decoder are intermediately stored and forwarded via a free bus line to an interface unit. In the interface unit, the data pending at its inputs are forwarded to defined outputs of the interface. This allows the register contents to be co-read in real time. Accordingly, all the required information to perform an efficient error search are displayed for an outside operator who may then monitor the data processing sequence and conduct an error search.
    Type: Grant
    Filed: October 30, 2000
    Date of Patent: July 20, 2004
    Assignee: Siemens Aktiengesellschaft
    Inventor: Peter Haas
  • Patent number: 6745302
    Abstract: A semiconductor memory enabling a read modify write operation of data, comprising: a memory cell array including a plurality of memory cells arranged in a matrix and able to be written with and read out data; a read address decoding means for independently decoding an address of a read memory cell in response to a read address; a write address decoding means for independently decoding an address of a write memory cell in response to a write address; a data reading means for reading data of a memory cell addressed by the read address decoding means; a data writing means for writing data to a memory cell addressed by the write address decoding means; and an address delay means by which a write address decoded by the write address decoding means is delayed by a predetermined time from a read address decoded by the read address decoding means, wherein the predetermined time is set as a predetermined plurality of times of basic synchronization pulse periods so that the data read modify write operation is accomplis
    Type: Grant
    Filed: September 13, 1999
    Date of Patent: June 1, 2004
    Assignee: Sony Corporation
    Inventors: Kazuo Taniguchi, Masaharu Yoshimori
  • Patent number: 6745384
    Abstract: A method and system for anticipatory optimization of computer programs. The system generates code for a program that is specified using programming-language-defined computational constructs and user-defined, domain-specific computational constructs, the computational constructs including high-level operands that are domain-specific composites of low-level computational constructs. The system generates an abstract syntax tree (AST) representation of the program in a loop merging process. The AST has nodes representing the computational constructs of the program and abstract optimization tags for folding of the composites. A composite folding process is applied to the AST according to the optimization tags to generate optimized code for the program.
    Type: Grant
    Filed: September 21, 2000
    Date of Patent: June 1, 2004
    Assignee: Microsoft Corporation
    Inventor: Ted J. Biggerstaff
  • Patent number: 6745319
    Abstract: A data processing system is provided with a digital signal processor (DSP) which has a shuffle instruction for shuffling a source operand (600) and storing the shuffled result in a selected destination register (610). A shuffled result is formed by interleaving bits from a first source operand portion with bits from a second operand portion. A de-interleave and pack (DEAL) instruction is provided for de-interleaving a source operand. The shuffle instruction and the DEAL instruction have an exactly inverse effect. The DSP includes swizzle circuitry that performs interleaving or de-interleaving in a single execution phase.
    Type: Grant
    Filed: October 31, 2000
    Date of Patent: June 1, 2004
    Assignee: Texas Instruments Incorporated
    Inventors: Keith Balmer, David Hoyle, Lewis Nardini
  • Patent number: 6742109
    Abstract: One embodiment of the present invention provides a system for executing variable-size computer instructions, wherein a variable-size computer instruction includes an action component that specifies an operation to be performed and a data component of variable size that specifies data associated with the operation. The system operates by first retrieving the variable-size computer instruction from a computing device's memory. The system then decodes the variable-size computer instruction by separating the variable-size computer instruction into the action component and the data component. Next, the system stores the action component in a first store and the data component in a second store so they can be reused without repeated decoding. Finally, the system provides a first flow path for the action component and a second flow path for the data component.
    Type: Grant
    Filed: November 30, 2000
    Date of Patent: May 25, 2004
    Assignee: Sun Microsystems, Inc.
    Inventors: Stepan Sokolov, David Wallman
  • Patent number: 6742110
    Abstract: A processing engine 10 for executing instructions in parallel comprises an instruction buffer 600 for holding at least two instructions, with the first instruction 602 in a first position and the second instruction 604 in a second position. A first decoder 612 provides decoding of the first instruction and generates first control signals. The first control signals include first resource control signals, first address generation control signals, and a first validity signal indicative of the validity of the first instruction in the first position. A second decoder 614 provides decoding of the second instruction and generates second control signals. The second control signals include second resource control signals, second address generation control signals, and a second validity signal indicative of the validity of the second instruction in the second position.
    Type: Grant
    Filed: October 1, 1999
    Date of Patent: May 25, 2004
    Assignee: Texas Instruments Incorporated
    Inventors: Karim Djafarian, Gilbert Laurenti, Vincent Gillet
  • Publication number: 20040093483
    Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.
    Type: Application
    Filed: October 31, 2003
    Publication date: May 13, 2004
    Applicant: Seiko Epson Corporation
    Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
  • Patent number: 6735685
    Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load/store unit is provided whose main purpose is to make load requests out-of-order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out-of-order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.
    Type: Grant
    Filed: June 21, 1999
    Date of Patent: May 11, 2004
    Assignee: Seiko Epson Corporation
    Inventors: Cheryl D. Senter, Johannes Wang
  • Patent number: 6718457
    Abstract: A processor has an improved architecture for multiple-thread operation on the basis of a highly parallel structure including multiple independent parallel execution paths for executing in parallel across threads and a multiple-instruction parallel pathway within a thread. The multiple independent parallel execution paths include functional units that execute an instruction set including special data-handling instructions that are advantageous in a multiple-thread environment.
    Type: Grant
    Filed: December 3, 1998
    Date of Patent: April 6, 2004
    Assignee: Sun Microsystems, Inc.
    Inventors: Marc Tremblay, William Joy
  • Patent number: 6694426
    Abstract: A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.
    Type: Grant
    Filed: June 6, 2002
    Date of Patent: February 17, 2004
    Assignee: Intel Corporation
    Inventors: Patrice Roussel, Glenn J. Hinton, Shreekant S. Thakkar, Brent R. Boswell, Karol F. Menezes
  • Patent number: 6684322
    Abstract: A system and method for decoding the length of a macro instruction is described. In one embodiment, the system comprises an opcode-plus-immediate logic unit to generate a first length value, the first length value comprising a length of an opcode plus a length of intermediate data. A memory-length logic unit generates a second length value, the second length value comprising a potential length of a memory displacement, the opcode-plus-immediate logic unit and memory-length logic unit operating in parallel. In addition, the system comprises a length-summation logic unit to sum the first length value and the second length value if the second length value is present.
    Type: Grant
    Filed: August 30, 1999
    Date of Patent: January 27, 2004
    Assignee: Intel Corporation
    Inventors: Fred Gruner, Mike Morrison, Kushagra Vaid
  • Patent number: RE38679
    Abstract: A second decoder (114) of an instruction decode unit (119) decodes an operation code for a multiply-add operation, and a second operation unit (117) receives two data stored in a register file (115) to perform the multiply-add operation. In parallel with the operations of the second decoder (114) and the second operation unit (117), a first decoder (113) of the instruction decode unit (119) decodes an operation code for 2 data load, and an operand access unit (104) causes two data (e.g., n bits each) stored in an internal data memory (105) to be transferred in parallel in the form of combined 2n-bit data to a first operation unit (116). Then, two predetermined registers of the register file (115) store the respective n-bit data from the first operation unit (116).
    Type: Grant
    Filed: May 4, 2001
    Date of Patent: December 28, 2004
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Masahito Matsuo, Toyohiko Yoshida
  • Patent number: RE41751
    Abstract: A processor can decode short instructions with a word length equal to one unit field and long instructions with a word length equal to two unit fields. An opcode of each kind of instruction is arranged into the first unit field assigned to the instruction. The number of instructions to be executed by the processor in parallel is s. When the ratio of short to long instructions is s-1:1, the s-1 short instructions are assigned to the first unit field to the s-1tA unit field in the parallel execution code, and the long instruction is assigned to the sth unit field to the (s+k?1)th unit field in the same parallel execution code.
    Type: Grant
    Filed: November 24, 2003
    Date of Patent: September 21, 2010
    Assignee: Panasonic Corporation
    Inventors: Taketo Heishi, Tetsuya Tanaka, Nobuo Higaki, Shuichi Takayama, Kensuke Odani