Decoding By Plural Parallel Decoders Patents (Class 712/212)

CONTROLLER WHICH CONTROLS OPERATION OF NONVOLATILE SEMICONDUCTOR MEMORY AND SEMICONDUCTOR MEMORY DEVICE INCLUDING NONVOLATILE SEMICONDUCTOR MEMORY AND CONTROLLER THEREFORE

Publication number: 20100241791

Abstract: A controller includes an instruction table memory, a program counter, a first decoder, and a first executing unit. The instruction table memory stores an instruction code obtained by coding a sequence to access a nonvolatile semiconductor memory. A read address in the instruction table memory is set to the program counter. The first decoder decodes the instruction code read from the instruction table memory to output a first decode signal. The first executing unit executes access to the nonvolatile semiconductor memory on the basis of the first decode signal output from the first decoder.

Type: Application

Filed: September 4, 2009

Publication date: September 23, 2010

Inventors: Tarou Iwashiro, Takahide Nishiyama, Seiichi Tomita
Processing pipeline having parallel dispatch and method thereof

Patent number: 7793080

Abstract: One or more processor cores of a multiple-core processing device each can utilize a processing pipeline having a plurality of execution units (e.g., integer execution units or floating point units) that together share a pre-execution front-end having instruction fetch, decode and dispatch resources. Further, one or more of the processor cores each can implement dispatch resources configured to dispatch multiple instructions in parallel to multiple corresponding execution units via separate dispatch buses. The dispatch resources further can opportunistically decode and dispatch instruction operations from multiple threads in parallel so as to increase the dispatch bandwidth. Moreover, some or all of the stages of the processing pipelines of one or more of the processor cores can be configured to implement independent thread selection for the corresponding stage.

Type: Grant

Filed: December 31, 2007

Date of Patent: September 7, 2010

Inventors: Gene Shen, Sean Lie
MEMORY ACCESS CONTROL SYSTEM, MEMORY ACCESS CONTROL METHOD, AND PROGRAM THEREOF

Publication number: 20100223431

Abstract: In a multi-core processor of a shared-memory type, deterioration in the data processing capability caused by competitions of memory accesses from a plurality of processors is suppressed effectively. In a memory access controlling system for controlling accesses to a cache memory in a data read-ahead process when the multi-core processor of a shared-memory type performs a task including a data read-ahead thread for executing data read-ahead and a parallel execution thread for performing an execution process in parallel with the data read-ahead, the system includes a data read-ahead controller which controls an interval between data read-ahead processes in the data read-ahead thread adaptive to a data flow which varies corresponding to an input value of the parallel process in the parallel execution thread. By controlling the interval between the data read-ahead processes, competitions of memory accesses in the multi-core processor are suppressed.

Type: Application

Filed: February 4, 2008

Publication date: September 2, 2010

Inventor: Kosuke Nishihara
MULTITHREADED PROCESSOR WITH MULTIPLE CONCURRENT PIPELINES PER THREAD

Publication number: 20100199073

Abstract: A multithreaded processor comprises a plurality of hardware thread units, an instruction decoder coupled to the thread units for decoding instructions received therefrom, and a plurality of execution units for executing the decoded instructions. The multithreaded processor is configured for controlling an instruction issuance sequence for threads associated with respective ones of the hardware thread units. On a given processor clock cycle, only a designated one of the threads is permitted to issue one or more instructions, but the designated thread that is permitted to issue instructions varies over a plurality of clock cycles in accordance with the instruction issuance sequence. The instructions are pipelined in a manner which permits at least a given one of the threads to support multiple concurrent instruction pipelines.

Type: Application

Filed: October 15, 2009

Publication date: August 5, 2010

Inventors: Erdem Hokenek, Mayan Moudgill, Michael J. Schulte, C. John Glossner
PROCESSOR FOR EXECUTING HIGHLY EFFICIENT VLIW

Publication number: 20100169614

Abstract: A 32-bit instruction 50 is composed of a 4-bit format field 51, a 4-bit operation field 52, and two 12-bit operation fields 59 and 60. The 4-bit operation field 52 can only include (1) an operation code “cc” that indicates a branch operation which uses a stored value of the implicitly indicated constant register 36 as the branch address, or (2) a constant “const”. The content of the 4-bit operation field 52 is specified by a format code provided in the format field 51.

Type: Application

Filed: February 12, 2010

Publication date: July 1, 2010

Applicant: Panasonic Corporation

Inventors: Shuichi TAKAYAMA, Nobuo Higaki
Technique for implementing a security algorithm

Patent number: 7747020

Abstract: Performing a hash algorithm in a processor architecture to alleviate performance bottlenecks and improve overall algorithm performance. In one embodiment of the invention, the hash algorithm is pipelined within the processor architecture.

Type: Grant

Filed: December 4, 2003

Date of Patent: June 29, 2010

Assignee: Intel Corporation

Inventor: Wajdi K. Feghali
Methods and systems for ordering instructions using future values

Patent number: 7747993

Abstract: A method of ordering instructions. The method can include placing a first instruction that consumes a value of an object before a second instruction that produces the value of the object such that the first instruction is processed before the second instruction and a physical location is allocated to the value of the object upon processing the first instruction.

Type: Grant

Filed: December 30, 2004

Date of Patent: June 29, 2010

Assignee: Michigan Technological University

Inventor: Soner Onder
Apparatus for performing fast closest match in pattern recognition

Patent number: 7724963

Abstract: A method and apparatus for determining a closest match of N input patterns relative to R reference patterns using K processing units. Each of a set of input patterns are loaded into the K processing units. One of the Reference patterns is sequentially loaded into each of the processing units and a distance defining the similarity between the reference pattern and each of the input patterns is calculated. A present calculated distance replaces its corresponding stored present minimum distance if it is has a smaller value. After the R reference patterns have been processed the minimum distance and its corresponding identification for all N input patterns is determined without merging outputs. The minimum distances and the identifications may be read either in parallel or serially. The apparatus is easily scalable by adding processors. The number of reference patterns may be easily increased without altering system configuration.

Type: Grant

Filed: February 22, 2008

Date of Patent: May 25, 2010

Assignee: International Business Machines Corporation

Inventors: Kerry A. Kravec, Ali G. Saidi, Jan M. Slyfield, Pascal R. Tannhof
High-performance, superscalar-based computer system with out-of-order instruction execution

Patent number: 7721070

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Grant

Filed: September 22, 2008

Date of Patent: May 18, 2010

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Method apparatus and system of performing one or more encryption and/or decryption operations

Patent number: 7496196

Abstract: Embodiments of the present invention provide a method and apparatus of performing on one or more bytes of an input data block at least one predetermined encryption or decryption operation.

Type: Grant

Filed: June 30, 2004

Date of Patent: February 24, 2009

Assignee: Intel Corporation

Inventors: Marc Jalfon, Boris E. Ginzburg
Microprocessor access of operand stack as a register file using native instructions

Patent number: 7478224

Abstract: A combined native (RISC or CISC) microprocessor and stack (Java™) machine are constructed so that Java™ VM instructions can be executed in hardware. Most Java™ instructions are executed directly, while more complex Java™ instructions, such as those manipulating Java™ objects, are executed as native microcode. In order for native microcode instructions to access the Java™ operand stack, a Java™ operand stack pointer points to the register file location that is the current top of the stack, while a remap bit in the status register indicates that registers specified in native instructions are remapped as the maximum Java™ operand stack pointer value minus the present value of the Java™ operand stack pointer.

Type: Grant

Filed: April 15, 2005

Date of Patent: January 13, 2009

Assignee: Atmel Corporation

Inventors: Oyvind Strom, Erik K. Renno, Kristian Monsen
ORTHOGONAL REGISTER ACCESS

Publication number: 20080263328

Abstract: Embodiments of the invention relate to a method and system for accessing a set of parallel registers orthogonally. A decoder may be used to select a particular row or column of the set of parallel registers to perform register operations in a parallel fashion corresponding to the selected row or in an orthogonal fashion corresponding to the selected column. Thus, when a particular row is selected, a register operation may be carried out for each bit of the selected row to produce a parallel register output, such as by reading/writing each bit of the selected row to a parallel register. On the other hand, when a particular column is selected, a register operation may be carried out for each bit of the selected column, such as by reading/writing each bit of the selected column to an orthogonal register. The orthogonal register access allows for fast and efficient access to a particular bit in the set of parallel registers.

Type: Application

Filed: September 21, 2007

Publication date: October 23, 2008

Applicant: CYPRESS SEMICONDUCTOR CORPORATION

Inventors: Timothy Williams, Gregory John Verge, Dennis Seguine
Efficient complex multiplication and fast fourier transform (FFT) implementation on the ManArray architecture

Patent number: 7424594

Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.

Type: Grant

Filed: June 3, 2004

Date of Patent: September 9, 2008

Assignee: Altera Corporation

Inventors: Nikos P. Pitsianis, Gerald George Pechanek, Ricardo Rodriguez
Memory And Accessing Method Thereof

Publication number: 20080177982

Abstract: The present invention provides a memory device and a method for accessing the memory device thereof. The memory device comprises an address encoding selector for selecting one of plurality of encoding circuits which encodes a first address into a second address, and a data decoding selector for selecting one of plurality of decoding circuits which decodes a first data corresponding to the second address into a second data and a non-volatile memory, coupled to address encoding selector and the data decoding selector, for storing the first data. The method for accessing the memory device comprises encoding a first address into a second address by an address encoding selector, and decoding a first data corresponding to the second address into a second data by a data decoding selector, wherein the first data being stored in the non-volatile memory.

Type: Application

Filed: June 4, 2007

Publication date: July 24, 2008

Applicant: HOLTEK SEMICONDUCTOR INC.

Inventors: Chuen-An Lin, Ching-Tsung Tung
Method and apparatus for performing fast closest match in pattern recognition

Patent number: 7366352

Abstract: A method and apparatus for determining a closest match of N input patterns relative to R reference patterns using K processing units. Each of a set of input patterns are loaded into the K processing units. One of the Reference patterns is sequentially loaded into each of the processing units and a distance defining the similarity between the reference pattern and each of the input patterns is calculated. A present calculated distance replaces its corresponding stored present minimum distance if it is has a smaller value. After the R reference patterns have been processed the minimum distance and its corresponding identification for all N input patterns is determined without merging outputs. The minimum distances and the identifications may be read either in parallel or serially. The apparatus is easily scalable by adding processors. The number of reference patterns may be easily increased without altering system configuration.

Type: Grant

Filed: March 20, 2003

Date of Patent: April 29, 2008

Assignee: International Business Machines Corporation

Inventors: Kerry A. Kravec, Ali G. Saidi, Jan M. Slyfield, Pascal R. Tannhof
Hierarchical parallelism for system initialization

Publication number: 20080077774

Abstract: A technique includes using multiple processing cores of a semiconductor package to perform functions directed to booting up a computer system.

Type: Application

Filed: September 26, 2006

Publication date: March 27, 2008

Inventors: Lyle E. Cool, Vincent J. Zimmer
Shift prefix instruction decoder for modifying register information necessary for decoding the target instruction

Patent number: 7340589

Abstract: The data processing device and electronic equipment of the present invention perform pipeline control and include a fetch circuit which fetches instruction codes of a plurality of instructions in instruction queues. A prefix instruction decoder circuit performs a decode processing only on a prefix instruction. The prefix instruction decoder circuit receives the instruction code before decoding, judges whether or not the instruction is a given prefix instruction, and causes a target instruction to modify an information register to store information necessary for decoding a target instruction when the instruction is the given prefix instruction. A decoder circuit receives each of the instruction codes of the instructions other than the prefix instruction as a decode instruction and decodes the decode instruction. When the decode instruction is a target instruction, the target instruction modified by the prefix instruction is decoded based on the target instruction modifying information.

Type: Grant

Filed: June 20, 2003

Date of Patent: March 4, 2008

Assignee: Seiko Epson Corporation

Inventor: Makoto Kudo
Picture Processing Engine and Picture Processing System

Publication number: 20070294514

Abstract: To provide a technique to reduce power consumption when carrying out image processing by processors. For the purpose of this, for example, a means for specifying a two-dimensional source register and destination register is provided in an operand of an instruction, and the processor includes a means which executes calculation using a plurality of source registers in a plurality of cycles and obtains a plurality of destinations. Moreover, in an instruction to obtain a destination using a plurality of source registers and consuming a plurality of cycles, a data rounding processing part is connected to a final stage of a pipeline. With such configurations, the power consumed when reading an instruction memory is reduced by reducing the access frequency to the instruction memory, for example.

Type: Application

Filed: March 21, 2007

Publication date: December 20, 2007

Inventors: Koji Hosogi, Masakazu Ehama, Hiroaki Nakata, Kenichi Iwata, Seiji Mochizuki, Takafumi Yuasa, Yukifumi Kobayashi, Tetsuya Shibayama, Hiroshi Ueda, Masaki Nobori
Instruction length decoder

Patent number: 7305542

Abstract: Speculatively decoding instruction lengths in order to increase instruction throughput. Instructions are speculatively decoded within a pipelined microprocessor architecture such that up to four instruction lengths may be decoded within a maximum of two processor clock cycles.

Type: Grant

Filed: June 25, 2002

Date of Patent: December 4, 2007

Assignee: Intel Corporation

Inventor: Venkateswara Rao Madduri
Monolithic integrated circuit having a number of programmable processing elements

Patent number: 7289142

Abstract: A monolithic integrated circuit includes programmable processing circuitry. The processing circuitry includes four programmable circuitry elements. Switching circuitry is operatively connected to the programmable circuitry elements and is configured to provide data communication between the circuitry elements. An image sensor interface is connected to the processing circuitry and is configured to receive signals from an image sensor and to pass data representing the signals to the programmable processing circuitry.

Type: Grant

Filed: October 14, 2003

Date of Patent: October 30, 2007

Assignee: Silverbrook Research Pty Ltd

Inventor: Kia Silverbrook
Stall optimization for an in-order, multi-stage processor pipeline which analyzes current and next instructions to determine if a stall is necessary

Patent number: 7243214

Abstract: According to some embodiments, a method determining a number of stages associated with an instruction to be executed via a processor pipeline, determining a number of stages associated with a subsequent instruction, and stalling the pipeline based on the number of stages associated with the instruction to be executed and the number of stages associated with the subsequent instruction is provided.

Type: Grant

Filed: April 21, 2003

Date of Patent: July 10, 2007

Assignee: Intel Corporation

Inventors: Niall D. McDonnell, John Wishneusky
Micro-operation un-lamination

Patent number: 7206921

Abstract: A processor may include an instruction decoder to decode macroinstructions into micro-operations. In some embodiments, the instruction decoder may include a first decoder and a second decoder. The first decoder may decode a macroinstruction having SSE data type operands into a laminated micro-operation, and may generate unlamination information for the laminated micro-operation. The second decoder may generate from the laminated micro-operation and the unlamination information two or more micro-operations, where operands of the two or more micro-operations each correspond to a half of one of the SSE operands of the macroinstruction.

Type: Grant

Filed: April 7, 2003

Date of Patent: April 17, 2007

Assignee: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Simcha Gochman
Parallel layer 2 and layer 3 processing components in a network router

Patent number: 7180893

Abstract: A packet header processing engine includes a level 2 (L2) header generation unit and a level 3 (L3) header generation unit. The L2 and L3 header generation units are implemented in parallel with one another. The L2 generation unit writes L2 header information to a first buffer and the L3 generation unit writes L3 header information to a second buffer. When both the L2 and L3 generation units complete their operations for a particular packet, a build component combines the generated L2 and L3 header information from the buffers to form a complete packet header.

Type: Grant

Filed: March 22, 2002

Date of Patent: February 20, 2007

Assignee: Juniper Networks, Inc.

Inventors: Pradeep Sindhu, Raymond M. Lim, Jeffrey G. Libby
Semiconductor integrated circuit

Patent number: 7143200

Abstract: A semiconductor integrated circuit to be connected to a PCI bus, having a configuration register. The size of an address space mapped to the semiconductor integrated circuit depends on the size of readable and writable region (Fv) of a base address register (30) that the configuration register has. The size of the readable and writable region of the base address register can be changed by a mask circuit (31). The size of a local address space can be set to be variable according to the number of mask bits specified by a mask signal. For example, also in the case where a plurality of PCI devices are used, a memory space mapped to each device can be selectively reduced in size and as such, it is also possible to cope with the case where finite resources are mapped to many PCI devices to construct a system.

Type: Grant

Filed: May 20, 2004

Date of Patent: November 28, 2006

Assignee: Hitachi, Ltd.

Inventor: Katsuichi Tomobe
Microprocessor including microcode unit that only changes the value of control signals required for the current cycle operation for reduced power consumption and method therefor

Patent number: 7111151

Abstract: A microprocessor, method and signal-bearing medium for storing a program for executing the method, includes a microcode unit for outputting control signals, for each of a plurality of instructions, required by the microprocessor for executing the instructions. The microcode unit includes an instruction address input for receiving an instruction address, a control variable input for receiving a control variable corresponding to a current state of the microprocessor, a control signal input for receiving all of the control signals output by the microcode unit for an immediately preceding instruction, and a plurality of embedded logic circuits each dedicated for evaluating one unique type of instruction received by the microcode unit.

Type: Grant

Filed: March 14, 2001

Date of Patent: September 19, 2006

Assignee: International Business Machines Corporation

Inventors: William P. Moore, Sebastian T. Ventrone
Apparatus and method for realizing effective parallel execution of instructions in an information processor

Patent number: 7103755

Abstract: An apparatus for efficient parallel executing instruction avoiding the usage of cross bypasses, the apparatus including an instruction buffer for storing instructions, of decoders for decoding, in parallel, the instructions which simultaneously issue from the instruction buffer, executing units for executing the instructions decoded in the decoders, and an instruction-issuing controlling means for controlling the issuing of the instructions in such a way that, when the instructions are executed, one of the plural executing units executes instructions more frequently than the rest of the plural executing units. The apparatus is preferably incorporated in an information processor to superscalar or out-of-order instruction execution.

Type: Grant

Filed: January 10, 2003

Date of Patent: September 5, 2006

Assignee: Fujitsu Limited

Inventors: Susumu Akiu, Masaki Ukai, Toshio Yoshida
Super-region instruction scheduling and code generation for merging identical instruction into the ready-to-schedule instruction

Patent number: 7069555

Abstract: Systems and methods perform super-region instruction scheduling that increases the instruction level parallelism for computer programs. A compiler performs data flow analysis and memory interference analysis on the code to determine data dependencies between entities such as registers and memory locations. A region tree is constructed, where the region tree contains a single entry block and a single exit block, with potential intervening blocks representing different control flows through the region. Instructions within blocks are moved to predecessor blocks when there are no dependencies on the instruction to be moved, and when the move results in greater opportunity for instruction level parallelism. Redundant instructions from multiple paths can be merged into a single instruction during the process of scheduling. In addition, if a dependency can be removed the method transforms the instruction into an instruction that can be moved to a block having available resources.

Type: Grant

Filed: September 29, 2000

Date of Patent: June 27, 2006

Assignee: Microsoft Corporation

Inventor: Ten H. Tzen
Memory system for video decoding system

Patent number: 7007031

Abstract: System and method of data unit management in a decoding system employing a decoding pipeline. Each incoming data unit is assigned a memory element and is stored in the assigned memory element. Each decoding module gets the data to be operated on, as well as the control data, for a given data unit from the assigned memory element. Each decoding module, after performing its decoding operations on the data unit, deposits the newly processed data back into the same memory element. In one embodiment, the assigned memory locations comprise a header portion for holding the control data corresponding to the data unit and a data portion for holding the substantive data of the data unit. The header information is written to the header portion of the assigned memory element once and accessed by the various decoding modules throughout the decoding pipeline as needed. The data portion of memory is used/shared by multiple decoding modules.

Type: Grant

Filed: April 1, 2002

Date of Patent: February 28, 2006

Assignee: Broadcom Corporation

Inventors: Alexander G. MacInnis, Jose′ R. Alvarez, Sheng Zhong, Xiaodong Xie, Vivian Hsiun
Decoding suffix instruction specifying replacement destination for primary instruction

Patent number: 6970998

Abstract: In an embodiment, a method comprises receiving a first instruction and a second instruction, where the second instruction specifies that a destination address of the first instruction should be replaced with a destination address of the second instruction. The method also includes decoding the first instruction and the second instruction. The decoding comprises replacing a destination address of the first instruction with an address provided by the second instruction, upon determining that the second instruction is a suffix instructions to the first instruction.

Type: Grant

Filed: October 7, 2002

Date of Patent: November 29, 2005

Assignee: Redback Networks Inc.

Inventor: John G. Favor
Microprocessor employing a fixed position dispatch unit

Patent number: 6968444

Abstract: A microprocessor employing a fixed position dispatch unit. The microprocessor includes a plurality of execution units each corresponding to an issue position and configured to execute a common subset of instructions. At least a first one of the execution units includes extended logic for executing a designated instruction that others of the execution units may be incapable of executing. The microprocessor also includes a plurality of decoders coupled to the plurality of execution units. The plurality of decoders may provide positional information to cause the designated instruction to be routed to the first execution unit. Further, the microprocessor includes a dispatch control unit configured to dispatch during a dispatch cycle, the designated instruction for execution by the first execution unit based upon the positional information. The dispatch control unit may further dispatch one or more instructions within the common subset of instructions during the same dispatch cycle.

Type: Grant

Filed: November 4, 2002

Date of Patent: November 22, 2005

Assignee: Advanced Micro Devices, Inc.

Inventors: David E. Kroesche, Michael T. Clark
High-performance, superscalar-based computer system with out-of-order instruction execution

Patent number: 6959375

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Grant

Filed: October 29, 2002

Date of Patent: October 25, 2005

Assignee: Seiko Epson Corporation

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Method for quickly determining length of an execution package

Patent number: 6944749

Abstract: A method for decoding instructions in an execution package with a processor includes using an assembler to assemble instructions into different execution packages. Each instruction has an identification segment and an instruction segment. The method also includes using the assembler to reorder the instructions by separating identification segments from instruction segments, grouping all identification segments of the execution package together, and grouping all instruction segments of the execution package together. The method uses the processor to decode identification segments of the instructions at the same time, and adds a length of each identification segment together to calculate a total length of the execution package.

Type: Grant

Filed: July 29, 2002

Date of Patent: September 13, 2005

Assignee: Faraday Technology Corp.

Inventor: Shan-Chyun Ku
Pipelined, superscalar floating point unit having out-of-order execution capability and processor employing the same

Patent number: 6907518

Abstract: For use in a processor having a first number of decode units for decoding an ordered stream of floating point instructions, a floating point unit (FPU) for receiving decoded ones of the floating point instructions and a method of processing the decoded ones of the floating point instructions. In one embodiment, the FPU includes: (1) a second number of floating point pipelines that execute the floating point instructions, the second number being at least one and less than the first number, the floating point pipeline having a load unit, an execution core and a store unit, (2) a floating point checkpoint buffer, coupled to the decode units, that queues the decoded ones of the floating point instructions for allocation to the floating point pipelines and (3) a floating point register file, coupled to and cooperable with the floating point checkpoint buffer, that preserves states of the execution core to allow the floating point pipelines to execute the floating point instructions out of order.

Type: Grant

Filed: June 16, 2003

Date of Patent: June 14, 2005

Assignee: National Semiconductor Corporation

Inventors: Jeffrey Lohman, Nicholas Samra, Ram Gummadi
Data processor

Patent number: 6904514

Abstract: An instruction set is provided which has a first field for describing an execution instruction for designating content of an operation or data processing that is executed in at least one processing unit forming a data processing system, and a second field for describing preparation information for setting the processing unit to such a state that is ready to execute an operation or data processing that is executed according to the execution instruction, thereby making it possible to provide a control program having the instruction set in which preparation information independent of the execution instruction described in the first field is described in the second field. Accordingly preparation for execution of the subsequent execution instruction is made based on the preparation information. In the instruction set, since destination of branch instruction is described in the second field and is known in advance, the problems that cannot be solved with a conventional instruction set can be solved.

Type: Grant

Filed: August 30, 2000

Date of Patent: June 7, 2005

Assignee: IPFlex Inc.

Inventor: Tomoyoshi Sato
Selection of decoder output from two different length instruction decoders

Patent number: 6889313

Abstract: A decode unit comprises first and second decoders respectively connected to receive bit sequences of first and second predetermined lengths. The first and second decoders operate in parallel to generate respective outputs. A switch selects one of the outputs in dependence on an instruction mode of the processor which governs the length of the bit sequence which is actually required to be decoded.

Type: Grant

Filed: May 2, 2000

Date of Patent: May 3, 2005

Assignee: STMicroelectronics S.A.

Inventors: Andrew Cofler, Stéphane Bouvier, Laurent Wojcieszak
Multi-pipe dispatch and execution of complex instructions in a superscalar processor

Publication number: 20040230773

Abstract: In a computer system, a method and apparatus for dispatching and executing multi-cycle and complex instructions. The method results in maximum performance for such without impacting other areas in the processor such as decode, grouping or dispatch units. This invention allows multi-cycle and complex instructions to be dispatched to one port but executed in multiple execution pipes without cracking the instruction and without limiting it to a single execution pipe. Some control signals are generated in the dispatch unit and dispatched with the instruction to the Fixed Point Unit (FXU). The FXU logic then execute these instructions on the available FXU pipes. This method results in optimum performance with little or no other complications. The presented technique places the flexibility of how these instructions will be executed in the FXU, where the actual execution takes place, instead of in the instruction decode or dispatch units or cracking by the compiler.

Type: Application

Filed: May 12, 2003

Publication date: November 18, 2004

Applicant: International Business Machines Corporation

Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, John G. Rell, Timothy J. Slegel
Apparatus and method for executing an instruction with a register bit mask for transferring data between a plurality of registers and memory inside a processor

Patent number: 6820191

Abstract: An apparatus and method for executing an instruction with a register bit mask for transferring data between a plurality of registers and memory inside a processor is provided. The method includes adding the N bits in the N-bit decode information together to form an initial count value, and generating a plurality of register identification (ID) numbers equivalent in number to the initial count value. The register ID numbers correspond to the positions in the N-bit decode information that has a bit value ‘1’. According to the register ID number, a link is created between the plurality of registers corresponding to the register ID numbers and a memory unit so that the memory unit and the registers are free to exchange stored data.

Type: Grant

Filed: December 28, 2000

Date of Patent: November 16, 2004

Assignee: Faraday Technology Corp.

Inventors: Calvin Guey, Shyh-An Chi, Yu-Min Wang
RISC processor with a debug interface unit

Patent number: 6766438

Abstract: The present invention provides a RISC processor with a debug interface unit that enables the external replication of the data processing sequence within a RISC processor for debug purposes. The data exchanged between the sequence controller and the instruction decoder are intermediately stored and forwarded via a free bus line to an interface unit. In the interface unit, the data pending at its inputs are forwarded to defined outputs of the interface. This allows the register contents to be co-read in real time. Accordingly, all the required information to perform an efficient error search are displayed for an outside operator who may then monitor the data processing sequence and conduct an error search.

Type: Grant

Filed: October 30, 2000

Date of Patent: July 20, 2004

Assignee: Siemens Aktiengesellschaft

Inventor: Peter Haas
Method and circuit for enabling a clock-synchronized read-modify-write operation on a memory array

Patent number: 6745302

Abstract: A semiconductor memory enabling a read modify write operation of data, comprising: a memory cell array including a plurality of memory cells arranged in a matrix and able to be written with and read out data; a read address decoding means for independently decoding an address of a read memory cell in response to a read address; a write address decoding means for independently decoding an address of a write memory cell in response to a write address; a data reading means for reading data of a memory cell addressed by the read address decoding means; a data writing means for writing data to a memory cell addressed by the write address decoding means; and an address delay means by which a write address decoded by the write address decoding means is delayed by a predetermined time from a read address decoded by the read address decoding means, wherein the predetermined time is set as a predetermined plurality of times of basic synchronization pulse periods so that the data read modify write operation is accomplis

Type: Grant

Filed: September 13, 1999

Date of Patent: June 1, 2004

Assignee: Sony Corporation

Inventors: Kazuo Taniguchi, Masaharu Yoshimori
Anticipatory optimization with composite folding

Patent number: 6745384

Abstract: A method and system for anticipatory optimization of computer programs. The system generates code for a program that is specified using programming-language-defined computational constructs and user-defined, domain-specific computational constructs, the computational constructs including high-level operands that are domain-specific composites of low-level computational constructs. The system generates an abstract syntax tree (AST) representation of the program in a loop merging process. The AST has nodes representing the computational constructs of the program and abstract optimization tags for folding of the composites. A composite folding process is applied to the AST according to the optimization tags to generate optimized code for the program.

Type: Grant

Filed: September 21, 2000

Date of Patent: June 1, 2004

Assignee: Microsoft Corporation

Inventor: Ted J. Biggerstaff
Microprocessor with instructions for shuffling and dealing data

Patent number: 6745319

Abstract: A data processing system is provided with a digital signal processor (DSP) which has a shuffle instruction for shuffling a source operand (600) and storing the shuffled result in a selected destination register (610). A shuffled result is formed by interleaving bits from a first source operand portion with bits from a second operand portion. A de-interleave and pack (DEAL) instruction is provided for de-interleaving a source operand. The shuffle instruction and the DEAL instruction have an exactly inverse effect. The DSP includes swizzle circuitry that performs interleaving or de-interleaving in a single execution phase.

Type: Grant

Filed: October 31, 2000

Date of Patent: June 1, 2004

Assignee: Texas Instruments Incorporated

Inventors: Keith Balmer, David Hoyle, Lewis Nardini
Method and apparatus for representing variable-size computer instructions

Patent number: 6742109

Abstract: One embodiment of the present invention provides a system for executing variable-size computer instructions, wherein a variable-size computer instruction includes an action component that specifies an operation to be performed and a data component of variable size that specifies data associated with the operation. The system operates by first retrieving the variable-size computer instruction from a computing device's memory. The system then decodes the variable-size computer instruction by separating the variable-size computer instruction into the action component and the data component. Next, the system stores the action component in a first store and the data component in a second store so they can be reused without repeated decoding. Finally, the system provides a first flow path for the action component and a second flow path for the data component.

Type: Grant

Filed: November 30, 2000

Date of Patent: May 25, 2004

Assignee: Sun Microsystems, Inc.

Inventors: Stepan Sokolov, David Wallman
Preventing the execution of a set of instructions in parallel based on an indication that the instructions were erroneously pre-coded for parallel execution

Patent number: 6742110

Abstract: A processing engine 10 for executing instructions in parallel comprises an instruction buffer 600 for holding at least two instructions, with the first instruction 602 in a first position and the second instruction 604 in a second position. A first decoder 612 provides decoding of the first instruction and generates first control signals. The first control signals include first resource control signals, first address generation control signals, and a first validity signal indicative of the validity of the first instruction in the first position. A second decoder 614 provides decoding of the second instruction and generates second control signals. The second control signals include second resource control signals, second address generation control signals, and a second validity signal indicative of the validity of the second instruction in the second position.

Type: Grant

Filed: October 1, 1999

Date of Patent: May 25, 2004

Assignee: Texas Instruments Incorporated

Inventors: Karim Djafarian, Gilbert Laurenti, Vincent Gillet
High performance, superscalar-based computer system with out-of-order instruction execution

Publication number: 20040093483

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Application

Filed: October 31, 2003

Publication date: May 13, 2004

Applicant: Seiko Epson Corporation

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
System and method for handling load and/or store operations in a superscalar microprocessor

Patent number: 6735685

Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load/store unit is provided whose main purpose is to make load requests out-of-order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out-of-order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.

Type: Grant

Filed: June 21, 1999

Date of Patent: May 11, 2004

Assignee: Seiko Epson Corporation

Inventors: Cheryl D. Senter, Johannes Wang
Multiple-thread processor for threaded software applications

Patent number: 6718457

Abstract: A processor has an improved architecture for multiple-thread operation on the basis of a highly parallel structure including multiple independent parallel execution paths for executing in parallel across threads and a multiple-instruction parallel pathway within a thread. The multiple independent parallel execution paths include functional units that execute an instruction set including special data-handling instructions that are advantageous in a multiple-thread environment.

Type: Grant

Filed: December 3, 1998

Date of Patent: April 6, 2004

Assignee: Sun Microsystems, Inc.

Inventors: Marc Tremblay, William Joy
Method and apparatus for staggering execution of a single packed data instruction using the same circuit

Patent number: 6694426

Abstract: A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.

Type: Grant

Filed: June 6, 2002

Date of Patent: February 17, 2004

Assignee: Intel Corporation

Inventors: Patrice Roussel, Glenn J. Hinton, Shreekant S. Thakkar, Brent R. Boswell, Karol F. Menezes
Method and system for instruction length decode

Patent number: 6684322

Abstract: A system and method for decoding the length of a macro instruction is described. In one embodiment, the system comprises an opcode-plus-immediate logic unit to generate a first length value, the first length value comprising a length of an opcode plus a length of intermediate data. A memory-length logic unit generates a second length value, the second length value comprising a potential length of a memory displacement, the opcode-plus-immediate logic unit and memory-length logic unit operating in parallel. In addition, the system comprises a length-summation logic unit to sum the first length value and the second length value if the second length value is present.

Type: Grant

Filed: August 30, 1999

Date of Patent: January 27, 2004

Assignee: Intel Corporation

Inventors: Fred Gruner, Mike Morrison, Kushagra Vaid
Data processor and method of processing data

Patent number: RE38679

Abstract: A second decoder (114) of an instruction decode unit (119) decodes an operation code for a multiply-add operation, and a second operation unit (117) receives two data stored in a register file (115) to perform the multiply-add operation. In parallel with the operations of the second decoder (114) and the second operation unit (117), a first decoder (113) of the instruction decode unit (119) decodes an operation code for 2 data load, and an operand access unit (104) causes two data (e.g., n bits each) stored in an internal data memory (105) to be transferred in parallel in the form of combined 2n-bit data to a first operation unit (116). Then, two predetermined registers of the register file (115) store the respective n-bit data from the first operation unit (116).

Type: Grant

Filed: May 4, 2001

Date of Patent: December 28, 2004

Assignee: Mitsubishi Denki Kabushiki Kaisha

Inventors: Masahito Matsuo, Toyohiko Yoshida
Instruction converting apparatus using parallel execution code

Patent number: RE41751

Abstract: A processor can decode short instructions with a word length equal to one unit field and long instructions with a word length equal to two unit fields. An opcode of each kind of instruction is arranged into the first unit field assigned to the instruction. The number of instructions to be executed by the processor in parallel is s. When the ratio of short to long instructions is s-1:1, the s-1 short instructions are assigned to the first unit field to the s-1tA unit field in the parallel execution code, and the long instruction is assigned to the sth unit field to the (s+k?1)th unit field in the same parallel execution code.

Type: Grant

Filed: November 24, 2003

Date of Patent: September 21, 2010

Assignee: Panasonic Corporation

Inventors: Taketo Heishi, Tetsuya Tanaka, Nobuo Higaki, Shuichi Takayama, Kensuke Odani

prev 1 2 3 4 next