Superscalar Patents (Class 712/23)

RUN-TIME INSTRUMENTATION HANDLING IN A SUPERSCALAR PROCESSOR

Publication number: 20140281375

Abstract: A method and a computer program for a processor simultaneously handle multiple instructions at a time. The method includes labeling of an instruction ending a relevant sample interval from a plurality of such instructions. Further, the method utilizes a buffer to store N more number of entries than actually required, wherein, N refers to the number of RI instructions younger than the instruction ending a sample interval. Further, the method also includes the step of recording relevant instrumentation data corresponding to the sample interval and providing the instrumentation data in response to identification of the sample interval.

Type: Application

Filed: March 15, 2013

Publication date: September 18, 2014

Applicant: International Business Machines Corporation

Inventors: Gregory W. Alexander, Mark S. Farrell, Wolfgang Fischer, Guenter Gerwig, Frank Lehnert, Chung-Lung Shum
Scheduling instructions in a cascaded delayed execution pipeline to minimize pipeline stalls caused by a cache miss

Patent number: 8812822

Abstract: A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for minimizing unscheduled D-cache miss pipeline stalls is provided. The design structure includes an integrated circuit device, which includes a cascaded delayed execution pipeline unit having two or more execution pipelines that begin execution of instructions in a common issue group in a delayed manner relative to each other, and circuitry. The circuitry is configured to receive an issue group of instructions, determine whether the issue group is a load instruction, and if so, schedule the load instruction in a first pipeline of the two or more execution pipelines, and schedule each remaining instruction in the issue group to be executed in remaining pipelines of the two or more pipelines, wherein execution of the load instruction in the first pipeline begins prior to beginning execution of the remaining instructions in the remaining pipelines.

Type: Grant

Filed: March 13, 2008

Date of Patent: August 19, 2014

Assignee: International Business Machines Corporation

Inventor: David A. Luick
Algorithm and architecture for multi-argument associative operations that minimizes the number of components using a latency of the components

Patent number: 8775147

Abstract: An algorithm and architecture are disclosed for performing multi-argument associative operations. The algorithm and architecture can be used to schedule operations on multiple facilities for computations or can be used in the development of a model in a modeling environment. The algorithm and architecture resulting from the algorithm use the latency of the components that are used to process the associative operations. The algorithm minimizes the number of components necessary to produce an output of multi-argument associative operations and also can minimize the number of inputs each component receives.

Type: Grant

Filed: May 31, 2006

Date of Patent: July 8, 2014

Assignee: The MathWorks, Inc.

Inventors: Alireza Pakyari, Brian K. Ogilvie
System and method for implementing elliptic curve scalar multiplication in cryptography

Patent number: 8649508

Abstract: A system and method for implementing the Elliptic Curve scalar multiplication method in cryptography, where the Double Base Number System is expressed in decreasing order of exponents and further on using it to determine Elliptic curve scalar multiplication over a finite elliptic curve.

Type: Grant

Filed: September 29, 2008

Date of Patent: February 11, 2014

Assignee: Tata Consultancy Services Ltd.

Inventor: Natarajan Vijayarangan
Branch target address cache for predicting instruction decryption keys in a microprocessor that fetches and decrypts encrypted instructions

Patent number: 8645714

Abstract: A branch target address cache (BTAC) caches history information associated with branch and switch key instructions previously executed by a microprocessor. The history information includes a target address and an identifier (index into a register file) for identifying key values associated with each of the previous branch and switch key instructions. A fetch unit receives from the BTAC a prediction that the fetch unit fetched a previous branch and switch key instruction and receives the target address and identifier associated with the fetched branch and switch key instruction. The fetch unit also fetches encrypted instruction data at the associated target address and decrypts (via XOR) the fetched encrypted instruction data based on the key values identified by the identifier, in response to receiving the prediction. If the BTAC predicts correctly, a pipeline flush normally associated with the branch and switch key instruction is avoided.

Type: Grant

Filed: April 21, 2011

Date of Patent: February 4, 2014

Assignee: VIA Technologies, Inc.

Inventors: G. Glenn Henry, Terry Parks, Brent Bean, Thomas A. Crispin
Distributed multi-core memory initialization

Patent number: 8566570

Abstract: In a system having a plurality of processing nodes, a control node divides a task into a plurality of sub-tasks, and assigns the sub-tasks to one or more additional processing nodes which execute the assigned sub-tasks and return the results to the control node, thereby enabling a plurality of processing nodes to efficiently and quickly perform memory initialization and test of all assigned sub-tasks.

Type: Grant

Filed: October 10, 2012

Date of Patent: October 22, 2013

Assignee: Advanced Micro Devices, Inc.

Inventor: Oswin E. Housty
ENERGY EFFICIENT MICROPROCESSOR PLATFORM BASED ON INSTRUCTIONAL LEVEL PARALLELISM

Publication number: 20130232359

Abstract: Embodiments of a processing architecture are described. The architecture includes a fetch unit for fetching instructions from a data bus. A scheduler receives data from the fetch unit and creates a schedule allocates the data and schedule to a plurality of computational units. The scheduler also modifies voltage and frequency settings of the processing architecture to optimize power consumption and throughput of the system. The computational units include control units and execute units. The control units receive and decode the instructions and send the decoded instructions to execute units. The execute units then execute the instructions according to relevant software.

Type: Application

Filed: March 1, 2012

Publication date: September 5, 2013

Applicant: NXP B.V.

Inventors: Hamed Fatemi, Ajay Kapoor, J. Pineda de Gyvez
Dispatching instruction from reservation station to vacant instruction queue of alternate arithmetic unit

Patent number: 8516223

Abstract: A priority circuit is connected to a reservation station and a plurality of arithmetic units that processes different operations and dispatches, when it is determined that an executable flag indicating that an instruction can be executed by only a specific arithmetic unit is on, an instruction to an arithmetic unit that is different from the specific arithmetic unit and of which a queue is vacant in accordance with the input performed by an instruction decoder and the reservation station.

Type: Grant

Filed: June 29, 2010

Date of Patent: August 20, 2013

Assignee: Fujitsu Limited

Inventors: Atsushi Fusejima, Yasunobu Akizuki, Toshio Yoshida
Interrupt and exception handling for multi-streaming digital processors

Patent number: 8468540

Abstract: A multi-streaming processor has a plurality of streams for streaming one or more instruction threads, a set of functional resources for processing instructions from streams, and interrupt handler logic. The logic detects and maps interrupts and exceptions to one or more specific streams. In some embodiments, one interrupt or exception may be mapped to two or more streams, and in others two or more interrupts or exceptions may be mapped to one stream. Mapping may be static and determined at processor design, programmable, with data stored and amendable, or conditional and dynamic, the interrupt logic executing an algorithm sensitive to variables to determine the mapping. Interrupts may be external interrupts generated by devices external to the processor software (internal) interrupts generated by active streams, or conditional, based on variables. After interrupts are acknowledged, streams to which interrupts or exceptions are mapped are vectored to appropriate service routines.

Type: Grant

Filed: March 7, 2011

Date of Patent: June 18, 2013

Assignee: Bridge Crossing, LLC

Inventors: Mario D. Nemirovsky, Adolfo M. Nemirovsky, Narendra Sankar
Tracing apparatus and tracing system

Patent number: 8464089

Abstract: A tracing apparatus for tracing operational information that is output from a plurality of processing units in relation to data processing operations, the tracing apparatus comprising for each of the processing units: a counting unit configured to obtain and output a counter value for the corresponding processing unit, the counter value obtained by counting clock signals that are input to the processing unit at an operating frequency thereof; a counter value conversion unit configured to obtain and output a converted counter value for the corresponding processing unit, the converted counter value obtained by converting the counter value based on the assumption that the processing unit has a given reference operating frequency; and an adding unit configured to acquire an operational information set from the corresponding processing unit, and to add the converted counter value to the operational information set.

Type: Grant

Filed: June 3, 2010

Date of Patent: June 11, 2013

Assignee: Panasonic Corporation

Inventors: Kazuhiro Watanabe, Takashi Hashimoto
Processing prefix code in instruction queue storing fetched sets of plural instructions in superscalar processor

Patent number: 8402256

Abstract: The present invention is directed to realize efficient issue of a superscalar instruction in an instruction set including an instruction with a prefix. A circuit is employed which retrieves an instruction of each instruction code type other than a prefix on the basis of a determination result of decoders for determining an instruction code type, adds the immediately preceding instruction to the retrieved instruction, and outputs the resultant to instruction executing means. When an instruction of a target instruction code type is detected in a plurality of instruction units to be searched, the circuit outputs the detected instruction code and the immediately preceding instruction other than the target instruction code type as prefix code candidates. When an instruction of a target instruction code type cannot be detected at the rear end of the instruction units to be searched, the circuit outputs the instruction at the rear end as a prefix code candidate.

Type: Grant

Filed: August 25, 2009

Date of Patent: March 19, 2013

Assignee: Renesas Electronics Corporation

Inventor: Fumio Arakawa
Distributed multi-core memory initialization

Patent number: 8307198

Abstract: In a system having a plurality of processing nodes, a control node divides a task into a plurality of sub-tasks, and assigns the sub-tasks to one or more additional processing nodes which execute the assigned sub-tasks and return the results to the control node, thereby enabling a plurality of processing nodes to efficiently and quickly perform memory initialization and test of all assigned sub-tasks.

Type: Grant

Filed: November 24, 2009

Date of Patent: November 6, 2012

Assignee: Advanced Micro Devices, Inc.

Inventor: Oswin E. Housty
PROCESSOR WITH INCREASED EFFICIENCY VIA CONTROL WORD PREDICTION

Publication number: 20120226891

Abstract: Methods and apparatuses are provided for increased efficiency in a processor via control word prediction. The apparatus comprises an operational unit capable of determining whether an instruction will change a first control word to a second control word for processing dependent instructions. Execution units process the dependent instructions using a predicted control word and compare the second control word to the predicted control word. A scheduling unit causes the execution units to reprocess the dependent instructions when the predicted control word does not match the second control word. The method comprises determining that an instruction will change a first control word to a second control word and processing the dependent instructions using a predicted control word. The second control word is compared to the predicted control word and the dependent instructions are reprocessed using the second control word when the predicted control word does not match the second control word.

Type: Application

Filed: March 1, 2011

Publication date: September 6, 2012

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Michael D. ESTLICK, Jay FLEISCHMAN, Debjit Das Sarma, Emil TALPES, Krishnan V. RAMANI, Chun LIU
Methods, apparatus and systems to improve security in computer systems

Patent number: 8261085

Abstract: According to some implementations methods, apparatus and systems are provided involving the use of processors having at least one core with a security component, the security component adapted to read and verify data within data blocks stored in a L1 instruction cache memory and to allow the execution of data block instructions in the core only upon the instructions being verified by the use of a cryptographic algorithm.

Type: Grant

Filed: September 26, 2011

Date of Patent: September 4, 2012

Assignee: Media Patents, S.L.

Inventor: Álvaro Fernández Gutiérrez
Dynamic voltage and frequency scaling (DVFS) control for simultaneous multi-threading (SMT) processors

Patent number: 8250395

Abstract: A mechanism is provided for controlling operational parameters associated with a plurality of processors. A control system in the data processing system determines a utilization slack value of the data processing system. The utilization slack value is determined using one or more active core count values and one or more slack core count values. The control system computes a new utilization metric to be a difference between a full utilization value and the utilization slack value. The control system determines whether the new utilization metric is below a predetermined utilization threshold. Responsive to the new utilization metric being below the predetermined utilization threshold, the control system decreases a frequency of the plurality of processors.

Type: Grant

Filed: November 12, 2009

Date of Patent: August 21, 2012

Assignee: International Business Machines Corporation

Inventors: John B. Carter, Heather L. Hanson, Karthick Rajamani, Freeman L. Rawson, III, Todd J. Rosedahl, Malcolm S. Ware
Processor monitoring execution of a synchronization instruction issued to execution sections to detect completion of execution of preceding instructions in an identified thread

Patent number: 8245015

Abstract: A processor includes a plurality of executing sections configured to simultaneously execute instructions for a plurality of threads, an instruction issuing section configured to issue instructions to the plurality of executing sections, and an instruction sync monitoring section configured to, when an instruction-synchronizing instruction is issued to one or more of the plurality of executing sections from the instruction issuing section, monitor completion of execution of the instruction-synchronizing instruction for each of the executing sections, to which the instruction-synchronizing instruction has been issued, thus detecting completion of execution of preceding instructions for the thread to which the instruction-synchronizing instruction belongs.

Type: Grant

Filed: July 7, 2009

Date of Patent: August 14, 2012

Assignee: Sony Corporation

Inventor: Masaaki Ishii
Assigning and pre-decoding group ID and tag ID prior to dispatching instructions in out-of-order processor

Patent number: 8219784

Abstract: A computer-implemented method and apparatus for managing an out of order dispatched instruction queue in a microprocessor. In one embodiment, the method and apparatus include assigning a group identification number and a target identification number to an instruction in an instruction stream. The group identification number and the target identification number are labeled inside an instruction fetcher unit. The group identification number and the target identification number are pre-decoded. The instruction is sent to an instruction queue. The instruction is re-ordered in the instruction stream after executing the instruction utilizing information from the pre-decoding of the group identification number and the target identification number.

Type: Grant

Filed: December 9, 2008

Date of Patent: July 10, 2012

Assignee: International Business Machines Corporation

Inventors: Oliver Keren Ban, Xiangang Cheng, Liang Huang Lee, Katherine June Pearsall
Sharing pipeline by inserting NOP to accommodate memory access request received from other processors

Patent number: 8200950

Abstract: A pipeline operation processor comprises a pipeline processing unit and an instruction insertion controller which inserts an instruction when access to an operation memory is requested, and corrects control information by reference to control information of stages. When a control program is in execution, on receiving an access request instruction requesting for access to the operation memory, the instruction insertion controller inserts an NOP instruction from the instruction decoding unit in place of the access request instruction. The access request instruction is executed while the pipeline processing unit executes no operation, and subsequently, the pipeline processing is continued.

Type: Grant

Filed: June 4, 2009

Date of Patent: June 12, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventor: Motohiko Okabe
Performing externally assisted calls in a heterogeneous processing complex

Patent number: 8195759

Abstract: A mechanism is provided for accessing, by an application running on a first processor, operating system services from an operating system running on a second processor by performing an assisted call. A data plane processor first constructs a parameter area based on the input and output parameters for the function that requires control processor assistance. The current values for the input parameters are copied into the parameter area. An assisted call message is generated based on a combination of a pointer to the parameter area and a specific library function opcode for the library function that is being called. The assisted call message is placed into the processor's stack immediately following a stop-and-signal instruction. The control plane processor is signaled to perform the library function corresponding to the opcode on behalf of the data plane processor by executing a stop and signal instruction.

Type: Grant

Filed: May 29, 2008

Date of Patent: June 5, 2012

Assignee: International Business Machines Corporation

Inventors: Daniel A. Brokenshire, Mark R. Nutter
Image forming apparatus and management system utilizing counter and job log information for usage tracking

Patent number: 8179540

Abstract: An image forming apparatus is provided that holds counter information obtained by integrating a consumption of a consumable that depends on usage of service provided by the image forming apparatus. A log corresponding to the usage of the service is set in job log information with a synchronization flag set off. The log in the job log information, for which the synchronization flag is set off, is set on. The counter information and the job log information are output after the synchronization flag for the log having the synchronization flag set off has been set on.

Type: Grant

Filed: October 29, 2008

Date of Patent: May 15, 2012

Assignee: Canon Kabushiki Kaisha

Inventors: Junichi Hiruma, Nobuyuki Tonegawa
Routing instructions in a processor

Patent number: 8140831

Abstract: Disclosed are a method and system for reducing complexity of routing of instructions from an instruction issue queue to appropriate execution pipelines in a superscalar processor. In one or more embodiments, an instruction steering unit of the superscalar processor receives ordered instructions. The steering unit determines that a first instruction and a subsequent second instruction of the ordered instructions are non-branching instructions, and the steering unit stores the first and second instructions in two non-branching instruction issue queue entries of a shadow queue. The steering unit determines whether or not a third instruction the ordered instructions is a branch instruction, where the third instruction is subsequent to the second instruction. If the third instruction is a branch instruction, the steering unit stores the third instruction in a branch entry of the shadow queue; otherwise, the steering unit stores a no operation instruction in the branch entry of the shadow queue.

Type: Grant

Filed: March 27, 2009

Date of Patent: March 20, 2012

Assignee: International Business Machines Corporation

Inventors: Anthony J. Bybell, Kenichi Tsuchlya
Out-of-order execution microprocessor that selectively initiates instruction retirement early

Patent number: 8074060

Abstract: A microprocessor for improving out-of-order superscalar execution unit utilization with a relatively small in-order instruction retirement buffer. A plurality of execution units each calculate an instruction result. The instruction is either an excepting type instruction or a non-excepting type instruction. The excepting type instruction is capable of causing the microprocessor to take an exception after being issued to the execution unit, wherein the non-excepting type instruction is incapable of causing the microprocessor to take an exception after being issued. A retire unit makes a determination that an instruction is the oldest instruction in the microprocessor and that the instruction is ready to update the architectural state of the microprocessor with its result.

Type: Grant

Filed: November 25, 2008

Date of Patent: December 6, 2011

Assignee: VIA Technologies, Inc.

Inventors: Gerard M. Col, Brent Bean, Bryan Wayne Pogor
System and method for assigning tags to control instruction processing in a superscalar processor

Patent number: 8074052

Abstract: A tag monitoring system for assigning tags to instructions embodied in software on a tangible computer-readable storage medium. A source supplies instructions to be executed by a functional unit. A queue having a plurality of slots containing tags which are used for tagging instructions. A register file stores information required for the execution of each instruction at a location in the register file defined by the tag assigned to that instruction. A control unit monitors the completion of executed instructions and advances the tags in the queue upon completion of an executed instruction. The register file also contains a plurality of read address enable ports and corresponding read output ports. Each of the slots from the queue is coupled to a corresponding one of the read address enable ports. Thus, the information for each instruction can be read out of the register file in program order.

Type: Grant

Filed: September 15, 2008

Date of Patent: December 6, 2011

Assignee: Seiko Epson Corporation

Inventors: Kevin R. Iadonato, Trevor A. Deosaran, Sanjiv Garg
Processor task and data management

Patent number: 8068109

Abstract: Task and data management systems methods and apparatus are disclosed. A processor event that requires more memory space than is available in a local storage of a co-processor is divided into two or more segments. Each segment has a segment size that is less than or the same as an amount of memory space available in the local storage. The segments are processed with one or more co-processors to produce two or more corresponding outputs. The two or more outputs are associated into one or more groups. Each group is less than or equal to a target data size associated with a subsequent process.

Type: Grant

Filed: June 8, 2010

Date of Patent: November 29, 2011

Assignee: Sony Computer Entertainment Inc.

Inventors: Richard B. Stenson, John P. Bates
System and method for handling load and/or store operations in a superscalar microprocessor

Patent number: 8019975

Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.

Type: Grant

Filed: April 25, 2005

Date of Patent: September 13, 2011

Assignee: Seiko-Epson Corporation

Inventors: Cheryl Senter Brashears, Johannes Wang, Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
SUPERCONDUCTING CIRCUIT FOR HIGH-SPEED LOOKUP TABLE

Publication number: 20110167241

Abstract: A high-speed lookup table is designed using Rapid Single Flux Quantum (RSFQ) logic elements and fabricated using superconducting integrated circuits. The lookup table is composed of an address decoder and a programmable read-only memory array (PROM). The memory array has rapid parallel pipelined readout and slower serial reprogramming of memory contents. The memory cells are constructed using standard non-destructive reset-set flip-flops (RSN cells) and data flip-flops (DFF cells). An n-bit address decoder is implemented in the same technology and closely integrated with the memory array to achieve high-speed operation as a lookup table. The circuit architecture is scalable to large two-dimensional data arrays.

Type: Application

Filed: March 8, 2011

Publication date: July 7, 2011

Applicant: HYPRES, INC.

Inventors: Alex F. Kirichenko, Timur V. Filippov, Deepnarayan Gupta
Limiting entries in load reorder queue searched for snoop check to between snoop peril and tail pointers

Patent number: 7966478

Abstract: A method for reducing entries searched in a load reorder queue (LRQ) when snoop instructions are executed by a processor, including checking load reorder queue (LRQ) entries located between a load_peril_snoop register and a lrq_tail register for addresses matching the address of the snoop; and setting a snooped bit in the LRQ entry for any matches found.

Type: Grant

Filed: July 14, 2008

Date of Patent: June 21, 2011

Assignee: International Business Machines Corporation

Inventors: Erik R. Altman, Vijayalakshmi Srinivasan
RISC microprocessor architecture implementing multiple typed register sets

Patent number: 7941636

Abstract: Disclosed herein is an apparatus that implements multiple typed register sets, and applications thereof. The apparatus includes an execution unit and a register file. The execution unit is configured to execute instructions including one or more fields. The register file is configured to store operands defined by the one or more fields and is configured to store results of execution of the instructions in a destination defined by the one or more fields. The register file includes (i) a first register set having a register configured to store data of a single data type and (ii) a second register set having a register configured to store data of a plurality of data types.

Type: Grant

Filed: December 31, 2009

Date of Patent: May 10, 2011

Assignee: Intellectual Venture Funding LLC

Inventors: Sanjiv Garg, Derek J. Lentz, Le Trong Nguyen, Sho Long Chen
High-performance superscalar-based computer system with out-of order instruction execution and concurrent results distribution

Patent number: 7941635

Abstract: The high-performance, RISC core based microprocessor architecture includes an instruction fetch unit for fetching instruction sets from an instruction store and an execution unit that implements the concurrent execution of a plurality of instructions through a parallel array of functional units. The fetch unit generally maintains a predetermined number of instructions in an instruction buffer. The execution unit includes an instruction selection unit, coupled to the instruction buffer, for selecting instructions for execution, and a plurality of functional units for performing instruction specified functional operations. A unified instruction scheduler, within the instruction selection unit, initiates the processing of instructions through the functional units when instructions are determined to be available for execution and for which at least one of the functional units implementing a necessary computational function is available.

Type: Grant

Filed: December 19, 2006

Date of Patent: May 10, 2011

Assignee: Seiko-Epson Corporation

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Interrupt and exception handling for multi-streaming digital processors

Patent number: 7926062

Abstract: A multi-streaming processor has a plurality of streams for streaming one or more instruction threads, a set of functional resources for processing instructions from streams, and interrupt handler logic. The logic detects and maps interrupts and exceptions to one or more specific streams. In some embodiments, one interrupt or exception may be mapped to two or more streams, and in others two or more interrupts or exceptions may be mapped to one stream. Mapping may be static and determined at processor design, programmable, with data stored and amendable, or conditional and dynamic, the interrupt logic executing an algorithm sensitive to variables to determine the mapping. Interrupts may be external interrupts generated by devices external to the processor software (internal) interrupts generated by active streams, or conditional, based on variables. After interrupts are acknowledged, streams to which interrupts or exceptions are mapped are vectored to appropriate service routines.

Type: Grant

Filed: April 29, 2009

Date of Patent: April 12, 2011

Assignee: MIPS Technologies, Inc.

Inventors: Mario D. Nemirovsky, Adolfo M. Nemirovsky, Narendra Sankar
Interrupt and exception handling for multi-streaming digital processors

Patent number: 7900207

Abstract: A multi-streaming processor has a plurality of streams for streaming one or more instruction threads, a set of functional resources for processing instructions from streams, and interrupt handler logic. The logic detects and maps interrupts and exceptions to one or more specific streams. In some embodiments, one interrupt or exception may be mapped to two or more streams, and in others two or more interrupts or exceptions may be mapped to one stream. Mapping may be static and determined at processor design, programmable, with data stored and amendable, or conditional and dynamic, the interrupt logic executing an algorithm sensitive to variables to determine the mapping. Interrupts may be external interrupts generated by devices external to the processor software (internal) interrupts generated by active streams, or conditional, based on variables. After interrupts are acknowledged, streams to which interrupts or exceptions are mapped are vectored to appropriate service routines.

Type: Grant

Filed: November 19, 2008

Date of Patent: March 1, 2011

Assignee: MIPS Technologies, Inc.

Inventors: Mario D. Nemirovsky, Adolfo M. Nemirovsky, Narendra Sankar
MICROPROCESSOR WITH ALU INTEGRATED INTO LOAD UNIT

Publication number: 20110035569

Abstract: A superscalar pipelined microprocessor includes a register set defined by its instruction set architecture, a cache memory, execution units, and a load unit, coupled to the cache memory and distinct from the other execution units. The load unit comprises an ALU. The load unit receives an instruction that specifies a memory address of a source operand, an operation to be performed on the source operand to generate a result, and a destination register of the register set to which the result is to be stored. The load unit reads the source operand from the cache memory. The ALU performs the operation on the source operand to generate the result, rather than forwarding the source operand to any of the other execution units of the microprocessor to perform the operation on the source operand to generate the result. The load unit outputs the result for subsequent retirement to the destination register.

Type: Application

Filed: October 30, 2009

Publication date: February 10, 2011

Inventors: Gerard M. Col, Colin Eddy, Rodney E. Hooker
MICROPROCESSOR WITH ALU INTEGRATED INTO STORE UNIT

Publication number: 20110035570

Abstract: A superscalar pipelined microprocessor includes a register set defined by an instruction set architecture of the microprocessor, execution units, and a store unit, coupled to the cache memory and distinct from the other execution units of the microprocessor. The store unit comprises an ALU. The store unit receives an instruction that specifies a source register of the register set and an operation to be performed on a source operand to generate a result. The store unit reads the source operand from the source register. The ALU performs the operation on the source operand to generate the result, rather than forwarding the source operand to any of the other execution units of the microprocessor to perform the operation on the source operand to generate the result. The store unit operatively writes the result to the cache memory.

Type: Application

Filed: October 30, 2009

Publication date: February 10, 2011

Inventors: Gerard M. Col, Colin Eddy, Rodney E. Hooker
SUPERSCALAR REGISTER-RENAMING FOR A STACK-ADDRESSED ARCHITECTURE

Publication number: 20100318772

Abstract: A system and method for increasing processor throughput by decreasing a loop critical path. In one embodiment, a table comprises multiple stack entries, each comprising an x87 floating-point (FP) stack specifier. The combinatorial logic for operand translation of N FP instructions per clock cycle may require N instantiated copies of a combinatorial logic block. Each instantiated copy may determine a new ordering of the stack entries. Control logic may receive necessary information from the corresponding N FP instructions and determine a corresponding combined computational effect, or stack reordering, on entries within the table based on two or more instructions. Resulting control signals are conveyed to the N instantiated copies. A resulting accumulative delay from an input of the first copy to the output of the Nth copy may be less than or equal to (N?1)*time_delay versus a longer N*time_delay.

Type: Application

Filed: June 11, 2009

Publication date: December 16, 2010

Inventors: Ranganathan Sudhakar, Daryl Lieu, Debjit Das Sarma
System and method for handling load and/or store operations in a superscalar microprocessor

Patent number: 7844797

Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.

Type: Grant

Filed: May 6, 2009

Date of Patent: November 30, 2010

Assignee: Seiko Epson Corporation

Inventors: Cheryl D. Senter, Johannes Wang
Checking for exception by floating point instruction reordered across branch by comparing current status in FP status register against last status copied in shadow register

Patent number: 7840788

Abstract: A process which automatically inserts commands that test for and raise exceptions indicating floating point status exceptions into a sequence of instructions to be executed, re-ordering a pipelined instructions by moving a floating point instruction from after a branch instruction to before the branch instruction, and responds to exceptions in execution of the sequence of instructions by returning execution to a point in the sequence of instructions at which correct state is known and then executing each instruction in the sequence singly to completion so that exceptions in pipelined floating point instructions can be automatically-detected and handled precisely.

Type: Grant

Filed: February 26, 2008

Date of Patent: November 23, 2010

Inventors: Guillermo J. Rozas, David Dunn, Robert F. Cmelik
Process for automatic dynamic reloading of data flow processors (DFPs) and units with two- or three-dimensional programmable cell architectures (FPGAs, DPGAs, and the like)

Patent number: 7822881

Abstract: In a data-processing method, first result data may be obtained using a plurality of configurable coarse-granular elements, the first result data may be written into a memory that includes spatially separate first and second memory areas and that is connected via a bus to the plurality of configurable coarse-granular elements, the first result data may be subsequently read out from the memory, and the first result data may be subsequently processed using the plurality of configurable coarse-granular elements. In a first configuration, the first memory area may be configured as a write memory, and the second memory area may be configured as a read memory. Subsequent to writing to and reading from the memory in accordance with the first configuration, the first memory area may be configured as a read memory, and the second memory area may be configured as a write memory.

Type: Grant

Filed: October 7, 2005

Date of Patent: October 26, 2010

Inventors: Martin Vorbach, Robert Münch
Superscalar RISC instruction scheduling

Patent number: 7802074

Abstract: A register renaming system for out-of-order execution of a set of reduced instruction set computer instructions having addressable source and destination register fields, adapted for use in a computer having an instruction execution unit with a register file accessed by read address ports and for storing instruction operands. A data dependance check circuit is included for determining data dependencies between the instructions. A tag assignment circuit generates one or more tags to specify the location of operands, based on the data dependencies determined by the data dependance check circuit. A set of register file port multiplexers select the tags generated by the tag assignment circuit and pass the tags onto the read address ports of the register file for storing execution results.

Type: Grant

Filed: April 2, 2007

Date of Patent: September 21, 2010

Inventors: Sanjiv Garg, Kevin Ray Iadonato, Le Trong Nguyen, Johannes Wang
High-performance, superscalar-based computer system with out-of-order instruction execution

Patent number: 7739482

Abstract: A high-performance, superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput. The computer system fetches a plurality of fixed length instructions with a specified, sequential program order (in-order). The computer system includes an instruction execution unit including a register file, a plurality of functional units, and an instruction control unit for examining the instructions and scheduling the instructions for out-of-order execution by the functional units. The register file includes a set of temporary data registers that are utilized by the instruction execution control unit to receive data results generated by the functional units. The data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

Type: Grant

Filed: December 21, 2006

Date of Patent: June 15, 2010

Assignee: Seiko Epson Corporation

Inventors: Le Trong Nguyen, Derek J. Lentz, Yoshiyuki Miyayama, Sanjiv Garg, Yasuaki Hagiwara, Johannes Wang, Te-Li Lau, Sze-Shun Wang, Quang H. Trang
Allocation of memory access operations to memory access capable pipelines in a superscalar data processing apparatus and method having a plurality of execution threads

Patent number: 7734897

Abstract: A superscalar data processing apparatus and method are provided for processing operations, the apparatus having a plurality of execution threads and each execution thread being operable to process a sequence of operations including at least one memory access operation. The superscalar data processing apparatus comprises a plurality of execution pipelines for executing the operations, and issue logic for allocating each operation to one of the execution pipelines for execution by that execution pipeline. At least two of the execution pipelines are memory access capable pipelines which can execute memory access operations, and each memory access capable pipeline is associated with a subset of the plurality of execution threads. The issue logic is arranged, for each execution thread, to allocate any memory access operations of that execution thread to an associated memory access capable pipeline.

Type: Grant

Filed: December 21, 2005

Date of Patent: June 8, 2010

Assignee: ARM Limited

Inventor: David Hennah Mansell
Operation of cell processors

Patent number: 7734827

Abstract: Secure operation of cell processors is disclosed. A cell processor receives a secure file image from a client device at a cell processor of a host device (host cell processor), wherein the secure file image includes an encrypted SPU image.

Type: Grant

Filed: October 24, 2005

Date of Patent: June 8, 2010

Assignee: Sony Computer Entertainment, Inc.

Inventor: Tatsuya Iwamoto
Systems and methods for parallel distributed programming

Patent number: 7712080

Abstract: The present invention relates generally to computer programming, and more particularly to systems and methods for parallel distributed programming. Generally, a parallel distributed program is configured to operate across multiple processors and multiple memories. In one aspect of the invention, a parallel distributed program includes a distributed shared variable located across the multiple memories and distributed programs capable of operating across multiple processors.

Type: Grant

Filed: May 21, 2004

Date of Patent: May 4, 2010

Assignee: The Regents of the University of California

Inventors: Lei Pan, Lubomir R. Bic, Michael B. Dillencourt
Processor core and method for managing branch misprediction in an out-of-order processor pipeline

Patent number: 7711934

Abstract: A processor core and method for managing branch misprediction in an out-of-order processor pipeline. In one embodiment, the pipeline of the processor core includes a front-end instruction fetch portion, a back-end instruction execution portion, and pipeline control logic. Operation of the instruction fetch portion is decoupled from operation of the instruction execution portion. Following detection of a control transfer misprediction, operation of the instruction fetch portion is halted and instructions residing in the instruction fetch portion are invalidated. When the instruction associated with the misprediction reaches a selected pipeline stage, instructions residing in the instruction execution portion of the pipeline are invalidated and the flow of instructions from the instruction fetch portion to the instruction execution portion of the processor pipeline is restarted.

Type: Grant

Filed: October 31, 2005

Date of Patent: May 4, 2010

Assignee: MIPS Technologies, Inc.

Inventors: Karagada Ramarao Kishore, Kjeld Svendsen, Vidya Rajagopalan
Multiplexing output from second execution unit add/saturation processing portion of wider width intermediate result of first primitive execution unit for compound computation

Patent number: 7694112

Abstract: A method for executing multiple computational primitives is provided in accordance with exemplary embodiments. A first computational unit and at least a second computational unit cooperate to execute multiple computational primitives. The first computational unit independently computes other computational primitives. By virtue of arbitration for shared source operand buses or shared result buses, availability of the first and second computational units needed to execute cooperatively the multiple computational primitives is assured by a process of reservation as used for a computational primitive executed on a dedicated computational unit.

Type: Grant

Filed: January 31, 2008

Date of Patent: April 6, 2010

Assignee: International Business Machines Corporation

Inventors: Harry S. Barowski, J. Adam Butts, Stephen V. Kosonocky, Silvia M. Mueller, Jochen Preiss
RISC microprocessor architecture implementing multiple typed register sets

Patent number: 7685402

Abstract: A register system for a data processor which operates in a plurality of modes. The register system provides multiple, identical banks of register sets, the data processor controlling access such that instructions and processes need not specify any given bank. An integer register set includes first (RA[23:0]) and second (RA[31:24]) subsets, and a shadow subset (RT[31:24]). While the data processor is in a first mode, instructions access the first and second subsets. While the data processor is in a second mode, instructions may access the first subset, but any attempts to access the second subset are re-routed to the shadow subset instead, transparently to the instructions, allowing system routines to seemingly use the second subset without having to save and restore data which user routines have written to the second subset. A re-typable register set provides integer width data and floating point width data in response to integer instructions and floating point instructions, respectively.

Type: Grant

Filed: January 9, 2007

Date of Patent: March 23, 2010

Inventors: Sanjiv Garg, Derek J. Lentz, Le Trong Nguyen, Sho Long Chen
System and method for translating non-native instructions to native instructions for processing on a host processor

Patent number: 7664935

Abstract: A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions.

Type: Grant

Filed: March 11, 2008

Date of Patent: February 16, 2010

Inventors: Brett Coon, Yoshiyuki Miyayama, Le Trong Nguyen, Johannes Wang
Resource sharing in multiple parallel pipelines

Patent number: 7653804

Abstract: A signal processing network and method for generating code for such a signal processing network are described. Pipeline blocks are each coupled to receive control signaling and associated information signaling from a scheduler. Each of the pipeline blocks respectively includes an allocation unit, a pipeline, and section controllers. The allocation unit is configured to provide a lock signal and sequence information to the section controllers in each of the pipeline blocks. The section controllers are configured to maintain in order inter-pipeline execution of the sequence responsive to the sequence information and the lock signal.

Type: Grant

Filed: January 26, 2006

Date of Patent: January 26, 2010

Assignee: Xilinx, Inc.

Inventors: Thomas A. Lenart, Jorn W. Janneck
Mixed superscalar and VLIW instruction issuing and processing method and system

Patent number: 7590824

Abstract: Techniques for processing transmissions in a communications (e.g., CDMA) system. A method and system for issuing and executing mixed architecture instructions in a multiple-issue digital signal processor receives in a mixed instruction listing a plurality of digital signal processor instructions. The plurality of digital signal processor instructions includes a plurality of parallel executable instructions (e.g., VLIW instructions or instruction packets) mixed among a plurality of series executable instructions (e.g., superscalar instructions). The series executable instructions are associated by various instruction dependencies. The method and system further identify in the mixed instruction listing the plurality of parallel executable instructions. Once identified, the parallel executable instructions are first executed in parallel irrespective of any such instruction's relative order in the mixed instruction listing.

Type: Grant

Filed: March 29, 2005

Date of Patent: September 15, 2009

Assignee: QUALCOMM Incorporated

Inventors: Muhammad Ahmed, Erich Plondke, Lucian Codrescu, William C. Anderson
Selective Power-Down For High Performance CPU/System

Publication number: 20090228729

Abstract: A microelectronic device according to the present invention is made up of two or more functional units, which are all disposed on a single chip, or die. The present invention works on the strategy that all of the functional units on the die are not, and do not need to be operational at a given time in the execution of a computer program that is controlling the microelectronic device. The present invention on a very rapid basis (typically a half clock cycle), therefore, turns on and off the functional units of the microelectronic device in accordance with the requirements of the program being executed. This power down can be achieved by one of three techniques; turning off clock inputs to the functional units, interrupting the supply of power to the functional units, or deactivating input signals to the functional units.

Type: Application

Filed: February 19, 2009

Publication date: September 10, 2009

Applicant: Seiko Epson Corporation

Inventor: Chong Ming LIN
Multiprocessor computer having configurable hardware system domains

Patent number: RE41293

Abstract: Global address and data routers interconnect individual system units each having its own processors, memory, and I/O. A domain filter coupled to the routers dynamically defines groups of system units as domains and clusters of domains which have both software and hardware isolation from each other. Clusters can share dynamically definable ranges of memory with each other. The domain filter has software-loadable registers on the system units and in the global routers to set the parameters of the domains and clusters. The registers label individual inter-system transactions on the routers as invalid for system units not in the same domain or cluster as the originating unit.

Type: Grant

Filed: August 1, 2001

Date of Patent: April 27, 2010

Assignee: Sun Microsystems, Inc.

Inventors: Daniel P. Drogichen, Andrew J. McCrocklin, Nicholas E. Aneshansley

prev 1 2 3 4 5 6 7 … next