Patents Examined by Corey Faherty
-
Patent number: 7831812Abstract: A processor includes a processor core with a core interface unit that includes an age queue and a request queue. The core interface unit receives load requests from the processor core. The request queue stores the requests in respective slots of the request queue. The age queue stores ID tags in respective age queue slots. Each ID tag in the age queue corresponds to a respective address of a load instruction in the request queue. In one embodiment, ID tags propagate through the age queue at a fixed rate of two at a time from a tail of the age queue to a head of the age queue. Arbitration control circuitry generates an enable bit vector that identifies the oldest ID tag in the age queue corresponding to the oldest load request in the request queue. The arbitration circuitry selects the identified oldest instruction in the request queue as the next to dispatch. In one embodiment, the core interface unit exhibits an input frequency that is a multiple of an internal operating frequency of the core interface unit.Type: GrantFiled: August 31, 2007Date of Patent: November 9, 2010Assignee: International Business Machines CorporationInventors: Alvan Wing Ng, Takuya Kano
-
Patent number: 7827386Abstract: A first set of instructions and incoming data are provided to a first processing unit of a data driven processor, to operate upon the incoming data. The first processing unit, in response to recognizing that the first set of instructions will require either reading from or writing to external memory, sets up a logical channel between a second processing unit of the processor and the external memory, to transfer additional data between the external memory and the second processing unit. This capability may be implemented by the addition of a control port, separate from data ports, to the first processing unit, where the control port allows the first processing unit to write addressing information and mode information (including the location of the additional data) for reading or writing the additional data via a memory access unit data channel of the processor.Type: GrantFiled: June 30, 2003Date of Patent: November 2, 2010Assignee: Intel CorporationInventors: Louis A. Lippincott, Chin Hong Cheah
-
Patent number: 7822945Abstract: A semiconductor device including a multi-layer interconnection substrate having a signal distribution interconnection and a power supply line and semiconductor circuit blocks installed on the multi-layer interconnection substrate for performing required operations. The multi-layer substrate includes a third interconnection layer having interconnections extending in a first direction, a second interconnection layer having interconnections extending in a second direction which is different to the first direction, and a first interconnection layer having interconnections extends in a direction orthogonal to the first direction.Type: GrantFiled: February 2, 2007Date of Patent: October 26, 2010Assignee: NEC CorporationInventor: Takeo Hayashi
-
Patent number: 7818552Abstract: A VLIW processor is provided with an architecture which includes fetching and executing circuitry which when combined with operation, compare, branch (OCB) instructions realizes no processing branch penalties. The OCB instructions are provided with two direct branch fields or with two indirect branch fields.Type: GrantFiled: December 20, 2007Date of Patent: October 19, 2010Assignee: The United States of America as represented by the Secretary of the ArmyInventor: Patrick W. Jungwirth
-
Patent number: 7814302Abstract: A data processing system 2 is provided including an instruction decoder 34 responsive to program instructions within an instruction register 32 to generate control signals for controlling data processing circuitry 36. The instructions supported include an address calculation instruction which splits an input address value at a position dependent upon a size value into a first portion and second portion, adds a non-zero offset value to the first portion, sets the second portion to a value and then concatenates the result of the processing on the first portion and the second portion to form the output address value. Another type of instruction supported is a select-and-insert instruction. This instruction takes a first input value and shifts it by N bit positions to form a shifted value, selects N bits from within a second input value in dependence upon the first input value and then concatenates the shifted value with the N bits to form an output value.Type: GrantFiled: February 13, 2008Date of Patent: October 12, 2010Assignee: ARM LimitedInventors: Dominic Hugo Symes, Daniel Kershaw, Mladen Wilder
-
Patent number: 7809932Abstract: Processor pipeline controlling techniques are described which take advantage of the variation in critical path lengths of different instructions to achieve increased performance. By examining a processor's instruction set and execution unit implementation's critical timing paths, instructions are classified into speed classes. Based on these speed classes, one pipeline is presented where hold signals are used to dynamically control the pipeline based on the instruction class in execution. An alternative pipeline supporting multiple classes of instructions is presented where the pipeline clocking is dynamically changed as a result of decoded instruction class signals. A single pass synthesis methodology for multi-class execution stage logic is also described. For dynamic class variable pipeline processors, the mix of instructions can have a great effect on processor performance and power utilization since both can vary by the program mix of instruction classes.Type: GrantFiled: March 22, 2004Date of Patent: October 5, 2010Assignee: Altera CorporationInventors: Edwin Franklin Barry, Gerald George Pechanek, Patrick R. Marchand
-
Patent number: 7809933Abstract: A system and method for optimizing the branch logic of a processor to improve handling of hard to predict indirect branches are provided. The system and method leverage the observation that there will generally be only one move to the count register (mtctr) instruction that will be executed while a branch on count register (bcctr) instruction has been fetched and not executed. With the mechanisms of the illustrative embodiments, fetch logic detects that it has encountered a bcctr instruction that is hard to predict and, in response to this detection, blocks the target fetch from entering the instruction buffer of the processor. At this point, the fetch logic has fetched all the instructions up to and including the bcctr instruction but no target instructions. When the next mtctr instruction is executed, the branch logic of the processor grabs the data and starts fetching using that target address.Type: GrantFiled: June 7, 2007Date of Patent: October 5, 2010Assignee: International Business Machines CorporationInventors: David S. Levitan, Wolfram Sauer
-
Patent number: 7793083Abstract: A processor (10) manages, in an instruction management unit (103) and a data attribute management unit (105), secure attributes indicating whether instruction code and data stored in an instruction cache (102) and a data cache (104) of the processor (10) are confidential information. When the instruction code and the data are confidential information, the processor (10) also manages secure processing identification information for indicating in which secure process the confidential information is to be used. When the operating mode is switched from the secure mode to the normal mode, only the confidential information is disabled by a memory disabling unit (108). This prevents confidential information from being analyzed by the processor in the normal mode.Type: GrantFiled: November 24, 2005Date of Patent: September 7, 2010Assignee: Panasonic CorporationInventors: Masaaki Harada, Tsutomu Sekibe
-
Patent number: 7793084Abstract: The present invention provides an efficient method to implement nested if-then-else conditional statements in a SIMD processor, which requires only one vector compare instruction for both if and else parts of the conditional construct. No stack and stack-handling instructions are needed for vector condition codes. Two condition code flag bits representing if and else parts of testing per element provide for nesting of multiple if-then-else. All SIMD instructions are conditional including the vector compare instruction, and this provides a method for aggregating multiple conditions in nested if-then-else statements. M full levels of if-then-else nesting requires (2M?1) nodes or vector test instructions and 2M+1 condition code flags per vector element. Also, capability to compare any element of first source vector register with any element of second source vector is provided.Type: GrantFiled: May 20, 2003Date of Patent: September 7, 2010Inventor: Tibet Mimar
-
Patent number: 7793074Abstract: An apparatus comprises a plurality of processor cores, and an interconnection network to route data among the processor cores based on destination information in the data. The processor cores are configured to forward the data to a final destination if the destination information indicates that a destination processor core has been reached, or to forward the data to other processor cores if the destination information indicates that a destination processor core has not been reached. The final destination is one of a plurality of destinations indicated by the destination information, the destinations including a plurality of portions of the destination processor core.Type: GrantFiled: April 14, 2006Date of Patent: September 7, 2010Assignee: Tilera CorporationInventors: David Wentzlaff, Anant Agarwal
-
Patent number: 7788468Abstract: A “cooperative thread array,” or “CTA,” is a group of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique thread identifier assigned at thread launch time that controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Different threads of the CTA are advantageously synchronized at appropriate points during CTA execution using a barrier synchronization technique in which barrier instructions in the CTA program are detected and used to suspend execution of some threads until a specified number of other threads also reaches the barrier point.Type: GrantFiled: December 15, 2005Date of Patent: August 31, 2010Assignee: NVIDIA CorporationInventors: John R. Nickolls, Stephen D. Lew, Brett W. Coon, Peter C. Mills
-
Patent number: 7779239Abstract: A processor includes a feature control unit to enable or disable one or more processor features individually in response to a user selectable setting. The feature control unit is adapted to disable the processor feature(s) if the user setting has not been updated in accordance with an input regardless of the value of the user setting prior to the update and to enable or disable the processor feature(s) in accordance with the updated user setting after it has been updated. The feature control unit may also include a lock unit to prevent changes to the updated user setting and a software feature selection unit to enable or disable processor features in response to a software feature selection setting and, optionally, only enable or disable processor features whose corresponding updated user setting is user enabled. The feature control unit may also include mechanisms to detect illegal feature selection conditions.Type: GrantFiled: July 28, 2004Date of Patent: August 17, 2010Assignee: Intel CorporationInventors: Stephen A. Fischer, Dion Rodgers, James A. Sutton
-
Patent number: 7779231Abstract: A processor and a method for executing VLIW instructions using pipeline execution wherein each VLIW instruction includes a plurality of instructions and wherein the pipeline includes at least the following stages: first and second instruction fetch stages, a pre-decode stage, an instruction dispatch stage, first and second decoding stages, an execution stage and a write-back stage. During the first instruction fetch stage the number of outstanding instructions is determined where these outstanding instructions are from previous VLIW instructions that have not yet been issued for execution. During the second instruction fetch stage a comparison is performed on whether the number of outstanding instructions is less then the number of instructions in a VLIW instruction where if the number of outstanding instructions is less than the number of instructions in an instruction packet then the next VLIW instruction is fetched and the outstanding instructions are shifted and aligned with the fetched VLIW instruction.Type: GrantFiled: May 23, 2003Date of Patent: August 17, 2010Assignee: STMicroelectronics (R&D) Ltd.Inventor: Zahid Hussain
-
Patent number: 7769988Abstract: A method of integrating a personal computing system and apparatus thereof include processing that begins by integrating a central processing unit with a North bridge on a single substrate such that the central processing unit is directly coupled to the North bridge via an internal bus. The processing then continues by providing memory access requests from the central processing unit to the North bridge at a rate of the central processing unit. The processing continues by having the North bridge buffer the memory access request and subsequently process the memory access requests at a rate of the memory. The method may be expanded by integrating a South bridge onto the same substrate as well as integrating system memory onto the same substrate.Type: GrantFiled: December 23, 1999Date of Patent: August 3, 2010Assignee: ATI Technologies ULCInventors: Adrian Sfarti, Korbin Van Dyke, Michael Frank, Arkadi Avrukin
-
Patent number: 7769984Abstract: A dual-issue instruction is decoded to determine a plurality of LSU dependencies needed by an LSU part of the dual-issue instruction and a plurality of non-LSU dependencies needed by a non-LSU part of the dual-issue instruction. During dispatch of the dual-issue instruction by the microprocessor, the dual dependency matrices are employed as follows: a Load-Store Unit (LSU) dependency matrix is written with the plurality of LSU dependencies and a non-LSU dependency matrix is written with the plurality of non-LSU dependencies; an LSU issue valid (LSU IV) indicator is set as valid to issue; an LSU portion of the dual-issue instruction is issued once the plurality of LSU dependencies of the dual issue instruction are satisfied; a non-LSU issue valid (non-LSU IV) indicator is set as valid to issue; and a non-LSU portion of the dual-issue instruction is issued once the plurality of non-LSU dependencies of the dual issue instruction are satisfied.Type: GrantFiled: September 11, 2008Date of Patent: August 3, 2010Assignee: International Business Machines CorporationInventors: Gregory W. Alexander, Brian D. Barrick, Lee E. Eisen, John W. Ward, III
-
Patent number: 7765387Abstract: A program counter control method controls instructions by an out-of-order method using a branch prediction mechanism and controls an architecture having delay instructions for branching. The method includes the steps of simultaneously committing a plurality of instructions including a branch instruction, when a branch prediction is successful and the branch instruction branches, and simultaneously updating a program counter and a next program counter depending on a number of committed instructions.Type: GrantFiled: January 28, 2003Date of Patent: July 27, 2010Assignee: Fujitsu LimitedInventors: Ryuichi Sunayama, Kuniki Morita, Aiichiro Inoue
-
Patent number: 7730290Abstract: A method is disclosed for executing a load instruction. Address information of the load instruction is used to generate an address of needed data, and the address is used to search a cache memory for the needed data. If the needed data is found in the cache memory, a cache hit signal is generated. At least a portion of the address is used to search a queue for a previous load instruction specifying the same address. If a previous load instruction specifying the same address is found, the cache hit signal is ignored and the load instruction is stored in the queue. A load/store unit, and a processor implementing the method, are also described.Type: GrantFiled: February 25, 2008Date of Patent: June 1, 2010Assignee: International Business Machines CorporationInventors: Brian David Barrick, Kimberly Marie Fernsler, Dwain A. Hicks, Takeki Osanai, David Scott Ray
-
Patent number: 7730292Abstract: In the context of a microprocessor and a program, the invention provides parallel subword compare instructions that store results in a selectable intra-register subword location. In a targeting approach, an instruction permits the location to be specified; alternatively, there can be plural instructions, each associated with one of the locations. In a replicating approach, plural replicas are stored in the alternative locations. In a shifting approach, the instruction moves prior results, so that the number of subsequent iterations of the instruction determines the location of a result. The invention provides for overwriting and content-preserving instructions, and for overlapping and separate locations. The invention allows results from multiple parallel subword compare operations with relatively few instructions. The invention also provides for other parallel subword instructions.Type: GrantFiled: March 31, 2003Date of Patent: June 1, 2010Assignee: Hewlett-Packard Development Company, L.P.Inventor: Ruby B. Lee
-
Patent number: 7725686Abstract: Systems and methods for determining whether to retire a data entry from a buffer using multiple retirement logic units. In one embodiment, each retirement unit concurrently evaluates retirement conditions for one of the buffer entries in an associated subset (e.g., even or odd) of the buffer. Selection logic coupled to the retirement units alternately selects the first or second retirement unit for retirement of one of the entries in the associated subset. Because the aggregate number of entries retired by the combined retirement logic units is divided by the number of retirement logic units, each retirement logic unit has more time to process the retirement conditions for corresponding queue entries. The buffer may be any of a variety of different types of buffers and may comprise a single buffer, or multiple buffers.Type: GrantFiled: July 24, 2006Date of Patent: May 25, 2010Assignees: Habushiki Kaisha Toshiba, International Business Machines CorporationInventors: Takeki Osanai, Brian D. Barrick
-
Patent number: 7725693Abstract: Embodiments include a device and a method. In an embodiment, a device provides a resource manager operable to select a resource management policy likely to provide a substantially optimum execution of an instruction group by comparing an execution of the instruction group pursuant to a first resource management policy applied to a hardware resource and an execution of the instruction group pursuant to a second resource management policy applied to the hardware resource.Type: GrantFiled: September 22, 2006Date of Patent: May 25, 2010Assignee: Searete, LLCInventors: Bran Ferren, W. Daniel Hillis, Nathan P. Myhrvold, Clarence T. Tegreene, Lowell L. Wood, Jr.