Loop Execution Patents (Class 712/241)

LOOP DETECTION APPARATUS, LOOP DETECTION METHOD, AND LOOP DETECTION PROGRAM

Publication number: 20120137111

Abstract: A loop detection method, system, and article of manufacture for determining whether a sequence of unit processes continuously executed among unit processes in a program is a loop by means of computational processing performed by a computer. The method includes: reading address information on the sequence of unit processes; comparing an address of a unit process as a loop starting point candidate with an address of a last unit process in the sequence of unit processes; reading call stack information on the sequence of unit processes; comparing a call stack upon execution of the unit process as the loop starting point candidate with a call stack upon execution of the last unit process; outputting a determination result indicating that the sequence of unit processes forms a loop if the respective comparison results of the addresses and the call stacks match with each other.

Type: Application

Filed: November 21, 2011

Publication date: May 31, 2012

Applicant: International Business Machines Corporation

Inventor: Hiroshige Hayashizaki
Systems, methods, and computer products for compiler support for aggressive safe load speculation

Patent number: 8191057

Abstract: Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, and for each candidate loop in the set of candidate loops gathered for load speculation, computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determining an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction scheduling on the generated unrolled main loop.

Type: Grant

Filed: August 27, 2007

Date of Patent: May 29, 2012

Assignee: International Business Machines Corporation

Inventors: Roch G. Archambault, Geoffrey O. Blandy, Roland Froese, Yaoqing Gao, Liangxiao Hu, James L. McInnes, Raul E. Silvera
METHOD AND APPARATUS FOR IMPROVED SECURE COMPUTING AND COMMUNICATIONS

Publication number: 20120131316

Abstract: A method and apparatus are disclosed that may comprise applying compact markup notation to a general recursive computing system including hardware and software components, the compact markup notation defining things, places, paths, actions and causes within at least one of the hardware and the software of the general recursive computing system, to establish a set of data comprising a definitive description of the general recursive computing system in the compact notation; and synthesizing a self-aware and self-monitoring primitive recursive computing system utilizing the definitive description in the compact markup notation.

Type: Application

Filed: November 17, 2011

Publication date: May 24, 2012

Inventors: Joseph Mitola, III, Yu-Dong Yao, Yingying Chen, Hong Man
TABLE-DRIVEN SOAKER TOOL FOR INFORMATION HANDLING SYSTEMS

Publication number: 20120124350

Abstract: A soaker tool for an information handling system (IHS) exercises the IHS to provide a predetermined amount of utilization that a user may specify. The soaker tool schedules wait times following respective utilization times in alternating fashion to achieve a desired utilization value for a predetermined time period. The soaker tool monitors for a dispatch interrupt during the utilization times. Should a dispatch interrupt occur during a utilization time, the soaker tool accounts for the dispatch interrupt by determining a remainder utilization time to maintain utilization accuracy. The soaker tool may employ a parameter table that specifies utilization times, wait times, loop counts and adjustment cycles indexed to the respective utilization values that a user may select. The soaker tool may employ adjustment cycles to compensate for cumulative timing errors that may occur when running the tool for extended time periods.

Type: Application

Filed: November 12, 2010

Publication date: May 17, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Meik Neubauer
APPARATUS AND METHOD FOR DYNAMICALLY DETERMINING EXECUTION MODE OF RECONFIGURABLE ARRAY

Publication number: 20120124351

Abstract: An apparatus and method for dynamically determining the execution mode of a reconfigurable array are provided. Performance information of a loop may be obtained before and/or during the execution of the loop. The performance information may be used to determine whether to operate the apparatus in a very long instruction word (VLIW) mode or in a coarse grained array (CGA) mode.

Type: Application

Filed: August 25, 2011

Publication date: May 17, 2012

Inventors: Bernhard Egger, Dong-Hoon Yoo, Tai-Song Jin, Won-Sub Kim, Min-Wook Ahn, Jin-Seok Lee, Hee-Jin Ahn
Efficient code generation using loop peeling for SIMD loop code with multile misaligned statements

Patent number: 8171464

Abstract: An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.

Type: Grant

Filed: May 16, 2008

Date of Patent: May 1, 2012

Assignee: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
RECONFIGURABLE PROCESSOR AND METHOD FOR PROCESSING A NESTED LOOP

Publication number: 20120102496

Abstract: A reconfigurable processor which merges an inner loop and an outer loop which are included in a nested loop and allocates the merged loop to processing elements in parallel, thereby reducing processing time to process the nested loop. The reconfigurable processor may extract loop execution frequency information from the inner loop and the outer loop of the nested loop, and may merge the inner loop and the outer loop based on the extracted loop execution frequency information.

Type: Application

Filed: April 14, 2011

Publication date: April 26, 2012

Applicant: Samsung Electronics Co., Ltd.

Inventors: Min-Wook Ahn, Dong-Hoon Yoo, Jin-Seok Lee, Bernhard Egger, Tai-Song Jin, Won-Sub Kim, Hee-Jin Ahn
RECONFIGURABLE PROCESSOR AND METHOD FOR PROCESSING LOOP HAVING MEMORY DEPENDENCY

Publication number: 20120096247

Abstract: Provided are a reconfigurable processor, which is capable of reducing the probability of an incorrect computation by analyzing the dependence between memory access instructions and allocating the memory access instructions between a plurality of processing elements (PEs) based on the results of the analysis, and a method of controlling the reconfigurable processor. The reconfigurable processor extracts an execution trace from simulation results, and analyzes the memory dependence between instructions included in different iterations based on parts of the execution trace of memory access instructions.

Type: Application

Filed: October 13, 2011

Publication date: April 19, 2012

Inventors: Hee-Jin AHN, Dong-Hoon Yoo, Bernhard Egger, Min-Wook Ahn, Jin-Seok Lee, Tai-Song Jin, Won-Sub Kim
APPARATUS, METHOD, AND SYSTEM FOR PROVIDING A DECISION MECHANISM FOR CONDITIONAL COMMITS IN AN ATOMIC REGION

Publication number: 20120079246

Abstract: An apparatus and method is described herein for conditionally committing /andor speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

Type: Application

Filed: September 25, 2010

Publication date: March 29, 2012

Inventors: Mauricio Breternitz, JR., Youfeng Wu, Cheng Wang, Edson Borin, Shiliang Hu, Craig B. Zilles
MACROSCALAR PROCESSOR ARCHITECTURE

Publication number: 20120066472

Abstract: A macroscalar processor architecture is described herein. In one embodiment, a processor receives instructions of a program loop having a vector block and a sequence block intended to be executed after the vector block, where the processor includes multiple slices and each of the slices is capable of executing an instruction of an iteration of the program loop substantially in parallel. For each iteration of the program loop, the processor executes an instruction of the sequence block using one of the slices while executing instructions of the vector block using a remainder of the slices substantially in parallel. Other methods and apparatuses are also described.

Type: Application

Filed: November 17, 2011

Publication date: March 15, 2012

Inventor: Jeffry E. Gonion
Computation spreading utilizing dithering for spur reduction in a digital phase lock loop

Patent number: 8134411

Abstract: A novel and useful apparatus for and method of spur reduction using computation spreading with dithering in a digital phase locked loop (DPLL) architecture. A software based PLL incorporates a reconfigurable calculation unit (RCU) that is optimized and programmed to sequentially perform all the atomic operations of a PLL or any other desired task in a time sharing manner. An application specific instruction-set processor (ASIP) incorporating the RCU is adapted to spread the computation of the atomic operations out over a PLL reference clock period wherein each computation is performed at a much higher processor clock frequency than the PLL reference clock rate. This significantly reduces the per cycle current transient generated by the computations. The frequency content of the current transients is at the higher processor clock frequency which results in a significant reduction in spurs within sensitive portions of the output spectrum.

Type: Grant

Filed: April 17, 2008

Date of Patent: March 13, 2012

Assignee: Texas Instruments Incorporated

Inventors: Fuqiang Shi, Roman Staszewski, Robert B. Staszewski
Check-hazard instructions for processing vectors

Patent number: 8131979

Abstract: The described embodiments provide a system that determines data dependencies between two vector memory operations or two memory operations that use vectors of memory addresses. During operation, the system receives a first input vector and a second input vector. The first input vector includes a number of elements containing memory addresses for a first memory operation, while the second input vector includes a number of elements containing memory addresses for a second memory operation, wherein the first memory operation occurs before the second memory operation in program order. The system then determines elements in the first and second input vectors where the memory addresses indicate that a dependency exists between the memory operations. The system next generates a result vector, wherein the result vector indicates the elements where dependencies exist between the memory operations.

Type: Grant

Filed: April 7, 2009

Date of Patent: March 6, 2012

Assignee: Apple Inc.

Inventors: Jeffry E. Gonion, Keith E. Diefendorff, Jr.
Method and apparatus for initializing a system configured in a programmable logic device

Patent number: 8122239

Abstract: Method and apparatus for initializing a system configured in a programmable logic device (PLD) is described. In some examples, the method includes: initializing memory elements in the system with first data; executing a first iteration of the system to process the first data; partially reconfiguring the PLD, during execution of the first iteration, to initialize shadow memory elements in the PLD with second data, the shadow memory elements respectively shadowing the memory elements in the system; transferring the second data from the shadow memory elements to the memory elements; and executing a second iteration of the system to process the second data.

Type: Grant

Filed: September 11, 2008

Date of Patent: February 21, 2012

Assignee: Xilinx, Inc.

Inventors: Philip B. James-Roxby, Stephen A. Neuendorffer, Henry E. Styles
PARALLEL LOOP MANAGEMENT

Publication number: 20120023316

Abstract: The illustrative embodiments comprise a method, data processing system, and computer program product having a processor unit for processing instructions with loops. A processor unit creates a first group of instructions having a first set of loops and second group of instructions having a second set of loops from the instructions. The first set of loops have a different order of parallel processing from the second set of loops. A processor unit processes the first group. The processor unit monitors terminations in the first set of loops during processing of the first group. The processor unit determines whether a number of terminations being monitored in the first set of loops is greater than a selectable number of terminations. In response to a determination that the number of terminations is greater than the selectable number of terminations, the processor unit ceases processing the first group and processes the second group.

Type: Application

Filed: July 26, 2010

Publication date: January 26, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Brian Flachs, Charles R. Johns, Ulrich Weigand
Method and Apparatus for Improved Secure Computing and Communications

Publication number: 20110302397

Abstract: A computing and communications system and method may comprise a primitive recursive function computing engine including an instruction set architecture prohibiting loop operations that continue for an indefinite time. The system and method may further comprise the instruction set architecture comprising system identifiers selected from a group comprising things, places, paths, actions and causes. The instruction set architecture may comprise organizing at least one data thing into a processing path to be acted upon by an action. The instruction set architecture may comprise defining a processing element as comprising an input interface configured to receive a data thing into the processing path; a processor in the processing path configured to perform the action on the data thing; and an output interface configured to receive a result of performing of the action on the data thing configured to provide the result as an output of the processing element.

Type: Application

Filed: April 12, 2011

Publication date: December 8, 2011

Inventor: Joseph Mitola, III
Flexible parallel/serial reconfigurable array configuration scheme

Patent number: 8058896

Abstract: A programming interface device for a programmable logic circuit comprises a series of parallel logic block chains each having first and second connection means, the first and second connection means being disposed at opposite ends of each chain. The programming interface device comprises first and second interfacing means for interfacing with the first and second connection means of each logic block chain, respectively and at least one programming circuit, each programming circuit arranged to configure a plurality of serially connected logic blocks. Finally, the programming interface comprises programmable connection means for connecting the connection means of each logic block chain to either the connection means of another logic block chain or directly to one of the at least one programming circuits, such that the parallel logic block chains can be configured in parallel, series or in any combination thereof.

Type: Grant

Filed: October 8, 2009

Date of Patent: November 15, 2011

Assignee: Panasonic Corporation

Inventors: Simon Deeley, Anthony Stansfield
Processor micro-architecture for compute, save or restore multiple registers and responsive to first instruction for repeated issue of second instruction

Patent number: 8055886

Abstract: An electronic circuit (4000) includes a bias value generator circuit (3900) operable to supply a varying bias value in a programmable range, and an instruction circuit (3625, 4010) responsive to a first instruction to program the range of said bias value generator circuit (3900) and further responsive to a second instruction having an operand to repeatedly issue said second instruction with said operand varied in an operand value range determined as a function of the varying bias value.

Type: Grant

Filed: May 22, 2008

Date of Patent: November 8, 2011

Assignee: Texas Instruments Incorporated

Inventors: Kenichi Tashiro, Hiroyuki Mizuno, Yuji Umemoto
SOFTWARE CONVERSION PROGRAM PRODUCT AND COMPUTER SYSTEM

Publication number: 20110238957

Abstract: According to one embodiment, a software conversion program product having a computer readable medium including programmed instructions, wherein the instructions, when executed by a computer system including a host processor and one or more accelerator processors, causes the computer system to perform: analyzing input software and obtaining a compute intensity calculated by dividing the number of arithmetic processing times in a loop by the size of data accessed in the loop and a data reference area size that is a total size of areas where data is referred to; determining a processor that executes loops on the basis of obtained values and a preliminarily prepared win-loss table in which wins and losses of execution times between the host processor and the accelerator processor are defined; and converting the input software so that the determined processor executes the loops.

Type: Application

Filed: September 14, 2010

Publication date: September 29, 2011

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Yusuke SHIROTA, Osamu Torii
Loop data processing system and method for dividing a loop into phases

Patent number: 8019982

Abstract: A data processing system and method. The data processing system includes a processor core that executes a program; a loop accelerator that has an array consisting of a plurality of data processing cells and executes a loop in a program by configuring the array according to a set of configuration bits; and a centralized register file which allows data used in the program execution to be shared by the processor core and the loop accelerator. The loop accelerator divides the configuration of the array into at least three phases according to whether data exchange with the central register file is conducted during the loop execution. Thus, unnecessary occupation of the routing resource, which is used for the data exchange between the loop accelerator and the central register file during the loop execution, can be avoided.

Type: Grant

Filed: October 4, 2006

Date of Patent: September 13, 2011

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hong-seok Kim, Suk-jin Kim, Jeong-wook Kim, Soo-jung Ryu
Loop instruction execution using a register identifier

Patent number: 8019981

Abstract: Methods and apparatus are provided for performing loop execution. Modifier registers are used to hold loop counter values. Modifier register information and program memory address information are included in the loop instruction. When a processor executes a loop instruction, it decodes the instruction, identifies the modifier register, and accesses the register value to determine if the processor will jump back based on the memory address information. The loop execution can incur no clock cycle penalties.

Type: Grant

Filed: August 12, 2004

Date of Patent: September 13, 2011

Assignee: Altera Corporation

Inventor: Paul Metzgen
Building Approximate Data Dependences with a Moving Window

Publication number: 20110219222

Abstract: Mechanisms for building approximate data dependences using a moving look-back window are provided. The mechanisms track dependence information for memory accesses over iterations of execution of a portion of code. The mechanisms receive a memory access of an iteration of the portion of code, the memory access having an address for access the memory and an access type indicating at least one of a read or a write access type. An entry in a moving look-back window data structure is generated corresponding to a memory location accessed by the memory access. The entry comprises at least an identification of the address, the access type, and an iteration number corresponding to the iteration of the memory access. The moving look-back window data structure is utilized to determine dependence information for memory accesses over a plurality of iterations of the portion of code.

Type: Application

Filed: March 5, 2010

Publication date: September 8, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, John K.P. O'Brien, Kathryn M. O'Brien, Kai-Ting A. Wang, Xiaotong Zhuang
Simultaneous multiple thread processor increasing number of instructions issued for thread detected to be processing loop

Patent number: 8015391

Abstract: A processor simultaneously issues instructions to multiple threads in a same instruction execution cycle. An instruction issuer controls issuance of an instruction for each of the multiple threads. A detector detects, for each of the multiple threads, whether a loop processing is currently being executed. A unit causes the instruction issuer to increase a number of instructions to be issued when the detector detects that the loop processing is currently being executed.

Type: Grant

Filed: October 8, 2010

Date of Patent: September 6, 2011

Assignee: Panasonic Corporation

Inventor: Takenobu Tani
Loop processing counter with automatic start time set or trigger modes in context reconfigurable PE array

Patent number: 7996661

Abstract: A dynamic reconfigurable circuit that implements optional processing by dynamically switching a processing content of a reconfigurable processing element (PE) and a connection content between the PEs in accordance with a context, includes: a configuration register section for setting a content of loop processing on the basis of the context, the loop processing content including an output source of an output signal from each of a set of the reconfigured PEs, an output destination of the output signal, and a condition for outputting the output signal to the output destination; and at least one counter circuit including a loop control section and an output register section that implement the set loop processing, that count the number of implementations of the loop processing implemented by the loop control section, and that output the output signal to the output destination based on the counted number of implementations and the condition.

Type: Grant

Filed: September 17, 2008

Date of Patent: August 9, 2011

Assignee: Fujitsu Semiconductor Limited

Inventors: Takashi Hanai, Shinichi Sutou, Masaki Arai, Mitsuharu Wakayoshi
System and method for executing loops in a processor

Patent number: 7991984

Abstract: A loop control system comprises at least one loop flag in an instruction word, at least one loop counter associated with the at least one loop flag operable to store and compute a number of times a program loop is to be executed, at least one start address register associated with the at least one loop flag operable to store a program loop starting address, and at least one end address register associated with the at least one loop flag operable to store a program loop ending address.

Type: Grant

Filed: December 23, 2005

Date of Patent: August 2, 2011

Assignee: Samsung Electronics Co., Ltd.

Inventor: Eran Pisek
System and method for implementing and utilizing a zero overhead loop

Patent number: 7991985

Abstract: Systems and methods for implementing a zero overhead loop in a microprocessor or microprocessor based system/chip are disclosed. The systems and methods include the use of a breakpoint mechanism, and modification of parameters at runtime, with the breakpoint mechanism being additionally used in debugging, in order to provide some of the looping functionality.

Type: Grant

Filed: December 22, 2006

Date of Patent: August 2, 2011

Assignee: Broadcom Corporation

Inventors: Timothy Dobson, Mark Taunton
System and method for implementing a zero overhead loop

Patent number: 7987347

Abstract: Systems and methods for implementing a zero overhead loop in a microprocessor or microprocessor based system/chip are disclosed. The systems and methods include the use of a breakpoint mechanism which is additionally used in debugging in order to provide some of the looping functionality.

Type: Grant

Filed: December 22, 2006

Date of Patent: July 26, 2011

Assignee: Broadcom Corporation

Inventors: Sophie Mary Wilson, Timothy Martin Dobson
Microcontroller

Patent number: 7978750

Abstract: A microcontroller is disposed on a receiving part of a wireless system in order to process a demodulation signal generated by a receiver circuit, and includes a memory and a CPU. The memory stores a control program of the microcontroller. The control program thereof includes a dual loop routine for an operation in reception standby mode. The dual loop routine has a first loop and a second loop included in the first loop. The CPU has an instruction set consisting of a plurality of instructions, and executes the instructions according to the program stored in the memory. The CPU executes an instruction irrelevant to an operation when the microcontroller is in reception mode during the second loop a number of times. The number of times is at least such that noise caused by the repetition of the second loop is lowered below a desired level.

Type: Grant

Filed: June 29, 2005

Date of Patent: July 12, 2011

Assignee: Fujitsu Semiconductor Limited

Inventors: Hideo Nunokawa, Miki Suzuki, Hiroyuki Abe, Shinichi Okamoto, Shunichi Ko, Hiroshi Haibara, Nobuhiko Akasaka
Macroscalar processor architecture

Patent number: 7975134

Abstract: A macroscalar processor architecture is described herein. In one embodiment, an exemplary processor includes one or more execution units to execute instructions and one or more iteration units coupled to the execution units. The one or more iteration units receive one or more primary instructions of a program loop that comprise a machine executable program. For each of the primary instructions received, at least one of the iteration units generates multiple secondary instructions that correspond to multiple loop iterations of the task of the respective primary instruction when executed by the one or more execution units. Other methods and apparatuses are also described.

Type: Grant

Filed: May 26, 2010

Date of Patent: July 5, 2011

Assignee: Apple Inc.

Inventor: Jeffry E. Gonion
Image processing with highly threaded texture fragment generation

Patent number: 7973804

Abstract: A circuit arrangement and method support a multithreaded rendering architecture capable of dynamically routing pixel fragments from a pixel fragment generator to any pixel shader from among a pool of pixel shaders. The pixel fragment generator is therefore not tied to a specific pixel shader, but is instead able to utilize multiple pixel shaders in a pool of pixel shaders to minimize bottlenecks and improve overall hardware utilization and performance during image processing.

Type: Grant

Filed: March 11, 2008

Date of Patent: July 5, 2011

Assignee: International Business Machines Corporation

Inventors: Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer
Runtime Extraction of Data Parallelism

Publication number: 20110161643

Abstract: Mechanisms for extracting data dependencies during runtime are provided. The mechanisms execute a portion of code having a loop and generate, for the loop, a first parallel execution group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The mechanisms further execute the first parallel execution group and determining, for each iteration in the subset of iterations, whether the iteration has a data dependence. Moreover, the mechanisms commit store data to system memory only for stores performed by iterations in the subset of iterations for which no data dependence is determined. Store data of stores performed by iterations in the subset of iterations for which a data dependence is determined is not committed to the system memory.

Type: Application

Filed: December 30, 2009

Publication date: June 30, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
Parallel Execution Unit that Extracts Data Parallelism at Runtime

Publication number: 20110161642

Abstract: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor. Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.

Type: Application

Filed: December 30, 2009

Publication date: June 30, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
Method and apparatus for nested instruction looping using implicit predicates

Patent number: 7945768

Abstract: A method and apparatus for executing a nested program loop on a vector processor, the loop comprising outer-pre, inner and outer-post portions. An input stream unit of the vector processor provides a data value to a data path and sets an associated data validity tag to ‘valid’ once per outer loop iteration, as indicated by an inner counter of the input stream unit. The tag is set to ‘invalid’ in other iterations. Functional units of the vector processor operate on data values in the data path, each functional unit producing a valid result if the data validity tags associated with inputs data values are set to ‘valid’. An output stream unit of the vector processor sinks a data value from the data path once per outer loop iteration if an associated data validity tag indicates that the data value is valid.

Type: Grant

Filed: June 5, 2008

Date of Patent: May 17, 2011

Assignee: Motorola Mobility, Inc.

Inventors: Raymond B. Essick, IV, Kent D. Moat, Michael A. Schuette
SYSTEM AND METHOD FOR USING A BRANCH MIS-PREDICTION BUFFER

Publication number: 20110107071

Abstract: A system and method is provided for executing a conditional branch instruction. The system and method may include a branch predictor to predict one or more instructions that depend on the conditional branch instruction and a branch mis-prediction buffer to store correct instructions that were not predicted by the branch predictor during a branch mis-prediction.

Type: Application

Filed: November 4, 2009

Publication date: May 5, 2011

Inventor: Jeffrey Allan (Alon) JACOB (YAAKOV)
Precise counter hardware for microcode loops

Patent number: 7937574

Abstract: In an embodiment, a microcode unit for a processor is contemplated. The microcode unit comprises a microcode memory storing a plurality of microcode routines executable by the processor, wherein each microcode routine comprises two or more microcode operations. Coupled to the microcode memory, the sequence control unit is configured to control reading microcode operations from the microcode memory to be issued for execution by the processor. The sequence control unit is configured to stall issuance of microcode operations forming a body of a loop in a first routine of the plurality of microcode routines until a loop counter value that indicates a number of iterations of the loop is received by the sequence control unit.

Type: Grant

Filed: July 17, 2007

Date of Patent: May 3, 2011

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael T. Clark, Jelena Ilic, Syed Faisal Ahmed, Michael T. DiBrino
Computation spreading for spur reduction in a digital phase lock loop

Patent number: 7936221

Abstract: A novel and useful apparatus for and method of spur reduction using computation spreading in a digital phase locked loop (DPLL) architecture. A software based PLL incorporates a reconfigurable calculation unit (RCU) that is optimized and programmed to sequentially perform all the atomic operations of a PLL or any other desired task in a time sharing manner. An application specific instruction-set processor (ASIP) incorporating the RCU is adapted to spread the computation of the atomic operations out over and completed within an entire PLL reference clock period. Each computation being performed at a much higher processor clock frequency than the PLL reference clock rate. This functions to significantly reduce the per cycle current transient generated by the computations. Further, the frequency content of the current transients is at the higher processor clock frequency. This results in a significant reduction in spurs within sensitive portions of the output spectrum.

Type: Grant

Filed: September 11, 2007

Date of Patent: May 3, 2011

Assignee: Texas Instruments Incorporated

Inventors: Roman Staszewski, Robert B. Staszewski, Fuqiang Shi
Storage medium storing calculation processing visualization program, calculation processing visualization apparatus, and calculation processing visualization method

Patent number: 7917739

Abstract: The execution status of pipeline processing is highly visualized by appropriately displaying processes forming loops in a simplified manner. A loop-information storage unit stores loop-defining information specifying the address of an instruction that causes a pipeline process forming a loop. An operation-information storage unit stores operation information that includes the address of an instruction input into a pipeline and information indicating the execution status of a pipeline process caused by the instruction. A loop determination unit determines whether each pipeline process indicated by the operation information forms a loop by referring to the loop-defining information. An output unit outputs visualization information indicating, in a visually comprehensible manner, the execution status of a pipeline process that has been determined to form a loop for a predetermined number of executions of the loop and the execution status of a pipeline process that has been determined to form no loop.

Type: Grant

Filed: June 19, 2008

Date of Patent: March 29, 2011

Assignee: Fujitsu Limited

Inventors: Shuji Yamamura, Takashi Aoki
Cache system and control method of way prediction for cache memory

Publication number: 20110072215

Abstract: A cache device according to an exemplary aspect of the present invention includes a way information buffer that stores way information that is a result of selecting a way in an instruction that accesses a cache memory; and a control unit that controls a storage processing and a read processing, while a series of instruction groups are repeatedly executed, the storage processing being for storing the way information in the instruction groups to the way information memory, the read processing being for reading the way information from the way information memory.

Type: Application

Filed: September 17, 2010

Publication date: March 24, 2011

Applicant: Renesas Electronics Corporation

Inventor: Daisuke Takahashi
PILE PROCESSING SYSTEM AND METHOD FOR PARALLEL PROCESSORS

Publication number: 20110072251

Abstract: A system, method and computer program product are provided for processing exceptions. Initially, computational operations are processed in a loop. Moreover, exceptions are identified and stored while processing the computational operations. Such exceptions are then processed separate from the loop.

Type: Application

Filed: April 22, 2010

Publication date: March 24, 2011

Applicant: DROPLET TECHNOLOGY, INC.

Inventors: William C. Lynch, Krasimir D. Kolarov, Steven E. Saunders
Processor and method for executing a program loop within an instruction word

Patent number: 7913069

Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. Instruction words (48) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In a particular example, the series of operations are included in a single instruction word (48). The micro-loop (100) in combination with the ability of the computers (12) to send instruction words (48) to a neighboring computer (12) provides a powerful tool for allowing a computer (12) to utilize the resources of a neighboring computer (12).

Type: Grant

Filed: May 26, 2006

Date of Patent: March 22, 2011

Assignee: VNS Portfolio LLC

Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
Digital Signal Processing Systems

Publication number: 20110055445

Abstract: A signal processing system may include a multiply-accumulate (MAC) unit to generate output data by performing multiply-accumulate operations on first and second input data in response to a stream of MAC instruction words, where the MAC unit is pipelined to enable it to perform a multiply-accumulate operation in response to each MAC instruction word. The system may also include an instruction generator to generate the stream of MAC instruction words by performing loop expansion on a stream of intermediate instruction words, where one intermediate instruction word may comprise a group of fields to set up the MAC unit to execute in response to the one intermediate instruction word.

Type: Application

Filed: March 15, 2010

Publication date: March 3, 2011

Applicant: AZURAY TECHNOLOGIES, INC.

Inventors: Edward Gee, Keith Slavin, Robert Batten, Vincenzo DiTommaso, Ravindranath Naiknaware, Triet Tu Le, Adam Heiberg, Dennis Morel
Loop iteration prediction by supplying pseudo branch instruction for execution at first iteration and storing history information in branch prediction unit

Patent number: 7886134

Abstract: This invention combines a loop support mechanism and a branch prediction mechanism. After an instruction execution unit executes an end block instruction of a block repeat, the loop control unit branches to the first instruction in the loop and sends a pseudo branch instruction to the instruction execution unit. The instruction execution unit acts as if the last instruction in the block is an instruction for branching to the start address of the block. This is stored in the branch prediction unit and branch prediction is performed thereafter.

Type: Grant

Filed: December 5, 2008

Date of Patent: February 8, 2011

Assignee: Texas Instruments Incorporated

Inventor: Hiroyuki Mizumo
BRANCH PREDICTOR FOR SETTING PREDICATE FLAG TO SKIP PREDICATED BRANCH INSTRUCTION EXECUTION IN LAST ITERATION OF LOOP PROCESSING

Publication number: 20110029763

Abstract: A processor simultaneously issues instructions to multiple threads in a same instruction execution cycle. An instruction issuer controls issuance of an instruction for each of the multiple threads. A detector detects, for each of the multiple threads, whether a loop processing is currently being executed. A unit causes the instruction issuer to increase a number of instructions to be issued when the detector detects that the loop processing is currently being executed.

Type: Application

Filed: October 8, 2010

Publication date: February 3, 2011

Applicant: PANASONIC CORPORATION

Inventor: Takenobu TANI
Managing wasted active power in processors based on loop iterations and number of instructions executed since last loop

Patent number: 7882381

Abstract: Methods of managing wasted active power of processors in computer systems and other electronic devices are disclosed. In one aspect, a method may include counting a number of times that a processor has performed an instruction loop. Then, a determination may be made whether the number of times that the processor has performed the loop is greater than one or more thresholds or other given values. Next, a limit may be imposed on the power of the processor if the number of times that the processor has performed the loop is greater than at least one of the thresholds or given values. Logic to perform such methods is also disclosed, as are systems suitable for incorporating such logic.

Type: Grant

Filed: June 29, 2006

Date of Patent: February 1, 2011

Assignee: Intel Corporation

Inventor: David Anthony Wyatt
Processor utilizing a loop buffer to reduce power consumption

Patent number: 7873820

Abstract: The present invention provides processing systems, apparatuses, and methods that reduce power consumption with the use of a loop buffer. In an embodiment, an instruction fetch unit of a processor initially provides instructions from an instruction cache to an execution unit of the processor. While instructions are provided from the instruction cache to the execution unit, instructions forming a loop are stored in a loop buffer. When a loop stored in the loop buffer is being iterated, the instruction cache is disabled to reduce power consumption and instructions are provided to the execution unit from the loop buffer. When the loop is exited, the instruction cache is re-enabled and instructions are provided to the execution unit from the instruction cache.

Type: Grant

Filed: November 15, 2005

Date of Patent: January 18, 2011

Assignee: MIPS Technologies, Inc.

Inventor: Matthias Knoth
INFORMATION PROCESSING APPARATUS AND BRANCH PREDICTION METHOD

Publication number: 20100306516

Abstract: An information processor includes a first recording unit which stores first information indicating correspondence between an instruction address and a branch destination address of a most recent branch instruction, a computation of the most recent branch instruction having been completed and a branch for the most recent branch prediction having been taken, a second recording unit which stores a second information indicating correspondence between an instruction address and a branch destination address of each of past branch instructions including the most recent branch instruction, computations of the past branch instructions having been completed and branches for the past branch instructions having been taken, and a control unit which makes a branch prediction based on the first information or the second information, and stops supply of a clock to the second recording unit and makes a branch prediction based on the first information when an instruction sequence enters a loop.

Type: Application

Filed: May 14, 2010

Publication date: December 2, 2010

Applicant: FUJITSU LIMITED

Inventor: Takashi SUZUKI
SIMULATION SYSTEM, METHOD AND PROGRAM

Publication number: 20100299509

Abstract: A computer-implemented pipeline execution system, method, and program product for executing loop processing in a multi-core or a multiprocessor computing environment, where the loop processing includes multiple function blocks in a multiple-stage pipeline manner. The system includes: a pipelining unit for pipelining the loop processing and assigning the loop processing to a computer processor or core; a calculating unit for calculating a first-order gradient term from a value calculated with the use of a predicted value of the input to a pipeline; and a correcting unit for correcting an output value of the pipeline with the value of the first-order gradient term.

Type: Application

Filed: May 18, 2010

Publication date: November 25, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jun Doi, Shuichi Shimizu, Takeo Yoshizawa
Branch predictor for setting predicate flag to skip predicated branch instruction execution in last iteration of loop processing

Patent number: 7836289

Abstract: A program execution control device which controls execution of a program by a processor having a predicate function for conditional execution of an instruction, wherein the program includes a branch instruction to control iterations in loop processing, the branch instruction is further an instruction to generate an execute-or-not condition indicating whether or not the branch instruction is to be executed at an iteration in the loop processing after a current iteration, and to reflect the execute-or-not condition on a predicate flag used for conditional execution of the branch instruction, the program execution control device comprises a processor status changing unit configured to change, before an execution cycle of the branch instruction, a status of the processor in advance for execution of an instruction following the branch instruction, the status being changed based on the execute-or-not condition reflected on the predicate flag.

Type: Grant

Filed: August 20, 2008

Date of Patent: November 16, 2010

Assignee: Panasonic Corporation

Inventor: Takenobu Tani
Runtime Dependence-Aware Scheduling Using Assist Thread

Publication number: 20100287550

Abstract: A runtime dependence-aware scheduling of dependent iterations mechanism is provided. Computation is performed for one or more iterations of computer executable code by a main thread. Dependence information is determined for a plurality of memory accesses within the computer executable code using modified executable code using a set of dependence threads. Using the dependence information, a determination is made as to whether a subset of a set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time by the one or more available threads in the data processing system. If the subset of the set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time, the main thread is signaled to skip the subset of the set of uncompleted iterations and the set of assist threads is signaled to execute the subset of the set of uncompleted iterations.

Type: Application

Filed: May 5, 2009

Publication date: November 11, 2010

Applicant: International Business Machines Corporation

Inventors: Alexandre E. Eichenberger, Kathryn M. O'Brien, Xiaotong Zhuang
Program Code Simulator

Publication number: 20100281240

Abstract: A system and method for facilitating simulation of a computer program. A program representation is generated from a computer program. A simulation of the program is performed. Simulation may include applying heuristics to determine program flow for selected instructions, such as a branch instruction or a loop instruction. Simulation may also include creating imaginary objects as surrogates for real objects, when program code to create real objects is restricted, or fields of the objects are unavailable or uncertain, or for other reasons. Data descriptive of the simulation is inserted into the program representation. A visualizer may retrieve the program representation and generate a visualization that shows sequence flows resulting from the simulation.

Type: Application

Filed: May 1, 2009

Publication date: November 4, 2010

Applicant: Microsoft Corporation

Inventors: Deon Brewis, Durham Goode, John Joseph Jordan, Sadi Khan
Command supply device that supplies a command read out from a main memory to a central processing unit

Patent number: 7822949

Abstract: A command supply device supplies a command sequence that forms a loop. A loop command buffer accumulates a first partial command sequence. The first partial command sequence is a head part of a first command sequence repeatedly supplied to a CPU from among command sequences stored in a main memory, and is accumulated before the first command sequence is supplied to the CPU again. A linking command buffer accumulates a second partial command sequence. The second partial command sequence follows the first partial command sequence in the first command sequence, and is accumulated while the accumulated first partial command sequence in the loop command buffer is supplied to the CPU. A selection circuit supplies, to the CPU, a command from the accumulated second partial command sequence in the linking command buffer when the entirety of the first partial command sequence has been supplied to the CPU.

Type: Grant

Filed: May 9, 2005

Date of Patent: October 26, 2010

Assignee: Panasonic Corporation

Inventor: Satoshi Ogura

prev 1 2 3 4 5 6 7 8 … next