Loop Execution Patents (Class 712/241)
  • Publication number: 20120137111
    Abstract: A loop detection method, system, and article of manufacture for determining whether a sequence of unit processes continuously executed among unit processes in a program is a loop by means of computational processing performed by a computer. The method includes: reading address information on the sequence of unit processes; comparing an address of a unit process as a loop starting point candidate with an address of a last unit process in the sequence of unit processes; reading call stack information on the sequence of unit processes; comparing a call stack upon execution of the unit process as the loop starting point candidate with a call stack upon execution of the last unit process; outputting a determination result indicating that the sequence of unit processes forms a loop if the respective comparison results of the addresses and the call stacks match with each other.
    Type: Application
    Filed: November 21, 2011
    Publication date: May 31, 2012
    Applicant: International Business Machines Corporation
    Inventor: Hiroshige Hayashizaki
  • Patent number: 8191057
    Abstract: Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, and for each candidate loop in the set of candidate loops gathered for load speculation, computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determining an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction scheduling on the generated unrolled main loop.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: May 29, 2012
    Assignee: International Business Machines Corporation
    Inventors: Roch G. Archambault, Geoffrey O. Blandy, Roland Froese, Yaoqing Gao, Liangxiao Hu, James L. McInnes, Raul E. Silvera
  • Publication number: 20120131316
    Abstract: A method and apparatus are disclosed that may comprise applying compact markup notation to a general recursive computing system including hardware and software components, the compact markup notation defining things, places, paths, actions and causes within at least one of the hardware and the software of the general recursive computing system, to establish a set of data comprising a definitive description of the general recursive computing system in the compact notation; and synthesizing a self-aware and self-monitoring primitive recursive computing system utilizing the definitive description in the compact markup notation.
    Type: Application
    Filed: November 17, 2011
    Publication date: May 24, 2012
    Inventors: Joseph Mitola, III, Yu-Dong Yao, Yingying Chen, Hong Man
  • Publication number: 20120124350
    Abstract: A soaker tool for an information handling system (IHS) exercises the IHS to provide a predetermined amount of utilization that a user may specify. The soaker tool schedules wait times following respective utilization times in alternating fashion to achieve a desired utilization value for a predetermined time period. The soaker tool monitors for a dispatch interrupt during the utilization times. Should a dispatch interrupt occur during a utilization time, the soaker tool accounts for the dispatch interrupt by determining a remainder utilization time to maintain utilization accuracy. The soaker tool may employ a parameter table that specifies utilization times, wait times, loop counts and adjustment cycles indexed to the respective utilization values that a user may select. The soaker tool may employ adjustment cycles to compensate for cumulative timing errors that may occur when running the tool for extended time periods.
    Type: Application
    Filed: November 12, 2010
    Publication date: May 17, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Meik Neubauer
  • Publication number: 20120124351
    Abstract: An apparatus and method for dynamically determining the execution mode of a reconfigurable array are provided. Performance information of a loop may be obtained before and/or during the execution of the loop. The performance information may be used to determine whether to operate the apparatus in a very long instruction word (VLIW) mode or in a coarse grained array (CGA) mode.
    Type: Application
    Filed: August 25, 2011
    Publication date: May 17, 2012
    Inventors: Bernhard Egger, Dong-Hoon Yoo, Tai-Song Jin, Won-Sub Kim, Min-Wook Ahn, Jin-Seok Lee, Hee-Jin Ahn
  • Patent number: 8171464
    Abstract: An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.
    Type: Grant
    Filed: May 16, 2008
    Date of Patent: May 1, 2012
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kai-Ting Amy Wang, Peng Wu
  • Publication number: 20120102496
    Abstract: A reconfigurable processor which merges an inner loop and an outer loop which are included in a nested loop and allocates the merged loop to processing elements in parallel, thereby reducing processing time to process the nested loop. The reconfigurable processor may extract loop execution frequency information from the inner loop and the outer loop of the nested loop, and may merge the inner loop and the outer loop based on the extracted loop execution frequency information.
    Type: Application
    Filed: April 14, 2011
    Publication date: April 26, 2012
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Min-Wook Ahn, Dong-Hoon Yoo, Jin-Seok Lee, Bernhard Egger, Tai-Song Jin, Won-Sub Kim, Hee-Jin Ahn
  • Publication number: 20120096247
    Abstract: Provided are a reconfigurable processor, which is capable of reducing the probability of an incorrect computation by analyzing the dependence between memory access instructions and allocating the memory access instructions between a plurality of processing elements (PEs) based on the results of the analysis, and a method of controlling the reconfigurable processor. The reconfigurable processor extracts an execution trace from simulation results, and analyzes the memory dependence between instructions included in different iterations based on parts of the execution trace of memory access instructions.
    Type: Application
    Filed: October 13, 2011
    Publication date: April 19, 2012
    Inventors: Hee-Jin AHN, Dong-Hoon Yoo, Bernhard Egger, Min-Wook Ahn, Jin-Seok Lee, Tai-Song Jin, Won-Sub Kim
  • Publication number: 20120079246
    Abstract: An apparatus and method is described herein for conditionally committing /andor speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.
    Type: Application
    Filed: September 25, 2010
    Publication date: March 29, 2012
    Inventors: Mauricio Breternitz, JR., Youfeng Wu, Cheng Wang, Edson Borin, Shiliang Hu, Craig B. Zilles
  • Publication number: 20120066472
    Abstract: A macroscalar processor architecture is described herein. In one embodiment, a processor receives instructions of a program loop having a vector block and a sequence block intended to be executed after the vector block, where the processor includes multiple slices and each of the slices is capable of executing an instruction of an iteration of the program loop substantially in parallel. For each iteration of the program loop, the processor executes an instruction of the sequence block using one of the slices while executing instructions of the vector block using a remainder of the slices substantially in parallel. Other methods and apparatuses are also described.
    Type: Application
    Filed: November 17, 2011
    Publication date: March 15, 2012
    Inventor: Jeffry E. Gonion
  • Patent number: 8134411
    Abstract: A novel and useful apparatus for and method of spur reduction using computation spreading with dithering in a digital phase locked loop (DPLL) architecture. A software based PLL incorporates a reconfigurable calculation unit (RCU) that is optimized and programmed to sequentially perform all the atomic operations of a PLL or any other desired task in a time sharing manner. An application specific instruction-set processor (ASIP) incorporating the RCU is adapted to spread the computation of the atomic operations out over a PLL reference clock period wherein each computation is performed at a much higher processor clock frequency than the PLL reference clock rate. This significantly reduces the per cycle current transient generated by the computations. The frequency content of the current transients is at the higher processor clock frequency which results in a significant reduction in spurs within sensitive portions of the output spectrum.
    Type: Grant
    Filed: April 17, 2008
    Date of Patent: March 13, 2012
    Assignee: Texas Instruments Incorporated
    Inventors: Fuqiang Shi, Roman Staszewski, Robert B. Staszewski
  • Patent number: 8131979
    Abstract: The described embodiments provide a system that determines data dependencies between two vector memory operations or two memory operations that use vectors of memory addresses. During operation, the system receives a first input vector and a second input vector. The first input vector includes a number of elements containing memory addresses for a first memory operation, while the second input vector includes a number of elements containing memory addresses for a second memory operation, wherein the first memory operation occurs before the second memory operation in program order. The system then determines elements in the first and second input vectors where the memory addresses indicate that a dependency exists between the memory operations. The system next generates a result vector, wherein the result vector indicates the elements where dependencies exist between the memory operations.
    Type: Grant
    Filed: April 7, 2009
    Date of Patent: March 6, 2012
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff, Jr.
  • Patent number: 8122239
    Abstract: Method and apparatus for initializing a system configured in a programmable logic device (PLD) is described. In some examples, the method includes: initializing memory elements in the system with first data; executing a first iteration of the system to process the first data; partially reconfiguring the PLD, during execution of the first iteration, to initialize shadow memory elements in the PLD with second data, the shadow memory elements respectively shadowing the memory elements in the system; transferring the second data from the shadow memory elements to the memory elements; and executing a second iteration of the system to process the second data.
    Type: Grant
    Filed: September 11, 2008
    Date of Patent: February 21, 2012
    Assignee: Xilinx, Inc.
    Inventors: Philip B. James-Roxby, Stephen A. Neuendorffer, Henry E. Styles
  • Publication number: 20120023316
    Abstract: The illustrative embodiments comprise a method, data processing system, and computer program product having a processor unit for processing instructions with loops. A processor unit creates a first group of instructions having a first set of loops and second group of instructions having a second set of loops from the instructions. The first set of loops have a different order of parallel processing from the second set of loops. A processor unit processes the first group. The processor unit monitors terminations in the first set of loops during processing of the first group. The processor unit determines whether a number of terminations being monitored in the first set of loops is greater than a selectable number of terminations. In response to a determination that the number of terminations is greater than the selectable number of terminations, the processor unit ceases processing the first group and processes the second group.
    Type: Application
    Filed: July 26, 2010
    Publication date: January 26, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Brian Flachs, Charles R. Johns, Ulrich Weigand
  • Publication number: 20110302397
    Abstract: A computing and communications system and method may comprise a primitive recursive function computing engine including an instruction set architecture prohibiting loop operations that continue for an indefinite time. The system and method may further comprise the instruction set architecture comprising system identifiers selected from a group comprising things, places, paths, actions and causes. The instruction set architecture may comprise organizing at least one data thing into a processing path to be acted upon by an action. The instruction set architecture may comprise defining a processing element as comprising an input interface configured to receive a data thing into the processing path; a processor in the processing path configured to perform the action on the data thing; and an output interface configured to receive a result of performing of the action on the data thing configured to provide the result as an output of the processing element.
    Type: Application
    Filed: April 12, 2011
    Publication date: December 8, 2011
    Inventor: Joseph Mitola, III
  • Patent number: 8058896
    Abstract: A programming interface device for a programmable logic circuit comprises a series of parallel logic block chains each having first and second connection means, the first and second connection means being disposed at opposite ends of each chain. The programming interface device comprises first and second interfacing means for interfacing with the first and second connection means of each logic block chain, respectively and at least one programming circuit, each programming circuit arranged to configure a plurality of serially connected logic blocks. Finally, the programming interface comprises programmable connection means for connecting the connection means of each logic block chain to either the connection means of another logic block chain or directly to one of the at least one programming circuits, such that the parallel logic block chains can be configured in parallel, series or in any combination thereof.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: November 15, 2011
    Assignee: Panasonic Corporation
    Inventors: Simon Deeley, Anthony Stansfield
  • Patent number: 8055886
    Abstract: An electronic circuit (4000) includes a bias value generator circuit (3900) operable to supply a varying bias value in a programmable range, and an instruction circuit (3625, 4010) responsive to a first instruction to program the range of said bias value generator circuit (3900) and further responsive to a second instruction having an operand to repeatedly issue said second instruction with said operand varied in an operand value range determined as a function of the varying bias value.
    Type: Grant
    Filed: May 22, 2008
    Date of Patent: November 8, 2011
    Assignee: Texas Instruments Incorporated
    Inventors: Kenichi Tashiro, Hiroyuki Mizuno, Yuji Umemoto
  • Publication number: 20110238957
    Abstract: According to one embodiment, a software conversion program product having a computer readable medium including programmed instructions, wherein the instructions, when executed by a computer system including a host processor and one or more accelerator processors, causes the computer system to perform: analyzing input software and obtaining a compute intensity calculated by dividing the number of arithmetic processing times in a loop by the size of data accessed in the loop and a data reference area size that is a total size of areas where data is referred to; determining a processor that executes loops on the basis of obtained values and a preliminarily prepared win-loss table in which wins and losses of execution times between the host processor and the accelerator processor are defined; and converting the input software so that the determined processor executes the loops.
    Type: Application
    Filed: September 14, 2010
    Publication date: September 29, 2011
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yusuke SHIROTA, Osamu Torii
  • Patent number: 8019982
    Abstract: A data processing system and method. The data processing system includes a processor core that executes a program; a loop accelerator that has an array consisting of a plurality of data processing cells and executes a loop in a program by configuring the array according to a set of configuration bits; and a centralized register file which allows data used in the program execution to be shared by the processor core and the loop accelerator. The loop accelerator divides the configuration of the array into at least three phases according to whether data exchange with the central register file is conducted during the loop execution. Thus, unnecessary occupation of the routing resource, which is used for the data exchange between the loop accelerator and the central register file during the loop execution, can be avoided.
    Type: Grant
    Filed: October 4, 2006
    Date of Patent: September 13, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hong-seok Kim, Suk-jin Kim, Jeong-wook Kim, Soo-jung Ryu
  • Patent number: 8019981
    Abstract: Methods and apparatus are provided for performing loop execution. Modifier registers are used to hold loop counter values. Modifier register information and program memory address information are included in the loop instruction. When a processor executes a loop instruction, it decodes the instruction, identifies the modifier register, and accesses the register value to determine if the processor will jump back based on the memory address information. The loop execution can incur no clock cycle penalties.
    Type: Grant
    Filed: August 12, 2004
    Date of Patent: September 13, 2011
    Assignee: Altera Corporation
    Inventor: Paul Metzgen
  • Publication number: 20110219222
    Abstract: Mechanisms for building approximate data dependences using a moving look-back window are provided. The mechanisms track dependence information for memory accesses over iterations of execution of a portion of code. The mechanisms receive a memory access of an iteration of the portion of code, the memory access having an address for access the memory and an access type indicating at least one of a read or a write access type. An entry in a moving look-back window data structure is generated corresponding to a memory location accessed by the memory access. The entry comprises at least an identification of the address, the access type, and an iteration number corresponding to the iteration of the memory access. The moving look-back window data structure is utilized to determine dependence information for memory accesses over a plurality of iterations of the portion of code.
    Type: Application
    Filed: March 5, 2010
    Publication date: September 8, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, John K.P. O'Brien, Kathryn M. O'Brien, Kai-Ting A. Wang, Xiaotong Zhuang
  • Patent number: 8015391
    Abstract: A processor simultaneously issues instructions to multiple threads in a same instruction execution cycle. An instruction issuer controls issuance of an instruction for each of the multiple threads. A detector detects, for each of the multiple threads, whether a loop processing is currently being executed. A unit causes the instruction issuer to increase a number of instructions to be issued when the detector detects that the loop processing is currently being executed.
    Type: Grant
    Filed: October 8, 2010
    Date of Patent: September 6, 2011
    Assignee: Panasonic Corporation
    Inventor: Takenobu Tani
  • Patent number: 7996661
    Abstract: A dynamic reconfigurable circuit that implements optional processing by dynamically switching a processing content of a reconfigurable processing element (PE) and a connection content between the PEs in accordance with a context, includes: a configuration register section for setting a content of loop processing on the basis of the context, the loop processing content including an output source of an output signal from each of a set of the reconfigured PEs, an output destination of the output signal, and a condition for outputting the output signal to the output destination; and at least one counter circuit including a loop control section and an output register section that implement the set loop processing, that count the number of implementations of the loop processing implemented by the loop control section, and that output the output signal to the output destination based on the counted number of implementations and the condition.
    Type: Grant
    Filed: September 17, 2008
    Date of Patent: August 9, 2011
    Assignee: Fujitsu Semiconductor Limited
    Inventors: Takashi Hanai, Shinichi Sutou, Masaki Arai, Mitsuharu Wakayoshi
  • Patent number: 7991984
    Abstract: A loop control system comprises at least one loop flag in an instruction word, at least one loop counter associated with the at least one loop flag operable to store and compute a number of times a program loop is to be executed, at least one start address register associated with the at least one loop flag operable to store a program loop starting address, and at least one end address register associated with the at least one loop flag operable to store a program loop ending address.
    Type: Grant
    Filed: December 23, 2005
    Date of Patent: August 2, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Eran Pisek
  • Patent number: 7991985
    Abstract: Systems and methods for implementing a zero overhead loop in a microprocessor or microprocessor based system/chip are disclosed. The systems and methods include the use of a breakpoint mechanism, and modification of parameters at runtime, with the breakpoint mechanism being additionally used in debugging, in order to provide some of the looping functionality.
    Type: Grant
    Filed: December 22, 2006
    Date of Patent: August 2, 2011
    Assignee: Broadcom Corporation
    Inventors: Timothy Dobson, Mark Taunton
  • Patent number: 7987347
    Abstract: Systems and methods for implementing a zero overhead loop in a microprocessor or microprocessor based system/chip are disclosed. The systems and methods include the use of a breakpoint mechanism which is additionally used in debugging in order to provide some of the looping functionality.
    Type: Grant
    Filed: December 22, 2006
    Date of Patent: July 26, 2011
    Assignee: Broadcom Corporation
    Inventors: Sophie Mary Wilson, Timothy Martin Dobson
  • Patent number: 7978750
    Abstract: A microcontroller is disposed on a receiving part of a wireless system in order to process a demodulation signal generated by a receiver circuit, and includes a memory and a CPU. The memory stores a control program of the microcontroller. The control program thereof includes a dual loop routine for an operation in reception standby mode. The dual loop routine has a first loop and a second loop included in the first loop. The CPU has an instruction set consisting of a plurality of instructions, and executes the instructions according to the program stored in the memory. The CPU executes an instruction irrelevant to an operation when the microcontroller is in reception mode during the second loop a number of times. The number of times is at least such that noise caused by the repetition of the second loop is lowered below a desired level.
    Type: Grant
    Filed: June 29, 2005
    Date of Patent: July 12, 2011
    Assignee: Fujitsu Semiconductor Limited
    Inventors: Hideo Nunokawa, Miki Suzuki, Hiroyuki Abe, Shinichi Okamoto, Shunichi Ko, Hiroshi Haibara, Nobuhiko Akasaka
  • Patent number: 7975134
    Abstract: A macroscalar processor architecture is described herein. In one embodiment, an exemplary processor includes one or more execution units to execute instructions and one or more iteration units coupled to the execution units. The one or more iteration units receive one or more primary instructions of a program loop that comprise a machine executable program. For each of the primary instructions received, at least one of the iteration units generates multiple secondary instructions that correspond to multiple loop iterations of the task of the respective primary instruction when executed by the one or more execution units. Other methods and apparatuses are also described.
    Type: Grant
    Filed: May 26, 2010
    Date of Patent: July 5, 2011
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 7973804
    Abstract: A circuit arrangement and method support a multithreaded rendering architecture capable of dynamically routing pixel fragments from a pixel fragment generator to any pixel shader from among a pool of pixel shaders. The pixel fragment generator is therefore not tied to a specific pixel shader, but is instead able to utilize multiple pixel shaders in a pool of pixel shaders to minimize bottlenecks and improve overall hardware utilization and performance during image processing.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: July 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: Eric Oliver Mejdrich, Paul Emery Schardt, Robert Allen Shearer
  • Publication number: 20110161643
    Abstract: Mechanisms for extracting data dependencies during runtime are provided. The mechanisms execute a portion of code having a loop and generate, for the loop, a first parallel execution group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The mechanisms further execute the first parallel execution group and determining, for each iteration in the subset of iterations, whether the iteration has a data dependence. Moreover, the mechanisms commit store data to system memory only for stores performed by iterations in the subset of iterations for which no data dependence is determined. Store data of stores performed by iterations in the subset of iterations for which a data dependence is determined is not committed to the system memory.
    Type: Application
    Filed: December 30, 2009
    Publication date: June 30, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Publication number: 20110161642
    Abstract: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor. Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.
    Type: Application
    Filed: December 30, 2009
    Publication date: June 30, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Patent number: 7945768
    Abstract: A method and apparatus for executing a nested program loop on a vector processor, the loop comprising outer-pre, inner and outer-post portions. An input stream unit of the vector processor provides a data value to a data path and sets an associated data validity tag to ‘valid’ once per outer loop iteration, as indicated by an inner counter of the input stream unit. The tag is set to ‘invalid’ in other iterations. Functional units of the vector processor operate on data values in the data path, each functional unit producing a valid result if the data validity tags associated with inputs data values are set to ‘valid’. An output stream unit of the vector processor sinks a data value from the data path once per outer loop iteration if an associated data validity tag indicates that the data value is valid.
    Type: Grant
    Filed: June 5, 2008
    Date of Patent: May 17, 2011
    Assignee: Motorola Mobility, Inc.
    Inventors: Raymond B. Essick, IV, Kent D. Moat, Michael A. Schuette
  • Publication number: 20110107071
    Abstract: A system and method is provided for executing a conditional branch instruction. The system and method may include a branch predictor to predict one or more instructions that depend on the conditional branch instruction and a branch mis-prediction buffer to store correct instructions that were not predicted by the branch predictor during a branch mis-prediction.
    Type: Application
    Filed: November 4, 2009
    Publication date: May 5, 2011
    Inventor: Jeffrey Allan (Alon) JACOB (YAAKOV)
  • Patent number: 7937574
    Abstract: In an embodiment, a microcode unit for a processor is contemplated. The microcode unit comprises a microcode memory storing a plurality of microcode routines executable by the processor, wherein each microcode routine comprises two or more microcode operations. Coupled to the microcode memory, the sequence control unit is configured to control reading microcode operations from the microcode memory to be issued for execution by the processor. The sequence control unit is configured to stall issuance of microcode operations forming a body of a loop in a first routine of the plurality of microcode routines until a loop counter value that indicates a number of iterations of the loop is received by the sequence control unit.
    Type: Grant
    Filed: July 17, 2007
    Date of Patent: May 3, 2011
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael T. Clark, Jelena Ilic, Syed Faisal Ahmed, Michael T. DiBrino
  • Patent number: 7936221
    Abstract: A novel and useful apparatus for and method of spur reduction using computation spreading in a digital phase locked loop (DPLL) architecture. A software based PLL incorporates a reconfigurable calculation unit (RCU) that is optimized and programmed to sequentially perform all the atomic operations of a PLL or any other desired task in a time sharing manner. An application specific instruction-set processor (ASIP) incorporating the RCU is adapted to spread the computation of the atomic operations out over and completed within an entire PLL reference clock period. Each computation being performed at a much higher processor clock frequency than the PLL reference clock rate. This functions to significantly reduce the per cycle current transient generated by the computations. Further, the frequency content of the current transients is at the higher processor clock frequency. This results in a significant reduction in spurs within sensitive portions of the output spectrum.
    Type: Grant
    Filed: September 11, 2007
    Date of Patent: May 3, 2011
    Assignee: Texas Instruments Incorporated
    Inventors: Roman Staszewski, Robert B. Staszewski, Fuqiang Shi
  • Patent number: 7917739
    Abstract: The execution status of pipeline processing is highly visualized by appropriately displaying processes forming loops in a simplified manner. A loop-information storage unit stores loop-defining information specifying the address of an instruction that causes a pipeline process forming a loop. An operation-information storage unit stores operation information that includes the address of an instruction input into a pipeline and information indicating the execution status of a pipeline process caused by the instruction. A loop determination unit determines whether each pipeline process indicated by the operation information forms a loop by referring to the loop-defining information. An output unit outputs visualization information indicating, in a visually comprehensible manner, the execution status of a pipeline process that has been determined to form a loop for a predetermined number of executions of the loop and the execution status of a pipeline process that has been determined to form no loop.
    Type: Grant
    Filed: June 19, 2008
    Date of Patent: March 29, 2011
    Assignee: Fujitsu Limited
    Inventors: Shuji Yamamura, Takashi Aoki
  • Publication number: 20110072215
    Abstract: A cache device according to an exemplary aspect of the present invention includes a way information buffer that stores way information that is a result of selecting a way in an instruction that accesses a cache memory; and a control unit that controls a storage processing and a read processing, while a series of instruction groups are repeatedly executed, the storage processing being for storing the way information in the instruction groups to the way information memory, the read processing being for reading the way information from the way information memory.
    Type: Application
    Filed: September 17, 2010
    Publication date: March 24, 2011
    Applicant: Renesas Electronics Corporation
    Inventor: Daisuke Takahashi
  • Publication number: 20110072251
    Abstract: A system, method and computer program product are provided for processing exceptions. Initially, computational operations are processed in a loop. Moreover, exceptions are identified and stored while processing the computational operations. Such exceptions are then processed separate from the loop.
    Type: Application
    Filed: April 22, 2010
    Publication date: March 24, 2011
    Applicant: DROPLET TECHNOLOGY, INC.
    Inventors: William C. Lynch, Krasimir D. Kolarov, Steven E. Saunders
  • Patent number: 7913069
    Abstract: A computer array (10) has a plurality of computers (12). The computers (12) communicate with each other asynchronously, and the computers (12) themselves operate in a generally asynchronous manner internally. Instruction words (48) can include a micro-loop (100) which is capable of performing a series of operations repeatedly. In a particular example, the series of operations are included in a single instruction word (48). The micro-loop (100) in combination with the ability of the computers (12) to send instruction words (48) to a neighboring computer (12) provides a powerful tool for allowing a computer (12) to utilize the resources of a neighboring computer (12).
    Type: Grant
    Filed: May 26, 2006
    Date of Patent: March 22, 2011
    Assignee: VNS Portfolio LLC
    Inventors: Charles H. Moore, Jeffrey Arthur Fox, John W. Rible
  • Publication number: 20110055445
    Abstract: A signal processing system may include a multiply-accumulate (MAC) unit to generate output data by performing multiply-accumulate operations on first and second input data in response to a stream of MAC instruction words, where the MAC unit is pipelined to enable it to perform a multiply-accumulate operation in response to each MAC instruction word. The system may also include an instruction generator to generate the stream of MAC instruction words by performing loop expansion on a stream of intermediate instruction words, where one intermediate instruction word may comprise a group of fields to set up the MAC unit to execute in response to the one intermediate instruction word.
    Type: Application
    Filed: March 15, 2010
    Publication date: March 3, 2011
    Applicant: AZURAY TECHNOLOGIES, INC.
    Inventors: Edward Gee, Keith Slavin, Robert Batten, Vincenzo DiTommaso, Ravindranath Naiknaware, Triet Tu Le, Adam Heiberg, Dennis Morel
  • Patent number: 7886134
    Abstract: This invention combines a loop support mechanism and a branch prediction mechanism. After an instruction execution unit executes an end block instruction of a block repeat, the loop control unit branches to the first instruction in the loop and sends a pseudo branch instruction to the instruction execution unit. The instruction execution unit acts as if the last instruction in the block is an instruction for branching to the start address of the block. This is stored in the branch prediction unit and branch prediction is performed thereafter.
    Type: Grant
    Filed: December 5, 2008
    Date of Patent: February 8, 2011
    Assignee: Texas Instruments Incorporated
    Inventor: Hiroyuki Mizumo
  • Publication number: 20110029763
    Abstract: A processor simultaneously issues instructions to multiple threads in a same instruction execution cycle. An instruction issuer controls issuance of an instruction for each of the multiple threads. A detector detects, for each of the multiple threads, whether a loop processing is currently being executed. A unit causes the instruction issuer to increase a number of instructions to be issued when the detector detects that the loop processing is currently being executed.
    Type: Application
    Filed: October 8, 2010
    Publication date: February 3, 2011
    Applicant: PANASONIC CORPORATION
    Inventor: Takenobu TANI
  • Patent number: 7882381
    Abstract: Methods of managing wasted active power of processors in computer systems and other electronic devices are disclosed. In one aspect, a method may include counting a number of times that a processor has performed an instruction loop. Then, a determination may be made whether the number of times that the processor has performed the loop is greater than one or more thresholds or other given values. Next, a limit may be imposed on the power of the processor if the number of times that the processor has performed the loop is greater than at least one of the thresholds or given values. Logic to perform such methods is also disclosed, as are systems suitable for incorporating such logic.
    Type: Grant
    Filed: June 29, 2006
    Date of Patent: February 1, 2011
    Assignee: Intel Corporation
    Inventor: David Anthony Wyatt
  • Patent number: 7873820
    Abstract: The present invention provides processing systems, apparatuses, and methods that reduce power consumption with the use of a loop buffer. In an embodiment, an instruction fetch unit of a processor initially provides instructions from an instruction cache to an execution unit of the processor. While instructions are provided from the instruction cache to the execution unit, instructions forming a loop are stored in a loop buffer. When a loop stored in the loop buffer is being iterated, the instruction cache is disabled to reduce power consumption and instructions are provided to the execution unit from the loop buffer. When the loop is exited, the instruction cache is re-enabled and instructions are provided to the execution unit from the instruction cache.
    Type: Grant
    Filed: November 15, 2005
    Date of Patent: January 18, 2011
    Assignee: MIPS Technologies, Inc.
    Inventor: Matthias Knoth
  • Publication number: 20100306516
    Abstract: An information processor includes a first recording unit which stores first information indicating correspondence between an instruction address and a branch destination address of a most recent branch instruction, a computation of the most recent branch instruction having been completed and a branch for the most recent branch prediction having been taken, a second recording unit which stores a second information indicating correspondence between an instruction address and a branch destination address of each of past branch instructions including the most recent branch instruction, computations of the past branch instructions having been completed and branches for the past branch instructions having been taken, and a control unit which makes a branch prediction based on the first information or the second information, and stops supply of a clock to the second recording unit and makes a branch prediction based on the first information when an instruction sequence enters a loop.
    Type: Application
    Filed: May 14, 2010
    Publication date: December 2, 2010
    Applicant: FUJITSU LIMITED
    Inventor: Takashi SUZUKI
  • Publication number: 20100299509
    Abstract: A computer-implemented pipeline execution system, method, and program product for executing loop processing in a multi-core or a multiprocessor computing environment, where the loop processing includes multiple function blocks in a multiple-stage pipeline manner. The system includes: a pipelining unit for pipelining the loop processing and assigning the loop processing to a computer processor or core; a calculating unit for calculating a first-order gradient term from a value calculated with the use of a predicted value of the input to a pipeline; and a correcting unit for correcting an output value of the pipeline with the value of the first-order gradient term.
    Type: Application
    Filed: May 18, 2010
    Publication date: November 25, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jun Doi, Shuichi Shimizu, Takeo Yoshizawa
  • Patent number: 7836289
    Abstract: A program execution control device which controls execution of a program by a processor having a predicate function for conditional execution of an instruction, wherein the program includes a branch instruction to control iterations in loop processing, the branch instruction is further an instruction to generate an execute-or-not condition indicating whether or not the branch instruction is to be executed at an iteration in the loop processing after a current iteration, and to reflect the execute-or-not condition on a predicate flag used for conditional execution of the branch instruction, the program execution control device comprises a processor status changing unit configured to change, before an execution cycle of the branch instruction, a status of the processor in advance for execution of an instruction following the branch instruction, the status being changed based on the execute-or-not condition reflected on the predicate flag.
    Type: Grant
    Filed: August 20, 2008
    Date of Patent: November 16, 2010
    Assignee: Panasonic Corporation
    Inventor: Takenobu Tani
  • Publication number: 20100287550
    Abstract: A runtime dependence-aware scheduling of dependent iterations mechanism is provided. Computation is performed for one or more iterations of computer executable code by a main thread. Dependence information is determined for a plurality of memory accesses within the computer executable code using modified executable code using a set of dependence threads. Using the dependence information, a determination is made as to whether a subset of a set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time by the one or more available threads in the data processing system. If the subset of the set of uncompleted iterations in the plurality of iterations is capable of being executed ahead-of-time, the main thread is signaled to skip the subset of the set of uncompleted iterations and the set of assist threads is signaled to execute the subset of the set of uncompleted iterations.
    Type: Application
    Filed: May 5, 2009
    Publication date: November 11, 2010
    Applicant: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Kathryn M. O'Brien, Xiaotong Zhuang
  • Publication number: 20100281240
    Abstract: A system and method for facilitating simulation of a computer program. A program representation is generated from a computer program. A simulation of the program is performed. Simulation may include applying heuristics to determine program flow for selected instructions, such as a branch instruction or a loop instruction. Simulation may also include creating imaginary objects as surrogates for real objects, when program code to create real objects is restricted, or fields of the objects are unavailable or uncertain, or for other reasons. Data descriptive of the simulation is inserted into the program representation. A visualizer may retrieve the program representation and generate a visualization that shows sequence flows resulting from the simulation.
    Type: Application
    Filed: May 1, 2009
    Publication date: November 4, 2010
    Applicant: Microsoft Corporation
    Inventors: Deon Brewis, Durham Goode, John Joseph Jordan, Sadi Khan
  • Patent number: 7822949
    Abstract: A command supply device supplies a command sequence that forms a loop. A loop command buffer accumulates a first partial command sequence. The first partial command sequence is a head part of a first command sequence repeatedly supplied to a CPU from among command sequences stored in a main memory, and is accumulated before the first command sequence is supplied to the CPU again. A linking command buffer accumulates a second partial command sequence. The second partial command sequence follows the first partial command sequence in the first command sequence, and is accumulated while the accumulated first partial command sequence in the loop command buffer is supplied to the CPU. A selection circuit supplies, to the CPU, a command from the accumulated second partial command sequence in the linking command buffer when the entirety of the first partial command sequence has been supplied to the CPU.
    Type: Grant
    Filed: May 9, 2005
    Date of Patent: October 26, 2010
    Assignee: Panasonic Corporation
    Inventor: Satoshi Ogura