Branching (e.g., Delayed Branch, Loop Control, Branch Predict, Interrupt) Patents (Class 712/233)
  • Patent number: 8656400
    Abstract: Method and apparatus are provided for a synchronizing execution of a plurality of threads on a multi-threaded processor. Each thread is provided with a number of synchronization points corresponding to points where it is advantageous or preferable that execution should be synchronized with another thread. Execution of a thread is paused when it reaches a synchronization point until at least one other thread with which it is intended to be synchronized reaches a corresponding synchronization point. Execution is subsequently resumed. Where an executing thread branches over a section of code which included a synchronization point then execution is paused at the end of the branch until the at least one other thread reaches the synchronization point of the end of the corresponding branch.
    Type: Grant
    Filed: May 30, 2012
    Date of Patent: February 18, 2014
    Assignee: Imagination Technologies, Ltd.
    Inventor: Yoong Chert Foo
  • Patent number: 8645668
    Abstract: A sub-processor different from the main processor executing control in the operating system (OS) is designated to control a device driver corresponding to a communication unit and thus, the communication control is executed by the sub-processor in response to an interrupt originating from a network card functioning as the communication unit in an information processing apparatus equipped with a plurality of processors and engaged in communication via a network. The structure enables the main processor to execute data processing with a high level of efficiency without a time lag in the data processing.
    Type: Grant
    Filed: January 9, 2008
    Date of Patent: February 4, 2014
    Assignees: Sony Corporation, Sony Computer Entertainment Inc.
    Inventors: Hiroshi Kyusojin, Yuji Kawamura
  • Patent number: 8645714
    Abstract: A branch target address cache (BTAC) caches history information associated with branch and switch key instructions previously executed by a microprocessor. The history information includes a target address and an identifier (index into a register file) for identifying key values associated with each of the previous branch and switch key instructions. A fetch unit receives from the BTAC a prediction that the fetch unit fetched a previous branch and switch key instruction and receives the target address and identifier associated with the fetched branch and switch key instruction. The fetch unit also fetches encrypted instruction data at the associated target address and decrypts (via XOR) the fetched encrypted instruction data based on the key values identified by the identifier, in response to receiving the prediction. If the BTAC predicts correctly, a pipeline flush normally associated with the branch and switch key instruction is avoided.
    Type: Grant
    Filed: April 21, 2011
    Date of Patent: February 4, 2014
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Terry Parks, Brent Bean, Thomas A. Crispin
  • Patent number: 8634982
    Abstract: To improve the scheduling and tasking of sensors, the present disclosure describes an improved planning system and method for the allocation and management of sensors. In one embodiment, the planning system uses a branch and bound approach of tasking sensors using a heuristic to expedite arrival at a deterministic solution. In another embodiment, a progressive lower bound is applied to the branch and bound approach. Also, in another embodiment, a hybrid branch and bound approach is used where both local and global planning are employed in a tiered fashion.
    Type: Grant
    Filed: August 19, 2009
    Date of Patent: January 21, 2014
    Assignee: Raytheon Company
    Inventors: Deepak Khosla, David Fuciarelli, David L Ii
  • Patent number: 8635437
    Abstract: A microprocessor includes a memory that stores an exception handler to handle an exception condition. The exception handler is a non-user program private to the microprocessor and includes a conditional branch instruction. A first fetch unit fetches instructions of a user program that includes a user program instruction that causes the exception condition. An execution unit executes the user program instructions fetched by the first fetch unit and executes instructions of the exception handler. The execution unit also saves a state in response to detecting the exception condition caused by the user program instruction. A second fetch unit fetches the exception handler instructions from the memory and resolves the conditional branch instruction based on the saved state without sending the conditional branch instruction to the execution unit to resolve the conditional branch instruction.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: January 21, 2014
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Terry Parks, Brent Bean
  • Patent number: 8631225
    Abstract: Mechanisms are provided for dynamically rewriting branch instructions in a portion of code. The mechanisms execute a branch instruction in the portion of code. The mechanisms determine if a target instruction of the branch instruction, to which the branch instruction branches, is present in an instruction cache associated with the processor. Moreover, the mechanisms directly branch execution of the portion of code to the target instruction in the instruction cache, without intervention from an instruction cache runtime system, in response to a determination that the target instruction is present in the instruction cache. In addition, the mechanisms redirect execution of the portion of code to the instruction cache runtime system in response to a determination that the target instruction cannot be determined to be present in the instruction cache.
    Type: Grant
    Filed: June 25, 2010
    Date of Patent: January 14, 2014
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Patent number: 8627051
    Abstract: Mechanisms are provided for dynamically rewriting branch instructions in a portion of code. The mechanisms execute a branch instruction in the portion of code. The mechanisms determine if a target instruction of the branch instruction, to which the branch instruction branches, is present in an instruction cache associated with the processor. Moreover, the mechanisms directly branch execution of the portion of code to the target instruction in the instruction cache, without intervention from an instruction cache runtime system, in response to a determination that the target instruction is present in the instruction cache. In addition, the mechanisms redirect execution of the portion of code to the instruction cache runtime system in response to a determination that the target instruction cannot be determined to be present in the instruction cache.
    Type: Grant
    Filed: April 10, 2012
    Date of Patent: January 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Publication number: 20130339689
    Abstract: In some implementations, a register file has a plurality of read ports for providing data to a micro-operation during execution of the micro-operation. For example, the micro-operation may utilize at least two data sources, with at least one first data source being utilized at least one pipeline stage earlier than at least one second data source. A number of register file read ports may be allocated for executing the micro-operation. A bypass calculation is performed during a first pipeline stage to detect whether the at least one second data source is available from a bypass network. During a subsequent second pipeline stage, when the at least one second data source is detected to be available from the bypass network, the number of the read ports allocated to the micro-operation may be reduced.
    Type: Application
    Filed: December 29, 2011
    Publication date: December 19, 2013
    Inventors: Srikanth T. Srinivasan, Chia Yin Kevin Lai, Bambang Sutanto, Chad D. Hancock
  • Publication number: 20130339688
    Abstract: Embodiments relate to implementing processor management of transactions. An aspect includes receiving an instruction from a thread. The instruction includes an instruction type, and executes within a transaction. The transaction effectively delays committing stores to memory until the transaction has completed. A processor manages transaction nesting for the instruction based on the instruction type of the instruction. The transaction nesting includes a maximum processor capacity. The transaction nesting management performs enables executing a sequence of nested transactions within a transaction, supports multiple nested transactions in a processor pipeline, or generates and maintains a set of effective controls for controlling a pipeline. The processor prevents the transaction nesting from exceeding the maximum processor capacity.
    Type: Application
    Filed: June 15, 2012
    Publication date: December 19, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fadi Y. Busaba, Brian W. Thompto
  • Patent number: 8607209
    Abstract: A processor framework includes a compiler to add control information to an instruction sequence at compile time. The control information is added in the instruction sequence prior to a control-flow changing instruction. Microarchitecture is configured to use the control information at runtime to predict an outcome of the control-flow changing instruction prior to fetching the control-flow changing instruction.
    Type: Grant
    Filed: January 18, 2005
    Date of Patent: December 10, 2013
    Assignee: BlueRISC Inc.
    Inventors: Saurabh Chheda, Kristopher Carver, Raksit Ashok
  • Patent number: 8607034
    Abstract: An apparatus including a microprocessor, a system memory, and a secure non-volatile memory. The microprocessor executes non-secure application programs and a secure application program. The secure application program is executed in a secure execution mode. The microprocessor has secure watchdog logic that monitors environmental attributes corresponding to the microprocessor and to the secure application program, and that transfers program control to one of a plurality of event handlers within the secure application program. The system memory has non-secure application programs stored therein. The secure non-volatile memory is coupled to the microprocessor via a private bus.
    Type: Grant
    Filed: October 31, 2008
    Date of Patent: December 10, 2013
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Terry Parks
  • Patent number: 8601177
    Abstract: A method may include distributing ranges of addresses in a memory among a first set of functions in a first pipeline. The first set of the functions in the first pipeline may operate on data using the ranges of addresses. Different ranges of addresses in the memory may be redistributed among a second set of functions in a second pipeline without waiting for the first set of functions to be flushed of data.
    Type: Grant
    Filed: June 27, 2012
    Date of Patent: December 3, 2013
    Assignee: Intel Corporation
    Inventor: Thomas A. Piazza
  • Publication number: 20130311758
    Abstract: A hardware profiling mechanism implemented by performance monitoring hardware enables page level automatic binary translation. The hardware during runtime identifies a code page in memory containing potentially optimizable instructions. The hardware requests allocation of a new page in memory associated with the code page, where the new page contains a collection of counters and each of the counters corresponds to one of the instructions in the code page. When the hardware detects a branch instruction having a branch target within the code page, it increments one of the counters that has the same position in the new page as the branch target in the code page. The execution of the code page is repeated and the counters are incremented when branch targets fall within the code page. The hardware then provides the counter values in the new page to a binary translator for binary translation.
    Type: Application
    Filed: March 30, 2012
    Publication date: November 21, 2013
    Inventors: Paul Caprioli, Matthew C. Merten, Muawya M. Al-Otoom, Omar M. Shaikh, Abhay S. Kanhere, Suresh Srinivas, Koichi Yamada, Vivek Thakkar, Pawel Osciak
  • Patent number: 8577769
    Abstract: A system for optimization of variables is provided. The system includes a buyer finance system for receiving asset data and buyer finance data. A seller variable system receives the asset data and the buyer finance data and applies a seller variable distribution to generate seller transaction state data. A finance variable system receives the asset data and the buyer finance data and applies a finance variable distribution to generate finance transaction state data. A variable optimization system receives the seller transaction state data and the finance transaction state data and generates transaction approval data.
    Type: Grant
    Filed: June 12, 2012
    Date of Patent: November 5, 2013
    Assignee: Skopos Financial Group, LLC
    Inventors: A. John Fineout, Craig M. Allen, Thomas R. Brower
  • Publication number: 20130232323
    Abstract: Methods, media and systems that obfuscate control flow in software programs. The obfuscation can impede or prevent static flow analysis of a software program's control flow. In one embodiment, a method, performed by a data processing system, identifies each branch point in a set of branch points in a first version of software and replaces, in each branch point in the set, a representation of a target of the branch point with a computed value that depends upon at least one prior computed value in a stream of instructions in the first version of software. Other embodiments are also described.
    Type: Application
    Filed: October 19, 2012
    Publication date: September 5, 2013
    Applicant: APPLE INC
    Inventors: Julien Lerouge, Jonathan Gregory McLachlan, Daniel F. Reynaud
  • Patent number: 8527707
    Abstract: A digital system is provided for high-performance cache systems. The digital system includes a processor core and a cache control unit. The processor core is capable of being coupled to a first memory containing executable instructions and a second memory with a faster speed than the first memory. Further, the processor core is configured to execute one or more instructions of the executable instructions from the second memory. The cache control unit is configured to be couple to the first memory, the second memory, and the processor core to fill at least the one or more instructions from the first memory to the second memory before the processor core executes the one or more instructions.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: September 3, 2013
    Assignee: Shanghai Xin Hao Micro Electronics Co. Ltd.
    Inventors: Kenneth Chenghao Lin, Haoqi Ren
  • Patent number: 8521996
    Abstract: A microprocessor includes a pipeline of stages for processing instructions and first and second types of conditional branch instruction includable by a program. The microprocessor makes a prediction of conditional branch instructions of the first type and flushes the pipeline of instructions if the prediction is subsequently determined to be incorrect, thereby incurring a branch misprediction penalty related to processing of conditional branch instructions of the first type. The microprocessor always correctly resolves conditional branch instructions of the second type without making a prediction of conditional branch instructions of the second type, thereby avoiding ever incurring a branch misprediction penalty related to processing of conditional branch instructions of the second type.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: August 27, 2013
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Terry Parks, Brent Bean
  • Patent number: 8516230
    Abstract: An application thread executes a direct branch instruction that is stored in an instruction cache line. Upon execution, the direct branch instruction branches to a branch descriptor that is also stored in the instruction cache line. The branch descriptor includes a trampoline branch instruction and a target instruction space address. Next, the trampoline branch instruction sends a branch descriptor pointer, which points to the branch descriptor, to an instruction cache manager. The instruction cache manager extracts the target instruction space address from the branch descriptor, and executes a target instruction corresponding to the target instruction space address. In one embodiment, the instruction cache manager generates a target local store address by masking off a portion of bits included in the target instruction space address. In turn, the application thread executes the target instruction located at the target local store address accordingly.
    Type: Grant
    Filed: December 29, 2009
    Date of Patent: August 20, 2013
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad William Michael, Mark Richard Nutter, Kathryn M. O'Brien, John Kevin Patrick O'Brien
  • Publication number: 20130198496
    Abstract: Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing.
    Type: Application
    Filed: November 15, 2012
    Publication date: August 1, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130198492
    Abstract: Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing.
    Type: Application
    Filed: January 31, 2012
    Publication date: August 1, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
  • Publication number: 20130198497
    Abstract: Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing.
    Type: Application
    Filed: November 15, 2012
    Publication date: August 1, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20130198491
    Abstract: Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing.
    Type: Application
    Filed: January 31, 2012
    Publication date: August 1, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Fadi Y. Busaba, Steven R. Carlough, Christopher A. Krygowski, Brian R. Prasky, Chung-Lung K. Shum
  • Patent number: 8479053
    Abstract: In one embodiment, a processor includes an execution unit and at least one last branch record (LBR) register to store address information of a branch taken during program execution. This register may further store a transaction indicator to indicate whether the branch was taken during a transactional memory (TM) transaction. This register may further store an abort indicator to indicate whether the branch was caused by a transaction abort. Other embodiments are described and claimed.
    Type: Grant
    Filed: July 28, 2010
    Date of Patent: July 2, 2013
    Assignee: Intel Corporation
    Inventors: Ravi Rajwar, Peter Lachner, Laura A. Knauth, Konrad K. Lai
  • Publication number: 20130159684
    Abstract: One embodiment of the present invention sets forth an optimized way to execute replay operations for divergent operations in a parallel processing subsystem. Specifically, the streaming multiprocessor (SM) includes a multistage pipeline configured to batch two or more replay operations for processing via replay loop. A logic element within the multistage pipeline detects whether the current pipeline stage is accessing a shared resource, such as loading data from a shared memory. If the threads are accessing data which are distributed across multiple cache lines, then the multistage pipeline batches two or more replay operations, where the replay operations are inserted into the pipeline back-to-back. Advantageously, divergent operations requiring two or more replay operations operate with reduced latency. Where memory access operations require transfer of more than two cache lines to service all threads, the number of clock cycles required to complete all replay operations is reduced.
    Type: Application
    Filed: December 16, 2011
    Publication date: June 20, 2013
    Inventors: Michael Fetterman, Jack Hilaire Choquette, Omkar Paranjape, Anjana Rajendran, Eric Lyell Hill, Stewart glenn Carlton, Rajeshwaran Selvanesan, Douglas J. Hahn, Steven James Heinrich
  • Patent number: 8443171
    Abstract: The present invention provides a system and method for runtime updating of hints in program instructions. The invention also provides for programs of instructions that include hint performance data. Also, the invention provides an instruction cache that modifies hints and writes them back. As runtime hint updates are stored in instructions, the impact of the updates is not limited by the limited memory capacity local to a processor. Also, there is no conflict between hardware and software hints, as they can share a common encoding in the program instructions.
    Type: Grant
    Filed: July 30, 2004
    Date of Patent: May 14, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Dale Morris, James E. McCormick
  • Patent number: 8423751
    Abstract: A microprocessor includes an instruction set architecture, comprising a call instruction type, a return instruction type, and other instruction types. Execution units correctly execute program instructions of the other instruction types. A call/return stack has a plurality of entries arranged in a last-in-first-out manner. The call/return stack is architectural state of the microprocessor not modifiable by program instructions of the other instruction types. The call/return stack is architectural state of the microprocessor indirectly modifiable by program instructions of the call and return instruction types. The microprocessor also includes a fetch unit that fetches program instructions and sends the program instructions of the other instruction types to the execution units to be correctly executed.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: April 16, 2013
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Terry Parks, Brent Bean
  • Patent number: 8402541
    Abstract: Malware detection systems and methods for determining whether a collection of data not expected to include executable code is suspected of containing malicious executable code. In some embodiments, a malware detection system may disassemble a collection of data to obtain a sequence of possible instructions and determine whether the collection of data is suspected of containing malicious executable code based, at least partially, on an analysis of the sequence of possible instructions. In one embodiment, the analysis of the sequence of possible instructions may comprise determining whether the sequence of possible instructions comprises an execution loop. In a further embodiment, a control flow of the sequence of possible instructions may be analyzed. In a further embodiment, the analysis of the sequence of possible instructions may comprise assigning a weight that is indicative of a level of suspiciousness of the sequence of possible instructions.
    Type: Grant
    Filed: March 12, 2009
    Date of Patent: March 19, 2013
    Assignee: Microsoft Corporation
    Inventors: Cristian Craioveanu, Ying Lin, Peter Ferrie, Bruce Dang
  • Patent number: 8392893
    Abstract: The computer system of the present invention emulates target instructions. The computer system includes a processing unit for branching to collective emulation coding for emulating plural of target instructions created beforehand collectively, thereby processing those instructions collectively according to the coding when those target instructions are combined so as to be processed collectively and a memory for storing the collective emulation coding.
    Type: Grant
    Filed: May 11, 2007
    Date of Patent: March 5, 2013
    Assignee: NEC Computertechno, Ltd.
    Inventor: Tsutomu Fujihara
  • Patent number: 8387053
    Abstract: A method of performing operations in a computer system, computer system, and related method of compilation, are disclosed. In one embodiment, the method of performing includes providing compiled code having at least one thread, where each of the at least one thread includes a respective plurality of blocks and each respective block includes a respective pre-fetch component and a respective execute component. The method also includes performing a first pre-fetch component from a first block of a first thread of the at least one thread, performing a first additional component after the first pre-fetch component has been performed, and performing a first execute component from the first block of the first thread. The first execute component is performed after the first additional component has been performed, and the first additional component is from either a second thread or another block of the first thread that is not the first block.
    Type: Grant
    Filed: January 25, 2007
    Date of Patent: February 26, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Blaine D. Gaither, Verna Knapp, Jerome Huck, Benjamin D. Osecky
  • Patent number: 8379644
    Abstract: A system and method of processing management frames implement a switching strategy that supports an interface between a generic device and a distributed switching architecture enabled switch. Control or management frames may be identified and processed independent of ordinary network traffic.
    Type: Grant
    Filed: June 6, 2007
    Date of Patent: February 19, 2013
    Assignee: Marvell International Ltd.
    Inventor: Donald Pannell
  • Patent number: 8381037
    Abstract: A method, an apparatus, and a computer program product in a data processing system are presented for using hardware assistance for gathering performance information that significantly reduces the overhead in gathering such information. Performance indicators are associated with instructions or memory locations, and processing of the performance indicators enables counting of events associated with execution of those instructions or events associated with accesses to those memory locations. The performance information that has been dynamically gathered from the assisting hardware is available to the software application during runtime in order to autonomically affect the behavior of the software application, particularly to enhance its performance. For example, the counted events may be used to autonomically control an execution-path selection within the software application.
    Type: Grant
    Filed: October 9, 2003
    Date of Patent: February 19, 2013
    Assignee: International Business Machines Corporation
    Inventors: Jimmie Earl DeWitt, Jr., Frank Eliot Levine, Christopher Michael Richardson, Robert John Urquhart
  • Patent number: 8381192
    Abstract: Some embodiments of the present invention provide a system that tests a software program. During operation, the system traces a flow of tainted data through the software program during execution of the software program. Next, the system alters the flow by modifying an instruction within the software program. The system then monitors the behavior of the software program after modifying the instruction. Finally, the system analyzes a correctness of the software program based on the monitored behavior.
    Type: Grant
    Filed: August 1, 2008
    Date of Patent: February 19, 2013
    Assignee: Google Inc.
    Inventors: William A. Drewry, Tavis Ormandy
  • Publication number: 20130036473
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for obfuscating branches in computer code. A compiler or a post-compilation tool can obfuscate branches by receiving source code, and compiling the source code to yield computer-executable code. The compiler identifies branches in the computer-executable code, and determines a return address and a destination value for each branch. Then, based on the return address and the destination value for each branch, the compiler constructs a binary tree with nodes and leaf nodes, each node storing a balanced value, and each leaf node storing a destination value. The non-leaf nodes are arranged such that searching the binary tree by return address leads to a corresponding destination value. Then the compiler inserts the binary tree in the computer-executable code and replaces each branch with instructions in the computer-executable code for performing a branching operation based on the binary tree.
    Type: Application
    Filed: August 1, 2011
    Publication date: February 7, 2013
    Applicant: Apple Inc.
    Inventors: Gideon M. Myles, Julien Lerouge, Jon McLachlan, Ganna Zaks, Augustin J. Farrugia
  • Patent number: 8370797
    Abstract: A data processing apparatus includes a host processing apparatus that can cooperatively verify, using generated Timed software, hardware and software of a semiconductor device mounted with a target processing device and an operating system (OS), wherein the host processing apparatus analyzes an assembler of the target processing device and recognizes a Basic Block, which is a basic unit for calculating information concerning time, and generates Timed software for the cooperative verification with reference to the Basic Block.
    Type: Grant
    Filed: June 19, 2009
    Date of Patent: February 5, 2013
    Assignee: Sony Corporation
    Inventors: Md. Ashfaquzzaman Khan, Yasushi Fukuda
  • Publication number: 20130024675
    Abstract: A dynamic code translator with isoblocking uses a return trampoline having branch instructions conditioned on different isostates to optimize return address translation, by allowing the hardware to predict that the address of a future return will be the address of trampoline. An IP relative call is inserted into translated code to write the trampoline address to a target link register and a target return address stack used by the native machine to predict return addresses. If a computed subject return address matches a subject return address register value, the current isostate of the isoblock is written to an isostate register. The isostate value in the isostate register is then used to select the branch instruction in the trampoline for the true subject return address. Sufficient code area in the trampoline instruction set can be reserved for a number of compare/branch pairs which is equal to the number of available isostates.
    Type: Application
    Filed: May 23, 2012
    Publication date: January 24, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: William O. Lovett, Alexander B. Brown
  • Publication number: 20130019085
    Abstract: A mechanism is provided for reducing a penalty for executing a correct branch of a branch instruction. An execution unit in a processor of a data processing system executes a first branch of the branch instruction from a main thread of a processor and executes a second branch of the branch instruction from an assist thread of the processor. The execution unit determines whether the main thread is a correct branch of the branch instruction or the assist thread is the correct branch of the branch instruction. Responsive to the assist thread being the correct branch of the branch instruction, the execution unit pauses execution of the branch instruction on both the main thread and the assist thread. The execution unit then properly inherits a context of the main thread in order that execution of the second branch may continue.
    Type: Application
    Filed: July 12, 2011
    Publication date: January 17, 2013
    Applicant: International Business Machines Corporation
    Inventors: Harold W. Cain, III, David M. Daly, Michael C. Huang, Jose E. Moreira, IL Park
  • Patent number: 8347272
    Abstract: A method of analyzing program source code prepared for a multithreading platform comprises analyzing a targeted source code set to extract a set of characteristic information for each wait operation; analyzing the targeted source code set to extract a set of characteristic information for each notification call to an application programming interface of the multithreading platform; identifying a one-way branching correspondence with a wait operation for each notification call by comparing the extracted set of characteristic information for the notification operation and the extracted set of characteristic information for each wait operation with a set of predefined asynchronous operation correspondence pattern information for notification and wait functions implemented by the application programming interface; extracting a set of information for each identified one-way branching correspondence; and storing the extracted set of information for each identified one-way branching correspondence in a data store.
    Type: Grant
    Filed: July 23, 2008
    Date of Patent: January 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: Naoki Sugawara, Tadashi Yamamoto
  • Publication number: 20120290820
    Abstract: Techniques are disclosed relating to a processor that is configured to execute control transfer instructions (CTIs). In some embodiments, the processor includes a mechanism that suppresses results of mispredicted younger CTIs on a speculative execution path. This mechanism permits the branch predictor to maintain its fidelity, and eliminates spurious flushes of the pipeline. In one embodiment, a misprediction bit is be used to indicate that a misprediction has occurred, and younger CTIs than the CTI that was mispredicted are suppressed. In some embodiments, the processor may be configured to execute instruction streams from multiple threads. Each thread may include a misprediction indication. CTIs in each thread may execute in program order with respect to other CTIs of the thread, while instructions other than CTIs may execute out of program order.
    Type: Application
    Filed: September 8, 2011
    Publication date: November 15, 2012
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventors: Christopher H. Olson, Manish K. Shah
  • Publication number: 20120265966
    Abstract: Methods and apparatuses are provided for increased efficiency in a processor via early instruction completion. An apparatus is provided for increased efficiency in a processor via early instruction completion. The apparatus comprises an execution unit for processing instructions and determining whether a later issued instruction is ready for completion or an earlier issued instruction is ready for completion and a retire unit for retiring the later issued instruction when the later instruction is ready for completion or to retire the earlier instruction when later instruction is not ready for completion and the earlier issued instruction has a known good completion status. A method is provided for increased efficiency in a processor via early instruction completion. The method comprises completing an earlier issued instruction having a known good completion status ahead of a later issued instruction when the later issued instruction is not ready for completion.
    Type: Application
    Filed: April 15, 2011
    Publication date: October 18, 2012
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Michael D. ESTLICK, Kevin HURD, Jay FLEISCHMAN
  • Publication number: 20120254597
    Abstract: A distributed data-parallel execution (DDPE) system splits a computational problem into a plurality of sub-problems using a branch-and-bound algorithm, designates a synchronous stop time for a “plurality of processors” (for example, a cluster) for each round of execution, processes the search tree by recursively using a branch-and-bound algorithm in multiple rounds (without inter-processor communications), determines if further processing is required based on the processing round state data, and terminates processing on the processors when processing is completed.
    Type: Application
    Filed: March 29, 2011
    Publication date: October 4, 2012
    Applicant: Microsoft Corporation
    Inventors: Daniel Delling, Mihai Budiu, Renato F. Werneck
  • Publication number: 20120254593
    Abstract: Embodiments of systems, apparatuses, and methods for performing a jump instruction in a computer processor are described. In some embodiments, the execution of a blend instruction causes a conditional jump to an address of a target instruction when all of bits of a writemask are zero, wherein the address of the target instruction is calculated using an instruction pointer of the instruction and the relative offset.
    Type: Application
    Filed: April 1, 2011
    Publication date: October 4, 2012
    Inventors: Jesus Corbal San Adrian, Bret Toll, Robert C. Valentine, Milind Baburao Girkar, Andrew Thomas Foryth, George Z. Chrysos, Edward Thomas Grochowski, Dennis R. Bradford
  • Patent number: 8281110
    Abstract: An out-of-order execution in-order retire microprocessor includes a branch information table comprising N entries. Each of the N entries stores information associated with a branch instruction. The microprocessor also includes a reorder buffer comprising M entries. Each of the M entries stores information associated with an unretired instruction within the microprocessor. Each of the M entries includes a field that indicates whether the unretired instruction is a branch instruction and, if so, a tag identifying one of the N entries in the branch information table storing information associated with the branch instruction. N is significantly less than M such that the overall die space and power consumption is reduced over a processor in which each reorder buffer entry stores the branch information.
    Type: Grant
    Filed: October 16, 2009
    Date of Patent: October 2, 2012
    Assignee: VIA Technologies, Inc.
    Inventors: Thomas C. McDonald, Brent Bean
  • Patent number: 8266597
    Abstract: A first section of executable computer code of a computer program is dynamically patched by performing the following. A breakpoint is inserted at the first section of executable computer code. During execution of the computer program, an instruction counter is incremented on an instruction-by-instruction basis through the computer program. The instruction counter indicates a current instruction of the computer program being executed. The breakpoint where the instruction counter points to the first section of executable computer code is encountered, which results in a breakpoint handler being called. The breakpoint handler changes the instruction pointer to instead point to a second section of executable computer code. The second section of executable computer code is a patched version of the first section of executable computer code. Upon the breakpoint handler returning, the second section of executable computer code is executed in lieu of the first section of executable computer code.
    Type: Grant
    Filed: June 16, 2008
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Prasanna S. Panchamukhi, Balbir Singh
  • Patent number: 8266414
    Abstract: A method for managing a hardware instruction loop, the method includes: (i) detecting, by a branch prediction unit, an instruction loop; wherein a size of the instruction loop exceeds a size of a storage space allocated in a fetch unit for storing fetched instructions; (ii) requesting from the fetch unit to fetch instructions of the instruction loop that follow the first instructions of the instruction loop; and (iii) selecting, during iterations of the instruction loop, whether to provide to a dispatch unit one of the first instructions of the instruction loop or another instruction that is fetched by the fetch unit; wherein the first instructions of the instruction loop are stored at the dispatch unit.
    Type: Grant
    Filed: August 19, 2008
    Date of Patent: September 11, 2012
    Assignee: Freescale Semiconductor, Inc.
    Inventors: Lev Vaskevich, Itzhak Barak, Amir Paran, Yuval Peled, Idan Rozenberg, Doron Schupper
  • Patent number: 8266450
    Abstract: It is possible to achieve the protection of software with reduced overhead. For example, a memory for storing an encrypted code prepared in advance and a decryptor module for decrypting the code are provided. The decryptor module includes, for example, a three-stage pipeline and a selector for selecting one output from the outputs of each stage of the pipeline. When a branch instruction is issued and subsequent inputs of the pipeline are in the order of CD?1, CD?2, . . . , the decryptor module outputs a first decrypted code by performing a one-stage pipeline process to CD?1. Next, the decryptor module outputs a second decrypted code by performing a two-stage pipeline process to CD?2, and the decryptor module outputs a third decrypted code by performing a three-stage pipeline process to CD?3 (and subsequent codes). Therefore, in particular, the overhead to CD?1 can be reduced.
    Type: Grant
    Filed: April 3, 2009
    Date of Patent: September 11, 2012
    Assignee: Renesas Electronics Corporation
    Inventors: Takashi Endo, Toshio Okochi, Shunsuke Ota, Tatsuya Kameyama
  • Patent number: 8261276
    Abstract: A mechanism for controlling instruction fetch and dispatch thread priority settings in a thread switch control register for reducing the occurrence of balance flushes and dispatch flushes for increased power performance of a simultaneous multi-threading data processing system. To achieve a target power efficiency mode of a processor, the illustrative embodiments receive an instruction or command from a higher-level system control to set a current power consumption of the processor. The illustrative embodiments determine a target power efficiency mode for the processor. Once the target power mode is determined, the illustrative embodiments update thread priority settings in a thread switch control register for an executing thread to control balance flush speculation and dispatch flush speculation to achieve the target power efficiency mode.
    Type: Grant
    Filed: March 31, 2008
    Date of Patent: September 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Richard James Eickemeyer, Susan Elizabeth Eisen, Michael Stephen Floyd, Hans Mikael Jacobson, Jeffrey R. Summers
  • Publication number: 20120210106
    Abstract: The present invention discloses a method of processing instructions in a pipeline-based central processing unit, wherein the pipeline is partitioned into base pipeline stages and enhanced pipeline stages according to functions, the base pipeline stages being activated all the while, and the enhanced pipeline stages being activated or shutdown according to requirements for performance of a workload. The present invention further discloses a method of processing instructions in a pipeline-based central processing unit, wherein the pipeline is partitioned into base pipeline stages and enhanced pipeline stages according to functions, each pipeline stage being partitioned into a base module and at least one enhanced module, the base module being activated all the while, and the enhanced module being activated or shutdown according to requirements for performance of a workload.
    Type: Application
    Filed: April 26, 2012
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Wen Bo Shen, Peng Shao, Yu Li, Xiao Tao Chang, Yi Ge, Huayong Wang, Huan Hao Zou
  • Patent number: 8245017
    Abstract: A microprocessor includes a first branch condition state and a second branch condition state. The microprocessor also includes a conditional branch instruction of a first type that instructs the microprocessor to wait to correctly resolve the conditional branch instruction of the first type based on the first branch condition state until other instructions within the microprocessor that update the first branch condition state and that are older than the conditional branch instruction of the first type have updated the first branch condition state. A conditional branch instruction of a second type instructs the microprocessor to correctly resolve the conditional branch instruction of the second type based on the second branch condition state without regard to whether other instructions within the microprocessor that update the second branch condition state and that are older than the conditional branch instruction of the second type have yet updated the second branch condition state.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: August 14, 2012
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Terry Parks, Brent Bean
  • Patent number: 8225012
    Abstract: A method may include distributing ranges of addresses in a memory among a first set of functions in a first pipeline. The first set of the functions in the first pipeline may operate on data using the ranges of addresses. Different ranges of addresses in the memory may be redistributed among a second set of functions in a second pipeline without waiting for the first set of functions to be flushed of data.
    Type: Grant
    Filed: September 3, 2009
    Date of Patent: July 17, 2012
    Assignee: Intel Corporation
    Inventor: Thomas A. Piazza
  • Patent number: 8225077
    Abstract: An obfuscation device includes a first instruction generating unit, for each of a first process and a second process, which generates an initialization instruction for securing a management area for managing identification information indicating an instruction block that should be executed next so as to proceed with the process. Further, a second instruction generating unit generates a selection instruction (i) to make a first selection selecting a process that should be proceeded out of the first process and the second process, (ii) to make a second selection selecting an instruction block indicated by the identification information managed in the management area as an instruction block that should be executed for proceeding with the process selected by the first selection, and (iii) to cause the execution device to execute the instruction block selected by the second selection, and stores the selection instruction in a storage unit.
    Type: Grant
    Filed: March 24, 2009
    Date of Patent: July 17, 2012
    Assignee: Panasonic Corporation
    Inventors: Taichi Sato, Tomoyuki Haga, Kenichi Matsumoto, Akito Monden, Haruaki Tamada