Conditional Branching Patents (Class 712/234)
  • Patent number: 11861365
    Abstract: Systems and methods are disclosed for macro-op fusion. Sequences of macro-ops that include a control-flow instruction are fused into single micro-ops for execution. The fused micro-ops may avoid the use of control-flow instructions, which may improve performance. A fusion predictor may be used to facilitate macro-op fusion.
    Type: Grant
    Filed: May 3, 2021
    Date of Patent: January 2, 2024
    Assignee: SiFive, Inc.
    Inventors: Krste Asanovic, Andrew Waterman
  • Patent number: 11687440
    Abstract: Protection of a first software application to be executed on an execution platform by adding at least one check module to the software application, wherein the check module, when being executed, checks at least a part of the code of the protected software application loaded in the memory and carries out a predefined tamper response in case the check module detects that the checked code was changed or ensures that the protected software application continues to function correctly in case the check module detects that the checked code was not changed; selecting a first code region of the first software application, said first code region provides a first functionality when being executed; amending the selected first code region of the first software application such that an amended first code region is generated to provide the protected software application; wherein the amended first code region, when being executed, still provides the first functionality but carries out an access to at least a part of the code
    Type: Grant
    Filed: February 2, 2021
    Date of Patent: June 27, 2023
    Assignee: THALES DIS CPL USA, INC.
    Inventors: Andreas Weber, David Andreas Lange, Michael Zunke
  • Patent number: 11567776
    Abstract: In one embodiment, a microprocessor, comprising: first logic configured to dynamically adjust a maximum prefetch count based on a total count of predicted taken branches over a predetermined quantity of cache lines; and second logic configured to prefetch instructions based on the adjusted maximum prefetch count.
    Type: Grant
    Filed: November 3, 2020
    Date of Patent: January 31, 2023
    Assignee: CENTAUR TECHNOLOGY, INC.
    Inventors: Thomas C. McDonald, Brent Bean
  • Patent number: 11507475
    Abstract: A data processing apparatus (2) has scalar processing circuitry (32-42) and vector processing circuitry (38, 40, 42). When executing main scalar processing on the scalar processing circuitry (32-42), or main vector processing using a subset of said plurality of lanes on the vector processing circuitry (38, 40, 42), checker processing is executed using at least one lane of the plurality of lanes on the vector processing circuitry (38, 40, 42), the checker processing comprising operations corresponding to at least part of the main scalar/vector processing. Errors can then be detected based on a comparison of an outcome of the main processing and an outcome of the checker processing. This provides a technique for achieving functional safety in a high end processor with better performance and reduced hardware cost compared to a dual/triple core lockstep approach.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: November 22, 2022
    Assignee: Arm Limited
    Inventors: Matthias Lothar Boettcher, Mbou Eyole, Nathanael Premillieu
  • Patent number: 11501143
    Abstract: A reconfigurable neural circuit includes an array of processing nodes. Each processing node includes a single physical neuron circuit having only one input and an output, a single physical synapse circuit having a presynaptic input, and a single physical output coupled to the input of the neuron circuit, a weight memory for storing N synaptic conductance value or weights having an output coupled to the single physical synapse circuit, a single physical spike timing dependent plasticity (STDP) circuit having an output coupled to the weight memory, a first input coupled to the output of the neuron circuit, and a second input coupled to the presynaptic input, and interconnect circuitry connected to the presynaptic input and connected to the output of the single physical neuron circuit. The synapse circuit and the STDP circuit are each time multiplexed circuits. The interconnect circuitry in each respective processing node is coupled to the interconnect circuitry in each other processing node.
    Type: Grant
    Filed: June 20, 2019
    Date of Patent: November 15, 2022
    Assignee: HRL LABORATORIES, LLC
    Inventors: Jose Cruz-Albrecht, Timothy Derosier, Narayan Srinivasa
  • Patent number: 11314512
    Abstract: An aspect includes generating a data result and a special case indicator based on an instruction and at least one input data operand. Outputting the data result to a processor core. Outputting the first condition code to the processor core prior to outputting the data result to the processor core. Generating a second condition code based on the data result and the special case indicator. Performing a check by comparing the first condition code and the second condition code and flagging an error to the processor core upon the first condition code being different from the second condition code.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: April 26, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Petra Leber, Kerstin Claudia Schelm, Cedric Lichtenau, Michael Klein
  • Patent number: 11309023
    Abstract: A memory system may include multiple memory cells to store logical data and cycle tracking circuitry to track a number of cycles associated the memory cells. The cycles may be representative of one or more past accesses of the memory cells. The memory system may also include control circuitry to access the memory cells. Accessing of the memory cell may include a read operation, a write operation, or both. During the accessing of the memory cell, the control circuitry may determine a voltage parameter of the access based at least in part on the tracked number of cycles.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: April 19, 2022
    Assignee: Micron Technology, Inc.
    Inventor: Hari Giduturi
  • Patent number: 11210103
    Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to determine, based on a field of a first instruction, a number of additional instructions to execute in conjunction with the first instruction and prior to execution of the first instruction.
    Type: Grant
    Filed: September 14, 2016
    Date of Patent: December 28, 2021
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Horst Diewald, Johann Zipperer
  • Patent number: 11163577
    Abstract: A processor reads at least one instruction comprising at least one of a branch instruction and a non-branch instruction. In response to the branch instruction comprising a conditional branch instruction and set in dynamic mode, the processor dynamically predicts a branch path as taken or not taken. The processor, in response to the instruction fetch unit set in static mode for a conditional branch instruction and static branch prediction setting bits received with the conditional branch instruction specifying static branch prediction, statically sets the branch path as taken or not taken according to the static branch prediction setting bits received with the branch instruction. The processor selectively sets the operation of the processor temporarily from the dynamic mode to the static mode only in response to detecting a type of the at least one instruction matches a type of instruction qualifying to trigger static branch prediction.
    Type: Grant
    Filed: November 26, 2018
    Date of Patent: November 2, 2021
    Assignee: International Business Machines Corporation
    Inventors: Sheldon Levenstein, Brian W. Thompto, David S. Levitan
  • Patent number: 11144288
    Abstract: Embodiments of the present disclosure are directed to a system, methods, and computer-readable media for compiling source code into bytecode using a compiler. Using a rules set as input, a compiler de-duplicates action codes in the rules and assigns a unique identifier to each action code. The compiler generates a cascading hierarchy of switches that process discrete portions of the unique identifiers in order to invoke methods. The methods are assigned to classes using a method-per-class limit, and bytecode is generated from the class-assigned methods.
    Type: Grant
    Filed: May 15, 2020
    Date of Patent: October 12, 2021
    Assignee: Adobe Inc.
    Inventor: Sandeep Nawathe
  • Patent number: 11126432
    Abstract: A computer processor is provided which hides jump instructions, in particular condition jump instructions, from side-channels. The processor comprises a forward jump detector for detecting a forward jump instruction having a jump target location which lies ahead and a jump inhibitor for inhibiting an execution of the forward jump instruction. The computer processor is configured for executing at least one intermediate computer instruction located between the inhibited forward jump instruction and the jump target location. The processor further comprises a storage destination modifier for modifying the storage destination determined by the at least one intermediate computer instruction to suppress the effects of execution of intermediate instructions. Since the intermediate instruction is executed regardless of the forward jump instruction, the jump is hidden in a side-channel. Secret information, such as cryptographic keys, on which the forward jump may depend, is also hidden.
    Type: Grant
    Filed: February 4, 2011
    Date of Patent: September 21, 2021
    Assignee: NXP B.V.
    Inventor: Jan Hoogerbrugge
  • Patent number: 11080063
    Abstract: A processing device includes an instruction extractor that extracts target instructions intended for a loop process that is repeatedly performed, from instructions decoded by an instruction decoder, and a loop buffer including entries where each of the target instructions extracted by an instruction extractor are stored. An instruction processor stores a target instruction into one of the entries of the loop buffer, and combines target instructions into one target instruction in a case where resources of an instruction execution circuit used by the target instructions do not overlap, to store the one instruction in one of the entries of the loop buffer, and a selector selects the instruction output from the instruction decoder or the target instruction output from the loop buffer, and outputs the selected instruction to the instruction execution circuit.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: August 3, 2021
    Assignee: FUJITSU LIMITED
    Inventor: Ryohei Okazaki
  • Patent number: 11029950
    Abstract: A move data instruction to move data from one location to another location is obtained. Based on obtaining the move data instruction, a determination is made as to whether the data to be moved is located in a buffer. The buffer is configured to maintain the data for use by multiple move data instructions. The buffer is used to move the data from the one location to the other location, based on determining that the data to be moved is in the buffer.
    Type: Grant
    Filed: July 3, 2019
    Date of Patent: June 8, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yossi Shapira, Yair Fried, Eyal Naor, Amir Turi
  • Patent number: 10996952
    Abstract: Systems and methods are disclosed for macro-op fusion. Sequences of macro-ops that include a control-flow instruction are fused into single micro-ops for execution. The fused micro-ops may avoid the use of control-flow instructions, which may improve performance. A fusion predictor may be used to facilitate macro-op fusion.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: May 4, 2021
    Assignee: SiFive, Inc.
    Inventors: Krste Asanovic, Andrew Waterman
  • Patent number: 10901878
    Abstract: Embodiments of the present invention are directed to a computer-implemented method for building and executing test cases. A non-limiting example of the computer-implemented method includes building, using a processor, a master test case instruction stream including a plurality of instructions including a replaceable instruction. The computer-implemented method builds, using the processor, a test case instruction stream derivative including the plurality of instructions including a replacement instruction in lieu of the replaceable instruction, and predicts, using the processor, a predicted result of executing the test case instruction stream derivative in a test case environment. The computer-implemented method executes, using the processor, the test case instruction stream derivative on the test case environment to generate an actual test case result and compares, using the processor, the actual test case result with the predicted result to determine proper operation of the test case environment.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: January 26, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ali Y. Duale, Dennis Wittig
  • Patent number: 10884735
    Abstract: A processor includes a front end to receive an instruction. The processor also includes a core to execute the instruction. The core includes logic to execute a base function of the instruction to yield a result, generate a predicate value of a comparison of the result based upon a predication setting in the instruction, and set the predicate value in a register. The processor also includes a retirement unit to retire the instruction.
    Type: Grant
    Filed: February 26, 2018
    Date of Patent: January 5, 2021
    Assignee: Intel Corporation
    Inventors: Jayesh Iyer, Jamison Collins, Sebastian Winkel, Howard Chen
  • Patent number: 10747536
    Abstract: A data processing system provides a loop-end instruction for use at the end of a program loop body specifying an address of a beginning instruction of said program loop body. Loop control circuitry (1000) serves to control repeated execution of the program loop body upon second and subsequent passes through the program loop body using loop control data provided by the loop-end instruction without requiring the loop-end instruction to be explicitly executed upon each pass.
    Type: Grant
    Filed: March 21, 2017
    Date of Patent: August 18, 2020
    Assignee: ARM Limited
    Inventors: Alasdair Grant, Thomas Christopher Grocutt, Simon John Craske
  • Patent number: 10698670
    Abstract: There is provided a parallel program generating method capable of generating a static scheduling enabled parallel program without undermining the possibility of extracting parallelism. The parallel program generating method executed by the parallelization compiling apparatus 100 includes a fusion step (FIG. 2/STEP026) of fusing, as a new task, a task group including a reference task as a task having a conditional branch, and subsequent tasks as tasks control dependent, extended-control dependent, or indirect control dependent on respective of all branch directions of the conditional branch included in the reference task.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: June 30, 2020
    Assignee: WASEDA UNIVERSITY
    Inventors: Hironori Kasahara, Keiji Kimura, Dan Umeda, Hiroki Mikami
  • Patent number: 10684828
    Abstract: A method and apparatus for modifying a user interface. The method comprises receiving user interface data at a client from a first server, receiving modification computer program code at said client, and executing said modification computer program code at said client to modify said user interface data to generate modified user interface data. The modification computer program code can be received from said first server or from a further server.
    Type: Grant
    Filed: July 15, 2016
    Date of Patent: June 16, 2020
    Assignee: VERSATA FZ-LLC
    Inventor: Plamen Ivanov Valtchev
  • Patent number: 10628159
    Abstract: A processor includes; a processor core, a register selectively controlled by either external hardware during a first operation mode or the processor core during a second operation mode, and a selection circuit receiving first data provided by the external hardware to the register during the first operation mode and second data provided by the processor core to the register during the second operation mode.
    Type: Grant
    Filed: September 28, 2017
    Date of Patent: April 21, 2020
    Assignee: Sansung Electronics Co., Ltd.
    Inventor: Ji Yong Yoon
  • Patent number: 10606667
    Abstract: Computational tasks are mapped with computational locations in a distributed system such as a cloud computing environment. Mapping does not rely on workload estimates. Instead, tasks whose prerequisite tasks or other preconditions are determined to be mutually exclusive are co-located, while other tasks are mapped to different locations than one another. Locations are servers, processor cores, virtual machines, applications, or computational processes, for example. Mutual exclusivity may be determined by detecting that preconditions require different values of a shared variable in order to be satisfied, for example, or determining that preconditions correspond to different branches of a conditional programming statement. A satisfiability engine may also provide a satisfiability determination. Co-located tasks may also be batched, for improved execution performance.
    Type: Grant
    Filed: October 2, 2017
    Date of Patent: March 31, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ilya Grebnov, Stephen Siciliano, Charles Lamanna
  • Patent number: 10606595
    Abstract: A data processor in which execution threads may be grouped together into thread groups in which the plural threads of a thread group can each execute a set of instructions in lockstep, one instruction at a time. The data processor comprises a plurality of execution lanes for executing respective execution threads of a thread group. For each thread group in a pool 51 of thread groups available to be issued to the execution lanes, an indication 54 of the active threads of the thread group is stored, and sets of at least one thread group from the pool 51 of available thread groups to issue 73 to the execution lanes for execution are selected 72 based on the indications of the active threads for the thread groups in the thread group pool.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: March 31, 2020
    Assignee: Arm Limited
    Inventor: Kenneth Edvard Ostby
  • Patent number: 10360034
    Abstract: A graphics processing unit may include a register file memory, a processing element (PE) and a load-store unit (LSU). The register file memory includes a plurality of registers. The PE is coupled to the register file memory and processes at least one thread of a vector of threads of a graphical application. Each thread in the vector of threads are processed in a non-stalling manner. The PE stores data in a first predetermined set of the plurality of registers in the register file memory that has been generated by processing the at least one thread and that is to be routed to a first stallable logic unit that is external to the PE. The LSU is coupled to the register file memory, and the LSU accesses the data in the first predetermined set of the plurality of registers and routes to the first stallable logic unit.
    Type: Grant
    Filed: June 26, 2017
    Date of Patent: July 23, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: David C. Tannenbaum, Srinivasan S. Iyer, Mitchell K. Alsup
  • Patent number: 10089212
    Abstract: An embodiment provides a memory system connectable to a host device. The memory system includes a host interface configured to receive a read command and a write command and a first non-volatile memory. In addition, the memory system includes a debug unit configured to collect debugging information when a processor executes firmware. The debug unit is capable of outputting the debugging information to a buffer area of the host device through the host interface.
    Type: Grant
    Filed: March 10, 2016
    Date of Patent: October 2, 2018
    Assignee: TOSHIBA MEMORY CORPORATION
    Inventor: Daisuke Iwai
  • Patent number: 9959095
    Abstract: An adder-subtractor includes a first XOR circuit that inverts or non-inverts data from a second input line; first and second operand registers that hold outputs of first and second input selector; a result register that holds the operation result in response to the clock; and an adder that outputs an operation result of first and second input data in the first and second operand registers to the result register and also to inputs of the first and second input selectors via the first bypass line. The adder includes a second XOR circuit for the first and second input data, a carry calculation unit that calculates carry data of the first and second input data, a fourth XOR circuit that inverts or not an output of the second XOR circuit, and a third XOR circuit for outputs of the carry calculation unit and outputs the operation result.
    Type: Grant
    Filed: April 8, 2016
    Date of Patent: May 1, 2018
    Assignee: FUJITSU LIMITED
    Inventors: Kouji Kimura, Ryuji Kan
  • Patent number: 9952864
    Abstract: An apparatus is described having decode circuitry to decode a first instruction, wherein the first instruction indicates that a copy of a plurality of condition codes bits is to be copied from a first register to a second register. The apparatus also has first execution circuitry to copy a plurality of condition code bits from a first register to a second register.
    Type: Grant
    Filed: December 23, 2009
    Date of Patent: April 24, 2018
    Assignee: INTEL CORPORATION
    Inventors: Guilherme D. Ottoni, Hong Wang, Christopher T. Weaver, Thomas A. Hartin, Wei Li, Jason W. Brandt
  • Patent number: 9892016
    Abstract: A method for securing a first program, the first program including a finite number of program points and evolution rules associated to program points and defining the passage of a program point to another, the method including defining a plurality of exit cases and, when a second program is used in the definition of the first program, for each exit case, definition of a branching toward a specific program point of the first program or a declaration of branching impossibility, defining a set of properties to be proven, each associated with one of the constitutive elements of the first program, said set of properties comprising the branching impossibility as a particular property and establishment of the formal proof of the set of properties.
    Type: Grant
    Filed: November 3, 2016
    Date of Patent: February 13, 2018
    Inventor: Dominique Bolignano
  • Patent number: 9830157
    Abstract: A system and method of parallelizing programs employs runtime instructions to identify data accessed by program portions and to assign those program portions to particular processors based on potential overlap between the access data. Data dependence between different program portions may be identified and used to look for pending “predicate” program portions that could create data dependencies and to postpone program portions that may be dependent while permitting parallel execution of other program portions.
    Type: Grant
    Filed: August 18, 2010
    Date of Patent: November 28, 2017
    Assignee: Wisconsin ALumni Research Foundation
    Inventors: Gagan Gupta, Gurindar S. Sohi, Srinath Sridharan
  • Patent number: 9798593
    Abstract: A system for determining a toggle value includes an input interface and a processor. The input interface is to receive a request for the toggle value associated with a toggle. The processor is to determine an indicated toggle value associated with the toggle; determine the toggle value associated with the toggle based at least in part on the indicated toggle value and a set of dependencies; and provide the toggle value associated with the toggle.
    Type: Grant
    Filed: July 6, 2016
    Date of Patent: October 24, 2017
    Assignee: Workday, Inc.
    Inventors: Salvador Maiorano Quiroga, Saul Arjona Polo, Andrew Jacob Malin, Daniel Duan Ho
  • Patent number: 9766892
    Abstract: An apparatus and method for executing nested control flow instructions on a graphics processing unit (GPU). For example, one embodiment of a processor comprises: an execution unit having a plurality of channels to execute control flow instructions including fused control flow instructions comprising two or more consecutive control flow instructions fused into a single fused control flow instruction; and a branch unit to process the control flow instructions and to maintain a global counter indicating a nesting level of the control flow instructions, wherein to process a fused control flow instruction, the branch unit is to store a value N in a stack indicating a number of control flow instructions fused into the fused control flow instruction, the branch unit to subsequently read the value N from the stack upon execution of the fused control flow instruction and decrement the global counter by a value of N responsive to execution of the fused control flow instruction.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: September 19, 2017
    Assignee: Intel Corporation
    Inventors: Wei-Yu Chen, Guei-Yuan Lueh, Subramaniam Maiyuran
  • Patent number: 9652242
    Abstract: An apparatus and method for calculating flag bits is disclosed. The flag bits may be used in a processor utilizing branch predication. More particularly, the apparatus and method may be used to calculate a predicate that can be used by a branch unit to evaluate whether a branch is to be taken. In one embodiment, the apparatus is coupled to receive a condition code associated with an instruction, and flag bits generated responsive to execution of the instruction. The condition code is indicative of a condition to be checked resulting from execution of the instruction. The apparatus may then provide an indication of whether the condition is true.
    Type: Grant
    Filed: May 2, 2012
    Date of Patent: May 16, 2017
    Assignee: Apple Inc.
    Inventors: Rajat Goel, Sandeep Gupta, Yamini Modukuru
  • Patent number: 9479431
    Abstract: Communicating among nodes in a network includes: sending a packet from an origin node to a destination node over a route including plural nodes. At each node in the route, routing of the packet is initiated according to a predicted path concurrently with verifying the correctness of the predicted path based on analyzing route information in the packet. In response to results of verifying the correctness of the predicted path, the routing of the packet is completed according to the predicted path or initiating a routing of the packet according to an actual path based on the route information in the packet.
    Type: Grant
    Filed: September 15, 2015
    Date of Patent: October 25, 2016
    Assignee: EZChip Technologies Ltd.
    Inventors: Ian Rudolf Bratt, Carl G. Ramey, Matthew Mattina
  • Patent number: 9424041
    Abstract: A method and apparatus for simultaneously canceling a dependent instruction and a nested dependent instruction when a cancel timer of a source of the dependent instruction and a cancel timer of a source of the nested dependent instruction expire and a producer instruction speculatively waking up the dependent instruction is canceled.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: August 23, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ravi Iyengar, Bradley Gene Burgess, Sandeep Kumar Dubey
  • Patent number: 9323551
    Abstract: A technique of modifying a code sequence for a processor includes identifying a set of one or more target instructions in the code sequence. A replacement instruction is selected that includes a set of replacement instruction parts. A length of each of the replacement instruction parts corresponds to a minimum instruction length for an instruction set of the processor. The replacement instruction parts include a first instruction type and one or more second instruction types that are each configured as exception instructions if processed in isolation from the first instruction type. The replacement instruction is then substituted for the set of one or more target instructions in the code sequence for processing by the processor.
    Type: Grant
    Filed: January 6, 2012
    Date of Patent: April 26, 2016
    Assignee: International Business Machines Corporation
    Inventor: Neil A. Campbell
  • Patent number: 9305167
    Abstract: Described systems and methods allow protecting a host computer system from malware, such as return-oriented programming (ROP) and jump-oriented programming (JOP) exploits. In some embodiments, a processor of the host system is endowed with two counters configured to store a count of branch instructions and a count of inter-branch instructions, respectively, occurring within a stream of instructions fetched by the processor for execution. Exemplary counted branch instructions include indirect JMP, indirect CALL, and RET on x86 platforms, while inter-branch instructions consist of instructions executed between two consecutive counted branch instructions. The processor may be further configured to generate a processor event, such as an exception, when a value stored in a counter exceeds a predetermined threshold. Such events may be used as triggers for launching a malware analysis to determine whether the host system is subject to a code reuse attack.
    Type: Grant
    Filed: May 21, 2014
    Date of Patent: April 5, 2016
    Assignee: Bitdefender IPR Management Ltd.
    Inventors: Andrei V. Lutas, Sandor Lukacs
  • Patent number: 9298456
    Abstract: A mechanism for executing speculative predicated instructions may include execution of initiating execution of a vector instruction when one or more operands upon which the vector instruction depends are available for use, even if a predicate vector that the vector instruction also depends is not available. If the predicate vector was not available, the results of the execution of the vector instruction may be temporarily held until the predicate vector becomes available, at which time, a destination vector may be updated with the results.
    Type: Grant
    Filed: August 21, 2012
    Date of Patent: March 29, 2016
    Assignee: Apple Inc.
    Inventor: Jeffry E. Gonion
  • Patent number: 9256429
    Abstract: This disclosure describes techniques for selectively activating a resume check operation in a single instruction, multiple data (SIMD) processing system. A processor is described that is configured to selectively enable or disable a resume check operation for a particular instruction based on information included in the instruction that indicates whether a resume check operation is to be performed for the instruction. A compiler is also described that is configured to generate compiled code which, when executed, causes a resume check operation to be selectively enabled or disabled for particular instructions. The compiled code may include one or more instructions that each specify whether a resume check operation is to be performed for the respective instruction. The techniques of this disclosure may be used to reduce the power consumption of and/or improve the performance of a SIMD system that utilizes a resume check operation to manage the reactivation of deactivated threads.
    Type: Grant
    Filed: September 21, 2012
    Date of Patent: February 9, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Lin Chen, Yun Du, Andrew Gruber
  • Patent number: 9229698
    Abstract: A method for processing a function with a plurality of execution spaces is disclosed. The method comprises creating an internal compiler representation for the function. Creating the internal compiler representation comprises copying substantially all lexical tokens corresponding to a body of the function. Further, the creating comprises inserting the lexical tokens into a plurality of conditional if-statements, wherein a conditional if-statement is generated for each corresponding execution space of said plurality of execution spaces, and wherein each conditional if-statement determines which execution space the function is executing in. During compilation, the method finally comprises performing overload resolution at a call site of an overloaded function by checking for compatibility with a first execution space specified by one of the plurality of conditional if-statements, wherein the overloaded function is called within the body of the function.
    Type: Grant
    Filed: November 25, 2013
    Date of Patent: January 5, 2016
    Assignee: NVIDIA CORPORATION
    Inventor: Jaydeep Marathe
  • Patent number: 9223572
    Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.
    Type: Grant
    Filed: December 29, 2012
    Date of Patent: December 29, 2015
    Assignee: Intel Corporation
    Inventors: Alexander D. Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
  • Patent number: 9195460
    Abstract: Systems and methods for compiling programs using condition codes and executing those programs when non-numeric values are present allow for explicit handling of non-numeric values. In addition to the conventional condition code values of positive, negative, and zero, a fourth value may be encoded, not a number (NaN) representing a non-numeric value. New condition tests are defined that explicitly account for condition code values of NaN. A compiler may produce code using the new condition tests to represent if and if-else statements. The code including the new condition tests generates deterministic results during execution when non-numeric values are present.
    Type: Grant
    Filed: May 2, 2006
    Date of Patent: November 24, 2015
    Assignee: NVIDIA CORPORATION
    Inventors: Robert Steven Glanville, John Erik Lindholm, Ming Y. Siu
  • Patent number: 9182983
    Abstract: A processor of an aspect includes a register file including a first register to hold a first packed data including a first low data element and a first high data element, a second register to hold a second packed data including a second low data element and a second high data element, and a third register. The processor also includes a decoder to decode an unpack instruction. The processor also includes a functional unit coupled with the decoder and the register file. The functional unit, in response to the decoder decoding the unpack instruction, is to transfer the first low data element to a high position of the third register and the second low data element to a low position of the third register.
    Type: Grant
    Filed: December 29, 2012
    Date of Patent: November 10, 2015
    Assignee: Intel Corporation
    Inventors: Alexander D. Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
  • Patent number: 9122797
    Abstract: Devices systems and methods are provided for providing a deterministic remote interface unit (RIU) based on a finite state machine. The RIU emulator uses a sequence controller that is configured to receive a synchronization input and to execute a fixed list of unconditional commands in an invariable order of execution based solely upon the synchronization input. The RIU emulator also uses pre-defined or pre-certified data structures that are specific to one or more interface devices to successfully execute the at least one unconditional command of the plurality when encountered in the invariable order. As such, peripheral devices may be added, removed or updated without recertification by merely inserting pre-certified data structures into memory or deleting them.
    Type: Grant
    Filed: September 29, 2009
    Date of Patent: September 1, 2015
    Assignee: HONEYWELL INTERNATIONAL INC.
    Inventors: Mitch Fletcher, Thom Kreider, John Dawson, Julee Clelland
  • Patent number: 9117020
    Abstract: An embodiment is directed to a method for analyzing a computer program that includes receiving an instruction specifying a first variable of the program. The first variable has a first value at a first location during program execution. The instruction further specifies a second value for the first variable at the first location. The method includes determining that a second location during program execution includes a conditional control flow instruction that includes the first variable. In addition, the method includes evaluating the conditional control flow instruction using the first and second values of the first variable at the second location. It may be determined whether control flow diverges at the second location based on the evaluating of the conditional control flow instruction using the first and second values at the second location.
    Type: Grant
    Filed: September 17, 2014
    Date of Patent: August 25, 2015
    Assignee: International Business Machines Corporation
    Inventors: Krzysztof Anton, Michal Bodziony, Pawel K. Koperek, Rafal Korczyk
  • Patent number: 9104633
    Abstract: Hardware for performing sequences of arithmetic operations. The hardware comprises a scheduler operable to generate a schedule of instructions from a bitmap denoting whether an entry in a matrix is zero or not. An arithmetic circuit is provided which is configured to perform arithmetic operations on the matrix in accordance with the schedule.
    Type: Grant
    Filed: January 7, 2011
    Date of Patent: August 11, 2015
    Assignee: LINEAR ALGEBRA TECHNOLOGIES LIMITED
    Inventor: David Moloney
  • Patent number: 9069565
    Abstract: A processor includes: first selectors that select instruction addresses of instructions of a plurality of threads or a branch target address of a branch instruction to be predicted and that output addresses of the plurality of threads; a second selector that selects one of the addresses of the plurality of threads output by the first selectors; a branch prediction circuit that predicts and outputs a branch direction, which indicates whether the branch instruction of the address selected by the second selector is branched, based on the selected address in a first cycle stage and that predicts and outputs the branch target address of the branch instruction to be predicted based on the selected address in a second cycle stage later than the first cycle stage; and a thread arbitration circuit that controls selection of the addresses of the threads by the first selectors and the second selector.
    Type: Grant
    Filed: August 29, 2012
    Date of Patent: June 30, 2015
    Assignee: FUJITSU LIMITED
    Inventors: Toshiro Ito, Takashi Suzuki
  • Publication number: 20150134939
    Abstract: An information processing system is provided. The information processing system includes a processor used to obtain information, a memory used to store the information and output an information block based on a received address; and a scanner used to generate an address based on the current information block and to provide the address to the memory, where the current information block is the information block currently outputted from the memory. Thus, the speed for obtaining the information block by the processor (information block requested device) is further improved, and the execution speed of the processor and the information processing system is improved.
    Type: Application
    Filed: June 14, 2013
    Publication date: May 14, 2015
    Inventor: Chenghao Kenneth Lin
  • Publication number: 20150127929
    Abstract: A data processing device includes an instruction executing part executing a normal task and a management task scheduling an execution order of the normal task with switching the normal task and the management task, a counter measuring an execution state of the normal task being executed in the instruction executing part, and a state controller controlling the counter based on the normal task being executed in the instruction executing part. The instruction executing part determines whether the normal task to be executed next of a plurality of normal tasks scheduled by the management task is a measurement object or not, and outputs an operation signal notifying the state controller of the determination result. The state controller operates the counter in accordance with the branch operation.
    Type: Application
    Filed: January 14, 2015
    Publication date: May 7, 2015
    Inventors: Hitoshi Suzuki, Yukihiko Akaike
  • Publication number: 20150106602
    Abstract: A system and method for efficiently performing program instrumentation. A processor processes instructions stored in a memory. The processor allocates a memory region for the purpose of creating “random branches” in the computer code utilizing existing memory access instructions. When the processor processes a given instruction, the processor both accesses a first location in the memory region and may determine a condition is satisfied. In response, the processor generates an interrupt. The corresponding interrupt handler may transfer control flow from the computer program to instrumentation code. The condition may include a pointer storing an address pointing to locations within the memory region equals a given address after the point is updated. Alternatively, the condition may include an updated data value stored in a location pointed to by the given address equals a threshold value.
    Type: Application
    Filed: October 15, 2013
    Publication date: April 16, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Joseph L. Greathouse, David S. Christie
  • Publication number: 20150095628
    Abstract: Various embodiments are generally directed to techniques to detect a return-oriented programming (ROP) attack by verifying target addresses of branch instructions during execution. An apparatus includes a processor component, and a comparison component for execution by the processor component to determine whether there is a matching valid target address for a target address of a branch instruction associated with a translated portion of a routine in a table comprising valid target addresses. Other embodiments are described and claimed.
    Type: Application
    Filed: May 23, 2013
    Publication date: April 2, 2015
    Inventors: Koichi Yamada, Palanivelra Shanmugavelayutham, Arvind Krishnaswamy, Jason M. Agron, Jiwei Lu
  • Publication number: 20150058605
    Abstract: Exemplary methods, apparatuses, and systems assign a plurality of branch instructions within a computer program to a plurality of prime numbers. Each branch instruction is assigned a unique prime number within the plurality of prime numbers. A run-time branch trace value is determined to be divisible, without a remainder, by a first prime number of the plurality of prime numbers. The run-time branch trace value was generated during execution of the computer program. An output is generated indicating that a first branch instruction assigned to the first prime number was executed.
    Type: Application
    Filed: August 21, 2013
    Publication date: February 26, 2015
    Applicant: VMware, Inc.
    Inventor: Rajiv MADAMPATH