Conditional Branching Patents (Class 712/234)
-
Patent number: 11861365Abstract: Systems and methods are disclosed for macro-op fusion. Sequences of macro-ops that include a control-flow instruction are fused into single micro-ops for execution. The fused micro-ops may avoid the use of control-flow instructions, which may improve performance. A fusion predictor may be used to facilitate macro-op fusion.Type: GrantFiled: May 3, 2021Date of Patent: January 2, 2024Assignee: SiFive, Inc.Inventors: Krste Asanovic, Andrew Waterman
-
Patent number: 11687440Abstract: Protection of a first software application to be executed on an execution platform by adding at least one check module to the software application, wherein the check module, when being executed, checks at least a part of the code of the protected software application loaded in the memory and carries out a predefined tamper response in case the check module detects that the checked code was changed or ensures that the protected software application continues to function correctly in case the check module detects that the checked code was not changed; selecting a first code region of the first software application, said first code region provides a first functionality when being executed; amending the selected first code region of the first software application such that an amended first code region is generated to provide the protected software application; wherein the amended first code region, when being executed, still provides the first functionality but carries out an access to at least a part of the codeType: GrantFiled: February 2, 2021Date of Patent: June 27, 2023Assignee: THALES DIS CPL USA, INC.Inventors: Andreas Weber, David Andreas Lange, Michael Zunke
-
Patent number: 11567776Abstract: In one embodiment, a microprocessor, comprising: first logic configured to dynamically adjust a maximum prefetch count based on a total count of predicted taken branches over a predetermined quantity of cache lines; and second logic configured to prefetch instructions based on the adjusted maximum prefetch count.Type: GrantFiled: November 3, 2020Date of Patent: January 31, 2023Assignee: CENTAUR TECHNOLOGY, INC.Inventors: Thomas C. McDonald, Brent Bean
-
Patent number: 11507475Abstract: A data processing apparatus (2) has scalar processing circuitry (32-42) and vector processing circuitry (38, 40, 42). When executing main scalar processing on the scalar processing circuitry (32-42), or main vector processing using a subset of said plurality of lanes on the vector processing circuitry (38, 40, 42), checker processing is executed using at least one lane of the plurality of lanes on the vector processing circuitry (38, 40, 42), the checker processing comprising operations corresponding to at least part of the main scalar/vector processing. Errors can then be detected based on a comparison of an outcome of the main processing and an outcome of the checker processing. This provides a technique for achieving functional safety in a high end processor with better performance and reduced hardware cost compared to a dual/triple core lockstep approach.Type: GrantFiled: December 12, 2017Date of Patent: November 22, 2022Assignee: Arm LimitedInventors: Matthias Lothar Boettcher, Mbou Eyole, Nathanael Premillieu
-
Patent number: 11501143Abstract: A reconfigurable neural circuit includes an array of processing nodes. Each processing node includes a single physical neuron circuit having only one input and an output, a single physical synapse circuit having a presynaptic input, and a single physical output coupled to the input of the neuron circuit, a weight memory for storing N synaptic conductance value or weights having an output coupled to the single physical synapse circuit, a single physical spike timing dependent plasticity (STDP) circuit having an output coupled to the weight memory, a first input coupled to the output of the neuron circuit, and a second input coupled to the presynaptic input, and interconnect circuitry connected to the presynaptic input and connected to the output of the single physical neuron circuit. The synapse circuit and the STDP circuit are each time multiplexed circuits. The interconnect circuitry in each respective processing node is coupled to the interconnect circuitry in each other processing node.Type: GrantFiled: June 20, 2019Date of Patent: November 15, 2022Assignee: HRL LABORATORIES, LLCInventors: Jose Cruz-Albrecht, Timothy Derosier, Narayan Srinivasa
-
Patent number: 11314512Abstract: An aspect includes generating a data result and a special case indicator based on an instruction and at least one input data operand. Outputting the data result to a processor core. Outputting the first condition code to the processor core prior to outputting the data result to the processor core. Generating a second condition code based on the data result and the special case indicator. Performing a check by comparing the first condition code and the second condition code and flagging an error to the processor core upon the first condition code being different from the second condition code.Type: GrantFiled: August 9, 2019Date of Patent: April 26, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Petra Leber, Kerstin Claudia Schelm, Cedric Lichtenau, Michael Klein
-
Patent number: 11309023Abstract: A memory system may include multiple memory cells to store logical data and cycle tracking circuitry to track a number of cycles associated the memory cells. The cycles may be representative of one or more past accesses of the memory cells. The memory system may also include control circuitry to access the memory cells. Accessing of the memory cell may include a read operation, a write operation, or both. During the accessing of the memory cell, the control circuitry may determine a voltage parameter of the access based at least in part on the tracked number of cycles.Type: GrantFiled: November 6, 2020Date of Patent: April 19, 2022Assignee: Micron Technology, Inc.Inventor: Hari Giduturi
-
Patent number: 11210103Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to determine, based on a field of a first instruction, a number of additional instructions to execute in conjunction with the first instruction and prior to execution of the first instruction.Type: GrantFiled: September 14, 2016Date of Patent: December 28, 2021Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Horst Diewald, Johann Zipperer
-
Patent number: 11163577Abstract: A processor reads at least one instruction comprising at least one of a branch instruction and a non-branch instruction. In response to the branch instruction comprising a conditional branch instruction and set in dynamic mode, the processor dynamically predicts a branch path as taken or not taken. The processor, in response to the instruction fetch unit set in static mode for a conditional branch instruction and static branch prediction setting bits received with the conditional branch instruction specifying static branch prediction, statically sets the branch path as taken or not taken according to the static branch prediction setting bits received with the branch instruction. The processor selectively sets the operation of the processor temporarily from the dynamic mode to the static mode only in response to detecting a type of the at least one instruction matches a type of instruction qualifying to trigger static branch prediction.Type: GrantFiled: November 26, 2018Date of Patent: November 2, 2021Assignee: International Business Machines CorporationInventors: Sheldon Levenstein, Brian W. Thompto, David S. Levitan
-
Patent number: 11144288Abstract: Embodiments of the present disclosure are directed to a system, methods, and computer-readable media for compiling source code into bytecode using a compiler. Using a rules set as input, a compiler de-duplicates action codes in the rules and assigns a unique identifier to each action code. The compiler generates a cascading hierarchy of switches that process discrete portions of the unique identifiers in order to invoke methods. The methods are assigned to classes using a method-per-class limit, and bytecode is generated from the class-assigned methods.Type: GrantFiled: May 15, 2020Date of Patent: October 12, 2021Assignee: Adobe Inc.Inventor: Sandeep Nawathe
-
Patent number: 11126432Abstract: A computer processor is provided which hides jump instructions, in particular condition jump instructions, from side-channels. The processor comprises a forward jump detector for detecting a forward jump instruction having a jump target location which lies ahead and a jump inhibitor for inhibiting an execution of the forward jump instruction. The computer processor is configured for executing at least one intermediate computer instruction located between the inhibited forward jump instruction and the jump target location. The processor further comprises a storage destination modifier for modifying the storage destination determined by the at least one intermediate computer instruction to suppress the effects of execution of intermediate instructions. Since the intermediate instruction is executed regardless of the forward jump instruction, the jump is hidden in a side-channel. Secret information, such as cryptographic keys, on which the forward jump may depend, is also hidden.Type: GrantFiled: February 4, 2011Date of Patent: September 21, 2021Assignee: NXP B.V.Inventor: Jan Hoogerbrugge
-
Patent number: 11080063Abstract: A processing device includes an instruction extractor that extracts target instructions intended for a loop process that is repeatedly performed, from instructions decoded by an instruction decoder, and a loop buffer including entries where each of the target instructions extracted by an instruction extractor are stored. An instruction processor stores a target instruction into one of the entries of the loop buffer, and combines target instructions into one target instruction in a case where resources of an instruction execution circuit used by the target instructions do not overlap, to store the one instruction in one of the entries of the loop buffer, and a selector selects the instruction output from the instruction decoder or the target instruction output from the loop buffer, and outputs the selected instruction to the instruction execution circuit.Type: GrantFiled: October 28, 2019Date of Patent: August 3, 2021Assignee: FUJITSU LIMITEDInventor: Ryohei Okazaki
-
Patent number: 11029950Abstract: A move data instruction to move data from one location to another location is obtained. Based on obtaining the move data instruction, a determination is made as to whether the data to be moved is located in a buffer. The buffer is configured to maintain the data for use by multiple move data instructions. The buffer is used to move the data from the one location to the other location, based on determining that the data to be moved is in the buffer.Type: GrantFiled: July 3, 2019Date of Patent: June 8, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yossi Shapira, Yair Fried, Eyal Naor, Amir Turi
-
Patent number: 10996952Abstract: Systems and methods are disclosed for macro-op fusion. Sequences of macro-ops that include a control-flow instruction are fused into single micro-ops for execution. The fused micro-ops may avoid the use of control-flow instructions, which may improve performance. A fusion predictor may be used to facilitate macro-op fusion.Type: GrantFiled: December 10, 2018Date of Patent: May 4, 2021Assignee: SiFive, Inc.Inventors: Krste Asanovic, Andrew Waterman
-
Patent number: 10901878Abstract: Embodiments of the present invention are directed to a computer-implemented method for building and executing test cases. A non-limiting example of the computer-implemented method includes building, using a processor, a master test case instruction stream including a plurality of instructions including a replaceable instruction. The computer-implemented method builds, using the processor, a test case instruction stream derivative including the plurality of instructions including a replacement instruction in lieu of the replaceable instruction, and predicts, using the processor, a predicted result of executing the test case instruction stream derivative in a test case environment. The computer-implemented method executes, using the processor, the test case instruction stream derivative on the test case environment to generate an actual test case result and compares, using the processor, the actual test case result with the predicted result to determine proper operation of the test case environment.Type: GrantFiled: December 19, 2018Date of Patent: January 26, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ali Y. Duale, Dennis Wittig
-
Patent number: 10884735Abstract: A processor includes a front end to receive an instruction. The processor also includes a core to execute the instruction. The core includes logic to execute a base function of the instruction to yield a result, generate a predicate value of a comparison of the result based upon a predication setting in the instruction, and set the predicate value in a register. The processor also includes a retirement unit to retire the instruction.Type: GrantFiled: February 26, 2018Date of Patent: January 5, 2021Assignee: Intel CorporationInventors: Jayesh Iyer, Jamison Collins, Sebastian Winkel, Howard Chen
-
Patent number: 10747536Abstract: A data processing system provides a loop-end instruction for use at the end of a program loop body specifying an address of a beginning instruction of said program loop body. Loop control circuitry (1000) serves to control repeated execution of the program loop body upon second and subsequent passes through the program loop body using loop control data provided by the loop-end instruction without requiring the loop-end instruction to be explicitly executed upon each pass.Type: GrantFiled: March 21, 2017Date of Patent: August 18, 2020Assignee: ARM LimitedInventors: Alasdair Grant, Thomas Christopher Grocutt, Simon John Craske
-
Patent number: 10698670Abstract: There is provided a parallel program generating method capable of generating a static scheduling enabled parallel program without undermining the possibility of extracting parallelism. The parallel program generating method executed by the parallelization compiling apparatus 100 includes a fusion step (FIG. 2/STEP026) of fusing, as a new task, a task group including a reference task as a task having a conditional branch, and subsequent tasks as tasks control dependent, extended-control dependent, or indirect control dependent on respective of all branch directions of the conditional branch included in the reference task.Type: GrantFiled: December 28, 2017Date of Patent: June 30, 2020Assignee: WASEDA UNIVERSITYInventors: Hironori Kasahara, Keiji Kimura, Dan Umeda, Hiroki Mikami
-
Patent number: 10684828Abstract: A method and apparatus for modifying a user interface. The method comprises receiving user interface data at a client from a first server, receiving modification computer program code at said client, and executing said modification computer program code at said client to modify said user interface data to generate modified user interface data. The modification computer program code can be received from said first server or from a further server.Type: GrantFiled: July 15, 2016Date of Patent: June 16, 2020Assignee: VERSATA FZ-LLCInventor: Plamen Ivanov Valtchev
-
Patent number: 10628159Abstract: A processor includes; a processor core, a register selectively controlled by either external hardware during a first operation mode or the processor core during a second operation mode, and a selection circuit receiving first data provided by the external hardware to the register during the first operation mode and second data provided by the processor core to the register during the second operation mode.Type: GrantFiled: September 28, 2017Date of Patent: April 21, 2020Assignee: Sansung Electronics Co., Ltd.Inventor: Ji Yong Yoon
-
Patent number: 10606667Abstract: Computational tasks are mapped with computational locations in a distributed system such as a cloud computing environment. Mapping does not rely on workload estimates. Instead, tasks whose prerequisite tasks or other preconditions are determined to be mutually exclusive are co-located, while other tasks are mapped to different locations than one another. Locations are servers, processor cores, virtual machines, applications, or computational processes, for example. Mutual exclusivity may be determined by detecting that preconditions require different values of a shared variable in order to be satisfied, for example, or determining that preconditions correspond to different branches of a conditional programming statement. A satisfiability engine may also provide a satisfiability determination. Co-located tasks may also be batched, for improved execution performance.Type: GrantFiled: October 2, 2017Date of Patent: March 31, 2020Assignee: Microsoft Technology Licensing, LLCInventors: Ilya Grebnov, Stephen Siciliano, Charles Lamanna
-
Patent number: 10606595Abstract: A data processor in which execution threads may be grouped together into thread groups in which the plural threads of a thread group can each execute a set of instructions in lockstep, one instruction at a time. The data processor comprises a plurality of execution lanes for executing respective execution threads of a thread group. For each thread group in a pool 51 of thread groups available to be issued to the execution lanes, an indication 54 of the active threads of the thread group is stored, and sets of at least one thread group from the pool 51 of available thread groups to issue 73 to the execution lanes for execution are selected 72 based on the indications of the active threads for the thread groups in the thread group pool.Type: GrantFiled: March 23, 2018Date of Patent: March 31, 2020Assignee: Arm LimitedInventor: Kenneth Edvard Ostby
-
Patent number: 10360034Abstract: A graphics processing unit may include a register file memory, a processing element (PE) and a load-store unit (LSU). The register file memory includes a plurality of registers. The PE is coupled to the register file memory and processes at least one thread of a vector of threads of a graphical application. Each thread in the vector of threads are processed in a non-stalling manner. The PE stores data in a first predetermined set of the plurality of registers in the register file memory that has been generated by processing the at least one thread and that is to be routed to a first stallable logic unit that is external to the PE. The LSU is coupled to the register file memory, and the LSU accesses the data in the first predetermined set of the plurality of registers and routes to the first stallable logic unit.Type: GrantFiled: June 26, 2017Date of Patent: July 23, 2019Assignee: Samsung Electronics Co., Ltd.Inventors: David C. Tannenbaum, Srinivasan S. Iyer, Mitchell K. Alsup
-
Patent number: 10089212Abstract: An embodiment provides a memory system connectable to a host device. The memory system includes a host interface configured to receive a read command and a write command and a first non-volatile memory. In addition, the memory system includes a debug unit configured to collect debugging information when a processor executes firmware. The debug unit is capable of outputting the debugging information to a buffer area of the host device through the host interface.Type: GrantFiled: March 10, 2016Date of Patent: October 2, 2018Assignee: TOSHIBA MEMORY CORPORATIONInventor: Daisuke Iwai
-
Patent number: 9959095Abstract: An adder-subtractor includes a first XOR circuit that inverts or non-inverts data from a second input line; first and second operand registers that hold outputs of first and second input selector; a result register that holds the operation result in response to the clock; and an adder that outputs an operation result of first and second input data in the first and second operand registers to the result register and also to inputs of the first and second input selectors via the first bypass line. The adder includes a second XOR circuit for the first and second input data, a carry calculation unit that calculates carry data of the first and second input data, a fourth XOR circuit that inverts or not an output of the second XOR circuit, and a third XOR circuit for outputs of the carry calculation unit and outputs the operation result.Type: GrantFiled: April 8, 2016Date of Patent: May 1, 2018Assignee: FUJITSU LIMITEDInventors: Kouji Kimura, Ryuji Kan
-
Patent number: 9952864Abstract: An apparatus is described having decode circuitry to decode a first instruction, wherein the first instruction indicates that a copy of a plurality of condition codes bits is to be copied from a first register to a second register. The apparatus also has first execution circuitry to copy a plurality of condition code bits from a first register to a second register.Type: GrantFiled: December 23, 2009Date of Patent: April 24, 2018Assignee: INTEL CORPORATIONInventors: Guilherme D. Ottoni, Hong Wang, Christopher T. Weaver, Thomas A. Hartin, Wei Li, Jason W. Brandt
-
Patent number: 9892016Abstract: A method for securing a first program, the first program including a finite number of program points and evolution rules associated to program points and defining the passage of a program point to another, the method including defining a plurality of exit cases and, when a second program is used in the definition of the first program, for each exit case, definition of a branching toward a specific program point of the first program or a declaration of branching impossibility, defining a set of properties to be proven, each associated with one of the constitutive elements of the first program, said set of properties comprising the branching impossibility as a particular property and establishment of the formal proof of the set of properties.Type: GrantFiled: November 3, 2016Date of Patent: February 13, 2018Inventor: Dominique Bolignano
-
Patent number: 9830157Abstract: A system and method of parallelizing programs employs runtime instructions to identify data accessed by program portions and to assign those program portions to particular processors based on potential overlap between the access data. Data dependence between different program portions may be identified and used to look for pending “predicate” program portions that could create data dependencies and to postpone program portions that may be dependent while permitting parallel execution of other program portions.Type: GrantFiled: August 18, 2010Date of Patent: November 28, 2017Assignee: Wisconsin ALumni Research FoundationInventors: Gagan Gupta, Gurindar S. Sohi, Srinath Sridharan
-
Patent number: 9798593Abstract: A system for determining a toggle value includes an input interface and a processor. The input interface is to receive a request for the toggle value associated with a toggle. The processor is to determine an indicated toggle value associated with the toggle; determine the toggle value associated with the toggle based at least in part on the indicated toggle value and a set of dependencies; and provide the toggle value associated with the toggle.Type: GrantFiled: July 6, 2016Date of Patent: October 24, 2017Assignee: Workday, Inc.Inventors: Salvador Maiorano Quiroga, Saul Arjona Polo, Andrew Jacob Malin, Daniel Duan Ho
-
Patent number: 9766892Abstract: An apparatus and method for executing nested control flow instructions on a graphics processing unit (GPU). For example, one embodiment of a processor comprises: an execution unit having a plurality of channels to execute control flow instructions including fused control flow instructions comprising two or more consecutive control flow instructions fused into a single fused control flow instruction; and a branch unit to process the control flow instructions and to maintain a global counter indicating a nesting level of the control flow instructions, wherein to process a fused control flow instruction, the branch unit is to store a value N in a stack indicating a number of control flow instructions fused into the fused control flow instruction, the branch unit to subsequently read the value N from the stack upon execution of the fused control flow instruction and decrement the global counter by a value of N responsive to execution of the fused control flow instruction.Type: GrantFiled: December 23, 2014Date of Patent: September 19, 2017Assignee: Intel CorporationInventors: Wei-Yu Chen, Guei-Yuan Lueh, Subramaniam Maiyuran
-
Patent number: 9652242Abstract: An apparatus and method for calculating flag bits is disclosed. The flag bits may be used in a processor utilizing branch predication. More particularly, the apparatus and method may be used to calculate a predicate that can be used by a branch unit to evaluate whether a branch is to be taken. In one embodiment, the apparatus is coupled to receive a condition code associated with an instruction, and flag bits generated responsive to execution of the instruction. The condition code is indicative of a condition to be checked resulting from execution of the instruction. The apparatus may then provide an indication of whether the condition is true.Type: GrantFiled: May 2, 2012Date of Patent: May 16, 2017Assignee: Apple Inc.Inventors: Rajat Goel, Sandeep Gupta, Yamini Modukuru
-
Patent number: 9479431Abstract: Communicating among nodes in a network includes: sending a packet from an origin node to a destination node over a route including plural nodes. At each node in the route, routing of the packet is initiated according to a predicted path concurrently with verifying the correctness of the predicted path based on analyzing route information in the packet. In response to results of verifying the correctness of the predicted path, the routing of the packet is completed according to the predicted path or initiating a routing of the packet according to an actual path based on the route information in the packet.Type: GrantFiled: September 15, 2015Date of Patent: October 25, 2016Assignee: EZChip Technologies Ltd.Inventors: Ian Rudolf Bratt, Carl G. Ramey, Matthew Mattina
-
Patent number: 9424041Abstract: A method and apparatus for simultaneously canceling a dependent instruction and a nested dependent instruction when a cancel timer of a source of the dependent instruction and a cancel timer of a source of the nested dependent instruction expire and a producer instruction speculatively waking up the dependent instruction is canceled.Type: GrantFiled: March 15, 2013Date of Patent: August 23, 2016Assignee: Samsung Electronics Co., Ltd.Inventors: Ravi Iyengar, Bradley Gene Burgess, Sandeep Kumar Dubey
-
Patent number: 9323551Abstract: A technique of modifying a code sequence for a processor includes identifying a set of one or more target instructions in the code sequence. A replacement instruction is selected that includes a set of replacement instruction parts. A length of each of the replacement instruction parts corresponds to a minimum instruction length for an instruction set of the processor. The replacement instruction parts include a first instruction type and one or more second instruction types that are each configured as exception instructions if processed in isolation from the first instruction type. The replacement instruction is then substituted for the set of one or more target instructions in the code sequence for processing by the processor.Type: GrantFiled: January 6, 2012Date of Patent: April 26, 2016Assignee: International Business Machines CorporationInventor: Neil A. Campbell
-
Patent number: 9305167Abstract: Described systems and methods allow protecting a host computer system from malware, such as return-oriented programming (ROP) and jump-oriented programming (JOP) exploits. In some embodiments, a processor of the host system is endowed with two counters configured to store a count of branch instructions and a count of inter-branch instructions, respectively, occurring within a stream of instructions fetched by the processor for execution. Exemplary counted branch instructions include indirect JMP, indirect CALL, and RET on x86 platforms, while inter-branch instructions consist of instructions executed between two consecutive counted branch instructions. The processor may be further configured to generate a processor event, such as an exception, when a value stored in a counter exceeds a predetermined threshold. Such events may be used as triggers for launching a malware analysis to determine whether the host system is subject to a code reuse attack.Type: GrantFiled: May 21, 2014Date of Patent: April 5, 2016Assignee: Bitdefender IPR Management Ltd.Inventors: Andrei V. Lutas, Sandor Lukacs
-
Patent number: 9298456Abstract: A mechanism for executing speculative predicated instructions may include execution of initiating execution of a vector instruction when one or more operands upon which the vector instruction depends are available for use, even if a predicate vector that the vector instruction also depends is not available. If the predicate vector was not available, the results of the execution of the vector instruction may be temporarily held until the predicate vector becomes available, at which time, a destination vector may be updated with the results.Type: GrantFiled: August 21, 2012Date of Patent: March 29, 2016Assignee: Apple Inc.Inventor: Jeffry E. Gonion
-
Patent number: 9256429Abstract: This disclosure describes techniques for selectively activating a resume check operation in a single instruction, multiple data (SIMD) processing system. A processor is described that is configured to selectively enable or disable a resume check operation for a particular instruction based on information included in the instruction that indicates whether a resume check operation is to be performed for the instruction. A compiler is also described that is configured to generate compiled code which, when executed, causes a resume check operation to be selectively enabled or disabled for particular instructions. The compiled code may include one or more instructions that each specify whether a resume check operation is to be performed for the respective instruction. The techniques of this disclosure may be used to reduce the power consumption of and/or improve the performance of a SIMD system that utilizes a resume check operation to manage the reactivation of deactivated threads.Type: GrantFiled: September 21, 2012Date of Patent: February 9, 2016Assignee: QUALCOMM IncorporatedInventors: Lin Chen, Yun Du, Andrew Gruber
-
Patent number: 9229698Abstract: A method for processing a function with a plurality of execution spaces is disclosed. The method comprises creating an internal compiler representation for the function. Creating the internal compiler representation comprises copying substantially all lexical tokens corresponding to a body of the function. Further, the creating comprises inserting the lexical tokens into a plurality of conditional if-statements, wherein a conditional if-statement is generated for each corresponding execution space of said plurality of execution spaces, and wherein each conditional if-statement determines which execution space the function is executing in. During compilation, the method finally comprises performing overload resolution at a call site of an overloaded function by checking for compatibility with a first execution space specified by one of the plurality of conditional if-statements, wherein the overloaded function is called within the body of the function.Type: GrantFiled: November 25, 2013Date of Patent: January 5, 2016Assignee: NVIDIA CORPORATIONInventor: Jaydeep Marathe
-
Patent number: 9223572Abstract: An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.Type: GrantFiled: December 29, 2012Date of Patent: December 29, 2015Assignee: Intel CorporationInventors: Alexander D. Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
-
Patent number: 9195460Abstract: Systems and methods for compiling programs using condition codes and executing those programs when non-numeric values are present allow for explicit handling of non-numeric values. In addition to the conventional condition code values of positive, negative, and zero, a fourth value may be encoded, not a number (NaN) representing a non-numeric value. New condition tests are defined that explicitly account for condition code values of NaN. A compiler may produce code using the new condition tests to represent if and if-else statements. The code including the new condition tests generates deterministic results during execution when non-numeric values are present.Type: GrantFiled: May 2, 2006Date of Patent: November 24, 2015Assignee: NVIDIA CORPORATIONInventors: Robert Steven Glanville, John Erik Lindholm, Ming Y. Siu
-
Patent number: 9182983Abstract: A processor of an aspect includes a register file including a first register to hold a first packed data including a first low data element and a first high data element, a second register to hold a second packed data including a second low data element and a second high data element, and a third register. The processor also includes a decoder to decode an unpack instruction. The processor also includes a functional unit coupled with the decoder and the register file. The functional unit, in response to the decoder decoding the unpack instruction, is to transfer the first low data element to a high position of the third register and the second low data element to a low position of the third register.Type: GrantFiled: December 29, 2012Date of Patent: November 10, 2015Assignee: Intel CorporationInventors: Alexander D. Peleg, Yaakov Yaari, Millind Mittal, Larry M. Mennemeier, Benny Eitan
-
Patent number: 9122797Abstract: Devices systems and methods are provided for providing a deterministic remote interface unit (RIU) based on a finite state machine. The RIU emulator uses a sequence controller that is configured to receive a synchronization input and to execute a fixed list of unconditional commands in an invariable order of execution based solely upon the synchronization input. The RIU emulator also uses pre-defined or pre-certified data structures that are specific to one or more interface devices to successfully execute the at least one unconditional command of the plurality when encountered in the invariable order. As such, peripheral devices may be added, removed or updated without recertification by merely inserting pre-certified data structures into memory or deleting them.Type: GrantFiled: September 29, 2009Date of Patent: September 1, 2015Assignee: HONEYWELL INTERNATIONAL INC.Inventors: Mitch Fletcher, Thom Kreider, John Dawson, Julee Clelland
-
Patent number: 9117020Abstract: An embodiment is directed to a method for analyzing a computer program that includes receiving an instruction specifying a first variable of the program. The first variable has a first value at a first location during program execution. The instruction further specifies a second value for the first variable at the first location. The method includes determining that a second location during program execution includes a conditional control flow instruction that includes the first variable. In addition, the method includes evaluating the conditional control flow instruction using the first and second values of the first variable at the second location. It may be determined whether control flow diverges at the second location based on the evaluating of the conditional control flow instruction using the first and second values at the second location.Type: GrantFiled: September 17, 2014Date of Patent: August 25, 2015Assignee: International Business Machines CorporationInventors: Krzysztof Anton, Michal Bodziony, Pawel K. Koperek, Rafal Korczyk
-
Patent number: 9104633Abstract: Hardware for performing sequences of arithmetic operations. The hardware comprises a scheduler operable to generate a schedule of instructions from a bitmap denoting whether an entry in a matrix is zero or not. An arithmetic circuit is provided which is configured to perform arithmetic operations on the matrix in accordance with the schedule.Type: GrantFiled: January 7, 2011Date of Patent: August 11, 2015Assignee: LINEAR ALGEBRA TECHNOLOGIES LIMITEDInventor: David Moloney
-
Patent number: 9069565Abstract: A processor includes: first selectors that select instruction addresses of instructions of a plurality of threads or a branch target address of a branch instruction to be predicted and that output addresses of the plurality of threads; a second selector that selects one of the addresses of the plurality of threads output by the first selectors; a branch prediction circuit that predicts and outputs a branch direction, which indicates whether the branch instruction of the address selected by the second selector is branched, based on the selected address in a first cycle stage and that predicts and outputs the branch target address of the branch instruction to be predicted based on the selected address in a second cycle stage later than the first cycle stage; and a thread arbitration circuit that controls selection of the addresses of the threads by the first selectors and the second selector.Type: GrantFiled: August 29, 2012Date of Patent: June 30, 2015Assignee: FUJITSU LIMITEDInventors: Toshiro Ito, Takashi Suzuki
-
Publication number: 20150134939Abstract: An information processing system is provided. The information processing system includes a processor used to obtain information, a memory used to store the information and output an information block based on a received address; and a scanner used to generate an address based on the current information block and to provide the address to the memory, where the current information block is the information block currently outputted from the memory. Thus, the speed for obtaining the information block by the processor (information block requested device) is further improved, and the execution speed of the processor and the information processing system is improved.Type: ApplicationFiled: June 14, 2013Publication date: May 14, 2015Inventor: Chenghao Kenneth Lin
-
Publication number: 20150127929Abstract: A data processing device includes an instruction executing part executing a normal task and a management task scheduling an execution order of the normal task with switching the normal task and the management task, a counter measuring an execution state of the normal task being executed in the instruction executing part, and a state controller controlling the counter based on the normal task being executed in the instruction executing part. The instruction executing part determines whether the normal task to be executed next of a plurality of normal tasks scheduled by the management task is a measurement object or not, and outputs an operation signal notifying the state controller of the determination result. The state controller operates the counter in accordance with the branch operation.Type: ApplicationFiled: January 14, 2015Publication date: May 7, 2015Inventors: Hitoshi Suzuki, Yukihiko Akaike
-
Publication number: 20150106602Abstract: A system and method for efficiently performing program instrumentation. A processor processes instructions stored in a memory. The processor allocates a memory region for the purpose of creating “random branches” in the computer code utilizing existing memory access instructions. When the processor processes a given instruction, the processor both accesses a first location in the memory region and may determine a condition is satisfied. In response, the processor generates an interrupt. The corresponding interrupt handler may transfer control flow from the computer program to instrumentation code. The condition may include a pointer storing an address pointing to locations within the memory region equals a given address after the point is updated. Alternatively, the condition may include an updated data value stored in a location pointed to by the given address equals a threshold value.Type: ApplicationFiled: October 15, 2013Publication date: April 16, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Joseph L. Greathouse, David S. Christie
-
Publication number: 20150095628Abstract: Various embodiments are generally directed to techniques to detect a return-oriented programming (ROP) attack by verifying target addresses of branch instructions during execution. An apparatus includes a processor component, and a comparison component for execution by the processor component to determine whether there is a matching valid target address for a target address of a branch instruction associated with a translated portion of a routine in a table comprising valid target addresses. Other embodiments are described and claimed.Type: ApplicationFiled: May 23, 2013Publication date: April 2, 2015Inventors: Koichi Yamada, Palanivelra Shanmugavelayutham, Arvind Krishnaswamy, Jason M. Agron, Jiwei Lu
-
Publication number: 20150058605Abstract: Exemplary methods, apparatuses, and systems assign a plurality of branch instructions within a computer program to a plurality of prime numbers. Each branch instruction is assigned a unique prime number within the plurality of prime numbers. A run-time branch trace value is determined to be divisible, without a remainder, by a first prime number of the plurality of prime numbers. The run-time branch trace value was generated during execution of the computer program. An output is generated indicating that a first branch instruction assigned to the first prime number was executed.Type: ApplicationFiled: August 21, 2013Publication date: February 26, 2015Applicant: VMware, Inc.Inventor: Rajiv MADAMPATH