Instruction Modification Based On Condition Patents (Class 712/226)
-
Publication number: 20140149724Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.Type: ApplicationFiled: January 31, 2014Publication date: May 29, 2014Inventors: Robert C. Valentine, Jesus Corbal San Adrian, Roger Espasa Sans, Robert D. Cavin, Bret L. Toll, Santiago Galan Duran, Jeffrey G. Wiedemeier, Sridhar Samudrala, Milind Baburao Girkar, Edward Thomas Grochowski, Jonathan Cannon Hall, Dennis R. Bradford, Elmoustapha Ould-Ahmed-Vall, James C. Abel, Mark Charney, Seth Abraham, Suleyman Sair, Andrew Thomas Forsyth, Lisa Wu, Charles Yount
-
Patent number: 8738892Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.Type: GrantFiled: April 15, 2008Date of Patent: May 27, 2014Assignee: Panasonic CorporationInventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
-
Patent number: 8732442Abstract: A method for managing data, including obtaining a first instruction for moving a first data item from a first source to a first destination, determining a data type of the first data item, determining a data type supported by the first destination, comparing the data type of the first data item with the data type supported by the first destination to test a validity of the first instruction, and moving the first data item from the first source to the first destination based on the validity of the first instruction.Type: GrantFiled: June 25, 2008Date of Patent: May 20, 2014Assignee: Oracle America, Inc.Inventors: Mario I. Wolczko, Gregory M. Wright, Matthew L. Seidl
-
Publication number: 20140129809Abstract: Exemplary embodiments of the present invention disclose a method and system for executing data permute and data shift instructions. In a step, an exemplary embodiment encodes a control index value using the recoding logic into a 1-hot-of-n control for at least one of a plurality of datum positions in the one or more target registers. In another step, an exemplary embodiment conditions the 1-hot-of-n control by a gate-free logic configured for at least one of the plurality of datum positions in the one or more target registers for each of the data permute instructions and the at least one data shift instruction. In another step, an exemplary embodiment selects the 1-hot-of-n control or the conditioned 1-hot-of-n control based on a current instruction mode. In another step, an exemplary embodiment transforms the selected 1-hot-of-n control into a format applicable for the crossbar switch.Type: ApplicationFiled: January 7, 2014Publication date: May 8, 2014Applicant: International Business Machines CorporationInventors: Markus Kaltenbach, Jens Leenstra, Philipp Panitz, Christoph Wandel
-
Publication number: 20140129205Abstract: Methods for encoding a program. Each program instruction in a program has one or more possible encodings, and each instruction encoding may have a different length. The instruction encodings are selected such that the resulting encoding of the program as a whole minimizes the number of program cycles used in a decoding stage of a processor. Instruction padding or program padding may be used to create instruction encodings of lengths.Type: ApplicationFiled: November 2, 2012Publication date: May 8, 2014Inventors: Michael Rolle, Stanley Goldberg
-
Patent number: 8717586Abstract: An image processing apparatus which makes it possible to select a plurality of instructions at a time, and connect a plurality of documents together so that they can be processed as one document. The image processing apparatus has a reading unit, which reads an image on an original to generate image data, and performs processing according to an instruction defining reading processing to be performed, as well as processing on the generated image data. The selected plurality of instructions are analyzed, and based on the analysis result, the selected plurality of instructions are connected together to create a new instruction.Type: GrantFiled: May 31, 2011Date of Patent: May 6, 2014Assignee: Canon Kabushiki KaishaInventor: Shinichi Takano
-
Patent number: 8713292Abstract: A data processing system is used to evaluate a data processing function by executing a sequence of program instructions including an intermediate value generating instruction and an intermediate value consuming instruction. In dependence upon one or more input operands to the evaluation, an embedded opcode within the intermediate value passed between the intermediate value generating instruction and the intermediate value consuming instruction may be set to have a value indicating that a substitute instruction should be used in place of the intermediate value consuming instruction. The instructions may be floating point instructions, such as a floating point power instruction evaluating the data processing function ab.Type: GrantFiled: February 7, 2011Date of Patent: April 29, 2014Assignee: ARM LimitedInventor: Jorn Nystad
-
Publication number: 20140115304Abstract: Computer implemented techniques are disclosed for identification of repeated binary strings and for storing those binary strings in order to compress code. The binary strings can be longer instructions, data, or addresses. A table of binary strings is generated based on repeated occurrences, and a reference index is provided for accessing specific entries within the table. An opcode uses a shorter string as an index through which to access the table. The longer string is executed when the longer string is an instruction. When the longer string is an address or data, the appropriate address or data are accessed.Type: ApplicationFiled: October 18, 2012Publication date: April 24, 2014Applicant: SYNOPSYS, INC.Inventor: Marcus J. Mauro
-
Publication number: 20140108771Abstract: Two computer machine instructions are fetched for execution, but replaced by a single optimized instruction to be executed, wherein a temporary register used by the two instructions is identified as a last-use register, where a last-use register has a value that is not to be accessed by later instructions, whereby the two computer machine instructions are replaced by a single optimized internal instruction for execution, the single optimized instruction not including the last-use register.Type: ApplicationFiled: December 6, 2013Publication date: April 17, 2014Applicant: International Business Machines CorporationInventors: Michael K Gschwind, Valentina Salapura
-
Publication number: 20140108772Abstract: A pool of available physical registers are provided for architected registers, wherein operations are performed that activate and deactivate selected architected registers, such that the deactivated selected architected registers need not retain values, and physical registers can be deallocated to the pool, wherein deallocation of physical registers is performed after a last-use by a designated last-use instruction, wherein the last-use information is provided either by the last-use instruction or a prefix instruction, wherein reads to deallocated architecture registers return an archtiected default value.Type: ApplicationFiled: December 23, 2013Publication date: April 17, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Publication number: 20140095834Abstract: Instructions and logic provide extended vector suffix comparisons for Boyer-Moore searches. Some embodiments, responsive to an instruction specifying: a pattern source operand and a target source operand, compare each of m data elements of the pattern operand with each data element of the target operand. A first and second equal ordered aggregation operation are performed from the comparisons according to the m data elements of the pattern source operand. A result of the first and second aggregation operations indicating whether or not a possible match exists between the m data elements of the pattern source operand and d data element positions relative to data elements of the target source operand is stored. Ordering of the data elements of the pattern and the target operands may be reversed for the second aggregation operation, and d may be a sum of m?1 and the quantity of target operand elements in some embodiments.Type: ApplicationFiled: September 30, 2012Publication date: April 3, 2014Inventor: Shih J. Kuo
-
Publication number: 20140089638Abstract: Various techniques for processing instructions that specify multiple destinations. A first portion of a processor pipeline is configured to split a multi-destination instruction into a plurality of single-destination operations. A second portion of the pipeline is configured to process the plurality of single-destination operations. A third portion of the pipeline is configured to merge the plurality of single-destination operations into one or more multi-destination operations. The one or more multi-destination operations may be performed. The first portion of the pipeline may include a decode unit. The second portion of the pipeline may include a map unit, which may in turn include circuitry configured to maintain a list of free architectural registers and a mapping table that maps physical registers to architectural registers. The third portion of the pipeline may comprise a dispatch unit. In some embodiments, this may provide certain advantages such as reduced area and/or power consumption.Type: ApplicationFiled: September 26, 2012Publication date: March 27, 2014Applicant: APPLE INC.Inventors: John H. Mylius, Gerard R. Williams III, James B. Keller, Fang Liu, Shyam Sundar
-
Publication number: 20140082334Abstract: A conventional instruction set architecture such, as the x86 instruction set architecture, may be reencoded to reduce the amount of memory used by the instructions. This may be particularly useful in applications that are memory sized limited, as is the case with microcontrollers. With a reencoded instruction set that is more dense, more functions can be implemented or a smaller memory size may be used. The encoded instructions are then naturally decoded at run time in the predecoder and decoder of the core pipeline.Type: ApplicationFiled: December 30, 2011Publication date: March 20, 2014Inventors: Steven R. King, Sergey Kochuguev, Alexander Redkin, Srihari Makineni
-
Patent number: 8677106Abstract: One embodiment of the present invention sets forth a mechanism for managing thread divergence in a thread group executing a multithreaded processor. A unanimous branch instruction, when executed, causes all the active threads in the thread group to branch only when each thread in the thread group agrees to take the branch. In such a manner, thread divergence is eliminated. A branch-any instruction, when executed, causes all the active threads in the thread group to branch when at least one thread in the thread group agrees to take the branch.Type: GrantFiled: June 14, 2010Date of Patent: March 18, 2014Assignee: Nvidia CorporationInventors: John R. Nickolls, Richard Craig Johnson, Robert Steven Glanville, Guillermo Juan Rozas
-
Publication number: 20140068229Abstract: Coding circuitry comprises at least an encoder configured to encode an instruction address for transmission to a decoder. The encoder is operative to identify the instruction address as belonging to a particular one of a plurality of groups of instruction addresses associated with respective distinct program constructs, and to encode the instruction address based on the identified group. The decoder is operative to identify the encoded instruction address as belonging to the particular one of a plurality of groups of instruction addresses associated with respective distinct program constructs, and to decode the encoded instruction address based on the identified group. The coding circuitry may be implemented as part of an integrated circuit or other processing device that includes associated processor and memory elements. In such an arrangement, the processor may generate the instruction address for delivery over a bus to the memory.Type: ApplicationFiled: August 28, 2012Publication date: March 6, 2014Applicant: LSI CorporationInventors: Prakash Krishnamoorthy, Ramesh C. Tekumalla, Parag Madhani
-
Publication number: 20140047222Abstract: A method for recombining runtime instruction comprising: an instruction running environment is buffered; the machine instruction segment to be scheduled is obtained; the second jump instruction which directs an entry address of an instruction recombining platform is inserted before the last instruction of the obtained machine instruction segment to generate the recombined instruction segment comprising the address A?; the value A of the address register of the buffered instruction running environment is modified to the address A?; the instruction running environment is recovered.Type: ApplicationFiled: April 29, 2011Publication date: February 13, 2014Applicant: Beijing Zhongtian Antai Technology Co., Ltd.Inventor: Jiaxiang Wang
-
Publication number: 20140047221Abstract: Fusing flag-producing and flag-consuming instructions in instruction processing circuits and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a flag-producing instruction indicating a first operation generating a first flag result is detected in an instruction stream by an instruction processing circuit. The instruction processing circuit also detects a flag-consuming instruction in the instruction stream indicating a second operation consuming the first flag result as an input. The instruction processing circuit generates a fused instruction indicating the first operation generating the first flag result and indicating the second operation consuming the first flag result as the input. In this manner, as a non-limiting example, the fused instruction eliminates a potential for a read-after-write hazard between the flag-producing instruction and the flag-consuming instruction.Type: ApplicationFiled: March 7, 2013Publication date: February 13, 2014Applicant: QUALCOMM INCORPORATEDInventors: Andrew S. Irwin, James Norris Dieffenderfer, Melinda J. Brown, Jeffery M. Schottmiller, Brian Michael Stempel, Michael Scott McIlvaine, Rodney Wayne Smith, Michael William Morrow
-
Publication number: 20140032882Abstract: In an example embodiment, an instruction set is accessed. An instruction modifier is associated with the instruction set. Thereafter, the instruction set is transformed into a modified instruction set based on the instruction modifier. After the transformation, the modified instruction set is executed.Type: ApplicationFiled: July 13, 2007Publication date: January 30, 2014Inventors: Les G. Woolsey, Ken Beaton
-
Patent number: 8635434Abstract: A mathematical operation processing apparatus is disclosed by which the supply of an operand which is performed based on condition codes by a plurality of mathematical operations can be performed at a high speed. The mathematical operation processing apparatus includes a plurality of computing elements configured to perform different mathematical operations different from one another and produce mathematical operation results of the mathematical operations and condition codes. A condition code set register retains the condition codes produced simultaneously by the computing elements as a condition code set. A condition code conversion section performs a predetermined conversion for the condition code set and outputs a result of the conversion as a conversion condition code set. An operand supplying section supplies an operand for the mathematical operations in the computing elements based on the conversion condition code set.Type: GrantFiled: December 4, 2007Date of Patent: January 21, 2014Assignee: Sony CorporationInventors: Yasuhiro Iizuka, Takahiro Sato, Takayasu Kon, Kenichi Sanpei, Eiichiro Morinaga
-
Publication number: 20140019731Abstract: Methods of bit manipulation within a computer processor are disclosed. Improved flexibility in bit manipulation proves helpful in computing elementary functions critical to the performance of many programs and for other applications. In one embodiment, a unit of input data is shifted/rotated and multiple non-contiguous bit fields from the unit of input data are inserted in an output register. In another embodiment, one of two units of input data is optionally shifted or rotated, the two units of input data are partitioned into a plurality of bit fields, bitwise operations are performed on each bit field, and pairs of bit fields are combined with either an AND or an OR bitwise operation. Embodiments are also disclosed to simultaneously perform these processes on multiple units and pairs of units of input data in a Single Input, Multiple Data processing environment capable of performing logical operations on floating point data.Type: ApplicationFiled: March 15, 2013Publication date: January 16, 2014Applicant: International Business Machines CorporationInventors: Christopher Kumar Anand, Simon Christopher Broadhead, Robert Frederick Enenkel
-
Publication number: 20140019732Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor mask bit compression in response to a single mask bit compression instruction that includes a source writemask register operand, a destination writemask register operand, and an opcode are described.Type: ApplicationFiled: December 23, 2011Publication date: January 16, 2014Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
-
Publication number: 20140013089Abstract: A microprocessor instruction translator translates a conditional load instruction into at least two microinstructions. An out-of-order execution pipeline executes the microinstructions. To execute a first microinstruction, an execution unit receives source operands from the source registers of a register file and responsively generates a first result using the source operands. To execute a second the microinstruction, an execution unit receives a previous value of the destination register and the first result and responsively reads data from a memory location specified by the first result and provides a second result that is the data if a condition is satisfied and that is the previous destination register value if not. The previous value of the destination register comprises a result produced by execution of a microinstruction that is the most recent in-order previous writer of the destination register with respect to the second microinstruction.Type: ApplicationFiled: April 6, 2012Publication date: January 9, 2014Applicant: VIA Technologies, Inc.Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker, Gerard M. Col, Colin Eddy
-
Publication number: 20140013088Abstract: An integrated circuit device comprising at least one instruction processing module arranged to receive a bit-manipulation instruction, and in response to receiving the bit-manipulation instruction to select at least one bit from at least one source data register in accordance with a value of at least one control bit, select from candidate values a manipulation value for the at least one selected bit in accordance with a value of at least one further control bit, and store the selected manipulation value for the at least one selected bit in at least one output data register.Type: ApplicationFiled: March 30, 2011Publication date: January 9, 2014Applicant: Freescale Semiconductor, Inc.Inventors: Noam Eshel-Goldman, Aviram Amir, Itzhak Barak, Amir Kleen
-
Publication number: 20140006757Abstract: Key lookup operations are broken into two instructions: a Key Dispatch Instruction (KDI), and a Return Result Instruction (RRI). The thread uses KDI to dispatch key information to a selected coprocessor to initiate a key lookup operation. Upon dispatch of the key value to the coprocessor, the KDI is retired to enable the thread to continue to dispatch and retire addition instructions in the pipeline and does not go idle. Subsequently, the thread will issue a RRI to obtain the key lookup result from the coprocessor. While a thread is executing, it maintains, as part of its context, a busy flag per coprocessor in a scoreboard register and a return result register per coprocessor. KDI causes the corresponding busy flag in the scoreboard register to be set. When the key lookup operation is complete, the busy flag is cleared and the result is stored in the return result register.Type: ApplicationFiled: June 29, 2012Publication date: January 2, 2014Applicant: Avaya, Inc.Inventor: Hamid Assarpour
-
Patent number: 8615646Abstract: One embodiment of the present invention sets forth a mechanism for managing thread divergence in a thread group executing a multithreaded processor. A unanimous branch instruction, when executed, causes all the active threads in the thread group to branch only when each thread in the thread group agrees to take the branch. In such a manner, thread divergence is eliminated. A branch-any instruction, when executed, causes all the active threads in the thread group to branch when at least one thread in the thread group agrees to take the branch.Type: GrantFiled: June 14, 2010Date of Patent: December 24, 2013Assignee: Nvidia CorporationInventors: John R. Nickolls, Richard Craig Johnson, Robert Steven Glanville, Guillermo Juan Rozas
-
Publication number: 20130339682Abstract: According to one embodiment, a code optimizer is configured to receive first code having a program loop implemented with scalar instructions to store values of a first array to a second array based on values of a third array. The code optimizer is configured to generate second code representing the program loop with vector instructions including a shuffle instruction and a store instruction, the store instruction to shuffle using a shuffle table elements of the first array based on the second array in a vector manner, the store instruction to store using a mask store table the shuffled elements in the third array in a vector manner.Type: ApplicationFiled: December 15, 2011Publication date: December 19, 2013Inventors: Tal Uliel, Elmoustapha Ould-Ahmed-Vall, Bret T. Toll
-
Publication number: 20130332710Abstract: Technologies and implementations for modulating dynamic optimizations of a computer program during execution are generally disclosed.Type: ApplicationFiled: June 11, 2012Publication date: December 12, 2013Applicant: EMPIRE TECHNOLOGY DEVELOPMENT LLCInventor: Ezekiel Kruglick
-
Patent number: 8601244Abstract: An apparatus and method for generating a very long instruction word (VLIW) command that supports predicated execution, and a VLIW processor and method for processing a VLIW are provided herein. The VLIW command includes an instruction bundle formed of a plurality of instructions to be executed in parallel and a single value indicating predicated execution, and is generated using the apparatus and method for generating a VLIW command. The VLIW processor decodes the instruction bundle and executes the instructions, which are included in the decoded instruction bundle, in parallel, according to the value indicating predicated execution.Type: GrantFiled: February 16, 2010Date of Patent: December 3, 2013Assignee: Samsung Electronics Co., Ltd.Inventors: Bernhard Egger, Soo-jung Ryu, Dong-hoon Yoo, Il-hyun Park
-
Patent number: 8601245Abstract: A scalable processing system includes a memory device having a plurality of executable program instructions, wherein each of the executable program instructions includes a timetag data field indicative of the nominal sequential order of the associated executable program instructions. The system also includes a plurality of processing elements, which are configured and arranged to receive executable program instructions from the memory device, wherein each of the processing elements executes executable instructions having the highest priority as indicated by the state of the timetag data field.Type: GrantFiled: July 15, 2011Date of Patent: December 3, 2013Assignee: Board of Governors for Higher Education, State of Rhode Island and Providence PlantationsInventors: Augustus K. Uht, David Morano, David Kaeli
-
Patent number: 8595712Abstract: A method, system, and computer readable article of manufacture to enable parallel execution of a divided source code in a multiprocessor system. The method includes the steps of: inputting an original source code by an input device into the computing apparatus; finding a critical path in the original source code by a critical path cut module; cutting the critical path in the original source code into a plurality of process block groups by the critical path cut module; and dividing the plurality of process block groups among a plurality of processors in the multiprocessor system by a CPU assignment code generation module to produce the divided source code. The system includes an input device; a critical path cut module; and a CPU assignment code generation unit to produce the divided source code. The computer readable article of manufacture includes instructions to implement the method.Type: GrantFiled: January 25, 2013Date of Patent: November 26, 2013Assignee: International Business Machines CorporationInventors: Hideaki Komatsu, Takeo Yoshizawa
-
Publication number: 20130311756Abstract: A method of one aspect may include receiving a rotate instruction. The rotate instruction may indicate a source operand and a rotate amount. A result may be stored in a destination operand indicated by the rotate instruction. The result may have the source operand rotated by the rotate amount. Execution of the rotate instruction may complete without reading a carry flag.Type: ApplicationFiled: July 22, 2013Publication date: November 21, 2013Inventors: VINODH GOPAL, JAMES D. GULILFORD, GILBERT M. WOLRICH, WAIDI K. FEGHALI, ERDINC OZTURK, MARTIN G. DIXON, SEAN P. MIRKES, BRET L. TOLL, MAXIM LOKTYUKHIN, MARK C. DAVIS, ALEXANDRE J. FARCY
-
Publication number: 20130311755Abstract: A microprocessor includes functional units and control registers writeable to cause the functional units to institute actions that reduce the instructions-per-clock rate of the microprocessor to reduce power consumption when the microprocessor is operating in its lowest performance running state. Examples of the actions include in-order vs. out-of-order execution, serial vs. parallel cache access and single vs. multiple instruction issue, retire, translation and/or formatting per clock cycle. The actions may be instituted only if additional conditions exist, such as residing in the lowest performance running state for a minimum time, not running in a higher performance state for more than a maximum time, a user did not disable the feature, the microprocessor supports multiple running states and the operating system supports multiple running states.Type: ApplicationFiled: February 26, 2013Publication date: November 21, 2013Applicant: VIA TECHNOLOGIES, INC.Inventor: VIA TECHNOLOGIES, INC.
-
Patent number: 8589664Abstract: A data processing apparatus includes a data engine 6 having an instruction decoder 18 for generating one or more control signals 24 for controlling processing circuitry 20 to perform data processing operations specified by the program instructions decoded. The instruction decoder 18 responsive to a marker instruction to read a programmable flow control value from a flow control register 38. The programmable flow control value specifies the action to be taken upon completion of execution of a current sequence of program instructions. The action taken may be jumping to a target program instruction at the start of a target sequence of program instructions or entry into an idle state awaiting a new processing task to be initiated.Type: GrantFiled: September 7, 2010Date of Patent: November 19, 2013Assignee: u-blox AGInventors: Merlijn Aurich, Jef Verdonck
-
Patent number: 8584109Abstract: A computer-implementable method includes providing an instruction set architecture that comprises features to generate diverse copies of a program, using the instruction set architecture to generate diverse copies of a program and providing a virtual machine for execution of one of the diverse copies of the program. Various exemplary methods, devices, systems, etc., use virtualization for diversifying code and/or virtual machines to thereby enhance software security.Type: GrantFiled: October 27, 2006Date of Patent: November 12, 2013Assignee: Microsoft CorporationInventors: Bertrand Anckaert, Mariusz H. Jakubowski, Ramarathnam Venkatesan
-
Patent number: 8572357Abstract: A monitoring facility that is operable in two modes allowing compatibility with prior existing monitoring facilities. In one mode, in response to encountering a monitored event, an interrupt is generated. In another mode, in response to encountering a monitored event, one or more associated counters are incremented without causing an interrupt.Type: GrantFiled: September 29, 2009Date of Patent: October 29, 2013Assignee: International Business Machines CorporationInventors: Dan Greiner, James H. Mulder, Robert R. Rogers, Robert W. StJohn
-
Publication number: 20130283021Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.Type: ApplicationFiled: December 23, 2011Publication date: October 24, 2013Applicant: LURGI GmbHInventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
-
Publication number: 20130283020Abstract: A program trace data compression mechanism in which execution of a variable length execution set (VLES) including multiple non-branch conditional instructions are traced in real-time in a manner that allows the instruction execution to be reconstructed completely by correlating the trace data with the traced binary code.Type: ApplicationFiled: April 18, 2012Publication date: October 24, 2013Applicant: FREESCALE SEMICONDUCTOR, INC.Inventors: Robert N. Ehrlich, Petru Lauric, Robert A. McGowan
-
Publication number: 20130283022Abstract: Vector translation instructions are used to demarcate the beginning and the end of a code region to be translated. The code region includes a first set of vector instructions defined in an instruction set of a source processor. A processor receives the vector translation instructions and the demarcated code region, and translates the code region into translated code. The translated code includes a second set of vector instructions defined in an instruction set of a target processor. The translated code is executed by the target processor to produce a result value, the result value being the same as an original result value produced by the source processor executing the code region. The target processor stores the result value at a location that is not a vector register, the location being the same as an original location used by the source processor to store the original result value.Type: ApplicationFiled: December 6, 2011Publication date: October 24, 2013Inventor: Ruchira Sasanka
-
Publication number: 20130275734Abstract: A method of an aspect includes receiving a packed data operation mask concatenation instruction. The packed data operation mask concatenation instruction indicates a first source having a first packed data operation mask, indicates a second source having a second packed data operation mask, and indicates a destination. A result is stored in the destination in response to the packed data operation mask concatenation instruction. The result includes the first packed data operation mask concatenated with the second packed data operation mask. Other methods, apparatus, systems, and instructions are disclosed.Type: ApplicationFiled: December 22, 2011Publication date: October 17, 2013Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Elmoustapha Ould-Ahmed-Vall, Mark Charney
-
Publication number: 20130268742Abstract: An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code executing on the ASMP is analyzed by a binary analysis unit to determine what functions are called by the program code and select which of the cores are to execute the program code, or a code segment thereof. Selection may be made to provide for native execution of the program code, to minimize power consumption, and so forth. Control operations based on this selection may then be inserted into the program code, forming instrumented program code. The instrumented program code is then executed by the ASMP.Type: ApplicationFiled: December 29, 2011Publication date: October 10, 2013Inventors: Koichi Yamada, Boris Ginzburg, Wei Li, Ronny Ronen, Esfir Natanzon, Konstantin Levit-Gurevich, Gadi Haber, Alon Naveh, Eliezer Weissmann, Michael Mishaeli
-
Publication number: 20130262823Abstract: A computer system for optimizing instructions includes a processor including an instruction execution unit configured to execute instructions and an instruction optimization unit configured to optimize instructions and memory to store machine instructions to be executed by the instruction execution unit. The computer system is configured to perform a method including analyzing machine instructions from among a stream of instructions to be executed by the instruction execution unit, the machine instructions including a memory load instruction and a data processing instruction to perform a data processing function based on the memory load instruction, identifying the machine instructions as being eligible for optimization, merging the machine instructions into a single optimized internal instruction, and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.Type: ApplicationFiled: March 28, 2012Publication date: October 3, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Publication number: 20130262829Abstract: A technique is provided for replacing an atomic sequence. A processing circuit receives the atomic sequence. The processing circuit detects the atomic sequence. The processing circuit generates an internal atomic operation to replace the atomic sequence.Type: ApplicationFiled: March 28, 2012Publication date: October 3, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Michael K. Gschwind
-
Publication number: 20130262841Abstract: A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization, where the two or more instructions include a memory load instruction and a data processing instruction to process data based on the memory load instruction. The method includes merging, by a processor, the two or more instructions into a single optimized internal instruction and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.Type: ApplicationFiled: March 8, 2013Publication date: October 3, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Publication number: 20130262839Abstract: A computer system for optimizing instructions is configured to identify two or more machine instructions as being eligible for optimization, to merge the two or more machine instructions into a single optimized internal instruction that is configured to perform functions of the two or more machine instructions, and to execute the single optimized internal instruction to perform the functions of the two or more machine instructions. Being eligible includes determining that the two or more machine instructions include a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register. The second instruction is a next sequential instruction of the first instruction in program order, wherein the first instruction specifies a first function to be performed, and the second instruction specifies a second function to be performed.Type: ApplicationFiled: March 28, 2012Publication date: October 3, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Publication number: 20130262830Abstract: A technique is provided for replacing an atomic sequence. A processing circuit receives the atomic sequence. The processing circuit detects the atomic sequence. The processing circuit generates an internal atomic operation to replace the atomic sequence.Type: ApplicationFiled: March 4, 2013Publication date: October 3, 2013Applicant: International Business Machines CorporationInventor: Michael K. Gshwind
-
Publication number: 20130262840Abstract: A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization. Eligibility is based on a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register. The method includes merging the two or more machine instructions into a single optimized internal instruction that is configured to perform first and second functions of two or more machine instructions employing operands specified by the two or more machine instructions. The single optimized internal instruction specifies the first target register only as a single target register and the single optimized internal instruction specifies the first and second functions to be performed. The method includes executing the single optimized internal instruction to perform the first and second functions of the two or more instructions.Type: ApplicationFiled: March 8, 2013Publication date: October 3, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael K. Gschwind, Valentina Salapura
-
Publication number: 20130262842Abstract: An information processing apparatus generates first and second trees representing a dependency relationship among instructions from first code. The information processing apparatus then adjusts the height of the shorter one of the first and second trees by inserting pseudo instructions that do not cause any difference in data before and after operation in the shorter tree, and also shuffles the order of instructions existing at the same depth from the root, according to operation types in at least one of the first and second trees. The information processing apparatus compares the first and second trees subjected to the height adjustment and the order shuffling with each other to determine combinations of an instruction of the first tree and an instruction of the second tree.Type: ApplicationFiled: March 26, 2013Publication date: October 3, 2013Applicant: FUJITSU LIMITEDInventors: Shuichi CHIBA, Takashi ARAKAWA
-
Patent number: 8549266Abstract: A method and system of instruction modification. A first machine language instruction, which may comprise a plurality of discrete instructions, is fetched. Responsive to a trigger pattern in the first machine language instruction, a segment of the first machine language instruction is modified. Information can be substituted into the segment based on specifics outlined in the trigger pattern. Alternatively, information can be combined with the segment via logical and/or arithmetic operations. Modification of the segment produces a second machine language instruction that is executed by units of the processor. In one embodiment, information may be taken from a queue and used to replace data from the segment. How information is taken from the queue and how the information so taken is used to replace fields of the segment are defined by the trigger pattern.Type: GrantFiled: June 7, 2011Date of Patent: October 1, 2013Inventors: John P. Banning, Eric Hao, Brett Coon
-
Publication number: 20130246766Abstract: Emulation of instructions that include non-contiguous specifiers is facilitated. A non-contiguous specifier specifies a resource of an instruction, such as a register, using multiple fields of the instruction. For example, multiple fields of the instruction (e.g., two fields) include bits that together designate a particular register to be used by the instruction. Non-contiguous specifiers of instructions defined in one computer system architecture are transformed to contiguous specifiers usable by instructions defined in another computer system architecture. The instructions defined in the another computer system architecture emulate the instructions defined for the one computer system architecture.Type: ApplicationFiled: March 3, 2013Publication date: September 19, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: INTERNATIONAL BUSINESS MACHINES CORPORATION
-
Publication number: 20130246765Abstract: For efficient issue of a superscalar instruction a circuit is employed which retrieves an instruction of each instruction code type other than a prefix based on a determination result of decoders for determining instruction code type, adds the immediately preceding instruction to the retrieved instruction, and outputs the resultant. When an instruction of a target code type is detected in a plurality of instruction units to be searched, the circuit outputs the detected instruction code and the immediately preceding instruction other than the target code type as prefix code candidates. When an instruction of a target code type cannot be detected at the rear end of the instruction units, the circuit outputs the instruction at the rear end as a prefix code candidate. When an instruction of a target code type is detected at the head in the instruction code search, the circuit outputs the instruction code at the head.Type: ApplicationFiled: March 1, 2013Publication date: September 19, 2013Applicant: RENESAS ELECTRONICS CORPORATIONInventor: FUMIO ARAKAWA