Decoding Instruction To Accommodate Variable Length Instruction Or Operand Patents (Class 712/210)
  • Patent number: 10915323
    Abstract: Provided is a data processing method including the operations of storing, in a register, a first immediate portion included in a first instruction, from among the first immediate portion and a second immediate portion that constitute an immediate value, which is an operand; determining the immediate value by catenating the second immediate portion included in a second instruction with the stored first immediate portion; and performing an operation by using a value indicated by the second instruction and the determined immediate value.
    Type: Grant
    Filed: October 14, 2015
    Date of Patent: February 9, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki-seok Kwon, Min-wook Ahn, Suk-jin Kim, Young-hwan Park
  • Patent number: 10877759
    Abstract: Managing the capture of information. A plurality of instruction units of an instruction stream are received in parallel by a plurality of instruction decode units of a processor. One instruction decode unit of the plurality of instruction decode units receives a prefix instruction and another instruction decode unit of the plurality of instruction decode units receives a prefixed instruction. The prefixed instruction is an instruction to be modified by the prefix instruction. Information associated with processing of the plurality of instruction units is captured, and the capturing includes modifying the information to be captured to manage the prefix instruction and the prefixed instruction separately received by the instruction decode units as a single instruction.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: December 29, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10853069
    Abstract: Aspects for vector comparison in neural network are described herein. The aspects may include a direct memory access unit configured to receive a first vector and a second vector from a storage device. The first vector may include one or more first elements and the second vector may include one or more second elements. The aspects may further include a computation module that includes one or more comparers respectively configured to generate a comparison result by comparing one of the one or more first elements to a corresponding one of the one or more second elements in accordance with an instruction.
    Type: Grant
    Filed: January 14, 2019
    Date of Patent: December 1, 2020
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Dong Han, Xiao Zhang, Shaoli Liu, Tianshi Chen, Yunji Chen
  • Patent number: 10831480
    Abstract: A single architected instruction is obtained to perform multiple functions. The instruction is executed, and the executing includes performing a first function of the multiple functions and a second function of the multiple functions. The first function includes moving a block of data from one location to another location, and the second function includes setting a storage key. The storage key is associated with the block of data at the other location and controls access to the block of data. The first function and the second function are performed as part of the single architected instruction.
    Type: Grant
    Filed: February 25, 2019
    Date of Patent: November 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Timothy Slegel, Elpida Tzortzatos
  • Patent number: 10810011
    Abstract: A method of implementing a processor architecture and corresponding system includes operands of a first size and a datapath of a second size. The second size is different from the first size. Given a first array of registers and a second array of registers, each register of the first and second arrays being of the second size, selecting a first register and corresponding second register from the first array and the second array, respectively, to perform operations of the first size. Advantageously, this allows a user, who is interfacing with the hardware processor through software, to provide data to the processor agnostic to the size of the registers and datapath bit-width of the processor.
    Type: Grant
    Filed: November 13, 2015
    Date of Patent: October 20, 2020
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: David Kravitz, Manan Salvi, David A. Carlson
  • Patent number: 10783082
    Abstract: Implementations of the present specification provide a method for deploying a smart contract. According to the method in the implementations, in a phase of deploying a smart contract, a bytecode included in a contract module corresponding to the contract is obtained; and then the bytecode is parsed into executable instruction codes, and the executable instruction codes are stored in a cache memory. Further, a function index table is determined for import and export functions in the bytecode, where the function index table is used to indicate a memory address of an instruction code corresponding to each of the import and export functions; and the function index table is stored in the cache memory.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: September 22, 2020
    Assignee: Alibaba Group Holding Limited
    Inventor: Zhongxiao Yao
  • Patent number: 10776126
    Abstract: An apparatus includes a scheduler circuit and a processing circuit. The scheduler circuit may be configured to (i) parse a directed acyclic graph into one or more operators and (ii) schedule the one or more operators in one or more data paths. The processing circuit generally comprises one or more hardware engines configured as the one or more data paths. The one or more hardware engines are generally configured to generate one or more output vectors in response to zero or more input vectors using the operators. At least one of the one or more hardware engines may support input vector dimensions ranging from zero to at least four dimensions. At least one of the one or more hardware engines is implemented solely in hardware.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: September 15, 2020
    Assignee: Ambarella International LP
    Inventors: Leslie D. Kohn, Robert C. Kunz
  • Patent number: 10747698
    Abstract: A control or test system for a field device includes: a communication unit for bidirectionally commmunicating with the field device via a fieldbus protocol; a command memory for receiving commands that are transmittable to the field device via the fieldbus protocol; and a masking memory, which masking memory receives the commands contained in the command memory that are not supported by the field device, and/or receives an error message returned by the field device in response to such command.
    Type: Grant
    Filed: September 17, 2018
    Date of Patent: August 18, 2020
    Assignee: ABB SCHWEIZ AG
    Inventors: Dirk Wagener, Christoph Welte, Marcus Heege, Wolfgang Mahnke, Marko Schlueter
  • Patent number: 10678545
    Abstract: A streaming engine employed in a digital signal processor specified a fixed data stream. Once started the data stream is read only and cannot be written. Once fetched the data stream is stored in a first-in-first-out buffer for presentation to functional units in the fixed order. Data use by the functional unit is controlled using the input operand fields of the corresponding instruction. A read only operand coding supplies the data an input of the functional unit. A read/advance operand coding supplies the data and also advances the stream to the next sequential data elements. The read only operand coding permits reuse of data without requiring a register of the register file for temporary storage.
    Type: Grant
    Filed: July 7, 2016
    Date of Patent: June 9, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventor: Joseph Zbiciak
  • Patent number: 10579381
    Abstract: Methods of encoding and decoding are described which use a variable number of instruction words to encode instructions from an instruction set, such that different instructions within the instruction set may be encoded using different numbers of instruction words. To encode an instruction, the bits within the instruction are re-ordered and formed into instruction words based upon their variance as determined using empirical or simulation data. The bits in the instruction words are compared to corresponding predicted values and some or all of the instruction words that match the predicted values are omitted from the encoded instruction.
    Type: Grant
    Filed: November 24, 2017
    Date of Patent: March 3, 2020
    Assignee: Imagination Technologies Limited
    Inventors: Simon Thomas Nield, James McCarthy
  • Patent number: 10571901
    Abstract: Module-based systems and methods are described for controlled roll-out of module classes for configuring a process plant. In various aspects the module-based systems and methods generate a second version of a module class based on a modification to a first version of the module class, where the module class is associated with one or more module instances that are each associated with a process control element of the process plant. The module-based systems and methods execute a roll-out instruction to update an upgraded process control element, where the upgraded process control element is associated with a new module instance based on the second version of the module class. The roll-out instruction is also designed to ignore or skip a non-upgraded process control element, where the non-upgraded process control element remains associated with a previous module instance based on the first version of the module class.
    Type: Grant
    Filed: August 8, 2017
    Date of Patent: February 25, 2020
    Assignee: FISHER-ROSEMOUNT SYSTEMS, INC.
    Inventors: Julian K. Naidoo, Daniel R. Strinden, Cristopher Ian Sarmiento Uy, Prashant Joshi
  • Patent number: 10564971
    Abstract: A processor includes: at least one operator; and at least one macro instruction processing unit configured to share the at least one operator, wherein the at least one macro instruction processing unit is configured to execute a macro instruction with respect to input data by using the at least one operator to output result data, and to control the at least one operator to perform an operation included in the macro instruction, and the at least one macro instruction processing unit comprises: a scheduler configured to manage schedules of the at least one operator and output input data and a control signal to the at least one operator; and a controller configured to control the scheduler to execute the macro instruction and to receive the result data from the scheduler.
    Type: Grant
    Filed: October 15, 2015
    Date of Patent: February 18, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Doo-hyun Kim, Jae-hyun Kim, Joon-ho Song
  • Patent number: 10540179
    Abstract: A processor is configured to identify a branch instruction immediately followed by an architectural delay slot. A single bonded instruction comprising the branch instruction immediately followed by the architectural delay slot is created. The single bonded instruction is loaded into an instruction buffer.
    Type: Grant
    Filed: March 7, 2013
    Date of Patent: January 21, 2020
    Assignee: MIPS Tech, LLC
    Inventors: Ranganathan Sudhakar, Parthiv Pota
  • Patent number: 10514924
    Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: December 24, 2019
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Mark Charney, Robert Valentine, Jesus Corbal, Binwei Yang
  • Patent number: 10481946
    Abstract: An information processing device for reducing the number of times of interrupt notification for notifying completion of execution of input/output instruction and lightening a load of interrupt processing is described. The information processing device prescribes that a driver checks a completion state of a preceding input/output instruction after issuance of the input/output instruction. An issuing timing of the input/output instruction is considered to be a polling timing for checking the completion state of the preceding input/output instruction. Before the input/output device transmits interrupt notification to a CPU, the input/output device sets a timer to stand by for a prescribed time. A processing unit which resets the timer and extends the standby time by a prescribed time in a case where notification that a subsequent input/output instruction is issued arrives from a driver to the input/output device during the time is additionally provided to the input/output device.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: November 19, 2019
    Assignee: HITACHI, LTD.
    Inventors: Katsuto Sato, Yuki Kondo
  • Patent number: 10394568
    Abstract: Managing exception handling. A plurality of instruction units of an instruction stream are selected to be decoded in parallel by a plurality of instruction decode units of a processor. The plurality of instruction units includes a prefix instruction and a prefixed instruction. The prefixed instruction is an instruction to be modified by the prefix instruction. An exception condition associated with the prefixed instruction is determined. Exception handling is performed for the prefixed instruction, in which the performing includes determining an address at which to restart execution of the instruction stream. The determining the address includes adjusting the address at which to restart execution based on the prefix instruction to be separately decoded by an instruction decode unit.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: August 27, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10394569
    Abstract: Managing exception handling. A plurality of instruction units of an instruction stream are selected to be decoded in parallel by a plurality of instruction decode units of a processor. The plurality of instruction units includes a prefix instruction and a prefixed instruction. The prefixed instruction is an instruction to be modified by the prefix instruction. An exception condition associated with the prefixed instruction is determined. Exception handling is performed for the prefixed instruction, in which the performing includes determining an address at which to restart execution of the instruction stream. The determining the address includes adjusting the address at which to restart execution based on the prefix instruction to be separately decoded by an instruction decode unit.
    Type: Grant
    Filed: November 14, 2015
    Date of Patent: August 27, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10360029
    Abstract: Provided is a signal processing circuit occupying a small circuit area. A common arithmetic operation element is shared between a plurality of arithmetic operation sequence control units. An arbitration circuit selects, when the plurality of arithmetic operation sequence control units simultaneously generate requests for arithmetic operations to use the common arithmetic operation element, the predetermined sequence control unit based on priority information about the plurality of arithmetic operation sequence control units, causes the common arithmetic operation element to execute the arithmetic operation requested from the selected arithmetic operation sequence control unit, and returns the result of the arithmetic operation to the selected arithmetic operation sequence control unit.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: July 23, 2019
    Assignee: RENESAS ELECTRONICS CORPORATION
    Inventors: Hiroyuki Yamasaki, Hideyuki Noda, Kan Murata
  • Patent number: 10331449
    Abstract: Various encoding schemes are discussed for more efficiently encoding instructions which identify first and second architectural register numbers. In the first example, by constraining the first architectural register number to be greater than the second architectural register number, this frees up encodings for use in encoding other operations. In a second example, the first and second architectural register numbers may take any value but one of a first type of processing operation and a second type of processing operation is selected depending on a comparison of the first and second architectural register numbers.
    Type: Grant
    Filed: January 22, 2016
    Date of Patent: June 25, 2019
    Assignee: ARM Limited
    Inventors: Simon Hosie, Jørn Nystad
  • Patent number: 10331454
    Abstract: A processor includes a back end to execute decoded instructions and a front end. The front end includes two decode clusters and circuitry to receive data elements representing undecoded instructions, in program order, and to direct subsets of the data elements to the decode clusters. An IP generator directs one subset of data elements to the first cluster, detects a condition indicating that a load balancing action should be taken, and directs a subset of data elements immediately following the first subset in program order to the first or second decode cluster dependent on the action taken. The action may include annotating a BTB entry, inserting a fake branch in the BTB, forcing a cluster switch, or suppressing a cluster switch. The detected condition may be a predicated taken branch or an annotation thereof, or a heuristic based on a queue state, a count of uops, or a latency value.
    Type: Grant
    Filed: September 29, 2016
    Date of Patent: June 25, 2019
    Assignee: Intel Corporation
    Inventor: Jonathan D. Combs
  • Patent number: 10318306
    Abstract: An apparatus includes a scheduler circuit and a plurality of hardware engines. The scheduler circuit may be configured to (i) store a directed acyclic graph, (ii) parse the directed acyclic graph into one or more operators and (iii) schedule the one or more operators in one or more data paths. The hardware engines may be (i) configured as a plurality of the data paths and (ii) configured to generate one or more output vectors by processing zero or more input vectors using the operators. One or more of the hardware engines supports a range of multiple dimensions of the input vectors from zero dimensions to at least four dimensions.
    Type: Grant
    Filed: May 18, 2017
    Date of Patent: June 11, 2019
    Assignee: Ambarella, Inc.
    Inventors: Leslie D. Kohn, Robert C. Kunz
  • Patent number: 10303476
    Abstract: An arithmetic processor of an embodiment comprises program counter, a program memory, registers, and a decoder. Also the arithmetic processor comprises an arithmetic unit that carries out an operation using the operand and operator acquired from the registers based on a decode result by the decoder, a data memory that stores constant data and an address in association with the data, and a load unit that comprises a load data address storing unit that stores a load data address indicating an address where the constant data is stored; and an increment unit that updates the load data address stored in the load data address storing unit. The load unit loads, from the data memory, constant data corresponding to an address specified by an operand of a load instruction from the decoder, and stores the constant data in a specific one of the registers.
    Type: Grant
    Filed: June 24, 2015
    Date of Patent: May 28, 2019
    Assignee: SANKEN ELECTRIC CO., LTD.
    Inventors: Kazuhiro Mima, Hiroki Yukiyama, Takanaga Yamazaki
  • Patent number: 10187208
    Abstract: A processor includes a decode unit to decode an instruction. The instruction indicates a first 64-bit source operand having a first 64-bit value, indicates a second 64-bit source operand having a second 64-bit value, indicates a third 64-bit source operand having a third 64-bit value, and indicates a fourth 64-bit source operand having a fourth 64-bit value. An execution unit is coupled with the decode unit. The execution unit is operable, in response to the instruction, to store a result. The result includes the first 64-bit value multiplied by the second 64-bit value added to the third 64-bit value added to the fourth 64-bit value. The execution unit may store a 64-bit least significant half of the result in a first 64-bit destination operand indicated by the instruction, and store a 64-bit most significant half of the result in a second 64-bit destination operand indicated by the instruction.
    Type: Grant
    Filed: December 28, 2013
    Date of Patent: January 22, 2019
    Assignee: Intel Corporation
    Inventors: Yang Lu, Xiangzheng Sun, Nan Qiao
  • Patent number: 10180840
    Abstract: Apparatus and methods are disclosed for dynamic nullification of memory access instructions, such as memory store instructions. In some examples of the disclosed technology, an apparatus can include memory and one or more block-based processor cores. One of the cores can include an execution unit configured to execute memory access instructions comprising a plurality of memory load and/or memory store instructions contained in an instruction block. The core can also include a hardware structure storing data for at least one predicate instruction in the instruction block, the data identifying whether one or more of the memory store instructions will issue if a condition of the predicate instruction is satisfied. The core may further include a control unit configured to control issuing of the memory access instructions to the execution unit based at least in a part on the hardware structure data.
    Type: Grant
    Filed: December 23, 2015
    Date of Patent: January 15, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Douglas C. Burger, Aaron L. Smith
  • Patent number: 10002010
    Abstract: Multi-byte compressed string representation embodiments define a String class control field identifying compression as enabled/disabled, and another control field, identifying a decompressed string created when compression enabled. Tests are noped based on null setting of the compression flag. When arguments to a String class constructor are not compressible, a decompressed String is created and stringCompressionFlag initialized. Endian-aware helper methods for reading/writing byte and character values are defined. Enhanced String class constructors, when characters are not compressible, create a decompressed String, and initialize stringCompressionFlag triggering class load assumptions, overwriting all nopable patch points. A String object sign bit is set to one for decompressed strings when compression enabled, and masking/testing this flag bit is noped. Alternative package protected string constructors and operations are provided.
    Type: Grant
    Filed: May 13, 2016
    Date of Patent: June 19, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrew J. Craik, Filip Jeremic, Vijay Sundaresan
  • Patent number: 9996345
    Abstract: In an aspect, a pipelined execution resource can produce an intermediate result for use in an iterative approximation algorithm in an odd number of clock cycles. The pipelined execution resource executes SIMD requests by staggering commencement of execution of the requests from a SIMD instruction. When executing one or more operations for a SIMD iterative approximation algorithm, and an operation for another SIMD iterative approximation algorithm is ready to begin execution, control logic causes intermediate results completed by the pipelined execution resource to pass through a wait state, before being used in a subsequent computation. This wait state presents two open scheduling cycles in which both parts of the next SIMD instruction can begin execution. Although the wait state increases latency to complete an in-progress algorithm, a total throughput of execution on the pipeline increases.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: June 12, 2018
    Assignee: Imagination Technologies Limited
    Inventors: Kristie Veith, Leonard Rarick, Manouk Manoukian
  • Patent number: 9959119
    Abstract: A computer processor including an instruction buffer configured to store at least one variable-length instruction having a bit bundle bounded by a head end and a tail end with a plurality of slots each defining a corresponding operation, wherein the plurality of slots and corresponding operations are logically partitioned into a plurality of distinct blocks with a first group of blocks extending from the head end of the bit bundle toward the tail end of the bit bundle and a second group of blocks extending from the tail end of the bit bundle toward the head end of the bit bundle, wherein the second group of blocks includes a tail end block disposed adjacent the tail end of the bit bundle. A decode stage is operably coupled to the instruction buffer and configured to process a given variable-length instruction stored by the instruction buffer by decoding at least one operation of a particular block belonging to the first group of blocks in parallel with decoding at least one operation of the tail end block.
    Type: Grant
    Filed: May 29, 2014
    Date of Patent: May 1, 2018
    Assignee: MILL COMPUTING, INC.
    Inventors: Roger Rawson Godard, Arthur David Kahlich, David Arthur Yost
  • Patent number: 9928123
    Abstract: Processes from a set of processes are divided for use at a second triggering event and which are part of a single application programming interface (API). The set of processes including a subset of the set of processes including at least one process in the set and a remainder of the set of processes including at least one process in the set of processes and outside of the subset of the set of processes. A first triggering event is identified. The subset of the set of processes are performed using a processor and in response to the first triggering event to obtain a first result for use at the second triggering event. A state and the first result of the subset of the set of processes is saved. The remainder of the set of processes are performed using the processor in response to the second triggering event occurring after the first triggering event, and using the state and first result, to obtain a second result.
    Type: Grant
    Filed: January 13, 2016
    Date of Patent: March 27, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: Ajit Ashok Varangaonkar
  • Patent number: 9910787
    Abstract: The present disclosure includes apparatuses and methods related to virtual address tables. An example method comprises generating an object file that comprises: an instruction comprising a number of arguments; and an address table comprising a number of indexed address elements. Each one of the number of indexed address elements can correspond to a virtual address of a respective one of the number of arguments, wherein the address table can serves as a target for the number of arguments. The method can include storing the object file in a memory.
    Type: Grant
    Filed: May 15, 2015
    Date of Patent: March 6, 2018
    Assignee: Micron Technology, Inc.
    Inventors: John D. Leidel, Kyle B. Wheeler
  • Patent number: 9898293
    Abstract: Methods and apparatus are provided for decoding instructions in a computer program wherein the instructions include one or more base instructions that are subject to modification by one or more other instructions. A decoder determines whether a first received instruction was arrived at by a non-incremental change to a program counter (i.e. a jump in the program). If the first instruction was arrived at by a non-incremental change to the program counter the decoder decodes the immediately preceding instruction to determine if the original instruction is a base instruction subject to modification by one or more other instructions. If the preceding instruction indicates that the original instruction is a base instruction an error has occurred and exception handling code is invoked.
    Type: Grant
    Filed: May 27, 2015
    Date of Patent: February 20, 2018
    Assignee: MIPS Tech, LLC
    Inventor: James Robert Whittaker
  • Patent number: 9898286
    Abstract: A processor includes a decode unit to decode a packed finite impulse response (FIR) filter instruction that indicates one or more source packed data operands, a plurality of FIR filter coefficients, and a destination storage location. The source operand(s) include a first number of data elements and a second number of additional data elements. The second number is one less than a number of FIR filter taps. An execution unit, in response to the packed FIR filter instruction being decoded, is to store a result packed data operand. The result packed data operand includes the first number of FIR filtered data elements that each is to be based on a combination of products of the plurality of FIR filter coefficients and a different corresponding set of data elements from the one or more source packed data operands, which is equal in number to the number of FIR filter taps.
    Type: Grant
    Filed: May 5, 2015
    Date of Patent: February 20, 2018
    Assignee: Intel Corporation
    Inventors: Edwin Jan Van Dalen, Martinus C. Wezelenburg, Steven Roos, Edward T. Grochowski, Moshe Maor
  • Patent number: 9870305
    Abstract: A debugging capability that enables the efficient debugging of code that has prefixes, referred to herein as prefixed code. To debug application code, in which the application code includes a prefixed instruction to be modified by a prefix, a trap is provided. The trap is configured to report a presence of the prefix, but to otherwise perform the trap functions absent the prefix; i.e., the prefix is otherwise ignored in the processing of the trap.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Corporation
    Inventor: Michael K. Gschwind
  • Patent number: 9870225
    Abstract: A processor comprises a decoder for decoding an instruction based both on an explicit opcode identifier and on metadata encoded in the instruction. For example, a relative order of source register names may be used to decode the instruction. As an example, an instruction set may have a Branch Equal (BEQ) specifying two registers (r1 and r2) that store values that are compared for equality. An instruction set can provide a single opcode identifier for BEQ and a processor can determine whether to decode a particular instance of that opcode identifier as BEQ or another instruction, in dependence on an order of appearance of the source registers in that instance. For example, the BEQ opcode can be interpreted as a branch not equal, if a higher numbered register appears before a lower numbered register. Additional forms of metadata can include interpreting a constant included in an instruction, as well as determining equality of source registers, among other forms of metadata.
    Type: Grant
    Filed: December 16, 2014
    Date of Patent: January 16, 2018
    Assignee: MIPS Tech, LLC
    Inventor: Ranganathan Sudhakar
  • Patent number: 9870308
    Abstract: A debugging capability that enables the efficient debugging of code that has prefixes, referred to herein as prefixed code. To debug application code, in which the application code includes a prefixed instruction to be modified by a prefix, a trap is provided. The trap is configured to report a presence of the prefix, but to otherwise perform the trap functions absent the prefix; i.e., the prefix is otherwise ignored in the processing of the trap.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Corporation
    Inventor: Michael K. Gschwind
  • Patent number: 9804853
    Abstract: Provided are an instruction compression apparatus and method for a very long instruction word (VLIW) processor, and an instruction fetching apparatus and method. The instruction compression apparatus includes: an indicator generator configured to generate an indicator code that indicates an issue width of an instruction bundle to be executed in the VLIW processor, and a number of No-Operation (NOP) instruction bundles following the instruction bundle; an instruction compressor configured to compress the instruction bundle by removing at least one of NOP instructions from the instruction bundle and the NOP instruction bundles following the instruction bundle; and an instruction converter configured to include the generated indicator code in the compressed instruction bundle.
    Type: Grant
    Filed: April 22, 2014
    Date of Patent: October 31, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jae-Un Park, Suk-jin Kim
  • Patent number: 9753730
    Abstract: A data processing apparatus, method and computer program are described that are capable of decoding instructions from different instruction sets. The method comprising: receiving an instruction; if an operation code of said instruction is an operation code of an instruction from a base set of instructions decoding said instruction according to decode rules for said base set of instructions; and if said operation code of said instruction is an operation code of an instruction from at least one further set of instructions decoding said instruction according to a set of decode rules determined by an indicator value indicating which of said at least one further set of instructions is currently to be decoded.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: September 5, 2017
    Assignee: ARM Limited
    Inventor: Simon John Craske
  • Patent number: 9747112
    Abstract: A graph-based program specification includes components, at least one having at least one input port for receiving a collection of data elements, or at least one collection type output port for providing a collection of data elements. Executing a program specified by the graph-based program specification at a computing node, includes: receiving data elements of a first collection into a first storage in a first order via a link connected to a collection type output port of a first component and an input port of a second component, and invoking a plurality of instances of a task corresponding to the second component to process data elements of the first collection, including retrieving the data elements from the first storage in a second order, without blocking invocation of any of the instances until after any particular instance completes processing one or more data elements.
    Type: Grant
    Filed: September 2, 2015
    Date of Patent: August 29, 2017
    Assignee: Ab Initio Technology, LLC
    Inventors: Craig W. Stanfill, Richard Shapiro, Stephen A. Kukolich, Joseph Skeffington Wholey, III
  • Patent number: 9696992
    Abstract: An apparatus and method for performing a check on inputs to a mathematical instruction and selecting a default sequence efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: an arithmetic logic unit (ALU) to perform a plurality of mathematical operations using one or more source operands; instruction check logic to evaluate the source operands for a current mathematical instruction and to determine, based on the evaluation, whether to execute a default sequence of operations including executing the current mathematical instruction by the ALU or to jump to an alternate sequence of operations adapted to provide a result for the mathematical instruction having particular types of source operands more efficiently than the default sequence of operations.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: July 4, 2017
    Assignee: Intel Corporation
    Inventors: Jesus Corbal San Adrian, Robert N. Hanek, Warren E. Ferguson, Taraneh Bahrami, Avi A. Tevet, Dennis R. Bradford, Michael Ferry, Jingwei Zhang
  • Patent number: 9684632
    Abstract: Systems, internal processors, and methods of parallel data processing in an internal processor are provided. In one embodiment, an external controller sends instructions to a memory device, and the internal processor on the memory device executes the instructions on the data. The internal processor may include one or more arithmetic logic units (ALUs), and each ALU may perform an operation on an entire operand, such that one or more operands may be processed in parallel by one or more ALUs in the internal processor. The operations may be completed on each operand in one or more cycles through the circuitry of the ALU, and the path of the operands through the ALU may be based on the width of the ALU, the size of the operands, or the type of operation to be performed.
    Type: Grant
    Filed: June 4, 2009
    Date of Patent: June 20, 2017
    Assignee: Micron Technology, Inc.
    Inventor: Robert Walker
  • Patent number: 9678754
    Abstract: A system and method of processing a hierarchical very long instruction word (VLIW) packet is disclosed. In a particular embodiment, a method of processing instructions is disclosed. The method includes receiving a hierarchical VLIW packet of instructions and decoding an instruction from the packet to determine whether the instruction is a single instruction or whether the instruction includes a subpacket that includes a plurality of sub-instructions. The method also includes, in response to determining that the instruction includes the subpacket, executing each of the sub-instructions.
    Type: Grant
    Filed: March 3, 2010
    Date of Patent: June 13, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Lucian Codrescu, Erich James Plondke, Ajay Anant Ingle, Suresh K. Venkumahanti, Charles Joseph Tabony
  • Patent number: 9658853
    Abstract: A technique for operating a processor includes storing a first result to a writeback buffer, in response to a first execution unit of the processor attempting to write the first result of a first completed instruction to a register file of the processor at a same processor time as a second execution unit of the processor is attempting to write a second result of a second completed instruction to the register file. The writeback buffer is positioned in a dataflow between the first execution unit and the register file. A buffer full indicator logic is used to detect that the writeback buffer is unavailable. A buffer unavailable signal is transmitted, from the buffer full indicator logic, in response to detecting the writeback buffer is unavailable. In response to receiving the buffer unavailable signal, a buffer retrieving logic writes the first result from the writeback buffer to the register file.
    Type: Grant
    Filed: July 31, 2014
    Date of Patent: May 23, 2017
    Assignee: GLOBALFOUNDRIES INC
    Inventors: Harry Barowski, Tim Niggemeier
  • Patent number: 9652231
    Abstract: Mechanisms are provided for dynamic data driven alignment and data formatting in a floating point SIMD architecture. At least two operand inputs are input to a permute unit of a processor. Each operand input contains at least one floating point value upon which a permute operation is to be performed by the permute unit. A control vector input, having a plurality of floating point values that together constitute the control vector input, is input to the permute unit of the processor for controlling the permute operation of the permute unit. The permute unit performs a permute operation on the at least two operand inputs according to a permutation pattern specified by the plurality of floating point values that constitute the control vector input. Moreover, a result output of the permute operation is output from the permute unit to a result vector register of the processor.
    Type: Grant
    Filed: October 14, 2008
    Date of Patent: May 16, 2017
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Bruce M. Fleischer, Michael K. Gschwind
  • Patent number: 9639369
    Abstract: In an embodiment, a processor includes a register file having multiple widths corresponding to different operands sizes of a given data type implemented by the processor. For example, the integer register file may have 32 bit and 64 bit widths for 32 and 64 bit operand sizes. The register file may have a section of registers for each operand size, and the map unit may allocate registers from the appropriate section for each instruction operation based on the operand size of that instruction operation. The register file may consume less integrated circuit area than another register file having the same number of registers, all of which are implemented at the largest operand size. In some embodiments, only the register file and the map unit (specifically the free list management logic in the map unit) are changed to implement the multiple-width register file.
    Type: Grant
    Filed: November 11, 2013
    Date of Patent: May 2, 2017
    Assignee: Apple Inc.
    Inventor: Conrado Blasco
  • Patent number: 9639503
    Abstract: An example method for placing one or more element data values into an output vector includes identifying a vertical permute control vector including a plurality of elements, each element of the plurality of elements including a register address. The method also includes for each element of the plurality of elements, reading a register address from the vertical permute control vector. The method further includes retrieving a plurality of element data values based on the register address. The method also includes identifying a horizontal permute control vector including a set of addresses corresponding to an output vector. The method further includes placing at least some of the retrieved element data values of the plurality of element data values into the output vector based on the set of addresses in the horizontal permute control vector.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: May 2, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Ajay Anant Ingle, David J. Hoyle, Marc M. Hoffman
  • Patent number: 9639371
    Abstract: A system and method for efficiently processing instructions in hardware parallel execution lanes within a processor. In response to a given divergent point within an identified loop, a compiler generates code wherein when executed determines a size of a next very large instruction world (VLIW) to process and determine multiple pointer values to store in multiple corresponding PC registers in a target processor. The updated PC registers point to instructions intermingled from different basic blocks between the given divergence point and a corresponding convergence point. The target processor includes a single instruction multiple data (SIMD) micro-architecture. The assignment for a given lane is based on branch direction found at runtime for the given lane at the given divergent point. The processor includes a vector register for mapping PC registers to execution lanes.
    Type: Grant
    Filed: January 29, 2013
    Date of Patent: May 2, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Reza Yazdani
  • Patent number: 9639357
    Abstract: A processor, apparatus and method to use a multiple store instruction based on physical addresses of registers are provided. The processor is configured to execute an instruction to store data of a plurality of registers in a memory, the instruction including a first area in which a physical address of each of the registers is written. An instruction generating apparatus is configured to generate an instruction to store data of a plurality of registers in a memory, the instruction including a first area in which a physical address of each of the registers is written. An instruction generating method includes detecting a code area that instructs to store data of a plurality of registers in a memory, from a program code. The instruction generating method further includes generating an instruction corresponding to the code area by mapping physical addresses of the registers to a first area of the instruction.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: May 2, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki-Seok Kwon, Jae-Un Park, Suk-Jin Kim
  • Patent number: 9612834
    Abstract: A processor includes a plurality of execution units. At least one of the execution units is configured to execute a complex instruction that requires multiple instruction cycles to execute, and to enforce atomic execution of the complex instruction during a first-portion of the multiple instruction cycles required to execute the complex instruction. The at least one of the execution units is further configured to enable execution of the complex instruction to be interrupted for execution of a different instruction by the at least one execution unit during execution of a second portion of the multiple instruction cycles. The first portion and the second portion are non-overlapping.
    Type: Grant
    Filed: September 27, 2012
    Date of Patent: April 4, 2017
    Assignee: TEXAS INSTRUMENTS DEUTSCHLAND GMBH
    Inventors: Horst Diewald, Johann Zipperer
  • Patent number: 9613667
    Abstract: A data storage device includes a memory device suitable to perform an internal operation; a processor suitable to generate command generation information to command performance of the internal operation; and a command set processing block suitable to generate a command set, which is provided to the memory device, based on the command generation information, wherein the command set processing block generates a final sequence which configures a pattern included in the command set.
    Type: Grant
    Filed: October 2, 2014
    Date of Patent: April 4, 2017
    Assignee: SK Hynix Inc.
    Inventors: Dong Yeob Chun, Re Sen Ahn
  • Patent number: 9606960
    Abstract: An example method for placing one or more element data values into an output vector includes identifying a vertical permute control vector including a plurality of elements, each element of the plurality of elements including a register address. The method also includes for each element of the plurality of elements, reading a register address from the vertical permute control vector. The method further includes retrieving a plurality of element data values based on the register address. The method also includes identifying a horizontal permute control vector including a set of addresses corresponding to an output vector. The method further includes placing at least some of the retrieved element data values of the plurality of element data values into the output vector based on the set of addresses in the horizontal permute control vector.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: March 28, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Ajay Anant Ingle, David J. Hoyle, Marc M. Hoffman
  • Patent number: 9606931
    Abstract: Some implementations disclosed herein provide techniques and arrangements for indicating a length of an instruction from an instruction set that has variable length instructions. A plurality of bytes that include an instruction may be read from an instruction cache based on a logical instruction pointer. A determination is made whether a first byte of the plurality of bytes identifies a length of the instruction. In response to detecting that the first byte of the plurality of bytes identifies the length of the instruction, the instruction is read from the plurality of bytes based on the length of the instruction.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: March 28, 2017
    Assignee: Intel Corporation
    Inventors: Santiago Galan, Roger Espasa, Julio Gago, Jose Gonzalez