Predecoding Of Instruction Component Patents (Class 712/213)
  • Patent number: 11972263
    Abstract: Aspects of the disclosure are directed to methods, systems, and apparatuses using an instruction prefetch pipeline architecture that provides good performance without the complexity of a full cache coherent solution deployed in conventional CPUs. The architecture can include components which can be used to construct an instruction prefetch pipeline, including instruction memory (TiMem), instruction buffer (iBuf), a prefetch unit, and an instruction router.
    Type: Grant
    Filed: October 25, 2022
    Date of Patent: April 30, 2024
    Assignee: Google LLC
    Inventors: Rahul Nagarajan, Christopher Leary, Thejasvi Magudilu Vijayaraj, Thomas James Norrie
  • Patent number: 11882141
    Abstract: In some embodiments, a data platform receives information associated with activities within a network environment, generates a logical graph based on the information, stores data representative of the logical graph in a database, receives, in response to a user interaction with an interface of the data platform, a request to filter the information, in response to the request generates a query using a graph-based schema, and performs the generated query against the database.
    Type: Grant
    Filed: March 8, 2023
    Date of Patent: January 23, 2024
    Assignee: Lacework Inc.
    Inventors: Yijou Chen, Sanjay Kalra, Vikram Kapoor
  • Patent number: 11789736
    Abstract: A method for executing new instructions is provided. The method is used in a processor and includes: receiving an instruction; generating an unknown instruction exception when the received instruction is an unknown instruction; in response to the unknown instruction exception, executing the following steps through a conversion program: determining whether the received instruction is a new instruction; and converting the received instruction into at least one old instruction when the received instruction is a new instruction; and executing the at least one old instruction in the same execution mode as the received instruction.
    Type: Grant
    Filed: September 10, 2021
    Date of Patent: October 17, 2023
    Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.
    Inventors: Weilin Wang, Mengchen Yang, Yingbing Guan
  • Patent number: 11775308
    Abstract: A cache system having cache sets, registers associated with the cache sets respectively, and a logic circuit coupled to a processor to control the cache sets according to the registers. When a connection to an address bus of the system receives a memory address from the processor, the logic circuit can be configured to: generate an extended tag from at least the memory address; and determine whether the generated extended tag matches with a first extended tag for a first cache set or a second extended tag for a second cache set of the system. Also, the logic circuit can also be configured to implement a command received from the processor via the first cache set in response to the generated extended tag matching with the first extended tag and via the second cache set in response to the generated extended tag matching with the second extended tag.
    Type: Grant
    Filed: May 11, 2022
    Date of Patent: October 3, 2023
    Assignee: Micron Technology, Inc.
    Inventor: Steven Jeffrey Wallach
  • Patent number: 11748097
    Abstract: Extending fused multiply-add instructions, including: receiving an extended fused multiply-add (FMA) instruction comprising a first subset of bits indicating a corresponding register for each operand of a fused multiply-add (FMA) operation and a second subset of bits indicating a different register storing data describing one or more transformations applicable to one or more operands of the FMA operation; and performing, based on the extended FMA instruction.
    Type: Grant
    Filed: March 2, 2022
    Date of Patent: September 5, 2023
    Assignee: GHOST AUTONOMY INC.
    Inventors: John Hayes, Volkmar Uhlig
  • Patent number: 11693664
    Abstract: Methods and systems for distributing instructions amongst processing units in a processing pipeline are disclosed. A method includes compiling a set of instructions for a stage of a multistage programmable processing pipeline in which the stage of the multistage programmable processing pipeline includes multiple processing units configured to processes instructions in parallel, wherein compiling the set of instructions includes, identifying first and second subsets of instructions within the set of instructions that can be executed independent of each other, assigning the first subset of instructions to a first processing unit of the stage, assigning the second subset of instructions to a second processing unit of the stage, and executing the first and second subsets of instructions in parallel at the first and second processing units, respectively.
    Type: Grant
    Filed: July 2, 2021
    Date of Patent: July 4, 2023
    Assignee: PENSANDO SYSTEMS INC.
    Inventor: Jan Civlin
  • Patent number: 11593380
    Abstract: Techniques for generating a dataflow graph include generating a first dataflow graph with a plurality of first nodes representing first computer operations in processing data, with at least one of the first computer operations being a declarative operation that specifies one or more characteristics of one or more results of processing of data, and transforming the first dataflow graph into a second dataflow graph for processing data in accordance with the first computer operations, the second dataflow graph including a plurality of second nodes representing second computer operations, with at least one of the second nodes representing one or more imperative operations that implement the logic specified by the declarative operation, where the one or more imperative operations are unrepresented by the first nodes in the first dataflow graph.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: February 28, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Ian Schechter, Garth Dickie
  • Patent number: 11573910
    Abstract: An apparatus of a computing system, a computer-readable medium, a method and a system. The apparatus comprises processing circuitry including a core, and a communication controller coupled to the core to communicate with a memory of the computing system, wherein the memory is to define a leak zone corresponding to a plurality of memory addresses including data therein, the leak zone having an identifier; and the processing circuitry is to: decode instructions including a starting leak barrier, an ending leak barrier, and a sequence of code between the starting and ending leak barriers, the sequence of code including the identifier for the leak zone, the identifier to indicate the sequence of code is to be executed only on the data within the leak zone; and execute the sequence of code only on the data within the leak zone based on the leak barriers and on the identifier.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: February 7, 2023
    Assignee: Intel Corporation
    Inventor: Rodrigo Branco
  • Patent number: 11550587
    Abstract: An instruction processing device and an instruction processing method are disclosed. The instruction processing device includes: an instruction boundary prediction unit including circuitry configured to acquire an instruction packet of a variable-length instruction set and to add instruction prediction information to a plurality of instruction meta-fields in the instruction packet; and an instruction pipeline structure comprising an instruction fetch unit including an instruction boundary determination unit including circuitry configured to determine instruction boundary information according to the instruction prediction information to obtain one or more instructions in the instruction packet.
    Type: Grant
    Filed: July 17, 2020
    Date of Patent: January 10, 2023
    Assignee: C-SKY Microsystems Co., Ltd.
    Inventors: Chen Chen, Dongqi Liu, Tao Jiang, Chaojun Zhao
  • Patent number: 11467838
    Abstract: Systems, apparatuses, and methods for implementing a fastpath microcode sequencer are disclosed. A processor includes at least an instruction decode unit and first and second microcode units. For each received instruction, the instruction decode unit forwards the instruction to the first microcode unit if the instruction satisfies at least a first condition. In one implementation, the first condition is the instruction being classified as a frequently executed instruction. If a received instruction satisfies at least a second condition, the instruction decode unit forwards the received instruction to a second microcode unit. In one implementation, the first microcode unit is a smaller, faster structure than the second microcode unit. In one implementation, the second condition is the instruction being classified as an infrequently executed instruction.
    Type: Grant
    Filed: May 22, 2018
    Date of Patent: October 11, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kai Troester, Magiting Talisayon, Hongwen Gao, Benjamin Floering, Emil Talpes
  • Patent number: 11372648
    Abstract: A cache system having cache sets, registers associated with the cache sets respectively, and a logic circuit coupled to a processor to control the cache sets according to the registers. When a connection to an address bus of the system receives a memory address from the processor, the logic circuit can be configured to: generate an extended tag from at least the memory address; and determine whether the generated extended tag matches with a first extended tag for a first cache set or a second extended tag for a second cache set of the system. Also, the logic circuit can also be configured to implement a command received from the processor via the first cache set in response to the generated extended tag matching with the first extended tag and via the second cache set in response to the generated extended tag matching with the second extended tag.
    Type: Grant
    Filed: January 26, 2021
    Date of Patent: June 28, 2022
    Assignee: Micron Technology, Inc.
    Inventor: Steven Jeffrey Wallach
  • Patent number: 11334491
    Abstract: In one embodiment, a microprocessor, comprising: an instruction cache configured to receive an instruction fetch comprising a first byte portion and a second byte portion; a side cache tag array configured to signal further processing of the second byte portion in addition to the first byte portion based on a hit of the side cache tag array; and a side cache data array configured to store instruction data for the second byte portion.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: May 17, 2022
    Assignee: CENTAUR TECHNOLOGY, INC.
    Inventors: Thomas C. McDonald, John Duncan
  • Patent number: 11093827
    Abstract: Using a processor and a memory at a worker machine, a gradient vector is computed corresponding to a set of weights associated with a set of nodes of a neural network instance being trained in the worker machine. In an ISA vector corresponding to the gradient vector, an ISA instruction is constructed corresponding to a gradient in a set of gradients in the gradient vector, wherein a data transmission of the ISA instruction is smaller as compared to a data transmission of the gradient. The ISA vector is transmitted from the worker machine to a parameter server, the ISA vector being responsive to one iteration of a training of the neural network instance, the ISA vector being transmitted instead of the gradient vector to reduce an amount of data transmitted from the worker machine to the parameter server for the one iteration of the training.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: August 17, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Minsik Cho, Ulrich A. Finkler
  • Patent number: 10996953
    Abstract: A computer processing system is provided. The computer processing system includes a processor configured to execute a record form instruction cracked into two internal instructions. A first one of the two internal instructions executes out-of-order to compute a target register and a second one of the two internal instructions executes in-order to compute a condition register (CR) to improve a processing speed of the record form instruction.
    Type: Grant
    Filed: September 5, 2019
    Date of Patent: May 4, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Brian J. D. Barrick, Maarten J. Boersma, Niels Fricke, Michael J. Genden
  • Patent number: 10824938
    Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to perform one or more machine learning operations, wherein the decode unit, based on parameters of the one or more machine learning operations, is to request a scheduler to schedule the one or more machine learning operations to one of an array of programmable compute units and a fixed function compute unit.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: November 3, 2020
    Assignee: Intel Corporation
    Inventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang
  • Patent number: 10671913
    Abstract: The present disclosure provides a computation device and method, which are capable of using a single instruction to complete a transpose computation of a matrix of any size within constant time. Compared with conventional methods for performing a matrix transpose computation, the device and method may reduce the time complexity of a matrix transpose computation as well as making the usage of the computation simpler and more efficient.
    Type: Grant
    Filed: July 24, 2019
    Date of Patent: June 2, 2020
    Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD
    Inventors: Shaoli Liu, Wei Li, Tian Zhi, Tianshi Chen
  • Patent number: 10402203
    Abstract: An apparatus comprises prediction circuitry (40, 100, 80) for determining, based on current prediction policy information (43, 82, 104), a predicted behavior to be used for processing instructions. The current prediction policy information is updated based on an outcome of processing of instructions. A storage structure (50) stores at least one entry identifying previous prediction policy information (60) for a corresponding block of instructions. In response to an instruction from a block having a corresponding entry in the storage structure (50) which identifies the previous prediction policy information (60), the current prediction policy information (43, 82, 104) can be reset based on the previous prediction policy information 60 identified in the corresponding entry of the storage structure (50).
    Type: Grant
    Filed: March 31, 2016
    Date of Patent: September 3, 2019
    Assignee: ARM Limited
    Inventors: Max John Batley, Simon John Craske, Ian Michael Caulfield, Peter Richard Greenhalgh, Allan John Skillman, Antony John Penton
  • Patent number: 9983873
    Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor mask bit compression in response to a single mask bit compression instruction that includes a source writemask register operand, a destination writemask register operand, and an opcode are described.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: May 29, 2018
    Assignee: Intel Corporation
    Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
  • Patent number: 9836254
    Abstract: An image forming apparatus that facilitates management of information security policy even for an extended application installed from exterior. A scanning unit scans an original to generate image data of the original. A printing unit prints an image based on image data. A management unit manages applications dynamically installed. At least one of the applications executes a job using at least one of the scanning unit and the printing unit. A setting unit sets an operation mode for the image forming apparatus, based on security settings that are received from an external apparatus. A determination unit determines whether each of the applications supports the security settings. A control unit restricts an operation of an application that the determination unit determines that the application does not support the security settings.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: December 5, 2017
    Assignee: CANON KABUSHIKI KAISHA
    Inventors: Naoki Tsuchitoi, Akari Yasukawa, Shota Shimizu
  • Patent number: 9747216
    Abstract: A computer processor including a first memory structure that operates over multiple cycles to temporarily store operands referenced by at least one instruction. A plurality of functional units performs operations that produce and access operands stored in the first memory structure. A second memory structure is provided, separate from the first memory structure. The second memory structure is configured as a dedicated memory for storage of operands copied from the first memory structure. The second memory structure is organized with a byte-addressable memory space and each operand stored in the second memory structure is accessed by a given byte address into the byte-addressable memory space.
    Type: Grant
    Filed: June 23, 2014
    Date of Patent: August 29, 2017
    Assignee: Mill Computing, Inc.
    Inventors: Roger Rawson Godard, Arthur David Kahlich, Sebastien Paul Maurice Mirolo, David Arthur Yost
  • Patent number: 9672033
    Abstract: Techniques are described for loading decoded instructions and super-set instructions in a memory for later access. For loading a decoded instruction, the decoded instruction is a transformed form of an original instruction that was stored in the program memory. The transformation is from an encoded assembly level format to a binary machine level format. In one technique, the transformation mechanism is invoked by a transform and load instruction that causes an instruction retrieved from program memory to be transformed into a new language format and then loaded into a transformed instruction memory. The format of the transformed instruction may be optimized to the implementation requirements, such as improving critical path timing. The transformation of instructions may extend to other needs beyond timing path improvement, for example, requiring super-set instructions for increased functionality and improvements to instruction level parallelism.
    Type: Grant
    Filed: January 8, 2009
    Date of Patent: June 6, 2017
    Assignee: Altera Corporation
    Inventors: Gerald George Pechanek, Larry D. Larsen
  • Patent number: 9495001
    Abstract: In an embodiment, a processor includes a plurality of cores each to independently execute instructions, a power delivery logic coupled to the plurality of cores, and a power controller including a first logic to cause a first core to enter into a first low power state of an operating system power management scheme independently of the OS, during execution of at least one thread on the first core. Other embodiments are described and claimed.
    Type: Grant
    Filed: August 21, 2013
    Date of Patent: November 15, 2016
    Assignee: Intel Corporation
    Inventors: Ankush Varma, Krishnakanth V. Sistla, Allen W. Chu, Ian M. Steiner
  • Patent number: 9459893
    Abstract: A computer-implementable method includes providing an instruction set architecture that comprises features to generate diverse copies of a program, using the instruction set architecture to generate diverse copies of a program and providing a virtual machine for execution of one of the diverse copies of the program. Various exemplary methods, devices, systems, etc., use virtualization for diversifying code and/or virtual machines to thereby enhance software security.
    Type: Grant
    Filed: November 11, 2013
    Date of Patent: October 4, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Bertrand Anckaert, Mariusz H. Jakubowski, Ramarathnam Venkatesan
  • Patent number: 9430237
    Abstract: A central processing unit includes a register file having a plurality of read ports, a first execution unit having a first plurality of input ports, and logic operable to selectively couple different arrangements of the read ports to the input ports. A method for reading operands from a register file having a plurality of read ports by a first execution unit having a first plurality of input ports includes scheduling an instruction for execution by the first execution unit and selectively coupling a particular arrangement of the read ports to the input ports based on a type of the instruction.
    Type: Grant
    Filed: September 29, 2011
    Date of Patent: August 30, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jeffrey P. Rupley, Dimitri Tan
  • Patent number: 9411594
    Abstract: A processor includes: an arithmetic unit configured to execute instructions; an instruction decode part configured to decode the instructions executed in the arithmetic unit and to output opcodes; and an interrupt register configured to receive interrupt signals, wherein the instruction decode part includes an instruction code map that stores the opcodes in correspondence to instructions and outputs the opcodes in accordance with the instructions inputted, and the instruction code map stores a plurality of sets of opcodes to be output as switch opcodes corresponding to additional instructions, the additional instructions are a part of the instructions, and switches the sets of the switch opcodes in accordance with the interrupt signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: August 9, 2016
    Assignee: Cypress Semiconductor Corporation
    Inventor: Masayuki Tsuji
  • Patent number: 9378022
    Abstract: A method for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching. The method includes receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence and determining if the first instruction and the second instruction can be optimized. In response to the determining that the first instruction and second instruction can be optimized, the method includes, preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction and storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache. In response to determining that the first instruction and second instruction can not be optimized, the method includes, storing the pre-decoded first instruction and a pre-decoded second instruction in the instruction cache.
    Type: Grant
    Filed: December 9, 2013
    Date of Patent: June 28, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 9354888
    Abstract: A method for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching. The method includes receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence and determining if the first instruction and the second instruction can be optimized. In response to the determining that the first instruction and second instruction can be optimized, the method includes, preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction and storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache. In response to determining that the first instruction and second instruction can not be optimized, the method includes, storing the pre-decoded first instruction and a pre-decoded second instruction in the instruction cache.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: May 31, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 9354850
    Abstract: A method for scheduling loop processing of a reconfigurable processor includes generating a dependence graph of instructions for the loop processing; mapping a first register file of the reconfigurable processor on an arrow indicating inter-iteration dependence on the dependence graph; and searching for schedules of the instructions based on the mapping result.
    Type: Grant
    Filed: October 6, 2014
    Date of Patent: May 31, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Min-wook Ahn, Won-sub Kim, Tai-song Jin, Seung-won Lee, Jin-seok Lee
  • Patent number: 9250870
    Abstract: A computing device determines a first set of functions in a first file having a first bit length. The computing device performs a first test, which includes executing a first test program capable of testing each function in the first set of functions. The computing device determines whether the first test was successful, and based on determining that the first test was successful, creates a shim program based on the first set of functions. The computing device creates a second file based on the first set of functions, the second file having a second bit length. The computing device performs a second test, which includes executing a second test program capable of testing each function in the second file. The computing device determines whether the second test was successful, and based on determining that the second test was successful, publishes one or more of the shim program and second file.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: February 2, 2016
    Assignee: International Business Machines Corporation
    Inventors: John E. Moore, Jr., Jeffrey K. Price, Robert R. Wentworth, Stanley C. Wood
  • Patent number: 9218286
    Abstract: Methods and apparatuses for processing partial write requests in a system cache within a memory controller. When a write request that updates a portion of a cache line misses in the system cache, the write request writes the data to the system cache without first reading the corresponding cache line from memory. The system cache includes error correction code bits which are redefined as word mask bits when a cache line is in a partial dirty state. When a read request hits on a partial dirty cache line, the partial data is written to memory using a word mask. Then, the corresponding full cache line is retrieved from memory and stored in the system cache.
    Type: Grant
    Filed: September 27, 2012
    Date of Patent: December 22, 2015
    Assignee: Apple Inc.
    Inventors: Sukalpa Biswas, Shinye Shiu
  • Patent number: 9189241
    Abstract: A method is provided for dynamically determining which instructions from a plurality of available instructions to issue in each clock cycle in a multithreaded processor capable of issuing a plurality of instructions in each clock cycle. The method includes the steps of: determining a highest priority instruction from the plurality of available instructions; determining the compatibility of the highest priority instruction with each of the remaining available instructions; and issuing the highest priority instruction together with other instructions compatible with the highest priority instruction in the same clock cycle. The highest priority instruction cannot be a speculative instruction. The effect of this method is that speculative instructions are only ever issued together with at least one non-speculative instruction.
    Type: Grant
    Filed: September 11, 2009
    Date of Patent: November 17, 2015
    Assignee: Imagination Technologies Limited
    Inventor: Andrew Webber
  • Patent number: 9104426
    Abstract: A LIW processor comprises multiple execution units. The multiple execution units of the processor are divided into groups, and an input instruction word can contain instructions for one execution unit in each of the groups. The processor is optimized for use in signal processing operations, in that the multiple execution units of the processor are divided into groups which do not place significant restrictions on the desirable uses of the processor, because it has been determined that, in signal processing applications, it is not usually necessary for certain execution units to operate simultaneously. These execution units can therefore be grouped together, in such a way that only one of them can operate at a particular time, without significantly impacting on the operation of the device. An array is formed from multiple interconnected processors of this type.
    Type: Grant
    Filed: November 1, 2007
    Date of Patent: August 11, 2015
    Assignee: Intel Corporation
    Inventors: Andrew Duller, Gajinder Singh Panesar, Peter Claydon, William Robbins, Andrew Kuligowski, Olfat Younis
  • Patent number: 9098221
    Abstract: An image forming apparatus that facilitates management of information security policy even for an extended application installed from exterior. A scanning unit scans an original to generate image data of the original. A printing unit prints an image based on image data. A management unit manages applications dynamically installed. At least one of the applications executes a job using at least one of the scanning unit and the printing unit. A setting unit sets an operation mode for the image forming apparatus, based on security settings that are received from an external apparatus. A determination unit determines whether each of the applications supports the security settings. A control unit restricts an operation of an application that the determination unit determines that the application does not support the security settings.
    Type: Grant
    Filed: January 24, 2014
    Date of Patent: August 4, 2015
    Assignee: CANON KABUSHIKI KAISHA
    Inventors: Naoki Tsuchitoi, Akari Yasukawa, Shota Shimizu
  • Patent number: 9043576
    Abstract: System and method for conversion of virtual machine files without requiring copying of the virtual machine payload (data) from one location to another location. By eliminating this step, applicant's invention significantly enhances the efficiency of the conversion process. In one embodiment, a file system or storage system provides indirections to locations of data elements stored on a persistent storage media. A source virtual machine file includes hypervisor metadata (HM) data elements in one hypervisor file format, and virtual machine payload (VMP) data elements.
    Type: Grant
    Filed: August 21, 2013
    Date of Patent: May 26, 2015
    Assignee: SimpliVity Corporation
    Inventors: Jesse St. Laurent, James E. King, III
  • Publication number: 20150082009
    Abstract: A processing system and method includes a predecoder configured to identify instructions that are combinable to form a single executable internal instruction. Instruction storage is configured to merge instructions that are combinable. An instruction execution unit is configured to execute the single, executable internal instruction on a hardware wide datapath.
    Type: Application
    Filed: November 18, 2014
    Publication date: March 19, 2015
    Inventors: Michael Gschwind, Balaram Sinharoy
  • Patent number: 8984260
    Abstract: A circuit arrangement, method, and program product for substituting a plurality of scalar instructions in an instruction stream with a functionally equivalent vector instruction for execution by a vector execution unit. Predecode logic is coupled to an instruction buffer which stores instructions in an instruction stream to be executed by the vector execution unit. The predecode logic analyzes the instructions passing through the instruction buffer to identify a plurality of scalar instructions that may be replaced by a vector instruction in the instruction stream. The predecode logic may generate the functionally equivalent vector instruction based on the plurality of scalar instructions, and the functionally equivalent vector instruction may be substituted into the instruction stream, such that the vector execution unit executes the vector instruction in lieu of the plurality of scalar instructions.
    Type: Grant
    Filed: December 20, 2011
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8943300
    Abstract: An apparatus for emulating the branch prediction behavior of an explicit subroutine call is disclosed. The apparatus includes a first input which is configured to receive an instruction address and a second input. The second input is configured to receive predecode information which describes the instruction address as being related to an implicit subroutine call to a subroutine. In response to the predecode information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: January 27, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Brian Michael Stempel, James Norris Dieffenderfer, Thomas Andrew Sartorius, Rodney Wayne Smith
  • Patent number: 8935512
    Abstract: It is possible to increase the processor instruction set design job efficiency and reduce workload on designers in investigation of an instruction set. An instruction operation code generation system includes an operation code bit width decision means, an instruction sorting means, and an operation code value decision means. The operation code bit width decision means decides a bit width that can be assigned for an operation code of each instruction according to specification data associated with a processor instruction set. The instruction sorting means sorts the instructions according to the operation code bit width. The operation code value decision means decides the value of the operation code of each instruction.
    Type: Grant
    Filed: November 19, 2007
    Date of Patent: January 13, 2015
    Assignee: NEC Corporation
    Inventor: Takahiro Kumura
  • Patent number: 8924689
    Abstract: The present invention realizes an efficient superscalar instruction issue and low power consumption at an instruction set including instructions with prefixes. An instruction fetch unit is adopted which determines whether an instruction code is of a prefix code or an instruction code other than it, and outputs the result of determination and the 16-bit instruction code. Along with it, decoders each of which decodes the instruction code, based on the result of determination, and decoders each of which decodes the prefix code, are disposed separately. Further, a prefix is supplied to each decoder prior to a fixed-length instruction code like 16 bits modified with it. A fixed-length instruction code following the prefix code is supplied to each decoder of the same pipeline as the decoder for the prefix code.
    Type: Grant
    Filed: October 29, 2010
    Date of Patent: December 30, 2014
    Assignee: Renesas Electronics Corporation
    Inventors: Hiroaki Nakaya, Yuki Kondoh, Makoto Ishikawa
  • Patent number: 8898437
    Abstract: A predecode repair cache is described in a processor capable of fetching and executing variable length instructions having instructions of at least two lengths which may be mixed in a program. An instruction cache is operable to store in an instruction cache line instructions having at least a first length and a second length, the second length longer than the first length. A predecoder is operable to predecode instructions fetched from the instruction cache that have invalid predecode information to form repaired predecode information. A predecode repair cache is operable to store the repaired predecode information associated with instructions of the second length that span across two cache lines in the instruction cache. Methods for filling the predecode repair cache and for executing an instruction that spans across two cache lines are also described.
    Type: Grant
    Filed: November 2, 2007
    Date of Patent: November 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Rodney Wayne Smith, Brian Michael Stempel, David John Mandzak, James Norris Dieffenderfer
  • Patent number: 8880851
    Abstract: A microprocessor includes a hardware instruction translator that translates x86 ISA and ARM ISA machine language program instructions into microinstructions, which are encoded in a distinct manner from the x86 and ARM instructions. An execution pipeline executes the microinstructions to generate x86/ARM-defined results. The microinstructions are distinct from the results generated by the execution of the microinstructions by the execution pipeline. The translator directly provides the microinstructions to the execution pipeline for execution. Each time the microprocessor performs one of the x86 ISA and ARM ISA instructions, the translator translates it into the microinstructions. An indicator indicates either x86 or ARM as a boot ISA. After reset, the microprocessor initializes its architectural state, fetches its first instructions from a reset address, and translates them all as defined by the boot ISA. An instruction cache caches the x86 and ARM instructions and provides them to the translator.
    Type: Grant
    Filed: September 1, 2011
    Date of Patent: November 4, 2014
    Assignee: VIA Technologies, Inc.
    Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
  • Publication number: 20140317384
    Abstract: A hierarchical cache with at least a unified cache is used to store both instructions and data values, and a further cache coupled between processing circuitry and a unified cache. The unified cache has a plurality of cache lines identified as an instruction cache line or a data cache line. Each data cache line stores at least one data value and the associated information. Pre-decode circuitry is associated with the unified cache and performs a first pre-decode operation on a received instruction for that instruction cache line in order to generate a corresponding partially pre-decoded instruction for storing in the instruction cache line. Further pre-decode circuitry is associated with the further cache, and, when a partially pre-decoded instruction is routed to the further cache, performs a further pre-decode operation on the partially pre-decoded instruction to generate a corresponding pre-decoded instruction for storage in the further cache.
    Type: Application
    Filed: April 23, 2013
    Publication date: October 23, 2014
    Applicant: ARM Limited
    Inventor: Peter Richard GREENHALGH
  • Patent number: 8850410
    Abstract: A system and method for improving software maintainability, performance, and/or security by associating a unique marker to each software code-block; the system comprising of a plurality of processors, a plurality of code-blocks, and a marker associated with each code-block. The system may also include a special hardware register (code-block marker hardware register) in each processor for identifying the markers of the code-blocks executed by the processor, without changing any of the plurality of code-blocks.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: September 30, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ramanjaneya S. Burugula, Joefon Jann, Pratap C. Pattnaik
  • Publication number: 20140244976
    Abstract: Various techniques for processing and pre-decoding branches within an IT instruction block. Instructions are fetched and cached in an instruction cache, and pre-decode bits are generated to indicate the presence of an IT instruction and the likely boundaries of the IT instruction block. If an unconditional branch is detected within the likely boundaries of an IT instruction block, the unconditional branch is treated as if it were a conditional branch. The unconditional branch is sent to the branch direction predictor and the predictor generates a branch direction prediction for the unconditional branch.
    Type: Application
    Filed: February 22, 2013
    Publication date: August 28, 2014
    Applicant: APPLE INC.
    Inventors: Shyam Sundar, Ian D. Kountanis, Conrado Blasco-Allue, Gerard R. Williams, III, Wei-Han Lien, Ramesh B. Gunna
  • Patent number: 8793470
    Abstract: A method, apparatus and system are disclosed for decoding an instruction in a variable-length instruction set. The instruction is one of a set of new types of instructions that uses a new escape code value, which is two bytes in length, to indicate that a third opcode byte includes the instruction-specific opcode for a new instruction. The new instructions are defined such the length of each instruction in the opcode map for one of the new escape opcode values may be determined using the same set of inputs, where each of the inputs is relevant to determining the length of each instruction in the new opcode map. For at least one embodiment, the length of one of the new instructions is determined without evaluating the instruction-specific opcode.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: July 29, 2014
    Assignee: Intel Corporation
    Inventors: James S. Coke, Peter J. Ruscito, Masood Tahir, David B. Jackson, Ves A. Naydenov, Scott D. Rodgers, Bret L. Toll, Frank Binns
  • Patent number: 8788795
    Abstract: A wake-and-go mechanism may be a programming idiom accelerator. As a processor fetches instructions, the programming idiom accelerator may look ahead to determine whether a programming idiom is coming up in the instruction stream. If the programming idiom accelerator recognizes a programming idiom, the programming idiom accelerator may perform an action to accelerate execution of the programming idiom. In the case of a wake-and-go programming idiom, the programming idiom accelerator may record an entry in a wake-and-go array, for example.
    Type: Grant
    Filed: February 1, 2008
    Date of Patent: July 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Ravi K. Arimilli, Satya P. Sharma, Randal C. Swanberg
  • Patent number: 8782435
    Abstract: A processor comprising: an instruction processing pipeline, configured to receive a sequence of instructions for execution, said sequence comprising at least one instruction including a flow control instruction which terminates the sequence; a hash generator, configured to generate a hash associated with execution of the sequence of instructions; a memory configured to securely receive a reference signature corresponding to a hash of a verified corresponding sequence of instructions; verification logic configured to determine a correspondence between the hash and the reference signature; and authorization logic configured to selectively produce a signal, in dependence on a degree of correspondence of the hash with the reference signature.
    Type: Grant
    Filed: July 15, 2011
    Date of Patent: July 15, 2014
    Assignee: The Research Foundation for The State University of New York
    Inventor: Kanad Ghose
  • Patent number: 8782434
    Abstract: A pipelined processor comprising a cache memory system, fetching instructions for execution from a portion of said cache memory system, an instruction commencing processing before a digital signature of the cache line that contained the instruction is verified against a reference signature of the cache line, the verification being done at the point of decoding, dispatching, or committing execution of the instruction, the reference signature being stored in an encrypted form in the processor's memory, and the key for decrypting the said reference signature being stored in a secure storage location. The instruction processing proceeds when the two signatures exactly match and, where further instruction processing is suspended or processing modified on a mismatch of the two said signatures.
    Type: Grant
    Filed: July 15, 2011
    Date of Patent: July 15, 2014
    Assignee: The Research Foundation for the State University of New York
    Inventor: Kanad Ghose
  • Publication number: 20140095835
    Abstract: A method for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching. The method includes receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence and determining if the first instruction and the second instruction can be optimized. In response to the determining that the first instruction and second instruction can be optimized, the method includes, preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction and storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache. In response to determining that the first instruction and second instruction can not be optimized, the method includes, storing the pre-decoded first instruction and a pre-decoded second instruction in the instruction cache.
    Type: Application
    Filed: December 9, 2013
    Publication date: April 3, 2014
    Applicant: International Business Machines Corporation
    Inventors: Michael K. Gschwind, Valentina Salapura
  • Patent number: 8650386
    Abstract: A data processor includes a first register file including registers, a second register file including registers, a number of which is larger than that of the registers of the first register file, an instruction decoder and an operation unit. The instruction decoder decodes an instruction described in first and second instruction formats. The first instruction format includes a first register-addressing field for designating the first register file. The second instruction format includes a second register-addressing field for designating the second register file, a size of which is larger than that of the first register-addressing field. The operation unit executes an instruction described in the first and second instruction formats using operand data stored in the first and second register files, respectively, based on the instruction decoder, and executes operations in parallel, a number of which is determined by a certain field included in the second instruction format.
    Type: Grant
    Filed: April 12, 2013
    Date of Patent: February 11, 2014
    Assignee: Panasonic Corporation
    Inventors: Takeshi Kishida, Masaitsu Nakajima