Patents Examined by Aimee Li
  • Patent number: 10409612
    Abstract: An apparatus and method is described herein for providing robust speculative code section abort control mechanisms. Hardware is able to track speculative code region abort events, conditions, and/or scenarios, such as an explicit abort instruction, a data conflict, a speculative timer expiration, a disallowed instruction attribute or type, etc. And hardware, firmware, software, or a combination thereof makes an abort determination based on the tracked abort events. As an example, hardware may make an initial abort determination based on one or more predefined events or choose to pass the event information up to a firmware or software handler to make such an abort determination. Upon determining an abort of a speculative code region is to be performed, hardware, firmware, software, or a combination thereof performs the abort, which may include following a fallback path specified by hardware or software.
    Type: Grant
    Filed: December 26, 2015
    Date of Patent: September 10, 2019
    Assignee: Intel Corporation
    Inventors: Martin G. Dixon, Ravi Rajwar, Konrad K. Lai, Robert S. Chappell, Rajesh S. Parthasarathy, Alexandre J. Farcy, Ilhyun Kim, Prakash Math, Matthew Merten, Vijaykumar Kadgi
  • Patent number: 10409599
    Abstract: A method including fetching a group of instructions, where the group of instructions is configured to execute atomically by a processor is provided. The method further includes decoding at least one of a first instruction or a second instruction, where: (1) decoding the first instruction results in a processing of information about a group of instructions, including information about a size of the group of instructions, and (2) decoding the second instruction results in a processing of at least one of: (a) a reference to a memory location having the information about the group of instructions, including information about the size of the group of instructions or (b) a processor status word having information about the group of instructions, including information about the size of the group of instructions.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: September 10, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jan Gray, Doug Burger, Aaron Smith
  • Patent number: 10402199
    Abstract: One embodiment of this invention provides two conditional execution auxiliary instructions directed to disparate subsets of the plural functional units. Depending on the conditional execution desired, only one of the two conditional execution auxiliary instructions may be required for a particular execute packet. Another embodiment of this invention employs only one of two possible register files for the condition registers. In a VLIW processor it may be advantageous to split the functional units into separate sets with corresponding register files. This limits the number of functional units that may simultaneously access the register files. In the preferred embodiment of this invention the functional units are divided into a scalar set which access scalar registers and a vector set which access vector registers. The data registers storing the conditions for both scalar and vector instructions are in the scalar data register file.
    Type: Grant
    Filed: October 22, 2015
    Date of Patent: September 3, 2019
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Timothy David Anderson, Duc Quang Bui, Joseph Raymond Michael Zbiciak
  • Patent number: 10402196
    Abstract: A logic circuit in a processor including a plurality of input registers, each for storing a vector containing data elements, a coefficient register for storing a vector containing N coefficients, an output register for storing a result vector, and an arithmetic unit configured to: obtain a pattern for selecting N data elements from the plurality of input registers, select a plurality of groups of N data elements from the plurality of input registers in parallel, wherein each group is selected in accordance with the pattern, and wherein each group is shifted with respect to a previous selected group, perform an arithmetic operation between each of the selected groups and the coefficients in parallel, and store results of the arithmetic operations in the output register.
    Type: Grant
    Filed: May 11, 2015
    Date of Patent: September 3, 2019
    Assignee: Ceva D.S.P. Ltd.
    Inventors: Roni M. Sadeh, Noam Dvoretzki
  • Patent number: 10402200
    Abstract: Embodiments include a micro BTB, which can predict up to two branches per cycle, every cycle, with zero bubble insertion on either a taken or not taken prediction, thereby significantly improving performance and reducing power consumption of a microprocessor. A front end of a microprocessor can include a main front end logic section having a main BTB, a micro BTB to produce prediction information, and a decoupling queue. The micro BTB can include a graph having multiple entries, and a CAM having multiple items. Each of the entries of the graph can include a link pointer to a next branch in a taken direction, and a link pointer to a next branch in a not-taken direction. The micro BTB can insert a hot branch into the graph as a new seed.
    Type: Grant
    Filed: February 18, 2016
    Date of Patent: September 3, 2019
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: James David Dundas, Gerald David Zuraski, Jr., Timothy Russell Snyder
  • Patent number: 10394569
    Abstract: Managing exception handling. A plurality of instruction units of an instruction stream are selected to be decoded in parallel by a plurality of instruction decode units of a processor. The plurality of instruction units includes a prefix instruction and a prefixed instruction. The prefixed instruction is an instruction to be modified by the prefix instruction. An exception condition associated with the prefixed instruction is determined. Exception handling is performed for the prefixed instruction, in which the performing includes determining an address at which to restart execution of the instruction stream. The determining the address includes adjusting the address at which to restart execution based on the prefix instruction to be separately decoded by an instruction decode unit.
    Type: Grant
    Filed: November 14, 2015
    Date of Patent: August 27, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10394568
    Abstract: Managing exception handling. A plurality of instruction units of an instruction stream are selected to be decoded in parallel by a plurality of instruction decode units of a processor. The plurality of instruction units includes a prefix instruction and a prefixed instruction. The prefixed instruction is an instruction to be modified by the prefix instruction. An exception condition associated with the prefixed instruction is determined. Exception handling is performed for the prefixed instruction, in which the performing includes determining an address at which to restart execution of the instruction stream. The determining the address includes adjusting the address at which to restart execution based on the prefix instruction to be separately decoded by an instruction decode unit.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: August 27, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 10387157
    Abstract: An instruction set conversion system and method is provided, which can convert guest instructions to host instructions for processor core execution. Through configuration, instruction sets supported by the processor core are easily expanded. A method for real-time conversion between host instruction addresses and guest instruction addresses is also provided, such that the processor core can directly read out the host instructions from a higher level cache, reducing the depth of a pipeline.
    Type: Grant
    Filed: November 26, 2014
    Date of Patent: August 20, 2019
    Assignee: SHANGHAI XINHAO MICROELECTRONICS CO. LTD.
    Inventor: Kenneth Chenghao Lin
  • Patent number: 10387159
    Abstract: Methods and apparatuses relate to emulating architectural performance monitoring in a binary translation system. In one embodiment, a processor includes an architectural performance counter to maintain an architectural value associated with instruction execution, a register to store the architectural value of the architectural performance counter, binary translation logic to embed an architectural value from the architectural performance counter into a stream of translated instructions having a transactional code region and to store the architectural value into the register, and an execution unit to execute the transactional code region of the stream of translated instructions. The binary translation logic is configured to add the architectural value from the register to the architectural performance counter upon completion of the transactional code region of the stream of translated instructions.
    Type: Grant
    Filed: February 4, 2015
    Date of Patent: August 20, 2019
    Assignee: Intel Corporation
    Inventors: Jason M Agron, Polychronis Xekalakis, Paul Caprioli, Jiwei Oliver Lu, Koichi Yamada
  • Patent number: 10379865
    Abstract: A circuit includes an instruction scheduling circuit and an instruction buffer including entries. The entries each include an instruction, a validity indication, and an attribute. The instruction scheduling circuit partitions the entries into first sets, determines second sets by reordering the entries of each first set according to their attributes, determines a set ordering for the first sets according to a function of their attributes, and selects, based on the set ordering, instructions from the second sets. A process for selecting instructions to issue receives entries, each entry including an instruction, a validity indications, and an attribute. The process partitions the entries into first sets, determines second sets by reordering the entries of each first set according to their attributes, determines a set ordering for first sets according to a function of the attributes of their entries, and selects, based on the set ordering, instructions from the second sets.
    Type: Grant
    Filed: May 13, 2016
    Date of Patent: August 13, 2019
    Assignee: Marvell International Ltd.
    Inventors: Warren Menezes, Joshua Smith
  • Patent number: 10360033
    Abstract: A Conditional Transaction End (CTEND) instruction is provided that allows a program executing in a nonconstrained transactional execution mode to inspect a storage location that is modified by either another central processing unit or the Input/Output subsystem. Based on the inspected data, transactional execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs. For instance, when the instruction executes, the processor is in a nonconstrained transaction execution mode, and the transaction nesting depth is one at the beginning of the instruction, a second operand of the instruction is inspected, and based on the inspected data, transaction execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs, such as the value of the second operand becomes a prespecified value or a time interval is exceeded.
    Type: Grant
    Filed: June 26, 2018
    Date of Patent: July 23, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
  • Patent number: 10360036
    Abstract: A computer processing system is provided. The computer processing system includes a processor configured to crack a Move-To-FPSCR instruction into two internal instructions. A first one of the two internal instructions executes out-of-order to update a control field and a second one of the two internal instructions executes in-order to compute a trap decision.
    Type: Grant
    Filed: July 12, 2017
    Date of Patent: July 23, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Brian J. D. Barrick, Maarten J. Boersma, Niels Fricke, Michael J. Genden
  • Patent number: 10346172
    Abstract: Embodiments include a technique for caching of perceptron branch patterns using ternary content addressable memory. The technique includes defining a table of perceptrons, each perceptron having a plurality of weights with each weight being associated with a bit location in a history vector, and defining a TCAM, the TCAM having a number of entries, wherein each entry includes a number of bit pairs, the number of bit pairs being equal to a number of weights for each associated perceptron. The technique also includes associating the TCAM with an array of x-bit saturating counters, and performing a branch prediction for a history vector of a given branch, the branch prediction indicating a perceptron prediction. The technique includes determining a most influential bit location in the history vector, the most influential bit location having a greatest weight of an associated perceptron.
    Type: Grant
    Filed: January 13, 2017
    Date of Patent: July 9, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James J. Bonanno, Brian R. Prasky
  • Patent number: 10338920
    Abstract: A processor includes an execution unit to execute instructions to get data elements of the same type from multiple data structures packed in vector registers. The execution unit includes logic to extract data elements from specific positions within each data structure dependent on an instruction encoding. A vector GET3 instruction encoding specifies that data elements be extracted from the first, second, or third position in each XYZ-type data structure. A vector GET4 instruction encoding specifies that data elements be extracted from the first, second, third, or fourth position in each XYZW-type data structure and that the extracted data elements be placed in the upper or lower half of a destination vector. The execution unit includes logic to place the extracted data elements in contiguous locations in the destination vector. The execution unit includes logic to store the destination vector to a destination vector register specified in the instruction.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: July 2, 2019
    Assignee: Intel Corporation
    Inventor: Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10331455
    Abstract: An electronic apparatus generating compiled data used in a very long instruction word (VLIW) processor including a plurality of function units is provided. The electronic apparatus includes a storage and a processor configured to control the storage to store the compiled data in which a plurality of VLIW instructions are compiled, identify a VLIW instruction from the compiled data; and update, if a multi-cycle no operation (nop) instruction for the plurality of function units is identified within a cycle corresponding to a latency of the identified VLIW instruction and if an end cycle of another VLIW instruction is within the cycle corresponding to the latency of the identified VLIW instruction, the compiled data by including information on a cycle difference between an end cycle of the identified VLIW instruction and the end cycle of the another VLIW instruction in the multi-cycle nop instruction.
    Type: Grant
    Filed: July 19, 2017
    Date of Patent: June 25, 2019
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Jong-hun Lee, Jae-un Park, Si-hoon Song, Myung-sun Kim
  • Patent number: 10324725
    Abstract: The disclosure provides a method and a system for identifying and replacing code translations that generate spurious fault events. In one embodiment the method includes executing a first set and a second set of native instructions, performing a third translation of a target instruction to form a third set of native instructions in response to a determination that a fault occurrence is attributed to a first translation, wherein the third set of native instructions is not the same as the second set of native instructions, and the third set of native instructions is not the same as the first set of native instructions, and executing the third set of native instructions.
    Type: Grant
    Filed: March 8, 2018
    Date of Patent: June 18, 2019
    Assignee: Nvidia Corporation
    Inventors: Nathan Tuck, David Dunn, Ross Segelken, Madhu Swarna
  • Patent number: 10324727
    Abstract: A data processing apparatus executes a stream of instructions. Memory access circuitry accesses a memory in response to control signals associated with a memory access instruction that is executed in the stream of instructions. Branch prediction circuitry predicts the outcome of branch instructions in the stream of instructions based on a branch prediction table. Processing circuitry performs a determination of whether out-of-order execution of memory access instructions is to be performed based on memory prediction data, and selectively enables out-of-order execution of the memory access instructions in dependence on the determination. The memory prediction data is stored in the branch prediction table.
    Type: Grant
    Filed: August 17, 2016
    Date of Patent: June 18, 2019
    Assignee: ARM Limited
    Inventors: Curtis Glenn Dunham, Mitchell Bryan Hayenga
  • Patent number: 10318294
    Abstract: Operation of a multi-slice processor that includes a plurality of execution slices. Operation of such a multi-slice processor includes: receiving a first instruction indicating a first target register; receiving a second instruction indicating the first target register as a source operand; responsive to the second instruction indicating the first target register as a source operand, updating a dependent count corresponding to the first instruction; and issuing, in dependence upon the dependent count for the first instruction being greater than a dependent count for another instruction, the first instruction to an execution slice of the plurality of execution slices.
    Type: Grant
    Filed: June 20, 2016
    Date of Patent: June 11, 2019
    Assignee: International Business Machines Corporation
    Inventors: Khandker N. Adeeb, Joshua W. Bowman, Jeffrey C. Brownscheidle, Brandon R. Goddard, Dung Q. Nguyen, Tu-An T. Nguyen, Brian D. Victor, Brendan M. Wong
  • Patent number: 10318291
    Abstract: A processor includes a vector register including data fields to store values of vector elements of data, a decoder to decode a single instruction multiple data (SIMD) instruction specifying a source operand and a mask to identify a masked portion of the data fields. An execution unit is to read a plurality of values from unmasked data fields of the plurality of data fields of the vector register; compare, within the vector register, each of the plurality of values from the unmasked data fields for equality with all other values of the plurality of values; and responsive to a detection of an inequality of any two values of the plurality of values, set a mask field, corresponding to a detected unequal value, to a masked state with a flip of a bit value of the mask field, to signal the detection of the inequality.
    Type: Grant
    Filed: May 3, 2017
    Date of Patent: June 11, 2019
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Charles R. Yount, Suleyman Sair, Kshitij A. Doshi
  • Patent number: 10310740
    Abstract: Aligning memory access operations to a geometry of a storage device, including: receiving, by a storage array controller, information describing the layout of memory in the storage device; determining, by the storage array controller, a write size in dependence upon the layout of memory in the storage device; and sending, by the storage array controller, a write request addressed to a location within the memory unit in dependence upon the layout of memory in the storage device.
    Type: Grant
    Filed: June 23, 2015
    Date of Patent: June 4, 2019
    Assignee: Pure Storage, Inc.
    Inventors: John Colgrove, Peter E. Kirkpatrick