Patents Examined by Eric Coleman
  • Patent number: 10838725
    Abstract: Techniques are disclosed relating to fetching items from a compute command stream that includes compute kernels. In some embodiments, stream fetch circuitry sequentially pre-fetches items from the stream and stores them in a buffer. In some embodiments, fetch parse circuitry iterate through items in the buffer using a fetch parse pointer to detect indirect-data-access items and/or redirect items in the buffer. The fetch parse circuitry may send detected indirect data accesses to indirect-fetch circuitry, which may buffer requests. In some embodiments, execute parse circuitry iterates through items in the buffer using an execute parse pointer (e.g., which may trail the fetch parse pointer) and outputs both item data from the buffer and indirect-fetch results from indirect-fetch circuitry for execution. In various embodiments, the disclosed techniques may reduce fetch latency for compute kernels.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: November 17, 2020
    Assignee: Apple Inc.
    Inventors: Andrew M. Havlir, Jeffrey T. Brady
  • Patent number: 10824434
    Abstract: Examples described herein relate to dynamically structured single instruction, multiple data (SIMD) instructions, and systems and circuits implementing such dynamically structured SIMD instructions. An example is a method for processing data. A first SIMD structure is determined by a processor. A characteristic of the first SIMD structure is altered by the processor to obtain a second SIMD structure. An indication of the second SIMD structure is communicated from the processor to a numerical engine. Data is packed by the numerical engine into an SIMD instruction according to the second SIMD structure. The SIMD instruction is transmitted from the numerical engine.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: November 3, 2020
    Assignee: XILINX, INC.
    Inventors: Sean Settle, Ehsan Ghasemi, Ashish Sirasao, Ralph D. Wittig
  • Patent number: 10824426
    Abstract: Embodiments of the present invention are directed to a computer-implemented method for generating and verifying hardware instruction traces including memory data contents. The method includes initiating an in-memory trace (IMT) data capture for a processor, the IMT data being an instruction trace collected while instructions flow through an execution pipeline of the processor. The method further includes capturing contents of architected registers of the processor by: storing the contents of the architected registers to a predetermined memory location, and causing a load-store unit (LSU) to read contents of the predetermined memory location.
    Type: Grant
    Filed: April 12, 2019
    Date of Patent: November 3, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jane H. Bartik, Christian Jacobi, David Lee, Jang-Soo Lee, Anthony Saporito, Christian Zoellin
  • Patent number: 10817299
    Abstract: A data processing apparatus is provided that includes a plurality of control flow execution circuits to simultaneously execute a first control flow instruction having a first type and a second control flow instruction having a second type from a plurality of instructions. A control flow prediction update circuit updates at most one of: a prediction of the first control flow instruction based on a result of the first control flow instruction, and a prediction of the second control flow instruction based on a result of the second control flow instruction.
    Type: Grant
    Filed: September 7, 2018
    Date of Patent: October 27, 2020
    Assignee: Arm Limited
    Inventors: Yasuo Ishii, Chris Abernathy
  • Patent number: 10810156
    Abstract: An apparatus includes multiple parallel computing cores, where each computing core is configured to perform one or more processing operations and generate input data. The apparatus also includes multiple sets of parallel coprocessors, where each computing core is associated with a different one of the sets of parallel coprocessors. The coprocessors in each set of parallel coprocessors are configured to process the input data and generate output data. Each of the computing cores is configured to generate additional input data based on the output data generated by the associated set of parallel coprocessors.
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: October 20, 2020
    Assignee: Goldman Sachs & Co. LLC
    Inventors: Paul Burchard, Ulrich Drepper
  • Patent number: 10802830
    Abstract: A computer data processing system includes a plurality of logical registers, each including multiple storage sections. A processor writes data a storage section based on a dispatched first instruction, and sets a valid bit corresponding to the storage section that receives the data. In response to each subsequent instruction, the processor sets an evictor valid bit indicating a subsequent instruction has written new data to a storage section written by the first instruction, and updates the valid bit to indicate the storage section containing the new written data. A register combination unit generates a combined evictor tag to identify a most recent subsequent instruction. The processor determines the most recent subsequent instruction based on the combined evictor tag in response to a flush event, and unsets all the evictor tag valid bits set by the most the most recent subsequent instruction along with all previous subsequent instructions.
    Type: Grant
    Filed: March 5, 2019
    Date of Patent: October 13, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jonathan Hsieh, Gregory William Alexander, Tu-An Nguyen
  • Patent number: 10795712
    Abstract: A method for processing virtualization of computers that are part of a group into virtual computers is provided. The method includes obtaining relationship data from the computers, where the relationship data identifies parameters used to communicate within the group. Then, the method analyzes utilization parameters for each of the computers of the group. A visual model for proposed virtualization of the group of computers is then generated. The visual model identifies hosting machines designated to define a virtual computer for each of the computers, where the visual model provides a graphical illustration of the group of computers once converted to virtual computers. The method enables adjustment of the proposed virtualization of the group of computers. Then, an execution sequence of virtualization operations to be carried out is generated, if execution of the proposed virtualization is triggered, and the execution sequence is saved to storage and accessed upon execution.
    Type: Grant
    Filed: May 6, 2018
    Date of Patent: October 6, 2020
    Assignee: VMware, Inc.
    Inventor: Abhinav Katiyar
  • Patent number: 10782972
    Abstract: An apparatus comprises processing circuitry (4) and an instruction decoder (6) which supports vector instructions for which multiple lanes of processing are performed on respective data elements of a vector value. In response to a vector predication instruction, the instruction decoder (6) controls the processing circuitry (4) to set control information based on the outcome of a number of element comparison operations each for determining whether a corresponding element passes or fails a test condition. The control information controls processing of a predetermined number of subsequent vector instructions after the vector predication instruction. The predetermined number is hard-wired or identified by the vector predication instruction. For one of the subsequent vector instructions, an operation for a given portion of a given lane of vector processing is masked based on the outcome indicated by the control information for a corresponding data element.
    Type: Grant
    Filed: March 17, 2017
    Date of Patent: September 22, 2020
    Assignee: ARM Limited
    Inventor: Thomas Christopher Grocutt
  • Patent number: 10782980
    Abstract: Examples of the present disclosure provide apparatuses and methods related to generating and executing a control flow. An example apparatus can include a first device configured to generate control flow instructions, and a second device including an array of memory cells, an execution unit to execute the control flow instructions, and a controller configured to control an execution of the control flow instructions on data stored in the array.
    Type: Grant
    Filed: August 24, 2018
    Date of Patent: September 22, 2020
    Assignee: Micron Technology, Inc.
    Inventors: Kyle B. Wheeler, Richard C. Murphy, Troy A. Manning, Dean A. Klein
  • Patent number: 10776207
    Abstract: A method, computer program product, and a computer system are disclosed for processing information using hardware instructions in a processor of a computer system by performing a hardware reduction instruction using an input to calculate at least one range reduction factor of the input; performing a hardware restoration instruction using the input to calculate at least one range restoration factor of the input; and performing a final fused multiply add (FMA) type of hardware instruction or a multiply (FM) hardware instruction by combining an approximation based on a value reduced by the at least one range reduction factor with the at least one range restoration factor.
    Type: Grant
    Filed: September 6, 2018
    Date of Patent: September 15, 2020
    Assignee: International Business Machines Corporation
    Inventors: Robert F. Enenkel, Christopher Anand, Lucas Dutton, Adele Olejarz
  • Patent number: 10776126
    Abstract: An apparatus includes a scheduler circuit and a processing circuit. The scheduler circuit may be configured to (i) parse a directed acyclic graph into one or more operators and (ii) schedule the one or more operators in one or more data paths. The processing circuit generally comprises one or more hardware engines configured as the one or more data paths. The one or more hardware engines are generally configured to generate one or more output vectors in response to zero or more input vectors using the operators. At least one of the one or more hardware engines may support input vector dimensions ranging from zero to at least four dimensions. At least one of the one or more hardware engines is implemented solely in hardware.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: September 15, 2020
    Assignee: Ambarella International LP
    Inventors: Leslie D. Kohn, Robert C. Kunz
  • Patent number: 10768989
    Abstract: Methods and apparatus to provide virtualized vector processing are described. In one embodiment, one or more operations corresponding to a virtual vector request are distributed to one or more processor cores for execution.
    Type: Grant
    Filed: January 16, 2018
    Date of Patent: September 8, 2020
    Assignee: Intel Corporation
    Inventors: Anthony Nguyen, Engin Ipek, Victor Lee, Daehyun Kim, Mikhail Smelyanskiy
  • Patent number: 10768937
    Abstract: Overhead associated with verifying function return addresses to protect against security exploits is reduced by taking advantage of branch prediction mechanisms for predicting return addresses. More specifically, returning from a function includes popping a return address from a data stack. Well-known security exploits overwrite the return address on the data stack to hijack control flow. In some processors, a separate data structure referred to as a control stack is used to verify the data stack. When a return instruction is executed, the processor issues an exception if the return addresses on the control stack and the data stack are not identical. This overhead can be avoided by taking advantage of the return address stack, which is a data structure used by the branch predictor to predict return addresses. In most situations, if this prediction is correct, the above check does not need to occur, thus reducing the associated overhead.
    Type: Grant
    Filed: July 26, 2018
    Date of Patent: September 8, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Marius Evers, David A. Kaplan, Debjit Das Sarma
  • Patent number: 10761847
    Abstract: An apparatus in a configurable logic unit may include a configurable logic unit (CLU) configured to receive first and second operands and to perform an operand operation and generate an operation value. The apparatus may also include: a random value generator for generating a random value; an adder coupled to the CLU and the random value generator and configured to generate a sum of the operation value and the random value; and a shift register coupled to the adder and configured to shift the sum by a number of bits to generate shifted data at an output. The random value generator may be a linear feedback shift register. The output may be coupled to an additional CLU so that the shifted data may be used for subsequent operand operations. The apparatus may be implemented in a digital signal processor slice in a configurable logic block.
    Type: Grant
    Filed: August 17, 2018
    Date of Patent: September 1, 2020
    Assignee: Micron Technology, Inc.
    Inventor: David Hulton
  • Patent number: 10761855
    Abstract: A method performed in a processor, includes: receiving, in the processor, a branch instruction in the processing; determining, by the processor, an address of an instruction after the branch instruction as a candidate for speculative execution, the address including an object identification and an offset; and determining, by the processor, whether or not to perform speculative execution of the instruction after the branch instruction based on the object identification of the address.
    Type: Grant
    Filed: July 6, 2018
    Date of Patent: September 1, 2020
    Assignee: Micron Technology, Inc.
    Inventor: Steven Jeffrey Wallach
  • Patent number: 10754654
    Abstract: An apparatus that includes a program controller to fetch and issue instructions is described. The apparatus includes an execution lane having at least one execution unit to execute the instructions. The execution lane is part of an execution lane array that is coupled to a two dimensional shift register array structure, wherein, execution lane s of the execution lane array are located at respective array locations and are coupled to dedicated registers at same respective array locations in the two-dimensional shift register array.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: August 25, 2020
    Assignee: Google LLC
    Inventors: Albert Meixner, Jason Rupert Redgrave, Ofer Shacham, Daniel Frederic Finchelstein, Qiuling Zhu
  • Patent number: 10747691
    Abstract: Examples provide a memory device, a dual inline memory module, a storage device, an apparatus for storing, a method for storing, a computer program, a machine readable storage, and a machine readable medium. A memory device is configured to store data and comprises one or more interfaces configured to receive and to provide data. The memory device further comprises a memory module configured to store the data, and a memory logic component configured to control the one or more interfaces and the memory module. The memory logic component is further configured to receive information on a specific memory region with one or more model identifications, to receive information on an instruction to perform an acceleration function for one or more certain model identifications, and to perform the acceleration function on data in a specific memory region with the one or more certain model identifications.
    Type: Grant
    Filed: April 10, 2018
    Date of Patent: August 18, 2020
    Assignee: Intel Corporation
    Inventors: Mark Schmisseur, Thomas Willhalm, Francesc Guim Bernat, Karthik Kumar
  • Patent number: 10740281
    Abstract: A method is described that entails operating enabled cores of a multi-core processor such that both cores support respective software routines with a same instruction set, a first core being higher performance and consuming more power than a second core under a same set of applied supply voltage and operating frequency.
    Type: Grant
    Filed: August 14, 2018
    Date of Patent: August 11, 2020
    Assignee: INTEL CORPORATION
    Inventors: Varghese George, Sanjeev S. Jahagirdar, Deborah T. Marr
  • Patent number: 10725954
    Abstract: Embodiments of the present invention are directed to a microcontroller device having a microprocessor, programmable memory components, and programmable analog and digital blocks. The programmable analog and digital blocks are configurable based on programming information stored in the memory components. Programmable interconnect logic, also programmable from the memory components, is used to couple the programmable analog and digital blocks as needed. The advanced microcontroller design also includes programmable input/output blocks for coupling selected signals to external pins. The memory components also include user programs that the embedded microprocessor executes. These programs may include instructions for programming the digital and analog blocks “on-the-fly,” e.g., dynamically. In one implementation, there are a plurality of programmable digital blocks and a plurality of programmable analog blocks.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: July 28, 2020
    Assignee: Monterey Research, LLC
    Inventors: Warren S. Snyder, Monte Mar
  • Patent number: 10719329
    Abstract: An apparatus and method are provided for using predicted result values. The apparatus has a processing unit that comprises processing circuitry for executing a sequence of instructions, and value prediction circuitry for identifying a predicted result value for at least one instruction. A result producing structure is provided that is responsive to a request issued from the processing unit when the processing circuitry is executing a first instruction, to produce a result value for the first instruction and return that result value to the processing unit. While waiting for the result value from the result producing structure, the processing circuitry can be arranged to speculatively execute at least one dependent instruction using a predicted result value for the first instruction as obtained from the value prediction circuitry.
    Type: Grant
    Filed: June 28, 2018
    Date of Patent: July 21, 2020
    Assignee: Arm Limited
    Inventors: Vladimir Vasekin, David Michael Bull, Chiloda Ashan Senarath Pathirane, Alexei Fedorov