Patents Examined by Eric Coleman

Low latency fetch circuitry for compute kernels

Patent number: 10838725

Abstract: Techniques are disclosed relating to fetching items from a compute command stream that includes compute kernels. In some embodiments, stream fetch circuitry sequentially pre-fetches items from the stream and stores them in a buffer. In some embodiments, fetch parse circuitry iterate through items in the buffer using a fetch parse pointer to detect indirect-data-access items and/or redirect items in the buffer. The fetch parse circuitry may send detected indirect data accesses to indirect-fetch circuitry, which may buffer requests. In some embodiments, execute parse circuitry iterates through items in the buffer using an execute parse pointer (e.g., which may trail the fetch parse pointer) and outputs both item data from the buffer and indirect-fetch results from indirect-fetch circuitry for execution. In various embodiments, the disclosed techniques may reduce fetch latency for compute kernels.

Type: Grant

Filed: September 26, 2018

Date of Patent: November 17, 2020

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Jeffrey T. Brady
Dynamically structured single instruction, multiple data (SIMD) instructions

Patent number: 10824434

Abstract: Examples described herein relate to dynamically structured single instruction, multiple data (SIMD) instructions, and systems and circuits implementing such dynamically structured SIMD instructions. An example is a method for processing data. A first SIMD structure is determined by a processor. A characteristic of the first SIMD structure is altered by the processor to obtain a second SIMD structure. An indication of the second SIMD structure is communicated from the processor to a numerical engine. Data is packed by the numerical engine into an SIMD instruction according to the second SIMD structure. The SIMD instruction is transmitted from the numerical engine.

Type: Grant

Filed: November 29, 2018

Date of Patent: November 3, 2020

Assignee: XILINX, INC.

Inventors: Sean Settle, Ehsan Ghasemi, Ashish Sirasao, Ralph D. Wittig
Generating and verifying hardware instruction traces including memory data contents

Patent number: 10824426

Abstract: Embodiments of the present invention are directed to a computer-implemented method for generating and verifying hardware instruction traces including memory data contents. The method includes initiating an in-memory trace (IMT) data capture for a processor, the IMT data being an instruction trace collected while instructions flow through an execution pipeline of the processor. The method further includes capturing contents of architected registers of the processor by: storing the contents of the architected registers to a predetermined memory location, and causing a load-store unit (LSU) to read contents of the predetermined memory location.

Type: Grant

Filed: April 12, 2019

Date of Patent: November 3, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jane H. Bartik, Christian Jacobi, David Lee, Jang-Soo Lee, Anthony Saporito, Christian Zoellin
Handling multiple control flow instructions

Patent number: 10817299

Abstract: A data processing apparatus is provided that includes a plurality of control flow execution circuits to simultaneously execute a first control flow instruction having a first type and a second control flow instruction having a second type from a plurality of instructions. A control flow prediction update circuit updates at most one of: a prediction of the first control flow instruction based on a result of the first control flow instruction, and a prediction of the second control flow instruction based on a result of the second control flow instruction.

Type: Grant

Filed: September 7, 2018

Date of Patent: October 27, 2020

Assignee: Arm Limited

Inventors: Yasuo Ishii, Chris Abernathy
General-purpose parallel computing architecture

Patent number: 10810156

Abstract: An apparatus includes multiple parallel computing cores, where each computing core is configured to perform one or more processing operations and generate input data. The apparatus also includes multiple sets of parallel coprocessors, where each computing core is associated with a different one of the sets of parallel coprocessors. The coprocessors in each set of parallel coprocessors are configured to process the input data and generate output data. Each of the computing cores is configured to generate additional input data based on the output data generated by the associated set of parallel coprocessors.

Type: Grant

Filed: September 21, 2018

Date of Patent: October 20, 2020

Assignee: Goldman Sachs & Co. LLC

Inventors: Paul Burchard, Ulrich Drepper
Imprecise register dependency tracking

Patent number: 10802830

Abstract: A computer data processing system includes a plurality of logical registers, each including multiple storage sections. A processor writes data a storage section based on a dispatched first instruction, and sets a valid bit corresponding to the storage section that receives the data. In response to each subsequent instruction, the processor sets an evictor valid bit indicating a subsequent instruction has written new data to a storage section written by the first instruction, and updates the valid bit to indicate the storage section containing the new written data. A register combination unit generates a combined evictor tag to identify a most recent subsequent instruction. The processor determines the most recent subsequent instruction based on the combined evictor tag in response to a flush event, and unsets all the evictor tag valid bits set by the most the most recent subsequent instruction along with all previous subsequent instructions.

Type: Grant

Filed: March 5, 2019

Date of Patent: October 13, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan Hsieh, Gregory William Alexander, Tu-An Nguyen
Methods and systems for converting a related group of physical machines to virtual machines

Patent number: 10795712

Abstract: A method for processing virtualization of computers that are part of a group into virtual computers is provided. The method includes obtaining relationship data from the computers, where the relationship data identifies parameters used to communicate within the group. Then, the method analyzes utilization parameters for each of the computers of the group. A visual model for proposed virtualization of the group of computers is then generated. The visual model identifies hosting machines designated to define a virtual computer for each of the computers, where the visual model provides a graphical illustration of the group of computers once converted to virtual computers. The method enables adjustment of the proposed virtualization of the group of computers. Then, an execution sequence of virtualization operations to be carried out is generated, if execution of the proposed virtualization is triggered, and the execution sequence is saved to storage and accessed upon execution.

Type: Grant

Filed: May 6, 2018

Date of Patent: October 6, 2020

Assignee: VMware, Inc.

Inventor: Abhinav Katiyar
Vector predication instruction

Patent number: 10782972

Abstract: An apparatus comprises processing circuitry (4) and an instruction decoder (6) which supports vector instructions for which multiple lanes of processing are performed on respective data elements of a vector value. In response to a vector predication instruction, the instruction decoder (6) controls the processing circuitry (4) to set control information based on the outcome of a number of element comparison operations each for determining whether a corresponding element passes or fails a test condition. The control information controls processing of a predetermined number of subsequent vector instructions after the vector predication instruction. The predetermined number is hard-wired or identified by the vector predication instruction. For one of the subsequent vector instructions, an operation for a given portion of a given lane of vector processing is masked based on the outcome indicated by the control information for a corresponding data element.

Type: Grant

Filed: March 17, 2017

Date of Patent: September 22, 2020

Assignee: ARM Limited

Inventor: Thomas Christopher Grocutt
Generating and executing a control flow

Patent number: 10782980

Abstract: Examples of the present disclosure provide apparatuses and methods related to generating and executing a control flow. An example apparatus can include a first device configured to generate control flow instructions, and a second device including an array of memory cells, an execution unit to execute the control flow instructions, and a controller configured to control an execution of the control flow instructions on data stored in the array.

Type: Grant

Filed: August 24, 2018

Date of Patent: September 22, 2020

Assignee: Micron Technology, Inc.

Inventors: Kyle B. Wheeler, Richard C. Murphy, Troy A. Manning, Dean A. Klein
Load exploitation and improved pipelineability of hardware instructions

Patent number: 10776207

Abstract: A method, computer program product, and a computer system are disclosed for processing information using hardware instructions in a processor of a computer system by performing a hardware reduction instruction using an input to calculate at least one range reduction factor of the input; performing a hardware restoration instruction using the input to calculate at least one range restoration factor of the input; and performing a final fused multiply add (FMA) type of hardware instruction or a multiply (FM) hardware instruction by combining an approximation based on a value reduced by the at least one range reduction factor with the at least one range restoration factor.

Type: Grant

Filed: September 6, 2018

Date of Patent: September 15, 2020

Assignee: International Business Machines Corporation

Inventors: Robert F. Enenkel, Christopher Anand, Lucas Dutton, Adele Olejarz
Flexible hardware engines for handling operating on multidimensional vectors in a video processor

Patent number: 10776126

Abstract: An apparatus includes a scheduler circuit and a processing circuit. The scheduler circuit may be configured to (i) parse a directed acyclic graph into one or more operators and (ii) schedule the one or more operators in one or more data paths. The processing circuit generally comprises one or more hardware engines configured as the one or more data paths. The one or more hardware engines are generally configured to generate one or more output vectors in response to zero or more input vectors using the operators. At least one of the one or more hardware engines may support input vector dimensions ranging from zero to at least four dimensions. At least one of the one or more hardware engines is implemented solely in hardware.

Type: Grant

Filed: April 29, 2019

Date of Patent: September 15, 2020

Assignee: Ambarella International LP

Inventors: Leslie D. Kohn, Robert C. Kunz
Virtual vector processing

Patent number: 10768989

Abstract: Methods and apparatus to provide virtualized vector processing are described. In one embodiment, one or more operations corresponding to a virtual vector request are distributed to one or more processor cores for execution.

Type: Grant

Filed: January 16, 2018

Date of Patent: September 8, 2020

Assignee: Intel Corporation

Inventors: Anthony Nguyen, Engin Ipek, Victor Lee, Daehyun Kim, Mikhail Smelyanskiy
Using return address predictor to speed up control stack return address verification

Patent number: 10768937

Abstract: Overhead associated with verifying function return addresses to protect against security exploits is reduced by taking advantage of branch prediction mechanisms for predicting return addresses. More specifically, returning from a function includes popping a return address from a data stack. Well-known security exploits overwrite the return address on the data stack to hijack control flow. In some processors, a separate data structure referred to as a control stack is used to verify the data stack. When a return instruction is executed, the processor issues an exception if the return addresses on the control stack and the data stack are not identical. This overhead can be avoided by taking advantage of the return address stack, which is a data structure used by the branch predictor to predict return addresses. In most situations, if this prediction is correct, the above check does not need to occur, thus reducing the associated overhead.

Type: Grant

Filed: July 26, 2018

Date of Patent: September 8, 2020

Assignee: Advanced Micro Devices, Inc.

Inventors: Marius Evers, David A. Kaplan, Debjit Das Sarma
Linear feedback shift register for a reconfigurable logic unit

Patent number: 10761847

Abstract: An apparatus in a configurable logic unit may include a configurable logic unit (CLU) configured to receive first and second operands and to perform an operand operation and generate an operation value. The apparatus may also include: a random value generator for generating a random value; an adder coupled to the CLU and the random value generator and configured to generate a sum of the operation value and the random value; and a shift register coupled to the adder and configured to shift the sum by a number of bits to generate shifted data at an output. The random value generator may be a linear feedback shift register. The output may be coupled to an additional CLU so that the shifted data may be used for subsequent operand operations. The apparatus may be implemented in a digital signal processor slice in a configurable logic block.

Type: Grant

Filed: August 17, 2018

Date of Patent: September 1, 2020

Assignee: Micron Technology, Inc.

Inventor: David Hulton
Securing conditional speculative instruction execution

Patent number: 10761855

Abstract: A method performed in a processor, includes: receiving, in the processor, a branch instruction in the processing; determining, by the processor, an address of an instruction after the branch instruction as a candidate for speculative execution, the address including an object identification and an offset; and determining, by the processor, whether or not to perform speculative execution of the instruction after the branch instruction based on the object identification of the address.

Type: Grant

Filed: July 6, 2018

Date of Patent: September 1, 2020

Assignee: Micron Technology, Inc.

Inventor: Steven Jeffrey Wallach
Energy efficient processor core architecture for image processor

Patent number: 10754654

Abstract: An apparatus that includes a program controller to fetch and issue instructions is described. The apparatus includes an execution lane having at least one execution unit to execute the instructions. The execution lane is part of an execution lane array that is coupled to a two dimensional shift register array structure, wherein, execution lane s of the execution lane array are located at respective array locations and are coupled to dedicated registers at same respective array locations in the two-dimensional shift register array.

Type: Grant

Filed: March 28, 2019

Date of Patent: August 25, 2020

Assignee: Google LLC

Inventors: Albert Meixner, Jason Rupert Redgrave, Ofer Shacham, Daniel Frederic Finchelstein, Qiuling Zhu
Memory device, a dual inline memory module, a storage device, an apparatus for storing, a method for storing, a computer program, a machine readable storage, and a machine readable medium

Patent number: 10747691

Abstract: Examples provide a memory device, a dual inline memory module, a storage device, an apparatus for storing, a method for storing, a computer program, a machine readable storage, and a machine readable medium. A memory device is configured to store data and comprises one or more interfaces configured to receive and to provide data. The memory device further comprises a memory module configured to store the data, and a memory logic component configured to control the one or more interfaces and the memory module. The memory logic component is further configured to receive information on a specific memory region with one or more model identifications, to receive information on an instruction to perform an acceleration function for one or more certain model identifications, and to perform the acceleration function on data in a specific memory region with the one or more certain model identifications.

Type: Grant

Filed: April 10, 2018

Date of Patent: August 18, 2020

Assignee: Intel Corporation

Inventors: Mark Schmisseur, Thomas Willhalm, Francesc Guim Bernat, Karthik Kumar
Asymmetric performance multicore architecture with same instruction set architecture

Patent number: 10740281

Abstract: A method is described that entails operating enabled cores of a multi-core processor such that both cores support respective software routines with a same instruction set, a first core being higher performance and consuming more power than a second core under a same set of applied supply voltage and operating frequency.

Type: Grant

Filed: August 14, 2018

Date of Patent: August 11, 2020

Assignee: INTEL CORPORATION

Inventors: Varghese George, Sanjeev S. Jahagirdar, Deborah T. Marr
Microcontroller programmable system on a chip

Patent number: 10725954

Abstract: Embodiments of the present invention are directed to a microcontroller device having a microprocessor, programmable memory components, and programmable analog and digital blocks. The programmable analog and digital blocks are configurable based on programming information stored in the memory components. Programmable interconnect logic, also programmable from the memory components, is used to couple the programmable analog and digital blocks as needed. The advanced microcontroller design also includes programmable input/output blocks for coupling selected signals to external pins. The memory components also include user programs that the embedded microprocessor executes. These programs may include instructions for programming the digital and analog blocks “on-the-fly,” e.g., dynamically. In one implementation, there are a plurality of programmable digital blocks and a plurality of programmable analog blocks.

Type: Grant

Filed: June 1, 2018

Date of Patent: July 28, 2020

Assignee: Monterey Research, LLC

Inventors: Warren S. Snyder, Monte Mar
Apparatus and method for using predicted result values

Patent number: 10719329

Abstract: An apparatus and method are provided for using predicted result values. The apparatus has a processing unit that comprises processing circuitry for executing a sequence of instructions, and value prediction circuitry for identifying a predicted result value for at least one instruction. A result producing structure is provided that is responsive to a request issued from the processing unit when the processing circuitry is executing a first instruction, to produce a result value for the first instruction and return that result value to the processing unit. While waiting for the result value from the result producing structure, the processing circuitry can be arranged to speculatively execute at least one dependent instruction using a predicted result value for the first instruction as obtained from the value prediction circuitry.

Type: Grant

Filed: June 28, 2018

Date of Patent: July 21, 2020

Assignee: Arm Limited

Inventors: Vladimir Vasekin, David Michael Bull, Chiloda Ashan Senarath Pathirane, Alexei Fedorov

prev … 11 12 13 14 15 16 17 18 19 … next