Patents Examined by Courtney P Carmichael-Moody

Multi-modal gather operation

Patent number: 11842200

Abstract: An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. The load store unit ignores exceptions or faults while performing the gather operation in the first mode and transitions to a second mode in response to an exception or fault. Two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.

Type: Grant

Filed: September 27, 2019

Date of Patent: December 12, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: John M. King, Magiting Talisayon, Michael Estlick
Efficient inter-thread communication between hardware processing threads of a hardware multithreaded processor by selective aliasing of register blocks

Patent number: 11816486

Abstract: A hardware multithreaded processor including a register file, a thread controller, and aliasing circuitry. The thread controller is configured to assign each of multiple hardware processing threads to a corresponding one of multiple register block sets in which each register block set includes at least two of multiple register blocks and in which each register block includes at least two registers. The aliasing circuitry is programmable to redirect a reference provided by a first hardware processing thread to a register of a register block assigned to a second hardware processing thread. The reference may be a register number in an instruction issued by the first hardware processing thread. The register number is converted by the aliasing circuitry to a register file address locating a register of the register block assigned to the second hardware processing thread. The aliasing circuitry may include a programmable register for one or more threads.

Type: Grant

Filed: January 18, 2022

Date of Patent: November 14, 2023

Assignee: NXP B.V.

Inventor: Michael Andrew Fischer
Computer-readable recording medium storing program for converting first single instruction multiple data (SIMD) command using first mask register into second SIMD command using second mask register, command conversion method for converting first SIMD command using first mask register into second SIMD command using second mask register, and command conversion apparatus for converting first SIMD command using first mask register into second SIMD command using second mask register

Patent number: 11803384

Abstract: A recording medium stores a program for causing a computer to execute a process including: converting, in a first source code corresponding to a first-type processor, a first load command for a first mask register included in the first-type processor into a second load command for a second mask register included in a second-type processor; and converting, when a first SIMD command for performing an arithmetic operation using the first mask register exists after the first load command in the first source code and a state of a value of the first mask register does not coincide with a state of a value of the first mask register, the first SIMD command into a second SIMD command corresponding to the second-type processor and a change command for changing a state of a value of the second mask register to a state of a value of the second mask register.

Type: Grant

Filed: May 31, 2022

Date of Patent: October 31, 2023

Assignee: FUJITSU LIMITED

Inventors: Koji Kurihara, Kentaro Kawakami
Prediction class determination

Patent number: 11803390

Abstract: There is provided an apparatus, method and medium. The apparatus comprises processing circuitry to perform data processing in response to decoded instructions and prediction circuitry to generate a prediction of a number of iterations of a fetching process. The fetching process is used to control fetching of data or instructions to be used in processing operations that are predicted to be performed by the processing circuitry. The processing circuitry is configured to tolerate performing one or more unnecessary iterations of the fetching process following an over-prediction of the number of iterations and, for at least one prediction, to determine a class of a plurality of prediction classes, each of which corresponds to a range of numbers of iterations. The prediction circuitry is also arranged to signal a predetermined number of iterations associated with the class to the processing circuitry to trigger at least the predetermined number of iterations of the fetching process.

Type: Grant

Filed: July 1, 2022

Date of Patent: October 31, 2023

Assignee: Arm Limited

Inventors: Houdhaifa Bouzguarrou, Guillaume Bolbenes, Thibaut Elie Lanois
Fetch stage handling of indirect jumps in a processor pipeline

Patent number: 11797308

Abstract: Systems and methods are disclosed for fetch stage handling of indirect jumps in a processor pipeline. For example, a method includes detecting a sequence of instructions fetched by a processor core, wherein the sequence of instructions includes a first instruction, with a result that depends on an immediate field of the first instruction and a program counter value, followed by a second instruction that is an indirect jump instruction; responsive to detection of the sequence of instructions, preventing an indirect jump target predictor circuit from generating a target address prediction for the second instruction; and, responsive to detection of the sequence of instructions, determining a target address for the second instruction before the first instruction is issued to an execution stage of a pipeline of the processor core.

Type: Grant

Filed: April 11, 2022

Date of Patent: October 24, 2023

Assignee: SiFive, Inc.

Inventors: Joshua Smith, Krste Asanovic, Andrew Waterman
Asynchronous pipeline merging using long vector arbitration

Patent number: 11797311

Abstract: Devices and techniques for asynchronous pipeline merging are described herein. An apparatus, includes a memory controller, which includes merge circuitry; where the memory controller chiplet is configured to perform operations including those to: perform a bitwise logical operation on a first logging bit vector and a second logging bit vector to obtain a result vector, wherein the first logging bit vector is associated with a first pipeline and the second logging bit vector is associated with a second pipeline, and wherein bits in respective index positions of the first and second logging bit vectors represent transactions; select a completed transaction from the result vector using a round-robin technique; and forward the completed transaction from the set of completed transactions to an output pipeline.

Type: Grant

Filed: September 8, 2022

Date of Patent: October 24, 2023

Assignee: Micron Technology, Inc.

Inventor: Michael Grassi
Reconfigurable multi-thread processor for simultaneous operations on split instructions and operands

Patent number: 11782719

Abstract: A superscalar processor has a thread mode of operation for supporting multiple instruction execution threads which are full data path wide instructions, and a micro-thread mode of operation where each thread supports two micro-threads which independently execute instructions. An executed instruction sets a micro-thread mode and an executed instruction sets the thread mode.

Type: Grant

Filed: March 27, 2021

Date of Patent: October 10, 2023

Assignee: Ceremorphic, Inc.

Inventor: Heonchul Park
Reusing fetched, flushed instructions after an instruction pipeline flush in response to a hazard in a processor to reduce instruction re-fetching

Patent number: 11726787

Abstract: Reusing fetched, flushed instructions after an instruction pipeline flush in response to a hazard in a processor to reduce instruction re-fetching is disclosed. An instruction processing circuit is configured to detect fetched performance degrading instructions (PDIs) in a pre-execution stage in an instruction pipeline that may cause a precise interrupt that would cause flushing of the instruction pipeline. In response to detecting a PDI in an instruction pipeline, the instruction processing circuit is configured to capture the fetched PDI and/or its successor, younger fetched instructions that are processed in the instruction pipeline behind the PDI, in a pipeline refill circuit.

Type: Grant

Filed: May 27, 2022

Date of Patent: August 15, 2023

Assignee: Microsoft Technology Licensing LLC

Inventors: Rami Mohammad Al Sheikh, Michael Scott McIlvaine
Arithmetic processing apparatus using either simple or complex instruction decoder

Patent number: 11720366

Abstract: An arithmetic processing apparatus includes two instruction decoders. A first decoder processes instructions in a single cycle, while a second decoder processes instructions in a plurality of cycles. The apparatus further includes a determination circuit that causes the first decoder to process an instruction to be processed when the instruction to be processed is a specific instruction and there is no previous instruction being processed, and causes the second decoder to process the instruction to be processed when the instruction to be processed is not the specific instruction or there is a previous instruction being processed.

Type: Grant

Filed: February 23, 2021

Date of Patent: August 8, 2023

Assignee: FUJITSU LIMITED

Inventor: Ryohei Okazaki
Zero operand instruction conversion for accelerating sparse computations in a central processing unit pipeline

Patent number: 11714652

Abstract: A processing device includes a zero detection circuit to determine that an operand of a first instruction is zero and instruction conversion logic coupled with the zero detection circuit to, in response to the zero detection circuit determining that the operand is zero, convert the first instruction to a register move instruction executable by the processing device.

Type: Grant

Filed: July 23, 2021

Date of Patent: August 1, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: John Kalamatianos, Ganesh Dasika
Streaming engine with early exit from loop levels supporting early exit loops and irregular loops

Patent number: 11714646

Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. Upon a stream break instruction specifying one of the nested loops, the stream engine ends a current iteration of the loop. If the specified loop was not the outermost loop, the streaming engine begins an iteration of a next outer loop. If the specified loop was the outermost nested loop, the streaming engine ends the stream. The streaming engine places a vector of data elements in order in lanes within a stream head register. A stream break instruction is operable upon a vector break.

Type: Grant

Filed: February 1, 2021

Date of Patent: August 1, 2023

Assignee: Texas Instmments Incorporated

Inventor: Joseph Zbiciak
Data processing apparatus and method for providing candidate prediction entries

Patent number: 11687343

Abstract: A data processing apparatus and a method are disclosed.

Type: Grant

Filed: September 29, 2020

Date of Patent: June 27, 2023

Assignee: Arm Limited

Inventors: Yasuo Ishii, Chang Joo Lee, James David Dundas, Muhammed Umar Farooq
Method for forming constant extensions in the same execute packet in a VLIW processor

Patent number: 11681532

Abstract: In a very long instruction word (VLIW) central processing unit instructions are grouped into execute packets that execute in parallel. A constant may be specified or extended by bits in a constant extension instruction in the same execute packet. If an instruction includes an indication of constant extension, the decoder employs bits of a constant extension instruction to extend the constant of an immediate field. Two or more constant extension slots are permitted in each execute packet, each extending constants for a different predetermined subset of functional unit instructions. In an alternative embodiment, more than one functional unit may have constants extended from the same constant extension instruction employing the same extended bits. A long extended constant may be formed using the extension bits of two constant extension instructions.

Type: Grant

Filed: April 13, 2020

Date of Patent: June 20, 2023

Assignee: Texas Instruments Incorporated

Inventors: Timothy David Anderson, Duc Quang Bui, Joseph Raymond Michael Zbiciak
Computational array microprocessor system using non-consecutive data formatting

Patent number: 11681649

Abstract: A microprocessor system comprises a computational array and a hardware data formatter. The computational array includes a plurality of computation units that each operates on a corresponding value addressed from memory. The values operated by the computation units are synchronously provided together to the computational array as a group of values to be processed in parallel. The hardware data formatter is configured to gather the group of values, wherein the group of values includes a first subset of values located consecutively in memory and a second subset of values located consecutively in memory. The first subset of values is not required to be located consecutively in the memory from the second subset of values.

Type: Grant

Filed: October 22, 2021

Date of Patent: June 20, 2023

Assignee: Tesla, Inc.

Inventors: Emil Talpes, William McGee, Peter Joseph Bannon
Control of branch prediction for zero-overhead loop

Patent number: 11663007

Abstract: In response to decoding a zero-overhead loop control instruction of an instruction set architecture, processing circuitry sets at least one loop control parameter for controlling execution of one or more iterations of a program loop body of a zero-overhead loop. Based on the at least one loop control parameter, loop control circuitry controls execution of the one or more iterations of the program loop body of the zero-overhead loop, the program loop body excluding the zero-overhead loop control instruction. Branch prediction disabling circuitry detects whether the processing circuitry is executing the program loop body of the zero-overhead loop associated with the zero-overhead loop control instruction, and dependent on detecting that the processing circuitry is executing the program loop body of the zero-overhead loop, disables branch prediction circuitry. This reduces power consumption during a zero-overhead loop when the branch prediction circuitry is unlikely to provide a benefit.

Type: Grant

Filed: October 1, 2021

Date of Patent: May 30, 2023

Assignee: Arm Limited

Inventors: Thomas Christopher Grocutt, François Christopher Jacques Botman
Program flow prediction for loops

Patent number: 11650822

Abstract: Instruction processing circuitry comprises fetch circuitry to fetch instructions for execution; instruction decoder circuitry to decode fetched instructions; execution circuitry to execute decoded instructions; and program flow prediction circuitry to predict a next instruction to be fetched; in which the instruction decoder circuitry is configured to decode a loop control instruction in respect of a given program loop and to derive information from the loop control instruction for use by the program flow prediction circuitry to predict program flow for one or more iterations of the given program loop.

Type: Grant

Filed: October 25, 2021

Date of Patent: May 16, 2023

Assignee: Arm Limited

Inventor: Vijay Chavan
Constrained carries on speculative counters

Patent number: 11620134

Abstract: A computer-implemented method for of constrained carries on speculative counters includes providing one or more speculative counters having an upper portion of most significant bits partially embedded in a random-access memory (RAM) array, and a pre-counter portion external to the RAM array having a plurality of least significant bits. The one or more speculative counters are configured to count a plurality of events of interest during a processor core instruction execution. A carry output from the pre-counter portion to the RAM array is suppressed for a duration of a speculative event period.

Type: Grant

Filed: June 30, 2021

Date of Patent: April 4, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan Buerkle, James W. Bishop, Maria Lorena Pesantez, David Henry Wilde
Processor and pipeline processing method for processing multiple threads including wait instruction processing

Patent number: 11586444

Abstract: A pipeline processing unit includes a fetch unit that fetches the instruction for the thread having an execution right, a decoding unit that decodes the instruction fetched by the fetch unit, and a computation execution unit that executes the instruction decoded by the decoding unit. When the WAIT instruction for the thread having the execution right is executed, an instruction holding unit holds instruction fetch information on a processing target instruction to be processed immediately after the WAIT instruction. An execution target thread selection unit selects a thread to be executed based on a wait command and, in response to a wait state started from the execution of the WAIT instruction being canceled, processes the processing target instruction from decoding thereof based on the instruction fetch information on the processing target instruction held in the instruction holding unit.

Type: Grant

Filed: June 10, 2021

Date of Patent: February 21, 2023

Assignee: SANKEN ELECTRIC CO., LTD.

Inventors: Kazuhiro Mima, Hitomi Shishido
Memory-network processor with programmable optimizations

Patent number: 11544072

Abstract: Various embodiments are disclosed of a multiprocessor system with processing elements optimized for high performance and low power dissipation and an associated method of programming the processing elements. Each processing element may comprise a fetch unit and a plurality of address generator units and a plurality of pipelined datapaths. The fetch unit may be configured to receive a multi-part instruction, wherein the multi-part instruction includes a plurality of fields. First and second address generator units may generate, based on different fields of the multi-part instruction, addresses from which to retrieve first and second data for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction. The execution units may perform operations using a single pipeline or multiple pipelines based on third and fourth fields of the multi-part instruction.

Type: Grant

Filed: March 16, 2021

Date of Patent: January 3, 2023

Assignee: Coherent Logix, Inc.

Inventors: Michael B. Doerr, Carl S. Dobbs, Michael B. Solka, Michael R. Trocino, Kenneth R. Faulkner, Keith M. Bindloss, Sumeer Arya, John Mark Beardslee, David A. Gibson
Control flow mechanism for execution of graphics processor instructions using active channel packing

Patent number: 11537403

Abstract: An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.

Type: Grant

Filed: March 26, 2021

Date of Patent: December 27, 2022

Assignee: INTEL CORPORATION

Inventors: Subramaniam M. Maiyuran, Guei-Yuan Lueh, Supratim Pal, Gang Chen, Ananda V. Kommaraju, Joy Chandra, Altug Koker, Prasoonkumar Surti, David Puffer, Hong Bin Liao, Joydeep Ray, Abhishek R. Appu, Ankur N. Shah, Travis T. Schluessler, Jonathan Kennedy, Devan Burke

prev 1 2 3 4 5 6 … next