Patents Examined by Shawn Doman

Method and apparatus to sort a vector for a bitonic sorting algorithm

Patent number: 11281464

Abstract: A method is provided that includes performing, by a processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values in a first portion of the lanes are sorted in a first order indicated by the vector sort instruction and the values in a second portion of the lanes are sorted in a second order indicated by the vector sort instruction; and storing the sorted vector in a storage location.

Type: Grant

Filed: September 30, 2019

Date of Patent: March 22, 2022

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Timothy David Anderson, Mujibur Rahman
Universal floating-point instruction set architecture for computing directly with decimal character sequences and binary formats in any combination

Patent number: 11275584

Abstract: A universal floating-point Instruction Set Architecture (ISA) implemented entirely in hardware. Using a single instruction, the universal floating-point ISA has the ability, in hardware, to compute directly with dual decimal character sequences up to IEEE 754-2008 “H=20” in length, without first having to explicitly perform a conversion-to-binary-format process in software before computing with these human-readable floating-point or integer representations. The ISA does not employ opcodes, but rather pushes and pulls “gobs” of data without the encumbering opcode fetch, decode, and execute bottleneck. Instead, the ISA employs stand-alone, memory-mapped operators, complete with their own pipeline that is completely decoupled from the processor's primary push-pull pipeline.

Type: Grant

Filed: July 30, 2020

Date of Patent: March 15, 2022

Inventor: Jerry D. Harthcock
Data exchange pathways between pairs of processing units in columns in a computer

Patent number: 11269806

Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets.

Type: Grant

Filed: May 22, 2019

Date of Patent: March 8, 2022

Assignee: Graphcore Limited

Inventors: Stephen Felix, Simon Christian Knowles
System and method for implementing strong load ordering in a processor using a circular ordering ring

Patent number: 11269644

Abstract: A system and corresponding method enforce strong load ordering in a processor. The system comprises an ordering ring that stores entries corresponding to in-flight memory instructions associated with a program order, scanning logic, and recovery logic. The scanning logic scans the ordering ring in response to execution or completion of a given load instruction of the in-flight memory instructions and detects an ordering violation in an event at least one entry of the entries indicates that a younger load instruction has completed and is associated with an invalidated cache line. In response to the ordering violation, the recovery logic allows the given load instruction to complete, flushes the younger load instruction, and restarts execution of the processor after the given load instruction in the program order, causing data returned by the given and younger load instructions to be returned consistent with execution according to the program order to satisfy strong load ordering.

Type: Grant

Filed: July 29, 2019

Date of Patent: March 8, 2022

Assignee: MARVELL ASIA PTE, LTD.

Inventors: David A. Carlson, Shubhendu S. Mukherjee, Wilson P. Snyder, II
Flushing in a microprocessor with multi-step ahead branch predictor and a fetch target queue

Patent number: 11249764

Abstract: A microprocessor is shown, in which a branch predictor and an instruction cache are decoupled by a fetch-target queue (FTQ). The FTQ stores at least an instruction address whose branch prediction has been finished by the branch predictor. The instruction addresses queued in the FTQ is to be read out later as an instruction-fetching address for the instruction cache. The instruction address that is input into the branch predictor and used for branch prediction leads the instruction-fetching address.

Type: Grant

Filed: October 13, 2020

Date of Patent: February 15, 2022

Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.

Inventors: Fangong Gong, Mengchen Yang
Stream reference register with double vector and dual single vector operating modes

Patent number: 11210097

Abstract: A streaming engine employed in a digital signal processor specifies a fixed read only data stream. Once fetched the data stream is stored in two head registers for presentation to functional units in the fixed order. Data use by the functional unit is preferably controlled using the input operand fields of the corresponding instruction. A first read only operand coding supplies data from the first head register. A first read/advance operand coding supplies data from the first head register and also advances the stream to the next sequential data elements. Corresponding second read only operand coding and second read/advance operand coding operate similarly with the second head register. A third read only operand coding supplies double width data from both head registers.

Type: Grant

Filed: July 1, 2019

Date of Patent: December 28, 2021

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventor: Joseph Zbiciak
Tensor partitioning and partition access order

Patent number: 11204889

Abstract: A method of processing partitions of a tensor in a target order includes receiving, by a reorder unit and from two or more producer units, a plurality of partitions of a tensor in a first order that is different from the target order, storing the plurality of partitions in the reorder unit, and providing, from the reorder unit, the plurality of partitions in the target order to one or more consumer units. In an example, the one or more consumer units process the plurality of partitions in the target order.

Type: Grant

Filed: March 29, 2021

Date of Patent: December 21, 2021

Assignee: SambaNova Systems, Inc.

Inventors: Raghu Prabhakar, Nathan Francis Sheeley, Matheen Musaddiq, Scott Layson Burson, Sitanshu Gupta, Sumti Jairath, Pramod Nataraja, Ajit Punj
Managing out-of-order retirement of instructions

Patent number: 11194584

Abstract: Retiring instructions out-of-order includes: receiving processor instructions comprising two or more and fewer than all processor instructions generated based on a program, where the processor instructions include a first instruction and a second instruction such that the first instruction precedes the second instruction in a program order of the program; receiving a start instruction that immediately precedes the processor instructions and indicates that the processor instructions are to be retired out-of-order; receiving a stop instruction immediately that succeeds the processor instructions and indicates a stop to out-of-order instruction retirement; and, in response to completing execution of the second instruction before completing execution of the first instruction, retiring the second instruction before retiring the first instruction.

Type: Grant

Filed: April 30, 2020

Date of Patent: December 7, 2021

Assignee: Marvell Asia Pte, Ltd.

Inventor: Shubhendu Sekhar Mukherjee
Micro-architecture designs and methods for eager execution and fetching of instructions

Patent number: 11188337

Abstract: Micro-architecture designs and methods are provided. A computer processing architecture may include an instruction cache for storing producer instructions, a half-instruction cache for storing half instructions, and eager shelves for storing a result of a first producer instruction. The computer processing architecture may fetch the first producer instruction and a first half instruction; send the first half instruction to the eager shelves; based on execution of the first producer instruction, send a second half instruction to the eager shelves; assemble the first producer instruction in the eager shelves based on the first half instruction and the second half instruction; and dispatch the first producer instruction for execution.

Type: Grant

Filed: September 30, 2019

Date of Patent: November 30, 2021

Assignees: The Florida State University Research Foundation, Inc., Michigan Technological University

Inventors: David Whalley, Soner Onder
Vector registers implemented in memory

Patent number: 11175915

Abstract: Systems and methods related to implementing vector registers in memory. A memory system for implementing vector registers in memory can include an array of memory cells, where a plurality of rows in the array serve as a plurality of vector registers as defined by an instruction set architecture. The memory system for implementing vector registers in memory can also include a processing resource configured to, responsive to receiving a command to perform a particular vector operation on a particular vector register, access a particular row of the array serving as the particular register to perform the vector operation.

Type: Grant

Filed: October 10, 2018

Date of Patent: November 16, 2021

Assignee: Micron Technology, Inc.

Inventors: Timothy P Finkbeiner, Troy D. Larsen
Method for maintaining a branch prediction history table

Patent number: 11163574

Abstract: A method for managing tasks in a computer system comprising a processor and a memory, the method includes performing a first task by the processor, the first task comprising task-relating branch instructions and task-independent branch instructions and executing the branch prediction method, the execution resulting in task-relating branch prediction data in the branch prediction history table. In response to determining that the first task is to be interrupted or terminated, the method includes storing the task-relating branch prediction data of the first task in the task structure of the first task. In response to determining that a second task is to be continued, the method includes reading task-relating branch prediction data of the second task from the task structure of the second task and storing the task-relating branch prediction data of the second task in the branch prediction history table.

Type: Grant

Filed: July 31, 2019

Date of Patent: November 2, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Wolfgang Gellerich, Peter M. Held, Martin Schwidefsky, Chung-Lung K. Shum
Apparatus and method for managing address collisions when performing vector operations

Patent number: 11132196

Abstract: Address collisions are managed when performing vector operations. A register store stores vector operands. Execution circuitry performs memory access operations to move the vector operands between the register store and memory and data processing operations using the vector operands. The execution circuitry may iteratively execute a vector loop, where during each iteration the execution circuitry executes a sequence of instructions to implement the vector loop. The sequence includes a check instruction identifying a plurality of memory addresses. The execution circuitry responds to the check instruction to determine whether an address hazard condition exists among the plurality of memory addresses. For each iteration of the vector loop, the execution circuitry responds to the check instruction determining an absence of the hazard address condition to employ a default level of vectorization when executing the sequence of instructions to implement the vector loop.

Type: Grant

Filed: April 6, 2017

Date of Patent: September 28, 2021

Assignee: Arm Limited

Inventors: Mbou Eyole, Jacob Eapen, Alejandro Martinez Vicente
Synchronized access to data in shared memory by protecting the load target address of a fronting load

Patent number: 11119781

Abstract: A data processing system includes multiple processing units all having access to a shared memory. A processing unit of the data processing system includes a processor core including an upper level cache, core reservation logic that records addresses in the shared memory for which the processor core has obtained reservations, and an execution unit that executes memory access instructions including a fronting load instruction. Execution of the fronting load instruction generates a load request that specifies a load target address. The processing unit further includes lower level cache that, responsive to receipt of the load request and based on the load request indicating an address match for the load target address in the core reservation logic, protects the load target address against access by any conflicting memory access request during a protection interval following servicing of the load request.

Type: Grant

Filed: December 11, 2018

Date of Patent: September 14, 2021

Assignee: International Business Machines Corporation

Inventors: Derek E. Williams, Guy L. Guthrie, Hugh Shen, Sanjeev Ghai
Apparatus and method for gang invariant operation optimizations using dynamic evaluation

Patent number: 11093250

Abstract: An apparatus and method for efficiently processing invariant operations on a parallel execution engine.

Type: Grant

Filed: September 29, 2018

Date of Patent: August 17, 2021

Assignee: Intel Corporation

Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jaewoong Sim, Andrey Ayupov
Supporting access to accelerators on a programmable integrated circuit by multiple host processes

Patent number: 11086815

Abstract: Supporting multiple clients on a single programmable integrated circuit (IC) can include implementing a first image within the programmable IC in response to a first request for processing to be performed by the programmable IC, wherein the request is from a first process executing in a host data processing system coupled to the programmable IC, receiving, using a processor of the host data processing system, a second request for processing to be performed on the programmable IC from a second and different process executing in the host data processing system while the programmable IC still implements the first image, comparing, using the processor, a second image specified by the second request to the first image, and, in response to determining that the second image matches the first image based on the comparing, granting, using the processor, the second request for processing to be performed by the programmable IC.

Type: Grant

Filed: April 15, 2019

Date of Patent: August 10, 2021

Assignee: Xilinx, Inc.

Inventors: Sonal Santan, Soren T. Soe, Cheng Zhen
Preventing operand store compare conflicts using conflict address data tables

Patent number: 11080060

Abstract: Managing application execution by receiving a store instruction, including a store instruction itag and store instruction address, creating a hash of the store instruction address, receiving a load instruction and matching a hash of a store instruction address associated with the load instruction with the hash of the store instruction address associated with the store instruction. The store instruction itag is sent to an instruction sequencing unit (ISU). The ISU delays execution of the load instruction according to the received itag.

Type: Grant

Filed: April 23, 2019

Date of Patent: August 3, 2021

Assignee: International Business Machines Corporation

Inventors: Ehsan Fatehi, Brian W. Thompto, John B. Griswell, Jr.
Chained split execution of fused compound arithmetic operations

Patent number: 11061672

Abstract: A microprocessor is configured for unchained and chained modes of split execution of a fused compound arithmetic operation. In both modes of split execution, a first execution unit executes only a first part of the fused compound arithmetic operation and produces an intermediate result thereof, and a second instruction execution unit receives the intermediate result and executes a second part of the fused compound arithmetic operation to produce a final result. In the unchained mode, execution is accomplished by dispatching separate split-execution microinstructions to the first and second instruction execution units. In the chained mode, execution is accomplished by dispatching a single split-execution microinstruction to the first instruction execution unit and sending a chaining control signal or signal group to the second execution unit, causing it to execute its part of the fused arithmetic operation without needing an instruction.

Type: Grant

Filed: July 5, 2016

Date of Patent: July 13, 2021

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventors: Thomas Elmer, Nikhil A. Patil
Data selection network for a data processing engine in an integrated circuit

Patent number: 11061673

Abstract: An example core for data processing engine (DPE) includes a first register file configured to provide a first plurality of output lanes, a processor, coupled to the register file, including: a multiply-accumulate (MAC) circuit, and a first permute circuit coupled between the first register file and the MAC circuit. The first permute circuit is configured to generate a first vector by selecting a first set of output lanes from the first plurality of output lanes, and a second permute circuit coupled between the first register file and the MAC circuit. The second permute circuit is configured to generate a second vector by selecting a second set of output lanes from the first plurality of output lanes.

Type: Grant

Filed: April 3, 2018

Date of Patent: July 13, 2021

Assignee: XILINX, INC.

Inventors: Baris Ozgul, Jan Langer, Juan J. Noguera Serra, Goran H. K. Bilski, Richard L. Walke
Double-load instruction using a fixed stride and a variable stride for updating addresses between successive instructions

Patent number: 11061679

Abstract: A processor comprising an execution unit, memory and one or more register files. The execution unit is configured to execute instances of machine code instructions from an instruction set. The types of instruction defined in the instruction set include a double-load instruction for loading from the memory to at least one of the one or more register files. The execution unit is configured so as, when the load instruction is executed, to perform a first load operation strided by a fixed stride, and a second load operation strided by a variable stride, the variable stride being specified in a variable stride register in one of the one or more register files.

Type: Grant

Filed: April 19, 2019

Date of Patent: July 13, 2021

Assignee: Graphcore Limited

Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore
Technique for processing a sequence of atomic add with carry instructions when a data value is not present in a cache

Patent number: 11036500

Abstract: Processing circuitry performs processing operations specified by program instructions. An instruction decoder decodes an atomic-add-with-carry instruction AADDC to control the processing circuitry to perform an atomic operation of an add of an addend operand value and a data value stored in a memory to generate a result value stored in the memory and a carry value indicative of whether or not the add generated a carry out. The atomic-add-with-carry instructions may be used within systems which accumulate a local sum value prior to a data value being returned into a local cache memory at which time the local sum value is added to the return data value. The atomic-add-with-carry instructions may also be used in embodiments comprising a coalescing tree of respective processing apparatus where the carry out values generated from local sums produced at each node are returned early to higher nodes within the hierarchy thereby releasing them to commence other processing.

Type: Grant

Filed: October 23, 2019

Date of Patent: June 15, 2021

Assignee: Arm Limited

Inventor: Andreas Due Engh-Halstvedt

prev 1 2 3 4 5 6 7 8 9 next