Patents Examined by Shawn Doman

Vector registers implemented in memory

Patent number: 11556339

Abstract: Systems and methods related to implementing vector registers in memory. A memory system for implementing vector registers in memory can include an array of memory cells, where a plurality of rows in the array serve as a plurality of vector registers as defined by an instruction set architecture. The memory system for implementing vector registers in memory can also include a processing resource configured to, responsive to receiving a command to perform a particular vector operation on a particular vector register, access a particular row of the array serving as the particular register to perform the vector operation.

Type: Grant

Filed: November 9, 2021

Date of Patent: January 17, 2023

Assignee: Micron Technology, Inc.

Inventors: Timothy P. Finkbeiner, Troy D. Larsen
Defect repair for a reconfigurable data processor for homogeneous subarrays

Patent number: 11556494

Abstract: A device architecture includes a spatially reconfigurable array of processors, such as configurable units of a CGRA, having spare homogenous subarrays, and a parameter store on the device which stores parameters that tag one or more elements as unusable. Configuration data is distributed using a statically reconfigurable bus system, to implement the pattern of placement of configuration data, in dependence on the tagged elements. As a result, a spatially reconfigurable array having unusable elements can be repaired.

Type: Grant

Filed: July 16, 2021

Date of Patent: January 17, 2023

Assignee: SambaNova Systems, Inc.

Inventors: Gregory F. Grohoski, Manish K. Shah, Kin Hing Leung
System and method for implementing strong load ordering in a processor using a circular ordering ring

Patent number: 11550590

Abstract: A system and corresponding method enforce strong load ordering in a processor. The system comprises an ordering ring that stores entries corresponding to in-flight memory instructions associated with a program order, scanning logic, and recovery logic. The scanning logic scans the ordering ring in response to execution or completion of a given load instruction of the in-flight memory instructions and detects an ordering violation in an event at least one entry of the entries indicates that a younger load instruction has completed and is associated with an invalidated cache line. In response to the ordering violation, the recovery logic allows the given load instruction to complete, flushes the younger load instruction, and restarts execution of the processor after the given load instruction in the program order, causing data returned by the given and younger load instructions to be returned consistent with execution according to the program order to satisfy strong load ordering.

Type: Grant

Filed: January 28, 2022

Date of Patent: January 10, 2023

Assignee: Marvell Asia Pte, Ltd.

Inventors: David A. Carlson, Shubhendu S. Mukherjee, Wilson P. Snyder, II
Monolithic vector processor configured to operate on variable length vectors using a vector length register

Patent number: 11544214

Abstract: A computer processor comprising a vector unit is disclosed. The vector unit may comprise a vector register file comprising at least one register to hold a varying number of elements. The vector unit may further comprise a vector length register file comprising at least one register to specify the number of operations of a vector instruction to be performed on the varying number of elements in the at least one register of the vector register file. The computer processor may be implemented as a monolithic integrated circuit.

Type: Grant

Filed: May 12, 2015

Date of Patent: January 3, 2023

Assignee: Optimum Semiconductor Technologies, Inc.

Inventors: Mayan Moudgill, Gary J. Nacer, C. John Glossner, Arthur Joseph Hoane, Paul Hurtley, Murugappan Senthilvelan, Pablo Balzola, Vitaly Kalashnikov, Sitij Agrawal
Systems and methods for controlling machine operations within a multi-dimensional memory space

Patent number: 11526357

Abstract: Systems and methods for controlling machine operations are provided. A number of data entries are organized into a stack. Each data entry includes a type, a flag, a length, and a value or pointer entry. For each data entry in the stack, the type of data is determined from the type entry, the presence of an address or value is determined by the respective flag entry, and a length of the address or value is determined from the respective length entry. The data to be utilized or an address for the same at a particular electronic storage area is provided at the respective value or pointer entry, which may be specified by a space definition pushed onto the stack.

Type: Grant

Filed: January 25, 2021

Date of Patent: December 13, 2022

Assignee: Rankin Labs, LLC

Inventor: John Rankin
Flushing of instructions based upon a finish ratio and/or moving a flush point in a processor

Patent number: 11520591

Abstract: Processing data in an information handling system is disclosed that includes: in response to an event that triggers a flushing operation, calculate a finish ratio, wherein the finish ratio is a number of finished operations to a number of at least one of the group consisting of in-flight instructions, instructions pending in a processor pipeline, instructions issued to an issue queue, and instructions being processed in a processor execution unit; compare the calculated finish ratio to a threshold; and if the finish ratio is greater than the threshold, then do not perform the flushing operation. Also disclosed is moving the flush point.

Type: Grant

Filed: March 27, 2020

Date of Patent: December 6, 2022

Assignee: International Business Machines Corporation

Inventors: Ehsan Fatehi, Richard J. Eickemeyer, John B. Griswell, Jr.
Computer architecture with synergistic heterogeneous processors

Patent number: 11513805

Abstract: A computer architecture employs multiple special-purpose processors having different affinities for program execution to execute substantial portions of general-purpose programs to provide improved performance with respect to a general-purpose processor executing the general-purpose program alone.

Type: Grant

Filed: August 19, 2016

Date of Patent: November 29, 2022

Assignee: Wisconsin Alumni Research Foundation

Inventors: Karthikeyan Sankaralingam, Anthony Nowatzki
True/false vector index registers and methods of populating thereof

Patent number: 11507374

Abstract: Disclosed herein are vector index registers for storing or loading indexes of true and/or false results of comparison operations in vector processors. Each of the vector index registers store multiple addresses for accessing multiple positions in operand vectors.

Type: Grant

Filed: May 20, 2019

Date of Patent: November 22, 2022

Assignee: Micron Technology, Inc.

Inventor: Steven Jeffrey Wallach
Arithmetic processing apparatus and control method using ordering property

Patent number: 11500639

Abstract: An arithmetic processing apparatus includes a memory, a first processor coupled to the memory, and a second processor coupled to the memory. The first processor is configured to consecutively issue a plurality of load instructions for reading respective data with respect to the memory. The first processor is configured to determine whether an ordering property is guaranteed, based on values included in the data loaded from the memory. The second processor is configured to issue a store instruction during an execution of the plurality of load instructions with respect to the memory.

Type: Grant

Filed: May 22, 2019

Date of Patent: November 15, 2022

Assignee: FUJITSU LIMITED

Inventor: Hideyuki Takano
Systolic array-friendly data placement and control based on masked write

Patent number: 11500680

Abstract: The present disclosure relates to an accelerator for systolic array-friendly data placement. The accelerator may include: a systolic array comprising a plurality of operation units, wherein the systolic array is configured to receive staged input data and perform operations using the staged input to generate staged output data, the staged output data comprising a number of segments; a controller configured to execute one or more instructions to generate a pattern generation signal; a data mask generator; and a memory configured to store the staged output data using the generated masks. The data mask generator may include circuitry configured to: receive the pattern generation signal from the controller, and, based on the received signal, generate a mask corresponding to each segment of the staged output data.

Type: Grant

Filed: April 24, 2020

Date of Patent: November 15, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Yuhao Wang, Xiaoxin Fan, Dimin Niu, Chunsheng Liu, Wei Han
Compute unit having independent data paths

Patent number: 11461107

Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a streaming multiprocessor having a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The streaming multiprocessor comprises multiple processing blocks including multiple processing cores. The processing cores include independent integer and floating-point data paths that are configurable to concurrently execute multiple independent instructions. A memory is coupled with the multiple processing blocks.

Type: Grant

Filed: December 20, 2018

Date of Patent: October 4, 2022

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Tatiana Shpeisman, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
Deferred system error exception handling in a data processing apparatus

Patent number: 11461104

Abstract: Apparatus for data processing and a method of data processing are provided. Data processing operations are performed in response to data processing instructions. An error exception condition is set if a data processing operation has not been successful. It is determined if an error memory barrier condition exists and an error memory barrier procedure is performed in dependence on whether the error memory barrier condition exists. The error memory barrier procedure comprises, if the error exception condition is set and if an error mask condition is set: setting a deferred error exception condition and clearing the error exception condition.

Type: Grant

Filed: November 25, 2015

Date of Patent: October 4, 2022

Assignee: ARM LIMITED

Inventors: Michael John Williams, Richard Roy Grisenthwaite, Simon John Craske
Multiported parity scoreboard circuit

Patent number: 11455171

Abstract: A fast and frugal item-state tracking scoreboard circuit is disclosed. The scoreboard maintains per-item partial states across multiple memory circuits, enabling multiple lookups per clock cycle and multiple state updates per clock cycle. In an embodiment a scoreboard is used to schedule instructions in an out-of-order processor. Each clock cycle the scoreboard indicates the busy state of an instruction's registers and may update the busy state of the destination registers of issuing instructions and completing instructions. Applications include register tracking, function-unit tracking, and cache-line state tracking, in embodiments including processor cores (including superscalar, superpipelined, and multithreaded processors), accelerators, memory systems, and networks. In an embodiment, a register-busy scoreboard circuit is implemented using FPGA LUT RAM memory.

Type: Grant

Filed: May 29, 2020

Date of Patent: September 27, 2022

Assignee: Gray Research LLC

Inventor: Jan Stephen Gray
Data processing systems including an intermediate buffer with controlled data value eviction

Patent number: 11442731

Abstract: A data processor includes an execution unit that executes instructions to perform data processing operations, a register file operable to store data values for use by and produced by the execution unit, and a buffer intermediate between the register file for providing data values from the register file to the execution unit for use when executing an instruction, and to receive output data values from the execution unit for writing to the register file. Instructions to be executed by the execution unit of the data processor have associated buffer eviction priority indications representative of a priority for eviction from the buffer of an output data value that will be generated when executing the instruction. The buffer eviction priority indications are then used when selecting data values to evict from the buffer.

Type: Grant

Filed: October 17, 2019

Date of Patent: September 13, 2022

Assignee: Arm Limited

Inventors: John David Robson, Sean Tristram LeGuay Ellis, William Robert Stoye
Data selection for a processor pipeline using multiple supply lines

Patent number: 11429389

Abstract: A method for a plurality of pipelines, each having a processing element having first and second inputs and first and second lines, wherein at least one of the pipelines includes first and second logic operable to select a respective line so that data is received at the first and second inputs respectively. A first mode is selected and for the at least one pipeline, the first and second lines of that pipeline are selected such that the processing element of that pipeline receives data via the first and second lines of that pipeline, the first line being capable of supplying data that is different to the second line. A second mode is selected and for the at least one pipeline a line of another pipeline is selected, the second line of the at least one pipeline is selected and the same data at the second line is supplied as the first line.

Type: Grant

Filed: November 25, 2020

Date of Patent: August 30, 2022

Assignee: Imagination Technologies Limited

Inventors: Simon Nield, Thomas Rose
Method of debugging a processor that executes vertices of an application, each vertex being assigned to a programming thread of the processor

Patent number: 11416258

Abstract: A method for debugging a processor which is executing vertices of a software application is described. Each vertex is assigned to a programming thread of the processor. The processor has debug hardware for raising exceptions in certain break conditions. The method comprises inspecting a vertex identifier, comparing the vertex identifier and raising an instruction exception event for the programming thread if the vertex identifier assigned to the thread matches the vertex break identifier in the debug hardware. Exceptions are raised based on identified vertices, rather than just individual instructions or instruction addresses.

Type: Grant

Filed: May 22, 2019

Date of Patent: August 16, 2022

Assignee: Graphcore Limited

Inventors: Alan Graham Alexander, Richard Luke Southwell Osborne, Matthew David Fyles
Mixed inference using low and high precision

Patent number: 11409537

Abstract: One embodiment provides for a graphics processing unit (GPU) to accelerate machine learning operations, the GPU comprising an instruction cache to store a first instruction and a second instruction, the first instruction to cause the GPU to perform a floating-point operation, including a multi-dimensional floating-point operation, and the second instruction to cause the GPU to perform an integer operation; and a general-purpose graphics compute unit having a single instruction, multiple thread (SIMT) architecture, the general-purpose graphics compute unit to simultaneously execute the first instruction and the second instruction, wherein the integer operation corresponds to a memory address calculation.

Type: Grant

Filed: November 21, 2017

Date of Patent: August 9, 2022

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Barath Lakshmanan, Tatiana Shpeisman, Joydeep Ray, Ping T. Tang, Michael Strickland, Xiaoming Chen, Anbang Yao, Ben J. Ashbaugh, Linda L. Hurd, Liwei Ma
Microprocessor with multi-step ahead branch predictor and having a fetch-target queue between the branch predictor and instruction cache

Patent number: 11403103

Abstract: A microprocessor is shown, in which a branch predictor and an instruction cache are decoupled by a fetch-target queue (FTQ). The branch predictor performs branch prediction for N instruction addresses in parallel in the same cycle, wherein N is an integer greater than 1. In the current cycle, the branch predictor finishes branch prediction for N instruction addresses in parallel and, among the N instruction addresses with finished branch prediction, those that are not bypassed and do not overlap previously-predicted instruction addresses are pushed into the fetch-target queue, to be read out later as an instruction-fetching address for the instruction cache. The previously-predicted instruction addresses are pushed into the fetch-target queue in a previous cycle.

Type: Grant

Filed: October 13, 2020

Date of Patent: August 2, 2022

Assignee: SHANGHAI ZHAOXIN SEMICONDUCTOR CO., LTD.

Inventors: Fangong Gong, Mengchen Yang
Memory device and computing in memory method thereof

Patent number: 11354123

Abstract: A computing in memory method for a memory device is provided. The computing in memory method includes: based on a stride parameter, unfolding a kernel into a plurality of sub-kernels and a plurality of complement sub-kernels; based on the sub-kernels and the complement sub-kernels, writing a plurality of weights into a plurality of target memory cells of a memory array of the memory device; inputting an input data into a selected word line of the memory array; performing a stride operation in the memory array; temporarily storing a plurality of partial sums; and summing the stored partial sums into a stride operation result when all operation cycles are completed.

Type: Grant

Filed: September 21, 2020

Date of Patent: June 7, 2022

Assignee: MACRONIX INTERNATIONAL CO., LTD.

Inventors: Hung-Sheng Chang, Han-Wen Hu, Yueh-Han Wu, Tse-Yuan Wang, Yuan-Hao Chang, Tei-Wei Kuo
IC including logic tile, having reconfigurable MAC pipeline, and reconfigurable memory

Patent number: 11288076

Abstract: An integrated circuit including configurable multiplier-accumulator circuitry, wherein, during processing operations, a plurality of the multiplier-accumulator circuits are serially connected into pipelines to perform concatenated multiply and accumulate operations. The integrated circuit includes a first memory and a second memory, and a switch interconnect network, including configurable multiplexers arranged in a plurality of switch matrices. The first and second memories are configurable as either a dedicated read memory or a dedicated write memory and connected to a given pipeline, via the switch interconnect network, during a processing operation performed thereby; wherein, during a first processing operations, the first memory is dedicated to write data to a first pipeline and the second memory is dedicated to read data therefrom and, during a second processing operation, the first memory is dedicated to read data from a second pipeline and the second memory is dedicated to write data thereto.

Type: Grant

Filed: September 12, 2020

Date of Patent: March 29, 2022

Assignee: Flex Logix Technologies, Inc.

Inventor: Cheng C. Wang

prev 1 2 3 4 5 6 7 8 … next