Patents Examined by Eric Coleman

Processor and method for executing an instruction with a processor

Patent number: 12386620

Abstract: The invention discloses a processor and a method for executing an instruction with a processor. The processor comprises a set of tiny register files, each of which is connected correspondingly to one of the set of register files and is configured to temporarily store the operand and the output result of the instruction executed by the plurality of physical threads; and an operand collector, which is connected to the set of register files and to the set of tiny register files and is configured to read the operand of the instruction executed by the plurality of physical threads from the set of register files and/or from the set of tiny register files and write the output result of the instruction executed by the plurality of physical threads to the set of register files and/or to the set of tiny register files.

Type: Grant

Filed: December 9, 2021

Date of Patent: August 12, 2025

Assignee: METAX INTEGRATED CIRCUITS (SHANGHAI) CO., LTD.

Inventor: Ying Li
Compiler-based input synchronization for processor with variant stage latencies

Patent number: 12373182

Abstract: The technology disclosed provides a system that comprises a processor with computing units on an integrated circuit substrate. The processor is configured to map a program across multiple hardware stages with each hardware stage executing a corresponding operation of the program at a different stage latency dependent on an operation type and an operand format. The system further comprises a runtime logic that configures the compute units with configuration data. The configuration data causes first and second producer hardware stages in a given compute unit to execute first and second data processing operations and produce first and second outputs at first and second stage latencies, and synchronizes consumption of the first and second outputs by a consumer hardware stage in the given compute unit for execution of a third data processing operation by introducing a register storage delay that compensates for a difference between the first and second stage latencies.

Type: Grant

Filed: December 27, 2022

Date of Patent: July 29, 2025

Assignee: SambaNova Systems, Inc.

Inventors: Weiwei Chen, Raghu Prabhakar, David Alan Koeplinger
Method of notifying a process or programmable atomic operation traps

Patent number: 12360941

Abstract: Disclosed in some examples, are methods, systems, programmable atomic units, and machine-readable mediums that provide an exception as a response to the calling processor. That is, the programmable atomic unit will send a response to the calling processor. The calling processor will recognize that the exception has been raised and will handle the exception. Because the calling processor knows which process triggered the exception, the calling processor (e.g., the Operating System) can take appropriate action, such as terminating the calling process. The calling processor may be a same processor as that executing the programmable atomic transaction, or a different processor (e.g., on a different chiplet).

Type: Grant

Filed: October 24, 2023

Date of Patent: July 15, 2025

Assignee: Micron Technology, Inc.

Inventor: Tony Brewer
Systems and methods for processing functions in computational storage

Patent number: 12353916

Abstract: Provided is a method for performing computations near memory, the method including receiving, at a processor core of a storage device, a request to perform a first function on first data, the first function including a first operation and a second operation, performing, by a first processor-core acceleration engine of the storage device, the first operation on the first data, based on first processor-core custom instructions, to generate first result data, and performing, by a first co-processor acceleration engine of the storage device, the second operation on the first result data, based on first co-processor custom instructions.

Type: Grant

Filed: June 2, 2023

Date of Patent: July 8, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jonghyeon Kim, Soogil Jeong
Copy a subset of status flags from a control and status register to a flags register

Patent number: 12346695

Abstract: Techniques for copying a subset of status flags from a control and status register to a flags register in response to an instruction are described. An exemplary instruction includes a field for an opcode, the opcode to indicate execution circuitry is to copy from a first register a saturation flag value, an overflow value, and a carry value to a second register into one or more instructions of a different instruction set.

Type: Grant

Filed: September 25, 2021

Date of Patent: July 1, 2025

Assignee: Intel Corporation

Inventors: Vedvyas Shanbhogue, Robert Valentine, Mark Charney, Venkateswara Madduri
Vector instruction cracking after scalar dispatch

Patent number: 12340226

Abstract: Apparatus and methods for vector instruction cracking after scalar dispatch are described. An integrated circuit includes a primary pipeline and a vector pipeline. The primary pipeline is configured to determine a type of instruction, responsive to a determination that the instruction is a vector instruction, create a reorder buffer entry in a reorder buffer for the vector instruction prior to out-of-order processing in the primary pipeline, and send the vector instruction to a vector pipeline. The vector pipeline is configured to process the vector instruction.

Type: Grant

Filed: September 18, 2023

Date of Patent: June 24, 2025

Assignee: SiFive, Inc.

Inventor: Kathlene Rose Magnus
Apparatus for accelerating neural networks

Patent number: 12327139

Abstract: An apparatus for accelerating neural networks, includes: a memory for storing graph input data including vertices and edges; an aggregation engine that processes the accumulation of features and generates feature vectors by taking the graph input data and performing an aggregation operation on the graph input data; an on-chip cache for caching the feature vectors; and a combination engine that generates a systolic array for matrix multiplications based on the feature vectors taken from the on-chip cache and weights taken from the memory.

Type: Grant

Filed: September 27, 2023

Date of Patent: June 10, 2025

Assignee: UNIVERSITY INDUSTRY FOUNDATION, YONSEI UNIVERSITY

Inventors: Youngsok Kim, Jinho Lee, Mingi Yoo, Jaeyong Song, Jounghoo Lee
Count to empty for microarchitectural return predictor security

Patent number: 12327123

Abstract: An embodiment of an integrated circuit may comprise a return stack buffer (RSB), a speculative return stack buffer (SRSB), and circuitry coupled to the RSB and the SRSB, the circuitry to track a count until the SRSB is empty at a time of a prediction by a branch prediction unit, and return an output from the branch prediction unit that corresponds to one of the RSB and the SRSB based at least in part on the count until the SRSB is empty. Other embodiments are disclosed and claimed.

Type: Grant

Filed: June 21, 2021

Date of Patent: June 10, 2025

Assignee: Intel Corporation

Inventors: Mathew Lowes, Martin Licht
Hybrid hardware accelerator and programmable array architecture

Patent number: 12314217

Abstract: Techniques are disclosed for the use of a hybrid architecture that combines a programmable processing array and a hardware accelerator. The hybrid architecture dedicates the most computationally intensive blocks to the hardware accelerator, while maintaining flexibility for additional computations to be performed by the programmable processing array. An interface is also described for coupling the processing array to the hardware accelerator, which achieves a division of functionality and connects the programmable processing array components to the hardware accelerator components without sacrificing flexibility. This results in a balance between power/area and flexibility.

Type: Grant

Filed: December 23, 2021

Date of Patent: May 27, 2025

Assignee: Intel Corporation

Inventors: Zoran Zivkovic, Kameran Azadet, Kannan Rajamani, Thomas Smith
Dual vector arithmetic logic unit

Patent number: 12299413

Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

Type: Grant

Filed: January 16, 2024

Date of Patent: May 13, 2025

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Bin He, Brian Emberling, Mark Leather, Michael Mantor
Illegal address mask method and device for cores of DSP

Patent number: 12293184

Abstract: An illegal address mask method for cores of a DSP includes: S1, initializing a core of a DSP; S2, configuring a start address register and an end address register, and taking an address range defined by the start address register and the end address register as a masked address range; configuring a first comparator and a second comparator to send out illegal address decision signals for instructions within the masked address range; S3, acquiring a PC pointer, and determining whether the PC pointer is located in the masked address range; if so, sending out an illegal address decision signal to stop an operation; if not, performing pre-decoding to obtain a memory access instruction; and S4, determining whether an address of the memory access instruction is located in the masked address range; if so, sending out an illegal address decision signal to stop an operation; otherwise, completing a memory access operation.

Type: Grant

Filed: December 27, 2024

Date of Patent: May 6, 2025

Assignee: Jiangsu Huachuang Microsystem Company Limited

Inventors: Haibin Zhou, Guoqiang He, Wenjun Han, Ming Hao
General-purpose systolic array

Patent number: 12287756

Abstract: A systolic array cell is described, the cell including two general-purpose arithmetic logic units (ALUs) and register-file. A plurality of the cells may be configured in a matrix or array, such that the output of the first ALU in a first cell is provided to a second cell to the right of the first cell, and the output of the second ALU in the first cell is provided to a third cell below the first cell. The two ALUs in each cell of the array allow for processing of a different instruction in each cycle.

Type: Grant

Filed: October 4, 2023

Date of Patent: April 29, 2025

Assignee: GOOGLE LLC

Inventors: Reginald Clifford Young, Trevor Gale, Sushma Honnavara-Prasad, Paolo Mantovani
Instruction simulation device and method thereof

Patent number: 12288068

Abstract: An instruction simulation device and a method thereof are provided. The instruction simulation device includes a processor. The processor includes an instruction decoder which generates format information of a ready-for-execution instruction. The processor determines whether the ready-for-execution instruction currently executed by the processor is a compatible instruction or an extended instruction based on the format information of the ready-for-execution instruction. If the ready-for-execution instruction is an extended instruction under the new instruction set or the extended instruction set, the processor converts the ready-for-execution instruction into a simulation program corresponding to the extended instruction, and simulates an execution result of the ready-for-execution instruction by executing the simulation program. The simulation program is composed of at least one compatible instructions of the processor.

Type: Grant

Filed: September 12, 2023

Date of Patent: April 29, 2025

Assignee: Shanghai Zhaoxin Semiconductor Co., Ltd.

Inventors: Weilin Wang, Yingbing Guan, Mengchen Yang
Systems, methods, and apparatus for tile configuration

Patent number: 12282773

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

Type: Grant

Filed: December 8, 2023

Date of Patent: April 22, 2025

Assignee: Intel Corporation

Inventors: Menachem Adelman, Robert Valentine, Zeev Sperber, Mark J. Charney, Bret L. Toll, Rinat Rappoport, Jesus Corbal, Dan Baum, Alexander F. Heinecke, Elmoustapha Ould-Ahmed-Vall, Yuri Gebil, Raanan Sade
Instruction format and instruction set architecture for tensor streaming processor

Patent number: 12271339

Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.

Type: Grant

Filed: October 9, 2023

Date of Patent: April 8, 2025

Assignee: Groq, Inc.

Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
Determining distances between vectors

Patent number: 12260215

Abstract: In-memory computing circuits can be used to determine distances between vectors. Such circuits can be used for machine learning applications. Examples include obtaining at least one dimension of a query vector wherein the dimension includes one or more bits and comparing respective bits of the dimension to corresponding bits of at least one dimension of a reference vector. This obtains a control signal dependent upon whether the bits of the dimension of the query vector are the same as corresponding bits of the dimension of the reference vector. The control signal can then be used to control a pulse modifying circuit such that a modification applied to a pulse signal is dependent upon whether the bits of the dimension of the query vector are the same as corresponding bits of the dimension of the reference vector.

Type: Grant

Filed: May 22, 2023

Date of Patent: March 25, 2025

Assignee: Nokia Technologies Oy

Inventor: Marijan Herceg
Network computer with two embedded rings

Patent number: 12248429

Abstract: A computer comprising a plurality of interconnected processing nodes arranged in a configuration in which multiple layers of interconnected nodes are arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by at least respective intralayer link between each pair of neighbouring processing nodes, wherein each of the at least four processing nodes in each layer is connected to a respective corresponding node in one or more adjacent layer by a respective interlayer link, the computer being programmed to provide in the configuration two embedded one dimensional paths and to transmit data around each of the two embedded one dimensional paths, each embedded one dimensional path using all processing nodes of the computer in such a manner that the two embedded one dimensional paths operate simultaneously without sharing links.

Type: Grant

Filed: March 17, 2023

Date of Patent: March 11, 2025

Assignee: GRAPHCORE LIMITED

Inventor: Simon Knowles
Technique for hardware activation function computation in RNS artificial neural networks

Patent number: 12242894

Abstract: A device can be used to implement a neural network in hardware. The device can include a processor, a memory, and a neural network accelerator. The neural network accelerator can be configured to implement, in hardware, a neural network by using a residue number system (RNS). At least one function of the neural network can have a corresponding approximation in the RNS system, and the at least one function can be provided by implementing the corresponding approximation in hardware.

Type: Grant

Filed: March 31, 2023

Date of Patent: March 4, 2025

Assignee: Khalifa University of Science and Technology

Inventors: Athanasios Stouraitis, Sakellariou Vasileios, Vasileios Paliouras, Ioannis Kouretas, Hani Saleh
Large integer multiplication enhancements for graphics environment

Patent number: 12236238

Abstract: An apparatus to facilitate large integer multiplication enhancements in a graphics environment is disclosed. The apparatus includes a processor comprising processing resources, the processing resources comprising multiplier circuitry to: receive operands for a multiplication operation, wherein the multiplication operation is part of a chain of multiplication operations for a large integer multiplication; and issue a multiply and add (MAD) instruction for the multiplication operation utilizing at least one of a double precision multiplier or a 48 bit output, wherein the MAD instruction to generate an output in a single clock cycle of the processor.

Type: Grant

Filed: June 25, 2021

Date of Patent: February 25, 2025

Assignee: INTEL CORPORATION

Inventors: Supratim Pal, Li-An Tang, Changwon Rhee, Timothy R. Bauer, Alexander Lyashevsky, Jiasheng Chen
Instruction and logic for tracking fetch performance bottlenecks

Patent number: 12229558

Abstract: A processor includes a front end, an execution unit, a retirement stage, a counter, and a performance monitoring unit. The front end includes logic to receive an event instruction to enable supervision of a front end event that will delay execution of instructions. The execution unit includes logic to set a register with parameters for supervision of the front end event. The front end further includes logic to receive a candidate instruction and match the candidate instruction to the front end event. The counter includes logic to generate the front end event upon retirement of the candidate instruction.

Type: Grant

Filed: September 22, 2023

Date of Patent: February 18, 2025

Assignee: Intel Corporation

Inventor: Ahmad Yasin

1 2 3 4 5 … next