Patents Examined by Cheng-Yuan Tseng

Scheduling tasks using swap flags

Patent number: 12367046

Abstract: A method of activating scheduling instructions within a parallel processing unit is described. The method comprises decoding, in an instruction decoder, an instruction in a scheduled task in an active state and checking, by an instruction controller, if a swap flag is set in the decoded instruction. If the swap flag in the decoded instruction is set, a scheduler is triggered to de-activate the scheduled task by changing the scheduled task from the active state to a non-active state.

Type: Grant

Filed: December 5, 2022

Date of Patent: July 22, 2025

Assignee: Imagination Technologies Limited

Inventors: Simon Nield, Yoong-Chert Foo, Adam de Grasse, Luca Iuliano
Debug trace circuitry configured to generate a record including an address pair and a counter value

Patent number: 12367047

Abstract: Systems and methods are disclosed for debug path profiling. For example, a processor pipeline may execute instructions. A debug trace circuitry may, responsive to an indication of a non-sequential execution of an instruction by the processor pipeline, generate a record including an address pair and one or more counter values. The address pair may include a first address corresponding to a first instruction before the non-sequential execution and a second address corresponding to a second instruction resulting in the non-sequential execution. The one or more counter values may indicate, for example, a count of instructions executed, a type of instruction executed, cache misses, cycles consumed by cache misses, translation lookaside buffer misses, cycles consumed by translation lookaside buffer misses, and/or processor stalls.

Type: Grant

Filed: November 6, 2023

Date of Patent: July 22, 2025

Assignee: SiFive, Inc.

Inventor: Bruce Ableidinger
Data processing apparatus, method and virtual machine

Patent number: 12360767

Abstract: A data processing apparatus comprises processing circuitry to execute processing instructions, the processing circuitry comprising: a set of physical registers; instruction decoder circuitry to decode processing instructions; detector circuitry to detect groups of instructions which comply with a conflict condition, in which a group of instructions complies with the conflict condition at least when a given storage element is written to by a maximum of one instruction of that group of instructions; instruction issue circuitry to issue decoded instructions for execution; and instruction execution circuitry to execute instructions decoded by the instruction decoder circuitry.

Type: Grant

Filed: March 3, 2023

Date of Patent: July 15, 2025

Assignee: Arm Limited

Inventors: Michael Jean Sole, Cedric Denis Robert Airaud
Vectorized scalar processor for executing scalar instructions in multi-threaded computing

Patent number: 12360805

Abstract: Processors, systems and methods are provided for thread level parallel processing. A processor may include a sequencer and a plurality of columns of vector processing units coupled to the sequencer. The sequencer may include a scalar instruction decoder, a vector instruction decoder, and a plurality of scalar processors configured to concurrently execute one or more scalar instructions decoded by the scalar instruction decoder to generate one or more vectors of parameters. The vector instruction decoder may be configured to decode one or more vector instructions to generate a set of configurations with the one or more vectors of parameters embedded as one or more vectors of immediate values and send the configurations to a target column. Each column of vector processing units may be configured to repeatedly execute the vector operations in the configurations using one or more respective elements of one or more vectors of immediate values per repetition.

Type: Grant

Filed: July 10, 2023

Date of Patent: July 15, 2025

Assignee: AzurEngine Technologies Zhuhai Inc.

Inventors: Toshio Nagata, Yuan Li, Jianbin Zhu
Programmable access engine architecture for graph neural network and graph application

Patent number: 12346797

Abstract: This specification describes methods and systems for accessing attribute data in graph neural network (GNN) processing. An example system includes: a plurality of cores, each of the plurality of cores comprises a key-value fetcher and a filter, and is programmable using a software interface to support a plurality of data formats of the GNN attribute data, wherein: the key-value fetcher is programmable using the software interface to perform key-value fetching associated with accessing the GNN attribute data, and the filter of at least one of the plurality of cores is programmable using the software interface to sample node identifiers associated with accessing the GNN attribute data; and a first memory communicatively coupled with the plurality of cores, wherein the first memory is configured to store data shared by the plurality of cores.

Type: Grant

Filed: January 12, 2022

Date of Patent: July 1, 2025

Assignee: Alibaba Damo (Hangzhou) Technology Co., Ltd.

Inventors: Heng Liu, Shuangchen Li, Tianchan Guan, Hongzhong Zheng
Method and device with neural network implementation

Patent number: 12333418

Abstract: A neural network device including an on-chip buffer memory that stores an input feature map of a first layer of a neural network, a computational circuit that receives the input feature map of the first layer through a single port of the on-chip buffer memory and performs a neural network operation on the input feature map of the first layer to output an output feature map of the first layer corresponding to the input feature map of the first layer, and a controller that transmits the output feature map of the first layer to the on-chip buffer memory through the single port to store the output feature map of the first layer and the input feature map of the first layer together in the on-chip buffer memory.

Type: Grant

Filed: October 18, 2023

Date of Patent: June 17, 2025

Assignees: Samsung Electronics Co., Ltd., UNIST (ULSAN NATIONAL INSTITUTE OF SCIENCE AND TECHNOLOGY)

Inventors: Hyeongseok Yu, Hyeonuk Sim, Jongeun Lee
Systems and methods for hardware gather optimization

Patent number: 12321744

Abstract: A computer-implemented method for hardware gather optimization can include identifying, by at least one processor, one or more gather instructions that retrieve data from contiguous memory locations. The method can additionally include converting, by the at least one processor, the one or more gather instructions into one or more strided load instructions in response to the identification. The method can also include loading, by the at least one processor, data retrieved using the one or more strided load instructions into one or more vector registers. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: June 27, 2023

Date of Patent: June 3, 2025

Assignee: Advanced Micro Devices, Inc.

Inventor: Ashish Jha
Physical layer module and network module

Patent number: 12321298

Abstract: A physical layer module and a network module are provided. The network module includes the physical layer module and a media access control module. The physical layer module includes a group decoder, an input selection module, and a device module. The group decoder decodes a common input data signal generated according to a management data input/output signal to generate a group selection signal. The input selection module includes X input circuits being classified into M groups. The X input circuits generate X device input data according to the common input data signal and the group selection signal. The device module includes K physical layer devices classified into M groups. The K physical devices receive X device input data from the X input circuits. An m-th group corresponds to at least one input circuit and N[m] physical layer devices.

Type: Grant

Filed: February 5, 2024

Date of Patent: June 3, 2025

Assignee: FARADAY TECHNOLOGY CORPORATION

Inventor: Chun-Yuan Lai
Circuit, system, and method for matrix decimation

Patent number: 12321747

Abstract: A method is described herein. The method generally includes fetching a set of data from a memory coupled to a memory controller. The method generally includes determining a first subset of data from the set of data. The method generally includes determining a second subset of data from the set of data. The method generally includes determining a first element from the set of data. The method generally includes providing a vector including the first subset, the first element, and the second subset, wherein each element of the first subset is disposed in one portion of the vector and each element of the second subset is disposed in another portion of the vector. The method generally includes storing the vector into a register of the memory controller.

Type: Grant

Filed: February 6, 2023

Date of Patent: June 3, 2025

Assignee: Texas Instruments Incorporated

Inventors: Asheesh Bhardwaj, Burton Adrik Copeland, Tim Anderson
Tracking of store operations

Patent number: 12314715

Abstract: Apparatus and methods for tracking sub-micro-operations and groups thereof are described. An integrated circuit includes a load store unit configured to receive store micro-operations cracked from a vector store instruction. The load store unit is configured to unroll multiple store sub-micro-operations from each of the store micro-operations. The load store unit includes an issue status vector to track issuance of each sub-micro-operation, an unroll status vector to track unrolling of each sub-micro-operation associated with a group of sub-micro-operations, and a replay status vector to track a replayability of sub-micro-operations associated with the group of sub-micro-operations.

Type: Grant

Filed: June 15, 2023

Date of Patent: May 27, 2025

Assignee: SiFive, Inc.

Inventors: Yueh Chi Wu, Yohann Rabefarihy
Method and apparatus for distributed training of artificial intelligence model in channel-sharing network environment

Patent number: 12314201

Abstract: Disclosed herein is a method for distributed training of an AI model in a channel-sharing network environment. The method includes determining whether data parallel processing is applied, calculating a computation time and a communication time when input data is evenly distributed across multiple computation devices, and unevenly distributing the input data across the multiple computation devices based on the computation time and the communication time.

Type: Grant

Filed: June 30, 2023

Date of Patent: May 27, 2025

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Ki-Dong Kang, Hong-Yeon Kim, Baik-Song An, Myung-Hoon Cha
Systems and methods to store a tile register pair to memory

Patent number: 12293186

Abstract: Embodiments detailed herein relate to systems and methods to store a tile register pair to memory. In one example, a processor includes: decode circuitry to decode a store matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded store matrix pair instruction to store every element of left and right tiles of the identified source matrix to corresponding element positions of left and right tiles of the identified destination matrix, respectively, wherein the executing stores a chunk of C elements of one row of the identified source matrix at a time.

Type: Grant

Filed: November 2, 2023

Date of Patent: May 6, 2025

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Network layer 7 offload to infrastructure processing unit for service mesh

Patent number: 12292842

Abstract: Examples described herein relate to network layer 7 (L7) offload to an infrastructure processing unit (IPU) for a service mesh. An apparatus described herein includes an IPU comprising an IPU memory to store a routing table for a service mesh, the routing table to map shared memory address spaces of the IPU and a host device executing one or more microservices, wherein the service mesh provides an infrastructure layer for the one or more microservices executing on the host device; and one or more IPU cores communicably coupled to the IPU memory, the one or more IPU cores to: host a network L7 proxy endpoint for the service mesh, and communicate messages between the network L7 proxy endpoint and an L7 interface device of the one or more microservices by copying data between the shared memory address spaces of the IPU and the host device based on the routing table.

Type: Grant

Filed: September 27, 2021

Date of Patent: May 6, 2025

Assignee: Intel Corporation

Inventors: Mrittika Ganguli, Anjali Jain, Reshma Lal, Edwin Verplanke, Priya Autee, Chih-Jen Chang, Abhirupa Layek, Nupur Jain
Systems, methods, and apparatuses for matrix operations

Patent number: 12282525

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address, and execution circuitry to execute the decoded instruction to store configuration information about usage of storage for two-dimensional data structures at the memory address.

Type: Grant

Filed: November 3, 2023

Date of Patent: April 22, 2025

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Circuit and method for resource arbitration

Patent number: 12284122

Abstract: A circuit and corresponding method perform resource arbitration. The circuit comprises a pending arbiter (PA) that outputs a PA selection for accessing a resource. The PA selection is based on PA input. The PA input represents respective pending-state of requesters of the resource. The circuit further comprises a valid arbiter (VA) that outputs a VA selection for accessing the resource. The VA selection is based on VA input. The VA input represents respective valid-state of the requesters. The circuit performs a validity check on the PA selection output. The circuit outputs a final selection for accessing the resource by selecting, based on the validity check performed, the PA selection output or VA selection output. The circuit addresses arbitration fairness issues that may result when multiple requesters are arbitrating to be selected for access to a shared resource and such requesters require a credit (token) to be eligible for arbitration.

Type: Grant

Filed: February 6, 2024

Date of Patent: April 22, 2025

Assignee: Marvell Asia Pte Ltd

Inventors: Joseph Featherston, Aadeetya Shreedhar
Mechanisms to utilize communication fabric via multi-port architecture

Patent number: 12277074

Abstract: Techniques are disclosed pertaining to utilizing a communication fabric via multiple ports. An agent circuit includes a plurality of command-and-data ports that couple the agent circuit to a communication fabric coupled to a plurality of hardware components that includes a plurality of memory controller circuits that facilitate access to a memory. The agent circuit can execute an instruction that involves issuing a command for data stored at the memory. The agent circuit may perform a hash operation on a memory address associated with the command to determine which one of the plurality of memory controller circuits to which to issue the command. The agent circuit issues the command to the determined memory controller circuit on a particular one of the plurality of command-and-data ports that is designated to the memory controller circuit. The agent circuit may issue all commands destined to that memory controller circuit on that port.

Type: Grant

Filed: September 25, 2023

Date of Patent: April 15, 2025

Assignee: Apple Inc.

Inventors: Sergio Kolor, Sandeep Gupta, James Vash
Apparatus and method for die-to-die (D2D) interconnects

Patent number: 12265488

Abstract: An apparatus includes a first die connected to a second die through a die-to-die (D2D) interface. The first die includes a first interconnect configured to provide first lanes communicating with the second die to the D2D interface, the first interconnect includes a first logic circuit configured to indicate a correlation between a number of chiplet dies connected to the first lanes and connected signal pins from among a plurality of signal pins of the connected chiplet dies. The second die includes the number of connected chiplet dies each including a second interconnect configured to provide second lanes to the D2D interface from each of the connected chiplet dies. The second lanes are configured to be set according to a number of the connected signal pins of the connected chiplet dies.

Type: Grant

Filed: October 31, 2023

Date of Patent: April 1, 2025

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Wangyong Im, Byoungkon Jo, Gyesik Oh, Duksung Kim, Jangseok Choi
Exception handling method and related apparatus

Patent number: 12260222

Abstract: This application discloses an exception handling method, which may be applied to a processor. The method includes: The processor calls a second function according to a call instruction of a first function, where the first function is a high-level language function, and the second function is a runtime function. When an exception occurs in a process of executing the second function, the processor executes a return operation of the second function, where the return operation of the second function includes restoring a status of a first register used when the second function is executed to a status before the first function calls the second function. The processor performs exception handling based on the status of the first register. The method can improve running performance of the processor.

Type: Grant

Filed: August 24, 2023

Date of Patent: March 25, 2025

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventor: Ning Chu
Hardware/software co-compressed computing method and system for static random access memory computing-in-memory-based processing unit

Patent number: 12260906

Abstract: A hardware/software co-compressed computing method for a static random access memory (SRAM) computing-in-memory-based (CIM-based) processing unit includes performing a data dividing step, a sparsity step, an address assigning step and a hardware decoding and calculating step. The data dividing step is performed to divide a plurality of kernels into a plurality of weight groups. The sparsity step includes performing a weight setting step. The weight setting step is performed to set each of the weight groups to one of a zero weight group and a non-zero weight group. The address assigning step is performed to assign a plurality of index codes to a plurality of the non-zero weight groups, respectively. The hardware decoding and calculating step is performed to execute an inner product to the non-zero weight groups and the input feature data group corresponding to the non-zero weight groups to generate the output feature data group.

Type: Grant

Filed: September 3, 2021

Date of Patent: March 25, 2025

Assignee: NATIONAL TSING HUA UNIVERSITY

Inventors: Kea-Tiong Tang, Syuan-Hao Sie, Jye-Luen Lee
Method and apparatus for accelerating GNN pre-processing

Patent number: 12248814

Abstract: Provided is an apparatus for accelerating graph neural network (GNN) pre-processing, the apparatus including a set-partitioning accelerator configured to sort each edge of an original graph stored in a coordinate list (COO) format by a node number, perform radix sorting based on a vertex identification (VID) to generate a COO array of a preset length, and perform uniform random sampling on some nodes of a given node array, a merger configured to merge the COO array of the preset length to generate one sorted COO array, a re-indexer configured to assign new consecutive VIDs respectively to the nodes selected through the uniform random sampling, and a compressed sparse row (CSR) converter configured to the edges sorted by the node number into a CSR format.

Type: Grant

Filed: August 22, 2023

Date of Patent: March 11, 2025

Assignee: KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY

Inventors: Myoungsoo Jung, Seungkwan Kang, Donghyun Gouk, Miryeong Kwon, Hyunkyu Choi, Junhyeok Jang

1 2 3 4 5 … next