Patents by Inventor Hongzhong Zheng

Hongzhong Zheng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ERROR DETECTION, PREDICTION AND HANDLING TECHNIQUES FOR SYSTEM-IN-PACKAGE MEMORY ARCHITECTURES

Publication number: 20240020194

Abstract: A system-in-package including a logic die and one or more memory dice can include a reliability availability serviceability (RAS) memory management unit (MMU) for memory error detection, memory error prediction and memory error handling. The RAS MMU can receive memory health information, on-die memory error information, system error information and read address information for the one or more memory dice. The RAS MMU can manage the memory blocks of the one or more memory dice based on the memory health information, on-die memory error type, system error type and read address. The RAS MMU can also further manage the memory blocks based on received on-die memory temperature information and or system temperature information.

Type: Application

Filed: November 4, 2020

Publication date: January 18, 2024

Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
COMPUTER-IMPLEMENTED MEMORY ALLOCATION METHOD FOR SPARSE MATRIX MULTIPLICATION APPLICATIONS

Publication number: 20240004955

Abstract: This application describes an accelerator, a computer system, and a method for memory optimization in sparse matrix-matrix multiplications (spGEMM). The memory optimization includes accurate memory pre-allocation for a to-be-generated output matrix of spGEMM between two sparse matrices. An exemplary method may include: sampling a plurality of first rows in the first sparse matrix; identifying, based on indices of non-zero data in the plurality of first rows, a plurality of second rows in a second sparse matrix; performing symbolic multiplication operations between the non-zero data in the plurality of first and second rows; determining an estimated compression ratio of the output matrix; determining an estimated mean row size for each row in the output matrix based on the estimated compression ratio; and allocating, according to the estimated mean row size and a total number of rows of the output matrix, a memory space in a hardware memory.

Type: Application

Filed: November 9, 2022

Publication date: January 4, 2024

Inventors: Zhaoyang DU, Yijin GUAN, Dimin NIU, Hongzhong ZHENG
HARDWARE ACCELERATION FRAMEWORK FOR GRAPH NEURAL NETWORK QUANTIZATION

Publication number: 20240005133

Abstract: This application describes an hardware and a software design for quantization in GNN computation. An exemplary method may include: receiving a graph comprising a plurality of nodes respectively represented by a plurality of feature vectors; segmenting the plurality of feature vectors into a plurality of sub-vectors and grouping the plurality of sub-vectors into a plurality of groups of sub-vectors; performing vector clustering on each of the plurality of groups of sub-vectors to generate a plurality of centroids as a codebook; encoding each of the plurality of feature vectors to obtain a plurality of index maps by quantizing sub-vectors within the each feature vector based on the codebook, wherein each index map occupies a smaller storage space than the each feature vector does; and storing the plurality of index maps as an assignment table instead of the plurality of feature vectors to represent the plurality of nodes for GNN computation.

Type: Application

Filed: August 30, 2022

Publication date: January 4, 2024

Inventors: Linyong HUANG, Zhe ZHANG, Shuangchen LI, Hongzhong ZHENG
SMART MEMORY EXTENSION TO PROCESSORS

Publication number: 20240005127

Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example system includes a plurality of processing units, each configured to perform graph neural network (GNN) processing; and a plurality of memory extension cards, each configured to store graph data for the GNN processing, wherein: each of the plurality of processing units is communicatively coupled with three other processing units via one or more interconnects respectively; the plurality of processing units are communicatively coupled with the plurality of memory extension cards respectively; and each of the plurality of memory extension cards includes a graphic access engine circuitry configured to acceleratre GNN memory access.

Type: Application

Filed: November 28, 2022

Publication date: January 4, 2024

Inventors: Yijin GUAN, Dimin NIU, Shengcheng WANG, Shuangchen LI, Hongzhong ZHENG
COMPUTER-IMPLEMENTED ACCUMULATION METHOD FOR SPARSE MATRIX MULTIPLICATION APPLICATIONS

Publication number: 20240004954

Abstract: This application describes an hardware acceleration design for improving SpGEMM efficiency. An exemplary method may include: obtaining a first sparse matrix and a second sparse matrix for performing SpGEMM; allocating a pair of buffers respectively pointed by a first pointer and a second pointer; for each first row in the first sparse matrix that comprises a plurality of non-zero elements, identifying a plurality of second rows in the second sparse matrix that correspond to the plurality of non-zero elements; obtaining a plurality of intermediate lists computed based on each of the plurality of non-zero elements in the first row and one of the plurality of second rows that corresponds to the non-zero element; performing accumulation of the intermediate lists using the pair of buffers; and migrating the one final merged list to a system memory as a row of an output matrix of the SpGEMM.

Type: Application

Filed: November 1, 2022

Publication date: January 4, 2024

Inventors: Zhaoyang DU, Yijin GUAN, Dimin NIU, Hongzhong ZHENG
GRAPH ACCELERATION SOLUTION WITH CLOUD FPGA

Publication number: 20240004824

Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example method includes fetching, by an access engine circuitry implemented on a circuitry board, a portion of structure data of a graph from a pinned memory in a host memory of a host via a first peripheral component interconnect express (PCIe) connection; performing node sampling using the fetched portion of the structure data of the graph to select one or more sampled nodes; fetching, by the access engine circuitry, a portion of attribute data of the graph from the pinned memory via the first PCIe connection; sending the fetched portion of the attribute data of the graph to one or more processors; and performing, by the one or more processors, GNN processing for the graph using the fetched portion of the attribute data of the graph.

Type: Application

Filed: November 30, 2022

Publication date: January 4, 2024

Inventors: Shuangchen LI, Dimin NIU, Hongzhong ZHENG, Zhe ZHANG, Yuhao WANG
GRAPHIC NEURAL NETWORK ACCELERATION SOLUTION WITH CUSTOMIZED BOARD FOR SOLID-STATE DRIVES

Publication number: 20240005075

Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example method includes fetching, by an access engine circuitry implemented on a circuitry board, a portion of structure data of a graph from one or more of a plurality of flash memory drives implemented on the circuitry board; performing node sampling using the fetched portion of the structure data of the graph to select one or more sampled nodes; fetching a portion of attribute data of the graph from two or more of the plurality of memory drives in parallel according to the selected one or more sampled nodes; sending the fetched portion of the attribute data of the graph to a host outside of the circuitry board; and performing, by the host, GNN processing for the graph using the fetched portion of the attribute data of the graph.

Type: Application

Filed: November 30, 2022

Publication date: January 4, 2024

Inventors: Shuangchen LI, Dimin NIU, Hongzhong ZHENG
3D-STACKED MEMORY WITH RECONFIGURABLE COMPUTE LOGIC

Publication number: 20240004547

Abstract: A 3D-stacked memory device including: a base die including a plurality of switches to direct data flow and a plurality of arithmetic logic units (ALUs) to compute data; a plurality of memory dies stacked on the base die; and an interface to transfer signals to control the base die.

Type: Application

Filed: September 15, 2023

Publication date: January 4, 2024

Inventors: Mu-Tien Chang, Prasun Gera, Dimin Niu, Hongzhong Zheng
Processing system that increases the memory capacity of a GPGPU

Patent number: 11847049

Abstract: The total memory space that is logically available to a processor in a general-purpose graphics processing unit (GPGPU) module is increased to accommodate terabyte-sized amounts of data by utilizing the memory space in an external memory module, and by further utilizing a portion of the memory space in a number of other external memory modules.

Type: Grant

Filed: January 21, 2022

Date of Patent: December 19, 2023

Assignee: Alibaba Damo (Hangzhou) Technology Co., Ltd

Inventors: Yuhao Wang, Dimin Niu, Yijin Guan, Shengcheng Wang, Shuangchen Li, Hongzhong Zheng
Graph neural network accelerator with attribute caching

Patent number: 11841799

Abstract: This application describes a hardware accelerator, a computer system, and a method for accelerating Graph Neural Network (GNN) node attribute fetching. The hardware accelerator comprises a GNN attribute processor; and a first memory, wherein the GNN attribute processor is configured to: receive a graph node identifier; determine a target memory address within the first memory based on the graph node identifier; determine, based on the received graph node identifier, whether attribute data corresponding to the received graph node identifier is cached in the first memory at the target memory address; and in response to determining that the attribute data is not cached in the first memory: fetch the attribute data from a second memory, and write the fetched attribute data into the first memory at the target memory address.

Type: Grant

Filed: January 21, 2022

Date of Patent: December 12, 2023

Assignee: T-Head (Shanghai) Semiconductor Co., Ltd.

Inventors: Tianchan Guan, Heng Liu, Shuangchen Li, Hongzhong Zheng
PROCESSING SYSTEM WITH INTEGRATED DOMAIN SPECIFIC ACCELERATORS

Publication number: 20230393851

Abstract: A number of domain specific accelerators (DSA1-DSAn) are integrated into a conventional processing system (100) to operate on the same chip by adding additional instructions to a conventional instruction set architecture (ISA), and further adding an accelerator interface unit (130) to the processing system (100) to respond to the additional instructions and interact with the DSAs.

Type: Application

Filed: June 20, 2023

Publication date: December 7, 2023

Inventors: Yuhao WANG, Zhaoyang DU, Yen-kuang CHEN, Wei HAN, Shuangchen LI, Fei XUE, Hongzhong ZHENG
Devices and methods for accessing and retrieving data in a graph

Patent number: 11836188

Abstract: A programmable device receives commands from a processor and, based on the commands: identifies a root node in a graph; identifies nodes in the graph that are neighbors of the root node; identifies nodes in the graph that are neighbors of the neighbors; retrieves data associated with the root node; retrieves data associated with at least a subset of the nodes that are neighbors of the root node and that are neighbors of the neighbor nodes; and writes the data that is retrieved into a memory.

Type: Grant

Filed: January 21, 2022

Date of Patent: December 5, 2023

Assignee: Alibaba Damo (Hangzhou) Technology Co., Ltd

Inventors: Shuangchen Li, Tianchan Guan, Zhe Zhang, Heng Liu, Wei Han, Dimin Niu, Hongzhong Zheng
3D-stacked memory with reconfigurable compute logic

Patent number: 11789610

Abstract: A 3D-stacked memory device including: a base die including a plurality of switches to direct data flow and a plurality of arithmetic logic units (ALUs) to compute data; a plurality of memory dies stacked on the base die; and an interface to transfer signals to control the base die.

Type: Grant

Filed: June 21, 2021

Date of Patent: October 17, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Mu-Tien Chang, Prasun Gera, Dimin Niu, Hongzhong Zheng
THREE-DIMENSIONAL STACKED PROCESSING SYSTEMS

Publication number: 20230326905

Abstract: Aspects of the present technology are directed toward three-dimensional (3D) stacked processing systems characterized by high memory capacity, high memory bandwidth, low power consumption and small form factor. The 3D stacked processing systems include a plurality of processor chiplets and input/output circuits directly coupled to each of the plurality of processor chiplets.

Type: Application

Filed: September 17, 2020

Publication date: October 12, 2023

Inventors: Dimin NIU, Wei HAN, Tianchan GUAN, Yuhao WANG, Shuangchen LI, Hongzhong ZHENG
Memory lookup computing mechanisms

Patent number: 11775294

Abstract: According to some example embodiments of the present disclosure, in a method for a memory lookup mechanism in a high-bandwidth memory system, the method includes: using a memory die to conduct a multiplication operation using a lookup table (LUT) methodology by accessing a LUT, which includes floating point operation results, stored on the memory die; sending, by the memory die, a result of the multiplication operation to a logic die including a processor and a buffer; and conducting, by the logic die, a matrix multiplication operation using computation units.

Type: Grant

Filed: November 30, 2021

Date of Patent: October 3, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Peng Gu, Krishna T. Malladi, Hongzhong Zheng
Cache access method and associated graph neural network system

Patent number: 11762776

Abstract: The present application discloses a cache access method and an associated graph neural network system. The graph neural network processor is used for performing computation upon a graph neural network. The graph neural network is stored in the memory in compressed sparse row format. The method includes: receiving an address corresponding to a node of the graph neural network and a type of the address; in response to the type is one of a first type or a second type, performing lookup by comparing the address with a tag field of a degree lookup table to at least obtain a degree of the node; determining whether the degree is greater than a predetermined value to obtain a determination result; and determining whether to perform lookup on a region of the cache corresponding to the type according to the determination result.

Type: Grant

Filed: January 25, 2022

Date of Patent: September 19, 2023

Assignee: T-HEAD (SHANGHAI) SEMICONDUCTOR CO., LTD.

Inventors: Zhe Zhang, Shuangchen Li, Hongzhong Zheng
HBM BASED MEMORY LOOKUP ENGINE FOR DEEP LEARNING ACCELERATOR

Publication number: 20230289081

Abstract: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.

Type: Application

Filed: May 11, 2023

Publication date: September 14, 2023

Inventors: Peng Gu, Krishna T. Malladi, Hongzhong Zheng
Memory module threading with staggered data transfers

Patent number: 11755507

Abstract: A method of transferring data between a memory controller and at least one memory module via a primary data bus having a primary data bus width is disclosed. The method includes accessing a first one of a memory device group via a corresponding data bus path in response to a threaded memory request from the memory controller. The accessing results in data groups collectively forming a first data thread transferred across a corresponding secondary data bus path. Transfer of the first data thread across the primary data bus width is carried out over a first time interval, while using less than the primary data transfer continuous throughput during that first time interval. During the first time interval, at least one data group from a second data thread is transferred on the primary data bus.

Type: Grant

Filed: May 13, 2022

Date of Patent: September 12, 2023

Assignee: Rambus Inc.

Inventors: Hongzhong Zheng, Frederick A Ware
MEMORY CONTROLLER

Publication number: 20230281124

Abstract: Apparatus, method, and system provided herein are directed to prioritizing cache line writing of compressed data. The memory controller comprises a cache line compression engine that receives raw data, compresses the raw data, determines a compression rate between the raw data and the compressed data, determines whether the compression rate is greater than a predetermined rate, and outputs the compressed data as data-to-be-written if the compression rate is greater than the predetermined rate. In response to determining that the compression rate is greater than the predetermined rate, the cache line compression engine generates a compression signal indicating the data-to-be-written is the compressed data and sends the compression signal to a scheduler of a command queue in the memory controller where writing of compressed data is prioritized.

Type: Application

Filed: August 6, 2020

Publication date: September 7, 2023

Inventors: Dimin Niu, Tianchan Guan, Lide Duan, Hongzhong Zheng
Computer-implemented method, system, and storage medium for prefetching in a distributed graph architecture

Patent number: 11729268

Abstract: Various embodiments of the present disclosure relate to a computer-implemented method, a system, and a storage medium, where a graph stored in a computing system is logically divided into subgraphs, the subgraphs are stored on different interconnected (or coupled) devices in the computing system, and nodes of the subgraphs include hub nodes connected to adjacent subgraphs. Each device stores attributes and node structure information of the hub nodes of the subgraphs into other devices, and software or hardware prefetch engine on the device prefetches attributes and node structure information associated with a sampled node. A prefetcher on a device interfacing with the interconnected (or coupled) devices may further prefetch attributes and node structure information of nodes of the subgraphs on other devices. A traffic monitor is provided on an interface device to monitor traffic. When the traffic is small, the interface device prefetches node attributes and node structure information.

Type: Grant

Filed: June 8, 2022

Date of Patent: August 15, 2023

Assignee: Alibaba (China) Co., Ltd.

Inventors: Wei Han, Shuangcheng Li, Hongzhong Zheng, Yawen Zhang, Heng Liu, Dimin Niu

prev 1 2 3 4 5 6 7 8 9 … next