Patents by Inventor Shuangchen Li
Shuangchen Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11954093Abstract: Embodiments of the disclosure provide devices and methods for performing a top-k function. The device can include: a memory comprising a plurality of register files for storing the data elements, the plurality of register files comprising a parent register file and a first child register file associated with the parent register file, wherein the parent register file is associated with: first interface circuitry configured for reading a first parent data element from the parent register file and receiving a first child data element and a second child data element from the first child register file; and first comparison circuitry configured for updating the parent register file and the first child register file based on the first parent data element, the first child data element, and the second child data element according to a given principle.Type: GrantFiled: June 4, 2020Date of Patent: April 9, 2024Assignee: Alibaba Group Holding LimitedInventors: Fei Sun, Shuangchen Li, Dimin Niu, Fei Xue, Yuanwei Fang
-
Publication number: 20240104360Abstract: Near memory processing systems for graph neural network processing can include a central core coupled to one or more memory units. The memory units can include one or more controllers and a plurality of memory devices. The system can be configured for offloading aggregation, concentrate and the like operations from the central core to the controllers of the one or more memory units. The central core can sample the graph neural network and schedule memory accesses for execution by the one or more memory units. The central core can also schedule aggregation, combination or the like operations associated with one or more memory accesses for execution by the controller. The controller can access data in accordance with the data access requests from the central core. One or more computation units of the controller can also execute the aggregation, combination or the like operations associated with one or more memory access.Type: ApplicationFiled: December 2, 2020Publication date: March 28, 2024Inventors: Tianchan GUAN, Dimin NIU, Hongzhong ZHENG, Shuangchen LI
-
Publication number: 20240095179Abstract: A memory management method of a data processing system is provided. The memory management method includes: creating a first memory zone and a second memory zone related to a first node of a first server, wherein the first server is located in the data processing system, and the first node includes a processor and a first memory; mapping the first memory zone to the first memory; and mapping the second memory zone to a second memory of a second server, wherein the second server is located in the data processing system, and the processor is configured to access the second memory of the second server through an interface circuit of the first server and through an interface circuit of the second server.Type: ApplicationFiled: December 13, 2022Publication date: March 21, 2024Inventors: DIMIN NIU, YIJIN GUAN, TIANCHAN GUAN, SHUANGCHEN LI, HONGZHONG ZHENG
-
Publication number: 20240094922Abstract: A data processing system includes a first server and a second server. The first server includes a first processor group, a first memory space and a first interface circuit. The second server includes a second processor group, a second memory space and a second interface circuit. The first memory space and the second memory space are allocated to the first processor group. The first processor group is configured to perform memory error detection to generate an error log corresponding to a memory error. When the memory error occurs in the second memory space, the first interface circuit is configured to send the error log to the second interface circuit, and the second processor group is configured to log the memory error according to the error log received by the second interface circuit. The data processing system is capable of realizing memory reliability architecture supporting operations across different servers.Type: ApplicationFiled: December 12, 2022Publication date: March 21, 2024Inventors: DIMIN NIU, TIANCHAN GUAN, YIJIN GUAN, SHUANGCHEN LI, HONGZHONG ZHENG
-
Patent number: 11934669Abstract: A processor includes a plurality of memory units, each of the memory units including a plurality of memory cells, wherein each of the memory units is configurable to operate as memory, as a computation unit, or as a hybrid memory-computation unit.Type: GrantFiled: July 29, 2020Date of Patent: March 19, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Dimin Niu, Shuangchen Li, Bob Brennan, Krishna T. Malladi, Hongzhong Zheng
-
Publication number: 20240078036Abstract: A memory module can include a hybrid media controller coupled to a volatile memory, a non-volatile memory, a non-volatile memory buffer and a set of memory mapped input/output (MMIO) register. The hybrid media controller can be configured for reading and writing data to a volatile memory of a memory mapped space of a memory module. The hybrid media controller can also be configured for reading and writing bulk data to a non-volatile memory of the memory mapped space. The hybrid media controller can also be configured for reading and writing data of a random-access granularity to the non-volatile memory of the memory mapped space. The hybrid media controller can also be configured for self-indexed moving data between the non-volatile memory and the volatile memory of the memory module.Type: ApplicationFiled: December 24, 2020Publication date: March 7, 2024Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
-
Publication number: 20240063200Abstract: The present disclosure relates to a hybrid bonding based integrated circuit (HBIC) device and its manufacturing method. In some embodiments, an exemplary HBIC device includes: a first die stack comprising one or more dies; and a second die stack integrated above the first die stack. The second die stack includes at least two memory dies communicatively connected to the first die stack by wire bonding.Type: ApplicationFiled: February 6, 2020Publication date: February 22, 2024Inventors: Dimin NIU, Shuangchen LI, Tianchan GUAN, Hongzhong ZHENG
-
Publication number: 20240054096Abstract: The present disclosure discloses a processor. The processor is used to perform parallel computation and includes a logic die and a memory die. The logic die includes a plurality of processor cores and a plurality of networks on chip, wherein each processor core is programmable. The plurality of networks on chip are correspondingly connected to the plurality of processor cores, so that the plurality of processor cores form a two-dimensional mesh network. The memory die and the processor core are stacked vertically, wherein the memory die includes a plurality of memory tiles, and when the processor performs the parallel computation, the plurality of memory tiles do not have cache coherency; wherein, the plurality of memory tiles correspond to the plurality of processor cores in a one-to-one or one-to-many manner.Type: ApplicationFiled: December 12, 2022Publication date: February 15, 2024Inventors: SHUANGCHEN LI, ZHE ZHANG, DIMIN NIU, HONGZHONG ZHENG
-
Patent number: 11900239Abstract: Systems and methods for dynamically executing sparse neural networks are provided. In one implementation, a system for providing dynamic sparsity in a neural network may include at least one memory storing instructions and at least one processor configured to execute the instructions to: reduce an input vector and a set of weights of the neural network, execute an input layer of the neural network using the reduced input vector and set of weights to generate a reduced output vector; expand the reduced output vector to a full output vector using first predictable output neurons (PONs); using a PON map, reduce a dimension of the full output vector; execute subsequent layers of the neural network using the reduced full output vector to produce a second reduced output vector; and expand the second reduced output vector to a second full output vector using second PONs.Type: GrantFiled: September 5, 2019Date of Patent: February 13, 2024Assignee: Alibaba Group Holding LimitedInventors: Zhenyu Gu, Liu Liu, Shuangchen Li, Yuan Xie
-
Publication number: 20240045975Abstract: The present disclosure discloses a processor and a multi-core processor. The processor includes a processor core and a memory. The processor core includes a homomorphic encryption instruction execution module and a general-purpose instruction execution module; the homomorphic encryption instruction execution module is configured to perform homomorphic encryption operation and includes a plurality of instruction set architecture extension components, wherein the plurality of instruction set architecture extension components are respectively configured to perform a sub-operation related to the homomorphic encryption; the general-purpose instruction execution module is configured to perform non-homomorphic encryption operation. The memory is vertically stacked with the processor core and is used as a cache or scratchpad memory of the processor core.Type: ApplicationFiled: December 14, 2022Publication date: February 8, 2024Inventors: SHUANGCHEN LI, ZHE ZHANG, LINYONG HUANG, DIMIN NIU, XUANLE REN, HONGZHONG ZHENG
-
Patent number: 11886352Abstract: This specification describes methods and systems for accelerating attribute data access for graph neural network (GNN) processing. An example method includes: receiving a root node identifier corresponding to a node in a graph for GNN processing; determining one or more candidate node identifiers according to the root node identifier, wherein attribute data corresponding to the one or more candidate node identifiers are sequentially stored in a memory; and sampling one or more graph node identifiers at least from the one or more candidate node identifiers for the GNN processing.Type: GrantFiled: January 21, 2022Date of Patent: January 30, 2024Assignee: T-Head (Shanghai) Semiconductor Co., Ltd.Inventors: Heng Liu, Tianchan Guan, Shuangchen Li, Hongzhong Zheng
-
Publication number: 20240028554Abstract: A configurable processing unit including a core processing element and a plurality of assist processing elements can be coupled together by one or more networks. The core processing element can include a large processing logic, large non-volatile memory, input/output interfaces and multiple memory channels. The plurality of assist processing elements can each include smaller processing logic, smaller non-volatile memory and multiple memory channels. One or more bitstreams can be utilized to configure and reconfigure computation resources of the core processing element and memory management of the plurality of assist processing elements.Type: ApplicationFiled: September 18, 2020Publication date: January 25, 2024Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
-
Publication number: 20240020194Abstract: A system-in-package including a logic die and one or more memory dice can include a reliability availability serviceability (RAS) memory management unit (MMU) for memory error detection, memory error prediction and memory error handling. The RAS MMU can receive memory health information, on-die memory error information, system error information and read address information for the one or more memory dice. The RAS MMU can manage the memory blocks of the one or more memory dice based on the memory health information, on-die memory error type, system error type and read address. The RAS MMU can also further manage the memory blocks based on received on-die memory temperature information and or system temperature information.Type: ApplicationFiled: November 4, 2020Publication date: January 18, 2024Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
-
Publication number: 20240005127Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example system includes a plurality of processing units, each configured to perform graph neural network (GNN) processing; and a plurality of memory extension cards, each configured to store graph data for the GNN processing, wherein: each of the plurality of processing units is communicatively coupled with three other processing units via one or more interconnects respectively; the plurality of processing units are communicatively coupled with the plurality of memory extension cards respectively; and each of the plurality of memory extension cards includes a graphic access engine circuitry configured to acceleratre GNN memory access.Type: ApplicationFiled: November 28, 2022Publication date: January 4, 2024Inventors: Yijin GUAN, Dimin NIU, Shengcheng WANG, Shuangchen LI, Hongzhong ZHENG
-
Publication number: 20240005133Abstract: This application describes an hardware and a software design for quantization in GNN computation. An exemplary method may include: receiving a graph comprising a plurality of nodes respectively represented by a plurality of feature vectors; segmenting the plurality of feature vectors into a plurality of sub-vectors and grouping the plurality of sub-vectors into a plurality of groups of sub-vectors; performing vector clustering on each of the plurality of groups of sub-vectors to generate a plurality of centroids as a codebook; encoding each of the plurality of feature vectors to obtain a plurality of index maps by quantizing sub-vectors within the each feature vector based on the codebook, wherein each index map occupies a smaller storage space than the each feature vector does; and storing the plurality of index maps as an assignment table instead of the plurality of feature vectors to represent the plurality of nodes for GNN computation.Type: ApplicationFiled: August 30, 2022Publication date: January 4, 2024Inventors: Linyong HUANG, Zhe ZHANG, Shuangchen LI, Hongzhong ZHENG
-
Publication number: 20240005075Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example method includes fetching, by an access engine circuitry implemented on a circuitry board, a portion of structure data of a graph from one or more of a plurality of flash memory drives implemented on the circuitry board; performing node sampling using the fetched portion of the structure data of the graph to select one or more sampled nodes; fetching a portion of attribute data of the graph from two or more of the plurality of memory drives in parallel according to the selected one or more sampled nodes; sending the fetched portion of the attribute data of the graph to a host outside of the circuitry board; and performing, by the host, GNN processing for the graph using the fetched portion of the attribute data of the graph.Type: ApplicationFiled: November 30, 2022Publication date: January 4, 2024Inventors: Shuangchen LI, Dimin NIU, Hongzhong ZHENG
-
Publication number: 20240004824Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example method includes fetching, by an access engine circuitry implemented on a circuitry board, a portion of structure data of a graph from a pinned memory in a host memory of a host via a first peripheral component interconnect express (PCIe) connection; performing node sampling using the fetched portion of the structure data of the graph to select one or more sampled nodes; fetching, by the access engine circuitry, a portion of attribute data of the graph from the pinned memory via the first PCIe connection; sending the fetched portion of the attribute data of the graph to one or more processors; and performing, by the one or more processors, GNN processing for the graph using the fetched portion of the attribute data of the graph.Type: ApplicationFiled: November 30, 2022Publication date: January 4, 2024Inventors: Shuangchen LI, Dimin NIU, Hongzhong ZHENG, Zhe ZHANG, Yuhao WANG
-
Patent number: 11847049Abstract: The total memory space that is logically available to a processor in a general-purpose graphics processing unit (GPGPU) module is increased to accommodate terabyte-sized amounts of data by utilizing the memory space in an external memory module, and by further utilizing a portion of the memory space in a number of other external memory modules.Type: GrantFiled: January 21, 2022Date of Patent: December 19, 2023Assignee: Alibaba Damo (Hangzhou) Technology Co., LtdInventors: Yuhao Wang, Dimin Niu, Yijin Guan, Shengcheng Wang, Shuangchen Li, Hongzhong Zheng
-
Patent number: 11841799Abstract: This application describes a hardware accelerator, a computer system, and a method for accelerating Graph Neural Network (GNN) node attribute fetching. The hardware accelerator comprises a GNN attribute processor; and a first memory, wherein the GNN attribute processor is configured to: receive a graph node identifier; determine a target memory address within the first memory based on the graph node identifier; determine, based on the received graph node identifier, whether attribute data corresponding to the received graph node identifier is cached in the first memory at the target memory address; and in response to determining that the attribute data is not cached in the first memory: fetch the attribute data from a second memory, and write the fetched attribute data into the first memory at the target memory address.Type: GrantFiled: January 21, 2022Date of Patent: December 12, 2023Assignee: T-Head (Shanghai) Semiconductor Co., Ltd.Inventors: Tianchan Guan, Heng Liu, Shuangchen Li, Hongzhong Zheng
-
Publication number: 20230393851Abstract: A number of domain specific accelerators (DSA1-DSAn) are integrated into a conventional processing system (100) to operate on the same chip by adding additional instructions to a conventional instruction set architecture (ISA), and further adding an accelerator interface unit (130) to the processing system (100) to respond to the additional instructions and interact with the DSAs.Type: ApplicationFiled: June 20, 2023Publication date: December 7, 2023Inventors: Yuhao WANG, Zhaoyang DU, Yen-kuang CHEN, Wei HAN, Shuangchen LI, Fei XUE, Hongzhong ZHENG