Patents by Inventor Shuangchen Li

Shuangchen Li has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11954093
    Abstract: Embodiments of the disclosure provide devices and methods for performing a top-k function. The device can include: a memory comprising a plurality of register files for storing the data elements, the plurality of register files comprising a parent register file and a first child register file associated with the parent register file, wherein the parent register file is associated with: first interface circuitry configured for reading a first parent data element from the parent register file and receiving a first child data element and a second child data element from the first child register file; and first comparison circuitry configured for updating the parent register file and the first child register file based on the first parent data element, the first child data element, and the second child data element according to a given principle.
    Type: Grant
    Filed: June 4, 2020
    Date of Patent: April 9, 2024
    Assignee: Alibaba Group Holding Limited
    Inventors: Fei Sun, Shuangchen Li, Dimin Niu, Fei Xue, Yuanwei Fang
  • Publication number: 20240104360
    Abstract: Near memory processing systems for graph neural network processing can include a central core coupled to one or more memory units. The memory units can include one or more controllers and a plurality of memory devices. The system can be configured for offloading aggregation, concentrate and the like operations from the central core to the controllers of the one or more memory units. The central core can sample the graph neural network and schedule memory accesses for execution by the one or more memory units. The central core can also schedule aggregation, combination or the like operations associated with one or more memory accesses for execution by the controller. The controller can access data in accordance with the data access requests from the central core. One or more computation units of the controller can also execute the aggregation, combination or the like operations associated with one or more memory access.
    Type: Application
    Filed: December 2, 2020
    Publication date: March 28, 2024
    Inventors: Tianchan GUAN, Dimin NIU, Hongzhong ZHENG, Shuangchen LI
  • Publication number: 20240095179
    Abstract: A memory management method of a data processing system is provided. The memory management method includes: creating a first memory zone and a second memory zone related to a first node of a first server, wherein the first server is located in the data processing system, and the first node includes a processor and a first memory; mapping the first memory zone to the first memory; and mapping the second memory zone to a second memory of a second server, wherein the second server is located in the data processing system, and the processor is configured to access the second memory of the second server through an interface circuit of the first server and through an interface circuit of the second server.
    Type: Application
    Filed: December 13, 2022
    Publication date: March 21, 2024
    Inventors: DIMIN NIU, YIJIN GUAN, TIANCHAN GUAN, SHUANGCHEN LI, HONGZHONG ZHENG
  • Publication number: 20240094922
    Abstract: A data processing system includes a first server and a second server. The first server includes a first processor group, a first memory space and a first interface circuit. The second server includes a second processor group, a second memory space and a second interface circuit. The first memory space and the second memory space are allocated to the first processor group. The first processor group is configured to perform memory error detection to generate an error log corresponding to a memory error. When the memory error occurs in the second memory space, the first interface circuit is configured to send the error log to the second interface circuit, and the second processor group is configured to log the memory error according to the error log received by the second interface circuit. The data processing system is capable of realizing memory reliability architecture supporting operations across different servers.
    Type: Application
    Filed: December 12, 2022
    Publication date: March 21, 2024
    Inventors: DIMIN NIU, TIANCHAN GUAN, YIJIN GUAN, SHUANGCHEN LI, HONGZHONG ZHENG
  • Patent number: 11934669
    Abstract: A processor includes a plurality of memory units, each of the memory units including a plurality of memory cells, wherein each of the memory units is configurable to operate as memory, as a computation unit, or as a hybrid memory-computation unit.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: March 19, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Dimin Niu, Shuangchen Li, Bob Brennan, Krishna T. Malladi, Hongzhong Zheng
  • Publication number: 20240078036
    Abstract: A memory module can include a hybrid media controller coupled to a volatile memory, a non-volatile memory, a non-volatile memory buffer and a set of memory mapped input/output (MMIO) register. The hybrid media controller can be configured for reading and writing data to a volatile memory of a memory mapped space of a memory module. The hybrid media controller can also be configured for reading and writing bulk data to a non-volatile memory of the memory mapped space. The hybrid media controller can also be configured for reading and writing data of a random-access granularity to the non-volatile memory of the memory mapped space. The hybrid media controller can also be configured for self-indexed moving data between the non-volatile memory and the volatile memory of the memory module.
    Type: Application
    Filed: December 24, 2020
    Publication date: March 7, 2024
    Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
  • Publication number: 20240063200
    Abstract: The present disclosure relates to a hybrid bonding based integrated circuit (HBIC) device and its manufacturing method. In some embodiments, an exemplary HBIC device includes: a first die stack comprising one or more dies; and a second die stack integrated above the first die stack. The second die stack includes at least two memory dies communicatively connected to the first die stack by wire bonding.
    Type: Application
    Filed: February 6, 2020
    Publication date: February 22, 2024
    Inventors: Dimin NIU, Shuangchen LI, Tianchan GUAN, Hongzhong ZHENG
  • Publication number: 20240054096
    Abstract: The present disclosure discloses a processor. The processor is used to perform parallel computation and includes a logic die and a memory die. The logic die includes a plurality of processor cores and a plurality of networks on chip, wherein each processor core is programmable. The plurality of networks on chip are correspondingly connected to the plurality of processor cores, so that the plurality of processor cores form a two-dimensional mesh network. The memory die and the processor core are stacked vertically, wherein the memory die includes a plurality of memory tiles, and when the processor performs the parallel computation, the plurality of memory tiles do not have cache coherency; wherein, the plurality of memory tiles correspond to the plurality of processor cores in a one-to-one or one-to-many manner.
    Type: Application
    Filed: December 12, 2022
    Publication date: February 15, 2024
    Inventors: SHUANGCHEN LI, ZHE ZHANG, DIMIN NIU, HONGZHONG ZHENG
  • Patent number: 11900239
    Abstract: Systems and methods for dynamically executing sparse neural networks are provided. In one implementation, a system for providing dynamic sparsity in a neural network may include at least one memory storing instructions and at least one processor configured to execute the instructions to: reduce an input vector and a set of weights of the neural network, execute an input layer of the neural network using the reduced input vector and set of weights to generate a reduced output vector; expand the reduced output vector to a full output vector using first predictable output neurons (PONs); using a PON map, reduce a dimension of the full output vector; execute subsequent layers of the neural network using the reduced full output vector to produce a second reduced output vector; and expand the second reduced output vector to a second full output vector using second PONs.
    Type: Grant
    Filed: September 5, 2019
    Date of Patent: February 13, 2024
    Assignee: Alibaba Group Holding Limited
    Inventors: Zhenyu Gu, Liu Liu, Shuangchen Li, Yuan Xie
  • Publication number: 20240045975
    Abstract: The present disclosure discloses a processor and a multi-core processor. The processor includes a processor core and a memory. The processor core includes a homomorphic encryption instruction execution module and a general-purpose instruction execution module; the homomorphic encryption instruction execution module is configured to perform homomorphic encryption operation and includes a plurality of instruction set architecture extension components, wherein the plurality of instruction set architecture extension components are respectively configured to perform a sub-operation related to the homomorphic encryption; the general-purpose instruction execution module is configured to perform non-homomorphic encryption operation. The memory is vertically stacked with the processor core and is used as a cache or scratchpad memory of the processor core.
    Type: Application
    Filed: December 14, 2022
    Publication date: February 8, 2024
    Inventors: SHUANGCHEN LI, ZHE ZHANG, LINYONG HUANG, DIMIN NIU, XUANLE REN, HONGZHONG ZHENG
  • Patent number: 11886352
    Abstract: This specification describes methods and systems for accelerating attribute data access for graph neural network (GNN) processing. An example method includes: receiving a root node identifier corresponding to a node in a graph for GNN processing; determining one or more candidate node identifiers according to the root node identifier, wherein attribute data corresponding to the one or more candidate node identifiers are sequentially stored in a memory; and sampling one or more graph node identifiers at least from the one or more candidate node identifiers for the GNN processing.
    Type: Grant
    Filed: January 21, 2022
    Date of Patent: January 30, 2024
    Assignee: T-Head (Shanghai) Semiconductor Co., Ltd.
    Inventors: Heng Liu, Tianchan Guan, Shuangchen Li, Hongzhong Zheng
  • Publication number: 20240028554
    Abstract: A configurable processing unit including a core processing element and a plurality of assist processing elements can be coupled together by one or more networks. The core processing element can include a large processing logic, large non-volatile memory, input/output interfaces and multiple memory channels. The plurality of assist processing elements can each include smaller processing logic, smaller non-volatile memory and multiple memory channels. One or more bitstreams can be utilized to configure and reconfigure computation resources of the core processing element and memory management of the plurality of assist processing elements.
    Type: Application
    Filed: September 18, 2020
    Publication date: January 25, 2024
    Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
  • Publication number: 20240020194
    Abstract: A system-in-package including a logic die and one or more memory dice can include a reliability availability serviceability (RAS) memory management unit (MMU) for memory error detection, memory error prediction and memory error handling. The RAS MMU can receive memory health information, on-die memory error information, system error information and read address information for the one or more memory dice. The RAS MMU can manage the memory blocks of the one or more memory dice based on the memory health information, on-die memory error type, system error type and read address. The RAS MMU can also further manage the memory blocks based on received on-die memory temperature information and or system temperature information.
    Type: Application
    Filed: November 4, 2020
    Publication date: January 18, 2024
    Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
  • Publication number: 20240005127
    Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example system includes a plurality of processing units, each configured to perform graph neural network (GNN) processing; and a plurality of memory extension cards, each configured to store graph data for the GNN processing, wherein: each of the plurality of processing units is communicatively coupled with three other processing units via one or more interconnects respectively; the plurality of processing units are communicatively coupled with the plurality of memory extension cards respectively; and each of the plurality of memory extension cards includes a graphic access engine circuitry configured to acceleratre GNN memory access.
    Type: Application
    Filed: November 28, 2022
    Publication date: January 4, 2024
    Inventors: Yijin GUAN, Dimin NIU, Shengcheng WANG, Shuangchen LI, Hongzhong ZHENG
  • Publication number: 20240005133
    Abstract: This application describes an hardware and a software design for quantization in GNN computation. An exemplary method may include: receiving a graph comprising a plurality of nodes respectively represented by a plurality of feature vectors; segmenting the plurality of feature vectors into a plurality of sub-vectors and grouping the plurality of sub-vectors into a plurality of groups of sub-vectors; performing vector clustering on each of the plurality of groups of sub-vectors to generate a plurality of centroids as a codebook; encoding each of the plurality of feature vectors to obtain a plurality of index maps by quantizing sub-vectors within the each feature vector based on the codebook, wherein each index map occupies a smaller storage space than the each feature vector does; and storing the plurality of index maps as an assignment table instead of the plurality of feature vectors to represent the plurality of nodes for GNN computation.
    Type: Application
    Filed: August 30, 2022
    Publication date: January 4, 2024
    Inventors: Linyong HUANG, Zhe ZHANG, Shuangchen LI, Hongzhong ZHENG
  • Publication number: 20240005075
    Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example method includes fetching, by an access engine circuitry implemented on a circuitry board, a portion of structure data of a graph from one or more of a plurality of flash memory drives implemented on the circuitry board; performing node sampling using the fetched portion of the structure data of the graph to select one or more sampled nodes; fetching a portion of attribute data of the graph from two or more of the plurality of memory drives in parallel according to the selected one or more sampled nodes; sending the fetched portion of the attribute data of the graph to a host outside of the circuitry board; and performing, by the host, GNN processing for the graph using the fetched portion of the attribute data of the graph.
    Type: Application
    Filed: November 30, 2022
    Publication date: January 4, 2024
    Inventors: Shuangchen LI, Dimin NIU, Hongzhong ZHENG
  • Publication number: 20240004824
    Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example method includes fetching, by an access engine circuitry implemented on a circuitry board, a portion of structure data of a graph from a pinned memory in a host memory of a host via a first peripheral component interconnect express (PCIe) connection; performing node sampling using the fetched portion of the structure data of the graph to select one or more sampled nodes; fetching, by the access engine circuitry, a portion of attribute data of the graph from the pinned memory via the first PCIe connection; sending the fetched portion of the attribute data of the graph to one or more processors; and performing, by the one or more processors, GNN processing for the graph using the fetched portion of the attribute data of the graph.
    Type: Application
    Filed: November 30, 2022
    Publication date: January 4, 2024
    Inventors: Shuangchen LI, Dimin NIU, Hongzhong ZHENG, Zhe ZHANG, Yuhao WANG
  • Patent number: 11847049
    Abstract: The total memory space that is logically available to a processor in a general-purpose graphics processing unit (GPGPU) module is increased to accommodate terabyte-sized amounts of data by utilizing the memory space in an external memory module, and by further utilizing a portion of the memory space in a number of other external memory modules.
    Type: Grant
    Filed: January 21, 2022
    Date of Patent: December 19, 2023
    Assignee: Alibaba Damo (Hangzhou) Technology Co., Ltd
    Inventors: Yuhao Wang, Dimin Niu, Yijin Guan, Shengcheng Wang, Shuangchen Li, Hongzhong Zheng
  • Patent number: 11841799
    Abstract: This application describes a hardware accelerator, a computer system, and a method for accelerating Graph Neural Network (GNN) node attribute fetching. The hardware accelerator comprises a GNN attribute processor; and a first memory, wherein the GNN attribute processor is configured to: receive a graph node identifier; determine a target memory address within the first memory based on the graph node identifier; determine, based on the received graph node identifier, whether attribute data corresponding to the received graph node identifier is cached in the first memory at the target memory address; and in response to determining that the attribute data is not cached in the first memory: fetch the attribute data from a second memory, and write the fetched attribute data into the first memory at the target memory address.
    Type: Grant
    Filed: January 21, 2022
    Date of Patent: December 12, 2023
    Assignee: T-Head (Shanghai) Semiconductor Co., Ltd.
    Inventors: Tianchan Guan, Heng Liu, Shuangchen Li, Hongzhong Zheng
  • Publication number: 20230393851
    Abstract: A number of domain specific accelerators (DSA1-DSAn) are integrated into a conventional processing system (100) to operate on the same chip by adding additional instructions to a conventional instruction set architecture (ISA), and further adding an accelerator interface unit (130) to the processing system (100) to respond to the additional instructions and interact with the DSAs.
    Type: Application
    Filed: June 20, 2023
    Publication date: December 7, 2023
    Inventors: Yuhao WANG, Zhaoyang DU, Yen-kuang CHEN, Wei HAN, Shuangchen LI, Fei XUE, Hongzhong ZHENG