Patents by Inventor Dimin Niu

Dimin Niu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11954093
    Abstract: Embodiments of the disclosure provide devices and methods for performing a top-k function. The device can include: a memory comprising a plurality of register files for storing the data elements, the plurality of register files comprising a parent register file and a first child register file associated with the parent register file, wherein the parent register file is associated with: first interface circuitry configured for reading a first parent data element from the parent register file and receiving a first child data element and a second child data element from the first child register file; and first comparison circuitry configured for updating the parent register file and the first child register file based on the first parent data element, the first child data element, and the second child data element according to a given principle.
    Type: Grant
    Filed: June 4, 2020
    Date of Patent: April 9, 2024
    Assignee: Alibaba Group Holding Limited
    Inventors: Fei Sun, Shuangchen Li, Dimin Niu, Fei Xue, Yuanwei Fang
  • Publication number: 20240104360
    Abstract: Near memory processing systems for graph neural network processing can include a central core coupled to one or more memory units. The memory units can include one or more controllers and a plurality of memory devices. The system can be configured for offloading aggregation, concentrate and the like operations from the central core to the controllers of the one or more memory units. The central core can sample the graph neural network and schedule memory accesses for execution by the one or more memory units. The central core can also schedule aggregation, combination or the like operations associated with one or more memory accesses for execution by the controller. The controller can access data in accordance with the data access requests from the central core. One or more computation units of the controller can also execute the aggregation, combination or the like operations associated with one or more memory access.
    Type: Application
    Filed: December 2, 2020
    Publication date: March 28, 2024
    Inventors: Tianchan GUAN, Dimin NIU, Hongzhong ZHENG, Shuangchen LI
  • Patent number: 11940922
    Abstract: A method of processing in-memory commands in a high-bandwidth memory (HBM) system includes sending a function-in-HBM instruction to the HBM by a HBM memory controller of a GPU. A logic component of the HBM receives the FIM instruction and coordinates the instructions execution using the controller, an ALU, and a SRAM located on the logic component.
    Type: Grant
    Filed: December 14, 2022
    Date of Patent: March 26, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Mu-Tien Chang, Krishna T. Malladi, Dimin Niu, Hongzhong Zheng
  • Publication number: 20240094922
    Abstract: A data processing system includes a first server and a second server. The first server includes a first processor group, a first memory space and a first interface circuit. The second server includes a second processor group, a second memory space and a second interface circuit. The first memory space and the second memory space are allocated to the first processor group. The first processor group is configured to perform memory error detection to generate an error log corresponding to a memory error. When the memory error occurs in the second memory space, the first interface circuit is configured to send the error log to the second interface circuit, and the second processor group is configured to log the memory error according to the error log received by the second interface circuit. The data processing system is capable of realizing memory reliability architecture supporting operations across different servers.
    Type: Application
    Filed: December 12, 2022
    Publication date: March 21, 2024
    Inventors: DIMIN NIU, TIANCHAN GUAN, YIJIN GUAN, SHUANGCHEN LI, HONGZHONG ZHENG
  • Publication number: 20240095179
    Abstract: A memory management method of a data processing system is provided. The memory management method includes: creating a first memory zone and a second memory zone related to a first node of a first server, wherein the first server is located in the data processing system, and the first node includes a processor and a first memory; mapping the first memory zone to the first memory; and mapping the second memory zone to a second memory of a second server, wherein the second server is located in the data processing system, and the processor is configured to access the second memory of the second server through an interface circuit of the first server and through an interface circuit of the second server.
    Type: Application
    Filed: December 13, 2022
    Publication date: March 21, 2024
    Inventors: DIMIN NIU, YIJIN GUAN, TIANCHAN GUAN, SHUANGCHEN LI, HONGZHONG ZHENG
  • Patent number: 11934669
    Abstract: A processor includes a plurality of memory units, each of the memory units including a plurality of memory cells, wherein each of the memory units is configurable to operate as memory, as a computation unit, or as a hybrid memory-computation unit.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: March 19, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Dimin Niu, Shuangchen Li, Bob Brennan, Krishna T. Malladi, Hongzhong Zheng
  • Publication number: 20240078036
    Abstract: A memory module can include a hybrid media controller coupled to a volatile memory, a non-volatile memory, a non-volatile memory buffer and a set of memory mapped input/output (MMIO) register. The hybrid media controller can be configured for reading and writing data to a volatile memory of a memory mapped space of a memory module. The hybrid media controller can also be configured for reading and writing bulk data to a non-volatile memory of the memory mapped space. The hybrid media controller can also be configured for reading and writing data of a random-access granularity to the non-volatile memory of the memory mapped space. The hybrid media controller can also be configured for self-indexed moving data between the non-volatile memory and the volatile memory of the memory module.
    Type: Application
    Filed: December 24, 2020
    Publication date: March 7, 2024
    Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
  • Publication number: 20240069755
    Abstract: The present application provides a computer system, a memory expansion device and a method for use in the computer system. The computer system includes multiple hosts and multiple memory expansion devices; the memory expansion devices correspond to the hosts in a one-to-one manner. Each host includes a CPU and a memory; each memory expansion device includes a first interface and multiple second interfaces. The first interface is configured to allow each memory expansion device to communicate with the corresponding CPU via a first coherence interconnection protocol, and the second interface is configured to allow each memory expansion device to communicate with a portion of memory expansion devices via a second coherence interconnection protocol. Any two memory expansion devices communicate with each other via at least two different paths, and the number of memory expansion devices that at least one of the two paths passes through is not more than one.
    Type: Application
    Filed: December 12, 2022
    Publication date: February 29, 2024
    Inventors: YIJIN GUAN, TIANCHAN GUAN, DIMIN NIU, HONGZHONG ZHENG
  • Publication number: 20240069754
    Abstract: The present application discloses a computing system and an associated method. The computing system includes a first host, a second host, a first memory extension device and a second memory extension device. The first host includes a first memory, and the second host includes a second memory. The first host has a plurality of first memory addresses corresponding to a plurality of memory spaces of the first memory, and a plurality of second memory addresses corresponding to a plurality of memory spaces of the second memory. The first memory extension device is coupled to the first host. The second memory extension device is coupled to the second host and the first memory extension device. The first host accesses the plurality of memory spaces of the second memory through the first memory extension device and the second memory extension device.
    Type: Application
    Filed: December 12, 2022
    Publication date: February 29, 2024
    Inventors: TIANCHAN GUAN, YIJIN GUAN, DIMIN NIU, HONGZHONG ZHENG
  • Publication number: 20240069954
    Abstract: The present application discloses a computing system and a memory sharing method for a computing system. The computing system includes a first host, a second host, a first memory extension device and a second memory extension device. The first memory extension device is coupled to the first host. The second memory extension device is coupled to the second host and the first memory extension device. The first host accesses a memory space of a memory of the second host through the first memory extension device and the second memory extension device according to a first physical address of the first host.
    Type: Application
    Filed: May 24, 2023
    Publication date: February 29, 2024
    Inventors: TIANCHAN GUAN, DIMIN NIU, YIJIN GUAN, HONGZHONG ZHENG
  • Publication number: 20240063200
    Abstract: The present disclosure relates to a hybrid bonding based integrated circuit (HBIC) device and its manufacturing method. In some embodiments, an exemplary HBIC device includes: a first die stack comprising one or more dies; and a second die stack integrated above the first die stack. The second die stack includes at least two memory dies communicatively connected to the first die stack by wire bonding.
    Type: Application
    Filed: February 6, 2020
    Publication date: February 22, 2024
    Inventors: Dimin NIU, Shuangchen LI, Tianchan GUAN, Hongzhong ZHENG
  • Publication number: 20240054096
    Abstract: The present disclosure discloses a processor. The processor is used to perform parallel computation and includes a logic die and a memory die. The logic die includes a plurality of processor cores and a plurality of networks on chip, wherein each processor core is programmable. The plurality of networks on chip are correspondingly connected to the plurality of processor cores, so that the plurality of processor cores form a two-dimensional mesh network. The memory die and the processor core are stacked vertically, wherein the memory die includes a plurality of memory tiles, and when the processor performs the parallel computation, the plurality of memory tiles do not have cache coherency; wherein, the plurality of memory tiles correspond to the plurality of processor cores in a one-to-one or one-to-many manner.
    Type: Application
    Filed: December 12, 2022
    Publication date: February 15, 2024
    Inventors: SHUANGCHEN LI, ZHE ZHANG, DIMIN NIU, HONGZHONG ZHENG
  • Publication number: 20240045975
    Abstract: The present disclosure discloses a processor and a multi-core processor. The processor includes a processor core and a memory. The processor core includes a homomorphic encryption instruction execution module and a general-purpose instruction execution module; the homomorphic encryption instruction execution module is configured to perform homomorphic encryption operation and includes a plurality of instruction set architecture extension components, wherein the plurality of instruction set architecture extension components are respectively configured to perform a sub-operation related to the homomorphic encryption; the general-purpose instruction execution module is configured to perform non-homomorphic encryption operation. The memory is vertically stacked with the processor core and is used as a cache or scratchpad memory of the processor core.
    Type: Application
    Filed: December 14, 2022
    Publication date: February 8, 2024
    Inventors: SHUANGCHEN LI, ZHE ZHANG, LINYONG HUANG, DIMIN NIU, XUANLE REN, HONGZHONG ZHENG
  • Publication number: 20240028554
    Abstract: A configurable processing unit including a core processing element and a plurality of assist processing elements can be coupled together by one or more networks. The core processing element can include a large processing logic, large non-volatile memory, input/output interfaces and multiple memory channels. The plurality of assist processing elements can each include smaller processing logic, smaller non-volatile memory and multiple memory channels. One or more bitstreams can be utilized to configure and reconfigure computation resources of the core processing element and memory management of the plurality of assist processing elements.
    Type: Application
    Filed: September 18, 2020
    Publication date: January 25, 2024
    Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
  • Publication number: 20240020194
    Abstract: A system-in-package including a logic die and one or more memory dice can include a reliability availability serviceability (RAS) memory management unit (MMU) for memory error detection, memory error prediction and memory error handling. The RAS MMU can receive memory health information, on-die memory error information, system error information and read address information for the one or more memory dice. The RAS MMU can manage the memory blocks of the one or more memory dice based on the memory health information, on-die memory error type, system error type and read address. The RAS MMU can also further manage the memory blocks based on received on-die memory temperature information and or system temperature information.
    Type: Application
    Filed: November 4, 2020
    Publication date: January 18, 2024
    Inventors: Dimin NIU, Tianchan GUAN, Hongzhong ZHENG, Shuangchen LI
  • Publication number: 20240004954
    Abstract: This application describes an hardware acceleration design for improving SpGEMM efficiency. An exemplary method may include: obtaining a first sparse matrix and a second sparse matrix for performing SpGEMM; allocating a pair of buffers respectively pointed by a first pointer and a second pointer; for each first row in the first sparse matrix that comprises a plurality of non-zero elements, identifying a plurality of second rows in the second sparse matrix that correspond to the plurality of non-zero elements; obtaining a plurality of intermediate lists computed based on each of the plurality of non-zero elements in the first row and one of the plurality of second rows that corresponds to the non-zero element; performing accumulation of the intermediate lists using the pair of buffers; and migrating the one final merged list to a system memory as a row of an output matrix of the SpGEMM.
    Type: Application
    Filed: November 1, 2022
    Publication date: January 4, 2024
    Inventors: Zhaoyang DU, Yijin GUAN, Dimin NIU, Hongzhong ZHENG
  • Publication number: 20240004547
    Abstract: A 3D-stacked memory device including: a base die including a plurality of switches to direct data flow and a plurality of arithmetic logic units (ALUs) to compute data; a plurality of memory dies stacked on the base die; and an interface to transfer signals to control the base die.
    Type: Application
    Filed: September 15, 2023
    Publication date: January 4, 2024
    Inventors: Mu-Tien Chang, Prasun Gera, Dimin Niu, Hongzhong Zheng
  • Publication number: 20240005127
    Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example system includes a plurality of processing units, each configured to perform graph neural network (GNN) processing; and a plurality of memory extension cards, each configured to store graph data for the GNN processing, wherein: each of the plurality of processing units is communicatively coupled with three other processing units via one or more interconnects respectively; the plurality of processing units are communicatively coupled with the plurality of memory extension cards respectively; and each of the plurality of memory extension cards includes a graphic access engine circuitry configured to acceleratre GNN memory access.
    Type: Application
    Filed: November 28, 2022
    Publication date: January 4, 2024
    Inventors: Yijin GUAN, Dimin NIU, Shengcheng WANG, Shuangchen LI, Hongzhong ZHENG
  • Publication number: 20240004824
    Abstract: This application describes systems and methods for facilitating memory access for graph neural network (GNN) processing. An example method includes fetching, by an access engine circuitry implemented on a circuitry board, a portion of structure data of a graph from a pinned memory in a host memory of a host via a first peripheral component interconnect express (PCIe) connection; performing node sampling using the fetched portion of the structure data of the graph to select one or more sampled nodes; fetching, by the access engine circuitry, a portion of attribute data of the graph from the pinned memory via the first PCIe connection; sending the fetched portion of the attribute data of the graph to one or more processors; and performing, by the one or more processors, GNN processing for the graph using the fetched portion of the attribute data of the graph.
    Type: Application
    Filed: November 30, 2022
    Publication date: January 4, 2024
    Inventors: Shuangchen LI, Dimin NIU, Hongzhong ZHENG, Zhe ZHANG, Yuhao WANG
  • Publication number: 20240004955
    Abstract: This application describes an accelerator, a computer system, and a method for memory optimization in sparse matrix-matrix multiplications (spGEMM). The memory optimization includes accurate memory pre-allocation for a to-be-generated output matrix of spGEMM between two sparse matrices. An exemplary method may include: sampling a plurality of first rows in the first sparse matrix; identifying, based on indices of non-zero data in the plurality of first rows, a plurality of second rows in a second sparse matrix; performing symbolic multiplication operations between the non-zero data in the plurality of first and second rows; determining an estimated compression ratio of the output matrix; determining an estimated mean row size for each row in the output matrix based on the estimated compression ratio; and allocating, according to the estimated mean row size and a total number of rows of the output matrix, a memory space in a hardware memory.
    Type: Application
    Filed: November 9, 2022
    Publication date: January 4, 2024
    Inventors: Zhaoyang DU, Yijin GUAN, Dimin NIU, Hongzhong ZHENG