Patents by Inventor Lide Duan
Lide Duan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12248406Abstract: The present application discloses a computing system and an associated method. The computing system includes a memory, a master computing device and a slave computing device. The master computing device includes a memory controller and an input-output memory management unit (IOMMU). When the slave computing device accesses a first virtual address, and a first translation lookaside buffer (TLB) of the slave computing device does not store the first virtual address, the first TLB sends a translation request to the IOMMU. The IOMMU traverses page tables of the memory controller to obtain a first physical address corresponding to the first virtual address, selects and clears a first virtual address entry from a second TLB of the computing system according to a recent use time and a dependent workload of each virtual address entry to store the first virtual address and the first physical address.Type: GrantFiled: December 13, 2022Date of Patent: March 11, 2025Assignee: ALIBABA (CHINA) CO., LTD.Inventors: Lide Duan, Qichen Zhang, Shijian Zhang, Yen-Kuang Chen
-
Patent number: 12248400Abstract: A computer-implemented method for allocating memory bandwidth of multiple CPU cores in a server includes: receiving an access request to a last level cache (LLC) shared by the multiple CPU cores in the server, the access request being sent from a core with a private cache holding copies of frequently accessed data from a memory; determining whether the access request is an LLC hit or an LLC miss; and controlling a memory bandwidth controller based on the determination. The memory bandwidth controller performs a memory bandwidth throttling to control a request rate between the private cache and the last level cache. The LLC hit of the access request causes the memory bandwidth throttling initiated by the memory bandwidth controller to be disabled and the LLC miss of the access request causes the memory bandwidth throttling initiated by the memory bandwidth controller to be enabled.Type: GrantFiled: August 16, 2023Date of Patent: March 11, 2025Assignee: Alibaba (China) Co., Ltd.Inventors: Lide Duan, Bowen Huang, Qichen Zhang, Shengcheng Wang, Yen-Kuang Chen, Hongzhong Zheng
-
Patent number: 12147341Abstract: Apparatus, method, and system provided herein are directed to prioritizing cache line writing of compressed data. The memory controller comprises a cache line compression engine that receives raw data, compresses the raw data, determines a compression rate between the raw data and the compressed data, determines whether the compression rate is greater than a predetermined rate, and outputs the compressed data as data-to-be-written if the compression rate is greater than the predetermined rate. In response to determining that the compression rate is greater than the predetermined rate, the cache line compression engine generates a compression signal indicating the data-to-be-written is the compressed data and sends the compression signal to a scheduler of a command queue in the memory controller where writing of compressed data is prioritized.Type: GrantFiled: August 6, 2020Date of Patent: November 19, 2024Assignee: Alibaba Group Holding LimitedInventors: Dimin Niu, Tianchan Guan, Lide Duan, Hongzhong Zheng
-
Patent number: 12141438Abstract: Zero skipping sparsity techniques for reduced data movement between memory and accelerators and reduced computational workload of accelerators. The techniques include detection of zero and near-zero values on the memory. The non-zero values are transferred to the accelerator for computation. The zero and near-zero values are written back within the memory as zero values.Type: GrantFiled: February 25, 2021Date of Patent: November 12, 2024Assignee: Alibaba Group Holding LimitedInventors: Fei Xue, Fei Sun, Yangjie Zhou, Lide Duan, Hongzhong Zheng
-
Publication number: 20240303090Abstract: A data processing method, applicable to an accelerator that is communicatively coupled to a processor core, includes obtaining a service data processing request from a first queue; obtaining to-be-processed service data corresponding to the service data processing request from the processor core via a service interface; generating result service data based on the to-be-processed service data; and writing the result service data into a second queue for providing to the processor core.Type: ApplicationFiled: March 7, 2024Publication date: September 12, 2024Inventors: Shijian Zhang, Lide Duan, Hongzhong Zheng
-
Publication number: 20240303135Abstract: Embodiments of the present disclosure provide a data transmission method. The data transmission method is applied to an operation chip. The operation chip includes a plurality of nodes of a network on chip (NoC), and the method includes: receiving a data processing instruction of target service data, where the data processing instruction carries information about a receiving node and a processing node set; determining a relay processing node in the processing node set based on the receiving node; and transmitting the target service data from the receiving node to the relay processing node, and transmitting the target service data from the relay processing node to another processing node in the processing node set.Type: ApplicationFiled: March 8, 2024Publication date: September 12, 2024Inventors: Huatao Zhao, Shengcheng Wang, Yunfan Li, Lide Duan
-
Patent number: 12056374Abstract: A dynamic bias coherency configuration engine can include control logic, a host threshold register, and device threshold register and a plurality of memory region monitoring units. The memory region monitoring units can include a starting page number register, an ending page number register, a host access register and a device access register. The memory region monitoring units can be utilized by dynamic bias coherency configuration engine to configure corresponding portions of a memory space in a device bias mode or a host bias mode.Type: GrantFiled: February 3, 2021Date of Patent: August 6, 2024Assignee: Alibaba Group Holding LimitedInventors: Lide Duan, Dimin Niu, Hongzhong Zheng
-
Publication number: 20240244013Abstract: Embodiments of this disclosure provide a data packet transmission method, a scheduling management unit, a chip, and a graphic card. The data packet transmission method includes: determining a source node and a destination node of a data packet to be transmitted; determining at least one intermediate routing node corresponding to the data packet to be transmitted based on the source node and the destination node of the data packet to be transmitted and a data transmission state of each node in a network on chip (NoC); and transmitting identification information of the at least one intermediate routing node to the source node of the data packet to be transmitted.Type: ApplicationFiled: January 16, 2024Publication date: July 18, 2024Inventors: Yunfang LI, Jiayi HUANG, Lide DUAN, Dimin NIU
-
Publication number: 20240061780Abstract: A computer-implemented method for allocating memory bandwidth of multiple CPU cores in a server includes: receiving an access request to a last level cache (LLC) shared by the multiple CPU cores in the server, the access request being sent from a core with a private cache holding copies of frequently accessed data from a memory; determining whether the access request is an LLC hit or an LLC miss; and controlling a memory bandwidth controller based on the determination. The memory bandwidth controller performs a memory bandwidth throttling to control a request rate between the private cache and the last level cache. The LLC hit of the access request causes the memory bandwidth throttling initiated by the memory bandwidth controller to be disabled and the LLC miss of the access request causes the memory bandwidth throttling initiated by the memory bandwidth controller to be enabled.Type: ApplicationFiled: August 16, 2023Publication date: February 22, 2024Inventors: Lide DUAN, Bowen HUANG, Qichen ZHANG, Shengcheng WANG, Yen-Kuang CHEN, Hongzhong ZHENG
-
Publication number: 20240045960Abstract: A computing device includes a processor, at least one storage block, and an access detection unit. The processor includes a load/store unit (LSU). When the processor switches from a program to another program, the LSU stores a return address of the another program to the at least one storage block. The access detection unit includes a store-once stack and a comparison logic circuit. The store-once stack stores a storage address of the return address in the at least one storage block when the at least one storage block stores the return address. Before the LSU performs a storage operation on the at least one storage block, the comparison logic circuit compares a write address of the storage operation with a storage addresses of the return address stored in the store-once stack to determine whether the return address will be modified.Type: ApplicationFiled: December 9, 2022Publication date: February 8, 2024Inventors: SHIJIAN ZHANG, LIDE DUAN
-
Publication number: 20240045809Abstract: The present application discloses a computing system and an associated method. The computing system includes a memory, a master computing device and a slave computing device. The master computing device includes a memory controller and an input-output memory management unit (IOMMU). When the slave computing device accesses a first virtual address, and a first translation lookaside buffer (TLB) of the slave computing device does not store the first virtual address, the first TLB sends a translation request to the IOMMU. The IOMMU traverses page tables of the memory controller to obtain a first physical address corresponding to the first virtual address, selects and clears a first virtual address entry from a second TLB of the computing system according to a recent use time and a dependent workload of each virtual address entry to store the first virtual address and the first physical address.Type: ApplicationFiled: December 13, 2022Publication date: February 8, 2024Inventors: LIDE DUAN, QICHEN ZHANG, SHIJIAN ZHANG, YEN-KUANG CHEN
-
Publication number: 20240045599Abstract: The present application discloses a processing unit and an access detection method thereof. The processing unit includes an execution circuit. The execution circuit connects to a memory and is configured to: execute an access request, wherein the access request is for accessing at least one part of a first physical memory section corresponding to a first access base address; determine whether a first tag of the access request is equal to a second tag corresponding to the first memory base address and whether the at least one part of the first physical memory section matches a first legal access section corresponding to the first memory base address; and determine whether to send an alert message according to the determination result.Type: ApplicationFiled: December 13, 2022Publication date: February 8, 2024Inventors: SHIJIAN ZHANG, LIDE DUAN
-
Publication number: 20240045805Abstract: Core-aware caching systems and methods for non-inclusive non-exclusive shared caching based on core sharing behaviors of the data and/or instructions. In one implementation, the caching between a shared cache level and a core specific cache level can be based on physical page number (PPN) and core identifier sets for previous accesses to the respective physical page numbers. In another implementation, the caching between a shared cache level and a core specific cache level can be based on physical page number and core valid bit vector sets for previous accesses to the respective physical page numbers by each of the plurality of cores.Type: ApplicationFiled: January 20, 2021Publication date: February 8, 2024Inventors: Lide DUAN, Guocai ZHU, Yen-kuang Chen, Hongzhong ZHENG
-
Publication number: 20240037221Abstract: The present application discloses a processor and an attack detection method thereof. The processor includes a first register and an execution unit. The execution unit is configured to: execute a first jump-related instruction under a first privilege mode; set a first field of the first register to a first jump status parameter according to execution of the first jump-related instruction; jump to a first corresponding instruction in a specified register of the first jump-related instruction; determine whether the first corresponding instruction is a legal instruction and whether a first parameter of the first corresponding instruction is equal to the first jump status parameter to obtain a first determination; and determine whether to send an alert message according to the first determination.Type: ApplicationFiled: December 13, 2022Publication date: February 1, 2024Inventors: SHIJIAN ZHANG, LIDE DUAN
-
Publication number: 20240004830Abstract: Embodiments of the present disclosure includes a processor. The processor may include a systolic array of processing elements; a first group of buffers coupled to the systolic array, wherein the first group comprises one or more first buffers; a second group of buffers coupled to the systolic array, wherein the second group comprises one or more second buffers; an accumulator coupled to the systolic array; and a third group of buffers coupled to the accumulator, wherein the third group comprises one or more third buffers.Type: ApplicationFiled: November 7, 2022Publication date: January 4, 2024Inventors: Qichen ZHANG, Lide DUAN, Shengcheng WANG
-
Publication number: 20230281124Abstract: Apparatus, method, and system provided herein are directed to prioritizing cache line writing of compressed data. The memory controller comprises a cache line compression engine that receives raw data, compresses the raw data, determines a compression rate between the raw data and the compressed data, determines whether the compression rate is greater than a predetermined rate, and outputs the compressed data as data-to-be-written if the compression rate is greater than the predetermined rate. In response to determining that the compression rate is greater than the predetermined rate, the cache line compression engine generates a compression signal indicating the data-to-be-written is the compressed data and sends the compression signal to a scheduler of a command queue in the memory controller where writing of compressed data is prioritized.Type: ApplicationFiled: August 6, 2020Publication date: September 7, 2023Inventors: Dimin Niu, Tianchan Guan, Lide Duan, Hongzhong Zheng
-
Patent number: 11704271Abstract: A system-in-package architecture in accordance with aspects includes a logic die and one or more memory dice coupled together in a three-dimensional slack. The logic die can include one or more global building blocks and a plurality of local building blocks. The number of local building blocks can be scalable. The local building blocks can include a plurality of engines and memory controllers. The memory controllers can be configured to directly couple one or more of the engines to the one or more memory dice. The number and type of local building blocks, and the number and types of engines and memory controllers can be scalable.Type: GrantFiled: August 20, 2020Date of Patent: July 18, 2023Assignee: Alibaba Group Holding LimitedInventors: Lide Duan, Wei Han, Yuhao Wang, Fei Xue, Yuanwei Fang, Hongzhong Zheng
-
Patent number: 11604744Abstract: A dual-model memory interface of a computing system is provided, configurable to present memory interfaces having differently-graded bandwidth capacity to different processors of the computing system. A mode switch controller of the memory interface controller, based on at least an arbitration rule written to a configuration register, switches the memory interface controller between a narrow-band mode and a wide-band mode. In each mode, the memory interface controller disables either a plurality of narrow-band memory interfaces of the memory interface controller according to a first bus standard, or a wide-band memory interface of the memory interface controller according to a second bus standard. The memory interface controller virtualizes a plurality of system memory units of the computing system as a virtual wide-band memory unit according to the second bus standard, or virtualizes a system memory unit of the computing system as a virtual narrow-band memory unit according to the first bus standard.Type: GrantFiled: October 16, 2020Date of Patent: March 14, 2023Assignee: Alibaba Group Holding LimitedInventors: Yuhao Wang, Wei Han, Dimin Niu, Lide Duan, Shuangchen Li, Fei Xue, Hongzhong Zheng
-
Publication number: 20230026824Abstract: A memory system for accelerating graph neural network processing can include an on-host chip memory to cache data needed for processing a current root node. The system can also include a volatile memory interface between the host and non-volatile memory. The volatile memory can be configured to save one or more sets of next root nodes, neighbor nodes and corresponding attributes. The non-volatile memory can have sufficient capacity to store the entire graph data. The non-volatile memory can also be configured to pre-arrange the sets of next root nodes, neighbor nodes and corresponding attributes for storage in the volatile memory.Type: ApplicationFiled: July 15, 2022Publication date: January 26, 2023Inventors: Fei XUE, Yangjie ZHOU, Lide DUAN, Hongzhong ZHENG
-
Publication number: 20220244870Abstract: A dynamic bias coherency configuration engine can include control logic, a host threshold register, and device threshold register and a plurality of memory region monitoring units. The memory region monitoring units can include a starting page number register, an ending page number register, a host access register and a device access register. The memory region monitoring units can be utilized by dynamic bias coherency configuration engine to configure corresponding portions of a memory space in a device bias mode or a host bias mode.Type: ApplicationFiled: February 3, 2021Publication date: August 4, 2022Inventors: Lide DUAN, Dimin NIU, Hongzhong ZHENG