Patents by Inventor Yongliu Wang

Yongliu Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11487989
    Abstract: A data reuse method based on a convolutional neural network accelerator includes a tile scanning module receiving command information of a command module, the command information comprising a size of a CNN job to be divided into tile blocks; a tile scanning module according to a tile. The size of the tile generates the coordinates of the tile block and sends it to the memory request module; the memory request module generates a memory read request and sends the memory read request to the memory module; the memory module sequentially returns the tile block data to the input activation In the weight buffer unit, the input activation weight buffer unit saves the received tile block data to implement data reuse and transmits the received tile block data to the calculation processing unit PE.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: November 1, 2022
    Assignee: Shanghai Iluvatar CoreX Semiconductor Co., Ltd.
    Inventors: Yile Sun, Pingping Shao, Yongliu Wang, Jinshan Zheng, Yunxiao Zou, Haihua Zhai
  • Patent number: 10671288
    Abstract: A hierarchical sparse tensor compression method based on artificial intelligence devices, in DRAM, not only saves the storage space of the neuron surface, but also adds a meta-surface to the mask block. When reading data, the mask is first read, then the size of the non-zero data is calculated, and only these non-zero data are read to save DRAM bandwidth. In the cache, only non-zero data is stored, so the required storage space is reduced. When processing data, only non-zero data is used. The method uses a bit mask to determine if the data is zero. There are three levels in the hierarchical compression scheme: tiles, lines, and points, reading bitmasks and non-zero data from DRAM, and saving bandwidth by not reading zero data. When processing data, if their bit mask is zero, the tile data may be easily removed.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: June 2, 2020
    Assignee: Nanjing Iluvatar CoreX Technology Co., Ltd.
    Inventors: Pingping Shao, Jiejun Chen, Yongliu Wang
  • Publication number: 20200057722
    Abstract: A data reading and writing method based on a variable length cache line. A lookup table stores cache line information of each request. When a read task arrives at the cache, the cache line information is obtained according to the request index, and the request is hit. The data in the cache is read and sent to the requester in multiple cycles, otherwise the request is not in the cache, some read requests are created and sent. The offset, tag and cache line size are recorded in the record of the lookup table, and the request is sent to the DRAM. Once all the data is returned and written to the cache, the corresponding record of the lookup table is set to be valid.
    Type: Application
    Filed: December 31, 2018
    Publication date: February 20, 2020
    Applicant: Nanjing Iluvatar CoreX Technology Co., Ltd. (DBA “Iluvatar CoreX Inc. Nanjing”)
    Inventors: Yongliu Wang, Pingping Shao, Chenggen Zheng, Jinshan Zheng
  • Publication number: 20200042189
    Abstract: A hierarchical sparse tensor compression method based on artificial intelligence devices, in DRAM, not only saves the storage space of the neuron surface, but also adds a meta-surface to the mask block. When reading data, the mask is first read, then the size of the non-zero data is calculated, and only these non-zero data are read to save DRAM bandwidth. In the cache, only non-zero data is stored, so the required storage space is reduced. When processing data, only non-zero data is used. The method uses a bit mask to determine if the data is zero. There are three levels in the hierarchical compression scheme: tiles, lines, and points, reading bitmasks and non-zero data from DRAM, and saving bandwidth by not reading zero data. When processing data, if their bit mask is zero, the tile data may be easily removed.
    Type: Application
    Filed: December 31, 2018
    Publication date: February 6, 2020
    Applicant: Nanjing Iluvatar CoreX Technology Co., Ltd. (DBA “Iluvatar CoreX Inc. Nanjing”)
    Inventors: Pingping Shao, Jiejun Chen, Yongliu Wang
  • Publication number: 20200042860
    Abstract: A data reuse method based on a convolutional neural network accelerator includes a tile scanning module receiving command information of a command module, the command information comprising a size of a CNN job to be divided into tile blocks; a tile scanning module according to a tile. The size of the tile generates the coordinates of the tile block and sends it to the memory request module; the memory request module generates a memory read request and sends the memory read request to the memory module; the memory module sequentially returns the tile block data to the input activation In the weight buffer unit, the input activation weight buffer unit saves the received tile block data to implement data reuse and transmits the received tile block data to the calculation processing unit PE.
    Type: Application
    Filed: December 31, 2018
    Publication date: February 6, 2020
    Applicant: Nanjing Iluvatar CoreX Technology Co., Ltd. (DBA “Iluvatar CoreX Inc. Nanjing”)
    Inventors: Yile Sun, Pingping Shao, Yongliu Wang, Jinshan Zheng, Yunxiao Zou, Haihua Zhai