Patents Assigned to Advanced Institute of Big Data, Beijing
  • Patent number: 11789899
    Abstract: The present disclosure provides a high-performance data lake system and a data storage method. The data storage method includes the following steps: S1: converting a file into a file stream; S2: converting the file stream into an array in which multiple subarrays are nested; and S3: converting the array into a resilient distributed dataset (RDD), and storing the RDD to a storage layer of a data lake. The present disclosure provides a nested field structure, which lays the foundation for parallel processing in reading, and effectively improves read performance. Furthermore, the present disclosure flexibly generates a number of nested subarrays according to hardware cores, such that the data lake achieves better extension performance, and can keep optimal writing efficiency for different users.
    Type: Grant
    Filed: November 17, 2022
    Date of Patent: October 17, 2023
    Assignees: Nanhu Laboratory, Advanced Institute of Big Data, Beijing
    Inventors: Hao Liu, Zhiling Chen, Tao Zhang, Peng Wang, Qiuye Wang, Chenxi Yu, Wei Chen, Yinlong Liu, Zhefeng Liu, Yonggang Tu
  • Publication number: 20230153267
    Abstract: The present disclosure provides a high-performance data lake system and a data storage method. The data storage method includes the following steps: S1: converting a file into a file stream; S2: converting the file stream into an array in which multiple subarrays are nested; and S3: converting the array into a resilient distributed dataset (RDD), and storing the RDD to a storage layer of a data lake. The present disclosure provides a nested field structure, which lays the foundation for parallel processing in reading, and effectively improves read performance. Furthermore, the present disclosure flexibly generates a number of nested subarrays according to hardware cores, such that the data lake achieves better extension performance, and can keep optimal writing efficiency for different users.
    Type: Application
    Filed: November 17, 2022
    Publication date: May 18, 2023
    Applicants: Nanhu Laboratory, Advanced Institute of Big Data, Beijing
    Inventors: Hao LIU, Zhiling CHEN, Tao ZHANG, Peng WANG, Qiuye WANG, Chenxi YU, Wei CHEN, Yinlong LIU, Zhefeng LIU, Yonggang TU