Patents by Inventor Hefei Zhu

Hefei Zhu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11847501
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that determines a static partition of resources in each DPA in the cluster communicatively coupled to a host device. Each DPA has sensitive (secure) and non-sensitive (non-secure) resources. The host device and a DPA can access all resources of the DPA. Other DPAs can only access non-sensitive resources of a DPA. The partition of resources within a DPA is static and may be implemented in hardware or firmware. Resources include memory, one or more processing modules such as key generators and cryptographic modules, caches, registers, and storage.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: December 19, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11687376
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using dynamic partitioning of DPAs into, or out of, one or more groups of DPAs in the cluster. A host device instructs each DPA in the cluster to link, or unlink, with one or more DPAs in the cluster to establish groups of DPAs in the cluster. A DPA that is not linked to any DPA is set to a low-power mode. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device allocates processing tasks for one application or user to a group.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: June 27, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11687629
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs). The cluster of accelerators may include DPAs of a third party accelerator that may not be trusted. To ensure data protection in the cluster, a first DPA that receives a request from a second DPA to access a resource of the first DPA authenticates the second DPA. If the second DPA passes authentication, the second DPA is permitted to access non-sensitive resources of the first DPA, otherwise the second DPA is not permitted access to any resources of the first DPA and the first DPA breaks a communication link with the second DPA. Authentication is premised on a shared secret function between DPAs and a random number generated by the first DPA. The shared secret function is updateable by, e.g., a patch from a manufacturer of the DPA.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: June 27, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11657332
    Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: May 23, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11615295
    Abstract: A data processing system includes a central processing unit (CPU) and accelerator cards coupled to the CPU over a bus, each of the accelerator cards having a plurality of data processing (DP) accelerators to receive DP tasks from the CPU and to perform the received DP tasks. At least two of the accelerator cards are coupled to each other via an inter-card connection, and at least two of the DP accelerators are coupled to each other via an inter-chip connection. Each of the inter-card connection and the inter-chip connection is capable of being dynamically activated or deactivated, such that in response to a request received from the CPU, any one of the accelerator cards or any one of the DP accelerators within any one of the accelerator cards can be enabled or disabled to process any one of the DP tasks received from the CPU.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: March 28, 2023
    Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Hefei Zhu, Jian Ouyang, Zhibiao Zhao, Xiaozhang Gong, Qingshu Chen
  • Patent number: 11588796
    Abstract: According to one embodiment, a host communicates with a data processing (DP) accelerator using an obfuscation scheme. The DP accelerator receives an obfuscation kernel algorithm (or obfuscation algorithm), where the obfuscation kernel algorithm is used to obfuscate and de-obfuscate data in communication with a host. The DP accelerator de-obfuscates, using the obfuscation kernel algorithm, obfuscated data received from the host for a prediction request to obtain one or more AI models. The DP accelerator generates prediction results by applying the one or more AI models to a prediction input. The DP accelerator obfuscates, using the obfuscation kernel algorithm, the prediction results. The DP accelerator sends the obfuscated prediction results to the host, where the host retrieves the prediction results by de-obfuscating the obfuscated prediction results.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: February 21, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11563745
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that partitions the DPAs into one or more group of DPAs in the cluster. A host device instructs the DPAs to organize themselves into non-overlapping groups according to a policy for each DPA in the cluster. The policy indicates, for each DPA, one or more other DPAs the DPA is to establish a communication link with, to implement the grouping. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device can allocate processing tasks to any group in the cluster.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: January 24, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11556859
    Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: January 17, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11544067
    Abstract: According to various embodiments, methods and systems are provided to accelerate artificial intelligence (AI) model training with advanced interconnect communication technologies and systematic zero-value compression over a distributed training system. According to an exemplary method, during each iteration of a Scatter-Reduce process performed on a cluster of processors arranged in a logical ring to train a neural network model, a processor receives a compressed data block from a prior processor in the logical ring, performs an operation on the received compressed data block and a compressed data block generated on the processor to obtain a calculated data block, and sends the calculated data block to a following processor in the logical ring. A compressed data block calculated from corresponding data blocks from the processors can be identified on each processor and distributed to each other processor and decompressed therein for use in the AI model training.
    Type: Grant
    Filed: October 12, 2019
    Date of Patent: January 3, 2023
    Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Zhibiao Zhao, Jian Ouyang, Hefei Zhu, Qingshu Chen, Wei Qi
  • Patent number: 11485376
    Abstract: An automatic processing system, a system on chip and a method for monitoring a processing module are described herein. The automatic driving processing system comprises: an automatic driving processing module, configured for receiving an input data stream and processing the input data stream based on a deep learning model so as to generate a processing result; a fault detection module, configured for generating a control signal and a fault detection stimulating data stream, and receiving the processing result from the automatic driving processing module; and a multi-way selection module, configured for receiving an automatic driving data stream as well as the control signal and the fault detection stimulating data stream, and selectively outputting the automatic driving data stream or the fault detection stimulating data stream to the automatic driving processing module based on the control signal, as an input data stream.
    Type: Grant
    Filed: December 11, 2019
    Date of Patent: November 1, 2022
    Assignees: Beijing Baidu Netcom Science And Technology Co., Ltd., Kunlunxin Technology (Beijing) Company Limited
    Inventors: Chongqin Wang, Zhibiao Zhao, Hefei Zhu, Ningyi Xu, Jian Ouyang
  • Patent number: 11409653
    Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of an AI model, wherein each layer of the plurality of layers is associated with a memory address. The method further includes randomizing the memory address associated with each layer of the plurality of layers, and transferring the plurality of layers with the randomized memory addresses to a data processing accelerator to execute the AI model.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: August 9, 2022
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11301255
    Abstract: Methods, apparatuses, devices, and storage media for performing a processing task are provided. A portion of portions of the processing task can include a group of operations that are to be performed at a processing unit of processing units. The group of operations can include operations of a first type and operations of a second type. In the method, a first queue for performing the operations of the first type and a second queue for performing the operations of the second type can be built, respectively. Based on a definition of the processing task, a dependency relationship between a group of operations to be performed at the processing unit and a group of operations to be performed at other processing units in the plurality of processing units can be obtained. Operations in the first queue and operations in the second queue can be performed respectively based on the dependency relationship.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: April 12, 2022
    Assignee: Kunlunxin Technology (Beijing) Company Limited
    Inventors: Qingshu Chen, Zhibiao Zhao, Hefei Zhu, Xiaozhang Gong, Yong Wang, Jian Ouyang
  • Publication number: 20210390463
    Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Yueqiang CHENG, Hefei ZHU
  • Publication number: 20210390163
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs). The cluster of accelerators may include DPAs of a third party accelerator that may not be trusted. To ensure data protection in the cluster, a first DPA that receives a request from a second DPA to access a resource of the first DPA authenticates the second DPA. If the second DPA passes authentication, the second DPA is permitted to access non-sensitive resources of the first DPA, otherwise the second DPA is not permitted access to any resources of the first DPA and the first DPA breaks a communication link with the second DPA. Authentication is premised on a shared secret function between DPAs and a random number generated by the first DPA. The shared secret function is updateable by, e.g., a patch from a manufacturer of the DPA.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Yueqiang CHENG, Hefei ZHU
  • Publication number: 20210390462
    Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Yueqiang CHENG, Hefei ZHU
  • Publication number: 20210390047
    Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of an AI model, wherein each layer of the plurality of layers is associated with a memory address. The method further includes randomizing the memory address associated with each layer of the plurality of layers, and transferring the plurality of layers with the randomized memory addresses to a data processing accelerator to execute the AI model.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Yueqiang CHENG, Hefei ZHU
  • Publication number: 20210389992
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that determines a static partition of resources in each DPA in the cluster communicatively coupled to a host device. Each DPA has sensitive (secure) and non-sensitive (non-secure) resources. The host device and a DPA can access all resources of the DPA. Other DPAs can only access non-sensitive resources of a DPA. The partition of resources within a DPA is static and may be implemented in hardware or firmware. Resources include memory, one or more processing modules such as key generators and cryptographic modules, caches, registers, and storage.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Yueqiang CHENG, Hefei ZHU
  • Publication number: 20210389993
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using dynamic partitioning of DPAs into, or out of, one or more groups of DPAs in the cluster. A host device instructs each DPA in the cluster to link, or unlink, with one or more DPAs in the cluster to establish groups of DPAs in the cluster. A DPA that is not linked to any DPA is set to a low-power mode. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device allocates processing tasks for one application or user to a group.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Yueqiang CHENG, Hefei ZHU
  • Publication number: 20210392143
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that partitions the DPAs into one or more group of DPAs in the cluster. A host device instructs the DPAs to organize themselves into non-overlapping groups according to a policy for each DPA in the cluster. The policy indicates, for each DPA, one or more other DPAs the DPA is to establish a communication link with, to implement the grouping. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device can allocate processing tasks to any group in the cluster.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Yueqiang CHENG, Hefei ZHU
  • Publication number: 20210350264
    Abstract: Embodiments of the disclosure discloses a method to obfuscate AI models. In one embodiment, a host communicates with a data processing (DP) accelerator to request an AI training by the DP accelerator. The DP accelerator (or system) receives an AI model training request from a host, where the AI model training request includes one or more model-obfuscation kernel algorithms, one or more AI models, and/or training input data. In response to receiving the AI model training request, the system trains the one or more AI models based on the training input data. In some embodiments, AI accelerator already has a copy of the AI model. After the AI models are trained, the system obfuscates, using the one or more model-obfuscation kernel algorithms, the one or more trained AI models. The system sends the obfuscated one or more trained AI models to the host.
    Type: Application
    Filed: May 7, 2020
    Publication date: November 11, 2021
    Inventors: YUEQIANG CHENG, HEFEI ZHU