Patents by Inventor Hefei Zhu

Hefei Zhu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method for data protection in a data processing cluster with partition

Patent number: 11847501

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that determines a static partition of resources in each DPA in the cluster communicatively coupled to a host device. Each DPA has sensitive (secure) and non-sensitive (non-secure) resources. The host device and a DPA can access all resources of the DPA. Other DPAs can only access non-sensitive resources of a DPA. The partition of resources within a DPA is static and may be implemented in hardware or firmware. Resources include memory, one or more processing modules such as key generators and cryptographic modules, caches, registers, and storage.

Type: Grant

Filed: June 12, 2020

Date of Patent: December 19, 2023

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Method for data protection in a data processing cluster with dynamic partition

Patent number: 11687376

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using dynamic partitioning of DPAs into, or out of, one or more groups of DPAs in the cluster. A host device instructs each DPA in the cluster to link, or unlink, with one or more DPAs in the cluster to establish groups of DPAs in the cluster. A DPA that is not linked to any DPA is set to a low-power mode. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device allocates processing tasks for one application or user to a group.

Type: Grant

Filed: June 12, 2020

Date of Patent: June 27, 2023

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Method for data protection in a data processing cluster with authentication

Patent number: 11687629

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs). The cluster of accelerators may include DPAs of a third party accelerator that may not be trusted. To ensure data protection in the cluster, a first DPA that receives a request from a second DPA to access a resource of the first DPA authenticates the second DPA. If the second DPA passes authentication, the second DPA is permitted to access non-sensitive resources of the first DPA, otherwise the second DPA is not permitted access to any resources of the first DPA and the first DPA breaks a communication link with the second DPA. Authentication is premised on a shared secret function between DPAs and a random number generated by the first DPA. The shared secret function is updateable by, e.g., a patch from a manufacturer of the DPA.

Type: Grant

Filed: June 12, 2020

Date of Patent: June 27, 2023

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Method for AI model transferring with layer randomization

Patent number: 11657332

Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.

Type: Grant

Filed: June 12, 2020

Date of Patent: May 23, 2023

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Distributed AI training topology based on flexible cable connection

Patent number: 11615295

Abstract: A data processing system includes a central processing unit (CPU) and accelerator cards coupled to the CPU over a bus, each of the accelerator cards having a plurality of data processing (DP) accelerators to receive DP tasks from the CPU and to perform the received DP tasks. At least two of the accelerator cards are coupled to each other via an inter-card connection, and at least two of the DP accelerators are coupled to each other via an inter-chip connection. Each of the inter-card connection and the inter-chip connection is capable of being dynamically activated or deactivated, such that in response to a request received from the CPU, any one of the accelerator cards or any one of the DP accelerators within any one of the accelerator cards can be enabled or disabled to process any one of the DP tasks received from the CPU.

Type: Grant

Filed: November 15, 2019

Date of Patent: March 28, 2023

Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Hefei Zhu, Jian Ouyang, Zhibiao Zhao, Xiaozhang Gong, Qingshu Chen
Data transmission with obfuscation for a data processing (DP) accelerator

Patent number: 11588796

Abstract: According to one embodiment, a host communicates with a data processing (DP) accelerator using an obfuscation scheme. The DP accelerator receives an obfuscation kernel algorithm (or obfuscation algorithm), where the obfuscation kernel algorithm is used to obfuscate and de-obfuscate data in communication with a host. The DP accelerator de-obfuscates, using the obfuscation kernel algorithm, obfuscated data received from the host for a prediction request to obtain one or more AI models. The DP accelerator generates prediction results by applying the one or more AI models to a prediction input. The DP accelerator obfuscates, using the obfuscation kernel algorithm, the prediction results. The DP accelerator sends the obfuscated prediction results to the host, where the host retrieves the prediction results by de-obfuscating the obfuscated prediction results.

Type: Grant

Filed: September 11, 2019

Date of Patent: February 21, 2023

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Method for data protection in a data processing cluster with policy-based partition

Patent number: 11563745

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that partitions the DPAs into one or more group of DPAs in the cluster. A host device instructs the DPAs to organize themselves into non-overlapping groups according to a policy for each DPA in the cluster. The policy indicates, for each DPA, one or more other DPAs the DPA is to establish a communication link with, to implement the grouping. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device can allocate processing tasks to any group in the cluster.

Type: Grant

Filed: June 12, 2020

Date of Patent: January 24, 2023

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Method for al model transferring with layer and memory randomization

Patent number: 11556859

Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.

Type: Grant

Filed: June 12, 2020

Date of Patent: January 17, 2023

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Accelerating AI training by an all-reduce process with compression over a distributed system

Patent number: 11544067

Abstract: According to various embodiments, methods and systems are provided to accelerate artificial intelligence (AI) model training with advanced interconnect communication technologies and systematic zero-value compression over a distributed training system. According to an exemplary method, during each iteration of a Scatter-Reduce process performed on a cluster of processors arranged in a logical ring to train a neural network model, a processor receives a compressed data block from a prior processor in the logical ring, performs an operation on the received compressed data block and a compressed data block generated on the processor to obtain a calculated data block, and sends the calculated data block to a following processor in the logical ring. A compressed data block calculated from corresponding data blocks from the processors can be identified on each processor and distributed to each other processor and decompressed therein for use in the AI model training.

Type: Grant

Filed: October 12, 2019

Date of Patent: January 3, 2023

Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Zhibiao Zhao, Jian Ouyang, Hefei Zhu, Qingshu Chen, Wei Qi
Automatic driving processing system, system on chip and method for monitoring processing module

Patent number: 11485376

Abstract: An automatic processing system, a system on chip and a method for monitoring a processing module are described herein. The automatic driving processing system comprises: an automatic driving processing module, configured for receiving an input data stream and processing the input data stream based on a deep learning model so as to generate a processing result; a fault detection module, configured for generating a control signal and a fault detection stimulating data stream, and receiving the processing result from the automatic driving processing module; and a multi-way selection module, configured for receiving an automatic driving data stream as well as the control signal and the fault detection stimulating data stream, and selectively outputting the automatic driving data stream or the fault detection stimulating data stream to the automatic driving processing module based on the control signal, as an input data stream.

Type: Grant

Filed: December 11, 2019

Date of Patent: November 1, 2022

Assignees: Beijing Baidu Netcom Science And Technology Co., Ltd., Kunlunxin Technology (Beijing) Company Limited

Inventors: Chongqin Wang, Zhibiao Zhao, Hefei Zhu, Ningyi Xu, Jian Ouyang
Method for AI model transferring with address randomization

Patent number: 11409653

Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of an AI model, wherein each layer of the plurality of layers is associated with a memory address. The method further includes randomizing the memory address associated with each layer of the plurality of layers, and transferring the plurality of layers with the randomized memory addresses to a data processing accelerator to execute the AI model.

Type: Grant

Filed: June 12, 2020

Date of Patent: August 9, 2022

Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED

Inventors: Yueqiang Cheng, Hefei Zhu
Method, apparatus, device, and storage medium for performing processing task

Patent number: 11301255

Abstract: Methods, apparatuses, devices, and storage media for performing a processing task are provided. A portion of portions of the processing task can include a group of operations that are to be performed at a processing unit of processing units. The group of operations can include operations of a first type and operations of a second type. In the method, a first queue for performing the operations of the first type and a second queue for performing the operations of the second type can be built, respectively. Based on a definition of the processing task, a dependency relationship between a group of operations to be performed at the processing unit and a group of operations to be performed at other processing units in the plurality of processing units can be obtained. Operations in the first queue and operations in the second queue can be performed respectively based on the dependency relationship.

Type: Grant

Filed: December 30, 2019

Date of Patent: April 12, 2022

Assignee: Kunlunxin Technology (Beijing) Company Limited

Inventors: Qingshu Chen, Zhibiao Zhao, Hefei Zhu, Xiaozhang Gong, Yong Wang, Jian Ouyang
METHOD FOR AI MODEL TRANSFERRING WITH LAYER AND MEMORY RANDOMIZATION

Publication number: 20210390463

Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yueqiang CHENG, Hefei ZHU
METHOD FOR DATA PROTECTION IN A DATA PROCESSING CLUSTER WITH AUTHENTICATION

Publication number: 20210390163

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs). The cluster of accelerators may include DPAs of a third party accelerator that may not be trusted. To ensure data protection in the cluster, a first DPA that receives a request from a second DPA to access a resource of the first DPA authenticates the second DPA. If the second DPA passes authentication, the second DPA is permitted to access non-sensitive resources of the first DPA, otherwise the second DPA is not permitted access to any resources of the first DPA and the first DPA breaks a communication link with the second DPA. Authentication is premised on a shared secret function between DPAs and a random number generated by the first DPA. The shared secret function is updateable by, e.g., a patch from a manufacturer of the DPA.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yueqiang CHENG, Hefei ZHU
METHOD FOR AI MODEL TRANSFERRING WITH LAYER RANDOMIZATION

Publication number: 20210390462

Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yueqiang CHENG, Hefei ZHU
METHOD FOR AI MODEL TRANSFERRING WITH ADDRESS RANDOMIZATION

Publication number: 20210390047

Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of an AI model, wherein each layer of the plurality of layers is associated with a memory address. The method further includes randomizing the memory address associated with each layer of the plurality of layers, and transferring the plurality of layers with the randomized memory addresses to a data processing accelerator to execute the AI model.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yueqiang CHENG, Hefei ZHU
METHOD FOR DATA PROTECTION IN A DATA PROCESSING CLUSTER WITH PARTITION

Publication number: 20210389992

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that determines a static partition of resources in each DPA in the cluster communicatively coupled to a host device. Each DPA has sensitive (secure) and non-sensitive (non-secure) resources. The host device and a DPA can access all resources of the DPA. Other DPAs can only access non-sensitive resources of a DPA. The partition of resources within a DPA is static and may be implemented in hardware or firmware. Resources include memory, one or more processing modules such as key generators and cryptographic modules, caches, registers, and storage.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yueqiang CHENG, Hefei ZHU
METHOD FOR DATA PROTECTION IN A DATA PROCESSING CLUSTER WITH DYNAMIC PARTITION

Publication number: 20210389993

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using dynamic partitioning of DPAs into, or out of, one or more groups of DPAs in the cluster. A host device instructs each DPA in the cluster to link, or unlink, with one or more DPAs in the cluster to establish groups of DPAs in the cluster. A DPA that is not linked to any DPA is set to a low-power mode. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device allocates processing tasks for one application or user to a group.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yueqiang CHENG, Hefei ZHU
METHOD FOR DATA PROTECTION IN A DATA PROCESSING CLUSTER WITH POLICY-BASED PARTITION

Publication number: 20210392143

Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that partitions the DPAs into one or more group of DPAs in the cluster. A host device instructs the DPAs to organize themselves into non-overlapping groups according to a policy for each DPA in the cluster. The policy indicates, for each DPA, one or more other DPAs the DPA is to establish a communication link with, to implement the grouping. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device can allocate processing tasks to any group in the cluster.

Type: Application

Filed: June 12, 2020

Publication date: December 16, 2021

Inventors: Yueqiang CHENG, Hefei ZHU
METHOD FOR OBFUSCATED AI MODEL TRAINING FOR DATA PROCESSING ACCELERATORS

Publication number: 20210350264

Abstract: Embodiments of the disclosure discloses a method to obfuscate AI models. In one embodiment, a host communicates with a data processing (DP) accelerator to request an AI training by the DP accelerator. The DP accelerator (or system) receives an AI model training request from a host, where the AI model training request includes one or more model-obfuscation kernel algorithms, one or more AI models, and/or training input data. In response to receiving the AI model training request, the system trains the one or more AI models based on the training input data. In some embodiments, AI accelerator already has a copy of the AI model. After the AI models are trained, the system obfuscates, using the one or more model-obfuscation kernel algorithms, the one or more trained AI models. The system sends the obfuscated one or more trained AI models to the host.

Type: Application

Filed: May 7, 2020

Publication date: November 11, 2021

Inventors: YUEQIANG CHENG, HEFEI ZHU

1 2 next