Patents Assigned to Kunlunxin Technology (Beijing) Company Limited
  • Publication number: 20240134532
    Abstract: An electronic device, a method of determining a memory access efficiency for a memory, and a storage medium are provided, which relate to a field of computer technology, and in particular to fields of chip, memory and processor technologies. The electronic device includes: a memory configured to store executable instructions and data to be processed; and a processor configured to execute the executable instructions so as to at least: read a data block to be tested from the memory; determine a memory access description information according to a size information of the data block to be tested; and determine, according to the memory access description information and a channel description information, a memory access efficiency of the processor in reading the data block to be tested, where the channel description information describes a plurality of channels for the processor to read the data to be processed from the memory.
    Type: Application
    Filed: November 27, 2023
    Publication date: April 25, 2024
    Applicant: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Daheng GAO, Liang SHAN, Yupeng LI, Chen FENG
  • Publication number: 20240126610
    Abstract: An apparatus and a method of processing data, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence, and in particular to fields of chip and multi-thread parallel technologies. The apparatus includes: a first target storage unit; and a processor configured to: determine an initial number of threads according to a data amount of target data and a capacity of the first target storage unit in response to determining that the data amount is less than or equal to the capacity of the first target storage unit, where the target data includes input data to be processed, weight data to be processed, and output data; and determine a first number of executable tasks according to the initial number of threads in response to determining that the initial number of threads is greater than or equal to a predetermined number of threads.
    Type: Application
    Filed: November 28, 2023
    Publication date: April 18, 2024
    Applicant: Kunlunxin Technology (Beijing) Company Limited
    Inventors: Runze LI, Shiyu ZHU, Baoyu ZHOU
  • Patent number: 11847501
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using a policy that determines a static partition of resources in each DPA in the cluster communicatively coupled to a host device. Each DPA has sensitive (secure) and non-sensitive (non-secure) resources. The host device and a DPA can access all resources of the DPA. Other DPAs can only access non-sensitive resources of a DPA. The partition of resources within a DPA is static and may be implemented in hardware or firmware. Resources include memory, one or more processing modules such as key generators and cryptographic modules, caches, registers, and storage.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: December 19, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11822964
    Abstract: Embodiments of the disclosure discloses a method and system for a virtualization environment for a data processing (DP) accelerator. In one embodiment, a data processing (DP) accelerator includes one or more statically partitioned resources and one or more virtual functions (VFs) each associated with one of the one or more statically partitioned resources. A virtual machine (VM) of a host is assigned one of the one or more VFs to access the statically partitioned resources associated with the assigned VF. The VM has no access to the rest of the one or more statically partitioned resources of the DP accelerator.
    Type: Grant
    Filed: June 3, 2020
    Date of Patent: November 21, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Zhibiao Zhao
  • Publication number: 20230367548
    Abstract: A computing method is provided. The computing method includes: obtaining a plurality of first fixed point numbers and a plurality of first exponents that correspond to the plurality of first floating point numbers, and a plurality of second fixed point numbers and a plurality of second exponents that correspond to the plurality of second floating point numbers; obtaining a fixed point product of each of the plurality of first fixed point numbers and a second fixed point number corresponding to the first fixed point number, and a corresponding fixed point product exponent; obtaining a fixed point inner product calculation result of the first vector and the second vector; and obtaining, based on the fixed point inner product calculation result, a floating point inner product calculation result in a floating point data format corresponding to the fixed point inner product calculation result.
    Type: Application
    Filed: May 19, 2023
    Publication date: November 16, 2023
    Applicant: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Peng WU, Jian OUYANG
  • Patent number: 11799651
    Abstract: According to one embodiment, a DP accelerator includes one or more execution units (EUs) configured to perform data processing operations in response to an instruction received from a host system coupled over a bus. The DP accelerator includes a time unit (TU) coupled to the security unit to provide timestamp services. The DP accelerator includes a security unit (SU) configured to establish and maintain a secure channel with the host system to exchange commands and data associated with the data processing operations, where the security unit includes a secure storage area to store a private root key associated with the DP accelerator, where the private root key is utilized for authentication. The SU includes a random number generator to generate a random number, and a cryptographic engine to perform cryptographic operations on data exchanged with the host system over the bus using a session key derived based on the random number.
    Type: Grant
    Filed: January 4, 2019
    Date of Patent: October 24, 2023
    Assignees: BAIDU USA LLC, BAIDU.COM TIMES TECHNOLOGY (BEIJING) CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yong Liu, Yueqiang Cheng, Jian Ouyang, Tao Wei
  • Patent number: 11782722
    Abstract: A complex computing device, a complex computing method, an artificial intelligence chip and an electronic apparatus are provided. An input interface receives complex computing instructions and arbitrates each complex computing instruction to a corresponding computing component respectively, according to the computing types in the respective complex computing instructions Each computing component is connected to the input interface, acquires a source operand from a complex computing instruction to perform complex computing, and generates a computing result instruction to feed back to an output interface. The output interface arbitrates the computing result in each computing result instruction to the corresponding instruction source respectively, according to the instruction source identifier in each computing result instruction.
    Type: Grant
    Filed: January 14, 2021
    Date of Patent: October 10, 2023
    Assignees: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD., KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Baofu Zhao, Xueliang Du, Kang An, Yingnan Xu, Chao Tang
  • Patent number: 11775347
    Abstract: In one embodiment, a computer-implemented method performed by a data processing (DP) accelerator includes receiving, at the DP accelerator, first data representing an artificial intelligence (AI) model that has been previously trained from a host processor; receiving, at the DP accelerator, a request to implant a watermark in the AI model from the host processor; and implanting, by the DP accelerator, the watermark within the AI model. The DP accelerator then transmits second data representing the AI model having the watermark implanted therein to the host processor. In embodiment, the method further includes extracting, at the DP accelerator, a watermark algorithm identifier (ID) from the request to implant a watermark; and generating the watermark using a watermark algorithm identified by the watermark algorithm ID.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: October 3, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Yong Liu
  • Patent number: 11775692
    Abstract: In one embodiment, a computer-implemented method of a data processing (DP) accelerator encrypting or decrypting input data can include receiving, from a host device, a command, the input data, and a kernel. The kernel can be an encryption kernel, or a decryption kernel, and the DP accelerator need not know which kernel it has received. The DP accelerator runs the received kernel. In response to the DP accelerator receiving the command, the DP accelerator performs encrypting of the input data using the kernel, if the received kernel is an encryption kernel, otherwise, decrypting the input data using the kernel. The encrypted, or decrypted, input data is then provided to the host device.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: October 3, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yong Liu, Yueqiang Cheng
  • Patent number: 11748108
    Abstract: Example embodiments of the present application provide an instruction executing method and apparatus, an electronic device, and a computer-readable storage medium that may be applied in the field of artificial intelligence. The instruction executing method may include: executing an instruction sequence that includes memory instructions and non-memory instructions, the instructions in the sequence executed starting to be executed in order; determining that execution of a first memory instruction needs to be completed before a second memory instruction starts to be executed, the second memory instruction being a next memory instruction following the first memory instruction in the instruction sequence; and executing non-memory instructions between the first memory instruction and the second memory instruction without executing the second memory instruction, during a cycle of executing the first memory instruction.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: September 5, 2023
    Assignees: Beijing Baidu Netcom Science and Technology Co., LTD., Kunlunxin Technology (Beijing) Company Limited
    Inventors: Yingnan Xu, Jian Ouyang, Xueliang Du, Kang An
  • Patent number: 11740940
    Abstract: In one embodiment, a computer-implemented method performed by a data processing (DP) accelerator, includes receiving, at the DP accelerator, an artificial intelligence (AI) model that has been previously trained and a set of input data from a host processor; receiving, at the DP accelerator, a watermark kernel from the host processor; executing the watermark kernel within the DP accelerator on the AI model and the set of input data. The watermark kernel, when executed, is configured to: generate a new watermark by inheriting an existing watermark from a data object of the set of input data or the AI model, perform an AI inference using the AI model based on the input data to generate output data, and implant the new watermark within the output data. The DP accelerator then transmits output data having the new watermark implanted therein to the host processor.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: August 29, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Yong Liu
  • Patent number: 11728996
    Abstract: Embodiments disclose systems and methods to broadcast a message among virtual DP accelerators (DPAs). In one embodiment, in response to receiving a broadcast instruction from an application via a communication switch, the broadcast instruction designating one or more virtual DP accelerators of a plurality of virtual DP accelerators to receive a broadcast message, a system encrypts the broadcast message based on a broadcast session key for a broadcast communication session. The system determines one or more public keys of one or more security key pairs each associated with one of the designated virtual DP accelerators. The system encrypts a plurality of the broadcast session key based on the determined one or more public keys. The system broadcasts the encrypted broadcast message, and the one or more encrypted broadcast session keys to the virtual DP accelerators.
    Type: Grant
    Filed: December 10, 2019
    Date of Patent: August 15, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yong Liu, Yueqiang Cheng
  • Patent number: 11709712
    Abstract: In one embodiment, a computer-implemented method performed by a data processing (DP) accelerator, includes receiving, at the DP accelerator, first data representing a set of training data from a host processor; receiving, at the DP accelerator, a watermark kernel from the host processor; and executing the watermark kernel within the DP accelerator on an artificial intelligence (AI) model. The watermark kernel, when executed, is configured to: generate a new watermark by inheriting an existing watermark from a data object of the set of training data, train the AI model using the set of training data, and implant the new watermark within the AI model during training of the AI model. The DP accelerator then transmits second data representing the trained AI model having the new watermark implanted therein to the host processor.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: July 25, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Yong Liu
  • Patent number: 11704390
    Abstract: In one embodiment, a computer-implemented method of a data processing (DP) accelerator obtaining a watermark of a watermark-enable artificial intelligence (AI) model includes receiving, by the DP accelerator, input data to the DP accelerator that causes the watermark-enabled AI model to extract the watermark from the watermark-enabled AI model; and providing the watermark of the watermark-enabled AI model to the host device. The DP accelerator can receive the model from the host device. The DP accelerator can further receive a command to digitally sign the watermark and call a security unit of the DP accelerator to digitally sign the watermark.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: July 18, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yong Liu, Yueqiang Cheng
  • Patent number: 11687629
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs). The cluster of accelerators may include DPAs of a third party accelerator that may not be trusted. To ensure data protection in the cluster, a first DPA that receives a request from a second DPA to access a resource of the first DPA authenticates the second DPA. If the second DPA passes authentication, the second DPA is permitted to access non-sensitive resources of the first DPA, otherwise the second DPA is not permitted access to any resources of the first DPA and the first DPA breaks a communication link with the second DPA. Authentication is premised on a shared secret function between DPAs and a random number generated by the first DPA. The shared secret function is updateable by, e.g., a patch from a manufacturer of the DPA.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: June 27, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11687376
    Abstract: Systems and methods are disclosed for data protection in a cluster of data processing accelerators (DPAs) using dynamic partitioning of DPAs into, or out of, one or more groups of DPAs in the cluster. A host device instructs each DPA in the cluster to link, or unlink, with one or more DPAs in the cluster to establish groups of DPAs in the cluster. A DPA that is not linked to any DPA is set to a low-power mode. Once grouped, the host device and a DPA can access all resources of the DPA. DPAs in the same group as a first DPA can access non-secure resources, but not secure resources, of the first DPA. DPAs in a different group from the first DPA cannot access any resources of the first DPA. A scheduler in the host device allocates processing tasks for one application or user to a group.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: June 27, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11657332
    Abstract: A method to transfer an artificial intelligence (AI) model includes identifying a plurality of layers of the AI model, the plurality of layers organized in a first ordered list. The method further includes randomizing the plurality of layers by reorganizing the first ordered list into a second ordered list, and transferring the plurality of layers of the AI model to a data processing accelerator in an order defined by the second ordered list.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: May 23, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Hefei Zhu
  • Patent number: 11650754
    Abstract: Embodiments of the present disclosure provide a data accessing method, a device and a storage medium. The method includes: obtaining a first accessing request and a second accessing request for a storage device; loading first data associated with the first accessing request from a source device to a pre-allocated buffer area with a size same as a size of a single physical storage block of the storage device; determining a first part of the second data when the first size of second data associated with the second accessing request is greater than or equal to the second size of an available space of the buffer area, a size of the first part being the same as the second size; and providing the first data and the first part to a target device associated with the first accessing request and the second accessing request.
    Type: Grant
    Filed: November 20, 2019
    Date of Patent: May 16, 2023
    Assignee: KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Zihao Liang, Jian Ouyang
  • Patent number: 11645116
    Abstract: In one embodiment, a computer-implemented method performed by a data processing (DP) accelerator, includes receiving, at the DP accelerator, first data representing an artificial intelligence (AI) model that has been previously trained from a host processor and a set of input data; receiving, at the DP accelerator, a watermark kernel from the host processor; and executing the watermark kernel within the DP accelerator on the AI model. The watermark kernel, when executed, is configured to: perform inference operations of the artificial intelligence model based on the input data to generate output data, and implant the watermark within the output data. The DP accelerator then transmits the output data having the watermark implanted therein to the host processor.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: May 9, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Yong Liu
  • Patent number: 11645586
    Abstract: In one embodiment, a computer-implemented method performed by a data processing (DP) accelerator, the method includes receiving, at the DP accelerator, first data representing a set of training data from a host processor and performing training of an artificial intelligence (AI) model based on the set of training data within the DP accelerator. The method further includes implanting, by the DP accelerator, a watermark within the trained AI model and transmitting second data representing the trained AI model having the watermark implanted therein to the host processor. In an embodiment, the method further includes receiving a pre-trained machine learning model; and performing training for the pre-trained AI model based on the set of training data within the DP accelerator.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: May 9, 2023
    Assignees: BAIDU USA LLC, KUNLUNXIN TECHNOLOGY (BEIJING) COMPANY LIMITED
    Inventors: Yueqiang Cheng, Yong Liu