Patents by Inventor Jian OUYANG

Jian OUYANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200218821
    Abstract: According to one embodiment, a system establishes a secure connection between a host system and a data processing (DP) accelerator over a bus, the secure connection including one or more data channels. The system transmits a first instruction from the host system to the DP accelerator over a command channel, the first instruction requesting the DP accelerator to perform a data preparation operation. The system receives a first request to read a first data from a first memory location of the host system from the DP accelerator over one data channel. In response to the request, the system transmits the first data to the DP accelerator over the data channel, where the first data is utilized for a computation or a configuration operation. The system transmits a second instruction from the host system to the DP accelerator over the command channel to perform the computation or the configuration operation.
    Type: Application
    Filed: January 24, 2020
    Publication date: July 9, 2020
    Inventors: Yong LIU, Yueqiang CHENG, Jian OUYANG, Tao WEI
  • Publication number: 20200159461
    Abstract: Embodiments of the present disclosure provide a data accessing method, a device and a storage medium. The method includes: obtaining a first accessing request and a second accessing request for a storage device; loading first data associated with the first accessing request from a source device to a pre-allocated buffer area with a size same as a size of a single physical storage block of the storage device; determining a first part of the second data when the first size of second data associated with the second accessing request is greater than or equal to the second size of an available space of the buffer area, a size of the first part being the same as the second size; and providing the first data and the first part to a target device associated with the first accessing request and the second accessing request.
    Type: Application
    Filed: November 20, 2019
    Publication date: May 21, 2020
    Inventors: Zihao LIANG, Jian OUYANG
  • Patent number: 10607668
    Abstract: The present application discloses a data processing method and apparatus. A specific embodiment of the method includes: preprocessing received to-be-processed input data; obtaining a storage address of configuration parameters of the to-be-processed input data based on a result of the preprocessing and a result obtained by linearly fitting an activation function, the configuration parameters being preset according to curve characteristics of the activation function; acquiring the configuration parameters of the to-be-processed input data according to the storage address; and processing the result of the preprocessing of the to-be-processed input data based on the configuration parameters of the to-be-processed input data and a preset circuit structure, to obtain a processing result.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: March 31, 2020
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Jian Ouyang, Wei Qi, Yong Wang
  • Publication number: 20200050557
    Abstract: Disclosed are an apparatus for data processing, an artificial intelligence chip, and an electronic device. The apparatus for data processing includes: at least one input memory, at least one data conveying component, at least one multiplexed arbitration component, and at least one output memory. The input memory is connected to the data conveying component, the data conveying component is connected to the multiplexed arbitration component, and the multiplexed arbitration component is connected to the output memory.
    Type: Application
    Filed: July 9, 2019
    Publication date: February 13, 2020
    Inventors: Peng Wu, Jian Ouyang, Canghai Gu, Wei Qi, Ningyi Xu
  • Publication number: 20200050481
    Abstract: Disclosed are a computing method applied to an artificial intelligence chip and the artificial intelligence chip.
    Type: Application
    Filed: July 9, 2019
    Publication date: February 13, 2020
    Inventors: Jian Ouyang, Xueliang Du, Yingnan Xu, Huimin Li
  • Publication number: 20200050456
    Abstract: Embodiments of the present disclosure relate to a method for processing information, and a processor. The processor includes an arithmetic and logic unit, a bypass unit, a queue unit, a multiplexer, and a register file. The bypass unit includes a data processing subunit; the data processing subunit is configured to acquire at least one valid processing result outputted by the arithmetic and logic unit, determine a processing result from the at least one valid processing result, output the determined processing result to the multiplexer, and output processing results except for the determined processing result of among the at least one valid processing result to the queue unit; and the multiplexer is configured to sequentially output more than one valid processing results to the register file.
    Type: Application
    Filed: July 3, 2019
    Publication date: February 13, 2020
    Inventor: Jian Ouyang
  • Publication number: 20190164254
    Abstract: A processor and method for scaling an image are disclosed. A specific embodiment of the processor includes: an off-chip memory, a communication circuit, a control circuit, and an array processor, wherein: the off-chip memory is configured for storing a to-be-scaled original image; the communication circuit is configured for receiving an image scaling instruction; the control circuit is configured for executing the image scaling instruction, and sending a calculation control signal to the array processor; and the array processor is configured for calculating in parallel channel values of N channels in a target pixel using N processing elements in the array processor under the control of the calculation control signal based on a width scaling factor, a height scaling factor, and channel values of N channels in extracted pixel data. The embodiment has improved the processing speed of an image scaling operation.
    Type: Application
    Filed: February 1, 2019
    Publication date: May 30, 2019
    Inventors: Yichen Tu, Jian Ouyang, Wei Qi, Yong Wang
  • Publication number: 20190114202
    Abstract: The present disclosure provides a task scheduling method and apparatus of artificial intelligence heterogeneous hardware, a device and a readable medium. The method comprises: receiving a task execution request for a corresponding function sent from an API, the task execution request carrying attribute information of the task; obtaining a priority of the task according to attribute information of the task, wherein a priority of an online service is higher than a priority of an offline task; inserting the corresponding task into a scheduling queue of a corresponding function according to the priority of the task; tasks in the scheduling queue being arranged in a descending order of priorities; controlling in turn a free computing unit in a plurality of computing units of the corresponding function to execute the corresponding task, in the descending order of priorities of the task in the scheduling queue.
    Type: Application
    Filed: October 12, 2018
    Publication date: April 18, 2019
    Inventors: Yong WANG, Jian OUYANG, Wei QI
  • Patent number: 10261796
    Abstract: A processor and a method for executing an instruction on a processor are provided. In the method, a to-be-executed instruction is fetched, the instruction including a source address field, a destination address field, an operation type field, and an operation parameter field; in at least one execution unit, an execution unit controlled by a to-be-generated control signal according to the operation type field is determined, a source address and a destination address of data operated by the execution unit are determined according to the source address field and the destination address field, and a data amount of the data operated by the execution unit controlled by the to-be-generated control signal is determined according to the operation parameter field; the control signal is generated; and the execution unit in the at least one execution unit is controlled by using the control signal.
    Type: Grant
    Filed: November 23, 2016
    Date of Patent: April 16, 2019
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD
    Inventors: Jian Ouyang, Wei Qi, Yong Wang
  • Patent number: 10189426
    Abstract: The present application discloses a method and apparatus for operating a field-programmable gate array (FPGA) board in a driverless vehicle. The method according to a specific embodiment includes: collecting driving scenario information on a driving scenario of the driverless vehicle; determining, based on the driving scenario information, a speed at which the driverless vehicle executes a computing operation in the driving scenario; comparing the speed with a speed threshold; switching a working mode of the FPGA board in the driverless vehicle executing the computing operation to reduce power consumption of the FPGA board, in response to the speed being lower than the speed threshold. This embodiment implements the adaptive adjustment of the working mode of the FPGA board, thereby reducing the overall power consumption.
    Type: Grant
    Filed: January 20, 2017
    Date of Patent: January 29, 2019
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zhao Zhang, Jian Ouyang, Jing Wang, Peng Wu, Liang Gao, Yupeng Li
  • Patent number: 10140251
    Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n?1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
    Type: Grant
    Filed: May 9, 2017
    Date of Patent: November 27, 2018
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Ni Zhou, Wei Qi, Yong Wang, Jian Ouyang
  • Patent number: 10127040
    Abstract: The present application discloses a processor and a method for executing an instruction on a processor. A specific implementation of the processor includes: a host interaction device, an instruction control device, an off-chip memory, an on-chip cache and an array processing device, wherein the host interaction device is configured to exchange data and instructions with a host connected with the processor, wherein the exchanged data has a granularity of a matrix; the off-chip memory is configured to store a matrix received from the host, on which a matrix operation is to be performed; and the instruction control device is configured to convert an external instruction received from the host to a series of memory access instructions and a series of computing instructions and execute the converted instructions. The implementation can improve the execution efficiency of a deep learning algorithm.
    Type: Grant
    Filed: September 28, 2016
    Date of Patent: November 13, 2018
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Wei Qi, Jian Ouyang, Yong Wang
  • Publication number: 20180129933
    Abstract: The present application discloses a method and apparatus for processing a data sequence. A specific implementation of the method includes: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence. This implementation improves the data sequence processing efficiency of the recurrent neural network model.
    Type: Application
    Filed: June 9, 2017
    Publication date: May 10, 2018
    Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Yong Wang, Jian Ouyang, Wei Qi, Sizhong Li
  • Publication number: 20180124023
    Abstract: The present application discloses a method, system and apparatus for storing a website private key plaintext. A specific implementation of the method includes: receiving a public key sent from a terminal configured to perform encryption and decryption, wherein the public key is generated at random by the terminal; encrypting a website private key plaintext by using the public key to generate a website private key ciphertext, wherein the website private key plaintext is pre-acquired; and sending the website private key ciphertext to the terminal, so that the terminal decrypts the website private key ciphertext by using the private key to generate the website private key plaintext and store the website private key plaintext in the terminal. This implementation improves the security of storage of the website private key plaintext.
    Type: Application
    Filed: June 9, 2017
    Publication date: May 3, 2018
    Inventors: Wei QI, Jian OUYANG, Yong WANG, Yichen TU, Sijie YANG
  • Publication number: 20180121789
    Abstract: The present application discloses a data processing method and apparatus. A specific implementation of the method includes: receiving floating point data sent from an electronic device; converting the received floating point data into fixed point data according to a data length and a value range of the received floating point data; performing calculation on the obtained fixed point data according to a preset algorithm to obtain result data in a fixed point form; and converting the obtained result data in the fixed point form into result data in a floating point form and sending the result data in the floating point form to the electronic device. This implementation improves the data processing efficiency.
    Type: Application
    Filed: June 9, 2017
    Publication date: May 3, 2018
    Inventors: Jian OUYANG, Wei QI, Yong WANG, Lin LIU
  • Publication number: 20180107630
    Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n-1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
    Type: Application
    Filed: May 9, 2017
    Publication date: April 19, 2018
    Inventors: Ni Zhou, Wei Qi, Yong Wang, Jian Ouyang
  • Publication number: 20180072251
    Abstract: The present application discloses a method and apparatus for operating a field-programmable gate array (FPGA) board in a driverless vehicle. The method according to a specific embodiment includes: collecting driving scenario information on a driving scenario of the driverless vehicle; determining, based on the driving scenario information, a speed at which the driverless vehicle executes a computing operation in the driving scenario; comparing the speed with a speed threshold; switching a working mode of the FPGA board in the driverless vehicle executing the computing operation to reduce power consumption of the FPGA board, in response to the speed being lower than the speed threshold. This embodiment implements the adaptive adjustment of the working mode of the FPGA board, thereby reducing the overall power consumption.
    Type: Application
    Filed: January 20, 2017
    Publication date: March 15, 2018
    Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zhao ZHANG, Jian OUYANG, Jing WANG, Peng WU, Liang GAO, Yupeng LI
  • Patent number: 9912349
    Abstract: The present disclosure provides a method and apparatus for processing a floating point number matrix, an apparatus and a computer readable storage medium. In embodiments of the present disclosure, the minimum value of the floating point number model matrix and the maximum value of the floating point number model matrix are obtained according to a floating point number model matrix to be compressed, and then, compression processing is performed for the floating point number model matrix to obtain the fixed point number model matrix according to the bit width, the minimum value of the floating point number model matrix and the maximum value of the floating point number model matrix. The compression processing is performed for the floating point number model matrix of the deep learning model by a fixed point method, to obtain the fixed point number model matrix and reduce the storage space and amount of operation of the deep learning model.
    Type: Grant
    Filed: June 20, 2017
    Date of Patent: March 6, 2018
    Assignee: Beijing Baidu Netcom Science And Technology Co., Ltd.
    Inventors: Jian Ouyang, Ni Zhou, Yong Wang, Wei Qi
  • Publication number: 20180052685
    Abstract: The present application discloses a processor and a method for executing an instruction on a processor. The method includes: fetching a to-be-executed instruction, the instruction comprising a source address field, a destination address field, an operation type field, and an operation parameter field; determining, in at least one execution unit, an execution unit controlled by a to-be-generated control signal according to the operation type field, determining a source address and a destination address of data operated by the execution unit controlled by the to-be-generated control signal according to the source address field and the destination address field, and determining a data amount of the data operated by the execution unit controlled by the to-be-generated control signal according to the operation parameter field; generating the control signal; and controlling, by using the control signal, the execution unit in the at least one execution unit to execute an operation.
    Type: Application
    Filed: November 23, 2016
    Publication date: February 22, 2018
    Inventors: Jian Ouyang, Wei Qi, Yong Wang
  • Publication number: 20180032336
    Abstract: The present application discloses a processor and a method for executing an instruction on a processor. A specific implementation of the processor includes: a host interaction device, an instruction control device, an off-chip memory, an on-chip cache and an array processing device, wherein the host interaction device is configured to exchange data and instructions with a host connected with the processor, wherein the exchanged data has a granularity of a matrix; the off-chip memory is configured to store a matrix received from the host, on which a matrix operation is to be performed; and the instruction control device is configured to convert an external instruction received from the host to a series of memory access instructions and a series of computing instructions and execute the converted instructions. The implementation can improve the execution efficiency of a deep learning algorithm.
    Type: Application
    Filed: September 28, 2016
    Publication date: February 1, 2018
    Inventors: Wei QI, Jian OUYANG, Yong WANG