Patents by Inventor Xiaoqian Zhang

Xiaoqian Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12292852
    Abstract: This application describes a hardware accelerator and a device for accelerating neural network computations. An example accelerator may include multiple cores and a central processing unit (CPU) respectively associated with DDRs, a data exchange interface connecting a host device to the accelerator, and a three-layer NoC architecture. The three-layer NoC architecture includes an outer-layer NoC configured to transfer data between the host device and the DDRs, a middle-layer NoC configured to transfer data among the plurality of cores; and an inner-layer NoC within each core and including a cross-bar network for broadcasting weights and activations of neural networks from a global buffer of the core to a plurality of processing entity (PE) clusters within the core.
    Type: Grant
    Filed: October 23, 2023
    Date of Patent: May 6, 2025
    Assignee: Moffett International Co., Limited
    Inventors: Xiaoqian Zhang, Zhibin Xiao
  • Publication number: 20250028673
    Abstract: This application describes a network-on-chip system on a hardware accelerator for accelerating neural network computations. An example NoC system in the NN accelerator may include interconnected routers with routing control circuits and cores respectively coupled to the routers. The cores are arranged into a matrix. Each row of cores are connected with a first uni-directional ring-shape data link and every two adjacent data links are in opposite directions. Each column of cores are connected with a second uni-directional ring-shape data link and every two adjacent data links are in opposite directions. In a given router of the plurality of routers, the routing control circuit is configured to: receive a data package; convert physical addresses of the given router and the target router into logical addresses; determine a routing port of the given router based on the logical addresses; and output the data package through the routing port.
    Type: Application
    Filed: October 7, 2024
    Publication date: January 23, 2025
    Inventors: Xiaoqian ZHANG, Zhibin XIAO
  • Patent number: 12117961
    Abstract: This application describes a network-on-chip system on a hardware accelerator for accelerating neural network computations. An example NoC system in the NN accelerator may include interconnected routers with routing control circuits and cores respectively coupled to the routers. The cores are arranged into a matrix. Each row of cores are connected with a first uni-directional ring-shape data link and every two adjacent data links are in opposite directions. Each column of cores are connected with a second uni-directional ring-shape data link and every two adjacent data links are in opposite directions. In a given router of the plurality of routers, the routing control circuit is configured to: receive a data package; convert physical addresses of the given router and the target router into logical addresses; determine a routing port of the given router based on the logical addresses; and output the data package through the routing port.
    Type: Grant
    Filed: May 15, 2023
    Date of Patent: October 15, 2024
    Assignee: Moffett International Co., Limited
    Inventors: Xiaoqian Zhang, Zhibin Xiao
  • Publication number: 20240338339
    Abstract: This application describes a hardware accelerator and a device for accelerating neural network computations. An example accelerator may include multiple cores and a central processing unit (CPU) respectively associated with DDRs, a data exchange interface connecting a host device to the accelerator, and a three-layer NoC architecture. The three-layer NoC architecture includes an outer-layer NoC configured to transfer data between the host device and the DDRs, a middle-layer NoC configured to transfer data among the plurality of cores; and an inner-layer NoC within each core and including a cross-bar network for broadcasting weights and activations of neural networks from a global buffer of the core to a plurality of processing entity (PE) clusters within the core.
    Type: Application
    Filed: October 23, 2023
    Publication date: October 10, 2024
    Inventors: Xiaoqian ZHANG, Zhibin XIAO
  • Publication number: 20240338338
    Abstract: This application describes a network-on-chip system on a hardware accelerator for accelerating neural network computations. An example NoC system in the NN accelerator may include interconnected routers with routing control circuits and cores respectively coupled to the routers. The cores are arranged into a matrix. Each row of cores are connected with a first uni-directional ring-shape data link and every two adjacent data links are in opposite directions. Each column of cores are connected with a second uni-directional ring-shape data link and every two adjacent data links are in opposite directions. In a given router of the plurality of routers, the routing control circuit is configured to: receive a data package; convert physical addresses of the given router and the target router into logical addresses; determine a routing port of the given router based on the logical addresses; and output the data package through the routing port.
    Type: Application
    Filed: May 15, 2023
    Publication date: October 10, 2024
    Inventors: Xiaoqian ZHANG, Zhibin XIAO
  • Patent number: 12072834
    Abstract: This application describes a hardware accelerator and a device for accelerating neural network computations. An example accelerator may include multiple cores and a central processing unit (CPU) respectively associated with DDRs, a data exchange interface connecting a host device to the accelerator, and a three-layer NoC architecture. The three-layer NoC architecture includes an outer-layer NoC configured to transfer data between the host device and the DDRs, a middle-layer NoC configured to transfer data among the plurality of cores; and an inner-layer NoC within each core and including a cross-bar network for broadcasting weights and activations of neural networks from a global buffer of the core to a plurality of processing entity (PE) clusters within the core.
    Type: Grant
    Filed: May 15, 2023
    Date of Patent: August 27, 2024
    Assignee: Moffett International Co., Limited
    Inventors: Xiaoqian Zhang, Zhibin Xiao
  • Publication number: 20240279298
    Abstract: A glucagon analog and the medical use thereof. Specifically, the glucagon analog has a significantly improved in vitro activity, excellent physical/chemical stability, and high solubility, and can be used to treat metabolic diseases such as hypoglycemia, obesity, and diabetes.
    Type: Application
    Filed: June 17, 2022
    Publication date: August 22, 2024
    Inventors: Weibing LIU, Xuchao HUANG, Xiaoqian ZHANG, Fangzhou WU, Lei WANG, Liang QU
  • Publication number: 20240264802
    Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.
    Type: Application
    Filed: April 17, 2024
    Publication date: August 8, 2024
    Inventors: Xiaoqian ZHANG, Zhibin XIAO, Changxu ZHANG, Renjie CHEN
  • Patent number: 12020001
    Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.
    Type: Grant
    Filed: April 3, 2023
    Date of Patent: June 25, 2024
    Assignee: Moffett International Co., Limited
    Inventors: Xiaoqian Zhang, Zhibin Xiao, Changxu Zhang, Renjie Chen
  • Publication number: 20240086151
    Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.
    Type: Application
    Filed: April 3, 2023
    Publication date: March 14, 2024
    Inventors: Xiaoqian ZHANG, Zhibin XIAO, Changxu ZHANG, Renjie CHEN
  • Patent number: 11868307
    Abstract: This application describes a hardware accelerator and a device for accelerating neural network computations. An example accelerator may include multiple cores and a central processing unit (CPU) respectively associated with DDRs, a data exchange interface connecting a host device to the accelerator, and a three-layer NoC architecture. The three-layer NoC architecture includes an outer-layer NoC configured to transfer data between the host device and the DDRs, a middle-layer NoC configured to transfer data among the plurality of cores; and an inner-layer NoC within each core and including a cross-bar network for broadcasting weights and activations of neural networks from a global buffer of the core to a plurality of processing entity (PE) clusters within the core.
    Type: Grant
    Filed: May 15, 2023
    Date of Patent: January 9, 2024
    Assignee: Moffett International Co., Limited
    Inventors: Xiaoqian Zhang, Zhibin Xiao
  • Publication number: 20230407928
    Abstract: A brake dust filtering apparatus (100), comprising: a base (10) which is provided with multiple first through holes (101); a scribing sheet (20) which is provided with multiple second through holes (201) and can be slidably connected to the base (10); a filter screen (30) which is disposed on the side of the base (10) close to a brake caliper (200) and covers the first through holes (101); and a drive member (40) which is used for driving the scribing sheet (20) to slide with respect to the base (10), so as to form an off state in which each of the first through holes (101) and each of the second through holes (201) are staggered and an on state that the multiple first through holes (101) at least partially overlap the multiple second through holes (201). Further disclosed is a vehicle.
    Type: Application
    Filed: November 23, 2020
    Publication date: December 21, 2023
    Applicant: Wuhan Lotus Cars Co., Ltd.
    Inventors: Bowen ZHENG, Xiaoqian ZHANG
  • Publication number: 20230259758
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving efficiency of neural network computations using adaptive tensor compute kernels. First, the adaptive tensor compute kernels may adjust shapes according to the different shapes of input/weight tensors for distributing the weights and input values to a processing elements (PE) array for parallel processing. Depending on the shape of the tensor compute kernels, additional inter-cluster or intra-cluster adders may be needed to perform convolution computations. Second, the adaptive tensor compute kernels may support two different tensor operation modes, i.e., 1×1 tensor operation mode and 3×3 tensor operation mode, to cover all types of convolution computations. Third, the underlying PE array may configure each PE-internal buffer (e.g., a register file) differently to support different compression ratios and sparsity granularities of sparse neural networks.
    Type: Application
    Filed: February 16, 2022
    Publication date: August 17, 2023
    Inventors: XIAOQIAN ZHANG, ENXU YAN, ZHIBIN XIAO
  • Patent number: 11726746
    Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.
    Type: Grant
    Filed: September 14, 2022
    Date of Patent: August 15, 2023
    Assignee: Moffett International Co., Limited
    Inventors: Xiaoqian Zhang, Zhibin Xiao, Changxu Zhang, Renjie Chen
  • Patent number: 11531869
    Abstract: Embodiments herein describe circuitry with improved efficiency when executing layers in a nested neural network. As mentioned above, a nested neural network has at least one split operation where a tensor generated by a first layer is transmitted to, and processed by several branches in the neural network. Each of these branches can have several layers that have data dependencies which result in a multiply-add array sitting idly. In one embodiment, the circuitry can include a dedicated pre-pooler for performing a pre-pooling operation. Thus, the pre-pooling operation can be performing in parallel with other operations (e.g., the convolution performed by another layer). Once the multiply-add array is idle, the pre-pooling operation has already completed (or at least, has already started) which means the time the multiply-add array must wait before it can perform the next operation is reduced or eliminated.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: December 20, 2022
    Assignee: XILINX, INC.
    Inventors: Ephrem C. Wu, David Berman, Xiaoqian Zhang
  • Patent number: 11429851
    Abstract: Disclosed circuits and methods involve a first register configured to store of a first convolutional neural network (CNN) instruction during processing of the first CNN instruction and a second register configured to store a second CNN instruction during processing of the second CNN instruction. Each of a plurality of address generation circuits is configured to generate one or more addresses in response to an input CNN instruction. Control circuitry is configured to select one of the first CNN instruction or the second CNN instruction as input to the address generation circuits.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: August 30, 2022
    Assignee: XILINX, INC.
    Inventors: Xiaoqian Zhang, Ephrem C. Wu, David Berman
  • Patent number: 11429850
    Abstract: A circuit arrangement includes an array of MAC circuits, wherein each MAC circuit includes a cache configured for storage of a plurality of kernels. The MAC circuits are configured to receive a first set of data elements of an IFM at a first rate. The MAC circuits are configured to perform first MAC operations on the first set of the data elements and a first one of the kernels associated with a first OFM depth index during a first MAC cycle, wherein a rate of MAC cycles is faster than the first rate. The MAC circuits are configured to perform second MAC operations on the first set of the data elements and a second one of the kernels associated with a second OFM depth index during a second MAC cycle that consecutively follows the first MAC cycle.
    Type: Grant
    Filed: July 19, 2018
    Date of Patent: August 30, 2022
    Assignee: XILINX, INC.
    Inventors: Xiaoqian Zhang, Ephrem C. Wu, David Berman
  • Patent number: 11132296
    Abstract: The embodiments herein store tabulated values representing a linear or non-linear function in separate memory banks to reduce the size of memory used to store the tabulated values while being able to provide upper and lower values for performing linear interpolation in parallel (e.g., the same cycle). To do so, a linear interpolation system includes a first memory bank that stores the even indexed tabulated values while a second memory bank stores the odd indexed tabulated values. During each clock cycle, the first and second memory banks can output upper and lower values for linear interpolation (although which memory bank outputs the upper value and which outputs the lower value can vary). Using the upper and lower values, the linear interpolation system performs linear interpolation to approximate the value of a non-linear function that is between the upper and lower values.
    Type: Grant
    Filed: July 12, 2018
    Date of Patent: September 28, 2021
    Assignee: XILINX, INC.
    Inventors: Ephrem C. Wu, Xiaoqian Zhang
  • Patent number: D1021901
    Type: Grant
    Filed: February 21, 2022
    Date of Patent: April 9, 2024
    Inventor: Xiaoqian Zhang
  • Patent number: D1039527
    Type: Grant
    Filed: July 22, 2022
    Date of Patent: August 20, 2024
    Inventor: Xiaoqian Zhang