Patents by Inventor Zhibin Xiao

Zhibin Xiao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VECTOR OPERATION ACCELERATION WITH CONVOLUTION COMPUTATION UNIT

Publication number: 20240086151

Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.

Type: Application

Filed: April 3, 2023

Publication date: March 14, 2024

Inventors: Xiaoqian ZHANG, Zhibin XIAO, Changxu ZHANG, Renjie CHEN
Hierarchical networks on chip (NoC) for neural network accelerator

Patent number: 11868307

Abstract: This application describes a hardware accelerator and a device for accelerating neural network computations. An example accelerator may include multiple cores and a central processing unit (CPU) respectively associated with DDRs, a data exchange interface connecting a host device to the accelerator, and a three-layer NoC architecture. The three-layer NoC architecture includes an outer-layer NoC configured to transfer data between the host device and the DDRs, a middle-layer NoC configured to transfer data among the plurality of cores; and an inner-layer NoC within each core and including a cross-bar network for broadcasting weights and activations of neural networks from a global buffer of the core to a plurality of processing entity (PE) clusters within the core.

Type: Grant

Filed: May 15, 2023

Date of Patent: January 9, 2024

Assignee: Moffett International Co., Limited

Inventors: Xiaoqian Zhang, Zhibin Xiao
Method and system for balanced-weight sparse convolution processing

Patent number: 11763150

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for balanced-weight sparse convolution processing. An exemplary method comprises: obtaining an input tensor and a plurality of filters at a layer within a neural network; segmenting the input tensor into a plurality of sub-tensors; dividing a channel dimension of each of the plurality of filters into a plurality of channel groups; pruning each of the plurality of filters so that each of the plurality of channel groups of each filter comprises a same number of non-zero weights; segmenting each of the plurality of filters into a plurality of the sub-filters according to the plurality of channel groups; and assigning the plurality of sub-tensors and the plurality of sub-filters to a plurality of processors for parallel convolution processing.

Type: Grant

Filed: August 2, 2021

Date of Patent: September 19, 2023

Assignee: Moffett International Co., Limited

Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
ADAPTIVE TENSOR COMPUTE KERNEL FOR SPARSE NEURAL NETWORK

Publication number: 20230259758

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improving efficiency of neural network computations using adaptive tensor compute kernels. First, the adaptive tensor compute kernels may adjust shapes according to the different shapes of input/weight tensors for distributing the weights and input values to a processing elements (PE) array for parallel processing. Depending on the shape of the tensor compute kernels, additional inter-cluster or intra-cluster adders may be needed to perform convolution computations. Second, the adaptive tensor compute kernels may support two different tensor operation modes, i.e., 1×1 tensor operation mode and 3×3 tensor operation mode, to cover all types of convolution computations. Third, the underlying PE array may configure each PE-internal buffer (e.g., a register file) differently to support different compression ratios and sparsity granularities of sparse neural networks.

Type: Application

Filed: February 16, 2022

Publication date: August 17, 2023

Inventors: XIAOQIAN ZHANG, ENXU YAN, ZHIBIN XIAO
Vector operation acceleration with convolution computation unit

Patent number: 11726746

Abstract: This application describes hybrid hardware accelerators, systems, and apparatus for performing various computations in neural network applications using the same set of hardware resources. An example accelerator may include weight selectors, activation input interfaces, and a plurality of Multiplier-Accumulation (MAC) circuits organized as a plurality of MAC lanes Each of the plurality of MAC lanes may be configured to: receive a control signal indicating whether to perform convolution or vector operations; receive one or more weights according to the control signal; receive one or more activations according to the control signal; and generate output data based on the one or more weights and the one or more input activations according to the control signal and feed the output data into an output buffer. Each of the plurality of MAC lanes includes a plurality of multiplier circuits and a plurality of adder-subtractor circuits.

Type: Grant

Filed: September 14, 2022

Date of Patent: August 15, 2023

Assignee: Moffett International Co., Limited

Inventors: Xiaoqian Zhang, Zhibin Xiao, Changxu Zhang, Renjie Chen
METHOD AND SYSTEM FOR DUAL-SPARSE CONVOLUTION PROCESSING AND PARALLELIZATION

Publication number: 20230111362

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for parallelizing convolution processing. An exemplary method comprises: segmenting an input tensor into a plurality of sub-tensors and a plurality of filters into a plurality of sub-filter groups; respectively assigning a plurality of combinations of the sub-tensors and the sub-filter groups to a plurality of processors; storing, by each of the plurality of processors, nonzero values of the sub-tensor and the sub-filter group in the assigned combination as index-value pairs; parallelly performing for a plurality of iterations, by the plurality of processors, multiply-and-accumulate (MAC) operations based on the index-value pairs to obtain a plurality of outputs, where the index-value pairs of the sub-filter groups are rotated among the plurality of processors across the plurality of iterations; and aggregating the plurality of outputs as an output tensor.

Type: Application

Filed: December 12, 2022

Publication date: April 13, 2023

Inventors: Enxu YAN, Yong LU, Wei WANG, Zhibin XIAO, Jiachao LIU, Hengchang XIONG
Apparatus and method for representation of a sparse matrix in a neural network

Patent number: 11586601

Abstract: The present disclosure relates to a method and an apparatus for representation of a sparse matrix in a neural network. In some embodiments, an exemplary operation unit includes a buffer for storing a representation of a sparse matrix in a neural network, a sparse engine communicatively coupled with the buffer, and a processing array communicatively coupled with the sparse engine. The sparse engine includes circuitry to: read the representation of the sparse matrix from the buffer, the representation comprising a first level bitmap, a second level bitmap, and an element array; decompress the first level bitmap to determine whether a block of the sparse matrix comprises a non-zero element; and in response to the block comprising a non-zero element, decompress the second level bitmap using the element array to obtain the block of the sparse matrix. The processing array includes circuitry to execute the neural network with the sparse matrix.

Type: Grant

Filed: February 5, 2020

Date of Patent: February 21, 2023

Assignee: Alibaba Group Holding Limited

Inventors: Zhibin Xiao, Xiaoxin Fan, Minghai Qin
Vector-vector multiplication techniques for processing systems

Patent number: 11568021

Abstract: Vector-vector multiplication or matrix-matrix multiplication computation on computing systems can include computing a first portion of a vector-vector multiplication product based on a most-significant-bit set of a first vector and a most-significant-bit set of a second vector, and determining if the first portion of the vector-vector multiplication product is less than a threshold. If the first partial vector-vector multiplication product is not less than the threshold, a remaining portion of the vector-vector multiplication product can be computed, and a rectified linear vector-vector multiplication product can be determined for the sum of the first portion of the vector-vector multiplication product and the remaining portion of the vector-vector multiplication product.

Type: Grant

Filed: February 21, 2020

Date of Patent: January 31, 2023

Assignee: Alibaba Group Holding Limited

Inventors: Minghai Qin, Zhibin Xiao, Chunsheng Liu
HIERARCHICAL METHODS AND SYSTEMS FOR STORING DATA

Publication number: 20230021511

Abstract: Disclosed are systems and methods that determine whether instances of data (e.g., forward activations, backward derivatives of activations) that are used to train deep neural networks are to be stored on-chip or off-chip. The disclosed systems and methods are also used to prune the data (discard or delete selected instances of data). A system includes a hierarchical arrangement of on-chip and off-chip memories, and also includes a hierarchical arrangement of data selector devices that are used to decide whether to discard data and where in the system the data is to be discarded.

Type: Application

Filed: October 4, 2022

Publication date: January 26, 2023

Inventors: Minghai QIN, Chunsheng LIU, Zhibin XIAO, Tianchan GUAN, Yuan GAO
Systems and methods for testing many-core processors

Patent number: 11455222

Abstract: Systems and methods are provided for testing many-core processors consisting of processing element cores. The systems and methods can include grouping the processing elements according to the dataflow of the many-core processor. Each group can include a processing element that only receives inputs from other processing elements in the group. After grouping the processing elements, test information can be provided in parallel to each group. The test information can be configured to ensure a desired degree of test coverage for the processing element that that only receives inputs from other processing elements in the group. Each group can perform testing operations in parallel to generate test results. The test results can be read out of each group. The processing elements can then be regrouped according to the dataflow of the many-core processor and the testing can be repeated to achieve a target test coverage.

Type: Grant

Filed: March 30, 2020

Date of Patent: September 27, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Chunsheng Meon Liu, Arjun Chaudhuri, Zhibin Xiao
NOTCHED STEEL BEAM AND FLOOR SLAB STRUCTURE OF FLANGE EMBEDDED FLOOR SLAB AND CONSTRUCTION METHOD

Publication number: 20220268019

Abstract: The present disclosure relates to a notched steel beam and a floor slab structure of a flange embedded floor slab and a construction method. The notched steel beam comprises a web (1), wherein an upper flange (2) and a lower flange (3) are respectively arranged on the upper end and the lower end of the web (1); the flange embedded floor slab comprises four rectangularly distributed floor slab stand columns (7), a steel beam (8) is arranged between the adjacent floor slab stand columns (7), laminated slab bottom slabs (9) are arranged between the two steel beams (8) which are symmetrically distributed, floor slab reinforcing steel bars (10) are arranged above the laminated slab bottom slabs (9), a concrete layer (11) is arranged on the floor slab reinforcing steel bars (10); and the steel beam (8) is the notched steel beam of the flange embedded floor slab.

Type: Application

Filed: January 6, 2022

Publication date: August 25, 2022

Applicant: The Architectural Design & Research Institute of Zhejiang University Co., Ltd.

Inventors: Quanbiao XU, Benyue LI, Mingshan ZHANG, Jiawei ZHOU, Zhibin XIAO, Shunfeng GONG, Liang XIA, Kepeng CHEN, Jiayin YANG, Yuxuan WANG
COMBINED PREFABRICATED REINFORCED CONCRETE STAIR MOLD AND SPLICING METHOD

Publication number: 20220228381

Abstract: The present disclosure provides a combined prefabricated reinforced concrete stair mold and a splicing method. The combined prefabricated reinforced concrete stair mold comprises a bottom mold platform (1), wherein an upper platform module (2), a tread module (3) and a lower platform module (4) which are spliced with one another are sequentially arranged on the upper surface of the bottom mold platform (1), and corner modules (5) are arranged between the upper platform module (2) and the tread module (3) and between the tread module (3) and the lower platform module (4); the tread module (3) comprises a tread upper surface mold (6) and a tread lower surface mold (7); and the tread lower surface mold (7) comprises a plurality of combined templates (8) formed by mutual splicing, and the tread upper surface mold (6) comprises a plurality of tread splicing pieces (9) formed by mutual splicing.

Type: Application

Filed: January 5, 2022

Publication date: July 21, 2022

Applicants: Zhejiang University, The Architectural Design & Research Institute of Zhejiang University Co., Ltd.

Inventors: Benyue LI, Quanbiao XU, Mingshan ZHANG, Zhibin XIAO, Jiayin YANG, Liang XIA, Kepeng CHEN, Tao HONG, Minwei CHEN
Scheduling commands in a virtual computing environment

Patent number: 11366690

Abstract: A method and an apparatus for scheduling commands in a virtual computing environment includes picking a command. It is determined whether the command is a synchronization command or a conditional command. A synchronization command is an independent command. A conditional command is a dependent command that depends on a synchronization command. In response to the command being determined as the synchronization command, a waiting queue is enabled for the command, the waiting queue storing conditional commands dependent on a running synchronization command. The command is dispatched to a processing engine.

Type: Grant

Filed: December 2, 2019

Date of Patent: June 21, 2022

Assignee: Alibaba Group Holding Limited

Inventors: Zhibin Xiao, Chunsheng Liu, Yuan Xie
METHOD AND SYSTEM FOR CONVOLUTION WITH WORKLOAD-BALANCED ACTIVATION SPARSITY

Publication number: 20220147826

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for convolution with workload-balanced activation sparsity are described. An exemplary method comprises: assigning an input tensor and a weight tensor at a convolution layer into a plurality of processors to perform Multiply-Accumulate (MAC) operations in parallel based on the input tensor and the weight tensor; obtaining a plurality of output values based on results of the MAC operations; constructing one or more banks of output values based on the plurality of output values; for each of the banks, performing a top-K sorting on the one or more output values in the bank to obtain K output values; pruning each of the banks by setting the one or more output values other than the obtained K output values in the each bank as zeros; and constructing an output tensor of the convolution layer based on the pruned banks.

Type: Application

Filed: November 6, 2020

Publication date: May 12, 2022

Inventors: ZHIBIN XIAO, ENXU YAN, YONG LU, WEI WANG
METHOD AND SYSTEM FOR BALANCED-WEIGHT SPARSE CONVOLUTION PROCESSING

Publication number: 20210406686

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for balanced-weight sparse convolution processing. An exemplary method comprises: obtaining an input tensor and a plurality of filters at a layer within a neural network; segmenting the input tensor into a plurality of sub-tensors; dividing a channel dimension of each of the plurality of filters into a plurality of channel groups; pruning each of the plurality of filters so that each of the plurality of channel groups of each filter comprises a same number of non-zero weights; segmenting each of the plurality of filters into a plurality of the sub-filters according to the plurality of channel groups; and assigning the plurality of sub-tensors and the plurality of sub-filters to a plurality of processors for parallel convolution processing.

Type: Application

Filed: August 2, 2021

Publication date: December 30, 2021

Inventors: ZHIBIN XIAO, ENXU YAN, WEI WANG, YONG LU
System and method for knowledge-preserving neural network pruning

Patent number: 11200497

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for knowledge-preserving sparse pruning on neural networks are described. An exemplary method includes obtaining a pre-trained machine learning model trained based on a plurality of general-purpose training data; training a task-specific machine learning model by tuning the pre-trained machine learning model based on a plurality of task-specific training data corresponding to a task; constructing a student network based on the task-specific machine learning model; simultaneously performing (1) knowledge distillation from the trained task-specific machine learning model as a teacher network to the student network and (2) network pruning on the student network; and obtaining the trained student network for serving the task.

Type: Grant

Filed: March 16, 2021

Date of Patent: December 14, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Enxu Yan, Dongkuan Xu, Zhibin Xiao
Method and system for hierarchical weight-sparse convolution processing

Patent number: 11144823

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for hierarchical weight-sparse convolution processing are described. An exemplary method comprises: obtaining an input tensor and a filter at a convolution layer of a neural network; segmenting the filter into a plurality of sub-filters; generating a hierarchical bit representation of the filter representing a plurality of non-zero weights in the filter, wherein the hierarchical bit representation comprises a first layer, the first layer comprising a plurality of bits respectively corresponding to the plurality of sub-filters in the filter, each of the plurality of bits indicating whether the corresponding sub-filter includes at least one non-zero weight; and performing multiply-and-accumulate (MAC) operations based on the hierarchical bit representation of the filter and the input tensor.

Type: Grant

Filed: April 5, 2021

Date of Patent: October 12, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
SYSTEMS AND METHODS FOR TESTING MANY-CORE PROCESSORS BACKGROUND

Publication number: 20210303426

Abstract: Systems and methods are provided for testing many-core processors consisting of processing element cores. The systems and methods can include grouping the processing elements according to the dataflow of the many-core processor. Each group can include a processing element that only receives inputs from other processing elements in the group. After grouping the processing elements, test information can be provided in parallel to each group. The test information can be configured to ensure a desired degree of test coverage for the processing element that that only receives inputs from other processing elements in the group. Each group can perform testing operations in parallel to generate test results. The test results can be read out of each group. The processing elements can then be regrouped according to the dataflow of the many-core processor and the testing can be repeated to achieve a target test coverage.

Type: Application

Filed: March 30, 2020

Publication date: September 30, 2021

Inventors: Chunsheng Meon LIU, Arjun Chaudhuri, Zhibin Xiao
Method and system for balanced-weight sparse convolution processing

Patent number: 11113601

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for balanced-weight sparse convolution processing.

Type: Grant

Filed: June 30, 2020

Date of Patent: September 7, 2021

Assignee: MOFFETT TECHNOLOGIES CO., LIMITED

Inventors: Zhibin Xiao, Enxu Yan, Wei Wang, Yong Lu
VECTOR-VECTOR MULTIPLICATION TECHNIQUES FOR PROCESSING SYSTEMS

Publication number: 20210263992

Abstract: Vector-vector multiplication or matrix-matrix multiplication computation on computing systems can include computing a first portion of a vector-vector multiplication product based on a most-significant-bit set of a first vector and a most-significant-bit set of a second vector, and determining if the first portion of the vector-vector multiplication product is less than a threshold. If the first partial vector-vector multiplication product is not less than the threshold, a remaining portion of the vector-vector multiplication product can be computed, and a rectified linear vector-vector multiplication product can be determined for the sum of the first portion of the vector-vector multiplication product and the remaining portion of the vector-vector multiplication product.

Type: Application

Filed: February 21, 2020

Publication date: August 26, 2021

Inventors: Minghai QIN, Zhibin XIAO, Chunsheng LIU

1 2 next