Patents by Inventor Lingzhi Liu
Lingzhi Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11928446Abstract: A method, apparatus, and a non-transitory computer-readable storage medium for generating heterogenous platform code. The method may obtain a neural network model. The neural network model may be programed to run on at least one platform. The method may also obtain an initial intermediate representation (IR) code by encoding the neural network model, and obtain a target IR code by adding decorations to the initial IR code based on a target platform. The method may also output an executable code optimized to run on the target platform by decoding the target IR code.Type: GrantFiled: November 11, 2021Date of Patent: March 12, 2024Assignee: KWAI INC.Inventors: Zhen Peng, Yang Liu, Hanxian Huang, Yongxiong Ren, Jishen Yang, Lingzhi Liu, Xin Chen
-
Patent number: 11830480Abstract: Systems and methods are provided for automatic speech recognition. In the method, the system obtains a padded sequence by processing a plurality of acoustic signals. The system compresses the padded sequence by reducing the size of the padded sequence to obtain a compressed sequence. The system inputs the compressed sequence into a pre-trained encoder neural network to obtain an encoded sequence and then decompresses the encoded sequence by recovering the encoded sequence to an original sequential ordering. The system inputs the encoded sequence to a decoding module to obtain recognition texts.Type: GrantFiled: February 17, 2021Date of Patent: November 28, 2023Assignee: KWAI INC.Inventors: Yongxiong Ren, Yang Liu, Heng Liu, Lingzhi Liu
-
Publication number: 20230362374Abstract: An decoding method is disclosed, including: parsing a bitstream to obtain a first flag, wherein the first flag specifies whether a current coding block is required to be partitioned; when the first flag specifies that the current coding block is required to be partitioned, parsing the bitstream to obtain a second flag, wherein the second flag specifies whether the current coding blocks is partitioned in a horizontal direction or a vertical direction; partitioning the current coding block into four first rectangular subblocks in the horizontal direction or four second rectangular subblocks in the vertical direction; and reconstructing the current coding block based on the four first rectangular subblocks or the four second rectangular subblocks.Type: ApplicationFiled: July 21, 2023Publication date: November 9, 2023Inventors: Changcai Lai, Xiaoran Cao, Yongbing Lin, Lingzhi Liu, Yun He
-
Patent number: 11750809Abstract: An encoding method with multiple image block division manners is disclosed, including: determining a division manner and a division direction of an image block; dividing the image block to obtain image subblocks sequentially arranged horizontally or vertically; determining whether the image subblocks need subdivision, and if subdivision is not needed, predicting the encoding object in the frame according to the image subblocks, to obtain residual data; performing transformation, quantization, and entropy encoding for the residual data so as to obtain coded residual data; and writing the division manner of the image block, the division direction of the image block, an identifier indicating whether the image subblocks need subdivision, and the coded residual data into a bitstream. By applying the encoding method, better prediction accuracy can be achieved when the image block presents a small change of pixel value in the horizontal or vertical direction.Type: GrantFiled: February 16, 2022Date of Patent: September 5, 2023Assignees: Huawei Technologies Co., Ltd., Tsinghua UniversityInventors: Changcai Lai, Xiaoran Cao, Yongbing Lin, Lingzhi Liu, Yun He
-
Patent number: 11741967Abstract: An automatic speech recognition system and a method thereof are provided. The system includes an encoder and a decoder. The encoder comprises a plurality of encoder layers. At least one encoder layer includes a plurality of encoder sublayers fused into one or more encoder kernels. The system further comprises a first pair of ping-pong buffers communicating with the one or more encoder kernels. The decoder comprises a plurality of decoder layers. At least one decoder layer includes a plurality of decoder sublayers fused into one or more decoder kernels. The decoder receives a decoder output related to the encoder output and generates a decoder output. The encoder sends the decoder output to a beam search kernel.Type: GrantFiled: January 4, 2021Date of Patent: August 29, 2023Assignee: KWAI INC.Inventors: Yongxiong Ren, Heng Liu, Yang Liu, Lingzhi Liu, Jie Li, Yuanyuan Zhao, Xiaorui Wang
-
Publication number: 20230153381Abstract: A method and an apparatus for length-aware local tiling in a sparse attention module in a transformer in heterogeneous devices are provided. The method includes that a heterogeneous device including one or more GPUs: divides a transformed sparsity mask into a plurality of first tiles and obtaining one or more effective first tiles from the plurality of first tiles, where each effective first tile includes at least one non-zero element; loads the one or more effective first tiles into a shared memory in the one or more GPUs and loads a plurality of elements in a first matrix corresponding to the one or more effective first tiles into the shared memory; and performs multiplication by a first sampled dense-dense matrix multiplication (SDDMM) kernel in the sparse attention module in the transformer by fetching the one or more effective first tiles and the plurality of elements from the shared memory.Type: ApplicationFiled: November 17, 2021Publication date: May 18, 2023Applicant: KWAI INC.Inventors: Zhendong WANG, Yongxiong REN, Yang LIU, Lingzhi LIU
-
Publication number: 20230143291Abstract: A method, apparatus, and a non-transitory computer-readable storage medium for generating heterogenous platform code. The method may obtain a neural network model. The neural network model may be programed to run on at least one platform. The method may also obtain an initial intermediate representation (IR) code by encoding the neural network model, and obtain a target IR code by adding decorations to the initial IR code based on a target platform. The method may also output an executable code optimized to run on the target platform by decoding the target IR code.Type: ApplicationFiled: November 11, 2021Publication date: May 11, 2023Applicant: KWAI INC.Inventors: Zhen PENG, Yang LIU, Hanxian HUANG, Yongxiong REN, Jishen YANG, Lingzhi LIU, Xin CHEN
-
Publication number: 20230133305Abstract: A method and an apparatus for accelerating a transformer with a sparse attention pattern are provided. The method includes that a heterogeneous device including one or more GPUs loads a first matrix, a second matrix, and a transformed sparsity mask into a first sampled dense-dense matrix multiplication (SDDMM) kernel in a sparse attention module in the transformer and generates a first output based on the first matrix, the second matrix, and the transformed sparsity mask by the first SDDMM kernel, generates a second output by a softmax kernel in the sparse attention module based on the first output, loads the second output, a third matrix, and the transformed sparsity mask into a matrix multiplication kernel in the sparse attention module, and generates an output of the sparse attention module.Type: ApplicationFiled: October 28, 2021Publication date: May 4, 2023Applicant: KWAI INC.Inventors: Zhendong WANG, Yongxiong REN, Yang LIU, Lingzhi LIU
-
Publication number: 20230105436Abstract: A method and an apparatus for video processing are provided. The method includes that a decoding terminal receives a plurality of coded video frames coded using one or more generative adversarial networks (GANs), receives network parameters related to the one or more GANs, and decodes the plurality of coded video frames using GANs based on the network parameters. Further, the one or more GANs respectively implement one or more video coding functions including reference-frame coding, motion-compensated frame prediction, and residue-frame coding.Type: ApplicationFiled: October 6, 2021Publication date: April 6, 2023Applicants: KWAI INC., SANTA CLARA UNIVERSITYInventors: Pengli DU, Ying LIU, Nam LING, Lingzhi LIU, Yongxiong REN, Ming Kai HSU
-
Publication number: 20230084000Abstract: Methods and apparatuses are provided for temporal profiling for neural network quantization. The method includes: obtaining a neural network that comprises anode connected to different paths at different time periods; obtaining node outputs for the node at the different time periods; determining statistic properties of the node outputs at the different time periods; and determining activation ranges of the node outputs based on the statistic properties.Type: ApplicationFiled: September 15, 2021Publication date: March 16, 2023Applicant: KWAI INC.Inventors: Ming Kai HSU, Chao YANG, Yue MA, Sikai WANG, Sitong FENG, Wenhui CAO, Danqing LI, Hui ZHONG, Lingzhi LIU
-
Publication number: 20230075264Abstract: Methods and devices are provided for implementing efficient general deconvolution Implementation on hardware accelerator.Type: ApplicationFiled: September 7, 2021Publication date: March 9, 2023Applicant: KWAI INC.Inventors: Shiya LIU, Ming Kai HSU, Quan LIN, Lingzhi LIU
-
Patent number: 11562734Abstract: The present disclosure relates to an automatic speech recognition system and a method thereof. The system includes a conformer encoder and a pair of ping-pong buffers. The encoder includes a plurality of encoder layers sequentially executed by one or more graphic processing units. At least one encoder layer includes a first feed forward module, a multi-head self-attention module, a convolution module, and a second feed forward module. The convolution module and the multi-head self-attention module are sandwiched between the first feedforward module and the second feed forward module. The four modules respectively include a plurality of encoder sublayers fused into one or more encoder kernels. The one or more encoder kernels respectively read from one of the pair of ping-pong buffers and write into the other of the pair of ping-pong buffers.Type: GrantFiled: January 4, 2021Date of Patent: January 24, 2023Assignee: KWAI INC.Inventors: Yongxiong Ren, Yang Liu, Heng Liu, Lingzhi Liu, Jie Li, Kaituo Xu, Xiaorui Wang
-
Publication number: 20220310068Abstract: Methods and apparatuses for automatic speech recognition are provided. The method includes: generating a weight matrix for a layer of a plurality of layers in a neural network; dividing the weight matrix into a plurality of blocks, each block including a plurality of weights; selecting a set of blocks from the plurality of blocks for block-wise pruning by minimizing a cost function subject to a pre-determined block-wise constraint; and generating a block-wise pruned weight matrix by setting one or more weights in the set of blocks to zero. The weight matrix includes a set of weights associated with the layer, the plurality of layers includes a first layer receiving a first input associated with one or more audio feature sequences, and the plurality of layers are executed on one or more processors.Type: ApplicationFiled: March 25, 2021Publication date: September 29, 2022Applicant: KWAI INC.Inventors: Yongxiong REN, Bingbing LI, Yang LIU, Lingzhi LIU
-
Publication number: 20220310069Abstract: A method and an apparatus for automatic speech recognition are provided. The method includes: generating a weight matrix for a layer of a plurality of layers in a neural network; dividing the weight matrix into a plurality of blocks, each block including a plurality of weights; selecting a pre-determined percentage of weights from at least one block for block-wise pruning; and generating a block-wise pruned weight matrix by setting the pre-determined percentage of weights selected from the at least one block to zero. The weight matrix includes a set of weights associated with the layer, the plurality of layers includes a first layer receiving a first input associated with one or more audio feature sequences, and the plurality of layers are executed on one or more processors.Type: ApplicationFiled: March 25, 2021Publication date: September 29, 2022Applicant: KWAI INC.Inventors: Yongxiong REN, Bingbing LI, Yang LIU, Lingzhi LIU
-
Publication number: 20220292727Abstract: A class-specific neural network for video compressed sensing and methods for training and testing the class-specific neural network are provided. The class-specific neural network includes a Gaussian-mixture model (GMM) and a plurality of encoders, where the GMM classifies video frame blocks with a plurality of clusters and assigns the video frame blocks to the plurality of clusters. Further, the plurality of encoders receive the video frame blocks and generate a plurality of compressed-sensed frame block vectors, where the plurality of encoders correspond to the plurality of clusters.Type: ApplicationFiled: March 15, 2022Publication date: September 15, 2022Applicants: KWAI INC., SANTA CLARA UNIVERSITYInventors: Yifei PEI, Ying LIU, Nam LING, Lingzhi LIU, Yongxiong REN, Ming Kai HSU
-
Publication number: 20220262349Abstract: Systems and methods are provided for automatic speech recognition. In the method, the system obtains a padded sequence by processing a plurality of acoustic signals. The system compresses the padded sequence by reducing the size of the padded sequence to obtain a compressed sequence. The system inputs the compressed sequence into a pre-trained encoder neural network to obtain an encoded sequence and then decompresses the encoded sequence by recovering the encoded sequence to an original sequential ordering. The system inputs the encoded sequence to a decoding module to obtain recognition texts.Type: ApplicationFiled: February 17, 2021Publication date: August 18, 2022Applicant: KWAI INC.Inventors: Yongxiong REN, Yang LIU, Heng LIU, Lingzhi LIU
-
Publication number: 20220245447Abstract: Systems and methods are provided for quantization aware training of a neural network for heterogeneous hardware platform. In the method, the system acquires hardware profiles with respect to a plurality of hardware components of a heterogeneous hardware platform. The system determines a plurality of hardware configurations based on the hardware profiles. The system acquires a set of training data and performing a quantization aware training using the training data on a network model based on the hardware configurations. The system obtains the network model with model weights for the heterogeneous hardware platform.Type: ApplicationFiled: February 2, 2021Publication date: August 4, 2022Applicant: KWAI INC.Inventors: Yang LIU, Yongxiong REN, Lingzhi LIU
-
Publication number: 20220215832Abstract: The present disclosure relates to an automatic speech recognition system and a method thereof. The system includes a conformer encoder and a pair of ping-pong buffers. The encoder includes a plurality of encoder layers sequentially executed by one or more graphic processing units. At least one encoder layer includes a first feed forward module, a multi-head self-attention module, a convolution module, and a second feed forward module. The convolution module and the multi-head self-attention module are sandwiched between the first feedforward module and the second feed forward module. The four modules respectively include a plurality of encoder sublayers fused into one or more encoder kernels. The one or more encoder kernels respectively read from one of the pair of ping-pong buffers and write into the other of the pair of ping-pong buffers.Type: ApplicationFiled: January 4, 2021Publication date: July 7, 2022Applicant: KWAI INC.Inventors: Yongxiong REN, Yang LIU, Heng LIU, Lingzhi LIU, Jie LI, Kaituo XU, Xiaorui WANG
-
Publication number: 20220215843Abstract: An automatic speech recognition system and a method thereof are provided. The system includes an encoder and a decoder. The encoder comprises a plurality of encoder layers. At least one encoder layer includes a plurality of encoder sublayers fused into one or more encoder kernels. The system further comprises a first pair of ping-pong buffers communicating with the one or more encoder kernels. The decoder comprises a plurality of decoder layers. At least one decoder layer includes a plurality of decoder sublayers fused into one or more decoder kernels. The decoder receives a decoder output related to the encoder output and generates a decoder output. The encoder sends the decoder output to a beam search kernel.Type: ApplicationFiled: January 4, 2021Publication date: July 7, 2022Applicant: KWAI INC.Inventors: Yongxiong REN, Heng LIU, Yang LIU, Lingzhi LIU, Jie LI, Yuanyuan ZHAO, Xiaorui WANG
-
Publication number: 20220174279Abstract: An decoding method is disclosed, including: parsing a bitstream to obtain a first flag, wherein the first flag specifies whether a current coding block is required to be partitioned; when the first flag specifies that the current coding block is required to be partitioned, parsing the bitstream to obtain a second flag, wherein the second flag specifies whether the current coding blocks is partitioned in a horizontal direction or a vertical direction; partitioning the current coding block into four first rectangular subblocks in the horizontal direction or four second rectangular subblocks in the vertical direction; and reconstructing the current coding block based on the four first rectangular subblocks or the four second rectangular subblocks.Type: ApplicationFiled: February 16, 2022Publication date: June 2, 2022Inventors: Changcai LAI, Xiaoran CAO, Yongbing LIN, Lingzhi LIU, Yun HE