Patents Assigned to KWAI INC.

METHODS AND DEVICES FOR IMAGE RESTORATION USING SUB-BAND SPECIFIC TRANSFORM DOMAIN LEARNING

Publication number: 20230099539

Abstract: A method, apparatus, and a non-transitory computer-readable storage medium for sub-band image reconstruction. The method may include obtaining an image captured by a camera. The method may also obtain a transform image based on the image captured by the camera. The transform image may be in a transform domain. The method may further obtain decomposed image components of the transform image. The decomposed image components may include a low frequency component and at least one high frequency component.

Type: Application

Filed: September 30, 2021

Publication date: March 30, 2023

Applicant: KWAI INC.

Inventors: Paras MAHARJAN, Ning XU, Xuan XU, Yuyan SONG
METHODS AND APPARATUSES FOR PHOTOREALISTIC RENDERING OF IMAGES USING MACHINE LEARNING

Publication number: 20230087476

Abstract: A neural network training method, an image processing method, and apparatuses thereof are provided. The neural network training method includes obtaining a first domain image and a second domain image, where the first domain image and the second domain image are unpaired images in different domains; obtaining a scaled first domain image by scaling, at an iteration, the first domain image; obtaining a training patch by cropping the scaled first domain image, where each training patch has a same number of pixels with different contents; inputting the training patch into the neural network at the iteration, and outputting an output patch; calculating a contrastive loss based on a query sub-patch and negative sub-patches selected from the training patch and a corresponding positive sub-patch selected from the output patch; and updating model parameters of the neural network based on the contrastive loss and a generative adversarial network loss.

Type: Application

Filed: September 17, 2021

Publication date: March 23, 2023

Applicant: KWAI INC.

Inventors: Oliver Dayun LIU, Mengtian LI, Yi ZHENG, Haibin HUANG, Chongyang MA
METHODS AND DEVICES FOR NEURAL NETWORK QUANTIZATION USING TEMPORAL PROFILING

Publication number: 20230084000

Abstract: Methods and apparatuses are provided for temporal profiling for neural network quantization. The method includes: obtaining a neural network that comprises anode connected to different paths at different time periods; obtaining node outputs for the node at the different time periods; determining statistic properties of the node outputs at the different time periods; and determining activation ranges of the node outputs based on the statistic properties.

Type: Application

Filed: September 15, 2021

Publication date: March 16, 2023

Applicant: KWAI INC.

Inventors: Ming Kai HSU, Chao YANG, Yue MA, Sikai WANG, Sitong FENG, Wenhui CAO, Danqing LI, Hui ZHONG, Lingzhi LIU
METHODS AND DEVICES FOR EFFICIENT GENERAL DECONVOLUTION IMPLEMENTATION ON HARDWARE ACCELERATOR

Publication number: 20230075264

Abstract: Methods and devices are provided for implementing efficient general deconvolution Implementation on hardware accelerator.

Type: Application

Filed: September 7, 2021

Publication date: March 9, 2023

Applicant: KWAI INC.

Inventors: Shiya LIU, Ming Kai HSU, Quan LIN, Lingzhi LIU
TRANSFERABLE VISION TRANSFORMER FOR UNSUPERVISED DOMAIN ADAPTATION

Publication number: 20230062151

Abstract: A method and an apparatus for training a transferable vision transformer (TVT) for unsupervised domain adaption (UDA) in heterogeneous devices are provided. The method includes that a heterogeneous device including one or more graphic processing units (GPUs) loads multiple patches into the TVT which includes a transferability adaption module (TAM). Furthermore, a patch-level domain discriminator in the TAM assigns weights to the multiple patches and determines one or more transferable patches based on the weights. Moreover, the heterogeneous device generates a transferable attention output for an attention module in the TAM based on the one or more transferable patches.

Type: Application

Filed: September 24, 2021

Publication date: March 2, 2023

Applicant: KWAI INC.

Inventors: Ning XU, Jingjing LIU, Jinyu YANG
METHODS AND APPARATUSES FOR GENERATING STYLE PICTURES

Publication number: 20230054283

Abstract: A style picture generating method, an apparatus and a non-transitory computer readable storage medium thereof are provided. The method includes: obtaining one or more models by training a neural network; obtaining a plurality of interpolated models based on the one or more models; generating a plurality of pictures by the plurality of interpolated models; and generating the style picture by combining two or more pictures in the plurality of pictures using one or more model-specific alpha masks.

Type: Application

Filed: August 20, 2021

Publication date: February 23, 2023

Applicant: KWAI INC.

Inventors: Jiayi LIU, Shen WANG, Zhenyu LIAO, Huayan WANG
Systems and methods for automatic speech recognition based on graphics processing units

Patent number: 11562734

Abstract: The present disclosure relates to an automatic speech recognition system and a method thereof. The system includes a conformer encoder and a pair of ping-pong buffers. The encoder includes a plurality of encoder layers sequentially executed by one or more graphic processing units. At least one encoder layer includes a first feed forward module, a multi-head self-attention module, a convolution module, and a second feed forward module. The convolution module and the multi-head self-attention module are sandwiched between the first feedforward module and the second feed forward module. The four modules respectively include a plurality of encoder sublayers fused into one or more encoder kernels. The one or more encoder kernels respectively read from one of the pair of ping-pong buffers and write into the other of the pair of ping-pong buffers.

Type: Grant

Filed: January 4, 2021

Date of Patent: January 24, 2023

Assignee: KWAI INC.

Inventors: Yongxiong Ren, Yang Liu, Heng Liu, Lingzhi Liu, Jie Li, Kaituo Xu, Xiaorui Wang
METHODS AND APPARATUSES FOR HIGH PERFORMANCE AND ACCURACY FIXED-POINT BATCHNORM IMPLEMENTATION

Publication number: 20230010197

Abstract: A method to implement a fixed-point batchnorm layer in a neural network for data processing is provided in the present disclosure. The method includes: receiving fixed-point input data over a channel of a standalone floating-point batchnorm layer, and converting the floating-point input data into fixed-point input data of the standalone floating-point batchnorm layer; obtaining fixed-point quantization parameters in each channel based on the input data and floating-point parameters ?i, ?i, ?i in each channel; converting the standalone floating-point batchnorm layer based on the fixed-point quantization parameters into a fixed-point batchnorm layer for processing the fixed-point input data to generate fixed-point output data; and mapping the fixed-point batchnorm layer to a fixed-point convolution layer and the computation of convolution is done by matrix multiplication that can be executed on a GEMM engine.

Type: Application

Filed: July 6, 2021

Publication date: January 12, 2023

Applicant: KWAI INC.

Inventors: Ming Kai HSU, Sikai WANG
Methods and apparatuses for high performance and accuracy fixed-point scale implementation

Publication number: 20230010981

Abstract: A method to implement a fixed-point scale layer in a neural network for data processing is provided in the present disclosure. The method includes: receiving fixed-point input data over a channel of a standalone floating-point scale layer, and converting the floating-point input data into fixed-point input data of the standalone floating-point scale layer; obtaining fixed-point quantization parameters in each channel based on the input data and floating-point parameters ?i, ?i in each channel; converting the standalone floating-point scale layer based on the fixed-point quantization parameters into a fixed-point scale layer for processing the fixed-point input data to generate fixed-point output data; and mapping the fixed-point scale layer to a fixed-point convolution layer and the computation of convolution is done by matrix multiplication that can be executed on a GEMM engine.

Type: Application

Filed: July 6, 2021

Publication date: January 12, 2023

Applicant: KWAI INC.

Inventors: Ming Kai HSU, Sitong FENG
METHODS AND APPARATUSES FOR FINE-GRAINED STYLE-BASED GENERATIVE NEURAL NETWORKS

Publication number: 20220335250

Abstract: A method and an apparatus for training a generative adversarial network (GAN) and a method and an apparatus for processing an image are provided. The method for training the GAN includes: obtaining a fine-grained style label (FGSL) associated with the image and inputting the FGSL and a latent vector into a style-based generator in the GAN; the style-based generator generating an first output image based on the FGSL and the latent vector; the projection discriminator determining whether the first output image matches the image based on the FGSL; and adjusting one or more parameters of the GAN and regenerating, by the style-based generator, a second output image based on the FGSL, the latent vector, and the adjusted GAN in response to determining that the first output image does not match the image based on the FGSL.

Type: Application

Filed: April 19, 2021

Publication date: October 20, 2022

Applicant: KWAI INC.

Inventors: Xin MIAO, Huayan WANG
METHODS AND DEVICES FOR IRREGULAR PRUNING FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20220310069

Abstract: A method and an apparatus for automatic speech recognition are provided. The method includes: generating a weight matrix for a layer of a plurality of layers in a neural network; dividing the weight matrix into a plurality of blocks, each block including a plurality of weights; selecting a pre-determined percentage of weights from at least one block for block-wise pruning; and generating a block-wise pruned weight matrix by setting the pre-determined percentage of weights selected from the at least one block to zero. The weight matrix includes a set of weights associated with the layer, the plurality of layers includes a first layer receiving a first input associated with one or more audio feature sequences, and the plurality of layers are executed on one or more processors.

Type: Application

Filed: March 25, 2021

Publication date: September 29, 2022

Applicant: KWAI INC.

Inventors: Yongxiong REN, Bingbing LI, Yang LIU, Lingzhi LIU
METHODS AND DEVICES FOR STRUCTURED PRUNING FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20220310068

Abstract: Methods and apparatuses for automatic speech recognition are provided. The method includes: generating a weight matrix for a layer of a plurality of layers in a neural network; dividing the weight matrix into a plurality of blocks, each block including a plurality of weights; selecting a set of blocks from the plurality of blocks for block-wise pruning by minimizing a cost function subject to a pre-determined block-wise constraint; and generating a block-wise pruned weight matrix by setting one or more weights in the set of blocks to zero. The weight matrix includes a set of weights associated with the layer, the plurality of layers includes a first layer receiving a first input associated with one or more audio feature sequences, and the plurality of layers are executed on one or more processors.

Type: Application

Filed: March 25, 2021

Publication date: September 29, 2022

Applicant: KWAI INC.

Inventors: Yongxiong REN, Bingbing LI, Yang LIU, Lingzhi LIU
CLASS-SPECIFIC NEURAL NETWORK FOR VIDEO COMPRESSED SENSING

Publication number: 20220292727

Abstract: A class-specific neural network for video compressed sensing and methods for training and testing the class-specific neural network are provided. The class-specific neural network includes a Gaussian-mixture model (GMM) and a plurality of encoders, where the GMM classifies video frame blocks with a plurality of clusters and assigns the video frame blocks to the plurality of clusters. Further, the plurality of encoders receive the video frame blocks and generate a plurality of compressed-sensed frame block vectors, where the plurality of encoders correspond to the plurality of clusters.

Type: Application

Filed: March 15, 2022

Publication date: September 15, 2022

Applicants: KWAI INC., SANTA CLARA UNIVERSITY

Inventors: Yifei PEI, Ying LIU, Nam LING, Lingzhi LIU, Yongxiong REN, Ming Kai HSU
SYSTEMS AND METHODS FOR ACCELERATING AUTOMATIC SPEECH RECOGNITION BASED ON COMPRESSION AND DECOMPRESSION

Publication number: 20220262349

Abstract: Systems and methods are provided for automatic speech recognition. In the method, the system obtains a padded sequence by processing a plurality of acoustic signals. The system compresses the padded sequence by reducing the size of the padded sequence to obtain a compressed sequence. The system inputs the compressed sequence into a pre-trained encoder neural network to obtain an encoded sequence and then decompresses the encoded sequence by recovering the encoded sequence to an original sequential ordering. The system inputs the encoded sequence to a decoding module to obtain recognition texts.

Type: Application

Filed: February 17, 2021

Publication date: August 18, 2022

Applicant: KWAI INC.

Inventors: Yongxiong REN, Yang LIU, Heng LIU, Lingzhi LIU
SYSTEMS AND METHODS FOR QUANTIZATION AWARE TRAINING OF A NEURAL NETWORK FOR HETEROGENEOUS HARDWARE PLATFORM

Publication number: 20220245447

Abstract: Systems and methods are provided for quantization aware training of a neural network for heterogeneous hardware platform. In the method, the system acquires hardware profiles with respect to a plurality of hardware components of a heterogeneous hardware platform. The system determines a plurality of hardware configurations based on the hardware profiles. The system acquires a set of training data and performing a quantization aware training using the training data on a network model based on the hardware configurations. The system obtains the network model with model weights for the heterogeneous hardware platform.

Type: Application

Filed: February 2, 2021

Publication date: August 4, 2022

Applicant: KWAI INC.

Inventors: Yang LIU, Yongxiong REN, Lingzhi LIU
SYSTEMS AND METHODS FOR AUTOMATIC SPEECH RECOGNITION BASED ON GRAPHICS PROCESSING UNITS

Publication number: 20220215832

Abstract: The present disclosure relates to an automatic speech recognition system and a method thereof. The system includes a conformer encoder and a pair of ping-pong buffers. The encoder includes a plurality of encoder layers sequentially executed by one or more graphic processing units. At least one encoder layer includes a first feed forward module, a multi-head self-attention module, a convolution module, and a second feed forward module. The convolution module and the multi-head self-attention module are sandwiched between the first feedforward module and the second feed forward module. The four modules respectively include a plurality of encoder sublayers fused into one or more encoder kernels. The one or more encoder kernels respectively read from one of the pair of ping-pong buffers and write into the other of the pair of ping-pong buffers.

Type: Application

Filed: January 4, 2021

Publication date: July 7, 2022

Applicant: KWAI INC.

Inventors: Yongxiong REN, Yang LIU, Heng LIU, Lingzhi LIU, Jie LI, Kaituo XU, Xiaorui WANG
SYSTEMS AND METHODS FOR AUTOMATIC SPEECH RECOGNITION BASED ON GRAPHICS PROCESSING UNITS

Publication number: 20220215843

Abstract: An automatic speech recognition system and a method thereof are provided. The system includes an encoder and a decoder. The encoder comprises a plurality of encoder layers. At least one encoder layer includes a plurality of encoder sublayers fused into one or more encoder kernels. The system further comprises a first pair of ping-pong buffers communicating with the one or more encoder kernels. The decoder comprises a plurality of decoder layers. At least one decoder layer includes a plurality of decoder sublayers fused into one or more decoder kernels. The decoder receives a decoder output related to the encoder output and generates a decoder output. The encoder sends the decoder output to a beam search kernel.

Type: Application

Filed: January 4, 2021

Publication date: July 7, 2022

Applicant: KWAI INC.

Inventors: Yongxiong REN, Heng LIU, Yang LIU, Lingzhi LIU, Jie LI, Yuanyuan ZHAO, Xiaorui WANG
3D SEPARABLE DEEP CONVOLUTIONAL NEURAL NETWORK FOR MOVING OBJECT DETECTION

Publication number: 20220164630

Abstract: A method for detecting moving objects in video frames, an apparatus and a non-transitory computer-readable storage medium thereof are provided. The method includes that: an encoder in a 3-dimenional (3D) separable convolutional neural network with multi-input multi-output (3DS_MM) receives a first input including multiple video frames, where the encoder includes a plurality of encoder layers including 3D separable convolutional neural network (CNN) layers; the encoder generates a first encoder output; and a decoder in the 3DS_MM receives the first encoder output and generates a first output including multiple first binary masks related to the first input, where the decoder includes a plurality of decoder layers comprising 3D separable transposed CNN layers.

Type: Application

Filed: November 22, 2021

Publication date: May 26, 2022

Applicants: KWAI INC., SANTA CLARA UNIVERSITY

Inventors: Bingxin HOU, Ying LIU, Nam LING, Lingzhi LIU, Yongxiong REN, Ming Kai HSU

prev 1 2