Patents by Inventor Weiran Wang

Weiran Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multi-Output Decoders for Multi-Task Learning of ASR and Auxiliary Tasks

Publication number: 20240153495

Abstract: A method includes receiving a training dataset that includes one or more spoken training utterances for training an automatic speech recognition (ASR) model. Each spoken training utterance in the training dataset paired with a corresponding transcription and a corresponding target sequence of auxiliary tokens. For each spoken training utterance, the method includes generating a speech recognition hypothesis for a corresponding spoken training utterance, determining a speech recognition loss based on the speech recognition hypothesis and the corresponding transcription, generating a predicted auxiliary token for the corresponding spoken training utterance, and determining an auxiliary task loss based on the predicted auxiliary token and the corresponding target sequence of auxiliary tokens. The method also includes the ASR model jointly on the speech recognition loss and the auxiliary task loss determined for each spoken training utterance.

Type: Application

Filed: October 26, 2023

Publication date: May 9, 2024

Applicant: Google LLC

Inventors: Weiran Wang, Ding Zhao, Shaojin Ding, Hao Zhang, Shuo-yiin Chang, David Johannes Rybach, Tara N. Sainath, Yanzhang He, Ian McGraw, Shankar Kumar
Method and device for obtaining motion vector of video image

Patent number: 11949911

Abstract: A video processing method includes obtaining motion information of a neighboring block of a current image block, dividing the current image block into a plurality of sub-blocks in response to the neighboring block satisfying a preset condition, determining, in a time-domain reference image of the current image block, related blocks of the plurality of sub-blocks according to a motion vector of the neighboring block, and performing prediction on the current image block according to motion vectors of the related blocks.

Type: Grant

Filed: May 23, 2022

Date of Patent: April 2, 2024

Assignee: SZ DJI TECHNOLOGY CO., LTD.

Inventors: Xiaozhen Zheng, Suhong Wang, Shanshe Wang, Siwei Ma, Weiran Li
Method and device for video image processing

Patent number: 11949912

Abstract: A video image processing method includes determining a related block of a current image block according to a motion vector of a target neighboring block, the current image block, and a collocated frame of the current image block, decoding the current image block according to a motion vector of the related block of the current image block, constructing motion vectors of part of control points of the current image block according to neighboring blocks of the part of the control points of the current image block, and adding the motion vectors of the part of the control points of the current image block to a motion vector candidate list of the current image block.

Type: Grant

Filed: June 20, 2022

Date of Patent: April 2, 2024

Assignee: SZ DJI TECHNOLOGY CO., LTD.

Inventors: Xiaozhen Zheng, Tianliang Fu, Shanshe Wang, Siwei Ma, Weiran Li, Suhong Wang
Joint Speech and Text Streaming Model for ASR

Publication number: 20240028829

Abstract: A method includes receiving training data that includes a set of unspoken textual utterances. For each respective unspoken textual utterance, the method includes, tokenizing the respective textual utterance into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit tokenized from the respective unspoken textual utterance, receiving the first higher order textual feature representation generated by a text encoder, and generating a first probability distribution over possible text units. The method also includes training an encoder based on the first probability distribution over possible text units generated by a first-pass decoder for each respective unspoken textual utterance in the set of unspoken textual utterances.

Type: Application

Filed: July 1, 2023

Publication date: January 25, 2024

Applicant: Google LLC

Inventors: Tara N. Sainath, Zhouyuan Huo, Zhehuai Chen, Yu Zhang, Weiran Wang, Trevor Strohman, Rohit Prakash Prabhavalkar, Bo Li, Ankur Bapna
Unified Cascaded Encoder ASR model for Dynamic Model Sizes

Publication number: 20230326461

Abstract: An automated speech recognition (ASR) model includes a first encoder, a first encoder, a second encoder, and a second decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The first decoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a first probability distribution over possible speech recognition hypotheses. The second encoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a second higher order feature representation for a corresponding first higher order feature frame. The second decoder receives, as input, the second higher order feature representation generated by the second encoder, and generates a second probability distribution over possible speech recognition hypotheses.

Type: Application

Filed: March 13, 2023

Publication date: October 12, 2023

Applicant: Google LLC

Inventors: Shaojin Ding, Yangzhang He, Xin Wang, Weiran Wang, Trevor Strohman, Tara N. Sainath, Rohit Parkash Prabhavalkar, Robert David, Rina Panigrahy, Rami Botros, Qiao Liang, Ian Mcgraw, Ding Zhao, Dongseong Hwang
Rare Word Recognition with LM-aware MWER Training

Publication number: 20230298570

Abstract: A method includes generating, using an audio encoder, a higher-order feature representation for each acoustic frame in a sequence of acoustic frames; generating, using a decoder, based on the higher-order feature representation, a plurality of speech recognition hypotheses, each hypotheses corresponding to a candidate transcription of an utterance and having an associated first likelihood score; generating, using an external language model, for each speech recognition hypothesis, a second likelihood score; determining, using a learnable fusion module, for each speech recognition hypothesis, a set of fusion weights based on the higher-order feature representation and the speech recognition hypothesis; and generating, using the learnable fusion module, for each speech recognition hypothesis, a third likelihood score based on the first likelihood score, the second likelihood score, and the set of fusion weights, the audio encoder and decoder trained using minimum additive error rate training in the presence of t

Type: Application

Filed: March 21, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prakash Prabhavalkar, Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Charles Caleb Peyser, Trevor Strohman, Yangzhang He, David Rybach
Deliberation by Text-Only and Semi-Supervised Training

Publication number: 20230298563

Abstract: A method of text-only and semi-supervised training for deliberation includes receiving training data including unspoken textual utterances that are each not paired with any corresponding spoken utterance of non-synthetic speech, and training a deliberation model that includes a text encoder and a deliberation decoder on the unspoken textual utterances. The method also includes receiving, at the trained deliberation model, first-pass hypotheses and non-causal acoustic embeddings. The first-pass hypotheses is generated by a recurrent neural network-transducer (RNN-T) decoder for the non-causal acoustic embeddings encoded by a non-causal encoder. The method also includes encoding, using the text encoder, the first-pass hypotheses generated by the RNN-T decoder, and generating, using the deliberation decoder attending to both the first-pass hypotheses and the non-causal acoustic embeddings, second-pass hypotheses.

Type: Application

Filed: March 18, 2023

Publication date: September 21, 2023

Applicant: Google LLC

Inventors: Ke Hu, Tara N. Sainath, Yanzhang He, Rohit Prabhavalkar, Sepand Mavandadi, Weiran Wang, Trevor Strohman
Optimizing Inference Performance for Conformer

Publication number: 20230130634

Abstract: A computer-implemented method includes receiving a sequence of acoustic frames as input to an automatic speech recognition (ASR) model. Here, the ASR model includes a causal encoder and a decoder. The method also includes generating, by the causal encoder, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The method also includes generating, by the decoder, a first probability distribution over possible speech recognition hypotheses. Here, the causal encoder includes a stack of causal encoder layers each including a Recurrent Neural Network (RNN) Attention-Performer module that applies linear attention.

Type: Application

Filed: September 29, 2022

Publication date: April 27, 2023

Applicant: Google LLC

Inventors: Tara N. Sainath, Rami Botros, Anmol Gulati, Krzysztof Choromanski, Ruoming Pang, Trevor Strohman, Weiran Wang, Jiahui Yu
Deliberation of Streaming RNN-Transducer by Non-Autoregressive Decoding

Publication number: 20230107248

Abstract: A method includes receiving an initial alignment for a candidate hypothesis generated by a transducer decoder model during a first pass. Here, the candidate hypothesis corresponds to a candidate transcription for an utterance and the initial alignment for the candidate hypothesis includes a sequence of output labels. Each output label corresponds to a blank symbol or a hypothesized sub-word unit. The method also include receiving a subsequent sequence of audio encodings characterizing the utterance. During an initial refinement step, the method also includes generating a new alignment for a rescored sequence of output labels using a non-autoregressive decoder. The non-autoregressive decoder is configured to receive the initial alignment for the candidate hypothesis and the subsequent sequence of audio encodings.

Type: Application

Filed: September 16, 2022

Publication date: April 6, 2023

Applicant: Google LLC

Inventors: Weiran Wang, Ke Hu, Tara N. Sainath
SYSTEM OF RECOMMENDING ITEM, METHOD OF RECOMMENDING ITEM, COMPUTER SYSTEM, AND MEDIUM

Publication number: 20230099386

Abstract: A system of recommending an item, a method of recommending an item by a system of recommending an item, a computer system, and a computer-readable storage medium are provided. The system includes: an item expansion module configured to expand a content input by a user in response to the content input by the user, to generate an item set of interest to the user, the item set includes one or more items; a price radar module configured to monitor a discount information of the item in the item set; and a price monitoring module configured to calculate an actual price of the item based on the discount information of the item, maintain a price change record of the item in the item set, and determine whether to push a prompt information to the user or not according to the calculated actual price of the item and the price change record.

Type: Application

Filed: February 24, 2021

Publication date: March 30, 2023

Inventors: Wei ZHANG, Xin SHANG, Guangming ZHU, Fan YANG, Xiaoting SI, Hongguang LIU, Weiran WANG, Jiang LAN, Yijun HUANG, Hongkai JIANG, Xuedi QIAN
Phone-based sub-word units for end-to-end speech recognition

Patent number: 11328731

Abstract: System and methods for identifying a text word from a spoken utterance are provided. An ensemble BPE system that includes a phone BPE system and a character BPE system receives a spoken utterance. Both BPE systems include a multi-level language model (LM) and an acoustic model. The phone BPE system identifies first words from the spoken utterance and determine a first score for each first word. The first words are converted into character sequences. The character BPE model converts the character sequences into second words and determines a second score for each second word. For each word from the first words that matches a word in the second words the first and second scores are combined. The text word is the word with a highest score.

Type: Grant

Filed: June 17, 2020

Date of Patent: May 10, 2022

Assignee: salesforce.com, inc.

Inventors: Weiran Wang, Yingbo Zhou, Caiming Xiong
SYSTEMS AND METHODS FOR MUTUAL INFORMATION BASED SELF-SUPERVISED LEARNING

Publication number: 20220067534

Abstract: Embodiments described herein combine both masked reconstruction and predictive coding. Specifically, unlike contrastive learning, the mutual information between past states and future states are directly estimated. The context information can also be directly captured via shifted masked reconstruction—unlike standard masked reconstruction, the target reconstructed observations are shifted slightly towards the future to incorporate more predictability. The estimated mutual information and shifted masked reconstruction loss can then be combined as the loss function to update the neural model.

Type: Application

Filed: August 28, 2020

Publication date: March 3, 2022

Inventors: Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong
Phone-Based Sub-Word Units for End-to-End Speech Recognition

Publication number: 20210319796

Abstract: System and methods for identifying a text word from a spoken utterance are provided. An ensemble BPE system that includes a phone BPE system and a character BPE system receives a spoken utterance. Both BPE systems include a multi-level language model (LM) and an acoustic model. The phone BPE system identifies first words from the spoken utterance and determine a first score for each first word. The first words are converted into character sequences. The character BPE model converts the character sequences into second words and determines a second score for each second word. For each word from the first words that matches a word in the second words the first and second scores are combined. The text word is the word with a highest score.

Type: Application

Filed: June 17, 2020

Publication date: October 14, 2021

Inventors: Weiran Wang, Yingbo Zhou, Caiming Xiong
Audio event detection

Patent number: 10803885

Abstract: An audio event detection system that processes audio data into audio feature data and processes the audio feature data using pre-configured candidate interval lengths to identify top candidate regions of the feature data that may include an audio event. The feature data from the top candidate regions are then scored by a classifier, where the score indicates a likelihood that the candidate region corresponds to a desired audio event. The scores are compared to a threshold, and if the threshold is satisfied, the top scoring candidate region is determined to include an audio event.

Type: Grant

Filed: June 29, 2018

Date of Patent: October 13, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Chieh-Chi Kao, Chao Wang, Weiran Wang, Ming Sun
Audio event detection

Patent number: 10418957

Abstract: An audio event detection system that subsamples input audio data using a series of recurrent neural networks to create data of a coarser time scale than the audio data. Data frames corresponding to the coarser time scale may then be upsampled to data frames that match the finer time scale of the original audio data frames. The resulting data frames are then scored with a classifier to determine a likelihood that the individual frames correspond to an audio event. Each frame is then weighted by its score and a composite weighted frame is created by summing the weighted frames and dividing by the cumulative score. The composite weighted frame is then scored by the classifier. The resulting score is taken as an overall score indicating a likelihood that the input audio data includes an audio event.

Type: Grant

Filed: June 29, 2018

Date of Patent: September 17, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Weiran Wang, Chao Wang, Chieh-Chi Kao
Treatment of a fabric article

Patent number: 10246817

Abstract: The present application relates to a method of treating a fabric article so that it has a characteristic smell normally associated with garments that have been exposed to natural sunlight, the method comprising positioning a fabric article (1) to be treated in an enclosure (8) and irradiating said fabric article positioned in said enclosure with ultraviolet light with a wavelength of between 280 nm and 400 nm and so that it is subjected to a predetermined radiant exposure. A device for treating a fabric article to replicate the characteristic effect of exposing said fabric article to natural sunlight is also disclosed.

Type: Grant

Filed: October 25, 2013

Date of Patent: April 2, 2019

Assignee: KONINKLIJKE PHILIPS N.V.

Inventors: Yuqi Wang, Weiran Wang, Yong Jiang, Boon Teck Tan, Jiuyu Zhou
PH MONITORING DEVICE

Publication number: 20180172649

Abstract: A pH monitoring device includes a chamber for containing a solution; and a polymer immersed in the solution. The physical state of the polymer is changeable in dependence on whether the pH of the solution exceeds a threshold value. The pH monitoring device includes further includes a detector configured to detect the change of the physical state of the polymer and thus determine the pH of the solution.

Type: Application

Filed: December 27, 2017

Publication date: June 21, 2018

Inventors: JUN SHI, WEIRAN WANG
Air purification apparatus

Patent number: 9739500

Abstract: The invention relates to an air purification apparatus. The apparatus is disposed in a window for separating an indoor space and an outdoor space. The apparatus comprising: an inlet chamber having a first inlet, a second inlet and an outlet, wherein the first inlet is operatively open to the outdoor space and the second inlet is operatively open to the indoor space; an air pumping unit for pumping air from the inlet chamber to the indoor space through the outlet, wherein the air is pumped into the inlet chamber through the first inlet or through the second inlet; and a filtering unit disposed upstream of the air pumping unit, for filtering pollutants in the air.

Type: Grant

Filed: November 21, 2013

Date of Patent: August 22, 2017

Assignee: KONINKLIJKE PHILIPS N.V.

Inventor: Weiran Wang
AIR PURIFICATION APPARATUS

Publication number: 20150300677

Abstract: The invention relates to an air purification apparatus. The apparatus is disposed in a window for separating an indoor space and an outdoor space. The apparatus comprising: an inlet chamber having a first inlet, a second inlet and an outlet, wherein the first inlet is operatively open to the outdoor space and the second inlet is operatively open to the indoor space; an air pumping unit for pumping air from the inlet chamber to the indoor space through the outlet, wherein the air is pumped into the inlet chamber through the first inlet or through the second inlet; and a filtering unit disposed upstream of the air pumping unit, for filtering pollutants in the air.

Type: Application

Filed: November 21, 2013

Publication date: October 22, 2015

Inventor: Weiran Wang
TREATMENT OF A FABRIC ARTICLE

Publication number: 20150292143

Abstract: The present application relates to a method of treating a fabric article so that it has a characteristic smell normally associated with garments that have been exposed to natural sunlight, the method comprising positioning a fabric article (1) to be treated in an enclosure (8) and irradiating said fabric article positioned in said enclosure with ultraviolet light with a wavelength of between 280 nm and 400 nm and so that it is subjected to a predetermined radiant exposure. A device for treating a fabric article to replicate the characteristic effect of exposing said fabric article to natural sunlight is also disclosed.

Type: Application

Filed: October 25, 2013

Publication date: October 15, 2015

Inventors: YUQI WANG, WEIRAN WANG, YONG JIANG, BOON TECK TAN, JIUYU ZHOU

1 2 next