Patents by Inventor Yingbo Zhou

Yingbo Zhou has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

End-to-end speech recognition with policy learning

Patent number: 11056099

Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.

Type: Grant

Filed: September 5, 2019

Date of Patent: July 6, 2021

Assignee: salesforce.com, inc.

Inventors: Yingbo Zhou, Caiming Xiong
Near-Zero-Cost Differentially Private Deep Learning with Teacher Ensembles

Publication number: 20210089882

Abstract: Systems and methods are provided for near-zero-cost (NZC) query framework or approach for differentially private deep learning. To protect the privacy of training data during learning, the near-zero-cost query framework transfers knowledge from an ensemble of teacher models trained on partitions of the data to a student model. Privacy guarantees may be understood intuitively and expressed rigorously in terms of differential privacy. Other features are also provided.

Type: Application

Filed: October 21, 2019

Publication date: March 25, 2021

Inventors: Lichao SUN, Jia LI, Caiming XIONG, Yingbo ZHOU
Dense video captioning

Patent number: 10958925

Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

Type: Grant

Filed: November 18, 2019

Date of Patent: March 23, 2021

Assignee: salesforce.com, inc.

Inventors: Yingbo Zhou, Luowei Zhou, Caiming Xiong, Richard Socher
Unsupervised non-parallel speech domain adaptation using a multi-discriminator adversarial network

Patent number: 10783875

Abstract: A system for domain adaptation includes a domain adaptation model configured to adapt a representation of a signal in a first domain to a second domain to generate an adapted presentation and a plurality of discriminators corresponding to a plurality of bands of values of a domain variable. Each of the plurality of discriminators is configured to discriminate between the adapted representation and representations of one or more other signals in the second domain.

Type: Grant

Filed: July 3, 2018

Date of Patent: September 22, 2020

Assignee: salesforce.com, inc.

Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
Continual Neural Network Learning Via Explicit Structure Learning

Publication number: 20200104699

Abstract: Embodiments for training a neural network using sequential tasks are provided. A plurality of sequential tasks are received. For each task in the plurality of tasks a copy of the neural network that includes a plurality of layers is generated. From the copy of the neural network a task specific neural network is generated by performing an architectural search on the plurality of layers in the copy of the neural network. The architectural search identifies a plurality of candidate choices in the layers of the task specific neural network. Parameters in the task specific neural network that correspond to the plurality of candidate choices and that maximize architectural weights at each layer are identified. The parameters are retrained and merged with the neural network. The neural network trained on the plurality of sequential tasks is a trained neural network.

Type: Application

Filed: October 31, 2018

Publication date: April 2, 2020

Inventors: Yingbo ZHOU, Xilai LI, Caiming XIONG
Dense Video Captioning

Publication number: 20200084465

Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

Type: Application

Filed: November 18, 2019

Publication date: March 12, 2020

Inventors: Yingbo ZHOU, Luowei ZHOU, Caiming XIONG, Richard SOCHER
End-to-end speech recognition with policy learning

Patent number: 10573295

Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.

Type: Grant

Filed: January 23, 2018

Date of Patent: February 25, 2020

Assignee: salesforce.com, inc.

Inventors: Yingbo Zhou, Caiming Xiong
Dense video captioning

Patent number: 10542270

Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

Type: Grant

Filed: January 18, 2018

Date of Patent: January 21, 2020

Assignee: salesforce.com, inc.

Inventors: Yingbo Zhou, Luowei Zhou, Caiming Xiong, Richard Socher
End-To-End Speech Recognition with Policy Learning

Publication number: 20200005765

Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.

Type: Application

Filed: September 5, 2019

Publication date: January 2, 2020

Inventors: Yingbo ZHOU, Caiming XIONG
UNSUPERVISED NON-PARALLEL SPEECH DOMAIN ADAPTATION USING A MULTI-DISCRIMINATOR ADVERSARIAL NETWORK

Publication number: 20190295530

Abstract: A system for domain adaptation includes a domain adaptation model configured to adapt a representation of a signal in a first domain to a second domain to generate an adapted presentation and a plurality of discriminators corresponding to a plurality of bands of values of a domain variable. Each of the plurality of discriminators is configured to discriminate between the adapted representation and representations of one or more other signals in the second domain.

Type: Application

Filed: July 3, 2018

Publication date: September 26, 2019

Applicant: salesforce.com, inc.

Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
SYSTEMS AND METHODS FOR LEARNING FOR DOMAIN ADAPTATION

Publication number: 20190286073

Abstract: A method for training parameters of a first domain adaptation model includes evaluating a cycle consistency objective using a first task specific model associated with a first domain and a second task specific model associated with a second domain. The evaluating the cycle consistency objective is based on one or more first training representations adapted from the first domain to the second domain by a first domain adaptation model and from the second domain to the first domain by a second domain adaptation model, and one or more second training representations adapted from the second domain to the first domain by the second domain adaptation model and from the first domain to the second domain by the first domain adaptation model. The method further includes evaluating a learning objective based on the cycle consistency objective, and updating parameters of the first domain adaptation model based on learning objective.

Type: Application

Filed: August 3, 2018

Publication date: September 19, 2019

Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
Dense Video Captioning

Publication number: 20190149834

Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

Type: Application

Filed: January 18, 2018

Publication date: May 16, 2019

Inventors: Yingbo Zhou, Luowei ZHOU, Caiming XIONG, Richard SOCHER
END-TO-END SPEECH RECOGNITION WITH POLICY LEARNING

Publication number: 20190130897

Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.

Type: Application

Filed: January 23, 2018

Publication date: May 2, 2019

Applicant: salesforce.com, inc.

Inventors: Yingbo Zhou, Caiming Xiong
Regularization Techniques for End-To-End Speech Recognition

Publication number: 20190130896

Abstract: The disclosed technology teaches regularizing a deep end-to-end speech recognition model to reduce overfitting and improve generalization: synthesizing sample speech variations on original speech samples labelled with text transcriptions, and modifying a particular original speech sample to independently vary tempo and pitch of the original speech sample while retaining the labelled text transcription of the original speech sample, thereby producing multiple sample speech variations having multiple degrees of variation from the original speech sample. The disclosed technology includes training a deep end-to-end speech recognition model, on thousands to millions of original speech samples and the sample speech variations on the original speech samples, that outputs recognized text transcriptions corresponding to speech detected in the original speech samples and the sample speech variations.

Type: Application

Filed: December 21, 2017

Publication date: May 2, 2019

Applicant: salesforce.com, inc.

Inventors: Yingbo ZHOU, Caiming XIONG, Richard SOCHER
Method and apparatus for classifying machine printed text and handwritten text

Patent number: 9432671

Abstract: A method, non-transitory computer readable medium, and apparatus for classifying machine printed text and handwritten text in an input are disclosed. For example, the method defines a perspective for an auto-encoder, receives the input for the auto-encoder, wherein the input comprises a document comprising the machine printed text and the handwritten text, performs an encoding on the input using an auto-encoder to generate a classifier, applies the classifier on the input and generates an output that separates the machine printed text and the handwritten text in the input based on the classifier in accordance with the perspective.

Type: Grant

Filed: May 22, 2014

Date of Patent: August 30, 2016

Assignee: Xerox Corporation

Inventors: Michael Robert Campanelli, Safwan R. Wshah, Yingbo Zhou
METHOD AND APPARATUS FOR CLASSIFYING MACHINE PRINTED TEXT AND HANDWRITTEN TEXT

Publication number: 20150339543

Abstract: A method, non-transitory computer readable medium, and apparatus for classifying machine printed text and handwritten text in an input are disclosed. For example, the method defines a perspective for an auto-encoder, receives the input for the auto-encoder, wherein the input comprises a document comprising the machine printed text and the handwritten text, performs an encoding on the input using an auto-encoder to generate a classifier, applies the classifier on the input and generates an output that separates the machine printed text and the handwritten text in the input based on the classifier in accordance with the perspective.

Type: Application

Filed: May 22, 2014

Publication date: November 26, 2015

Applicant: Xerox Corporation

Inventors: Michael Robert Campanelli, Safwan R. Wshah, Yingbo Zhou
Method and apparatus for personal identification using finger imaging

Patent number: 8872909

Abstract: A system for extracting finger vein and finger texture images from a finger of a person at the same time, the device including an image capture device configured to capture at least one image of at least one finger in a contactless manner, a feature extraction module configured to extract unique finger vein features and finger texture features from the at least one captured image, and a processing module configured to normalize the at least one captured image and integrate the extracted finger vein features and finger texture features.

Type: Grant

Filed: June 1, 2011

Date of Patent: October 28, 2014

Assignee: The Hong Kong Polytechnic University

Inventors: Ajay Kumar, Yingbo Zhou
METHOD AND APPARATUS FOR PERSONAL IDENTIFICATION USING FINGER IMAGING

Publication number: 20110304720

Abstract: A system for extracting finger vein and finger texture images from a finger of a person at the same time, the device including an image capture device configured to capture at least one image of at least one finger in a contactless manner, a feature extraction module configured to extract unique finger vein features and finger texture features from the at least one captured image, and a processing module configured to normalize the at least one captured image and integrate the extracted finger vein features and finger texture features.

Type: Application

Filed: June 1, 2011

Publication date: December 15, 2011

Applicant: The Hong Kong Polytechnic University

Inventors: Ajay Kumar, Yingbo Zhou

prev 1 2 3