Patents by Inventor Yingbo Zhou

Yingbo Zhou has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11056099
    Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.
    Type: Grant
    Filed: September 5, 2019
    Date of Patent: July 6, 2021
    Assignee: salesforce.com, inc.
    Inventors: Yingbo Zhou, Caiming Xiong
  • Publication number: 20210089882
    Abstract: Systems and methods are provided for near-zero-cost (NZC) query framework or approach for differentially private deep learning. To protect the privacy of training data during learning, the near-zero-cost query framework transfers knowledge from an ensemble of teacher models trained on partitions of the data to a student model. Privacy guarantees may be understood intuitively and expressed rigorously in terms of differential privacy. Other features are also provided.
    Type: Application
    Filed: October 21, 2019
    Publication date: March 25, 2021
    Inventors: Lichao SUN, Jia LI, Caiming XIONG, Yingbo ZHOU
  • Patent number: 10958925
    Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: March 23, 2021
    Assignee: salesforce.com, inc.
    Inventors: Yingbo Zhou, Luowei Zhou, Caiming Xiong, Richard Socher
  • Patent number: 10783875
    Abstract: A system for domain adaptation includes a domain adaptation model configured to adapt a representation of a signal in a first domain to a second domain to generate an adapted presentation and a plurality of discriminators corresponding to a plurality of bands of values of a domain variable. Each of the plurality of discriminators is configured to discriminate between the adapted representation and representations of one or more other signals in the second domain.
    Type: Grant
    Filed: July 3, 2018
    Date of Patent: September 22, 2020
    Assignee: salesforce.com, inc.
    Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
  • Publication number: 20200104699
    Abstract: Embodiments for training a neural network using sequential tasks are provided. A plurality of sequential tasks are received. For each task in the plurality of tasks a copy of the neural network that includes a plurality of layers is generated. From the copy of the neural network a task specific neural network is generated by performing an architectural search on the plurality of layers in the copy of the neural network. The architectural search identifies a plurality of candidate choices in the layers of the task specific neural network. Parameters in the task specific neural network that correspond to the plurality of candidate choices and that maximize architectural weights at each layer are identified. The parameters are retrained and merged with the neural network. The neural network trained on the plurality of sequential tasks is a trained neural network.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 2, 2020
    Inventors: Yingbo ZHOU, Xilai LI, Caiming XIONG
  • Publication number: 20200084465
    Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.
    Type: Application
    Filed: November 18, 2019
    Publication date: March 12, 2020
    Inventors: Yingbo ZHOU, Luowei ZHOU, Caiming XIONG, Richard SOCHER
  • Patent number: 10573295
    Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.
    Type: Grant
    Filed: January 23, 2018
    Date of Patent: February 25, 2020
    Assignee: salesforce.com, inc.
    Inventors: Yingbo Zhou, Caiming Xiong
  • Patent number: 10542270
    Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.
    Type: Grant
    Filed: January 18, 2018
    Date of Patent: January 21, 2020
    Assignee: salesforce.com, inc.
    Inventors: Yingbo Zhou, Luowei Zhou, Caiming Xiong, Richard Socher
  • Publication number: 20200005765
    Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.
    Type: Application
    Filed: September 5, 2019
    Publication date: January 2, 2020
    Inventors: Yingbo ZHOU, Caiming XIONG
  • Publication number: 20190295530
    Abstract: A system for domain adaptation includes a domain adaptation model configured to adapt a representation of a signal in a first domain to a second domain to generate an adapted presentation and a plurality of discriminators corresponding to a plurality of bands of values of a domain variable. Each of the plurality of discriminators is configured to discriminate between the adapted representation and representations of one or more other signals in the second domain.
    Type: Application
    Filed: July 3, 2018
    Publication date: September 26, 2019
    Applicant: salesforce.com, inc.
    Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
  • Publication number: 20190286073
    Abstract: A method for training parameters of a first domain adaptation model includes evaluating a cycle consistency objective using a first task specific model associated with a first domain and a second task specific model associated with a second domain. The evaluating the cycle consistency objective is based on one or more first training representations adapted from the first domain to the second domain by a first domain adaptation model and from the second domain to the first domain by a second domain adaptation model, and one or more second training representations adapted from the second domain to the first domain by the second domain adaptation model and from the first domain to the second domain by the first domain adaptation model. The method further includes evaluating a learning objective based on the cycle consistency objective, and updating parameters of the first domain adaptation model based on learning objective.
    Type: Application
    Filed: August 3, 2018
    Publication date: September 19, 2019
    Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
  • Publication number: 20190149834
    Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.
    Type: Application
    Filed: January 18, 2018
    Publication date: May 16, 2019
    Inventors: Yingbo Zhou, Luowei ZHOU, Caiming XIONG, Richard SOCHER
  • Publication number: 20190130897
    Abstract: The disclosed technology teaches a deep end-to-end speech recognition model, including using multi-objective learning criteria to train a deep end-to-end speech recognition model on training data comprising speech samples temporally labeled with ground truth transcriptions.
    Type: Application
    Filed: January 23, 2018
    Publication date: May 2, 2019
    Applicant: salesforce.com, inc.
    Inventors: Yingbo Zhou, Caiming Xiong
  • Publication number: 20190130896
    Abstract: The disclosed technology teaches regularizing a deep end-to-end speech recognition model to reduce overfitting and improve generalization: synthesizing sample speech variations on original speech samples labelled with text transcriptions, and modifying a particular original speech sample to independently vary tempo and pitch of the original speech sample while retaining the labelled text transcription of the original speech sample, thereby producing multiple sample speech variations having multiple degrees of variation from the original speech sample. The disclosed technology includes training a deep end-to-end speech recognition model, on thousands to millions of original speech samples and the sample speech variations on the original speech samples, that outputs recognized text transcriptions corresponding to speech detected in the original speech samples and the sample speech variations.
    Type: Application
    Filed: December 21, 2017
    Publication date: May 2, 2019
    Applicant: salesforce.com, inc.
    Inventors: Yingbo ZHOU, Caiming XIONG, Richard SOCHER
  • Patent number: 9432671
    Abstract: A method, non-transitory computer readable medium, and apparatus for classifying machine printed text and handwritten text in an input are disclosed. For example, the method defines a perspective for an auto-encoder, receives the input for the auto-encoder, wherein the input comprises a document comprising the machine printed text and the handwritten text, performs an encoding on the input using an auto-encoder to generate a classifier, applies the classifier on the input and generates an output that separates the machine printed text and the handwritten text in the input based on the classifier in accordance with the perspective.
    Type: Grant
    Filed: May 22, 2014
    Date of Patent: August 30, 2016
    Assignee: Xerox Corporation
    Inventors: Michael Robert Campanelli, Safwan R. Wshah, Yingbo Zhou
  • Publication number: 20150339543
    Abstract: A method, non-transitory computer readable medium, and apparatus for classifying machine printed text and handwritten text in an input are disclosed. For example, the method defines a perspective for an auto-encoder, receives the input for the auto-encoder, wherein the input comprises a document comprising the machine printed text and the handwritten text, performs an encoding on the input using an auto-encoder to generate a classifier, applies the classifier on the input and generates an output that separates the machine printed text and the handwritten text in the input based on the classifier in accordance with the perspective.
    Type: Application
    Filed: May 22, 2014
    Publication date: November 26, 2015
    Applicant: Xerox Corporation
    Inventors: Michael Robert Campanelli, Safwan R. Wshah, Yingbo Zhou
  • Patent number: 8872909
    Abstract: A system for extracting finger vein and finger texture images from a finger of a person at the same time, the device including an image capture device configured to capture at least one image of at least one finger in a contactless manner, a feature extraction module configured to extract unique finger vein features and finger texture features from the at least one captured image, and a processing module configured to normalize the at least one captured image and integrate the extracted finger vein features and finger texture features.
    Type: Grant
    Filed: June 1, 2011
    Date of Patent: October 28, 2014
    Assignee: The Hong Kong Polytechnic University
    Inventors: Ajay Kumar, Yingbo Zhou
  • Publication number: 20110304720
    Abstract: A system for extracting finger vein and finger texture images from a finger of a person at the same time, the device including an image capture device configured to capture at least one image of at least one finger in a contactless manner, a feature extraction module configured to extract unique finger vein features and finger texture features from the at least one captured image, and a processing module configured to normalize the at least one captured image and integrate the extracted finger vein features and finger texture features.
    Type: Application
    Filed: June 1, 2011
    Publication date: December 15, 2011
    Applicant: The Hong Kong Polytechnic University
    Inventors: Ajay Kumar, Yingbo Zhou