Patents by Inventor Zhi-Jie Yan

Zhi-Jie Yan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10579922
    Abstract: The use of the alternating direction method of multipliers (ADMM) algorithm to train a classifier may reduce the amount of classifier training time with little degradation in classifier accuracy. The training involves partitioning the training data for training the classifier into multiple data blocks. The partitions may preserve the joint distribution of input features and an output class of the training data. The training may further include performing an ADMM iteration on the multiple data blocks in an initial order using multiple worker nodes. Subsequently, the training of the classifier is determined to be completed if a stop criterion is satisfied following the ADMM iteration. Otherwise, if the stop criterion is determined to be unsatisfied following the ADMM iteration, one or more additional ADMM iterations may be performed on different orders of the multiple data blocks until the stop criterion is satisfied.
    Type: Grant
    Filed: April 8, 2014
    Date of Patent: March 3, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Qiang Huo, Zhi-Jie Yan, Kai Chen
  • Publication number: 20170147920
    Abstract: The use of the alternating direction method of multipliers (ADMM) algorithm to train a classifier may reduce the amount of classifier training time with little degradation in classifier accuracy. The training involves partitioning the training data for training the classifier into multiple data blocks. The partitions may preserve the joint distribution of input features and an output class of the training data. The training may further include performing an ADMM iteration on the multiple data blocks in an initial order using multiple worker nodes. Subsequently, the training of the classifier is determined to be completed if a stop criterion is satisfied following the ADMM iteration. Otherwise, if the stop criterion is determined to be unsatisfied following the ADMM iteration, one or more additional ADMM iterations may be performed on different orders of the multiple data blocks until the stop criterion is satisfied.
    Type: Application
    Filed: April 8, 2014
    Publication date: May 25, 2017
    Inventors: Qiang Huo, Zhi-Jie Yan, Kai Chen
  • Publication number: 20150199960
    Abstract: Methods and systems for i-vector based clustering training data in speech recognition are described. An i-vector may be extracted from a speech segment of a speech training data to represent acoustic information. The extracted i-vectors from the speech training data may be clustered into multiple clusters using a hierarchical divisive clustering algorithm. Using a cluster of the multiple clusters, an acoustic model may be trained. This trained acoustic model may be used in speech recognition.
    Type: Application
    Filed: August 24, 2012
    Publication date: July 16, 2015
    Inventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang, Jian Xu
  • Publication number: 20130185070
    Abstract: A speech recognition system trains a plurality of feature transforms and a plurality of acoustic models using an irrelevant variability normalization based discriminative training. The speech recognition system employs the trained feature transforms to absorb or ignore variability within an unknown speech that is irrelevant to phonetic classification. The speech recognition system may then recognize the unknown speech using the trained recognition models. The speech recognition system may further perform an unsupervised adaptation to adapt the feature transforms for the unknown speech and thus increase the accuracy of recognizing the unknown speech.
    Type: Application
    Filed: January 12, 2012
    Publication date: July 18, 2013
    Applicant: Microsoft Corporation
    Inventors: Qiang Huo, Zhi-Jie Yan, Yu Zhang
  • Patent number: 8340965
    Abstract: Embodiments of rich context modeling for speech synthesis are disclosed. In operation, a text-to-speech engine refines a plurality of rich context models based on decision tree-tied Hidden Markov Models (HMMs) to produce a plurality of refined rich context models. The text-to-speech engine then generates synthesized speech for an input text based at least on some of the plurality of refined rich context models.
    Type: Grant
    Filed: December 2, 2009
    Date of Patent: December 25, 2012
    Assignee: Microsoft Corporation
    Inventors: Zhi-Jie Yan, Yao Qian, Frank Kao-Ping Soong
  • Publication number: 20120143611
    Abstract: Hidden Markov Models HMM trajectory tiling (HTT)-based approaches may be used to synthesize speech from text. In operation, a set of Hidden Markov Models (HMMs) and a set of waveform units may be obtained from a speech corpus. The set of HMMs are further refined via minimum generation error (MGE) training to generate a refined set of HMMs. Subsequently, a speech parameter trajectory may be generated by applying the refined set of HMMs to an input text. A unit lattice of candidate waveform units may be selected from the set of waveform units based at least on the speech parameter trajectory. A normalized cross-correlation (NCC)-based search on the unit lattice may be performed to obtain a minimal concatenation cost sequence of candidate waveform units, which are concatenated into a concatenated waveform sequence that is synthesized into speech.
    Type: Application
    Filed: December 7, 2010
    Publication date: June 7, 2012
    Applicant: Microsoft Corporation
    Inventors: Yao Qian, Zhi-Jie Yan, Yi-Jian Wu, Frank Kao-Ping Soong
  • Publication number: 20110071835
    Abstract: Embodiments of small footprint text-to-speech engine are disclosed. In operation, the small footprint text-to-speech engine generates a set of feature parameters for an input text. The set of feature parameters includes static feature parameters and delta feature parameters. The small footprint text-to-speech engine then derives a saw-tooth stochastic trajectory that represents the speech characteristics of the input text based on the static feature parameters and the delta parameters. Finally, the small footprint text-to-speech engine produces a smoothed trajectory from the saw-tooth stochastic trajectory, and generates synthesized speech based on the smoothed trajectory.
    Type: Application
    Filed: September 22, 2009
    Publication date: March 24, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Yi-Ning Chen, Zhi-Jie Yan, Frank Kao-Ping Soong
  • Publication number: 20110054903
    Abstract: Embodiments of rich text modeling for speech synthesis are disclosed. In operation, a text-to-speech engine refines a plurality of rich context models based on decision tree-tied Hidden Markov Models (HMMs) to produce a plurality of refined rich context models. The text-to-speech engine then generates synthesized speech for an input text based at least on some of the plurality of refined rich context models.
    Type: Application
    Filed: December 2, 2009
    Publication date: March 3, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhi-Jie Yan, Yao Qian, Frank Kao-Ping Soong