Patents by Inventor Frank Soong

Frank Soong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9728203
    Abstract: Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which a synthesized image sequence will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with lip movements synchronized with the desired speech.
    Type: Grant
    Filed: May 2, 2011
    Date of Patent: August 8, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lijuan Wang, Frank Soong
  • Patent number: 9613450
    Abstract: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.
    Type: Grant
    Filed: May 3, 2011
    Date of Patent: April 4, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lijuan Wang, Frank Soong, Qiang Huo, Zhengyou Zhang
  • Publication number: 20120284029
    Abstract: Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which a synthesized image sequence will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with lip movements synchronized with the desired speech.
    Type: Application
    Filed: May 2, 2011
    Publication date: November 8, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Lijuan Wang, Frank Soong
  • Publication number: 20120280974
    Abstract: Dynamic texture mapping is used to create a photorealistic three dimensional animation of an individual with facial features synchronized with desired speech. Audiovisual data of an individual reading a known script is obtained and stored in an audio library and an image library. The audiovisual data is processed to extract feature vectors used to train a statistical model. An input audio feature vector corresponding to desired speech with which the animation will be synchronized is provided. The statistical model is used to generate a trajectory of visual feature vectors that corresponds to the input audio feature vector. These visual feature vectors are used to identify a matching image sequence from the image library. The resulting sequence of images, concatenated from the image library, provides a photorealistic image sequence with facial features, such as lip movements, synchronized with the desired speech. This image sequence is applied to the three-dimensional model.
    Type: Application
    Filed: May 3, 2011
    Publication date: November 8, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Lijuan Wang, Frank Soong, Qiang Huo, Zhengyou Zhang
  • Publication number: 20110231193
    Abstract: Various technologies for generating a synthesized singing voice waveform. In one implementation, the computer program may receive a request from a user to create a synthesized singing voice using the lyrics of a song and a digital file containing its melody as inputs. The computer program may then dissect the lyrics' text and its melody file into its corresponding sub-phonemic units and musical score respectively. The musical score may be further dissected into a sequence of musical notes and duration times for each musical note. The computer program may then determine a fundamental frequency (F0), or pitch, of each musical note.
    Type: Application
    Filed: June 2, 2011
    Publication date: September 22, 2011
    Applicant: Microsoft Corporation
    Inventors: Yao Qian, Frank Soong
  • Patent number: 7977562
    Abstract: Various technologies for generating a synthesized singing voice waveform. In one implementation, the computer program may receive a request from a user to create a synthesized singing voice using the lyrics of a song and a digital file containing its melody as inputs. The computer program may then dissect the lyrics' text and its melody file into its corresponding sub-phonemic units and musical score respectively. The musical score may be further dissected into a sequence of musical notes and duration times for each musical note. The computer program may then determine a fundamental frequency (F0), or pitch, of each musical note.
    Type: Grant
    Filed: June 20, 2008
    Date of Patent: July 12, 2011
    Assignee: Microsoft Corporation
    Inventors: Yao Qian, Frank Soong
  • Patent number: 7903877
    Abstract: Exemplary methods, systems, and computer-readable media for developing, training and/or using models for online handwriting recognition of characters are described. An exemplary method for building a trainable radical-based HMM for use in character recognition includes defining radical nodes, where a radical node represents a structural element of an character, and defining connection nodes, where a connection node represents a spatial relationship between two or more radicals. Such a method may include determining a number of paths in the radical-based HMM using subsequence direction histogram vector (SDHV) clustering and determining a number of states in the radical-based HMM using curvature scale space-based (CSS) corner detection.
    Type: Grant
    Filed: March 6, 2007
    Date of Patent: March 8, 2011
    Assignee: Microsoft Corporation
    Inventors: Shi Han, Yu Zou, Ming Chang, Peng Liu, Yi-Jian Wu, Lei Ma, Frank Soong, Dongmei Zhang, Jian Wang
  • Patent number: 7805004
    Abstract: Exemplary techniques are described for selecting radical sets for use in probabilistic East Asian character recognition algorithms. An exemplary technique includes applying a decomposition rule to each East Asian character of the set to generate a progressive splitting graph where the progressive splitting graph comprises radicals as nodes, formulating an optimization problem to find an optimal set of radicals to represent the set of East Asian characters using maximum likelihood and minimum description length and solving the optimization problem for the optimal set of radicals. Another exemplary technique includes selecting an optimal set of radicals by using a general function that characterizes a radical with respect to other East Asian characters and a complex function that characterizes complexity of a radical.
    Type: Grant
    Filed: February 28, 2007
    Date of Patent: September 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Shi Han, Yu Zou, Ming Chang, Peng Liu, Yi-Jian Wu, Lei Ma, Frank Soong, Dongmei Zhang, Jian Wang
  • Publication number: 20090314155
    Abstract: Various technologies for generating a synthesized singing voice waveform. In one implementation, the computer program may receive a request from a user to create a synthesized singing voice using the lyrics of a song and a digital file containing its melody as inputs. The computer program may then dissect the lyrics' text and its melody file into its corresponding sub-phonemic units and musical score respectively. The musical score may be further dissected into a sequence of musical notes and duration times for each musical note. The computer program may then determine a fundamental frequency (F0), or pitch, of each musical note.
    Type: Application
    Filed: June 20, 2008
    Publication date: December 24, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Yao Qian, Frank Soong
  • Publication number: 20090099847
    Abstract: Detailed herein is a technology which, among other things, reduces errors introduced in recording and transcription data. In one approach to this technology, a method of detecting audio transcription errors is utilized. This method includes selected a focus unit, and selecting a context template corresponding to the focus unit. A hypothesis set is then determined, with reference to the context template and the focus unit. A probability is calculated corresponding to the focus unit, across the hypothesis set.
    Type: Application
    Filed: October 10, 2007
    Publication date: April 16, 2009
    Applicant: Microsoft Corporation
    Inventors: Frank Soong, Lijuan Wang
  • Publication number: 20080219556
    Abstract: Exemplary methods, systems, and computer-readable media for developing, training and/or using models for online handwriting recognition of characters are described. An exemplary method for building a trainable radical-based HMM for use in character recognition includes defining radical nodes, where a radical node represents a structural element of an character, and defining connection nodes, where a connection node represents a spatial relationship between two or more radicals. Such a method may include determining a number of paths in the radical-based HMM using subsequence direction histogram vector (SDHV) clustering and determining a number of states in the radical-based HMM using curvature scale space-based (CSS) corner detection.
    Type: Application
    Filed: March 6, 2007
    Publication date: September 11, 2008
    Applicant: Microsoft Corporation
    Inventors: Shi Han, Yu Zou, Ming Chang, Peng Liu, Yi-Jian Wu, Lei Ma, Frank Soong, Dongmei Zhang, Jian Wang
  • Publication number: 20080205761
    Abstract: Exemplary techniques are described for selecting radical sets for use in probabilistic East Asian character recognition algorithms. An exemplary technique includes applying a decomposition rule to each East Asian character of the set to generate a progressive splitting graph where the progressive splitting graph comprises radicals as nodes, formulating an optimization problem to find an optimal set of radicals to represent the set of East Asian characters using maximum likelihood and minimum description length and solving the optimization problem for the optimal set of radicals. Another exemplary technique includes selecting an optimal set of radicals by using a general function that characterizes a radical with respect to other East Asian characters and a complex function that characterizes complexity of a radical.
    Type: Application
    Filed: February 28, 2007
    Publication date: August 28, 2008
    Applicant: Microsoft Corporation
    Inventors: Shi Han, Yu Zou, Ming Chang, Peng Liu, Yi-Jian Wu, Lei Ma, Frank Soong, Dongmei Zhang, Jian Wang
  • Publication number: 20070239432
    Abstract: Multiple input modalities are selectively used by a user or process to prune a word graph. Pruning initiates rescoring in order to generate a new word graph with a revised best path.
    Type: Application
    Filed: March 30, 2006
    Publication date: October 11, 2007
    Applicant: Microsoft Corporation
    Inventors: Frank Soong, Jian-Lai Zhou, Peng Liu
  • Publication number: 20070219796
    Abstract: A Weighted Likelihood Ratio Hidden Markov Model is utilized for speech processing. The model emphasizes spectral peaks when comparing spectra. Probability density functions for states in the model can be developed with weights based on the comparison.
    Type: Application
    Filed: March 20, 2006
    Publication date: September 20, 2007
    Applicant: Microsoft Corporation
    Inventors: Chao Huang, Frank Soong, Jian-lai Zhou
  • Publication number: 20070219797
    Abstract: Speech recognition such as command and control speech recognition generally use a context free grammar to constrain the decoding process. Word or subword background model are constructed to repopulate dynamic hypothesis space, especially when word spareness is at issue. The background models can be later used in speech recognition. During speech recognition, background and conventional context free grammar decoding are used to measure confidence. The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.
    Type: Application
    Filed: March 16, 2006
    Publication date: September 20, 2007
    Applicant: Microsoft Corporation
    Inventors: Peng Liu, Ye Tian, Jian-Lai Zhou, Frank Soong
  • Publication number: 20060290656
    Abstract: Input is received from at least two different input sources. Information from these sources are combined together to provide a result. In a particular example, input from one source corresponds to potential recognition candidates, and input from another source corresponds to other potential candidates. These candidates are combined together to select a result.
    Type: Application
    Filed: June 28, 2005
    Publication date: December 28, 2006
    Applicant: Microsoft Corporation
    Inventors: Frank Soong, Jian-Lai Zhou, Ye Tian