Patents by Inventor Tianyan ZHOU

Tianyan ZHOU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11776548
    Abstract: Embodiments may include determination, for each of a plurality of speech frames associated with an acoustic feature, of a phonetic feature based on the associated acoustic feature, generation of one or more two-dimensional feature maps based on the plurality of phonetic features, input of the one or more two-dimensional feature maps to a trained neural network to generate a plurality of speaker embeddings, and aggregation of the plurality of speaker embeddings into a speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, wherein the speaker embedding is associated with an identity of the speaker.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: October 3, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yong Zhao, Tianyan Zhou, Jinyu Li, Yifan Gong, Jian Wu, Zhuo Chen
  • Publication number: 20220157324
    Abstract: Embodiments may include determination, for each of a plurality of speech frames associated with an acoustic feature, of a phonetic feature based on the associated acoustic feature, generation of one or more two-dimensional feature maps based on the plurality of phonetic features, input of the one or more two-dimensional feature maps to a trained neural network to generate a plurality of speaker embeddings, and aggregation of the plurality of speaker embeddings into a speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, wherein the speaker embedding is associated with an identity of the speaker.
    Type: Application
    Filed: February 7, 2022
    Publication date: May 19, 2022
    Inventors: Yong ZHAO, Tianyan ZHOU, Jinyu LI, Yifan GONG, Jian WU, Zhuo CHEN
  • Patent number: 11276410
    Abstract: Embodiments may include reception of a plurality of speech frames, determination of a multi-dimensional acoustic feature associated with each of the plurality of speech frames, determination of a plurality of multi-dimensional phonetic features, each of the plurality of multi-dimensional phonetic features determined based on a respective one of the plurality of speech frames, generation of a plurality of two-dimensional feature maps based on the phonetic features, input of the feature maps and the plurality of acoustic features to a convolutional neural network, the convolutional neural network to generate a plurality of speaker embeddings based on the plurality of feature maps and the plurality of acoustic features, aggregation of the plurality of speaker embeddings into a first speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, and determination of a speaker associated with the plurality of speech frames based on the first speaker embedding.
    Type: Grant
    Filed: November 13, 2019
    Date of Patent: March 15, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yong Zhao, Tianyan Zhou, Jinyu Li, Yifan Gong, Jian Wu, Zhuo Chen
  • Publication number: 20210082438
    Abstract: Embodiments may include reception of a plurality of speech frames, determination of a multi-dimensional acoustic feature associated with each of the plurality of speech frames, determination of a plurality of multi-dimensional phonetic features, each of the plurality of multi-dimensional phonetic features determined based on a respective one of the plurality of speech frames, generation of a plurality of two-dimensional feature maps based on the phonetic features, input of the feature maps and the plurality of acoustic features to a convolutional neural network, the convolutional neural network to generate a plurality of speaker embeddings based on the plurality of feature maps and the plurality of acoustic features, aggregation of the plurality of speaker embeddings into a first speaker embedding based on respective weights determined for each of the plurality of speaker embeddings, and determination of a speaker associated with the plurality of speech frames based on the first speaker embedding.
    Type: Application
    Filed: November 13, 2019
    Publication date: March 18, 2021
    Inventors: Yong ZHAO, Tianyan ZHOU, Jinyu LI, Yifan GONG, Jian WU, Zhuo CHEN