Patents by Inventor VYACHESLAV SHECHTMAN

VYACHESLAV SHECHTMAN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11842728
    Abstract: An example system includes a processor to receive training targets. The training targets include an observed prosody info vector. The processor can train a neural network to predict acoustic sequences based on the training targets. The processor can train a prosody info generator to predict combined prosody info.
    Type: Grant
    Filed: April 28, 2022
    Date of Patent: December 12, 2023
    Assignee: International Business Machines Corporation
    Inventor: Vyacheslav Shechtman
  • Patent number: 11556782
    Abstract: In a trained attentive decoder of a trained Sequence-to-Sequence (seq2seq) Artificial Neural Network (ANN): obtaining an encoded input vector sequence; generating, using a trained primary attention mechanism of the trained attentive decoder, a primary attention vectors sequence; for each primary attention vector of the primary attention vectors sequence: (a) generating a set of attention vector candidates corresponding to the respective primary attention vector, (b) evaluating, for each attention vector candidate of the set of attention vector candidates, a structure fit measure that quantifies a similarity of the respective attention vector candidate to a desired attention vector structure, (c) generating, using a trained soft-selection ANN, a secondary attention vector based on said evaluation and on state variables of the trained attentive decoder; and generating, using the trained attentive decoder, an output sequence based on the encoded input vector sequence and the secondary attention vectors.
    Type: Grant
    Filed: September 19, 2019
    Date of Patent: January 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Vyacheslav Shechtman, Alexander Sorin
  • Patent number: 11557274
    Abstract: Embodiments may provide improved techniques to assess model checkpoint stability on unseen data on-the-fly, so as to prevent unstable checkpoints from being saved, and to avoid or reduce the need for an expensive thorough evaluation. For example, a method may comprise passing a set of input sequences through a checkpoint of a sequence to sequence model in inference mode to obtain a set of generated sequences of feature vectors, determining whether each of a plurality of generated sequences of feature vectors is complete, counting a number of incomplete generated sequences of feature vectors among the plurality of generated sequences of feature vectors, generating a score indicating a stability of the model based on the count of incomplete generated sequences of feature vectors, and storing the model checkpoint when the score indicating the stability of the model is above a predetermined threshold.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: January 17, 2023
    Assignee: International Business Machines Corporation
    Inventor: Vyacheslav Shechtman
  • Publication number: 20220328041
    Abstract: An example system includes a processor to receive training targets. The training targets include an observed prosody info vector. The processor can train a neural network to predict acoustic sequences based on the training targets. The processor can train a prosody info generator to predict combined prosody info.
    Type: Application
    Filed: April 28, 2022
    Publication date: October 13, 2022
    Inventor: Vyacheslav SHECHTMAN
  • Publication number: 20220293083
    Abstract: Embodiments may provide improved techniques to assess model checkpoint stability on unseen data on-the-fly, so as to prevent unstable checkpoints from being saved, and to avoid or reduce the need for an expensive thorough evaluation. For example, a method may comprise passing a set of input sequences through a checkpoint of a sequence to sequence model in inference mode to obtain a set of generated sequences of feature vectors, determining whether each of a plurality of generated sequences of feature vectors is complete, counting a number of incomplete generated sequences of feature vectors among the plurality of generated sequences of feature vectors, generating a score indicating a stability of the model based on the count of incomplete generated sequences of feature vectors, and storing the model checkpoint when the score indicating the stability of the model is above a predetermined threshold.
    Type: Application
    Filed: March 15, 2021
    Publication date: September 15, 2022
    Inventor: Vyacheslav Shechtman
  • Patent number: 11322135
    Abstract: An example system includes a processor to receive a linguistic sequence and a prosody info offset. The processor can generate, via a trained prosody info predictor, combined prosody info including a number of observations based on the linguistic sequence. The number of observations include linear combinations of statistical measures evaluating a prosodic component over a predetermined period of time. The processor can generate, via a trained neural network, an acoustic sequence based on the combined prosody info, the prosody info offset, and the linguistic sequence.
    Type: Grant
    Filed: September 12, 2019
    Date of Patent: May 3, 2022
    Assignee: International Business Machines Corporation
    Inventor: Vyacheslav Shechtman
  • Publication number: 20210304783
    Abstract: Method, system and computer program product, the method comprising: receiving a first audio, wherein the first audio is a conversion of an audio by a first source to a second source, wherein the first audio having embedded therein first information characterizing the first source of the audio; extracting from the first audio the first information of the first source embedded within the first audio; obtaining second information characterizing a third source; comparing the first information to the second information to obtain comparison results; and subject to the comparison results indicating that the first source is the same as the third source, initiating an action.
    Type: Application
    Filed: March 31, 2020
    Publication date: September 30, 2021
    Inventors: ZVI KONS, VYACHESLAV SHECHTMAN
  • Patent number: 11062691
    Abstract: Embodiments of the present systems and methods may provide techniques that provide the capability to automatically generate allowance intervals for voice personas that meet desired requirements for realism and fidelity. For example, a method for voice persona generation may be implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method comprising: displaying to a user, a plurality of user-selectable voice persona parameters that control features of a synthesized voice signal, and displaying, in conjunction with each of at least some of plurality of user-selectable voice persona parameters, voice transformation allowance intervals of the voice persona parameters, accepting from a user, a selection of at least one user-selectable voice persona parameter, and generating a synthesized voice signal based on the selected at least one user-selectable voice persona parameter.
    Type: Grant
    Filed: May 13, 2019
    Date of Patent: July 13, 2021
    Assignee: International Business Machines Corporation
    Inventors: Vyacheslav Shechtman, Alexander Sorin
  • Publication number: 20210089877
    Abstract: In a trained attentive decoder of a trained Sequence-to-Sequence (seq2seq) Artificial Neural Network (ANN): obtaining an encoded input vector sequence; generating, using a trained primary attention mechanism of the trained attentive decoder, a primary attention vectors sequence; for each primary attention vector of the primary attention vectors sequence: (a) generating a set of attention vector candidates corresponding to the respective primary attention vector, (b) evaluating, for each attention vector candidate of the set of attention vector candidates, a structure fit measure that quantifies a similarity of the respective attention vector candidate to a desired attention vector structure, (c) generating, using a trained soft-selection ANN, a secondary attention vector based on said evaluation and on state variables of the trained attentive decoder; and generating, using the trained attentive decoder, an output sequence based on the encoded input vector sequence and the secondary attention vectors.
    Type: Application
    Filed: September 19, 2019
    Publication date: March 25, 2021
    Inventors: Vyacheslav Shechtman, Alexander Sorin
  • Publication number: 20210082408
    Abstract: An example system includes a processor to receive a linguistic sequence and a prosody info offset. The processor can generate, via a trained prosody info predictor, combined prosody info including a number of observations based on the linguistic sequence. The number of observations include linear combinations of statistical measures evaluating a prosodic component over a predetermined period of time. The processor can generate, via a trained neural network, an acoustic sequence based on the combined prosody info, the prosody info offset, and the linguistic sequence.
    Type: Application
    Filed: September 12, 2019
    Publication date: March 18, 2021
    Inventor: Vyacheslav Shechtman
  • Publication number: 20200365135
    Abstract: Embodiments of the present systems and methods may provide techniques that provide the capability to automatically generate allowance intervals for voice personas that meet desired requirements for realism and fidelity. For example, a method for voice persona generation may be implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method comprising: displaying to a user, a plurality of user-selectable voice persona parameters that control features of a synthesized voice signal, and displaying, in conjunction with each of at least some of plurality of user-selectable voice persona parameters, voice transformation allowance intervals of the voice persona parameters, accepting from a user, a selection of at least one user-selectable voice persona parameter, and generating a synthesized voice signal based on the selected at least one user-selectable voice persona parameter.
    Type: Application
    Filed: May 13, 2019
    Publication date: November 19, 2020
    Inventors: VYACHESLAV SHECHTMAN, Alexander Sorin