Patents by Inventor VYACHESLAV SHECHTMAN

VYACHESLAV SHECHTMAN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Training neural networks to predict acoustic sequences using observed prosody info

Patent number: 11842728

Abstract: An example system includes a processor to receive training targets. The training targets include an observed prosody info vector. The processor can train a neural network to predict acoustic sequences based on the training targets. The processor can train a prosody info generator to predict combined prosody info.

Type: Grant

Filed: April 28, 2022

Date of Patent: December 12, 2023

Assignee: International Business Machines Corporation

Inventor: Vyacheslav Shechtman
Structure-preserving attention mechanism in sequence-to-sequence neural models

Patent number: 11556782

Abstract: In a trained attentive decoder of a trained Sequence-to-Sequence (seq2seq) Artificial Neural Network (ANN): obtaining an encoded input vector sequence; generating, using a trained primary attention mechanism of the trained attentive decoder, a primary attention vectors sequence; for each primary attention vector of the primary attention vectors sequence: (a) generating a set of attention vector candidates corresponding to the respective primary attention vector, (b) evaluating, for each attention vector candidate of the set of attention vector candidates, a structure fit measure that quantifies a similarity of the respective attention vector candidate to a desired attention vector structure, (c) generating, using a trained soft-selection ANN, a secondary attention vector based on said evaluation and on state variables of the trained attentive decoder; and generating, using the trained attentive decoder, an output sequence based on the encoded input vector sequence and the secondary attention vectors.

Type: Grant

Filed: September 19, 2019

Date of Patent: January 17, 2023

Assignee: International Business Machines Corporation

Inventors: Vyacheslav Shechtman, Alexander Sorin
Robust checkpoint selection for monotonic autoregressive seq2seq neural generative models

Patent number: 11557274

Abstract: Embodiments may provide improved techniques to assess model checkpoint stability on unseen data on-the-fly, so as to prevent unstable checkpoints from being saved, and to avoid or reduce the need for an expensive thorough evaluation. For example, a method may comprise passing a set of input sequences through a checkpoint of a sequence to sequence model in inference mode to obtain a set of generated sequences of feature vectors, determining whether each of a plurality of generated sequences of feature vectors is complete, counting a number of incomplete generated sequences of feature vectors among the plurality of generated sequences of feature vectors, generating a score indicating a stability of the model based on the count of incomplete generated sequences of feature vectors, and storing the model checkpoint when the score indicating the stability of the model is above a predetermined threshold.

Type: Grant

Filed: March 15, 2021

Date of Patent: January 17, 2023

Assignee: International Business Machines Corporation

Inventor: Vyacheslav Shechtman
TRAINING NEURAL NETWORKS TO PREDICT ACOUSTIC SEQUENCES USING OBSERVED PROSODY INFO

Publication number: 20220328041

Abstract: An example system includes a processor to receive training targets. The training targets include an observed prosody info vector. The processor can train a neural network to predict acoustic sequences based on the training targets. The processor can train a prosody info generator to predict combined prosody info.

Type: Application

Filed: April 28, 2022

Publication date: October 13, 2022

Inventor: Vyacheslav SHECHTMAN
ROBUST CHECKPOINT SELECTION FOR MONOTONIC AUTOREGRESSIVE SEQ2SEQ NEURAL GENERATIVE MODELS

Publication number: 20220293083

Abstract: Embodiments may provide improved techniques to assess model checkpoint stability on unseen data on-the-fly, so as to prevent unstable checkpoints from being saved, and to avoid or reduce the need for an expensive thorough evaluation. For example, a method may comprise passing a set of input sequences through a checkpoint of a sequence to sequence model in inference mode to obtain a set of generated sequences of feature vectors, determining whether each of a plurality of generated sequences of feature vectors is complete, counting a number of incomplete generated sequences of feature vectors among the plurality of generated sequences of feature vectors, generating a score indicating a stability of the model based on the count of incomplete generated sequences of feature vectors, and storing the model checkpoint when the score indicating the stability of the model is above a predetermined threshold.

Type: Application

Filed: March 15, 2021

Publication date: September 15, 2022

Inventor: Vyacheslav Shechtman
Generating acoustic sequences via neural networks using combined prosody info

Patent number: 11322135

Abstract: An example system includes a processor to receive a linguistic sequence and a prosody info offset. The processor can generate, via a trained prosody info predictor, combined prosody info including a number of observations based on the linguistic sequence. The number of observations include linear combinations of statistical measures evaluating a prosodic component over a predetermined period of time. The processor can generate, via a trained neural network, an acoustic sequence based on the combined prosody info, the prosody info offset, and the linguistic sequence.

Type: Grant

Filed: September 12, 2019

Date of Patent: May 3, 2022

Assignee: International Business Machines Corporation

Inventor: Vyacheslav Shechtman
VOICE CONVERSION AND VERIFICATION

Publication number: 20210304783

Abstract: Method, system and computer program product, the method comprising: receiving a first audio, wherein the first audio is a conversion of an audio by a first source to a second source, wherein the first audio having embedded therein first information characterizing the first source of the audio; extracting from the first audio the first information of the first source embedded within the first audio; obtaining second information characterizing a third source; comparing the first information to the second information to obtain comparison results; and subject to the comparison results indicating that the first source is the same as the third source, initiating an action.

Type: Application

Filed: March 31, 2020

Publication date: September 30, 2021

Inventors: ZVI KONS, VYACHESLAV SHECHTMAN
Voice transformation allowance determination and representation

Patent number: 11062691

Abstract: Embodiments of the present systems and methods may provide techniques that provide the capability to automatically generate allowance intervals for voice personas that meet desired requirements for realism and fidelity. For example, a method for voice persona generation may be implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method comprising: displaying to a user, a plurality of user-selectable voice persona parameters that control features of a synthesized voice signal, and displaying, in conjunction with each of at least some of plurality of user-selectable voice persona parameters, voice transformation allowance intervals of the voice persona parameters, accepting from a user, a selection of at least one user-selectable voice persona parameter, and generating a synthesized voice signal based on the selected at least one user-selectable voice persona parameter.

Type: Grant

Filed: May 13, 2019

Date of Patent: July 13, 2021

Assignee: International Business Machines Corporation

Inventors: Vyacheslav Shechtman, Alexander Sorin
STRUCTURE-PRESERVING ATTENTION MECHANISM IN SEQUENCE-TO-SEQUENCE NEURAL MODELS

Publication number: 20210089877

Abstract: In a trained attentive decoder of a trained Sequence-to-Sequence (seq2seq) Artificial Neural Network (ANN): obtaining an encoded input vector sequence; generating, using a trained primary attention mechanism of the trained attentive decoder, a primary attention vectors sequence; for each primary attention vector of the primary attention vectors sequence: (a) generating a set of attention vector candidates corresponding to the respective primary attention vector, (b) evaluating, for each attention vector candidate of the set of attention vector candidates, a structure fit measure that quantifies a similarity of the respective attention vector candidate to a desired attention vector structure, (c) generating, using a trained soft-selection ANN, a secondary attention vector based on said evaluation and on state variables of the trained attentive decoder; and generating, using the trained attentive decoder, an output sequence based on the encoded input vector sequence and the secondary attention vectors.

Type: Application

Filed: September 19, 2019

Publication date: March 25, 2021

Inventors: Vyacheslav Shechtman, Alexander Sorin
GENERATING ACOUSTIC SEQUENCES VIA NEURAL NETWORKS USING COMBINED PROSODY INFO

Publication number: 20210082408

Abstract: An example system includes a processor to receive a linguistic sequence and a prosody info offset. The processor can generate, via a trained prosody info predictor, combined prosody info including a number of observations based on the linguistic sequence. The number of observations include linear combinations of statistical measures evaluating a prosodic component over a predetermined period of time. The processor can generate, via a trained neural network, an acoustic sequence based on the combined prosody info, the prosody info offset, and the linguistic sequence.

Type: Application

Filed: September 12, 2019

Publication date: March 18, 2021

Inventor: Vyacheslav Shechtman
VOICE TRANSFORMATION ALLOWANCE DETERMINATION AND REPRESENTATION

Publication number: 20200365135

Abstract: Embodiments of the present systems and methods may provide techniques that provide the capability to automatically generate allowance intervals for voice personas that meet desired requirements for realism and fidelity. For example, a method for voice persona generation may be implemented in a computer system comprising a processor, memory accessible by the processor, and computer program instructions stored in the memory and executable by the processor, the method comprising: displaying to a user, a plurality of user-selectable voice persona parameters that control features of a synthesized voice signal, and displaying, in conjunction with each of at least some of plurality of user-selectable voice persona parameters, voice transformation allowance intervals of the voice persona parameters, accepting from a user, a selection of at least one user-selectable voice persona parameter, and generating a synthesized voice signal based on the selected at least one user-selectable voice persona parameter.

Type: Application

Filed: May 13, 2019

Publication date: November 19, 2020

Inventors: VYACHESLAV SHECHTMAN, Alexander Sorin