Patents by Inventor Ranniery MAIA

Ranniery MAIA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multi-stream spectral representation for statistical parametric speech synthesis

Patent number: 10446133

Abstract: There is provided a speech synthesizer comprising a processor configured to receive one or more linguistic units, convert said one or more linguistic units into a sequence of speech vectors for synthesizing speech, and output the sequence of speech vectors. Said conversion comprises modelling higher and lower spectral frequencies of the speech data as separate high and low spectral streams by applying a first set of one or more statistical models to the higher spectral frequencies and a second set of one or more statistical models to the lower spectral frequencies.

Type: Grant

Filed: February 24, 2017

Date of Patent: October 15, 2019

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kayoko Yanagisawa, Ranniery Maia, Yannis Stylianou
MULTI-STREAM SPECTRAL REPRESENTATION FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS

Publication number: 20170263239

Abstract: There is provided a speech synthesiser comprising a processor configured to receive one or more linguistic units, convert said one or more linguistic units into a sequence of speech vectors for synthesising speech, and output the sequence of speech vectors. Said conversion comprises modelling higher and lower spectral frequencies of the speech data as separate high and low spectral streams by applying a first set of one or more statistical models to the higher spectral frequencies and a second set of one or more statistical models to the lower spectral frequencies.

Type: Application

Filed: February 24, 2017

Publication date: September 14, 2017

Applicant: Kabushiki Kaisha Toshiba

Inventors: Kayoko YANAGISAWA, Ranniery MAIA, Yannis STYLIANOU
Speech processing system

Patent number: 9466285

Abstract: A method of deriving speech synthesis parameters from an input speech audio signal, wherein the audio signal is segmented on the basis of estimated positions of glottal closure incidents and the resulting segments are processed to obtain the complex cepstrum used to derive a synthesis filter. A reconstructed speech signal is produced by passing a pulsed excitation signal derived from the position of the glottal closure incidents through the synthesis filter, and compared with the input speech audio signal. The pulse excitation signal and the complex cepstrum are then iteratively modified to minimize the difference between the reconstructed speech signal and the input speech audio signal, by optimizing the position of the pulses in the excitation signal to reduce the mean squared error between the reconstructed speech signal and the input speech audio signal, and recalculating the complex using the optimized pulse positions.

Type: Grant

Filed: November 26, 2013

Date of Patent: October 11, 2016

Assignee: Kabushiki Kaisha Toshiba

Inventor: Ranniery Maia
Synthetic audiovisual storyteller

Patent number: 9361722

Abstract: A method of animating a computer generation of a head and displaying the text of an electronic book, such that the head has a mouth which moves in accordance with the speech of the text of the electronic book to be output by the head and a word or group of words from the text is displayed while simultaneously being mimed by the mouth, wherein input text is divided into a sequence of acoustic units, which are converted to a sequence of image vectors and into a sequence of text display indicators. The sequence of image vectors is outputted as video such that the mouth of said head moves to mime the speech associated with the input text with a selected expression, and the sequence of text display indicators is output as video which is synchronized with the lip movement of the head.

Type: Grant

Filed: August 8, 2014

Date of Patent: June 7, 2016

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Balakrishna Venkata Jagannadha Kolluru, Ioannis Stylianou, Robert Arthur Blokland, Norbert Braunschweiler, Kayoko Yanagisawa, Langzhou Chen, Ranniery Maia, Robert Anderson, Bjorn Stenger, Roberto Cipolla, Neil Baker
SYNTHETIC AUDIOVISUAL STORYTELLER

Publication number: 20150042662

Abstract: A method of animating a computer generation of a head and displaying the text of an electronic book, such that the head has a mouth which moves in accordance with the speech of the text of the electronic book to be output by the head and a word or group of words from the text is displayed while simultaneously being mimed by the mouth, said method comprising: inputting the text of said book; dividing said input text into a sequence of acoustic units; determining expression characteristics for the inputted text; calculating a duration for each acoustic unit using a duration model; converting said sequence of acoustic units to a sequence of image vectors using a statistical model, wherein said model has a plurality of model parameters describing probability distributions which relate an acoustic unit to an image vector, said image vector comprising a plurality of parameters which define a face of said head; converting said sequence of acoustic units into a sequence of text display indicators using an text dis

Type: Application

Filed: August 8, 2014

Publication date: February 12, 2015

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Balakrishna Venkata Jagannadha Kolluru, Ioannis Stylianou, Robert Arthur Blokland, Norbert Braunschweiler, Kayoko Yanagisawa, Langzhou Chen, Ranniery MAIA, Robert Anderson, Bjorn Stenger, Roberto Cipolla, Neil Baker
SPEECH PROCESSING METHOD AND APPARATUS

Publication number: 20110276332

Abstract: A speech synthesis method comprising: receiving a text input and outputting speech corresponding to said text input using a stochastic model, said stochastic model comprising an acoustic model and an excitation model, said acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to a feature, said excitation model comprising excitation model parameters which are used to model the vocal chords and lungs to output the speech using said features; wherein said acoustic parameters and excitation parameters have been jointly estimated; and outputting said speech.

Type: Application

Filed: May 6, 2011

Publication date: November 10, 2011

Applicant: Kabushiki Kaisha Toshiba

Inventors: Ranniery MAIA, Byung Ha Chun

Multi-stream spectral representation for statistical parametric speech synthesis

MULTI-STREAM SPECTRAL REPRESENTATION FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS

Speech processing system

Synthetic audiovisual storyteller

SYNTHETIC AUDIOVISUAL STORYTELLER

SPEECH PROCESSING METHOD AND APPARATUS