Time Element Patents (Class 704/267)

Synthesis of speech from text in a voice of a target speaker using neural networks

Patent number: 12175963

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.

Type: Grant

Filed: November 30, 2023

Date of Patent: December 24, 2024

Assignee: Google LLC

Inventors: Ye Jia, Zhifeng Chen, Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Ignacio Lopez Moreno, Fei Ren, Yu Zhang, Quan Wang, Patrick An Phu Nguyen
Text-to-speech synthesis system and method

Patent number: 12118979

Abstract: A method, computer program product, and computer system for text-to-speech synthesis is disclosed. Synthetic speech data for an input text may be generated. The synthetic speech data may be compared to recorded reference speech data corresponding to the input text. Based on, at least in part, the comparison of the synthetic speech data to the recorded reference speech data, at least one feature indicative of at least one difference between the synthetic speech data and the recorded reference speech data may be extracted. A speech gap filling model may be generated based on, at least in part, the at least one feature extracted. A speech output may be generated based on, at least in part, the speech gap filling model.

Type: Grant

Filed: July 3, 2023

Date of Patent: October 15, 2024

Assignee: Telepathy Labs, Inc.

Inventors: Piero Perucci, Martin Reber, Vijeta Avijeet
Key frame networks

Patent number: 12046227

Abstract: A method for generating frame values using a key frame network includes receiving a text utterance having at least one phoneme, and for each respective phoneme of the at least one phoneme, predicting, using a predictive model, a fixed quantity of key frames. Each respective key frame of the fixed quantity of key frames includes a representation of a component of the respective phoneme. The method also includes generating, using the fixed quantity of key frames, a plurality of frame values. Here, each respective frame value of the plurality of frame values is representative of a fixed-duration of audio.

Type: Grant

Filed: April 19, 2022

Date of Patent: July 23, 2024

Assignee: Google LLC

Inventors: Tom Marius Kenter, Tobias Alexander Hawker, Robert Clark
Synthesis of speech from text in a voice of a target speaker using neural networks

Patent number: 11848002

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.

Type: Grant

Filed: July 19, 2022

Date of Patent: December 19, 2023

Assignee: Google LLC

Inventors: Ye Jia, Zhifeng Chen, Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Ignacio Lopez Moreno, Fei Ren, Yu Zhang, Quan Wang, Patrick An Phu Nguyen
Text-to-speech synthesis system and method

Patent number: 11741942

Abstract: A method, computer program product, and computer system for text-to-speech synthesis is disclosed. Synthetic speech data for an input text may be generated. The synthetic speech data may be compared to recorded reference speech data corresponding to the input text. Based on, at least in part, the comparison of the synthetic speech data to the recorded reference speech data, at least one feature indicative of at least one difference between the synthetic speech data and the recorded reference speech data may be extracted. A speech gap filling model may be generated based on, at least in part, the at least one feature extracted. A speech output may be generated based on, at least in part, the speech gap filling model.

Type: Grant

Filed: August 3, 2022

Date of Patent: August 29, 2023

Assignee: Telepathy Labs, Inc

Inventors: Piero Perucci, Martin Reber, Vijeta Avijeet
Learnable speed control for speech synthesis

Patent number: 11302301

Abstract: A method, computer program, and computer system is provided for synthesizing speech at one or more speeds. A context associated with one or more phonemes corresponding to a speaking voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a voice sample corresponding to the speaking voice is synthesized using the generated mel-spectrogram features.

Type: Grant

Filed: March 3, 2020

Date of Patent: April 12, 2022

Assignee: TENCENT AMERICA LLC

Inventors: Chengzhu Yu, Dong Yu
Method for operating a test apparatus and a test apparatus

Patent number: 11243254

Abstract: A method for operating a test apparatus including a plurality of shared resources is shown, wherein the plurality of shared resources can be used in different instruments. The method includes blocking a first set of resource blockers when a first instrument, which requires a first subset of the shared resources, is to be executed. Furthermore, the method tries to block a second set of resource blockers, when a second instrument, which requires a second subset of the shared resources, is to be executed. Therefore, the first set of resource blockers is different from the second set of resource blockers and a plurality of resource blockers are assigned to a shared resource, which is involved in a conflicting combination of instruments and in a non-conflicting combination of instruments.

Type: Grant

Filed: September 27, 2017

Date of Patent: February 8, 2022

Assignee: ADVANTEST CORPORATION

Inventor: Wolfgang Horn
Command processing using multimodal signal analysis

Patent number: 10832031

Abstract: A first set of signals corresponding to a first signal modality (such as the direction of a gaze) during a time interval is collected from an individual. A second set of signals corresponding to a different signal modality (such as hand-pointing gestures made by the individual) is also collected. In response to a command, where the command does not identify a particular object to which the command is directed, the first and second set of signals is used to identify candidate objects of interest, and an operation associated with a selected object from the candidates is performed.

Type: Grant

Filed: August 14, 2017

Date of Patent: November 10, 2020

Assignee: Apple Inc.

Inventors: Wolf Kienzle, Douglas A. Bowman
System and method for preserving privacy of data in the cloud

Patent number: 10810313

Abstract: A system and method for preserving the privacy of data while processing of the data in a cloud. The system comprises a computer program application and a client encryption key, The system is operable to encrypt the computer program application and data using the client encryption key; upload the encrypted computer program application and encrypted data in the cloud; enable the computer platform to undertake processing of the encrypted data in the cloud using the encrypted computer program application; output encrypted processing results; and, enable decryption of the encrypted processing results using the client encryption key.

Type: Grant

Filed: October 3, 2016

Date of Patent: October 20, 2020

Inventors: Nigel Henry Cannings, Gerard Chollet, Cornelius Glackin, Muttukrishnan Rajarajan
Method and system for text-to-speech synthesis

Patent number: 10685644

Abstract: There is disclosed a method of generating a text-to-speech (TTS) training set for training a Machine Learning Algorithm (MLA) for generating machine-spoken utterances The method is executable by a server. The method includes generating a synthetic word based on merging separate phonemes from each of two words of a corpus of pre-recorded utterances, the merging being done using the common phoneme as a merging anchor, the merging resulting in at least two synthetic words. The synthetic words and assessor labels are used to train a classifier to predict a quality parameter associated with a new synthetic phonemes-based word, the quality parameter being representative of whether the new synthetic phonemes-based word is naturally sounding (based on acoustic features of generated synthetic words utterances). The classifier is then used to generate training objects for the MLA and to use the MLA to process the corpus of pre-recorded utterances into their respective vectors.

Type: Grant

Filed: July 4, 2018

Date of Patent: June 16, 2020

Assignee: YANDEX EUROPE AG

Inventors: Vladimir Vladimirovich Kirichenko, Petr Vladislavovich Luferenko
Determining phonetic relationships

Patent number: 10650810

Abstract: Systems and methods of determining phonetic relationships are provided. For instance data indicative of an input text phrase input by a user can be received. An audio output corresponding to a spoken rendering of the input text phrase can be determined. A text transcription of the audio output of the input text phrase can be determined. The text transcription can be a textual representation of the audio output. The text transcription can be compared against a plurality of test phrases to identify a match between the text transcription and at least one test phrase.

Type: Grant

Filed: September 29, 2017

Date of Patent: May 12, 2020

Assignee: GOOGLE LLC

Inventors: Nikhil Chandru Rao, Saisuresh Krishnakumaran
Sound control device, sound control method, and sound control program

Patent number: 10504502

Abstract: A sound control device includes: a detection unit that detects a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; and a control unit that causes output of a second sound to be started, in response to the second operation being detected. The control unit causes output of a first sound to be started before causing the output of the second sound to be started, in response to the first operation being detected.

Type: Grant

Filed: September 20, 2017

Date of Patent: December 10, 2019

Assignee: YAMAHA CORPORATION

Inventors: Keizo Hamano, Yoshitomo Ota, Kazuki Kashiwase
Phrase labeling within spoken audio recordings

Patent number: 10469623

Abstract: A system and method for multi-language phrase identification within spoken interaction audio capable of adjusting for regional pronunciation (accents), cadence differences, and homologs. In this system, a spoken interaction audio data store supplies spoken audio data such as contact center call recordings to be analyzed for a specific phrase or set of phrases. Phrases are entered as natural language text and converted to the phonemes representative of the phrase audio using the invention's language packs and stored in a data store. Spoken interaction and phrase audio are converted to a digital format allowing comparison using multiple characteristics. Phrase matches are stored for subsequent post analysis display and analytics generation.

Type: Grant

Filed: November 27, 2018

Date of Patent: November 5, 2019

Assignee: ZOOM International a.s.

Inventor: Vaclav Slovacek
Natural language processing to merge related alert messages for accessibility

Patent number: 10395638

Abstract: An apparatus and a computer program product for merging incoming alerts for accessibility are described. Two input alerts intended for presentation by a screen reader are received. If the two input alerts have arrived with a specified time interval, the two input alerts are combined into an output alert. The output alert is sent to a screen reader for presentation.

Type: Grant

Filed: July 8, 2017

Date of Patent: August 27, 2019

Assignee: International Business Machines Corporation

Inventors: Stephen A Boxwell, Kyle M Brake, Keith G Frost, Stanley J Vernier
Musical sound generation device

Patent number: 10373595

Abstract: A musical sound generation device including a first memory having a plurality of waveform data, a second memory which stores waveform data read out from the first memory, and a control processor which controls such that, when a sound emission instruction is provided and specified waveform data is in the second memory, the waveform data is read out by the sound source processor, or controls such that, when a sound emission instruction is provided and specified waveform data is not in the second memory, the specified waveform data is transferred from the first memory to the second memory and read out by the sound source processor, in which the control processor controls such that waveform data satisfying a set condition is not subjected to a waveform data change by the transfer and waveform data not satisfying the set condition is subjected to the waveform data change by the transfer.

Type: Grant

Filed: March 2, 2018

Date of Patent: August 6, 2019

Assignee: CASIO COMPUTER CO., LTD.

Inventors: Hiroki Sato, Hajime Kawashima
Wireless communication system

Patent number: 10205587

Abstract: Provided is a wireless communication system which enables wireless communication in which crosstalk due to multiple access is canceled while using a large number of inexpensive wireless terminals. In order to generate an interference component, an analysis data sequence is generated by applying a Hilbert transform, by a Hilbert transform, to a subcarrier data sequence obtained by extracting a target subcarrier wave component from a finite length data sequence, while a carrier phase difference ? is estimated by using the regression analysis by a carrier wave phase estimation unit. After rotation calculation configured to return the analysis data sequence by the carrier phase difference ? is performed, conversion into an angle is performed. Further, a multiplication by a desired odd number of multiplication is performed, and then an inverse Hilbert transform is applied.

Type: Grant

Filed: April 24, 2017

Date of Patent: February 12, 2019

Assignee: KYOWA ELECTRONIC INSTRUMENTS CO., LTD.

Inventors: Jin Mitsugi, Yuki Igarashi, Haruhisa Ichikawa, Yuusuke Kawakita, Kiyoshi Egawa
Systems and methods for name pronunciation

Patent number: 10134385

Abstract: Systems and methods are provided for associating a phonetic pronunciation with a name by receiving the name, mapping the name to a plurality of monosyllabic components that are combinable to construct the phonetic pronunciation of the name, receiving a user input to select one or more of the plurality, and combining the selected one or more of the plurality of monosyllabic components to construct the phonetic pronunciation of the name.

Type: Grant

Filed: March 2, 2012

Date of Patent: November 20, 2018

Assignee: Apple Inc.

Inventor: Devang K. Naik
Speech simulation device

Patent number: 10019982

Abstract: A speech simulation system adapted for a user to communicate with others. The system has at least one sensor to sense controlled and coordinated body movement. The system has a computer processor connected to the at least one sensor. The system has a database memory connected to the computer processor. The system has software programming to operate the computer processor. The system has a feedback device connected to the computer processor and directed to the user. The system has an outward audio output device connected to the computer processor to provide sound and a speaker connected to the outward audio output device.

Type: Grant

Filed: October 18, 2016

Date of Patent: July 10, 2018

Inventor: Mary Elizabeth McCulloch
Vehicular display input apparatus

Patent number: 10019069

Abstract: A vehicular display input apparatus includes a gesture detection unit, a determiner, and a controller. The gesture detection unit detects a gesture made by a hand of the driver. The determiner determines whether a visual line of the driver is directed within a visual line detection area, which is preliminarily defined to include at least partial display region. The controller switches to one of operations listed in an operation menu, which is to be correlated with the gesture, according to a determination result of the determiner. The determination result indicates whether the visual line is directed within the visual line detection area.

Type: Grant

Filed: March 12, 2015

Date of Patent: July 10, 2018

Assignee: DENSO CORPORATION

Inventor: Youichi Naruse
Method and apparatus for exemplary morphing computer system background

Patent number: 10008216

Abstract: Method and apparatus for reducing a size of databases required for recorded speech data.

Type: Grant

Filed: April 15, 2014

Date of Patent: June 26, 2018

Assignee: SPEECH MORPHING SYSTEMS, INC.

Inventors: Fathy Yassa, Benjamin Reaves, Steve Pearson
System, hearing aid, and method for improving synchronization of an acoustic signal to a video display

Patent number: 9924282

Abstract: A system for improving synchronization of an acoustic signal to a video display includes a hearing aid comprising a hearing loss processor configured for signal processing in accordance with a hearing loss of a user of the hearing aid, the hearing aid being configured for receiving a first audio signal for synchronous presentation to the user viewing the video display, the hearing aid being configured for generating a first acoustic signal to be presented to the user of the hearing aid, the first acoustic signal comprising at least a first part being generated in response to the first audio signal. The system also includes a delay unit configured for applying a delay, such that synchronization of the at least first part of the first acoustic signal to the video display is improved.

Type: Grant

Filed: January 20, 2012

Date of Patent: March 20, 2018

Assignee: GN RESOUND A/S

Inventors: Søren C. Ell, Jesper L. Nielsen
System and method of voice annunciation of signal strength, quality of service, and sensor status for wireless devices

Patent number: 9900115

Abstract: Systems and methods of voice annunciation of signal strength, quality of service, and sensor status for wireless devices are provided. Some methods can include determining a signal strength or range of a radio, determining quality of service events and statistics for a wireless device, or determining a status of a sensor and then verbally annunciating information or instructions relating to the determined signal strength or range, the determined quality of service events and statistics, or the determined sensor status.

Type: Grant

Filed: February 20, 2015

Date of Patent: February 20, 2018

Assignee: HONEYWELL INTERNATIONAL INC.

Inventors: Timothy A. Rauworth, Douglas L. Hoeferle, Robert J. Selepa, Pardeep Verma
Sound synthesizing apparatus and method, sound processing apparatus, by arranging plural waveforms on two successive processing periods

Patent number: 9640172

Abstract: A sound synthesizing apparatus includes a waveform storing section which stores a plurality of unit waveforms extracted from different positions, on a time axis, of a sound waveform indicating a voiced sound, and a waveform generating section which generates, for each of a first processing period and a second processing period, a synthesized waveform by arranging the plurality of unit waveforms on the time axis, wherein the second processing period is an immediately succeeding processing period after the first processing period.

Type: Grant

Filed: August 30, 2012

Date of Patent: May 2, 2017

Assignee: Yamaha Corporation

Inventor: Hiraku Kayama
Sound synthesizing apparatus

Patent number: 9552806

Abstract: A sound synthesizing apparatus includes a processor coupled to a memory. The processor configured to execute computer-executable units comprising: an information acquirer adapted to acquire synthesis information which specifies a duration and an utterance content for each unit sound; a prolongation setter adapted to set whether prolongation is permitted or inhibited for each of a plurality of phonemes corresponding to the utterance content of the each unit sound; and a sound synthesizer adapted to generate a synthesized sound corresponding to the synthesis information by connecting a plurality of sound fragments corresponding to the utterance content of the each unit sound. The sound synthesizer prolongs a sound fragment corresponding to the phoneme the prolongation of which is permitted in accordance with the duration of the unit sound.

Type: Grant

Filed: February 26, 2013

Date of Patent: January 24, 2017

Assignee: Yamaha Corporation

Inventors: Hiraku Kayama, Motoki Ogasawara
Speech simulation system

Patent number: 9484013

Abstract: A speech simulation system adapted for a user to communicate with others. The system has at least one sensor to sense controlled and coordinated body movement. The system has a computer processor connected to the at least one sensor. The system has a database memory connected to the computer processor. The system has software programming to operate the computer processor. The system has a feedback device connected to the computer processor and directed to the user. The system has an outward audio output device connected to the computer processor to provide sound and a speaker connected to the outward audio output device.

Type: Grant

Filed: February 19, 2013

Date of Patent: November 1, 2016

Inventor: Mary Elizabeth McCulloch
Assessing speech prosody

Patent number: 9368126

Abstract: A method, system and computer readable storage medium for assessing speech prosody. The method includes the steps of: receiving input speech data; acquiring a prosody constraint; assessing prosody of the input speech data according to the prosody constraint; and providing assessment result where at least of the steps is carried out using a computer device.

Type: Grant

Filed: April 29, 2011

Date of Patent: June 14, 2016

Assignee: Nuance Communications, Inc.

Inventors: Yong Qin, Qin Shi, Zhiwei Shuang, Shi Lei Zhang
Voice assist device and program in electronic musical instrument

Patent number: 9218798

Abstract: Provided is a voice assist device in an electronic musical instrument in which tone selection or a sound setting corresponding to a key is performed in advance by pressing the operation button 1 while pressing one of the keys in the keyboard 2, including a changed state recognizing unit 3 that recognizes from a pressed key a changed state of tone selection or a sound setting determined corresponding to the key in advance, a setting item name storing unit 4 that stores a setting item name of the tone selection or sound setting as voice data, and a sound emitting unit 5 that emits a setting item name corresponding to the changed state, and the changed state recognizing unit 3 includes a voice assist recognizing unit 6 that detects a depression for a preset time or more of the operation button 1 prior to a depression of the key.

Type: Grant

Filed: August 5, 2015

Date of Patent: December 22, 2015

Assignee: KAWAI MUSICAL INSTRUMENTS MANUFACTURING CO., LTD.

Inventors: Takuya Satoh, Kohtaro Ilimura, Sachie Ilimura
Automated generation of phonemic lexicon for voice activated cockpit management systems

Patent number: 9135911

Abstract: A system, method and program for acquiring from an input text a character string set and generating the pronunciation thereof which should be recognized as a word is disclosed.

Type: Grant

Filed: September 26, 2014

Date of Patent: September 15, 2015

Assignees: NEXGEN FLIGHT LLC, DOINITA DIANE SERBAN

Inventors: Doinita Serban, Bhupat Raigaga
Apparatus, process, and program for combining speech and audio data

Patent number: 8983842

Abstract: There is provided a speech processing apparatus including: a data obtaining unit which obtains music progression data defining a property of one or more time points or one or more time periods along progression of music; a determining unit which determines an output time point at which a speech is to be output during reproducing the music by utilizing the music progression data obtained by the data obtaining unit; and an audio output unit which outputs the speech at the output time point determined by the determining unit during reproducing the music.

Type: Grant

Filed: August 12, 2010

Date of Patent: March 17, 2015

Assignee: Sony Corporation

Inventors: Tetsuo Ikeda, Ken Miyashita, Tatsushi Nashida
Method and system for enhancing a speech database

Patent number: 8977552

Abstract: A system, method and computer readable medium that enhances a speech database for speech synthesis is disclosed. The method may include labeling audio files in a primary speech database, identifying segments in the labeled audio files that have varying pronunciations based on language differences, identifying replacement segments in a secondary speech database, enhancing the primary speech database by substituting the identified secondary speech database segments for the corresponding identified segments in the primary speech database, and storing the enhanced primary speech database for use in speech synthesis.

Type: Grant

Filed: May 28, 2014

Date of Patent: March 10, 2015

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Alistair D. Conkie, Ann K. Syrdal
Information providing apparatus and information providing method

Patent number: 8977550

Abstract: Part units of speech information are arranged in a predetermined order to generate a sentence unit of a speech information set. To each of a plurality of speech part units of the speech information, an attribute of “interrupt possible after reproduction” with which reproduction of priority interrupt information can be started after the speech part unit of the speech information is reproduced or another attribute of “interrupt impossible after reproduction” with which reproduction of the priority interrupt information cannot be started even after the speech part unit of the speech information is reproduced is set. When the priority interrupt information having a high priority rank than the speech information set being currently reproduced is inputted, if the attribute of the speech information being reproduced at the point in time is “interrupt impossible after reproduction,” then the priority interrupt information is reproduced after the speech information is reproduced.

Type: Grant

Filed: May 6, 2011

Date of Patent: March 10, 2015

Assignee: Honda Motor Co., Ltd.

Inventor: Tokujiro Kizaki
Parametric speech synthesis method and system

Patent number: 8977551

Abstract: The present invention provides a parametric speech synthesis method and a parametric speech synthesis system.

Type: Grant

Filed: October 27, 2011

Date of Patent: March 10, 2015

Assignee: Goertek Inc.

Inventors: Fengliang Wu, Zhenhua Wu
METHOD AND SYSTEM FOR TEMPLATE-BASED PERSONALIZED SINGING SYNTHESIS

Publication number: 20150025892

Abstract: A system and method for speech-to-singing synthesis is provided. The method includes deriving characteristics of a singing voice for a first individual and modifying vocal characteristics of a voice for a second individual in response to the characteristics of the singing voice of the first individual to generate a synthesized singing voice for the second individual.

Type: Application

Filed: March 6, 2013

Publication date: January 22, 2015

Applicant: Agency for Science, Technology and Research

Inventors: Siu Wa Lee, Ling Cen, Haizhou Li, Yaozhu Paul Chan, Minghui Dong
Enhanced interface for use with speech recognition

Patent number: 8909538

Abstract: Improved methods of presenting speech prompts to a user as part of an automated system that employs speech recognition or other voice input are described. The invention improves the user interface by providing in combination with at least one user prompt seeking a voice response, an enhanced user keyword prompt intended to facilitate the user selecting a keyword to speak in response to the user prompt. The enhanced keyword prompts may be the same words as those a user can speak as a reply to the user prompt but presented using a different audio presentation method, e.g., speech rate, audio level, or speaker voice, than used for the user prompt. In some cases, the user keyword prompts are different words from the expected user response keywords, or portions of words, e.g., truncated versions of keywords.

Type: Grant

Filed: November 11, 2013

Date of Patent: December 9, 2014

Assignee: Verizon Patent and Licensing Inc.

Inventor: James Mark Kondziela
Encoding apparatus, decoding apparatus and methods thereof

Patent number: 8898057

Abstract: Disclosed is an encoding apparatus that can efficiently encode a signal that is a broad or extra-broad band signal or the like, thereby improving the quality of a decoded signal. This encoding apparatus includes a band establishing unit (301) that generate, based on the characteristic of the input signal, band establishment information to be used for dividing the band of the input signal to establish a first band part of lower frequency side and a second band part of higher frequency side; a lower frequency encoding unit (302) for encoding, based on the band establishment information, the input signal of the first band part to generate encoded lower frequency part information; and a higher frequency encoding unit (303) for encoding, based on the band establishment information, the input signal of the second band part to generate encoded higher frequency part information.

Type: Grant

Filed: October 22, 2010

Date of Patent: November 25, 2014

Assignee: Panasonic Intellectual Property Corporation of America

Inventor: Tomofumi Yamanashi
Interactive environment for performing arts scripts

Patent number: 8888494

Abstract: One or more embodiments present a script to a user in an interactive script environment. A digital representation of a manuscript is analyzed. This digital representation includes a set of roles and a set of information associated with each role in the set of roles. An active role in the set of roles that is associated with a given user is identified based on the analyzing. At least a portion of the manuscript is presented to the given user via a user interface. The portion includes at least a subset of information in the set of information. Information within the set of information that is associated with the active role is presented in a visually different manner than information within the set of information that is associated with a non-active role, which is a role that is associated with a user other than the given user.

Type: Grant

Filed: June 27, 2011

Date of Patent: November 18, 2014

Inventor: Randall Lee Threewits
Recognition dictionary creation device and voice recognition device

Patent number: 8868431

Abstract: A recognition dictionary creation device identifies the language of a reading of an inputted text which is a target to be registered and adds a reading with phonemes in the language identified thereby to the target text to be registered, and also converts the reading of the target text to be registered from the phonemes in the language identified thereby to phonemes in a language to be recognized which is handled in voice recognition to create a recognition dictionary in which the converted reading of the target text to be registered is registered.

Type: Grant

Filed: February 5, 2010

Date of Patent: October 21, 2014

Assignee: Mitsubishi Electric Corporation

Inventors: Michihiro Yamazaki, Jun Ishii, Yasushi Ishikawa
Menu hierarchy skipping dialog for directed dialog speech recognition

Patent number: 8862477

Abstract: A method and a processing device for managing an interactive speech recognition system is provided. Whether a voice input relates to expected input, at least partially, of any one of a group of menus different from a current menu is determined. If the voice input relates to the expected input, at least partially, of any one of the group of menus different from the current menu, skipping to the one of the group of menus is performed. The group of menus is different from the current menu include menus at multiple hierarchical levels.

Type: Grant

Filed: June 3, 2013

Date of Patent: October 14, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Hary E. Blanchard
Training and applying prosody models

Patent number: 8856008

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: September 18, 2013

Date of Patent: October 7, 2014

Assignee: Morphism LLC

Inventor: James H. Stephens, Jr.
Time warped modified transform coding of audio signals

Patent number: 8838441

Abstract: A representation of an audio signal having a first, a second and a third frame is derived by estimating first warp information for the first and second frames and second warp information for the second and third frames, the warp information describing pitch information of the audio signal. First or second spectral coefficients for first and second frames or second and third frames are derived using first or second warp information and a first or second weighted representation of the first and second frames or second and third frames, the first or second weighted representation derived by applying a first or second window function to the first and second frames or second and third frames, wherein the first or second window function depends on the first or second warp information. The representation of the audio signal is generated including the first and the second spectral coefficients.

Type: Grant

Filed: February 14, 2013

Date of Patent: September 16, 2014

Assignee: Dolby International AB

Inventor: Lars Villemoes
METHOD AND APPARATUS FOR COMMUNICATING MESSAGES AMONGST A NODE, DEVICE AND A USER OF A DEVICE

Publication number: 20140236586

Abstract: An method and apparatus that modifies static media, such as music files being played to a user of the device, upon the generation or receipt of an alert, notification or message, so that information in the alert, notification or message can be incorporated into the media files then communicated to the user. In a further embodiment, a user's response to the communicated information can be sensed using one or more sensors and transducers so as to provide feedback to the device, and then optionally to a node in a system.

Type: Application

Filed: February 18, 2013

Publication date: August 21, 2014

Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)

Inventor: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
Speech samples library for text-to-speech and methods and apparatus for generating and using same

Patent number: 8775185

Abstract: A method for converting translating text into speech with a speech sample library is provided. The method comprises converting translating an input text to a sequence of triphones; determining musical parameters of each phoneme in the sequence of triphones; detecting, in the speech sample library, speech segments having at least the determined musical parameters; and concatenating the detected speech segments.

Type: Grant

Filed: November 27, 2012

Date of Patent: July 8, 2014

Inventors: Gershon Silbert, Andres Hakim
METHOD FOR SPEECH CODING, METHOD FOR SPEECH DECODING AND THEIR APPARATUSES

Publication number: 20140180696

Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal.

Type: Application

Filed: February 25, 2014

Publication date: June 26, 2014

Applicant: BlackBerry Limited

Inventor: Tadashi YAMAURA
Text-to-speech device and text-to-speech method

Patent number: 8751237

Abstract: A sound control section (114) selects and outputs a text-to-speech item from items included in program information multiplexed with a broadcast signal; and starts or stops outputting the text-to-speech item, based on request from a remote controller control section (113). A sound generation section (115) converts the text-to-speech item to a sound signal. A speaker (109) reproduces the sound signal. The sound control section (114) compares each item of information about a program currently selected by user's operation of the remote controller, with each item of information about the previous program selected just before the user's operation. If an item of the currently selected program information is the same as the corresponding item of the operation-prior program information, and text-to-speech processing has been already completed for the item after the last change in the item, the sound control section (114) stops outputting the item to the sound generation section (115).

Type: Grant

Filed: February 23, 2011

Date of Patent: June 10, 2014

Assignee: Panasonic Corporation

Inventor: Koumei Kubota
Method and system for enhancing a speech database

Patent number: 8744851

Abstract: A system, method and computer readable medium that enhances a speech database for speech synthesis is disclosed. The method may include labeling audio files in a primary speech database, identifying segments in the labeled audio files that have varying pronunciations based on language differences, identifying replacement segments in a secondary speech database, enhancing the primary speech database by substituting the identified secondary speech database segments for the corresponding identified segments in the primary speech database, and storing the enhanced primary speech database for use in speech synthesis.

Type: Grant

Filed: August 13, 2013

Date of Patent: June 3, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Alistair Conkie, Ann K Syrdal
Adaptive time and/or frequency-based encoding mode determination apparatus and method of determining encoding mode of the apparatus

Patent number: 8744841

Abstract: An adaptive time/frequency-based encoding mode determination apparatus including a time domain feature extraction unit to generate a time domain feature by analysis of a time domain signal of an input audio signal, a frequency domain feature extraction unit to generate a frequency domain feature corresponding to each frequency band generated by division of a frequency domain corresponding to a frame of the input audio signal into a plurality of frequency domains, by analysis of a frequency domain signal of the input audio signal, and a mode determination unit to determine any one of a time-based encoding mode and a frequency-based encoding mode, with respect to the each frequency band, by use of the time domain feature and the frequency domain feature.

Type: Grant

Filed: September 21, 2006

Date of Patent: June 3, 2014

Assignee: SAMSUNG Electronics Co., Ltd.

Inventors: Eun Mi Oh, Ki Hyun Choo, Jung-Hoe Kim, Chang Yong Son
Speech synthesis apparatus and method utilizing acquisition of at least two speech unit waveforms acquired from a continuous memory region by one access

Patent number: 8731933

Abstract: A speech synthesizing apparatus includes a selector configured to select a plurality of speech units for synthesizing a speech of a phoneme sequence by referring to speech unit information stored in an information memory. Speech unit waveforms corresponding to the speech units are acquired from a plurality of speech unit waveforms stored in a waveform memory, and the speech is synthesized by utilizing the speech unit waveforms acquired. When acquiring the speech unit waveforms, at least two speech unit waveforms from a continuous region of the waveform memory are copied onto a buffer by one access, wherein a data quantity of the at least two speech unit waveforms is less than or equal to a size of the buffer.

Type: Grant

Filed: April 10, 2013

Date of Patent: May 20, 2014

Assignee: Kabushiki Kaisha Toshiba

Inventor: Takehiko Kagoshima
System and method for speech synthesis

Patent number: 8719030

Abstract: The present invention is a method and system to convert speech signal into a parametric representation in terms of timbre vectors, and to recover the speech signal thereof. The speech signal is first segmented into non-overlapping frames using the glottal closure instant information, each frame is converted into an amplitude spectrum using a Fourier analyzer, and then using Laguerre functions to generate a set of coefficients which constitute a timbre vector. A sequence of timbre vectors can be subject to a variety of manipulations. The new timbre vectors are converted back into voice signals by first transforming into amplitude spectra using Laguerre functions, then generating phase spectra from the amplitude spectra using Kramers-Knonig relations. A Fourier transformer converts the amplitude spectra and phase spectra into elementary acoustic waves, then superposed to become the output voice. The method and system can be used for voice transformation, speech synthesis, and automatic speech recognition.

Type: Grant

Filed: December 3, 2012

Date of Patent: May 6, 2014

Inventor: Chengjun Julian Chen
Controllable prosody re-estimation system and method and computer program product thereof

Patent number: 8706493

Abstract: In one embodiment of a controllable prosody re-estimation system, a TTS/STS engine consists of a prosody prediction/estimation module, a prosody re-estimation module and a speech synthesis module. The prosody prediction/estimation module generates predicted or estimated prosody information. And then the prosody re-estimation module re-estimates the predicted or estimated prosody information and produces new prosody information, according to a set of controllable parameters provided by a controllable prosody parameter interface. The new prosody information is provided to the speech synthesis module to produce a synthesized speech.

Type: Grant

Filed: July 11, 2011

Date of Patent: April 22, 2014

Assignee: Industrial Technology Research Institute

Inventors: Cheng-Yuan Lin, Chien-Hung Huang, Chih-Chung Kuo
Speech signal restoration device and speech signal restoration method

Patent number: 8706497

Abstract: A synthesis filter 106 synthesizes a plurality of wide-band speech signals by combining wide-band phoneme signals and sound source signals from a speech signal code book 105, and a distortion evaluation unit 107 selects one of the wide-band speech signals with a minimum waveform distortion with respect to an up-sampled narrow-band speech signal output from a sampling conversion unit 101. A first bandpass filter 103 extracts a frequency component outside a narrow-band of the wide-band speech signal and a band synthesis unit 104 combines it with the up-sampled narrow-band speech signal.

Type: Grant

Filed: October 22, 2010

Date of Patent: April 22, 2014

Assignee: Mitsubishi Electric Corporation

Inventors: Satoru Furuta, Hirohisa Tasaki

1 2 3 4 5 … next