Synthesis Patents (Class 704/258)

Neural network (Class 704/259)

Image to speech (Class 704/260)

Vocal tract model (Class 704/261)

Linear prediction (Class 704/262)

Correlation (Class 704/263)

Excitation (Class 704/264)

Interpolation (Class 704/265)

Specialized model (Class 704/266)

Time element (Class 704/267)

Frequency element (Class 704/268)

Transformation (Class 704/269)

Text-to-speech user's voice cooperative server for instant messaging clients

Patent number: 9026445

Abstract: A system and method to allow an author of an instant message to enable and control the production of audible speech to the recipient of the message. The voice of the author of the message is characterized into parameters compatible with a formative or articulative text-to-speech engine such that upon receipt, the receiving client device can generate audible speech signals from the message text according to the characterization of the author's voice. Alternatively, the author can store samples of his or her actual voice in a server so that, upon transmission of a message by the author to a recipient, the server extracts the samples needed only to synthesize the words in the text message, and delivers those to the receiving client device so that they are used by a client-side concatenative text-to-speech engine to generate audible speech signals having a close likeness to the actual voice of the author.

Type: Grant

Filed: March 20, 2013

Date of Patent: May 5, 2015

Assignee: Nuance Communications, Inc.

Inventors: Terry Wade Niemeyer, Liliana Orozco
SENTENCE SET GENERATING DEVICE, SENTENCE SET GENERATING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20150120303

Abstract: According to an embodiment, a sentence set generating device includes an importance degree storage, a frequency storage, a calculator, and a selector. The importance degree storage is configured to store therein a degree of importance of each of a plurality of acoustic units. The frequency storage is configured to store therein a frequency of appearance of each of the acoustic units in a second sentence set. The calculator is configured to calculate a score of a first sentence included in a first sentence set, from a degree of rarity corresponding to the frequency of appearance of each acoustic unit in the first sentence and from a degree of importance of the each acoustic unit. The selector is configured to, from sentences included in the first sentence set, select a sentence having a score higher than other sentences, and add the selected sentence to the second sentence set.

Type: Application

Filed: September 12, 2014

Publication date: April 30, 2015

Inventor: Yusuke Shinohara
Apparatus and method for editing speech synthesis, and computer readable medium

Patent number: 9020821

Abstract: An acquisition unit analyzes a text, and acquires phonemic and prosodic information. An editing unit edits a part of the phonemic and prosodic information. A speech synthesis unit converts the phonemic and prosodic information before editing the part to a first speech waveform, and converts the phonemic and prosodic information after editing the part to a second speech waveform. A period calculation unit calculates a contrast period corresponding to the part in the first speech waveform and the second speech waveform. A speech generation unit generates an output waveform by connecting a first partial waveform and a second partial waveform. The first partial waveform contains the contrast period of the first speech waveform. The second partial waveform contains the contrast period of the second speech waveform.

Type: Grant

Filed: September 19, 2011

Date of Patent: April 28, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventor: Osamu Nishiyama
AUDIO DATA SYNTHESIS TERMINAL, AUDIO DATA RECORDING TERMINAL, AUDIO DATA SYNTHESIS METHOD, AUDIO OUTPUT METHOD, AND PROGRAM

Publication number: 20150112686

Abstract: The time difference calculation unit which calculates a time difference between own terminal and the another terminal, based on the time at which output of the first sound from the audio output module is started, a time at which input of a sound corresponding to the audio data to the audio input module is started, a time indicated by the first information, and a time indicated by the second information.

Type: Application

Filed: September 26, 2014

Publication date: April 23, 2015

Applicant: OLYMPUS CORPORATION

Inventor: Ryuichi Kiyoshige
Multilingual speech recognition and public announcement

Patent number: 9015032

Abstract: Embodiments of the present invention provide a system, method, and program product to deliver an announcement to people, such as a public announcement. A computer receives input representative of audio from one or more people speaking in one or more natural languages. The computer processes the input to identify the languages being spoken, and identifies a relative proportion of each of the identified languages. Using these proportions, the computer determines one or more languages in which to deliver the announcement. The computer then causes to be delivered the announcement in the determined languages. In other embodiments, the computer can also determine an order in which to deliver the announcement. Further, the computer can transmit the announcement in the determined languages and order for delivery in aural or visual form.

Type: Grant

Filed: November 28, 2011

Date of Patent: April 21, 2015

Assignee: International Business Machines Corporation

Inventors: Sheri G. Daye, Peeyush Jaiswal, Aleksas J. Vitenas
Apparatus, method, and program for reading aloud documents based upon a calculated word presentation order

Patent number: 9009051

Abstract: According to one embodiment, a reading aloud support apparatus includes a reception unit, a first extraction unit, a second extraction unit, an acquisition unit, a generation unit, a presentation unit. The reception unit is configured to receive an instruction. The first extraction unit is configured to extract, as a partial document, a part of a document which corresponds to a range of words. The second extraction unit is configured to perform morphological analysis and to extract words as candidate words. The acquisition unit is configured to acquire attribute information items relates to the candidate words. The generation unit is configured to perform weighting relating to a value corresponding a distance and to determine each of candidate words to be preferentially presented to generate a presentation order. The presentation unit is configured to present the candidate words and the attribute information items in accordance with the presentation order.

Type: Grant

Filed: March 22, 2011

Date of Patent: April 14, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kosei Fume, Masaru Suzuki, Yuji Shimizu, Tatsuya Izuha
SYSTEMS AND METHODS FOR MITIGATING SPEECH SIGNAL QUALITY DEGRADATION

Publication number: 20150100318

Abstract: A method for decoding a speech signal is described. The method includes obtaining a packet. The method also includes obtaining a previous lag value. The method further includes limiting the previous lag value if the previous lag value is greater than a maximum lag threshold. The method additionally includes disallowing an adjustment to a number of synthesized peaks if a combination of the number of synthesized peaks and an estimated number of peaks is not valid.

Type: Application

Filed: October 4, 2013

Publication date: April 9, 2015

Applicant: QUALCOMM Incorporated

Inventors: Venkatraman Rajagopalan, Venkatesh Krishnan, Alok K. Gupta
CUSTOMIZED AURAL METHOD AND SYSTEM FOR MANAGING THREATS IN AN AIRCRAFT COCKPIT

Publication number: 20150097706

Abstract: A micro-computer based aircraft system that creates aural messages based upon system-detected threats (e.g., low oil pressure). The messages are unique to the make and model of aircraft. Speech recognition allows the pilot to request aircraft-specific, customized aurally-delivered checklists and to respond via a challenge and response protocol. This permits a hands-free, timely, complete and prudent response to the threat or hazardous situation, while allowing the pilot the relative freedom to do what is paramount: first, fly the airplane (with minimum distraction).

Type: Application

Filed: October 9, 2013

Publication date: April 9, 2015

Applicant: CMX Avionics, LLC

Inventors: Warren F. Perger, William G. Abbatt
Community audio narration generation

Patent number: 9002703

Abstract: The community-based generation of audio narrations for a text-based work leverages collaboration of a community of people to provide human-voiced audio readings. During the community-based generation, a collection of audio recordings for the text-based work may be collected from multiple human readers in a community. An audio recording for each section in the text-based work may be selected from the collection of audio recordings. The selected audio recordings may be then combined to produce an audio reading of at least a portion of the text-based work.

Type: Grant

Filed: September 28, 2011

Date of Patent: April 7, 2015

Assignee: Amazon Technologies, Inc.

Inventor: Jay A. Crosley
Speech synthesis apparatus and method

Patent number: 9002711

Abstract: According to an embodiment, a speech synthesis apparatus includes a selecting unit configured to select speaker's parameters one by one for respective speakers and obtain a plurality of speakers' parameters, the speaker's parameters being prepared for respective pitch waveforms corresponding to speaker's speech sounds, the speaker's parameters including formant frequencies, formant phases, formant powers, and window functions concerning respective formants that are contained in the respective pitch waveforms. The apparatus includes a mapping unit configured to make formants correspond to each other between the plurality of speakers' parameters using a cost function based on the formant frequencies and the formant powers. The apparatus includes a generating unit configured to generate an interpolated speaker's parameter by interpolating, at desired interpolation ratios, the formant frequencies, formant phases, formant powers, and window functions of formants which are made to correspond to each other.

Type: Grant

Filed: December 16, 2010

Date of Patent: April 7, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Ryo Morinaka, Takehiko Kagoshima
Release of transaction data

Patent number: 8996387

Abstract: For clearing transaction data selected for a processing, there is generated in a portable data carrier (1) a transaction acoustic signal (003; 103; 203) (S007; S107; S207) upon whose acoustic reproduction by an end device (10) at least transaction data selected for the processing are reproduced superimposed acoustically with a melody specific to a user of the data carrier (1) (S009; S109; S209). The generated transaction acoustic signal (003; 103; 203) is electronically transferred to an end device (10) (S108; S208), which processes the selected transaction data (S011; S121; S216) only when the user of the data carrier (1) confirms vis-à-vis the end device (10) an at least partial match both of the acoustically reproduced melody with the user-specific melody and of the acoustically reproduced transaction data with the selected transaction data (S010; S110, S116; S210).

Type: Grant

Filed: September 8, 2009

Date of Patent: March 31, 2015

Assignee: Giesecke & Devrient GmbH

Inventors: Thomas Stocker, Michael Baldischweiler
Blending recorded speech with text-to-speech output for specific domains

Patent number: 8996377

Abstract: A text-to-speech (TTS) engine combines recorded speech with synthesized speech from a TTS synthesizer based on text input. The TTS engine receives the text input and identifies the domain for the speech (e.g. navigation, dialing, . . . ). The identified domain is used in selecting domain specific speech recordings (e.g. pre-recorded static phrases such as “turn left”, “turn right” . . . ) from the input text. The speech recordings are obtained based on the static phrases for the domain that are identified from the input text. The TTS engine blends the static phrases with the TTS output to smooth the acoustic trajectory of the input text. The prosody of the static phrases is used to create similar prosody in the TTS output.

Type: Grant

Filed: July 12, 2012

Date of Patent: March 31, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sheng Zhao, Peng Wang, Difei Gao, Yijian Wu, Binggong Ding, Shenghua Ye, Max Leung
VOICE SYNTHESIZER

Publication number: 20150088520

Abstract: A candidate voice segment sequence generator 1 generates candidate voice segment sequences 102 for an input language information sequence 101 by using DB voice segments 105 in a voice segment database 4. An output voice segment sequence determinator 2 calculates a degree of match between the input language information sequence 101 and each of the candidate voice segment sequences 102 by using a parameter 107 showing a value according to a cooccurrence criterion 106 for cooccurrence between the input language information sequence 101 and a sound parameter showing the attribute of each of a plurality of candidate voice segments in each of the candidate voice segment sequences 102, and determines an output voice segment sequence 103 on the basis of the degree of match.

Type: Application

Filed: February 21, 2014

Publication date: March 26, 2015

Applicant: Mitsubishi Electric Corporation

Inventors: Takahiro OTSUKA, Keigo KAWASHIMA, Satoru FURUTA, Tadashi YAMAURA
SPEECH SERVER, SPEECH METHOD, RECORDING MEDIUM, SPEECH SYSTEM, SPEECH TERMINAL, AND MOBILE TERMINAL

Publication number: 20150088521

Abstract: A speech server includes: a speech terminal-specifying information management unit configured to manage speech terminal-specifying information; a reception unit configured to receive, from an external server, (i) the speech terminal-specifying information or user-specifying information and (ii) speech information indicative of speech content to be outputted as speech; and a speech instruction unit configured to instruct a speech terminal specified by the speech terminal-specifying information to output the speech content as speech.

Type: Application

Filed: September 25, 2014

Publication date: March 26, 2015

Applicant: SHARP KABUSHIKI KAISHA

Inventors: Masahiro CHIBA, Kazunori SHIBATA
Providing text to speech from digital content on an electronic device

Patent number: 8990087

Abstract: A method for providing text to speech from digital content in an electronic device is described. Digital content including a plurality of words and a pronunciation database is received. Pronunciation instructions are determined for the word using the digital content. Audio or speech is played for the word using the pronunciation instructions. As a result, the method provides text to speech on the electronic device based on the digital content.

Type: Grant

Filed: September 30, 2008

Date of Patent: March 24, 2015

Assignee: Amazon Technologies, Inc.

Inventors: John Lattyak, John T. Kim, Robert Wai-Chi Chu, Laurent An Minh Nguyen
Text to speech synthesis for texts with foreign language inclusions

Patent number: 8990089

Abstract: A speech output is generated from a text input written in a first language and containing inclusions in a second language. Words in the native language are pronounced with a native pronunciation and words in the foreign language are pronounced with a proficient foreign pronunciation. Language dependent phoneme symbols generated for words of the second language are replaced with language dependent phoneme symbols of the first language, where said replacing includes the steps of assigning to each language dependent phoneme symbol of the second language a language independent target phoneme symbol, mapping to each one language independent target phoneme symbol a language independent substitute phoneme symbol assignable to a language dependent substitute phoneme symbol of the first language, substituting the language dependent phoneme symbols of the second language by the language dependent substitute phoneme symbols of the first language.

Type: Grant

Filed: November 19, 2012

Date of Patent: March 24, 2015

Assignee: Nuance Communications, Inc.

Inventors: Johan Wouters, Christof Traber, David Hagstrand, Alexis Wilpert, Jürgen Keller, Igor Nozhov
Method for enhancing the playback of information in interactive voice response systems

Patent number: 8983841

Abstract: A network communication node includes an audio outputter that outputs an audible representation of data to be provided to a requester. The network communication node also includes a processor that determines a categorization of the data to be provided to the requester and that varies a pause between segments of the audible representation of the data in accordance with the categorization of the data to be provided to the requester.

Type: Grant

Filed: July 15, 2008

Date of Patent: March 17, 2015

Assignee: AT&T Intellectual Property, I, L.P.

Inventors: Gregory Pulz, Steven Lewis, Charles Rajnai
Electronic device and server for processing voice message

Patent number: 8983835

Abstract: An electronic device includes a voice processing unit, a wireless communication unit, and a combining unit. The voice processing unit receives speech signals. The wireless communication unit sends the speech signals to a server. The server converts the speech signals into a text message. The wireless communication unit receives the text message from the server. The combining unit combines the text message and the speech signals into a combined message. The wireless communication unit further sends the combined message to a recipient. A related server is also provided.

Type: Grant

Filed: June 30, 2011

Date of Patent: March 17, 2015

Assignees: Fu Tai Hua Industry (Shenzhen) Co., Ltd, Hon Hai Precision Industry Co., Ltd.

Inventors: Shih-Fang Wong, Tsung-Jen Chuang, Bo Zhang
Information providing apparatus and information providing method

Patent number: 8977550

Abstract: Part units of speech information are arranged in a predetermined order to generate a sentence unit of a speech information set. To each of a plurality of speech part units of the speech information, an attribute of “interrupt possible after reproduction” with which reproduction of priority interrupt information can be started after the speech part unit of the speech information is reproduced or another attribute of “interrupt impossible after reproduction” with which reproduction of the priority interrupt information cannot be started even after the speech part unit of the speech information is reproduced is set. When the priority interrupt information having a high priority rank than the speech information set being currently reproduced is inputted, if the attribute of the speech information being reproduced at the point in time is “interrupt impossible after reproduction,” then the priority interrupt information is reproduced after the speech information is reproduced.

Type: Grant

Filed: May 6, 2011

Date of Patent: March 10, 2015

Assignee: Honda Motor Co., Ltd.

Inventor: Tokujiro Kizaki
Parametric speech synthesis method and system

Patent number: 8977551

Abstract: The present invention provides a parametric speech synthesis method and a parametric speech synthesis system.

Type: Grant

Filed: October 27, 2011

Date of Patent: March 10, 2015

Assignee: Goertek Inc.

Inventors: Fengliang Wu, Zhenhua Wu
Method and system for enhancing a speech database

Patent number: 8977552

Abstract: A system, method and computer readable medium that enhances a speech database for speech synthesis is disclosed. The method may include labeling audio files in a primary speech database, identifying segments in the labeled audio files that have varying pronunciations based on language differences, identifying replacement segments in a secondary speech database, enhancing the primary speech database by substituting the identified secondary speech database segments for the corresponding identified segments in the primary speech database, and storing the enhanced primary speech database for use in speech synthesis.

Type: Grant

Filed: May 28, 2014

Date of Patent: March 10, 2015

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Alistair D. Conkie, Ann K. Syrdal
Band broadening apparatus and method

Patent number: 8972248

Abstract: A band broadening apparatus includes a processor configured to analyze a fundamental frequency based on an input signal bandlimited to a first band, generate a signal that includes a second band different from the first band based on the input signal, control a frequency response of the second band based on the fundamental frequency, reflect the frequency response of the second band on the signal that includes the second band and generate a frequency-response-adjusted signal that includes the second band, and synthesize the input signal and the frequency-response-adjusted signal.

Type: Grant

Filed: September 14, 2012

Date of Patent: March 3, 2015

Assignee: Fujitsu Limited

Inventors: Takeshi Otani, Taro Togawa, Masanao Suzuki, Shusaku Ito
Multiple voices in audio content

Patent number: 8972265

Abstract: A content customization service is disclosed. The content customization service may identify one or more speakers in an item of content, and map one or more portions of the item of content to a speaker. A speaker may also be mapped to a voice. In one embodiment, the content customization service obtains portions of audio content synchronized to the mapped portions of the item of content. Each portion of audio content may be associated with a voice to which the speaker of the portion of the item of content is mapped. These portions of audio content may be combined to produce a combined item of audio content with multiple voices.

Type: Grant

Filed: June 18, 2012

Date of Patent: March 3, 2015

Assignee: Audible, Inc.

Inventor: Kevin S. Lester
Coding with noise shaping in a hierarchical coder

Patent number: 8965773

Abstract: A method is provided for hierarchical coding of a digital audio signal comprising, for a current frame of the input signal: a core coding, delivering a scalar quantization index for each sample of the current frame and at least one enhancement coding delivering indices of scalar quantization for each coded sample of an enhancement signal. The enhancement coding comprises a step of obtaining a filter for shaping the coding noise used to determine a target signal and in that the indices of scalar quantization of said enhancement signal are determined by minimizing the error between a set of possible values of scalar quantization and said target signal. The coding method can also comprise a shaping of the coding noise for the core bitrate coding. A coder implementing the coding method is also provided.

Type: Grant

Filed: November 17, 2009

Date of Patent: February 24, 2015

Assignee: Orange

Inventors: Balazs Kovesi, Stéphane Ragot, Alain Le Guyader
System and method for automatic detection of abnormal stress patterns in unit selection synthesis

Patent number: 8965768

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for detecting and correcting abnormal stress patterns in unit-selection speech synthesis. A system practicing the method detects incorrect stress patterns in selected acoustic units representing speech to be synthesized, and corrects the incorrect stress patterns in the selected acoustic units to yield corrected stress patterns. The system can further synthesize speech based on the corrected stress patterns. In one aspect, the system also classifies the incorrect stress patterns using a machine learning algorithm such as a classification and regression tree, adaptive boosting, support vector machine, and maximum entropy. In this way a text-to-speech unit selection speech synthesizer can produce more natural sounding speech with suitable stress patterns regardless of the stress of units in a unit selection database.

Type: Grant

Filed: August 6, 2010

Date of Patent: February 24, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Yeon-Jun Kim, Mark Charles Beutnagel, Alistair D. Conkie, Ann K. Syrdal
Electronic apparatus and voice recognition method for the same

Patent number: 8965764

Abstract: Disclosed are an electronic apparatus and a voice recognition method for the same. The voice recognition method for the electronic apparatus includes: receiving an input voice of a user; determining characteristics of the user; and recognizing the input voice based on the determined characteristics of the user.

Type: Grant

Filed: January 7, 2010

Date of Patent: February 24, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hee-seob Ryu, Seung-kwon Park, Jong-ho Lea, Jong-hyuk Jang
System and method for synthetic voice generation and modification

Patent number: 8965767

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a synthetic voice. A system configured to practice the method combines a first database of a first text-to-speech voice and a second database of a second text-to-speech voice to generate a combined database, selects from the combined database, based on a policy, voice units of a phonetic category for the synthetic voice to yield selected voice units, and synthesizes speech based on the selected voice units. The system can synthesize speech without parameterizing the first text-to-speech voice and the second text-to-speech voice. A policy can define, for a particular phonetic category, from which text-to-speech voice to select voice units. The combined database can include multiple text-to-speech voices from different speakers. The combined database can include voices of a single speaker speaking in different styles. The combined database can include voices of different languages.

Type: Grant

Filed: May 20, 2014

Date of Patent: February 24, 2015

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. Conkie, Ann K. Syrdal
Markup assistance apparatus, method and program

Patent number: 8965769

Abstract: According to one embodiment, a markup assistance apparatus includes an acquisition unit, a first calculation unit, a detection unit and a presentation unit. The acquisition unit acquires a feature amount for respective tags, each of the tags being used to control text-to-speech processing of a markup text. The first calculation unit calculates, for respective character strings, a variance of feature amounts of the tags which are assigned to the character string in a markup text. The detection unit detects a first character string assigned a first tag having the variance not less than a first threshold value as a first candidate including the tag to be corrected. The presentation unit presents the first candidate.

Type: Grant

Filed: September 24, 2012

Date of Patent: February 24, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kouichirou Mori, Masahiro Morita
System for media correlation based on latent evidences of audio

Patent number: 8959022

Abstract: A method for determining a relatedness between a query video and a database video is provided. A processor extracts an audio stream from the query video to produce a query audio stream, extracts an audio stream from the database video to produce a database audio stream, produces a first-sized snippet from the query audio stream, and produces a first-sized snippet from the database audio stream. An estimation is made of a first most probable sequence of latent evidence probability vectors generating the first-sized audio snippet of the query audio stream. An estimation is made of a second most probable sequence of latent evidence probability vectors generating the first-sized audio snippet of the database audio stream. A similarity is measured between the first sequence and the second sequence producing a score of relatedness between the two snippets. Finally a relatedness is determined between the query video and a database video.

Type: Grant

Filed: November 19, 2012

Date of Patent: February 17, 2015

Assignee: Motorola Solutions, Inc.

Inventors: Yang M. Cheng, Dusan Macho
Single interface for local and remote speech synthesis

Patent number: 8959021

Abstract: Features are disclosed for providing a consistent interface for local and distributed text to speech (TTS) systems. Some portions of the TTS system, such as voices and TTS engine components, may be installed on a client device, and some may be present on a remote system accessible via a network link. Determinations can be made regarding which TTS system components to implement on the client device and which to implement on the remote server. The consistent interface facilitates connecting to or otherwise employing the TTS system through use of the same methods and techniques regardless of the which TTS system configuration is implemented.

Type: Grant

Filed: December 19, 2012

Date of Patent: February 17, 2015

Assignee: IVONA Software Sp. z.o.o.

Inventors: Michal T. Kaszczuk, Lukasz M. Osowski
Systems and methods for document narration with multiple characters having multiple moods

Patent number: 8954328

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices. Further disclosed are techniques and systems for providing a plurality of characters at least some of the characters having multiple associated moods for use in document narration.

Type: Grant

Filed: January 14, 2010

Date of Patent: February 10, 2015

Assignee: K-NFB Reading Technology, Inc.

Inventors: Raymond C. Kurzweil, Paul Albrecht, Peter Chapman
Speech translation system, control device, and control method

Patent number: 8954335

Abstract: Appropriate processing results or appropriate apparatuses can be selected with a control device that selects the most probable speech recognition result by using speech recognition scores received with speech recognition results from two or more speech recognition apparatuses; sends the selected speech recognition result to two or more translation apparatuses respectively; selects the most probable translation result by using translation scores received with translation results from the two or more translation apparatuses; sends the selected translation result to two or more speech synthesis apparatuses respectively; receives a speech synthesis processing result including a speech synthesis result and a speech synthesis score from each of the two or more speech synthesis apparatuses; selects the most probable speech synthesis result by using the scores; and sends the selected speech synthesis result to a second terminal apparatus.

Type: Grant

Filed: March 3, 2010

Date of Patent: February 10, 2015

Assignee: National Institute of Information and Communications Technology

Inventors: Satoshi Nakamura, Eiichiro Sumita, Yutaka Ashikari, Noriyuki Kimura, Chiori Hori
Display apparatus and voice conversion method thereof

Patent number: 8949123

Abstract: The voice conversion method of a display apparatus includes: in response to the receipt of a first video frame, detecting one or more entities from the first video frame; in response to the selection of one of the detected entities, storing the selected entity; in response to the selection of one of a plurality of previously-stored voice samples, storing the selected voice sample in connection with the selected entity; and in response to the receipt of a second video frame including the selected entity, changing a voice of the selected entity based on the selected voice sample and outputting the changed voice.

Type: Grant

Filed: April 11, 2012

Date of Patent: February 3, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Aditi Garg, Kasthuri Jayachand Yadlapalli
Method and apparatus for providing speech output for speech-enabled applications

Patent number: 8949128

Abstract: Techniques for providing speech output for speech-enabled applications. A synthesis system receives from a speech-enabled application a text input including a text transcription of a desired speech output. The synthesis system selects one or more audio recordings corresponding to one or more portions of the text input. In one aspect, the synthesis system selects from audio recordings provided by a developer of the speech-enabled application. In another aspect, the synthesis system selects an audio recording of a speaker speaking a plurality of words. The synthesis system forms a speech output including the one or more selected audio recordings and provides the speech output for the speech-enabled application.

Type: Grant

Filed: February 12, 2010

Date of Patent: February 3, 2015

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Corinne Bos-Plachez, Martine Marguerite Staessen
Method of speech synthesis

Patent number: 8942983

Abstract: The present invention relates to a method of text-based speech synthesis, wherein at least one portion of a text is specified; the intonation of each portion is determined; target speech sounds are associated with each portion; physical parameters of the target speech sounds are determined; speech sounds most similar in terms of the physical parameters to the target speech sounds are found in a speech database; and speech is synthesized as a sequence of the found speech sounds. The physical parameters of said target speech sounds are determined in accordance with the determined intonation. The present method, when used in a speech synthesizer, allows improved quality of synthesized speech due to precise reproduction of intonation.

Type: Grant

Filed: November 23, 2011

Date of Patent: January 27, 2015

Assignee: Speech Technology Centre, Limited

Inventor: Mikhail Vasilievich Khitrov
METHOD AND SYSTEM FOR TEMPLATE-BASED PERSONALIZED SINGING SYNTHESIS

Publication number: 20150025892

Abstract: A system and method for speech-to-singing synthesis is provided. The method includes deriving characteristics of a singing voice for a first individual and modifying vocal characteristics of a voice for a second individual in response to the characteristics of the singing voice of the first individual to generate a synthesized singing voice for the second individual.

Type: Application

Filed: March 6, 2013

Publication date: January 22, 2015

Applicant: Agency for Science, Technology and Research

Inventors: Siu Wa Lee, Ling Cen, Haizhou Li, Yaozhu Paul Chan, Minghui Dong
Vector joint encoding/decoding method and vector joint encoder/decoder

Patent number: 8930200

Abstract: A vector joint encoding/decoding method and a vector joint encoder/decoder are provided, more than two vectors are jointly encoded, and an encoding index of at least one vector is split and then combined between different vectors, so that encoding idle spaces of different vectors can be recombined, thereby facilitating saving of encoding bits, and because an encoding index of a vector is split and then shorter split indexes are recombined, thereby facilitating reduction of requirements for the bit width of operating parts in encoding/decoding calculation.

Type: Grant

Filed: July 24, 2013

Date of Patent: January 6, 2015

Assignee: Huawei Technologies Co., Ltd

Inventors: Fuwei Ma, Dejun Zhang, Lei Miao, Fengyan Qi
SOUND ENHANCEMENT FOR MOVIE THEATERS

Publication number: 20150006180

Abstract: A process and system for enhancing and customizing movie theatre sound includes receiving an input audio sound and enhancing the voice audio input in two or more harmonic and dynamic ranges by re-synthesizing the audio into a full range PCM wave. The enhancement includes the parallel processing the input audio via a low pass filter with dynamic offset, an envelope controlled bandpass filter, a high pass filter, adding an amount of dynamic synthesized sub bass to the audio and combining the four treated audio signals in a summing mixer with the original audio.

Type: Application

Filed: February 21, 2014

Publication date: January 1, 2015

Applicant: Max Sound Corporation

Inventor: Lloyd Trammell
Communication converter for converting audio information/textual information to corresponding textual information/audio information

Patent number: 8924217

Abstract: A communication converter is described for converting among speech signals and textual information, permitting communication between telephone users and textual instant communications users.

Type: Grant

Filed: April 28, 2011

Date of Patent: December 30, 2014

Assignee: Verizon Patent and Licensing Inc.

Inventors: Richard G. Moore, Gregory L. Mumford, Duraisamy Gunasekar
Contextual conversion platform for generating prioritized replacement text for spoken content output

Patent number: 8918323

Abstract: A contextual conversion platform, and method for converting text-to-speech, are described that can convert content of a target to spoken content. Embodiments of the contextual conversion platform can identify certain contextual characteristics of the content, from which can be generated a spoken content input. This spoken content input can include tokens, e.g., words and abbreviations, to be converted to the spoken content, as well as substitution tokens that are selected from contextual repositories based on the context identified by the contextual conversion platform.

Type: Grant

Filed: March 15, 2013

Date of Patent: December 23, 2014

Inventor: Daniel Ben-Ezri
Personalized text-to-speech services

Patent number: 8918322

Abstract: A personalized text-to-speech (pTTS) system provides a method for converting text data to speech data utilizing a pTTS template representing the voice characteristics of an individual. A memory stores executable program code that converts text data to speech data. Text data represents a textual message directed to a system user and speech data represents a spoken form of text data having the characteristics of an individual's voice. A processor executes the program code, and a storage device stores a pTTS template and may store speech data. The pTTS system can be used to provide various services that provide immediate spoken presentation of the speech data converted from text data and/or combine stored speech data with generated speech data for spoken presentation.

Type: Grant

Filed: June 20, 2007

Date of Patent: December 23, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Edmund Gale Acker, Frederick Murray Burg
Replay apparatus, signal processing apparatus, and signal processing method

Patent number: 8918313

Abstract: A method of selectively performing signal processing in a first mode and in a second mode. In the first mode, a noise cancel signal having a signal characteristic to cancel an external noise component is generated based on a voice signal supplied from a microphone, and an input digital audio signal and the noise cancel signal are combined into a voice signal to be output through a speaker. In the second mode, a sound process for vocal voice is performed on a voice signal supplied from a microphone, a vocal voice component is canceled from a digital audio signal of input music to generate a karaoke signal, and the karaoke signal and the vocal signal are combined into a voice signal to be output through a speaker. The first mode corresponds to an audio replay operation accompanied by noise cancel, and the second mode corresponds to a karaoke operation.

Type: Grant

Filed: May 16, 2012

Date of Patent: December 23, 2014

Assignee: Sony Corporation

Inventors: Kazunobu Ookuri, Kohei Asada, Yasunobu Murata
Earguard monitoring system

Patent number: 8917876

Abstract: SPL monitoring systems are provided. A SPL monitoring system includes an audio transducer configured to receive sound pressure, a logic circuit which calculates a safe time duration over which a user can receive current sound pressure values and an indicator element which produces a notification when an indicator level occurs. An SPL monitoring information system includes a database which stores data such as a list of earpiece devices and associated instrument response functions. The logic circuit compares a request with the data in the database and retrieves a subset of data and sends it to an output control unit. The output control unit sends the subset of data to a sending unit.

Type: Grant

Filed: June 14, 2007

Date of Patent: December 23, 2014

Assignee: Personics Holdings, LLC.

Inventor: Steven W. Goldstein
Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment

Patent number: 8914290

Abstract: Method and apparatus that dynamically adjusts operational parameters of a text-to-speech engine in a speech-based system. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.

Type: Grant

Filed: May 18, 2012

Date of Patent: December 16, 2014

Assignee: Vocollect, Inc.

Inventors: James Hendrickson, Debra Drylie Scott, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8914291

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: September 24, 2013

Date of Patent: December 16, 2014

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
Enhanced interface for use with speech recognition

Patent number: 8909538

Abstract: Improved methods of presenting speech prompts to a user as part of an automated system that employs speech recognition or other voice input are described. The invention improves the user interface by providing in combination with at least one user prompt seeking a voice response, an enhanced user keyword prompt intended to facilitate the user selecting a keyword to speak in response to the user prompt. The enhanced keyword prompts may be the same words as those a user can speak as a reply to the user prompt but presented using a different audio presentation method, e.g., speech rate, audio level, or speaker voice, than used for the user prompt. In some cases, the user keyword prompts are different words from the expected user response keywords, or portions of words, e.g., truncated versions of keywords.

Type: Grant

Filed: November 11, 2013

Date of Patent: December 9, 2014

Assignee: Verizon Patent and Licensing Inc.

Inventor: James Mark Kondziela
System and Method for Generalized Preselection for Unit Selection Synthesis

Publication number: 20140350940

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.

Type: Application

Filed: August 7, 2014

Publication date: November 27, 2014

Inventors: Alistair D. CONKIE, Mark BEUTNAGEL, Yeon-Jun KIM, Ann K. SYRDAL
Multi-lingual text-to-speech system and method

Patent number: 8898066

Abstract: A multi-lingual text-to-speech system and method processes a text to be synthesized via an acoustic-prosodic model selection module and an acoustic-prosodic model mergence module, and obtains a phonetic unit transformation table. In an online phase, the acoustic-prosodic model selection module, according to the text and a phonetic unit transcription corresponding to the text, uses at least a set controllable accent weighting parameter to select a transformation combination and find a second and a first acoustic-prosodic models. The acoustic-prosodic model mergence module merges the two acoustic-prosodic models into a merged acoustic-prosodic model, according to the at least a controllable accent weighting parameter, processes all transformations in the transformation combination and generates a merged acoustic-prosodic model sequence. A speech synthesizer and the merged acoustic-prosodic model sequence are further applied to synthesize the text into an L1-accent L2 speech.

Type: Grant

Filed: August 25, 2011

Date of Patent: November 25, 2014

Assignee: Industrial Technology Research Institute

Inventors: Jen-Yu Li, Jia-Jang Tu, Chih-Chung Kuo
System and method for answering a communication notification

Patent number: 8892442

Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.

Type: Grant

Filed: February 17, 2014

Date of Patent: November 18, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst J. Schroeter
Multicore system, control method of multicore system, and non-transitory readable medium storing program

Patent number: 8892230

Abstract: A multicore system 2 includes a main system program 610 that operates on a first processor core 61 and stores synthesized audio data, which is mixed audio data, to a buffer for DMA transfer 63, a standby program 620 that operates on a second processor 62, and an audio output unit 64 that sequentially stores the synthesized audio data transferred from the buffer for DMA transfer 63 and plays the stored synthesized audio data. When an amount of storage of the synthesized audio data stored to the buffer for DMA transfer 63 has not reached a predetermined amount of data determined according to the amount of storage of the synthesized audio data stored to the audio output unit 64, the standby system program 620 takes over and executes the mixing and the storage of the synthesized audio data that is executed by the main system program 610.

Type: Grant

Filed: August 4, 2010

Date of Patent: November 18, 2014

Assignee: NEC Corporation

Inventor: Kentaro Sasagawa

prev 1 2 3 4 5 6 7 8 9 … next