Speech Synthesis; Text To Speech Systems (epo) Patents (Class 704/E13.001)

E Subclasses

Methods for producing synthetic speech; speech synthesizers (epo) (Class 704/E13.002)

Concept-to-speech synthesizers; generation of natural phrases not from text but from machine-based concepts (EPO) (Class 704/E13.003)
Sound editing, manipulating voice of the synthesizer (EPO) (Class 704/E13.004)

Details of speech synthesis systems, e.g., synthesizer architecture, memory management, etc. (epo) (Class 704/E13.005)

Elementary speech units used in speech synthesizers; concatenation rules (epo) (Class 704/E13.009)

Concatenation (EPO) (Class 704/E13.01)

Text analysis, generation of parameters for speech synthesis out of text, e.g., grapheme to phoneme translation, prosody generation, stress, or intonation determination, etc. (epo) (Class 704/E13.011)

ELECTRONIC PERSONAL INTERACTIVE DEVICE

Publication number: 20110283190

Abstract: An interface device and method of use, comprising audio and image inputs; a processor for determining topics of interest, and receiving information of interest to the user from a remote resource; an audiovisual output for presenting an anthropomorphic object conveying the received information, having a selectively defined and adaptively alterable mood; an external communication device adapted to remotely communicate at least a voice conversation with a human user of the personal interface device. Also provided is a system and method adapted to receive logic for, synthesize, and engage in conversation dependent on received conversational logic and a personality.

Type: Application

Filed: May 12, 2011

Publication date: November 17, 2011

Inventor: Alexander POLTORAK
SPEECH PROCESSING METHOD AND APPARATUS

Publication number: 20110276332

Abstract: A speech synthesis method comprising: receiving a text input and outputting speech corresponding to said text input using a stochastic model, said stochastic model comprising an acoustic model and an excitation model, said acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to a feature, said excitation model comprising excitation model parameters which are used to model the vocal chords and lungs to output the speech using said features; wherein said acoustic parameters and excitation parameters have been jointly estimated; and outputting said speech.

Type: Application

Filed: May 6, 2011

Publication date: November 10, 2011

Applicant: Kabushiki Kaisha Toshiba

Inventors: Ranniery MAIA, Byung Ha Chun
Method and Apparatus for Switching Speech or Audio Signals

Publication number: 20110270614

Abstract: A method and an apparatus for switching speech or audio signals, wherein the method for switching speech or audio signals includes when switching of a speech or audio, weighting a first high frequency band signal of a current frame of speech or audio signal and a second high frequency band signal of the previous M frame of speech or audio signals to obtain a processed first high frequency band signal, where M is greater than or equal to 1, and synthesizing the processed first high frequency band signal and a first low frequency band signal of the current frame of speech or audio signal into a wide frequency band signal. In this way, speech or audio signals with different bandwidths can be smoothly switched, thus improving the quality of audio signals received by a user.

Type: Application

Filed: June 16, 2011

Publication date: November 3, 2011

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Zexin Liu, Lei Miao, Chen Hu, Wenhai Wu, Yue Lang, Qing Zhang
INFERRING SWITCHING CONDITIONS FOR SWITCHING BETWEEN MODALITIES IN A SPEECH APPLICATION ENVIRONMENT EXTENDED FOR INTERACTIVE TEXT EXCHANGES

Publication number: 20110270613

Abstract: The disclosed solution includes a method for dynamically switching modalities based upon inferred conditions in a dialogue session involving a speech application. The method establishes a dialogue session between a user and the speech application. During the dialogue session, the user interacts using an original modality and a second modality. The speech application interacts using a speech modality only. A set of conditions indicative of interaction problems using the original modality can be inferred. Responsive to the inferring step, the original modality can be changed to the second modality. A modality transition to the second modality can be transparent the speech application and can occur without interrupting the dialogue session. The original modality and the second modality can be different modalities; one including a text exchange modality and another including a speech modality.

Type: Application

Filed: July 8, 2011

Publication date: November 3, 2011

Applicant: Nuance Communications, Inc.

Inventors: William V. Da Palma, Baiju D. Mandalia, Victor S. Moore, Wendi L. Nusbickel
AUDIO OUTPUT OF TEXT DATA USING SPEECH CONTROL COMMANDS

Publication number: 20110264452

Abstract: Example embodiments disclosed herein relate to audio output of speech data using speech control commands. In particular, example embodiments include a mechanism for accessing text data. Example embodiments may also include a mechanism for outputting the text data as audio by converting the text data to speech audio data and transmitting the speech audio data over an audio output. Example embodiments may also include a mechanism for receiving speech control commands that allow for voice control of the output of the audio data.

Type: Application

Filed: April 27, 2010

Publication date: October 27, 2011

Inventors: Ramya Venkataramu, Molly Joy
SECURE VOICE BIOMETRIC ENROLLMENT AND VOICE ALERT DELIVERY SYSTEM

Publication number: 20110260832

Abstract: In one embodiment, a method includes enrolling a potential enrollee for an identity-monitoring service. The enrolling includes acquiring personally-identifying information (PII) and capturing a voiceprint. Following successful completion of the enrolling, the potential enrollee is an enrollee. The method further includes, responsive to an identified suspicious event related to the PII, creating an identity alert, establishing voice communication with an individual purporting to be the enrollee, and performing voice-biometric verification of the individual. The voice-biometric verification includes comparing one or more spoken utterances with the voiceprint. Following successful completion of the voice-biometric verification, the individual is a verified enrollee. In addition, the method includes authorizing delivery of the identity alert to the verified enrollee.

Type: Application

Filed: April 25, 2011

Publication date: October 27, 2011

Inventors: Joe Ross, Isaac Chapa, Adrian Cruz, Harold E. Gottschalk, JR.
METHOD AND APPARATUS FOR SYNTHESIZING SPEECH

Publication number: 20110243447

Abstract: Method and apparatus of synthesizing speech from a plurality of portion of text data, each portion having at least one associated attribute. The invention is achieved by determining (25, 35, 45) a value of the attribute for each of the portions of text data, selecting (27, 37, 47) a voice from a plurality of candidate voices on the basis of each of said determined attribute values, and converting (29, 39, 49) each portion of text data into synthesized speech using said respective selected voice.

Type: Application

Filed: December 7, 2009

Publication date: October 6, 2011

Applicant: KONINKLIJKE PHILIPS ELECTRONICS N.V.

Inventor: Franciscus Johannes Henricus Maria Meulenbroeks
SYSTEM FOR PROVIDING AUDIO MESSAGES ON A MOBILE DEVICE

Publication number: 20110246201

Abstract: While performing a function, a mobile device identifies that it is idle while it is downloading content or performing another task. During that idle time, it gathers one or more parameters (e.g., location, time, gender of user, age of user, etc.) and sends a request for an audio message (e.g., audio advertisement). One or more servers at a remote facility receive the request with the one or more parameters, and use the parameters to identify a targeted message. In some cases, the targeted message will include one or more dynamic variables (e.g., distance to store, time to event, etc.) that will be replaced based on the parameters received from the mobile device, so that the audio message is dynamically updated and customized for the mobile device. In one embodiment, the targeted message is transmitted to the mobile device as text. After being received at the mobile device, the text is optionally displayed and converted to an audio format and played for the user.

Type: Application

Filed: April 6, 2010

Publication date: October 6, 2011

Inventor: Andre F. Hawit
SPEECH SYNTHESIZER

Publication number: 20110246199

Abstract: According to one embodiment, a speech synthesizer generates a speech segment sequence and synthesizes speech by connecting speech segments of the generated speech segment sequence. If a speech segment of a synthesized first speech segment sequence is different from the speech segment of a synthesized second speech segment sequence having the same synthesis unit as the first speech segment sequence, the speech synthesizer disables the speech segment of the first speech segment sequence that is different from the speech segment of the second speech segment sequence.

Type: Application

Filed: September 14, 2010

Publication date: October 6, 2011

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Osamu NISHIYAMA, Takehiko Kagoshima
SPEECH USAGE AND PERFORMANCE TOOL

Publication number: 20110243310

Abstract: A system may include a database configured to selectively store and retrieve data. The system may further include a call record parser configured to receive a plurality of call records, each call record being associated with a respective call, parse the plurality of call records to identify periods of resource usage and types of resource usage for the associated calls, create parsed data based on the identified periods of resource usage and the types of resource usage, and store the parsed data in the database indexed according to the type of the identified resource and including the start and end times for the identified periods of usage.

Type: Application

Filed: March 30, 2010

Publication date: October 6, 2011

Applicant: Verizon Patent and Licensing Inc.

Inventors: Belinda Franklin-Barr, John Rivera
METHOD AND APPARATUS FOR EDITING SPEECH, AND METHOD FOR SYNTHESIZING SPEECH

Publication number: 20110238420

Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.

Type: Application

Filed: September 13, 2010

Publication date: September 29, 2011

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Gou Hirabayashi, Takehiko Kagoshima
System and Method for Audio Content Generation

Publication number: 20110231192

Abstract: A system and method for generating audio content. Content is automatically retrieved from an original website according to a predetermined schedule to generate retrieved content. The retrieved content is converted to one or more audio file. A hierarchy is assigned to the one or more audio files to provide an audible website that mimics a hierarch of the retrieved content as represented at the original website. The audible website is stored in a database for retrieval by one or more users. A first user input is received indicating an attempt to access the original website. The audible website is indicated as being associated with the original website in response to the user selection. Portion of the audible website are played in response to a second user input.

Type: Application

Filed: May 2, 2011

Publication date: September 22, 2011

Inventors: William C. O'Conor, Nathan T. Bradley
SYNTHESIZED SINGING VOICE WAVEFORM GENERATOR

Publication number: 20110231193

Abstract: Various technologies for generating a synthesized singing voice waveform. In one implementation, the computer program may receive a request from a user to create a synthesized singing voice using the lyrics of a song and a digital file containing its melody as inputs. The computer program may then dissect the lyrics' text and its melody file into its corresponding sub-phonemic units and musical score respectively. The musical score may be further dissected into a sequence of musical notes and duration times for each musical note. The computer program may then determine a fundamental frequency (F0), or pitch, of each musical note.

Type: Application

Filed: June 2, 2011

Publication date: September 22, 2011

Applicant: Microsoft Corporation

Inventors: Yao Qian, Frank Soong
DIGITAL CALENDAR DEVICE AND METHODS

Publication number: 20110205849

Abstract: A digital calendar device involving an electronic device having a graphic user interface. The electronic device is capable of receiving at least one note from a user, and retrieving and displaying the at least one note, as well as running on at least one operating system. The digital calendar device further includes features, such retrieval and displaying features, a touch screen interface, handwriting support feature, a speech recognition feature, a reminder feature, a frame feature, a mechanical support feature, and a fortune-teller software program.

Type: Application

Filed: February 23, 2010

Publication date: August 25, 2011

Applicant: SONY CORPORATION, A JAPANESE CORPORATION

Inventor: Feng Kang
HEARING AID AND COMPUTING DEVICE FOR PROVIDING AUDIO LABELS

Publication number: 20110200214

Abstract: A hearing aid includes a microphone to convert audible sounds into sound-related electrical signals and a memory configured to store a plurality of hearing aid profiles. Each hearing aid profile has an associated audio label. The hearing aid further includes a processor coupled to the microphone and to the memory and configured to select one of the plurality of hearing aid profiles. The processor applies the one of the plurality of hearing aid profiles to the sound-related electrical signals to produce a shaped output signal to compensate for a hearing impairment of a user. The processor is configured to insert the associated audio label into the shaped output signal. The hearing aid also includes a speaker coupled to the processor and configured to convert the shaped output signal into an audible sound.

Type: Application

Filed: February 8, 2011

Publication date: August 18, 2011

Applicant: AUDIOTONIQ, INC.

Inventors: John Michael Page Knox, David Matthew Landry, Samir Ibrahim, Andrew Lawrence Eisenberg
METHOD AND APPARATUS FOR PROVIDING SPEECH OUTPUT FOR SPEECH-ENABLED APPLICATIONS

Publication number: 20110202344

Abstract: Techniques for providing speech output for speech-enabled applications. A synthesis system receives from a speech-enabled application a text input including a text transcription of a desired speech output. The synthesis system selects one or more audio recordings corresponding to one or more portions of the text input. In one aspect, the synthesis system selects from audio recordings provided by a developer of the speech-enabled application. In another aspect, the synthesis system selects an audio recording of a speaker speaking a plurality of words. The synthesis system forms a speech output including the one or more selected audio recordings and provides the speech output for the speech-enabled application.

Type: Application

Filed: February 12, 2010

Publication date: August 18, 2011

Inventors: Darren C. Meyer, Corinne Bos-Plachez, Martine Marguerite Staessen
COMMUNICATION CONVERTER FOR CONVERTING AUDIO INFORMATION/TEXTUAL INFORMATION TO CORRESPONDING TEXTUAL INFORMATION/AUDIO INFORMATION

Publication number: 20110202347

Abstract: A communication converter is described for converting among speech signals and textual information, permitting communication between telephone users and textual instant communications users.

Type: Application

Filed: April 28, 2011

Publication date: August 18, 2011

Applicant: VERIZON BUSINESS GLOBAL LLC

Inventors: Richard G. Moore, Gregory L. Mumford, Duraisamy Gunasekar
SPEECH SYNTHESIS SYSTEM

Publication number: 20110196680

Abstract: When a system (100) is used for synthesizing speech having prosody serving as a reference, the system stores speech element information representing a speech element capable of synthesizing speech having a degree of naturalness indicating a degree of similarity to speech uttered by a human higher than a predetermined reference value (speech element information storage (115)). The system accepts requested prosody information representing prosody requested by the user (requested prosody information accepting part (113)). The system generates intermediate prosody information representing intermediate prosody between the reference prosody and the requested prosody (intermediate prosody information generator (114)). The system executes a speech synthesis process to synthesize speech based on the generated intermediate prosody information and the stored speech element information (speech synthesizer (116)).

Type: Application

Filed: August 21, 2009

Publication date: August 11, 2011

Applicant: NEC CORPORATION

Inventor: Masanori Kato
NAVIGATION AND ORIENTATION TOOLS FOR SPEECH SYNTHESIS

Publication number: 20110184738

Abstract: TTS is a well known technology for decades used for various applications from Artificial Call centers attendants to PC software that allows people with visual impairments or reading disabilities to listen to written works on a home computer. However to date TTS is not widely adopted for PC and Mobile users for daily reading tasks such as reading emails, reading pdf and word documents, reading through website content, and for reading books. The present invention offers new user experience for operating TTS for day to day usage. More specifically this invention describes a synchronization technique for following text being read by TTS engines and specific interfaces for touch pads, touch and multi touch screens. Nevertheless this invention also describes usage of other input methods such as touchpad, mouse, and keyboard.

Type: Application

Filed: January 25, 2011

Publication date: July 28, 2011

Inventors: Dror KALISKY, Sharon CARMEL
Device and Method for Providing a Television Sequence

Publication number: 20110179452

Abstract: A device for providing a television sequence has a database interface, a search request receiver, a television sequence rendition module and an output interface. The database interface accesses at least one database, using a search request. The search request receiver is formed to control the database interface so as to acquire at least audio content and at least image content separate therefrom via the database interface for the search request. The television sequence rendition module combines the separate audio content and the image content to generate the television sequence based on the audio content and the image content. The output interface outputs the television sequence to a television sequence distributor.

Type: Application

Filed: January 21, 2011

Publication date: July 21, 2011

Inventors: Peter Dunker, Uwe Kuehhirt, Andreas Haupt, Christian Dittmar, Holger Grossman
SYSTEM AND METHOD FOR DISPLAYING TIME-SENSITIVE ALERTS IN A MOBILE DEVICE

Publication number: 20110153754

Abstract: In various embodiments, a method for receiving alerts through a network includes providing a device having a pop-up management module and a display; providing a communications interface between the device and one or more database systems located outside the network; providing a user interface configured to allow the user to selectively choose to display, on the display, one or more message types generated by the one or more database systems, wherein said one or more message types are received by said pop-up management module via the network and displayed on the display as a pop-up message. A related system includes a device registered in the network having a processor, a memory device, a transceiver, a user interface, and a display, wherein the processor is configured to control a pop-up management module for displaying one or more message types as a pop-up message. The device may be a WiMAX-enabled device and the network may be a WiMAX network.

Type: Application

Filed: December 22, 2009

Publication date: June 23, 2011

Applicant: CLEAR WIRELESS, LLC

Inventor: Don GUNASEKARA
METHOD FOR DYNAMICALLY ADJUSTING THE SPECTRAL CONTENT OF AN AUDIO SIGNAL

Publication number: 20110153314

Abstract: A method for dynamically adjusting the spectral content of an audio signal, which increases the harmonic content of said audio signal, said method comprising translating an encoded digital signal into data bands, creating a psychoacoustic model to identify sections of said data bands that are deficient in harmonic quality, analyzing the fundamental frequency and amplitude of said harmonically deficient data bands, creating additional higher order harmonics for said harmonically deficient data bands, adding said higher order harmonics back to said encoded digital signal to form a newly enhanced signal, inverse filtering said newly enhanced signal, and converting said inverse filtered signal to an analog waveform for consumption by the listener.

Type: Application

Filed: February 28, 2011

Publication date: June 23, 2011

Inventors: J. Craig Oxford, Patrick Taylor, D. Michael Shields
VOICE SYNTHESIS MODEL GENERATION DEVICE, VOICE SYNTHESIS MODEL GENERATION SYSTEM, COMMUNICATION TERMINAL DEVICE AND METHOD FOR GENERATING VOICE SYNTHESIS MODEL

Publication number: 20110144997

Abstract: A voice synthesis model generation device, a voice synthesis model generation system, a communication terminal device, and a method for generating a voice synthesis model all of which are capable of preferably acquiring a user's voice. A voice synthesis model generation system is configured to include a mobile communication terminal device and a voice synthesis model generation device. The mobile communication terminal device includes a characteristic amount extraction portion that extracts a characteristic amount of input voice, and a text data acquisition portion that acquires text data from the voice.

Type: Application

Filed: July 7, 2009

Publication date: June 16, 2011

Applicant: NTT DOCOMO, INC

Inventor: Noriko Mizuguchi
SPEECH SYNTHESIS SYSTEM

Publication number: 20110137655

Abstract: A speech synthesis system includes a server device and a client device. The server device stores speech element information and speech element identification information in association with each other so that, in a case that speech element information representing respective speech elements included in speech uttered by a speech registering user are arranged in the order of arrangement of the speech elements in the speech, at least one of speech element identification information identifying the respective speech element information has different information from information arranged in accordance with a predetermined rule. The client device transmits speech element identification information to the server device based on accepted text information. The client device executes a speech synthesis process based on the speech element information received from the server device.

Type: Application

Filed: June 22, 2009

Publication date: June 9, 2011

Inventors: Reishi Kondo, Masanori Kato, Yasuyuki Mitsui
CONTENT DISPLAY DEVICE, CONTENT DISPLAY METHOD, PROGRAM, STORAGE MEDIUM, AND CONTENT DISTRIBUTION SYSTEM

Publication number: 20110131516

Abstract: Provided are a content display device and a content display method each capable of reliably providing, even if a plurality of content items displayed on a single screen are to be read aloud consecutively, a user with voice reading each article, a program therefor, and a storage medium storing the program. A television (100) is a content display device capable of displaying a plurality of content items in a single screen and sequentially reading aloud by voice text strings relating to the respective content items. The television (100) includes a setting section (114) for setting the screen to have a display condition displaying a content item, among the content items, which has a text string relating to the content item and being currently read aloud in order to notify a user of the content item in such a manner that the content item is distinguishable from the other content item(s).

Type: Application

Filed: July 16, 2009

Publication date: June 2, 2011

Applicant: SHARP KABUSHIKI KAISHA

Inventors: Hirofumi Furukawa, Kiyotaka Kashito
SYNTHESIZED AUDIO MESSAGE OVER COMMUNICATION LINKS

Publication number: 20110111805

Abstract: A communication device establishes an audio connection with a far-end user via a communication network. The communication device receives text input from a near-end user, and converts the text input into speech signals. The speech signals are transmitted to the far-end user using the established audio connection while muting audio input to its microphone. Other embodiments are also described and claimed.

Type: Application

Filed: November 6, 2009

Publication date: May 12, 2011

Applicant: Apple Inc.

Inventors: Baptiste P. Paquier, Aram M. Lindahl, Phillip G. Tamchina
METHOD AND DEVICE FOR CONVERTING SPEECH

Publication number: 20110112836

Abstract: Electronic device and method for obtaining a digital speech signal and a control command relating to the digital speech signal while obtaining the digital speech signal, and for temporally associating the control command with a substantially corresponding time instant in the digital speech signal to which the control command was directed, wherein the control command determines one or more punctuation marks or another, optionally symbolic, elements to be at least logically positioned at a text location corresponding to the communication instant relative to the digital speech signal so as to cultivate the speech to text conversion procedure.

Type: Application

Filed: July 3, 2008

Publication date: May 12, 2011

Applicant: MOBITER DICTA OY

Inventors: Risto Kurki-Suonio, Andrew Cotton
SPEECH SYNTHESIS SYSTEM

Publication number: 20110106538

Abstract: This speech synthesis system includes a server device and a client device. The client device accepts text information representing text, and transmits a speech element request to the server device. The server device stores speech element information. The server device receives the speech element request transmitted by the client device and, in response to the received speech element request, transmits speech element information to the client device so that the speech element information is received by the client device in a different order from an order of arrangement of speech elements in speech corresponding to the text. The client device executes a speech synthesis process by rearranging the speech element information so that speech elements represented by the received speech element information are arranged in the same order as the order of arrangement of the speech elements in the speech corresponding to the text.

Type: Application

Filed: June 22, 2009

Publication date: May 5, 2011

Inventors: Reishi Kondo, Masanori Kato, Yasuyuki Mitsui
TRANSFORMING COMPONENTS OF A WEB PAGE TO VOICE PROMPTS

Publication number: 20110106537

Abstract: Embodiments of the invention address the deficiencies of the prior art by providing a method, apparatus, and program product to of converting components of a web page to voice prompts for a user. In some embodiments, the method comprises selectively determining at least one HTML component from a plurality of HTML components of a web page to transform into a voice prompt for a mobile system based upon a voice attribute file associated with the web page. The method further comprises transforming the at least one HTML component into parameterized data suitable for use by the mobile system based upon at least a portion of the voice attribute file associated with the at least one HTML component and transmitting the parameterized data to the mobile system.

Type: Application

Filed: October 30, 2009

Publication date: May 5, 2011

Inventors: Paul M. Funyak, Norman J. Connors, Paul E. Kolonay, Matthew Aaron Nichols
SPEECH CONTENT BASED PACKET LOSS CONCEALMENT

Publication number: 20110099014

Abstract: Systems and methods are described for performing packet loss concealment (PLC) to mitigate the effect of one or more lost frames within a series of frames that represent a speech signal. In accordance with the exemplary systems and methods, PLC is performed by searching a codebook of speech-related parameter profiles to identify content that is being spoken and by selecting a profile associated with the identified content for use in predicting or estimating speech-related parameter information associated with one or more lost frames of a speech signal. The predicted/estimated speech-related parameter information is then used to synthesize one or more frames to replace the lost frame(s) of the speech signal.

Type: Application

Filed: September 21, 2010

Publication date: April 28, 2011

Applicant: BROADCOM CORPORATION

Inventor: Robert W. Zopf
EMOTIVE ADVISORY SYSTEM ACOUSTIC ENVIRONMENT

Publication number: 20110083075

Abstract: An emotive advisory system for use by one or more occupants of an automotive vehicle includes a directional speaker array, and a computer. The computer is configured to determine an audio direction, and output data representing an avatar for visual display. The computer is further configured to output data representing a spoken statement for the avatar for audio play from the speaker array such that the audio from the speaker array is directed in the determined audio direction. A visual appearance of the avatar and the spoken statement for the avatar convey a simulated emotional state.

Type: Application

Filed: October 2, 2009

Publication date: April 7, 2011

Applicant: FORD GLOBAL TECHNOLOGIES, LLC

Inventors: Perry Robinson MacNeille, Oleg Yurievitch Gusikhin, Kacie Alane Theisen
System and method for supporting text-to-speech

Patent number: 7921014

Abstract: A system for generating high-quality synthesized text-to-speech includes a learning data generating unit, a frequency data generating unit, and a setting unit. The learning data generating unit recognizes inputted speech, and then generates first learning data in which wordings of phrases are associated with readings thereof. The frequency data generating unit generates, based on the first learning data, frequency data indicating appearance frequencies of both wordings and readings of phrases. The setting unit sets the thus generated frequency data for a language processing unit in order to approximate outputted speech of text-to-speech to the inputted speech. Furthermore, the language processing unit generates, from a wording of text, a reading corresponding to the wording, on the basis of the appearance frequencies.

Type: Grant

Filed: July 9, 2007

Date of Patent: April 5, 2011

Assignee: Nuance Communications, Inc.

Inventors: Gakuto Kurata, Toru Nagano, Masafumi Nishimura, Ryuki Tachibana
FLEXIBLE PARAMETER UPDATE IN AUDIO/SPEECH CODED SIGNALS

Publication number: 20110077945

Abstract: This invention relates to a method, a computer program product, apparatuses and a system for extracting coded parameter set from an encoded audio/speech stream, said audio/speech stream being distributed to a sequence of packets, and generating a time scaled encoded audio/speech stream in the parameter coded domain using said extracted coded parameter set.

Type: Application

Filed: June 6, 2007

Publication date: March 31, 2011

Applicant: NOKIA CORPORATION

Inventors: Pasi Sakari Ojala, Ari Kalevi Lakaniemi
SYSTEM AND METHOD FOR DATA CORRELATION AND MOBILE TERMINAL THEREFOR

Publication number: 20110077048

Abstract: The invention relates to a system for data correlation, having: a receiving device 1 having an image acquisition element 10 and a data set generator 12 for generating at least one object data set from at least one acquired first image, which represents a physical object, and an identification label, which uniquely determines an object-related acquisition procedure, and at least one information data set from at least one acquired second image, which represents coded information related to the physical object, and the identification label; a correlation device 2 for the extraction 20 of the coded information from the information data set, for the semantic analysis 22 of the extracted information, and for the generation of at least one combination data sets ? from the results of the semantic analysis, the extracted information, and the at least one object data set with the same identification label as the extracted information data set; and a user device 3 for the storage and further use of the combination data

Type: Application

Filed: March 3, 2009

Publication date: March 31, 2011

Applicant: Linguatec Sprachtechnologien GmbH

Inventor: Reinhard Busch
SYNTHETIC SPEECH TEXT-INPUT DEVICE AND PROGRAM

Publication number: 20110060590

Abstract: A synthetic speech text-input device is provided that allows a user to intuitively know an amount of an input text that can be fit in a desired duration. A synthetic speech text-input device 1 includes: an input unit that receives a set duration in which a speech to be synthesized is to be fit, and a text for a synthetic speech; a text amount calculation unit that calculates an acceptable text amount based on the set duration received by the input unit, the acceptable text amount being an amount of a text acceptable as a synthetic speech of the set duration; and a text amount output unit that outputs the acceptable text amount calculated by the text amount calculation unit, when the input unit receives the text.

Type: Application

Filed: September 10, 2010

Publication date: March 10, 2011

Applicant: JUJITSU LIMITED

Inventors: Nobuyuki Katae, Kentaro Murase
Touch-Screen User Interface

Publication number: 20110050594

Abstract: A user interface for a touch-screen display of a dedicated handheld electronic book reader device is described. The user interface detects human gestures manifest as pressure being applied by a finger or stylus to regions on the touch-screen display. In one implementation, the touch-screen user interface enables a user to turn one or more pages in response to applying a force or pressure to the touch-screen display. In another implementation, the touch-screen user interface is configured to bookmark a page temporarily by applying a pressure to the display, then allowing a user to turn pages to a new page, but reverting back to a previously-displayed page when the pressure is removed. In another implementation, the touch-screen user interface identifies and filters electronic books based on book size and/or a time available to read a book. In another implementation, the touch-screen user interface converts text to speech in response to a user touching the touch-screen display.

Type: Application

Filed: September 2, 2009

Publication date: March 3, 2011

Inventors: John T. Kim, Christopher Green, Joseph J. Hebenstreit, Kevin E. Keller
External Content Transformation

Publication number: 20110054880

Abstract: Techniques and systems for content transformation between devices are disclosed. In one aspect, a system includes a host device that sends content to client devices, and client devices that receive content from the host device in one format and transform the content into a different format. The client devices present the transformed content to users. In another aspect, the host device presents content in a native format, determines that a client device requires the content to be in a different format, converts the content to a reference format, and sends the converted content to the client device.

Type: Application

Filed: September 2, 2009

Publication date: March 3, 2011

Inventor: Christopher B. Fleizach
SYSTEM AND METHOD FOR SPEECH SYNTHESIS USING FREQUENCY SPLICING

Publication number: 20110046957

Abstract: Techniques are disclosed for frequency splicing in which speech segments used in the creation of a final speech waveform are constructed, at least in part, by combining (e.g., summing) a small number (e.g., two) of component speech segments that overlap substantially, or entirely, in time but have spectral energy that occupies disjoint, or substantially disjoint, frequency ranges. The component speech segments may be derived from speech segments produced by different speakers or from different speech segments produced by the same speaker. Depending on the embodiment, frequency splicing may supplement rule-based, concatenative, hybrid, or limited-vocabulary speech synthesis systems to provide various advantages.

Type: Application

Filed: August 24, 2010

Publication date: February 24, 2011

Applicant: NovaSpeech, LLC

Inventors: Susan R. Hertz, Harold G. Mills
METHOD AND APPARATUS FOR PROCESSING DATA

Publication number: 20110046943

Abstract: A data processing method and apparatus that may set emotion based on development of a story are provided. The method and apparatus may set emotion without inputting emotion for each sentence of text data. Emotion setting information is generated based on development of the story and the like, and may be applied to the text data.

Type: Application

Filed: April 5, 2010

Publication date: February 24, 2011

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Dong Yeol Lee, Seung Seop Park, Jae Hyun Ahn
Automatic Evaluation of Spoken Fluency

Publication number: 20110040554

Abstract: A procedure to automatically evaluate the spoken fluency of a speaker by prompting the speaker to talk on a given topic, recording the speaker's speech to get a recorded sample of speech, and then analyzing the patterns of disfluencies in the speech to compute a numerical score to quantify the spoken fluency skills of the speakers. The numerical fluency score accounts for various prosodic and lexical features, including formant-based filled-pause detection, closely-occurring exact and inexact repeat N-grams, normalized average distance between consecutive occurrences of N-grams. The lexical features and prosodic features are combined to classify the speaker with a C-class classification and develop a rating for the speaker.

Type: Application

Filed: August 15, 2009

Publication date: February 17, 2011

Applicant: International Business Machines Corporation

Inventors: Kartik Audhkhasi, Om D. Deshmukh, Kundan Kandhway, Ashish Verma
SELECTING FROM A PLURALITY OF AUDIO CLIPS FOR ANNOUNCING MEDIA

Publication number: 20110035222

Abstract: Systems and methods for selecting one of several audio clips associated with a text item for playback are provided. The electronic device can determine which audio clip to play back at any point in time using different approaches, including for example receiving a user selection or randomly selecting audio clips. In some embodiments, the electronic device can intelligently select audio clips based on attributes of the media item, the electronic device operations, or the environment of the electronic device. The attributes can include, for example, metadata values of the media item, the type of ongoing operations of the electronic device, and environmental characteristics that can be measured or detected using sensors of or coupled to the electronic device. Different audio clips can be associated with particular attribute values, such that an audio clip corresponding to the detected or received attribute values are played back.

Type: Application

Filed: August 4, 2009

Publication date: February 10, 2011

Applicant: Apple Inc.

Inventor: Jon Schiller
AUDIO CLIPS FOR ANNOUNCING REMOTELY ACCESSED MEDIA ITEMS

Publication number: 20110035223

Abstract: Systems and methods for retrieving and playing back audio clips for streamed or remotely received media items are provided. An electronic device can provide audio clips identifying media items at any suitable time, including for example to identify media items that are currently played back or available for playback. When the media items played back are not locally stored, the electronic device may not have a corresponding audio clip locally stored. In such cases, the electronic device can identify a streamed media item, and retrieve an audio clip corresponding to text items associated with the media item. For example, the electronic device can retrieve audio clips corresponding to the artist, title and album of the received media item. The electronic device can retrieve audio clips from any suitable source, such as a dedicated audio clip server or other remote source, a remote text-to-speech engine, or a locally stored text-to-speech engine.

Type: Application

Filed: August 4, 2009

Publication date: February 10, 2011

Applicant: Apple Inc.

Inventor: Jon Schiller
MEDIA PROCESSING COMPARISON SYSTEM AND TECHNIQUES

Publication number: 20110018889

Abstract: A media processing comparison system (“MPCS”) and techniques facilitate concurrent, subjective quality comparisons between media presentations produced by different instances of media processing components performing the same functions (for example, instances of media processing components in the form of hardware, software, and/or firmware, such as parsers, codecs, decryptors, and/or demultiplexers, supplied by the same or different entities) in a particular media content player. The MPCS receives an ordered stream of encoded media samples from a media source, and decodes a particular encoded media sample using two or more different instances of media processing components. A single renderer renders and/or coordinates the synchronous presentation of decoded media samples from each instance of media processing component(s) as separate media presentations. The media presentations may be subjectively compared and/or selected for storage by a user in a sample-by-sample manner.

Type: Application

Filed: July 23, 2009

Publication date: January 27, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Firoz Dalal, Shyam Sadhwani
UNIFIED COMMUNICATION SYSTEM

Publication number: 20110015930

Abstract: A unified communication system is disclosed that allows a variety of end point types to participate in a communication event using a common, unified communication system. In some implementations, a calling party interacts with a client application residing on an endpoint to make a communication request to another endpoint. A communication event manager residing in the unified communication system selects a script from a repository of scripts based on the communication event and the capabilities of the endpoints. A communication event execution engine receives a user profile associated with at least one of the endpoints. The user profile can be configured by the user to describe the user's preferences for how the communication should be processed by the unified communication system.

Type: Application

Filed: September 7, 2010

Publication date: January 20, 2011

Applicant: INTELEPEER, INC.

Inventors: John Ward, Haydar Haba, Charles Studt, Peter Antypas, Jonathan Green
TRANSFORMING A TACTUALLY SELECTED USER INPUT INTO AN AUDIO OUTPUT

Publication number: 20110015929

Abstract: A contextual input device includes a plurality of tactually discernable keys disposed in a predetermined configuration which replicates a particular relationship among a plurality of items associated with a known physical object. The tactually discernable keys are typically labeled with Braille type. The known physical object is typically a collection of related items grouped together by some common relationship. A computer-implemented process determines whether a input signal represents a selection of an item from among a plurality of items or an attribute pertaining to an item among the plurality of items. Once the selected item or attribute pertaining to an item is determined, the computer-implemented process transforms a user's selection from the input signal into an analog audio signal which is then audibly output as human speech with an electro-acoustic transducer.

Type: Application

Filed: July 17, 2009

Publication date: January 20, 2011

Applicant: Calpoly Corporation

Inventors: Dennis Fantin, C. Arthur MacCarley
VOICE SYNTHESIS AND PROCESSING

Publication number: 20110010179

Abstract: A method and an apparatus for voice synthesis and processing have been presented. In one exemplary method, a first audio recording of a human speech in a natural language is received. Then speech analysis synthesis algorithm is applied to the first audio recording to synthesize a second audio recording from the first audio recording such that the second audio recording sounds humanistic and consistent, but unintelligible.

Type: Application

Filed: July 13, 2009

Publication date: January 13, 2011

Inventor: Devang K. Naik
METHOD AND DEVICE FOR UPDATING STATUS OF SYNTHESIS FILTERS

Publication number: 20100332232

Abstract: A method and device for updating statuses of synthesis filters are provided. The method includes: exciting a synthesis filter corresponding to a first encoding rate by using an excitation signal of the first encoding rate, outputting reconstructed signal information, and updating status information of the synthesis filter and a synthesis filter corresponding to a second encoding rate. In the present disclosure, the status of the synthesis filter corresponding to the current rate and the statuses of the synthesis filters at other rates are updated. Thus, synchronization between the statuses of the synthesis filters corresponding to different rates at the encoding terminal may be realized, thereby facilitating the consistency of the reconstructed signals of the encoding and decoding terminals when the encoding rate is switched, and improving the quality of the reconstructed signal of the decoding terminal.

Type: Application

Filed: September 16, 2010

Publication date: December 30, 2010

Inventor: Jinliang DAI
VEHICLE INTERNET RADIO INTERFACE

Publication number: 20100330975

Abstract: The invention provides a internet radio interface for use in vehicles. The interface allows a device unit, with wireless capability and voice interface technology, to communicate with a vehicle, mobile phone, and portal in order to manage and upload various user preferences to the device unit as set out by the user prior to getting into the vehicle. The device unit interacts with the user to permit various functions and access preferable channels as well as managing secondary functions of the user, including cell phone communications.

Type: Application

Filed: June 28, 2010

Publication date: December 30, 2010

Inventor: Otman A. Basir
ATTENUATION OF OVERVOICING, IN PARTICULAR FOR THE GENERATION OF AN EXCITATION AT A DECODER WHEN DATA IS MISSING

Publication number: 20100324907

Abstract: The invention proposes the synthesis of a signal consisting of consecutive blocks. It proposes more particularly, on receipt of such a signal, to replace, by synthesis, lost or erroneous blocks of this signal. To this end, it proposes an attenuation of the overvoicing during the generation of a signal synthesis. More particularly, a voiced excitation is generated on the basis of the pitch period (T) estimated or transmitted at the previous block, by optionally applying a correction of plus or minus a sample of the duration of this period (counted in terms of number of samples), by constituting groups (A?,B?,C?,D?) of at least two samples and inverting positions of samples in the groups, randomly (B?,C?) or in a forced manner. An over-harmonicity in the excitation generated is thus broken and the effect of overvoicing in the synthesis of the generated signal is thereby attenuated.

Type: Application

Filed: October 17, 2007

Publication date: December 23, 2010

Applicant: France Telecom

Inventors: David Virette, Balazs Kovesi
Systems and Methods Document Narration

Publication number: 20100324902

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices. In some aspects, systems and methods described herein can include receiving a user-based selection of a first portion of words in a document where the document has a pre-associated first voice model and overwriting the association of the first voice model, by the one or more computers, with a second voice model for the first portion of words.

Type: Application

Filed: January 14, 2010

Publication date: December 23, 2010

Inventors: Raymond Kurzweil, Paul Albrecht, Peter Chapman

prev 1 2 3 4 5 6 7 … next