Details Of Speech Synthesis Systems, E.g., Synthesizer Architecture, Memory Management, Etc. (epo) Patents (Class 704/E13.005)

E Subclasses

Architecture of speech synthesizers (epo) (Class 704/E13.006)

Excitation (epo) (Class 704/E13.007)

Systems using speech synthesizers (epo) (Class 704/E13.008)

Electronic Apparatus

Publication number: 20120197645

Abstract: An electronic apparatus includes a communication module, a storage module, a manipulation module, voice output control module, and a control module. The communication module receives book data delivered externally. The storage module stores the received book data. The manipulation module converts a manipulation of a user into an electrical signal. The voice output control module reproduces, as a voice, the book data based on the manipulation while controlling the reproduction speed of the voice. The control module determines a part that is important to the user, stores, in the storage module, a position of voice reproduction of the book data, and synchronizes the position of the voice reproduction with a reproduction position in the book data.

Type: Application

Filed: September 22, 2011

Publication date: August 2, 2012

Inventor: Midori Nakamae
Method and System for Construction and Rendering of Annotations Associated with an Electronic Image

Publication number: 20120166175

Abstract: A method and system for construction and rendering of annotations associated with an electronic image is disclosed. The system comprises a first data repository for storing the electronic image, which has a plurality of pixels, with one or more pixels annotated at a plurality of levels, which contain descriptive characteristics of the pixel, in ascending magnitude, such that the descriptive characteristics at a subsequent level are with reference to descriptive characteristics of one or more pixels surrounding the pixel. The system comprises a second data repository for storing the annotations. An image display module is configured to display the electronic image. A pixel and level identification module is configured to receive pixel and level selection details from a user-interface. An annotation retrieval module is configured to retrieve annotations corresponding to the pixel and level selection from the second repository and renders the retrieved annotations for the electronic image.

Type: Application

Filed: December 21, 2011

Publication date: June 28, 2012

Applicant: Tata Consultancy Services Ltd.

Inventor: Sunil Kumar Kopparapu
SPEAKER-ADAPTIVE SYNTHESIZED VOICE

Publication number: 20120059654

Abstract: An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.

Type: Application

Filed: March 16, 2010

Publication date: March 8, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Masafumi Nishimura, Ryuki Tachibana
SPEECH SYSTEM USED FOR ROBOT AND ROBOT WITH SPEECH SYSTEM

Publication number: 20120046788

Abstract: A speech system (2) used for a robot (1) and a robot (1) with the speech system (2) are provided. The speech system (2) includes an audio file storage unit (200) and a speech control unit (300). The audio file storage unit (200) stores audio files obtained from an audio file preparation unit (100), which is located outside the robot (1). The data stored in the audio file storage unit (200) can be prepared, modified or replaced according to the requirement of a user. According to the received robot state information, the speech control unit (300) converts the audio data of the audio file, which corresponds to the state information and is stored in the audio file storage unit (200), to a corresponding analog signal, and then plays the analog signal.

Type: Application

Filed: January 22, 2010

Publication date: February 23, 2012

Applicant: TEK ELECTRICAL (SUZHOU) CO., LTD.

Inventor: Dongqi Qian
SYSTEM AND METHOD FOR AUDIBLE TEXT CENTER SUBSYSTEM

Publication number: 20110144989

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for sending a spoken message as a text message. The method includes initiating a connection with a first subscriber, receiving from the first subscriber a spoken message and spoken information associated with at least one recipient address. The method further includes converting the spoken message to text via an audible text center subsystem (ATCS), and delivering the text to the recipient address. The method can also include verifying a subscription status of the first subscriber, or delivering the text to the recipient address based on retrieved preferences of the first subscriber. The preferences can be retrieved from a consolidated network repository or embedded within the spoken message. Text and the spoken message can be delivered to the same or different recipient addresses. The method can include updating recipient addresses based on a received oral command from the first subscriber.

Type: Application

Filed: December 15, 2009

Publication date: June 16, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventor: Sangar DOWLATKHAH
MULTI CHANNEL, AUTOMATED COMMUNICATION AND RESOURCE SYNCHRONIZATION

Publication number: 20110047221

Abstract: Methods and systems that allow multiple channels of communication between multiple users via a platform that automatically integrates and synchronizes the resources of each user during the communication are described. The systems comprise a platform capable of handling multiple types of communications with multiple users and systems. The platform contains a browser, one or more servers for handling communications between the platform and user devices that are external to the platform, a speech engine for converting text to speech and vice versa, a chat server, an email server, a text server, a data warehouse, a scheduler, a workflow/rules engine, a reports server, and integration APIs that can be integrated with 3rd party systems and allow those systems to be integrated with the platform. The platform is linked to multiple users (and their devices or systems) through a communications network.

Type: Application

Filed: August 24, 2009

Publication date: February 24, 2011

Inventors: Timothy Watanabe, Kenneth Poray, Craig So, Ryan Menda
SYSTEMS AND METHODS FOR SELECTION AND USE OF MULTIPLE CHARACTERS FOR DOCUMENT NARRATION

Publication number: 20100318364

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices. Further disclosed are techniques and systems for generating an audible output in which different portions of a text are narrated using voice models associated with different characters.

Type: Application

Filed: January 14, 2010

Publication date: December 16, 2010

Inventors: Raymond C. Kurzweil, Paul Albrecht, Peter Chapman
INTERACTIVE TTS OPTIMIZATION TOOL

Publication number: 20100312565

Abstract: An interactive prompt generation and TTS optimization tool with a user-friendly graphical user interface is provided. The tool accepts HTS abstraction or speech recognition processed input from a user to generate an enhanced initial waveform for synthesis. Acoustic features of the waveform are presented to the user with graphical visualizations enabling the user to modify various parameters of the speech synthesis process and listen to modified versions until an acceptable end product is reached.

Type: Application

Filed: June 9, 2009

Publication date: December 9, 2010

Applicant: Microsoft Corporation

Inventors: Jian-Chao Wang, Lu-Jun Yuan, Sheng Zhao, Fileno A. Alleva, Jingyang Xu, Chiwei Che
FREQUENCY AXIS WARPING FACTOR ESTIMATION APPARATUS, SYSTEM, METHOD AND PROGRAM

Publication number: 20100204985

Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.

Type: Application

Filed: September 22, 2008

Publication date: August 12, 2010

Inventor: Tadashi Emori
Electromagnetic/acoustic underwater communications system

Publication number: 20100134319

Abstract: An underwater communications system is provided that transmits electromagnetic and/or magnetic signals to a remote receiver. The transmitter includes a data input. A digital data compressor compresses data to be transmitted. A modulator modulates compressed data onto a carrier signal. An electrically insulated, magnetic coupled antenna transmits the compressed, modulated signals. The receiver that has an electrically insulated, magnetic coupled antenna for receiving a compressed, modulated signal. A demodulator is provided for demodulating the signal to reveal compressed data. A de-compressor de-compresses the data. An appropriate human interface is provided to present transmitted data into text/audio/visible form. Similarly, the transmit system comprises appropriate audio/visual/text entry mechanisms.

Type: Application

Filed: February 3, 2010

Publication date: June 3, 2010

Inventors: Mark Rhodes, Derek Wolfe, Brendan Hyland
SPEECH SAMPLES LIBRARY FOR TEXT-TO-SPEECH AND METHODS AND APPARATUS FOR GENERATING AND USING SAME

Publication number: 20100131267

Abstract: A method of recording speech for use in a speech samples library. In an exemplary embodiment, the method comprises recording a speaker pronouncing a phoneme with musical parameters characterizing pronunciation of another phoneme by the same or another speaker. For example, in one embodiment the method comprises: providing a recording of a first speaker pronouncing a first phoneme in a phonemic context. The pronunciation is characterized by some musical parameters. A second reader, who may be the same as the first reader, is then recorded pronouncing a second phoneme (different from the first phoneme) with the musical parameters that characterizes pronunciation of the first phoneme by the first speaker. The recordings made by the second reader are used for compiling a speech samples library.

Type: Application

Filed: March 19, 2008

Publication date: May 27, 2010

Applicant: Vivo Text Ltd.

Inventors: Gershon Silbert, Andres Hakim
APPARATUS AND METHOD FOR SYNTHESIZING AN OUTPUT SIGNAL

Publication number: 20100094631

Abstract: An apparatus for synthesizing a rendered output signal having a first audio channel and a second audio channel includes a decorrelator stage for generating a decorrelator signal based on a downmix signal, and a combiner for performing a weighted combination of the downmix signal and a decorrelated signal based on parametric audio object information, downmix information and target rendering information. The combiner solves the problem of optimally combining matrixing with decorrelation for a high quality stereo scene reproduction of a number of individual audio objects using a multichannel downmix.

Type: Application

Filed: April 23, 2008

Publication date: April 15, 2010

Inventors: Jonas Engdegard, Heiko Purnhagen, Barbara Resch, Lars Villemoes, Cornelia Falch, Juergen Herre, Johannes Hilpert, Andreas Hoelzer, Leonid Terentiev
VARIABLE TEXT-TO-SPEECH FOR AUTOMOTIVE APPLICATION

Publication number: 20100057465

Abstract: A text-to-speech (TTS) system implemented in an automotive vehicle is dynamically tuned to improve intelligibility over a wide variety of vehicle operating states and environmental conditions. In one embodiment of the present invention, a TTS system is interfaced to one or more vehicle sensors to measure parameters including vehicle speed, interior noise, visibility conditions, and road roughness, among others. In response to measurements of these operating parameters, TTS voice volume, pitch, and speed, among other parameters, may be tuned in order to improve intelligibility of the TTS voice system and increase its effectiveness for the operator of the vehicle.

Type: Application

Filed: September 3, 2008

Publication date: March 4, 2010

Inventors: DAVID MICHAEL KIRSCH, Ritchie Winson Huang
METHODS, APPARATUSES, AND SYSTEMS FOR PROVIDING TIMELY USER CUES PERTAINING TO SPEECH RECOGNITION

Publication number: 20100049525

Abstract: A method is provided of providing cues from am electronic communication device to a user while capturing an utterance. A plurality of cues associated with the user utterance are provided by the device to the user in at least near real-time. For each of a plurality of portions of the utterance, data representative of the respective portion of the user utterance is communicated from the electronic communication device to a remote electronic device. In response to this communication, data, representative of at least one parameter associated with the respective portion of the user utterance, is received at the electronic communication device. The electronic communication device provides one or more cues to the user based on the at least parameter. At least one of the cues is provided by the electronic communication device to the user prior to completion of the step of capturing the user utterance.

Type: Application

Filed: August 24, 2009

Publication date: February 25, 2010

Applicant: YAP, INC.

Inventor: Scott Edward Paden
SPEECH SYNTHESIZING DEVICE, SPEECH SYNTHESIZING SYSTEM, LANGUAGE PROCESSING DEVICE, SPEECH SYNTHESIZING METHOD AND RECORDING MEDIUM

Publication number: 20090319275

Abstract: A speech synthesizing device, the device includes: a text accepting unit for accepting text data; an extracting unit for extracting a special character including a pictographic character, a face mark or a symbol from text data accepted by the text accepting unit; a dictionary database in which a plurality of special characters and a plurality of phonetic expressions for each special character are registered; a selecting unit for selecting a phonetic expression of an extracted special character from the dictionary database when the extracting unit extracts the special character; a converting unit for converting the text data accepted by the accepting unit to a phonogram in accordance with a phonetic expression selected by the selecting unit in association with the extracted special character; and a speech synthesizing unit for synthesizing a voice from a phonogram obtained by the converting

Type: Application

Filed: August 31, 2009

Publication date: December 24, 2009

Applicant: FUJITSU LIMITED

Inventor: Takuya Noda
SYSTEM AND METHOD FOR AUDIBLY OUTPUTTING TEXT MESSAGES

Publication number: 20090313022

Abstract: A method and system for audibly outputting text messages includes: setting a vocalizing function for audibly outputting text messages, searching a character speech library for each character of a received text message, and acquiring pronunciation data of each character of the received text message. The method and the system further includes vocalizing the pronunciation data of each character of the received text message, generating a voice message, and audibly outputting the generated voice message.

Type: Application

Filed: December 23, 2008

Publication date: December 17, 2009

Applicant: CHI MEI COMMUNICATION SYSTEMS, INC.

Inventor: CHI-MING HSIAO
VOICE DATA CREATION SYSTEM, PROGRAM, SEMICONDUCTOR INTEGRATED CIRCUIT DEVICE, AND METHOD FOR PRODUCING SEMICONDUCTOR INTEGRATED CIRCUIT DEVICE

Publication number: 20090281808

Abstract: A voice data creation system includes a dictionary data memory section that stores dictionary data for generating synthesized voice data corresponding to text data; an edition processing section that displays an edition screen for editing a voice guidance message as a sentence including a plurality of phrases to receive edition input formation so as to perform an edition processing based on the edition input information; a list information generation processing section that generates list information relating to each sentence and phrases included in the each sentence based on a result of the edition processing; a phrase voice data generating section that determines a target phrase for voice data creation based on the list information to generate and maintain voice data corresponding to the target phrase determined for voice data creation based on the dictionary data; and a memory write information generating section that determines a target phrase to be stored in a voice data memory based on the list informat

Type: Application

Filed: April 28, 2009

Publication date: November 12, 2009

Applicant: SEIKO EPSON CORPORATION

Inventors: Jun NAKAMURA, Fumihito BAISHO
PROACTIVE COMPLETION OF INPUT FIELDS FOR AUTOMATED VOICE ENABLEMENT OF A WEB PAGE

Publication number: 20090254347

Abstract: Embodiments of the present invention provide a method and computer program product for the proactive completion of input fields for automated voice enablement of a Web page. In an embodiment of the invention, a method for proactively completing empty input fields for voice enabling a Web page can be provided. The method can include receiving speech input for an input field in a Web page and inserting a textual equivalent to the speech input into the input field in a Web page. The method further can include locating an empty input field remaining in the Web page and generating a speech grammar for the input field based upon permitted terms in a core attribute of the empty input field and prompting for speech input for the input field. Finally, the method can include posting the received speech input and the grammar to an automatic speech recognition (ASR) engine and inserting a textual equivalent to the speech input provided by the ASR engine into the empty input field.

Type: Application

Filed: April 7, 2008

Publication date: October 8, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Victor S. Moore, Wendi L. Nusbickel
PHONETIC PRONUNCIATION TRAINING DEVICE, PHONETIC PRONUNCIATION TRAINING METHOD AND PHONETIC PRONUNCIATION TRAINING PROGRAM

Publication number: 20090239201

Abstract: A phonetic pronunciation training device, phonetic pronunciation training method, and phonetic pronunciation training program is provided wherein pronunciation and sounds in language acquisition can be self-learned and listening skills, spelling skills and vocabulary can be enhanced. The present invention comprises at least a data base for storing phonetic pronunciation data associated with phonetic data and phonetic symbol data indicating this phonetic data, a selection function block for receiving instruction signal from an input means and randomly selecting phonetic pronunciation data, a phonetic pronunciation data reproducing function for reproducing selected phonetic pronunciation data, and a phonetic symbol data correct/error determination function block for comparing phonetic symbol data input by the input means and phonetic symbol data corresponding to the selected phonetic pronunciation data and recording the correct/error result to a memory means.

Type: Application

Filed: July 15, 2005

Publication date: September 24, 2009

Inventor: Richard A Moe
Navigation Device and Method for Receiving and Playing Sound Samples

Publication number: 20090234565

Abstract: A navigation device is disclosed including a processor unit, memory device, and a speaker. The memory device includes a plurality of sound samples. In at least one embodiment, the navigation device is arranged to play a selection of the sound samples over speaker to provide navigation instructions. In at least one embodiment, the navigation device further includes an input device for receiving sound samples and is arranged for storing the received sound samples in memory device for subsequent playback over speaker for providing navigation instructions.

Type: Application

Filed: February 19, 2007

Publication date: September 17, 2009

Inventor: Pieter Andreas Geelen
METHOD, APPARATUS AND PROGRAM FOR SPEECH SYNTHESIS

Publication number: 20090204405

Abstract: Apparatus and method for generating high quality synthesized speech having smooth waveform concatenation. The apparatus includes a pitch frequency calculation section, a pitch synchronization position calculation section, a unit waveform storage, a unit waveform selection section, a unit waveform generation section, and a waveform synthesis section. The unit waveform generation section includes a conversion ratio calculation section, a sampling rate conversion section, and a unit waveform re-selection section. The conversion ratio calculation section calculates a sampling rate conversion ratio from the pitch information and the position of pitch synchronization, and the sampling rate conversion section converts the sampling rate of the unit waveform, delivered as input, based on the sampling rate conversion ratio.

Type: Application

Filed: September 4, 2006

Publication date: August 13, 2009

Applicant: NEC CORPORATION

Inventors: Masanori Kato, Satoshi Tsukada
METHOD AND APPARATUS FOR CREATING CUSTOMIZED PODCASTS WITH MULTIPLE TEXT-TO-SPEECH VOICES

Publication number: 20090204402

Abstract: Method and apparatus for creating customized podcasts with multiple voices, where text content is converted into audio content, and where the voices are selected at least in part on words in the text content suggestive of the type of voice. Types of voice include at least male and female, accent, language, and speed.

Type: Application

Filed: January 9, 2009

Publication date: August 13, 2009

Inventors: Harpreet MARWAHA, Brett ROBINSON
SPEECH PROCESSING APPARATUS AND PROGRAM

Publication number: 20090177474

Abstract: A speech synthesizer includes a periodic component fusing unit and an aperiodic component fusing unit, and fuses periodic components and aperiodic components of a plurality of speech units for each segment, which are selected by a unit selector, by a periodic component fusing unit and an aperiodic component fusing unit, respectively. The speech synthesizer is further provided with an adder, so that the adder adds, edits, and concatenates the periodic components and the aperiodic components of the fused speech units to generate a speech waveform.

Type: Application

Filed: September 18, 2008

Publication date: July 9, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masahiro Morita, Takehiko Kagoshima
Methods, Apparatuses, and Computer Program Products for Semantic Media Conversion From Source Files to Audio/Video Files

Publication number: 20090157407

Abstract: An apparatus for semantic media conversion from source data to audio/video data may include a processor. The processor may be configured to parse source data having text and one or more tags and create a semantic structure model representative of the source data, and generate audio data comprising at least one of speech converted from parsed text of the source data contained in the semantic structure model and applied audio effects. Corresponding methods and computer program products are also provided.

Type: Application

Filed: December 12, 2007

Publication date: June 18, 2009

Inventors: Tetsuo Yamabe, Kiyotaka Takahashi
Context-aware unit selection

Publication number: 20090132253

Abstract: Methods and apparatuses to perform context-aware unit selection for natural language processing are described. Streams of information associated with input units are received. The streams of information are analyzed in a context associated with first candidate units to determine a first set of weights of the streams of information. A first candidate unit is selected from the first candidate units based on the first set of weights of the streams of information. The streams of information are analyzed in the context associated with second candidate units to determine a second set of weights of the streams of information. A second candidate unit is selected from second candidate units to concatenate with the first candidate unit based on the second set of weights of the streams of information.

Type: Application

Filed: November 20, 2007

Publication date: May 21, 2009

Inventor: Jerome Bellegarda
Screen reader remote access system

Publication number: 20090100150

Abstract: The present invention provides an assistive technology screen reader in a distributed network computer system. The screen reader, on a server computer system, receives display information output from one or more applications. The screen reader converts the text and symbolic content of the display information into a performant format for transmission across a network. The screen reader, on a client computer system, receives the performant format. The received performant format is converted to a device type file, by the screen reader. The screen reader then presents the device type file to a device driver, for output to a speaker, braille reader, or the like.

Type: Application

Filed: June 14, 2002

Publication date: April 16, 2009

Inventor: David Yee
Generalized Object Recognition for Portable Reading Machine

Publication number: 20090048842

Abstract: Techniques for operating a reading machine are disclosed. The techniques include forming an N-dimensional features vector based on features of an image, the features corresponding to characteristics of at least one object depicted in the image, representing the features vector as a point in n-dimensional space, where n corresponds to N, the number of features in the features vector and comparing the point in n-dimensional space to a centroid that represents a cluster of points in the n-dimensional space corresponding to a class of objects to determine whether the point belongs in the class of objects corresponding to the centroid.

Type: Application

Filed: April 28, 2008

Publication date: February 19, 2009

Inventors: Paul Albrecht, Rafael Maya Zetune, Lucy Gibson, Raymond C. Kurzweil