Text Analysis, Generation Of Parameters For Speech Synthesis Out Of Text, E.g., Grapheme To Phoneme Translation, Prosody Generation, Stress, Or Intonation Determination, Etc. (epo) Patents (Class 704/E13.011)

E Subclasses

Grapheme to phoneme, detection of language (epo) (Class 704/E13.012)

Prosody rules derived from text (epo) (Class 704/E13.013)

Stress or intonation (epo) (Class 704/E13.014)

SYSTEM AND METHOD FOR TRANSFORMING VERNACULAR PRONUNCIATION

Publication number: 20110010178

Abstract: Provided is a system and method for transforming vernacular pronunciation with respect to Hanja using a statistical method. In a system for transforming vernacular pronunciation, a vernacular pronunciation extracting unit extracts a vernacular pronunciation with respect to a Hanja character string, a statistical data determining unit determines a statistical data with respect to the Hanja character string by using statistical data of features related to a Hanja-vernacular pronunciation transformation, and a vernacular pronunciation transforming unit transforms the Hanja character string into a vernacular pronunciation using the extracted vernacular pronunciation and the determined statistical data.

Type: Application

Filed: July 7, 2010

Publication date: January 13, 2011

Applicant: NHN Corporation

Inventors: Hyunjung LEE, Taeil Kim, Hee-Cheol Seo, Ji Hye Lee
IMAGE PROCESSING APPARATUS AND METHOD FOR PROCESSING IMAGE

Publication number: 20100329505

Abstract: An image processing apparatus includes: a storage module configured to store a plurality of pieces of comment data; an analyzing module configured to analyze an expression of a person contained in image data; a generating module configured to select a target comment data from among the comment data stored in the storage module based on the expression of the person analyzed by the analyzing module, and to generate voice data using the target comment data; and an output module configured to output reproduction data to be used for displaying the image data together with the voice data generated by the generating module.

Type: Application

Filed: June 1, 2010

Publication date: December 30, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kousuke Imoji, Yuki Kaneko, Junichi Takahashi
METHOD AND APPARATUS FOR CONVERTING TEXT TO AUDIO AND TACTILE OUTPUT

Publication number: 20100332224

Abstract: In accordance with an example embodiment of the present invention, an apparatus comprises a controller configured to process punctuated text data, and to identify punctuation in said punctuated text data; and an output unit configured to generate audio output corresponding to said punctuated text data, and to generate tactile output corresponding to said identified punctuation.

Type: Application

Filed: June 30, 2009

Publication date: December 30, 2010

Applicant: NOKIA CORPORATION

Inventors: Jakke Sakari Mäkelä, Jukka Pekka Naula, Niko Santeri Porjo
VOICE MODELS FOR DOCUMENT NARRATION

Publication number: 20100324905

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices. Further disclosed are techniques and systems for modifying a voice model associated with a selected character based on data received from a user.

Type: Application

Filed: January 14, 2010

Publication date: December 23, 2010

Inventors: Raymond C. Kurzweil, Paul Albrecht, Peter Chapman
METHOD AND SYSTEM FOR EXTRACTING MESSAGES

Publication number: 20100318360

Abstract: The present invention is a method and system for extracting messages from a person using the body features presented by a user. The present invention captures a set of images and extracts a first set of body features, along with a set of contexts, and a set of meanings. From the first set of body features, the set of contexts, and the set of meanings, the present invention generates a set of words corresponding to the message that the person is attempting to convey. The present invention can also use the body features of the person in addition to the voice of the person to further improve the accuracy of extracting the person's message.

Type: Application

Filed: June 10, 2009

Publication date: December 16, 2010

Applicant: Toyota Motor Engineering & Manufacturing North America, Inc.

Inventor: Yasuo Uehara
Context-Relevant Images

Publication number: 20100318361

Abstract: Assistive, context-relevant images may be provided. First, text may be received. Then a spell check indication may be received and a spelling check may be performed on the received text in response to the received spell check indication. Next, in response to the performed spelling check, a misspelling indication may be provided configured to indicate that at least one word in the received text is misspelled. A selection of the misspelling indication may then be received. Then, on a display device in response to the received selection of the misspelling indication, a plurality of suggested spellings for the at least one word and an image corresponding to a first one of the plurality of suggested spellings for the at least one word may be displayed.

Type: Application

Filed: June 11, 2009

Publication date: December 16, 2010

Applicant: Microsoft Corporation

Inventors: Roderick C. Paulino, Jimmy Y. Sun
LOCAL AND REMOTE FEEDBACK LOOP FOR SPEECH SYNTHESIS

Publication number: 20100312564

Abstract: A local text to speech feedback loop is utilized to modify algorithms used in speech synthesis to provide a user with an improved experience. A remote text to speech feedback loop is utilized to aggregate local feedback loop data and incorporate best solutions into new improved text to speech engine for deployment.

Type: Application

Filed: June 5, 2009

Publication date: December 9, 2010

Applicant: Microsoft Corporation

Inventor: Michael D. Plumpe
HIDDEN MARKOV MODEL BASED TEXT TO SPEECH SYSTEMS EMPLOYING ROPE-JUMPING ALGORITHM

Publication number: 20100312562

Abstract: A rope-jumping algorithm is employed in a Hidden Markov Model based text to speech system to determine start and end models and to modify the start and end models by setting small co-variances. Disordered acoustic parameters due to violation of parameter constraints are avoided through the modification and result in stable line frequency spectrum for the generated speech.

Type: Application

Filed: June 4, 2009

Publication date: December 9, 2010

Applicant: Microsoft Corporation

Inventors: Wenlin Wang, Guoliang Zhang, Jingyang Xu
SPEECH-TO-SPEECH TRANSLATION

Publication number: 20100299147

Abstract: Systems and methods for facilitating communication including recognizing speech in a first language represented in a first audio signal; forming a first text representation of the speech; processing the first text representation to form data representing a second audio signal; and causing presentation of the second audio signal to a second user while responsive to an interrupt signal from a first user. In some embodiments, processing the first text representation includes translating the first text representation to a second text representation in a second language and processing the second text representation to form the data representing the second audio signal. In some embodiments include accepting an interrupt signal from the first user and interrupting the presentation of the second audio signal.

Type: Application

Filed: May 20, 2009

Publication date: November 25, 2010

Applicant: BBN Technologies Corp.

Inventor: David G. Stallard
SPEECH SYNTHESIZING DEVICE, COMPUTER PROGRAM PRODUCT, AND METHOD

Publication number: 20100250254

Abstract: An acquiring unit acquires pattern sentences, which are similar to one another and include fixed segments and non-fixed segments, and substitution words that are substituted for the non-fixed segments. A sentence generating unit generates target sentences by replacing the non-fixed segments with the substitution words for each of the pattern sentences. A first synthetic-sound generating unit generates a first synthetic sound, a synthetic sound of the fixed segment, and a second synthetic-sound generating unit generates a second synthetic sound, a synthetic sound of the substitution word, for each of the target sentences. A calculating unit calculates a discontinuity value of a boundary between the first synthetic sound and the second synthetic sound for each of the target sentences and a selecting unit selects the target sentence having the smallest discontinuity value. A connecting unit connects the first synthetic sound and the second synthetic sound of the target sentence selected.

Type: Application

Filed: September 15, 2009

Publication date: September 30, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Nobuaki Mizutani
CONTEXT AWARE, SPEECH-CONTROLLED INTERFACE AND SYSTEM

Publication number: 20100250253

Abstract: A speech-directed user interface system includes at least one speaker for delivering an audio signal to a user and at least one microphone for capturing speech utterances of a user. An interface device interfaces with the speaker and microphone and provides a plurality of audio signals to the speaker to be heard by the user. A control circuit is operably coupled with the interface device and is configured for selecting at least one of the plurality of audio signals as a foreground audio signal for delivery to the user through the speaker. The control circuit is operable for recognizing speech utterances of a user and using the recognized speech utterances to control the selection of the foreground audio signal.

Type: Application

Filed: March 27, 2009

Publication date: September 30, 2010

Inventor: Yangmin Shen
SPEECH SYNTHESIZING DEVICE, METHOD AND COMPUTER PROGRAM PRODUCT

Publication number: 20100211392

Abstract: The speech synthesizing device acquires numerical data at regular time intervals, each piece of the numerical data representing a value having a plurality of digits, detects a change between two values represented by the numerical data that is acquired at two consecutive times, determines which digit of the value represented by the numerical data is used to generate speech data depending on the detected change, generates numerical information that indicates the determined digit of the value represented by the numerical data, and generates speech data from the digit indicated by the numerical information.

Type: Application

Filed: September 21, 2009

Publication date: August 19, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Ryutaro Tokuda, Takehiko Kagoshima
SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM

Publication number: 20100211393

Abstract: A speech synthesis device is provided with: a central segment selection unit for selecting a central segment from among a plurality of speech segments; a prosody generation unit for generating prosody information based on the central segment; a non-central segment selection unit for selecting a non-central segment, which is a segment outside of a central segment section, based on the central segment and the prosody information; and a waveform generation unit for generating a synthesized speech waveform based on the prosody information, the central segment, and the non-central segment. The speech synthesis device first selects a central segment that forms a basis for prosody generation and generates prosody information based on the central segment so that it is possible to sufficiently reduce both concatenation distortion and sound quality degradation accompanying prosody control in the section of the central segment.

Type: Application

Filed: April 28, 2008

Publication date: August 19, 2010

Inventors: Masanori Kato, Yasuyuki Mitsui, Reishi Kondo
PORTABLE READING DEVICE WITH MODE PROCESSING

Publication number: 20100201793

Abstract: A reading device includes a computing device and an image input device coupled to the computing device for capturing low resolution images and high resolution images. The reading machine also includes a computer program product residing on a computer readable medium. The medium is in communication with the computing device and includes instructions to operate in a plurality of modes to optimize performance for specific uses of the reading device and process low and high resolution images during operation of at least one of the plurality of modes.

Type: Application

Filed: February 9, 2010

Publication date: August 12, 2010

Inventors: Raymond C. Kurzweil, Paul Albrecht, James Gashel, Lucy Gibson
SYSTEMS AND METHODS FOR INTERACTIVELY ACCESSING HOSTED SERVICES USING VOICE COMMUNICATIONS

Publication number: 20100198595

Abstract: In a system comprising a voice recognition module, a session manager, and a voice generator module, a method for providing a service to a user comprises receiving an utterance via the voice recognition module; converting the utterance into one or more structures using lexicon tied to an ontology; identifying concepts in the utterance using the structures; provided the utterance includes sufficient information, selecting a service based on the concepts; generating a text message based on the selected service; and converting the text message to a voice message using the voice generator.

Type: Application

Filed: February 3, 2009

Publication date: August 5, 2010

Applicant: SoftHUS Sp.z.o.o

Inventor: Eugeniusz Wlasiuk
MOBILE PHONE COMMUNICATION GAP RECOVERY

Publication number: 20100198594

Abstract: Mobile phone signals may be corrupted by noise, fading, interference with other signals, and low strength field coverage of a transmitting and/or a receiving mobile phone as they pass through the communication network (e.g., free space). Because of the corruption of the mobile phone signal, a voice conversation between a caller and a receiver may be interrupted and there may be gaps in a received oral communication from one or more participants in the voice conversation forcing either or both the caller and the receiver to repeat the conversation. Transmitting a transcript of the oral communication along with a voice signal comprising the oral communication can help ensure that voice conversation is not interrupted due to a corrupted voice signal. The transcript of the oral communication can be used to retrieve parts of the oral communication lost in transmission (e.g., by fading, etc.) to make the conversation more fluid.

Type: Application

Filed: February 3, 2009

Publication date: August 5, 2010

Applicant: International Business Machines Corporation

Inventors: Rosario Gangemi, Giuseppe Longobardi
SYSTEM-EFFECTED METHODS FOR ANALYZING, PREDICTING, AND/OR MODIFYING ACOUSTIC UNITS OF HUMAN UTTERANCES FOR USE IN SPEECH SYNTHESIS AND RECOGNITION

Publication number: 20100161327

Abstract: A computer-implemented method for automatically analyzing, predicting, and/or modifying acoustic units of prosodic human speech utterances for use in speech synthesis or speech recognition. Possible steps include: initiating analysis of acoustic wave data representing the human speech utterances, via the phase state of the acoustic wave data; using one or more phase state defined acoustic wave metrics as common elements for analyzing, and optionally modifying, pitch, amplitude, duration, and other measurable acoustic parameters of the acoustic wave data, at predetermined time intervals; analyzing acoustic wave data representing a selected acoustic unit to determine the phase state of the acoustic unit; and analyzing the acoustic wave data representing the selected acoustic unit to determine at least one acoustic parameter of the acoustic unit with reference to the determined phase state of the selected acoustic unit. Also included are systems for implementing the described and related methods.

Type: Application

Filed: December 16, 2009

Publication date: June 24, 2010

Inventors: Nishant CHANDRA, Reiner Wilhelms-Tricarico, Rattima Nitisaroj, Brian Mottershead, Gary A. Marple, John B. Reichenbach
SPEECH SAMPLES LIBRARY FOR TEXT-TO-SPEECH AND METHODS AND APPARATUS FOR GENERATING AND USING SAME

Publication number: 20100131267

Abstract: A method of recording speech for use in a speech samples library. In an exemplary embodiment, the method comprises recording a speaker pronouncing a phoneme with musical parameters characterizing pronunciation of another phoneme by the same or another speaker. For example, in one embodiment the method comprises: providing a recording of a first speaker pronouncing a first phoneme in a phonemic context. The pronunciation is characterized by some musical parameters. A second reader, who may be the same as the first reader, is then recorded pronouncing a second phoneme (different from the first phoneme) with the musical parameters that characterizes pronunciation of the first phoneme by the first speaker. The recordings made by the second reader are used for compiling a speech samples library.

Type: Application

Filed: March 19, 2008

Publication date: May 27, 2010

Applicant: Vivo Text Ltd.

Inventors: Gershon Silbert, Andres Hakim
STOCHASTIC PHONEME AND ACCENT GENERATION USING ACCENT CLASS

Publication number: 20100125459

Abstract: Exemplary embodiments provide for determining a sequence of words in a TTS system. An input text is analyzed using two models, a word n-gram model and an accent class n-gram model. A list of all possible words for each word in the input is generated for each model. Each word in each list for each model is given a score based on the probability that the word is the correct word in the sequence, based on the particular model. The two lists are combined and the two scores are combined for each word. A set of sequences of words are generated. Each sequence of words comprises a unique combination of an attribute and associated word for each word in the input. The combined score of each of word in the sequence of words is combined. A sequence of words having the highest score is selected and presented to a user.

Type: Application

Filed: July 1, 2009

Publication date: May 20, 2010

Applicant: Nuance Communications, Inc.

Inventors: Nobuyasu Itoh, Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana
Method and apparatus for translating speech during a call

Publication number: 20100121629

Abstract: A translation platform allows a client using a first language to communicate via translated voice and/or text to at least a second client using a second language. A control server uses various speech recognition engines, text translation engines and text to speech engines to accomplish real-time or near-real time translations.

Type: Application

Filed: May 28, 2009

Publication date: May 13, 2010

Inventor: Sanford H. Cohen
APPARATUS FOR TEXT-TO-SPEECH DELIVERY AND METHOD THEREFOR

Publication number: 20100100317

Abstract: A method and apparatus for determining the manner in which a processor-enabled device should producing sounds from data is described. In at least one embodiment, the device includes a first device for synthesizing sounds digitally, and re-producing pre-recorded sounds, a second device for audible delivery thereof, memory in which is stored a database of a plurality data at least some of which is in the form of text-based indicators, and one or more pre-recorded sounds, a data transfer device by which the data is transferred between the processor of the device and the memory, and operating system software which controls the processing and flow of data between a processor and the memory, and whether the sounds are audibly reproduced. In accordance with at least one embodiment of the invention, the device is further capable of repeatedly determining one or more physical conditions, e.g.

Type: Application

Filed: March 21, 2007

Publication date: April 22, 2010

Inventors: Rory Jones, Sven Jurgens
System and Method of Developing A TTS Voice

Publication number: 20100094632

Abstract: Disclosed herein are various aspects of a toolkit used for generating a TTS voice for use in a spoken dialog system. The embodiments in each case may be in the form of the system, a computer-readable medium or a method for generating the TTS voice. An embodiment of the invention relates to a method of tracking progress in developing a text-to-speech (TTS) voice. The method comprises insuring that a corpus of recorded speech contains reading errors and matches an associated written text, creating a tuple for each utterance in the corpus and tracking progress for each utterance utilizing the tuple. Various parameters may be tracked using the tuple but the tuple provides a means for enabling multiple workers to efficiently process a database of utterance in preparation of a TTS voice.

Type: Application

Filed: December 15, 2009

Publication date: April 15, 2010

Applicant: AT&T Corp,

Inventors: Steven Lawrence Davis, Shane Fetters, David Eugene Schultz, Beverly Gustafson, Louise Loney
Method for producing indicators and processing apparatus and system utilizing the indicators

Publication number: 20100076766

Abstract: The present invention discloses a method for producing graphical indicators and interactive systems for utilizing the graphical indicators. On the surface of an object, visually negligible graphical indicators are provided. The graphical indicators and main information, i.e. text or pictures, co-exist on the surface of object. The graphical indicators do not interfere with the main information when the perception of human eyes are concerned. With the graphical indicators, further information other than the main information on the surface of object are carried. In addition to the main information on the surface of object, one is able to obtain additional information through an auxiliary electronic device or trigger an interactive operation.

Type: Application

Filed: November 19, 2009

Publication date: March 25, 2010

Applicant: Sonix Technology Co., Ltd.

Inventor: Yao-Hung Tsai
Hands-Free and Non-Visually Occluding Object Information Interaction System

Publication number: 20100063821

Abstract: Technologies are described herein for providing a hands-free and non-visually occluding interaction with object information. In one method, a visual capture of a portion of an object is received through a hands-free and non-visually occluding visual capture device. An audio capture is also received from a user through a hands-free and non-visually occluding audio capture device. The audio capture may include a request for information about a portion of the object in the visual capture. The information is retrieved and is transmitted to the user for playback through a hands-free and non-visually occluding audio output device.

Type: Application

Filed: September 9, 2008

Publication date: March 11, 2010

Inventors: Joseph C. Marsh, Eric M. Smith
Voice and text communication system, method and apparatus

Publication number: 20100030557

Abstract: The disclosure relates to systems, methods and apparatus to convert speech to text and vice versa. One apparatus comprises a vocoder, a speech to text conversion engine, a text to speech conversion engine, and a user interface. The vocoder is operable to convert speech signals into packets and convert packets into speech signals. The speech to text conversion engine is operable to convert speech to text. The text to speech conversion engine is operable to convert text to speech. The user interface is operable to receive a user selection of a mode from among a plurality of modes, wherein a first mode enables the speech to text conversion engine, a second mode enables the text to speech conversion engine, and a third mode enables the speech to text conversion engine and the text to speech conversion engine.

Type: Application

Filed: July 31, 2006

Publication date: February 4, 2010

Inventors: Stephen Molloy, Khaled Helmi El-Maleh
SPEECH SYNTHESIZING APPARATUS AND METHOD THEREOF

Publication number: 20090326951

Abstract: Ratios of powers at the peaks of respective formants of the spectrum of a pitch-cycle waveform and powers at boundaries between the formants are obtained and, when the ratios are large, bandwidth of window functions are widened and the formant waveforms are generated by multiplying generated sinusoidal waveforms from the formant parameter sets on the basis of pitch-cycle waveform generating data by the window functions of the widened bandwidth, whereby a pitch-cycle waveform is generated by the sum of these formant waveforms.

Type: Application

Filed: April 14, 2009

Publication date: December 31, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Ryo Morinaka, Takehiko Kagoshima
SINGING SYNTHESIS PARAMETER DATA ESTIMATION SYSTEM

Publication number: 20090306987

Abstract: There is provided a singing synthesis parameter data estimation system that automatically estimates singing synthesis parameter data for automatically synthesizing a human-like singing voice from an audio signal of input singing voice. A pitch parameter estimating section 9 estimates a pitch parameter, by which the pitch feature of an audio signal of synthesized singing voice is got closer to the pitch feature of the audio signal of input singing voice based on at least both of the pitch feature and lyric data with specified syllable bondaries of the audio signal of input singing voice.

Type: Application

Filed: May 21, 2009

Publication date: December 10, 2009

Applicant: NATIONAL INSTITUTE OF ADVANCED INDUSTRIAL SCIENCE AND TECHNOLOGY

Inventors: Tomoyasu Nakano, Masataka Goto
Electronic Device and Method for Automatically Converting Text into a Speech Signal

Publication number: 20090295735

Abstract: An electronic device and a method for automatically converting text to be displayed on a display screen of an electronic device into a speech signal when ambient light conditions affect viewing of the text. The method is performed by the electronic device and the method includes receiving a command to display text on the display screen and determining if an ambient light signal provided by an ambient light sensor is above a pre-determined viewing threshold. This ambient light signal corresponds to ambient light conditions adjacent the display screen. The method also includes automatically converting the text to a speech signal when the ambient light signal is above the pre-determined viewing threshold. Suitably, there is performed a step of emitting the speech signal in an audible form from a speaker.

Type: Application

Filed: May 27, 2008

Publication date: December 3, 2009

Applicant: Motorola, Inc.

Inventors: Wang Wang, Wei Guo, Kan Ni, Danilo Tan
DISPLAY APPARATUS AND DISPLAY METHOD

Publication number: 20090278766

Abstract: An adequate display operation control in accordance with the external world situation is realized. For example, where a user wears the wearing unit of a spectacle-shaped or head-worn unit, the user is made to be able to view any type of image on the display section immediately in front of the eyes, and provided with taken images, reproduced images, and received images. At the point, a control relative to various display operations such as on/off of the display operation, display operation mode, and source change is carried out based on external world information.

Type: Application

Filed: August 17, 2007

Publication date: November 12, 2009

Applicant: SONY CORPORATION

Inventors: Yoichiro Sako, Masaaki Tsuruta, Taiji Ito, Masamichi Asukai
SPEECH SYNTHESIS APPARATUS, SPEECH SYNTHESIS METHOD, SPEECH SYNTHESIS PROGRAM, PORTABLE INFORMATION TERMINAL, AND SPEECH SYNTHESIS SYSTEM

Publication number: 20090271202

Abstract: A speech synthesis apparatus includes a content selection unit that selects a text content item to be converted into speech; a related information selection unit that selects related information which can be at least converted into text and which is related to the text content item selected by the content selection unit; a data addition unit that converts the related information selected by the related information selection unit into text and adds text data of the text to text data of the text content item selected by the content selection unit; a text-to-speech conversion unit that converts the text data supplied from the data addition unit into a speech signal; and a speech output unit that outputs the speech signal supplied from the text-to-speech conversion unit.

Type: Application

Filed: March 25, 2009

Publication date: October 29, 2009

Applicant: SONY ERICSSON MOBILE COMMUNICATIONS JAPAN, INC.

Inventor: Susumu TAKATSUKA
Multilingual Administration Of Enterprise Data With Default Target Languages

Publication number: 20090271176

Abstract: Methods, systems, and computer program products are provided for multilingual administration of enterprise data. Embodiments include retrieving enterprise data; extracting text from the enterprise data for rendering from digital media file, the extracted text being in a source language; identifying that the source language is not a predetermined default target language for rendering the enterprise data; translating the extracted text in the source language to translated text in the default target language; converting the translated text to synthesized speech in the default target language; and storing the synthesized speech in the default target language in a digital media file.

Type: Application

Filed: April 24, 2008

Publication date: October 29, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William K. Bodin, David Jaramillo, Ann Marie Maynard
METHODS AND APPARATUS TO PRESENT A VIDEO PROGRAM TO A VISUALLY IMPAIRED PERSON

Publication number: 20090259473

Abstract: Methods and apparatus to present a video program to a visually impaired person are disclosed. An example method comprises receiving a video stream and an associated audio stream of a video program, detecting a portion of the video program that is not readily consumable by a visually impaired person, obtaining text associated with the portion of the video program, converting the text to a second audio stream, and combining the second audio stream with the associated audio stream.

Type: Application

Filed: April 14, 2008

Publication date: October 15, 2009

Inventors: Hisao M. Chang, Horst Schroeter
DISTANCE METRICS FOR UNIVERSAL PATTERN PROCESSING TASKS

Publication number: 20090259471

Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.

Type: Application

Filed: April 11, 2008

Publication date: October 15, 2009

Applicant: International Business Machines Corporation

Inventors: Dimitri KANEVSKY, David Nahamoo, Tara N. Sainath
AUTOMATICALLY GENERATING NEW WORDS FOR LETTER-TO-SOUND CONVERSION

Publication number: 20090240501

Abstract: Described is a technology by which artificial words are generated based on seed words, and then used with a letter-to-sound conversion model. To generate an artificial word, a stressed syllable of a seed word is replaced with a different syllable, such as a candidate (artificial) syllable, when the phonemic structure and/or graphonemic structure of the stressed syllable and the candidate syllable match one another. In one aspect, the artificial words are provided for use with a letter-to-sound conversion model, which may be used to generate artificial phonemes from a source of words, such as in conjunction with other models. If the phonemes provided by the various models for a selected source word are in agreement relative to one another, the selected source word and an associated artificial phoneme may be added to a training set which may then be used to retrain the letter-to-sound conversion model.

Type: Application

Filed: March 19, 2008

Publication date: September 24, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Yi Ning Chen, Jia Li You, Frank Kao-ping Soong
System and methods for reporting

Publication number: 20090187407

Abstract: The present invention relates to a system and methods for preparing reports, such as medical reports. The system and methods advantageously can verbalize information, using speech synthesis (text-to-speech), to support a dialogue between a user and the reporting system during the course of the preparation of the report in order that the user can avoid inefficient visual distractions.

Type: Application

Filed: January 18, 2008

Publication date: July 23, 2009

Inventors: Jeffrey Soble, James Roberge
SPEECH INFORMATION PROCESSING APPARATUS AND METHOD

Publication number: 20090187408

Abstract: A temporary child set is generated. An elastic ratio of an elastic section of a model pattern is calculated. A temporary typical pattern of the set is generated by combining the pattern belonging to the set with the model pattern having the elastic pattern expanded or contracted. A distortion between the temporary typical pattern of the set and the pattern belonging to the set is calculated, and a child set is determined as the set when the distortion is below a threshold. A typical pattern as the temporary typical pattern of the child set is stored with a classification rule as the classification item of the context of the pattern belonging to the child set.

Type: Application

Filed: January 23, 2009

Publication date: July 23, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Nobuaki MIZUTANI
SPEECH SYNTHESIS DEVICE, METHOD, AND PROGRAM

Publication number: 20090177475

Abstract: Even when a pitch cycle has a large fluctuation and the pitch cycle string changes abruptly, it possible to suppress the affect of the pitch cycle fluctuation and generate high-quality synthesized speech. A speech synthesis device generates a synthesized speech corresponding to an input text sentence according to an original speech waveform stored in original speech waveform information storage unit (25). The speech synthesis device includes pitch cycle correction unit (40) which extracts a fluctuation component of the pitch cycle of the original speech waveform which is obtained from original speech waveform information storage unit (25) in order to generate the synthesized speech and which corrects, based on the extracted fluctuation component, the pitch cycle of the synthesized speech obtained by analyzing the input text sentence. Pitch cycle correction unit (40) connects the pitch cycle waveform of the original speech waveform at the pitch cycle of the corrected synthesized speech.

Type: Application

Filed: July 4, 2007

Publication date: July 9, 2009

Applicant: NEC CORPORATION

Inventor: Masanori Kato
SPEECH SYNTHESIZING METHOD AND APPARATUS

Publication number: 20090157408

Abstract: The present invention relates to a speech synthesizing method and apparatus based on a hidden Markov model (HMM). Among code words that are obtained by quantizing speech parameter instances for each state of an HMM model, a code word closest to a speech parameter generated from an input text using a known method is searched. When the distance between the searched code word and the speech parameter generated by the known method is smaller to or equal to a threshold value, the searched code word is output as a final speech parameter. When the distance exceeds the threshold value, the speech parameter generated by the known method is output as the final speech parameter. The final speech parameter is processed to generate final synthesized speech for the input text.

Type: Application

Filed: June 27, 2008

Publication date: June 18, 2009

Applicant: Electronics and Telecommunications Research Institute

Inventor: Sanghun KIM
SPEECH PROCESSING APPARATUS AND PROGRAM

Publication number: 20090150157

Abstract: A word dictionary including sets of a character string which constitutes a word, a phoneme sequence which constitutes pronunciation of the word and a part of speech of the word is referenced, an entered text is analyzed, the entered text is divided into one or more subtexts, a phoneme sequence and a part of speech sequence are generated for each subtext, the part of speech sequence of the subtext and a list of part of speech sequence are collated to determine whether the phonetic sound of the subtext is to be converted or not, and the phonetic sounds of the phoneme sequence in the subtext whose phonetic sounds are determined to be converted are converted.

Type: Application

Filed: September 15, 2008

Publication date: June 11, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Takehiko KAGOSHIMA, Noriko YAMANAKA, Makoto YAJIMA
Methods, Systems, and Products for Synthesizing Speech

Publication number: 20090125309

Abstract: Methods, Systems, and Products are disclosed for synthesizing speech. Text is received for translation to speech. The text is correlated to phrases, and each phrase is converted into a corresponding string of phonemes. A phoneme identifier is retrieved that uniquely represents each phoneme in the string of phonemes. Each phoneme identifier is concatenated to produce a sequence of phoneme identifiers with each phoneme identifier separated by a comma. Each sequence of phoneme identifiers is concatenated and separated by a semi-colon.

Type: Application

Filed: January 22, 2009

Publication date: May 14, 2009

Inventor: Steve Tischer
Automated pattern based human assisted computerized translation network systems

Publication number: 20090119091

Abstract: A system and method for automated languages translation comprising a database containing pre-translated patterns that were translated by human translators, generating a transparent and seamless translation service. Whenever a user issues a translation request, the system offers suitable translated sentences from the aforementioned database. The system does so by separating the submitted text into elements and using a pattern recognition mechanism to identify a matching translation to each element. If there is no matching translated pattern in the database or if the user does not approve the translated sentence, the system transparently uses a suitable registered human translator to translate. The new translation is stored in the database, thus perfecting the database, and the translation request is delivered.

Type: Application

Filed: October 14, 2008

Publication date: May 7, 2009

Inventor: Eitan Chaim Sarig
Unnatural prosody detection in speech synthesis

Publication number: 20090083036

Abstract: Described is a technology by which synthesized speech generated from text is evaluated against a prosody model (trained offline) to determine whether the speech will sound unnatural. If so, the speech is regenerated with modified data. The evaluation and regeneration may be iterative until deemed natural sounding. For example, text is built into a lattice that is then (e.g., Viterbi) searched to find a best path. The sections (e.g., units) of data on the path are evaluated via a prosody model. If the evaluation deems a section to correspond to unnatural prosody, that section is replaced, e.g., by modifying/pruning the lattice and re-performing the search. Replacement may be iterative until all sections pass the evaluation. Unnatural prosody detection may be biased such that during evaluation, unnatural prosody is falsely detected at a higher rate relative to a rate at which unnatural prosody is missed.

Type: Application

Filed: September 20, 2007

Publication date: March 26, 2009

Applicant: Microsoft Corporation

Inventors: Yong Zhao, Frank Kao-ping Soong, Min Chu, Lijuan Wang
DEVICE AND METHOD FOR INTERACTIVE MACHINE TRANSLATION

Publication number: 20090063128

Abstract: Provided are a device and method for interactive machine translation. The device includes a machine translation engine having a morphological/syntactic analyzer for analyzing morphemes and sentences of an original text and generating original text analysis information, and a translation generator for generating a translation and translation generation information on the basis of the original text analysis information, and a user interface module for displaying sentence structures of the original text and the translation, and a relationship between the original text and the translation to a user on the basis of the original text analysis information and the translation generation information, and for receiving corrections to the original text or the translation from the user. The device and method provide a user interface whereby the user can effectively recognize and correct a mistranslated part and a cause of the mistranslation, and rapidly provides a re-translated result according to the correction.

Type: Application

Filed: September 5, 2008

Publication date: March 5, 2009

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Young Ae SEO, Chang Hyun Kim, Seong Il Yang, Young Sook Hwang, Chang Hao Yin, Eun Jin Park, Sung Kwon Choi, Ki Young Lee, Oh Woog Kwon, Yoon Hyung Roh, Young Kil Kim
EMOTIVE TEXT-TO-SPEECH SYSTEM AND METHOD

Publication number: 20090063154

Abstract: Information about a device may be emotively conveyed to a user of the device. Input indicative of an operating state of the device may be received. The input may be transformed into data representing a simulated emotional state. Data representing an avatar that expresses the simulated emotional state may be generated and displayed. A query from the user regarding the simulated emotional state expressed by the avatar may be received. The query may be responded to.

Type: Application

Filed: November 5, 2008

Publication date: March 5, 2009

Applicant: Ford Global Technologies, LLC

Inventors: Oleg Yurievitch Gusikhin, Perry Robinson MacNeille, Erica Klampfl, Kacie Alane Theisen, Dimitar Petrov Filev, Yifan Chen, Basavaraj Tonshal
Speech translation apparatus and method

Publication number: 20090055158

Abstract: A speech translation apparatus includes a speech recognition unit configured to recognize input speech of a first language to generate a first text of the first language, an extraction unit configured to compare original prosody information of the input speech with first synthesized prosody information based on the first text to extract paralinguistic information about each of first words of the first text, a machine translation unit configured to translate the first text to a second text of a second language, a mapping unit configured to allocate the paralinguistic information about each of the first words to each of second words of the second text in accordance with synonymity, a generating unit configured to generate second synthesized prosody information based on the paralinguistic information allocated to each of the second words, and a speech synthesis unit configured to synthesize output speech based on the second synthesized prosody information.

Type: Application

Filed: August 21, 2008

Publication date: February 26, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Dawei Xu, Takehiko Kagoshima
SYSTEM AND METHOD FOR PERFORMING SPEECH SYNTHESIS WITH A CACHE OF PHONEME SEQUENCES

Publication number: 20090043585

Abstract: Disclosed are systems, methods, and computer readable media for performing speech synthesis. The method embodiment comprises applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences, for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences, and adding the identified joins to a cache for use in speech synthesis.

Type: Application

Filed: August 9, 2007

Publication date: February 12, 2009

Applicant: AT&T Corp.

Inventor: Alistair D. CONKIE
REAL-TIME MULTI-LINGUAL ADAPTATION OF MANUFACTURING INSTRUCTIONS IN A MANUFACTURING MANAGEMENT SYSTEM

Publication number: 20090030670

Abstract: Embodiments of the present invention provide a method, system and computer program product for real-time multi-lingual adaptation of manufacturing instructions in a manufacturing management system. In one embodiment of the invention, a manufacturing language adaptation method can be provided. The method can include identifying an operator receiving manufacturing instruction, determining a primary language preference for the operator and determining whether or not the manufacturing instructions have been translated into the primary language preference. If it is determined that the manufacturing instructions have been translated into the primary language preference, the manufacturing instructions can be presented to the operator in the primary language preference. Otherwise the manufacturing instructions can be submitted to a translation engine for translation into the primary language preference.

Type: Application

Filed: July 25, 2007

Publication date: January 29, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ivory W. Knipfer, John W. Marreel, Kay M. Momsen, Ryan T. Paske
SEMANTIC PARSER

Publication number: 20090024385

Abstract: A method and an apparatus for semantic parsing of electronic text documents. The electronic text documents can comprise a plurality of sentences with several language components. The method comprises analyzing at least one sentence of the electronic text document and dynamically generating a graph from the analyzed sentence of the text document. The graph represents a semantic representation of the analyzed one or more sentences. The method continues the analysis until an ambiguous sentence is determined and analyzed by evaluating at least a portion of the generated graph.

Type: Application

Filed: July 16, 2007

Publication date: January 22, 2009

Applicant: SEMGINE, GMBH

Inventor: Martin Christian Hirsch
Voice persona service for embedding text-to-speech features into software programs

Publication number: 20090006096

Abstract: Described is a voice persona service by which users convert text into speech waveforms, based on user-provided parameters and voice data from a service data store. The service may be remotely accessed, such as via the Internet. The user may provide text tagged with parameters, with the text sent to a text-to-speech engine along with base or custom voice data, and the resulting waveform morphed based on the tags. The user may also provide speech. Once created, a voice persona corresponding to the speech waveform may be persisted, exchanged, made public, shared and so forth. In one example, the voice persona service receives user input and parameters, and retrieves a base or custom voice that may be edited by the user via a morphing algorithm. The service outputs a waveform, such as a .wav file for embedding in a software program, and persists the voice persona corresponding to that waveform.

Type: Application

Filed: June 27, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Yusheng Li, Min Chu, Xin Zou, Frank Kao-ping Soong
Text-to-speech apparatus

Publication number: 20080319755

Abstract: According to an aspect of an embodiment, an apparatus for converting text data into sound signal, comprises: a phoneme determiner for determining phoneme data corresponding to a plurality of phonemes and pause data corresponding to a plurality of pauses to be inserted among a series of phonemes in the text data to be converted into sound signal; a phoneme length adjuster for modifying the phoneme data and the pause data by determining lengths of the phonemes, respectively in accordance with a speed of the sound signal and selectively adjusting the length of at least one of the phonemes which is placed immediately after one of the pauses so that the at least one of the phonemes is relatively extended timewise as compared to other phonemes; and a output unit for outputting sound signal on the basis of the adjusted phoneme data and pause data by the phoneme length adjuster.

Type: Application

Filed: June 24, 2008

Publication date: December 25, 2008

Applicant: FUJITSU LIMITED

Inventors: Rika Nishiike, Hitoshi Sasaki, Nobuyuki Katae, Kentaro Murase, Takuya Noda

prev 1 2 3 4 next