Systems Using Speech Synthesizers (epo) Patents (Class 704/E13.008)

Multilingual transcription at customer endpoint for optimizing interaction results in a contact center

Patent number: 11862169

Abstract: Providing speech-to-text (STT) transcription by a user endpoint device includes initiating an audio communication between an enterprise server and the user endpoint device, the audio communication comprising a voice interaction between a user associated with the user endpoint device and an agent associated with an agent device to which the enterprise server routes the audio communication; performing a first STT of at least a portion of the voice interaction to produce a first transcribed speech in a first language; concurrent with performing the first STT, performing, by the user endpoint device, a second STT of the at least the portion of the voice interaction to produce a second transcribed speech in a second language different than the first language, and transmitting the at least the portion of the voice interaction and at least the first transcribed speech from the user endpoint device to the enterprise server.

Type: Grant

Filed: September 11, 2020

Date of Patent: January 2, 2024

Assignee: Avaya Management L.P.

Inventors: Valentine C. Matula, Pushkar Yashavant Deole, Sandesh Chopdekar, Navin Daga
SCREEN READER WITH FOCUS-BASED SPEECH VERBOSITY

Publication number: 20140058733

Abstract: The amount of speech output to a blind or low-vision user using a screen reader application is automatically adjusted based on how the user navigates to a control in a graphic user interface. Navigation by mouse presumes the user has greater knowledge of the identity of the control than navigation by tab keystroke which is more indicative of a user searching for a control. In addition, accelerator keystrokes indicate a higher level of specificity to set focus on a control and thus less verbosity is required to sufficiently inform the screen reader user.

Type: Application

Filed: August 23, 2012

Publication date: February 27, 2014

Applicant: FREEDOM SCIENTIFIC, INC.

Inventors: Garald Lee Voorhees, Glen Gordon, Eric Damery
Wearing State Based Device Operation

Publication number: 20130238340

Abstract: Methods and apparatuses for wearing state device operation are disclosed. In one example, a headset includes a sensor for detecting a headset donned state or a headset doffed state. The headset operation is modified based on whether the headset is donned or doffed.

Type: Application

Filed: March 9, 2012

Publication date: September 12, 2013

Applicant: Plantronics, Inc.

Inventor: Scott Walsh
NATURAL QUERY INTERFACE FOR CONNECTED CAR

Publication number: 20130030811

Abstract: Sensors within the vehicle monitor driver movement, such as face and head movement to ascertain the direction a driver is looking, and gestural movement to ascertain what the driver may be pointing at. This information is combined with video camera data taken of the external vehicle surroundings. The apparatus uses these data to assist the speech dialogue processor to disambiguate phrases uttered by the driver. The apparatus can issue informative responses or control vehicular functions based on queries automatically generated based on the disambiguated phrases.

Type: Application

Filed: July 29, 2011

Publication date: January 31, 2013

Applicant: PANASONIC CORPORATION

Inventors: Jules Olleon, Rohit Talati, David Kryze, Akihiko Sugiura
METHOD, APPARATUS FOR SYNTHESIZING SPEECH AND ACOUSTIC MODEL TRAINING METHOD FOR SPEECH SYNTHESIS

Publication number: 20120221339

Abstract: According to one embodiment, a method, apparatus for synthesizing speech, and a method for training acoustic model used in speech synthesis is provided. The method for synthesizing speech may include determining data generated by text analysis as fuzzy heteronym data, performing fuzzy heteronym prediction on the fuzzy heteronym data to output a plurality of candidate pronunciations of the fuzzy heteronym data and probabilities thereof, generating fuzzy context feature labels based on the plurality of candidate pronunciations and probabilities thereof, determining model parameters for the fuzzy context feature labels based on acoustic model with fuzzy decision tree, generating speech parameters from the model parameters, and synthesizing the speech parameters via synthesizer as speech.

Type: Application

Filed: February 22, 2012

Publication date: August 30, 2012

Inventors: Xi Wang, Xiaoyan Lou, Jian Li
SPEECH TRANSLATION SYSTEM, DICTIONARY SERVER, AND PROGRAM

Publication number: 20120166176

Abstract: A conventional speech recognition dictionary, translation dictionary and speech synthesis dictionary used in speech translation have inconsistencies.

Type: Application

Filed: March 3, 2010

Publication date: June 28, 2012

Inventors: Satoshi Nakamura, Eiichiro Sumita, Yutaka Ashikari, Noriyuki Kimura, Chiori Hori
SPEECH OUTPUT WITH CONFIDENCE INDICATION

Publication number: 20110313762

Abstract: A method, system, and computer program product are provided for speech output with confidence indication. The method includes receiving a confidence score for segments of speech or text to be synthesized to speech. The method includes modifying a speech segment by altering one or more parameters of the speech proportionally to the confidence score.

Type: Application

Filed: June 20, 2010

Publication date: December 22, 2011

Applicant: International Business Machines Corporation

Inventors: Shay Ben-David, Ron Hoory
SPEECH ADAPTATION IN SPEECH SYNTHESIS

Publication number: 20110282668

Abstract: A method of and system for speech synthesis. First and second text inputs are received in a text-to-speech system, and processed into respective first and second speech outputs corresponding to stored speech respectively from first and second speakers using a processor of the system. The second speech output of the second speaker is adapted to sound like the first speech output of the first speaker.

Type: Application

Filed: May 14, 2010

Publication date: November 17, 2011

Applicant: GENERAL MOTORS LLC

Inventors: Jeffrey M. Stefan, Gaurav Talwar, Rathinavelu Chengalvarayan
SIGN LANGUAGE RECOGNITION SYSTEM AND METHOD

Publication number: 20110274311

Abstract: A sign language recognition method includes a camera capturing an image of a gesture from a signer, comparing the image of the gesture with a number of gestures to find out the meanings of the gesture, and displaying or vocalizing the meanings of the gestures.

Type: Application

Filed: August 8, 2010

Publication date: November 10, 2011

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventors: HOU-HSIEN LEE, CHANG-JUNG LEE, CHIH-PING LO
VOICE SYNTHESIS DEVICE, NAVIGATION DEVICE HAVING THE SAME, AND METHOD FOR SYNTHESIZING VOICE MESSAGE

Publication number: 20110218809

Abstract: A voice synthesis device includes: a memory for storing a plurality of recorded voice data; a dividing unit for dividing a text into a plurality of words or phrases, wherein the text is to be converted into a voice message; a verifying unit for verifying whether one of the recorded voice data corresponding to each word or phrase is disposed in the memory; and a voice synthesizing unit for preparing a whole of the text with the recorded voice data when all of the recorded voice data corresponding to all of the plurality of words or phrases are disposed in the memory, and for preparing the whole of the text with rule-based synthesized voice data when at least one of the recorded voice data corresponding to one of the plurality of words or phrases is not disposed in the memory.

Type: Application

Filed: February 8, 2011

Publication date: September 8, 2011

Applicant: DENSO CORPORATION

Inventors: Ryuichi Suzuki, Takashi Ooi
User Profile Based Speech To Text Conversion For Visual Voice Mail

Publication number: 20110116610

Abstract: Messages in a message system are converted from one format to another format in accordance with preferred message formats and/or conditions. Message formats can include text messages, multimedia messages, visual voicemail messages, and/or other audio/visual messages. Based on conditions such as recipient device location or velocity and a preferred message format a message can be converted into an appropriate transmission format and transmitted and/or communicated to the recipient in its appropriate format (e.g., text, multimedia, audio, etc. . . .).

Type: Application

Filed: November 19, 2009

Publication date: May 19, 2011

Applicant: AT&T MOBILITY II LLC

Inventors: Venson Shaw, Robert Z. Evora
RICH CONTEXT MODELING FOR TEXT-TO-SPEECH ENGINES

Publication number: 20110054903

Abstract: Embodiments of rich text modeling for speech synthesis are disclosed. In operation, a text-to-speech engine refines a plurality of rich context models based on decision tree-tied Hidden Markov Models (HMMs) to produce a plurality of refined rich context models. The text-to-speech engine then generates synthesized speech for an input text based at least on some of the plurality of refined rich context models.

Type: Application

Filed: December 2, 2009

Publication date: March 3, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Zhi-Jie Yan, Yao Qian, Frank Kao-Ping Soong
METHODS AND APPARATUS TO ENHANCE HEALTHCARE INFORMATION ANALYSES

Publication number: 20110029325

Abstract: Methods and apparatus to enhance healthcare information analyses are disclosed herein.

Type: Application

Filed: July 28, 2009

Publication date: February 3, 2011

Applicant: General Electric Company, a New York Corporation

Inventors: Emil Markov Georgiev, Erik Paul Kemper
SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM

Publication number: 20100223058

Abstract: A speech synthesis device includes a pitch pattern generation unit (104) which generates a pitch pattern by combining, based on pitch pattern target data including phonemic information formed from at least syllables, phonemes, and words, a standard pattern which approximately expresses the rough shape of the pitch pattern and an original utterance pattern which expresses the pitch pattern of a recorded speech, a unit waveform selection unit (106) which selects unit waveform data based on the generated pitch pattern and upon selection, selects original utterance unit waveform data corresponding to the original utterance pattern in a section where the original utterance pattern is used, and a speech waveform generation unit (107) which generates a synthetic speech by editing the selected unit waveform data so as to reproduce prosody represented by the generated pitch pattern.

Type: Application

Filed: August 28, 2008

Publication date: September 2, 2010

Inventors: Yasuyuki Mitsui, Reishi Kondo
Speech Synthesizer

Publication number: 20100088089

Abstract: Synthesizing a set of digital speech samples corresponding to a selected voicing state includes dividing speech model parameters into frames, with a frame of speech model parameters including pitch information, voicing information determining the voicing state in one or more frequency regions, and spectral information. First and second digital filters are computed using, respectively, first and second frames of speech model parameters, with the frequency responses of the digital filters corresponding to the spectral information in frequency regions for which the voicing state equals the selected voicing state. A set of pulse locations are determined, and sets of first and second signal samples are produced using the pulse locations and, respectively, the first and second digital filters. Finally, the sets of first and second signal samples are combined to produce a set of digital speech samples corresponding to the selected voicing state.

Type: Application

Filed: August 21, 2009

Publication date: April 8, 2010

Applicant: DIGITAL VOICE SYSTEMS, INC.

Inventor: John C. Hardwick
SPEECH AND TEXT DRIVEN HMM-BASED BODY ANIMATION SYNTHESIS

Publication number: 20100082345

Abstract: An “Animation Synthesizer” uses trainable probabilistic models, such as Hidden Markov Models (HMM), Artificial Neural Networks (ANN), etc., to provide speech and text driven body animation synthesis. Probabilistic models are trained using synchronized motion and speech inputs (e.g., live or recorded audio/video feeds) at various speech levels, such as sentences, phrases, words, phonemes, sub-phonemes, etc., depending upon the available data, and the motion type or body part being modeled. The Animation Synthesizer then uses the trainable probabilistic model for selecting animation trajectories for one or more different body parts (e.g., face, head, hands, arms, etc.) based on an arbitrary text and/or speech input. These animation trajectories are then used to synthesize a sequence of animations for digital avatars, cartoon characters, computer generated anthropomorphic persons or creatures, actual motions for physical robots, etc.

Type: Application

Filed: September 26, 2008

Publication date: April 1, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Lijuan Wang, Lei Ma, Frank Kao-Ping Soong
METHOD AND SYSTEM FOR PROVIDING SYNTHESIZED SPEECH

Publication number: 20100082350

Abstract: An approach providing the efficient use of speech synthesis in rendering text content as audio in a communications network. The communications network can include a telephony network and a data network in support of, for example, Voice over Internet Protocol (VoIP) services. A speech synthesis system receives a text string from either a telephony network, or a data network. The speech synthesis system determines whether a rendered audio file of the text string is stored in a database and to render the text string to output the rendered audio file, if the rendered audio is determined not to exist. The rendered audio file is stored in the database for re-use according to a hash value generated by the speech synthesis system based on the text string.

Type: Application

Filed: December 8, 2009

Publication date: April 1, 2010

Applicant: VERIZON BUSINESS GLOBAL LLC

Inventors: Paul T. Schultz, Robert A. Sartini
SYSTEM AND METHOD FOR AUDIBLY PRESENTING SELECTED TEXT

Publication number: 20100070281

Abstract: Disclosed herein are methods for presenting speech from a selected text that is on a computing device. This method includes presenting text on a touch-sensitive display and having that text size within a threshold level so that the computing device can accurately determine the intent of the user when the user touches the touch screen. Once the user touch has been received, the computing device identifies and interprets the portion of text that is to be selected, and subsequently presents the text audibly to the user.

Type: Application

Filed: October 24, 2008

Publication date: March 18, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Alistair D. CONKIE, Horst Schroeter
VARIABLE TEXT-TO-SPEECH FOR AUTOMOTIVE APPLICATION

Publication number: 20100057465

Abstract: A text-to-speech (TTS) system implemented in an automotive vehicle is dynamically tuned to improve intelligibility over a wide variety of vehicle operating states and environmental conditions. In one embodiment of the present invention, a TTS system is interfaced to one or more vehicle sensors to measure parameters including vehicle speed, interior noise, visibility conditions, and road roughness, among others. In response to measurements of these operating parameters, TTS voice volume, pitch, and speed, among other parameters, may be tuned in order to improve intelligibility of the TTS voice system and increase its effectiveness for the operator of the vehicle.

Type: Application

Filed: September 3, 2008

Publication date: March 4, 2010

Inventors: DAVID MICHAEL KIRSCH, Ritchie Winson Huang
Automatic Creation of Audio Files

Publication number: 20100042411

Abstract: A method of building an audio description of a particular product of a class of products includes providing a plurality of human voice recordings, wherein each of the human voice recordings includes audio corresponding to an attribute value common to many of the products. The method also includes automatically obtaining attribute values of the particular product, wherein the attribute values reside electronically. The method also includes automatically applying a plurality of rules for selecting a subset of the human voice recordings that correspond to the obtained attribute values and automatically stitching the selected subset of human voice recordings together to provide a voiceover product description of the particular product. A similar method is used to build an audio description of a particular process.

Type: Application

Filed: August 15, 2008

Publication date: February 18, 2010

Inventors: Jamie M. Addessi, Mark Paul Bonfigli, Richard F. Gibbs, JR., Christopher Nathaniel Scott
ANNOTATING PHONEMES AND ACCENTS FOR TEXT-TO-SPEECH SYSTEM

Publication number: 20100030561

Abstract: A system that outputs phonemes and accents of texts. The system has a storage section storing a first corpus in which spellings, phonemes, and accents of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text. A text for which phonemes and accents are to be output is acquired and the first corpus is searched to retrieve at least one set of spellings that match the spellings in the text from among sets of contiguous spellings. Then, the combination of a phoneme and an accent that has a higher probability of occurrence in the first corpus than a predetermined reference probability is selected as the phonemes and accent of the text.

Type: Application

Filed: August 3, 2009

Publication date: February 4, 2010

Applicant: Nuance Communications, Inc.

Inventors: Shinsuke Mori, Toru Nagano, Masafumi Nishimura
Multilingual text-to-speech system

Publication number: 20090313023

Abstract: The invention converts raw data in a base language (e.g. English) into conversational formatted messages in multiple languages. The process converts input data rows into related sequences to a set of prerecorded audio phrase files. The sequences reference both recorded phrases of input data components and user-created text phrases inserted before and after the input data. When the audio sequences are played in sequence, a coherent conversational message in the language of the caller results. An IVR server responding to a caller's menu selection uses the invention's output data to generate the coherent response. Two embodiment are presented, a simple embodiment that responds to messages, and a more complex embodiment that converts enterprise demographic and member-event data collected over a period into audio sentences played in response to a menu item section by a caller in the caller's language.

Type: Application

Filed: June 15, 2009

Publication date: December 17, 2009

Inventor: Ralph Jones
Method and system for providing speech synthesis on user terminals over a communications network

Publication number: 20090306986

Abstract: Service architecture for providing to a user terminal of a communications network textual information and relative speech synthesis, the user terminal being provided with a speech synthesis engine and a basic database of speech waveforms includes: a content server for downloading textual information requested by means of a browser application on the user terminal; a context manager for extracting context information from the textual information requested by the user terminal; a context selector for selecting an incremental database of speech waveforms associated with extracted context information and for downloading the incremental database into the user terminal; a database manager on the user terminal for managing the composition of an enlarged database of speech waveforms for the speech synthesis engine including the basic and the incremental databases of speech waveforms.

Type: Application

Filed: May 31, 2005

Publication date: December 10, 2009

Inventors: Alessio Cervone, Ivano Salvatore Collotta, Paolo Coppo, Donato Ettorre, Maurizio Fodrini, Maura Turolla
METHOD AND SYSTEM FOR SPEECH SYNTHESIS

Publication number: 20090299746

Abstract: A method for performing speech synthesis to a textual content at a client. The method includes the steps of: performing speech synthesis to the textual content based on a current acoustical unit set Scurrent in a corpus at the client; analyzing the textual content and generating a list of target units with corresponding context features, selecting multiple acoustical unit candidates for each target unit according to the context features based on an acoustical unit set Stotal that is more plentiful than the current acoustical unit set Scurrent in the corpus at the client, and determining acoustical units suitable for speech synthesis for the textual content according to the multiple unit candidates; and updating the current acoustical unit set Scurrent in the corpus at the client based on the determined acoustical units.

Type: Application

Filed: May 27, 2009

Publication date: December 3, 2009

Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhiwei Shuang
Natural-language processing system and dictionary registration system

Publication number: 20090281786

Abstract: A natural-language processing system (10) includes a registration-candidate storage section (32) that stores therein registration-candidate dictionary data, a judgment means (22) that compares input data against the registration-candidate dictionary data to thereby judge whether or not the input data includes a word corresponding to the registration-candidate dictionary data, an inquiry means (23) that inquires to a user whether or not corresponding dictionary data is to be registered in a dictionary storage section (31) to accept a user's instruction if it is judged that a corresponding word exists, a dictionary registration means (24) that registers the corresponding dictionary data in the dictionary storage section based on the input instruction, and a natural language processing means (25) that executes a natural-language processing onto the input data by using the dictionary data registered in the dictionary storage section.

Type: Application

Filed: September 6, 2007

Publication date: November 12, 2009

Inventors: Shinichi Ando, Kunihiko Sadamasa, Shinichi Doi
VOICE SYNTHESIS DEVICE

Publication number: 20090234652

Abstract: The voice synthesis device includes: an emotion input unit (202) which obtains an utterance mode of a voice waveform for which voice synthesis is to be performed; a prosody generation unit (205) which generate a prosody which is used when a language-processed text is uttered in the obtained utterance mode; a characteristic tone selection unit (203) which selects a characteristic tone based on the utterance mode, the characteristic tone is observed when the text is uttered in the obtained utterance mode: a characteristic tone temporal position estimation unit (604) which (i) judges whether or not each of phonemes included in a phonologic sequence of the text is to be uttered with the characteristic tone, based on the phonologic sequence, the characteristic tone, and the prosody, and (ii) decide a phoneme which is an utterance position where the text is uttered with the characteristic tone: and an element selection unit (606) and an element connection unit (209) which generates the voice waveform based on the p

Type: Application

Filed: May 2, 2006

Publication date: September 17, 2009

Inventors: Yumiko Kato, Takahiro Kamai
SENTENCE READING ALOUD APPARATUS, CONTROL METHOD FOR CONTROLLING THE SAME, AND CONTROL PROGRAM FOR CONTROLLING THE SAME

Publication number: 20090222269

Abstract: An apparatus for voice synthesis includes: a word database for storing words and voices; a syllable database for storing syllables and voices; a processor for executing a process including: extracting a word from a document, generating a voice signal based on the extracted voice when the extracted word is included in the word database synthesizing a voice signal based on the extracted voice associated with the one or more syllables corresponding to the extracted word when the extracted word is not found in the word database; a speaker for producing a voice based on either of the generated and the synthesized voice signal; and a display for selectively displaying the extracted word when the voice based on the synthesized voice signal is produced by the speaker.

Type: Application

Filed: May 11, 2009

Publication date: September 3, 2009

Inventor: Shinichiro MORI
SYSTEM AND METHOD OF PROVIDING MACHINE TRANSLATION FROM A SOURCE LANGUAGE TO A TARGET LANGUAGE

Publication number: 20090192781

Abstract: A machine translation method, system for using the method, and computer readable media are disclosed. The method includes the steps of receiving a source language sentence, selecting a set of target language n-grams using a lexical classifier and based on the source language sentence. When selecting the set of target language n-grams, in at least one n-gram, n is greater than 1. The method continues by combining the selected set of target language n-grams as a finite state acceptor (FSA), weighting the FSA with data from the lexical classifier, and generating an n-best list of target sentences from the FSA. As an alternate to using the FSA, N strings may be generated from the n-grams and ranked using a language model. The N strings may be represented by an FSA for efficiency but it is not necessary.

Type: Application

Filed: January 30, 2008

Publication date: July 30, 2009

Applicant: AT&T Labs

Inventors: Srinivas BANGALORE, Emil Ettelaie
SPEECH PROCESSING APPARATUS AND PROGRAM

Publication number: 20090177474

Abstract: A speech synthesizer includes a periodic component fusing unit and an aperiodic component fusing unit, and fuses periodic components and aperiodic components of a plurality of speech units for each segment, which are selected by a unit selector, by a periodic component fusing unit and an aperiodic component fusing unit, respectively. The speech synthesizer is further provided with an adder, so that the adder adds, edits, and concatenates the periodic components and the aperiodic components of the fused speech units to generate a speech waveform.

Type: Application

Filed: September 18, 2008

Publication date: July 9, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masahiro Morita, Takehiko Kagoshima
METHOD AND APPARATUS FOR CREATING AND MODIFYING NAVIGATION VOICE SYNTAX

Publication number: 20090171665

Abstract: Techniques are described for enabling flexible and dynamic creation and/or modification of voice data for a position-determining device. In some embodiments, a voice package is provided that includes a language database and a plurality of audio files. The language database specifies appropriate syntax and vocabulary for information that is intended for audio output by a position-determining device. The audio files include words and/or phrases that may be accessed by the position-determining device to communicate the information via audible output. Some embodiments utilize a voice package toolkit to construct and/or customize one or more parts of a voice package.

Type: Application

Filed: December 18, 2008

Publication date: July 2, 2009

Applicant: GARMIN LTD.

Inventors: Scott D. Hammerschmidt, Jacob W. Caire, Michael P. Russell, David W. Wiskur, Scott J. Brunk
Recursive Adaptive Interaction Management System

Publication number: 20090171668

Abstract: A management system for guiding an agent in a media-specific dialogue has a conversion engine for instantiating ongoing dialogue as machine-readable text, if the dialogue is in voice media, a context analysis engine for determining facts from the text, a rules engine for asserting rules based on fact input, and a presentation engine for presenting information to the agent to guide the agent in the dialogue. The context analysis engine passes determined facts to the rules engine, which selects and asserts to the presentation engine rules based on the facts, and the presentation engine provides periodically updated guidance to the agent based on the rules asserted.

Type: Application

Filed: December 28, 2007

Publication date: July 2, 2009

Inventors: Dave Sneyders, Brian Galvin, S. Michael Perlmutter
METHOD AND APPARATUS FOR FAST SEARCH IN CALL-CENTER MONITORING

Publication number: 20090150152

Abstract: A method and apparatus for indexing one or more audio signals using a speech to text engine and a phoneme detection engine, and generating a combined lattice comprising a text part and a phoneme part. A word to be searched is searched for in the text part, and if not found, or is found with low certainty is divided into phonemes and searched for in the phoneme parts of the lattice.

Type: Application

Filed: November 18, 2007

Publication date: June 11, 2009

Applicant: Nice Systems

Inventors: Moshe WASSERBLAT, Barak Eilam, Yuval Lubowich, Maor Nissan
Systems and Methods of Performing Speech Recognition with Barge-In for use in a Bluetooth System

Publication number: 20090132255

Abstract: Embodiments of the present invention improve methods of performing speech recognition with barge-in. In one embodiment, the present invention includes a speech recognition method comprising starting a synthesis of recorded speech, receiving a user speech input signal providing information regarding a user choice, detecting an initial portion of the user speech input signal, selectively altering the synthesis of recorded speech, and recognizing the user choice.

Type: Application

Filed: November 19, 2007

Publication date: May 21, 2009

Applicant: Sensory, Incorporated

Inventor: Younan Lu
VOICE EDITION DEVICE, VOICE EDITION METHOD, AND VOICE EDITION PROGRAM

Publication number: 20090106027

Abstract: An object of the invention is to conveniently increase standard patterns registered in a voice recognition device to efficiently extend the amount of words that can be voice-recognized. New standard patterns are generated by modifying a part of an existing standard pattern. A pattern matching unit 16 of a modifying-part specifying unit 14 performs pattern matching process to specify a part to be modified in the existing standard pattern of a usage source. A standard pattern generating unit 18 generates the new standard patterns by cutting or deleting voice data of the modifying part of the usage-source standard pattern, substituting the voice data of the modifying part of the usage-source standard pattern for another voice data, or combining the voice data of the modifying part of the usage-source standard pattern with another voice data. A standard pattern database update unit 20 adds the new standard patterns to a standard pattern database 24.

Type: Application

Filed: May 25, 2006

Publication date: April 23, 2009

Applicant: Matsushita Electric Industrial Co., Ltd.

Inventors: Toshiyuki Teranishi, Kouji Hatano
SYSTEM AND METHOD FOR BLENDING SYNTHETIC VOICES

Publication number: 20090063153

Abstract: A system and method for generating a synthetic text-to-speech TTS voice are disclosed. A user is presented with at least one TTS voice and at least one voice characteristic. A new synthetic TTS voice is generated by blending a plurality of existing TTS voices according to the selected voice characteristics. The blending of voices involves interpolating segmented parameters of each TTS voice. Segmented parameters may be, for example, prosodic characteristics of the speech such as pitch, volume, phone durations, accents, stress, mis-pronunciations and emotion.

Type: Application

Filed: November 4, 2008

Publication date: March 5, 2009

Applicant: AT&T Corp.

Inventors: David A. Kapilow, Kenneth H. Rosen, Juergen Schroeter
DEVICES FOR USE BY DEAF AND/OR BLIND PEOPLE

Publication number: 20090055192

Abstract: A device for use by a deafblind person is disclosed. The device comprises a first key for manually inputting a series of words in the form of a code, a second key for manually inputting an action to be performed by the device, a third key for manually inputting a user preference, and a fourth key for manually inputting communication instructions. The device further has an internal processor programmed to carry out communication functions and search and guide functions. The device has various safety and security functions for pedestrians or persons in transit. In a preferred embodiment, the device comprises an electronic cane known as an eCane. Also disclosed is a system for allowing a deafblind person to enjoy television programs.

Type: Application

Filed: November 3, 2008

Publication date: February 26, 2009

Inventor: Raanan Liebermann
SPEECH-TO-SPEECH GENERATION SYSTEM AND METHOD

Publication number: 20080312920

Abstract: An expressive speech-to-speech generation system which can generate expressive speech output by using expressive parameters extracted from the original speech signal to drive the standard TTS system. The system comprises: speech recognition means, machine translation means, text-to-speech generation means, expressive parameter detection means for extracting expressive parameters from the speech of language A, and expressive parameter mapping means for mapping the expressive parameters extracted by the expressive parameter detection means from language A to language B, and driving the text-to-speech generation means by the mapping results to synthesize expressive speech.

Type: Application

Filed: August 23, 2008

Publication date: December 18, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Shen Liqin, Shi Qin, Donald T. Tang, Zhang Wei
Autonomously moving robot management system

Patent number: 7467026

Abstract: An autonomous robot is controlled by the local robot information controller which is connected to a robot application network to which the transceiver to communicate with the autonomous robot is attached. The robot application network, a user LAN adaptive controller an information distribution manager and the third party information provider subsystem are linked with a public network. The information distribution manager acquires the information from the third party information provider subsystem on the schedule which is set by the user LAN adaptive controller. The local robot information controller receives the information distribution manager and convert it into the data that generates robot gestures. The robot performs actions in accordance to the gesture data received from the local robot information controller.

Type: Grant

Filed: August 13, 2004

Date of Patent: December 16, 2008

Assignee: Honda Motor Co. Ltd.

Inventors: Yoshiaki Sakagami, Shinichi Matsunaga, Naoaki Sumida
COARTICULATION METHOD FOR AUDIO-VISUAL TEXT-TO-SPEECH SYNTHESIS

Publication number: 20080221904

Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. The processor reads first data comprising one or more parameters associated with noise-producing orifice images of sequences of at least three concatenated phonemes which correspond to an input stimulus. The processor reads, based on the first data. second data comprising images of a noise-producing entity. The processor generates an animated sequence of the noise-producing entity.

Type: Application

Filed: May 19, 2008

Publication date: September 11, 2008

Applicant: AT&T Corp.

Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
Method of representing information

Publication number: 20080126099

Abstract: A method of representing information to a person comprising displaying an image viewable by a person, the image comprising visual markers representative of portions of a human body minimally necessary to communicate with the person, the visual markers, when viewed by the person, causing the person to extrapolate the human body, a remainder of the image being visually silent with respect to the person. The method is particularly applicable to represent information so as to be perceivable by a hearing-impaired person (e.g. deaf person) wherein a plurality of images, when displayed, one after another on a display device, represent information perceivable by the hearing-impaired person via sign language.

Type: Application

Filed: October 25, 2007

Publication date: May 29, 2008

Applicant: UNIVERSITE DE SHERBROOKE

Inventors: Denis Belisle, Johanne Deschenes
System and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources

Publication number: 20080004861

Abstract: A system and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources is disclosed. Propagating wave electromagnetic sensors monitor excitation sources in sound producing systems, such as machines, musical instruments, and various other structures. Acoustical output from these sound producing systems is also monitored. From such information, a transfer function characterizing the sound producing system is generated. From the transfer function, acoustical output from the sound producing system may be synthesized or canceled. The methods disclosed enable accurate calculation of matched transfer functions relating specific excitations to specific acoustical outputs. Knowledge of such signals and functions can be used to effect various sound replication, sound source identification, and sound cancellation applications.

Type: Application

Filed: September 6, 2007

Publication date: January 3, 2008

Inventors: John Holzrichter, Greg Burnett, Lawrence Ng
Voice-controlled navigation device utilizing wireless data transmission for obtaining maps and real-time overlay information

Patent number: RE45262

Abstract: A navigation system and method involving wireless communications technology and speech processing technology is presented. In accordance with an embodiment of the invention, the navigation system includes a subscriber unit communicating with a service provider. The subscriber unit includes a global positioning system mechanism to determine subscriber position information and a speech processing mechanism to receive destination information spoken by a subscriber. The subscriber unit transmits the subscriber position and destination information to the service provider, which gathers navigation information, including a map and a route from the subscriber position to the specified destination. The service provider transmits the navigation information to the subscriber unit. The subscriber unit conveys the received navigation information to the subscriber via an output mechanism, such as a speech synthesis unit or a graphical display.

Type: Grant

Filed: December 2, 2004

Date of Patent: November 25, 2014

Assignee: Intel Corporation

Inventor: Christopher R. Wiener