Systems Using Speech Synthesizers (epo) Patents (Class 704/E13.008)
  • Patent number: 11862169
    Abstract: Providing speech-to-text (STT) transcription by a user endpoint device includes initiating an audio communication between an enterprise server and the user endpoint device, the audio communication comprising a voice interaction between a user associated with the user endpoint device and an agent associated with an agent device to which the enterprise server routes the audio communication; performing a first STT of at least a portion of the voice interaction to produce a first transcribed speech in a first language; concurrent with performing the first STT, performing, by the user endpoint device, a second STT of the at least the portion of the voice interaction to produce a second transcribed speech in a second language different than the first language, and transmitting the at least the portion of the voice interaction and at least the first transcribed speech from the user endpoint device to the enterprise server.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: January 2, 2024
    Assignee: Avaya Management L.P.
    Inventors: Valentine C. Matula, Pushkar Yashavant Deole, Sandesh Chopdekar, Navin Daga
  • Publication number: 20140058733
    Abstract: The amount of speech output to a blind or low-vision user using a screen reader application is automatically adjusted based on how the user navigates to a control in a graphic user interface. Navigation by mouse presumes the user has greater knowledge of the identity of the control than navigation by tab keystroke which is more indicative of a user searching for a control. In addition, accelerator keystrokes indicate a higher level of specificity to set focus on a control and thus less verbosity is required to sufficiently inform the screen reader user.
    Type: Application
    Filed: August 23, 2012
    Publication date: February 27, 2014
    Applicant: FREEDOM SCIENTIFIC, INC.
    Inventors: Garald Lee Voorhees, Glen Gordon, Eric Damery
  • Publication number: 20130238340
    Abstract: Methods and apparatuses for wearing state device operation are disclosed. In one example, a headset includes a sensor for detecting a headset donned state or a headset doffed state. The headset operation is modified based on whether the headset is donned or doffed.
    Type: Application
    Filed: March 9, 2012
    Publication date: September 12, 2013
    Applicant: Plantronics, Inc.
    Inventor: Scott Walsh
  • Publication number: 20130030811
    Abstract: Sensors within the vehicle monitor driver movement, such as face and head movement to ascertain the direction a driver is looking, and gestural movement to ascertain what the driver may be pointing at. This information is combined with video camera data taken of the external vehicle surroundings. The apparatus uses these data to assist the speech dialogue processor to disambiguate phrases uttered by the driver. The apparatus can issue informative responses or control vehicular functions based on queries automatically generated based on the disambiguated phrases.
    Type: Application
    Filed: July 29, 2011
    Publication date: January 31, 2013
    Applicant: PANASONIC CORPORATION
    Inventors: Jules Olleon, Rohit Talati, David Kryze, Akihiko Sugiura
  • Publication number: 20120221339
    Abstract: According to one embodiment, a method, apparatus for synthesizing speech, and a method for training acoustic model used in speech synthesis is provided. The method for synthesizing speech may include determining data generated by text analysis as fuzzy heteronym data, performing fuzzy heteronym prediction on the fuzzy heteronym data to output a plurality of candidate pronunciations of the fuzzy heteronym data and probabilities thereof, generating fuzzy context feature labels based on the plurality of candidate pronunciations and probabilities thereof, determining model parameters for the fuzzy context feature labels based on acoustic model with fuzzy decision tree, generating speech parameters from the model parameters, and synthesizing the speech parameters via synthesizer as speech.
    Type: Application
    Filed: February 22, 2012
    Publication date: August 30, 2012
    Inventors: Xi Wang, Xiaoyan Lou, Jian Li
  • Publication number: 20120166176
    Abstract: A conventional speech recognition dictionary, translation dictionary and speech synthesis dictionary used in speech translation have inconsistencies.
    Type: Application
    Filed: March 3, 2010
    Publication date: June 28, 2012
    Inventors: Satoshi Nakamura, Eiichiro Sumita, Yutaka Ashikari, Noriyuki Kimura, Chiori Hori
  • Publication number: 20110313762
    Abstract: A method, system, and computer program product are provided for speech output with confidence indication. The method includes receiving a confidence score for segments of speech or text to be synthesized to speech. The method includes modifying a speech segment by altering one or more parameters of the speech proportionally to the confidence score.
    Type: Application
    Filed: June 20, 2010
    Publication date: December 22, 2011
    Applicant: International Business Machines Corporation
    Inventors: Shay Ben-David, Ron Hoory
  • Publication number: 20110282668
    Abstract: A method of and system for speech synthesis. First and second text inputs are received in a text-to-speech system, and processed into respective first and second speech outputs corresponding to stored speech respectively from first and second speakers using a processor of the system. The second speech output of the second speaker is adapted to sound like the first speech output of the first speaker.
    Type: Application
    Filed: May 14, 2010
    Publication date: November 17, 2011
    Applicant: GENERAL MOTORS LLC
    Inventors: Jeffrey M. Stefan, Gaurav Talwar, Rathinavelu Chengalvarayan
  • Publication number: 20110274311
    Abstract: A sign language recognition method includes a camera capturing an image of a gesture from a signer, comparing the image of the gesture with a number of gestures to find out the meanings of the gesture, and displaying or vocalizing the meanings of the gestures.
    Type: Application
    Filed: August 8, 2010
    Publication date: November 10, 2011
    Applicant: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventors: HOU-HSIEN LEE, CHANG-JUNG LEE, CHIH-PING LO
  • Publication number: 20110218809
    Abstract: A voice synthesis device includes: a memory for storing a plurality of recorded voice data; a dividing unit for dividing a text into a plurality of words or phrases, wherein the text is to be converted into a voice message; a verifying unit for verifying whether one of the recorded voice data corresponding to each word or phrase is disposed in the memory; and a voice synthesizing unit for preparing a whole of the text with the recorded voice data when all of the recorded voice data corresponding to all of the plurality of words or phrases are disposed in the memory, and for preparing the whole of the text with rule-based synthesized voice data when at least one of the recorded voice data corresponding to one of the plurality of words or phrases is not disposed in the memory.
    Type: Application
    Filed: February 8, 2011
    Publication date: September 8, 2011
    Applicant: DENSO CORPORATION
    Inventors: Ryuichi Suzuki, Takashi Ooi
  • Publication number: 20110116610
    Abstract: Messages in a message system are converted from one format to another format in accordance with preferred message formats and/or conditions. Message formats can include text messages, multimedia messages, visual voicemail messages, and/or other audio/visual messages. Based on conditions such as recipient device location or velocity and a preferred message format a message can be converted into an appropriate transmission format and transmitted and/or communicated to the recipient in its appropriate format (e.g., text, multimedia, audio, etc. . . .).
    Type: Application
    Filed: November 19, 2009
    Publication date: May 19, 2011
    Applicant: AT&T MOBILITY II LLC
    Inventors: Venson Shaw, Robert Z. Evora
  • Publication number: 20110054903
    Abstract: Embodiments of rich text modeling for speech synthesis are disclosed. In operation, a text-to-speech engine refines a plurality of rich context models based on decision tree-tied Hidden Markov Models (HMMs) to produce a plurality of refined rich context models. The text-to-speech engine then generates synthesized speech for an input text based at least on some of the plurality of refined rich context models.
    Type: Application
    Filed: December 2, 2009
    Publication date: March 3, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhi-Jie Yan, Yao Qian, Frank Kao-Ping Soong
  • Publication number: 20110029325
    Abstract: Methods and apparatus to enhance healthcare information analyses are disclosed herein.
    Type: Application
    Filed: July 28, 2009
    Publication date: February 3, 2011
    Applicant: General Electric Company, a New York Corporation
    Inventors: Emil Markov Georgiev, Erik Paul Kemper
  • Publication number: 20100223058
    Abstract: A speech synthesis device includes a pitch pattern generation unit (104) which generates a pitch pattern by combining, based on pitch pattern target data including phonemic information formed from at least syllables, phonemes, and words, a standard pattern which approximately expresses the rough shape of the pitch pattern and an original utterance pattern which expresses the pitch pattern of a recorded speech, a unit waveform selection unit (106) which selects unit waveform data based on the generated pitch pattern and upon selection, selects original utterance unit waveform data corresponding to the original utterance pattern in a section where the original utterance pattern is used, and a speech waveform generation unit (107) which generates a synthetic speech by editing the selected unit waveform data so as to reproduce prosody represented by the generated pitch pattern.
    Type: Application
    Filed: August 28, 2008
    Publication date: September 2, 2010
    Inventors: Yasuyuki Mitsui, Reishi Kondo
  • Publication number: 20100088089
    Abstract: Synthesizing a set of digital speech samples corresponding to a selected voicing state includes dividing speech model parameters into frames, with a frame of speech model parameters including pitch information, voicing information determining the voicing state in one or more frequency regions, and spectral information. First and second digital filters are computed using, respectively, first and second frames of speech model parameters, with the frequency responses of the digital filters corresponding to the spectral information in frequency regions for which the voicing state equals the selected voicing state. A set of pulse locations are determined, and sets of first and second signal samples are produced using the pulse locations and, respectively, the first and second digital filters. Finally, the sets of first and second signal samples are combined to produce a set of digital speech samples corresponding to the selected voicing state.
    Type: Application
    Filed: August 21, 2009
    Publication date: April 8, 2010
    Applicant: DIGITAL VOICE SYSTEMS, INC.
    Inventor: John C. Hardwick
  • Publication number: 20100082345
    Abstract: An “Animation Synthesizer” uses trainable probabilistic models, such as Hidden Markov Models (HMM), Artificial Neural Networks (ANN), etc., to provide speech and text driven body animation synthesis. Probabilistic models are trained using synchronized motion and speech inputs (e.g., live or recorded audio/video feeds) at various speech levels, such as sentences, phrases, words, phonemes, sub-phonemes, etc., depending upon the available data, and the motion type or body part being modeled. The Animation Synthesizer then uses the trainable probabilistic model for selecting animation trajectories for one or more different body parts (e.g., face, head, hands, arms, etc.) based on an arbitrary text and/or speech input. These animation trajectories are then used to synthesize a sequence of animations for digital avatars, cartoon characters, computer generated anthropomorphic persons or creatures, actual motions for physical robots, etc.
    Type: Application
    Filed: September 26, 2008
    Publication date: April 1, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Lijuan Wang, Lei Ma, Frank Kao-Ping Soong
  • Publication number: 20100082350
    Abstract: An approach providing the efficient use of speech synthesis in rendering text content as audio in a communications network. The communications network can include a telephony network and a data network in support of, for example, Voice over Internet Protocol (VoIP) services. A speech synthesis system receives a text string from either a telephony network, or a data network. The speech synthesis system determines whether a rendered audio file of the text string is stored in a database and to render the text string to output the rendered audio file, if the rendered audio is determined not to exist. The rendered audio file is stored in the database for re-use according to a hash value generated by the speech synthesis system based on the text string.
    Type: Application
    Filed: December 8, 2009
    Publication date: April 1, 2010
    Applicant: VERIZON BUSINESS GLOBAL LLC
    Inventors: Paul T. Schultz, Robert A. Sartini
  • Publication number: 20100070281
    Abstract: Disclosed herein are methods for presenting speech from a selected text that is on a computing device. This method includes presenting text on a touch-sensitive display and having that text size within a threshold level so that the computing device can accurately determine the intent of the user when the user touches the touch screen. Once the user touch has been received, the computing device identifies and interprets the portion of text that is to be selected, and subsequently presents the text audibly to the user.
    Type: Application
    Filed: October 24, 2008
    Publication date: March 18, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. CONKIE, Horst Schroeter
  • Publication number: 20100057465
    Abstract: A text-to-speech (TTS) system implemented in an automotive vehicle is dynamically tuned to improve intelligibility over a wide variety of vehicle operating states and environmental conditions. In one embodiment of the present invention, a TTS system is interfaced to one or more vehicle sensors to measure parameters including vehicle speed, interior noise, visibility conditions, and road roughness, among others. In response to measurements of these operating parameters, TTS voice volume, pitch, and speed, among other parameters, may be tuned in order to improve intelligibility of the TTS voice system and increase its effectiveness for the operator of the vehicle.
    Type: Application
    Filed: September 3, 2008
    Publication date: March 4, 2010
    Inventors: DAVID MICHAEL KIRSCH, Ritchie Winson Huang
  • Publication number: 20100042411
    Abstract: A method of building an audio description of a particular product of a class of products includes providing a plurality of human voice recordings, wherein each of the human voice recordings includes audio corresponding to an attribute value common to many of the products. The method also includes automatically obtaining attribute values of the particular product, wherein the attribute values reside electronically. The method also includes automatically applying a plurality of rules for selecting a subset of the human voice recordings that correspond to the obtained attribute values and automatically stitching the selected subset of human voice recordings together to provide a voiceover product description of the particular product. A similar method is used to build an audio description of a particular process.
    Type: Application
    Filed: August 15, 2008
    Publication date: February 18, 2010
    Inventors: Jamie M. Addessi, Mark Paul Bonfigli, Richard F. Gibbs, JR., Christopher Nathaniel Scott
  • Publication number: 20100030561
    Abstract: A system that outputs phonemes and accents of texts. The system has a storage section storing a first corpus in which spellings, phonemes, and accents of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text. A text for which phonemes and accents are to be output is acquired and the first corpus is searched to retrieve at least one set of spellings that match the spellings in the text from among sets of contiguous spellings. Then, the combination of a phoneme and an accent that has a higher probability of occurrence in the first corpus than a predetermined reference probability is selected as the phonemes and accent of the text.
    Type: Application
    Filed: August 3, 2009
    Publication date: February 4, 2010
    Applicant: Nuance Communications, Inc.
    Inventors: Shinsuke Mori, Toru Nagano, Masafumi Nishimura
  • Publication number: 20090313023
    Abstract: The invention converts raw data in a base language (e.g. English) into conversational formatted messages in multiple languages. The process converts input data rows into related sequences to a set of prerecorded audio phrase files. The sequences reference both recorded phrases of input data components and user-created text phrases inserted before and after the input data. When the audio sequences are played in sequence, a coherent conversational message in the language of the caller results. An IVR server responding to a caller's menu selection uses the invention's output data to generate the coherent response. Two embodiment are presented, a simple embodiment that responds to messages, and a more complex embodiment that converts enterprise demographic and member-event data collected over a period into audio sentences played in response to a menu item section by a caller in the caller's language.
    Type: Application
    Filed: June 15, 2009
    Publication date: December 17, 2009
    Inventor: Ralph Jones
  • Publication number: 20090306986
    Abstract: Service architecture for providing to a user terminal of a communications network textual information and relative speech synthesis, the user terminal being provided with a speech synthesis engine and a basic database of speech waveforms includes: a content server for downloading textual information requested by means of a browser application on the user terminal; a context manager for extracting context information from the textual information requested by the user terminal; a context selector for selecting an incremental database of speech waveforms associated with extracted context information and for downloading the incremental database into the user terminal; a database manager on the user terminal for managing the composition of an enlarged database of speech waveforms for the speech synthesis engine including the basic and the incremental databases of speech waveforms.
    Type: Application
    Filed: May 31, 2005
    Publication date: December 10, 2009
    Inventors: Alessio Cervone, Ivano Salvatore Collotta, Paolo Coppo, Donato Ettorre, Maurizio Fodrini, Maura Turolla
  • Publication number: 20090299746
    Abstract: A method for performing speech synthesis to a textual content at a client. The method includes the steps of: performing speech synthesis to the textual content based on a current acoustical unit set Scurrent in a corpus at the client; analyzing the textual content and generating a list of target units with corresponding context features, selecting multiple acoustical unit candidates for each target unit according to the context features based on an acoustical unit set Stotal that is more plentiful than the current acoustical unit set Scurrent in the corpus at the client, and determining acoustical units suitable for speech synthesis for the textual content according to the multiple unit candidates; and updating the current acoustical unit set Scurrent in the corpus at the client based on the determined acoustical units.
    Type: Application
    Filed: May 27, 2009
    Publication date: December 3, 2009
    Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhiwei Shuang
  • Publication number: 20090281786
    Abstract: A natural-language processing system (10) includes a registration-candidate storage section (32) that stores therein registration-candidate dictionary data, a judgment means (22) that compares input data against the registration-candidate dictionary data to thereby judge whether or not the input data includes a word corresponding to the registration-candidate dictionary data, an inquiry means (23) that inquires to a user whether or not corresponding dictionary data is to be registered in a dictionary storage section (31) to accept a user's instruction if it is judged that a corresponding word exists, a dictionary registration means (24) that registers the corresponding dictionary data in the dictionary storage section based on the input instruction, and a natural language processing means (25) that executes a natural-language processing onto the input data by using the dictionary data registered in the dictionary storage section.
    Type: Application
    Filed: September 6, 2007
    Publication date: November 12, 2009
    Inventors: Shinichi Ando, Kunihiko Sadamasa, Shinichi Doi
  • Publication number: 20090234652
    Abstract: The voice synthesis device includes: an emotion input unit (202) which obtains an utterance mode of a voice waveform for which voice synthesis is to be performed; a prosody generation unit (205) which generate a prosody which is used when a language-processed text is uttered in the obtained utterance mode; a characteristic tone selection unit (203) which selects a characteristic tone based on the utterance mode, the characteristic tone is observed when the text is uttered in the obtained utterance mode: a characteristic tone temporal position estimation unit (604) which (i) judges whether or not each of phonemes included in a phonologic sequence of the text is to be uttered with the characteristic tone, based on the phonologic sequence, the characteristic tone, and the prosody, and (ii) decide a phoneme which is an utterance position where the text is uttered with the characteristic tone: and an element selection unit (606) and an element connection unit (209) which generates the voice waveform based on the p
    Type: Application
    Filed: May 2, 2006
    Publication date: September 17, 2009
    Inventors: Yumiko Kato, Takahiro Kamai
  • Publication number: 20090222269
    Abstract: An apparatus for voice synthesis includes: a word database for storing words and voices; a syllable database for storing syllables and voices; a processor for executing a process including: extracting a word from a document, generating a voice signal based on the extracted voice when the extracted word is included in the word database synthesizing a voice signal based on the extracted voice associated with the one or more syllables corresponding to the extracted word when the extracted word is not found in the word database; a speaker for producing a voice based on either of the generated and the synthesized voice signal; and a display for selectively displaying the extracted word when the voice based on the synthesized voice signal is produced by the speaker.
    Type: Application
    Filed: May 11, 2009
    Publication date: September 3, 2009
    Inventor: Shinichiro MORI
  • Publication number: 20090192781
    Abstract: A machine translation method, system for using the method, and computer readable media are disclosed. The method includes the steps of receiving a source language sentence, selecting a set of target language n-grams using a lexical classifier and based on the source language sentence. When selecting the set of target language n-grams, in at least one n-gram, n is greater than 1. The method continues by combining the selected set of target language n-grams as a finite state acceptor (FSA), weighting the FSA with data from the lexical classifier, and generating an n-best list of target sentences from the FSA. As an alternate to using the FSA, N strings may be generated from the n-grams and ranked using a language model. The N strings may be represented by an FSA for efficiency but it is not necessary.
    Type: Application
    Filed: January 30, 2008
    Publication date: July 30, 2009
    Applicant: AT&T Labs
    Inventors: Srinivas BANGALORE, Emil Ettelaie
  • Publication number: 20090177474
    Abstract: A speech synthesizer includes a periodic component fusing unit and an aperiodic component fusing unit, and fuses periodic components and aperiodic components of a plurality of speech units for each segment, which are selected by a unit selector, by a periodic component fusing unit and an aperiodic component fusing unit, respectively. The speech synthesizer is further provided with an adder, so that the adder adds, edits, and concatenates the periodic components and the aperiodic components of the fused speech units to generate a speech waveform.
    Type: Application
    Filed: September 18, 2008
    Publication date: July 9, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20090171665
    Abstract: Techniques are described for enabling flexible and dynamic creation and/or modification of voice data for a position-determining device. In some embodiments, a voice package is provided that includes a language database and a plurality of audio files. The language database specifies appropriate syntax and vocabulary for information that is intended for audio output by a position-determining device. The audio files include words and/or phrases that may be accessed by the position-determining device to communicate the information via audible output. Some embodiments utilize a voice package toolkit to construct and/or customize one or more parts of a voice package.
    Type: Application
    Filed: December 18, 2008
    Publication date: July 2, 2009
    Applicant: GARMIN LTD.
    Inventors: Scott D. Hammerschmidt, Jacob W. Caire, Michael P. Russell, David W. Wiskur, Scott J. Brunk
  • Publication number: 20090171668
    Abstract: A management system for guiding an agent in a media-specific dialogue has a conversion engine for instantiating ongoing dialogue as machine-readable text, if the dialogue is in voice media, a context analysis engine for determining facts from the text, a rules engine for asserting rules based on fact input, and a presentation engine for presenting information to the agent to guide the agent in the dialogue. The context analysis engine passes determined facts to the rules engine, which selects and asserts to the presentation engine rules based on the facts, and the presentation engine provides periodically updated guidance to the agent based on the rules asserted.
    Type: Application
    Filed: December 28, 2007
    Publication date: July 2, 2009
    Inventors: Dave Sneyders, Brian Galvin, S. Michael Perlmutter
  • Publication number: 20090150152
    Abstract: A method and apparatus for indexing one or more audio signals using a speech to text engine and a phoneme detection engine, and generating a combined lattice comprising a text part and a phoneme part. A word to be searched is searched for in the text part, and if not found, or is found with low certainty is divided into phonemes and searched for in the phoneme parts of the lattice.
    Type: Application
    Filed: November 18, 2007
    Publication date: June 11, 2009
    Applicant: Nice Systems
    Inventors: Moshe WASSERBLAT, Barak Eilam, Yuval Lubowich, Maor Nissan
  • Publication number: 20090132255
    Abstract: Embodiments of the present invention improve methods of performing speech recognition with barge-in. In one embodiment, the present invention includes a speech recognition method comprising starting a synthesis of recorded speech, receiving a user speech input signal providing information regarding a user choice, detecting an initial portion of the user speech input signal, selectively altering the synthesis of recorded speech, and recognizing the user choice.
    Type: Application
    Filed: November 19, 2007
    Publication date: May 21, 2009
    Applicant: Sensory, Incorporated
    Inventor: Younan Lu
  • Publication number: 20090106027
    Abstract: An object of the invention is to conveniently increase standard patterns registered in a voice recognition device to efficiently extend the amount of words that can be voice-recognized. New standard patterns are generated by modifying a part of an existing standard pattern. A pattern matching unit 16 of a modifying-part specifying unit 14 performs pattern matching process to specify a part to be modified in the existing standard pattern of a usage source. A standard pattern generating unit 18 generates the new standard patterns by cutting or deleting voice data of the modifying part of the usage-source standard pattern, substituting the voice data of the modifying part of the usage-source standard pattern for another voice data, or combining the voice data of the modifying part of the usage-source standard pattern with another voice data. A standard pattern database update unit 20 adds the new standard patterns to a standard pattern database 24.
    Type: Application
    Filed: May 25, 2006
    Publication date: April 23, 2009
    Applicant: Matsushita Electric Industrial Co., Ltd.
    Inventors: Toshiyuki Teranishi, Kouji Hatano
  • Publication number: 20090063153
    Abstract: A system and method for generating a synthetic text-to-speech TTS voice are disclosed. A user is presented with at least one TTS voice and at least one voice characteristic. A new synthetic TTS voice is generated by blending a plurality of existing TTS voices according to the selected voice characteristics. The blending of voices involves interpolating segmented parameters of each TTS voice. Segmented parameters may be, for example, prosodic characteristics of the speech such as pitch, volume, phone durations, accents, stress, mis-pronunciations and emotion.
    Type: Application
    Filed: November 4, 2008
    Publication date: March 5, 2009
    Applicant: AT&T Corp.
    Inventors: David A. Kapilow, Kenneth H. Rosen, Juergen Schroeter
  • Publication number: 20090055192
    Abstract: A device for use by a deafblind person is disclosed. The device comprises a first key for manually inputting a series of words in the form of a code, a second key for manually inputting an action to be performed by the device, a third key for manually inputting a user preference, and a fourth key for manually inputting communication instructions. The device further has an internal processor programmed to carry out communication functions and search and guide functions. The device has various safety and security functions for pedestrians or persons in transit. In a preferred embodiment, the device comprises an electronic cane known as an eCane. Also disclosed is a system for allowing a deafblind person to enjoy television programs.
    Type: Application
    Filed: November 3, 2008
    Publication date: February 26, 2009
    Inventor: Raanan Liebermann
  • Publication number: 20080312920
    Abstract: An expressive speech-to-speech generation system which can generate expressive speech output by using expressive parameters extracted from the original speech signal to drive the standard TTS system. The system comprises: speech recognition means, machine translation means, text-to-speech generation means, expressive parameter detection means for extracting expressive parameters from the speech of language A, and expressive parameter mapping means for mapping the expressive parameters extracted by the expressive parameter detection means from language A to language B, and driving the text-to-speech generation means by the mapping results to synthesize expressive speech.
    Type: Application
    Filed: August 23, 2008
    Publication date: December 18, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shen Liqin, Shi Qin, Donald T. Tang, Zhang Wei
  • Patent number: 7467026
    Abstract: An autonomous robot is controlled by the local robot information controller which is connected to a robot application network to which the transceiver to communicate with the autonomous robot is attached. The robot application network, a user LAN adaptive controller an information distribution manager and the third party information provider subsystem are linked with a public network. The information distribution manager acquires the information from the third party information provider subsystem on the schedule which is set by the user LAN adaptive controller. The local robot information controller receives the information distribution manager and convert it into the data that generates robot gestures. The robot performs actions in accordance to the gesture data received from the local robot information controller.
    Type: Grant
    Filed: August 13, 2004
    Date of Patent: December 16, 2008
    Assignee: Honda Motor Co. Ltd.
    Inventors: Yoshiaki Sakagami, Shinichi Matsunaga, Naoaki Sumida
  • Publication number: 20080221904
    Abstract: A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. The processor reads first data comprising one or more parameters associated with noise-producing orifice images of sequences of at least three concatenated phonemes which correspond to an input stimulus. The processor reads, based on the first data. second data comprising images of a noise-producing entity. The processor generates an animated sequence of the noise-producing entity.
    Type: Application
    Filed: May 19, 2008
    Publication date: September 11, 2008
    Applicant: AT&T Corp.
    Inventors: Eric Cosatto, Hans Peter Graf, Juergen Schroeter
  • Publication number: 20080126099
    Abstract: A method of representing information to a person comprising displaying an image viewable by a person, the image comprising visual markers representative of portions of a human body minimally necessary to communicate with the person, the visual markers, when viewed by the person, causing the person to extrapolate the human body, a remainder of the image being visually silent with respect to the person. The method is particularly applicable to represent information so as to be perceivable by a hearing-impaired person (e.g. deaf person) wherein a plurality of images, when displayed, one after another on a display device, represent information perceivable by the hearing-impaired person via sign language.
    Type: Application
    Filed: October 25, 2007
    Publication date: May 29, 2008
    Applicant: UNIVERSITE DE SHERBROOKE
    Inventors: Denis Belisle, Johanne Deschenes
  • Publication number: 20080004861
    Abstract: A system and method for characterizing, synthesizing, and/or canceling out acoustic signals from inanimate sound sources is disclosed. Propagating wave electromagnetic sensors monitor excitation sources in sound producing systems, such as machines, musical instruments, and various other structures. Acoustical output from these sound producing systems is also monitored. From such information, a transfer function characterizing the sound producing system is generated. From the transfer function, acoustical output from the sound producing system may be synthesized or canceled. The methods disclosed enable accurate calculation of matched transfer functions relating specific excitations to specific acoustical outputs. Knowledge of such signals and functions can be used to effect various sound replication, sound source identification, and sound cancellation applications.
    Type: Application
    Filed: September 6, 2007
    Publication date: January 3, 2008
    Inventors: John Holzrichter, Greg Burnett, Lawrence Ng
  • Patent number: RE45262
    Abstract: A navigation system and method involving wireless communications technology and speech processing technology is presented. In accordance with an embodiment of the invention, the navigation system includes a subscriber unit communicating with a service provider. The subscriber unit includes a global positioning system mechanism to determine subscriber position information and a speech processing mechanism to receive destination information spoken by a subscriber. The subscriber unit transmits the subscriber position and destination information to the service provider, which gathers navigation information, including a map and a route from the subscriber position to the specified destination. The service provider transmits the navigation information to the subscriber unit. The subscriber unit conveys the received navigation information to the subscriber via an output mechanism, such as a speech synthesis unit or a graphical display.
    Type: Grant
    Filed: December 2, 2004
    Date of Patent: November 25, 2014
    Assignee: Intel Corporation
    Inventor: Christopher R. Wiener