Synthesis Patents (Class 704/258)

Neural network (Class 704/259)

Image to speech (Class 704/260)

Vocal tract model (Class 704/261)

Linear prediction (Class 704/262)

Correlation (Class 704/263)

Excitation (Class 704/264)

Interpolation (Class 704/265)

Specialized model (Class 704/266)

Time element (Class 704/267)

Frequency element (Class 704/268)

Transformation (Class 704/269)

Method and system for assessing pronunciation difficulties of non-native speakers

Patent number: 8478597

Abstract: The present disclosure presents a useful metric for assessing the relative difficulty which non-native speakers face in pronouncing a given utterance and a method and systems for using such a metric in the evaluation and assessment of the utterances of non-native speakers. In an embodiment, the metric may be based on both known sources of difficulty for language learners and a corpus-based measure of cross-language sound differences. The method may be applied to speakers who primarily speak a first language speaking utterances in any non-native second language.

Type: Grant

Filed: January 10, 2006

Date of Patent: July 2, 2013

Assignee: Educational Testing Service

Inventors: Derrick Higgins, Klaus Zechner, Yoko Futagi, Rene Lawless
Server for automatically scoring opinion conveyed by text message containing pictorial-symbols

Patent number: 8478582

Abstract: A server is disclosed for computing a score of an opinion that a message in a text file is expected to convey regarding a subject to be evaluated, wherein the message is written using literal strings and pictorial symbols. In this server, by the use of a pictorial-symbol dictionary memory storing a correspondence between designated pictorial-symbols to be rated and scores of opinions expressed by the respective pictorial-symbols, at least one of the used pictorial-symbols in the message which is coincident with at least one of the designated pictorial-symbols stored in the pictorial-symbol dictionary memory, is extracted from the message, at least one of the opinion scores which corresponds to the at least one extracted pictorial-symbol is retrieved within the pictorial-symbol dictionary memory, and an aggregate net opinion score for the message is calculated, based on an aggregate opinion score for the at least one extracted pictorial-symbol.

Type: Grant

Filed: February 2, 2010

Date of Patent: July 2, 2013

Assignee: KDDI Corporation

Inventors: Yukiko Habu, Ryoichi Kawada, Nobuhide Kotsuka, Sung Jiae, Koki Uchiyama, Santi Saeyor, Hirosuke Asano, Toshiaki Shimamura
ACCESSING MEDIA DATA USING METADATA REPOSITORY

Publication number: 20130166303

Abstract: A computer-implemented method includes receiving, in a computer system, a user query comprising at least a first term, parsing the user query to at least determine whether the user query assigns a field to the first term, the parsing resulting in a parsed query that conforms to a predefined format, performing a search in a metadata repository using the parsed query, the metadata repository embodied in a computer readable medium and including triplets generated based on multiple modes of metadata for video content, the search identifying a set of candidate scenes from the video content, ranking the set of candidate scenes according to a scoring metric into a ranked scene list, and generating an output from the computer system that includes at least part of the ranked scene list, the output generated in response to the user query.

Type: Application

Filed: November 13, 2009

Publication date: June 27, 2013

Applicant: ADOBE SYSTEMS INCORPORATED

Inventors: Walter Chang, Michael J. Welch
Multi-stage quantization method and device

Patent number: 8468017

Abstract: The invention discloses a multi-stage quantization method, which includes the following steps: obtaining a reference codebook according to a previous stage codebook; obtaining a current stage codebook according to the reference codebook and a scaling factor; and quantizing an input vector by using the current stage codebook. The invention also discloses a multi-stage quantization device. With the invention, the current stage codebook may be obtained according to the previous stage codebook, by using the correlation between the current stage codebook and the previous stage codebook. As a result, it does not require an independent codebook space for the current stage codebook, which saves the storage space and improves the resource usage efficiency.

Type: Grant

Filed: May 1, 2010

Date of Patent: June 18, 2013

Assignee: Huawei Technologies Co., Ltd.

Inventors: Eyal Shlomot, Jiliang Dai, Fuliang Yin, Xin Ma, Jun Zhang
Speech synthesis apparatus and method wherein more than one speech unit is acquired from continuous memory region by one access

Patent number: 8468020

Abstract: An apparatus for synthesizing a speech including a waveform memory that stores a plurality of speech unit waveforms, an information memory that correspondingly stores speech unit information and an address of each of the speech unit waveforms, a selector that selects a speech unit sequence corresponding to the input phoneme sequence by referring to the speech unit information, a speech unit waveform acquisition unit that acquires a speech unit waveform corresponding to each speech unit of the speech unit sequence from the waveform memory by referring to the address, a speech unit concatenation unit that generates the speech by concatenating the speech unit waveform acquired.

Type: Grant

Filed: May 8, 2007

Date of Patent: June 18, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventor: Takehiko Kagoshima
VOICE MODULATION APPARATUS AND VOICE MODULATION METHOD USING THE SAME

Publication number: 20130151243

Abstract: A voice modulation apparatus is provided. The voice modulation apparatus includes an audio signal input unit which receives an audio signal from an external source; an extraction unit which extracts property information relating to a voice from the audio signal; a storage unit which stores the extracted property information; a control unit which modulates a target voice based on the extracted property information; and an output unit which outputs the modulated target voice.

Type: Application

Filed: December 7, 2012

Publication date: June 13, 2013

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Samsung Electronics Co., Ltd.
Automatic evaluation of spoken fluency

Patent number: 8457967

Abstract: A procedure to automatically evaluate the spoken fluency of a speaker by prompting the speaker to talk on a given topic, recording the speaker's speech to get a recorded sample of speech, and then analyzing the patterns of disfluencies in the speech to compute a numerical score to quantify the spoken fluency skills of the speakers. The numerical fluency score accounts for various prosodic and lexical features, including formant-based filled-pause detection, closely-occurring exact and inexact repeat N-grams, normalized average distance between consecutive occurrences of N-grams. The lexical features and prosodic features are combined to classify the speaker with a C-class classification and develop a rating for the speaker.

Type: Grant

Filed: August 15, 2009

Date of Patent: June 4, 2013

Assignee: Nuance Communications, Inc.

Inventors: Kartik Audhkhasi, Om D. Deshmukh, Kundan Kandhway, Ashish Verma
Audible list traversal

Patent number: 8456420

Abstract: Many embodiments may comprise logic such as hardware and/or code to implement user interface for traversal of long sorted lists, via audible mapping of the lists, using sensor based gesture recognition, audio and tactile feedback and button selection while on the go. In several embodiments, such user interface modalities are physically small in size, enabling a user to be truly mobile by reducing the cognitive load required to operate the device. For some embodiments, the user interface may be divided across multiple worn devices, such as a mobile device, watch, earpiece, and ring. Rotation of the watch may be translated into navigation instructions, allowing the user to traverse the list while the user receives audio feedback via the earpiece to describe items in the list as well as audio feedback regarding the navigation state. Many embodiments offer the user a simple user interface to traverse the list without visual feedback.

Type: Grant

Filed: December 31, 2008

Date of Patent: June 4, 2013

Assignee: Intel Corporation

Inventors: Lama Nachman, David L. Graumann, Giuseppe Raffa, Jennifer Healey
Assisted reader

Patent number: 8452600

Abstract: An electronic reading device for reading ebooks and other digital media items combines a touch surface electronic reading device with accessibility technology to provide a visually impaired user more control over his or her reading experience. In some implementations, the reading device can be configured to operate in at least two modes: a continuous reading mode and an enhanced reading mode.

Type: Grant

Filed: August 18, 2010

Date of Patent: May 28, 2013

Assignee: Apple Inc.

Inventor: Christopher B. Fleizach
Adjustment of temporal acoustical characteristics

Patent number: 8447609

Abstract: Embodiments may be a standalone module or part of mobile devices, desktop computers, servers, stereo systems, or any other systems that might benefit from condensed audio presentations of item structures such as lists or tables. Embodiments may comprise logic such as hardware and/or code to adjust the temporal characteristics of items comprising words. The items maybe included in a structure such as a text listing or table, an audio listing or table, or a combination thereof, or may be individual words or phrases. For instance, embodiments may comprise a keyword extractor to extract keywords from the items and an abbreviations generator to generate abbreviations based upon the keywords. Further embodiments may comprise a text-to-speech generator to generate audible items based upon the abbreviations to render to a user while traversing the item structure.

Type: Grant

Filed: December 31, 2008

Date of Patent: May 21, 2013

Assignee: Intel Corporation

Inventors: Giuseppe Raffa, Lama Nachman, David L. Graumann, Michael E. Deisher
Robot and server with optimized message decoding

Patent number: 8447613

Abstract: A method for optimizing message transmission and decoding comprises: reading data from a memory of an originating device, the data comprising information regarding the originating device; encoding the data by converting the data to a subset of words having a ranked recognition accuracy higher than the remainder of words; transmitting the encoded data from the originating device to a receiving system audibly as words via a telephone connection; utilizing a voice recognition software to recognize the words; decoding the words back to the data; and taking a predetermined action based on the data.

Type: Grant

Filed: April 28, 2009

Date of Patent: May 21, 2013

Assignee: iRobot Corporation

Inventors: Patrick Alan Hussey, Maryellen Abreu
Method and apparatus for processing scripts and related data

Patent number: 8447604

Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.

Type: Grant

Filed: May 28, 2010

Date of Patent: May 21, 2013

Assignee: Adobe Systems Incorporated

Inventor: Walter W. Chang
Method and apparatus for generating synthetic speech with contrastive stress

Patent number: 8447610

Abstract: Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.

Type: Grant

Filed: August 9, 2010

Date of Patent: May 21, 2013

Assignee: Nuance Communications, Inc.

Inventors: Darren C. Meyer, Stephen R. Springer
Methods and apparatus for formant-based voice systems

Patent number: 8447592

Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.

Type: Grant

Filed: September 13, 2005

Date of Patent: May 21, 2013

Assignee: Nuance Communications, Inc.

Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
Testing within digital media items

Patent number: 8442423

Abstract: A digital media item, such as an electronic book (eBook), may include testing content. The testing content may include questions about the content of the digital media item. When is user is viewing the digital media item on an electronic device, such as an eBook reader, the user may be allowed to select whether the testing content is displayed. The user may also be allowed to select a particular mode of testing, such as automatic testing, selective testing, etc. If the user chooses to display the testing content, the user may also be allowed to provide answers to the testing questions.

Type: Grant

Filed: January 26, 2009

Date of Patent: May 14, 2013

Assignee: Amazon Technologies, Inc.

Inventors: Thomas A. Ryan, Edward J. Gayles, Laurent An Minh Nguyen, Steven K. Weiss, Martin Görner
Mobile terminal and method of using text data obtained as result of voice recognition

Patent number: 8433369

Abstract: A mobile terminal has a sound obtaining unit configured to obtain a sound signal; a voice recognition unit configured to recognize the sound signal and convert the sound signal into a text data; a display unit configured to display the text data divided in a plurality of units; a selection unit configured to receive a selection of one of the units from the text data divided in the plurality of the units displayed on the display unit; and a control unit configured to perform a predetermined process corresponding to each of the units selected by the selection unit.

Type: Grant

Filed: September 15, 2009

Date of Patent: April 30, 2013

Assignee: Fujitsu Mobile Communications Limited

Inventor: Yasuhito Ambiru
Augmenting an audio signal via extraction of musical features and obtaining of media fragments

Patent number: 8433575

Abstract: A system and method is described in which a multimedia story is rendered to a consumer in dependence on features extracted from an audio signal representing for example a musical selection of the consumer. Features such as key changes and tempo of the music selection are related to dramatic parameters defined by and associated with story arcs, narrative story rules and film or story structure. In one example a selection of a few music tracks provides input audio signals (602) from which musical features are extracted (604), following which a dramatic parameter list and timeline are generated (606). Media fragments are then obtained (608), the fragments having story content associated with the dramatic parameters, and the fragments output (610) with the music selection.

Type: Grant

Filed: December 10, 2003

Date of Patent: April 30, 2013

Assignee: AMBX UK Limited

Inventors: David A. Eves, Richard S. Cole, Christopher Thorne
Prosody modification device, prosody modification method, and recording medium storing prosody modification program

Patent number: 8433573

Abstract: A prosody modification device includes: a real voice prosody input part that receives real voice prosody information extracted from an utterance of a human; a regular prosody generating part that generates regular prosody information having a regular phoneme boundary that determines a boundary between phonemes and a regular phoneme length of a phoneme by using data representing a regular or statistical phoneme length in an utterance of a human with respect to a section including at least a phoneme or a phoneme string to be modified in the real voice prosody information; and a real voice prosody modification part that resets a real voice phoneme boundary by using the generated regular prosody information so that the real voice phoneme boundary and a real voice phoneme length of the phoneme or the phoneme string to be modified in the real voice prosody information are approximate to an actual phoneme boundary and an actual phoneme length of the utterance of the human, thereby modifying the real voice prosody in

Type: Grant

Filed: February 11, 2008

Date of Patent: April 30, 2013

Assignee: Fujitsu Limited

Inventors: Kentaro Murase, Nobuyuki Katae
Hosted voice recognition system for wireless devices

Patent number: 8433574

Abstract: Methods, systems, and software for converting the audio input of a user of a hand-held client device or mobile phone into a textual representation by means of a backend server accessed by the device through a communications network. The text is then inserted into or used by an application of the client device to send a text message, instant message, email, or to insert a request into a web-based application or service. In one embodiment, the method includes the steps of initializing or launching the application on the device; recording and transmitting the recorded audio message from the client device to the backend server through a client-server communication protocol; converting the transmitted audio message into the textual representation in the backend server; and sending the converted text message back to the client device or forwarding it on to an alternate destination directly from the server.

Type: Grant

Filed: February 13, 2012

Date of Patent: April 30, 2013

Assignee: Canyon IP Holdings, LLC

Inventors: Victor R. Jablokov, Igor R. Jablokov, Marc White
Text-to-speech user's voice cooperative server for instant messaging clients

Patent number: 8428952

Abstract: A system and method to allow an author of an instant message to enable and control the production of audible speech to the recipient of the message. The voice of the author of the message is characterized into parameters compatible with a formative or articulative text-to-speech engine such that upon receipt, the receiving client device can generate audible speech signals from the message text according to the characterization of the author's voice. Alternatively, the author can store samples of his or her actual voice in a server so that, upon transmission of a message by the author to a recipient, the server extracts the samples needed only to synthesize the words in the text message, and delivers those to the receiving client device so that they are used by a client-side concatenative text-to-speech engine to generate audible speech signals having a close likeness to the actual voice of the author.

Type: Grant

Filed: June 12, 2012

Date of Patent: April 23, 2013

Assignee: Nuance Communications, Inc.

Inventors: Terry Wade Niemeyer, Liliana Orozco
Automatically training speech synthesizers

Patent number: 8423366

Abstract: A method includes receiving, by a system, a voice recording associated with a user, transcribing, the voice recording into text that includes a group of words, and storing an association between a portion of each respective word and a corresponding portion of the voice recording. The corresponding portion of the voice recording is the portion of the voice recording from which the portion of the respective word was transcribed. The method may also include determining a modification to a speech synthesis voice associated with the user based at least in part on the association.

Type: Grant

Filed: July 18, 2012

Date of Patent: April 16, 2013

Assignee: Google Inc.

Inventors: Marcus Alexander Foster, Richard Zarek Cohen
Distributed record server architecture for recording call sessions over a VoIP network

Patent number: 8422641

Abstract: Devices, systems, and methods for recording call sessions over a VoIP network using a distributed record server architecture are disclosed. An example recording device for recording segments of a call session includes a record server configured to receive an agent voice data stream and an external caller voice data stream from an agent telephone station, and a file repository configured to store voice data and call data associated with each recorded segment of the call session. The recording device is configured to tag recorded segments of each call session, which can be later used by a third-party application or database to check the status and/or integrity of the recorded call session.

Type: Grant

Filed: June 15, 2009

Date of Patent: April 16, 2013

Assignee: Calabrio, Inc.

Inventor: James Paul Martin, II
Contextual conversion platform

Patent number: 8423365

Abstract: A contextual conversion platform, and method for converting text-to-speech, are described that can convert content of a target to spoken content. Embodiments of the contextual conversion platform can identify certain contextual characteristics of the content, from which can be generated a spoken content input. This spoken content input can include tokens, e.g., words and abbreviations, to be converted to the spoken content, as well as substitution tokens that are selected from contextual repositories based on the context identified by the contextual conversion platform.

Type: Grant

Filed: May 28, 2010

Date of Patent: April 16, 2013

Inventor: Daniel Ben-Ezri
Back-end database reorganization for application-specific concatenative text-to-speech systems

Patent number: 8412528

Abstract: The present invention relates to computer-generated text-to-speech conversion. It relates in particular to a method and system for updating a Concatenative Text-To-Speech (CTTS) system with a speech database from a base version to a new version. The present invention performs an application-specific re-organization of a synthesizer's speech database by means of certain decision tree modifications. By that reorganization, certain synthesis units are made available for the new application, which are not available in prior art without a new speech session. This allows the creation of application-specific synthesizers with improved output speech quality for arbitrary domains and applications at very low cost.

Type: Grant

Filed: May 2, 2006

Date of Patent: April 2, 2013

Assignee: Nuance Communications, Inc.

Inventors: Volker Fischer, Siegfried Kunzmann
Method and system for enhancing verbal communication sessions

Patent number: 8412529

Abstract: An approach is provided for enhancing verbal communication sessions. A verbal component of a communication session is converted into textual information. The converted textual information is scanned for a text string to trigger an application. The application is invoked to provide supplemental information about the textual information or to perform an action in response to the textual information for or on behalf of a party of the communication session. The supplemental information or a confirmation of the action is transmitted to the party.

Type: Grant

Filed: October 29, 2008

Date of Patent: April 2, 2013

Assignee: Verizon Patent and Licensing Inc.

Inventors: Martin W. McKee, Paul T. Schultz, Robert A. Sartini
Automatic normalization of spoken syllable duration

Patent number: 8401856

Abstract: A very common problem is when people speak a language other than the language which they are accustomed, syllables can be spoken for longer or shorter than the listener would regard as appropriate. An example of this can be observed when people who have a heavy Japanese accent speak English. Since Japanese words end with vowels, there is a tendency for native Japanese to add a vowel sound to the end of English words that should end with a consonant. Illustratively, native Japanese speakers often pronounce “orange” as “orenji.” An aspect provides an automatic speech-correcting process that would not necessarily need to know that fruit is being discussed; the system would only need to know that the speaker is accustomed to Japanese, that the listener is accustomed to English, that “orenji” is not a word in English, and that “orenji” is a typical Japanese mispronunciation of the English word “orange.

Type: Grant

Filed: May 17, 2010

Date of Patent: March 19, 2013

Assignee: Avaya Inc.

Inventors: Terry Jennings, Paul Roller Michaelis
PARAMETRIC SPEECH SYNTHESIS METHOD AND SYSTEM

Publication number: 20130066631

Abstract: The present invention provides a parametric speech synthesis method and a parametric speech synthesis system.

Type: Application

Filed: October 27, 2011

Publication date: March 14, 2013

Applicant: GOERTEK INC.

Inventors: Fengliang Wu, Zhenhua Wu
Facial expression representation apparatus

Patent number: 8396708

Abstract: An avatar facial expression representation technology is provided. The avatar facial expression representation technology estimates changes in emotion and emphasis in a user's voice from vocal information, and changes in mouth shape of the user from pronunciation information of the voice. The avatar facial expression technology tracks a user's facial movements and changes in facial expression from image information and may represent avatar facial expressions based on the result of the these operations. Accordingly, the avatar facial expressions can be obtained which are similar to actual facial expressions of the user.

Type: Grant

Filed: January 28, 2010

Date of Patent: March 12, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Chi-youn Park, Young-kyoo Hwang, Jung-bae Kim
Chinese prosodic words forming method and apparatus

Patent number: 8392191

Abstract: The present invention provides a method and apparatus of forming Chinese prosodic words, which method comprises the steps of inputting Chinese text; performing process of word segmentation and part of speech annotation for the input Chinese text to generate an initial prosodic word sequence; inserting grids representing prosodic word boundaries for all the words in the initial prosodic word sequence to generate a grid prosodic word sequence; annotating the grids ready to be deleted in the grid prosodic word sequence based on the prosodic word forming means; judging the grids which actually need to be deleted in the grids ready to be deleted based on the prosodic word forming means; deleting the grids which actually need to be deleted in the grid prosodic word sequence, and word forming the words between every two grids in the remaining grids to generate prosodic words.

Type: Grant

Filed: December 10, 2007

Date of Patent: March 5, 2013

Assignee: Fujitsu Limited

Inventors: Guo Qing, Nobuyuki Katae
System and method for machine-based determination of speech intelligibility in an aircraft during flight operations

Patent number: 8392194

Abstract: A method for effecting a machine-based determination of speech intelligibility in an aircraft during flight operations includes: (a) in no particular order: (1) providing a representation of a machine-based speech evaluating signal; and (2) providing a representation of in-flight noise; (b) combining the representation of a machine-based speech evaluation signal and the representation of in-flight noise to obtain a combined noise signal; and (c) employing the combined noise signal to present the machine-based determination of speech intelligibility in an aircraft during flight operations.

Type: Grant

Filed: October 15, 2008

Date of Patent: March 5, 2013

Assignee: The Boeing Company

Inventor: Naval Kishore Agarwal
Method and system of dynamically changing a sentence structure of a message

Patent number: 8380484

Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.

Type: Grant

Filed: August 10, 2004

Date of Patent: February 19, 2013

Assignee: International Business Machines Corporation

Inventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
Training and applying prosody models

Patent number: 8374873

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: August 11, 2009

Date of Patent: February 12, 2013

Assignee: Morphism, LLC

Inventor: James H. Stephens, Jr.
Automatic answering device, automatic answering system, conversation scenario editing device, conversation server, and automatic answering method

Patent number: 8374859

Abstract: An automatic answering device and an automatic answering method for automatically answering to a user utterance are configured: to prepare a conversation scenario that is a set of input sentences and replay sentences, the input sentences each corresponding to a user utterance assumed to be uttered by a user, the reply sentences each being an automatic reply to the inputted sentence; to accept a user utterance; to determine the reply sentence to the accepted user utterance on the basis of the conversation scenario; and to present the determined reply sentence to the user. Data of the conversation scenario have a data structure that enables the inputted sentences and the reply sentences to be expressed in a state transition diagram in which each of the inputted sentences is defined as a morphism and the reply sentence corresponding to the inputted sentence is defined as an object.

Type: Grant

Filed: August 17, 2009

Date of Patent: February 12, 2013

Assignee: Universal Entertainment Corporation

Inventors: Shengyang Huang, Hiroshi Katukura
Dynamic update of grammar for interactive voice response

Patent number: 8374872

Abstract: A device provides a question to a user, and receives, from the user, an unrecognized voice response to the question. The device also provides the unrecognized voice response to an utterance agent for determination of the unrecognized voice response without user involvement, and provides an additional question to the user prior to receiving the determination of the unrecognized voice response from the utterance agent.

Type: Grant

Filed: November 4, 2008

Date of Patent: February 12, 2013

Assignee: Verizon Patent and Licensing Inc.

Inventor: Manohar R. Kesireddy
Speech generation user interface

Patent number: 8374876

Abstract: A system and a method for speech generation which assist the speech of those with a disability or a medical condition such as cerebral palsy, motor neurone disease or a dysarthia following a stroke. The system has a user interface having a multiplicity of states each of which correspond to a sound and a selector for making a selection of a state or a combination of states. The system also has a processor for processing the selected state or combination of states and an audio output for outputting the sound or combination of sounds. The sounds associated with the states can be phonemes or phonics and the user interface is typically a manually operable device such as a mouse, trackball, joystick or other device that allows a user to distinguish between states by manipulating the interface to a number of positions.

Type: Grant

Filed: February 1, 2007

Date of Patent: February 12, 2013

Assignee: The University of Dundee

Inventors: Rolf Black, Annula Waller, Eric Abel, Iain Murray, Graham Pullin
ELECTROLARYNGEAL SPEECH RECONSTRUCTION METHOD AND SYSTEM THEREOF

Publication number: 20130035940

Abstract: The invention provides an electrolaryngeal speech reconstruction method and a system thereof. Firstly, model parameters are extracted from the collected speech as a parameter library, then facial images of a speaker are acquired and then transmitted to an image analyzing and processing module to obtain the voice onset and offset times and the vowel classes, then a waveform of a voice source is synthesized by a voice source synthesis module, finally, the waveform of the above voice source is output by an electrolarynx vibration output module, wherein the voice source synthesis module firstly sets the model parameters of a glottal voice source so as to synthesize the waveform of the glottal voice source, and then a waveguide model is used to simulate sound transmission in a vocal tract and select shape parameters of the vocal tract according to the vowel classes.

Type: Application

Filed: September 4, 2012

Publication date: February 7, 2013

Applicant: XI'AN JIAOTONG UNIVERITY

Inventors: MINGXI WAN, LIANG WU, SUPIN WANG, ZHIFENG NIU, CONGYING WAN
Character information presentation device

Patent number: 8370150

Abstract: The text information presentation device calculates an optimum readout speed on the basis of the content of text information being input, its arriving time, and its previous arriving time; speech-synthesizes text information being input, at the readout speed calculated; and outputs it as an audio signal, or alternatively controls the speed at which a video signal is output according to an output state of the speech synthesizing unit.

Type: Grant

Filed: July 15, 2008

Date of Patent: February 5, 2013

Assignee: Panasonic Corporation

Inventors: Keiichi Toiyama, Mitsuteru Kataoka, Kohsuke Yamamoto
System and method for answering a communication notification

Patent number: 8370148

Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.

Type: Grant

Filed: April 14, 2008

Date of Patent: February 5, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst Schroeter
Systems and methods for multiple voice document narration

Patent number: 8370151

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices where the portions of the text narrated using the different voices are selected by a user.

Type: Grant

Filed: January 14, 2010

Date of Patent: February 5, 2013

Assignee: K-NFB Reading Technology, Inc.

Inventors: Raymond C. Kurzweil, Paul Albrecht, Peter Chapman
Speech synthesis system, speech synthesis program product, and speech synthesis method

Patent number: 8370149

Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.

Type: Grant

Filed: August 15, 2008

Date of Patent: February 5, 2013

Assignee: Nuance Communications, Inc.

Inventors: Ryuki Tachibana, Masafumi Nishimura
Voice models for document narration

Patent number: 8364488

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices. Further disclosed are techniques and systems for modifying a voice model associated with a selected character based on data received from a user.

Type: Grant

Filed: January 14, 2010

Date of Patent: January 29, 2013

Assignee: K-NFB Reading Technology, Inc.

Inventors: Raymond C. Kurzweil, Paul Albrecht, Peter Chapman
Speech recognition system with display information

Patent number: 8364487

Abstract: A language processing system may determine a display form of a spoken word by analyzing the spoken form using a language model that includes dictionary entries for display forms of homonyms. The homonyms may include trade names as well as given names and other phrases. The language processing system may receive spoken language and produce a display form of the language while displaying the proper form of the homonym. Such a system may be used in search systems where audio input is converted to a graphical display of a portion of the spoken input.

Type: Grant

Filed: October 21, 2008

Date of Patent: January 29, 2013

Assignee: Microsoft Corporation

Inventors: Yun-Cheng Ju, Julian J. Odell
Fast-and-engaging, real-time translation using a network environment

Patent number: 8364466

Abstract: The teachings described herein generally relate to a multilingual electronic translation of a source phrase to a destination language selected from multiple languages, and this can be accomplished through the use of a network environment. The electronic translation can occur as a spoken translation, can be in real-time, and can mimic the voice of the user of the system.

Type: Grant

Filed: June 16, 2012

Date of Patent: January 29, 2013

Assignee: NewTalk, Inc.

Inventors: Bruce W. Nash, Craig A. Robinson, Martha P. Robinson, Robert H. Clemons
Voice encoding device and voice encoding method

Patent number: 8364472

Abstract: Provided is an audio encoding device which can detect an optimal pitch pulse when using pitch pulse information as redundant information.

Type: Grant

Filed: February 29, 2008

Date of Patent: January 29, 2013

Assignee: Panasonic Corporation

Inventor: Hiroyuki Ehara
Character models for document narration

Patent number: 8359202

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices where the portions of the text narrated using the different voices are selected by a user. Also disclosed are techniques and systems for associating characters with portions of a sequence of words selected by a user. Different characters having different voice models can be associated with different portions of a sequence of words.

Type: Grant

Filed: January 14, 2010

Date of Patent: January 22, 2013

Assignee: K-NFB Reading Technology, Inc.

Inventors: Raymond C. Kurzweil, Paul Albrecht, Peter Chapman
METHOD AND SYSTEM FOR PRESELECTION OF SUITABLE UNITS FOR CONCATENATIVE SPEECH

Publication number: 20130013312

Abstract: A system and method for improving the response time of text-to-speech synthesis using triphone contexts. The method includes receiving input text, selecting a plurality of N phoneme units from a triphone unit selection database as candidate phonemes for synthesized speech based on the input text, wherein the triphone unit selection database comprises triphone units each comprising three phones and if the candidate phonemes are available in the triphone unit selection database, and applying a cost process to select a set of phonemes from the candidate phonemes. If so candidate phonemes are available in the triphone unit selection database, the method includes applying a single phoneme approach to select single phonemes for synthesis, the single phonemes used in synthesis independent of a triphone structure.

Type: Application

Filed: July 16, 2012

Publication date: January 10, 2013

Applicant: AT&T Intellectual Property II, L.P.

Inventor: Alistair D. Conkie
Systems and methods for processing indicia for document narration

Patent number: 8352269

Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices. Further disclosed are techniques and systems for processing indicia in a document to determine a portion of words and associating a particular a voice model with the portion of words based on the indicia. During a readback process, an audible output corresponding to the words in the portion of words is generated using the voice model associated with the portion of words.

Type: Grant

Filed: January 14, 2010

Date of Patent: January 8, 2013

Assignee: K-NFB Reading Technology, Inc.

Inventors: Raymond C. Kurzweil, Paul Albrecht, Peter Chapman
Facilitating text-to-speech conversion of a username or a network address containing a username

Patent number: 8352271

Abstract: To facilitate text-to-speech conversion of a username, a first or last name of a user associated with the username may be retrieved, and a pronunciation of the username may be determined based at least in part on whether the name forms at least part of the username. To facilitate text-to-speech conversion of a domain name having a top level domain and at least one other level domain, a pronunciation for the top level domain may be determined based at least in part upon whether the top level domain is one of a predetermined set of top level domains. Each other level domain may be searched for one or more recognized words therewithin, and a pronunciation of the other level domain may be determined based at least in part on an outcome of the search. The username and domain name may form part of a network address such as an email address, URL or URI.

Type: Grant

Filed: February 23, 2012

Date of Patent: January 8, 2013

Assignee: Research In Motion Limited

Inventors: Matthew Bells, Jennifer Elizabeth Lhotak, Michael Angelo Nanni
Information processing system and method for reading characters aloud

Patent number: 8352267

Abstract: A plurality of input devices each includes a speaker, an operation data transmitter, a voice data receiver, and a voice controller. An information processing apparatus includes a voice storing area, object displaying programmed logic circuitry, operation data acquiring programmed logic circuitry, pointing position determining programmed logic circuitry, object specifying programmed logic circuitry, voice reading programmed logic circuitry, and voice data transmitting programmed logic circuitry. The pointing position determining programmed logic circuitry specifies, for each of the input devices, a pointing position on a screen based on operation data transmitted from the operation data transmitter. The voice reading programmed logic circuitry reads voice data corresponding to the pointing position for each of the input devices. The voice data transmitting programmed logic circuitry transmits the voice data to each of the input devices.

Type: Grant

Filed: June 27, 2007

Date of Patent: January 8, 2013

Assignee: Nintendo Co., Ltd.

Inventor: Toshiaki Suzuki
Systems and methods for selective rate of speech and speech preferences for text to speech synthesis

Patent number: 8352268

Abstract: Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech in a language (e.g., dialect or accent) that is familiar to a user. The algorithms may be implemented on a system including several dedicated render engines. The system may be part of a back end coupled to a front end including storage for media assets and associated synthesized speech, and a request processor for receiving and processing requests that result in providing the synthesized speech. The front end may communicate media assets and associated synthesized speech content over a network to host devices coupled to portable electronic devices on which the media assets and synthesized speech are played back.

Type: Grant

Filed: September 29, 2008

Date of Patent: January 8, 2013

Assignee: Apple Inc.

Inventors: DeVang Naik, Kim Silverman, Jerome Bellegarda

prev … 5 6 7 8 9 10 11 12 13 … next