Speech Synthesis; Text To Speech Systems (epo) Patents (Class 704/E13.001)
  • Publication number: 20110283190
    Abstract: An interface device and method of use, comprising audio and image inputs; a processor for determining topics of interest, and receiving information of interest to the user from a remote resource; an audiovisual output for presenting an anthropomorphic object conveying the received information, having a selectively defined and adaptively alterable mood; an external communication device adapted to remotely communicate at least a voice conversation with a human user of the personal interface device. Also provided is a system and method adapted to receive logic for, synthesize, and engage in conversation dependent on received conversational logic and a personality.
    Type: Application
    Filed: May 12, 2011
    Publication date: November 17, 2011
    Inventor: Alexander POLTORAK
  • Publication number: 20110276332
    Abstract: A speech synthesis method comprising: receiving a text input and outputting speech corresponding to said text input using a stochastic model, said stochastic model comprising an acoustic model and an excitation model, said acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to a feature, said excitation model comprising excitation model parameters which are used to model the vocal chords and lungs to output the speech using said features; wherein said acoustic parameters and excitation parameters have been jointly estimated; and outputting said speech.
    Type: Application
    Filed: May 6, 2011
    Publication date: November 10, 2011
    Applicant: Kabushiki Kaisha Toshiba
    Inventors: Ranniery MAIA, Byung Ha Chun
  • Publication number: 20110270614
    Abstract: A method and an apparatus for switching speech or audio signals, wherein the method for switching speech or audio signals includes when switching of a speech or audio, weighting a first high frequency band signal of a current frame of speech or audio signal and a second high frequency band signal of the previous M frame of speech or audio signals to obtain a processed first high frequency band signal, where M is greater than or equal to 1, and synthesizing the processed first high frequency band signal and a first low frequency band signal of the current frame of speech or audio signal into a wide frequency band signal. In this way, speech or audio signals with different bandwidths can be smoothly switched, thus improving the quality of audio signals received by a user.
    Type: Application
    Filed: June 16, 2011
    Publication date: November 3, 2011
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zexin Liu, Lei Miao, Chen Hu, Wenhai Wu, Yue Lang, Qing Zhang
  • Publication number: 20110270613
    Abstract: The disclosed solution includes a method for dynamically switching modalities based upon inferred conditions in a dialogue session involving a speech application. The method establishes a dialogue session between a user and the speech application. During the dialogue session, the user interacts using an original modality and a second modality. The speech application interacts using a speech modality only. A set of conditions indicative of interaction problems using the original modality can be inferred. Responsive to the inferring step, the original modality can be changed to the second modality. A modality transition to the second modality can be transparent the speech application and can occur without interrupting the dialogue session. The original modality and the second modality can be different modalities; one including a text exchange modality and another including a speech modality.
    Type: Application
    Filed: July 8, 2011
    Publication date: November 3, 2011
    Applicant: Nuance Communications, Inc.
    Inventors: William V. Da Palma, Baiju D. Mandalia, Victor S. Moore, Wendi L. Nusbickel
  • Publication number: 20110264452
    Abstract: Example embodiments disclosed herein relate to audio output of speech data using speech control commands. In particular, example embodiments include a mechanism for accessing text data. Example embodiments may also include a mechanism for outputting the text data as audio by converting the text data to speech audio data and transmitting the speech audio data over an audio output. Example embodiments may also include a mechanism for receiving speech control commands that allow for voice control of the output of the audio data.
    Type: Application
    Filed: April 27, 2010
    Publication date: October 27, 2011
    Inventors: Ramya Venkataramu, Molly Joy
  • Publication number: 20110260832
    Abstract: In one embodiment, a method includes enrolling a potential enrollee for an identity-monitoring service. The enrolling includes acquiring personally-identifying information (PII) and capturing a voiceprint. Following successful completion of the enrolling, the potential enrollee is an enrollee. The method further includes, responsive to an identified suspicious event related to the PII, creating an identity alert, establishing voice communication with an individual purporting to be the enrollee, and performing voice-biometric verification of the individual. The voice-biometric verification includes comparing one or more spoken utterances with the voiceprint. Following successful completion of the voice-biometric verification, the individual is a verified enrollee. In addition, the method includes authorizing delivery of the identity alert to the verified enrollee.
    Type: Application
    Filed: April 25, 2011
    Publication date: October 27, 2011
    Inventors: Joe Ross, Isaac Chapa, Adrian Cruz, Harold E. Gottschalk, JR.
  • Publication number: 20110243447
    Abstract: Method and apparatus of synthesizing speech from a plurality of portion of text data, each portion having at least one associated attribute. The invention is achieved by determining (25, 35, 45) a value of the attribute for each of the portions of text data, selecting (27, 37, 47) a voice from a plurality of candidate voices on the basis of each of said determined attribute values, and converting (29, 39, 49) each portion of text data into synthesized speech using said respective selected voice.
    Type: Application
    Filed: December 7, 2009
    Publication date: October 6, 2011
    Applicant: KONINKLIJKE PHILIPS ELECTRONICS N.V.
    Inventor: Franciscus Johannes Henricus Maria Meulenbroeks
  • Publication number: 20110246201
    Abstract: While performing a function, a mobile device identifies that it is idle while it is downloading content or performing another task. During that idle time, it gathers one or more parameters (e.g., location, time, gender of user, age of user, etc.) and sends a request for an audio message (e.g., audio advertisement). One or more servers at a remote facility receive the request with the one or more parameters, and use the parameters to identify a targeted message. In some cases, the targeted message will include one or more dynamic variables (e.g., distance to store, time to event, etc.) that will be replaced based on the parameters received from the mobile device, so that the audio message is dynamically updated and customized for the mobile device. In one embodiment, the targeted message is transmitted to the mobile device as text. After being received at the mobile device, the text is optionally displayed and converted to an audio format and played for the user.
    Type: Application
    Filed: April 6, 2010
    Publication date: October 6, 2011
    Inventor: Andre F. Hawit
  • Publication number: 20110246199
    Abstract: According to one embodiment, a speech synthesizer generates a speech segment sequence and synthesizes speech by connecting speech segments of the generated speech segment sequence. If a speech segment of a synthesized first speech segment sequence is different from the speech segment of a synthesized second speech segment sequence having the same synthesis unit as the first speech segment sequence, the speech synthesizer disables the speech segment of the first speech segment sequence that is different from the speech segment of the second speech segment sequence.
    Type: Application
    Filed: September 14, 2010
    Publication date: October 6, 2011
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Osamu NISHIYAMA, Takehiko Kagoshima
  • Publication number: 20110243310
    Abstract: A system may include a database configured to selectively store and retrieve data. The system may further include a call record parser configured to receive a plurality of call records, each call record being associated with a respective call, parse the plurality of call records to identify periods of resource usage and types of resource usage for the associated calls, create parsed data based on the identified periods of resource usage and the types of resource usage, and store the parsed data in the database indexed according to the type of the identified resource and including the start and end times for the identified periods of usage.
    Type: Application
    Filed: March 30, 2010
    Publication date: October 6, 2011
    Applicant: Verizon Patent and Licensing Inc.
    Inventors: Belinda Franklin-Barr, John Rivera
  • Publication number: 20110238420
    Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.
    Type: Application
    Filed: September 13, 2010
    Publication date: September 29, 2011
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20110231192
    Abstract: A system and method for generating audio content. Content is automatically retrieved from an original website according to a predetermined schedule to generate retrieved content. The retrieved content is converted to one or more audio file. A hierarchy is assigned to the one or more audio files to provide an audible website that mimics a hierarch of the retrieved content as represented at the original website. The audible website is stored in a database for retrieval by one or more users. A first user input is received indicating an attempt to access the original website. The audible website is indicated as being associated with the original website in response to the user selection. Portion of the audible website are played in response to a second user input.
    Type: Application
    Filed: May 2, 2011
    Publication date: September 22, 2011
    Inventors: William C. O'Conor, Nathan T. Bradley
  • Publication number: 20110231193
    Abstract: Various technologies for generating a synthesized singing voice waveform. In one implementation, the computer program may receive a request from a user to create a synthesized singing voice using the lyrics of a song and a digital file containing its melody as inputs. The computer program may then dissect the lyrics' text and its melody file into its corresponding sub-phonemic units and musical score respectively. The musical score may be further dissected into a sequence of musical notes and duration times for each musical note. The computer program may then determine a fundamental frequency (F0), or pitch, of each musical note.
    Type: Application
    Filed: June 2, 2011
    Publication date: September 22, 2011
    Applicant: Microsoft Corporation
    Inventors: Yao Qian, Frank Soong
  • Publication number: 20110205849
    Abstract: A digital calendar device involving an electronic device having a graphic user interface. The electronic device is capable of receiving at least one note from a user, and retrieving and displaying the at least one note, as well as running on at least one operating system. The digital calendar device further includes features, such retrieval and displaying features, a touch screen interface, handwriting support feature, a speech recognition feature, a reminder feature, a frame feature, a mechanical support feature, and a fortune-teller software program.
    Type: Application
    Filed: February 23, 2010
    Publication date: August 25, 2011
    Applicant: SONY CORPORATION, A JAPANESE CORPORATION
    Inventor: Feng Kang
  • Publication number: 20110200214
    Abstract: A hearing aid includes a microphone to convert audible sounds into sound-related electrical signals and a memory configured to store a plurality of hearing aid profiles. Each hearing aid profile has an associated audio label. The hearing aid further includes a processor coupled to the microphone and to the memory and configured to select one of the plurality of hearing aid profiles. The processor applies the one of the plurality of hearing aid profiles to the sound-related electrical signals to produce a shaped output signal to compensate for a hearing impairment of a user. The processor is configured to insert the associated audio label into the shaped output signal. The hearing aid also includes a speaker coupled to the processor and configured to convert the shaped output signal into an audible sound.
    Type: Application
    Filed: February 8, 2011
    Publication date: August 18, 2011
    Applicant: AUDIOTONIQ, INC.
    Inventors: John Michael Page Knox, David Matthew Landry, Samir Ibrahim, Andrew Lawrence Eisenberg
  • Publication number: 20110202344
    Abstract: Techniques for providing speech output for speech-enabled applications. A synthesis system receives from a speech-enabled application a text input including a text transcription of a desired speech output. The synthesis system selects one or more audio recordings corresponding to one or more portions of the text input. In one aspect, the synthesis system selects from audio recordings provided by a developer of the speech-enabled application. In another aspect, the synthesis system selects an audio recording of a speaker speaking a plurality of words. The synthesis system forms a speech output including the one or more selected audio recordings and provides the speech output for the speech-enabled application.
    Type: Application
    Filed: February 12, 2010
    Publication date: August 18, 2011
    Inventors: Darren C. Meyer, Corinne Bos-Plachez, Martine Marguerite Staessen
  • Publication number: 20110202347
    Abstract: A communication converter is described for converting among speech signals and textual information, permitting communication between telephone users and textual instant communications users.
    Type: Application
    Filed: April 28, 2011
    Publication date: August 18, 2011
    Applicant: VERIZON BUSINESS GLOBAL LLC
    Inventors: Richard G. Moore, Gregory L. Mumford, Duraisamy Gunasekar
  • Publication number: 20110196680
    Abstract: When a system (100) is used for synthesizing speech having prosody serving as a reference, the system stores speech element information representing a speech element capable of synthesizing speech having a degree of naturalness indicating a degree of similarity to speech uttered by a human higher than a predetermined reference value (speech element information storage (115)). The system accepts requested prosody information representing prosody requested by the user (requested prosody information accepting part (113)). The system generates intermediate prosody information representing intermediate prosody between the reference prosody and the requested prosody (intermediate prosody information generator (114)). The system executes a speech synthesis process to synthesize speech based on the generated intermediate prosody information and the stored speech element information (speech synthesizer (116)).
    Type: Application
    Filed: August 21, 2009
    Publication date: August 11, 2011
    Applicant: NEC CORPORATION
    Inventor: Masanori Kato
  • Publication number: 20110184738
    Abstract: TTS is a well known technology for decades used for various applications from Artificial Call centers attendants to PC software that allows people with visual impairments or reading disabilities to listen to written works on a home computer. However to date TTS is not widely adopted for PC and Mobile users for daily reading tasks such as reading emails, reading pdf and word documents, reading through website content, and for reading books. The present invention offers new user experience for operating TTS for day to day usage. More specifically this invention describes a synchronization technique for following text being read by TTS engines and specific interfaces for touch pads, touch and multi touch screens. Nevertheless this invention also describes usage of other input methods such as touchpad, mouse, and keyboard.
    Type: Application
    Filed: January 25, 2011
    Publication date: July 28, 2011
    Inventors: Dror KALISKY, Sharon CARMEL
  • Publication number: 20110179452
    Abstract: A device for providing a television sequence has a database interface, a search request receiver, a television sequence rendition module and an output interface. The database interface accesses at least one database, using a search request. The search request receiver is formed to control the database interface so as to acquire at least audio content and at least image content separate therefrom via the database interface for the search request. The television sequence rendition module combines the separate audio content and the image content to generate the television sequence based on the audio content and the image content. The output interface outputs the television sequence to a television sequence distributor.
    Type: Application
    Filed: January 21, 2011
    Publication date: July 21, 2011
    Inventors: Peter Dunker, Uwe Kuehhirt, Andreas Haupt, Christian Dittmar, Holger Grossman
  • Publication number: 20110153754
    Abstract: In various embodiments, a method for receiving alerts through a network includes providing a device having a pop-up management module and a display; providing a communications interface between the device and one or more database systems located outside the network; providing a user interface configured to allow the user to selectively choose to display, on the display, one or more message types generated by the one or more database systems, wherein said one or more message types are received by said pop-up management module via the network and displayed on the display as a pop-up message. A related system includes a device registered in the network having a processor, a memory device, a transceiver, a user interface, and a display, wherein the processor is configured to control a pop-up management module for displaying one or more message types as a pop-up message. The device may be a WiMAX-enabled device and the network may be a WiMAX network.
    Type: Application
    Filed: December 22, 2009
    Publication date: June 23, 2011
    Applicant: CLEAR WIRELESS, LLC
    Inventor: Don GUNASEKARA
  • Publication number: 20110153314
    Abstract: A method for dynamically adjusting the spectral content of an audio signal, which increases the harmonic content of said audio signal, said method comprising translating an encoded digital signal into data bands, creating a psychoacoustic model to identify sections of said data bands that are deficient in harmonic quality, analyzing the fundamental frequency and amplitude of said harmonically deficient data bands, creating additional higher order harmonics for said harmonically deficient data bands, adding said higher order harmonics back to said encoded digital signal to form a newly enhanced signal, inverse filtering said newly enhanced signal, and converting said inverse filtered signal to an analog waveform for consumption by the listener.
    Type: Application
    Filed: February 28, 2011
    Publication date: June 23, 2011
    Inventors: J. Craig Oxford, Patrick Taylor, D. Michael Shields
  • Publication number: 20110144997
    Abstract: A voice synthesis model generation device, a voice synthesis model generation system, a communication terminal device, and a method for generating a voice synthesis model all of which are capable of preferably acquiring a user's voice. A voice synthesis model generation system is configured to include a mobile communication terminal device and a voice synthesis model generation device. The mobile communication terminal device includes a characteristic amount extraction portion that extracts a characteristic amount of input voice, and a text data acquisition portion that acquires text data from the voice.
    Type: Application
    Filed: July 7, 2009
    Publication date: June 16, 2011
    Applicant: NTT DOCOMO, INC
    Inventor: Noriko Mizuguchi
  • Publication number: 20110137655
    Abstract: A speech synthesis system includes a server device and a client device. The server device stores speech element information and speech element identification information in association with each other so that, in a case that speech element information representing respective speech elements included in speech uttered by a speech registering user are arranged in the order of arrangement of the speech elements in the speech, at least one of speech element identification information identifying the respective speech element information has different information from information arranged in accordance with a predetermined rule. The client device transmits speech element identification information to the server device based on accepted text information. The client device executes a speech synthesis process based on the speech element information received from the server device.
    Type: Application
    Filed: June 22, 2009
    Publication date: June 9, 2011
    Inventors: Reishi Kondo, Masanori Kato, Yasuyuki Mitsui
  • Publication number: 20110131516
    Abstract: Provided are a content display device and a content display method each capable of reliably providing, even if a plurality of content items displayed on a single screen are to be read aloud consecutively, a user with voice reading each article, a program therefor, and a storage medium storing the program. A television (100) is a content display device capable of displaying a plurality of content items in a single screen and sequentially reading aloud by voice text strings relating to the respective content items. The television (100) includes a setting section (114) for setting the screen to have a display condition displaying a content item, among the content items, which has a text string relating to the content item and being currently read aloud in order to notify a user of the content item in such a manner that the content item is distinguishable from the other content item(s).
    Type: Application
    Filed: July 16, 2009
    Publication date: June 2, 2011
    Applicant: SHARP KABUSHIKI KAISHA
    Inventors: Hirofumi Furukawa, Kiyotaka Kashito
  • Publication number: 20110111805
    Abstract: A communication device establishes an audio connection with a far-end user via a communication network. The communication device receives text input from a near-end user, and converts the text input into speech signals. The speech signals are transmitted to the far-end user using the established audio connection while muting audio input to its microphone. Other embodiments are also described and claimed.
    Type: Application
    Filed: November 6, 2009
    Publication date: May 12, 2011
    Applicant: Apple Inc.
    Inventors: Baptiste P. Paquier, Aram M. Lindahl, Phillip G. Tamchina
  • Publication number: 20110112836
    Abstract: Electronic device and method for obtaining a digital speech signal and a control command relating to the digital speech signal while obtaining the digital speech signal, and for temporally associating the control command with a substantially corresponding time instant in the digital speech signal to which the control command was directed, wherein the control command determines one or more punctuation marks or another, optionally symbolic, elements to be at least logically positioned at a text location corresponding to the communication instant relative to the digital speech signal so as to cultivate the speech to text conversion procedure.
    Type: Application
    Filed: July 3, 2008
    Publication date: May 12, 2011
    Applicant: MOBITER DICTA OY
    Inventors: Risto Kurki-Suonio, Andrew Cotton
  • Publication number: 20110106538
    Abstract: This speech synthesis system includes a server device and a client device. The client device accepts text information representing text, and transmits a speech element request to the server device. The server device stores speech element information. The server device receives the speech element request transmitted by the client device and, in response to the received speech element request, transmits speech element information to the client device so that the speech element information is received by the client device in a different order from an order of arrangement of speech elements in speech corresponding to the text. The client device executes a speech synthesis process by rearranging the speech element information so that speech elements represented by the received speech element information are arranged in the same order as the order of arrangement of the speech elements in the speech corresponding to the text.
    Type: Application
    Filed: June 22, 2009
    Publication date: May 5, 2011
    Inventors: Reishi Kondo, Masanori Kato, Yasuyuki Mitsui
  • Publication number: 20110106537
    Abstract: Embodiments of the invention address the deficiencies of the prior art by providing a method, apparatus, and program product to of converting components of a web page to voice prompts for a user. In some embodiments, the method comprises selectively determining at least one HTML component from a plurality of HTML components of a web page to transform into a voice prompt for a mobile system based upon a voice attribute file associated with the web page. The method further comprises transforming the at least one HTML component into parameterized data suitable for use by the mobile system based upon at least a portion of the voice attribute file associated with the at least one HTML component and transmitting the parameterized data to the mobile system.
    Type: Application
    Filed: October 30, 2009
    Publication date: May 5, 2011
    Inventors: Paul M. Funyak, Norman J. Connors, Paul E. Kolonay, Matthew Aaron Nichols
  • Publication number: 20110099014
    Abstract: Systems and methods are described for performing packet loss concealment (PLC) to mitigate the effect of one or more lost frames within a series of frames that represent a speech signal. In accordance with the exemplary systems and methods, PLC is performed by searching a codebook of speech-related parameter profiles to identify content that is being spoken and by selecting a profile associated with the identified content for use in predicting or estimating speech-related parameter information associated with one or more lost frames of a speech signal. The predicted/estimated speech-related parameter information is then used to synthesize one or more frames to replace the lost frame(s) of the speech signal.
    Type: Application
    Filed: September 21, 2010
    Publication date: April 28, 2011
    Applicant: BROADCOM CORPORATION
    Inventor: Robert W. Zopf
  • Publication number: 20110083075
    Abstract: An emotive advisory system for use by one or more occupants of an automotive vehicle includes a directional speaker array, and a computer. The computer is configured to determine an audio direction, and output data representing an avatar for visual display. The computer is further configured to output data representing a spoken statement for the avatar for audio play from the speaker array such that the audio from the speaker array is directed in the determined audio direction. A visual appearance of the avatar and the spoken statement for the avatar convey a simulated emotional state.
    Type: Application
    Filed: October 2, 2009
    Publication date: April 7, 2011
    Applicant: FORD GLOBAL TECHNOLOGIES, LLC
    Inventors: Perry Robinson MacNeille, Oleg Yurievitch Gusikhin, Kacie Alane Theisen
  • Patent number: 7921014
    Abstract: A system for generating high-quality synthesized text-to-speech includes a learning data generating unit, a frequency data generating unit, and a setting unit. The learning data generating unit recognizes inputted speech, and then generates first learning data in which wordings of phrases are associated with readings thereof. The frequency data generating unit generates, based on the first learning data, frequency data indicating appearance frequencies of both wordings and readings of phrases. The setting unit sets the thus generated frequency data for a language processing unit in order to approximate outputted speech of text-to-speech to the inputted speech. Furthermore, the language processing unit generates, from a wording of text, a reading corresponding to the wording, on the basis of the appearance frequencies.
    Type: Grant
    Filed: July 9, 2007
    Date of Patent: April 5, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Gakuto Kurata, Toru Nagano, Masafumi Nishimura, Ryuki Tachibana
  • Publication number: 20110077945
    Abstract: This invention relates to a method, a computer program product, apparatuses and a system for extracting coded parameter set from an encoded audio/speech stream, said audio/speech stream being distributed to a sequence of packets, and generating a time scaled encoded audio/speech stream in the parameter coded domain using said extracted coded parameter set.
    Type: Application
    Filed: June 6, 2007
    Publication date: March 31, 2011
    Applicant: NOKIA CORPORATION
    Inventors: Pasi Sakari Ojala, Ari Kalevi Lakaniemi
  • Publication number: 20110077048
    Abstract: The invention relates to a system for data correlation, having: a receiving device 1 having an image acquisition element 10 and a data set generator 12 for generating at least one object data set from at least one acquired first image, which represents a physical object, and an identification label, which uniquely determines an object-related acquisition procedure, and at least one information data set from at least one acquired second image, which represents coded information related to the physical object, and the identification label; a correlation device 2 for the extraction 20 of the coded information from the information data set, for the semantic analysis 22 of the extracted information, and for the generation of at least one combination data sets ? from the results of the semantic analysis, the extracted information, and the at least one object data set with the same identification label as the extracted information data set; and a user device 3 for the storage and further use of the combination data
    Type: Application
    Filed: March 3, 2009
    Publication date: March 31, 2011
    Applicant: Linguatec Sprachtechnologien GmbH
    Inventor: Reinhard Busch
  • Publication number: 20110060590
    Abstract: A synthetic speech text-input device is provided that allows a user to intuitively know an amount of an input text that can be fit in a desired duration. A synthetic speech text-input device 1 includes: an input unit that receives a set duration in which a speech to be synthesized is to be fit, and a text for a synthetic speech; a text amount calculation unit that calculates an acceptable text amount based on the set duration received by the input unit, the acceptable text amount being an amount of a text acceptable as a synthetic speech of the set duration; and a text amount output unit that outputs the acceptable text amount calculated by the text amount calculation unit, when the input unit receives the text.
    Type: Application
    Filed: September 10, 2010
    Publication date: March 10, 2011
    Applicant: JUJITSU LIMITED
    Inventors: Nobuyuki Katae, Kentaro Murase
  • Publication number: 20110050594
    Abstract: A user interface for a touch-screen display of a dedicated handheld electronic book reader device is described. The user interface detects human gestures manifest as pressure being applied by a finger or stylus to regions on the touch-screen display. In one implementation, the touch-screen user interface enables a user to turn one or more pages in response to applying a force or pressure to the touch-screen display. In another implementation, the touch-screen user interface is configured to bookmark a page temporarily by applying a pressure to the display, then allowing a user to turn pages to a new page, but reverting back to a previously-displayed page when the pressure is removed. In another implementation, the touch-screen user interface identifies and filters electronic books based on book size and/or a time available to read a book. In another implementation, the touch-screen user interface converts text to speech in response to a user touching the touch-screen display.
    Type: Application
    Filed: September 2, 2009
    Publication date: March 3, 2011
    Inventors: John T. Kim, Christopher Green, Joseph J. Hebenstreit, Kevin E. Keller
  • Publication number: 20110054880
    Abstract: Techniques and systems for content transformation between devices are disclosed. In one aspect, a system includes a host device that sends content to client devices, and client devices that receive content from the host device in one format and transform the content into a different format. The client devices present the transformed content to users. In another aspect, the host device presents content in a native format, determines that a client device requires the content to be in a different format, converts the content to a reference format, and sends the converted content to the client device.
    Type: Application
    Filed: September 2, 2009
    Publication date: March 3, 2011
    Inventor: Christopher B. Fleizach
  • Publication number: 20110046957
    Abstract: Techniques are disclosed for frequency splicing in which speech segments used in the creation of a final speech waveform are constructed, at least in part, by combining (e.g., summing) a small number (e.g., two) of component speech segments that overlap substantially, or entirely, in time but have spectral energy that occupies disjoint, or substantially disjoint, frequency ranges. The component speech segments may be derived from speech segments produced by different speakers or from different speech segments produced by the same speaker. Depending on the embodiment, frequency splicing may supplement rule-based, concatenative, hybrid, or limited-vocabulary speech synthesis systems to provide various advantages.
    Type: Application
    Filed: August 24, 2010
    Publication date: February 24, 2011
    Applicant: NovaSpeech, LLC
    Inventors: Susan R. Hertz, Harold G. Mills
  • Publication number: 20110046943
    Abstract: A data processing method and apparatus that may set emotion based on development of a story are provided. The method and apparatus may set emotion without inputting emotion for each sentence of text data. Emotion setting information is generated based on development of the story and the like, and may be applied to the text data.
    Type: Application
    Filed: April 5, 2010
    Publication date: February 24, 2011
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Dong Yeol Lee, Seung Seop Park, Jae Hyun Ahn
  • Publication number: 20110040554
    Abstract: A procedure to automatically evaluate the spoken fluency of a speaker by prompting the speaker to talk on a given topic, recording the speaker's speech to get a recorded sample of speech, and then analyzing the patterns of disfluencies in the speech to compute a numerical score to quantify the spoken fluency skills of the speakers. The numerical fluency score accounts for various prosodic and lexical features, including formant-based filled-pause detection, closely-occurring exact and inexact repeat N-grams, normalized average distance between consecutive occurrences of N-grams. The lexical features and prosodic features are combined to classify the speaker with a C-class classification and develop a rating for the speaker.
    Type: Application
    Filed: August 15, 2009
    Publication date: February 17, 2011
    Applicant: International Business Machines Corporation
    Inventors: Kartik Audhkhasi, Om D. Deshmukh, Kundan Kandhway, Ashish Verma
  • Publication number: 20110035222
    Abstract: Systems and methods for selecting one of several audio clips associated with a text item for playback are provided. The electronic device can determine which audio clip to play back at any point in time using different approaches, including for example receiving a user selection or randomly selecting audio clips. In some embodiments, the electronic device can intelligently select audio clips based on attributes of the media item, the electronic device operations, or the environment of the electronic device. The attributes can include, for example, metadata values of the media item, the type of ongoing operations of the electronic device, and environmental characteristics that can be measured or detected using sensors of or coupled to the electronic device. Different audio clips can be associated with particular attribute values, such that an audio clip corresponding to the detected or received attribute values are played back.
    Type: Application
    Filed: August 4, 2009
    Publication date: February 10, 2011
    Applicant: Apple Inc.
    Inventor: Jon Schiller
  • Publication number: 20110035223
    Abstract: Systems and methods for retrieving and playing back audio clips for streamed or remotely received media items are provided. An electronic device can provide audio clips identifying media items at any suitable time, including for example to identify media items that are currently played back or available for playback. When the media items played back are not locally stored, the electronic device may not have a corresponding audio clip locally stored. In such cases, the electronic device can identify a streamed media item, and retrieve an audio clip corresponding to text items associated with the media item. For example, the electronic device can retrieve audio clips corresponding to the artist, title and album of the received media item. The electronic device can retrieve audio clips from any suitable source, such as a dedicated audio clip server or other remote source, a remote text-to-speech engine, or a locally stored text-to-speech engine.
    Type: Application
    Filed: August 4, 2009
    Publication date: February 10, 2011
    Applicant: Apple Inc.
    Inventor: Jon Schiller
  • Publication number: 20110018889
    Abstract: A media processing comparison system (“MPCS”) and techniques facilitate concurrent, subjective quality comparisons between media presentations produced by different instances of media processing components performing the same functions (for example, instances of media processing components in the form of hardware, software, and/or firmware, such as parsers, codecs, decryptors, and/or demultiplexers, supplied by the same or different entities) in a particular media content player. The MPCS receives an ordered stream of encoded media samples from a media source, and decodes a particular encoded media sample using two or more different instances of media processing components. A single renderer renders and/or coordinates the synchronous presentation of decoded media samples from each instance of media processing component(s) as separate media presentations. The media presentations may be subjectively compared and/or selected for storage by a user in a sample-by-sample manner.
    Type: Application
    Filed: July 23, 2009
    Publication date: January 27, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Firoz Dalal, Shyam Sadhwani
  • Publication number: 20110015930
    Abstract: A unified communication system is disclosed that allows a variety of end point types to participate in a communication event using a common, unified communication system. In some implementations, a calling party interacts with a client application residing on an endpoint to make a communication request to another endpoint. A communication event manager residing in the unified communication system selects a script from a repository of scripts based on the communication event and the capabilities of the endpoints. A communication event execution engine receives a user profile associated with at least one of the endpoints. The user profile can be configured by the user to describe the user's preferences for how the communication should be processed by the unified communication system.
    Type: Application
    Filed: September 7, 2010
    Publication date: January 20, 2011
    Applicant: INTELEPEER, INC.
    Inventors: John Ward, Haydar Haba, Charles Studt, Peter Antypas, Jonathan Green
  • Publication number: 20110015929
    Abstract: A contextual input device includes a plurality of tactually discernable keys disposed in a predetermined configuration which replicates a particular relationship among a plurality of items associated with a known physical object. The tactually discernable keys are typically labeled with Braille type. The known physical object is typically a collection of related items grouped together by some common relationship. A computer-implemented process determines whether a input signal represents a selection of an item from among a plurality of items or an attribute pertaining to an item among the plurality of items. Once the selected item or attribute pertaining to an item is determined, the computer-implemented process transforms a user's selection from the input signal into an analog audio signal which is then audibly output as human speech with an electro-acoustic transducer.
    Type: Application
    Filed: July 17, 2009
    Publication date: January 20, 2011
    Applicant: Calpoly Corporation
    Inventors: Dennis Fantin, C. Arthur MacCarley
  • Publication number: 20110010179
    Abstract: A method and an apparatus for voice synthesis and processing have been presented. In one exemplary method, a first audio recording of a human speech in a natural language is received. Then speech analysis synthesis algorithm is applied to the first audio recording to synthesize a second audio recording from the first audio recording such that the second audio recording sounds humanistic and consistent, but unintelligible.
    Type: Application
    Filed: July 13, 2009
    Publication date: January 13, 2011
    Inventor: Devang K. Naik
  • Publication number: 20100332232
    Abstract: A method and device for updating statuses of synthesis filters are provided. The method includes: exciting a synthesis filter corresponding to a first encoding rate by using an excitation signal of the first encoding rate, outputting reconstructed signal information, and updating status information of the synthesis filter and a synthesis filter corresponding to a second encoding rate. In the present disclosure, the status of the synthesis filter corresponding to the current rate and the statuses of the synthesis filters at other rates are updated. Thus, synchronization between the statuses of the synthesis filters corresponding to different rates at the encoding terminal may be realized, thereby facilitating the consistency of the reconstructed signals of the encoding and decoding terminals when the encoding rate is switched, and improving the quality of the reconstructed signal of the decoding terminal.
    Type: Application
    Filed: September 16, 2010
    Publication date: December 30, 2010
    Inventor: Jinliang DAI
  • Publication number: 20100330975
    Abstract: The invention provides a internet radio interface for use in vehicles. The interface allows a device unit, with wireless capability and voice interface technology, to communicate with a vehicle, mobile phone, and portal in order to manage and upload various user preferences to the device unit as set out by the user prior to getting into the vehicle. The device unit interacts with the user to permit various functions and access preferable channels as well as managing secondary functions of the user, including cell phone communications.
    Type: Application
    Filed: June 28, 2010
    Publication date: December 30, 2010
    Inventor: Otman A. Basir
  • Publication number: 20100324907
    Abstract: The invention proposes the synthesis of a signal consisting of consecutive blocks. It proposes more particularly, on receipt of such a signal, to replace, by synthesis, lost or erroneous blocks of this signal. To this end, it proposes an attenuation of the overvoicing during the generation of a signal synthesis. More particularly, a voiced excitation is generated on the basis of the pitch period (T) estimated or transmitted at the previous block, by optionally applying a correction of plus or minus a sample of the duration of this period (counted in terms of number of samples), by constituting groups (A?,B?,C?,D?) of at least two samples and inverting positions of samples in the groups, randomly (B?,C?) or in a forced manner. An over-harmonicity in the excitation generated is thus broken and the effect of overvoicing in the synthesis of the generated signal is thereby attenuated.
    Type: Application
    Filed: October 17, 2007
    Publication date: December 23, 2010
    Applicant: France Telecom
    Inventors: David Virette, Balazs Kovesi
  • Publication number: 20100324902
    Abstract: Disclosed are techniques and systems to provide a narration of a text in multiple different voices. In some aspects, systems and methods described herein can include receiving a user-based selection of a first portion of words in a document where the document has a pre-associated first voice model and overwriting the association of the first voice model, by the one or more computers, with a second voice model for the first portion of words.
    Type: Application
    Filed: January 14, 2010
    Publication date: December 23, 2010
    Inventors: Raymond Kurzweil, Paul Albrecht, Peter Chapman