Abstract: A computer-based virtual assistant includes a virtual assistant application running on a computer capable of receiving human voice communications from a user of a remote user interface and transmitting a vocalization to the remote user interface, the virtual assistant application enabling the user to access email and voicemail messages of the user, the virtual assistant application selecting a responsive action to a verbal query or instruction received from the remote user interface and transmitting a vocalization characterizing the selected responsive action to the remote user interface, and the virtual assistant waiting a predetermined period of time, and if no canceling indication is received from the remote user interface, proceeding to perform the selected responsive action, and if a canceling indication is received from the remote user interface halting the selected responsive action and transmitting a new vocalization to the remote user interface. Also a method of using the virtual assistant.
Type:
Application
Filed:
September 23, 2008
Publication date:
January 15, 2009
Inventors:
Robert S. Cooper, Derek Sanders, Richard M. Ulmer
Abstract: Provided are a user interface for processing digital data, a method for processing a media interface, and a recording medium thereof. The user interface is used for converting a selected script into voice to generate digital data having a form of a voice file corresponding to the script, or for managing the generated digital data. In the method, the user interface is displayed. The user interface includes at least a text window on which a script to be converted into voice is written, and an icon to be selected for converting the script written on the text window into voice.
Type:
Application
Filed:
July 10, 2008
Publication date:
January 15, 2009
Applicant:
LG Electronics Inc.
Inventors:
Tae Hee Ahn, Sung Hun Kim, Dong Hoon Lee
Abstract: In a speech synthesis, a selecting unit selects one string from first speech unit strings corresponding to a first segment sequence obtained by dividing a phoneme string corresponding to target speech into segments. The selecting unit performs repeatedly generating, based on maximum W second speech unit strings corresponding to a second segment sequence as a partial sequence of the first sequence, third speech unit strings corresponding to a third segment sequence obtained by adding a segment to the second sequence, and selecting maximum W strings from the third strings based on a evaluation value of each of the third strings. The value is obtained by correcting a total cost of each of the third string candidate with a penalty coefficient for each of the third strings. The coefficient is based on a restriction concerning quickness of speech unit data acquisition, and depends on extent in which the restriction is approached.
Abstract: A method is provided for creating a phonebook for a hands-free telephone system in a vehicle using phonebook entries retrieved from a remote phonebook of a mobile phone over a wireless communication link between a control module of the hands-free telephone system and the mobile phone. The method includes receiving a remote phonebook from the mobile phone, the remote phonebook including a plurality of entries, each entry including text data and numeric data, identifying the text data in each entry, generating an acoustic baseform for each entry based on the text data for each entry, storing the acoustic baseform for each entry in a baseform list, and storing the plurality of entries in a mobile phonebook associated with the baseform list.
Type:
Application
Filed:
January 6, 2006
Publication date:
January 8, 2009
Inventors:
Brian L. Douthitt, Steven G. Schultz, Ted W. Ringold, Jeffrey N. Golden, Mark L. Zeinstra
Abstract: According to an aspect of an embodiment, an apparatus for converting text data into sound signal, comprises: a phoneme determiner for determining phoneme data corresponding to a plurality of phonemes and pause data corresponding to a plurality of pauses to be inserted among a series of phonemes in the text data to be converted into sound signal; a phoneme length adjuster for modifying the phoneme data and the pause data by determining lengths of the phonemes, respectively in accordance with a speed of the sound signal and selectively reducing the length of at least one of the pause in the text data to a pause length which is less than the pause length corresponding to the speed of the sound signal; and an output unit for outputting sound signal on the basis of the adjusted phoneme data and pause data by the phoneme length adjuster.
Abstract: According to an aspect of an embodiment, an apparatus for converting text data into sound signal, comprises: a phoneme determiner for determining phoneme data corresponding to a plurality of phonemes and pause data corresponding to a plurality of pauses to be inserted among a series of phonemes in the text data to be converted into sound signal; a phoneme length adjuster for modifying the phoneme data and the pause data by determining lengths of the phonemes, respectively in accordance with a speed of the sound signal and selectively adjusting the length of at least one of the phonemes which is placed immediately after one of the pauses so that the at least one of the phonemes is relatively extended timewise as compared to other phonemes; and a output unit for outputting sound signal on the basis of the adjusted phoneme data and pause data by the phoneme length adjuster.
Abstract: The present invention discloses a method for training an exception-limited phonetic decision tree. An initial subset of data can be selected and used for creating an initial phonetic decision tree. Additional terms can then be incorporated into the subset. The enlarged subset can be used to evaluate the phonetic decision tree with the results being categorized as either correctly or incorrectly phonetized. An exception-limited phonetic tree can be generated from the set of correctly phonetized terms. If the termination conditions for the method have been determined to be unsatisfactorily met, then steps of the method can be repeated.
Type:
Application
Filed:
June 25, 2007
Publication date:
December 25, 2008
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Abstract: A speech synthesis system stores a group of speech units in a memory, selects a plurality of speech units from the group based on prosodic information of target speech, the speech units selected corresponding to each of segments which are obtained by segmenting a phoneme string of the target speech and minimizing distortion of synthetic speech generated from the speech units selected to the target speech, generates a new speech unit corresponding to the each of the segments, by fusing the speech units selected, to obtain a plurality of new speech units corresponding to the segments respectively, and generates synthetic speech by concatenating the new speech units.
Abstract: A method for operating a voice mail system connected to a telecommunication system, having a recorder with a message storage connected thereto, shall enable a particularly flexible use of a speech message stored in the voice mail system. For this purpose, according to the invention, a speech message collected by the recorder, storable as a speech information in the message storage is converted by means of a speech-to-text conversion module into a text message.
Abstract: An information dissemination system comprises an Internet-connected server adapted for gathering information from plural sources, and sorting the information according to subscriber preferences. The sorted information is transmitted via the Internet to a subscriber's Internet Appliance (IA) as electronic documents, where the documents are downloaded to a connected playback device. The playback device may be disconnected from the PC, and the information electronic documents rendered as speech to a speaker in the playback device by a text-to-speech system. In a preferred embodiment annotation is added at the Internet-connected server to control speech characteristics, such as inflection, upon playback. In some embodiments updates may be made by radio with the playback device disconnected from the IA.
Abstract: A technique for producing speech output in a text-to-speech system is provided. A message is created for communication to a user in a natural language generator of the text-to-speech system. The message is annotated in the natural language generator with a synthetic speech output style. The message is conveyed to the user through a speech synthesis system in communication with the natural language generator, wherein the message is conveyed in accordance with the synthetic speech output style.
Type:
Application
Filed:
July 1, 2008
Publication date:
December 4, 2008
Applicant:
International Business Machines Corporation
Abstract: Provided is a mobile phone which converts an input first word to a second word, displays the second word, extracts voice data corresponding to the second word and outputs the extracted voice data and a method of converting a word and outputting the converted word as a voice in the mobile phone. Furthermore, there is also provided a mobile phone which receives and stores a composite video signal including an audio signal and a video signal, inputs a playback point of the stored composite video signal and a playback speed exceeding 1X and plays the sound and image corresponding to the stored composite video signal from the input playback point at the input playback speed and a composite image processing method of the mobile phone.
Abstract: A method includes obtaining digital content comprising text content; obtaining at least one speech parameter associated with the digital content; and using the speech parameters as an input, generating a speech output corresponding to at least part of the text content. Corresponding apparatuses, system and computer program products are also presented.
Abstract: Abstract of the Disclosure A text-to-speech system that includes an arrangement for accepting text input, an arrangement for providing synthetic speech output, and an arrangement for imparting emotion-based features to synthetic speech output. The arrangement for imparting emotion-based features includes an arrangement for accepting instruction for imparting at least one emotion-based paradigm to synthetic speech output, as well as an arrangement for applying at least one emotion-based paradigm to synthetic speech output.
Type:
Application
Filed:
July 14, 2008
Publication date:
November 27, 2008
Applicant:
International Business Machines Corporation
Abstract: A subscription-based system provides transcribed audio information to one or more mobile devices. Some techniques feature a system for providing subscription services for currently-generated (e.g., not stored) information (e.g., caption information, transcribed audio) for one or more mobile devices for a live/current audio event. There can be a communication network for communicating to the one or more mobile devices, a transcriber configured for transcribing the event to generate information (e.g., caption information, transcribed audio). Caption data includes transcribed data and control code data. The system includes a subscription gateway configured for live/current transfer of the transcribed data to the one or more mobile devices. The subscription gateway is configured to provide access for the transcribed data to the one or more mobile devices.
Abstract: The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering.
Type:
Application
Filed:
April 3, 2008
Publication date:
November 20, 2008
Applicant:
International Business Machines Corporation
Inventors:
Dan Ning Jiang, Fan Ping Meng, Yong Qin, Zhi Wei Shuang
Abstract: The invention relates to a method for transmitting data to at least one communications end system, and to a communications device for carrying out said method.
Abstract: A text to speech interactive voice response system is operable within a personal computer having a processor, data storage means and an operating system. The system comprises an input subsystem for receiving a text data stream from a source device in a predetermined format; a process control subsystem for converting the text data stream into corresponding output data items; an audio record subsystem for recording audio data to be associated with each output data item; and, a broadcast control subsystem for generating an audio broadcast based on the output data items. There is also disclosed a system management and control subsystem for user interface with the system.
Type:
Application
Filed:
April 27, 2007
Publication date:
October 30, 2008
Inventors:
Craig B. Dickson, Stephen J. Eady, James R. Woolsey
Abstract: The present invention provides a method and apparatus for text to speech conversion, and a method and apparatus for adjusting a corpus. The method for text to speech comprises: text analysis step for parsing the text to obtain descriptive prosody annotations of the text based on a TTS model generated from a first corpus; prosody parameter prediction step for predicting the prosody parameter of the text according to the result of text analysis step; speech synthesis step for synthesizing speech of said text based on said the prosody parameter of the text; wherein descriptive prosody annotations of the text include prosody structure for the text, the prosody structure of the text is adjusted according to a target speech speed for the synthesized speech. The present invention adjusts the prosody structure of the text according to the target speech speed. The synthesized speech will have improved quality.
Type:
Application
Filed:
July 3, 2008
Publication date:
October 30, 2008
Inventors:
Qin Shi, Wei Zhang, Wei Bin Zhu, Hai Xin Chai
Abstract: An email system for mobile devices, such as cellular phones and PDAs, is disclosed which allows email messages to be played back on the mobile device as voice messages on demand by way of a media player, thus eliminating the need for a unified messaging system. Email messages are received by the mobile device in a known manner. In accordance with an important aspect of the invention, the email messages are identified by the mobile device as they are received. After the message is identified, the mobile device sends the email message in text format to a server for conversion to speech or voice format. After the message is converted to speech format, the server sends the messages back to the user's mobile device and notifies the user of the email message and then plays the message back to the user through a media player upon demand.
Abstract: A method for facilitating cooperation between humans and remote vehicles comprises creating image data, detecting humans within the image data, extracting gesture information from the image data, mapping the gesture information to a remote vehicle behavior, and activating the remote vehicle behavior. Alternatively, voice commands can by used to activate the remote vehicle behavior.
Type:
Application
Filed:
April 11, 2008
Publication date:
October 16, 2008
Inventors:
Christopher Vernon Jones, Odest Chadwicke Jenkins, Matthew M. Loper
Abstract: A speech module (13) comprises an independent self-contained connector module or unit which is adapted to be releasably connected in series with the input to, or output from, a signal sensing apparatus (1). The module is provided with plugs and/or sockets (14a,14b,20) compatible with those of the apparatus (1) so that the module (13) is capable of forming a connector in series with the signal input or output leads (7,8,14). The module is further provided with plugs and/or sockets (30,31) and leads (3,4) to replace the signal input or output leads (7,8,14). The module is connected to a data output socket (12) by means of a lead (14); in the alternative, it is connected to the input connectors (30,31) of the apparatus and is further connected by leads to probes (3,4) equivalent to the standard probes used by the apparatus, which is preferably an electrical multimeter.
Type:
Application
Filed:
June 9, 2008
Publication date:
October 2, 2008
Inventors:
Milton Bernard Hollander, Shahin Baghai
Abstract: The present invention is a speech synthesizer that generates speech data of text including a fixed part and a variable part, in combination with recorded speech and rule-based synthetic speech. The speech synthesizer is a high-quality one in which recorded speech and synthetic speech are concatenated with the discontinuity of timbres and prosodies not perceived.
Abstract: Methods, apparatus, and products are disclosed for supporting multi-lingual user interaction with a multimodal application, the application including a plurality of VoiceXML dialogs, each dialog characterized by a particular language, supporting multi-lingual user interaction implemented with a plurality of speech engines, each speech engine having a grammar and characterized by a language corresponding to one of the dialogs, with the application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the application operatively coupled to the speech engines through a VoiceXML interpreter, the VoiceXML interpreter: receiving a voice utterance from a user; determining in parallel, using the speech engines, recognition results for each dialog in dependence upon the voice utterance and the grammar for each speech engine; administering the recognition results for the dialogs; and selecting a language for user interaction in dependence upon
Abstract: A software language including language constructs for disambiguating text that is to be converted to speech using configurable lexeme based rules. The language can include at least one conditional statement and a significance indicator. The conditional statement can define a sense of usage for a lexeme. The significance indicator can define a criteria for selecting an associated sense of usage. The language can also include an action expression that is associated with a conditional statement that defines a set of programmatic actions to be executed upon a selection of the associated usage sense. The conditional statement can include a context range specification that defines a scope of an input string for examination when evaluating the conditional statement. Further, the conditional statement can include a directive that represents a defined condition of the lexeme or the text surrounding the lexeme.
Type:
Application
Filed:
March 21, 2007
Publication date:
September 25, 2008
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
OSWALDO GAGO, STEVEN M. HANCOCK, MARIA E. SMITH
Abstract: A prosody modification device includes: a real voice prosody input part that receives real voice prosody information extracted from an utterance of a human; a regular prosody generating part that generates regular prosody information having a regular phoneme boundary that determines a boundary between phonemes and a regular phoneme length of a phoneme by using data representing a regular or statistical phoneme length in an utterance of a human with respect to a section including at least a phoneme or a phoneme string to be modified in the real voice prosody information; and a real voice prosody modification part that resets a real voice phoneme boundary by using the generated regular prosody information so that the real voice phoneme boundary and a real voice phoneme length of the phoneme or the phoneme string to be modified in the real voice prosody information are approximate to an actual phoneme boundary and an actual phoneme length of the utterance of the human, thereby modifying the real voice prosody in
Abstract: A Language Generating System (“LGS”) for generating and outputting natural language data for informing a user of a predetermined event in a plurality of different languages, is provided. The LGS may include a database including grammar data sets corresponding to each of a plurality of languages, the grammar data including transformation rules that may be used to obtain a sequence of words having an information content corresponding to the predetermined event. In addition, a universal speech driver may be provided, which constructs a grammatically correct sequence of words having the information content corresponding to the predetermined event on the basis of a grammar data set. The language generating system may additionally include an information unit that may generate an auditory output to via, for example, a loudspeaker, or a visual output via, for example, a display.
Abstract: A method of modifying an audio signal comprises the steps of analyzing the input audio signal (x) so as to produce a set of filter parameters (p) and a residual signal (r), modifying the set of filter parameters (p) so as to produce a modified set of filter parameters (p?), and synthesizing an output audio signal (y) using the modified set of filter parameters (p?) and the residual signal (r). The set of filter parameters (p) comprises poles (?A) and coefficients (a; c). The step of modifying the filter parameters (p) involves interpolating lattice filter reflection coefficients (c) so as to scale the spectral envelope of the audio signal.
Type:
Application
Filed:
July 18, 2006
Publication date:
September 4, 2008
Applicant:
KONINKLIJKE PHILIPS ELECTRONICS, N.V.
Inventors:
Aki Sakari Harma, Albertus Cornelis Den Brinker
Abstract: A variable voice rate apparatus to control a reproduction rate of voice, includes a voice data generation unit configured to generate voice data from the voice, a text data generation unit configured to generate text data indicating a content of the voice data, a division information generation unit configured to generate division information used for dividing the text data into a plurality of linguistic units each of which is characterized by a linguistic form, a reproduction information generation unit configured to generate reproduction information set for each of the linguistic units, and a voice reproduction controller which controls reproduction of each of the linguistic units, based on the reproduction information and the division information.
Abstract: A bandwidth extension system extends the bandwidth of an acoustic signal. By shifting a portion of the signal by a frequency value, the system generates an upper bandwidth extension signal. An extended bandwidth acoustic signal may be generated from the acoustic signal, the upper bandwidth extension signal, and/or a lower bandwidth extension signal.
Type:
Application
Filed:
January 17, 2008
Publication date:
August 14, 2008
Inventors:
Bernd Iser, Gerhard Nussle, Gerhard Uwe Schmidt
Abstract: A system for delivering customized audio content to customers. A central processing site (120) is coupled with content providers (110) through a network (142). The central processing consists of a number of components, namely content classification system (200), user preference management (400), content conversion system (500), content delivery system (600), and user authentication (300).
Abstract: In an apparatus and a method for automatically indicating time in a text file, a receiver module receives a text file and a speech file, in which the text file is composed of a plurality of sentences; a speech recognition module transforms the sentences in the text file into a speech model, divides the speech file into a plurality of sound frames and assigns numbers to them in sequence in accordance with a time interval, turns speech data of the sound frames into feature parameters through speech capturing, and calculates the best speech route matching the sound frames with the speech model; an indicator module captures the assigned number of the sound frame corresponding to the beginning of each sentence in accordance with the best speech route, obtains a starting time of the speech file corresponding to the beginning of each sentence through the assigned number of the sound frame and a time interval and indicates the starting time in the text file.
Type:
Application
Filed:
August 8, 2007
Publication date:
August 7, 2008
Applicant:
MICRO-STAR INT'L CO., LTD.
Inventors:
Ming Hsiang Yen, Jui Yu Yen, Ping-Hsia Chao
Abstract: Process to record or create sound bites or sound files into test format on a handheld device A preferred embodiment includes a process that can change text to sound bites or files to play back sound file in the form of a test, This includes having one Question and two or more possible answers for the user to choose The process also has a method of redistributing test into one file
Abstract: An apparatus, method and computer readable medium are disclosed. In at least one embodiment, the apparatus includes a keyboard including keys, a plurality of the keys each being associated with a polysemous symbol relating to a concept represented by a Chinese radical; and a processor, to determine whether or not a plurality of symbols, associated with a plurality of selected keys, form a sequence of symbols associated with at least one Chinese character, and, in response to determining that the plurality of selected symbols form a sequence of symbols associated with at least one Chinese character, to instruct output of the at least one Chinese character. A plurality of the keys may include each of a polysemous symbol, a Chinese radical, a Chinese measure word character and a Pinyin/Bopomofo letter, each associated with one another.
Type:
Application
Filed:
December 17, 2007
Publication date:
July 31, 2008
Inventors:
Bruce R. Baker, Tianxue Yao, Paul Andres, Jutta Hermann, Sarah Yong, Zen Koh, Eric Nyberg, Katharine J. Hill, Mark A. Zucco
Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
Type:
Application
Filed:
March 25, 2008
Publication date:
July 17, 2008
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
William Kress Bodin, Michael John Burkhart, Daniel G. Elsenhauer, Daniel Mark Schumacher, Thomas J. Watson
Abstract: A mobile radio terminal includes a radio circuit for communicating with a communications network so as to receive at least one of telephone calls or messages. The mobile radio terminal also includes a broadcast transmitter controlled to broadcast a signal corresponding to an alert sound when at least one of an incoming call is received or an incoming message is received via the radio circuit. The signal is configured for receipt by a compatible receiver and the alert sound is configured to alert a user to the receipt of the call or message by the mobile radio terminal.
Abstract: The present invention discloses an automotive mobile electronic apparatus and its operation method installed and applied in an automobile, and a big button is operated with a voice reminder to assure the safety and efficiency of the driving. The automotive mobile electronic apparatus includes a database module, a display module, a voice broadcasting module and a processing module. The database module records the information of points of interest (POI). The display module selectively displays the information of the points of interest. The voice broadcasting module broadcasts the information of the points of interest sequentially by a text to speech (TTS). The processing module receives a control signal from a big button when the voice broadcasting module broadcasts the information of the points of interest to confirm that the broadcasting information of the points of interest is the function selected by a user.
Abstract: A device and method for assisting a human user in performing processes includes a speaker that provides audible instructions to the user corresponding to multiple tasks associated with performing the process. A storage device stores data corresponding to the audible instructions. A processor converts the stored data to the audible instructions, and an input device is adapted to enable the user to control the provision of the audible instructions.
Abstract: The invention relates to a method of informing a user about a category (152) of a media content item. The method comprises the steps of: identifying the category of the media content item, and enabling a user to obtain an audible signal (156) having an audio parameter (153) in accordance with the category of the media content item. The invention further relates to a device, which is capable of functioning in accordance with the method. The invention also relates to audio data comprising an audible signal informing a user about a category of a media content item, a database comprising a plurality of the audio data, and a computer program product. In a recommender system, the audible signal may be reproduced by the recommender system when a user interaction with the recommender system relates to the media content item of a particular genre. The invention may be used in the EPG user interface.
Type:
Application
Filed:
October 10, 2005
Publication date:
June 12, 2008
Applicant:
KONINKLIJKE PHILIPS ELECTRONICS, N.V.
Inventors:
Dzevdet Burazerovic, Declan Patrick Kelly
Abstract: A multi-factor authentication solution implements a recognizable voice in conjunction with a user address to increase login security and reduce user inconvenience. A user creates an online account, providing an address such as a telephone number or email address to which voice messages may be sent. The user selects a recognizable voice such as the user's own voice or the voice of a famous or well-known figure. When the user attempts to login to the online account, a random passphrase is generated and converted to a voice message employing the user's pre-selected voice and the voice message is sent to the user's address. The user listens to the voice message and if the user recognizes the voice rendering the passphrase the user's login request is granted.
Abstract: A vehicle communication system facilitates hands-free interaction with a mobile device in a vehicle or elsewhere. The invention also provides remote access to information such as existing news sources (i.e. existing RSS feeds) and supported websites. This also includes subscription to value-added services including: weather, custom alerts (i.e. stock price triggers), traffic conditions, personalized news, e-books (not limited to audio books, but any e-book), personalized audio feeds, and personalized image or video feeds for passengers. The system obtains, translates, and provides personalized news content in audible form within a vehicle without explicit user requests. An individual may set their preferences by selecting from a set of common sources of information, or by specifying custom search criteria. When new information is available and relevant to the individual's preferences, it is read out loud to the individual when appropriate.
Abstract: A multi-language attraction communication device is provided wherein a guest selects a particular language file corresponding to an audio file coordinated with at least one entertainment activity. The communication device includes a memory for storing multiple language files for at least one entertainment activity and a processor. The processor is configured to access the memory, in response to a selected language file for the at least one entertainment activity, and to communicate the language file for review by a guest in real time coordination with the at least one entertainment activity. A method of providing audio to guests at an entertainment activity is also provided.
Type:
Application
Filed:
December 8, 2006
Publication date:
June 12, 2008
Inventors:
Matthew Preston Jones, Steven C. Blum, Justin Michael Schwartz, Brian McQuillian
Abstract: Methods and systems for information retrieval during communication for use in a device having telecommunication capability. The device performs a communication. An instruction is received during the communication. Information is retrieved according to the instruction. The information is converted to speech using a text-to-speech technology, and the speech is provided to at least one party corresponding to the communication.
Abstract: Web pages and other text documents displayed on a computer are reformatted to allow a user who has difficulty reading to navigate between and among such documents and to have such documents, or portions of them, read aloud by the computer using a text-to-speech engine in their original or translated form while preserving the original layout of the document. A “point-and-read” paradigm allows a user to cause the text to be read solely by moving a pointing device over graphical icons or text without requiring the user to click on anything in the document. Hyperlink navigation and other program functions are accomplished in a similar manner.
Abstract: Preparation of Braille books and audio books by man has a problem that a great deal of costs including the work cost for recitation or translation (printing) of Braille, composition, examination and the like and the distribution cost are required as compared with general books. Further, there is a problem that books that visually handicapped people wish to read cannot be delivered immediately at an inexpensive price. The user (reader) himself participates in editing or correction in processing for producing an audio book which reads an electronic book aloud using the speech synthesis technique and pronunciation symbol text constituting the basis of the audio book to thereby reduce the cost required for the production processing of the audio book and the pronunciation symbol text and improve the quality of the audio book and the pronunciation symbol text.
Abstract: A computer-assisted method of assisting a user of a computer to memorize includes steps of displaying an image indicating a content to be memorized, on a left-hand side of a computer screen for a particular period; displaying text data indicating language information related to the image displayed in this displaying step, on a right-hand side of the computer screen for a particular period; and playing back a voice pronouncing the text data displayed in the text data displaying step. The image displaying step, the text displaying step, and the voice playing back step are performed repeatedly, thereby assisting the user of the computer to memorize.
Abstract: A method is disclosed for centrally storing data in a remote server (4). On the remote server (4), a voice memo (42) that has been recorded by a user (1) and converted into a text with the aid of a voice recognition system (13, 43) is stored as a text. The memory area (42) is user-specific, and the user (1) is previously identified in the remote server (4). Simultaneously, the contents of the memory area (42) can be searched by the user (1) for a particular item of information. The invention is characterized in that additional data are transmitted over an interface (14, 24) working at close range to a telecommunication device (11) of the user (1) and over the telecommunication device (11) to the remote server (4), and additionally stored in the memory area (42) allocated to the user (1). The invention also relates to a remote server (4) with equivalent characteristics.
Abstract: A method for providing a message-based communications infrastructure for automated call center operation is described. A call from a telephony interface is accepted. The accepted call includes an incoming stream of verbal speech. The incoming stream of verbal speech is converted into incoming text from a caller into a call center. The call is automatically assigned at a session manager to a session and to a live agent. The incoming text is progressively processed through an agent application during the session through a customer support scenario interactively monitored and controlled by the live agent. The live agent sends outgoing text messages that are converted into an outgoing stream of synthesized speech to the caller.
Type:
Application
Filed:
October 19, 2007
Publication date:
March 6, 2008
Inventors:
Gilad Odinak, Alastair Sutherland, William Tolhurst
Abstract: A mobile device includes a display device, a modem interface to communicate with a network, an input interface to receive data, and processing logic responsive to the input interface. The mobile device also includes a memory accessible to the processing logic. The memory includes a plurality of instructions executable by the processing logic to provide a user interface to the display device. The user interface includes a first area to receive a text message and a second area to receive an identifier associated with an addressee device. The memory also includes instructions executable by the processing logic to receive the text message and to submit the text message for conversion into an audio message and for transmission of the audio message to the addressee destination device.
Abstract: An electronic appliance includes a speaker which outputs a first sound wave based on a first voice signal generated from the electronic appliance, and a microphone to detect a second sound wave on which a sound wave generated for control of the electronic appliance is superimposed to output a second voice signal. A first waveform generator generates a first waveform signal based on the first voice signal, and a second waveform generator generates a second waveform signal based on the second voice signal. A waveform shaping unit outputs a third waveform signal in which the first waveform signal is enlarged in a time axis direction, and a subtracter subtracts the third waveform signal from the second waveform signal.