Abstract: A speech recognition apparatus which improves the sound quality of speech output as a speech recognition result is provided. The speech recognition apparatus includes a recognition unit, which recognizes speech based on a recognition dictionary, and a registration unit, which registers a dictionary entry of a new recognition word in the recognition dictionary. The recognition unit includes a generation unit, which generates a dictionary entry including speech of the new recognition word item and feature parameters of the speech, and a modification unit, which makes a modification for improving the sound quality of the speech included in the dictionary entry generated by the generation unit. The recognition unit includes a speech output unit, which outputs speech which is included in a dictionary entry corresponding to the recognition result of input speech, and is modified by the modification unit.
Abstract: A plurality of prompting layers configured to provide varying levels of detailed assistance in prompting a user are maintained. A prompt from a current prompting layer is presented to a user. Input is received from the user. A level of detail in prompting the user is adaptively changed based on user behavior. Upon the user making a hesitant verbal gesture that reaches a threshold duration, a transition is made from the current prompting layer to a more detailed prompting layer. Upon the user interrupting the prompt with a valid input, a transition is made from the current prompting layer to a less detailed prompting layer.
Abstract: A system comprises a user interface configured to receive natural language input from a user. An input module couples to the user interface and is configured to process the received natural language input for selected words and phrases. A user skill determination module couples to the input module and is configured to determine a skill level of the user based on the selected words and phrases.
Type:
Grant
Filed:
July 14, 2008
Date of Patent:
April 10, 2012
Assignee:
International Business Machines Corporation
Inventors:
Julio E. Ruano, Seth M. Holloway, Christopher Laffoon, Abraao Lourenco, Carl A. Timm, IV
Abstract: An apparatus, and an associated method, detects spam and other fraudulent messages sent to a recipient station. The textual portion of a received message is analyzed to determine whether the message includes errors made by non-native language speakers when authoring a text message. A text analysis engine analyzes the text using rules sets that identify grammatical errors made by non-native language speakers, usage errors made by non-native language speakers, and other errors.
Type:
Grant
Filed:
August 15, 2008
Date of Patent:
April 3, 2012
Assignee:
Hewlett-Packard Development Company, L.P.
Abstract: Provided are a user interface for processing digital data, a method for processing a media interface, and a recording medium thereof. The user interface is used for converting a selected script into voice to generate digital data having a form of a voice file corresponding to the script, or for managing the generated digital data. In the method, the user interface is displayed. The user interface includes at least a text window on which a script to be converted into voice is written, and an icon to be selected for converting the script written on the text window into voice.
Type:
Grant
Filed:
July 10, 2008
Date of Patent:
March 27, 2012
Assignee:
LG Electronics Inc.
Inventors:
Tae Hee Ahn, Sung Hun Kim, Dong Hoon Lee
Abstract: Systems, methods, and apparatus described include waveform alignment operations in which a single set of evaluated cosines and sines is used to calculate cross-correlations of two periodic waveforms at two different phase shifts.
Type:
Grant
Filed:
December 1, 2006
Date of Patent:
March 27, 2012
Inventors:
Sharath Manjunath, Ananthapadmanabhan A. Kandhadai
Abstract: An apparatus, a server, a method, and a tangible machine-readable medium thereof for processing and recognizing a sound signal are provided. The apparatus is configured to sense the sound signal of the environment and to dynamically derive and to transmit a feature signal and a sound feature message of the sound signal to the server. The server is configured to retrieve the stored sound models according to the sound feature message and to compare each of the sound models with the feature signal to determine whether the sound signal is abnormal after receiving the feature signal and the sound feature message.
Abstract: A method for translating stenographic strokes includes the steps of receiving a series of stenographic strokes, creating a table of translations of one or more strokes within the series of strokes, sequentially assigning a score to each of the one or more strokes, determining at least one alternate translation to at least one of the translations in the table of translations, ranking the translations and alternate translations based on an accumulation of the score of the strokes within, and selecting one of the ranked translations or one of the ranked alternate translations based on a best score.
Abstract: In a device configurable to encode speech performing an open loop re-decision may comprise representing a speech signal by amplitude components and phase components for a current frame and a past frame. During the current frame, there may be an extraction of uncompressed amplitude components and uncompressed phase components. The amplitude components and the phase components from the past frame may then be retrieved. A set of features may be generated based on the uncompressed amplitude components from the current frame, the uncompressed phase components from the current frame, the amplitude components from the past frame, and the phase components from the past frame. The set of features may be checked as part of the open loop re-decision, and determining a final encoding decision based on the checking may be performed. The final encoding decision may be an encoding mode and/or encoding rate.
Type:
Grant
Filed:
January 22, 2007
Date of Patent:
January 3, 2012
Assignee:
QUALCOMM Incorporated
Inventors:
Sharath Manjunath, Ananthapadmanabhan Arasanipalai Kandhadai, Eddie L. T. Choy
Abstract: A speech recognition method includes a model selection step which selects a recognition model and translation dictionary information based on characteristic information of input speech and a speech recognition step which translates input speech into text data based on the selected recognition model and translation step which translates the text data based on the selected translation dictionary information.
Abstract: Systems, methods, and computer readable media providing a speech input interface. The interface can receive speech input and non-speech input from a user through a user interface. The speech input can be converted to text data and the text data can be combined with the non-speech input for presentation to a user.
Abstract: The fundamental frequency of a harmonic signal is estimated by forming a fundamental frequency hypothesis (f0?). A comb filter is provided based on the fundamental frequency hypothesis. The harmonic signal is filtered using the comb filter. The fundamental frequency hypothesis is tested for each tooth in the comb filter. A signal indicating an estimated fundamental frequency of the provided harmonic signal may be outputted based on the testing.
Abstract: Normalization parameters are generated at a normalization-parameter generating unit by calculating the mean values and the standard deviations of an initial prosody pattern and a prosody pattern of a training sentence of a speech corpus. Then, the variance range or variance width of the initial prosody pattern is normalized at the prosody-pattern normalizing unit in accordance with the normalization parameters. As a result, a prosody pattern similar to speech of human beings and improved in naturalness can be generated with a small amount of calculation.
Abstract: A CPU of a speech ECU acquires vehicle position information. If it is determined from the position information and map data stored in a memory that the vehicle has moved between areas where different languages are spoken as dialects or official languages, the CPU determines a language corresponding to the vehicle position information and transmits a request signal to a speech information center to transmit speech information in the language. By receiving the speech information from the speech information center, the CPU updates speech information pre-stored in the memory with the speech information transmitted from the speech information center.
Abstract: The present invention features a hand-held language translation device comprising a microprocessor configured to receive (i) an audio input signal from a foreign speaker, and (ii) a simultaneous visual input signal generated by a camera which captures the facial expression and body language of the foreign language speaker while the foreign language speaker speaks, wherein upon receiving the audio input signal from the foreign speaker the microprocessor is capable of translating the spoken foreign language into a written form in the language of the user, whereby the written translation segment is stored in a searchable and retrievable manner.
Abstract: Each of N audio signals are filtered with a unique decorrelating filter (38) characteristic, the characteristic being a causal linear time-invariant characteristic in the time domain or the equivalent thereof in the frequency domain, and, for each decorrelating filter characteristic, combining (40, 44, 46), in a time and frequency varying manner, its input (Zi) and output (Z-i) signals to provide a set of N processed signals (X i). The set of decorrelation filter characteristics are designed so that all of the input and output signals are approximately mutually decorrelated. The set of N audio signals may be synthesized from M audio signals by upmixing (36), where M is one or more and N is greater than M.
Abstract: An information-processing device and method that attains speech-recognition to recognize data input via speech. The information-processing device and method includes analyzing speech-recognition-grammar data, generating data on a template used to input data by speech based on the analysis results, and displaying the generated speech-input-template data.
Abstract: A computer implemented system and method for processing text are disclosed. Partially processed text, in which named entities have been extracted by a standard named entity system, is processed to identify attributive relations between a named entity or proper noun and a corresponding attribute. A concept for the attribute is identified and, in the case of a named entity, compared with the named entity's context, enabling a confirmation or conflict between the two to be determined. In the case of a proper name, the attribute's context can be associated with the proper name, allowing the proper name to be recognized as a new named entity.
Abstract: A system and method to assist a singer or other user. An audio source is processed to extract the lead vocals from the audio signal. This vocal signal is fed to a pitch detection processor which estimates the pitch at each moment in time. A user singing into a microphone provides a user vocal signal that is also pitch detected. The pitch of the lead vocal signal and the user vocal signal are compared and any difference is provided to a pitch shifting module, which then can correct the pitch of the user vocal signal. The corrected user vocal signal may be combined with a background signal comprising a signal from the audio source without the lead vocal signal, and then provided to headphones or loudspeakers to the user and/or an audience. This system and method may be used for Karaoke performances.
Abstract: Electronic devices and methods are disclosed that adaptively filter a microphone signal responsive to recognition of a targeted speaker's voice. An electronic device can include a microphone, a speaker characterization circuit, an adaptive sound filter circuit, and a speaker recognition circuit. The speaker characterization circuit operates in a training mode to learn characteristics of the targeted speaker's voice component in the microphone signal, and to store the learned characteristics. The adaptive sound filter circuit adaptively filters the microphone signal responsive to a control signal. The speaker recognition circuit uses the learned characteristics to recognize the presence of the targeted speaker's voice in the microphone signal and to regulate the control signal to cause the adaptive sound filter circuit to adapt the filtering to increase the targeted speaker's voice component of the microphone signal relative to other components.