Speech Recognition (epo) Patents (Class 704/E15.001)

  • Publication number: 20090287489
    Abstract: A mobile communication device configured to communicate over a wireless network has an audio processing circuit that is adaptable based on a pattern of the speaker's voice to provide improved audio quality and intelligibility. The audio processing circuit is configured to receive a voice signal from an individual speaker, to determine a pattern associated with the speaker's voice, and to adjust a filter based on the determined pattern.
    Type: Application
    Filed: May 15, 2008
    Publication date: November 19, 2009
    Inventor: Sagar Savant
  • Publication number: 20090287484
    Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. In a particular embodiment, a method includes determining a frequency of occurrence of a particular type of utterance method and includes determining whether the frequency of occurrence exceeds a threshold. The method further includes tuning a speech recognition system to improve recognition of the particular type of utterance when the frequency of occurrence of the particular type of utterance exceeds the threshold.
    Type: Application
    Filed: July 15, 2009
    Publication date: November 19, 2009
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
  • Publication number: 20090287483
    Abstract: A method for speech recognition includes: prompting a user with a first query to input speech into a speech recognition engine; determining if the inputted speech is correctly recognized; wherein in the event the inputted speech is correctly recognized proceeding to a new task; wherein in the event the inputted speech is not correctly recognized, prompting the user repeatedly with the first query to input speech into the speech recognition engine, and determining if the inputted speech is correctly recognized until a predefined limit on repetitions has been met; wherein in the event the predefined limit has been met without correctly recognizing the inputted user speech, prompting speech input from the user with a secondary query for redundant information; and cross-referencing the user's n-best result from the first query with the n-best result from the second query to obtain a top hypothesis.
    Type: Application
    Filed: May 14, 2008
    Publication date: November 19, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Raymond L. Co, Ee-ee Jan, David M. Lubensky
  • Publication number: 20090281804
    Abstract: A processing unit is provided which executes speech recognition on speech signals captured by a microphone for capturing sounds uttered in an environment. The processing unit has: an initial reflection component extraction portion that extracts initial reflection components by removing diffuse reverberation components from a reverberation pattern of an impulse response generated in the environment; and an acoustic model learning portion that learns an acoustic model for the speech recognition by reflecting the initial reflection components to speech data for learning.
    Type: Application
    Filed: November 20, 2008
    Publication date: November 12, 2009
    Applicants: TOYOTA JIDOSHA KABUSHIKI KAISHA, NATIONAL UNIVERSITY CORPORATION NARA INSTITUTE OF SCIENCE AND TECHNOLOGY
    Inventors: Narimasa Watanabe, Kiyohiro Shikano, Randy Gomez
  • Publication number: 20090276216
    Abstract: A method for speech recognition, the method includes: extracting time—frequency speech features from a series of reference speech elements in a first series of sampling windows; aligning reference speech elements that are not of equal time span duration; constructing a common subspace for the aligned speech features; determining a first set of coefficient vectors; extracting a time—frequency feature image from a test speech stream spanned by a second sampling window; approximating the extracted image in the common subspace for the aligned extracted time—frequency speech features with a second coefficient vector; computing a similarity measure between the first and the second coefficient vector; determining if the similarity measure is below a predefined threshold; and wherein a match between the reference speech elements and a portion of the test speech stream is made in response to a similarity measure below a predefined threshold.
    Type: Application
    Filed: May 2, 2008
    Publication date: November 5, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lisa Amini, Pascal Frossard, Effrosyni Kokiopoulou, Oliver Verscheure
  • Publication number: 20090276213
    Abstract: A voice activity detection process is robust to a low and high signal-to-noise ratio speech and signal loss. A process divides an aural signal into one or more bands. Signal magnitudes of frequency components and the respective noise components are estimated. A noise adaptation rate modifies estimates of noise components based on differences between the signal to the estimated noise and signal variability.
    Type: Application
    Filed: April 23, 2009
    Publication date: November 5, 2009
    Inventor: Phillip A. Hetherington
  • Publication number: 20090276330
    Abstract: A low power subsystem for a portable computer is described. In one example, the portable computer includes a computer system and low power multimedia center. The computer system includes a central processing unit, a system memory, a mass storage device, and a user interface, the computer system having a low-power mode in which the CPU, system memory, and user interface are inactive. The low-power multimedia center includes a low power processor coupled to the mass storage device, a low power memory coupled to the low power processor, a miniature display to display multimedia from the mass storage device, and an external user interface coupled to the processor, independent of the computer system to control the displaying of multimedia.
    Type: Application
    Filed: July 22, 2009
    Publication date: November 5, 2009
    Inventors: Pankaj Kedia, James Kardach
  • Publication number: 20090271002
    Abstract: A system and method for monitoring and controlling electric devices and electronic devices connected to an automation system in a first location from a different and remote location. The system can include at least one electric or electronic device connected to an automation system, the automation system being communicatively linked to a computer. The computer can be connected to a server via a communications network. An access device can be used by a user to submit control instructions for the electric or electronic device to the server via the communications network. The server can transmit the control instructions to the computer, which can use control software to translate the control instructions into a form that is readable by the automation system, thereby controlling one or more features of the electric or electronic device.
    Type: Application
    Filed: April 29, 2009
    Publication date: October 29, 2009
    Inventor: David Asofsky
  • Publication number: 20090271189
    Abstract: Methods, systems, and products for testing a grammar used in speech recognition for reliability in a plurality of operating environments having different background noise that include: receiving recorded background noise for each of the plurality of operating environments; generating a test speech utterance for recognition by a speech recognition engine using a grammar; mixing the test speech utterance with each recorded background noise, resulting in a plurality of mixed test speech utterances, each mixed test speech utterance having different background noise; performing, for each of the mixed test speech utterances, speech recognition using the grammar and the mixed test speech utterance, resulting in speech recognition results for each of the mixed test speech utterances; and evaluating, for each recorded background noise, speech recognition reliability of the grammar in dependence upon the speech recognition results for the mixed test speech utterance having that recorded background noise.
    Type: Application
    Filed: April 24, 2008
    Publication date: October 29, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES
    Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR., Michael H. Mirt
  • Publication number: 20090271190
    Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.
    Type: Application
    Filed: April 25, 2008
    Publication date: October 29, 2009
    Applicant: NOKIA CORPORATION
    Inventors: Riitta Elina Niemisto, Paivi Marianna Valve
  • Publication number: 20090271201
    Abstract: A standard model creating apparatus which provides a high-precision standard model used for pattern recognition such as speech recognition, character recognition, or image recognition using a probability model based on a hidden Markov model, Bayesian theory, or linear discrimination analysis; intention interpretation using a probability model such as a Bayesian net; data-mining performed using a probability model; and so forth. The standard model creating apparatus includes a reference model preparing unit that prepares at least one reference model; a reference model storing unit that stores the reference model prepared by the reference model preparing unit (; and a standard model creating unit that creates a standard model by calculating statistics of the standard model so as to maximize or locally maximize the probability or likelihood with respect to the reference model stored in the reference model storing unit.
    Type: Application
    Filed: July 8, 2009
    Publication date: October 29, 2009
    Inventor: Shinichi YOSHIZAWA
  • Publication number: 20090271200
    Abstract: The invention relates to a speech recognition assembly for acoustically controlling a function of a motor vehicle, wherein the speech recognition assembly comprises a microphone disposed in the motor vehicle for inputting a voice command, a data base disposed in the motor vehicle in which respectively at least one meaning is allocated to phonetic representations of voice commands and an on-board-speech-recognition-system disposed in the motor vehicle for determining a meaning of the voice command by use of a meaning of a phonetic representation of a voice command stored in the data base, and wherein the speech recognition assembly further comprises an off-board-speech-recognition-system disposed spatially separated from the motor vehicle for determining a meaning of the voice command.
    Type: Application
    Filed: March 24, 2009
    Publication date: October 29, 2009
    Applicant: Volkswagen Group of America, Inc.
    Inventors: Rohit Mishra, Edward Kim
  • Publication number: 20090271199
    Abstract: Methods, apparatus, and products are disclosed for record disambiguation in a multimodal application operating on a multimodal device, the multimodal device supporting multiple modes of interaction including at least a voice mode and a visual mode, that include: prompting, by the multimodal application, a user to identify a particular record among a plurality of records; receiving, by the multimodal application in response to the prompt, a voice utterance from the user; determining, by the multimodal application, that the voice utterance ambiguously identifies more than one of the plurality of records; generating, by the multimodal application, a user interaction to disambiguate the records ambiguously identified by the voice utterance in dependence upon record attributes of the records ambiguously identified by the voice utterance; and selecting, by the multimodal application for further processing, one of the records ambiguously identified by the voice utterance in dependence upon the user interaction.
    Type: Application
    Filed: April 24, 2008
    Publication date: October 29, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES
    Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR., Pradeep P. Mansey
  • Publication number: 20090271188
    Abstract: Methods, apparatus, and products are disclosed for adjusting a speech engine for a mobile computing device based on background noise, the mobile computing device operatively coupled to a microphone, that include: sampling, through the microphone, background noise for a plurality of operating environments in which the mobile computing device operates; generating, for each operating environment, a noise model in dependence upon the sampled background noise for that operating environment; and configuring the speech engine for the mobile computing device with the noise model for the operating environment in which the mobile computing device currently operates.
    Type: Application
    Filed: April 24, 2008
    Publication date: October 29, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR., Paritosh D. Patel
  • Publication number: 20090265169
    Abstract: A technique of operating a communication device includes dividing a frequency band associated with a background noise signal into respective sub-bands. Respective individual level estimates for each of the respective sub-bands are then determined. A total level estimate for the background noise signal is determined. Finally, a comfort noise signal (whose characteristics are based on the respective individual level estimates and the total level estimate) is provided.
    Type: Application
    Filed: April 18, 2008
    Publication date: October 22, 2009
    Inventors: Roman A. Dyba, Perry P. He, Brad L. Zwernemann
  • Publication number: 20090264789
    Abstract: A set of therapy parameter values is selected based on a patient state, where the patient state comprises a speech state or a mixed patient state including the speech state and at least one of a movement state or a sleep state. In this way, therapy delivery is tailored to the patient state, which may include one or more patient symptoms specific to the patient state. In some examples, a medical device determines whether the patient is in the speech state or a mixed patient state including the speech state based on a signal generated by a voice activity sensor. The voice activity sensor detects the use of the patient's voice, and may include a microphone, a vibration detector or an accelerometer.
    Type: Application
    Filed: April 28, 2009
    Publication date: October 22, 2009
    Inventors: Gregory F. Molnar, Richard T. Stone, Xuan Wei
  • Publication number: 20090259469
    Abstract: A method and apparatus for performing speech recognition receives an audio signal, generates a sequence of frames of the audio signal, transforms each frame of the audio signal into a set of narrow band feature vectors using a narrow passband, couples the narrow band feature vectors to a speech model, and determines whether the audio signal is a wide band signal. When the audio signal is determined to be a wide band signal, a pass band parameter of each of one or more passbands that are outside the narrow passband is generated for each frame and the one or more band energy parameters are coupled to the speech model.
    Type: Application
    Filed: April 14, 2008
    Publication date: October 15, 2009
    Applicant: MOTOROLA, INC.
    Inventors: Changxue Ma, Yuan-Jun Wei
  • Publication number: 20090259464
    Abstract: A system and method for facilitating cognitive processing of simultaneous remote voice conversations is provided. A plurality of remote voice conversations participated in by distributed participants are provided over a shared communication channel. A main conversation between at least two of the distributed participants and one or more subconversations between at least two other of the distributed participants are identified from within the remote voice conversations. Segments of interest to one of the distributed participants are defined including a conversation excerpt having a lower attention activation threshold for the one distributed participant. Each of the subconversations is parsed into conversation excerpts. The conversation excerpts are compared to the segments of interest. One or more gaps between conversation flow in the main conversation are predicted.
    Type: Application
    Filed: April 11, 2008
    Publication date: October 15, 2009
    Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
    Inventors: Nicolas B. Ducheneaut, Trevor F. Smith
  • Publication number: 20090252345
    Abstract: A wearable computer system has a user interface with at least an audio-only mode of operating, and that is natural in appearance and facilitates natural interactions with the system and the user's surroundings. The wearable computer system may retrieve information from the user's voice or surroundings using a passive user interface. The audio-only user interface for the wearable computer system may include two audio receivers and a single output device, such as a speaker, that provides audio data directly to the user. The two audio receivers may be miniature microphones that collaborate to input audio signals from the user's surroundings while also accurately inputting voice commands from the user. Additionally, the user may enter natural voice commands to the wearable computer system in a manner that blends in with the natural phrases and terminology spoken by the user.
    Type: Application
    Filed: June 11, 2009
    Publication date: October 8, 2009
    Applicant: ACCENTURE GLOBAL SERVICES GMBH
    Inventors: Dana Le, Lucian P. Hughes, Owen E. Richter
  • Publication number: 20090253463
    Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal and a memory configured to store multiple domains related to menus and operations of the mobile terminal. It further includes a controller configured to access a specific domain among the multiple domains included in the memory based on the received input to activate the voice recognition function, to recognize user speech based on a language model and an acoustic model of the accessed domain, and to determine at least one menu and operation of the mobile terminal based on the accessed specific domain and the recognized user speech.
    Type: Application
    Filed: June 16, 2008
    Publication date: October 8, 2009
    Inventors: Jong-Ho SHIN, Jong-Keun YOUN, Dae-Sung JUNG, Jae-Hoon YU, Tae-Jun KIM, Jae-Min JOH, Jae-Do KWAK
  • Publication number: 20090254343
    Abstract: Embodiments of a system for identifying audio content are described. During operation, the system receives a data stream from an electronic device via a communication network. Then, the system distorts a set of target patterns which are used to identify the audio content based on characteristics of the electronic device and/or the communication network. Next, the system identifies the audio content in the data stream based on the set of distorted target patterns.
    Type: Application
    Filed: April 4, 2008
    Publication date: October 8, 2009
    Applicant: INTUIT INC.
    Inventor: Matt E. Hart
  • Publication number: 20090254351
    Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.
    Type: Application
    Filed: June 16, 2008
    Publication date: October 8, 2009
    Inventors: Jong-Ho Shin, Jea-Do Kwak, Jong-Keun Youn
  • Publication number: 20090254341
    Abstract: A spectrum calculating unit calculates, for each of the frames, a spectrum by performing a frequency analysis on an acoustic signal. An estimating unit estimates a noise spectrum. An energy calculating unit calculates an energy characteristic amount. An entropy calculating unit calculates a normalized spectral entropy value. A generating unit generates a characteristic vector based on the energy characteristic amounts and the normalized spectral entropy values that have been calculated for a plurality of frames. A likelihood calculating unit calculates a speech likelihood value of a target frame that corresponds to the characteristic vector. In a case where the speech likelihood value is larger than a threshold value, a judging unit judges that the target frame is a speech frame.
    Type: Application
    Filed: September 22, 2008
    Publication date: October 8, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Koichi Yamamoto, Masami Akamine
  • Publication number: 20090247245
    Abstract: An electronic headset comprising connectivity functionality, the connectivity functionality arranged to allow the headset to be operatively coupled to a remote associated electronic device for the transmission of signalling between the headset and the device, the headset further comprising camera functionality arranged to capture one or more images, and wherein the camera functionality is arranged to be controlled by control signalling transmitted via the connectivity functionality, the control signalling being processed by the remote associated electronic device and being sent to the electronic headset to control the camera functionality.
    Type: Application
    Filed: December 14, 2004
    Publication date: October 1, 2009
    Inventors: Andrew Strawn, Brian Davidson, Tuomas Matila
  • Publication number: 20090248411
    Abstract: VoIP phones according to the present invention include a microphone, which may be internal or external, and allow the user to communicate unobtrusively, check voice mail and conduct other activities in an environment which can be noisy in general and extremely noisy sometimes. Speech recognition functionally may also be used to generate and send touch tone or DTMF tones such as in response to call trees or voice recognition functionality used by airlines, credit card companies, voice mail systems, and other applications. A system and method of audio processing which provides enhanced speech recognition is provided. Audio input is received at the microphone which is processed by adaptive noise cancellation to generate an enhanced audio signal. The operation of the speech recognition engine and the adaptive noise canceller may be advantageously controlled based on Voice Activity Detection (VAD).
    Type: Application
    Filed: March 27, 2009
    Publication date: October 1, 2009
    Inventors: Alon Konchitsky, Alberto D. Berstein, Hariharan Ganapathy Kathirvelu, Sandeep Kulakcherla, William Martin Ribble
  • Publication number: 20090244000
    Abstract: An integrated communication interface is provided for composing and sending messages. The interface is multi-configurable to seamlessly switch between different communication methods, e.g., electronic mail, instant messaging, SMS, chat, voice, and the like, without loss of message content. The interface allows a user to begin composing a message to be sent using one communication method, such as electronic mail, and subsequently change the communication method and send the message via a second communication method, such as instant messaging. When the communication method is changed, the user interface may also change to include elements specific to a particular communication method. The integrated communication interface may display information about participants in the communication, such as the participants' presence, i.e., whether they are online and available for communication, and may automatically choose the best method of communication based on the preferences and online presence of the participants.
    Type: Application
    Filed: February 9, 2009
    Publication date: October 1, 2009
    Applicant: YAHOO! INC.
    Inventors: Brooke THOMPSON, Greg Rosenberg, Ryan Michael Olshavsky, Brian Kobashikawa
  • Publication number: 20090248398
    Abstract: A system and method for instructing dynamic nodes in a dynamically changing mobile network how to maneuver. A receiver receives situation data indicative of a respective situation of each dynamic node in space and a situation unit coupled to the receiver determines the respective situation of each dynamic node. An analysis unit coupled to the situation unit analyzes the respective situation of each dynamic node in combination with specified criteria to generate respective situation awareness date for each dynamic node. A dynamic selector unit coupled to the analysis unit determines from the respective situation awareness data appropriate action to be performed by each node; and a communication unit coupled to the dynamic selector unit conveys to the respective dynamic node command data to permit rendering of a personalized command for informing the respective node of appropriate action to be carried out thereby.
    Type: Application
    Filed: September 13, 2006
    Publication date: October 1, 2009
    Applicant: Elta Systems Ltd
    Inventors: Moshe Aviran, Alexander Zussman
  • Publication number: 20090243794
    Abstract: A camera module includes an image sensor, a first signal processor, a bus interface, and a security device interface. The image sensor acquires an image data input. The first signal processor is coupled to the image sensor to receive the image data input, exchange data with a security device, and exchange data with a computer system which includes a second signal processor. The bus interface is coupled to the first signal processor to exchange data between the first signal processor and the second signal processor. The security device interface is coupled to the first signal processor to exchange data between the first signal processor and the security device.
    Type: Application
    Filed: March 23, 2009
    Publication date: October 1, 2009
    Inventor: Neil MORROW
  • Publication number: 20090248420
    Abstract: A voice interaction system includes one or more independent, concurrent state charts, which are used to model the behavior of each of a plurality of participants. The model simplifies the notation and provide a clear description of the interactions between multiple participants. These state charts capture the flow of voice prompts, the impact of externally initiated events and voice commands, and capture the progress of audio through each prompt. This system enables a method to prioritize conflicting and concurrent events leveraging historical patterns and the progress of in-progress prompts.
    Type: Application
    Filed: March 25, 2009
    Publication date: October 1, 2009
    Inventors: Otman a. Basir, William Ben Miners
  • Publication number: 20090245063
    Abstract: A digital memory organizer including a digital memory with a connection to a computer and designated software associated with the digital memory to carry out functions as automatic copying digital data, burning digital data onto hardware, organizing digital data according to the instruction of the user and in default of specific instruction, according to a default instruction. The designated software would also support voice recognition technology for the purpose of memory organization and memory retrieval.
    Type: Application
    Filed: March 27, 2008
    Publication date: October 1, 2009
    Inventors: Horin Kobalo, Uzi Ezra Havosha
  • Publication number: 20090240488
    Abstract: A method for facilitating the updating of a language model includes receiving, at a client device, via a microphone, an audio message corresponding to speech of a user; communicating the audio message to a first remote server; receiving, that the client device, a result, transcribed at the first remote server using an automatic speech recognition system (“ASR”), from the audio message; receiving, at the client device from the user, an affirmation of the result; storing, at the client device, the result in association with an identifier corresponding to the audio message; and communicating, to a second remote server, the stored result together with the identifier.
    Type: Application
    Filed: March 19, 2009
    Publication date: September 24, 2009
    Applicant: Yap, Inc.
    Inventors: Marc White, Igor Roditis Jablokov, Victor Roditis Jablokov
  • Publication number: 20090238348
    Abstract: A method of defining a voice browser for browsing a plurality of voice sites, at least some of the voice sites having different telephone numbers, the voice sites being configured to be accessed by telephone, is provided including storing information relating to voice sites visited by a voice user; and providing forward and back functions, comprising transferring a user from one voice site to another, in response to commands by the user. Computer program code and systems are also provided.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Applicant: International Business Machines Corporation
    Inventors: Sheetal K. Agarwal, Dipanjan Chakraborty, Arun Kumar, Amit Anil Nanavati, Nitendra Rajput
  • Publication number: 20090240673
    Abstract: There is provided an information processing system comprising a device and a terminal. The device comprises a recording unit to record telephone communication; a voice obtaining unit to obtain voice in response to detection of disconnection from a phone line; a transmission unit to transmit recorded data; a transmission unit to transmit voice data; a reception unit to receive a keyword candidate; a display control unit to display the keyword candidate; a selection unit to select a keyword candidate; and a transmission unit to transmit the selected keyword candidate. The terminal comprises a reception unit to receive the recorded data; a keyword generation unit to generate the keyword candidate based on the voice data; a transmission unit to transmit the keyword candidate; a registration unit to register the keyword candidate as a search keyword while associating the keyword candidate with the recorded data.
    Type: Application
    Filed: March 16, 2009
    Publication date: September 24, 2009
    Applicant: BROTHER KOGYO KABUSHIKI KAISHA
    Inventor: Takeshi Nagasaki
  • Publication number: 20090240499
    Abstract: A speech recognition system comprising: an analog to digital converter, a time to frequency transformer, a noise filter; a context preprocessor, an acoustic word classifier, an initial acoustic model generator, a textual search module, and a trainer. The system recognizes speech initially prior to training, due to the context preprocessor classifying words of identical sound by the context of a leading and trailing neighboring group of words and by the acoustic model generator creating an initial acoustic model derived from an acoustic word statistical analysis ‘average’. Applications of the system include voice activated computer games, command and control systems and text dictation.
    Type: Application
    Filed: March 19, 2008
    Publication date: September 24, 2009
    Inventors: Zohar Dvir, Ben-Zion Elishakov, Eitan Broukman, Yoel Shor
  • Publication number: 20090234647
    Abstract: A method, program storage device and mobile device provide speech disambiguation. Audio for speech recognition processing is transmitted by the mobile device. Results representing alternates identified to match the transmitted audio are received. The alternates are displayed in a disambiguation dialog screen for making corrections to the alternates. Corrections are made to the alternates using the disambiguation dialog screen until a correct result is displayed. The correct result is selected. Content associated with the selected correct result is received in parallel with the receiving of the results representing alternates identified to match the transmitted audio.
    Type: Application
    Filed: March 14, 2008
    Publication date: September 17, 2009
    Applicant: Microsoft Corporation
    Inventors: Oliver Scholz, Robert L. Chambers, Julian James Odell
  • Publication number: 20090234649
    Abstract: Aspects of the invention relate to an audio matching system for detecting matching of audio signals. In one embodiment, the system comprises a digital signature generation device configured to generate signatures from an audio signal. The device may comprise a signal energy analyser, an event detector, a signature generator, and a transmitter. Embodiments may further include an audio signal comparing device that comprises a receiver, an event locator arranged to use the event data of the received digital signature to locate corresponding portions of another sampled data segment of another audio signal, and an event analyser arranged to analyse the corresponding portions located by the event locator and to determine whether they match the predetermined events of the event data, to thereby determine whether the audio signals match.
    Type: Application
    Filed: December 16, 2008
    Publication date: September 17, 2009
    Applicant: Taylor Nelson Sofres plc
    Inventor: Iain Goodhew
  • Publication number: 20090232471
    Abstract: An information recording/reproducing apparatus capable of simplifying settings of a scene breakpoint includes a voice recognition unit and a control unit. At a timing when the voice recognition unit extracts a feature during recording, the control units sets a scene breakpoint and generates a thumbnail at the same time. During reproduction, the thumbnail and voices when the feature was extracted are output at the same time.
    Type: Application
    Filed: February 6, 2009
    Publication date: September 17, 2009
    Applicant: Hitachi, Ltd.
    Inventors: Hironori KOMI, Keisuke INATA, Daisuke YOSHIDA, Yusuke YATABE, Mitsuhiro OKADA, Tomoyuki NONAKA
  • Publication number: 20090228126
    Abstract: To facilitate the use of audio files for annotation purposes, an audio file format, which includes audio data for playback purposes, is augmented with a parallel data channel of line identifiers, or with a map associating time codes for the audio data with line numbers on the original document. The line number-time code information in the audio file is used to navigate within the audio file, and also to associate bookmark links and captured audio annotation files with line numbers of the original text document. An annotation device may provide an output document wherein links to audio and/or text annotation files are embedded at corresponding line numbers. Also, a navigation index may be generated, having links to annotation files and associated document line numbers, as well as bookmark links to selected document line numbers.
    Type: Application
    Filed: February 27, 2009
    Publication date: September 10, 2009
    Inventors: Steven Spielberg, Samuel Gustman
  • Publication number: 20090228270
    Abstract: Semantically distinct items are extracted from a single utterance by repeatedly recognizing the same utterance using constraints provided by semantic items already recognized. User feedback for selection or correction of partially recognized utterance may be used in a hierarchical, multi-modal, or single step manner. An accuracy of recognition is preserved while the less structured and more natural single utterance recognition form is allowed to be used.
    Type: Application
    Filed: March 5, 2008
    Publication date: September 10, 2009
    Applicant: Microsoft Corporation
    Inventors: Julian J. Odell, Robert L. Chambers, Oliver Scholz
  • Publication number: 20090228269
    Abstract: Method of synchronization between an operation for processing, by automatic speech recognition, a voice sequence (Sv) uttered by a speaker and an action of said speaker intended to trigger said processing. According to the invention, said processing operation is effected from a given time (t0) preceding said action of the speaker. Application to automatic speech recognition.
    Type: Application
    Filed: April 6, 2006
    Publication date: September 10, 2009
    Applicant: France Telecom
    Inventors: Jean Monne, Alexandre Ferrieux
  • Publication number: 20090216529
    Abstract: Electronic devices and methods are disclosed that adaptively filter a microphone signal responsive to recognition of a targeted speaker's voice. An electronic device can include a microphone, a speaker characterization circuit, an adaptive sound filter circuit, and a speaker recognition circuit. The speaker characterization circuit operates in a training mode to learn characteristics of the targeted speaker's voice component in the microphone signal, and to store the learned characteristics. The adaptive sound filter circuit adaptively filters the microphone signal responsive to a control signal. The speaker recognition circuit uses the learned characteristics to recognize the presence of the targeted speaker's voice in the microphone signal and to regulate the control signal to cause the adaptive sound filter circuit to adapt the filtering to increase the targeted speaker's voice component of the microphone signal relative to other components.
    Type: Application
    Filed: February 27, 2008
    Publication date: August 27, 2009
    Inventor: Henrik Sven Bengtsson
  • Publication number: 20090216530
    Abstract: A system improves speech detection or processing by identifying registration signals. The system encodes a limited frequency band by varying the amplitude of a pulse width modulated signal between predefined values. The signal is separated into frequency bins that identify amplitude and phase. The registration signal is measured by comparing a difference in average acoustic power in a plurality of adjacent bins over time.
    Type: Application
    Filed: February 21, 2008
    Publication date: August 27, 2009
    Applicant: QNX Software Systems (Wavemakers). Inc.
    Inventors: Mark Fallat, Derek Sahota
  • Publication number: 20090216540
    Abstract: A system and method for processing voice requests from a user for accessing information on a computerized network and delivering information from a script server and an audio server in the network in audio format. A voice user interface subsystem includes: a dialog engine that is operable to interpret requests from users from the user input, communicate the requests to the script server and the audio server, and receive information from the script server and the audio server; a media telephony services (MTS) server, wherein the MTS server is operable to receive user input via a telephony system, and to transfer the user input to the dialog engine; and a broker coupled between the dialog engine and the MTS server. The broker establishes a session between the MTS server and the dialog engine and controls telephony functions with the telephony system.
    Type: Application
    Filed: February 20, 2009
    Publication date: August 27, 2009
    Applicant: Ben Franklin Patent Holding, LLC
    Inventors: Marianna TESSEL, Danny Lange, Eugene Ponomarenko, Mitsuru Oshima, Daniel Burkes, Tjoen Min Tjong
  • Publication number: 20090205662
    Abstract: A method of operating a device for treating sleep disordered breathing (SDB), wherein the device provides continuous positive airway pressure during sleep, includes applying a treatment pressure to a patient, monitoring the patient for speech output, generating a signal in response to detected speech of the patient, and, in response to the signal, reducing the treatment pressure applied to the patient.
    Type: Application
    Filed: June 14, 2006
    Publication date: August 20, 2009
    Inventors: Philip Rodney Kwok, Ron Richard, Rohan Mullins, Chee Keong Phuah, Karthikeyan Selvarajan, Adrian Barnes, Christopher Kingsley Blunsden, Benriah Goeldi
  • Publication number: 20090210229
    Abstract: A voice message processing system shortens received voice messages to reduce the time a user must spend in reviewing the user's voice messages. In some embodiments, a data file associated with a caller is created and updated with words and associated audio files that may be used to replace longer words or phrases in future voice messages from the caller. A user may manually configure preferences to aggressively shorten messages in some embodiments. A speech synthesizer may be employed to replace text in messages when sufficient audio files are not stored to provide sufficient processing of messages. An audible indicator may be played with a revised message to allow a user to play back at least a portion of the original, received message without the substituted portions. Such systems provide a user the opportunity to review messages in a reduced time.
    Type: Application
    Filed: February 18, 2008
    Publication date: August 20, 2009
    Applicant: AT&T KNOWLEDGE VENTURES, L.P.
    Inventors: Brian Scott Amento, Christopher Harrison, Larry Stead
  • Publication number: 20090210233
    Abstract: One or more commands are configured to cause content to be stored for retrieval. The content to be stored includes one or more entries. The content may include event-triggered content stored for retrieval upon an occurrence of a specified event or other content. The content is retrieved in response to a retrieval command specifying a given pattern by comparing the given pattern with the stored content and, upon finding a match for the given pattern, wherein the match corresponds with the given pattern within a predetermined variance, retrieving additional content stored with the match for the given pattern. The content also may be retrieved by identifying the occurrence of the specified event and retrieving the event-triggered content upon the occurrence of the specified event.
    Type: Application
    Filed: February 15, 2008
    Publication date: August 20, 2009
    Applicant: Microsoft Corporation
    Inventors: Ralph Donald Thompson, III, Russell I. Sanchez
  • Publication number: 20090210232
    Abstract: A plurality of prompting layers configured to provide varying levels of detailed assistance in prompting a user are maintained. A prompt from a current prompting layer is presented to a user. Input is received from the user. A level of detail in prompting the user is adaptively changed based on user behavior. Upon the user making a hesitant verbal gesture that reaches a threshold duration, a transition is made from the current prompting layer to a more detailed prompting layer. Upon the user interrupting the prompt with a valid input, a transition is made from the current prompting layer to a less detailed prompting layer.
    Type: Application
    Filed: February 15, 2008
    Publication date: August 20, 2009
    Applicant: Microsoft Corporation
    Inventor: Russell I. Sanchez
  • Publication number: 20090207247
    Abstract: A surveillance system includes an event module that receives an event detection signal that indicates an event external to the surveillance system. The system also includes a power control module that selectively operates a data recording device at least partially concurrent with the event based on the event detection signal. The system also includes a selection module that selects a portion of data recorded by the data recording device based on content of the data. A data transmission module transmits the portion over a remote communication link.
    Type: Application
    Filed: February 15, 2008
    Publication date: August 20, 2009
    Inventors: Jeffrey Zampieron, Joseph Presicci, Mark Sanders, Robert Post
  • Publication number: 20090210217
    Abstract: A gaming apparatus of the present invention comprises: a microphone; a speaker; a display; a memory storing effect image data for each language type; and a controller. The controller is programmed to conduct the processing of: (A) recognizing a language type from a sound inputted from said microphone by executing a language recognition program; (B) conducting a conversation with a player by recognizing a voice inputted from said microphone, in addition to outputting a voice from said speaker by executing a conversation program corresponding to the language recognized in said processing (A); and (C) displaying to said display an image based on effect image data corresponding to the language type recognized in said processing (A) according to progress of a game, said effect image data read from said memory.
    Type: Application
    Filed: October 1, 2008
    Publication date: August 20, 2009
    Applicant: ARUZE GAMING AMERICA, INC.
    Inventor: Kazuo Okada
  • Publication number: 20090204408
    Abstract: A multi-modal system providing for a single point of contact that can allow users to manage their personal contact information and contact lists, and connect to other users and businesses in a personalized, efficient, location-sensitive and organized manner. By accessing the system using any type of telephony-based device, a user can manage all of their personal and business contacts as well as perform generalized searches in public databases, such as white page and/or yellow page listings, or more personalized searches through databases of their business or personal contacts. A user may also, during a generalized search, go to a personalized search, and vice-versa. The system may also provide users with the opportunity to select certain businesses from their contact lists and allow these businesses to provide them with personalized data, either on demand or based on user-controlled permissions or areas of interest through various technologies including presence technologies.
    Type: Application
    Filed: February 4, 2009
    Publication date: August 13, 2009
    Inventors: TODD GARRETT SIMPSON, CHRISTOPHER EDWARD LUGG