Speech Classification Or Search (epo) Patents (Class 704/E15.014)

E Subclasses

Using distance or distortion measures between unknown speech and reference templates (epo) (Class 704/E15.015)

Using dynamic programming techniques, e.g., dynamic time warping (dtw), etc. (epo) (Class 704/E15.016)

Using artificial neural networks (epo) (Class 704/E15.017)

Using natural language modeling (epo) (Class 704/E15.018)

Using statistical models, e.g., hidden markov models (hmms), etc. (epo) (Class 704/E15.027)

Recognition networks (epo) (Class 704/E15.038)

Speech recognition word dictionary/language model making system, method, and program, and speech recognition system

Publication number: 20090106023

Abstract: A speech recognition word dictionary/language model making system for creating a word dictionary for recognizing a word not appearing in a learning text by selecting a word-generation-model-learning-method-by-word-class according to the word to be added which does not appear in the learning text and for making a language model. The speech recognition word dictionary/language model making system (100) includes a language model estimating device (111) for selecting estimating method information from a learning-method-knowledge-by-word-class storing section (109) for each word class of an addition word generating model which is a word generating model of the addition word according to the selected estimating method information and a database combining device (112) for adding an addition word to a word dictionary (105) and adding an addition word generating model to a word-generation-model-by-word-class database (107).

Type: Application

Filed: November 30, 2007

Publication date: April 23, 2009

Inventor: Kiyokazu Miki
CLIENT DEVICE FOR INTERACTING WITH A MIXED MEDIA REALITY RECOGNITION SYSTEM

Publication number: 20090100050

Abstract: The mobile device includes a client that has a number of modules, and the MMR Gateway and MMR matching unit are implemented as a server that has a number of modules. The implementation of the MMR system as a client and a server is advantageous because the modules may be distributed among the client and the server in a variety of configurations. The present invention includes a capture module, a preprocessing module, a feature extraction module, a retrieval module, a send message module, an action module, a prediction module, a feedback module, a sending module, an MMR database, a streaming module, an e-mail module, a voice recognition system and an audio database. These modules and systems are operational upon the client or the server. In one embodiment, the client includes only the capture module with the remaining modules operational on the server. In a second embodiment, the server includes the action module with the remaining modules operational on the client.

Type: Application

Filed: December 19, 2008

Publication date: April 16, 2009

Inventors: Berna Erol, Jorge Moraleda, Jonathan J. Hull
Creating A Voice Response Grammar From A Presentation Grammar

Publication number: 20090099842

Abstract: Methods, systems, and products are disclosed for creating a voice response grammar in a voice response server including identifying presentation documents for a presentation, each presentation document having a presentation grammar. Typical embodiments include storing each presentation grammar in a voice response grammar on a voice response server. In typical embodiments, identifying presentation documents for a presentation includes creating a data structure representing a presentation and listing at least one presentation document in the data structure representing a presentation. In typical embodiments listing the at least one presentation document includes storing a location of the presentation document in the data structure representing a presentation and storing each presentation grammar includes retrieving a presentation grammar of the presentation document in dependence upon the location of the presentation document.

Type: Application

Filed: December 23, 2008

Publication date: April 16, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Daniel Mark Schumacher, Thomas J. Watson
SIGNAL PRESENCE DETECTION USING BI-DIRECTIONAL COMMUNICATION DATA

Publication number: 20090043577

Abstract: A system and method for using bi-directional conversation data to improve signal presence detection are disclosed. The detector module is adapted to communicate with a signal enhancement module. The detector module collects data from a transmit direction of the connection and a receive direction of a data connection. The collected data from the transmit and the receive direction is used to classify at least one of data in the transmit direction and data in the receive direction. Responsive to the classification, the signal enhancement module enhances data in one of the transmit direction and the receive direction. Hence, data classification accuracy is improved by using data from both the transmit and receive directions. In one embodiment, the detector module applies a voice activity detection module (VAD) process to detect the presence or absence of voice data in the collected data.

Type: Application

Filed: August 10, 2007

Publication date: February 12, 2009

Applicant: DITECH NETWORKS, INC.

Inventor: MAHESH GODAVARTI
Method and System for Grouping Voice Messages

Publication number: 20080215323

Abstract: A method for grouping voice messages includes extracting a voice signature from a voice message and tagging the voice message with an identification associated with the voice signature. The method also includes grouping the voice message based on the identification.

Type: Application

Filed: March 2, 2007

Publication date: September 4, 2008

Applicant: Cisco Technology, Inc.

Inventors: Shmuel Shaffer, Labhesh Patel, Mukul Jain, Sanjeev Kumar
SYSTEM AND METHOD FOR SELECTING AND PRESENTING ADVERTISEMENTS BASED ON NATURAL LANGUAGE PROCESSING OF VOICE-BASED INPUT

Publication number: 20080189110

Abstract: A system and method for selecting and presenting advertisements based on natural language processing of voice-based inputs is provided. A user utterance may be received at an input device, and a conversational, natural language processor may identify a request from the utterance. At least one advertisement may be selected and presented to the user based on the identified request. The advertisement may be presented as a natural language response, thereby creating a conversational feel to the presentation of advertisements. The request and the user's subsequent interaction with the advertisement may be tracked to build user statistical profiles, thus enhancing subsequent selection and presentation of advertisements.

Type: Application

Filed: February 6, 2007

Publication date: August 7, 2008

Inventors: Tom Freeman, Mike Kennewick
System and method for modifying and updating a speech recognition program

Publication number: 20080167860

Abstract: An embodiment provides a system and method for updating a speech recognition program. The system provides a speech recognition program, an update website for updating a speech recognition program, and a database for storing data that may be used to update a speech recognition program. A user may utilize an update website, to add, modify, and delete speech recognition program information that may include: speech commands, dll's, multimedia files, executable code, and other information. Speech recognition program may communicate with update website to request information about possible updates. Update website may send a response consisting of information to speech recognition program. Speech recognition program may utilize received information to decide what speech commands, dll's, multimedia files, executable code, and other information it may want to download.

Type: Application

Filed: January 10, 2007

Publication date: July 10, 2008

Inventors: Michael D. Goller, Stuart E Goller
NETWORK ENTITY, METHOD AND COMPUTER PROGRAM PRODUCT FOR MIXING SIGNALS DURING A CONFERENCE SESSION

Publication number: 20080162127

Abstract: A network entity, method and computer program product are provided for effectuating a conference session. The method may include receiving a plurality of signals representative of voice communication of the participants. In this regard, the signals may be received from a plurality of terminals of a respective plurality of participants at one of the locations, each of at least some of the terminals otherwise being configured for voice communication independent of at least some of the other terminals. The method of this aspect also includes classifying speech activity of the conference session according to a speech pause, or one or more actively-speaking participants, during the conference session. The signals of the respective participants may then be mixed into a at least one mixed signal for output to one or more other participants at one or more other locations, the signals being mixed based upon classification of the speech activity.

Type: Application

Filed: December 27, 2006

Publication date: July 3, 2008

Inventors: Laura Laaksonen, Jussi Virolainen, Paivi Valve
MULTIPLE SOUND FRAGMENTS PROCESSING AND LOAD BALANCING

Publication number: 20080147403

Abstract: A method, system and article of manufacture of recognizing a voice command. One embodiment of the invention comprises: receiving a voice input; using the number of sound fragments, determining a number of sound fragments to be processed in a first set of sound fragments; determining whether the first set of sound fragments of the voice input matches with the first set of sound fragments of a voice command; and if the first set of sound fragments matches with the first set of sound fragments of the voice command, then determining whether one or more remaining sound fragments matches with one or more remaining sound fragments of the voice command.

Type: Application

Filed: March 3, 2008

Publication date: June 19, 2008

Inventors: Joseph Herbert McIntyre, Victor S. Moore
Method and system for high-speed speech recognition

Publication number: 20080140399

Abstract: Provided is a method and system for high-speed speech recognition. On the basis of a continuous density hidden Markov model (CDHMM) using a Gaussian mixture model (GMM) for an observation probability, the method and system add only K Gaussian components highly contributing to a state-specific observation probability for an input feature vector and calculate the state-specific observation probability. Thus, in the aspect of the recognition ratio, the degree of approximation of a state-specific observation probability increases, thereby minimizing deterioration of speech recognition performance. In addition, in the aspect of the amount of computation, the number of addition operations required for computing an observation probability is reduced, in comparison with conventional speech recognition that adds all Gaussian probabilities of an input feature vector and uses it for a state-specific observation probability, thereby reducing the total amount of computation required for speech recognition.

Type: Application

Filed: July 30, 2007

Publication date: June 12, 2008

Inventor: Hoon Chung
METHOD TO TRAIN THE LANGUAGE MODEL OF A SPEECH RECOGNITION SYSTEM TO CONVERT AND INDEX VOICEMAILS ON A SEARCH ENGINE

Publication number: 20080133235

Abstract: A method and a related system to index voicemail documents by training a language model for a speaker or group of speakers by using existing emails and contact information on available repositories.

Type: Application

Filed: December 3, 2007

Publication date: June 5, 2008

Inventors: Laurent SIMONEAU, Pascal Soucy
SYSTEM AND METHOD FOR COMPUTERIZED PSYCHOLOGICAL CONTENT ANALYSIS OF COMPUTER AND MEDIA GENERATED COMMUNICATIONS TO PRODUCE COMMUNICATIONS MANAGEMENT SUPPORT, INDICATIONS AND WARNINGS OF DANGEROUS BEHAVIOR, ASSESSMENT OF MEDIA IMAGES, AND PERSONNEL SELECTION SUPPORT

Publication number: 20080109214

Abstract: At least one computer-mediated communication produced by or received by an author is collected and parsed to identify categories of information within it. The categories of information are processed with at least one analysis to quantify at least one type of information in each category. A first output communication is generated regarding the at least one computer-mediated communication, describing the psychological state, attitudes or characteristics of the author of the communication. A second output communication is generated when a difference between the quantification of at least one type of information for at least one category and a reference for the at least one category is detected involving a psychological state, attitude or characteristic of the author to which a responsive action should be taken.

Type: Application

Filed: January 7, 2008

Publication date: May 8, 2008

Inventor: Eric Shaw
Apparatus and method for processing information, and program

Publication number: 20080077398

Abstract: An information processing apparatus includes an extracting unit that extracts a plurality of words serving as keywords of content from content information that describes the content, a property dictionary storage unit that stores a property dictionary containing the properties of the words, a property searching unit that searches the property dictionary for the properties of the plurality of words, a property determining unit that determines whether each of the properties of a target word to be processed and selected from among the words serving as keywords matches any of the different words other than the target word among the words serving as keywords or whether each of the properties of a target word matches any of the properties of the different words, and a determination unit that determines the representative property of the target word on the basis of a match count determined by the property determining unit.

Type: Application

Filed: September 19, 2007

Publication date: March 27, 2008

Inventors: Motoki Tsunokawa, Kenichiro Kobayashi
Apparatus and Methods for the Detection of Emotions in Audio Interactions

Publication number: 20080040110

Abstract: An apparatus and method for detecting an emotional state of a speaker participating in an audio signal. The apparatus and method are based on the distance in voice features between a person being in an emotional state and the same person being in a neutral state. The apparatus and method comprise a training phase in which a training feature vector is determined, and an ongoing stage in which the training feature vector is used to determine emotional states in a working environment. Multiple types of emotions can be detected, and the method and apparatus are speaker-independent, i.e., no prior voice sample or information about the speaker is required.

Type: Application

Filed: August 8, 2005

Publication date: February 14, 2008

Applicant: NICE SYSTEMS LTD.

Inventors: Oren Pereg, Moshe Wasserblat

prev 1 2