Speech Recognition (epo) Patents (Class 704/E15.001)

E Subclasses

Assessment or evaluation of speech recognition systems (epo) (Class 704/E15.002)

Language recognition (epo) (Class 704/E15.003)

Feature extraction for speech recognition; selection of recognition unit (epo) (Class 704/E15.004)

Segmentation or word limit detection (epo) (Class 704/E15.005)

Word boundary detection (EPO) (Class 704/E15.006)

Creation of reference templates; training of speech recognition systems, e.g., adaption to the characteristics of the speaker's voice, etc. (epo) (Class 704/E15.007)

Speech classification or search (epo) (Class 704/E15.014)

Speech recognition techniques for robustness in adverse environments, e.g., in noise, of stress induced speech, etc. (epo) (Class 704/E15.039)

Procedures used during a speech recognition process, e.g., man-machine dialogue, etc. (epo) (Class 704/E15.04)

Speech recognition using nonacoustical features, e.g., position of the lips, etc. (epo) (Class 704/E15.041)

Using position of the lips, movement of the lips, or face analysis (EPO) (Class 704/E15.042)

Speech to text systems (epo) (Class 704/E15.043)

Constructional details of speech recognition systems (epo) (Class 704/E15.046)

BINAURAL METHOD AND BINAURAL CONFIGURATION FOR VOICE CONTROL OF HEARING DEVICES

Publication number: 20110238419

Abstract: A binaural configuration and an associated method have/utilize first and second hearing devices for the voice control of the hearing devices by voice commands. The configuration contains a first voice recognition module in the first hearing device and a second voice recognition module in the second hearing device. The second voice recognition module uses information data from the first voice recognition module for recognition of the voice commands. It is here advantageous that the rate of erroneously recognized voice commands (“false alarms”) is reduced.

Type: Application

Filed: March 24, 2011

Publication date: September 29, 2011

Applicant: SIEMENS MEDICAL INSTRUMENTS PTE. LTD.

Inventor: Roland Barthel
Social Recommender System for Generating Dialogues Based on Similar Prior Dialogues from a Group of Users

Publication number: 20110230229

Abstract: Techniques for organizing information in a user-interactive system based on user interest are provided. In one aspect, a method for operating a system having a plurality of resources through which a user can navigate is provided. The method includes the following steps. When the user accesses the system, the resources are presented to the user in a particular order. Interests of the user in the resources presented are determined. The interests of the user are compared to interests of other users to find one or more subsets of users to which the user belongs by virtue of having similar interests. Upon one or more subsequent accesses to the system by the user, the order in which the resources are presented to the user is based on interests common to the one or more subsets of users to which the user belongs.

Type: Application

Filed: March 20, 2010

Publication date: September 22, 2011

Applicant: International Business Machines Corporation

Inventors: Rajarshi Das, Robert George Farrell, Nitendra Rajput
SYSTEM AND METHOD FOR PROVIDING AN ACOUSTIC GRAMMAR TO DYNAMICALLY SHARPEN SPEECH INTERPRETATION

Publication number: 20110231188

Abstract: The system and method described herein may provide an acoustic grammar to dynamically sharpen speech interpretation. In particular, the acoustic grammar may be used to map one or more phonemes identified in a user verbalization to one or more syllables or words, wherein the acoustic grammar may have one or more linking elements to reduce a search space associated with mapping the phonemes to the syllables or words. As such, the acoustic grammar may be used to generate one or more preliminary interpretations associated with the verbalization, wherein one or more post-processing techniques may then be used to sharpen accuracy associated with the preliminary interpretations. For example, a heuristic model may assign weights to the preliminary interpretations based on context, user profiles, or other knowledge and a probable interpretation may be identified based on confidence scores associated with one or more candidate interpretations generated with the heuristic model.

Type: Application

Filed: June 1, 2011

Publication date: September 22, 2011

Applicant: VoiceBox Technologies, Inc.

Inventors: Robert A. Kennewick, Min Ke, Michael Tjalve, Philippe Di Cristo
DETECTION OF VOICE INACTIVITY WITHIN A SOUND STREAM

Publication number: 20110224987

Abstract: A method for identifying end of voiced speech within an audio stream of a noisy environment employs a speech discriminator. The discriminator analyzes each window of the audio stream, producing an output corresponding to the window. The output is used to classify the window in one of several classes, for example, (1) speech, (2) silence, or (3) noise. A state machine processes the window classifications, incrementing counters as each window is classified: speech counter for speech windows, silence counter for silence, and noise counter for noise. If the speech counter indicates a predefined number of windows, the state machine clears all counters. Otherwise, the state machine appropriately weights the values in the silence and noise counters, adds the weighted values, and compares the sum to a limit imposed on the number of non-voice windows. When the non-voice limit is reached, the state machine terminates processing of the audio stream.

Type: Application

Filed: June 3, 2010

Publication date: September 15, 2011

Applicant: Applied Voice & Speech Technologies, Inc.

Inventor: Karl Daniel Gierach
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD AND PROGRAM

Publication number: 20110224978

Abstract: An information processing device includes an audio-based speech recognition processing unit which is input with audio information as observation information of a real space, executes an audio-based speech recognition process, thereby generating word information that is determined to have a high probability of being spoken, an image-based speech recognition processing unit which is input with image information as observation information of the real space, analyzes mouth movements of each user included in the input image, thereby generating mouth movement information, an audio-image-combined speech recognition score calculating unit which is input with the word information and the mouth movement information, executes a score setting process in which a mouth movement close to the word information is set with a high score, thereby executing a score setting process, and an information integration processing unit which is input with the score and executes a speaker specification process.

Type: Application

Filed: March 1, 2011

Publication date: September 15, 2011

Inventor: Tsutomu SAWADA
SPEECH RECOGNITION SYSTEM AND SPEECH RECOGNIZING METHOD

Publication number: 20110224980

Abstract: A speech recognition system according to the present invention includes a sound source separating section which separates mixed speeches from multiple sound sources from one another; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each frequency spectral component of a separated speech signal using distributions of speech signal and noise against separation reliability of the separated speech signal; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.

Type: Application

Filed: March 10, 2011

Publication date: September 15, 2011

Applicant: HONDA MOTOR CO., LTD.

Inventors: Kazuhiro Nakadai, Toru Takahashi, Hiroshi Okuno
METHOD AND APPARATUS FOR AUTOMATICALLY MAGNIFYING A TEXT BASED IMAGE OF AN OBJECT

Publication number: 20110224967

Abstract: Method and apparatus for capturing a text based source image 5, 5A provided on an object 3 supported on a surface. Positioned above the object 3 is a camera 7 for capturing a view of the text based image 5, 5A. The camera 7, through lens 9 generates a focused image of at least part of the object 3 and transmits this image to a processor 11 for magnification of the image captured by the camera 7 to a size specified for display on a display device 15. In the processor 11 the magnification is effected to a rate that is controlled by the second predefined size for display 19, 19A of the font to appear on the display 15 and is independent of the first font size.

Type: Application

Filed: June 16, 2009

Publication date: September 15, 2011

Inventor: Michiel Jeroen Van Schaik
DETERMINING TEXT TO SPEECH PRONUNCIATION BASED ON AN UTTERANCE FROM A USER

Publication number: 20110218806

Abstract: Systems and methods are provided for automatically building a native phonetic lexicon for a speech-based application trained to process a native (base) language, wherein the native phonetic lexicon includes native phonetic transcriptions (base forms) for non-native (foreign) words which are automatically derived from non-native phonetic transcriptions of the non-native words.

Type: Application

Filed: May 18, 2011

Publication date: September 8, 2011

Applicant: Nuance Communications, Inc.

Inventors: Neal Alewine, Eric Janke, Paul Sharp, Roberto Sicconi
METHOD AND SYSTEM FOR ASSESSING INTELLIGIBILITY OF SPEECH REPRESENTED BY A SPEECH SIGNAL

Publication number: 20110218803

Abstract: A method for assessing intelligibility of speech represented by a speech signal includes providing a speech signal and performing a feature extraction on at least one frame of the speech signal so as to obtain a feature vector for each of the at least one frame of the speech signal. The feature vector is input to a statistical machine learning model so as to obtain an estimated posterior probability of phonemes in the at least one frame as an output including a vector of phoneme posterior probabilities of different phonemes for each of the at least one frame of the speech signal. An entropy estimation is performed on the vector of phoneme posterior probabilities of the at least one frame of the speech signal so as to evaluate intelligibility of the at least one frame of the speech signal. An intelligibility measure is output for the at least one frame of the speech signal.

Type: Application

Filed: March 4, 2011

Publication date: September 8, 2011

Applicant: DEUTSCHE TELEKOM AG

Inventors: Hamed Ketabdar, Juan-Pablo Ramirez
SPEECH PROCESSOR, A SPEECH PROCESSING METHOD AND A METHOD OF TRAINING A SPEECH PROCESSOR

Publication number: 20110218804

Abstract: A speech recognition method, the method involving: receiving a speech input from a known speaker of a sequence of observations; and determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model, the acoustic model having a plurality of model parameters describing probability distributions which relate a word or part thereof to an observation, the acoustic model having been trained using first training data and adapted using second training data to said speaker, the speech recognition method also determining the likelihood of a sequence of observations occurring in a given language using a language model; and combining the likelihoods determined by the acoustic model and the language model and outputting a sequence of words identified from said speech input signal, wherein said acoustic model is context based for said speaker, said context based information being contained in said model using a plurality of decision trees, wherein the structure of said d

Type: Application

Filed: January 26, 2011

Publication date: September 8, 2011

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Byung Ha Chun
SPOKEN TERM DETECTION APPARATUS, METHOD, PROGRAM, AND STORAGE MEDIUM

Publication number: 20110218805

Abstract: A spoken term detection apparatus includes: processing performed by a processor includes a feature extraction process extracting an acoustic feature from speech data accumulated in an accumulation part and storing an extracted acoustic feature in an acoustic feature storage, a first calculation process calculating a standard score from a similarity between an acoustic feature stored in the acoustic feature storage and an acoustic model stored in the acoustic model storage part, a second calculation process for comparing an acoustic model corresponding to an input keyword with the acoustic feature stored in the acoustic feature storage part to calculate a score of the keyword, and a retrieval process retrieving speech data including the keyword from speech data accumulated in the accumulation part based on the score of the keyword calculated by the second calculation process and the standard score stored in the standard score storage part.

Type: Application

Filed: March 3, 2011

Publication date: September 8, 2011

Applicant: FUJITSU LIMITED

Inventors: Nobuyuki Washio, Shouji Harada
METHOD OF ANALYSING AN AUDIO SIGNAL

Publication number: 20110213614

Abstract: A method of analysing an audio signal is disclosed. A digital representation of an audio signal is received and a first output function is generated based on a response of a physiological model to the digital representation. At least one property of the first output function may be determined. One or more values are determined for use in analysing the audio signal, based on the determined property of the first output function.

Type: Application

Filed: September 11, 2009

Publication date: September 1, 2011

Applicant: NEWSOUTH INNOVATIONS PTY LIMITED

Inventors: Wenliang Lu, Dipanjan Sen
"System and Method for the Adaptive Use of Uncertainty Information in Speech Recognition to Assist in the Recognition of Natural Language Phrases"

Publication number: 20110213616

Abstract: A speech recognition system includes a natural language processing component and an automated speech recognition component distinct from each other such that uncertainty in speech recognition is isolated from uncertainty in natural language understanding, wherein the natural language processing component and an automated speech recognition component communicate corresponding weighted meta-information representative of the uncertainty.

Type: Application

Filed: September 23, 2010

Publication date: September 1, 2011

Inventors: Robert E. Williams, John E. Keane
Hidden Markov Model for Speech Processing with Training Method

Publication number: 20110208521

Abstract: A method, system and apparatus are shown for identifying non-language speech sounds in a speech or audio signal. An audio signal is segmented and feature vectors are extracted from the segments of the audio signal. The segment is classified using a hidden Markov model (HMM) that has been trained on sequences of these feature vectors. Post-processing components can be utilized to enhance classification. An embodiment is described in which the hidden Markov model is used to classify a segment as a language speech sound or one of a variety of non-language speech sounds. Another embodiment is described in which the hidden Markov model is trained using discriminative learning.

Type: Application

Filed: August 13, 2009

Publication date: August 25, 2011

Applicant: 21CT, INC.

Inventor: Matthew McClain
REAL-TIME DATA PATTERN ANALYSIS SYSTEM AND METHOD OF OPERATION THEREOF

Publication number: 20110208519

Abstract: A method of operation of a real-time data-pattern analysis system includes: providing a memory module, a computational unit, and an integrated data transfer module arranged within an integrated circuit die; storing a data pattern within the memory module; transferring the data pattern from the memory module to the computational unit using the integrated data transfer module; and comparing processed data to the data pattern using the computational unit.

Type: Application

Filed: October 7, 2009

Publication date: August 25, 2011

Inventor: Richard M. Fastow
Media Delivery System for an Automobile by the Use of Voice Input Device and Head-Up Display

Publication number: 20110208604

Abstract: System and method for delivering media assets to a user of an automobile is disclosed. The system comprises a voice input device and a voice recognition device. The system further comprises a head-up display device for displaying metadata of the media assets on a windshield of the automobile. After a user's input for one or a group of media assets is received, a list of metadata of the media assets is displayed on the windshield. The system plays back the selected media asset through the use of a media delivery unit after the selection is received by the voice input device. A driving route may also be displayed on the windshield.

Type: Application

Filed: February 20, 2010

Publication date: August 25, 2011

Inventor: Yang Pan
USER PROFILING FOR VOICE INPUT PROCESSING

Publication number: 20110208524

Abstract: This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device.

Type: Application

Filed: February 25, 2010

Publication date: August 25, 2011

Applicant: Apple Inc.

Inventor: Allen P. Haughay
METHOD FOR VARIABLE RESOLUTION AND ERROR CONTROL IN SPOKEN LANGUAGE UNDERSTANDING

Publication number: 20110208526

Abstract: A method for variable resolution and error control in spoken language understanding (SLU) allows arranging the categories of the SLU into a hierarchy of different levels of specificity. The pre-determined hierarchy is used to identify different types of errors such as high-cost errors and low-cost errors and trade, if necessary, high cost errors for low cost errors.

Type: Application

Filed: May 6, 2011

Publication date: August 25, 2011

Inventors: Roberto PIERACCINI, Krishna Dayanidhi
REMOTE CONTROL OF A WEB BROWSER

Publication number: 20110202350

Abstract: A system for remotely and interactively controlling visual and multimedia content displayed on and rendered by a web browser using a telephony device. In particular, the system relates to receiving a voice input (e.g., dual tone multi-frequency DTMF input, spoken input, etc.) from a telephony device (e.g., a landline, a cellular telephone, or other system with telephone functionality, etc.) via a wide-area network to an intermediary computer that is configured to control the rendering of one or more web pages (or other web data) by a standard web browser.

Type: Application

Filed: October 15, 2009

Publication date: August 18, 2011

Inventor: Troy Barnes
SYSTEM AND METHOD FOR RECOGNITION OF ALPHANUMERIC PATTERNS INCLUDING LICENSE PLATE NUMBERS

Publication number: 20110202338

Abstract: Voice recognition technology is combined with external information sources and/or contextual information to enhance the quality of voice recognition results specifically for the use case of reading out or speaking an alphanumeric identifier. The alphanumeric identifier may be associated with a good, service, person, account, or other entity. For example, the identifier may be a vehicle license plate number.

Type: Application

Filed: February 14, 2011

Publication date: August 18, 2011

Inventor: Philip INGHELBRECHT
Audio system and method for coordinating tasks

Publication number: 20110202351

Abstract: A system includes a hands free mobile communication device. Software stored on a machine readable storage device is executed to cause the hands free mobile communication device to communicate audibly with a field operator performing field operations. The operator receives instructions regarding operations to be performed. Oral communications are received from the operator and are processed automatically to provide further instructions in response to the received oral communications.

Type: Application

Filed: February 16, 2010

Publication date: August 18, 2011

Applicant: Honeywell International Inc.

Inventors: Tom Plocher, Emmanuel Letsu-Dake, Robert E. De Mers, Paul Derby
Analysis of the Temporal Evolution of Emotions in an Audio Interaction in a Service Delivery Environment

Publication number: 20110196677

Abstract: According to one illustrative embodiment, a method is provided for analyzing an audio interaction. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided.

Type: Application

Filed: February 11, 2010

Publication date: August 11, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Om D. Deshmukh, Chitra Dorai, Shailesh Joshi, Maureen E. Rzasa, Ashish Verma, Karthik Visweswariah, Gary J. Wright, Sai Zeng
SYSTEM AND METHOD FOR MEASURING SPEECH CHARACTERISTICS

Publication number: 20110191104

Abstract: A method for measuring a disparity between two speech samples is disclosed that may include determining upon a speech granularity level at which to compare the rhythm of a student speech sample and a reference speech sample; determining a duration disparity between a first speech unit and a second, non-adjacent speech unit in the student speech sample; determining a duration disparity between a first speech unit and a second, non-adjacent speech unit in the reference speech sample; and calculating the difference between the student speech-unit duration disparity and the reference speech-unit disparity.

Type: Application

Filed: January 29, 2010

Publication date: August 4, 2011

Applicant: Rosetta Stone, Ltd.

Inventors: Joseph Tepperman, Theban Stanley, Kadri Hacioglu
SPEECH RECOGNITION

Publication number: 20110184724

Abstract: Presented is a method and system for speech recognition. The method includes determining noise level in an environment, comparing the determined noise level with a predetermined noise level threshold value, using a first set of grammar for speech recognition, if the determined noise level is below the predetermined noise level threshold value, and using a second set of grammar for speech recognition, if the determined noise level is above the predetermined noise level threshold value.

Type: Application

Filed: April 6, 2010

Publication date: July 28, 2011

Inventor: Amit RANJAN
AUTOMATED METHOD OF RECOGNIZING INPUTTED INFORMATION ITEMS AND SELECTING INFORMATION ITEMS

Publication number: 20110184736

Abstract: Automated methods are provided for recognizing inputted information items and selecting information items. The recognition and selection processes are performed by selecting category designations that the information items belong to. The category designations improve the accuracy and speed of the inputting and selection processes.

Type: Application

Filed: January 25, 2011

Publication date: July 28, 2011

Inventor: Benjamin SLOTZNICK
SPEECH RECOGNITION APPARATUS, SPEECH RECOGNITION METHOD, AND SPEECH RECOGNITION ROBOT

Publication number: 20110184737

Abstract: A speech recognition apparatus includes a speech input unit that receives input speech, a phoneme recognition unit that recognizes phonemes of the input speech and generates a first phoneme sequence representing corrected speech, a matching unit that matches the first phoneme sequence with a second phoneme sequence representing original speech, and a phoneme correcting unit that corrects phonemes of the second phoneme sequence based on the matching result.

Type: Application

Filed: January 27, 2011

Publication date: July 28, 2011

Applicant: HONDA MOTOR CO., LTD.

Inventors: Mikio NAKANO, Naoto IWAHASHI, Kotaro FUNAKOSHI, Taisuke SUMII
SYSTEM AND METHOD FOR ACCESS TO MULTIMEDIA STRUCTURES

Publication number: 20110178801

Abstract: A system for access to multimedia structures has telephone sets capable of connecting to a telephone network, a storage device capable of storing a plurality of multimedia structures representing messages and/or data and/or commands, and a network access server that can be associated with the telephone sets and is capable of selectively instantiating the multimedia structures via an interconnection network. There is also a voice-recognition and speech-synthesis system that can be associated with the network access server and that comprises modules for reading files in XML format and for processing the files so as to obtain files in a format that can be synthesized by a speech synthesizer.

Type: Application

Filed: January 14, 2011

Publication date: July 21, 2011

Applicants: TELECOM ITALIA S.P.A., LOQUENDO S.P.A.

Inventors: Pierpaolo Anselmetti, Mauro Cociglio, Simone Toniolo, Diego Zanin, Nadia Zerba
SYSTEMS AND METHODS FOR MULTI-TENANCY IN CONTACT HANDLING SYSTEMS

Publication number: 20110179304

Abstract: One example embodiment includes a method for providing multi-tenancy in a computing environment. The method includes receiving a script in a computing environment, where the script includes one or more actions to be completed by the computing environment. The method further includes providing one or more computing resources in the computing environment and building an action list for the one or more computing resources, where the action list is a data structure that contains a list of one or more actions to be executed by the one or more computing resources. The method further includes transmitting a first action to one of the one or more computing resources, where the first action is one of the one or more actions. The method further includes executing the first action in the one of the one or more computing resources and indicating to the action list the completion of the first action.

Type: Application

Filed: January 15, 2010

Publication date: July 21, 2011

Applicant: INCONTACT, INC.

Inventor: David Owen Peterson
Apparatus, method, and computer program product for correcting a misrecognized utterance using a whole or a partial re-utterance

Patent number: 7983912

Abstract: A speech recognition apparatus includes a generation unit generating a recognition candidate associated with a speech utterance and a likelihood; a storing unit storing the one recognition; a selecting unit selecting the recognition candidate as a recognition result of a first speech utterance; an utterance relation determining unit determining whether a second speech utterance which is input after the input of the first speech utterance is a speech re-utterance of a whole of the first speech utterance or a speech re-utterance of a part of the first speech utterance; a whole correcting unit correcting the recognition candidate of the whole of the first speech utterance when the second speech utterance is the whole of the first speech utterance; and a part correcting unit correcting the recognition candidate for the part of the first speech utterance when the second speech utterance is the part of the first speech utterance.

Type: Grant

Filed: March 15, 2006

Date of Patent: July 19, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Hideki Hirakawa, Tetsuro Chino
DESTINATION FLOOR REGISTRATION DEVICE OF ELEVATOR

Publication number: 20110168499

Abstract: In a destination floor registration device 800, a boarding detection unit 804 detects a boarding of an elevator user; when no button operation by destination floor registration buttons 500 is performed after a predetermined period of time has elapsed since detection of the boarding, a voice destination floor registration unit 805 outputs from a voice output device 400 a message for prompting a passenger to pronounce a destination floor; a voice recognition unit 806 recognizes the destination floor pronounced by the passenger; and a voice destination floor registration unit 805 requests an elevator car control device 200 to register the destination floor recognized by the voice recognition unit 806.

Type: Application

Filed: October 3, 2008

Publication date: July 14, 2011

Applicant: MITSUBISHI ELECTRIC CORPORATION

Inventor: Nobukazu Takeuchi
System and Method for Building Emotional Machines

Publication number: 20110172999

Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.

Type: Application

Filed: March 21, 2011

Publication date: July 14, 2011

Applicant: AT&T Corp.

Inventors: Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
WORD CATEGORY ESTIMATION APPARATUS, WORD CATEGORY ESTIMATION METHOD, SPEECH RECOGNITION APPARATUS, SPEECH RECOGNITION METHOD, PROGRAM, AND RECORDING MEDIUM

Publication number: 20110173000

Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.

Type: Application

Filed: December 19, 2008

Publication date: July 14, 2011

Inventors: Hitoshi Yamamoto, Miki Kiyokazu
IN-VEHICLE DEVICE AND METHOD FOR MODIFYING DISPLAY MODE OF ICON INDICATED ON THE SAME

Publication number: 20110173002

Abstract: A storage unit stores a correspondence between a voice command and a display mode modification operation. When a control unit determines that a vehicle is traveling according to a traveling state of the vehicle obtained by a traveling state acquisition unit, when a voice recognition unit recognizes a voice, which is uttered by a user and received by a voice input unit, and when the control unit determines that the recognized voice corresponds to a voice command stored in the storage unit, the control unit performs a display mode change operation corresponding to the voice command and modifies a display mode of an icon indicated on an indication screen of an indication unit.

Type: Application

Filed: January 10, 2011

Publication date: July 14, 2011

Applicant: DENSO CORPORATION

Inventors: Masahiro FUJII, Yuji SHINKAI
Human Voice Distinguishing Method and Device

Publication number: 20110166857

Abstract: A human voice distinguishing method and device are provided. The method involves: taking every n sampling points of the current frame of audio signals as one subsection, wherein n is a positive integer, judging whether two adjacent subsections have transition relative to a distinguishing threshold, wherein the sliding maximum absolute value of the two adjacent subsections is more and less than the distinguishing threshold respectively, if so, then determining the current frame to be human voice, where the sliding maximum absolute value of the subsection is obtained by the following method: taking the maximum value of absolute intensity of every sampling point in this subsection as the initial maximum absolute value of this subsection, and taking the maximum value of the initial maximum absolute value of this subsection and m subsections following this subsection as the sliding maximum absolute value of this subsection, wherein m is a positive integer.

Type: Application

Filed: September 15, 2009

Publication date: July 7, 2011

Applicant: ACTIONS SEMICONDUCTOR CO. LTD.

Inventors: Xiangyong Xie, Zhan Chen
SPOKEN MOBILE ENGINE

Publication number: 20110166860

Abstract: Systems and methods are disclosed to operate a mobile device by capturing user input; transmitting the user input over a wireless channel to an engine, analyzing at the engine music clip or video in a multimedia data stream and sending an analysis wirelessly to the mobile device.

Type: Application

Filed: July 12, 2010

Publication date: July 7, 2011

Inventor: Bao Q. Tran
SYSTEM AND METHOD FOR VARIABLE AUTOMATED RESPONSE TO REMOTE VERBAL INPUT AT A MOBILE DEVICE

Publication number: 20110166862

Abstract: A method and system for altering an operational mode of evaluating and responding to verbal input from a user to a mobile device if conditions make such evaluation incompatible with a favorable user experience. Automated speech recognition (ASR) evaluation of verbal input may be performed on a mobile platform to continue a flow of the user experience. Evaluation of the verbal input may continue at a backend when conditions allow for transmission of recorded input to the backend.

Type: Application

Filed: January 4, 2011

Publication date: July 7, 2011

Inventors: Eyal ESHED, Ariel Velikovsky, Sherrie Ellen Shammass
Systems and Methods for Hands-free Voice Control and Voice Search

Publication number: 20110166855

Abstract: In one embodiment the present invention includes a method comprising receiving an acoustic input signal and processing the acoustic input signal with a plurality of acoustic recognition processes configured to recognize the same target sound. Different acoustic recognition processes start processing different segments of the acoustic input signal at different time points in the acoustic input signal. In one embodiment, initial states in the recognition processes may be configured on each time step.

Type: Application

Filed: July 6, 2010

Publication date: July 7, 2011

Applicant: SENSORY, INCORPORATED

Inventors: Pieter J. Vermeulen, Jonathan Shaw, Todd F. Mozer
Intuitive Computing Methods and Systems

Publication number: 20110161076

Abstract: A smart phone senses audio, imagery, and/or other stimulus from a user's environment, and acts autonomously to fulfill inferred or anticipated user desires. In one aspect, the detailed technology concerns phone-based cognition of a scene viewed by the phone's camera. The image processing tasks applied to the scene can be selected from among various alternatives by reference to resource costs, resource constraints, other stimulus information (e.g., audio), task substitutability, etc. The phone can apply more or less resources to an image processing task depending on how successfully the task is proceeding, or based on the user's apparent interest in the task. In some arrangements, data may be referred to the cloud for analysis, or for gleaning. Cognition, and identification of appropriate device response(s), can be aided by collateral information, such as context. A great number of other features and arrangements are also detailed.

Type: Application

Filed: June 9, 2010

Publication date: June 30, 2011

Inventors: Bruce L. Davis, Tony F. Rodriguez, William Y. Conwell, Geoffrey B. Rhoads
METHODS AND SYSTEMS FOR ASSESSING AND IMPROVING THE PERFORMANCE OF A SPEECH RECOGNITION SYSTEM

Publication number: 20110161082

Abstract: A method for assessing a performance of a speech recognition system may include determining a grade, corresponding to either recognition of instances of a word or recognition of instances of various words among a set of words, wherein the grade indicates a level of the performance of the system and the grade is based on a recognition rate and at least one recognition factor. An apparatus for assessing a performance of a speech recognition system may include a processor that determines a grade, corresponding to either recognition of instances of a word or recognition of instances of various words among a set of words, wherein the grade indicates a level of the performance of the system and wherein the grade is based on a recognition rate and at least one recognition factor.

Type: Application

Filed: March 9, 2011

Publication date: June 30, 2011

Inventors: Keith Braho, Jeffrey Pike, Amro El-Jaroudi, Lori Pike, Michael Laughery
REAL-TIME VOICE RECOGNITION ON A HANDHELD DEVICE

Publication number: 20110161075

Abstract: A method and apparatus for implementation of real-time speech recognition using a handheld computing apparatus are provided. The handheld computing apparatus receives an audio signal, such as a user's voice. The handheld computing apparatus ultimately transmits the voice data to a remote or distal computing device with greater processing power and operating a speech recognition software application. The speech recognition software application processes the signal and outputs a set of instructions for implementation either by the computing device or the handheld apparatus. The instructions can include a variety of items including instructing the presentation of a textual representation of dictation, or a function or command to be executed by the handheld device (such as linking to a website, opening a file, cutting, pasting, saving, or other file menu type functionalities), or by the computing device itself.

Type: Application

Filed: December 1, 2010

Publication date: June 30, 2011

Inventor: Eric Hon-Anderson
METHODS AND SYSTEMS FOR ASSESSING AND IMPROVING THE PERFORMANCE OF A SPEECH RECOGNITION SYSTEM

Publication number: 20110161083

Abstract: A method for assessing a performance of a speech recognition system may include determining a grade, corresponding to either recognition of instances of a word or recognition of instances of various words among a set of words, wherein the grade indicates a level of the performance of the system and the grade is based on a recognition rate and at least one recognition factor. An apparatus for assessing a performance of a speech recognition system may include a processor that determines a grade, corresponding to either recognition of instances of a word or recognition of instances of various words among a set of words, wherein the grade indicates a level of the performance of the system and wherein the grade is based on a recognition rate and at least one recognition factor.

Type: Application

Filed: March 9, 2011

Publication date: June 30, 2011

Inventors: Keith Braho, Jeffrey Pike, Amro El-Jaroudi, Lori Pike, Michael Laughery
METHOD AND SYSTEM FOR PROCESSING MULTIPLE SPEECH RECOGNITION RESULTS FROM A SINGLE UTTERANCE

Publication number: 20110161077

Abstract: A method of and system for accurately determining a caller response by processing speech-recognition results and returning that result to a directed-dialog application for further interaction with the caller. Multiple speech-recognition engines are provided that process the caller response in parallel. Returned speech-recognition results comprising confidence-score values and word-score values from each of the speech-recognition engines may be modified based on context information provided by the directed-dialog application and grammars associated with each speech-recognition engine. An optional context database may be used to further reduce or add weight to confidence-score values and word-score values, remove phrases and/or words, and add phrases and/or words to the speech-recognition engine results. In situations where a predefined threshold-confidence-score value is not exceeded, a new dynamic grammar may be created.

Type: Application

Filed: December 30, 2010

Publication date: June 30, 2011

Inventor: Gregory J. Bielby
SYSTEM AND METHOD FOR COMMUNICATION WITH HANDS-FREE PROFILE

Publication number: 20110151782

Abstract: The present invention relates to a system and a method for communication with hands-free profile. The method includes the steps of: providing illumination devices which microphones and speakers are respectively disposed on; providing a detection mechanism for detecting the position of a specific illumination device corresponding to a position of a user; when a communication device receives an incoming telegram signal: sending a preset ring by the speakers; and transmitting the sound transmitted from the communication device to the speaker of the specific illumination device and transmitting the sound received from the microphone of the specific illumination device to the communication device so that the communication is performed.

Type: Application

Filed: August 18, 2010

Publication date: June 23, 2011

Inventor: Chun-I SUN
METHOD AND SYSTEM FOR CONTROLLING EXTERNAL OUTPUT OF A MOBILE DEVICE

Publication number: 20110153323

Abstract: A method and system is provided that controls an external output function of a mobile device according to control interactions received via the microphone. The method includes, activating a microphone according to preset optional information when the mobile device enters an external output mode, performing an external output operation in the external output mode, detecting an interaction based on sound information in the external output mode, and controlling the external output according to the interaction.

Type: Application

Filed: December 10, 2010

Publication date: June 23, 2011

Applicant: SAMSUNG ELECTRONICS CO. LTD.

Inventors: Hee Woon KIM, Si Hak JANG
OBSCENE CONTENT ANALYSIS APPARATUS AND METHOD BASED ON AUDIO DATA ANALYSIS

Publication number: 20110153328

Abstract: Provided is an obscene content analysis apparatus and method. The obscene content analysis apparatus includes a content input unit that receives content, an input data buffering unit that buffers the received content, wherein buffering is performed on content corresponding to a length of a previously set analysis section or a length longer than the analysis section, an obscenity analysis determining unit that determines whether or not the analysis section of audio data extracted from the buffered content is obscene by using a previously generated audio-based obscenity determining model and marks the analysis section with an obscenity mark when the analysis section is determined as obscene, a reproduction data buffering unit that accumulates and stores content in which obscenity has been determined by the obscenity analysis determining unit, and a content reproducing unit that reproduces the content while blocking the analysis section marked with the obscenity mark.

Type: Application

Filed: November 17, 2010

Publication date: June 23, 2011

Applicant: Electronics and Telecommunications Research Institute

Inventors: Jae Deok LIM, Seung Wan HAN, Byeong Cheol CHOI, Byung Ho CHUNG
SYSTEMS AND METHODS FOR IDENTIFYING SPEECH SOUND FEATURES

Publication number: 20110153321

Abstract: Systems and methods for detecting features in spoken speech and processing speech sounds based on the features are provided. One or more features may be identified in a speech sound. The speech sound may be modified to enhance or reduce the degree to which the feature affects the sound ultimately heard by a listener. Systems and methods according to embodiments of the invention may allow for automatic speech recognition devices that enhance detection and recognition of spoken sounds, such as by a user of a hearing aid or other device.

Type: Application

Filed: July 2, 2009

Publication date: June 23, 2011

Applicant: THE BOARD OF TRUSTEES OF THE UNIVERSITY OF ILLINOI

Inventors: Jont B. Allen, Feipeng LI
SYSTEM AND METHOD FOR COMPUTING AND TRANSMITTING PARAMETERS IN A DISTRIBUTED VOICE RECOGNITION SYSTEM

Publication number: 20110153326

Abstract: A system and method for extracting acoustic features and speech activity on a device and transmitting them in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server. The local VR engine comprises a feature extraction (FE) module that extracts features from a speech signal, and a voice activity detection module (VAD) that detects voice activity within a speech signal. The system includes filters, framing and windowing modules, power spectrum analyzers, a neural network, a nonlinear element, and other components to selectively provide an advanced front end vector including predetermined portions of the voice activity detection indication and extracted features from the subscriber unit to the server. The system also includes a module to generate additional feature vectors on the server from the received features using a feed-forward multilayer perceptron (MLP) and providing the same to the speech server.

Type: Application

Filed: February 9, 2011

Publication date: June 23, 2011

Applicant: QUALCOMM INCORPORATED

Inventors: HARINATH GARUDADRI, HYNEK HERMANSKY, LUKAS BURGET, PRATIBHA JAIN, SACHIN KAJAREKAR, SUNIL SIVADAS, STEPHANE N. DUPONT, MARIA CARMEN BENITEZ ORTUZAR, NELSON H. MORGAN
DIALOG MANAGEMENT SYSTEM AND METHOD FOR PROCESSING INFORMATION-SEEKING DIALOGUE

Publication number: 20110153322

Abstract: A dialog management apparatus and method for processing an information-seeking dialogue with a user and providing a service to the user by prompting the user for a task-oriented dialogue may be provided. A hierarchical topic plan in which pieces of information are organized in a hierarchy according to topics corresponding to services may be used to prompt the user to change an information-seeking dialogue to a task-oriented dialogue, and the user may be provided with a service.

Type: Application

Filed: October 26, 2010

Publication date: June 23, 2011

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Byung-Kwan KWAK, Jeong-Mi Cho
SYSTEM AND METHOD FOR ADVANCEMENT OF VOCABULARY SKILLS AND FOR IDENTIFYING SUBJECT MATTER OF A DOCUMENT

Publication number: 20110144978

Abstract: A system and method for providing vocabulary information includes one or more computer processors that, for each of a plurality of words of a text, determine a relevance of the word to the text, and, for each of at least a subset of the plurality of words, output an indication of the respective determined relevance of the word to the text, where, for each of the plurality of words, the determination includes comparing a frequency of the word in the text to a frequency threshold.

Type: Application

Filed: December 15, 2009

Publication date: June 16, 2011

Inventor: Marc TINKLER
SYSTEM AND METHOD FOR TIGHTLY COUPLING AUTOMATIC SPEECH RECOGNITION AND SEARCH

Publication number: 20110144995

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing a search. A system configured to practice the method first receives from an automatic speech recognition (ASR) system a word lattice based on speech query and receives indexed documents from an information repository. The system composes, based on the word lattice and the indexed documents, at least one triple including a query word, selected indexed document, and weight. The system generates an N-best path through the word lattice based on the at least one triple and re-ranks ASR output based on the N-best path. The system aggregates each weight across the query words to generate N-best listings and returns search results to the speech query based on the re-ranked ASR output and the N-best listings. The lattice can be a confusion network, the arc density of which can be adjusted for a desired performance level.

Type: Application

Filed: December 15, 2009

Publication date: June 16, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Srinivas BANGALORE, Taniya MISHRA

prev … 4 5 6 7 8 9 10 11 12 … next