Speech Recognition (epo) Patents (Class 704/E15.001)

E Subclasses

Assessment or evaluation of speech recognition systems (epo) (Class 704/E15.002)

Language recognition (epo) (Class 704/E15.003)

Feature extraction for speech recognition; selection of recognition unit (epo) (Class 704/E15.004)

Segmentation or word limit detection (epo) (Class 704/E15.005)

Word boundary detection (EPO) (Class 704/E15.006)

Creation of reference templates; training of speech recognition systems, e.g., adaption to the characteristics of the speaker's voice, etc. (epo) (Class 704/E15.007)

Speech classification or search (epo) (Class 704/E15.014)

Speech recognition techniques for robustness in adverse environments, e.g., in noise, of stress induced speech, etc. (epo) (Class 704/E15.039)

Procedures used during a speech recognition process, e.g., man-machine dialogue, etc. (epo) (Class 704/E15.04)

Speech recognition using nonacoustical features, e.g., position of the lips, etc. (epo) (Class 704/E15.041)

Using position of the lips, movement of the lips, or face analysis (EPO) (Class 704/E15.042)

Speech to text systems (epo) (Class 704/E15.043)

Constructional details of speech recognition systems (epo) (Class 704/E15.046)

VOICE CONTROL CIRCUIT FOR STARTING ELECTRONIC DEVICES

Publication number: 20130144627

Abstract: A control circuit employed in an electronic device includes a microphone, a level conversion circuit, and a voice processing circuit. The voice processing circuit includes a voice operated switch connected between the microphone and the level conversion circuit. The microphone picks up voice commands, the voice operated switch receives the voice commands from the microphone, and outputs a high voltage signal when a volume of the voice commands is greater than or equal to a predetermined volume threshold or is within a predetermined volume range, the level conversion circuit converts the high voltage signal into a low voltage signal for turning on the electronic device.

Type: Application

Filed: March 9, 2012

Publication date: June 6, 2013

Applicants: HON HAI PRECISION INDUSTRY CO., LTD., HONG FU JIN PRECISION INDUSTRY (ShenZhen) CO., LTD.

Inventor: JIE LI
VISUAL PRESENTATION OF SPEAKER-RELATED INFORMATION

Publication number: 20130144623

Abstract: Techniques for ability enhancement are described. Some embodiments provide an ability enhancement facilitator system (“AEFS”) configured to determine and present speaker-related information based on speaker utterances. In one embodiment, the AEFS receives data that represents an utterance of a speaker received by a hearing device of the user, such as a hearing aid, smart phone, media player/device, or the like. The AEFS identifies the speaker based on the received data, such as by performing speaker recognition. The AEFS determines speaker-related information associated with the identified speaker, such as by determining an identifier (e.g., name or title) of the speaker, by locating an information item (e.g., an email message, document) associated with the speaker, or the like. The AEFS then informs the user of the speaker-related information, such as by presenting the speaker-related information on a display of the hearing device or some other device accessible to the user.

Type: Application

Filed: December 13, 2011

Publication date: June 6, 2013

Inventors: Richard T. Lord, Robert W. Lord, Nathan P. Myhrvold, Clarence T. Tegreene, Roderick A. Hyde, Lowell L. Wood, JR., Muriel Y. Ishikawa, Victoria Y.H. Wood, Charles Whitmer, Paramvir Bahl, Douglas C. Burger, Ranveer Chandra, William H. Gates, III, Paul Holman, Jordin T. Kare, Craig J. Mundie, Tim Paek, Desney S. Tan, Lin Zhong, Matthew G. Dyor
Interface for Setting Confidence Thresholds for Automatic Speech Recognition and Call Steering Applications

Publication number: 20130138439

Abstract: An interactive user interface is described for setting confidence score thresholds in a language processing system. There is a display of a first system confidence score curve characterizing system recognition performance associated with a high confidence threshold, a first user control for adjusting the high confidence threshold and an associated visual display highlighting a point on the first system confidence score curve representing the selected high confidence threshold, a display of a second system confidence score curve characterizing system recognition performance associated with a low confidence threshold, and a second user control for adjusting the low confidence threshold and an associated visual display highlighting a point on the second system confidence score curve representing the selected low confidence threshold. The operation of the second user control is constrained to require that the low confidence threshold must be less than or equal to the high confidence threshold.

Type: Application

Filed: November 29, 2011

Publication date: May 30, 2013

Applicant: Nuance Communications, Inc.

Inventors: Jeffrey N. Marcus, Amy E. Ulug, William Bridges Smith, JR.
Systems and Methods for Concurrent Signal Recognition

Publication number: 20130132082

Abstract: Methods and systems for recognition of concurrent, superimposed, or otherwise overlapping signals are described. A Markov Selection Model is introduced that, together with probabilistic decomposition methods, enable recognition of simultaneously emitted signals from various sources. For example, a signal mixture may include overlapping speech from different persons. In some instances, recognition may be performed without the need to separate signals or sources. As such, some of the techniques described herein may be useful in automatic transcription, noise reduction, teaching, electronic games, audio search and retrieval, medical and scientific applications, etc.

Type: Application

Filed: February 21, 2011

Publication date: May 23, 2013

Inventor: Paris Smaragdis
System and Method Implementing a Text Analysis Service

Publication number: 20130124193

Abstract: One embodiment includes a computer implemented method of processing documents. The method includes generating a text analysis task object that includes instructions regarding a document processing pipeline and a document identifier. The method further includes accessing, by a worker system, the text analysis task object and generating the document processing pipeline according to the instructions. The method further includes performing text analysis using the document processing pipeline on a document identified by the document identifier.

Type: Application

Filed: November 15, 2011

Publication date: May 16, 2013

Applicant: BUSINESS OBJECTS SOFTWARE LIMITED

Inventor: Greg Holmberg
System, Method and Program for Customized Voice Communication

Publication number: 20130110511

Abstract: A method for customized voice communication comprising receiving a speech signal, retrieving a user account including an user profile corresponding to an identifier of a caller producing the speech signal, and determining if the user profile includes a speech profile with at least one dialect. If the user profile includes a speech profile, the method further comprises analyzing using a speech analyzer on the speech signal to classify the speech signal into a classified dialect, comparing the classified dialect with each of the dialects in the user profiles to select one of the dialects, and using the selected dialect for subsequent voice communication with the user. The selected dialect can be used for subsequent recognition and response speech synthesis. Moreover, a method is described for storing a user's own pronunciation of names and addresses, whereby a user may be greeted by the communication device using their own specific pronunciation.

Type: Application

Filed: October 31, 2011

Publication date: May 2, 2013

Applicant: TELCORDIA TECHNOLOGIES, INC.

Inventors: Murray Spiegel, John R. Wullert, II
SYSTEM AND METHOD FOR COMBINING FRAME AND SEGMENT LEVEL PROCESSING, VIA TEMPORAL POOLING, FOR PHONETIC CLASSIFICATION

Publication number: 20130103402

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations.

Type: Application

Filed: October 25, 2011

Publication date: April 25, 2013

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Sumit CHOPRA, Dimitrios Dimitriadis, Patrick Haffner
Speech Recognition for Context Switching

Publication number: 20130090930

Abstract: Various embodiments provide techniques for implementing speech recognition for context switching In at least some embodiments, the techniques can enable a user to switch between different contexts and/or user interfaces of an application via speech commands. In at least some embodiments, a context menu is provided that lists available contexts for an application that may be navigated to via speech commands. In implementations, the contexts presented in the context menu include a subset of a larger set of contexts that are filtered based on a variety of context filtering criteria. A user can speak one of the contexts presented in the context menu to cause a navigation to a user interface associated with one of the contexts.

Type: Application

Filed: October 10, 2011

Publication date: April 11, 2013

Inventors: Matthew J. Monson, William P. Giese, Daniel J. Greenawalt
DETECTION OF CREATIVE WORKS ON BROADCAST MEDIA

Publication number: 20130080159

Abstract: This disclosure relates to systems and methods for proactively determining identification information for a plurality of audio segments within a plurality of broadcast media streams, and providing identification information associated with specific audio portions of a broadcast media stream automatically or upon request.

Type: Application

Filed: September 27, 2011

Publication date: March 28, 2013

Applicant: GOOGLE INC.

Inventors: Matthew Sharifi, Ant Oztaskent, Yaroslav Volovich
Model Based Online Normalization of Feature Distribution for Noise Robust Speech Recognition

Publication number: 20130080165

Abstract: Online histogram recognition may be provided. Upon receiving a spoken phrase from a user, a histogram/frequency distribution may be estimated on the spoken phrase according to a prior distribution. The histogram distribution may be equalized and then provided to a spoken language understanding application.

Type: Application

Filed: September 24, 2011

Publication date: March 28, 2013

Applicant: Microsoft Corporation

Inventors: Shizen Wang, Yifan Gong
AUDIO ANALYSIS SYSTEM, AUDIO ANALYSIS APPARATUS, AUDIO ANALYSIS TERMINAL

Publication number: 20130080169

Abstract: An audio analysis system includes a terminal apparatus and a host system. The terminal apparatus acquires an audio signal of a sound containing utterances of a user and another person, discriminates between portions of the audio signal corresponding to the utterances of the user and the other person, detects an utterance feature based on the portion corresponding to the utterance of the user or the other person, and transmits utterance information including the discrimination and detection results to the host system. The host system detects a part corresponding to a conversation from the received utterance information, detects portions of the part of the utterance information corresponding to the user and the other person, compares a combination of plural utterance features corresponding to the portions of the part of the utterance information of the user and the other person with relation information to estimate an emotion, and outputs estimation information.

Type: Application

Filed: February 10, 2012

Publication date: March 28, 2013

Applicant: FUJI XEROX Co., Ltd.

Inventors: Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Yohei NISHINO, Kiyoshi IIDA, Takao NAITO
BACKGROUND SPEECH RECOGNITION ASSISTANT

Publication number: 20130080171

Abstract: In one embodiment, a method receives an acoustic input signal at a speech recognizer configured to recognize the acoustic input signal in an always on mode. A set of responses based on the recognized acoustic input signal is determined and ranked based on criteria. A computing device determines if the response should be output based on a ranking of the response. The method determines an output method in a plurality of output methods based on the ranking of the response and outputs the response using the output method if it is determined the response should be output.

Type: Application

Filed: September 27, 2011

Publication date: March 28, 2013

Applicant: SENSORY, INCORPORATED

Inventors: Todd F. Mozer, Pieter J. Verneulen
SPEECH RECOGNITION APPARATUS AND METHOD

Publication number: 20130080161

Abstract: According to one embodiment, a speech recognition apparatus includes following units. The service estimation unit estimates a service being performed by a user, by using non-speech information, and to generate service information. The speech recognition unit performs speech recognition on speech information in accordance with a speech recognition technique corresponding to the service information. The feature quantity extraction unit extracts a feature quantity related to the service of the user, from the speech recognition result. The service estimation unit re-estimates the service by using the feature quantity. The speech recognition unit performs speech recognition based on the re-estimation result.

Type: Application

Filed: September 27, 2012

Publication date: March 28, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kenji IWATA, Kentaro TORRI, Naoshi UCHIHIRA, Tetsuro CHINO
Speech & Music Discriminator for Multi-Media Applications

Publication number: 20130066629

Abstract: The present invention relates to means and methods of classifying speech and music signals in voice communication systems, devices, telephones, and methods, and more specifically, to systems, devices, and methods that automate control when either speech or music is detected over communication links. The present invention provides a novel system and method for monitoring the audio signal, analyze selected audio signal components, compare the results of analysis with a pre-determined threshold value, and classify the audio signal either as speech or music.

Type: Application

Filed: November 12, 2012

Publication date: March 14, 2013

Inventor: Alon Konchitsky
INFORMATION PROCESSOR

Publication number: 20130066637

Abstract: Information processor 1 includes display unit 30 for displaying an interface screen having function execution key unit 23 indicating a prescribed function for each function type, and interface screen change key unit 22 for switching each function type; interface screen control unit 20 for controlling display switching of the screen on the display unit 30 in response to an input operation signal; interface screen operation history recording unit 110 for recording, as continuous operation information, operation time and operation contents of the function execution key unit 23 or interface screen change key unit 22 in response to the input operation signal; likelihood value providing unit 120 for calculating and adding, to each function the function execution key unit 23 indicates, a likelihood value indicating a degree of a user desire from the continuous operation information recorded; priority recognition word setting unit 130 for outputting word information corresponding to the function whose likelihood valu

Type: Application

Filed: August 9, 2010

Publication date: March 14, 2013

Applicant: Mitsubishi Electric Corporation

Inventors: Yusuke Seto, Tadashi Suzuki, Ryo Iwamiya
INTEGRATED LOCAL AND CLOUD BASED SPEECH RECOGNITION

Publication number: 20130060571

Abstract: A system for integrating local speech recognition with cloud-based speech recognition in order to provide an efficient natural user interface is described. In some embodiments, a computing device determines a direction associated with a particular person within an environment and generates an audio recording associated with the direction. The computing device then performs local speech recognition on the audio recording in order to detect a first utterance spoken by the particular person and to detect one or more keywords within the first utterance. The first utterance may be detected by applying voice activity detection techniques to the audio recording. The first utterance and the one or more keywords are subsequently transferred to a server which may identify speech sounds within the first utterance associated with the one or more keywords and adapt one or more speech recognition techniques based on the identified speech sounds.

Type: Application

Filed: September 2, 2011

Publication date: March 7, 2013

Applicant: Microsoft Corporation

Inventors: Thomas M. Soemo, Leo Soong, Michael H. Kim, Chad R. Heinemann, Dax H. Hawkins
SPEECH COMMUNICATION SYSTEM AND METHOD, AND ROBOT APPARATUS

Publication number: 20130060566

Abstract: This invention realizes a speech communication system and method, and a robot apparatus capable of significantly improving entertainment property. A speech communication system with a function to make conversation with a conversation partner is provided with a speech recognition means for recognizing speech of the conversation partner, a conversation control means for controlling conversation with the conversation partner based on the recognition result of the speech recognition means, an image recognition means for recognizing the face of the conversation partner, and a tracking control means for tracing the existence of the conversation partner based on one or both of the recognition result of the image recognition means and the recognition result of the speech recognition means. The conversation control means controls conversation so as to continue depending on tracking of the tracking control means.

Type: Application

Filed: November 2, 2012

Publication date: March 7, 2013

Inventors: Kazumi AOYAMA, Hideki Shimomura
SYSTEM AND METHOD FOR ADVANCED TURN-TAKING FOR INTERACTIVE SPOKEN DIALOG SYSTEMS

Publication number: 20130060570

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system. A system configured according to this disclosure can incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions. A first condition which, if found, allows the system to communicate partial speech recognition results, is that the most recent word found in the partial results is statistically likely to be the termination of the utterance, also known as a terminal node. A second condition is the determination that all search paths within a speech lattice converge to a common node, also known as a pinch node, before branching out again. Upon finding either condition, the system can communicate the partial speech recognition results. Stability and correctness probabilities can also determine which partial results are communicated.

Type: Application

Filed: September 1, 2011

Publication date: March 7, 2013

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Jason WILLIAMS, Ethan Selfridge
VOICE AUTHENTICATION SYSTEM AND METHOD USING A REMOVABLE VOICE ID CARD

Publication number: 20130060569

Abstract: A voice authentication system using a removable voice ID card comprises: at server side, a voiceprint database for storing the voiceprints of all authorized users; a voiceprint updating means for updating the voiceprints in said voiceprint database; and a voiceprint digest generator for generating a voiceprint digest according to a request from a client; at client side, a voice ID card for storing the voiceprint of an authorized user; a validation means for validating the voiceprint in the voice ID card on the basis of the voiceprint digest from the server; an audio device for performing voice interaction with a user; and a voice authentication means for determining whether the voiceprint from said voice ID card is of the same speaker as the voice from said audio device.

Type: Application

Filed: March 2, 2012

Publication date: March 7, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Guo Kang Fu, Feng F. Shi, Zhi Jun Wang, Yu Chen Zhou
TRULY HANDSFREE SPEECH RECOGNITION IN HIGH NOISE ENVIRONMENTS

Publication number: 20130054235

Abstract: Embodiments of the present invention improve content manipulation systems and methods using speech recognition. In one embodiment, the present invention includes a method comprising configuring a recognizer to recognize utterances in the presence of a background audio signal having particular audio characteristics. A composite signal comprising a first audio signal and a spoken utterance of a user is received by the recognizer, where the first audio signal comprises the particular audio characteristics used to configure the recognizer so that the recognizer is desensitized to the first audio signal. The spoke utterance is recognized in the presence of the first audio signal when the spoken utterance is one of the predetermined utterances. An operation is performed on the first audio signal.

Type: Application

Filed: August 24, 2011

Publication date: February 28, 2013

Applicant: Sensory, Incorporated

Inventors: Todd F. Mozer, Jeff Rogers, Pieter J. Vermeulen, Jonathan Shaw
ELECTRONIC DEVICE AND CONTROL METHOD

Publication number: 20130054243

Abstract: Provided is an electronic device and control method, wherein a simple interface upon utilizing voice recognition can be attained. A cellular phone (1) is provided with a voice recognition unit (30), an execution unit (40) that executes a prescribed application, and an OS (50) that controls the voice recognition unit (30) and the executing unit (40). The executing unit (40) will make an assessment, upon receiving an instruction from the OS (50) to start up the prescribed application, of whether the start-up instruction was based on a result of voice recognition conducted by the voice recognition unit (30) or not, and will select the content to be processed according to the result of this assessment.

Type: Application

Filed: September 28, 2010

Publication date: February 28, 2013

Applicant: KYOCERA Corporation

Inventor: Hajime Ichikawa
REDUCING FALSE POSITIVES IN SPEECH RECOGNITION SYSTEMS

Publication number: 20130054242

Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.

Type: Application

Filed: August 24, 2011

Publication date: February 28, 2013

Applicant: SENSORY, INCORPORATED

Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
Electronic Device and Method of Controlling the Same

Publication number: 20130041665

Abstract: There are disclosed an electronic device and a method of controlling the electronic device. The electronic device according to an aspect of the present invention includes a display unit, a voice input unit, and a control unit configured to output a plurality of contents through the electronic device, receive a voice command through the voice input unit for performing a command, determine which of the plurality of contents correspond to the received voice command, and perform the command on one or more of the plurality of contents that correspond to the received voice command. According to the present invention, multi-tasking performed in an electronic device can be efficiently controlled through a voice command.

Type: Application

Filed: September 23, 2011

Publication date: February 14, 2013

Inventors: Seokbok Jang, Jongse Park, Joonyup Lee, Jungkyo Choi
MAINTAINING AND SUPPLYING SPEECH MODELS

Publication number: 20130030802

Abstract: Maintaining and supplying a plurality of speech models is provided. A plurality of speech models and metadata for each speech model are stored. A query for a speech model is received from a source. The query includes one or more conditions. The speech model with metadata most closely matching the supplied one or more conditions is determined. The determined speech model is provided to the source. A refined speech model is received from the source, and the refined speech model is stored.

Type: Application

Filed: July 9, 2012

Publication date: January 31, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bin Jia, Ying Liu, E. Feng Lu, Jia Wu, Zhen Zhang
Computer-Implemented Systems and Methods for Scoring Concatenated Speech Responses

Publication number: 20130030808

Abstract: Systems and methods are provided for scoring non-native speech. Two or more speech samples are received, where each of the samples are of speech spoken by a non-native speaker, and where each of the samples are spoken in response to distinct prompts. The two or more samples are concatenated to generate a concatenated response for the non-native speaker, where the concatenated response is based on the two or more speech samples that were elicited using the distinct prompts. A concatenated speech proficiency metric is computed based on the concatenated response, and the concatenated speech proficiency metric is provided to a scoring model, where the scoring model generates a speaking score based on the concatenated speech metric.

Type: Application

Filed: July 24, 2012

Publication date: January 31, 2013

Inventors: Klaus Zechner, Su-Youn Yoon, Lei Chen, Shasha Xie, Xiaoming Xi, Chaitanya Ramineni
MICROPHONE-ARRAY-BASED SPEECH RECOGNITION SYSTEM AND METHOD

Publication number: 20130030803

Abstract: A microphone-array-based speech recognition system combines a noise cancelling technique for cancelling noise of input speech signals from an array of microphones, according to at least an inputted threshold. The system receives noise-cancelled speech signals outputted by a noise masking module through at least a speech model and at least a filler model, then computes a confidence measure score with the at least a speech model and the at least a filler model for each threshold and each noise-cancelled speech signal, and adjusts the threshold to continue the noise cancelling for achieving a maximum confidence measure score, thereby outputting a speech recognition result related to the maximum confidence measure score.

Type: Application

Filed: October 12, 2011

Publication date: January 31, 2013

Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Inventor: Hsien-Cheng Liao
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER READABLE MEDIUM

Publication number: 20130021362

Abstract: An apparatus includes an input unit, a microphone, a control unit, and a voice recognition unit. The input unit is configured to receive a first type input and a second type input. The microphone is configured to receive an input sound signal. The control unit is configured to control a display to display feedback according to a type of input. The voice recognition unit is configured to perform recognition processing on the input sound signal.

Type: Application

Filed: July 10, 2012

Publication date: January 24, 2013

Applicant: SONY CORPORATION

Inventors: Akiko SAKURADA, Osamu Shigeta, Nariaki Sato, Yasuyuki Koga, Kazuyuki Yamamoto
ELECTRONIC DEVICE AND METHOD FOR CONTROLLING THE SAME

Publication number: 20130024197

Abstract: An electronic device and a method for controlling an electronic device are disclosed. The electronic device includes: a display unit; a voice input unit; and a controller displaying a plurality of contents on the display unit, receiving a voice command for controlling any one of the plurality of contents through the voice input unit, and controlling content corresponding to the received voice command. Multitasking performed by the electronic device can be effectively controlled through a voice command.

Type: Application

Filed: October 13, 2011

Publication date: January 24, 2013

Applicant: LG ELECTRONICS INC.

Inventors: Seokbok JANG, Jongse PARK, Joonyup LEE, Jungkyu CHOI
Method And Apparatus For Determining a User Age Range

Publication number: 20130013308

Abstract: An approach is provided for determining a user age range. An age estimator causes, at least in part, acquisition of voice data. Next, the age estimator calculates a first set of probability values, wherein each of the probability values represents a probability that the voice data is in a respective one of a plurality of predefined age ranges, and the predefined age ranges are segments of a lifespan. Then, the age estimator derives a second set of probability values by applying a correlation matrix to the first set of probability values, wherein the correlation matrix associates the first set of probability values with probabilities of the voice data matching individual ages over the lifespan. Then, the age estimator, for each of the predefined age ranges, calculates a sum of the probabilities in the second set of probability values corresponding to the individual ages within the respective predefined age ranges.

Type: Application

Filed: March 23, 2010

Publication date: January 10, 2013

Applicant: NOKIA CORPORATION

Inventors: Yang Cao, Feng Ding, Jilei Tian
Method of Extracting Experience Sentence and Classifying Verb in Blog

Publication number: 20130013289

Abstract: Provided are a method of extracting an experience-revealing sentence from a blog document and a method of classifying verbs into activity verbs and state verbs in a sentence recorded in a blog document. The method of extracting an experience sentence from a blog document includes generating a sentence classifier using a machine learning algorithm based on grammatical features, and classifying experience sentences that represent actual experiences of users and non-experience sentences that represent no experience in the blog document using the sentence classifier. By classifying sentences in a blog document into experience sentences and non-experience sentences, it is possible to extract experiences that a user has actually had or that have actually happened to a user from the document.

Type: Application

Filed: July 7, 2011

Publication date: January 10, 2013

Applicant: KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY

Inventors: Sung Hyon Myaeng, Keun Chan Park, Yoon Jae Jeong
MESSAGE SERVICE METHOD USING SPEECH RECOGNITION

Publication number: 20130013297

Abstract: A message service method using speech recognition includes a message server recognizing a speech transmitted from a transmission terminal, generating and transmitting a recognition result of the speech and N-best results based on a confusion network to the transmission terminal; if a message is selected through the recognition result and the N-best results and an evaluation result according to accuracy of the message are decided, the transmission terminal transmitting the message and the evaluation result to a reception terminal; and the reception terminal displaying the message and the evaluation result.

Type: Application

Filed: July 5, 2012

Publication date: January 10, 2013

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Hwa Jeon SONG, YunKeun Lee, Jeon Gue Park, Jong Jin Kim, Ki-Young Park, Hoon Chung, Hyung-Bae Jeon, Ho Young Jung, Euisok Chung, Jeom Ja Kang, Byung Ok Kang, Sang Kyu Park, Sung Joo Lee, Yoo Rhee Oh
METHODS AND APPARATUS FOR INITIATING ACTIONS USING A VOICE-CONTROLLED INTERFACE

Publication number: 20130013319

Abstract: Methods and apparatus for initiating an action using a voice-controlled human interface. The interface provides a hands free, voice driven environment to control processes and applications. According to one embodiment, a method comprises electronically receiving first user input, parsing the first user input to determine whether the first user input contains a command activation statement that cues a voice-controlled human interface to enter a command mode in which a second user input comprising a voice signal is processed to identify at least one executable command and, in response to to determining that the first user input comprises the command activation statement, identifying the at least one executable command in the second user input.

Type: Application

Filed: September 14, 2012

Publication date: January 10, 2013

Applicant: Nuance Communications, Inc.

Inventors: Richard Grant, Pedro E. McGregor
Device, method, and program for performing interaction between user and machine

Patent number: 8352273

Abstract: There is provided a device for performing interaction between a user and a machine. The device includes a plurality of domains corresponding to a plurality of stages in the interaction. Each of the domains has voice comprehension means which understands the content of the user's voice. The device includes: means for recognizing the user's voice; means for selecting a domain enabling the best voice comprehension results as ht domain; means for referencing task knowledge of the domain and extracting a task correlated to the voice comprehension result; means for obtaining a sub task sequence correlated to the extracted task; means for setting the first sub task of the sub task sequence as the sub task and updating the domain to which the sub task belongs as the domain; means for extracting a behavior or sub task end flag correlated to the voice comprehension result and the subtask; and means for causing the machine to execute the extracted behavior.

Type: Grant

Filed: July 26, 2006

Date of Patent: January 8, 2013

Assignee: Honda Motor Co., Ltd.

Inventor: Mikio Nakano
CONTEXT-BASED GRAMMARS FOR AUTOMATED SPEECH RECOGNITION

Publication number: 20130006621

Abstract: Methods, apparatus, and computer program products for providing a context-based grammar for automatic speech recognition, including creating by a multimodal application a context, the context comprising words associated with user activity in the multimodal application, and supplementing by the multimodal application a grammar for automatic speech recognition in dependence upon the context.

Type: Application

Filed: September 13, 2012

Publication date: January 3, 2013

Applicant: Nuance Communications, Inc.

Inventors: Charles W. Cross, JR., Frank L. Jania
Turbo Processing of Speech Recognition

Publication number: 20130006631

Abstract: Environmental recognition systems may improve recognition accuracy by leveraging local and nonlocal features in a recognition target. A local decoder may be used to analyze local features, and a nonlocal decoder may be used to analyze nonlocal features. Local and nonlocal estimates may then be exchanged to improve the accuracy of the local and nonlocal decoders. Additional iterations of analysis and exchange may be performed until a predetermined threshold is reached. In some embodiments, the system may comprise extrinsic information extractors to prevent positive feedback loops from causing the system to adhere to erroneous previous decisions.

Type: Application

Filed: June 28, 2012

Publication date: January 3, 2013

Applicant: UTAH STATE UNIVERSITY

Inventors: Jacob Gunther, Todd Moon
SYSTEM AND METHOD FOR PROVIDING NETWORK COORDINATED CONVERSATIONAL SERVICES

Publication number: 20130006620

Abstract: A system and method for providing automatic and coordinated sharing of conversational resources, e.g., functions and arguments, between network-connected servers and devices and their corresponding applications. In one aspect, a system for providing automatic and coordinated sharing of conversational resources includes a network having a first and second network device, the first and second network device each comprising a set of conversational resources, a dialog manager for managing a conversation and executing calls requesting a conversational service, and a communication stack for communicating messages over the network using conversational protocols, wherein the conversational protocols establish coordinated network communication between the dialog managers of the first and second network device to automatically share the set of conversational resources of the first and second network device, when necessary, to perform their respective requested conversational service.

Type: Application

Filed: September 11, 2012

Publication date: January 3, 2013

Applicant: Nuance Communications, Inc.

Inventors: Stephane H. Maes, Ponani Gopalakrishnan
EXTENDED VIDEOLENS MEDIA ENGINE FOR AUDIO RECOGNITION

Publication number: 20130006625

Abstract: A system, method, and computer program product for automatically analyzing multimedia data audio content are disclosed. Embodiments receive multimedia data, detect portions having specified audio features, and output a corresponding subset of the multimedia data and generated metadata. Audio content features including voices, non-voice sounds, and closed captioning, from downloaded or streaming movies or video clips are identified as a human probably would do, but in essentially real time. Particular speakers and the most meaningful content sounds and words and corresponding time-stamps are recognized via database comparison, and may be presented in order of match probability. Embodiments responsively pre-fetch related data, recognize locations, and provide related advertisements. The content features may be also sent to search engines so that further related content may be identified. User feedback and verification may improve the embodiments over time.

Type: Application

Filed: June 28, 2011

Publication date: January 3, 2013

Applicant: Sony Corporation

Inventors: Priyan Gunatilake, Djung Nguyen, Abhishek Patil, Dipendu Saha
SEARCHING DEVICE, SEARCHING METHOD, AND PROGRAM

Publication number: 20130006629

Abstract: The present invention relates to a searching device, searching method, and program whereby searching for a word string corresponding to input voice can be performed in a robust manner. A voice recognition unit 11 subjects an input voice to voice recognition. A matching unit 16 performs matching, for each of multiple word strings for search results which are word strings that are to be search results for word strings corresponding to the input voice, of a pronunciation symbol string for search results, which is an array of pronunciation symbols expressing pronunciation of the word string search result, and a recognition result pronunciation symbol string which is an array of pronunciation symbols expressing pronunciation of the voice recognition results of the input voice.

Type: Application

Filed: December 2, 2010

Publication date: January 3, 2013

Applicant: SONY CORPORATION

Inventors: Hitoshi Honda, Yoshinori Maeda, Satoshi Asakawa
INPUT SUPPORTING SYSTEM, METHOD AND PROGRAM

Publication number: 20120330662

Abstract: An input supporting system (1) includes a database (10) which accumulates data for a plurality of items therein, an extraction unit (104) which compares, with the data for the items in the database (10), input data which is obtained as a result of a speech recognition process on speech data (D0), and extracts data similar to the input data from the database, and a presentation unit (106) which presents the extracted data as candidates to be registered in the database (10).

Type: Application

Filed: January 17, 2011

Publication date: December 27, 2012

Applicant: NEC CORPORATION

Inventor: Masahiro Saikou
FOR PERSONALIZING COMPUTERIZED CUSTOMER SERVICE

Publication number: 20120328086

Abstract: A method and system for improving user satisfaction with a computer system that includes a computer. The computer prompts a user at a user machine to select a language usage pattern preference from at least two language usage pattern preference choices respectively including at least two text passages, each text passage expressing different text. After the prompting, the computer receives from the user machine a language usage pattern preference selected by the user from the at least two language usage pattern preference choices. The computer stores, in a user profile of the user located in a database accessible to the computer, a flag indicative of the selected language usage pattern preference.

Type: Application

Filed: September 5, 2012

Publication date: December 27, 2012

Applicant: International Business Machines Corporation

Inventors: Nathan Raymond Hughes, Nishant Srinath Rao, Michelle Ann Uretsky
IDENTIFYING AND GENERATING AUDIO COHORTS BASED ON AUDIO DATA INPUT

Publication number: 20120330654

Abstract: A computer implemented method, system, and/or computer program product generates an audio cohort. Audio data from a set of audio sensors is received by an audio analysis engine. The audio data, which is associated with a plurality of objects, comprises a set of audio patterns. The audio data is processed to identify audio attributes associated with the plurality of objects to form digital audio data. This digital audio data comprises metadata that describes the audio attributes of the set of objects. A set of audio cohorts is generated using the audio attributes associated with the digital audio data and cohort criteria, where each audio cohort in the set of audio cohorts is a cohort of accompanied customers in a store, and where processing the audio data identifies a type of zoological creature that is accompanying each of the accompanied customers.

Type: Application

Filed: September 6, 2012

Publication date: December 27, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: ROBERT LEE ANGELL, ROBERT R. FRIEDLANDER, JAMES R. KRAEMER
AUTOMATED ADVERSE DRUG EVENT ALERTS

Publication number: 20120323576

Abstract: Event audio data that is based on verbal utterances associated with a pharmaceutical event associated with a patient may be received. Medical history information associated with the patient may be obtained, based on information included in a medical history repository. At least one text string that matches at least one interpretation of the event audio data may be obtained, based on information included in a pharmaceutical speech repository, information included in a speech accent repository, and a drug matching function, the at least one text string being associated with a pharmaceutical drug. One or more adverse drug event (ADE) alerts may be determined based on matching the at least one text string and medical history attributes associated with the at least one patient with ADE attributes obtained from an ADE repository. An ADE alert report may be generated, based on the determined one or more ADE alerts.

Type: Application

Filed: June 17, 2011

Publication date: December 20, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Tao Wang, Bin Zhou
SPEECH RECOGNITION FOR PREMATURE ENUNCIATION

Publication number: 20120323577

Abstract: Methods of automatic speech recognition for premature enunciation. In one method, a) a user is prompted to input speech, then b) a listening period is initiated to monitor audio via a microphone, such that there is no pause between the end of step a) and the beginning of step b), and then the begin-speaking audible indicator is communicated to the user during the listening period. In another method, a) at least one audio file is played including both a prompt for a user to input speech and a begin-speaking audible indicator to the user, b) a microphone is activated to monitor audio, after playing the prompt but before playing the begin-speaking audible indicator in step a), and c) speech is received from the user via the microphone.

Type: Application

Filed: June 16, 2011

Publication date: December 20, 2012

Applicant: GENERAL MOTORS LLC

Inventors: John J. Correia, Rathinavelu Chengalvarayan, Gaurav Talwar, Xufang Zhao
Non-Scorable Response Filters For Speech Scoring Systems

Publication number: 20120323573

Abstract: A method for scoring non-native speech includes receiving a speech sample spoken by a non-native speaker and performing automatic speech recognition and metric extraction on the speech sample to generate a transcript of the speech sample and a speech metric associated with the speech sample. The method further includes determining whether the speech sample is scorable or non-scorable based upon the transcript and speech metric, where the determination is based on an audio quality of the speech sample, an amount of speech of the speech sample, a degree to which the speech sample is off-topic, whether the speech sample includes speech from an incorrect language, or whether the speech sample includes plagiarized material. When the sample is determined to be non-scorable, an indication of non-scorability is associated with the speech sample. When the sample is determined to be scorable, the sample is provided to a scoring model for scoring.

Type: Application

Filed: March 23, 2012

Publication date: December 20, 2012

Inventors: Su-Youn Yoon, Derrick Higgins, Klaus Zechner, Shasha Xie, Je Hun Jeon, Keelan Evanini
HOSTED SPEECH HANDLING

Publication number: 20120316875

Abstract: Embodiments of the invention provide systems and methods for speech signal handling. Speech handling according to one embodiment of the present invention can be performed via a hosted architecture. Electrical signal representing human speech can be analyzed with an Automatic Speech Recognizer (ASR) hosted on a different server from a media server or other server hosting a service utilizing speech input. Neither server need be located at the same location as the user. The spoken sounds can be accepted as input to and handled with a media server which identifies parts of the electrical signal that contain a representation of speech. This architecture can serve any user who has a web-browser and Internet access, either on a PC, PDA, cell phone, tablet, or any other computing device.

Type: Application

Filed: June 8, 2012

Publication date: December 13, 2012

Applicant: Red Shift Company, LLC

Inventors: JOEL NYQUIST, Matthew Robinson
COMMUNICATION DEVICE WITH SPEECH RECOGNITION AND METHOD THEREOF

Publication number: 20120316870

Abstract: A communication unit, a voice input unit, a storage unit, and a processor are included in a communication device. The communication unit enables communication between the device and other communication devices. The voice input unit receives voice signals, which may correspond to one stored speech command and an related operation. The processor detects a match, and executes the desired operation. A related communication method is also provided.

Type: Application

Filed: August 24, 2011

Publication date: December 13, 2012

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventors: YING-CHUAN YU, YING-XIONG HUANG
DYNAMICALLY ADDING PERSONALIZATION FEATURES TO LANGUAGE MODELS FOR VOICE SEARCH

Publication number: 20120316877

Abstract: A dynamic exponential, feature-based, language model is continually adjusted per utterance by a user, based on the user's usage history. This adjustment of the model is done incrementally per user, over a large number of users, each with a unique history. The user history can include previously recognized utterances, text queries, and other user inputs. The history data for a user is processed to derive features. These features are then added into the language model dynamically for that user.

Type: Application

Filed: June 12, 2011

Publication date: December 13, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Geoffrey Zweig, Shuangyu Chang
Display Device, Method for Thereof and Voice Recognition System

Publication number: 20120316876

Abstract: A display system, a display device, a control method for the display device, and a voice recognition system are disclosed. A display device according to one embodiment of the present invention can carry out voice recognition upon a voice received from at least one speaker through at least one voice input device; and display the voice recognition result on the display unit. Accordingly, effective voice recognition is made possible for TV environments where various constraints exist differently from mobile terminal environments.

Type: Application

Filed: September 23, 2011

Publication date: December 13, 2012

Inventors: Seokbok Jang, Jongse Park, Joonyup Lee, Jungkyu Choi
Speech Recognition Using Loosely Coupled Components

Publication number: 20120316871

Abstract: An automatic speech recognition system includes an audio capture component, a speech recognition processing component, and a result processing component which are distributed among two or more logical devices and/or two or more physical devices. In particular, the audio capture component may be located on a different logical device and/or physical device from the result processing component. For example, the audio capture component may be on a computer connected to a microphone into which a user speaks, while the result processing component may be on a terminal server which receives speech recognition results from a speech recognition processing server.

Type: Application

Filed: June 8, 2012

Publication date: December 13, 2012

Inventors: Detlef Koll, Michael Finke
Methods And Systems For Changing A Communication Quality Of A Communication Session Based On A Meaning Of Speech Data

Publication number: 20120316868

Abstract: Methods and systems are described for changing a communication quality of a communication session based on a meaning of speech data. Speech data exchanged between clients participating in a communication session is parsed. A meaning of the parsed speech data is determined to determine a communication quality of the communication session. An action is performed to change the communication quality of the communication session based on the meaning of the parsed speech data.

Type: Application

Filed: August 27, 2012

Publication date: December 13, 2012

Inventor: Mona Singh

prev 1 2 3 4 5 6 7 … next