Voice Recognition Patents (Class 704/246)

Preliminary matching (Class 704/247)

Endpoint detection (Class 704/248)

Subportions (Class 704/249)

Specialized models (Class 704/250)

Methods for statistical analysis of speech

Patent number: 8583434

Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.

Type: Grant

Filed: January 29, 2008

Date of Patent: November 12, 2013

Assignee: CallMiner, Inc.

Inventor: Jeffrey A. Gallino
Recording and reproducing apparatus

Patent number: 8583443

Abstract: Disclosed is a recording and reproducing apparatus comprising: an apparatus main body; and a remote controller to perform remote control of the apparatus main body, wherein the remote controller comprises: a key operating section to receive a key operation by a user; a sound information inputting section to input sound information; and a transmitting section to transmit sound data based on the sound information to the apparatus main body, and the apparatus main body comprises: a recording section to record input content data on a recording medium; a reproducing section to reproduce the content data; a receiving section to receive the sound data; a sound information recording section to record the sound data so as to be associated with a piece of the content data; and a sound information outputting section to reproduce the sound data to output the reproduced sound data.

Type: Grant

Filed: April 10, 2008

Date of Patent: November 12, 2013

Assignee: Funai Electric Co., Ltd.

Inventor: Masayuki Misawa
Method and device for speech enhancement in the presence of background noise

Patent number: 8577675

Abstract: In one aspect thereof the invention provides a method for noise suppression of a speech signal that includes, for a speech signal having a frequency domain representation dividable into a plurality of frequency bins, determining a value of a scaling gain for at least some of said frequency bins and calculating smoothed scaling gain values. Calculating smoothed scaling gain values includes, for the at least some of the frequency bins, combining a currently determined value of the scaling gain and a previously determined value of the smoothed scaling gain. In another aspect a method partitions the plurality of frequency bins into a first set of contiguous frequency bins and a second set of contiguous frequency bins having a boundary frequency there between, where the boundary frequency differentiates between noise suppression techniques, and changes a value of the boundary frequency as a function of the spectral content of the speech signal.

Type: Grant

Filed: December 22, 2004

Date of Patent: November 5, 2013

Assignee: Nokia Corporation

Inventor: Milan Jelinek
Monitoring and logging voice traffic on data network

Patent number: 8577680

Abstract: A method, article of manufacture, and apparatus for monitoring data traffic on a network is disclosed. In an embodiment, this includes obtaining intrinsic data from at least a portion of the traffic, obtaining extrinsic data from at least a portion of the traffic, associating the intrinsic data with the extrinsic data, and logging the intrinsic data and extrinsic data. The portion of the traffic from which the intrinsic data and extrinsic data are derived may not be stored, or may be stored in encrypted form.

Type: Grant

Filed: December 30, 2006

Date of Patent: November 5, 2013

Assignee: EMC Corporation

Inventors: Christopher Hercules Claudatos, William Dale Andruss, Scott R. Bevan
Application of Voice Tags in a Social Media Context

Publication number: 20130289991

Abstract: According to a present invention embodiment, a system utilizes a voice tag to automatically tag one or more entities within a social media environment, and comprises a computer system including at least one processor. The system analyzes the voice tag to identify one or more entities, where the voice tag includes voice signals providing information pertaining to one or more entities. One or more characteristics of each identified entity are determined based on the information within the voice tag. One or more entities appropriate for tagging within the social media environment are determined based on the characteristics and user settings within the social media environment of the identified entities, and automatically tagged. Embodiments of the present invention further include a method and computer program product for utilizing a voice tag to automatically tag one or more entities within a social media environment in substantially the same manner described above.

Type: Application

Filed: April 30, 2012

Publication date: October 31, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bhavani K. Eshwar, Martin A. Oberhofer, Sushain Pandit
Method and system for bio-metric voice print authentication

Patent number: 8571867

Abstract: A method (700) and system (900) for authenticating a user is provided. The method can include receiving one or more spoken utterances from a user (702), recognizing a phrase corresponding to one or more spoken utterances (704), identifying a biometric voice print of the user from one or more spoken utterances of the phrase (706), determining a device identifier associated with the device (708), and authenticating the user based on the phrase, the biometric voice print, and the device identifier (710). A location of the handset or the user can be employed as criteria for granting access to one or more resources (712).

Type: Grant

Filed: September 13, 2012

Date of Patent: October 29, 2013

Assignee: Porticus Technology, Inc.

Inventors: Germano Di Mambro, Bernardas Salna
Inference-aided speaker recognition

Patent number: 8571865

Abstract: Systems, methods performed by data processing apparatus and computer storage media encoded with computer programs for receiving information relating to (i) a communication device that has received an utterance and (ii) a voice associated with the received utterance, comparing the received voice information with voice signatures in a comparison group, the comparison group including one or more individuals identified from one or more connections arising from the received information relating to the communication device, attempting to identify the voice associated with the utterance as matching one of the individuals in the comparison group, and based on a result of the attempt to identify, selectively providing the communication device with access to one or more resources associated with the matched individual.

Type: Grant

Filed: August 10, 2012

Date of Patent: October 29, 2013

Assignee: Google Inc.

Inventor: Philip Hewinson
Method and discriminator for classifying different segments of a signal

Patent number: 8571858

Abstract: For classifying different segments of a signal which has segments of at least a first type and second type, e.g. audio and speech segments, the signal is short-term classified on the basis of the at least one short-term feature extracted from the signal and a short-term classification result is delivered. The signal is also long-term classified on the basis of the at least one short-term feature and at least one long-term feature extracted from the signal and a long-term classification result is delivered. The short-term classification result and the long-term classification result are combined to provide an output signal indicating whether a segment of the signal is of the first type or of the second type.

Type: Grant

Filed: January 11, 2011

Date of Patent: October 29, 2013

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.

Inventors: Guillaume Fuchs, Stefan Bayer, Jens Hirschfeld, Juergen Herre, Jeremie Lecomte, Frederik Nagel, Nikolaus Rettelbach, Stefan Wabnik, Yoshikazu Yokotani
Natural language system and method based on unisolated performance metric

Patent number: 8571869

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.

Type: Grant

Filed: May 15, 2008

Date of Patent: October 29, 2013

Assignee: Nuance Communications, Inc.

Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
Lexical acquisition apparatus, multi dialogue behavior system, and lexical acquisition program

Patent number: 8566097

Abstract: A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and the

Type: Grant

Filed: June 1, 2010

Date of Patent: October 22, 2013

Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International

Inventors: Mikio Nakano, Takashi Nose, Ryo Taguchi, Kotaro Funakoshi, Naoto Iwahashi
Method and apparatus for extracting prosodic feature of speech signal

Patent number: 8566092

Abstract: The present invention discloses a method and an apparatus for extracting a prosodic feature of a speech signal, the method including: dividing the speech signal into speech frames; transforming the speech frames from time domain to frequency domain; and extracting respective prosodic features for different frequency ranges. According to the above technical solution of the present invention, it is possible to effectively extract the prosodic feature which can combine with a traditional acoustics feature without any obstacle.

Type: Grant

Filed: August 16, 2010

Date of Patent: October 22, 2013

Assignee: Sony Corporation

Inventors: Kun Liu, Weiguo Wu
Intersession variability compensation for automatic extraction of information from voice

Patent number: 8566093

Abstract: A method for compensating inter-session variability for automatic extraction of information from an input voice signal representing an utterance of a speaker, includes: processing the input voice signal to provide feature vectors each formed by acoustic features extracted from the input voice signal at a time frame; computing an intersession variability compensation feature vector; and computing compensated feature vectors based on the extracted feature vectors and the intersession variability compensation feature vector.

Type: Grant

Filed: May 16, 2006

Date of Patent: October 22, 2013

Assignee: Loquendo S.p.A.

Inventors: Claudio Vair, Daniele Colibro, Pietro Laface
Multi-modal web interaction over wireless network

Patent number: 8566103

Abstract: A system, apparatus, and method is disclosed for receiving user input at a client device, interpreting the user input to identify a selection of at least one of a plurality of web interaction modes, producing a corresponding client request based in part on the user input and the web interaction mode; and sending the client request to a server via a network.

Type: Grant

Filed: December 22, 2010

Date of Patent: October 22, 2013

Assignee: Intel Corporation

Inventor: Liang He
Numeric weighting of error recovery prompts for transfer to a human agent from an automated speech response system

Patent number: 8566104

Abstract: A method for a speech response system to automatically transfer users to human agents. The method can establish an interactive dialog session between a user and an automated speech response system. An error score can be established when the interactive dialog session is initiated. During the interactive dialog session, responses to dialog prompts can be received. Error weights can be assigned to receive responses determined to be non-valid responses. Different non-valid responses can be assigned different error weights. For each non-valid response, the assigned error weight can be added to the error score. When a value of the error score exceeds a previously established error threshold, a user can be automatically transferred from the automated speech response system to a human agent.

Type: Grant

Filed: November 1, 2011

Date of Patent: October 22, 2013

Assignee: Nuance Communications, Inc.

Inventors: Vanessa V. Michelini, Melanie D. Polkosky
Direct marketing system for matching caller value to risk and revenue

Patent number: 8560373

Abstract: A method for direct marketing comprising establishing a first communications link between a prospective customer using a device having a unique identification number and a communications device, automatically transmitting the unique identification number associated with the prospective customer's device to the communications device, establishing a second communications link between the communications device and a computer operably connected to a memory apparatus having a prospective customer database comprising prospective customer information associated with the unique identification number of the prospective customer's device, in which the information in the database determines prospective customer value which can be used to determine subsequent operations and marketing actions with the prospective customer.

Type: Grant

Filed: September 25, 2008

Date of Patent: October 15, 2013

Inventor: Eileen A. Fraser
Conference support device, conference support method, and computer-readable medium storing conference support program

Patent number: 8560315

Abstract: A conference support device includes an image receiving portion that receives captured images from conference terminals, a voice receiving portion that receives, from one of the conference terminals, a voice that is generated by a first participant, a first storage portion that stores the captured images and the voice, a voice recognition portion that recognizes the voice, a text data creation portion that creates text data that express the words that are included in the voice, an addressee specification portion that specifies a second participant, whom the voice is addressing, an image creation portion that creates a display image that is configured from the captured images and in which the text data are associated with the first participant and a specified image is associated with at least one of the first participant and the second participant, and a transmission portion that transmits the display image to the conference terminals.

Type: Grant

Filed: March 12, 2010

Date of Patent: October 15, 2013

Assignee: Brother Kogyo Kabushiki Kaisha

Inventor: Mizuho Yasoshima
System and method for synchronizing sound and manually transcribed text

Patent number: 8560327

Abstract: A method for synchronizing sound data and text data, said text data being obtained by manual transcription of said sound data during playback of the latter. The proposed method comprises the steps of repeatedly querying said sound data and said text data to obtain a current time position corresponding to a currently played sound datum and a currently transcribed text datum, respectively, correcting said current time position by applying a time correction value in accordance with a transcription delay, and generating at least one association datum indicative of a synchronization association between said corrected time position and said currently transcribed text datum. Thus, the proposed method achieves cost-effective synchronization of sound and text in connection with the manual transcription of sound data.

Type: Grant

Filed: August 18, 2006

Date of Patent: October 15, 2013

Assignee: Nuance Communications, Inc.

Inventors: Andreas Neubacher, Miklos Papai
Confidence levels for speaker recognition

Patent number: 8560316

Abstract: The present invention relates to a system and method of making a verification decision within a speaker recognition system. A speech sample is gathered from a speaker over a period of time a verification score is then produce for said sample over the period. Once the verification score is determined a confidence measure is produced based on frame score observations from said sample over the period and a confidence measure calculated based on the standard Gaussian distribution. If the confidence measure indicates with a set level of confidence that the verification score is below the verification threshold the speaker is rejected and gathering process terminated.

Type: Grant

Filed: December 19, 2007

Date of Patent: October 15, 2013

Inventors: Robert Vogt, Michael Mason, Sridaran Subramanian
Apparatus and method for language expression using context and intent awareness

Patent number: 8560301

Abstract: A language expression apparatus and a method based on a context and a intent awareness, are provided. The apparatus and method may recognize a context and an intent of a user and may generate a language expression based on the recognized context and the recognized intent, thereby providing an interpretation/translation service and/or providing an education service for learning a language.

Type: Grant

Filed: March 2, 2010

Date of Patent: October 15, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventor: Yeo Jin Kim
Method and apparatus providing improved voice activated functions

Patent number: 8560310

Abstract: A method, apparatus and computer program product for providing improved voice activated functions is presented. A grammar is provided from a collection of names for use in a voice activated operation, the grammar including the names and variations of the names. A preferred one of the variations of a name is associated with a name in the grammar. A preferred one of the variations of the name is received and is used to perform a task.

Type: Grant

Filed: May 8, 2012

Date of Patent: October 15, 2013

Assignee: Nuance Communications, Inc.

Inventors: Ya-Xin Zhang, Qing-Feng Bao
Method and system for speaker diarization

Patent number: 8554563

Abstract: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.

Type: Grant

Filed: September 11, 2012

Date of Patent: October 8, 2013

Assignee: Nuance Communications, Inc.

Inventor: Hagai Aronowitz
Training and applying prosody models

Patent number: 8554566

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: November 29, 2012

Date of Patent: October 8, 2013

Assignee: Morphism LLC

Inventor: James H. Stephens, Jr.
Method and system for speaker diarization

Patent number: 8554562

Abstract: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.

Type: Grant

Filed: November 15, 2009

Date of Patent: October 8, 2013

Assignee: Nuance Communications, Inc.

Inventor: Hagai Aronowitz
Virtual pet system, method and apparatus for virtual pet chatting

Patent number: 8554541

Abstract: A virtual pet system includes: a virtual pet client, adapted to receive a sentence in natural language and send the sentence to a Q&A server; the Q&A server, adapted to receive the sentence, process the sentence through natural language comprehension, generate an answer in natural language based on a result of natural language comprehension and reasoning knowledge, and send the answer in natural language to the virtual pet client. A method for virtual pet chatting includes: receiving a sentence in natural language, perform natural language comprehension for the sentence, and generating an answer in natural language based on a result of natural language comprehension and reasoning knowledge.

Type: Grant

Filed: September 18, 2008

Date of Patent: October 8, 2013

Assignee: Tencent Technology (Shenzhen) Company Ltd.

Inventors: Haisong Yang, Zhiyuan Liu, Yunfeng Liu, Rongling Yu
ALERT MODE MANAGEMENT METHOD AND COMMUNICATION DEVICE HAVING ALERT MODE MANAGEMENT FUNCTION

Publication number: 20130262115

Abstract: A computerized alert mode management method of a communication device, the communication device includes a sound capture unit. Vocal sounds of the environment around the communication device are extracted at regular intervals using the sound capture unit. Voice characteristic information of the captured vocal sounds is extracted using a speech recognition method and/or a voice recognition method. The communication device is controlled to work at one of a plurality of predetermined alert modes according to the extracted voice characteristic information.

Type: Application

Filed: December 6, 2012

Publication date: October 3, 2013

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventor: TSUNG-JEN CHUANG
Voice guidance system and voice guidance method using the same

Patent number: 8548809

Abstract: A voice guidance system for providing a guidance by voice concerning operations of an information processing apparatus, comprises a detector that detects that a predetermined function of the information processing apparatus is disabled, and a voice guidance unit that outputs a voice message reporting a reason why the predetermined function of the information processing apparatus is disabled, in response to the detection output of the detector.

Type: Grant

Filed: June 16, 2005

Date of Patent: October 1, 2013

Assignee: Fuji Xerox Co., Ltd.

Inventors: Kanji Itaki, Michihiro Kawamura, Nozomi Noguchi
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 8548807

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: June 9, 2009

Date of Patent: October 1, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
VOICE RECOGNITION DEVICE AND NAVIGATION DEVICE

Publication number: 20130253933

Abstract: A voice recognition device 100 includes: a vehicle state detecting unit 7 for detecting a vehicle state of a vehicle having the voice recognition device mounted thereon; an acoustic data matching unit 5 for matching an acoustic feature value converted by a sound analyzing unit 3 with a recognition dictionary stored in a recognition dictionary storage unit 4 to recognize a voice input to a microphone 1; a recognition parameter setting unit 10 for setting a recognition parameter at the time when the voice input to the microphone 1 is recognized; and a control unit 9 for instructing the recognition parameter setting unit 10 to change the recognition parameter when the vehicle state detected by the vehicle state detecting unit 7 satisfies a predetermined condition.

Type: Application

Filed: April 8, 2011

Publication date: September 26, 2013

Applicant: MITSUBISHI ELECTRIC CORPORATION

Inventor: Yuzo Maruta
CONVERSATION SUPPORTING DEVICE, CONVERSATION SUPPORTING METHOD AND CONVERSATION SUPPORTING PROGRAM

Publication number: 20130253932

Abstract: A conversation supporting device of an embodiment of the present disclosure has a information storage unit, a recognition resource constructing unit, and a voice recognition unit. Here, the information storage unit stores the information disclosed by a speaker. The recognition resource constructing unit uses the disclosed information to construct the recognition resource including a voice model and a language model for recognition of voice data. The voice recognition unit uses the recognition resource to recognize the voice data.

Type: Application

Filed: February 25, 2013

Publication date: September 26, 2013

Applicant: Kabushiki Kaisha Toshiba

Inventors: Masahide ARIU, Kazuo Sumita, Akinori Kawamura
System and method for three-way call detection

Patent number: 8542802

Abstract: A system for detecting three-way calls in a monitored telephone conversation includes a speech recognition processor that transcribes the monitored telephone conversation and associates characteristics of the monitored telephone conversation with a transcript thereof, a database to store the transcript and the characteristics associated therewith, and a three-way call detection processor to analyze the characteristics of the conversation and to detect therefrom the addition of one or more parties to the conversation. The system preferably includes at least one domain-specific language model that the speech recognition processor utilizes to transcribe the conversation. The system may operate in real-time or on previously recorded conversations. A query and retrieval system may be used to retrieve and review call records from the database.

Type: Grant

Filed: February 15, 2007

Date of Patent: September 24, 2013

Assignee: Global Tel*Link Corporation

Inventor: Andreas M. Olligschlaeger
Speaker segmentation in noisy conversational speech

Patent number: 8543402

Abstract: System and methods for robust multiple speaker segmentation in noisy conversational speech are presented. Robust voice activity detection is applied to detect temporal speech events. In order to get robust speech features and detect speech events in a noisy environment, a noise reduction algorithm is applied, using noise tracking. After noise reduction and voice activity detection, the incoming audio/speech is initially labeled as speech segments or silence segments. With no prior knowledge of the number of speakers, the system identifies one reliable speech segment near the beginning of the conversational speech and extracts speech features with a short latency, then learns a statistical model from the selected speech segment. This initial statistical model is used to identify the succeeding speech segments in a conversation. The statistical model is also continuously adapted and expanded with newly identified speech segments that match well to the model.

Type: Grant

Filed: April 29, 2011

Date of Patent: September 24, 2013

Assignee: The Intellisis Corporation

Inventor: Jiyong Ma
METHOD AND APPARATUS FOR PROVIDING SERVICES USING VOICE RECOGNITION IN POS SYSTEM

Publication number: 20130246066

Abstract: A method of providing services using voice recognition in a POS system includes loading an execution command set for each service subject provided by the POS system for each group; registering item-based voice pattern information on the execution command for each group; detecting operation mode of the POS system and activating a microphone by driving voice recognition mode in response to the detected operation mode; converting the received signal of the activated microphone into digital data, detecting properties of a sound wave from the digital data, and extracting sound wave analysis data from the detecting properties; checking whether the sound wave analysis data has been registered and assigning a service use right to the received signal according to a result of the check; and performing voice recognition conversion on the received signal, searching for an execution command having a maximum likelihood for the resulting data, and performing services corresponding to the retrieved execution command.

Type: Application

Filed: October 30, 2012

Publication date: September 19, 2013

Applicant: POSBANK CO., LTD.

Inventor: Jae Seob Choi
In-vehicle device and method for modifying display mode of icon indicated on the same

Patent number: 8538756

Abstract: A storage unit stores a correspondence between a voice command and a display mode modification operation. When a control unit determines that a vehicle is traveling according to a traveling state of the vehicle obtained by a traveling state acquisition unit, when a voice recognition unit recognizes a voice, which is uttered by a user and received by a voice input unit, and when the control unit determines that the recognized voice corresponds to a voice command stored in the storage unit, the control unit performs a display mode change operation corresponding to the voice command and modifies a display mode of an icon indicated on an indication screen of an indication unit.

Type: Grant

Filed: January 10, 2011

Date of Patent: September 17, 2013

Assignee: DENSO CORPORATION

Inventors: Masahiro Fujii, Yuji Shinkai
DEVICE FOR EXTRACTING INFORMATION FROM A DIALOG

Publication number: 20130238312

Abstract: Computer-implemented systems and methods for extracting information during a human-to-human mono-lingual or multi-lingual dialog between two speakers are disclosed. Information from either the recognized speech (or the translation thereof) by the second speaker and/or the recognized speech by the first speaker (or the translation thereof) is extracted. The extracted information is then entered into an electronic form stored in a data store.

Type: Application

Filed: February 6, 2013

Publication date: September 12, 2013

Applicant: MOBILE TECHNOLOGIES, LLC

Inventor: Alexander Waibel
System and method for isolating and processing common dialog cues

Patent number: 8532995

Abstract: A method, system and machine-readable medium are provided. Speech input is received at a speech recognition component and recognized output is produced. A common dialog cue from the received speech input or input from a second source is recognized. An action is performed corresponding to the recognized common dialog cue. The performed action includes sending a communication from the speech recognition component to the speech generation component while bypassing a dialog component.

Type: Grant

Filed: May 21, 2012

Date of Patent: September 10, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Vincent J. Goffin, Sarangarajan Parthasarathy
Multi-modal vehicle operating device

Patent number: 8532871

Abstract: A vehicle operating device includes: a voice operation unit (3) for recognizing an uttered voice and outputting a voice recognition result; a spatial operation unit (2) for recognizing a movement performed within a predetermined space and outputting a spatial recognition result; a main processing unit (4) for executing a processing corresponding to the voice recognition result and the spatial recognition result; and a display unit (5) for displaying an image generated in accordance with an instruction from the main processing unit, the image being superimposed on an actual scene that can be viewed through a windshield.

Type: Grant

Filed: March 14, 2008

Date of Patent: September 10, 2013

Assignee: Mitsubishi Electric Company

Inventors: Reiko Okada, Kiyoshi Matsutani, Atsushi Kohno, Fumitaka Sato, Yuta Kawana, Wataru Yamazaki
Method of intelligent vehicle dialing

Patent number: 8532674

Abstract: A method of operating a vehicle telematics unit includes determining the location of a vehicle equipped with a vehicle telematics unit; determining if telematics dialing software operated by the vehicle telematics unit includes a verbal dialing protocol used at the determined vehicle location; if not, identifying one or more verbal dialing protocols used at the determined location of the vehicle; requesting telematics dialing software that includes the one or more identified verbal dialing protocols; receiving the requested telematics dialing software from a central facility; and storing the received telematics dialing software at the vehicle.

Type: Grant

Filed: December 10, 2010

Date of Patent: September 10, 2013

Assignee: General Motors LLC

Inventors: Uma Arun, Rathinavelu Chengalvarayan, Kevin R. Krause, Eray Yasan, Gaurav Talwar, Xufang Zhao, Michael A. Wuergler
Addressee Identification of Speech in Small Groups of Children and Adults

Publication number: 20130231933

Abstract: A method and system for assignee identification of speech includes defining several time intervals and utilizing one or more function evaluations to classify each of the several participants as addressing speech to an automated character or not addressing speech to the automated character during each of the several time intervals. A first function evaluation includes computing values for a predetermined set of features for each of the participants during a particular time interval and assigning a first addressing status to each of the several participants in the particular time interval, based on the values of each of the predetermined sets of features determined during the particular time interval. A second function evaluation may assign a second addressing status to each of the several participants in the particular time interval utilizing results of the first function evaluation for the particular time interval and for one or more additional contiguous time intervals.

Type: Application

Filed: March 2, 2012

Publication date: September 5, 2013

Applicant: DISNEY ENTERPRISES, INC.

Inventors: Hannaneh Hajishirzi, Jill Fain Lehman
Method for speech recognition

Patent number: 8527271

Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.

Type: Grant

Filed: June 18, 2008

Date of Patent: September 3, 2013

Assignee: Nuance Communications, Inc.

Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
Method and apparatus for improving speech recognition and identifying video program material or content

Patent number: 8527268

Abstract: A system for identification of video content in a video signal is provided via a sound track audio signal. The audio signal is processed with filtering, frequency translation, and or non linear transformations to extract voice signals from the sound track channel. The extracted voice signals are coupled to a speech recognition system to provide in text form, the words of the video content, which is later compared with a reference library of words or dialog from known video programs or movies. Other attributes of the video signal or transport stream may be combined with closed caption data or closed caption text for identification purposes. Example attributes include DVS/SAP information, time code information, histograms, and or rendered video or pictures.

Type: Grant

Filed: June 30, 2010

Date of Patent: September 3, 2013

Assignee: Rovi Technologies Corporation

Inventor: Ronald Quan
COMMUNICATION DEVICE AND METHOD

Publication number: 20130226581

Abstract: A communication method includes: capturing analog sound signals output by the audio output unit, and analyze the captured analog sound signals to obtain a corresponding digital audio information. Comparing the obtained digital audio information with a digital feature information stored in a storage unit to determine whether the obtained digital audio information includes the stored digital feature information. Playing a reply information stored in the storage unit if the obtained digital audio information includes the stored digital feature information.

Type: Application

Filed: September 26, 2012

Publication date: August 29, 2013

Applicants: HON HAI PRECISION INDUSTRY CO., LTD., HONG FU JIN PRECISION INDUSTRY (Shenzhen) CO., LTD .

Inventors: HONG FU JIN PRECISION INDUSTRY (Shenzhen, HON HAI PRECISION INDUSTRY CO., LTD.
Computer-implemented system and method for processing audio in a voice response environment

Patent number: 8521527

Abstract: A computer-implemented system and method for processing audio in a voice response environment is provided. A database of host scripts each comprising signature files of audio phrases and actions to take when one of the audio phrases is recognized is maintained. The host scripts are loaded and a call to a voice mail server is initiated. Incoming audio buffers are received during the call from voice messages stored on the voice mail server. The incoming audio buffers are processed. A signature data structure is created for each audio buffer. The signature data structure is compared with signatures of expected phrases in the host scripts. The actions stored in the host scripts are executed when the signature data structure matches the signature of the expected phrase.

Type: Grant

Filed: September 10, 2012

Date of Patent: August 27, 2013

Assignee: Intellisist, Inc.

Inventor: Martin R. M. Dunsmuir
Detection and use of acoustic signal quality indicators

Patent number: 8521537

Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.

Type: Grant

Filed: April 3, 2007

Date of Patent: August 27, 2013

Assignee: Promptu Systems Corporation

Inventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
Disambiguation of a spoken query term

Patent number: 8521526

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.

Type: Grant

Filed: July 28, 2010

Date of Patent: August 27, 2013

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
System and Method for Enhancing Voice-Enabled Search Based on Automated Demographic Identification

Publication number: 20130218561

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information.

Type: Application

Filed: March 19, 2013

Publication date: August 22, 2013

Applicant: AT & T Intellectual Property I, L.P.

Inventor: AT & T Intellectual Property I, L.P.
INFORMATION NOTIFICATION SUPPORTING DEVICE, INFORMATION NOTIFICATION SUPPORTING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20130218553

Abstract: According to an embodiment, an information notification supporting device includes an analyzer configured to analyze an input voice so as to identify voice information indicating information related to speech; a storage unit configured to store therein a history of the voice information; an output controller configured to determine, using the history of the voice information, whether a user is able to listen to a message of which the user should be notified; and an output unit configured to output the message when it is determined that the user is in a state in which the user is able to listen to the message.

Type: Application

Filed: December 28, 2012

Publication date: August 22, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Hiroko Fujii, Masaru Suzuki, Kazuo Sumita, Masahide Ariu
Age determination using speech

Patent number: 8515756

Abstract: A method and device are configured to receive voice data from a user and perform speech recognition on the received voice data. A confidence score is calculated that represents the likelihood that received voice data has been accurately recognized. A likely age range is determined associated with the user based on the confidence score.

Type: Grant

Filed: November 30, 2011

Date of Patent: August 20, 2013

Assignee: Verizon Patent and Licensing Inc.

Inventor: Kevin R. Witzman
Using message sampling to determine the most frequent words in a user mailbox

Patent number: 8515974

Abstract: A method is presented for generating a list of frequently used words for an email application on a server computer. When a request is received for a word frequency list for emails stored in a user's mailbox, a word frequency list is returned if one exists. If the word frequency list does not exist, an asynchronous process is started on the server computer to generate a word frequency list. If the word frequency list exists but it is older than an aging limit, an asynchronous process is started on the server computer to regenerate the word frequency list. The word frequency list is stored in the user's mailbox along with a timestamp indicating the date and time that the list was created or updated.

Type: Grant

Filed: September 2, 2011

Date of Patent: August 20, 2013

Assignee: Microsoft Corporation

Inventors: Ashish Consul, Suryanarayana M. Gorti, Michael Geoffrey Andrew Wilson, James C. Kleewein
GLOBAL SPEECH USER INTERFACE

Publication number: 20130211836

Abstract: A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications.

Type: Application

Filed: March 6, 2013

Publication date: August 15, 2013

Applicant: PROMPTU SYSTEMS CORPORATION

Inventor: PROMPTU SYSTEMS CORPORATION
System and method for low overhead frequency domain voice authentication

Patent number: 8510104

Abstract: A system and method are provided to authenticate a voice in a frequency domain. A voice in the time domain is transformed to a signal in the frequency domain. The first harmonic is set to a predetermined frequency and the other harmonic components are equalized. Similarly, the amplitude of the first harmonic is set to a predetermined amplitude, and the harmonic components are also equalized. The voice signal is then filtered. The amplitudes of each of the harmonic components are then digitized into bits to form at least part of a voice ID. In another system and method, a voice is authenticated in a time domain. The initial rise time, initial fall time, second rise time, second fall time and final oscillation time are digitized into bits to form at least part of a voice ID. The voice IDs are used to authenticate a user's voice.

Type: Grant

Filed: September 14, 2012

Date of Patent: August 13, 2013

Assignee: Research In Motion Limited

Inventor: Sasan Adibi

prev … 13 14 15 16 17 18 19 20 21 … next