Voice Recognition Patents (Class 704/246)
-
Patent number: 8583434Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.Type: GrantFiled: January 29, 2008Date of Patent: November 12, 2013Assignee: CallMiner, Inc.Inventor: Jeffrey A. Gallino
-
Patent number: 8583443Abstract: Disclosed is a recording and reproducing apparatus comprising: an apparatus main body; and a remote controller to perform remote control of the apparatus main body, wherein the remote controller comprises: a key operating section to receive a key operation by a user; a sound information inputting section to input sound information; and a transmitting section to transmit sound data based on the sound information to the apparatus main body, and the apparatus main body comprises: a recording section to record input content data on a recording medium; a reproducing section to reproduce the content data; a receiving section to receive the sound data; a sound information recording section to record the sound data so as to be associated with a piece of the content data; and a sound information outputting section to reproduce the sound data to output the reproduced sound data.Type: GrantFiled: April 10, 2008Date of Patent: November 12, 2013Assignee: Funai Electric Co., Ltd.Inventor: Masayuki Misawa
-
Patent number: 8577675Abstract: In one aspect thereof the invention provides a method for noise suppression of a speech signal that includes, for a speech signal having a frequency domain representation dividable into a plurality of frequency bins, determining a value of a scaling gain for at least some of said frequency bins and calculating smoothed scaling gain values. Calculating smoothed scaling gain values includes, for the at least some of the frequency bins, combining a currently determined value of the scaling gain and a previously determined value of the smoothed scaling gain. In another aspect a method partitions the plurality of frequency bins into a first set of contiguous frequency bins and a second set of contiguous frequency bins having a boundary frequency there between, where the boundary frequency differentiates between noise suppression techniques, and changes a value of the boundary frequency as a function of the spectral content of the speech signal.Type: GrantFiled: December 22, 2004Date of Patent: November 5, 2013Assignee: Nokia CorporationInventor: Milan Jelinek
-
Patent number: 8577680Abstract: A method, article of manufacture, and apparatus for monitoring data traffic on a network is disclosed. In an embodiment, this includes obtaining intrinsic data from at least a portion of the traffic, obtaining extrinsic data from at least a portion of the traffic, associating the intrinsic data with the extrinsic data, and logging the intrinsic data and extrinsic data. The portion of the traffic from which the intrinsic data and extrinsic data are derived may not be stored, or may be stored in encrypted form.Type: GrantFiled: December 30, 2006Date of Patent: November 5, 2013Assignee: EMC CorporationInventors: Christopher Hercules Claudatos, William Dale Andruss, Scott R. Bevan
-
Publication number: 20130289991Abstract: According to a present invention embodiment, a system utilizes a voice tag to automatically tag one or more entities within a social media environment, and comprises a computer system including at least one processor. The system analyzes the voice tag to identify one or more entities, where the voice tag includes voice signals providing information pertaining to one or more entities. One or more characteristics of each identified entity are determined based on the information within the voice tag. One or more entities appropriate for tagging within the social media environment are determined based on the characteristics and user settings within the social media environment of the identified entities, and automatically tagged. Embodiments of the present invention further include a method and computer program product for utilizing a voice tag to automatically tag one or more entities within a social media environment in substantially the same manner described above.Type: ApplicationFiled: April 30, 2012Publication date: October 31, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bhavani K. Eshwar, Martin A. Oberhofer, Sushain Pandit
-
Patent number: 8571867Abstract: A method (700) and system (900) for authenticating a user is provided. The method can include receiving one or more spoken utterances from a user (702), recognizing a phrase corresponding to one or more spoken utterances (704), identifying a biometric voice print of the user from one or more spoken utterances of the phrase (706), determining a device identifier associated with the device (708), and authenticating the user based on the phrase, the biometric voice print, and the device identifier (710). A location of the handset or the user can be employed as criteria for granting access to one or more resources (712).Type: GrantFiled: September 13, 2012Date of Patent: October 29, 2013Assignee: Porticus Technology, Inc.Inventors: Germano Di Mambro, Bernardas Salna
-
Patent number: 8571865Abstract: Systems, methods performed by data processing apparatus and computer storage media encoded with computer programs for receiving information relating to (i) a communication device that has received an utterance and (ii) a voice associated with the received utterance, comparing the received voice information with voice signatures in a comparison group, the comparison group including one or more individuals identified from one or more connections arising from the received information relating to the communication device, attempting to identify the voice associated with the utterance as matching one of the individuals in the comparison group, and based on a result of the attempt to identify, selectively providing the communication device with access to one or more resources associated with the matched individual.Type: GrantFiled: August 10, 2012Date of Patent: October 29, 2013Assignee: Google Inc.Inventor: Philip Hewinson
-
Patent number: 8571858Abstract: For classifying different segments of a signal which has segments of at least a first type and second type, e.g. audio and speech segments, the signal is short-term classified on the basis of the at least one short-term feature extracted from the signal and a short-term classification result is delivered. The signal is also long-term classified on the basis of the at least one short-term feature and at least one long-term feature extracted from the signal and a long-term classification result is delivered. The short-term classification result and the long-term classification result are combined to provide an output signal indicating whether a segment of the signal is of the first type or of the second type.Type: GrantFiled: January 11, 2011Date of Patent: October 29, 2013Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.Inventors: Guillaume Fuchs, Stefan Bayer, Jens Hirschfeld, Juergen Herre, Jeremie Lecomte, Frederik Nagel, Nikolaus Rettelbach, Stefan Wabnik, Yoshikazu Yokotani
-
Patent number: 8571869Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.Type: GrantFiled: May 15, 2008Date of Patent: October 29, 2013Assignee: Nuance Communications, Inc.Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
-
Patent number: 8566097Abstract: A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and theType: GrantFiled: June 1, 2010Date of Patent: October 22, 2013Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute InternationalInventors: Mikio Nakano, Takashi Nose, Ryo Taguchi, Kotaro Funakoshi, Naoto Iwahashi
-
Patent number: 8566092Abstract: The present invention discloses a method and an apparatus for extracting a prosodic feature of a speech signal, the method including: dividing the speech signal into speech frames; transforming the speech frames from time domain to frequency domain; and extracting respective prosodic features for different frequency ranges. According to the above technical solution of the present invention, it is possible to effectively extract the prosodic feature which can combine with a traditional acoustics feature without any obstacle.Type: GrantFiled: August 16, 2010Date of Patent: October 22, 2013Assignee: Sony CorporationInventors: Kun Liu, Weiguo Wu
-
Patent number: 8566093Abstract: A method for compensating inter-session variability for automatic extraction of information from an input voice signal representing an utterance of a speaker, includes: processing the input voice signal to provide feature vectors each formed by acoustic features extracted from the input voice signal at a time frame; computing an intersession variability compensation feature vector; and computing compensated feature vectors based on the extracted feature vectors and the intersession variability compensation feature vector.Type: GrantFiled: May 16, 2006Date of Patent: October 22, 2013Assignee: Loquendo S.p.A.Inventors: Claudio Vair, Daniele Colibro, Pietro Laface
-
Patent number: 8566103Abstract: A system, apparatus, and method is disclosed for receiving user input at a client device, interpreting the user input to identify a selection of at least one of a plurality of web interaction modes, producing a corresponding client request based in part on the user input and the web interaction mode; and sending the client request to a server via a network.Type: GrantFiled: December 22, 2010Date of Patent: October 22, 2013Assignee: Intel CorporationInventor: Liang He
-
Patent number: 8566104Abstract: A method for a speech response system to automatically transfer users to human agents. The method can establish an interactive dialog session between a user and an automated speech response system. An error score can be established when the interactive dialog session is initiated. During the interactive dialog session, responses to dialog prompts can be received. Error weights can be assigned to receive responses determined to be non-valid responses. Different non-valid responses can be assigned different error weights. For each non-valid response, the assigned error weight can be added to the error score. When a value of the error score exceeds a previously established error threshold, a user can be automatically transferred from the automated speech response system to a human agent.Type: GrantFiled: November 1, 2011Date of Patent: October 22, 2013Assignee: Nuance Communications, Inc.Inventors: Vanessa V. Michelini, Melanie D. Polkosky
-
Patent number: 8560373Abstract: A method for direct marketing comprising establishing a first communications link between a prospective customer using a device having a unique identification number and a communications device, automatically transmitting the unique identification number associated with the prospective customer's device to the communications device, establishing a second communications link between the communications device and a computer operably connected to a memory apparatus having a prospective customer database comprising prospective customer information associated with the unique identification number of the prospective customer's device, in which the information in the database determines prospective customer value which can be used to determine subsequent operations and marketing actions with the prospective customer.Type: GrantFiled: September 25, 2008Date of Patent: October 15, 2013Inventor: Eileen A. Fraser
-
Patent number: 8560315Abstract: A conference support device includes an image receiving portion that receives captured images from conference terminals, a voice receiving portion that receives, from one of the conference terminals, a voice that is generated by a first participant, a first storage portion that stores the captured images and the voice, a voice recognition portion that recognizes the voice, a text data creation portion that creates text data that express the words that are included in the voice, an addressee specification portion that specifies a second participant, whom the voice is addressing, an image creation portion that creates a display image that is configured from the captured images and in which the text data are associated with the first participant and a specified image is associated with at least one of the first participant and the second participant, and a transmission portion that transmits the display image to the conference terminals.Type: GrantFiled: March 12, 2010Date of Patent: October 15, 2013Assignee: Brother Kogyo Kabushiki KaishaInventor: Mizuho Yasoshima
-
Patent number: 8560327Abstract: A method for synchronizing sound data and text data, said text data being obtained by manual transcription of said sound data during playback of the latter. The proposed method comprises the steps of repeatedly querying said sound data and said text data to obtain a current time position corresponding to a currently played sound datum and a currently transcribed text datum, respectively, correcting said current time position by applying a time correction value in accordance with a transcription delay, and generating at least one association datum indicative of a synchronization association between said corrected time position and said currently transcribed text datum. Thus, the proposed method achieves cost-effective synchronization of sound and text in connection with the manual transcription of sound data.Type: GrantFiled: August 18, 2006Date of Patent: October 15, 2013Assignee: Nuance Communications, Inc.Inventors: Andreas Neubacher, Miklos Papai
-
Patent number: 8560316Abstract: The present invention relates to a system and method of making a verification decision within a speaker recognition system. A speech sample is gathered from a speaker over a period of time a verification score is then produce for said sample over the period. Once the verification score is determined a confidence measure is produced based on frame score observations from said sample over the period and a confidence measure calculated based on the standard Gaussian distribution. If the confidence measure indicates with a set level of confidence that the verification score is below the verification threshold the speaker is rejected and gathering process terminated.Type: GrantFiled: December 19, 2007Date of Patent: October 15, 2013Inventors: Robert Vogt, Michael Mason, Sridaran Subramanian
-
Patent number: 8560301Abstract: A language expression apparatus and a method based on a context and a intent awareness, are provided. The apparatus and method may recognize a context and an intent of a user and may generate a language expression based on the recognized context and the recognized intent, thereby providing an interpretation/translation service and/or providing an education service for learning a language.Type: GrantFiled: March 2, 2010Date of Patent: October 15, 2013Assignee: Samsung Electronics Co., Ltd.Inventor: Yeo Jin Kim
-
Patent number: 8560310Abstract: A method, apparatus and computer program product for providing improved voice activated functions is presented. A grammar is provided from a collection of names for use in a voice activated operation, the grammar including the names and variations of the names. A preferred one of the variations of a name is associated with a name in the grammar. A preferred one of the variations of the name is received and is used to perform a task.Type: GrantFiled: May 8, 2012Date of Patent: October 15, 2013Assignee: Nuance Communications, Inc.Inventors: Ya-Xin Zhang, Qing-Feng Bao
-
Patent number: 8554563Abstract: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.Type: GrantFiled: September 11, 2012Date of Patent: October 8, 2013Assignee: Nuance Communications, Inc.Inventor: Hagai Aronowitz
-
Patent number: 8554566Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.Type: GrantFiled: November 29, 2012Date of Patent: October 8, 2013Assignee: Morphism LLCInventor: James H. Stephens, Jr.
-
Patent number: 8554562Abstract: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.Type: GrantFiled: November 15, 2009Date of Patent: October 8, 2013Assignee: Nuance Communications, Inc.Inventor: Hagai Aronowitz
-
Patent number: 8554541Abstract: A virtual pet system includes: a virtual pet client, adapted to receive a sentence in natural language and send the sentence to a Q&A server; the Q&A server, adapted to receive the sentence, process the sentence through natural language comprehension, generate an answer in natural language based on a result of natural language comprehension and reasoning knowledge, and send the answer in natural language to the virtual pet client. A method for virtual pet chatting includes: receiving a sentence in natural language, perform natural language comprehension for the sentence, and generating an answer in natural language based on a result of natural language comprehension and reasoning knowledge.Type: GrantFiled: September 18, 2008Date of Patent: October 8, 2013Assignee: Tencent Technology (Shenzhen) Company Ltd.Inventors: Haisong Yang, Zhiyuan Liu, Yunfeng Liu, Rongling Yu
-
Publication number: 20130262115Abstract: A computerized alert mode management method of a communication device, the communication device includes a sound capture unit. Vocal sounds of the environment around the communication device are extracted at regular intervals using the sound capture unit. Voice characteristic information of the captured vocal sounds is extracted using a speech recognition method and/or a voice recognition method. The communication device is controlled to work at one of a plurality of predetermined alert modes according to the extracted voice characteristic information.Type: ApplicationFiled: December 6, 2012Publication date: October 3, 2013Applicant: HON HAI PRECISION INDUSTRY CO., LTD.Inventor: TSUNG-JEN CHUANG
-
Patent number: 8548809Abstract: A voice guidance system for providing a guidance by voice concerning operations of an information processing apparatus, comprises a detector that detects that a predetermined function of the information processing apparatus is disabled, and a voice guidance unit that outputs a voice message reporting a reason why the predetermined function of the information processing apparatus is disabled, in response to the detection output of the detector.Type: GrantFiled: June 16, 2005Date of Patent: October 1, 2013Assignee: Fuji Xerox Co., Ltd.Inventors: Kanji Itaki, Michihiro Kawamura, Nozomi Noguchi
-
Patent number: 8548807Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: June 9, 2009Date of Patent: October 1, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Publication number: 20130253933Abstract: A voice recognition device 100 includes: a vehicle state detecting unit 7 for detecting a vehicle state of a vehicle having the voice recognition device mounted thereon; an acoustic data matching unit 5 for matching an acoustic feature value converted by a sound analyzing unit 3 with a recognition dictionary stored in a recognition dictionary storage unit 4 to recognize a voice input to a microphone 1; a recognition parameter setting unit 10 for setting a recognition parameter at the time when the voice input to the microphone 1 is recognized; and a control unit 9 for instructing the recognition parameter setting unit 10 to change the recognition parameter when the vehicle state detected by the vehicle state detecting unit 7 satisfies a predetermined condition.Type: ApplicationFiled: April 8, 2011Publication date: September 26, 2013Applicant: MITSUBISHI ELECTRIC CORPORATIONInventor: Yuzo Maruta
-
Publication number: 20130253932Abstract: A conversation supporting device of an embodiment of the present disclosure has a information storage unit, a recognition resource constructing unit, and a voice recognition unit. Here, the information storage unit stores the information disclosed by a speaker. The recognition resource constructing unit uses the disclosed information to construct the recognition resource including a voice model and a language model for recognition of voice data. The voice recognition unit uses the recognition resource to recognize the voice data.Type: ApplicationFiled: February 25, 2013Publication date: September 26, 2013Applicant: Kabushiki Kaisha ToshibaInventors: Masahide ARIU, Kazuo Sumita, Akinori Kawamura
-
Patent number: 8542802Abstract: A system for detecting three-way calls in a monitored telephone conversation includes a speech recognition processor that transcribes the monitored telephone conversation and associates characteristics of the monitored telephone conversation with a transcript thereof, a database to store the transcript and the characteristics associated therewith, and a three-way call detection processor to analyze the characteristics of the conversation and to detect therefrom the addition of one or more parties to the conversation. The system preferably includes at least one domain-specific language model that the speech recognition processor utilizes to transcribe the conversation. The system may operate in real-time or on previously recorded conversations. A query and retrieval system may be used to retrieve and review call records from the database.Type: GrantFiled: February 15, 2007Date of Patent: September 24, 2013Assignee: Global Tel*Link CorporationInventor: Andreas M. Olligschlaeger
-
Patent number: 8543402Abstract: System and methods for robust multiple speaker segmentation in noisy conversational speech are presented. Robust voice activity detection is applied to detect temporal speech events. In order to get robust speech features and detect speech events in a noisy environment, a noise reduction algorithm is applied, using noise tracking. After noise reduction and voice activity detection, the incoming audio/speech is initially labeled as speech segments or silence segments. With no prior knowledge of the number of speakers, the system identifies one reliable speech segment near the beginning of the conversational speech and extracts speech features with a short latency, then learns a statistical model from the selected speech segment. This initial statistical model is used to identify the succeeding speech segments in a conversation. The statistical model is also continuously adapted and expanded with newly identified speech segments that match well to the model.Type: GrantFiled: April 29, 2011Date of Patent: September 24, 2013Assignee: The Intellisis CorporationInventor: Jiyong Ma
-
Publication number: 20130246066Abstract: A method of providing services using voice recognition in a POS system includes loading an execution command set for each service subject provided by the POS system for each group; registering item-based voice pattern information on the execution command for each group; detecting operation mode of the POS system and activating a microphone by driving voice recognition mode in response to the detected operation mode; converting the received signal of the activated microphone into digital data, detecting properties of a sound wave from the digital data, and extracting sound wave analysis data from the detecting properties; checking whether the sound wave analysis data has been registered and assigning a service use right to the received signal according to a result of the check; and performing voice recognition conversion on the received signal, searching for an execution command having a maximum likelihood for the resulting data, and performing services corresponding to the retrieved execution command.Type: ApplicationFiled: October 30, 2012Publication date: September 19, 2013Applicant: POSBANK CO., LTD.Inventor: Jae Seob Choi
-
Patent number: 8538756Abstract: A storage unit stores a correspondence between a voice command and a display mode modification operation. When a control unit determines that a vehicle is traveling according to a traveling state of the vehicle obtained by a traveling state acquisition unit, when a voice recognition unit recognizes a voice, which is uttered by a user and received by a voice input unit, and when the control unit determines that the recognized voice corresponds to a voice command stored in the storage unit, the control unit performs a display mode change operation corresponding to the voice command and modifies a display mode of an icon indicated on an indication screen of an indication unit.Type: GrantFiled: January 10, 2011Date of Patent: September 17, 2013Assignee: DENSO CORPORATIONInventors: Masahiro Fujii, Yuji Shinkai
-
Publication number: 20130238312Abstract: Computer-implemented systems and methods for extracting information during a human-to-human mono-lingual or multi-lingual dialog between two speakers are disclosed. Information from either the recognized speech (or the translation thereof) by the second speaker and/or the recognized speech by the first speaker (or the translation thereof) is extracted. The extracted information is then entered into an electronic form stored in a data store.Type: ApplicationFiled: February 6, 2013Publication date: September 12, 2013Applicant: MOBILE TECHNOLOGIES, LLCInventor: Alexander Waibel
-
Patent number: 8532995Abstract: A method, system and machine-readable medium are provided. Speech input is received at a speech recognition component and recognized output is produced. A common dialog cue from the received speech input or input from a second source is recognized. An action is performed corresponding to the recognized common dialog cue. The performed action includes sending a communication from the speech recognition component to the speech generation component while bypassing a dialog component.Type: GrantFiled: May 21, 2012Date of Patent: September 10, 2013Assignee: AT&T Intellectual Property II, L.P.Inventors: Vincent J. Goffin, Sarangarajan Parthasarathy
-
Patent number: 8532871Abstract: A vehicle operating device includes: a voice operation unit (3) for recognizing an uttered voice and outputting a voice recognition result; a spatial operation unit (2) for recognizing a movement performed within a predetermined space and outputting a spatial recognition result; a main processing unit (4) for executing a processing corresponding to the voice recognition result and the spatial recognition result; and a display unit (5) for displaying an image generated in accordance with an instruction from the main processing unit, the image being superimposed on an actual scene that can be viewed through a windshield.Type: GrantFiled: March 14, 2008Date of Patent: September 10, 2013Assignee: Mitsubishi Electric CompanyInventors: Reiko Okada, Kiyoshi Matsutani, Atsushi Kohno, Fumitaka Sato, Yuta Kawana, Wataru Yamazaki
-
Patent number: 8532674Abstract: A method of operating a vehicle telematics unit includes determining the location of a vehicle equipped with a vehicle telematics unit; determining if telematics dialing software operated by the vehicle telematics unit includes a verbal dialing protocol used at the determined vehicle location; if not, identifying one or more verbal dialing protocols used at the determined location of the vehicle; requesting telematics dialing software that includes the one or more identified verbal dialing protocols; receiving the requested telematics dialing software from a central facility; and storing the received telematics dialing software at the vehicle.Type: GrantFiled: December 10, 2010Date of Patent: September 10, 2013Assignee: General Motors LLCInventors: Uma Arun, Rathinavelu Chengalvarayan, Kevin R. Krause, Eray Yasan, Gaurav Talwar, Xufang Zhao, Michael A. Wuergler
-
Publication number: 20130231933Abstract: A method and system for assignee identification of speech includes defining several time intervals and utilizing one or more function evaluations to classify each of the several participants as addressing speech to an automated character or not addressing speech to the automated character during each of the several time intervals. A first function evaluation includes computing values for a predetermined set of features for each of the participants during a particular time interval and assigning a first addressing status to each of the several participants in the particular time interval, based on the values of each of the predetermined sets of features determined during the particular time interval. A second function evaluation may assign a second addressing status to each of the several participants in the particular time interval utilizing results of the first function evaluation for the particular time interval and for one or more additional contiguous time intervals.Type: ApplicationFiled: March 2, 2012Publication date: September 5, 2013Applicant: DISNEY ENTERPRISES, INC.Inventors: Hannaneh Hajishirzi, Jill Fain Lehman
-
Patent number: 8527271Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.Type: GrantFiled: June 18, 2008Date of Patent: September 3, 2013Assignee: Nuance Communications, Inc.Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
-
Patent number: 8527268Abstract: A system for identification of video content in a video signal is provided via a sound track audio signal. The audio signal is processed with filtering, frequency translation, and or non linear transformations to extract voice signals from the sound track channel. The extracted voice signals are coupled to a speech recognition system to provide in text form, the words of the video content, which is later compared with a reference library of words or dialog from known video programs or movies. Other attributes of the video signal or transport stream may be combined with closed caption data or closed caption text for identification purposes. Example attributes include DVS/SAP information, time code information, histograms, and or rendered video or pictures.Type: GrantFiled: June 30, 2010Date of Patent: September 3, 2013Assignee: Rovi Technologies CorporationInventor: Ronald Quan
-
Publication number: 20130226581Abstract: A communication method includes: capturing analog sound signals output by the audio output unit, and analyze the captured analog sound signals to obtain a corresponding digital audio information. Comparing the obtained digital audio information with a digital feature information stored in a storage unit to determine whether the obtained digital audio information includes the stored digital feature information. Playing a reply information stored in the storage unit if the obtained digital audio information includes the stored digital feature information.Type: ApplicationFiled: September 26, 2012Publication date: August 29, 2013Applicants: HON HAI PRECISION INDUSTRY CO., LTD., HONG FU JIN PRECISION INDUSTRY (Shenzhen) CO., LTD .Inventors: HONG FU JIN PRECISION INDUSTRY (Shenzhen, HON HAI PRECISION INDUSTRY CO., LTD.
-
Patent number: 8521527Abstract: A computer-implemented system and method for processing audio in a voice response environment is provided. A database of host scripts each comprising signature files of audio phrases and actions to take when one of the audio phrases is recognized is maintained. The host scripts are loaded and a call to a voice mail server is initiated. Incoming audio buffers are received during the call from voice messages stored on the voice mail server. The incoming audio buffers are processed. A signature data structure is created for each audio buffer. The signature data structure is compared with signatures of expected phrases in the host scripts. The actions stored in the host scripts are executed when the signature data structure matches the signature of the expected phrase.Type: GrantFiled: September 10, 2012Date of Patent: August 27, 2013Assignee: Intellisist, Inc.Inventor: Martin R. M. Dunsmuir
-
Patent number: 8521537Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.Type: GrantFiled: April 3, 2007Date of Patent: August 27, 2013Assignee: Promptu Systems CorporationInventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
-
Patent number: 8521526Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.Type: GrantFiled: July 28, 2010Date of Patent: August 27, 2013Assignee: Google Inc.Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
-
Publication number: 20130218561Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information.Type: ApplicationFiled: March 19, 2013Publication date: August 22, 2013Applicant: AT & T Intellectual Property I, L.P.Inventor: AT & T Intellectual Property I, L.P.
-
Publication number: 20130218553Abstract: According to an embodiment, an information notification supporting device includes an analyzer configured to analyze an input voice so as to identify voice information indicating information related to speech; a storage unit configured to store therein a history of the voice information; an output controller configured to determine, using the history of the voice information, whether a user is able to listen to a message of which the user should be notified; and an output unit configured to output the message when it is determined that the user is in a state in which the user is able to listen to the message.Type: ApplicationFiled: December 28, 2012Publication date: August 22, 2013Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Hiroko Fujii, Masaru Suzuki, Kazuo Sumita, Masahide Ariu
-
Patent number: 8515756Abstract: A method and device are configured to receive voice data from a user and perform speech recognition on the received voice data. A confidence score is calculated that represents the likelihood that received voice data has been accurately recognized. A likely age range is determined associated with the user based on the confidence score.Type: GrantFiled: November 30, 2011Date of Patent: August 20, 2013Assignee: Verizon Patent and Licensing Inc.Inventor: Kevin R. Witzman
-
Patent number: 8515974Abstract: A method is presented for generating a list of frequently used words for an email application on a server computer. When a request is received for a word frequency list for emails stored in a user's mailbox, a word frequency list is returned if one exists. If the word frequency list does not exist, an asynchronous process is started on the server computer to generate a word frequency list. If the word frequency list exists but it is older than an aging limit, an asynchronous process is started on the server computer to regenerate the word frequency list. The word frequency list is stored in the user's mailbox along with a timestamp indicating the date and time that the list was created or updated.Type: GrantFiled: September 2, 2011Date of Patent: August 20, 2013Assignee: Microsoft CorporationInventors: Ashish Consul, Suryanarayana M. Gorti, Michael Geoffrey Andrew Wilson, James C. Kleewein
-
Publication number: 20130211836Abstract: A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications.Type: ApplicationFiled: March 6, 2013Publication date: August 15, 2013Applicant: PROMPTU SYSTEMS CORPORATIONInventor: PROMPTU SYSTEMS CORPORATION
-
Patent number: 8510104Abstract: A system and method are provided to authenticate a voice in a frequency domain. A voice in the time domain is transformed to a signal in the frequency domain. The first harmonic is set to a predetermined frequency and the other harmonic components are equalized. Similarly, the amplitude of the first harmonic is set to a predetermined amplitude, and the harmonic components are also equalized. The voice signal is then filtered. The amplitudes of each of the harmonic components are then digitized into bits to form at least part of a voice ID. In another system and method, a voice is authenticated in a time domain. The initial rise time, initial fall time, second rise time, second fall time and final oscillation time are digitized into bits to form at least part of a voice ID. The voice IDs are used to authenticate a user's voice.Type: GrantFiled: September 14, 2012Date of Patent: August 13, 2013Assignee: Research In Motion LimitedInventor: Sasan Adibi