Voice Recognition Patents (Class 704/246)
  • Publication number: 20150058016
    Abstract: A method and system for voice identification and validation is provided. Other embodiments are disclosed. The system registers one or more users on a social media platform with login information during a social media session, acquires a voice sample at any time of the social media session or a continuation of the social media session, associates the login information and the voice sample in a profile for each of the one or more users, stores the profile as a voice print in a voice print identifier database, and identifies at least one talker from an interfacing of the social media platform with the voice print identifier database. Other embodiments are provided.
    Type: Application
    Filed: August 22, 2014
    Publication date: February 26, 2015
    Inventor: Steve W. Goldstein
  • Patent number: 8965761
    Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
    Type: Grant
    Filed: February 27, 2014
    Date of Patent: February 24, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
  • Patent number: 8965816
    Abstract: Provided is a non-transitory computer readable medium storing a program causing a computer to function as a learning data acquiring unit that acquires learning data, a memory unit that performs machine learning using the learning data about cluster division where Markov chains of transition via a link from a node to a node on a network formed from plural nodes are divided into plural clusters each of which is indicated by a biased Markov chain and calculates a steady state of each biased Markov chain, a search condition receiving unit that receives a search condition from a user, a cluster extracting unit that extracts clusters suitable for the search condition, a partial network cutting unit that cuts a partial network formed by a node group belonging to the clusters, and an importance calculating unit that calculates importance of each node on the partial network.
    Type: Grant
    Filed: December 13, 2012
    Date of Patent: February 24, 2015
    Assignee: Fuji Xerox Co., Ltd.
    Inventor: Hiroshi Okamoto
  • Patent number: 8965764
    Abstract: Disclosed are an electronic apparatus and a voice recognition method for the same. The voice recognition method for the electronic apparatus includes: receiving an input voice of a user; determining characteristics of the user; and recognizing the input voice based on the determined characteristics of the user.
    Type: Grant
    Filed: January 7, 2010
    Date of Patent: February 24, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hee-seob Ryu, Seung-kwon Park, Jong-ho Lea, Jong-hyuk Jang
  • Patent number: 8958848
    Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal and a memory configured to store multiple domains related to menus and operations of the mobile terminal. It further includes a controller configured to access a specific domain among the multiple domains included in the memory based on the received input to activate the voice recognition function, to recognize user speech based on a language model and an acoustic model of the accessed domain, and to determine at least one menu and operation of the mobile terminal based on the accessed specific domain and the recognized user speech.
    Type: Grant
    Filed: June 16, 2008
    Date of Patent: February 17, 2015
    Assignee: LG Electronics Inc.
    Inventors: Jong-Ho Shin, Jong-Keun Youn, Dae-Sung Jung, Jae-Hoon Yu, Tae-Jun Kim, Jae-Min Joh, Jae-Do Kwak
  • Patent number: 8959022
    Abstract: A method for determining a relatedness between a query video and a database video is provided. A processor extracts an audio stream from the query video to produce a query audio stream, extracts an audio stream from the database video to produce a database audio stream, produces a first-sized snippet from the query audio stream, and produces a first-sized snippet from the database audio stream. An estimation is made of a first most probable sequence of latent evidence probability vectors generating the first-sized audio snippet of the query audio stream. An estimation is made of a second most probable sequence of latent evidence probability vectors generating the first-sized audio snippet of the database audio stream. A similarity is measured between the first sequence and the second sequence producing a score of relatedness between the two snippets. Finally a relatedness is determined between the query video and a database video.
    Type: Grant
    Filed: November 19, 2012
    Date of Patent: February 17, 2015
    Assignee: Motorola Solutions, Inc.
    Inventors: Yang M. Cheng, Dusan Macho
  • Patent number: 8959360
    Abstract: Methods, systems, and apparatus for voice authentication and command. In an aspect, a method comprises: receiving, by a data processing apparatus that is operating in a locked mode, audio data that encodes an utterance of a user, wherein the locked mode prevents the data processing apparatus from performing at least one action; providing, while the data processing apparatus is operating in the locked mode, the audio data to a voice biometric engine and a voice action engine; receiving, while the data processing apparatus is operating in the locked mode, an indication from the voice biometric engine that the user has been biometrically authenticated; and in response to receiving the indication, triggering the voice action engine to process a voice action that is associated with the utterance.
    Type: Grant
    Filed: August 15, 2013
    Date of Patent: February 17, 2015
    Assignee: Google Inc.
    Inventor: Hugo B. Barra
  • Publication number: 20150046161
    Abstract: An aspect provides a method, including: collecting, at one or more device sensors, one or more inputs selected from the group of inputs consisting of audio inputs from a learning environment and visual inputs from a learning environment; processing, using one or more processors, the one or more inputs to detect an unauthorized behavior pattern; mapping, using the one or more processors, the unauthorized behavior pattern to a predetermined action; and executing the predetermined action. Other aspects are described and claimed.
    Type: Application
    Filed: August 7, 2013
    Publication date: February 12, 2015
    Applicant: Lenovo (Singapore) Pte. Ltd.
    Inventors: Howard Locker, Richard Wayne Cheston, Goran Hans Wibran, John Weldon Nicholson
  • Patent number: 8954326
    Abstract: Provided are a voice command recognition apparatus and method capable of figuring out the intention of a voice command input through a voice dialog interface, by combining a rule based dialog model and a statistical dialog model rule. The voice command recognition apparatus includes a command intention determining unit configured to correct an error in recognizing a voice command of a user, and an application processing unit configured to check whether the final command intention determined in the command intention determining unit comprises the input factors for execution of an application.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: February 10, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Byung-Kwan Kwak, Chi-Youn Park, Jeong-Su Kim, Jeong-Mi Cho
  • Patent number: 8954327
    Abstract: A voice data analyzing device comprises speaker model deriving means which derives speaker models as models each specifying character of voice of each speaker from voice data including a plurality of utterances to each of which a speaker label as information for identifying a speaker has been assigned and speaker co-occurrence model deriving means which derives a speaker co-occurrence model as a model representing the strength of co-occurrence relationship among the speakers from session data obtained by segmenting the voice data in units of sequences of conversation by use of the speaker models derived by the speaker model deriving means.
    Type: Grant
    Filed: June 3, 2010
    Date of Patent: February 10, 2015
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Publication number: 20150039312
    Abstract: Methods and systems are provided for managing speech dialog of a speech system. In one embodiment, a method includes: receiving information determined from a non-speech related sensor; using the information in a turn-taking function to confirm at least one of if and when a user is speaking; and generating a command to at least one of a speech recognition module and a speech generation module based on the confirmation.
    Type: Application
    Filed: July 31, 2013
    Publication date: February 5, 2015
    Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: ELI TZIRKEL-HANCOCK, JAN H. AASE, ROBERT D. SIMS, III, IGAL BILIK, MOSHE LAIFENFELD
  • Publication number: 20150039313
    Abstract: The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase.
    Type: Application
    Filed: June 11, 2014
    Publication date: February 5, 2015
    Applicant: SENAM CONSULTING, INC.
    Inventor: Serge Olegovich Seyfetdinov
  • Patent number: 8949126
    Abstract: Methods for creating statistical language models (SLMs) for spoken Completely Automated Turing Tests for Telling Computers and Humans Apart (CAPTCHAs) are disclosed. In these methods, candidate challenge items including one or more words are automatically selected from a document corpus. Selected ones of the challenge items are articulated by a machine text-to-speech (TTS) system as candidate articulations. Those articulations are ranked based on a human listener score indicating whether a candidate articulation originated from a machine. The SLM is then trained to recognize machine TTS articulations according to those rankings, by using a subset of the plurality of candidate challenge items identified as machine articulations as a seed set.
    Type: Grant
    Filed: April 21, 2014
    Date of Patent: February 3, 2015
    Assignee: The John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Patent number: 8942356
    Abstract: A system for detecting three-way calls in a monitored telephone conversation includes as speech recognition processor that transcribes the monitored telephone conversation and associates Characteristics of the monitored telephone conversation with a transcript thereof, a database to store the transcript and the characteristics associated therewith, and a three-way Call detection processor to analyze the characteristics of the conversation and to detect therefrom the addition of one or more parties to the conversation. The system preferably includes at least one domain-specific language model that the speech recognition processor utilizes to transcribe the conversation. The system may operate in real-time or on previously recorded conversations. A query and retrieval system may be used to retrieve and review call records from the database.
    Type: Grant
    Filed: August 20, 2013
    Date of Patent: January 27, 2015
    Assignee: DSI-ITI, LLC
    Inventor: Andreas M. Olligschlaeger
  • Publication number: 20150025888
    Abstract: A method of enabling speaker identification, the method comprising receiving an identifier, the identifier having a limited number of potential speakers associated with it, processing speech data received from a speaker, and when the speaker is recognized, tagging a speaker and displaying a speaker identity.
    Type: Application
    Filed: October 22, 2013
    Publication date: January 22, 2015
    Applicant: Nuance Communications, Inc.
    Inventor: Robert Douglas Sharp
  • Publication number: 20150025889
    Abstract: A biometric audio security system comprises providing an input voice audio source. The input audio is enhanced in two or more harmonic and dynamic ranges by re-synthesizing the audio into a full range PCM wave. A hardware key with a set of audio frequency spikes (identifiers) with varying amplitude and frequency values is provided. The enhanced voice audio input and the key using additive resynthesis are summed. The voice and the spike set is compared against the users identification signature to verify user's identity. The set of audio spikes are user specific. The spikes are stored on the protected key device as a template, which would plug into the system. The template is determined by the owner/manufacturer of the system. The spikes are created and identified using the additive synthesis technique with a predetermined number of partials (harmonics). The identifiers include both positive and negative values. The amplitude and frequency values are spaced in very fine intervals.
    Type: Application
    Filed: February 19, 2014
    Publication date: January 22, 2015
    Applicant: Max Sound Corporation
    Inventors: Lloyd Trammell, Lawrence Schwartz
  • Patent number: 8938382
    Abstract: An item of information (212) is transmitted to a distal computer (220), translated to a different sense modality and/or language (222), and in substantially real time, and the translation (222) is transmitted back to the location (211) from which the item was sent. The device sending the item is preferably a wireless device, and more preferably a cellular or other telephone (210). The device receiving the translation is also preferably a wireless device, and more preferably a cellular or other telephone, and may advantageously be the same device as the sending device. The item of information (212) preferably comprises a sentence of human of speech having at least ten words, and the translation is a written expression of the sentence. All of the steps of transmitting the item of information, executing the program code, and transmitting the translated information preferably occurs in less than 60 seconds of elapsed time.
    Type: Grant
    Filed: March 21, 2012
    Date of Patent: January 20, 2015
    Assignee: Ulloa Research Limited Liability Company
    Inventor: Robert D. Fish
  • Patent number: 8938393
    Abstract: A system, method, and computer program product for automatically analyzing multimedia data audio content are disclosed. Embodiments receive multimedia data, detect portions having specified audio features, and output a corresponding subset of the multimedia data and generated metadata. Audio content features including voices, non-voice sounds, and closed captioning, from downloaded or streaming movies or video clips are identified as a human probably would do, but in essentially real time. Particular speakers and the most meaningful content sounds and words and corresponding time-stamps are recognized via database comparison, and may be presented in order of match probability. Embodiments responsively pre-fetch related data, recognize locations, and provide related advertisements. The content features may be also sent to search engines so that further related content may be identified. User feedback and verification may improve the embodiments over time.
    Type: Grant
    Filed: June 28, 2011
    Date of Patent: January 20, 2015
    Assignee: Sony Corporation
    Inventors: Priyan Gunatilake, Djung Nguyen, Abhishek Patil, Dipendu Saha
  • Patent number: 8938388
    Abstract: Maintaining and supplying a plurality of speech models is provided. A plurality of speech models and metadata for each speech model are stored. A query for a speech model is received from a source. The query includes one or more conditions. The speech model with metadata most closely matching the supplied one or more conditions is determined. The determined speech model is provided to the source. A refined speech model is received from the source, and the refined speech model is stored.
    Type: Grant
    Filed: July 9, 2012
    Date of Patent: January 20, 2015
    Assignee: International Business Machines Corporation
    Inventors: Bin Jia, Ying Liu, E. Feng Lu, Jia Wu, Zhen Zhang
  • Publication number: 20150019221
    Abstract: A speech recognition system includes a server, a data transmission interface and a speech recognition device. The speech recognition device builds a connection with the server through the data transmission interface. The speech recognition device includes a microphone, an output unit and a processing unit. The processing unit transmits received user information to the server through the data transmission interface to obtain a corresponding personal dictionary file. The personal dictionary file is generated according to history of speech recognition result and related data, which is used by others recently. The processing unit receives a voice signal to be recognized through the microphone and converts it into a digital characteristic file according to a voiceprint file of the user. The processing unit searches the personal dictionary file according to the digital characteristic file to obtain a speech recognition result for outputting through the output unit.
    Type: Application
    Filed: November 4, 2013
    Publication date: January 15, 2015
    Applicant: CHUNGHWA PICTURE TUBES, LTD.
    Inventors: Guan-Liang LEE, Chih-Yin CHIANG, Che-Wei CHANG
  • Publication number: 20150019222
    Abstract: A method for using voiceprint identification to operate voice recognition and electronic device thereof are provided. The method includes the following steps: receiving a specific voice fragment; cutting the received specific voice fragment into a plurality of specific sub-voice clips; performing a voiceprint identification flow to the specific sub-voice clips, respectively; determining whether each of the specific sub-voice clips is an appropriate sub-voice clip according to a result of the voiceprint identification flow; and capturing the appropriate sub-voice clips and operating a voice recognition thereto.
    Type: Application
    Filed: April 9, 2014
    Publication date: January 15, 2015
    Applicant: VIA TECHNOLOGIES, INC.
    Inventor: Guo-Feng Zhang
  • Publication number: 20150019224
    Abstract: A voice synthesis device according to the present invention regularly recognizes the contents of an utterance made by a passenger or the like, and specifies a word before abbreviation corresponding to an abbreviation included in a facility name or the like which is included in the utterance contents by using the facility name or the like. Therefore, the voice synthesis device can read the abbreviation out loud while preventing the passenger from being forced to perform a burdensome operation of, for example, registering the word before abbreviation corresponding to the abbreviation and using a reading method familiar to and appropriate for the passenger.
    Type: Application
    Filed: May 2, 2012
    Publication date: January 15, 2015
    Applicant: MITSUBISHI ELECTRIC CORPORATION
    Inventors: Masanobu Osawa, Tomohiro Iwasaki
  • Publication number: 20150019223
    Abstract: It is provided a method for triggering an action on a second device. It comprises the steps of obtaining audio of a multimedia content presented on a first device; comparing the obtained audio with reference audio data in a database; if finding the obtained audio exists in the database containing reference audio, determining an action corresponding to the matched reference audio; and triggering the action in the second device.
    Type: Application
    Filed: December 31, 2011
    Publication date: January 15, 2015
    Inventors: Jianfeng Chen, Xiaojun Ma, Zhigang Zhang
  • Patent number: 8934874
    Abstract: A system for transmitting voice messages from a caller location to a receiver location using a plurality of computers each coupled to another through a network such as the Internet. The system also has a plurality of handheld portable recording-delivery devices which are coupled to the network. Each handheld portable recording-delivery device can convert voice input into digital data for transmission through the network. Destination information for the digital data being transmitted is generated using speech recognition of voice input.
    Type: Grant
    Filed: July 28, 2014
    Date of Patent: January 13, 2015
    Assignee: Stage Innovations, Inc.
    Inventors: George Krucik, Taka Migimatsu
  • Patent number: 8935169
    Abstract: According to one embodiment, an electronic apparatus includes an acquiring module and a display process module. The acquiring module is configured to acquire information regarding a plurality of persons using information of video content data, the plurality of persons appearing in a plurality of sections in the video content data. The display process module is configured to display (i) a time bar representative of a sequence of the video content data, (ii) information regarding a first person appearing in a first section of the sections, and (iii) information regarding a second person different from the first person, the second person appearing in a second section of the sections. The first area of the time bar corresponds to the first section is displayed in a first form, and a second area of the time bar corresponds to the second section is displayed in a second form different from the first form.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: January 13, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Tetsuya Fujii
  • Patent number: 8935168
    Abstract: A state detecting device includes an input unit that receives an input voice sound; an analyzer that calculates a feature parameter of each of plurality of frames extracted from the voice sound; a calculator that calculates the average of the feature parameters of the frames, determines a threshold on the basis of the average and statistical data representing relationships between other averages of other feature parameters obtained from a plurality of speakers and cumulative frequencies of the other feature parameters, and calculates an appearance frequency of a frame that is among the plurality of frames and whose feature parameter is larger than the threshold; a determining unit that determines, on the basis of the appearance frequency, a strained state of a vocal cord that has made the voice sound; and an output unit that outputs a result of the determination.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: January 13, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8935166
    Abstract: Some embodiments disclosed herein store a target application and a dictation application. The target application may be configured to receive input from a user. The dictation application interface may include a full overlay mode option, where in response to selection of the full overlay mode option, the dictation application interface is automatically sized and positioned over the target application interface to fully cover a text area of the target application interface to appear as if the dictation application interface is part of the target application interface. The dictation application may be further configured to receive an audio dictation from the user, convert the audio dictation into text, provide the text in the dictation application interface and in response to receiving a first user command to complete the dictation, automatically copy the text from the dictation application interface and inserting the text into the target application interface.
    Type: Grant
    Filed: October 16, 2013
    Date of Patent: January 13, 2015
    Assignee: Dolbey & Company, Inc.
    Inventors: Curtis A. Weeks, Aaron G. Weeks, Stephen E. Barton
  • Publication number: 20150012276
    Abstract: A link table is generated, voice information is associated by dot patterns, and then, voice information associated with the dot pattern is reproduced from a speaker when the dot pattern is read by means of a scanner. In this manner, the dot pattern is printed on a surface of a material such as a picture book or a card, making it possible to play back voice information corresponding to a pattern or a story of a picture book and to play back voice information corresponding to a character described on the card. In addition, by means of a link table, new voice information can be associated with, dissociated from, or changed to, a new dot pattern.
    Type: Application
    Filed: September 19, 2014
    Publication date: January 8, 2015
    Inventor: Kenji Yoshida
  • Patent number: 8929866
    Abstract: A system for transmitting voice messages from a caller location to a receiver location using a plurality of computers each coupled to another through a network such as the Internet. The system also has a plurality of handheld portable recording-delivery devices which are coupled to the network. Each handheld portable recording-delivery device can convert voice input into digital data for transmission through the network. Destination information for the digital data being transmitted is based on input from a touch pad on the handheld portable recording-delivery device.
    Type: Grant
    Filed: July 28, 2014
    Date of Patent: January 6, 2015
    Assignee: Stage Innovations, Inc.
    Inventors: George Krucik, Taka Migimatsu
  • Patent number: 8930191
    Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A user request is received, the user request including at least a speech input received from a user. In response to the user request, (1) an echo of the speech input based on a textual interpretation of the speech input, and (2) a paraphrase of the user request based at least in part on a respective semantic interpretation of the speech input are presented to the user.
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: January 6, 2015
    Assignee: Apple Inc.
    Inventors: Thomas Robert Gruber, Harry Joseph Saddler, Adam John Cheyer, Dag Kittlaus, Christopher Dean Brigham, Richard Donald Giuli, Didier Rene Guzzoni, Marcello Bastea-Forte
  • Patent number: 8930187
    Abstract: An apparatus for utilizing textual data and acoustic data corresponding to speech data to detect sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including evaluating textual data and acoustic data corresponding to voice data associated with captured speech content. The computer program code may further cause the apparatus to analyze the textual data and the acoustic data to detect whether the textual data or the acoustic data includes one or more words indicating at least one sentiment of a user that spoke the speech content. The computer program code may further cause the apparatus to assign at least one predefined sentiment to at least one of the words in response to detecting that the word(s) indicates the sentiment of the user. Corresponding methods and computer program products are also provided.
    Type: Grant
    Filed: January 3, 2012
    Date of Patent: January 6, 2015
    Assignee: Nokia Corporation
    Inventors: Imre Attila Kiss, Joseph Polifroni, Francois Mairesse, Mark Adler
  • Patent number: 8924197
    Abstract: Disclosed are systems, methods, and computer readable media for converting a natural language query into a logical query. The method embodiment comprises receiving a natural language query and converting the natural language query using an extensible engine to generate a logical query, the extensible engine being linked to the toolkit and knowledge base. In one embodiment, a natural language query can be processed in a domain independent method to generate a logical query.
    Type: Grant
    Filed: October 30, 2007
    Date of Patent: December 30, 2014
    Assignee: Semantifi, Inc.
    Inventors: Sreenivasa Rao Pragada, Viswanath Dasari, Abhijit A Patil
  • Patent number: 8924214
    Abstract: A method for detecting and recognizing speech is provided that remotely detects body motions from a speaker during vocalization with one or more radar sensors. Specifically, the radar sensors include a transmit aperture that transmits one or more waveforms towards the speaker, and each of the waveforms has a distinct wavelength. A receiver aperture is configured to receive the scattered radio frequency energy from the speaker. Doppler signals correlated with the speaker vocalization are extracted with a receiver. Digital signal processors are configured to develop feature vectors utilizing the vocalization Doppler signals, and words associated with the feature vectors are recognized with a word classifier.
    Type: Grant
    Filed: June 7, 2011
    Date of Patent: December 30, 2014
    Assignee: The United States of America, as represented by the Secretary of the Navy
    Inventors: Jefferson M Willey, Todd Stephenson, Hugh Faust, James P. Hansen, George J Linde, Carol Chang, Justin Nevitt, James A Ballas, Thomas Herne Crystal, Vincent Michael Stanford, Jean W. De Graaf
  • Publication number: 20140379344
    Abstract: Method and apparatus for a user to access a systems interface to back-end legacy systems using voice inputs. Generally, a user such as a technician accesses a systems interface to legacy systems via a front-end voice server. The user dials-in to the voice server using a portable access device. Preferably, the portable access device is a cellular phone. Preferably, the voice recognition server performs voice authentication, speech recognition, and speech synthesis. The voice server authenticates the user based on a voice exemplar provided by the user. Using speech synthesis, the voice server provides a menu of operations from which the user can select. By speaking into the access device, the user selects an operation and provides any additional data needed for the operation. Using speech recognition, the voice server prepares a user request based on the spoken user input. The user request is forwarded to the systems interface to the legacy systems.
    Type: Application
    Filed: September 8, 2014
    Publication date: December 25, 2014
    Inventors: Steven G. Smith, Ralph J. Mills, Roland T. Morton, Mitchell E. Davis
  • Publication number: 20140379339
    Abstract: Methods, systems, computer-readable media, and apparatuses for selecting authentication questions based on a voice biometric confidence score are presented. In some embodiments, a computing device may receive a voice sample. Subsequently, the computing device may determine a voice biometric confidence score based on the voice sample. The computing device then may select one or more authentication questions based on the voice biometric confidence score.
    Type: Application
    Filed: June 20, 2013
    Publication date: December 25, 2014
    Inventors: Joseph Timem, Donald Perry, Jenny Rosenberger, David Karpey
  • Publication number: 20140379340
    Abstract: Methods, systems, computer-readable media, and apparatuses for utilizing voice biometrics to prevent unauthorized access are presented. In some embodiments, a computing device may receive a voice sample. Subsequently, the computing device may determine a voice biometric confidence score based on the voice sample. The computing device then may evaluate the voice biometric confidence score in combination with one or more other factors to identify an attempt to access an account without authorization.
    Type: Application
    Filed: June 20, 2013
    Publication date: December 25, 2014
    Inventors: Joseph Timem, Donald Perry, Jenny Rosenberger, David Karpey
  • Publication number: 20140379343
    Abstract: A method and apparatus that filters audio data received from a speaking person that includes a specific filter for that speaker. The audio characteristics of the speaker's voice may be collected and the specific filter may be formed to reduce noise while also enhancing voice quality. For instance, if a speaker's voice does not contain specific frequencies, then a filter may cancel the noise at such frequencies to ease noise cancellation and reduce processing sound spectrum for cleaning that is not needed. Additionally, the strength frequencies of a speaker's voice may be identified from the collected audio characteristics and those spectrums can be filtered with finer granularity to provide a speaker specific filter that enhances the voice quality of the speaker's voice data that is transmitted or output by a communication device. The audio data may also be output based upon a user's predefined hearing spectrum.
    Type: Application
    Filed: November 20, 2012
    Publication date: December 25, 2014
    Applicant: Unify GmbH Co. KG
    Inventors: Bizhan Karimi-Cherkandi, Farrokh Mohammadzadeh Kouchri, Schah Walli Ali
  • Publication number: 20140379342
    Abstract: Embodiments of the invention are directed to systems and methods for voice filtering. In some embodiments, an original voice segment from a user may be received. The received original voice segment may be modified using a first predetermined algorithm. The modified voice segment may be sent to an authentication server. At the authentication server, the modified voice segment may be reconstructed into the original voice segment using a second predetermined algorithm. The user may be authenticated for a transaction based at least in part on the reconstructed original voice segment.
    Type: Application
    Filed: June 25, 2014
    Publication date: December 25, 2014
    Inventors: Shaw Li, Dhiraj Sharda, Douglas Fisher
  • Publication number: 20140379338
    Abstract: In a conditional multipass automatic speech recognition system, one or more intent templates may be received from an application. A spoken utterance is received and audio frames are generated from the utterance. The audio frames are compared to a first grammar. Recognized speech results are generated and unrecognized audio frames or low confidence frames are collected. One of one or more intent templates and one or more corresponding intent parameters may be determined based on the recognized speech results. The unrecognized audio frames may be conditionally compared to a second grammar in instances when additional information is requested, relative to the determined intent template or the corresponding intent parameters.
    Type: Application
    Filed: June 20, 2013
    Publication date: December 25, 2014
    Inventor: Darrin Kenneth John Fry
  • Publication number: 20140379341
    Abstract: The present disclosure relates to a portable terminal, and more particularly, to a portable terminal and a method of detecting a gesture and controlling a function. A method of controlling a function of a portable terminal includes: detecting a gesture; activating a voice recognition module in response to the detected gesture; and analyzing a voice input into the activated voice recognition module, and executing a function corresponding to the input voice.
    Type: Application
    Filed: June 20, 2014
    Publication date: December 25, 2014
    Inventors: Ho-Seong Seo, Shi-Yun Cho
  • Patent number: 8918321
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for enhancing speech recognition accuracy. The method includes dividing a system dialog turn into segments based on timing of probable user responses, generating a weighted grammar for each segment, exclusively activating the weighted grammar generated for a current segment of the dialog turn during the current segment of the dialog turn, and recognizing user speech received during the current segment using the activated weighted grammar generated for the current segment. The method can further include assigning probability to the weighted grammar based on historical user responses and activating each weighted grammar is based on the assigned probability. Weighted grammars can be generated based on a user profile. A weighted grammar can be generated for two or more segments.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: December 23, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Michael Czahor
  • Patent number: 8918320
    Abstract: An apparatus for generating a review based in part on detected sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including determining a location(s) of the apparatus and a time(s) that the location(s) was determined responsive to capturing voice data of speech content associated with spoken reviews of entities. The computer program code may further cause the apparatus to analyze textual and acoustic data corresponding to the voice data to detect whether the textual or acoustic data includes words indicating a sentiment(s) of a user speaking the speech content. The computer program code may further cause the apparatus to generate a review of an entity corresponding to a spoken review(s) based on assigning a predefined sentiment to a word(s) responsive to detecting that the word indicates the sentiment of the user. Corresponding methods and computer program products are also provided.
    Type: Grant
    Filed: January 3, 2012
    Date of Patent: December 23, 2014
    Assignee: Nokia Corporation
    Inventors: Mark Adler, Imre Attila Kiss, Francois Mairesse, Joseph Polifroni
  • Patent number: 8918316
    Abstract: The content of a media program is recognized by analyzing its audio content to extract therefrom prescribed features, which are compared to a database of features associated with identified content. The identity of the content within the database that has features that most closely match the features of the media program being played is supplied as the identity of the program being played. The features are extracted from a frequency domain version of the media program by a) filtering the coefficients to reduce their number, e.g., using triangular filters; b) grouping a number of consecutive outputs of triangular filters into segments; and c) selecting those segments that meet prescribed criteria, such as those segments that have the largest minimum segment energy with prescribed constraints that prevent the segments from being too close to each other. The triangular filters may be log-spaced and their output may be normalized.
    Type: Grant
    Filed: July 29, 2003
    Date of Patent: December 23, 2014
    Assignee: Alcatel Lucent
    Inventors: Jan I Ben, Christopher J Burges, Madjid Sam Mousavi, Craig R. Nohl
  • Patent number: 8917853
    Abstract: A method and system for enhancing problem resolution at a call center based on speech recognition of a caller includes, receiving an incoming call and generating call data based on speech recognition of the incoming call using a computer. The method generates and associates annotated metadata about the call data. A historical record is created which includes the call data and the annotated metadata. The historical record may be stored in a storage medium communicating with the computer. Context data is generated for the incoming call by analyzing the historical record to identify: a caller, a topic, a date and a stress level of the caller. The method compares the context data to historical records of previous calls. A topic probabilities analysis is conducted by comparing the context data to the historical records of previous calls, and a solution is determined for the topic based on the probabilities analysis.
    Type: Grant
    Filed: June 19, 2012
    Date of Patent: December 23, 2014
    Assignee: International Business Machines Corporation
    Inventors: Vijay Dheap, Nicholas E. Poore, Lee M. Surprenant, Michael D. Whitley
  • Patent number: 8918406
    Abstract: A method of processing content files may include receiving the content file, employing processing circuitry to determine an identity score of a source of a portion of at least a portion the content file, to determine a word score based for the content file and to determine a metadata score for the content file, determining a composite priority score based on the identity score, the word score and the metadata score, and associating the composite priority score with the content file for electronic provision of the content file together with the composite priority score to a human analyst.
    Type: Grant
    Filed: December 14, 2012
    Date of Patent: December 23, 2014
    Assignee: Second Wind Consulting LLC
    Inventor: Donna Rober
  • Publication number: 20140372119
    Abstract: In general, the subject matter described in this specification can be embodied in methods, systems, and program products for performing compounded text segmentation. Compounded text that is extracted from one or more search queries submitted to a search engine is received. The compounded text includes a plurality of individual words that are joined together without intervening spaces. An electronic dictionary including words is accessed. A data structure representing possible segmentations of the compounded text is generated based on whether words in the possible segmentations occur in the electronic dictionary. A data store comprising data associated with a same field of usage as the compounded text is accessed to determine a frequency of occurrence for possible segmentations of the data structure. A segmentation of the compounded text that is most probable based on the data is determined. A language model is trained using the determined segmentation of the compounded text.
    Type: Application
    Filed: September 28, 2009
    Publication date: December 18, 2014
    Inventors: Carolina Parada, Boulos Harb, Johan Schalkwyk
  • Patent number: 8913720
    Abstract: A method includes receiving a communication from a party at a voice response system and capturing verbal communication spoken by the party. Then a processor creates a voice model associated with the party, the voice model being created by processing the captured verbal communication spoken by the party. The creation of the voice model is imperceptible to the party. The voice model is then stored to provide voice verification of the party during a subsequent communication.
    Type: Grant
    Filed: February 14, 2013
    Date of Patent: December 16, 2014
    Assignee: AT&T Intellectual Property, L.P.
    Inventor: Mazin Gilbert
  • Patent number: 8914290
    Abstract: Method and apparatus that dynamically adjusts operational parameters of a text-to-speech engine in a speech-based system. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.
    Type: Grant
    Filed: May 18, 2012
    Date of Patent: December 16, 2014
    Assignee: Vocollect, Inc.
    Inventors: James Hendrickson, Debra Drylie Scott, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk
  • Patent number: 8914287
    Abstract: One embodiment may take the form of a voice control system. The system may include a first apparatus with a processing unit configured to execute a voice recognition module and one or more executable commands, and a receiver coupled to the processing unit and configured to receive a first audio file from a remote control device. The first audio file may include at least one voice command. The first apparatus may further include a communication component coupled to the processing unit and configured to receive programming content, and one or more storage media storing the voice recognition module. The voice recognition module may be configured to convert voice commands into text.
    Type: Grant
    Filed: January 28, 2011
    Date of Patent: December 16, 2014
    Assignee: EchoStar Technologies L.L.C.
    Inventors: Jeremy Mickelsen, Nathan A. Hale, Benjamin Mauser, David A. Innes, Brad Bylund
  • Patent number: 8909516
    Abstract: Computing functionality converts an input linguistic item into a normalized linguistic item, representing a normalized counterpart of the input linguistic item. In one environment, the input linguistic item corresponds to a complaint by a person receiving medical care, and the normalized linguistic item corresponds to a definitive and error-free version of that complaint. In operation, the computing functionality uses plural reference resources to expand the input linguistic item, creating an expanded linguistic item. The computing functionality then forms a graph based on candidate tokens that appear in the expanded linguistic item, and then finds a shortest path through the graph; that path corresponds to the normalized linguistic item. The computing functionality may use a statistical language model to assign weights to edges in the graph, and to determine whether the normalized linguistic incorporates two or more component linguistic items.
    Type: Grant
    Filed: December 7, 2011
    Date of Patent: December 9, 2014
    Assignee: Microsoft Corporation
    Inventors: Julie A. Medero, Daniel S. Morris, Lucretia H. Vanderwende, Michael Gamon