Patents Examined by Jonathan Kim
  • Patent number: 9799338
    Abstract: Systems and methods providing for secure voice print authentication over a network are disclosed herein. During an enrollment stage, a client's voice is recorded and characteristics of the recording are used to create and store a voice print. When an enrolled client seeks access to secure information over a network, a sample voice recording is created. The sample voice recording is compared to at least one voice print. If a match is found, the client is authenticated and granted access to secure information. Systems and methods providing for a dual use voice analysis system are disclosed herein. Speech recognition is achieved by comparing characteristics of words spoken by a speaker to one or more templates of human language words. Speaker identification is achieved by comparing characteristics of a speaker's speech to one or more templates, or voice prints. The system is adapted to increase or decrease matching constraints depending on whether speaker identification or speaker recognition is desired.
    Type: Grant
    Filed: April 28, 2014
    Date of Patent: October 24, 2017
    Assignee: Voicelt Technology
    Inventor: Noel Grover
  • Patent number: 9754608
    Abstract: A noise estimation apparatus which estimates a non-stationary noise component on the basis of the likelihood maximization criterion is provided. The noise estimation apparatus obtains the variance of a noise signal that causes a large value to be obtained by weighted addition of the sums each of which is obtained by adding the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a speech segment and a speech posterior probability in each frame, and the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a non-speech segment and a non-speech posterior probability in each frame, by using complex spectra of a plurality of observed signals up to the current frame.
    Type: Grant
    Filed: January 30, 2013
    Date of Patent: September 5, 2017
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Mehrez Souden, Keisuke Kinoshita, Tomohiro Nakatani, Marc Delcroix, Takuya Yoshioka
  • Patent number: 9741337
    Abstract: In some embodiments, the present invention provides for an exemplary computer system which includes at least the following components: an adaptive self-trained computer engine programmed, during a training stage, to electronically receive an initial speech audio data generated by a microphone of a computing device; dynamically segment the initial speech audio data and the corresponding initial text into a plurality of user phonemes; dynamically associate a plurality of first timestamps with the plurality of user-specific subject-specific phonemes; and, during a transcription stage, electronically receive to-be-transcribed speech audio data of at least one user; dynamically split the to-be transcribed speech audio data into a plurality of to-be-transcribed speech audio segments; dynamically assigning each timestamped to-be-transcribed speech audio segment to a particular core of the multi-core processor; and dynamically transcribing, in parallel, the plurality of timestamped to-be-transcribed speech audio segm
    Type: Grant
    Filed: April 3, 2017
    Date of Patent: August 22, 2017
    Assignee: Green Key Technologies LLC
    Inventors: Tejas Shastry, Anthony Tassone, Patrick Kuca, Svyatoslav Vergun
  • Patent number: 9734845
    Abstract: In a speech-based system, a wake word or other trigger expression is used to preface user speech that is intended as a command. The system receives multiple directional audio signals, each of which emphasizes sound from a different direction. The signals are monitored and analyzed to detect the directions of interfering audio sources such as televisions or other types of electronic audio players. One of the directional signals having the strongest presence of speech is selected to be monitored for the trigger expression. If the directional signal corresponds to the direction of an interfering audio source, a more strict standard is used to detect the trigger expression. In addition, the directional audio signal having the second strongest presence of speech may also be monitored to detect the trigger expression.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: August 15, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Yue Liu, Praveen Jayakumar, Amit Singh Chhetri, Ramya Gopalan
  • Patent number: 9721580
    Abstract: Provided are methods and systems for providing situation-dependent transient noise suppression for audio signals. Different strategies (e.g., levels of aggressiveness) of transient suppression and signal restoration are applied to audio signals associated with participants in a video/audio conference depending on whether or not each participant is speaking (e.g., whether a voiced segment or an unvoiced/non-speech segment of audio is present). If no participants are speaking or there is an unvoiced/non-speech sound present, a more aggressive strategy for transient suppression and signal restoration is utilized. On the other hand, where voiced audio is detected (e.g., a participant is speaking), the methods and systems apply a softer, less aggressive suppression and restoration process.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: August 1, 2017
    Assignee: Google Inc.
    Inventors: Jan Skoglund, Alejandro Luebs
  • Patent number: 9704477
    Abstract: A method is disclosed that provides text-to-speech (TTS) functionality to a telematics unit of a telematics-equipped vehicle. The method includes: receiving text content to be played back by an audio system of the telematics-equipped vehicle; determining, by a processor, a TTS rendering process to be used for the text content from a plurality of TTS rendering processes, wherein the plurality of TTS rendering processes include local TTS rendering using a local TTS engine at the telematics-equipped vehicle and remote TTS rendering using a remote TTS engine at a communications center; and causing, by the processor, the text content to be rendered as an audio signal for playback by the telematics-equipped vehicle using the determined TTS rendering process.
    Type: Grant
    Filed: September 5, 2014
    Date of Patent: July 11, 2017
    Assignee: GENERAL MOTORS LLC
    Inventors: Xufang Zhao, Omer Tsimhoni, Gaurav Talwar
  • Patent number: 9697828
    Abstract: Features are disclosed for detecting words in audio using environmental information and/or contextual information in addition to acoustic features associated with the words to be detected. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
    Type: Grant
    Filed: June 20, 2014
    Date of Patent: July 4, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
  • Patent number: 9679566
    Abstract: The apparatus for synchronously processing text data and voice data, comprises: a storing unit for storing text data and voice data; a text data dividing section for dividing the text data; a text data phoneme converting section for phonemically converting the divided text data; a text data phoneme conversion accumulated value calculating section for calculating accumulated values of text data phoneme conversion values; a voice data dividing section for dividing the voice data; a reading data phoneme converting section for phonemically converting the divided voice data; a voice data phoneme conversion accumulated value calculating section for calculating accumulated values of voice data phoneme conversion values; a phrase corresponding data producing section for producing phrase corresponding data; and an output section for synchronously outputting the text data and the divided voice data.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: June 13, 2017
    Assignee: SHINANO KENSHI KABUSHIKI KAISHA
    Inventors: Tomoki Kodaira, Tatsuo Nishizawa
  • Patent number: 9645999
    Abstract: Provided is a process of modifying semantic similarity graphs representative of pair-wise similarity between documents in a corpus, the method comprising obtaining a semantic similarity graph that comprises more than 500 nodes and more than 1000 weighted edges, each node representing a document of a corpus, and each edge weight indicating an amount of similarity between a pair of documents corresponding to the respective nodes connected by the respective edge; obtaining an n-gram indicating that edge weights affected by the n-gram are to be increased or decreased; expanding the n-gram to produce a set of expansion n-grams; adjusting edge weights of edges between pairs of documents in which members of the expanded n-gram set co-occur.
    Type: Grant
    Filed: August 2, 2016
    Date of Patent: May 9, 2017
    Assignee: Quid, Inc.
    Inventors: Fabio Ciulla, Wojciech Musial, Ruggero Altair Tacchi
  • Patent number: 9646602
    Abstract: There is provided a method and an apparatus for processing a disordered voice. A method for processing a disordered voice according to an exemplary embodiment of the present invention includes: receiving a voice signal; recognizing the voice signal by phoneme; extracting multiple voice components from the voice signal; acquiring restored voice components by processing at least some disordered voice components of the multiple voice components by phoneme; and synthesizing a restored voice signal based on at least the restored voice components.
    Type: Grant
    Filed: June 20, 2014
    Date of Patent: May 9, 2017
    Inventors: Myung Whun Sung, Tack Kyun Kwon, Hee Jin Kim, Wook Eun Kim, Woo Il Kim, Mee Young Sung, Dong Wook Kim
  • Patent number: 9613029
    Abstract: Computer implemented techniques for performing transliteration of input text in a first character set to a second character set are disclosed. The techniques include receiving input text and determining a set of possible transliterations of the input text based on a plurality of mapping standards. Each mapping standard defines a mapping of characters in the first character set to characters in the second character set. The techniques further include determining a set of candidate words in the target language based on the possible transliterations and a text corpus. The techniques also include determining a likelihood score for each one of the candidate words based on a language model in the target language previously received words. The techniques also include providing one or more candidate words based on the likelihood scores and receiving a user selection indicating one of the candidate words.
    Type: Grant
    Filed: February 28, 2012
    Date of Patent: April 4, 2017
    Assignee: GOOGLE INC.
    Inventors: Fan Yang, Kirill Buryak, Feng Yuan, Baohua Liao
  • Patent number: 9613019
    Abstract: Techniques for automatically generating test data solve various problems in test data generation. A technique of automatically generating test data includes receiving a signature to be embedded in at least one character string to be generated and determining a total sum of attribute values intrinsic to characters in the character string. The sum is associated with each element of the signature. At least one of the characters in the character string may be selected from a character table describing characters prepared to create the test data so as to achieve the determined total sum for each element of the signature. The generated test data contains the character string including the selected character.
    Type: Grant
    Filed: September 5, 2014
    Date of Patent: April 4, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eisuke Kanzaki, Kaori Maruyama, Tetsuo Namba, Hideo Takeda
  • Patent number: 9607102
    Abstract: Disclosed methods and systems are directed to task switching in dialog processing. The methods and systems may include activating a primary task, receiving, one or more ambiguous natural language commands, and identifying a first candidate task for each of the one or more ambiguous natural language commands. The methods and system may also include identifying, for each of the one or more ambiguous natural language commands and based on one or more rules, a second candidate task of the plurality of tasks corresponding to the ambiguous natural language command, determining whether to modify at least one of the one or more rules-based task switching rules based on whether a quality metric satisfies a threshold quantity, and when the second quality metric satisfies the threshold quantity, changing the task switching rule for the corresponding candidate task from a rules-based model to the optimized statistical based task switching model.
    Type: Grant
    Filed: September 5, 2014
    Date of Patent: March 28, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Jean-Francois Lavallee, Jacques-Olivier Goussard, Richard Beaufort
  • Patent number: 9600231
    Abstract: A revised support vector machine (SVM) classifier is offered to distinguish between true keywords and false positives based on output from a keyword spotting component of a speech recognition system. The SVM operates on a reduced set of feature dimensions, where the feature dimensions are selected based on their ability to distinguish between true keywords and false positives. Further, support vectors pairs are merged to create a reduced set of re-weighted support vectors. These techniques result in an SVM that may be operated using reduced computing resources, thus improving system performance.
    Type: Grant
    Filed: June 26, 2015
    Date of Patent: March 21, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Ming Sun, Björn Hoffmeister, Shiv Naga Prasad Vitaladevuni, Varun Kumar Nagaraja
  • Patent number: 9589560
    Abstract: Features are disclosed for estimating a false rejection rate in a detection system. The false rejection rate can be estimated by fitting a model to a distribution of detection confidence scores. An estimated false rejection rate can then be computed for confidence scores that fall below a threshold. The false rejection rate and model can be verified once the detection system has been deployed by obtaining additional data with confidence scores falling below the threshold. Adjustments to the model or other operational parameters can be implemented based on the verified false rejection rate, model, or additional data.
    Type: Grant
    Filed: December 19, 2013
    Date of Patent: March 7, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Rohit Prasad
  • Patent number: 9570093
    Abstract: In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: February 14, 2017
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 9564127
    Abstract: The present invention relates to a speech recognition method and system based on user personalized information. The method comprises the following steps: receiving a speech signal; decoding the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is a decoding network associated with a basic name language model; if a decoding path enters a name node in the basic static decoding network, network extending is carried out on the name node according to an affiliated static decoding network of a user, wherein the affiliated static decoding network is a decoding network associated with a name language model of a particular user; and returning a recognition result after the decoding is completed. The recognition accuracy rate of user personalized information in continuous speech recognition may be raised by using the present invention.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: February 7, 2017
    Assignee: iFLYTEK Co., Ltd.
    Inventors: Qinghua Pan, Tingting He, Guoping Hu, Yu Hu, Qingfeng Liu
  • Patent number: 9530406
    Abstract: An apparatus and a method for recognizing a voice include a plurality of array microphones configured to have at least one microphone, and a seat controller configured to check a position of a seat provided in a vehicle. A microphone controller is configured to set a beam forming region based on the checked position of the seat and controls an array microphone so as to obtain sound source data from the set beam forming region.
    Type: Grant
    Filed: April 3, 2014
    Date of Patent: December 27, 2016
    Assignee: HYUNDAI MOTOR COMPANY
    Inventor: Seok Min Oh
  • Patent number: 9514751
    Abstract: Described herein is a speech recognition device comprising: a communication module receiving speech data corresponding to speech input from a speech recognition terminal and multi-sensor data corresponding to input environment of the speech; a model selection module selecting a language and acoustic model corresponding to the multi-sensor data among a plurality of language and acoustic models classified according to the speech input environment on the basis of previous multi-sensor data; and a speech recognition module controlling the communication module to apply a feature vector extracted from the speech data to the language and acoustic model and transmit speech recognition result for the speech data to the speech recognition terminal.
    Type: Grant
    Filed: March 25, 2014
    Date of Patent: December 6, 2016
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventor: Dong-Hyun Kim
  • Patent number: 9508341
    Abstract: Features are disclosed for active learning to identify the words which are likely to improve the guessing and automatic speech recognition (ASR) after manual annotation. When a speech recognition system needs pronunciations for words, a lexicon is typically used. For unknown words, pronunciation-guessing (G2P) may be included to provide pronunciations in an unattended (e.g., automatic) fashion. However, having manually (e.g., by a human) annotated pronunciations provides better ASR than having automatic pronunciations that may, in some instances, be wrong. The included active learning features help to direct these limited annotation resources.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: November 29, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Alok Ulhas Parlikar, Andrew Jake Rosenbaum, Jeffrey Paul Lilly, Jeffrey Penrod Adams