Patents by Inventor Nikko Strom

Nikko Strom has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speech recognizer with multi-directional decoding

Patent number: 9286897

Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to process speech based on multiple channels of audio received from a beamformer. The ASR processing system may include a microphone array and the beamformer to output multiple channels of audio such that each channel isolates audio in a particular direction. The multichannel audio signals may include spoken utterances/speech from one or more speakers as well as undesired audio, such as noise from a household appliance. The ASR device may simultaneously perform speech recognition on the multi-channel audio to provide more accurate speech recognition results.

Type: Grant

Filed: September 27, 2013

Date of Patent: March 15, 2016

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Michael Maximilian Emanuel Bisani, Nikko Strom, Bjorn Hoffmeister, Ryan Paul Thomas
SPEECH RECOGNIZER WITH MULTI-DIRECTIONAL DECODING

Publication number: 20150095026

Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to process speech based on multiple channels of audio received from a beamformer. The ASR processing system may include a microphone array and the beamformer to output multiple channels of audio such that each channel isolates audio in a particular direction. The multichannel audio signals may include spoken utterances/speech from one or more speakers as well as undesired audio, such as noise from a household appliance. The ASR device may simultaneously perform speech recognition on the multi-channel audio to provide more accurate speech recognition results.

Type: Application

Filed: September 27, 2013

Publication date: April 2, 2015

Applicant: Amazon Technologies, Inc.

Inventors: Michael Maximilian Emanuel Bisani, Nikko Strom, Bjorn Hoffmeister, Ryan Paul Thomas
Front-end difference coding for distributed speech recognition

Patent number: 8990076

Abstract: In automated speech recognition (ASR), multiple devices may be employed to perform the ASR in a distributed environment. To reduce bandwidth use in transmitting between devices ASR information is compressed prior to transmission. To counteract fidelity loss that may accompany such compression, two versions of an audio signal are processed by an acoustic front end (AFE), one version is unaltered and one is compressed and decompressed prior to AFE processing. The two versions are compared, and the comparison data is sent to a recipient for further ASR processing. The recipient uses the comparison data and a received version of the compressed audio signal to recreate the post-AFE processing results from the received audio signal. The result is improved ASR results and decreased bandwidth usage between distributed ASR devices.

Type: Grant

Filed: September 10, 2012

Date of Patent: March 24, 2015

Assignee: Amazon Technologies, Inc.

Inventor: Nikko Strom
Transmission of noise parameters for improving automatic speech recognition

Patent number: 8983844

Abstract: Methods and systems for transmission of noise parameters for improving automatic speech recognition are disclosed. A system includes one or more microphones, wherein each microphone is configured to produce an audio signal. The system also includes a noise reduction module configured to generate a noise-reduced audio signal and a noise parameter. Furthermore, the system includes a transmitter configured to transmit, to a computing device, the noise-reduced audio signal and a noise parameter. The computing device may use the noise parameter in obtaining a model to use for performing automatic speech recognition.

Type: Grant

Filed: July 31, 2012

Date of Patent: March 17, 2015

Assignee: Amazon Technologies, Inc.

Inventors: Ryan P. Thomas, Nikko Strom
Parse information encoding in a finite state transducer

Patent number: 8972243

Abstract: In automatic speech recognition, certain parsing information, such as rules and tags, may be embedded into a finite state transducer (FST) to produce FST output that includes speech recognition results along with codes indicating parsing results of the recognized speech. The codes in the FST output may be formatted using a markup language, such as XML or JSON, for processing by a later application. The FST may be constructed according to a grammar defining the parsing information. The codes for inclusion in the FST output may be embedded into arcs of the FST and then included in the FST output when the speech recognition engine traverses the arcs of the FST.

Type: Grant

Filed: November 20, 2012

Date of Patent: March 3, 2015

Assignee: Amazon Technologies, Inc.

Inventors: Nikko Strom, Karthik Ramakrishnan
Identifying candidate passwords from captured audio

Patent number: 8898064

Abstract: A computing device configured to request a password from a user, capture audio after issuing the request, and determine a number of alternative candidate passwords most likely represented by the audio. After identifying the number of candidate passwords, the computing device may submit these candidate passwords, one at a time, to an entity until the entity grants the device access to an account associated with the user or until the device has submitted each candidate password. The account may comprise a network account (e.g., a wired or wireless network account), an online account (e.g., an email account, an account an online merchant, etc.), or the like.

Type: Grant

Filed: March 19, 2012

Date of Patent: November 25, 2014

Assignee: Rawles LLC

Inventors: Ryan P. Thomas, Nikko Strom
Progressive application of knowledge sources in multistage speech recognition

Patent number: 8386251

Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.

Type: Grant

Filed: June 8, 2009

Date of Patent: February 26, 2013

Assignee: Microsoft Corporation

Inventors: Nikko Strom, Julian Odell, Jon Hamaker
Audio human verification

Patent number: 8224655

Abstract: A system generates an audio challenge that includes a first voice and one or more second voices, the first voice being audibly distinguishable, by a human, from the one or more second voices. The first voice conveys first information and the second voice conveys second information. The system provides the audio challenge to a user and verifies that the user is human based on whether the user can identify the first information in the audio challenge.

Type: Grant

Filed: September 12, 2011

Date of Patent: July 17, 2012

Assignee: Tell Me Networks

Inventors: Nikko Strom, Dylan F. Salisbury
AUDIO HUMAN VERIFICATION

Publication number: 20120004914

Abstract: A system generates an audio challenge that includes a first voice and one or more second voices, the first voice being audibly distinguishable, by a human, from the one or more second voices. The first voice conveys first information and the second voice conveys second information. The system provides the audio challenge to a user and verifies that the user is human based on whether the user can identify the first information in the audio challenge.

Type: Application

Filed: September 12, 2011

Publication date: January 5, 2012

Applicant: Tell Me Networks c/o Microsoft Corporation

Inventors: Nikko Strom, Dylan F. Salisbury
Audio human verification

Patent number: 8036902

Abstract: A system generates an audio challenge that includes a first voice and one or more second voices, the first voice being audibly distinguishable, by a human, from the one or more second voices. The first voice conveys first information and the second voice conveys second information. The system provides the audio challenge to a user and verifies that the user is human based on whether the user can identify the first information in the audio challenge.

Type: Grant

Filed: June 21, 2006

Date of Patent: October 11, 2011

Assignee: TellMe Networks, Inc.

Inventors: Nikko Strom, Dylan F. Salisbury
PROGRESSIVE APPLICATION OF KNOWLEDGE SOURCES IN MULTISTAGE SPEECH RECOGNITION

Publication number: 20100312557

Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.

Type: Application

Filed: June 8, 2009

Publication date: December 9, 2010

Applicant: Microsoft Corporation

Inventors: Nikko Strom, Julian Odell, Jon Hamaker
Adding audio effects to spoken utterance

Patent number: 7644000

Abstract: A system receives a spoken utterance, identifies at least one keyword within the spoken utterance, and identifies a function using the identified at least one keyword. The system further performs the identified function on at least a portion of the spoken utterance to create a voice file.

Type: Grant

Filed: December 29, 2005

Date of Patent: January 5, 2010

Assignee: TellMe Networks, Inc.

Inventor: Nikko Strom
Method and system for selecting grammars based on geographic information associated with a caller

Patent number: 7630900

Abstract: A computer implemented method for automatically processing a data request comprising accessing a voice signal from a caller, determining geographic information associated with the caller from telephone network information associated with a call made by the caller, and retrieving a speech recognition grammar customized to the geographic information associated with the caller. The method further includes recognizing the voice signal by matching the voice signal to an entry of the speech recognition grammar and providing a directory listing to the caller based on the entry of the speech recognition grammar. The selected grammar may be customized to bias speech recognition to more frequently recognize cities local to the geographic information.

Type: Grant

Filed: December 1, 2004

Date of Patent: December 8, 2009

Assignee: TellMe Networks, Inc.

Inventor: Nikko Strom
Coarticulated concatenated speech

Patent number: 7269557

Abstract: Described are methods and systems for reducing the audible gap in concatenated recorded speech, resulting in more natural sounding speech in voice applications. The sound of concatenated, recorded speech is improved by also coarticulating the recorded speech. The resulting message is smooth, natural sounding and lifelike. Existing libraries of regularly recorded bulk prompts can be used by coarticulating the user interface prompt occurring just before the bulk prompt. Applications include phone-based applications as well as non-phone-based applications.

Type: Grant

Filed: November 19, 2004

Date of Patent: September 11, 2007

Assignee: Tellme Networks, Inc.

Inventors: Scott J. Bailey, Nikko Strom
Histogram grammar weighting and error corrective training of grammar weights

Patent number: 6985862

Abstract: A multi-level method for estimating and training weights associated with grammar options is presented. The implementation of the method implemented differs depending on the amount of utterance data available for each option to be tuned. A first implementation, modified maximum likelihood estimation (MLE), can be used to estimate weights for a grammar option when few utterances are available for the option. Option weights are then estimated using an obtainable statistic that creates a basis for the predictability model. A second implementation, error corrective training (ECT), can be used to estimate option weight when a sufficiently large number of utterances are available. The ECT method minimizes the errors in the score of the correct interpretation of the utterance and the highest scoring incorrect interpretation in an utterance training set. The ECT method is iterated to converge on a solution for option weights.

Type: Grant

Filed: March 22, 2001

Date of Patent: January 10, 2006

Assignee: Tellme Networks, Inc.

Inventors: Nikko Ström, Nicholas Kibre
Coarticulated concatenated speech

Patent number: 6873952

Abstract: Described are methods and systems for reducing the audible gap in concatenated recorded speech, resulting in more natural sounding speech in voice applications. The sound of concatenated, recorded speech is improved by also coarticulating the recorded speech. The resulting message is smooth, natural sounding and lifelike. Existing libraries of regularly recorded bulk prompts can be used by coarticulating the user interface prompt occurring just before the bulk prompt. Applications include phone-based applications as well as non-phone-based applications.

Type: Grant

Filed: May 16, 2003

Date of Patent: March 29, 2005

Assignee: Tellme Networks, Inc.

Inventors: Scott J. Bailey, Nikko Strom
Histogram grammar weighting and error corrective training of grammar weights

Publication number: 20030004717

Abstract: A multi-level method for estimating and training weights associated with grammar options is presented. The implementation of the method implemented differs depending on the amount of utterance data available for each option to be tuned. A first implementation, modified maximum likelihood estimation (MLE), can be used to estimate weights for a grammar option when few utterances are available for the option. Option weights are then estimated using an obtainable statistic that creates a basis for the predictability model. A second implementation, error corrective training (ECT) , can be used to estimate option weight when a sufficiently large number of utterances are available. The ECT method minimizes the errors in the score of the correct interpretation of the utterance and the highest scoring incorrect interpretation in an utterance training set. The ECT method is iterated to converge on a solution for option weights.

Type: Application

Filed: March 22, 2001

Publication date: January 2, 2003

Inventors: Nikko Strom, Nicholas Kibre

prev 1 2 3