Patents by Inventor Johan Schalkwyk
Johan Schalkwyk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20150279351Abstract: Embodiments pertain to automatic speech recognition in mobile devices to establish the presence of a keyword. An audio waveform is received at a mobile device. Front-end feature extraction is performed on the audio waveform, followed by acoustic modeling, high level feature extraction, and output classification to detect the keyword. Acoustic modeling may use a neural network or Gaussian mixture modeling, and high level feature extraction may be done by aligning the results of the acoustic modeling with expected event vectors that correspond to a keyword.Type: ApplicationFiled: April 11, 2013Publication date: October 1, 2015Inventors: Patrick An Phu Nguyen, Maria Carolina Parada San Martin, Johan Schalkwyk
-
Patent number: 9117451Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.Type: GrantFiled: April 29, 2013Date of Patent: August 25, 2015Assignee: Google Inc.Inventors: Javier Gonzalvo Fructuoso, Johan Schalkwyk
-
Patent number: 9047870Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.Type: GrantFiled: September 29, 2011Date of Patent: June 2, 2015Assignee: Google Inc.Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
-
Publication number: 20150149167Abstract: Aspects of this disclosure are directed to accurately transforming speech data into one or more word strings that represent the speech data. A speech recognition device may receive the speech data from a user device and an indication of the user device. The speech recognition device may execute a speech recognition algorithm using one or more user and acoustic condition specific transforms that are specific to the user device and an acoustic condition of the speech data. The execution of the speech recognition algorithm may transform the speech data into one or more word strings that represent the speech data. The speech recognition device may estimate which one of the one or more word strings more accurately represents the received speech data.Type: ApplicationFiled: September 30, 2011Publication date: May 28, 2015Applicant: GOOGLE INC.Inventors: Françoise Beaufays, Johan Schalkwyk, Vincent Olivier Vanhoucke, Petar Stanisa Aleksic
-
Patent number: 9031830Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.Type: GrantFiled: December 22, 2010Date of Patent: May 12, 2015Assignee: Google Inc.Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
-
Publication number: 20150039299Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.Type: ApplicationFiled: September 18, 2013Publication date: February 5, 2015Applicant: Google Inc.Inventors: Eugene Weinstein, Pedro J. Moreno Mengibar, Johan Schalkwyk
-
Publication number: 20150006144Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.Type: ApplicationFiled: September 15, 2014Publication date: January 1, 2015Applicant: GOOGLE INC.Inventors: Jakob David Uszkoreit, Ashish Venugopal, Johan Schalkwyk, Joshua James Estelle
-
Publication number: 20140372119Abstract: In general, the subject matter described in this specification can be embodied in methods, systems, and program products for performing compounded text segmentation. Compounded text that is extracted from one or more search queries submitted to a search engine is received. The compounded text includes a plurality of individual words that are joined together without intervening spaces. An electronic dictionary including words is accessed. A data structure representing possible segmentations of the compounded text is generated based on whether words in the possible segmentations occur in the electronic dictionary. A data store comprising data associated with a same field of usage as the compounded text is accessed to determine a frequency of occurrence for possible segmentations of the data structure. A segmentation of the compounded text that is most probable based on the data is determined. A language model is trained using the determined segmentation of the compounded text.Type: ApplicationFiled: September 28, 2009Publication date: December 18, 2014Inventors: Carolina Parada, Boulos Harb, Johan Schalkwyk
-
Publication number: 20140288929Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.Type: ApplicationFiled: June 9, 2014Publication date: September 25, 2014Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson
-
Publication number: 20140278407Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling of complete language sequences. Training data indicating language sequences is accessed, and counts for a number of times each language sequence occurs in the training data are determined. A proper subset of the language sequences is selected, and a first component of a language model is trained. The first component includes first probability data for assigning scores to the selected language sequences. A second component of the language model is trained based on the training data, where the second component includes second probability data for assigning scores to language sequences that are not included in the selected language sequences. Adjustment data that normalizes the second probability data with respect to the first probability data is generated, and the first component, the second component, and the adjustment data are stored.Type: ApplicationFiled: May 2, 2013Publication date: September 18, 2014Applicant: Google Inc.Inventors: Ciprian I. Chelba, Hasim Sak, Johan Schalkwyk
-
Patent number: 8838459Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.Type: GrantFiled: April 30, 2012Date of Patent: September 16, 2014Assignee: Google Inc.Inventors: Jakob David Uszkoreit, Ashish Venugopal, Johan Schalkwyk, Joshua James Estelle
-
Patent number: 8751217Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.Type: GrantFiled: September 29, 2011Date of Patent: June 10, 2014Assignee: Google Inc.Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
-
Patent number: 8682661Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech input. In one aspect, a method includes receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar, retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network, generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar, processing the user input using the SLM to generate one or more results, comparing the one or more results to candidates provided in the grammar, identifying a particular candidate of the grammar based on the comparing, and providing the particular candidate for input to an application executed on a computing device.Type: GrantFiled: August 31, 2010Date of Patent: March 25, 2014Assignee: Google Inc.Inventors: Johan Schalkwyk, Bjorn Bringert, David P. Singleton
-
Publication number: 20140012586Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.Type: ApplicationFiled: August 6, 2012Publication date: January 9, 2014Applicant: GOOGLE INC.Inventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
-
Publication number: 20130226557Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.Type: ApplicationFiled: April 30, 2012Publication date: August 29, 2013Applicant: Google Inc.Inventors: Jakob David Uszkoreit, Ashish Venugopal, Johan Schalkwyk, Joshua James Estelle
-
Patent number: 8521526Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.Type: GrantFiled: July 28, 2010Date of Patent: August 27, 2013Assignee: Google Inc.Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
-
Publication number: 20130085753Abstract: A computing device is able to use an embedded speech recognizer and a network speech recognizer for speech recognition. In response to detecting speech in the captured audio, the computing device may forward the captured audio to its embedded speech recognizer and to a speech client for the network speech recognizer. The embedded speech recognizer provides an embedded-recognizer result for the captured audio. If a network-recognition criterion is met, the speech client forwards the captured audio to the network speech recognizer and receives a network-recognizer result for the captured audio from the network speech recognizer. A speech recognition result for the captured audio is forwarded to at least one application, wherein the speech recognition result is based on at least one of the embedded-recognizer result and the network-recognizer result.Type: ApplicationFiled: August 15, 2012Publication date: April 4, 2013Applicant: GOOGLE INC.Inventors: Bjorn Erik Bringert, Johan Schalkwyk, Michael J. LeBeau, Richard Zarek Cohen, Luca Zanolin, Simon Tickner
-
Patent number: 8370146Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech input. In one aspect, a method includes receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar, retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network, generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar, processing the user input using the SLM to generate one or more results, comparing the one or more results to candidates provided in the grammar, identifying a particular candidate of the grammar based on the comparing, and providing the particular candidate for input to an application executed on a computing device.Type: GrantFiled: September 30, 2011Date of Patent: February 5, 2013Assignee: Google Inc.Inventors: Johan Schalkwyk, Bjorn Bringert, David P. Singleton
-
Publication number: 20120022867Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.Type: ApplicationFiled: September 29, 2011Publication date: January 26, 2012Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
-
Publication number: 20120022853Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.Type: ApplicationFiled: September 29, 2011Publication date: January 26, 2012Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau