Patents by Inventor Johan Schalkwyk

Johan Schalkwyk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150279351
    Abstract: Embodiments pertain to automatic speech recognition in mobile devices to establish the presence of a keyword. An audio waveform is received at a mobile device. Front-end feature extraction is performed on the audio waveform, followed by acoustic modeling, high level feature extraction, and output classification to detect the keyword. Acoustic modeling may use a neural network or Gaussian mixture modeling, and high level feature extraction may be done by aligning the results of the acoustic modeling with expected event vectors that correspond to a keyword.
    Type: Application
    Filed: April 11, 2013
    Publication date: October 1, 2015
    Inventors: Patrick An Phu Nguyen, Maria Carolina Parada San Martin, Johan Schalkwyk
  • Patent number: 9117451
    Abstract: Methods and systems for sharing of adapted voice profiles are provided. The method may comprise receiving, at a computing system, one or more speech samples, and the one or more speech samples may include a plurality of spoken utterances. The method may further comprise determining, at the computing system, a voice profile associated with a speaker of the plurality of spoken utterances, and including an adapted voice of the speaker. Still further, the method may comprise receiving, at the computing system, an authorization profile associated with the determined voice profile, and the authorization profile may include one or more user identifiers associated with one or more respective users. Yet still further, the method may comprise the computing system providing the voice profile to at least one computing device associated with the one or more respective users, based at least in part on the authorization profile.
    Type: Grant
    Filed: April 29, 2013
    Date of Patent: August 25, 2015
    Assignee: Google Inc.
    Inventors: Javier Gonzalvo Fructuoso, Johan Schalkwyk
  • Patent number: 9047870
    Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.
    Type: Grant
    Filed: September 29, 2011
    Date of Patent: June 2, 2015
    Assignee: Google Inc.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
  • Publication number: 20150149167
    Abstract: Aspects of this disclosure are directed to accurately transforming speech data into one or more word strings that represent the speech data. A speech recognition device may receive the speech data from a user device and an indication of the user device. The speech recognition device may execute a speech recognition algorithm using one or more user and acoustic condition specific transforms that are specific to the user device and an acoustic condition of the speech data. The execution of the speech recognition algorithm may transform the speech data into one or more word strings that represent the speech data. The speech recognition device may estimate which one of the one or more word strings more accurately represents the received speech data.
    Type: Application
    Filed: September 30, 2011
    Publication date: May 28, 2015
    Applicant: GOOGLE INC.
    Inventors: Françoise Beaufays, Johan Schalkwyk, Vincent Olivier Vanhoucke, Petar Stanisa Aleksic
  • Patent number: 9031830
    Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: May 12, 2015
    Assignee: Google Inc.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
  • Publication number: 20150039299
    Abstract: A processing system receives an audio signal encoding a portion of an utterance. The processing system receives context information associated with the utterance, wherein the context information is not derived from the audio signal or any other audio signal. The processing system provides, as input to a neural network, data corresponding to the audio signal and the context information, and generates a transcription for the utterance based on at least an output of the neural network.
    Type: Application
    Filed: September 18, 2013
    Publication date: February 5, 2015
    Applicant: Google Inc.
    Inventors: Eugene Weinstein, Pedro J. Moreno Mengibar, Johan Schalkwyk
  • Publication number: 20150006144
    Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.
    Type: Application
    Filed: September 15, 2014
    Publication date: January 1, 2015
    Applicant: GOOGLE INC.
    Inventors: Jakob David Uszkoreit, Ashish Venugopal, Johan Schalkwyk, Joshua James Estelle
  • Publication number: 20140372119
    Abstract: In general, the subject matter described in this specification can be embodied in methods, systems, and program products for performing compounded text segmentation. Compounded text that is extracted from one or more search queries submitted to a search engine is received. The compounded text includes a plurality of individual words that are joined together without intervening spaces. An electronic dictionary including words is accessed. A data structure representing possible segmentations of the compounded text is generated based on whether words in the possible segmentations occur in the electronic dictionary. A data store comprising data associated with a same field of usage as the compounded text is accessed to determine a frequency of occurrence for possible segmentations of the data structure. A segmentation of the compounded text that is most probable based on the data is determined. A language model is trained using the determined segmentation of the compounded text.
    Type: Application
    Filed: September 28, 2009
    Publication date: December 18, 2014
    Inventors: Carolina Parada, Boulos Harb, Johan Schalkwyk
  • Publication number: 20140288929
    Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.
    Type: Application
    Filed: June 9, 2014
    Publication date: September 25, 2014
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson
  • Publication number: 20140278407
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling of complete language sequences. Training data indicating language sequences is accessed, and counts for a number of times each language sequence occurs in the training data are determined. A proper subset of the language sequences is selected, and a first component of a language model is trained. The first component includes first probability data for assigning scores to the selected language sequences. A second component of the language model is trained based on the training data, where the second component includes second probability data for assigning scores to language sequences that are not included in the selected language sequences. Adjustment data that normalizes the second probability data with respect to the first probability data is generated, and the first component, the second component, and the adjustment data are stored.
    Type: Application
    Filed: May 2, 2013
    Publication date: September 18, 2014
    Applicant: Google Inc.
    Inventors: Ciprian I. Chelba, Hasim Sak, Johan Schalkwyk
  • Patent number: 8838459
    Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: September 16, 2014
    Assignee: Google Inc.
    Inventors: Jakob David Uszkoreit, Ashish Venugopal, Johan Schalkwyk, Joshua James Estelle
  • Patent number: 8751217
    Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.
    Type: Grant
    Filed: September 29, 2011
    Date of Patent: June 10, 2014
    Assignee: Google Inc.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
  • Patent number: 8682661
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech input. In one aspect, a method includes receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar, retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network, generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar, processing the user input using the SLM to generate one or more results, comparing the one or more results to candidates provided in the grammar, identifying a particular candidate of the grammar based on the comparing, and providing the particular candidate for input to an application executed on a computing device.
    Type: Grant
    Filed: August 31, 2010
    Date of Patent: March 25, 2014
    Assignee: Google Inc.
    Inventors: Johan Schalkwyk, Bjorn Bringert, David P. Singleton
  • Publication number: 20140012586
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.
    Type: Application
    Filed: August 6, 2012
    Publication date: January 9, 2014
    Applicant: GOOGLE INC.
    Inventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
  • Publication number: 20130226557
    Abstract: The present disclosure describes a teleconferencing system that may use a virtual participant processor to translate language content of the teleconference into each participant's spoken language without additional user inputs. The virtual participant processor may connect to the teleconference as do the other participants. The virtual participant processor may intercept all text or audio data that was previously exchanged between the participants may now be intercepted by the virtual participant processor. Upon obtaining a partial or complete language recognition result or making a language preference determination, the virtual participant processor may call a translation engine appropriate for each of the participants. The virtual participant processor may send the resulting translation to a teleconference management processor. The teleconference management processor may deliver the respective translated text or audio data to the appropriate participant.
    Type: Application
    Filed: April 30, 2012
    Publication date: August 29, 2013
    Applicant: Google Inc.
    Inventors: Jakob David Uszkoreit, Ashish Venugopal, Johan Schalkwyk, Joshua James Estelle
  • Patent number: 8521526
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.
    Type: Grant
    Filed: July 28, 2010
    Date of Patent: August 27, 2013
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
  • Publication number: 20130085753
    Abstract: A computing device is able to use an embedded speech recognizer and a network speech recognizer for speech recognition. In response to detecting speech in the captured audio, the computing device may forward the captured audio to its embedded speech recognizer and to a speech client for the network speech recognizer. The embedded speech recognizer provides an embedded-recognizer result for the captured audio. If a network-recognition criterion is met, the speech client forwards the captured audio to the network speech recognizer and receives a network-recognizer result for the captured audio from the network speech recognizer. A speech recognition result for the captured audio is forwarded to at least one application, wherein the speech recognition result is based on at least one of the embedded-recognizer result and the network-recognizer result.
    Type: Application
    Filed: August 15, 2012
    Publication date: April 4, 2013
    Applicant: GOOGLE INC.
    Inventors: Bjorn Erik Bringert, Johan Schalkwyk, Michael J. LeBeau, Richard Zarek Cohen, Luca Zanolin, Simon Tickner
  • Patent number: 8370146
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech input. In one aspect, a method includes receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar, retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network, generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar, processing the user input using the SLM to generate one or more results, comparing the one or more results to candidates provided in the grammar, identifying a particular candidate of the grammar based on the comparing, and providing the particular candidate for input to an application executed on a computing device.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: February 5, 2013
    Assignee: Google Inc.
    Inventors: Johan Schalkwyk, Bjorn Bringert, David P. Singleton
  • Publication number: 20120022867
    Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.
    Type: Application
    Filed: September 29, 2011
    Publication date: January 26, 2012
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
  • Publication number: 20120022853
    Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.
    Type: Application
    Filed: September 29, 2011
    Publication date: January 26, 2012
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau