Patents by Inventor Michael H. Cohen

Michael H. Cohen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20130006640
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Application
    Filed: September 14, 2012
    Publication date: January 3, 2013
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
  • Publication number: 20120022873
    Abstract: Methods, computer program products and systems are described for forming a speech recognition language model. Multiple query-website relationships are determined by identifying websites that are determined to be relevant to queries using one or more search engines. Clusters are identified in the query-website relationships by connecting common queries and connecting common websites. A speech recognition language model is created for a particular website based on at least one of analyzing at queries in a cluster that includes the website or analyzing webpage content of web pages in the cluster that includes the website.
    Type: Application
    Filed: September 29, 2011
    Publication date: January 26, 2012
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
  • Publication number: 20120022853
    Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.
    Type: Application
    Filed: September 29, 2011
    Publication date: January 26, 2012
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
  • Publication number: 20120022867
    Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.
    Type: Application
    Filed: September 29, 2011
    Publication date: January 26, 2012
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
  • Publication number: 20120022866
    Abstract: Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among multiple language models, a language model appropriate for the context. Speech in the sound information is converted to text using the selected language model. The text is provided for use by the electronic device.
    Type: Application
    Filed: September 29, 2011
    Publication date: January 26, 2012
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
  • Publication number: 20110213613
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Application
    Filed: May 24, 2010
    Publication date: September 1, 2011
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
  • Publication number: 20110161080
    Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 30, 2011
    Applicant: GOOGLE INC.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
  • Publication number: 20110161081
    Abstract: Methods, computer program products and systems are described for forming a speech recognition language model. Multiple query-website relationships are determined by identifying websites that are determined to be relevant to queries using one or more search engines. Clusters are identified in the query-website relationships by connecting common queries and connecting common websites. A speech recognition language model is created for a particular website based on at least one of analyzing at queries in a cluster that includes the website or analyzing webpage content of web pages in the cluster that includes the website.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 30, 2011
    Applicant: GOOGLE INC.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
  • Publication number: 20110153324
    Abstract: Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among multiple language models, a language model appropriate for the context. Speech in the sound information is converted to text using the selected language model. The text is provided for use by the electronic device.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 23, 2011
    Applicant: GOOGLE INC.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
  • Publication number: 20110153325
    Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.
    Type: Application
    Filed: December 22, 2010
    Publication date: June 23, 2011
    Applicant: GOOGLE INC.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
  • Patent number: 7756708
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Grant
    Filed: April 3, 2006
    Date of Patent: July 13, 2010
    Assignee: Google Inc.
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
  • Publication number: 20100121704
    Abstract: A computer-implemented method for advertisement distribution includes receiving, in a computer system, an input from an advertiser that has previously registered an advertisement for on-demand activation. The input is generated based on the advertiser having an immediate availability and directs the computer system to initiate the on-demand activation substantially in real time with receiving the input. The method includes determining, using the computer system, a geographic location of the advertiser that corresponds to the immediate availability. The method includes defining, using the computer system, a target group to which the advertisement is to be presented, the target group identified based on at least the geographic location and the immediate availability. The method includes initiating the on-demand activation using the computer system, for receipt of the advertisement by at least part of the target group, the on-demand activation initiated substantially in real time with receiving the input.
    Type: Application
    Filed: November 12, 2009
    Publication date: May 13, 2010
    Inventors: Vincent Vanhoucke, Michael H. Cohen, Manish G. Patel, Gudmundur Hafsteinsson
  • Publication number: 20100057460
    Abstract: Verbal labels for electronic messages, as well as systems and methods for making and using such labels, are disclosed. A verbal label is a label containing audio data (such as a digital audio file of a user's voice and/or a speaker template thereof) that is associated with one or more electronic messages. Verbal labels permit a user to more efficiently manipulate e-mail and other electronic messages by voice. For example, a user can add such labels verbally to an e-mail or to a group of e-mails, thereby permitting these messages to be sorted and retrieved more easily.
    Type: Application
    Filed: November 13, 2009
    Publication date: March 4, 2010
    Inventor: MICHAEL H. COHEN
  • Patent number: 7627638
    Abstract: Verbal labels for electronic messages, as well as systems and methods for making and using such labels, are disclosed. A verbal label is a label containing audio data (such as a digital audio file of a user's voice and/or a speaker template thereof) that is associated with one or more electronic messages. Verbal labels permit a user to more efficiently manipulate e-mail and other electronic messages by voice. For example, a user can add such labels verbally to an e-mail or to a group of e-mails, thereby permitting these messages to be sorted and retrieved more easily.
    Type: Grant
    Filed: December 20, 2004
    Date of Patent: December 1, 2009
    Assignee: Google Inc.
    Inventor: Michael H. Cohen
  • Patent number: 7263489
    Abstract: A system which uses automatic speech recognition to provide dialogs with human speakers automatically detects one or more characteristics, which may be characteristics of a speaker, his speech, his environment, or the speech channel used to communicate with the speaker. The characteristic may be detected either during the dialog or at a later time based on stored data representing the dialog. If the characteristic is detected during the dialog, the dialog can be customized for the speaker at an application level, based on the detected characteristic. The customization may include customization of operations and features such as call routing, error recovery, call flow, content selection, system prompts, or system persona. Data indicative of detected characteristics can be stored and accumulated for many speakers and/or dialogs and analyzed offline to generate a demographic or other type of analysis of the speakers or dialogs with respect to one or more detected characteristics.
    Type: Grant
    Filed: January 11, 2002
    Date of Patent: August 28, 2007
    Assignee: Nuance Communications, Inc.
    Inventors: Michael H. Cohen, Larry P. Heck, Jennifer E. Balogh, James M. Riseman, Naghmeh N. Mirghafori
  • Patent number: 7082397
    Abstract: The present invention allows a user to audibly and interactively browse through a network of audio information, forming a seamless integration of the world wide web and the entire telephone network browsable from any telephone set. Preferably a browser controller allows the user to receive audio information and to transmit verbal instructions. The browser controller links the user to voice pages, which can be any telephone station or world wide web page, in response to voice commands. Upon linking, certain information is played with an audio indicia which identifies a linking capability. If the user repeats the information set off by the audio indicia, the telephone number or URL of the selected link is transmitted to the browser controller. The browser controller establishes a new link with the identified telephone number or URL, and if successful, disconnects the previous link.
    Type: Grant
    Filed: December 1, 1998
    Date of Patent: July 25, 2006
    Assignee: Nuance Communications, Inc.
    Inventors: Michael H. Cohen, Tracy Demian Wax
  • Patent number: 6859776
    Abstract: A network comprises a number of speech-enabled sites maintaining a number of voice pages. A central server on the network executes a voice browser which provides users with access to the sites using voice-activated hyperlinks. The server also maintains and brokers information associated with the users based on spoken dialogs between the users and the sites. In response to a user accessing a given ASR site, information about that user is provided by the server for use by that ASR site. The information is used by the ASR site to optimize a spoken dialog between the user and the ASR site by reducing the amount of information the user is required to provide during the dialog. Information about the user can thereby be shared between separate speech enabled sites, in a manner which is transparent to the user, in order to expedite the user's interaction with those sites.
    Type: Grant
    Filed: October 4, 1999
    Date of Patent: February 22, 2005
    Assignee: Nuance Communications
    Inventors: Michael H. Cohen, Tracy D. Wax, Michael A. Prince, Steven C. Ehrlich
  • Patent number: 6629066
    Abstract: A computerized method for building and running natural language understanding systems, wherein a natural language understanding system takes a sentence as input and returns some representation of the possible meanings of the sentence as output (the “interpretation”) using a run-time interpreter th assigns interpretations to sentences and a compiler that produces (in a computer memory) an internal specification needed for the run-time interpreter from a user specification of the semantics of the application. The compiler builds a natural language system, while the run-time interpreter runs the system.
    Type: Grant
    Filed: September 7, 1999
    Date of Patent: September 30, 2003
    Assignee: Nuance Communications
    Inventors: Eric G. Jackson, Michael H. Cohen, Fuliang Weng
  • Patent number: 6560576
    Abstract: A voice-enabled application, which may be a voice browser, is configured to provide active help to a user. The application maintains a usage history of each user with respect to dialog states. The usage history includes various user-specific variables, some of which are valid across multiple sessions. The application maintains a number of active help prompts capable of being played to a user as speech, each containing information on a different, specific help topic. The application further maintains a number of sets of conditions, each set corresponding to a different one of the active help prompts. The application monitors dialog states during a session of a user and generates an event based on the dialog states. The application applies certain ones of the conditions to certain ones of the user-specific variables in response to the event. The application then plays a active help prompt containing information on a specific help topic to the user if the applied conditions are satisfied.
    Type: Grant
    Filed: April 25, 2000
    Date of Patent: May 6, 2003
    Assignee: Nuance Communications
    Inventors: Michael H. Cohen, Jennifer E. Balogh, Tracy D. Wax, Madhavan S. Thirumalai, Debajit Ghosh
  • Publication number: 20020164000
    Abstract: The present invention allows a user to audibly and interactively browse through a network of audio information, forming a seamless integration of the world wide web and the entire telephone network browsable from any telephone set. Preferably a browser controller allows the user to receive audio information and to transmit verbal instructions. The browser controller links the user to voice pages, which can be any telephone station or world wide web page, in response to voice commands. Upon linking, certain information is played with an audio indicia which identifies a linking capability. If the user repeats the information set off by the audio indicia, the telephone number or URL of the selected link is transmitted to the browser controller. The browser controller establishes a new link with the identified telephone number or URL, and if successful, disconnects the previous link.
    Type: Application
    Filed: December 1, 1998
    Publication date: November 7, 2002
    Inventors: MICHAEL H. COHEN, TRACY DEMIAN WAX