Patents by Inventor Michael H. Cohen
Michael H. Cohen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20130006640Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.Type: ApplicationFiled: September 14, 2012Publication date: January 3, 2013Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
-
Publication number: 20120022873Abstract: Methods, computer program products and systems are described for forming a speech recognition language model. Multiple query-website relationships are determined by identifying websites that are determined to be relevant to queries using one or more search engines. Clusters are identified in the query-website relationships by connecting common queries and connecting common websites. A speech recognition language model is created for a particular website based on at least one of analyzing at queries in a cluster that includes the website or analyzing webpage content of web pages in the cluster that includes the website.Type: ApplicationFiled: September 29, 2011Publication date: January 26, 2012Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
-
Publication number: 20120022853Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.Type: ApplicationFiled: September 29, 2011Publication date: January 26, 2012Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
-
Publication number: 20120022867Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.Type: ApplicationFiled: September 29, 2011Publication date: January 26, 2012Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
-
Publication number: 20120022866Abstract: Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among multiple language models, a language model appropriate for the context. Speech in the sound information is converted to text using the selected language model. The text is provided for use by the electronic device.Type: ApplicationFiled: September 29, 2011Publication date: January 26, 2012Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
-
Publication number: 20110213613Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.Type: ApplicationFiled: May 24, 2010Publication date: September 1, 2011Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
-
Publication number: 20110161080Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.Type: ApplicationFiled: December 22, 2010Publication date: June 30, 2011Applicant: GOOGLE INC.Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
-
Publication number: 20110161081Abstract: Methods, computer program products and systems are described for forming a speech recognition language model. Multiple query-website relationships are determined by identifying websites that are determined to be relevant to queries using one or more search engines. Clusters are identified in the query-website relationships by connecting common queries and connecting common websites. A speech recognition language model is created for a particular website based on at least one of analyzing at queries in a cluster that includes the website or analyzing webpage content of web pages in the cluster that includes the website.Type: ApplicationFiled: December 22, 2010Publication date: June 30, 2011Applicant: GOOGLE INC.Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
-
Publication number: 20110153324Abstract: Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among multiple language models, a language model appropriate for the context. Speech in the sound information is converted to text using the selected language model. The text is provided for use by the electronic device.Type: ApplicationFiled: December 22, 2010Publication date: June 23, 2011Applicant: GOOGLE INC.Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
-
Publication number: 20110153325Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.Type: ApplicationFiled: December 22, 2010Publication date: June 23, 2011Applicant: GOOGLE INC.Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
-
Patent number: 7756708Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.Type: GrantFiled: April 3, 2006Date of Patent: July 13, 2010Assignee: Google Inc.Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
-
Publication number: 20100121704Abstract: A computer-implemented method for advertisement distribution includes receiving, in a computer system, an input from an advertiser that has previously registered an advertisement for on-demand activation. The input is generated based on the advertiser having an immediate availability and directs the computer system to initiate the on-demand activation substantially in real time with receiving the input. The method includes determining, using the computer system, a geographic location of the advertiser that corresponds to the immediate availability. The method includes defining, using the computer system, a target group to which the advertisement is to be presented, the target group identified based on at least the geographic location and the immediate availability. The method includes initiating the on-demand activation using the computer system, for receipt of the advertisement by at least part of the target group, the on-demand activation initiated substantially in real time with receiving the input.Type: ApplicationFiled: November 12, 2009Publication date: May 13, 2010Inventors: Vincent Vanhoucke, Michael H. Cohen, Manish G. Patel, Gudmundur Hafsteinsson
-
Publication number: 20100057460Abstract: Verbal labels for electronic messages, as well as systems and methods for making and using such labels, are disclosed. A verbal label is a label containing audio data (such as a digital audio file of a user's voice and/or a speaker template thereof) that is associated with one or more electronic messages. Verbal labels permit a user to more efficiently manipulate e-mail and other electronic messages by voice. For example, a user can add such labels verbally to an e-mail or to a group of e-mails, thereby permitting these messages to be sorted and retrieved more easily.Type: ApplicationFiled: November 13, 2009Publication date: March 4, 2010Inventor: MICHAEL H. COHEN
-
Patent number: 7627638Abstract: Verbal labels for electronic messages, as well as systems and methods for making and using such labels, are disclosed. A verbal label is a label containing audio data (such as a digital audio file of a user's voice and/or a speaker template thereof) that is associated with one or more electronic messages. Verbal labels permit a user to more efficiently manipulate e-mail and other electronic messages by voice. For example, a user can add such labels verbally to an e-mail or to a group of e-mails, thereby permitting these messages to be sorted and retrieved more easily.Type: GrantFiled: December 20, 2004Date of Patent: December 1, 2009Assignee: Google Inc.Inventor: Michael H. Cohen
-
Patent number: 7263489Abstract: A system which uses automatic speech recognition to provide dialogs with human speakers automatically detects one or more characteristics, which may be characteristics of a speaker, his speech, his environment, or the speech channel used to communicate with the speaker. The characteristic may be detected either during the dialog or at a later time based on stored data representing the dialog. If the characteristic is detected during the dialog, the dialog can be customized for the speaker at an application level, based on the detected characteristic. The customization may include customization of operations and features such as call routing, error recovery, call flow, content selection, system prompts, or system persona. Data indicative of detected characteristics can be stored and accumulated for many speakers and/or dialogs and analyzed offline to generate a demographic or other type of analysis of the speakers or dialogs with respect to one or more detected characteristics.Type: GrantFiled: January 11, 2002Date of Patent: August 28, 2007Assignee: Nuance Communications, Inc.Inventors: Michael H. Cohen, Larry P. Heck, Jennifer E. Balogh, James M. Riseman, Naghmeh N. Mirghafori
-
Patent number: 7082397Abstract: The present invention allows a user to audibly and interactively browse through a network of audio information, forming a seamless integration of the world wide web and the entire telephone network browsable from any telephone set. Preferably a browser controller allows the user to receive audio information and to transmit verbal instructions. The browser controller links the user to voice pages, which can be any telephone station or world wide web page, in response to voice commands. Upon linking, certain information is played with an audio indicia which identifies a linking capability. If the user repeats the information set off by the audio indicia, the telephone number or URL of the selected link is transmitted to the browser controller. The browser controller establishes a new link with the identified telephone number or URL, and if successful, disconnects the previous link.Type: GrantFiled: December 1, 1998Date of Patent: July 25, 2006Assignee: Nuance Communications, Inc.Inventors: Michael H. Cohen, Tracy Demian Wax
-
Patent number: 6859776Abstract: A network comprises a number of speech-enabled sites maintaining a number of voice pages. A central server on the network executes a voice browser which provides users with access to the sites using voice-activated hyperlinks. The server also maintains and brokers information associated with the users based on spoken dialogs between the users and the sites. In response to a user accessing a given ASR site, information about that user is provided by the server for use by that ASR site. The information is used by the ASR site to optimize a spoken dialog between the user and the ASR site by reducing the amount of information the user is required to provide during the dialog. Information about the user can thereby be shared between separate speech enabled sites, in a manner which is transparent to the user, in order to expedite the user's interaction with those sites.Type: GrantFiled: October 4, 1999Date of Patent: February 22, 2005Assignee: Nuance CommunicationsInventors: Michael H. Cohen, Tracy D. Wax, Michael A. Prince, Steven C. Ehrlich
-
Patent number: 6629066Abstract: A computerized method for building and running natural language understanding systems, wherein a natural language understanding system takes a sentence as input and returns some representation of the possible meanings of the sentence as output (the “interpretation”) using a run-time interpreter th assigns interpretations to sentences and a compiler that produces (in a computer memory) an internal specification needed for the run-time interpreter from a user specification of the semantics of the application. The compiler builds a natural language system, while the run-time interpreter runs the system.Type: GrantFiled: September 7, 1999Date of Patent: September 30, 2003Assignee: Nuance CommunicationsInventors: Eric G. Jackson, Michael H. Cohen, Fuliang Weng
-
Patent number: 6560576Abstract: A voice-enabled application, which may be a voice browser, is configured to provide active help to a user. The application maintains a usage history of each user with respect to dialog states. The usage history includes various user-specific variables, some of which are valid across multiple sessions. The application maintains a number of active help prompts capable of being played to a user as speech, each containing information on a different, specific help topic. The application further maintains a number of sets of conditions, each set corresponding to a different one of the active help prompts. The application monitors dialog states during a session of a user and generates an event based on the dialog states. The application applies certain ones of the conditions to certain ones of the user-specific variables in response to the event. The application then plays a active help prompt containing information on a specific help topic to the user if the applied conditions are satisfied.Type: GrantFiled: April 25, 2000Date of Patent: May 6, 2003Assignee: Nuance CommunicationsInventors: Michael H. Cohen, Jennifer E. Balogh, Tracy D. Wax, Madhavan S. Thirumalai, Debajit Ghosh
-
Publication number: 20020164000Abstract: The present invention allows a user to audibly and interactively browse through a network of audio information, forming a seamless integration of the world wide web and the entire telephone network browsable from any telephone set. Preferably a browser controller allows the user to receive audio information and to transmit verbal instructions. The browser controller links the user to voice pages, which can be any telephone station or world wide web page, in response to voice commands. Upon linking, certain information is played with an audio indicia which identifies a linking capability. If the user repeats the information set off by the audio indicia, the telephone number or URL of the selected link is transmitted to the browser controller. The browser controller establishes a new link with the identified telephone number or URL, and if successful, disconnects the previous link.Type: ApplicationFiled: December 1, 1998Publication date: November 7, 2002Inventors: MICHAEL H. COHEN, TRACY DEMIAN WAX