Speech Recognition (epo) Patents (Class 704/E15.001)

  • Publication number: 20100198592
    Abstract: This invention maps possibly noisy digital input from any of a number of different hardware or software sources such as keyboards, automatic speech recognition systems, cell phones, smart phones or the web onto an interpretation consisting of an action and one or more physical objects, such as robots, machinery, vehicles, etc. or digital objects such as data files, tables and databases. Tables and lists of (i) homonyms and misrecognitions, (ii) thematic relation patterns, and (iii) lexicons are used to generate alternative forms of the input which are scored to determine the best interpretation of the noisy input. The actions may be executed internally or output to any device which contains a digital component such as, but not limited to, a computer, a robot, a cell phone, a smart phone or the web. This invention may be implemented on sequential and parallel compute engines and systems.
    Type: Application
    Filed: January 27, 2010
    Publication date: August 5, 2010
    Inventor: Jerry Lee Potter
  • Publication number: 20100191519
    Abstract: A runtime framework and authoring tool are provided for enabling linguistic experts to author text normalization maps and grammar libraries without requiring high level of technical or programming skills. Authors define or select terminals, map the terminals, and define rules for the mapping. The tool enables an author to validate their work, by executing the map in the same way the recognition engine does, causing consistency in results from authoring to user operations. The runtime is used by the speech engines and by the tools to provide consistent normalization for supported scenarios.
    Type: Application
    Filed: January 28, 2009
    Publication date: July 29, 2010
    Applicant: Microsoft Corporation
    Inventors: Rachel I. Morton, Nicholas J. Gedge, Heiko W. Rahmel
  • Publication number: 20100191531
    Abstract: A system, method and computer program product for classification of an analog electrical signal using statistical models of training data. A technique is described to quantize the analog electrical signal in a manner which maximizes the compression of the signal while simultaneously minimizing the diminution in the ability to classify the compressed signal. These goals are achieved by utilizing a quantizer designed to minimize the loss in a power of the log-likelihood ratio. A further technique is described to enhance the quantization process by optimally allocating a number of bits for each dimension of the quantized feature vector subject to a maximum number of bits available across all dimensions.
    Type: Application
    Filed: April 3, 2010
    Publication date: July 29, 2010
    Applicant: International Business Machines Corporation
    Inventors: Upendra V. CHAUDHARI, Hsin I. Tseng, Deepak S. Turaga, Olivier Verscheure
  • Publication number: 20100191658
    Abstract: A customer service issue prediction engine uses one or more models of issue probability. A method of multi-phase customer issue prediction includes a modeling phase, an application phase, and a learning phase. A telephonic interactive voice response (IVR) system predicts customer issues.
    Type: Application
    Filed: January 25, 2010
    Publication date: July 29, 2010
    Inventors: Pallipuram V. Kannan, Mohit Jain, Ravi Vijayaraghavan
  • Publication number: 20100191528
    Abstract: A speech signal processing apparatus comprising: a control signal output unit configured to receive as an input signal either one of a first speech signal corresponding to a sound uttered by a user and a second speech signal corresponding to a sound output from an eardrum of the user when the user utters a sound, and output a control signal corresponding to a noise level of the input signal; and a speech signal output unit configured to output either one of the first speech signal and the second speech signal according to the control signal.
    Type: Application
    Filed: January 26, 2010
    Publication date: July 29, 2010
    Applicants: SANYO ELECTRIC CO., LTD., SANYO SEMICONDUCTOR CO., LTD.
    Inventors: Kozo Okuda, Kenji Morimoto
  • Publication number: 20100191520
    Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.
    Type: Application
    Filed: January 25, 2010
    Publication date: July 29, 2010
    Applicant: Harman Becker Automotive Systems GmbH
    Inventors: Rainer Gruhn, Andreas Marcel Riechert
  • Publication number: 20100185946
    Abstract: A system, method, and program product for instantiating and executing a bot on an interface system are disclosed. A bot is an agent for the user and includes an animated visual personification. The system includes an interface system including a graphical user interface, a system for instantiating a bot and displaying the bot on the graphical user interface, and a command processing system for causing the bot to execute one of a plurality of actions in response to a user command. The plurality of actions includes at least one local capability and at least one remote capability. The at least one remote capability also includes a system for transferring the bot to a second interface at a remote location.
    Type: Application
    Filed: January 21, 2009
    Publication date: July 22, 2010
    Inventors: Lisa Seacat DeLuca, Lydia Mai DO, Steven M. Miller, Pamela A. Nesbitt
  • Publication number: 20100185445
    Abstract: A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.
    Type: Application
    Filed: January 21, 2009
    Publication date: July 22, 2010
    Applicant: International Business Machines Corporation
    Inventors: Liam D. Comerford, Mahesh Viswanathan
  • Publication number: 20100185443
    Abstract: Systems and methods for processing speech are provided. A system may include a speech recognition interface and a processor. The processor may convert speech received from a call at the speech recognition interface to at least one word string. The processor may parse each word string of the at least one word string into first objects and first actions. The processor may access a synonym table to determine second objects and second actions based on the first objects and the first actions. The processor may also select a preferred object and a preferred action from the second objects and the second actions.
    Type: Application
    Filed: March 31, 2010
    Publication date: July 22, 2010
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin, Sarah Korth
  • Publication number: 20100185444
    Abstract: An apparatus for providing compound models for speech recognition adaptation includes a processor. The processor may be configured to receive a speech signal corresponding to a particular speaker, select a cluster model including both a speaker independent portion and a speaker dependent portion based at least in part on a characteristic of speech of the particular speaker, and process the speech signal using the selected cluster model. A corresponding method and computer program product are also provided.
    Type: Application
    Filed: January 21, 2009
    Publication date: July 22, 2010
    Inventor: Jesper Olsen
  • Publication number: 20100179813
    Abstract: The present invention relates to a method of providing voice recognition. The method comprises the steps of receiving a packetised voice data of a person to be identified over a packet-switched network, comparing the voice data with a stored voice data of a user and, based on the comparison, providing an indication of the likelihood that the person to be identified is the user, wherein the step of receiving the voice data comprises waiting for sufficient voice data to be received.
    Type: Application
    Filed: January 22, 2008
    Publication date: July 15, 2010
    Inventors: Clive Summerfield, Joel Moss
  • Publication number: 20100179810
    Abstract: A customer for music distributed over the interne may select a composition from a menu of written identifiers (such as the song title and singer or group) and then confirm that the composition is indeed the one desired by listening to a corrupted version of the composition. If the customer has forgotten the song title or the singer or other words that provide the identifier, he or she may hum or otherwise vocalize a few bars of the desired composition, or pick the desired composition out on a simulated keyboard. A music-recognition system then locates candidates for the selected composition and displays identifiers for these candidates to the customer.
    Type: Application
    Filed: March 18, 2010
    Publication date: July 15, 2010
    Inventor: Lawson A. Wood
  • Publication number: 20100172517
    Abstract: A microphone preamplifier circuit is provided. An amplifier comprises a first input end, a second input end, and an output end. A bias voltage is provided by a bias voltage source. A first sensor is coupled to the first input end and the bias voltage source for sensing a first physical parameter and a second physical parameter. A second sensor is coupled to the second input end and the bias voltage source for sensing the first physical parameter, wherein the second sensor is insensitive to the second physical parameter. The output end of the amplifier outputs a difference of the first and second input ends whereby noises and interferences are reduced.
    Type: Application
    Filed: January 8, 2009
    Publication date: July 8, 2010
    Applicant: FORTEMEDIA, INC.
    Inventors: Li-Te Wu, Jui-Te Chiu
  • Publication number: 20100169088
    Abstract: A method of generating demographic information relating to an individual is provided. The method includes monitoring an environment for a voice activity of an individual and detecting the voice activity of the individual. The method further includes analyzing the detected voice activity of the individual and determining, based on the detected voice activity of the individual, a demographic descriptor of the individual.
    Type: Application
    Filed: December 29, 2008
    Publication date: July 1, 2010
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Michael JOHNSTON, Hisao M. CHANG, Harry E. BLANCHARD, Bernard S. RENGER, Linda ROBERTS
  • Publication number: 20100169091
    Abstract: An aspect of the present invention is drawn to an audio data processing device for use by a user to control a system and for use with a microphone, a user demographic profiles database and a content/ad database. The microphone may be operable to detect speech and to generate speech data based on the detected speech. The user demographic profiles database may be capable of having demographic data stored therein. The content/ad database may be capable of having at least one of content data and advertisement data stored therein. The audio data processing device includes a voice recognition portion, a voice analysis portion and a speech to text portion. The voice recognition portion may be operable to process user instructions based on the speech data. The voice analysis portion may be operable to determine characteristics of the user based on the speech data. The speech to text portion may be operable to determine interests of the user.
    Type: Application
    Filed: December 30, 2008
    Publication date: July 1, 2010
    Applicant: MOTOROLA, INC.
    Inventors: Robert A. Zurek, James P. Ashley
  • Publication number: 20100169246
    Abstract: A multimodal system and an input processing method thereof are disclosed. The multimodal system includes a pre-constructed input combination constructing unit and an input combination selection unit for selecting an input combination corresponding to an input signal from a user or a sensor. The system performs learning for selecting an input combination from the pre-constructed input combinations. The system provides available input combinations due to this learning, resulting in high satisfaction with the processed result.
    Type: Application
    Filed: December 2, 2009
    Publication date: July 1, 2010
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jun Won Jang, Tae Sin Ha
  • Publication number: 20100161333
    Abstract: In one embodiment, an adaptive personal name grammar improves speech recognition by limiting or weighting the scope of potential addressable names based upon meta-information relative to the communications patterns, environmental considerations, or sociological/professional hierarchy of a user to increase the likelihood of a positive match.
    Type: Application
    Filed: December 23, 2008
    Publication date: June 24, 2010
    Applicant: CISCOTECHNOLOGY, INC
    Inventors: MICHAEL T. MAAS, KEVIN L. CHESTNUT
  • Publication number: 20100161446
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for placing an order for a user. The method includes receiving a search from a user, identifying a product category based on the search, presenting to the user a general ordering screen based on the identified product category, selecting and activating a speech recognition grammar tuned for the identified product category, recognizing a first received user utterance with the activated tuned grammar to identify a vendor who offers items in the identified product category, recognizing a second received user utterance with the activated tuned grammar to identify a specific item from the identified vendor, and placing an order for the specific item with the identified vendor for the user. In one aspect, the method further offers to sell the user additional items ancillary to the specific item.
    Type: Application
    Filed: December 19, 2008
    Publication date: June 24, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Joseph Anderson Alfred, Joseph M. Sommer
  • Publication number: 20100161337
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for recognizing speech. The method includes receiving speech from a user, perceiving at least one speech dialect in the received speech, selecting at least one grammar from a plurality of optimized dialect grammars based on at least one score associated with the perceived speech dialect and the perceived at least one speech dialect, and recognizing the received speech with the selected at least one grammar. Selecting at least one grammar can be further based on a user profile. Multiple grammars can be blended. Predefined parameters can include pronunciation differences, vocabulary, and sentence structure. Optimized dialect grammars can be domain specific. The method can further include recognizing initial received speech with a generic grammar until an optimized dialect grammar is selected. Selecting at least one grammar from a plurality of optimized dialect grammars can be based on a certainty threshold.
    Type: Application
    Filed: December 23, 2008
    Publication date: June 24, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Gregory PULZ, Harry E. BLANCHARD, Steven H. LEWIS, Lan ZHANG
  • Publication number: 20100161336
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for enhancing speech recognition accuracy. The method includes dividing a system dialog turn into segments based on timing of probable user responses, generating a weighted grammar for each segment, exclusively activating the weighted grammar generated for a current segment of the dialog turn during the current segment of the dialog turn, and recognizing user speech received during the current segment using the activated weighted grammar generated for the current segment. The method can further include assigning probability to the weighted grammar based on historical user responses and activating each weighted grammar is based on the assigned probability. Weighted grammars can be generated based on a user profile. A weighted grammar can be generated for two or more segments.
    Type: Application
    Filed: December 19, 2008
    Publication date: June 24, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventor: Michael Czahor
  • Publication number: 20100161329
    Abstract: A Viterbi decoder includes: an observation vector sequence generator for generating an observation vector sequence by converting an input speech to a sequence of observation vectors; a local optimal state calculator for obtaining a partial state sequence having a maximum similarity up to a current observation vector as an optimal state; an observation probability calculator for obtaining, as a current observation probability, a probability for observing the current observation vector in the optimal state; a buffer for storing therein a specific number of previous observation probabilities; a non-linear filter for calculating a filtered probability by using the previous observation probabilities stored in the buffer and the current observation probability; and a maximum likelihood calculator for calculating a partial maximum likelihood by using the filtered probability.
    Type: Application
    Filed: July 21, 2009
    Publication date: June 24, 2010
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Hoon CHUNG, Jeon Gue PARK, Yunkeun LEE, Ho-Young JUNG, Hyung-Bae JEON, Jeorn Ja KANG, Sung Joo LEE, Euisok CHUNG, Ji Hyun WANG, Byung Ok KANG, Ki-young PARK, Jong Jin KIM
  • Publication number: 20100161339
    Abstract: Methods and systems for operating an avionics system with voice command capability are provided. A first voice command is received. A first type of avionics system function is performed in response to the receiving of the first voice command. A second voice command is received. A second type of avionics system function that has a hazard level higher than that of the first type of avionics system function is performed in response to the receiving of the second voice command only after a condition is detected that is indicative of a confirmation of the request to perform the second type of avionics function. The avionics system may also have the capability to test whether or not the voice command feature is functioning properly.
    Type: Application
    Filed: December 19, 2008
    Publication date: June 24, 2010
    Applicant: Honeywell International Inc.
    Inventors: Robert E. De Mers, Steve Grothe, Joseph J. Nutaro
  • Publication number: 20100150321
    Abstract: A system and method for remotely enabled voice activated dialing. Generation of a special dial tone indicating that a user may give the voice identifier is initiated. A voice identifier is received over a network from a wired telephone utilized by a user. Dialing information associated with the voice identifier is determined. One or more receiving parties associated with the voice identifier are dialed. The wired telephone is connected to the one or more receiving parties.
    Type: Application
    Filed: December 15, 2008
    Publication date: June 17, 2010
    Inventors: Robert Harris, Don L. Briscoe, Jasen D. Ott, John Zeigler, Michael Schmidt
  • Publication number: 20100153111
    Abstract: Provided is an input device for a mobile body, the input device allowing a safe input operation at the time of operating an equipment such as a car regardless of whether the mobile body is traveling or stopping.
    Type: Application
    Filed: December 11, 2006
    Publication date: June 17, 2010
    Inventors: Takuya Hirai, Atsushi Yamashita, Tomohiro Terada
  • Publication number: 20100150333
    Abstract: A text/voice system comprises a device configured to receive an incoming voice call intended for a called party, and detect, in response to receiving the voice call, the current status of the called party on a text messaging system, where the current status may include active or inactive. The device is also configured to establish a communication session between the calling party and the called party via the text messaging system, where speech from the calling party is translated to text and delivered to the called party during the communication session, and responsive text from the called party is translated to speech and delivered as speech to the calling party during the communication session.
    Type: Application
    Filed: December 15, 2008
    Publication date: June 17, 2010
    Applicant: Verizon Data Services LLC
    Inventors: Lee N. GOODMAN, Sujin C. Chang
  • Publication number: 20100153105
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for referring to entities. The method includes receiving domain-specific training data of sentences describing a target entity in a context, extracting a speaker history and a visual context from the training data, selecting attributes of the target entity based on at least one of the speaker history, the visual context, and speaker preferences, generating a text expression referring to the target entity based on at least one of the selected attributes, the speaker history, and the context, and outputting the generated text expression. The weighted finite-state automaton can represent partial orderings of word pairs in the domain-specific training data. The weighted finite-state automaton can be speaker specific or speaker independent. The weighted finite-state automaton can include a set of weighted partial orderings of the training data for each possible realization.
    Type: Application
    Filed: December 12, 2008
    Publication date: June 17, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Giuseppe DI FABBRIZIO, Srinivas Bangalore, Amanda Stent
  • Publication number: 20100153117
    Abstract: An interface protocol for the functional manipulation of complex devices such as consumer electronic devices without the necessity of the visual feedback via textual or graphic data, wherein the sensor functions change with time rather than placement, so that a user action biases a binary state switch, which is correlated to a timed audible audio data stream, the correlation indicating the desired action selected by the user.
    Type: Application
    Filed: February 22, 2010
    Publication date: June 17, 2010
    Inventor: Paul M. Toupin
  • Publication number: 20100153112
    Abstract: Disclosed are editing methods that are added to speech-based searching to allow users to better understand textual queries submitted to a search engine and to easily edit their speech queries. According to some embodiments, the user begins to speak. The user's speech is translated into a textual query and submitted to a search engine. The results of the search are presented to the user. As the user continues to speak, the user's speech query is refined based on the user's further speech. The refined speech query is converted to a textual query which is again submitted to the search engine. The refined results are presented to the user. This process continues as long as the user continues to refine the query. Some embodiments present the textual query to the user and allow the user to use both speech-based and non-speech-based tools to edit the textual query.
    Type: Application
    Filed: December 16, 2008
    Publication date: June 17, 2010
    Applicant: MOTOROLA, INC.
    Inventors: W. Garland Phillips, Harry M. Bliss, Bashar Jano, Changxue Ma
  • Publication number: 20100153106
    Abstract: A method may include receiving communications associated with a communication session. The communication session may correspond to a telephone conversation, text-based conversation or a multimedia conversation. The method may also include identifying portions of the communication session and storing the identified portions. The method may further include receiving a request to retrieve information associated with the communication session and providing to a display, information associated with the identified portions.
    Type: Application
    Filed: December 15, 2008
    Publication date: June 17, 2010
    Applicant: VERIZON DATA SERVICES LLC
    Inventors: Kristopher T. Frazier, Brian F. Roberts, Heath Stallings
  • Publication number: 20100145680
    Abstract: A speech recognition method using a domain ontology includes: constructing domain ontology DB; forming a speech recognition grammar using the formed domain ontology DB; extracting a feature vector from a speech signal; modeling the speech signal using an acoustic model. The method performs speech recognition by using the acoustic model, the speech recognition dictionary and the speech recognition grammar on the basis of the feature vector.
    Type: Application
    Filed: September 1, 2009
    Publication date: June 10, 2010
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Seung YUN, Soo Jong Lee, Jeong Se Kim, Il Bin Lee, Jun Park, Sang Kyu Park
  • Publication number: 20100145684
    Abstract: A system and method for processing a narrowband speech signal comprising speech samples in a first range of frequencies. the method comprises: generating from the narrowband speech signal a highband speech signal in a second range of frequencies above the first range of frequencies; determining a pitch of the highband speech signal; using the pitch to generate a pitch-dependent tonality measure from samples of the highband speech signal; and filtering the speech samples using a gain factor derived from the tonality measure and selected to reduce the amplitude of harmonics in the highband speech signal.
    Type: Application
    Filed: June 10, 2009
    Publication date: June 10, 2010
    Inventors: Mattias Nilsson, Soren Vang Anderson
  • Publication number: 20100145709
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for voice authentication. The method includes receiving a speech sample from a user through an Internet browser for authentication as part of a request for a restricted-access resource, performing a comparison of the received speech sample to a previously established speech profile associated with the user, transmitting an authentication to the network client if the comparison is equal to or greater than a certainty threshold, and transmitting a denial to the network client if the comparison is less than the certainty threshold.
    Type: Application
    Filed: December 4, 2008
    Publication date: June 10, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventor: Saurabh Kumar
  • Publication number: 20100145699
    Abstract: Methods and systems for adapting of acoustic models are disclosed. A user terminal may determine a phoneme distribution of a text corpus, determine an acoustic model gain distribution of phonemes of an acoustic model before and after adaptation of the acoustic model, determine a desired phoneme distribution based on the phoneme distribution and the acoustic model gain distribution, generate an adaption sentence based on the desired phoneme distribution, and generate a prompt requesting a user speak the adaptation sentence.
    Type: Application
    Filed: December 9, 2008
    Publication date: June 10, 2010
    Applicant: NOKIA CORPORATION
    Inventor: Jilei Tian
  • Publication number: 20100145700
    Abstract: Mobile systems and methods that overcomes the deficiencies of prior art speech-based interfaces for telematics applications through the use of a complete speech-based information query, retrieval, presentation and local or remote command environment. This environment makes significant use of context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created. The invention creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command. The invention may organize domain specific behavior and information into agents, that are distributable or updateable over a wide area network.
    Type: Application
    Filed: February 12, 2010
    Publication date: June 10, 2010
    Applicant: VoiceBox Technologies, Inc.
    Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, SR., Michael R. Kennewick, JR., Richard Kennewick, Tom Freeman, Stephen F. Elston
  • Publication number: 20100144435
    Abstract: Game data and speech transfer to and from a wireless portable game terminal is disclosed. The wireless portable game terminal includes a radio transceiver configured to transfer speech and game data through a radio connection to a telecommunication system, a loudspeaker, a microphone, and a processing unit. The processing unit is configured to process the game data, to transfer the game data to and from another game terminal or a game server through the radio connection, to receive captured speech of another user through the radio connection, to output audio part of the game data and the captured speech of the other user through the loudspeaker, to capture speech of an user with the microphone, and to transfer the captured speech of the user to another game terminal or to a game server through the radio connection.
    Type: Application
    Filed: February 12, 2010
    Publication date: June 10, 2010
    Applicant: Nokia Corporation
    Inventor: Kari NIEMELA
  • Publication number: 20100145693
    Abstract: A method for extracting verbal cues is presented which enhances a speech signal to increase the saliency and recognition of verbal cues including emotive verbal cues. In a further embodiment of the method, the method works in conjunction with a computer that displays a face which gestures and articulates non-verbal cues in accord with speech patterns that are also modified to enhance their verbal cues. The methods work to provide a means for allowing non-fluent speakers to better understand and learn foreign languages.
    Type: Application
    Filed: March 20, 2008
    Publication date: June 10, 2010
    Inventor: Martin L Lenhardt
  • Publication number: 20100138150
    Abstract: Provided is a technology of a navigation device which is capable of identifying an intersection or the like based on designation of an incomplete name of a street in which an input of a first keyword and an input of a second keyword are received, and a connection point of a first street having a street name which includes at least in part the first keyword and a second street having a street name which includes at least in part the second keyword is identified, to thereby save the user, who is not always familiar with the geography of a search target area, from the inconvenience of inputting a complete name of the first street, based on which the second street is retrieved and selected and an intersection point of the first street and the second street is identified.
    Type: Application
    Filed: November 25, 2009
    Publication date: June 3, 2010
    Applicant: Clarion Co., Ltd.
    Inventors: Norio WATARAI, Chiharu HIRAI
  • Publication number: 20100138224
    Abstract: Information is exchanged between a user of a communications device and an application during an ongoing conversation between the user using the communications device and a party, without disrupting the conversation. An application associated with the communications device is accessed via the communications device in response to a command and keyword spoken by the user during the communications session. Information is retrieved from the application according to the keyword spoken by the user. When the information is retrieved from the application, the user is prompted in a manner transparent to the party, after which a response is sent to the user.
    Type: Application
    Filed: December 3, 2008
    Publication date: June 3, 2010
    Applicant: AT&T Intellectual Property I, LP.
    Inventor: James Carlton Bedingfield, SR.
  • Publication number: 20100137037
    Abstract: A vehicle communication system facilitates hands-free interaction with a mobile device in a vehicle or elsewhere. Users interact with the system by speaking to it. The system processes text and processes commands. The system supports Bluetooth wireless technology for hands-free use. The system handles telephone calls, email, and SMS text messages. The user can customize the device via a user profile stored on an Internet web server.
    Type: Application
    Filed: February 8, 2010
    Publication date: June 3, 2010
    Inventor: Otman A. Basir
  • Publication number: 20100131262
    Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.
    Type: Application
    Filed: November 25, 2009
    Publication date: May 27, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
  • Publication number: 20100131270
    Abstract: The invention relates to a method for determining a characteristic pattern for a speech message that is supplied in the form of a numerically encoded audio signal generated by means of a sampling process.
    Type: Application
    Filed: July 13, 2007
    Publication date: May 27, 2010
    Applicant: Nokia Siemens Networks GmbH & Co.
    Inventor: Joachim Charzinski
  • Publication number: 20100131276
    Abstract: A device (2) for changing the pitch of an audio signal (r), such as a speech signal, comprises a sinusoidal analysis unit (21) for determining sinusoidal parameters of the audio signal (r), a parameter production unit (22) for predicting the phase of a sinusoidal component, and a sinusoidal synthesis unit (23) for synthesizing the parameters to produce a reconstructed signal (r?). The parameter production unit (22) receives, for each time segment of the audio signal, the phase of the previous time segment to predict the phase of the current time segment.
    Type: Application
    Filed: July 6, 2006
    Publication date: May 27, 2010
    Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.
    Inventors: Albertus Cornelis Den Brinker, Robert Johannes Sluijter
  • Publication number: 20100131277
    Abstract: There is provided a device for performing interaction between a user and a machine. The device includes a plurality of domains corresponding to a plurality of stages in the interaction. Each of the domains has voice comprehension means which understands the content of the user's voice.
    Type: Application
    Filed: July 26, 2006
    Publication date: May 27, 2010
    Applicant: Honda Motor Co., Ltd.
    Inventor: Mikio Nakano
  • Publication number: 20100131274
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for dialog modeling. The method includes receiving spoken dialogs annotated to indicate dialog acts and task/subtask information, parsing the spoken dialogs with a hierarchical, parse-based dialog model which operates incrementally from left to right and which only analyzes a preceding dialog context to generate parsed spoken dialogs, and constructing a functional task structure of the parsed spoken dialogs. The method can further either interpret user utterances with the functional task structure of the parsed spoken dialogs or plan system responses to user utterances with the functional task structure of the parsed spoken dialogs. The parse-based dialog model can be a shift-reduce model, a start-complete model, or a connection path model.
    Type: Application
    Filed: November 26, 2008
    Publication date: May 27, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Amanda Stent, Srinivas Bangalore
  • Publication number: 20100131264
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.
    Type: Application
    Filed: November 21, 2008
    Publication date: May 27, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie
  • Publication number: 20100131275
    Abstract: Multimodal interaction with grammar-based speech applications may be facilitated with a device by presenting permissible phrases that are in-grammar based on acceptable terms that are in-vocabulary and that have been recognized from a spoken utterance. In an example embodiment, a spoken utterance having two or more terms is received. The two or more terms include one or more acceptable terms. An index is searched using the acceptable terms as query terms. From the searching of the index, permissible phrase(s) are produced that include the acceptable terms. The index is a searchable data structure that represents multiple possible grammar paths that are ascertainable based on acceptable values for each term position of a grammar-based speech application. The permissible phrase(s) are presented to a user as option(s) that may be selected to conduct multimodal interaction with the device.
    Type: Application
    Filed: November 26, 2008
    Publication date: May 27, 2010
    Applicant: MICROSOFT CORPORATION
    Inventor: Timothy Seung Yoon Paek
  • Publication number: 20100125460
    Abstract: A voice assistant system is disclosed which directs the voice Prompts delivered to a first user of a voice assistant to also be communicated wirelessly to the voice assistant of a second user so that the second user can hear the voice Prompts as delivered to the first user.
    Type: Application
    Filed: November 12, 2009
    Publication date: May 20, 2010
    Inventors: Mark B. Mellott, Richard Anthony Bates, Michael Laughery, James R. Logan
  • Publication number: 20100125457
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function.
    Type: Application
    Filed: November 19, 2008
    Publication date: May 20, 2010
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
  • Publication number: 20100125456
    Abstract: Embodiments of a dialog system that utilizes contextual information to perform recognition of proper names are described. Unlike present name recognition methods on large name lists that generally focus strictly on the static aspect of the names, embodiments of the present system take into account of the temporal, recency and context effect when names are used, and formulates new questions to further constrain the search space or grammar for recognition of the past and current utterances.
    Type: Application
    Filed: November 19, 2008
    Publication date: May 20, 2010
    Applicant: ROBERT BOSCH GMBH
    Inventors: Fuliang Weng, Zhongnan Shen, Zhe Feng
  • Publication number: 20100121641
    Abstract: An external voice identification system and an identification process thereof is disclosed. The external voice identification system of a multimedia electronic device is activated by identifying and analyzing inputting a voice message, and the multimedia electronic device can be an iPod player having a storage module stored with a plurality of voice files and has a transmission interface to electrically connect to a voice identification system. The voice identification system is electrically connected to the transmission interface and has a built-in identification module, and a identification unit can identify and analyze the voice signals. An adapting interface is connected to a voice input unit to receive the external voice signal, and thus identify and analyze the external voice signals by the identification module to further activate the multimedia electronic device for playing the voice signal (songs), and select, to adjust and switch the playing content by the inputted external voice signal.
    Type: Application
    Filed: November 11, 2008
    Publication date: May 13, 2010
    Applicant: AIBELIVE CO., LTD
    Inventors: TSUNG-HAN TSAI, Chen-Wei Su, Chun-Ping Fang, Min-Ching Wu