Speech Recognition (epo) Patents (Class 704/E15.001)

E Subclasses

Assessment or evaluation of speech recognition systems (epo) (Class 704/E15.002)

Language recognition (epo) (Class 704/E15.003)

Feature extraction for speech recognition; selection of recognition unit (epo) (Class 704/E15.004)

Segmentation or word limit detection (epo) (Class 704/E15.005)

Word boundary detection (EPO) (Class 704/E15.006)

Creation of reference templates; training of speech recognition systems, e.g., adaption to the characteristics of the speaker's voice, etc. (epo) (Class 704/E15.007)

Speech classification or search (epo) (Class 704/E15.014)

Speech recognition techniques for robustness in adverse environments, e.g., in noise, of stress induced speech, etc. (epo) (Class 704/E15.039)

Procedures used during a speech recognition process, e.g., man-machine dialogue, etc. (epo) (Class 704/E15.04)

Speech recognition using nonacoustical features, e.g., position of the lips, etc. (epo) (Class 704/E15.041)

Using position of the lips, movement of the lips, or face analysis (EPO) (Class 704/E15.042)

Speech to text systems (epo) (Class 704/E15.043)

Constructional details of speech recognition systems (epo) (Class 704/E15.046)

Method for recognizing and interpreting patterns in noisy data sequences

Publication number: 20100198592

Abstract: This invention maps possibly noisy digital input from any of a number of different hardware or software sources such as keyboards, automatic speech recognition systems, cell phones, smart phones or the web onto an interpretation consisting of an action and one or more physical objects, such as robots, machinery, vehicles, etc. or digital objects such as data files, tables and databases. Tables and lists of (i) homonyms and misrecognitions, (ii) thematic relation patterns, and (iii) lexicons are used to generate alternative forms of the input which are scored to determine the best interpretation of the noisy input. The actions may be executed internally or output to any device which contains a digital component such as, but not limited to, a computer, a robot, a cell phone, a smart phone or the web. This invention may be implemented on sequential and parallel compute engines and systems.

Type: Application

Filed: January 27, 2010

Publication date: August 5, 2010

Inventor: Jerry Lee Potter
TOOL AND FRAMEWORK FOR CREATING CONSISTENT NORMALIZATION MAPS AND GRAMMARS

Publication number: 20100191519

Abstract: A runtime framework and authoring tool are provided for enabling linguistic experts to author text normalization maps and grammar libraries without requiring high level of technical or programming skills. Authors define or select terminals, map the terminals, and define rules for the mapping. The tool enables an author to validate their work, by executing the map in the same way the recognition engine does, causing consistency in results from authoring to user operations. The runtime is used by the speech engines and by the tools to provide consistent normalization for supported scenarios.

Type: Application

Filed: January 28, 2009

Publication date: July 29, 2010

Applicant: Microsoft Corporation

Inventors: Rachel I. Morton, Nicholas J. Gedge, Heiko W. Rahmel
QUANTIZING FEATURE VECTORS IN DECISION-MAKING APPLICATIONS

Publication number: 20100191531

Abstract: A system, method and computer program product for classification of an analog electrical signal using statistical models of training data. A technique is described to quantize the analog electrical signal in a manner which maximizes the compression of the signal while simultaneously minimizing the diminution in the ability to classify the compressed signal. These goals are achieved by utilizing a quantizer designed to minimize the loss in a power of the log-likelihood ratio. A further technique is described to enhance the quantization process by optimally allocating a number of bits for each dimension of the quantized feature vector subject to a maximum number of bits available across all dimensions.

Type: Application

Filed: April 3, 2010

Publication date: July 29, 2010

Applicant: International Business Machines Corporation

Inventors: Upendra V. CHAUDHARI, Hsin I. Tseng, Deepak S. Turaga, Olivier Verscheure
Predictive Engine for Interactive Voice Response System

Publication number: 20100191658

Abstract: A customer service issue prediction engine uses one or more models of issue probability. A method of multi-phase customer issue prediction includes a modeling phase, an application phase, and a learning phase. A telephonic interactive voice response (IVR) system predicts customer issues.

Type: Application

Filed: January 25, 2010

Publication date: July 29, 2010

Inventors: Pallipuram V. Kannan, Mohit Jain, Ravi Vijayaraghavan
SPEECH SIGNAL PROCESSING APPARATUS

Publication number: 20100191528

Abstract: A speech signal processing apparatus comprising: a control signal output unit configured to receive as an input signal either one of a first speech signal corresponding to a sound uttered by a user and a second speech signal corresponding to a sound output from an eardrum of the user when the user utters a sound, and output a control signal corresponding to a noise level of the input signal; and a speech signal output unit configured to output either one of the first speech signal and the second speech signal according to the control signal.

Type: Application

Filed: January 26, 2010

Publication date: July 29, 2010

Applicants: SANYO ELECTRIC CO., LTD., SANYO SEMICONDUCTOR CO., LTD.

Inventors: Kozo Okuda, Kenji Morimoto
TEXT AND SPEECH RECOGNITION SYSTEM USING NAVIGATION INFORMATION

Publication number: 20100191520

Abstract: A system and method are provided for recognizing a user's speech input. The method includes the steps for detecting the user's speech input, recognizing the user's speech input by comparing the speech input to a list of entries using language model statistics to determine the most likely entry matching the user's speech input, and detecting navigation information of a trip to a predetermined destination, where the most likely entry is determined by modifying the language model statistics taking into account the navigation information. A system and method is further provided that takes into account navigation trip information to determine the most likely entry using language model statistics for recognizing text input.

Type: Application

Filed: January 25, 2010

Publication date: July 29, 2010

Applicant: Harman Becker Automotive Systems GmbH

Inventors: Rainer Gruhn, Andreas Marcel Riechert
MULTI-TOUCH DEVICE HAVING A BOT WITH LOCAL AND REMOTE CAPABILITIES

Publication number: 20100185946

Abstract: A system, method, and program product for instantiating and executing a bot on an interface system are disclosed. A bot is an agent for the user and includes an animated visual personification. The system includes an interface system including a graphical user interface, a system for instantiating a bot and displaying the bot on the graphical user interface, and a command processing system for causing the bot to execute one of a plurality of actions in response to a user command. The plurality of actions includes at least one local capability and at least one remote capability. The at least one remote capability also includes a system for transferring the bot to a second interface at a remote location.

Type: Application

Filed: January 21, 2009

Publication date: July 22, 2010

Inventors: Lisa Seacat DeLuca, Lydia Mai DO, Steven M. Miller, Pamela A. Nesbitt
MACHINE, SYSTEM AND METHOD FOR USER-GUIDED TEACHING AND MODIFYING OF VOICE COMMANDS AND ACTIONS EXECUTED BY A CONVERSATIONAL LEARNING SYSTEM

Publication number: 20100185445

Abstract: A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.

Type: Application

Filed: January 21, 2009

Publication date: July 22, 2010

Applicant: International Business Machines Corporation

Inventors: Liam D. Comerford, Mahesh Viswanathan
System and Method for Processing Speech

Publication number: 20100185443

Abstract: Systems and methods for processing speech are provided. A system may include a speech recognition interface and a processor. The processor may convert speech received from a call at the speech recognition interface to at least one word string. The processor may parse each word string of the at least one word string into first objects and first actions. The processor may access a synonym table to determine second objects and second actions based on the first objects and the first actions. The processor may also select a preferred object and a preferred action from the second objects and the second actions.

Type: Application

Filed: March 31, 2010

Publication date: July 22, 2010

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin, Sarah Korth
METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING COMPOUND MODELS FOR SPEECH RECOGNITION ADAPTATION

Publication number: 20100185444

Abstract: An apparatus for providing compound models for speech recognition adaptation includes a processor. The processor may be configured to receive a speech signal corresponding to a particular speaker, select a cluster model including both a speaker independent portion and a speaker dependent portion based at least in part on a characteristic of speech of the particular speaker, and process the speech signal using the selected cluster model. A corresponding method and computer program product are also provided.

Type: Application

Filed: January 21, 2009

Publication date: July 22, 2010

Inventor: Jesper Olsen
VOICE RECOGNITION SYSTEM AND METHODS

Publication number: 20100179813

Abstract: The present invention relates to a method of providing voice recognition. The method comprises the steps of receiving a packetised voice data of a person to be identified over a packet-switched network, comparing the voice data with a stored voice data of a user and, based on the comparison, providing an indication of the likelihood that the person to be identified is the user, wherein the step of receiving the voice data comprises waiting for sufficient voice data to be received.

Type: Application

Filed: January 22, 2008

Publication date: July 15, 2010

Inventors: Clive Summerfield, Joel Moss
Method for recognizing and distributing music

Publication number: 20100179810

Abstract: A customer for music distributed over the interne may select a composition from a menu of written identifiers (such as the song title and singer or group) and then confirm that the composition is indeed the one desired by listening to a corrupted version of the composition. If the customer has forgotten the song title or the singer or other words that provide the identifier, he or she may hum or otherwise vocalize a few bars of the desired composition, or pick the desired composition out on a simulated keyboard. A music-recognition system then locates candidates for the selected composition and displays identifiers for these candidates to the customer.

Type: Application

Filed: March 18, 2010

Publication date: July 15, 2010

Inventor: Lawson A. Wood
Microphone Preamplifier Circuit and Voice Sensing Devices

Publication number: 20100172517

Abstract: A microphone preamplifier circuit is provided. An amplifier comprises a first input end, a second input end, and an output end. A bias voltage is provided by a bias voltage source. A first sensor is coupled to the first input end and the bias voltage source for sensing a first physical parameter and a second physical parameter. A second sensor is coupled to the second input end and the bias voltage source for sensing the first physical parameter, wherein the second sensor is insensitive to the second physical parameter. The output end of the amplifier outputs a difference of the first and second input ends whereby noises and interferences are reduced.

Type: Application

Filed: January 8, 2009

Publication date: July 8, 2010

Applicant: FORTEMEDIA, INC.

Inventors: Li-Te Wu, Jui-Te Chiu
AUTOMATED DEMOGRAPHIC ANALYSIS

Publication number: 20100169088

Abstract: A method of generating demographic information relating to an individual is provided. The method includes monitoring an environment for a voice activity of an individual and detecting the voice activity of the individual. The method further includes analyzing the detected voice activity of the individual and determining, based on the detected voice activity of the individual, a demographic descriptor of the individual.

Type: Application

Filed: December 29, 2008

Publication date: July 1, 2010

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Michael JOHNSTON, Hisao M. CHANG, Harry E. BLANCHARD, Bernard S. RENGER, Linda ROBERTS
DEVICE, SYSTEM AND METHOD FOR PROVIDING TARGETED ADVERTISEMENTS AND CONTENT

Publication number: 20100169091

Abstract: An aspect of the present invention is drawn to an audio data processing device for use by a user to control a system and for use with a microphone, a user demographic profiles database and a content/ad database. The microphone may be operable to detect speech and to generate speech data based on the detected speech. The user demographic profiles database may be capable of having demographic data stored therein. The content/ad database may be capable of having at least one of content data and advertisement data stored therein. The audio data processing device includes a voice recognition portion, a voice analysis portion and a speech to text portion. The voice recognition portion may be operable to process user instructions based on the speech data. The voice analysis portion may be operable to determine characteristics of the user based on the speech data. The speech to text portion may be operable to determine interests of the user.

Type: Application

Filed: December 30, 2008

Publication date: July 1, 2010

Applicant: MOTOROLA, INC.

Inventors: Robert A. Zurek, James P. Ashley
Multimodal system and input process method thereof

Publication number: 20100169246

Abstract: A multimodal system and an input processing method thereof are disclosed. The multimodal system includes a pre-constructed input combination constructing unit and an input combination selection unit for selecting an input combination corresponding to an input signal from a user or a sensor. The system performs learning for selecting an input combination from the pre-constructed input combinations. The system provides available input combinations due to this learning, resulting in high satisfaction with the processed result.

Type: Application

Filed: December 2, 2009

Publication date: July 1, 2010

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jun Won Jang, Tae Sin Ha
ADAPTIVE PERSONAL NAME GRAMMARS

Publication number: 20100161333

Abstract: In one embodiment, an adaptive personal name grammar improves speech recognition by limiting or weighting the scope of potential addressable names based upon meta-information relative to the communications patterns, environmental considerations, or sociological/professional hierarchy of a user to increase the likelihood of a positive match.

Type: Application

Filed: December 23, 2008

Publication date: June 24, 2010

Applicant: CISCOTECHNOLOGY, INC

Inventors: MICHAEL T. MAAS, KEVIN L. CHESTNUT
SYSTEM AND METHOD FOR WIRELESS ORDERING USING SPEECH RECOGNITION

Publication number: 20100161446

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for placing an order for a user. The method includes receiving a search from a user, identifying a product category based on the search, presenting to the user a general ordering screen based on the identified product category, selecting and activating a speech recognition grammar tuned for the identified product category, recognizing a first received user utterance with the activated tuned grammar to identify a vendor who offers items in the identified product category, recognizing a second received user utterance with the activated tuned grammar to identify a specific item from the identified vendor, and placing an order for the specific item with the identified vendor for the user. In one aspect, the method further offers to sell the user additional items ancillary to the specific item.

Type: Application

Filed: December 19, 2008

Publication date: June 24, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Joseph Anderson Alfred, Joseph M. Sommer
SYSTEM AND METHOD FOR RECOGNIZING SPEECH WITH DIALECT GRAMMARS

Publication number: 20100161337

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for recognizing speech. The method includes receiving speech from a user, perceiving at least one speech dialect in the received speech, selecting at least one grammar from a plurality of optimized dialect grammars based on at least one score associated with the perceived speech dialect and the perceived at least one speech dialect, and recognizing the received speech with the selected at least one grammar. Selecting at least one grammar can be further based on a user profile. Multiple grammars can be blended. Predefined parameters can include pronunciation differences, vocabulary, and sentence structure. Optimized dialect grammars can be domain specific. The method can further include recognizing initial received speech with a generic grammar until an optimized dialect grammar is selected. Selecting at least one grammar from a plurality of optimized dialect grammars can be based on a certainty threshold.

Type: Application

Filed: December 23, 2008

Publication date: June 24, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Gregory PULZ, Harry E. BLANCHARD, Steven H. LEWIS, Lan ZHANG
SYSTEM AND METHOD FOR ENHANCING SPEECH RECOGNITION ACCURACY

Publication number: 20100161336

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for enhancing speech recognition accuracy. The method includes dividing a system dialog turn into segments based on timing of probable user responses, generating a weighted grammar for each segment, exclusively activating the weighted grammar generated for a current segment of the dialog turn during the current segment of the dialog turn, and recognizing user speech received during the current segment using the activated weighted grammar generated for the current segment. The method can further include assigning probability to the weighted grammar based on historical user responses and activating each weighted grammar is based on the assigned probability. Weighted grammars can be generated based on a user profile. A weighted grammar can be generated for two or more segments.

Type: Application

Filed: December 19, 2008

Publication date: June 24, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventor: Michael Czahor
VITERBI DECODER AND SPEECH RECOGNITION METHOD USING SAME

Publication number: 20100161329

Abstract: A Viterbi decoder includes: an observation vector sequence generator for generating an observation vector sequence by converting an input speech to a sequence of observation vectors; a local optimal state calculator for obtaining a partial state sequence having a maximum similarity up to a current observation vector as an optimal state; an observation probability calculator for obtaining, as a current observation probability, a probability for observing the current observation vector in the optimal state; a buffer for storing therein a specific number of previous observation probabilities; a non-linear filter for calculating a filtered probability by using the previous observation probabilities stored in the buffer and the current observation probability; and a maximum likelihood calculator for calculating a partial maximum likelihood by using the filtered probability.

Type: Application

Filed: July 21, 2009

Publication date: June 24, 2010

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Hoon CHUNG, Jeon Gue PARK, Yunkeun LEE, Ho-Young JUNG, Hyung-Bae JEON, Jeorn Ja KANG, Sung Joo LEE, Euisok CHUNG, Ji Hyun WANG, Byung Ok KANG, Ki-young PARK, Jong Jin KIM
METHOD AND SYSTEM FOR OPERATING A VEHICULAR ELECTRONIC SYSTEM WITH VOICE COMMAND CAPABILITY

Publication number: 20100161339

Abstract: Methods and systems for operating an avionics system with voice command capability are provided. A first voice command is received. A first type of avionics system function is performed in response to the receiving of the first voice command. A second voice command is received. A second type of avionics system function that has a hazard level higher than that of the first type of avionics system function is performed in response to the receiving of the second voice command only after a condition is detected that is indicative of a confirmation of the request to perform the second type of avionics function. The avionics system may also have the capability to test whether or not the voice command feature is functioning properly.

Type: Application

Filed: December 19, 2008

Publication date: June 24, 2010

Applicant: Honeywell International Inc.

Inventors: Robert E. De Mers, Steve Grothe, Joseph J. Nutaro
SYSTEM AND METHOD FOR VOICE ACTIVATED DIALING FROM A HOME PHONE

Publication number: 20100150321

Abstract: A system and method for remotely enabled voice activated dialing. Generation of a special dial tone indicating that a user may give the voice identifier is initiated. A voice identifier is received over a network from a wired telephone utilized by a user. Dialing information associated with the voice identifier is determined. One or more receiving parties associated with the voice identifier are dialed. The wired telephone is connected to the one or more receiving parties.

Type: Application

Filed: December 15, 2008

Publication date: June 17, 2010

Inventors: Robert Harris, Don L. Briscoe, Jasen D. Ott, John Zeigler, Michael Schmidt
INPUT DEVICE AND INPUT METHOD FOR MOBILE BODY

Publication number: 20100153111

Abstract: Provided is an input device for a mobile body, the input device allowing a safe input operation at the time of operating an equipment such as a car regardless of whether the mobile body is traveling or stopping.

Type: Application

Filed: December 11, 2006

Publication date: June 17, 2010

Inventors: Takuya Hirai, Atsushi Yamashita, Tomohiro Terada
VOICE AND TEXT COMMUNICATION SYSTEM

Publication number: 20100150333

Abstract: A text/voice system comprises a device configured to receive an incoming voice call intended for a called party, and detect, in response to receiving the voice call, the current status of the called party on a text messaging system, where the current status may include active or inactive. The device is also configured to establish a communication session between the calling party and the called party via the text messaging system, where speech from the calling party is translated to text and delivered to the called party during the communication session, and responsive text from the called party is translated to speech and delivered as speech to the calling party during the communication session.

Type: Application

Filed: December 15, 2008

Publication date: June 17, 2010

Applicant: Verizon Data Services LLC

Inventors: Lee N. GOODMAN, Sujin C. Chang
SYSTEM AND METHOD FOR REFERRING TO ENTITIES IN A DISCOURSE DOMAIN

Publication number: 20100153105

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for referring to entities. The method includes receiving domain-specific training data of sentences describing a target entity in a context, extracting a speaker history and a visual context from the training data, selecting attributes of the target entity based on at least one of the speaker history, the visual context, and speaker preferences, generating a text expression referring to the target entity based on at least one of the selected attributes, the speaker history, and the context, and outputting the generated text expression. The weighted finite-state automaton can represent partial orderings of word pairs in the domain-specific training data. The weighted finite-state automaton can be speaker specific or speaker independent. The weighted finite-state automaton can include a set of weighted partial orderings of the training data for each possible realization.

Type: Application

Filed: December 12, 2008

Publication date: June 17, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Giuseppe DI FABBRIZIO, Srinivas Bangalore, Amanda Stent
Single action audio prompt interface utilising binary state time domain multiple selection protocol

Publication number: 20100153117

Abstract: An interface protocol for the functional manipulation of complex devices such as consumer electronic devices without the necessity of the visual feedback via textual or graphic data, wherein the sensor functions change with time rather than placement, so that a user action biases a binary state switch, which is correlated to a timed audible audio data stream, the correlation indicating the desired action selected by the user.

Type: Application

Filed: February 22, 2010

Publication date: June 17, 2010

Inventor: Paul M. Toupin
PROGRESSIVELY REFINING A SPEECH-BASED SEARCH

Publication number: 20100153112

Abstract: Disclosed are editing methods that are added to speech-based searching to allow users to better understand textual queries submitted to a search engine and to easily edit their speech queries. According to some embodiments, the user begins to speak. The user's speech is translated into a textual query and submitted to a search engine. The results of the search are presented to the user. As the user continues to speak, the user's speech query is refined based on the user's further speech. The refined speech query is converted to a textual query which is again submitted to the search engine. The refined results are presented to the user. This process continues as long as the user continues to refine the query. Some embodiments present the textual query to the user and allow the user to use both speech-based and non-speech-based tools to edit the textual query.

Type: Application

Filed: December 16, 2008

Publication date: June 17, 2010

Applicant: MOTOROLA, INC.

Inventors: W. Garland Phillips, Harry M. Bliss, Bashar Jano, Changxue Ma
CONVERSATION MAPPING

Publication number: 20100153106

Abstract: A method may include receiving communications associated with a communication session. The communication session may correspond to a telephone conversation, text-based conversation or a multimedia conversation. The method may also include identifying portions of the communication session and storing the identified portions. The method may further include receiving a request to retrieve information associated with the communication session and providing to a display, information associated with the identified portions.

Type: Application

Filed: December 15, 2008

Publication date: June 17, 2010

Applicant: VERIZON DATA SERVICES LLC

Inventors: Kristopher T. Frazier, Brian F. Roberts, Heath Stallings
METHOD AND APPARATUS FOR SPEECH RECOGNITION USING DOMAIN ONTOLOGY

Publication number: 20100145680

Abstract: A speech recognition method using a domain ontology includes: constructing domain ontology DB; forming a speech recognition grammar using the formed domain ontology DB; extracting a feature vector from a speech signal; modeling the speech signal using an acoustic model. The method performs speech recognition by using the acoustic model, the speech recognition dictionary and the speech recognition grammar on the basis of the feature vector.

Type: Application

Filed: September 1, 2009

Publication date: June 10, 2010

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Seung YUN, Soo Jong Lee, Jeong Se Kim, Il Bin Lee, Jun Park, Sang Kyu Park
Regeneration of wideband speed

Publication number: 20100145684

Abstract: A system and method for processing a narrowband speech signal comprising speech samples in a first range of frequencies. the method comprises: generating from the narrowband speech signal a highband speech signal in a second range of frequencies above the first range of frequencies; determining a pitch of the highband speech signal; using the pitch to generate a pitch-dependent tonality measure from samples of the highband speech signal; and filtering the speech samples using a gain factor derived from the tonality measure and selected to reduce the amplitude of harmonics in the highband speech signal.

Type: Application

Filed: June 10, 2009

Publication date: June 10, 2010

Inventors: Mattias Nilsson, Soren Vang Anderson
SYSTEM AND METHOD FOR VOICE AUTHENTICATION

Publication number: 20100145709

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for voice authentication. The method includes receiving a speech sample from a user through an Internet browser for authentication as part of a request for a restricted-access resource, performing a comparison of the received speech sample to a previously established speech profile associated with the user, transmitting an authentication to the network client if the comparison is equal to or greater than a certainty threshold, and transmitting a denial to the network client if the comparison is less than the certainty threshold.

Type: Application

Filed: December 4, 2008

Publication date: June 10, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventor: Saurabh Kumar
ADAPTATION OF AUTOMATIC SPEECH RECOGNITION ACOUSTIC MODELS

Publication number: 20100145699

Abstract: Methods and systems for adapting of acoustic models are disclosed. A user terminal may determine a phoneme distribution of a text corpus, determine an acoustic model gain distribution of phonemes of an acoustic model before and after adaptation of the acoustic model, determine a desired phoneme distribution based on the phoneme distribution and the acoustic model gain distribution, generate an adaption sentence based on the desired phoneme distribution, and generate a prompt requesting a user speak the adaptation sentence.

Type: Application

Filed: December 9, 2008

Publication date: June 10, 2010

Applicant: NOKIA CORPORATION

Inventor: Jilei Tian
MOBILE SYSTEMS AND METHODS FOR RESPONDING TO NATURAL LANGUAGE SPEECH UTTERANCE

Publication number: 20100145700

Abstract: Mobile systems and methods that overcomes the deficiencies of prior art speech-based interfaces for telematics applications through the use of a complete speech-based information query, retrieval, presentation and local or remote command environment. This environment makes significant use of context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created. The invention creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command. The invention may organize domain specific behavior and information into agents, that are distributable or updateable over a wide area network.

Type: Application

Filed: February 12, 2010

Publication date: June 10, 2010

Applicant: VoiceBox Technologies, Inc.

Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, SR., Michael R. Kennewick, JR., Richard Kennewick, Tom Freeman, Stephen F. Elston
GAME DATA AND SPEECH TRANSFER TO AND FROM WIRELESS PORTABLE GAME TERMINAL

Publication number: 20100144435

Abstract: Game data and speech transfer to and from a wireless portable game terminal is disclosed. The wireless portable game terminal includes a radio transceiver configured to transfer speech and game data through a radio connection to a telecommunication system, a loudspeaker, a microphone, and a processing unit. The processing unit is configured to process the game data, to transfer the game data to and from another game terminal or a game server through the radio connection, to receive captured speech of another user through the radio connection, to output audio part of the game data and the captured speech of the other user through the loudspeaker, to capture speech of an user with the microphone, and to transfer the captured speech of the user to another game terminal or to a game server through the radio connection.

Type: Application

Filed: February 12, 2010

Publication date: June 10, 2010

Applicant: Nokia Corporation

Inventor: Kari NIEMELA
METHOD OF DECODING NONVERBAL CUES IN CROSS-CULTURAL INTERACTIONS AND LANGUAGE IMPAIRMENT

Publication number: 20100145693

Abstract: A method for extracting verbal cues is presented which enhances a speech signal to increase the saliency and recognition of verbal cues including emotive verbal cues. In a further embodiment of the method, the method works in conjunction with a computer that displays a face which gestures and articulates non-verbal cues in accord with speech patterns that are also modified to enhance their verbal cues. The methods work to provide a means for allowing non-fluent speakers to better understand and learn foreign languages.

Type: Application

Filed: March 20, 2008

Publication date: June 10, 2010

Inventor: Martin L Lenhardt
Navigation Device and Navigation Method

Publication number: 20100138150

Abstract: Provided is a technology of a navigation device which is capable of identifying an intersection or the like based on designation of an incomplete name of a street in which an input of a first keyword and an input of a second keyword are received, and a connection point of a first street having a street name which includes at least in part the first keyword and a second street having a street name which includes at least in part the second keyword is identified, to thereby save the user, who is not always familiar with the geography of a search target area, from the inconvenience of inputting a complete name of the first street, based on which the second street is retrieved and selected and an intersection point of the first street and the second street is identified.

Type: Application

Filed: November 25, 2009

Publication date: June 3, 2010

Applicant: Clarion Co., Ltd.

Inventors: Norio WATARAI, Chiharu HIRAI
NON-DISRUPTIVE SIDE CONVERSATION INFORMATION RETRIEVAL

Publication number: 20100138224

Abstract: Information is exchanged between a user of a communications device and an application during an ongoing conversation between the user using the communications device and a party, without disrupting the conversation. An application associated with the communications device is accessed via the communications device in response to a command and keyword spoken by the user during the communications session. Information is retrieved from the application according to the keyword spoken by the user. When the information is retrieved from the application, the user is prompted in a manner transparent to the party, after which a response is sent to the user.

Type: Application

Filed: December 3, 2008

Publication date: June 3, 2010

Applicant: AT&T Intellectual Property I, LP.

Inventor: James Carlton Bedingfield, SR.
VEHICLE IMMERSIVE COMMUNICATION SYSTEM

Publication number: 20100137037

Abstract: A vehicle communication system facilitates hands-free interaction with a mobile device in a vehicle or elsewhere. Users interact with the system by speaking to it. The system processes text and processes commands. The system supports Bluetooth wireless technology for hands-free use. The system handles telephone calls, email, and SMS text messages. The user can customize the device via a user profile stored on an Internet web server.

Type: Application

Filed: February 8, 2010

Publication date: June 3, 2010

Inventor: Otman A. Basir
Speech Recognition Based on a Multilingual Acoustic Model

Publication number: 20100131262

Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.

Type: Application

Filed: November 25, 2009

Publication date: May 27, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
METHOD AND SYSTEM FOR REDUCING RECEPTION OF UNWANTED MESSAGES

Publication number: 20100131270

Abstract: The invention relates to a method for determining a characteristic pattern for a speech message that is supplied in the form of a numerically encoded audio signal generated by means of a sampling process.

Type: Application

Filed: July 13, 2007

Publication date: May 27, 2010

Applicant: Nokia Siemens Networks GmbH & Co.

Inventor: Joachim Charzinski
AUDIO SIGNAL SYNTHESIS

Publication number: 20100131276

Abstract: A device (2) for changing the pitch of an audio signal (r), such as a speech signal, comprises a sinusoidal analysis unit (21) for determining sinusoidal parameters of the audio signal (r), a parameter production unit (22) for predicting the phase of a sinusoidal component, and a sinusoidal synthesis unit (23) for synthesizing the parameters to produce a reconstructed signal (r?). The parameter production unit (22) receives, for each time segment of the audio signal, the phase of the previous time segment to predict the phase of the current time segment.

Type: Application

Filed: July 6, 2006

Publication date: May 27, 2010

Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.

Inventors: Albertus Cornelis Den Brinker, Robert Johannes Sluijter
Device, Method, and Program for Performing Interaction Between User and Machine

Publication number: 20100131277

Abstract: There is provided a device for performing interaction between a user and a machine. The device includes a plurality of domains corresponding to a plurality of stages in the interaction. Each of the domains has voice comprehension means which understands the content of the user's voice.

Type: Application

Filed: July 26, 2006

Publication date: May 27, 2010

Applicant: Honda Motor Co., Ltd.

Inventor: Mikio Nakano
SYSTEM AND METHOD FOR DIALOG MODELING

Publication number: 20100131274

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for dialog modeling. The method includes receiving spoken dialogs annotated to indicate dialog acts and task/subtask information, parsing the spoken dialogs with a hierarchical, parse-based dialog model which operates incrementally from left to right and which only analyzes a preceding dialog context to generate parsed spoken dialogs, and constructing a functional task structure of the parsed spoken dialogs. The method can further either interpret user utterances with the functional task structure of the parsed spoken dialogs or plan system responses to user utterances with the functional task structure of the parsed spoken dialogs. The parse-based dialog model can be a shift-reduce model, a start-complete model, or a connection path model.

Type: Application

Filed: November 26, 2008

Publication date: May 27, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Amanda Stent, Srinivas Bangalore
SYSTEM AND METHOD FOR HANDLING MISSING SPEECH DATA

Publication number: 20100131264

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.

Type: Application

Filed: November 21, 2008

Publication date: May 27, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie
FACILITATING MULTIMODAL INTERACTION WITH GRAMMAR-BASED SPEECH APPLICATIONS

Publication number: 20100131275

Abstract: Multimodal interaction with grammar-based speech applications may be facilitated with a device by presenting permissible phrases that are in-grammar based on acceptable terms that are in-vocabulary and that have been recognized from a spoken utterance. In an example embodiment, a spoken utterance having two or more terms is received. The two or more terms include one or more acceptable terms. An index is searched using the acceptable terms as query terms. From the searching of the index, permissible phrase(s) are produced that include the acceptable terms. The index is a searchable data structure that represents multiple possible grammar paths that are ascertainable based on acceptable values for each term position of a grammar-based speech application. The permissible phrase(s) are presented to a user as option(s) that may be selected to conduct multimodal interaction with the device.

Type: Application

Filed: November 26, 2008

Publication date: May 27, 2010

Applicant: MICROSOFT CORPORATION

Inventor: Timothy Seung Yoon Paek
TRAINING/COACHING SYSTEM FOR A VOICE-ENABLED WORK ENVIRONMENT

Publication number: 20100125460

Abstract: A voice assistant system is disclosed which directs the voice Prompts delivered to a first user of a voice assistant to also be communicated wirelessly to the voice assistant of a second user so that the second user can hear the voice Prompts as delivered to the first user.

Type: Application

Filed: November 12, 2009

Publication date: May 20, 2010

Inventors: Mark B. Mellott, Richard Anthony Bates, Michael Laughery, James R. Logan
SYSTEM AND METHOD FOR DISCRIMINATIVE PRONUNCIATION MODELING FOR VOICE SEARCH

Publication number: 20100125457

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function.

Type: Application

Filed: November 19, 2008

Publication date: May 20, 2010

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
System and Method for Recognizing Proper Names in Dialog Systems

Publication number: 20100125456

Abstract: Embodiments of a dialog system that utilizes contextual information to perform recognition of proper names are described. Unlike present name recognition methods on large name lists that generally focus strictly on the static aspect of the names, embodiments of the present system take into account of the temporal, recency and context effect when names are used, and formulates new questions to further constrain the search space or grammar for recognition of the past and current utterances.

Type: Application

Filed: November 19, 2008

Publication date: May 20, 2010

Applicant: ROBERT BOSCH GMBH

Inventors: Fuliang Weng, Zhongnan Shen, Zhe Feng
EXTERNAL VOICE IDENTIFICATION SYSTEM AND IDENTIFICATION PROCESS THEREOF

Publication number: 20100121641

Abstract: An external voice identification system and an identification process thereof is disclosed. The external voice identification system of a multimedia electronic device is activated by identifying and analyzing inputting a voice message, and the multimedia electronic device can be an iPod player having a storage module stored with a plurality of voice files and has a transmission interface to electrically connect to a voice identification system. The voice identification system is electrically connected to the transmission interface and has a built-in identification module, and a identification unit can identify and analyze the voice signals. An adapting interface is connected to a voice input unit to receive the external voice signal, and thus identify and analyze the external voice signals by the identification module to further activate the multimedia electronic device for playing the voice signal (songs), and select, to adjust and switch the playing content by the inputted external voice signal.

Type: Application

Filed: November 11, 2008

Publication date: May 13, 2010

Applicant: AIBELIVE CO., LTD

Inventors: TSUNG-HAN TSAI, Chen-Wei Su, Chun-Ping Fang, Min-Ching Wu

prev … 8 9 10 11 12 13 14 15 16 … next