Segmentation Or Word Limit Detection (epo) Patents (Class 704/E15.005)

E Subclasses

Word boundary detection (epo) (Class 704/E15.006)

Systems and Methods for Extracting Meaning from Multimodal Inputs Using Finite-State Devices

Publication number: 20120116768

Abstract: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

Type: Application

Filed: November 8, 2011

Publication date: May 10, 2012

Applicant: AT&T INTELLECTUAL PROPERTY II, L.P.

Inventors: Srinivas Bangalore, Michael J. Johnston
SYSTEM AND METHOD FOR RECOGNIZING PROPER NAMES IN DIALOG SYSTEMS

Publication number: 20120101823

Abstract: Embodiments of a dialog system that utilizes contextual information to perform recognition of proper names are described. Unlike present name recognition methods on large name lists that generally focus strictly on the static aspect of the names, embodiments of the present system take into account of the temporal, recency and context effect when names are used, and formulates new questions to further constrain the search space or grammar for recognition of the past and current utterances.

Type: Application

Filed: December 28, 2011

Publication date: April 26, 2012

Applicant: Robert Bosch GmbH

Inventors: Fuliang Weng, Zhongnan Shen, Zhe Feng
AUTOMATICALLY PROVIDING A USER WITH SUBSTITUTES FOR POTENTIALLY AMBIGUOUS USER-DEFINED SPEECH COMMANDS

Publication number: 20120095765

Abstract: A method for alleviating ambiguity issues of new user-defined speech commands. An original command for a user-defined speech command can be received. It can then be determined if the original command is likely to be confused with a set of existing speech commands. When confusion is unlikely, the original command can be automatically stored. When confusion is likely, a substitute command that is unlikely to be confused with existing commands can be automatically determined. The substitute can be presented as an alternative to the original command and can be selectively stored as the user-defined speech command.

Type: Application

Filed: December 22, 2011

Publication date: April 19, 2012

Applicant: Nuance Communications, Inc.

Inventors: William K. Bodin, James R. Lewis, Leslie R. Wilson
APPARATUS AND METHOD FOR SPEECH ANALYSIS

Publication number: 20120089396

Abstract: A system that incorporates teachings of the present disclosure may include, for example, an interface for receiving an utterance of speech and converting the utterance into a speech signal, such as digital representation including a waveform and/or spectrum; and a processor for dividing the speech signal into segments and detecting the emotional information from speech. The system is designed by comparing the speech segments to a baseline to identify the emotion or emotions from the suprasegmental information (i.e., paralinguistic information) in speech, wherein the baseline is determined from acoustic characteristics of a plurality of emotion categories. Other embodiments are disclosed.

Type: Application

Filed: June 16, 2010

Publication date: April 12, 2012

Applicant: University of Florida Research Foundation, Inc.

Inventors: Sona Patel, Rahul Shrivastav
VOICE DIALOGUE SYSTEM, METHOD, AND PROGRAM

Publication number: 20120078634

Abstract: A voice dialogue system executing an operation by a voice dialogue with a user, includes a history storage unit storing an operation name of the operation executed by the voice dialogue system and an operation history corresponding to a number of execution times of the executed operation; a voice storage unit storing voice data corresponding to the operation name; a detection unit detecting a voice skip signal indicating skipping an user's voice input; an acquisition unit acquiring the operation name of the operation having a high priority based on the number of execution times from said history storage unit, when said detection unit detects the voice skip signal; and a generation unit reading the voice data corresponding to the acquired operation name from said voice storage unit, and generating a voice signal corresponding to the read voice data.

Type: Application

Filed: March 15, 2011

Publication date: March 29, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Masahide Ariu
RECOGNITION OF TARGET WORDS USING DESIGNATED CHARACTERISTIC VALUES

Publication number: 20120078631

Abstract: Target word recognition includes: obtaining a candidate word set and corresponding characteristic computation data, the candidate word set comprising text data, and characteristic computation data being associated with the candidate word set; performing segmentation of the characteristic computation data to generate a plurality of text segments; combining the plurality of text segments to form a text data combination set; determining an intersection of the candidate word set and the text data combination set, the intersection comprising a plurality of text data combinations; determining a plurality of designated characteristic values for the plurality of text data combinations; based at least in part on the plurality of designated characteristic values and according to at least a criterion, recognizing among the plurality of text data combinations target words whose characteristic values fulfill the criterion.

Type: Application

Filed: September 22, 2011

Publication date: March 29, 2012

Applicant: ALIBABA GROUP HOLDING LIMITED

Inventors: Haibo Sun, Yang Yang, Yining Chen
MEETING SUPPORT APPARATUS, METHOD AND PROGRAM

Publication number: 20120078629

Abstract: According to one embodiment, a meeting support apparatus includes a storage unit, a determination unit, a generation unit. The storage unit is configured to store storage information for each of words, the storage information indicating a word of the words, pronunciation information on the word, and pronunciation recognition frequency. The determination unit is configured to generate emphasis determination information including an emphasis level that represents whether a first word should be highlighted and represents a degree of highlighting determined in accordance with a pronunciation recognition frequency of a second word when the first word is highlighted, based on whether the storage information includes second set corresponding to first set and based on the pronunciation recognition frequency of the second word when the second set is included. The generation unit is configured to generate an emphasis character string based on the emphasis determination information when the first word is highlighted.

Type: Application

Filed: March 25, 2011

Publication date: March 29, 2012

Inventors: Tomoo Ikeda, Nobuhiro Shimogori, Kouji Ueno
SYSTEM AND METHOD FOR ENHANCING VOICE-ENABLED SEARCH BASED ON AUTOMATED DEMOGRAPHIC IDENTIFICATION

Publication number: 20120072219

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information.

Type: Application

Filed: September 22, 2010

Publication date: March 22, 2012

Applicant: AT & T Intellectual Property I, L.P.

Inventors: Michael JOHNSTON, Srinivas Bangalore, Junlan Feng, Taniya Mishra
DISTRIBUTED VOICE USER INTERFACE

Publication number: 20120072221

Abstract: A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and determines whether it is able to respond to the command or request by itself. If not, the local device initiates communication with a remote system for further processing of the speech input.

Type: Application

Filed: November 23, 2011

Publication date: March 22, 2012

Applicant: Ben Franklin Patent Holding, LLC

Inventors: George M. WHITE, James J. Buteau, Glen E. Shires, Kevin J. Surace, Steven Markman
METHODS AND APPARATUS FOR PERFORMING AN INTERNET SEARCH

Publication number: 20120059658

Abstract: Embodiments of the present invention relate to searching for content on the Internet. A user may supply a search query to a device, and the device may issue the search query to a plurality of search engines, including at least one general purpose search engine and at least one site-specific search engine. In this way, the user need not separately issue search queries to each of the plurality of search engines.

Type: Application

Filed: September 8, 2010

Publication date: March 8, 2012

Applicant: Nuance Communications, Inc.

Inventors: Vladimir Sejnoha, Gary B. Clayton, Victor S. Chen, Steven Hatch, William F. Ganong, III, Gunnar Evermann, Marc W. Regan, Stephen W. Laverty, Paul J. Vozila, Nathan M. Bodenstab, Yik-Cheung Tam
Radar Microphone Speech Recognition

Publication number: 20120059657

Abstract: A method for detecting and recognizing speech is provided that remotely detects body motions from a speaker during vocalization with one or more radar sensors. Specifically, the radar sensors include a transmit aperture that transmits one or more waveforms towards the speaker, and each of the waveforms has a distinct wavelength. A receiver aperture is configured to receive the scattered radio frequency energy from the speaker. Doppler signals correlated with the speaker vocalization are extracted with a receiver. Digital signal processors are configured to develop feature vectors utilizing the vocalization Doppler signals, and words associated with the feature vectors are recognized with a word classifier.

Type: Application

Filed: June 7, 2011

Publication date: March 8, 2012

Inventors: Jefferson M. Willey, Todd Stephenson, Hugh Faust, James P. Hansen, George J. Linde, Carol Chang, Justin Nevitt, James A. Ballas, Thomas Herne Crystal, Vincent Michael Stanford, Jean W. de Graaf
VOICE DIALING USING A REJECTION REFERENCE

Publication number: 20120053943

Abstract: A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.

Type: Application

Filed: November 7, 2011

Publication date: March 1, 2012

Applicant: GENERAL MOTORS LLC

Inventors: Jason W. Clark, Rathinavelu Chengalvarayan, Timothy J. Grost, Dana B. Fecher, Jeremy M. Spaulding
Keyword Alerting in Conference Calls

Publication number: 20120035930

Abstract: A conferencing system is disclosed in which a participant to a conference call can program the embodiment to listen for one or more “keywords” in the conference call. The keywords might be a participant's name or words associated with him or her or words associated with his or her area of knowledge. The embodiments uses speech recognition technology to listen for those words. When the embodiments detects that those words have been spoken, the embodiment alerts the participant—using audible, visual, and/or tactile signals—that the participant's attention to the call is warranted. When the keywords are chosen wisely, the benefit can be great.

Type: Application

Filed: September 26, 2011

Publication date: February 9, 2012

Applicant: AVAYA INC.

Inventors: Ezra Raphael Gilbert, Vipul Kishore Lalka, Venkat R. Gilakattula
Automatically Monitoring for Voice Input Based on Context

Publication number: 20120035931

Abstract: In one implementation, a computer-implemented method includes detecting a current context associated with a mobile computing device and determining, based on the current context, whether to switch the mobile computing device from a current mode of operation to a second mode of operation during which the mobile computing device monitors ambient sounds for voice input that indicates a request to perform an operation. The method can further include, in response to determining whether to switch to the second mode of operation, activating one or more microphones and a speech analysis subsystem associated with the mobile computing device so that the mobile computing device receives a stream of audio data. The method can also include providing output on the mobile computing device that is responsive to voice input that is detected in the stream of audio data and that indicates a request to perform an operation.

Type: Application

Filed: September 29, 2011

Publication date: February 9, 2012

Inventors: Michael J. LeBeau, John Nicholas Jitkoff, Dave Burke
USING LINGUISTICALLY-AWARE VARIABLES IN COMPUTER-GENERATED TEXT

Publication number: 20120029919

Abstract: One embodiment of the present invention provides a system for placing linguistically-aware variables in computer-generated text. During operation, the system receives a sentence at a computer system, wherein the sentence comprises two or more words. Next, the system analyzes the sentence to identify a first variable, wherein the first variable is a place-holder for a first word. The system then receives the first word. After that, the system automatically determines a gender of the first word. Next, the system analyzes the sentence to identify a first dependent word that is dependent on the first word, wherein a spelling of the first dependent word is dependent on the gender of the first word. The system then determines the spelling of the first dependent word that corresponds to the gender of the first word. Next, the system replaces the first variable in the sentence with the first word.

Type: Application

Filed: July 29, 2010

Publication date: February 2, 2012

Applicant: INTUIT INC.

Inventor: Peter J. Harris
LOCATION VERIFICATION SYSTEM USING SOUND TEMPLATES

Publication number: 20110320202

Abstract: A system using sound templates is presented that may receive a first template for an audio signal and compares it to templates from different sound sources to determine a correlation between them. A location history database is created that assists in identifying the location of a user in response to audio templates generated by the user over time and at different locations. Comparisons can be made using templates of different richness to achieve confidence levels and confidence levels may be represented based on the results of the comparisons. Queries may be run against the database to track users by templates generated from their voice. In addition, background information may be filtered out of the voice signal and separately compared against the database to assist in identifying a location based on the background noise.

Type: Application

Filed: June 22, 2011

Publication date: December 29, 2011

Inventor: John D. KAUFMAN
SOUND VERIFICATION SYSTEM USING TEMPLATES

Publication number: 20110320201

Abstract: An audio signal verification system is presented for verifying the sound is from a predetermined source. Various methods for analyzing the sound are presented and the various methods may be combined to vary degrees to determine an appropriate correlation with a predefined pattern. Moreover a confidence level or other indication may be used to indicate the determination was successful. The sound may be reduced to templates with varying degrees of richness. Also different templates may be created using the same sound source and different sounds from the same source may be aggregated to form a single template. Comparisons may be made comparing a sound or a template derived from that sound with stored sounds or templates derived from that stored sound. Moreover comparisons can be made using templates of different richness to achieve confidence levels and confidence levels may be represented based on the results of the comparisons.

Type: Application

Filed: June 2, 2011

Publication date: December 29, 2011

Inventor: John D. KAUFMAN
Gaming machine with dialog outputting method to victory or defeat of game and control method thereof

Patent number: 8083587

Abstract: A slot machine 1 of the present invention makes a control so as to: sequentially store the number of game values consumed per unit game; sequentially store the number of game values given per unit game; calculating a difference between the total number of game values given and the total number of game values consumed, as a self game value difference; transmitting the self game value difference to outside; receiving someone's game value difference from outside; when the self game value difference and the someone's game value difference are in a predetermined relationship, voice-outputting, by the conversation controller 91, an answer at volume corresponding to the predetermined relationship from the speaker 23 in response to a voice input through the microphone 90; and delete the stored numbers of game values given and consumed, under a predetermined condition.

Type: Grant

Filed: January 21, 2009

Date of Patent: December 27, 2011

Assignee: Aruze Gaming America, Inc.

Inventor: Kazuo Okada
METHODS AND APPARATUS FOR REAL-TIME INTERACTION ANALYSIS IN CALL CENTERS

Publication number: 20110307257

Abstract: A method and system for indicating in real time that an interaction is associated with a problem or issue, comprising: receiving a segment of an interaction in which a representative of the organization participates; extracting a feature from the segment; extracting a global feature associated with the interaction; aggregating the feature and the global feature; and classifying the segment or the interaction in association with the problem or issue by applying a model to the feature and the global feature. The method and system may also use features extracted from earlier segments within the interaction. The method and system can also evaluate the model based on features extracted from training interactions and manual tagging assigned to the interactions or segments thereof.

Type: Application

Filed: June 10, 2010

Publication date: December 15, 2011

Applicant: Nice Systems Ltd.

Inventors: Oren PEREG, Moshe WASSERBLAT, Yuval LUBOWICH, Ronen LAPERDON, Dori SHAPIRA, Vladislav FEIGIN, Oz FOX-KAHANA
Predicting and Learning Carrier Phrases for Speech Input

Publication number: 20110301955

Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop of a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.

Type: Application

Filed: June 7, 2010

Publication date: December 8, 2011

Applicant: GOOGLE INC.

Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas Beeferman
AUTOMATIC NORMALIZATION OF SPOKEN SYLLABLE DURATION

Publication number: 20110282650

Abstract: A very common problem is when people speak a language other than the language which they are accustomed, syllables can be spoken for longer or shorter than the listener would regard as appropriate. An example of this can be observed when people who have a heavy Japanese accent speak English. Since Japanese words end with vowels, there is a tendency for native Japanese to add a vowel sound to the end of English words that should end with a consonant. Illustratively, native Japanese speakers often pronounce “orange” as “orenji.” An aspect provides an automatic speech-correcting process that would not necessarily need to know that fruit is being discussed; the system would only need to know that the speaker is accustomed to Japanese, that the listener is accustomed to English, that “orenji” is not a word in English, and that “orenji” is a typical Japanese mispronunciation of the English word “orange.

Type: Application

Filed: May 17, 2010

Publication date: November 17, 2011

Applicant: AVAYA INC.

Inventors: Terry Jennings, Paul Roller Michaelis
Methods and System for Grammar Fitness Evaluation as Speech Recognition Error Predictor

Publication number: 20110282667

Abstract: A plurality of statements are received from within a grammar structure. Each of the statements is formed by a number of word sets. A number of alignment regions across the statements are identified by aligning the statements on a word set basis. Each aligned word set represents an alignment region. A number of potential confusion zones are identified across the statements. Each potential confusion zone is defined by words from two or more of the statements at corresponding positions outside the alignment regions. For each of the identified potential confusion zones, phonetic pronunciations of the words within the potential confusion zone are analyzed to determine a measure of confusion probability between the words when audibly processed by a speech recognition system during the computing event. An identity of the potential confusion zones across the statements and their corresponding measure of confusion probability are reported to facilitate grammar structure improvement.

Type: Application

Filed: May 14, 2010

Publication date: November 17, 2011

Applicant: Sony Computer Entertainment Inc.

Inventor: Gustavo A. Hernandez-Abrego
Computer-Implemented Systems and Methods for Estimating Word Accuracy for Automatic Speech Recognition

Publication number: 20110270612

Abstract: Systems and methods are provided for scoring non-native, spontaneous speech. A spontaneous speech sample is received, where the sample is of spontaneous speech spoken by a non-native speaker. Automatic speech recognition is performed on the sample using an automatic speech recognition system to generate a transcript of the sample, where a speech recognizer metric is determined by the automatic speech recognition system. A word accuracy rate estimate is determined for the transcript of the sample generated by the automatic speech recognition system based on the speech recognizer metric. The spontaneous speech sample is scored using a preferred scoring model when the word accuracy rate estimate satisfies a threshold, and the spontaneous speech sample is scored using an alternate scoring model when the word accuracy rate estimate fails to satisfy the threshold.

Type: Application

Filed: April 28, 2011

Publication date: November 3, 2011

Inventors: Su-Youn Yoon, Lei Chen, Klaus Zechner
Continuous Speech Recognition

Publication number: 20110218802

Abstract: A computerized method for continuous speech recognition using a speech recognition engine and a phoneme model. The computerized method inputs a speech signal into the speech recognition engine. Based on the phoneme model, the speech signal is indexed by scoring for the phonemes of the phoneme model and a time-ordered list of phoneme candidates and respective scores resulting from the scoring are produced. The phoneme candidates are input with the scores from the time-ordered list. Word transcription candidates are typically input from a dictionary and words are built by selecting from the word transcription candidates based on the scores. A stream of transcriptions is outputted corresponding to the input speech signal. The stream of transcriptions is re-scored by searching for and detecting anomalous word transcriptions in the stream of transcriptions to produce second scores.

Type: Application

Filed: March 8, 2010

Publication date: September 8, 2011

Inventors: Shlomi Hai Bouganim, Boris Levant
Method for Automated Sentence Planning in a Task Classification System

Publication number: 20110218807

Abstract: The invention relates to a method for sentence planning (120) in a task classification system that interacts with a user. The method may include recognizing symbols in the user's input communication and determining whether the user's input communication can be understood. If the user's communication can be understood, understanding data may be generated (220). The method may further include generating communicative goals (3010) based on the recognized symbols and understanding data. The generated communicative goals (3010) may be related to information needed to be obtained form the user. The method may also include automatically planning one or more sentences (3020) based on the generated communicative goals and outputting at least one of the sentence plans to the user (3080).

Type: Application

Filed: May 18, 2011

Publication date: September 8, 2011

Applicant: AT&T Intellectual Property ll, LP

Inventors: Marilyn A. WALKER, Owen Christopher RAMBOW, Monica ROGATI
REFRIGERATOR

Publication number: 20110210822

Abstract: A refrigerator is provided. The refrigerator includes a voice recognition unit for recognizing a voice of a name of food, a memory for storing location information of the food received in a storage chamber, a controller for determining the voice recognized by the voice recognition unit and searching a storage location of the food voice-recognized in accordance with the recognized voice, and a voice output unit for outputting a voice message on the storage location information of the food searched by the controller.

Type: Application

Filed: September 11, 2008

Publication date: September 1, 2011

Applicant: LG Electronics Inc.

Inventors: Sung-Ae Lee, Min-Kyeong Kim
GRAMMAR WEIGHTING VOICE RECOGNITION INFORMATION

Publication number: 20110202341

Abstract: A device receives a voice recognition statistic from a voice recognition application and applies a grammar improvement rule based on the voice recognition statistic. The device also automatically adjusts a weight of the voice recognition statistic based on the grammar improvement rule, and outputs the weight adjusted voice recognition statistic for use in the voice recognition application.

Type: Application

Filed: April 29, 2011

Publication date: August 18, 2011

Applicant: VERIZON PATENT AND LICENSING INC.

Inventor: Kevin W. BROWN
WORD RECOGNITION SYSTEM AND METHOD FOR CUSTOMER AND EMPLOYEE ASSESSMENT

Publication number: 20110191106

Abstract: One-to-many comparisons of callers' words and/or voice prints with known words and/or voice prints to identify any substantial matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract different words, such as words of anger. The system may also segment at least a portion of the customer's voice to create a tone profile, and it formats the segmented words and tone profiles for network transmission to a server. The server compares the customer's words and/or tone profiles with multiple known words and/or tone profiles stored on a database to determine any substantial matches. The identification of any matches may be used for a variety of purposes, such as providing representative feedback or customer follow-up.

Type: Application

Filed: April 12, 2011

Publication date: August 4, 2011

Applicant: American Express Travel Related Services Company, Inc.

Inventors: Chin H. Khor, Marcel Leyva, Vernon Marshall
SPEECH RECOGNITION ANALYSIS VIA IDENTIFICATION INFORMATION

Publication number: 20110184735

Abstract: Embodiments are disclosed that relate to the use of identity information to help avoid the occurrence of false positive speech recognition events in a speech recognition system. One embodiment provides a method comprising receiving speech recognition data comprising a recognized speech segment, acoustic locational data related to a location of origin of the recognized speech segment as determined via signals from the microphone array, and confidence data comprising a recognition confidence value, and also receiving image data comprising visual locational information related to a location of each person in an image. The acoustic locational data is compared to the visual locational data to determine whether the recognized speech segment originated from a person in the field of view of the image sensor, and the confidence data is adjusted depending on this determination.

Type: Application

Filed: January 22, 2010

Publication date: July 28, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Jason Flaks, Dax Hawkins, Christian Klein, Mitchell Stephen Dernis, Tommer Leyvand, Ali M. Vassigh, Duncan McKay
SPOKEN MOBILE ENGINE

Publication number: 20110166860

Abstract: Systems and methods are disclosed to operate a mobile device by capturing user input; transmitting the user input over a wireless channel to an engine, analyzing at the engine music clip or video in a multimedia data stream and sending an analysis wirelessly to the mobile device.

Type: Application

Filed: July 12, 2010

Publication date: July 7, 2011

Inventor: Bao Q. Tran
UNSUPERVISED LEARNING USING GLOBAL FEATURES, INCLUDING FOR LOG-LINEAR MODEL WORD SEGMENTATION

Publication number: 20110144992

Abstract: Described is a technology for performing unsupervised learning using global features extracted from unlabeled examples. The unsupervised learning process may be used to train a log-linear model, such as for use in morphological segmentation of words. For example, segmentations of the examples are sampled based upon the global features to produce a segmented corpus and log-linear model, which are then iteratively reprocessed to produce a final segmented corpus and a log-linear model.

Type: Application

Filed: December 15, 2009

Publication date: June 16, 2011

Applicant: Microsoft Corporation

Inventors: Kristina N. Toutanova, Colin Andrew Cherry, Hoifung Poon
SYSTEM AND METHOD FOR COMBINING GEOGRAPHIC METADATA IN AUTOMATIC SPEECH RECOGNITION LANGUAGE AND ACOUSTIC MODELS

Publication number: 20110144973

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for a speech recognition application for directory assistance that is based on a user's spoken search query. The spoken search query is received by a portable device and portable device then determines its present location. Upon determining the location of the portable device, that information is incorporated into a local language model that is used to process the search query. Finally, the portable device outputs the results of the search query based on the local language model.

Type: Application

Filed: December 15, 2009

Publication date: June 16, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Enrico Bocchieri, Diamantino Antonio Caseiro
Disfluent-utterance tracking system and method

Publication number: 20110144993

Abstract: A disfluent-utterance tracking system includes a speech transducer; one or more targeted-disfluent-utterance records stored in a memory; a real-time speech recording mechanism operatively connected with the speech transducer for recording a real-time utterance; and an analyzer operatively coupled with the targeted-disfluent-utterance record and with the real-time speech recording mechanism, the analyzer configured to compare one or more real-time snippets of the recorded speech with the targeted-disfluent-utterance record to determine and indicate to a user a level of correlation therebetween.

Type: Application

Filed: December 15, 2009

Publication date: June 16, 2011

Inventor: David Ruby
SYSTEMS AND METHODS FOR RESPONDING TO NATURAL LANGUAGE SPEECH UTTERANCE

Publication number: 20110131045

Abstract: Systems and methods are provided for receiving speech and non-speech communications of natural language questions and/or commands, transcribing the speech and non-speech communications to textual messages, and executing the questions and/or commands. The invention applies context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users presenting questions or commands across multiple domains. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech and non-speech communications and presenting the expected results for a particular question or command.

Type: Application

Filed: February 2, 2011

Publication date: June 2, 2011

Applicant: VoiceBox Technologies, Inc.

Inventors: Philippe Di Cristo, Min Ke, Robert A. Kennewick, Lynn Elise Armstrong
VOICE RECOGNITION SYSTEM, VOICE RECOGNITION METHOD, AND PROGRAM FOR VOICE RECOGNITION

Publication number: 20110131043

Abstract: The present invention enables the recognition process at high speed even when a lot of garbage is included in the grammar. The first voice recognition processing unit generates a recognition hypothesis graph which indicates a structure of hypothesis that is derived according to a first grammar together with a score associated with respective connections of a recognition unit by executing a voice recognition process based on the first grammar to a voice feature amount of input voice, and the second voice recognition processing unit outputs the recognition result from a total score of a hypothesis which is derived according to a second grammar after executing a voice recognition process according to the second grammar that is specified to accept a section other than keywords in input voice as the garbage section to a voice feature amount of input voice, and the second voice recognition processing unit acquires the structure and the score of the garbage section from the recognition hypothesis graph.

Type: Application

Filed: December 22, 2008

Publication date: June 2, 2011

Inventors: Fumihiro Adachi, Ryosuke Isotani, Ken Hanazawa
APPARATUS AND METHOD FOR ANALYSIS OF LANGUAGE MODEL CHANGES

Publication number: 20110093268

Abstract: An apparatus, a method, and a machine-readable medium are provided for characterizing differences between two language models. A group of utterances from each of a group of time domains are examined. One of a significant word change or a significant word class change within the plurality of utterances is determined. A first cluster of utterances including a word or a word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances. A second cluster of utterances not including the word or the word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances.

Type: Application

Filed: September 14, 2010

Publication date: April 21, 2011

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Allen Louis Gorin, John Grothendieck, Jeremy Huntley Greet Wright
METHOD AND SYSTEM FOR CONSIDERING INFORMATION ABOUT AN EXPECTED RESPONSE WHEN PERFORMING SPEECH RECOGNITION

Publication number: 20110093269

Abstract: A speech recognition system receives and analyzes speech input from a user in order to recognize and accept a response from the user. Under certain conditions, information about the response expected from the user may be available. In these situations, the available information about the expected response is used to modify the behavior of the speech recognition system by taking this information into account. The modified behavior of the speech recognition system comprises adjusting the rejection threshold when speech input matches the predetermined expected response.

Type: Application

Filed: December 30, 2010

Publication date: April 21, 2011

Inventors: Keith Braho, Amro El-Jaroudi, Jeffrey Pike
METHOD AND APPARATUS FOR ALIGNING TEXTS

Publication number: 20110054901

Abstract: A method and apparatus for aligning texts. The method includes acquiring a target text and a reference text and aligning the target text and the reference text at word level based on phoneme similarity. The method can be applied to automatically archiving a multimedia resource and a method of automatically searching a multimedia resource.

Type: Application

Filed: August 27, 2010

Publication date: March 3, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yong Qin, Qin Shi, Zhiwei Shuang, Shi Lei Zhang, Jie Zhou
Shareable Applications On Telecommunications Devices

Publication number: 20110044438

Abstract: During voice communication between multiple telecommunications devices, a shareable application facilitates concurrent sharing of data and processes between the devices. The application may be configured to monitor the voice communication and execute a predetermined function upon detecting a predetermined condition in the voice communication. The application may further facilitate sharing of functionality and user interface displays during the voice communication. In some implementations, a server computing device on a communications network may facilitate functions of shareable applications on one or more telecommunications devices.

Type: Application

Filed: August 20, 2009

Publication date: February 24, 2011

Applicant: T-Mobile USA, Inc.

Inventors: Winston Wang, Adam Holt, Jean-Luc Bouthemy, Michael Kemery
METHODS AND SYSTEMS FOR ADAPTING A MODEL FOR A SPEECH RECOGNITION SYSTEM

Publication number: 20110029313

Abstract: Methods are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. A method for model adaptation for a speech recognition system includes determining an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The method may further include adjusting an adaptation, of the model for the word or various models for the various words, based on the error rate. Apparatus are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. An apparatus for model adaptation for a speech recognition system includes a processor adapted to estimate an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system.

Type: Application

Filed: October 11, 2010

Publication date: February 3, 2011

Applicant: VOCOLLECT, INC.

Inventors: Keith P. Braho, Jeffrey P. Pike, Lori A. Pike
TEXT DATA PROCESSING APPARATUS, TEXT DATA PROCESSING METHOD, AND RECORDING MEDIUM STORING TEXT DATA PROCESSING PROGRAM

Publication number: 20110010175

Abstract: Provided is to a text data processing apparatus, method and program to add a symbol at an appropriate position. The apparatus according to this embodiment is a text data processing apparatus that executes edit of a symbol in input text, the apparatus including symbol edit determination means 52 that determines whether symbol edit is necessary or not based on a frequency of symbol insertion in a block consisting of a plurality of divided text; and symbol edit position calculation means 53 that calculates likelihood of the symbol edit based on likelihood of symbol insertion for a word and a distance between the symbols and calculates a symbol edit position in the block in accordance with the likelihood of symbol edit or a word in the block when the symbol edit determination means determines that the symbol edit is necessary.

Type: Application

Filed: February 13, 2009

Publication date: January 13, 2011

Inventors: Tasuku Kitade, Takafumi Koshinaka
Generating Topic-Specific Language Models

Publication number: 20110004462

Abstract: Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.

Type: Application

Filed: July 1, 2009

Publication date: January 6, 2011

Applicant: COMCAST INTERACTIVE MEDIA, LLC

Inventors: David F. Houghton, Seth Michael Murray, Sibley Verbeck Simon
PHONETIC DISTANCE MEASUREMENT SYSTEM AND RELATED METHODS

Publication number: 20100332230

Abstract: Phonetic distances are empirically measured as a function of speech recognition engine recognition error rates. The error rates are determined by comparing a recognized speech file with a reference file. The phonetic distances can be normalized to earlier measurements. The phonetic distances/error rates can also be used to improve speech recognition engine grammar selection, as an aid in language training and evaluation, and in other applications.

Type: Application

Filed: June 25, 2009

Publication date: December 30, 2010

Applicant: ADACEL SYSTEMS, INC.

Inventor: Chang-Qing Shu
LEXICAL ACQUISITION APPARATUS, MULTI DIALOGUE BEHAVIOR SYSTEM, AND LEXICAL ACQUISITION PROGRAM

Publication number: 20100332231

Abstract: A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and the

Type: Application

Filed: June 1, 2010

Publication date: December 30, 2010

Applicants: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International

Inventors: Mikio Nakano, Takashi Nose, Ryo Taguchi, Kotaro Funakoshi, Naoto Iwahashi
METHODS AND APPARATUS FOR FACILITATING TESTING OF THE SOBRIETY OF ONLINE GAMBLERS

Publication number: 20100328066

Abstract: Methods, systems and articles of manufacture are provided for administering sobriety tests to online gamblers, as well as to determining whether, when and to whom to administer a sobriety tests. Various mediation events to be initiated upon certain results of such sobriety tests are also disclosed.

Type: Application

Filed: June 24, 2010

Publication date: December 30, 2010

Inventors: Jay S. Walker, Zachary T. Smith, Magdalena M. Fincham
Searching in Audio Speech

Publication number: 20100324900

Abstract: A computerized method of detecting a target word in a speech signal. A speech recognition engine and a previously constructed phoneme model is provided. The speech signal is input into the speech recognition engine. Based on the phoneme model, the input speech signal is indexed. A time-ordered list is stored representing n-best phoneme candidates of the input speech signal and phonemes of the input speech signal in multiple phoneme frames. The target word is transcribed into a transcription of target phonemes. The time-ordered list of n-best phoneme candidates is searched for a locus of said target phonemes. While searching, scoring is based on the ranking of the phoneme candidates among the n-best phoneme candidates and based on the number of the target phonemes found. A composite score of the probability of an occurrence of the target word is produced. When the composite score is higher than a threshold, start and finish times are output which bound the locus.

Type: Application

Filed: June 19, 2009

Publication date: December 23, 2010

Inventors: Ronen Faifkov, Rabin Cohen-Tov, Adam Simone
APPLICATION OF USER-SPECIFIED TRANSFORMATIONS TO AUTOMATIC SPEECH RECOGNITION RESULTS

Publication number: 20100318356

Abstract: Textual transcription of speech is generated and formatted according to user-specified transformation and behavior requirements for a speech recognition system having input grammars and transformations. An apparatus may include a speech recognition platform configured to receive a user-specified transformation requirement, recognize speech in speech data into recognized speech according to a set of recognition grammars; and apply transformations to the recognized speech according to the user-specified transformation requirement. The apparatus may further be configured to receive a user-specified behavior requirement and transform the recognized speech according to the behavior requirement. Other embodiments are described and claimed.

Type: Application

Filed: June 12, 2009

Publication date: December 16, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Jonathan E. Hamaker, Keith C. Herold
SYMBOL INSERTION APPARATUS AND SYMBOL INSERTION METHOD

Publication number: 20100292989

Abstract: Enables symbol insertion evaluation in consideration of a difference in speaking style features between speakers. For a word sequence transcribing voice information, the symbol insertion likelihood calculation means 113 obtains a symbol insertion likelihood for each of a plurality of symbol insertion models supplied for different speaking style features. The speaking style feature similarity calculation means 112 obtains a similarity between the speaking style feature of the word sequence and the plurality of speaking style feature models. The symbol insertion evaluation means 114 weights the symbol insertion likelihood obtained for the word sequence by each of the plurality of symbol insertion models according to the similarity between the speaking style feature of the word sequence and the plurality of speaking style feature models and the relevance between the symbol insertion model and the speaking style feature model, and performs symbol insertion evaluation to the word sequence.

Type: Application

Filed: January 19, 2009

Publication date: November 18, 2010

Inventors: Tasuku Kitade, Takafumi Koshinaka
METHOD FOR SPEECH ROCOGNITION

Publication number: 20100286984

Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.

Type: Application

Filed: June 18, 2008

Publication date: November 11, 2010

Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
Communication Device Language Filter

Publication number: 20100280828

Abstract: Techniques are described that generally relate to systems, methods, and devices designed to selectively filter offensive communications in accordance with a user's intentions. Example methods may be designed to filter (such as by deleting, blocking, replacing, and/or modifying) various offensive words, phrases, and/or sounds that have been identified as having offensive meanings.

Type: Application

Filed: April 30, 2009

Publication date: November 4, 2010

Inventors: Gene Fein, Edward Merritt

prev 1 2 3 4 5 next