Training (epo) Patents (Class 704/E15.008)
-
Patent number: 12242643Abstract: A method, computer program product, and computing system for receiving an input speech signal. A transcription of the input speech signal may be received. One or more sensitive content portions may be identified from the transcription of the input speech signal. The one or more sensitive content portions from the transcription of the input speech signal may be obscured, thus defining an obscured transcription of the input speech signal. An obscured speech signal may be generated based upon, at least in part, the input speech signal, the transcription of the input speech signal, and the obscured transcription of the input speech signal.Type: GrantFiled: June 3, 2022Date of Patent: March 4, 2025Assignee: Microsoft Technology Licensing, LLCInventors: William F. Ganong, III, Uwe Helmut Jost
-
Patent number: 12230261Abstract: A method for controlling an electronic device is provided. The method includes identifying one or more user interface (UI) elements displayed on a screen of an electronic device, determining a characteristic(s) of one or more identified UI elements, generating a data base based on the characteristic of one or more identified UI elements, where the database comprises to predict NL utterances of one or more identified UI elements, where the NL utterances are predicted based on the at least one characteristic of one or more identified UI elements, receiving a voice input of a user of the electronic device, where the voice input comprises an utterance indicative of the at least one characteristic of one or more identified UI elements presented in the database, and automatically accessing UI element(s) of one or more UI elements in response to determining that the utterances of the received voice input from the user matches with the predicted NL utterances of one or more identified UI elements.Type: GrantFiled: October 27, 2021Date of Patent: February 18, 2025Assignee: Samsung Electronics Co., Ltd.Inventors: Ranjan Kumar Samal, Praveen Kumar Guvvakallu Sivamoorthy, Purushothama Chowdari Gonuguntla, Rituraj Laxminarayan Kabra, Manjunath Belgod Lokanath
-
Publication number: 20130332158Abstract: The technology of the present application provides a speech recognition system with at least two different speech recognition engines or a single engine speech recognition engine with at least two different modes of operation. The first speech recognition being used to match audio to text, which text may be words or phrases. The matched audio and text is used by a training module to train a user profile for a natural language speech recognition engine, which is at least one of the two different speech recognition engines or modes. An evaluation module evaluates when the user profile is sufficiently trained to convert the speech recognition engine from the first speech recognition engine or mode to the natural language speech recognition or mode.Type: ApplicationFiled: June 8, 2012Publication date: December 12, 2013Applicant: NVOQ INCORPORATEDInventors: Charles Corfield, Brian Marquette
-
Publication number: 20130262106Abstract: A system and method for adapting a language model to a specific environment by receiving interactions captured the specific environment, generating a collection of documents from documents retrieved from external resources, detecting in the collection of documents terms related to the environment that are not included in an initial language model and adapting the initial language model to include the terms detected.Type: ApplicationFiled: March 29, 2012Publication date: October 3, 2013Inventors: Eyal HURVITZ, Ezra Daya, Oren Pereg, Moshe Wasserblat
-
Publication number: 20130262114Abstract: Different advantageous embodiments provide a crowdsourcing method for modeling user intent in conversational interfaces. One or more stimuli are presented to a plurality of describers. One or more sets of describer data are captured from the plurality of describers using a data collection mechanism. The one or more sets of describer data are processed to generate one or more models. Each of the one or more models is associated with a specific stimulus from the one or more stimuli.Type: ApplicationFiled: April 3, 2012Publication date: October 3, 2013Applicant: MICROSOFT CORPORATIONInventors: Christopher John Brockett, Piali Choudhury, William Brennan Dolan, Yun-Cheng Ju, Patrick Pantel, Noelle Mallory Sophy, Svitlana Volkova
-
Publication number: 20130132077Abstract: Systems and methods for semi-supervised source separation using non-negative techniques are described. In some embodiments, various techniques disclosed herein may enable the separation of signals present within a mixture, where one or more of the signals may be emitted by one or more different sources. In audio-related applications, for instance, a signal mixture may include speech (e.g., from a human speaker) and noise (e.g., background noise). In some cases, speech may be separated from noise using a speech model developed from training data. A noise model may be created, for example, during the separation process (e.g., “on-the-fly”) and in the absence of corresponding training data.Type: ApplicationFiled: May 27, 2011Publication date: May 23, 2013Inventors: Gautham J. Mysore, Paris Smaragdis
-
Publication number: 20130006633Abstract: Techniques are provided to recognize a speaker's voice. In one embodiment, received audio data may be separated into a plurality of signals. For each signal, the signal may be associated with value/s for one or more features (e.g., Mel-Frequency Cepstral coefficients). The received data may be clustered (e.g., by clustering features associated with the signals). A predominate voice cluster may be identified and associated with a user. A speech model (e.g., a Gaussian Mixture Model or Hidden Markov Model) may be trained based on data associated with the predominate cluster. A received audio signal may then be processed using the speech model to, e.g.: determine who was speaking; determine whether the user was speaking; determining whether anyone was speaking; and/or determine what words were said. A context of the device or the user may then be inferred based at least partly on the processed signal.Type: ApplicationFiled: January 5, 2012Publication date: January 3, 2013Applicant: QUALCOMM IncorporatedInventors: Leonard Henry Grokop, Vidya Narayanan
-
Publication number: 20130006630Abstract: A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory.Type: ApplicationFiled: April 13, 2012Publication date: January 3, 2013Applicant: FUJITSU LIMITEDInventors: Shoji HAYAKAWA, Naoshi Matsuo
-
Publication number: 20130006632Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.Type: ApplicationFiled: September 12, 2012Publication date: January 3, 2013Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
-
Publication number: 20120290302Abstract: A Chinese speech recognition system and method is disclosed. Firstly, a speech signal is received and recognized to output a word lattice. Next, the word lattice is received, and word arcs of the word lattice are rescored and reranked with a prosodic break model, a prosodic state model, a syllable prosodic-acoustic model, a syllable-juncture prosodic-acoustic model and a factored language model, so as to output a language tag, a prosodic tag and a phonetic segmentation tag, which correspond to the speech signal. The present invention performs rescoring in a two-stage way to promote the recognition rate of basic speech information and labels the language tag, prosodic tag and phonetic segmentation tag to provide the prosodic structure and language information for the rear-stage voice conversion and voice synthesis.Type: ApplicationFiled: April 13, 2012Publication date: November 15, 2012Inventors: Jyh-Her YANG, Chen-Yu Chiang, Ming-Chieh Liu, Yih-Ru Wang, Yuan-Fu Liao, Sin-Horng Chen
-
Publication number: 20120245939Abstract: A speech recognition system receives and analyzes speech input from a user in order to recognize and accept a response from the user. Under certain conditions, information about the response expected from the user may be available. In these situations, the available information about the expected response is used to modify the behavior of the speech recognition system by taking this information into account. The modified behavior of the speech recognition system according to the invention has several embodiments including: comparing the observed speech features to the models of the expected response separately from the usual hypothesis search in order to speed up the recognition system; modifying the usual hypothesis search to emphasize the expected response; updating and adapting the models when the recognized speech matches the expected response to improve the accuracy of the recognition system.Type: ApplicationFiled: June 8, 2012Publication date: September 27, 2012Inventors: Keith Braho, Amro El-Jaroudi
-
Publication number: 20120232902Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating an acoustic model for use in speech recognition. A system configured to practice the method first receives training data and identifies non-contextual lexical-level features in the training data. Then the system infers sentence-level features from the training data and generates a set of decision trees by node-splitting based on the non-contextual lexical-level features and the sentence-level features. The system decorrelates training vectors, based on the training data, for each decision tree in the set of decision trees to approximate full-covariance Gaussian models, and then can train an acoustic model for use in speech recognition based on the training data, the set of decision trees, and the training vectors.Type: ApplicationFiled: March 8, 2011Publication date: September 13, 2012Applicant: AT&T Intellectual Property I, L.P.Inventors: Enrico BOCCHIERI, Diamantino Antonio Caseiro, Dimitrios Dimitriadis
-
Publication number: 20120221333Abstract: Techniques are disclosed for using phonetic features for speech recognition. For example, a method comprises the steps of obtaining a first dictionary and a training data set associated with a speech recognition system, computing one or more support parameters from the training data set, transforming the first dictionary into a second dictionary, wherein the second dictionary is a function of one or more phonetic labels of the first dictionary, and using the one or more support parameters to select one or more samples from the second dictionary to create a set of one or more exemplar-based class identification features for a pattern recognition task.Type: ApplicationFiled: February 24, 2011Publication date: August 30, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dimitri Kanevsky, David Nahamoo, Bhuvana Ramabhadran, Tara N. Sainath
-
Publication number: 20120101820Abstract: A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.Type: ApplicationFiled: October 24, 2011Publication date: April 26, 2012Applicant: AT&T Intellectual Property I, L.P.Inventor: Andrej Ljolje
-
Publication number: 20120059654Abstract: An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.Type: ApplicationFiled: March 16, 2010Publication date: March 8, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Masafumi Nishimura, Ryuki Tachibana
-
Publication number: 20120035928Abstract: A phonetic vocabulary for a speech recognition system is adapted to a particular speaker's pronunciation. A speaker can be attributed specific pronunciation styles, which can be identified from specific pronunciation examples. Consequently, a phonetic vocabulary can be reduced in size, which can improve recognition accuracy and recognition speed.Type: ApplicationFiled: October 13, 2011Publication date: February 9, 2012Applicant: Nuance Communications, Inc.Inventors: Nitendra Rajput, Ashish Verma
-
Publication number: 20120022869Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.Type: ApplicationFiled: September 30, 2011Publication date: January 26, 2012Applicant: GOOGLE, INC.Inventors: Matthew I. Lloyd, Trausti Kristjansson
-
Publication number: 20120010885Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.Type: ApplicationFiled: September 19, 2011Publication date: January 12, 2012Applicant: AT&T Intellectual Property II, L.P.Inventors: Dilek Zeynep Hakkani-Tür, Giuseppe Riccardi
-
Publication number: 20110144973Abstract: Disclosed herein are systems, methods, and computer-readable storage media for a speech recognition application for directory assistance that is based on a user's spoken search query. The spoken search query is received by a portable device and portable device then determines its present location. Upon determining the location of the portable device, that information is incorporated into a local language model that is used to process the search query. Finally, the portable device outputs the results of the search query based on the local language model.Type: ApplicationFiled: December 15, 2009Publication date: June 16, 2011Applicant: AT&T Intellectual Property I, L.P.Inventors: Enrico Bocchieri, Diamantino Antonio Caseiro
-
Publication number: 20110144992Abstract: Described is a technology for performing unsupervised learning using global features extracted from unlabeled examples. The unsupervised learning process may be used to train a log-linear model, such as for use in morphological segmentation of words. For example, segmentations of the examples are sampled based upon the global features to produce a segmented corpus and log-linear model, which are then iteratively reprocessed to produce a final segmented corpus and a log-linear model.Type: ApplicationFiled: December 15, 2009Publication date: June 16, 2011Applicant: Microsoft CorporationInventors: Kristina N. Toutanova, Colin Andrew Cherry, Hoifung Poon
-
Publication number: 20110144991Abstract: Methods for compressing a transform associated with a feature space are presented. For example, a method for compressing a transform associated with a feature space includes obtaining the transform including a plurality of transform parameters, assigning each of a plurality of quantization levels for the plurality of transform parameters to one of a plurality of quantization values, and assigning each of the plurality of transform parameters to one of the plurality of quantization values to which one of the plurality of quantization levels is assigned. One or more of obtaining the transform, assigning of each of the plurality of quantization levels, and assigning of each of the transform parameters are implemented as instruction code executed on a processor device. Further, a Viterbi algorithm may be employed for use in non-uniform level/value assignments.Type: ApplicationFiled: December 11, 2009Publication date: June 16, 2011Applicant: International Business Machines CorporationInventors: Petr Fousek, Vaibhava Goel, Etienne Marcheret, Peder Andreas Olsen
-
Publication number: 20110035216Abstract: The invention can recognize any several languages at the same time without using samples. The important skill is that features of known words in any language are extracted from unknown words or continuous voices. These unknown words represented by matrices are spread in the 144-dimensional space. The feature of a known word of any language represented by a matrix is simulated by the surrounding unknown words. The invention includes 12 elastic frames of equal length without filter and without overlap to normalize the signal waveform of variable length for a word, which has one to several syllables, into a 12×12 matrix as a feature of the word. The invention can improve the feature such that the speech recognition of an unknown sentence is correct. The invention can correctly recognize any languages without samples, such as English, Chinese, German, French, Japanese, Korean, Russian, Cantonese, Taiwanese, etc.Type: ApplicationFiled: August 5, 2009Publication date: February 10, 2011Inventors: Tze Fen LI, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
-
Publication number: 20100268536Abstract: A method and apparatus for continuously improving the performance of semantic classifiers in the scope of spoken dialog systems are disclosed. Rule-based or statistical classifiers are replaced with better performing rule-based or statistical classifiers and/or certain parameters of existing classifiers are modified. The replacement classifiers or new parameters are trained and tested on a collection of transcriptions and annotations of utterances which are generated manually or in a partially automated fashion. Automated quality assurance leads to more accurate training and testing data, higher classification performance, and feedback into the design of the spoken dialog system by suggesting changes to improve system behavior.Type: ApplicationFiled: April 17, 2009Publication date: October 21, 2010Inventors: David Suendermann, Keelan Evanini, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini
-
Publication number: 20100204988Abstract: A speech recognition method includes receiving a speech input signal in a first noise environment which includes a sequence of observations, determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model, adapting the model trained in a second noise environment to that of the first environment, wherein adapting the model trained in the second environment to that of the first environment includes using second order or higher order Taylor expansion coefficients derived for a group of probability distributions and the same expansion coefficient is used for the whole group.Type: ApplicationFiled: April 20, 2010Publication date: August 12, 2010Inventors: Haitian XU, Kean Kheong Chin
-
Publication number: 20100161331Abstract: In many application environments, it is desirable to provide voice access to tables on Internet pages, where the user asks a subject-related question in a natural language and receives an adequate answer from the table read out to him in a natural language. A method is disclosed for preparing information presented in a tabular form for a speech dialogue system so that the information of the table can be consulted in a user dialogue in a targeted manner.Type: ApplicationFiled: October 25, 2006Publication date: June 24, 2010Applicant: Siemens AktiengesellschaftInventors: Hans-Ulrich Block, Manfred Gehrke, Stefanie Schachchti
-
Publication number: 20100161332Abstract: A method and apparatus are provided that use narrowband data and wideband data to train a wideband acoustic model.Type: ApplicationFiled: March 8, 2010Publication date: June 24, 2010Applicant: MICROSOFT CORPORATIONInventors: Michael L. Seltzer, Alejandro Acero
-
Publication number: 20100153109Abstract: Machine-readable media, methods, apparatus and system for speech segmentation are described. In some embodiments, a fuzzy rule may be determined to discriminate a speech segment from a non-speech segment. An antecedent of the fuzzy rule may include an input variable and an input variable membership. A consequent of the fuzzy rule may include an output variable and an output variable membership. An instance of the input variable may be extracted from a segment. An input variable membership function associated with the input variable membership and an output variable membership function associated with the output variable membership may be trained. The instance of the input variable, the input variable membership function, the output variable, and the output variable membership function may be operated, to determine whether the segment is the speech segment or the non-speech segment.Type: ApplicationFiled: December 27, 2006Publication date: June 17, 2010Inventors: Robert Du, Ye Tao, Daren Zu
-
Publication number: 20100094629Abstract: A weighting factor learning system includes an audio recognition section that recognizes learning audio data and outputting the recognition result; a weighting factor updating section that updates a weighting factor applied to a score obtained from an acoustic model and a language model so that the difference between a correct-answer score calculated with the use of a correct-answer text of the learning audio data and a score of the recognition result becomes large; a convergence determination section that determines, with the use of the score after updating, whether to return to the weighting factor updating section to update the weighting factor again; and a weighting factor convergence determination section that determines, with the use of the score after updating, whether to return to the audio recognition section to perform the process again and update the weighting factor using the weighting factor updating section.Type: ApplicationFiled: February 19, 2008Publication date: April 15, 2010Inventors: Tadashi Emori, Yoshifumi Onishi
-
Publication number: 20100088088Abstract: An automated emotional recognition system is adapted to determine emotional states of a speaker based on the analysis of a speech signal. The emotional recognition system includes at least one server function and at least one client function in communication with the at least one server function for receiving assistance in determining the emotional states of the speaker. The at least one client function includes an emotional features calculator adapted to receive the speech signal and to extract therefrom a set of speech features indicative of the emotional state of the speaker. The emotional state recognition system further includes at least one emotional state decider adapted to determine the emotional state of the speaker exploiting the set of speech features based on a decision model. The server function includes at least a decision model trainer adapted to update the selected decision model according to the speech signal.Type: ApplicationFiled: January 31, 2007Publication date: April 8, 2010Inventors: Gianmario Bollano, Donato Ettorre, Antonio Esiliato
-
Publication number: 20100057453Abstract: Discrimination between at least two classes of events in an input signal is carried out in the following way. A set of frames containing an input signal is received, and at least two different feature vectors are determined for each of said frames. Said at least two different feature vectors are classified using respective sets of preclassifiers trained for said at least two classes of events. Values for at least one weighting factor are determined based on outputs of said preclassifiers for each of said frames. A combined feature vector is calculated for each of said frames by applying said at least one weighting factor to said at least two different feature vectors. Said combined feature vector is classified using a set of classifiers trained for said at least two classes of events.Type: ApplicationFiled: November 16, 2006Publication date: March 4, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Zica Valsan
-
Publication number: 20100042404Abstract: A method of generating a natural language model for use in a spoken dialog system is disclosed. The method comprises using sample utterances and creating a number of hand crafted rules for each call-type defined in a labeling guide. A first NLU model is generated and tested using the hand crafted rules and sample utterances. A second NLU model is built using the sample utterances as new training data and using the hand crafted rules. The second NLU model is tested for performance using a first batch of labeled data. A series of NLU models are built by adding a previous batch of labeled data to training data and using a new batch of labeling data as test data to generate the series of NLU models with training data that increases constantly. If not all the labeling data is received, the method comprises repeating the step of building a series of NLU models until all labeling data is received.Type: ApplicationFiled: October 20, 2009Publication date: February 18, 2010Applicant: AT&T Corp.Inventors: Narendra K. Gupta, Mazin G. Rahim, Gokhan Tur, Antony Van der Mude
-
Publication number: 20090259469Abstract: A method and apparatus for performing speech recognition receives an audio signal, generates a sequence of frames of the audio signal, transforms each frame of the audio signal into a set of narrow band feature vectors using a narrow passband, couples the narrow band feature vectors to a speech model, and determines whether the audio signal is a wide band signal. When the audio signal is determined to be a wide band signal, a pass band parameter of each of one or more passbands that are outside the narrow passband is generated for each frame and the one or more band energy parameters are coupled to the speech model.Type: ApplicationFiled: April 14, 2008Publication date: October 15, 2009Applicant: MOTOROLA, INC.Inventors: Changxue Ma, Yuan-Jun Wei
-
Publication number: 20090132249Abstract: A modifying method for a speech model and a modifying module thereof are provided. The modifying method is as follows. First, a correct sequence of a speech is generated according to a correct sequence generating method and the speech model. Next, a candidate sequence generating method is selected from a plurality of candidate sequence generating methods, and a candidate sequence of the speech is generated according to the selected candidate sequence generating method and the speech model. Finally, the speech model is modified according to the correct sequence and the candidate sequence. Therefore, the present invention increases a discrimination of the speech model.Type: ApplicationFiled: January 10, 2008Publication date: May 21, 2009Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTEInventors: Jia-Jang Tu, Yuan-Fu Liao
-
Publication number: 20090063145Abstract: Combined active and semi-supervised learning to reduce an amount of manual labeling when training a spoken language understanding model classifier. The classifier may be trained with human-labeled utterance data. Ones of a group of unselected utterance data may be selected for manual labeling via active learning. The classifier may be changed, via semi-supervised learning, based on the selected ones of the unselected utterance data.Type: ApplicationFiled: January 12, 2005Publication date: March 5, 2009Applicant: AT&T Corp.Inventors: Dilek Z. Hakkani-Tur, Robert Elias Schapire, Gokham Tur
-
Publication number: 20090043576Abstract: Systems and methods for improving the performance of a speech recognition system. In some embodiments a tuner module and/or a tester module are configured to cooperate with a speech recognition system. The tester and tuner modules can be configured to cooperate with each other. In one embodiment, the tuner module may include a module for playing back a selected portion of a digital data audio file, a module for creating and/or editing a transcript of the selected portion, and/or a module for displaying information associated with a decoding of the selected portion, the decoding generated by a speech recognition engine. In other embodiments, the tester module can include an editor for creating and/or modifying a grammar, a module for receiving a selected portion of a digital audio file and its corresponding transcript, and a scoring module for producing scoring statistics of the decoding based at least in part on the transcript.Type: ApplicationFiled: October 21, 2008Publication date: February 12, 2009Applicant: LumenVox, LLCInventors: Edward S. Miller, James F. Blake, II, Keith C. Herold, Michael D. Bergman, Kyle N. Danielson, Alexandra L. Auckland
-
Publication number: 20080319746Abstract: A keyword analysis device obtains word vectors represented by the documents by analyzing keywords contained in each of documents input in a designated period. A topic cluster extraction device extracts topic clusters belonging to the same topic from a plurality of documents. A keyword extraction device extracts, as a characteristic keyword group, a predetermined number of keywords from the topic cluster in descending order of appearance frequency. A topic structurization determination device determines whether the topic can be structurized, by segmenting the topic cluster into subtopic clusters with reference to the number of documents, the variance of dates contained in the documents, or the C-value of keyword contained in the documents, as a determination criterion. And a keyword presentation device presents the characteristic keyword group in the subtopic cluster upon arranging the keyword group on the basis of the date information.Type: ApplicationFiled: March 25, 2008Publication date: December 25, 2008Inventors: Masayuki Okamoto, Masaaki Kikuchi, Kazuyuki Goto
-
Publication number: 20080195387Abstract: A method and apparatus for determining whether a speaker uttering an utterance belongs to a predetermined set comprising known speakers, wherein a training utterance is available for each known speaker. The method and apparatus test whether features extracted from the tested utterance provide a score exceeding a threshold when matched against one or more of models constructed upon voice samples of each known speaker. The method and system further provide optional enhancements such as determining, using, and updating model normalization parameters, a fast scoring algorithm, summed calls handling, or quality evaluation for the tested utterance.Type: ApplicationFiled: October 19, 2006Publication date: August 14, 2008Applicant: NICE SYSTEMS LTD.Inventors: Yaniv ZIGEL, Moshe WASSERBLAT
-
Publication number: 20080167873Abstract: A method for pronunciation of English alphas according to the indications at different orientations of the alpha, comprises the steps of: dividing an area around an alpha into six sections, indicating short sounds, long sounds and strong sounds by points, lines and slashes; that put a small piece of line (in different angle) or point on an alpha indicating that it is pronounced by the pronunciation of another alpha; using underlines to indicate long sounds and short sounds of phonetic symbols of a set of double alphas; using a delete line to indicate that the alpha will not be pronounced, using a space area to divide syllables of a word; using a vertical cut line to indicate that one alpha is pronounced by two sounds; indicating an original sound line at an upper side of the first stroke to represents that the alpha is pronounced with an original sound; and a “?” under a double alpha set representing that the alpha is pronounced with a reverse sound.Type: ApplicationFiled: January 8, 2007Publication date: July 10, 2008Inventor: Wei-Chou Su
-
Publication number: 20080147404Abstract: Speech is processed that may be colored by speech accent. A method for recognizing speech includes maintaining a model of speech accent that is established based on training speech data, wherein the training speech data includes at least a first set of training speech data, and wherein establishing the model of speech accent includes not using any phone or phone-class transcription of the first set of training speech data. Related systems are also presented. A system for recognizing speech includes an accent identification module that is configured to identify accent of the speech to be recognized; and a recognizer that is configured to use models to recognize the speech to be recognized, wherein the models include at least an acoustic model that has been adapted for the identified accent using training speech data of a language, other than primary language of the speech to be recognized, that is associated with the identified accent. Related methods are also presented.Type: ApplicationFiled: May 15, 2001Publication date: June 19, 2008Applicant: NuSuara Technologies SDN BHDInventors: Wai Kat Liu, Pascale Fung
-
Publication number: 20080126089Abstract: Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principledType: ApplicationFiled: October 31, 2007Publication date: May 29, 2008Inventors: Harry Printz, Narren Chittar
-
Publication number: 20080120105Abstract: Methods and apparatus to operate an audience metering device with voice commands are described herein. An example method to identify audience members based on voice, includes: obtaining an audio input signal including a program audio signal and a human voice signal; receiving an audio line signal from an audio output line of a monitored media device; processing the audio line signal with a filter having adaptive weights to generate a delayed and attenuated line signal; subtracting the delayed and attenuated line signal from the audio input signal to develop a residual audio signal; identifying a person that spoke to create the human voice signal based on the residual audio signal; and logging an identity of the person as an audience member.Type: ApplicationFiled: February 1, 2008Publication date: May 22, 2008Inventor: VENUGOPAL SRINIVASAN
-
Publication number: 20080091410Abstract: A method of forming words utilizing a character actuator unit in which the character actuators are segregated into certain categories. First and second categories are employed and activated simultaneously to generate the beginning and ending of a word. First and second actuating categories may be combined with third and fourth categories of actuators to further form and modify words in any languages.Type: ApplicationFiled: January 4, 2007Publication date: April 17, 2008Inventor: Sherrie Benson