Abstract: A method of providing weighted concepts related to a sequence of one or more words, including: providing on a computer an encyclopedia with concepts and a document explaining each concept, forming a vector, which contains the frequency of the word for each concept, for each word in the encyclopedia, arranging the vector according to the frequency of appearance of the word for each concept, selecting the concepts with the highest frequencies for each word from the vector, truncating the rest of the vector, inducing a feature generator using the truncated vectors; wherein the feature generator is adapted to receive as input one or more words and provide a list of weighted concepts, which are most related to the one or more words provided as input.
Abstract: To enter Chinese text, a user enters the corresponding phonetic spelling via telephone style keypad. Some or all keys represent multiple phonetic letters. In disambiguating entered key presses to yield a valid phonetic spelling, a computer divides the key presses into segments, while still preserving key press order. Each segment must correspond to an entry in a dictionary of Chinese characters, character phrases, and/or character components such as radicals or other predetermined stroke groupings. Upon arrival of a new key press that cannot form a valid entry when appended to the current segment, key presses are incrementally reallocated from the previous segment. As for already-resolved segments occurring prior to the previous and current segments, these are left intact. After each shifting attempt, the computer reinterprets key presses of the last two segments, and accepts the new segmentation if the segments form valid dictionary entries.
Abstract: Operations for weighted and non-weighted multi-tape automata are described for use in natural language processing tasks such as morphological analysis, disambiguation, and entity extraction.
Type:
Grant
Filed:
November 9, 2009
Date of Patent:
January 10, 2012
Assignee:
Xerox Corporation
Inventors:
Andre Kempe, Franck Guingne, Florent Nicart
Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound text input. The device is able to assemble language objects in the memory to generate compound language solutions. The device is able to analyze the combinations of language objects in light of N-gram data stored on the device to avoid proposing low-probability compound language solutions.
Abstract: A method of prioritizing the automated translation of communications relating to a predetermined topic includes capturing and inputting into a data processing system a translation-candidate communication rendered in a first human language. A first data set representative of the translation-candidate communication is stored in computer memory and parsed into communication sub-portions. Communication sub-portions are algorithmically selected for translation depending on their relatedness to the predetermined topic as determined by first-language extraction rules. Each selected communication sub-portion is translated to a translated-data-set sub-portion representative of that selected communication sub-portion in the second human language. Translated-data-set sub-portions are subjected to a secondary filtration process in accordance with which their relatedness to the predetermined topic is determined by second-language extraction rules.
Abstract: Embodiments for manipulating characters displayed on a display screen are provided, wherein one example method includes identifying a selected word, wherein the selected word includes at least one character to be modified. The method further includes correlating each of the at least one character with a unique numerical value and receiving a selection command and a modification command, wherein the selection command is the unique numerical value corresponding to a selected character. Furthermore, the method includes modifying the selected character responsive to the modification command to generate a modified word.
Type:
Grant
Filed:
June 28, 2010
Date of Patent:
December 20, 2011
Assignee:
Microsoft Corporation
Inventors:
David Mowatt, Robert Chambers, Felix GTI Andrew
Abstract: In a speech translation apparatus, a correspondence storage unit stores therein identifiers of terminals and usage languages used in the terminals associated with each other. A receiving unit receives a source speech from one of the terminals. A translating unit acquires usage languages from the correspondence storage unit, and generates a translated speech by using each of the acquired usage languages as a target language. When the translated speech is generated in any one of the target languages, a determining unit determines whether it has been generated in all the target languages. If the translated speech has been generated in all the target languages, an output unit outputs the translated speech. A sending unit sends the translated speech to each of the terminals.
Abstract: An information display control apparatus includes an example sentence storage unit that stores a plurality of example sentences, an input unit that accepts a user's operation of inputting a string of characters, an example sentence search unit that, when a compound word consisting of a plurality of constituting words which are combinable and splittable is input via the input unit, searches the example sentences in the example sentence storage unit for an example sentence containing the compound word in a combined state where the plurality of constituting words are combined and an example sentence containing the compound word in a split state where the plurality of constituting words are split, and a display control unit that displays the example sentences searched by the example sentence search unit.
Abstract: Systems for translating text messages in an instant messaging system comprise a translation engine for translating text messages into a preferred language of a recipient of the text messages. The systems are configured to send and receive the text messages and to determine whether the text messages that are received in a source language are in the preferred language of the recipients so that the text messages are displayed in the preferred language of the recipients of the text messages, including an indication of whether or not the received message is translated. Other systems and methods are also provided.
Type:
Grant
Filed:
December 28, 2005
Date of Patent:
September 27, 2011
Assignee:
AT&T Intellectual Property I, L.P.
Inventors:
Brian K. Daigle, Larry G. Kent, Jr., W. Todd Daniell, Wendy H. Eason, Samuel N. Zellner, Jerry C. Liu, Robert A. Koch
Abstract: The invention concerns a method for source language comprehension designed for a listener mastering a target language, which consists in causing the listener to listen to a statement, consisting of a series of several contents, in the source language and in displaying simultaneously with the listening of the statement a succession of notations marking the succession of contents in the source language, showing a notation marking a content only as from the time when it is being heard and showing, a series of inscriptions in the target language which corresponds to the full statement. The method is characterized in that it consists in showing the whole series of inscriptions before the beginning of the statement.
Abstract: A system offers potential completions for fragments of text. The system may obtain a text fragment and identify documents that include the text fragment. The system may locate sentences within the documents that include at least a portion of the text fragment, identify sentence endings associated with the located sentences, and present the sentence endings as potential completions for the text fragment.
Type:
Grant
Filed:
December 14, 2009
Date of Patent:
September 20, 2011
Assignee:
Google Inc.
Inventors:
Georges R Harik, Simon Tong, David R Cheng
Abstract: One embodiment generally pertains to a method of prediction. The method includes generating a set of affixes from a selected input sequence and comparing the set of affixes with a predictive set of affixes. The method also includes selecting an affix from the predictive set of affixes. The invention uses various input data sets and allows the ability to perfectly render the original data set and the minimal size of the predictive set of affixes.
Type:
Grant
Filed:
February 27, 2004
Date of Patent:
September 20, 2011
Assignee:
Dictaphone Corporation
Inventors:
Alwin B. Carus, Thomas J. Deplonty, III
Abstract: A translation method for properly recognizing and automatically translating a sentence containing an emphasized word including two or more successive identical characters. First, words in a source text to be translated are looked up in a dictionary (step S201) to determine whether the text includes an unregistered word (step S203). Then, it is determined whether an unregistered word contains successive identical characters (step S205). If it contains successive identical characters, the number of the characters is reduced (step S207) and determines whether a modified word thus obtained is contained in the dictionary (step S209). If it is determined that the modified word is contained in the dictionary, the unregistered word is determined as the modified word (step S215), the part of speech and the attribute of the modified word are determined (step S217), and the unregistered word is replaced with the modified word to make translation.
Type:
Grant
Filed:
March 25, 2009
Date of Patent:
August 23, 2011
Assignee:
International Business Machines Corporation
Abstract: A method for determining a sentiment associated with an entity includes inputting a plurality of texts associated with the entity, labeling seed words in the plurality of texts as positive or negative, determining a score estimate for the plurality of words based on the labeling, re-enumerating paths of the plurality of words and determining a number of sentiment alternations, determining a final score for the plurality of words using only paths whose number of alternations is within a threshold, converting the final scores to corresponding z-scores for each of the plurality of words, and outputting the sentiment associated with the entity.
Type:
Grant
Filed:
April 24, 2007
Date of Patent:
August 9, 2011
Assignee:
The Research Foundation of the State University of New York
Inventors:
Namrata Godbole, Steven Skiena, Manjunath Srinivasaiah
Abstract: A user adaptive speech recognition method and apparatus is disclosed that controls user confirmation of a recognition candidate using a new threshold value adapted to a user. The user adaptive speech recognition method includes calculating a confidence score of a recognition candidate according to the result of speech recognition, setting a new threshold value adapted to the user based on a result of user confirmation of the recognition candidate and the confidence score of the recognition candidate, and outputting a corresponding recognition candidate as a result of the speech recognition if the calculated confidence score is higher than the new threshold value. Thus, the need for user confirmation of the result of speech recognition is reduced and the probability of speech recognition success is increased.
Abstract: Methods, systems, and computer-readable media provide for the management of an audio environment with multiple audio sources. According to various embodiments described herein, real-time audio from multiple sources is received. A speaker is identified for each of the audio sources. Upon detecting a change from a first audio source to a second audio source, an identification of the speaker associated with the second audio source is provided. According to various embodiments, a recording of the real-time audio may be made and descriptors inserted to identify each speaker as the audio source changes. Real-time feedback from the speakers regarding characteristics of the audio may be received and corresponding adjustments to the audio made.
Type:
Grant
Filed:
October 4, 2007
Date of Patent:
August 9, 2011
Assignee:
AT&T Intellectual Property I, LP
Inventors:
Robert Koch, Robert Starr, Steven Tischer
Abstract: A system is disclosed for checking grammar and usage using a flexible portfolio of different mechanisms, and automatically providing a variety of different examples of standard usage, selected from analogous Web content. The system can be used for checking the grammar and usage in any application that involves natural language text, such as word processing, email, and presentation applications. The grammar and usage can be evaluated using several complementary evaluation modules, which may include one based on a trained classifier, one based on regular expressions, and one based on comparative searches of the Web or a local corpus. The evaluation modules can provide a set of suggested alternative segments with corrected grammar and usage. A followup, screened Web search based on the alternative segments, in context, may provide several different in-context examples of proper grammar and usage that the user can consider and select from.
Type:
Grant
Filed:
February 28, 2007
Date of Patent:
August 2, 2011
Assignee:
Microsoft Corporation
Inventors:
Chris Brockett, William Dolan, Michael Gamon, Jianfeng Gao, Lucy Vanderwende, Hsiao-Wen Hon, Ming Zhou, Gary Kacmarcik, Alexandre Klementiev
Abstract: In a method and device for outputting information and/or messages from at least one device using speech, the information and/or messages required for vocal output are provided in a voice memory, the information and/or messages are read by a processing device according to a demand, and the information and/or messages are output via acoustic output device. The information and/or messages are output with a varying intonation according to their relevance.
Abstract: A system and method for generating grammatically correct text in a target language based on one or more text templates and corresponding context in a source language comprises a software module configured to select one or more source language text templates and corresponding context in the source language. The system also includes a localization engine configured to obtain the selected one or more source language text templates and corresponding context in the target language from memory, apply the target language context to the one or more target language text templates, and apply one or more grammatical rules for the target language, thereby generating a grammatically correct text string in the target language. The system further includes a display configured to display the grammatically correct text string in the target language.
Type:
Grant
Filed:
November 9, 2004
Date of Patent:
July 19, 2011
Assignee:
Sony Online Entertainment LLC
Inventors:
Robert A. McEntee, William M. Mauer, Steven J. Riley
Abstract: Systems, methods, computer-readable media and other embodiments are provided for automatically determining a language of a document from a set of candidate languages. In one embodiment, a system includes a logic for setting an assumption value associated with each of the languages of the set of candidate languages where the assumption value indicates that the document is not in the language. A language analyzer determines the language and generates an output that indicates that the document is one language of the candidate languages when the assumption value for the one language passes a threshold value.