Storage Or Retrieval Of Data Patents (Class 704/7)
  • Publication number: 20150088488
    Abstract: A method and system for transforming a source document to an output document is disclosed. The method includes preparing a first file that contains a fixed text and a variable text by generating a regular-expression code for the text in the source document. The variable text in the first file is translated from a source language to an output language, wherein the translation is performed on the basis of at least one of a translation dictionary look-up, and a phonetic transliteration. The method then generates the output document in a pre-decided format as required in the output document from the first file.
    Type: Application
    Filed: September 23, 2013
    Publication date: March 26, 2015
    Applicant: Lingua Next Technologies Pvt. Ltd.
    Inventor: Rajeevlochan Phadke
  • Patent number: 8990065
    Abstract: A set of stories may be related in a set of messages (e.g., news articles, weblog posts, or messages exchanged in a social network). Presented herein are techniques for automatically generating a summary of respective stories that may be used as a headline or title. These techniques involve identifying the entities referenced in each message, clustering the messages based on similarly referenced entities to generate a cluster of messages associated with each story, and identifying facts of the story that appear in many of the messages, which may be used to generate a perspective-independent summary of the story. Additionally, metadata regarding each message may be detected (e.g., the entities involved in each story, a meta-story of the story, and the perspective of the author in relating the story), and may be used to fulfill requests to filter the set of stories and/or messages based on these criteria.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: March 24, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: David Arthur Raskino, Matthew Bret MacLaurin
  • Patent number: 8977536
    Abstract: A system with a nonstatistical translation component integrated with a statistical translation component engine. The same corpus may be used for training the statistical engine and also for determining when to use the statistical engine and when to use the translation component. This training may use probabilistic techniques. Both the statistical engine and the translation components may be capable of translating the same information, however the system determines which component to use based on the training. Retraining can be carried out to add additional components, or when after additional translator training.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: March 10, 2015
    Assignee: University of Southern California
    Inventor: Franz Josef Och
  • Patent number: 8972424
    Abstract: A system and related method for the electronic processing of text onto a two-dimensional coordinate system to analyze the attitudinal mindset associated with the text. The system and related method may also be employed to generate text based on a desired attitudinal mindset to impart. The system includes a computer system embodying functions that enable a user to analyze the text. The system includes one or more functions to parse attitudinal words and objective words and associate two-dimensional coordinates with the subjective words. The system further includes one or more functions for mapping the associated two-dimensional coordinates to show the geographic locations of each attitudinal word of the text in relation to each other attitudinal word of the text. The system decomposes attitudinal words into attitudinal equivalence and reference category and enables the generation of a report of the mindset associated with the analyzed text.
    Type: Grant
    Filed: November 9, 2012
    Date of Patent: March 3, 2015
    Inventor: Peter Snell
  • Patent number: 8959084
    Abstract: A computer-implemented method includes receiving in a query a location identifier from a user of a remote device, parsing the input location identifier to generate one or more location-related tokens, querying a repository of location information with the one or more location-related tokens to identify locations for one or more documents having a substantial match to the tokens, scoring the one or more documents using a mass of location for each document that represents the geographical size of a location associated with the document, and presenting information relating to the one or more documents for display using the mass of location.
    Type: Grant
    Filed: July 13, 2006
    Date of Patent: February 17, 2015
    Assignee: Google Inc.
    Inventor: Christopher M. Atenasio
  • Patent number: 8959013
    Abstract: A method, including presenting, by a computer system executing a non-tactile three dimensional user interface, a virtual keyboard on a display, the virtual keyboard including multiple virtual keys, and capturing a sequence of depth maps over time of a body part of a human subject. On the display, a cursor is presented at positions indicated by the body part in the captured sequence of depth maps, and one of the multiple virtual keys is selected in response to an interruption of a motion of the presented cursor in proximity to the one of the multiple virtual keys.
    Type: Grant
    Filed: September 25, 2011
    Date of Patent: February 17, 2015
    Assignee: Apple Inc.
    Inventors: Micha Galor, Ofir Or, Shai Litvak, Erez Sali
  • Patent number: 8954335
    Abstract: Appropriate processing results or appropriate apparatuses can be selected with a control device that selects the most probable speech recognition result by using speech recognition scores received with speech recognition results from two or more speech recognition apparatuses; sends the selected speech recognition result to two or more translation apparatuses respectively; selects the most probable translation result by using translation scores received with translation results from the two or more translation apparatuses; sends the selected translation result to two or more speech synthesis apparatuses respectively; receives a speech synthesis processing result including a speech synthesis result and a speech synthesis score from each of the two or more speech synthesis apparatuses; selects the most probable speech synthesis result by using the scores; and sends the selected speech synthesis result to a second terminal apparatus.
    Type: Grant
    Filed: March 3, 2010
    Date of Patent: February 10, 2015
    Assignee: National Institute of Information and Communications Technology
    Inventors: Satoshi Nakamura, Eiichiro Sumita, Yutaka Ashikari, Noriyuki Kimura, Chiori Hori
  • Patent number: 8949112
    Abstract: One embodiment of the present invention is an XML application module that processes an XML character stream, which module includes an XML interface module, a parallel bit stream module, a lexical item stream module, a parser and a parsed data receiver. The XML interface module applies the XML character stream as input to the parallel bit stream module and the parser; the parallel bit stream module forms parallel bit streams and applies them as input to the lexical item stream module; the lexical stream module forms lexical item streams and applies them as input to the parser; the parser forms a stream of parsed XML data and applies it as input to the parsed data receiver; and the parsed data receiver processes the stream of parsed XML data. The parsed data receiver may be, for example, a communication module of a portable communication device.
    Type: Grant
    Filed: February 6, 2013
    Date of Patent: February 3, 2015
    Assignee: International Characters, Inc.
    Inventor: Robert D. Cameron
  • Patent number: 8949266
    Abstract: In embodiments of the present invention improved capabilities are described for multiple web-based content category searching for web content on a mobile communication facility comprising capturing speech presented by a user using a resident capture facility on the mobile communication facility; transmitting at least a portion of the captured speech as data through a wireless communication facility to a speech recognition facility; generating speech-to-text results for the captured speech utilizing the speech recognition facility; and transmitting the text results and a plurality of formatting rules specifying how search text may be used to form a query for a search capability on the mobile communications facility, wherein each formatting rule is associated with a category of content to be searched.
    Type: Grant
    Filed: August 27, 2010
    Date of Patent: February 3, 2015
    Assignee: Vlingo Corporation
    Inventors: Michael S. Phillips, John N. Nguyen
  • Patent number: 8949111
    Abstract: A method includes accessing text that includes a plurality of words, tagging each of the plurality of words with one of a plurality of parts of speech (POS) tags, and creating a plurality of tokens, each token comprising one of the plurality of words and its associated POS tag. The method further includes clustering one or more of the created tokens into a chunk of tokens, the one or more tokens clustered into the chunk of tokens based on the POS tags of the one or more tokens, and forming a phrase based on the chunk of tokens, the phrase comprising the words of the one or more tokens clustered into the chunk of tokens.
    Type: Grant
    Filed: December 14, 2011
    Date of Patent: February 3, 2015
    Assignee: Brainspace Corporation
    Inventor: Paul A. Jakubik
  • Patent number: 8938465
    Abstract: A method and system for providing access to content on an electronic device is provided. One embodiment includes receiving contextual information and querying a packaged content source for content related to the contextual information. Available content relating to the contextual information from the packaged content source is then indicated for user access.
    Type: Grant
    Filed: August 20, 2009
    Date of Patent: January 20, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Alan Messer
  • Patent number: 8935152
    Abstract: A frame represents a concept with a set of roles and a set of linguistic rules. If a linguistic rule is satisfied, by a unit of natural language discourse (UNLD), the frame is invoked and a frame instance produced. A frame instance specifies how the UNLD, with particular values drawn from the UNLD, fulfills the roles of the frame. A frame-based search, of target content, can produce a search result comprised of records and corresponding frame instances. The values of such frame instances can be presented to the user as a role-value oriented search result. Multiple values of a role-value oriented search result, sufficiently similar in meaning, can be merged. Merged values can be represented, in a role-value oriented search result, by a single value. Selection of a value, of a role-value oriented search result, can cause the records, for which the value occurs in the corresponding instance, to be displayed to the user.
    Type: Grant
    Filed: July 21, 2008
    Date of Patent: January 13, 2015
    Assignee: NetBase Solutions, Inc.
    Inventors: Wei Li, Michael Jacob Osofsky, Lokesh Pooranmal Bajaj
  • Patent number: 8933827
    Abstract: A data processing apparatus that is capable of reducing the garbling of characters caused by the difference among the character codes when setting data are transferred to another apparatus by the import-export function. A storage unit stores setting data for the data processing apparatus. A receiving unit receives an instruction for exporting the setting data stored in the storage unit. A converting unit converts Unicode data included in the setting data into character code data of language, which is set to the data processing apparatus. An export unit exports the character code data converted by the converting unit and the Unicode data.
    Type: Grant
    Filed: May 24, 2013
    Date of Patent: January 13, 2015
    Assignee: Canon Kabushiki Kaisha
    Inventor: Noritsugu Okayama
  • Patent number: 8930373
    Abstract: An aspect includes phrase searching using exclusion tokens. A token division unit is configured to divide an input character string to be searched into a plurality of tokens. A token position definition unit is configured to set each token to be excluded in an occurrence position calculation as an exclusion token and to set each token to be included in the occurrence position calculation as a headword token, and define an occurrence position for each headword token. A position offset information assigning unit is configured to assign, to each of the exclusion tokens, position information obtained with the headword token followed by the exclusion tokens and to assign the headword token followed by the exclusion tokens as a starting point. An indexing processing unit is configured to perform indexing on the plurality tokens such that whether or not the exclusion tokens follow one of the plurality of tokens is identifiable.
    Type: Grant
    Filed: June 22, 2012
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Masaki Komedani, Fumihiko Terui
  • Patent number: 8930176
    Abstract: Techniques for interactively presenting word-alignments of multilingual translations and automatically improving those translations based upon user feedback are described herein. With one or more implementations of the techniques described herein, a word-alignment user-interface (UI) concurrently displays a pair of bilingual sentences, where one is a translation of the other, and interactively highlights linked (i.e., “word-aligned”) words and phrases of the pair. Other implementations of the techniques described herein offer an option for a user to provide feedback about the existing word-alignments or realign the words or phrases. In still other described implementations, word-alignment is automatically improved based upon that user feedback.
    Type: Grant
    Filed: April 1, 2010
    Date of Patent: January 6, 2015
    Assignee: Microsoft Corporation
    Inventors: Henry Li, Matthew Robert Scott, Xiaohua Liu, Hao Wei, Ming Zhou
  • Patent number: 8924195
    Abstract: In a machine-translation apparatus, an example storage unit stores therein target language examples in a target language and source language examples in a source language, while keeping the target language examples and the source language examples in correspondence with one another. An input receiving unit receives an input sentence in the source language. A searching unit conducts a search in the example storage unit for one of the target language examples corresponding to one of the source language examples that either matches or is similar to the input sentence. A translating unit generates a reverse-translated sentence by translating the one of the target language examples found in the search into the source language. A detecting unit detects a difference portion between the reverse-translated sentence and the input sentence. An output unit outputs the difference portion.
    Type: Grant
    Filed: September 19, 2008
    Date of Patent: December 30, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Satoshi Kamatani, Tetsuro Chino, Kentaro Furihata
  • Patent number: 8924194
    Abstract: In an embodiment of a messaging system, a method for presenting a commercial message to a user is provided. A target language in which the user is comfortable communicating may be determined based on at least one communication received by the user or at least one communication provided by the user. The commercial message may be presented to the user in the target language.
    Type: Grant
    Filed: June 20, 2006
    Date of Patent: December 30, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Srinivas Bangalore
  • Patent number: 8924363
    Abstract: A method for correcting service manual textual inconsistencies. Extracting textual procedures from service documents stored in a memory of a service document storage device. Each term of an extracted textual procedure terminology is compared to a correlating target name terminology for identifying any matching terms by a processor. An overlap similarity is computed as a function of the identified matching terms from the extracted textual procedure terminology and the correlating target name terminology. A determination is made whether the overlap similarity is greater than a predetermined similarity threshold. The service documents are modified to change the extracted textual procedure terminology to the correlating target name terminology in response to the overlap similarity being greater than the predetermined similarity threshold and the extracted textual procedure terminology not exactly matching the correlating target name terminology.
    Type: Grant
    Filed: November 7, 2012
    Date of Patent: December 30, 2014
    Assignee: GM Global Technology Operations LLC
    Inventors: Satnam Singh, Sachin Raviram, Keith D. Armitage, Steven W. Holland, Frederick J. Vondrak, David N. Nowak, David B. Miller
  • Patent number: 8914402
    Abstract: In embodiments of the present invention improved capabilities are described for multiple web-based content category searching for web content on a mobile communication facility comprising capturing speech presented by a user using a resident capture facility on the mobile communication facility; transmitting at least a portion of the captured speech as data through a wireless communication facility to a speech recognition facility; generating speech-to-text results for the captured speech utilizing the speech recognition facility; and transmitting the text results and a plurality of formatting rules specifying how search text may be used to form a query for a search capability on the mobile communications facility, wherein each formatting rule is associated with a category of content to be searched.
    Type: Grant
    Filed: August 27, 2010
    Date of Patent: December 16, 2014
    Assignee: Vlingo Corporation
    Inventors: Michael S. Phillips, John N. Nguyen
  • Patent number: 8909516
    Abstract: Computing functionality converts an input linguistic item into a normalized linguistic item, representing a normalized counterpart of the input linguistic item. In one environment, the input linguistic item corresponds to a complaint by a person receiving medical care, and the normalized linguistic item corresponds to a definitive and error-free version of that complaint. In operation, the computing functionality uses plural reference resources to expand the input linguistic item, creating an expanded linguistic item. The computing functionality then forms a graph based on candidate tokens that appear in the expanded linguistic item, and then finds a shortest path through the graph; that path corresponds to the normalized linguistic item. The computing functionality may use a statistical language model to assign weights to edges in the graph, and to determine whether the normalized linguistic incorporates two or more component linguistic items.
    Type: Grant
    Filed: December 7, 2011
    Date of Patent: December 9, 2014
    Assignee: Microsoft Corporation
    Inventors: Julie A. Medero, Daniel S. Morris, Lucretia H. Vanderwende, Michael Gamon
  • Patent number: 8903715
    Abstract: A mechanism is provided for accelerating data exchange language parsing. An input data stream is loaded into a first in, first out (FIFO) memory. A tokenization bit corresponding to a next byte to be read is extracted from a FIFO. A determination is made as to whether the tokenization bit corresponding to the next byte to be read from the FIFO indicates a control character or a non-control character located in an associated FIFO memory location in the FIFO. Responsive to the tokenization bit indicating the control character, the control character that causes a state change in a state machine is processed. Responsive to the tokenization bit indicating the non-control character, a length associated with the tokenized bit is identified and a set of non-control characters that do not cause a state change in the state machine are processed based on the length associated with the tokenized bit.
    Type: Grant
    Filed: May 4, 2012
    Date of Patent: December 2, 2014
    Assignee: International Business Machines Corporation
    Inventor: Kanak B. Agarwal
  • Patent number: 8903718
    Abstract: The present invention relates to methods and systems for storing words and phrases in a data structure, and retrieving and displaying said words and phrases from said data structure. In particular, the present invention relates to a method and system of predicatively suggesting words and/or phrases to a user entering a string of characters into a user interface, which may be a limited user interface.
    Type: Grant
    Filed: January 19, 2009
    Date of Patent: December 2, 2014
    Inventor: Ugochukwu Akuwudike
  • Patent number: 8886535
    Abstract: A method of optimizing the calculation of matching scores between phone states and acoustic frames across a matrix of an expected progression of phone states aligned with an observed progression of acoustic frames within an utterance is provided. The matrix has a plurality of cells associated with a characteristic acoustic frame and a characteristic phone state. A first set and second set of cells that meet a threshold probability of matching a first phone state or a second phone state, respectively, are determined. The phone states are stored on a local cache of a first core and a second core, respectively. The first and second sets of cells are also provided to the first core and second core, respectively. Further, matching scores of each characteristic state and characteristic observation of each cell of the first set of cells and of the second set of cells are calculated.
    Type: Grant
    Filed: January 23, 2014
    Date of Patent: November 11, 2014
    Assignee: Accumente, LLC
    Inventors: Jike Chong, Ian Richard Lane, Senaka Wimal Buthpitiya
  • Patent number: 8880389
    Abstract: A method, computer program product and system are disclosed for determining the semantic density of textualized digital media (a measure of how much information is conveyed in a sentence or clause relative to its length). The more semantically dense text is, the more information it conveys in a given space. Users input a topic, a timeline, and one or more target web media sources for analysis. Text in the target media sources is deconstructed to determine density, and a density rating assigned to the web media source. Over time, users can track trends in the density of text media relative to a given topic, and determine how much information is being conveyed in connection with the topic, such as a political campaign. Line graphs, pie charts, and other time-elapsed output graphic representations of the semantic density are generated and rendered for the user.
    Type: Grant
    Filed: December 9, 2011
    Date of Patent: November 4, 2014
    Inventor: Igor Iofinov
  • Patent number: 8880388
    Abstract: In an automated Question Answer (QA) system architecture for automatic open-domain Question Answering, a system, method and computer program product for predicting the Lexical Answer Type (LAT) of a question. The approach is completely unsupervised and is based on a large-scale lexical knowledge base automatically extracted from a Web corpus. This approach for predicting the LAT can be implemented as a specific subtask of a QA process, and/or used for general purpose knowledge acquisition tasks such as frame induction from text.
    Type: Grant
    Filed: August 28, 2012
    Date of Patent: November 4, 2014
    Assignee: International Business Machines Corporation
    Inventors: David A. Ferrucci, Alfio M. Gliozzo, Aditya A. Kalyanpur
  • Patent number: 8874426
    Abstract: A method of translating a computer generated log output message from a first language to a second language, including receiving a log output containing a plurality of messages in a first language and matching words and phrases in the log output messages to pre-established codes in a matched message index. Ambiguous matches are resolved by removing codes matched to ones of the words and phrases that have overlap with words and phrases matched to different codes. The codes in the matched message index are translated into a second language different than the first language to a corresponding second log output message in the second language and then the second log output message is output in the second language.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: October 28, 2014
    Assignee: International Business Machines Corporation
    Inventor: Hugh P. Williams
  • Patent number: 8868405
    Abstract: A system and method are presented for the comparative analysis of textual documents. In an exemplary embodiment of the present invention the method includes accessing two or more documents, performing a linguistic analysis on each document, outputting a quantified representation of a semantic content of each document, and comparing the quantified representations using a defined metric. In exemplary embodiments of the present invention such a metric can measure relative semantic closeness or distance of two documents. In exemplary embodiments of the present invention the semantic content of a document can be expressed as a semantic vector. The format of a semantic vector is flexible, and in exemplary embodiments of the present invention it and any metric used to operate on it can be adapted and optimized to the type and/or domain of documents being analyzed and the goals of the comparison.
    Type: Grant
    Filed: January 27, 2004
    Date of Patent: October 21, 2014
    Assignee: Hewlett-Packard Development Company, L. P.
    Inventors: Kas Kasravi, Walter B. Novinger
  • Patent number: 8868419
    Abstract: A text content summary is created from speech content. A focus more signal is issued by a user while receiving the speech content. The focus more signal is associated with a time window, and the time window is associated with a part of the speech content. It is determined whether to use the part of the speech content associated with the time window to generate a text content summary based on a number of the focus more signals that are associated with the time window. The user may express relative significance to different portions of speech content, so as to generate a personal text content summary.
    Type: Grant
    Filed: August 22, 2011
    Date of Patent: October 21, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Bao Hua Cao, Le He, Xing Jin, Qing Bo Wang, Xin Zhou
  • Patent number: 8868404
    Abstract: A system, method, and computer-program product providing a customer-centric model of translation memory management including use of relevancy in an on-demand multi-tenant environment, and enabling selective multi-tenant searches for pre-existing translations.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: October 21, 2014
    Assignee: Cloudwords, Inc.
    Inventors: Scott William Yancey, Craig David Weissman
  • Patent number: 8863220
    Abstract: A method for rendering text onto moving image content. The method comprises receiving a request to translate dialog associated with moving image content, transmitting an interface, transmitting a time-stamped transcription, and receiving a translation of the dialog.
    Type: Grant
    Filed: June 22, 2011
    Date of Patent: October 14, 2014
    Assignee: Dotsub Inc.
    Inventor: Thor Sigvaldason
  • Patent number: 8855995
    Abstract: Systems, methods, and apparatuses including computer program products for machine translation. A method is provided that includes generating a plurality of machine translation systems using a single machine translation engine, and generating a consensus translation from a plurality of candidate translations for a source sentence, where each candidate translation of the plurality of candidate translations is an output of a respective machine translation system of the plurality of machine translation systems.
    Type: Grant
    Filed: December 3, 2012
    Date of Patent: October 7, 2014
    Assignee: Google Inc.
    Inventors: Wolfgang Macherey, Franz Josef Och
  • Patent number: 8843360
    Abstract: Disclosed are various embodiments for client-side internationalization of network pages. A network page and code that localizes the network page are obtained from a server. The code that localizes the network page is executed in a client and determines a locale associated with the client. One or more internationalized elements are identified in the network page. The internationalized elements are replaced with corresponding localized translations. The network page is rendered for display in the client after the network page has been localized.
    Type: Grant
    Filed: March 4, 2011
    Date of Patent: September 23, 2014
    Assignee: Amazon Technologies, Inc.
    Inventors: Simon K. Johnston, Margaux Eng, James K. Keiger, Gideon Shavit
  • Publication number: 20140278348
    Abstract: A system permitting communications between different organisms or species is disclosed.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Inventor: A. Christian Tahan
  • Patent number: 8838434
    Abstract: Techniques disclosed herein include systems and methods for creating a bootstrap call router for other languages by using selected N-best translations. Techniques include using N-best translations from a machine translation system so as to increase a possibility that desired keywords in a target language are covered in the machine translation output. A 1-best translation is added to a new text corpus. This is followed by selecting a subset that provides a varied set of translations for a given source transcribed utterance for better translation coverage. Additional translations are added to the new text corpus based on a measure of possible translations having words not yet seen for the selected transcribed utterances, and also based on possible translation having words that are not associated with very many or semantic tags in the new text corpus. Candidate translations can be selected from a window of N-best translations calculated based on machine translation accuracy.
    Type: Grant
    Filed: July 29, 2011
    Date of Patent: September 16, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Ding Liu
  • Patent number: 8825472
    Abstract: Embodiments are directed towards an automated machine learning framework to extract keywords within a message that are relevant to an attachment to the message. The machine learning model finds a set of relevant sentences within the message determined to be relevant to the one or more attachments based on identification of one or more sentence level features within a given sentence. The sentence level features include, for example, anchor features, noisy sentence features, short message features, threading features, anaphora detections, and lexicon features. From the set of relevant sentences, useful keywords may be extracted using a sequence of heuristics to convert the sentence set into the set of useful keywords. The set of useful keywords may then be associated to at least one attachment such that the keywords may subsequently be used to perform various indexing, searching, sorting, and to provide further context to the attachment.
    Type: Grant
    Filed: May 28, 2010
    Date of Patent: September 2, 2014
    Assignee: Yahoo! Inc.
    Inventor: Aravindam Raghuveer
  • Patent number: 8825470
    Abstract: A system and method of providing a response with different language options for a data communication protocol, such as Session Initiation Protocol, are disclosed. For example, data communication is controlled between at least two endpoints. A response code indicative of a condition of the data communication is transmitted to one of the at least two endpoints. The response code is associated with a reason phrase operable to be displayed at the one of the at least two endpoints in a language selected from an option of a plurality of languages.
    Type: Grant
    Filed: September 27, 2007
    Date of Patent: September 2, 2014
    Assignee: Siemens Enterprise Communications Inc.
    Inventors: Mallikarjuna Samayamantry Rao, Dennis Kucmerowski
  • Patent number: 8825648
    Abstract: Techniques for utilizing data mining technology to extract universal topics with multilingual representations from a multilingual database, and to organize existing or new documents in different languages by analyzing their respective topic distributions.
    Type: Grant
    Filed: April 15, 2010
    Date of Patent: September 2, 2014
    Assignee: Microsoft Corporation
    Inventors: Xiaochuan Ni, Jian-Tao Sun, Zheng Chen, Jian Hu
  • Patent number: 8825466
    Abstract: Systems and methods for automatically modifying an annotated bilingual segment pair are provided. An annotated bilingual segment pair (“Pair”) may be modified to generate improved translation rules used in machine translation of documents from a source language to a target language. Because a single Pair may be used to translate a phrase, many Pairs are used in a machine translation system and manual correction of each model is impractical. Each Pair may be modified by re-labeling syntactic categories within the Pair, re-structuring a tree within the Pair, and/or re-aligning source words to target words within the Pair. In exemplary embodiments, many alternate Pairs (or portions thereof) are generated automatically, rule sequences corresponding to each are derived, and one or more rule sequences are selected. Using the selected rule sequence, a modified Pair is distilled.
    Type: Grant
    Filed: June 8, 2007
    Date of Patent: September 2, 2014
    Assignees: Language Weaver, Inc., University of Southern California
    Inventors: Wei Wang, Jonathan May, Kevin Knight
  • Patent number: 8818792
    Abstract: An apparatus and a method for constructing a verb phrase translation pattern using a bilingual parallel corpus. Each of the apparatus or method recognizes a predicate and an argument using a syntax analysis result and a word alignment result of a source sentence from a plurality of bilingual parallel corpus, extracts a translation pattern candidate and an occurrence frequency using the recognized predicate and argument, and then generates a basic verb phrase translation pattern by verifying the translation pattern candidate, and generalizes the generated basic verb phrase translation pattern to generate a general verb phrase pattern so as to be applied to various language pairs and minimize an error in the verb phrase translation pattern and determine an appropriate generalization level using a co-occurrence frequency of a predicate and an argument of the verb phrase translation pattern and a translation probability of the predicate.
    Type: Grant
    Filed: July 8, 2011
    Date of Patent: August 26, 2014
    Assignee: SK Planet Co., Ltd.
    Inventors: Young Sook Hwang, Sang-Bum Kim, Chang Hao Yin, Hae Chang Rim, Joo Young Lee
  • Patent number: 8810581
    Abstract: A system, method and apparatus are described herein for input of characters into a mobile device. In one implementation, a user can input representations of character strokes of logographic characters, such as Chinese characters, using a trackpad module. The system can then associate the character strokes with a character the user desires to input based on the received inputs and a series of well-known rules for writing the logographic characters. One implementation of the trackpad includes an optical trackpad comprising a plurality of sub-sections that can be used to determine the direction of movement of an object over the optical trackpad, for example, a finger over the optical trackpad.
    Type: Grant
    Filed: October 20, 2010
    Date of Patent: August 19, 2014
    Assignee: Blackberry Limited
    Inventors: Archer Chi Kwong Wun, Kwok Ching Leung
  • Patent number: 8812303
    Abstract: Indexing and querying in multiple languages is accomplished using an ordered chain of filters and/or other such components. When receiving information to be indexed or for a query, the information can be tokenized and typed based at least in part on the language of each token. The character types can be adjusted if appropriate for the languages, and the tokens can be further segmented using a dictionary for the respective language types. Once appropriate tokens are determined, relevant synonyms in each appropriate language can be determined and typed accordingly. If necessary the case of the tokens and synonyms can be adjusted and further segmented based on punctuation. The terms and synonyms then can be used as part of the index or as part of the search query to include other terms or phrases based on relevance to the original information.
    Type: Grant
    Filed: January 17, 2013
    Date of Patent: August 19, 2014
    Assignee: Amazon Technologies, Inc.
    Inventors: Yuhui Jin, Cheng He, Peng Kang, Yang Wenbo
  • Patent number: 8812297
    Abstract: Determining synonyms of words in a set of documents. Particularly, when provided with a word or phrase as input, in exemplary embodiments there is afforded the return of a predetermined number of “top” synonym words (or phrases) for an input word (or phrase) in a specific collection of text documents. Further, a user is able to provide ongoing and iterative positive or negative feedback on the returned synonym words, by manually accepting or rejecting such words as the process is underway.
    Type: Grant
    Filed: April 9, 2010
    Date of Patent: August 19, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey M. Achtermann, Indrajit Bhattacharya, Kevin W. English, Shantanu R. Godbole, Ajay K. Gupta, Ashish Verma
  • Patent number: 8805669
    Abstract: A translation system for translating source text from a first language to target text in a second language. The system comprises a translation memory (TM) module that stores translation segments. The TM module is operable to generate a TM target text output in response to source text. A statistical translation machine (SMT) module is configured to generate translations on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The SMT module is operable to generate a SMT target text output in response to source text. An extractor is configured to extract features from the TM target text output and the SMT target text output. A vector generator is configured to generate a vector with a unified feature set derived from the extracted features and features associated with the SMT module and the TM module. A recommender is operable to read the vector and determine whether the TM target text output or the SMT target text output is optimum for post editing.
    Type: Grant
    Filed: July 13, 2011
    Date of Patent: August 12, 2014
    Assignee: Dublin City University
    Inventors: Yifan He, Yanjun Ma, Andrew Way, Josef van Genabith
  • Patent number: 8805672
    Abstract: Techniques for client side translation cache prediction are provided. The techniques include obtaining meta data associated with a request, applying a cache prediction model to the meta data to automatically predict one or more translations associated with the request, and storing the one or more translations in a client translation cache.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: August 12, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sasha P. Caskey, Sameer Maskey
  • Patent number: 8805670
    Abstract: The present invention provides an interactive system and method for effective and convenient language translation. The system and method provides a translation window that is opened in conjunction with a Web page window containing Web pages hosted on the Internet. The translation window and Web page window are automatically adjusted in size and position so that they fit on one user-viewable screen without overlapping. The translation window is linked to a translation dictionary database accessible through the Internet which provides accurate and comprehensive definitions of the words that are identified to be translated.
    Type: Grant
    Filed: April 2, 2013
    Date of Patent: August 12, 2014
    Assignee: GlobalEnglish Corporation
    Inventors: Brent E. Pearson, Scott T. Silliman, Peter A. Richter, Samuel N. Neff
  • Patent number: 8805671
    Abstract: Methods and apparatus, including computer program products, implementing and using techniques for translating digital content from a source language to a target language. A message is displayed to a user. The message contains digital content to be translated from the source language to the target language, as well as the context of the digital content in the source language and/or a reference to a context in which the digital content occurs. A proposed translation of the digital content into the target language is received from the user. The proposed translation is submitted to a translation server.
    Type: Grant
    Filed: August 21, 2013
    Date of Patent: August 12, 2014
    Assignee: Google Inc.
    Inventors: Przemyslaw Broniek, Joanna K. Chwastowska, Brendan Clavin, Dawid Duda, Terence Haddock, Marcin Mikosik, Maciej Molerus, Michal Pociecha-Los, Jan Wrobel
  • Patent number: 8799256
    Abstract: Including search result based content in a webpage is disclosed. One or more search criteria and an indication that a search result based content associated with the search criteria is to be included in a web page are received. A computer script or code configured to enable the search result based content to be retrieved in accordance with the search criteria is generated automatically for the web page.
    Type: Grant
    Filed: January 27, 2011
    Date of Patent: August 5, 2014
    Assignee: EMC Corporation
    Inventors: Gary Tang, Igor Shmulevich, Peggy Ringhausen
  • Patent number: 8798986
    Abstract: Portable, real time voice translation systems, and associated methods of use, are provided. The systems include a translation system for use on a single unit, portable computing device and operable for accessing a multilanguage database, selecting a source language from a plurality of source languages and a destination language from a plurality of destination languages, inputting a source phrase, transmitting the source phrase to a speech recognition module, a translation engine, and a template look-up engine for finding the phrase template associated with the destination phrase from among the multiple languages. The spoken translation is then output in the selected destination language. The translation system has a total time between the input of the source phrase and output of the destination phrase that is no slower than 0.010 seconds, and a communications interface operable for communicating with a second computer system.
    Type: Grant
    Filed: December 26, 2012
    Date of Patent: August 5, 2014
    Assignee: NewTalk, Inc.
    Inventors: Robert H. Clemons, Bruce W. Nash, Martha P. Robinson, Craig A. Robinson
  • Patent number: 8793130
    Abstract: A method of generating a confidence measure generator is provided for use in a voice search system, the voice search system including voice search components comprising a speech recognition system, a dialog manager and a search system. The method includes selecting voice search features, from a plurality of the voice search components, to be considered by the confidence measure generator in generating a voice search confidence measure. The method includes training a model, using a computer processor, to generate the voice search confidence measure based on selected voice search features.
    Type: Grant
    Filed: March 23, 2012
    Date of Patent: July 29, 2014
    Assignee: Microsoft Corporation
    Inventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu
  • Patent number: 8788260
    Abstract: Systems, methods, and computer storage media having computer-executable instructions embodied thereon that facilitate generation of snippets. In embodiments, text features within a keyword-sentence window are identified. The text features are utilized to determine break features that indicate favorability of breaking at a particular location of the keyword-sentence window. The break features are used to recognize features of partial snippets such that a snippet score to indicate the strength of the partial snippet can be calculated. Snippet scores associated with partial snippets are compared to select an optimal snippet, that is, the snippet having the highest snippet score.
    Type: Grant
    Filed: May 11, 2010
    Date of Patent: July 22, 2014
    Assignee: Microsoft Corporation
    Inventors: Valerie Rose Nygaard, Riccardo Turchetto, Joanna Mun Yee Chan, Christian Biemann, David Dongjah Ahn, Andrea Ryerson Burbank, Feng Pan, Timothy McDonnell Converse, James Michael Reinhold, Tracy Holloway King